Drowning in Data, Starving for Insight

A look at the eye care data crisis

 

Listen to this column here:

 

I am an optimist about the future of eye care in AI. However,  I have to be brutally honest about our present limitations. We, as an industry, are standing in our own way. We are generating petabytes of ophthalmic data—OCT scans, fundus photos, biometry measurements—but we have locked it all away in isolated digital silos. The grand vision of large language models (LLMs) and predictive AI transforming clinical decisions is being choked by a single, self-inflicted wound: the catastrophic lack of data standardization and connectedness. We are our biggest problem!!! 

The Problem of Connectedness

The challenge of acquiring a sufficiently large and high-quality dataset to train a robust LLM is not a problem of volume; it is a problem of connectedness. Right now, we are embarrassingly suffering from a true data disaster. Our data resides in tens of thousands of non-connected silos called electronic health records (EHRs) that have really evolved little in the 30 years since their development.

Digital Chaos

There are two inherent problems with this ageing technology. First is a version of non-standardized digital chaos. Within these digital filing systems, the information we collect is not standardized. For example, one practice may record “mild dry eye” in a free-text note; another might use structured ICD code; a third might use a proprietary scale. This non-uniformity renders aggregated data virtually useless for AI training. That is until countless man-hours are spent cleaning, normalizing and harmonizing the input. A model cannot learn if it cannot consistently define the disease.

 

Worse still, the systems themselves, or at least those in charge of these systems, are hostile to connection. We have dozens of proprietary EHR vendors often making data transfer difficult, incomplete and expensive. This is not just a failure of interoperability between systems, but a failure of connectedness within systems. A patient’s record might be fractured across two different clinics that use the same EHR vendor simply because the practices operate as separate entities with no mandated data-sharing infrastructure. This insular approach is the very antithesis of modern, large-scale data science. This fragmentation is not just annoying; it is a direct inhibitor to the evolution of care. We cannot aggregate the necessary evidence to identify game-changing trends because the data needed is scattered and non-comparable. This is simply unacceptable in a world of globally integrated data.

Creating a Holistic Data Model

The real tragedy here is the inability to create the holistic, multimodal dataset across the industry that is required to advance evidence-based medicine (EBM) in eye care. True clinical advancement requires fusing all aspects of a patient’s journey. There are three main areas of “fusion” that need to occur.

 

First, we need to overcome our diagnostic disconnect. We cannot easily link the precise quantitative data from a diagnostic technology (e.g., an OCT-A vascular density map) to a provider’s clinical observation (e.g., “macular edema resolved”) and the patient’s subjective history (e.g., “vision is now 20/20”). These remain separate files, often in separate systems, with separate, non-linked identifiers, making it nearly impossible to correlate various pieces of the holistic picture.

 

Second, there is a void in genotype-phenotype correlation. The true frontier of EBM lies in connecting a patient’s genotype (their genetic makeup) to their phenotype (their observable disease presentation and progression). This is impossible when clinical data is siloed and genetic data is managed – if we want to call it that –  by a third-party with no standardized, consent-driven pipeline to bring the two together for large-scale analysis.

 

And third, perhaps the most egregious gap, is our collective failure to rigorously and uniformly track treatment success or failure over time. We initiate therapy (e.g., a specific anti-VEGF agent, a glaucoma surgery protocol, or myopia management), but the outcome is often buried in unstructured follow-up notes. LLMs thrive on identifying patterns of response and non-response. Without clean, longitudinal data that definitively links intervention to outcome, we cannot develop the robust, predictive EBM treatment protocols needed to revolutionize patient care. We are forcing clinicians to practice based on small-scale trial data, or worse yet their own dataset with an “N” of one, rather than the collective, real-world experience of the entire global eye care community.

 

We need solutions! 

Those solutions demand a unified effort involving technology, regulation and a complete change in mindset. We can overcome this fragmentation and unlock the potential of LLMs and AI if we meet a few goals.

HIR Adoption

First, we need mandatory health care interoperability (HIR) adoption. We must move beyond “information blocking” regulations and the proprietary mindsets of data gatekeeper companies to mandate the adoption of HIR by all EHR and device vendors. Crucially, the industry must collaboratively establish a specific, rigorous HIR profile to ensure the standardized capture of key eye care data elements (e.g., visual acuity, intraocular pressure, OCT layer thickness) so that a “mild diabetic retinopathy” observation is coded identically everywhere. This will require not just accepted standardization, but also entirely new and more expansive code sets than ICD-10 –  much less the generally archaic and simplistic 5-digit CPT, can ever hope to capture.

Open Architecture Knowledge Networks

A key component to interoperability is the development and universal acceptance of open architecture knowledge networks. Instead of physically moving sensitive patient data, we must adopt a centralized data knowledge center. This technology would allow a central AI model to access and be trained across thousands of decentralized data silos. The data never leaves the practice’s system, but the knowledge—the updated model weights—is shared, providing the scale needed for training LLMs while preserving privacy.

Fighting AI’s Data Problem

This knowledge system will require AI harmonization tools to fight AI’s data problem. New LLM tools can be developed specifically to ingest non-standardized free text and structured data from disparate sources can automatically harmonize and normalize this data into a single “clean” format. This acts as a necessary bridge to use the legacy EHR data that currently exists. However, it may be just as efficient to develop an entirely new knowledge system that does not include the incomplete EHR datasets of today.

 

To “entice” all parties to participate, regulatory bodies, like CMS and other players, may want to consider establishing financial and quality-based incentives tied to the structured, longitudinal tracking of treatment endpoints. For example, when an anti-VEGF injection is given, the system should require specific, standardized data inputs regarding anatomical and functional outcomes at every follow-up visit, creating the definitive success/failure datasets needed for EBM protocol development. This, of course, would be much easier if interoperative and standardized data over a single universal knowledge system were accepted norms.

Patient Empowerment is Key

However, the most fundamental solution will be transparent, patient-directed, data portability. Empowering the patient is key. We need to establish an infrastructure that allows patients to easily and securely aggregate and share their own complete, interconnected records. That record would include all clinical, imaging and genetic data via APIs that will bypass many of the current practice-level data-sharing bottlenecks. This places the ultimate control and responsibility for data flow with the individual, accelerating knowledge while upholding privacy.

 

The biggest point that I want to make is that solving the eye care data crisis is not a technical challenge. It is a political and cultural one. I.e. –it is a human one. It is time to tear down the silos and build the interoperable data infrastructure that our patients, and the future of eye care, deserve. There are personal health information systems in development that will do just that. As the health care industry, we must be ready to get out of our own way and work together. Let’s lead the charge for a better knowledge system to evolve health care.

Author

  • Scot Morris, OD

    Scot Morris, OD, has practiced for 25 years in various clinical settings and served as a technology author, magazine chief optometric editor, corporate advisor, practice consultant, and prominent educator. He started or cofounded multiple companies within the eye care industry and participated in multiple clinical trials. Among the challenges he consistently hears about in the health care industry for providers, patients, companies, and the health system are inefficient care delivery, clinical decision-making errors, rising costs, access issues, and failure to provide connected care.

    Through his various roles, Dr. Morris has focused on how to improve system efficiencies, market, and teach peers how to improve care delivery. His peers voted him as one of the 50 most influential people in eye care and one of the top 250 innovators in the industry. Driven to always find a better way and share that knowledge to make people and processes better, Dr. Morris spent his entire career thinking about health care challenges, how to solve them, and educating others to do the same. As a result, he spent the last few years focusing on these issues and codeveloping a knowledge platform called the AMI Knowledge System, (AMIKnowS), to share and evolve knowledge in hopes that we can solve many health care issues and enable the delivery of accessible and unbiased health care regardless of income, education, or geography.



    View all posts


Leave a Reply

Your email address will not be published. Required fields are marked *