The American healthcare system holds the largest collection of structured clinical data ever assembled. Decades of diagnoses, labs, prescriptions, procedures, and outcomes for hundreds of millions of people — a longitudinal record of what happened when they got sick, got treated, and either got better or didn't. Taken seriously, this data could tell us which treatments work for which patients, which billions in annual spending produce no measurable benefit, and which populations are dying of conditions we already know how to treat.

Almost none of it is being used, because the infrastructure for data access was built for a world that no longer exists.

AI doesn't submit projects. It sends prompts.

For decades, the unit of data access in healthcare has been the project. A researcher drafts a protocol, an IRB reviews it, a privacy officer evaluates the data request, a data use agreement is negotiated, an engineer prepares the tables. Six to eighteen months pass. Many requests are quietly outlasted until the investigator moves on and the question goes unanswered. The researcher who survives the process begins.

If this is how your institution shares data with its researchers, AI will bury you.

When an AI agent is asked to analyze readmission patterns, it writes a query, examines the result, refines the query, and writes another — a hundred times, perhaps, in the course of answering a single question. The unit of data access is no longer a multi-month project. It is a prompt. And the volume will grow by orders of magnitude, because demand will no longer track with analyst headcount. It will track with available compute.

Researcher era · 4 per week

Analytics era · 20 per week

Agent era · 100 per minute

No review process scales to AI.

The instinct is to reach for operational fixes. Faster committees. Better triage. More reviewers. None of them are sufficient, because the mismatch is mathematical: capacity grows linearly, demand grows exponentially. You can double your review staff — at ten to thirty thousand dollars per request in institutional overhead — and buy yourself a few months of headroom. The demand curve will consume it inside a quarter.

There is no faster-committee future. The committee model has a fixed ceiling, the demand curve does not, and the ceiling has already been reached.

This isn't about training data.

The conversation about AI and health data has focused, to its detriment, on whether clinical records should be used to train foundation models. The data access crisis arriving in healthcare has almost nothing to do with training.

Training is a discrete event: a model is built once, approved once. Inference is continuous — a deployed agent querying structured data with every prompt, every refinement, every follow-up, for questions that cannot be anticipated in advance. Structured clinical data is a weak substrate for training large language models. It is extraordinarily valuable at inference time. What does a proposed payer contract mean for revenue across last year's case volume? Which service lines are outliers on risk-adjusted readmissions? How many patients matching twelve inclusion criteria were seen here in the last three years? Each of those questions hits a data warehouse at inference time, and for each one the current compliance infrastructure offers the same answer: get in line.

Training · one-time event

Inference · continuous

Make the data safe before anyone asks.

If demand grows exponentially and review capacity grows linearly, the shape of the answer is settled before anyone writes a line of code. You cannot put a faster gate in front of an exponential flow. You move safety from the point of use to the point of origin — which is how critical infrastructure problems have been solved for a century.

Before treatment plants, cities accepted that thousands of residents would die from unsafe water. They replaced inspectors at the tap with infrastructure at the source: treatment plants that made every drop safe before it entered the pipe. The inspection moved from consumption to production, and then it disappeared.

Health data is ready for the same transformation: compliance that is a property of every output, not a gate in front of it.

Every request certified at origin

Work with us to build this future.

Become a design partner

Or see the platform →