Over 1,400 AI medical devices are on the market, and premarket studies are rarely prospective, randomized, multisite, or representative of diverse patient populations.
About 90% of physicians believe randomized trials are conducted before an AI medical device reaches clinical use, but these expectations are often unmet.
Writing in Annals of Internal Medicine, Kyra Rosen and Kenneth Mandl, MD, lay out the gap between what FDA clearance signals and what it is designed to establish. Nearly all AI/ML devices — 97% — reach the market through the 510(k) pathway, which requires only substantial equivalence to a previously authorized device rather than independent demonstration of safety or effectiveness. Predicate chains can include older or recalled devices, and about one third rely on non-AI comparators.
Together, these gaps point to a broader concern: a mismatch between what clinicians assume FDA clearance signals and what it is designed to establish.
Existing postmarket surveillance systems are limited in their ability to capture AI/ML performance. For example, the MAUDE database captures as few as 0.5% of adverse events, clinicians may not recognize when probabilistic models underperform, and updates authorized under predetermined change control plans can proceed without independent review of postmodification testing.
Many of these limitations stem from gaps in data infrastructure, surveillance systems, and reporting incentives. The systems needed to track real-world performance remain underdeveloped, and industry participation in monitoring programs is limited. Meanwhile, proposed federal policy changes could reduce transparency for some AI tools outside the FDA device pathway, including by eliminating “model card” requirements for certain technologies.
As the authors put it: "Physicians and health delivery systems need to understand which AI/ML tools fall under FDA regulation and what clearance actually entails to integrate them into clinical practice effectively."
Procurement teams and health systems should not treat FDA clearance as the end of due diligence. The authors recommend demanding training data characteristics, subgroup performance, and validation populations — and approaching tools not validated in similar populations to their own patients with skepticism, particularly when validation populations differ from those they serve.
Disclosure forms are available with the article online.
Source: Annals of Internal Medicine