Nearly one in four U.S. Food and Drug Administration–approved artificial intelligence–enabled medical devices were reported as having no clinical performance studies at the time of approval.
In a cross-sectional study, investigators assessed the clinical generalizability of AI-enabled medical devices approved by the FDA as of August 31, 2024. Lead study author Daniel Windecker, BMed, of the Department of Diagnostic and Interventional Neuroradiology at the University of Bern in Switzerland, and colleagues analyzed publicly available data on 903 devices listed in the FDA’s AI-enabled medical device database to determine the extent to which clinical performance and subgroup-specific data were reported at the time of regulatory approval.
Among the 903 devices, 505 (55.9%) of them included clinical performance studies in their FDA documentation, whereas 218 (24.1%) of them explicitly noted that no such studies were conducted, and 180 (19.9%) of them didn't specify study inclusion. Among the clinical studies reported for approved devices, retrospective designs were most common (193 studies [38.2%]), followed by prospective studies (41 [8.1%]) and randomized trials (12 [2.4%]). Many devices (664 [73.5%]) were software only, and a small subset (6 [0.7%]) were implantable.
Discriminatory performance metrics were inconsistently reported. Sensitivity was provided for 183 devices (36.2%), specificity for 176 devices (34.9%), and area under the curve for 82 devices (16.2%). Subgroup data were infrequent, with only 145 devices (28.7%) reporting sex-specific outcomes and 117 (23.2%) including age-related data. Most devices (692 [76.6%]) were intended for radiology, followed by cardiovascular medicine (91 [10.1%]) and neurology (29 [3.2%]). Nearly all devices (877 [97.1%]) were approved via the 510(k) pathway, with limited use of the de novo pathway (22 devices [2.4%]).
The investigators also identified 43 devices (4.8%) that had been recalled, with a median time to recall of 1.2 years. Among the recalled devices, 13 (30.2%) had reported clinical performance studies, but only a minority included subgroup data.
The findings highlight the need for enhanced transparency, continuous monitoring, and reevaluation to ensure safe and effective clinical integration of AI technologies.
Full disclosures can be found in the full study.
Source: JAMA Network Open