Using advanced artificial intelligence (AI)-driven methods increased real-world data accuracy from 59.5% to 93.4%, completeness from 46.7% to 95.6%, and traceability from 11.5% to 77.3%, according to a recent study.
In the study, published in JAMA Network Open, investigators conducted a quality improvement study to evaluate methods for assessing data reliability in real-world evidence, focusing on accuracy, completeness, and traceability. They analyzed data from 58 hospitals and over 1,180 outpatient clinics in the United States, examining records of 120,616 patients with asthma treated between 2014 and 2022. The investigators compared traditional data sources, including medical and pharmacy claims, with advanced methodologies incorporating electronic health records and AI-extracted unstructured data.
Accuracy was quantified using the F1 score, completeness was determined as a weighted mean of available data sources per patient-year, and traceability was measured as the proportion of data elements linked to clinical source documentation. The traditional approach yielded an accuracy of 59.5%, a completeness rate of 46.7%, and a traceability rate of 11.5%. In contrast, the advanced approach demonstrated significantly improved metrics, with an accuracy of 93.4%, a completeness rate of 95.6%, and a traceability rate of 77.3%.
The findings indicated that systematic measurement of data reliability may be feasible and align with U.S. Food and Drug Administration guidance on real-world evidence. Led by Daniel Jay Riskin, MD, of Verantos and the Stanford University School of Medicine, the investigators emphasized that accuracy, completeness, and traceability may each provide distinct insights into data quality, and incorporating multiple data sources with advanced analytic techniques could enhance reliability. The findings provided insights into real-world evidence reliability.
The study's limitations included its focus on asthma, potentially limiting generalizability, and its reliance on AI-driven unstructured data extraction, which may not be uniformly applicable across health systems. Nonetheless, it examined the role of structured and unstructured data integration in real-world evidence reliability. Its applicability to other diseases and clinical outcomes was not assessed.
Full disclosures can be found in the published study.