A retrospective study at a US academic center found that an artificial intelligence (AI) algorithm correctly localized nearly one-third of interval breast cancers that were previously missed on digital breast tomosynthesis screening. Interval cancers—those that emerge with symptoms following a negative mammogram—are linked to poorer outcomes due to aggressive tumor biology and rapid growth.
Led by Manisha Bahl of the Department of Radiology at Massachusetts General Hospital in Boston, the researchers reviewed 224 cases of interval cancers that were diagnosed after negative digital breast tomosynthesis (DBT) screenings from February 2011 to June 2023. The FDA-cleared AI algorithm retrospectively evaluated these examinations and assigned lesion scores from 0 to 100. Scores of 10 or greater were considered positive. The algorithm correctly localized 73 of 224 cancers (33%).
Detected interval cancers were larger at surgery (mean 37 mm vs 22 mm; P < .001) and were more likely to involve lymph node positivity (41% vs 23%; P = .01) than those not detected by AI. Interval cancers identified by AI were also more likely to exhibit architectural distortion (21% vs 9%; P = .02) and have visible findings on mammography (96% vs 74%; P < .001). No significant differences in age, race, or breast density were observed between detected and undetected interval cancers.
The algorithm’s performance was further evaluated on 1,000 additional DBT exams: 334 true-positive (TP), 333 true-negative (TN), and 333 false-positive (FP) cases. AI correctly localized 84% of TP cancers and categorized 86% of TN and 73% of FP cases as negative. In an analysis of 152 asymptomatic false-negative (FN) cancers that were mostly identified via high-risk screening MRI, AI localized 18% of cases. Patients with AI-detected asymptomatic FN cancers were younger (mean 52 vs 59 years; P < .01), and these cancers were more likely to show calcifications. In the TP cohort, AI was more likely to detect invasive ductal carcinoma and masses, and less likely to detect ductal carcinoma in situ and calcifications. No differences were observed in tumor grade, hormone receptor status, or lymph node involvement between detected and undetected TP cancers.
The data set, drawn from a single high-volume institution, may not be generalizable to all practices due to variations in radiologist experience and screening protocols. However, the sample of 224 interval cancers is among the largest from a single site. The authors noted that implementation requires radiologists to review and act on AI findings during screening.
They concluded: "Further research, including retrospective evaluation of this AI algorithm using FN data sets from other institutions and monitoring of outcomes after AI deployment, are needed to fully understand the impact of AI on FN rates and other performance metrics, including false-positive rates."
Full disclosures can be found in the published study.
Source: RSNA Radiology