Integrating an artificial intelligence (AI)–based ultrasound model with molecular testing preserved high sensitivity while improving specificity and positive predictive value in patients with indeterminate thyroid nodules, according to a retrospective study published in Endocrine Practice.
For indeterminate thyroid nodules, molecular testing reportedly provides high negative predictive value (NPV), helping to minimize missed malignancies, but its limited positive predictive value (PPV) may result in potentially avoidable surgeries.
In the primary analysis of 42 nodules with surgical pathology, sensitivity was 95% for both the ThyroSeq version 3 (ThyroSeq) genomic classifier alone and with the integration of the AIBx version 2 (AIBx) imaging model, while specificity increased to 60% compared with 45% for ThyroSeq. PPV increased from 66% to 72%, and the area under the receiver operating characteristic curve (AUC) rose from 0.70 to 0.78.
According to co-senior author Anupam Kotwal, MD, of University of Nebraska Medical Center, Omaha, and colleagues, “The continued development of AI-driven imaging and its integration with molecular diagnostics may advance precision medicine in thyroidology.”
Methodology
The researchers retrospectively analyzed consecutive thyroid nodules with indeterminate cytology (Bethesda III and IV) that underwent fine needle aspiration followed by ThyroSeq testing between May 1, 2021, and May 30, 2023. Of 114 nodules initially identified, 108 met inclusion criteria.
ThyroSeq assesses 112 genes and reports both a binary classification (positive vs negative for malignancy) and an estimated malignancy probability. For this study, nodules were considered positive if classified as malignant or if genetic alterations were associated with at least a 50% estimated malignancy probability. Noninvasive follicular thyroid neoplasm with papillary-like nuclear features was categorized as malignant because diagnosis requires surgery.
Ultrasound images were analyzed using AIBx for a binary malignancy prediction (benign vs malignant). The hierarchical algorithm deferred to imaging results when ThyroSeq classified a nodule as “malignant” but the estimated probability of malignancy was below 50%.
Surgical pathology was available for 42 nodules; the remaining 66 were classified as benign for analytic purposes in the secondary analysis, most of which underwent follow-up imaging and continued to show stability.
Expanded Performance Results
In the surgical pathology subset, ThyroSeq demonstrated 95% sensitivity and 45% specificity, with NPV of 90% and PPV of 66%. AIBx showed 77% sensitivity and 60% specificity; the NPV and PPV were 70% and 68%, respectively. Compared with ThyroSeq alone, the addition of AIBx preserved 95% sensitivity, improved specificity to 60%, and increased NPV to 92% and PPV to 72%.
With AIBx, ThyroSeq, and the combined approach, the AUCs were 0.69, 0.70, and 0.78, respectively.
McNemar testing showed no discordant pairs for sensitivity, according to the researchers, which indicated identical identification of malignant cases compared with ThyroSeq alone. The combined approach correctly classified three additional benign nodules as benign, although this difference was not found to reach statistical significance.
In the full cohort of 108 nodules, the integrated strategy maintained 95% sensitivity, matching ThyroSeq and exceeding the 77% sensitivity of AIBx. Specificity increased to 91%, compared with 87% for ThyroSeq and 77% for AIBx. PPV improved to 72% from 66% with ThyroSeq and 46% with AIBx, while NPV was 99% with both the combined approach and ThyroSeq and 93% with AIBx. The AUC increased to 0.93, compared with 0.91 for ThyroSeq and 0.77 for AIBx.
The Road Ahead
Asked whether the findings could translate into meaningful patient-level benefits beyond improved diagnostic metrics, Dr. Kotwal said in an interview conducted by the American Association of Clinical Endocrinology (AACE) in partnership with Conexiant that “This combined approach has the potential to improve patient outcomes (complications of surgery, quality of life due to surgical complications or hypothyroidism) without negatively impacting long-term cancer-related outcomes.”
However, he emphasized that the effect on surgical rates remains uncertain. “This approach needs to be validated in a larger patient sample of indeterminate cytology thyroid nodules. Additionally, AIBx also needs to be tested with other molecular tests in diverse populations before the improvement in specificity demonstrated in this study can have a meaningful impact on real-world surgical rates.”
Regarding whether incorporating AI could shift the level of cancer risk at which clinicians feel comfortable recommending or avoiding surgery, Dr. Kotwal said it would not. However, he told AACE and Conexiant, “the AI combined with molecular testing approach has the potential to better inform the cancer risk than either approach alone in terms of guiding decision-making for patients with indeterminate cytology thyroid nodules.”
The researchers described the findings as exploratory and hypothesis-generating, noting that the limited sample size did not permit statistically significant differences across performance metrics. Classifying nodules without surgical pathology as benign may have introduced misclassification and partial verification bias, potentially inflating specificity and NPV while underestimating sensitivity. Results were limited to nodules tested with ThyroSeq and may not generalize to other molecular platforms. Echoing Dr. Kotwal’s call for larger and more diverse validation studies, the researchers added that future work “could also attempt to better identify [noninvasive follicular thyroid neoplasms with papillary-like nuclear features] preoperatively.”
The researchers reported no financial conflicts of interest.
Source: Endocrine Practice