Artificial intelligence has been used in thyroid disease research for about 3 decades, but advances in machine and deep learning have rapidly expanded its clinical applications and exposed persistent barriers to real-world adoption, according to a systematic review published in Frontiers in Endocrinology. The study, co-led by Qing Lu, MD, of Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China, examined current progress in AI-driven thyroid disease management, challenges to clinical implementation, and priorities for future development.
“Previously, [AI studies] mainly concentrated on the diagnosis of thyroid function and distinguishing benign from malignant thyroid nodules,” the researchers explained. “AI [is now] widely used in multiple areas of thyroid disease management, including image analysis, pathologic diagnosis, personalized treatment, patient monitoring, and follow-up.”
Review Methods
The researchers searched PubMed, Scopus, and Web of Science for studies published between 2019 and 2025 using keywords related to thyroid disease, imaging, artificial intelligence, pathology, and personalized treatment. Eligible studies were required to include clinical human research with validation in at least 50 patients and reported performance metrics.
Of 1,837 records initially identified, 30 studies met the inclusion criteria following screening.
Imaging and Pathology
According to the researchers, advances in deep-learning algorithms have markedly enhanced image-processing capabilities, allowing AI to analyze complex ultrasound images with greater accuracy and improve diagnostic sensitivity and specificity.
AI-assisted ultrasound systems achieved diagnostic accuracies above 90% for identifying thyroid nodules and, when combined with radiomics, reduced unnecessary fine-needle aspiration biopsies from approximately 30% to 38% to about 5% compared with conventional risk stratification approaches.
Beyond ultrasound, the researchers reported that AI demonstrated predictive capability for preoperative cervical lymph node metastasis in thyroid cancer using computed tomography, outperforming senior radiologists. Combining AI with radiologist assessments further enhanced diagnostic efficacy and supported surgical planning. Magnetic resonance imaging–based radiomics models also showed value in predicting extrathyroidal extension.
At the tissue level, AI systems were able to detect subtle changes in cellular morphology and tissue structure, improving the diagnostic accuracy of fine-needle aspiration biopsies. In one comparison cited in the review, an AI model demonstrated higher accuracy and specificity than the average expert cytopathologist by more than two standard deviations.
Treatment and Monitoring
According to the review, AI and radiomic models supported data-driven personalized treatment by guiding surgical decision-making through risk stratification and predictions of tumor invasiveness, lymph node metastasis, and the need for preventive lymph node dissection. A guideline-based clinical decision support system for routine surgical practice matched real-world treatment decisions in approximately 79% of cases.
The researchers also reported that AI supported targeted therapy by analyzing genetic mutations, identifying biologic pathways involved in drug responses, and highlighting potential druggable targets and interacting compounds.
AI additionally played an expanding role in patient monitoring and follow-up. The review highlighted studies demonstrating its use in recurrence prediction based on clinical data, imaging features, and biomarkers, as well as in remote monitoring through smartphone and wearable data to identify potential complications and recurrence risks and provide real-time clinical decision-making support.
Insights and Opportunities
Despite these advances, the review highlighted several limitations. Most studies relied on single-center, hospital-based cohorts (93%), focused on classical papillary thyroid carcinoma (90%), and evaluated models trained on Asian data sets (83%), raising concerns about generalizability.
The researchers also noted that the “black-box” nature of AI models remains a critical barrier to clinical adoption. In addition, algorithmic development was often poorly integrated with clinical workflows, and unresolved ethical and legal issues—such as liability for AI misdiagnoses and informed consent for predictive genomic models—continued to hinder real-world implementation.
The authors identified several priorities for future research, including cross-modal data-fusion architectures integrating ultrasound, pathomics, and multiomics data to support interpretable multitask learning frameworks; algorithmic improvements to enhance predictive fairness in heterogeneous thyroid nodule populations; rapid implementation pipelines for clinical use; and prospective randomized controlled trials to assess the real-world impact of AI systems on health care costs and patient outcomes. They concluded that addressing these priorities could help bridge the gap between AI innovation and equitable, ethically grounded clinical practice.
The study was funded by grants from the Wuhan Knowledge Innovation Project and the Technology Innovation Project of Hubei Province. The researchers reported no conflicts of interest.
Source: Frontiers in Endocrinology