Machine learning (ML) applications in endocrinology have expanded rapidly over the past two decades, with thyroid-related research dominating the field, according to a comprehensive narrative review published in Endocrine.
The authors, led by Alicja Hubalewska-Dydejczyk, MD, who chairs the department of endocrinology at Jagiellonian University Medical College, Kraków, Poland, identified 1,130 original studies published between January 2000 and December 2024 that applied ML methods to non-diabetic endocrine disorders, highlighting growing use in imaging, risk prediction, and treatment-response modeling, alongside persistent limitations in validation and clinical implementation.
The investigators used PubMed to search for English-language, full-text original research using ML techniques in thyroid, pituitary, adrenal, and parathyroid diseases. They excluded diabetology-focused studies because of prior extensive coverage. Included studies spanned imaging-based ML, radiomics, electronic health record analyses, and molecular and omics-based modeling.
Of the 1,130 analyzed studies, the majority were related to thyroid diseases (68%), followed by pituitary disorders (20%), adrenal disorders (7%), and parathyroid diseases (5%). Most studies were retrospective and single center.
Thyroid disorders
ML was most frequently applied to ultrasound-based evaluation of thyroid nodules. Several deep learning models demonstrated diagnostic performance comparable to or exceeding expert radiologists. In multicenter work cited in the review, a deep learning system reduced unnecessary fine-needle aspiration biopsies by 27% while maintaining diagnostic accuracy. ML models also showed high performance in malignancy prediction, lymph node metastasis detection, and molecular risk stratification, including prediction of BRAFV600E mutations. In cytology, combined refractive index and stained-image ML analysis achieved up to 100% accuracy in distinguishing benign from malignant samples in small studies.
Pituitary disorders
In pituitary imaging, ML-based radiomics differentiated cystic pituitary adenomas from Rathke cleft cysts with an area under the curve (AUC) of 0.848. Texture-based ML models predicted response to first-generation somatostatin receptor ligands in acromegaly with AUC values of 0.847. Postoperative outcome prediction models, including remission, hypopituitarism, hyponatremia, and diabetes insipidus, were also reported, mostly using preoperative imaging and clinical variables.
Adrenal disorders
Although fewer in number, adrenal ML studies showed high diagnostic performance in selected applications. Radiomics-based ML differentiated lipid-poor adenomas from malignant lesions and pheochromocytomas, with some models achieving AUC values above 0.94. In primary aldosteronism, ML-based clinical scoring systems exceeded 90% sensitivity and could reduce unnecessary screening tests by up to 32.7% without missing surgically curable cases. Steroid profiling combined with ML classified adrenal tumor subtypes with balanced accuracies as high as 97%.
Parathyroid disorders
ML applications in parathyroid disease focused on improving detection and surgical outcomes. Deep learning applied to fluorocholine PET/CT identified hyperfunctioning parathyroid tissue with 83% detection accuracy. Intraoperative ML-assisted imaging techniques achieved sensitivities up to 100% and specificities above 90% for parathyroid identification. A random forest model predicting postoperative hypocalcemia after thyroidectomy reached an AUC of 0.928 in validation cohorts.
Limitations
The authors noted recurring limitations across subspecialities, including lack of model transparency, data imbalance, small sample sizes, and heavy reliance on retrospective designs. External validation and standardized reporting were infrequent. The concentration of studies in thyroid disease underscores a research imbalance that may limit ML-driven advances in rarer endocrine disorders.
“Like in other medical fields, in endocrinology there is a need for high-quality, well-designed ML-based research,” the authors noted. “Validation of ML models and integration into clinical practice demands attention and careful supervision of specialists. Interdisciplinary collaboration between healthcare professionals, data scientists and AI experts is crucial in realizing the full potential of this technology. In this way, despite challenges, ML technology might provide remarkable benefits to the endocrine field.”
The authors reported having no relevant conflicts.