A deep-learning model estimated retinal age from fundus photographs with a mean absolute error of about 3 years in internal and primary external validation, and larger retinal age gaps were associated with cardiometabolic conditions, according to findings published in Communications Medicine.
Researchers developed an ensemble multitask learning model using 50,595 quality-controlled fundus photographs from 27,214 disease-free adults. The model used fundus images alone during inference but incorporated glycated hemoglobin during training as an auxiliary signal. It was internally validated in 7,288 eyes and externally tested in cohorts of 135 and 4,992 eyes.
Model performance was assessed by mean absolute error. The ensemble achieved a mean absolute error of 2.78 years in internal validation and 3.39 years in the primary external cohort. In a larger external cohort derived from the AlzEye study, error increased to 8.63 years, indicating reduced performance in a more heterogeneous population.
Although these results demonstrate strong performance, prior studies have reported mean absolute errors in the range of approximately 3.3 to 4.0 years, suggesting the improvement represents incremental refinement rather than a step change in accuracy.
In the primary external cohort, the model outperformed comparator models, including a registry-based retinal age model and RETFound-DINOv2. In the AlzEye cohort, comparisons were limited to the registry-based model because RETFound-DINOv2 had been pretrained on overlapping data.
Researchers also examined the retinal age gap—defined as predicted minus chronological age—in 8,467 patients following age- and sex-matched propensity-score analysis. Retinal age gap was higher among patients taking diabetes medication compared with matched controls, with a difference of approximately 1 year. Higher retinal age gaps were also observed among patients with a history of stroke or cardiac disease. No statistically significant differences were observed for lipid-lowering therapy or chronic renal disease. Hypertension showed no overall association, although a signal emerged in a lower-uncertainty subgroup, a finding that may reflect limited statistical reliability.
The magnitude of these differences was modest, and their clinical significance remains uncertain. In addition, “cardiac disease” was broadly defined and did not distinguish between conditions such as arrhythmia and prior myocardial infarction, limiting interpretability for clinical risk stratification.
The model also incorporated an internal measure of prediction uncertainty based on disagreement across ensemble models. Eyes with lower uncertainty had lower prediction error and showed stronger associations with disease. This approach could allow outputs to include both an estimated retinal age and a reliability indicator, although thresholds for clinical use have not been validated.
Several design features may limit generalizability. The model was trained exclusively in a relatively healthy population, excluding patients with hypertension, cardiovascular disease, abnormal body mass index, elevated glycated hemoglobin, or use of related medications. As a result, many patients encountered in routine clinical practice may fall outside the model’s training distribution.
The study was observational and cross-sectional, and systemic disease status was based partly on self-reported diagnoses and medication use. Matching was limited to age and sex, leaving potential residual confounding. The cohorts were predominantly Asian, and performance declined in the larger external data set, further highlighting the need for validation in more diverse populations.
“Retinal age derived from a single fundus photograph could provide a scalable biomarker of biological ageing,” wrote Takahiro Ninomiya, of Tohoku University Graduate School of Medicine, and colleagues, describing a potential future clinical role. Prospective studies are needed to determine whether retinal age gap can guide clinical decision-making or improve patient outcomes.
Before clinical adoption, further work will be required to establish validated thresholds, confirm performance across diverse populations, and evaluate whether integrating retinal age estimates into routine ophthalmic workflows improves risk stratification.
Disclosures: The researchers reported no competing interests.
Source: Communications Medicine