A machine-learning analysis of routine conversations between primary care physicians and older patients could help identify cognitive impairment with moderate accuracy, according to a study. Researchers reported that the highest-performing model achieved an area under the receiver operating characteristic curve of over 0.73 in both development and validation cohorts, supporting the feasibility of passive speech-based screening during routine clinical care.
The researchers recorded primary care visits involving 966 English-speaking patients aged 55 years and older who had no documented diagnosis of mild cognitive impairment or dementia at the time of their visit. The study included a development cohort of 787 patients recruited from practices in New York and an external validation cohort of 179 patients recruited from practices in Chicago. Cognitive impairment was defined as a Montreal Cognitive Assessment (MoCA) score at least 1 standard deviation below age- and education-adjusted norms. Machine-learning models were trained using acoustic features extracted from 30-second segments of recorded patient-physician conversations.
Among the evaluated approaches, models using Whisper-derived acoustic features produced the strongest classification performance. The best-performing models both achieved an AUROC of 0.73 in the development and external validation cohorts.
The researchers also evaluated the model as a potential screening tool. In the validation cohort, the best-performing algorithm achieved a positive predictive value of 30%, sensitivity of 68%, and specificity of 64%. Cognitive impairment prevalence in the study population was 21%.
Deep neural network–derived acoustic features generally outperformed expert-defined acoustic measures, including prosodic and eGeMAPS features. According to the researchers, preprocessing approaches that reduced background noise while preserving conversational structure yielded the most consistent performance during external validation.
Model interpretation analyses suggested that greater variability in pause duration and increased energy in unvoiced speech were associated with a higher likelihood of cognitive impairment classification. Features associated with cognitively normal classification included higher pitch-related measures, greater voicing rates, and stronger energy in voiced speech segments, according to the researchers.
In an accompanying editorial, Gabriela Meade, PhD, CCC-SLP, and Hugo Botha, MBChB, both of the Department of Neurology at the Mayo Clinic, noted that classifiers performed better when recordings included both patient and physician speech compared with when patient speech was analyzed alone. The editorial authors wrote that preserving conversational dynamics may improve model performance but also raises the possibility that physicians may alter their speech patterns during encounters with patients who have cognitive impairment, potentially contributing to the predictive signal detected by the models.
The editorial authors also highlighted broader implementation concerns. False-positive screening results could lead to additional evaluations and referrals, while potential racial, ethnic, and linguistic biases remain important considerations because both speech-recognition systems and cognitive screening tools may perform differently across patient populations.
The researchers acknowledged several limitations. Data were collected within two affiliated health systems using a common protocol, which may limit generalizability. Cognitive impairment was defined using MoCA performance rather than comprehensive neuropsychological assessment, and the models analyzed acoustic characteristics of speech without incorporating lexical or semantic content.
Overall, the findings suggested that routine clinical conversations contain acoustic signals associated with cognitive impairment and that machine-learning models may be able to identify patients at elevated risk without requiring dedicated screening tasks. However, additional validation will be needed before such tools can be integrated into clinical practice.
"These findings support the feasibility of passive, speech-based screening during routine primary care," wrote lead study author Joseph T. Colonel, PhD, of the Department of Psychiatry at the Icahn School of Medicine at Mount Sinai, and colleagues.
The study was supported by the National Institute on Aging and other National Institutes of Health funding sources. Full disclosures of the study authors can be found in the study.
Source: JAMA Neurology, Editorial