A supervised machine-learning model analyzing smartphone-captured tympanic membrane images could detect middle ear effusion with high accuracy in a prospective case-control study of pediatric patients.
Otitis media with effusion, defined as fluid accumulation in the middle ear without infection, is frequently misdiagnosed in clinical practice, with prior studies reporting diagnostic accuracy rates of 46% among general practitioners. In the study, conducted at a single tertiary center, a researcher aimed to develop and internally validate a supervised machine-learning model to differentiate normal tympanic membranes (TM) from those with otitis media with effusion using smartphone-based imaging.
The researcher evaluated 111 TM images from pediatric patients younger than 18 years. Ground-truth diagnoses were established by consensus between two otolaryngologists and confirmed using portable tympanometry prior to image acquisition. IThe images were then captured using a smartphone equipped with a video-otoscope under standardized conditions and processed in the red-green-blue color space following cropping to isolate the TM.
The model used a bag-of-features approach with a Speeded-Up Robust Features algorithm for feature detection and k-means clustering to construct a visual dictionary of 50 visual words. A support vector machine classifier was trained to categorize images as normal or otitis media with effusion.
Among 54 images used for training, the model achieved 96% sensitivity, 81% specificity, and 89% accuracy. When tested on a separate set of 57 images from the same cohort, performance declined to 87% sensitivity, 74% specificity, and 81% accuracy. The model also achieved a balanced accuracy of 80.4% and an F1 score of 82.5% on the test data set.
The data set included 54 normal images and 57 images with effusion, with an average participant age of 4 years. Images from both ears were included and randomly split at the image level, allowing images from the same patient to appear in both the training and testing sets.
The researcher noted that the difference between training and testing performance may reflect limited sample size and potential overfitting. Additional limitations included the absence of external validation, lack of comparison with clinician diagnostic performance, and potential information leakage as a result of image-level data splitting. The researcher noted that using the same smartphone and lighting conditions produced uniform image quality, improving internal validity at the expense of device heterogeneity.
“The supervised [machine-learning] algorithm developed in this study showed a promising result in detecting middle ear effusion when analyzing TM images captured by a smartphone,” wrote lead study author Mohammed K. Alnoury, of the Department of Otolaryngology–Head and Neck Surgery in the Faculty of Medicine at King Abdulaziz University in Saudi Arabia.
The researcher reported no conflicts of interest.