Researchers who analyzed voice recordings found that specific acoustic features, particularly harmonic-to-noise ratio and pitch, may help identify benign vocal fold lesions.
In the overall sample, statistically significant differences were observed between the benign lesion and no voice disorder cohorts in mean harmonic-to-noise ratio (HNR; P = .019), HNR variability (standard deviation, P = .028), and fundamental frequency (P = .012). HNR variability also differed between benign lesions and laryngeal cancer (P = .028). No significant differences were found for jitter or shimmer. When stratified by sex, differences in HNR and HNR variability between benign lesions and no voice disorder, as well as HNR variability between benign lesions and laryngeal cancer, were found only among cisgender men. No statistically significant differences were observed among cisgender women, which the authors attributed to the limited overall sample size.
The analysis included 176 participants who were recruited across 5 North American sites. Participants were divided into two groups. The first compared those with laryngeal cancer (n = 10), benign lesions (n = 13), or no voice disorder (n = 122). The second compared those with benign or malignant lesions but no other voice disorder (n = 17) to participants with spasmodic dysphonia (n = 8) or unilateral vocal fold paralysis (n = 26).
Voice samples were taken from the Rainbow Passage, a paragraph that contains all phonemes in American English that is commonly used by speech pathologists to assess voice function. Acoustic features were extracted using openSMILE software and included:
-
Fundamental frequency (F0): the rate of vocal fold vibration, which conveys pitch.
-
Jitter: cycle-to-cycle variation in F0; associated with reduced control of vocal fold vibration.
-
Shimmer: cycle-to-cycle variation in amplitude, often linked to breathiness and glottal resistance.
-
Harmonic-to-noise ratio: the ratio of periodic (regular glottal pulses during phonation) to aperiodic (noise from turbulence as air flows through the glottis) components of the voice signal that reflects how much of the voice is tonal vs noisy.
-
HNR variability (HNR SD): variation in HNR across phonation; measures consistency of vocal production.
The median age of participants was 59 years. The lesion-present group had a median weight approximately 20 pounds higher than the lesion-absent group and included 13% more African American participants. Overall, the dataset was predominantly White, heterosexual, and female.
“Our preliminary analysis of the Bridge2AI-Voice data set shows early promise that there are vocal features that can act as a biomarker for vocal fold lesions,” wrote lead author Phillip Jenkins, PhD, Division of Informatics and Clinical Epidemiology, Oregon Health & Science University, with colleagues.
The authors noted that HNR variability may be relevant for monitoring lesion progression or identifying early vocal changes that are related to laryngeal cancer. However, distinguishing lesions from other voice disorders such as spasmodic dysphonia or vocal fold paralysis was more challenging because no significant differences were observed in those comparisons.
The study’s limitations included the small number of lesion cases, incomplete lesion histories, and no objective measures of recording quality. The dataset reflected a research population that may not represent all patient demographics or lesion types. The authors recommended future research with larger and more diverse cohorts, lesion-specific details such as size and type, and exploration of additional acoustic features.
No conflicts of interest were reported.
Source: Frontiers