Machine learning models trained on electronic health records identified postpartum depression with modest accuracy while reducing racial and ethnic disparities in screening outcomes, according to a recent study.
Researchers at Cedars-Sinai Medical Center evaluated machine learning models for detecting perinatal mood and anxiety disorders (PMADs), including postpartum depression (PPD). The study, published in JAMA Network Open, analyzed data from 19,430 postpartum patients aged 14 to 59 years who delivered live births between 2020 and 2023. The study aimed to assess the predictive performance and fairness of models trained on electronic health records while addressing potential racial and ethnic biases.
The patients were screened using either the Patient Health Questionnaire–9 or the Edinburgh Postnatal Depression Scale. Machine learning models, including logistic regression, random forest, and extreme gradient boosting, were trained to predict moderate- to high-risk PPD outcomes. Predictor variables included demographic information, prior mental health diagnoses, and delivery-related data. Reweighing techniques were applied during preprocessing to reduce racial and ethnic disparities.
Performance was measured using the area under the receiver operating curve (AUROC). Baseline models achieved modest accuracy, with AUROCs ranging from 0.610 to 0.635. Reweighed models showed slightly reduced AUROCs (0.602 to 0.622) but achieved significant reductions in demographic parity differences (from 0.238 to 0.022, P < .001) and false-negative rate differences (from −0.184 to 0.018, P < .001). Without reweighing, models demonstrated bias, since minority patients had higher positive screening rates and lower false-negative rates compared with non-Hispanic White patients.
While the predictive performance of the models was modest, the researchers emphasized the potential of machine learning to supplement traditional screening tools. The study underscored the need for models that account for and mitigate biases to help reduce disparities in PMAD detection and treatment.
The researchers acknowledged study limitations, including the models’ modest accuracy and potential risks of rebiasing against certain groups through reweighing. Future research should focus on optimizing model parameters and addressing systemic barriers to improve access to mental health care.
This study underscored the need for equitable approaches in developing predictive models, emphasizing fairness metrics such as demographic parity and false-negative rate differences. The findings indicated that machine learning may complement existing psychometric tools to support routine PPD screening, with the potential to improve detection and treatment outcomes, especially for underserved populations.
Full disclosures can be found in the published study.