An analysis of large language models found that they may produce significantly different clinical recommendations based on patients' sociodemographic characteristics, potentially contributing to health disparities. The study evaluated nine models using emergency department cases and revealed disparities in triage, testing, treatment, and mental health assessment recommendations across different sociodemographic groups. The magnitude and consistency of these differences suggest that language model outputs may be influenced more by demographic attributes than by clinical need.
Source: Nature Medicine