January 2025
·
3 Reads
Journal of Racial and Ethnic Health Disparities
To evaluate algorithmic fairness in low birthweight predictive models. This study analyzed insurance claims (n = 9,990,990; 2013–2021) linked with birth certificates (n = 173,035; 2014–2021) from the Arkansas All Payers Claims Database (APCD). Low birthweight (< 2500 g) predictive models included four approaches (logistic, elastic net, linear discriminate analysis, and gradient boosting machines [GMB]) with and without racial/ethnic information. Model performance was assessed overall, among Hispanic individuals, and among non-Hispanic White, Black, Native Hawaiian/Other Pacific Islander, and Asian individuals using multiple measures of predictive performance (i.e., AUC [area under the receiver operating characteristic curve] scores, calibration, sensitivity, and specificity). AUC scores were lower (underperformed) for Black and Asian individuals relative to White individuals. In the strongest performing model (i.e., GMB), the AUC scores for Black (0.718 [95% CI: 0.705–0.732]) and Asian (0.655 [95% CI: 0.582–0.728]) populations were lower than the AUC for White individuals (0.764 [95% CI: 0.754–0.775 ]). Model performance measured using AUC was comparable in models that included and excluded race/ethnicity; however, sensitivity (i.e., the percent of records correctly predicted as “low birthweight” among those who actually had low birthweight) was lower and calibration was weaker, suggesting underprediction for Black individuals when race/ethnicity were excluded. This study found that racially blind models resulted in underprediction and reduced algorithmic performance, measured using sensitivity and calibration, for Black populations. Such under prediction could unfairly decrease resource allocation needed to reduce perinatal health inequities. Population health management programs should carefully consider algorithmic fairness in predictive models and associated resource allocation decisions.