ArticlePDF Available

Abstract and Figures

Background and aims Low birth weight (LBW), known as the condition of a newborn weighing less than 2500 g, is a growing concern in the United States (US). Previous studies have identified several contributing factors, but many have analyzed these variables in isolation, limiting their ability to capture the combined influence of multiple factors. Moreover, past research has predominantly focused on maternal health, demographics, and socioeconomic conditions, often neglecting paternal factors such as age, educational level, and ethnicity. Additionally, most studies have utilized localized datasets, which may not reflect the diversity of the US population. To address these gaps, this study leverages machine learning to analyze the 2022 Centers for Disease Control and Prevention’s National Natality Dataset, identifying the most significant factors contributing to LBW across the US. Methods We combined anthropometric, socioeconomic, maternal, and paternal factors to train logistic regression, random forest, XGBoost, conditional inference tree, and attention mechanism models to predict LBW and normal birth weight (NBW) outcomes. These models were interpreted using odds ratio analysis, feature importance, partial dependence plots (PDP), and Shapley Additive Explanations (SHAP) to identify the factors most strongly associated with LBW. Results Across all five models, the most consistently associated factors with birth weight were maternal height, pre-pregnancy weight, weight gain during pregnancy, and parental ethnicity. Other pregnancy-related factors, such as prenatal visits and avoiding smoking, also significantly influenced birth weight. Conclusion The relevance of maternal anthropometric factors, pregnancy weight gain, and parental ethnicity can help explain the current differences in LBW and NBW rates among various ethnic groups in the US. Ethnicities with shorter average statures, such as Asians and Hispanics, are more likely to have newborns below the World Health Organization’s 2500-gram threshold. Additionally, ethnic groups with historical challenges in accessing nutrition and perinatal care face a higher risk of delivering LBW infants.
This content is subject to copyright. Terms and conditions apply.
Dolaand Valderrama
BMC Medical Informatics and Decision Making (2024) 24:367
https://doi.org/10.1186/s12911-024-02783-x
RESEARCH Open Access
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0
International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long
as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if
you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or
parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To
view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
BMC Medical Informatics and
Decision Making
Exploring parental factors inuencing low
birth weight onthe2022 CDC natality dataset
Sumaiya Sultana Dola1† and Camilo E. Valderrama1,2*†
Abstract
Background andaims Low birth weight (LBW), known as the condition of a newborn weighing less than 2500 g,
is a growing concern in the United States (US). Previous studies have identified several contributing factors, but many
have analyzed these variables in isolation, limiting their ability to capture the combined influence of multiple factors.
Moreover, past research has predominantly focused on maternal health, demographics, and socioeconomic condi-
tions, often neglecting paternal factors such as age, educational level, and ethnicity. Additionally, most studies have
utilized localized datasets, which may not reflect the diversity of the US population. To address these gaps, this study
leverages machine learning to analyze the 2022 Centers for Disease Control and Prevention’s National Natality Dataset,
identifying the most significant factors contributing to LBW across the US.
Methods We combined anthropometric, socioeconomic, maternal, and paternal factors to train logistic regression,
random forest, XGBoost, conditional inference tree, and attention mechanism models to predict LBW and normal
birth weight (NBW) outcomes. These models were interpreted using odds ratio analysis, feature importance, partial
dependence plots (PDP), and Shapley Additive Explanations (SHAP) to identify the factors most strongly associated
with LBW.
Results Across all five models, the most consistently associated factors with birth weight were maternal height,
pre-pregnancy weight, weight gain during pregnancy, and parental ethnicity. Other pregnancy-related factors, such
as prenatal visits and avoiding smoking, also significantly influenced birth weight.
Conclusion The relevance of maternal anthropometric factors, pregnancy weight gain, and parental ethnicity
can help explain the current differences in LBW and NBW rates among various ethnic groups in the US. Ethnicities
with shorter average statures, such as Asians and Hispanics, are more likely to have newborns below the World Health
Organization’s 2500-gram threshold. Additionally, ethnic groups with historical challenges in accessing nutrition
and perinatal care face a higher risk of delivering LBW infants.
Keywords Low birth weight, Machine learning, Interpretable predictive models, Parental factors, Maternal health,
Statistical analysis
Sumaiya Sultana Dola and Camilo E. Valderrama contributed equally to this
work.
*Correspondence:
Camilo E. Valderrama
c.valderrama@uwinnipeg.ca
Full list of author information is available at the end of the article
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 2 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
Background
e birth weight of a newborn is a crucial determinant
of their survival chances because, according to the World
Health Organization (WHO), a newborn weighing less
than 2500 grams is at increased risk of dying in the first
28 days of life [1]. Moreover, low birth weight (LBW) is
associated with morbidity because those who survive
may experience long-term physiological, neuropsychi-
atric, cognitive, and social challenges that persist into
adulthood [2].
LBW is currently a public health issue in the United
States (US), which reports more cases than any other
Western European country [3]. Recent data show a 1%
increase in LBW from 8.52% in 2021 to 8.60% in 2022,
with a rise of 20% since 1980 [4]. As of 2022, 8.6% of the
US newborns were born with LBW, with Black newborns
experiencing the highest LBW rate (14.0%), followed by
Asian/Pacific Islanders (9.0%), American Indian/Alaska
Natives (8.3%), and Whites (7.2%). Surprisingly, the like-
lihood of LBW births among Black newborns was dou-
ble that of White newborns [5]. ese ethnic differences
were also reported by Paige etal. [6] after analyzing LBW
incidents in more than 113,760 singleton live births in
King County, Washington, from 2008 to 2012. e results
showed that women from certain ethnic groups who
were born outside of the US had a lower chance of having
an LBW newborn than females who were born in the US,
even after adjusting for common pregnancy complica-
tions. e lowest rates of LBW were found in White, Chi-
nese, and Korean women. On the other hand, the highest
rates of LBW were found in Filipino, Asian Indian, and
non-Hispanic Black women (6.8–7.6%).
According to Morisaki etal. [7], the disparities in birth
weight between ethnicities are not attributable to tra-
ditional factors like maternal age, socioeconomic sta-
tus, and behavioral characteristics (e.g., smoking) but to
maternal anthropometric factors. ey reached this con-
clusion after reviewing singleton US live births between
2009 and 2012, finding that height, BMI, and specific
pregnancy-related factors such as gestational weight gain
and preterm birth rates were the most significant factors
influencing LBW. Given the strong association between
maternal body composition, including height, and birth
weight [8], previous studies have suggested the need for
alternative methods to identify LBW, as the 2500 g cut-
off may not be appropriate for newborns of non-Euro-
pean descent [9].
Similar studies in other countries have also reported an
association between maternal physical, socioeconomic,
and health factors and LBW newborns. Sharma et al.
[10], after reviewing 193 neonates in Chandigarh, India
reported that a LBW prevalence of 23.8%, with higher
rates observed among newborns whose mothers were
under 20 (50.0%), poorly educated mothers (32.6%), and
mothers with a pre-pregnancy weight less than 45 kg
(50.0%).
Other factors contributing to LBW include health con-
cerns, inadequate prenatal care, lower socioeconomic
status, and limited education [11]. ese factors nega-
tively impact both the physical and mental health of the
mother during pregnancy. e sex of the newborn is
also an LBW contributor due to the inherent biological
differences in growth patterns between male and female
fetuses. According to Broere-Brown [12], there are dif-
ferences in the weight and other biometrics of male and
female fetuses, which leads to different body propor-
tions. Male newborns generally weigh more, are longer,
and have larger head circumferences than their female
counterparts.
ese previous studies have identified some relevant
factors influencing LBW, such as maternal age, educa-
tion, socioeconomic status, and ethnicity [4, 5, 10, 11].
Also, one study has mentioned the strong influence of
maternal anthropometric factors on birth weight out-
comes [7]. However, although these studies have outlined
factors shaping birth weight, they have not evaluated the
extent to which these factors intersect to create a paren-
tal profile associated with a higher risk of having LBW
newborns. Furthermore, their focus has primarily been
on maternal health, demographics, and socioeconomic
factors, often overlooking potential paternal influences
such as the father’s age, education level, and ethnicity.
Additionally, most of the previous research has been
restricted to specific local populations in the US, neglect-
ing the diversity across the US population. erefore,
there is a need for a more comprehensive analysis that
incorporates various factors, including paternal predic-
tors, to identify the most significant contributors to LBW
across all 50 US states.
One way to correlate different factors to identify those
more associated with LBW is to leverage machine learn-
ing (ML) and deep learning (DL) predictive models.
Unlike traditional statistical methods and statistical
hypothesis tests, which cannot accommodate interac-
tions among many variables simultaneously, are limited
in their ability to handle collinearity, and require a priori
hypotheses about how variables relate with one another
[1317], ML and DL models can handle multiple corre-
lated predictors simultaneously, yielding highly interpret-
able outcomes [18, 19]. In this way, ML and DL models
can provide a practical approach to operationalize iden-
tifying population subgroups with a high proportion of
LBW.
is study presents an approach based on ML and DL
models to correlate multiple factors, including anthropo-
metric, socioeconomic, and demographic factors from
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 3 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
both mothers and fathers, to predict LBW in a national
US newborn dataset provided by the Centers for Disease
Control and Prevention (CDC) [20]. To that aim, we use
a range of predictive models, including logistic regres-
sion, random forest, XGBoost, conditional inference tree
and attention mechanism layers, to determine which
factors most significantly influence LBW. Furthermore,
for explaining our models and to enhance interpretabil-
ity, we apply Shapley additive explanations (SHAP) and
partial dependence plots (PDP) to the outputs of these
predictive models, allowing us to identify both direct
and inverse relationships between the factors and birth
weight.
Methods
Data source
For this study, we used the 2022 National Natality Data-
set, a publicly available file, provided by the Centers for
Disease Control and Prevention (CDC) [20, 21]. e
dataset comprises information for 3,675,606 birth reg-
istrations that occurred in the US in 2022. For each
newborn, 227 features are provided, including mater-
nal anthropometric (height and weight), parental demo-
graphics (parent’s race and education), birth weight, etc.
e data was collected from the delivery admission form
filled out by the mothers, as well as from the medical
records collected before and during delivery, such as the
first prenatal care visit date, pregnancy risk factors, and
delivery mode.
Predictor variables
e 2022 National Natality Dataset provides 227 features
describing births that occurred in the US, from both resi-
dents and non-residents. To reduce collinearity between
the predictors, as well as reduce the computational cost
of building the predictor models, we selected 20 vari-
ables out of a total of 227. Our selection was based on
previous studies suggesting significant factors influenc-
ing birth weight [5, 8, 11, 22, 23]. ese variables fall into
anthropometric, maternal, paternal, socioeconomic, and
ethnicity.
Anthropometric variables generally reflect an indi-
vidual’s physical and biological development through
body measurements like height, weight, and body mass
index (BMI) [24]. ese measurements provide infor-
mation about the mother’s nutrition and health, which
are important indicators of the newborn’s health. e
BMI identifies pregnancy complications caused by being
underweight or overweight, which may impact the birth
weight [25]. Maternal height and pre-pregnancy weight
significantly influence fetal growth together. Taller moth-
ers experience accelerated fetal growth in the first and
second trimesters, likely due to genetic factors, whereas
maternal weight status increasingly influences intrauter-
ine growth in the third trimester [26]. Overall, taller and
heavier mothers tend to give birth to larger newborns.
Parental factors, particularly the mother’s age, play a
critical role in determining birth outcomes. Younger and
older mothers often face increased complications, such
as preterm birth and LBW, due to their age [27]. Simi-
larly, older fathers’ age is associated with greater genetic
abnormalities in offspring. In comparison to fathers aged
20 to 34, those older than 34 years have a 90% higher
chance of having an LBW newborn, and teenage fathers
have a 20% lower chance [28]. On another note, mater-
nal smoking during pregnancy affects fetal development
by shortening the gestation period and reducing fetal
growth, leading to LBW [29].
Pregnancy history, including previous live births, still-
births, or neonatal deaths, also provides insight into
potential risks. Mothers who have had two or more suc-
cessful pregnancies tend to have more newborns with
normal birth weight, compared to nulliparous women
[3032]. In contrast, a history of previous fetal loss has
been linked to a higher occurrence of abnormalities in
pregnancies [33]. is kind of occurrence can physically
and mentally affect a mother [34, 35]; as a result, the out-
comes are adverse.
Parental education levels significantly influence birth
outcomes by affecting access to resources and health lit-
eracy [11, 22, 23]. Mothers and fathers with more educa-
tion tend to get better prenatal care and make healthier
lifestyle choices, leading to more favorable birth out-
comes. Prenatal care and the frequency of prenatal vis-
its are critical [36], as they ensure timely monitoring and
intervention, which are essential for identifying and miti-
gating risks during pregnancy.
Various studies indicate that birth outcomes are not
consistent across different ethnicities [5, 37, 38]. Moreo-
ver, the origin of the parents can affect the health of the
fetus. Lebron etal. [39] investigate the significant influ-
ence of a mother’s origin on healthcare access, edu-
cational opportunities, and economic stability among
Hispanic subgroups. ese factors are all related to soci-
oeconomic status and have an impact on mothers and
newborn health outcomes, such as breastfeeding, birth
weight, and newborn mortality. is study also describes
how sociopolitical factors, particularly immigrant poli-
cies, directly and indirectly affect these health outcomes
through stress, limited healthcare access, and other
mechanisms.
Outcome variable
We aim to analyze the factors that influence newborns
birth weight. As such, we used the birth weight (DBWT)
column to determine the outcome variable. As 2500
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 4 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
grams is the WHO’s established cut-off for LBW, we
divided the birth records into two classes. Newborns
with a birth weight lower than 2500 grams were labeled
as “Low Birth Weight” (LBW), and those whose birth
weight was higher than 2500 grams were labeled as “Nor-
mal Birth Weight” (NBW).
Data ltering
In this study, we focused exclusively on newborns with a
gestational age of at least 37 weeks (i.e., COMBGEST
37) due to the strong correlation between preterm births
and low birth weight (LBW) [4043]. Newborns born
before 37 weeks typically have a birth weight below 2500
grams, and including them could skew our analysis. We
also excluded non-singleton records, as indicated by the
column ‘DPLURAL’, to prevent confounding factors asso-
ciated with multiple pregnancies. Records from parents
identified as mixed race were excluded to avoid ambi-
guity in the interpretation of results among ethnicities.
Furthermore, we only included infants reported to be
alive at the time of the report to avoid bias in our pre-
dictions due to medical complications. To assess fetal
well-being against the predictor variables, we removed
any birth records lacking an APGAR score at 5 minutes.
Finally, records with unknown values for the selected
predictor and outcome variables were also excluded.
Figure1 shows the data filtering process. Initially, our
dataset included 3,675,606 newborn newborns. After
filtering out instances based on gestational age, plural-
ity records, mixed races, infant living at the time of the
report, and unknown values, the final dataset contained
2,303,722 instances.
Distribution ofthepredictor variables
Tables1 and 2 show the distribution of the 20 predic-
tor variables, separated into numerical and categorical
variables, respectively. For the numerical variables, the
mean and standard deviation are provided, while for the
Fig. 1 Data filtering process
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 5 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
categorical variables, the number of samples and the rela-
tive frequency for each category are displayed. e differ-
ent subgroups that six parental ethnicities encompassed
are displayed in Table3.
Data preparation
Training andtest sets
e final dataset containing 2,303,722 instances was split
into training and testing sets. e training set comprised
80% of the data, and the test set comprised 20% of the
data. Although our major goal was to combine paren-
tal factors to identify those most associated with birth
weight outcomes, we used the test set to evaluate the
generalization capacity. In detail, given that the test set
was not used to fit the predictive models, assessing the
models on these independent samples provided a reliable
means of evaluating the identified patterns.
To tune the hyperparameters of the machine learning
models, we further split the training set into two sets:
training and validation. Each hyperparameter configura-
tion was used to train the model, and the validation set
was used for performance evaluation. e hyperparam-
eters with the highest performance were selected to train
the final model, which was then evaluated in the inde-
pendent, held-out test data.
To train and evaluate the performance of the predic-
tive models, NBW was labeled ‘1’, while LBW was labeled
as ‘0’. As the dataset was imbalanced, with LBW being
the minority class, the models were trained to prioritize
the accurate prediction of LBW. is focus was driven
by the fact that LBW is a critical health condition that
requires proper identification. Consequently, our models
were optimized to minimize false negatives (newborns
predicted as NBW when they were actually LBW) over
false positives (newborns predicted as LBW when they
were actually NBW).
Data preprocessing
e predictor variables were separated into numerical
and categorical variables. e categorical variables were
converted into dummy variables using one hot encod-
ing. e numerical variables were scaled using min-max
normalization.
Resampling
Because the number of LBW cases in the training set
were only around 3%, the training set was imbalanced. To
address this issue, we employed Random Over Sampling
(ROS) to ensure a more balanced distribution of classes
on the training set.
Predictive models
We used logistic regression, random forest, XGBoost,
conditional inference tree, and attention mechanisms to
predict the two birth weight classes. ese models used
different non-linear relationships between the predictor
variables to classify between LBW and NBW newborns.
Together, these five models offer a robust approach for
identifying relevant predictors of birth weight, highlight-
ing those that consistently emerged as significant across
all predictive methods.
Logistic regression converted the combination of pre-
dictors variables into probabilities using the sigmoid
function, thus indicating which combinations had higher
odds to belong to the NBW class. To train logistic regres-
sion, we used the majority category on the categorical
Table 1 Description of the 14 numerical predictor variables selected for predicting normal birth weight against low birth weight. For
each variable, the mean and standard deviation (SD) is provided
Category Variable Description Mean ± SD
Anthropometric M_Ht_In Maternal height (inches) 64.2 (2.8)
BMI Body Mass Index 27.5 (6.7)
PWgt_R Pre-pregnancy weight (pounds) 161.4 (41.4)
Paternal Factor FAGECOMB Parental age (years) 32.0 (6.6)
Maternal Factor MAGER Maternal age (years) 29.8 (5.5)
WTGAIN Weight gain (pounds) 29.3 (14.7)
CIG_0 Daily cigarettes before pregnancy 0.5 (3.1)
CIG_1 Daily cigarettes during 1st trimester 0.3 (2.3)
CIG_2 Daily cigarettes during 2nd trimester 0.2 (1.9)
CIG_3 Daily cigarettes during 3rd trimester 0.2 (1.8)
Previous Pregnancies PRIORLIVE Prior births now living (count) 1.1 (1.2)
PRIORDEAD Prior births now dead (count) 0.0 (0.2)
Prenatal care PREVIS_REC Number of prenatal visits (count) 6.9 (1.8)
PRECARE5 Month prenatal care began 2.8 (1.4)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 6 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
variables as a reference (see Table 2). us, for both
maternal and paternal ethnicity, the reference category
was White; for newborn sex, the reference was male; for
maternal education level, the reference was a Bachelor’s
degree; for paternal education, the reference was a high
Table 2 Description of the 6 categorical predictor variables selected for predicting normal birth weight against low birth weight. For
each variable, the number of categories
Category Variable Description Number (percent)
Ethnicity MRACE15 Maternal ethnicity White 1,357,994 (59.0)
Hispanic 491,418 (21.3)
Black 267,922 (11.6)
Asian 167,357 (7.3)
Indigenous 13,677 (0.6)
Pacific Islanders 5,354 (0.2)
FRACE15 Paternal ethnicity White 1,356,042 (58.9)
Hispanic 457,292 (19.9)
Black 319,525 (13.9)
Asian 151,372 (6.6)
Indigenous 13,534 (0.6)
Pacific Islanders 5,957 (0.3)
Newborn sex SEX Newborn’s sex Male 1,173,414 (50.9)
Female 1,130,308 (49.1)
Socioeconomic MEDUC Maternal education level 8th grade or less 51,656 (2.2)
9th through 12th grade with no diploma 124,089 (5.3)
High school graduate or GED completed 530,308 (23.0)
Some college credit, but not a degree 401,453 (17.4)
Associate degree (AA, AS) 209,008 (9.1)
Bachelor’s degree (BA, AB, BS) 60,3961 (26.2)
Master’s degree (MA, MS, MEng, MEd, MSW, MBA) 295,846 (12.8)
Doctorate (PhD, EdD) or Professional Degree (MD, DDS, DVM, LLB, JD) 87,401 (3.8)
FEDUC Paternal education level 8th grade or less 62,893 (2.7)
9th through 12th grade with no diploma 155,774 (6.8)
High school graduate or GED completed 683,298 (29.7)
Some college credit, but not a degree 404,908 (17.6)
Associate degree (AA, AS) 174,720 (7.6)
Bachelor’s degree (BA, AB, BS) 527,983 (22.9)
Master’s degree (MA, MS, MEng, MEd, MSW, MBA) 204,495 (8.9)
Doctorate (PhD, EdD) or Professional Degree (MD, DDS, DVM, LLB, JD) 89,651 (3.89)
MBSTATE_REC Maternal origin US born 1,819,958 (79.0)
born outside 483,764 (21.0)
Table 3 Detailed breakdown of ethnic categories of parents
Ethnicity of parents Categories
White
Black
Asian Asian Indian, Chinese, Filipino, Japanese, Korean, Vietnamese, Other Asian
Hispanic Mexican, Puerto Rican, Cuban, Central or South American, Dominican,
Other and unknown Hispanic
Indigeneous American Indian and Alaska Native
Pacific Islander Native Hawaiian and Other Pacific Islander
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 7 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
school graduate; and for maternal origin, the reference
was US-born.
Random forest (RF) and XGBoost built multiple deci-
sion trees to identify rules more associated with LBW
and NBW. Each tree was built by using a subset of train-
ing data and a subset of the predictors variables that
were selected randomly. e difference between RF and
XGBoost is the way the individual trees were combined.
RF used a bagging strategy, in which the trees are trained
independently. In contrast, XGBoost used a boosting
strategy, in which trees were trained sequentially aiming
that each new tree corrected the mistakes made by the
previous ones.
e conditional inference tree (CIT) built a tree relat-
ing the predictors based on their capacity to separate
samples in two groups that were statistically significantly
different [44]. To that aim, the CIT evaluated multiple
hypothesis tests with Bonferroni correction to find the
predictor variable that produces the lowest pvalue to dis-
criminate between LBW and NBW cases.
e attention layer mechanism is a deep learning
model that identifies the variables that a model focuses
on the most when making predictions [45]. is was
achieved using three matrices, query (Q), key (K), and
value (V), which were correlated to assign an attention
weight to each input feature as follows:
where
dk
was the dimension of the keys, and the softmax
function ensured that the attention weights sum to 1,
normalizing the attention weights. e attention weight
highlighted the importance of different input features
relative to discriminating between LBW and NBW cases.
Evaluation performance
To evaluate the performance of the models, six different
metrics were computed. e first two corresponded to
the individual recall for each class. e remaining four
corresponded to the average (macro) of the individual
class metric for recall, precision, F1-score, receiver oper-
ating characteristic area under the curve (ROC AUC),
and the precision-recall area under the curve (ROC PR).
Data analysis andinterpretation
After training the predictive models, we applied various
post-processing methodologies to identify the variables
that consistently emerged as significant across all pre-
dictive methods. Table4 shows the different methodolo-
gies used to interpret the models. ese interpretation
methods allowed us to identify the common factors that
Attention
(Q,K,V)=softmax QK T
d
k
V
consistently emerged as relevant across all analyses in
distinguishing between NBW and LBW cases.
Odds‑ratio analysis
For logistic regression, we performed odds-ratio analysis
to determine which variables significantly correlated with
birth weight outcomes, thus identifying those that were
strongly associated with LBW.
Feature importance
For the ensemble models, we conducted a feature impor-
tance analysis to identify the most influential factors
contributing to the predictions. e ensemble models
computed importance scores by weighting, summing,
and averaging attribute data across all decision trees,
identifying the factors that were most sensitive and criti-
cal for prediction performance.
Attention weights
Similarly, for the attention mechanism, we visualized the
attention scores assigned to each predictor after training
the model. Higher attention scores indicated that a par-
ticular feature was more relevant for the prediction task.
Conditional inference tree
We visualized the branches of the CIT, with each branch
representing a classification rule that offers insights into
how different predictor variables are combined to classify
NBW and LBW cases. By analyzing these branches, we
identified parental profiles associated with the lowest and
highest proportions of LBW newborns.
Partial dependence plots
To visualize the marginal impact of a single feature on
LBW and NBW cases, we implemented Partial Depend-
ence Plots (PDP) [46] using the logistic regression model.
PDPs illustrate how a feature influences the predicted
outcome by displaying the average prediction while hold-
ing other features constant. Unlike feature importance
techniques, PDPs can reveal both the direction and
Table 4 Interpretability methodologies to post-process trained
predictive models
Technique Models
Odd Ratio Analysis LR
Feature Importance RF, XGBoost
Attention Weights Attention Layer
Partial Dependence Plot (PDP) LR
Conditional Inference Tree CIT
SHAP Values LR, RF, XGBoost
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 8 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
nature of the relationship between a feature and the pre-
diction outcome.
Shapley additive exPlanations
We employed Shapley Additive Explanations (SHAP) to
analyze the predictive rules of logistic regression, RF, and
XGBoost further. SHAP analysis quantified the contri-
bution of each feature to individual predictions, offering
a detailed understanding of the models’ behavior. Spe-
cifically, the SHAP analysis generated visualizations that
illustrate the contribution of each feature to the predic-
tions. e summary plots displayed each variable verti-
cally, with the x-axis representing the range of SHAP
values. Positive values on the x-axis indicated a higher
likelihood of the predicted outcome, while negative val-
ues suggested a lower likelihood. For a specific feature,
red points on the right indicated a positive contribution
to the likelihood of achieving NBW, whereas blue points
on the left indicated a negative impact, reducing the
likelihood of NBW. When a feature exhibited a signifi-
cant contrast between red and blue across both positive
and negative SHAP values, it suggested that the feature’s
effect on the prediction varied considerably across its
range.
Result
Model evaluation onthetesting set
Table 5 shows the performance on the held-out, inde-
pendent test samples. All the models achieved an accu-
racy greater than 64%, with XGBoost showing the highest
performance. Overall, the predictive models performed
better for predicting NBW than LBW. e macro preci-
sion, F1-score, and PR AUC were the lowest metrics due
to the high imbalance between NBW and LBW classes, as
well as the fact that the models were trained to prioritize
the prediction of LBW cases. Consequently, the models
obtained a false positive rate for the LBW class around
34%, which, given the high ratio between LBW and NBW
samples (1:30), resulted in a low precision for the LBW
class. Nevertheless, the average ROC AUC across the
models was nearly 70%, indicating that the models were
able to effectively distinguish between LBW and NBW
cases [47].
Odds ratio analysis
Table 6 shows significant factors (
p value <0.05
)
obtained by the logistic regression for predicting NBW
(class labeled as ‘1’) and LBW (class labeled as ‘0’). Mater-
nal anthropometrics showed a strong association with
the odds of having NBW newborns. Specifically, taller
mothers and those with higher pre-pregnancy weight had
higher odds of delivering NBW newborns. In addition
to anthropometric factors, the chronological age of both
the mother and father showed a negative association
with NBW, as the odds of delivering an NBW newborn
decreased with increasing parental age.
e logistic regression analysis showed that parental
ethnicity correlated with birth weight outcomes. Par-
ents who identified as Black or Asian had higher odds of
having LBW offspring than their White counterparts. In
contrast, Hispanic mothers were more likely to have new-
borns with NBW compared to White mothers. Interest-
ingly, mothers who were born outside the US were more
associated with NBW newborns than US-born mothers.
Actions taken during pregnancy and previous preg-
nancy history significantly influenced the odds of deliv-
ering NBW infants. For instance, gaining adequate
weight during pregnancy and attending prenatal visits
were positively associated with having NBW newborns.
Conversely, smoking habits during pregnancy negatively
impacted the odds of NBW, particularly in the first tri-
mester, where an increase of one unit in daily cigarette
consumption decreased the odds of delivering an NBW
newborn by 76%. Additionally, the number of previ-
ous living births emerged as a critical indicator of NBW
outcomes, suggesting that mothers with a history of
Table 5 Performance of the predictive models for classifying low-birth weight (LBW) and normal-birth weight (NBW). Individual
recall for each class is presented, along with macro accuracy, recall, macro precision, macro F1-score, macro area under the receiver
operating characteristic curve (ROC AUC), and macro area under the precision-recall curve (PR AUC)
Model LBW recall (
%
)NBW recall (
%
)Accuracy(%) Macro recall (
%
) Macro
precision
(
%
)
Macro
F1-score
(
%
)
Macro ROC
AUC (
%
)Macro
PR AUC
(
%
)
LR 64.0 66.0 66.0 65.0 52.0 44.0 70.4 52.9
RF 62.0 66.0 66.0 64.0 52.0 44.0 69.5 52.7
XGBoost 66.0 68.0 68.3 67.0 52.0 46.0 73.4 53.9
CIT 61.6 61.8 61.8 61.7 51.4 43.4 63.8 51.3
Attention Mechanism 64.0 66.0 66.3 65.0 52.0 45.0 70.5 52.9
Average 63.5 65.6 65.7 64.3 51.9 44.5 69.52 52.7
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 9 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
successful pregnancies have a higher likelihood of deliv-
ering NBW newborns.
e educational levels of both mothers and fathers sig-
nificantly influenced the likelihood of a newborn having
a NBW. Mothers with an education level of an associate
degree or lower exhibited lower odds of delivering NBW
newborns compared to those with a bachelor’s degree.
Similarly, fathers who completed at least a bachelor’s
degree had approximately 30% higher odds of having
an NBW newborn than those who graduated from high
school.
Ensemble models relevant features
Figures2 and 3 illustrate the feature importance for
the random forest and XGBoost models, respectively.
For both ensemble models, weight gain during preg-
nancy emerged as the most important predictor of
NBW and LBW cases. Additionally, pre-pregnancy
Table 6 Odds ratios analysis for the logistic regression coefficients. All coefficients were significant at the significance level of 0.05.
The top 10 significant features are the mothers who were born outside of us, Asian fathers, fathers with a bachelor’s degree, female
newborns, Black mothers, month prenatal care began, number of prenatal care visits, number of previous living births, and weight gain
Category Variable Coecient 95% CI Odds ratio P val
Anthropometric Maternal height 3.70 (3.45, 4.00) 41.40 < 0.001
Pre-preganncy weight 1.54 (1.16, 1.92) 4.66 < 0.001
Ethnicity (White as reference) Mother - Black −0.53 (−0.55, −0.51) 0.59 0.0
Father - Asian −0.53 (−0.55, −0.51) 0.59 < 0.001
Father - Black −0.29 (−0.31, −0.27) 0.75 < 0.001
Mother - Hispanic 0.13 (0.11, 0.15) 1.14 < 0.001
Mother - Asian −0.14 (−0.16, −0.11) 0.87 < 0.001
Mother - Indigenous −0.28 (−0.39, −0.17) 0.75 < 0.001
Father - Pacific Islander −0.11 (−0.17, −0.05) 0.90 < 0.001
Maternal Education (Bachelors degree as refer-
ence) Mother - 9th through 12th grade
with no diploma −0.40 (−0.42, −0.38) 0.66 < 0.001
Mother - High school graduate or GED com-
pleted −0.20 (−0.21, −0.18) 0.81 < 0.001
Mother - Some college credit, but not a degree −0.15 (−0.16, −0.13) 0.86 < 0.001
Mother - Associate degree −0.10 (−0.13, −0.09) 0.90 < 0.001
Mother - 8th grade or less −0.17 (−0.20, −0.13) 0.84 < 0.001
Mother - Master’s degree 0.05 (0.04, 0.07) 1.06 < 0.001
Paternal Education (High school graduate
or GED completed as reference) Father - Bachelor’s degree 0.34 (0.33, 0.36) 1.41 0.0
Father - Master’s degree 0.29 (0.27, 0.31) 1.33 < 0.001
Father - Some college credit, but not a degree 0.17 (0.16, 0.18) 1.18 < 0.001
Father - Doctorate or Professional Degree 0.33 (0.30, 0.35) 1.39 < 0.001
Father - Associate degree 0.18 (0.16, 0.20) 1.20 < 0.001
Father - 9th through 12th grade
with no diploma −0.06 (−0.08, −0.04) 0.93 < 0.001
Father - 8th grade or less 0.11 (0.08, 0.14) 1.12 < 0.001
Paternal age Paternal age −0.26 (−0.34, −0.17) 0.77 < 0.001
Maternal factors Weight gain 2.63 (2.60, 2.67) 13.93 0.0
Maternal age −0.38 (−0.43, −0.34) 0.68 < 0.001
Daily cigarettes before pregnancy −1.64 (−1.82, −1.45) 0.19 < 0.001
Daily cigarettes in the 1st trimester −1.39 (−1.78, −0.99) 0.24 < 0.001
Daily cigarettes in the 3rd trimester −1.20 (−1.74, −0.66) 0.30 < 0.001
Daily cigarettes in the 2nd trimester −1.23 (−1.86, −0.61) 0.29 < 0.001
Newborn sex (male as reference) Female (1: ‘yes’, 0: ‘no’) −0.21 (−0.21, −0.19) 0.81 0.0
Previous pregnancies previous living births 3.62 (3.54, 3.69) 37.29 0.0
Prenatal care Number of prenatal visits 1.03 (1.01, 1.06) 2.81 0.0
Month prenatal care started 0.60 (0.57, 0.63) 1.82 0.0
Mother origin (Born in the US as reference) Born Outside the US (1: ‘yes’, 0: ‘no’) 0.39 (0.39, 0.40) 1.48 0.0
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 10 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
weight, maternal height, number of prenatal care vis-
its, and previous living births ranked among the top
ten features in both models. The random forest and
XGBoost also highlighted the significance of paternal
factors in predicting birthweight outcomes, revealing
that the father’s ethnicity (White or Black) and age
were critical for classifying LBW and NBW. Notably,
neither model included educational factors in their top
ten rankings based on feature importance.
Attention mechanism layer
Figure4 shows the attention scores assigned by the self-
attention mechanism to each variable. e bar chart
ranks features according to their importance scores, with
taller bars indicating greater significance for predicting
birth weight. Among the features, the education level of
parents exhibited the highest importance. Additionally,
the number of prenatal care visits, the presence of Asian
fathers, Black mothers, mothers born in the US., the
Fig. 2 Feature importance for the random forest (RF) model. The top ten predictors identified as most relevant for birth weight predictions were
weight gain (WTGAIN), Black parents, maternal height maternal height (M_Ht_in), pre-pregnancy weight (PWgt_R), number of previous living births
(PRIORLIVE), White parents, number of prenatal care visits (PREVIS_REC), and female infants
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 11 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
month prenatal care commenced, pre-pregnancy weight
gain, maternal height, and weight gain during pregnancy
were also among the features that received the highest
attention weights.
Partial dependence plots
Figures5, 6, 7, 8, 9, 10, 11, 12 and13 show the PDP for
nine parental factors based on the logistic regression
output, namely: weight gain during pregnancy, maternal
height, maternal pre-pregnancy weight, as well as paren-
tal ethnicity and education. In the plots, the x-axis repre-
sents the range of values for each feature, with numerical
features grouped into bins and categorical features repre-
sented by individual categories. e distribution of fea-
ture values was also displayed along the x-axis. e y-axis
shows the predicted change in the model output, with
the leftmost value on the x-axis serving as the reference
point. To aid interpretation, the PDP of the reference
value was set to zero, highlighting relative changes across
the feature values.
Figures5, 6, 7 display maternal anthropometric factors
effect on the chances of delivering an NBW newborn. Fig-
ure5 shows a significant upward trend with weight gain
during pregnancy, indicating that higher weight gains
Fig. 3 Feature importance for the XGBoost model. The top ten predictors identified as most relevant for birth weight predictions were weight gain
(WTGAIN), number of prenatal care visits (PREVIS_REC), number of previous living births (PRIORLIVE), pre-pregnancy weight (PWgt_R), BMI, month
prenatal care began (PRECARE), parental age (MAGER and FAGECOMB), maternal height (M_Ht_in), BMI, mothers who were born in the US
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 12 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
were strongly related to NBW outcomes. For maternal
height, Fig.6 indicates that mothers between 61 and 64
inches (155163 cm) had similar probabilities of deliver-
ing an NBW newborn, but these probabilities increased
steadily for mothers taller than 64 inches, suggesting that
taller mothers were more likely to deliver NBW new-
borns. In terms of pre-pregnancy weight (Fig.7), there
was an increasing trend, indicating that heavier mothers
had more chances to deliver NBW newborns.
Figures 8 and 9 show the effect of parental age on
the birth weight prediction. e trend for both parents
was inverse, indicating that the older parents were, the
lower the probability of having an NBW newborn was.
Figures10 and 11 show the impact of parental ethnic-
ity on NBW outcomes. In general, White parents had
a higher probability of having an NBW newborn than
fathers from other ethnicities. Asian and Black par-
ents were those with the highest risk of having an LBW
newborn. Among ethnicities, Hispanic mothers were
the only group with a higher likelihood of delivering an
NBW newborn compared to White mothers.
Figures12 and 13 show the influence of parental edu-
cation on birth weight outcomes. Mothers with at least
a bachelor’s degree were more likely to deliver a new-
born with normal birth weight (NBW) compared to
those with only a high school diploma or some college
credits. Regarding fathers, those who had completed at
least an associate degree showed a significantly higher
probability of having an NBW newborn.
Fig. 4 Feature importance from the attention mechanism layer, based on attention scores assigned to each predictor variable. As a reference, equal
relevance for all predictors would result in a score of
46 =2.2x10
. Variables with scores higher than
2.2x10
2
contributed the most to the birth
weight predictions
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 13 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
Conditional inference tree
Figure14 displays the conditional inference tree when its
maximum height was constrained to three levels. Among
the different predictor variables, the tree identified that
the most critical variables to discriminate between NBW
and LBW cases were maternal ethnicity, maternal height,
and maternal weight gain.
Fig. 5 PDP for maternal weight gain during pregnancy
Fig. 6 PDP for maternal height
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 14 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
Fig. 7 PDP for pre pregnancy weight
Fig. 8 PDP for maternal age
Fig. 9 PDP for paternal age
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 15 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
Based on maternal ethnicity, the tree was split into
two groups: one group included White, Hispanic,
Pacific Islander, and Indigenous mothers, whereas the
other one encompassed Black and Asian mothers. For
the White, Hispanic, Pacific Islander, and Indigenous
mothers, the node with the highest proportion of
LBW cases corresponded to mothers smaller than 63
inches who gained less than
28 lbs
during pregnancy,
and whose pre-pregnancy weight was lower than 131
lbs (Node 5; 68.8%). For Black and Asian mothers, the
node with the highest proportion of LBW cases was for
mothers who gained less than
28 lbs
(Node 9; 69.0%).
e node with the highest proportion of NBW new-
borns (Node 18; 73.6%) corresponded to White, His-
panic, Pacific Islander, and Indigenous mothers taller
than 63 inches who gained more than
27 lbs
and held a
bachelor’s, Master’s, PhD, or professional degree.
SHAP analysis
Figures15, 16, and 17 show the top 20 factors based on
SHAP values for the logistic regression, random for-
est, and XGBoost models, respectively. e SHAP sum-
mary plots revealed consistent patterns across all models
for predicting birth weight. Notably, weight gain during
Fig. 10 PDP for maternal ethnicity (Mother - white as reference)
Fig. 11 PDP for paternal ethnicity (Father - white as reference)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 16 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
pregnancy emerged as the most influential predictor,
with higher weight gain being strongly associated with
delivering an NBW. Additionally, all SHAP analyses high-
lighted the positive relationship between maternal height
(M_Ht_In), body mass index (BMI), pre-pregnancy
weight (PWgt_R), and the likelihood of having an NBW
newborn.
Parental factors, including ethnicity, age, and educa-
tion, played a pivotal role in birth weight predictions.
In terms of ethnicity, Black, Hispanic, and Asian fathers
were more frequently related to LBW predictions,
whereas White parents and Hispanic mothers tended to
correlate more with NBW predictions. Regarding age,
the SHAP analyses indicated that the older the parents
were, the higher the chances of having an LBW new-
born. Finally, mothers and fathers who had higher edu-
cation levels, such as masters and bachelor’s degrees,
Fig. 12 PDP for maternal education (Mother - bachelor’s degree as reference)
Fig. 13 PDP for paternal education (Father - high school graduate or GED completed as reference)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 17 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
were found to have a higher likelihood of giving birth to
NBW infants.
Previous pregnancy history, particularly the number of
living births (PRIORLIVE), was strongly associated with
NBW predictions. Likewise, regular prenatal checkups
(PREVIS_REC) were positively linked to NBW outcomes.
Conversely, negative factors such as maternal smoking
during pregnancy (CIG_0, CIG_1, CIG_2, and CIG_3)
were associated with LBW predictions. Additionally, the
sex of the newborn emerged as a significant factor, with
male newborns (SEX_M) more likely to be predicted as
NBW, while female newborns (SEX_F) were associated
with higher rates of LBW.
Eect ofmaternal height, ethnicity andbirth weight
To further explore the strong association between birth
weight outcomes, maternal height, and ethnicity indi-
cated by the predictive models, we conducted a descrip-
tive analysis comparing birth weights ranging from 2200
to 2550 g against newborn well-being, based on the
APGAR 5 score, and average maternal height (Fig.18).
For birth weights near the WHO’s LBW cutoff of 2500
g, White and Black newborns exhibited higher rates of
abnormal APGAR 5 scores (APGAR 5 < 6) compared to
their Asian and Hispanic counterparts. Notably, within
this birth weight range, White and Black mothers were,
on average, taller than Asian and Hispanic mothers.
is pattern suggests that the WHO’s LBW cutoff of
2500 g may represent a greater risk for offspring of eth-
nic groups with taller average maternal heights, such as
White and Black mothers, compared to infants born to
shorter mothers, such as Asian or Hispanic mothers.
Discussion
Our findings indicate that there are critical parental fac-
tors that strongly influence birth weight outcomes on
the US population. Across all the analyses, nutritional
and maternal anthropometric factors, such as maternal
height, weight gain during pregnancy, pre-pregnancy
weight, and parental ethnicity, consistently emerged as
critical determinants of newborn weight. ese find-
ings align with previous research, which also reports that
nutritional status and maternal anthropometrics are sig-
nificantly correlated with birth weight and length of the
newborn [7, 48, 49].
e relationship between maternal height, weight gain
during pregnancy, pre-pregnancy weight, and mater-
nal ethnicity helps explain why some women are more
likely to deliver LBW newborns. For example, women
of shorter stature and lower body mass are at greater
risk of delivering a baby weighing less than 2500 g. Simi-
larly, women with a pre-pregnancy BMI below 24.9 are
more likely to have an LBW newborn, as they are recom-
mended to gain between 11 to 18 kg during pregnancy to
achieve an NBW outcome [50], which can be a challenge
for some.
Our findings also emphasize the importance of
adopting healthy habits during pregnancy to improve
birth weight outcomes. It is important to ensure that
mothers have access to perinatal care and follow proper
Fig. 14 Conditional Inference Tree for detecting NBW and LBW newborns. For maternal education, the following abbreviation was used: ‘
8th’,
for 8th grade or less; ‘9th’, for 9th through 12th grade with no diploma; ‘HS’, for High school graduate or GED completed; ‘SC’, for some college credit,
but not a degree; ‘AD’, for Associate degree (AA, AS); ‘Bs’, for Bachelor’s degree (BA, AB, BS); ‘MS’, for Master’s degree (MA, MS, MEng, MEd, MSW, MBA);
‘PhD or PD’, for Doctorate (PhD, EdD) or Professional Degree (MD, DDS, DVM, LLB, JD)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 18 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
nutrition, which supports healthy weight gain, as these
factors strongly contribute to the likelihood of deliver-
ing an NBW infant. Other habits, like smoking, should
be avoided as it is a strong determinant of LBW. More-
over, pregnancy history needs to also be considered
as mothers who have had several successful births are
more likely to deliver an NBW newborn. Finally, paren-
tal age also matters, as both older mothers and fathers
are at an increased risk of having an LBW infant.
One of the most intriguing relationships identified in
our study is between maternal height, pre-preganncy
weight, weight gain during pregnancy, ethnicity, and
birth weight (Fig.14). Given that maternal anthropo-
metric factors (height, weight, BMI) significantly influ-
ence birth weight [49], and that newborns from White
parents have higher odds of having NBW (see Table6),
the WHO’s cut-off for defining LBW (2500 g) may be
biased towards the Caucasian population. is bias
is because, except for Black parents, White parents
have higher average height than other ethnicities in
the US [5154]. is finding aligns with other studies
that advocate for a review of the global WHO’s cut-off
threshold for LBW [55], which was originally estab-
lished due to the higher risk of mortality for European-
descendent newborns weighing less than 2500 g [9].
erefore, birth weights less than 2500 g for non-white
newborns do not necessarily indicate a high-risk condi-
tion (see Fig.18). It is essential also to consider other
factors, such as intrauterine growth restriction, mater-
nal health history, and preterm birth [56, 57].
Fig. 15 Top 20 variables ranked by SHAP values for logistic regression
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 19 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
e high difference between Black and White birth
weights seems more related to socioeconomic factors
than anthropometrics, as the average heights for both
groups are similar (163 cm for females and 178 cm for
males [51]). In the US, Black communities have his-
torically been concentrated in low-income areas due to
social, economic, and cultural reasons. One contributing
factor to this birth weight disparity is nutrition, as Black
communities tend to have poorer diets with higher con-
sumption of salt and sugar [58]. Since nutrition is crucial
during pregnancy, the lower birth weights in Black new-
borns compared to their White counterparts may result
from this nutritional dissimilarity. Moreover, other socio-
economic factors, such as education and income, play an
important role in predicting newborn weight outcomes.
Bachelor’s graduate parents tend to have newborns with
NBW more often than those with lower education levels.
Higher years of education can make parents more aware
of nutrition and lifestyle choices. Moreover, pregnant
women with higher levels of education are more likely to
Fig. 16 Top 20 variables ranked by SHAP values for random forest
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 20 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
earn higher incomes [59], leading to less stressful preg-
nancies, better adherence to medical advice, and more
regular prenatal checkups.
e identification of weight gain, maternal height,
pre-pregnancy weight, and parental ethnicity as crucial
factors influencing birth weight outcomes aligns with
the findings of Marisaki etal. [7], who emphasized that
anthropometric factors are the major factor explaining
LBW disparities among ethnicities. However, our study
enhances this perspective by indirectly incorporating
paternal anthropometrics, noting that paternal ethnic-
ity is correlated with paternal height [51]. us, our
study provides a more comprehensive understanding of
both maternal and paternal factors in predicting LBW
outcomes, as paternal height also affects the newborn’s
anthropometrics. Furthermore, we expand upon the
work of Marisaki etal. [7] by showing that when aver-
age heights are comparable between ethnicities, such as
White and Black parents in the US, disparities in birth
weight outcomes are predominantly attributed to other
Fig. 17 Top 20 variables ranked by SHAP values for XGBoost
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 21 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
factors, particularly access to adequate nutrition. is
finding highlights the critical need to consider socio-
economic factors alongside anthropometric measures to
fully comprehend LBW outcomes.
Strengths andlimitations
is is the first study, as far as we know, to use predictive
models to analyze various factors and identify the ones
most strongly linked to LBW in a nationwide US dataset.
Unlike prior studies, we also considered paternal factors
in our analysis, demonstrating how parental ethnicity,
age, and education level influence birth weight outcomes.
e generalization of our findings was evaluated on an
independent test set (see Table5), yielding an average
accuracy of approximately 64% and a macro ROC AUC
of nearly 70% for distinguishing between NBW and LBW
newborns. is evaluation metric suitably supports the
extension of our findings presented in this work. e lim-
itation for achieving a higher accuracy may be attributed
to the highly imbalanced dataset, with LBW cases con-
stituting only about 3% of the training data. Nonethe-
less, our primary objective was to identify critical factors
influencing birth weight outcomes rather than solely
maximizing accuracy. e comprehensive dataset, which
encompasses information from diverse populations
across all 50 US states, supports the findings presented
in this study.
We note that our analysis was confined to a single
dataset collected in 2022. Our rationale was to iden-
tify the most relevant predictors using the most cur-
rent data available from the CDC, thereby reflecting
the contemporary situation in the US. is scenario set
our study as a cross-sectional analysis, which restricts
our ability to conduct longitudinal studies that examine
evolving trends between birth weight and parental pre-
dictors. Moreover, although recent research suggests
that the COVID-19 pandemic did not significantly
impact the dynamics of prenatal care visits in the US in
Fig. 18 Birth weight compared to (a) newborn well-being, represented by the percentage of abnormal Apgar 5 scores, and (b) average maternal
height, categorized by ethnic group
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 22 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
2022 [60], we note that the pandemic may have affected
access to perinatal care services for certain households.
Future research could explore how the influence of the
factors identified in this study has evolved over the past
decade concerning birth weight outcomes in the US.
We also recognize that the dataset used in this study
lacks factors that may be relevant to determining birth
weight outcomes. For instance, key features such as
income [61] and paternal factors like height and weight
[62] were not included, which could have offered addi-
tional insights into the socioeconomic and anthro-
pometric influences on LBW. Future research should
address these gaps by incorporating a broader range of
datasets and variables to achieve a more comprehensive
understanding of the determinants of LBW.
Finally, we note that our analysis identified factors
influencing birth weight outcomes based on associa-
tions rather than causality. Although machine learning
models can capture complex, nonlinear relationships
among multiple predictors and the response variable,
they do not establish cause-and-effect relationships.
erefore, our study does not imply causality. Instead,
the machine learning models identified key anthropo-
metric, ethnic, educational, and pregnancy-related fac-
tors that are commonly associated with parents of LBW
newborns.
Conclusion
is study analyzed various factors to determine which
ones impact the birth weight of newborns in the US
the most. To achieve that aim, we used machine learn-
ing and deep learning models to create predictive mod-
els based on 20 factors, including maternal, parental,
socioeconomic, ethnicity, and neonatal factors. Our
models showed that certain fixed factors, like maternal
height and parents’ ethnicity, significantly influence birth
weight. Taller and White parents are more likely to have
NBW newborns. However, because White parents tend
to be taller than parents from other ethnicities, this result
should be interpreted with caution. Indeed, as reported
by previous studies, the WHO’s cut-off for LBW may not
be appropriate for non-White ethnicities. Additionally,
our findings also indicate that pregnancy-related factors,
such as nutrition, smoking habits, and access to perinatal
care, are crucial for birth weight. Our findings emphasize
the importance of proper nutrition, avoiding smoking,
and accessing prenatal care. is is especially crucial for
vulnerable communities in the US, such as Black commu-
nities, which are statistically significantly more associated
with LBW newborns.
Acknowledgements
Not applicable.
Authors’ contributions
S.S.D and C.E.V designed the methodology of the study. Both implemented
the code and analyze the results. S.S.D. drafted the manuscript, and C.E.V.
reviewed and edited. C.E.V. is the supervisor of S.S.D.
Funding
This project was unfunded.
Data availability
Study was conducted using a public available dataset provided by the Centers
for Disease Control and Prevention (CDC). The data can be accessed at the fol-
lowing URL: https:// www. cdc. gov/ nchs/ data_ access/ Vital stats online. htm.
Declarations
Ethics approval and consent to participate
All experiments were performed according to relevant guidelines and regula-
tions (such as the Declaration of Helsinki).
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Author details
1 Applied Computer Science Department, University of Winnipeg, 515 Portage
Avenue, Winnipeg R3B 2E9, MB, Canada. 2 Department of Community Health
Sciences, Cumming School of Medicine, University of Calgary, 3280 Hospital
Drive NW, Calgary T2N 4Z6, AB, Canada.
Received: 26 August 2024 Accepted: 25 November 2024
References
1. Organization WH. UNICEF-WHO low birthweight estimates: levels and
trends 2000–2015. World Health Organization; 2019. https:// www. unicef.
org/ repor ts/ UNICEF- WHO- low- birth weight- estim ates- 2019. Accessed 15
July 2024.
2. Mathewson KJ, Burack JA, Saigal S, Schmidt LA. Tiny Babies Grow Up: The
Long-Term Effects of Extremely Low Birth Weight. In: Wazana A, Székely
E, Oberlander TF, editors. Prenatal Stress and Child Development. Cham:
Springer International Publishing; 2021. pp. 469–490. https:// doi. org/ 10.
1007/ 978-3- 030- 60159-1_ 16.
3. Paneth NS. The Problem of Low Birth Weight. Futur Child. 1995;5:19–34.
http:// www. jstor. org/ stable/ 16025 05.
4. Osterman MJK, Hamilton BE, Martin JA, Driscoll AK , Valenzuela CP. Births:
Final data for 2022. Natl Vital Stat Rep. 2024;73. Retrieved from National
Center for Health Statistics. https:// www. cdc. gov/ nchs/ data/ nvsr/ nvsr73/
nvsr73- 02. pdf. Accessed 15 Aug 2024.
5. March of Dimes. Low Birthweight by Race: United States, 2020-2022 Aver-
age. 2024. https:// www. march ofdim es. org/ peris tats/ data? reg= 99& top= 4
& stop= 45 & lev= 1 & slev= 1 & obj=1. Accessed 15 Aug 2024.
6. Wartko PD, Wong EY, Enquobahrie DA. Maternal birthplace is associated
with low birth weight within racial/ethnic groups. Matern Child Health J.
2017;21:1358–66.
7. Morisaki N, Kawachi I, Oken E, Fujiwara T. Social and anthropometric
factors explaining racial/ethnical differences in birth weight in the United
States. Sci Rep. 2017;7(1):46657.
8. Arabzadeh H, Doosti-Irani A, Kamkari S, Farhadian M, Elyasi E, Moham-
madi Y. The maternal factors associated with infant low birth weight: an
umbrella review. BMC Pregnancy Childbirth. 2024;24(1):316.
9. McCormick MC. The contribution of low birth weight to infant mortality
and childhood morbidity. N Engl J Med. 1985;312(2):82–90.
10. Sharma MK, Kumar D, Huria A, Gupta P. Maternal risk factors of low birth
weight in Chandigarh India. Internet J Health. 2009;9:10–2.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 23 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
11. Shi L, Macinko J, Starfield B, et al. Primary care, infant mortality, and
low birth weight in the states of the USA. J Epidemiol Commun Health.
2004;58:374–80.
12. Broere-Brown ZA, Baan E, Schalekamp-Timmermans S, Verburg BO, Jad-
doe VW, Steegers EA. Sex-specific differences in fetal and infant growth
patterns: a prospective population-based cohort study. Biol Sex Differ.
2016;7:1–9.
13. Bowleg L. When Black+ lesbian+ woman Black lesbian woman: The
methodological challenges of qualitative and quantitative intersec-
tionality research. Sex Roles. 2008;59:312–25.
14. Bowleg L. The problem with the phrase women and minorities: inter-
sectionality—an important theoretical framework for public health.
Am J Public Health. 2012;102(7):1267–73.
15. Bauer GR. Incorporating intersectionality theory into population health
research methodology: challenges and the potential to advance health
equity. Soc Sci Med. 2014;110:10–7.
16. Evans CR, Williams DR, Onnela JP, Subramanian S. A multilevel
approach to modeling health inequalities at the intersection of multi-
ple social identities. Soc Sci Med. 2018;203:64–73.
17. Evans CR. Adding interactions to models of intersectional health
inequalities: comparing multilevel and conventional methods. Soc Sci
Med. 2019;221:95–105.
18. Strobl C, Malley J, Tutz G. An introduction to recursive partition-
ing: rationale, application, and characteristics of classification and
regression trees, bagging, and random forests. Psychol Methods.
2009;14(4):323.
19. Stevens LM, Mortazavi BJ, Deo RC, Curtis L, Kao DP. Recommenda-
tions for reporting machine learning analyses in clinical research. Circ
Cardiovasc Qual Outcomes. 2020;13(10):e006556.
20. National Center for Health Statistics. User Guide to the 2022 Natality
Public Use File. 2022. National Center for Health Statistics website.
https:// ftp. cdc. gov/ pub/ Health_ Stati stics/ NCHS/ Datas et_ Docum entat
ion/ DVS/ natal ity/ UserG uide2 022. pdf. Accessed 15 Aug 2024.
21. National Center for Health Statistics. Vital statistics online data portal:
Birth data files. https:// www. cdc. gov/ nchs/ data_ access/ Vital stats
online. htm. Accessed 21 Aug 2024.
22. Blumenshine PM, Egerter SA, Libet ML, Braveman PA. Father’s educa-
tion: an independent marker of risk for preterm birth. Matern Child
Health J. 2011;15:60–7.
23. Mao Y, Zhang C, Wang Y, Meng Y, Chen L, Dennis CL, et al. Association
between paternal age and birth weight in preterm and full-term birth:
a retrospective study. Front Endocrinol. 2021;12:706369.
24. Casadei K, Kiel J. Anthropometric Measurement. 2024. Updated 2022
Sep 26. StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing.
https:// www. ncbi. nlm. nih. gov/ books/ NBK53 7315/. Accessed 15 Aug
2024.
25. Wallace J, Horgan G, Bhattacharya S. Placental weight and efficiency
in relation to maternal body mass index and the risk of pregnancy
complications in women delivering singleton babies. Placenta.
2012;33(8):611–8.
26. Pölzlberger E, Hartmann B, Hafner E, Stümpflein I, Kirchengast S. Maternal
height and pre-pregnancy weight status are associated with fetal growth
patterns and newborn size. J Biosoc Sci. 2017;49(3):392–407.
27. Koo YJ, Ryu HM, Yang JH, Lim JH, Lee JE, Kim MY, et al. Pregnancy out-
comes according to increasing maternal age. Taiwan J Obstet Gynecol.
2012;51(1):60–5.
28. Reichman NE, Teitler JO. Paternal age as a risk factor for low birthweight.
Am J Public Health. 2006;96(5):862–6.
29. Wang X, Zuckerman B, Pearson C, Kaufman G, Chen C, Wang G, et al.
Maternal cigarette smoking, metabolic gene polymorphism, and infant
birth weight. JAMA. 2002;287(2):195–202.
30. Garces A, Perez W, Harrison MS, Hwang KS, Nolen TL, Goldenberg RL, et al.
Association of parity with birthweight and neonatal death in five sites:
The Global Network’s Maternal Newborn Health Registry study. Reprod
Health. 2020;17:1–7.
31. Momeni M, Danaei M, Kermani AJ, Bakhshandeh M, Foroodnia S,
Mahmoudabadi Z, Amirzadeh R, Safizadeh H. Prevalence and Risk Factors
of Low Birth Weight in the Southeast of Iran. Int J Prev Med. 2017;8(1):12.
https:// doi. org/ 10. 4103/ ijpvm. IJPVM_ 112_ 16.
32. Gebrehawerya T, Gebreslasie K, Admasu E, Gebremedhin M. Deter-
minants of low birth weight among mothers who gave birth in
Debremarkos referral hospital, Debremarkos town, east Gojam, Amhara
region, Ethiopia. Neonat Pediatr Med. 2018;4(1):145.
33. Manniello RL, Farrell PM. Analysis of United States neonatal mortality
statistics from 1968 to 1974, with specific reference to changing trends in
major causalities. Am J Obstet Gynecol. 1977;129(6):667–74.
34. Quenby S, Gallos ID, Dhillon-Smith RK, Podesek M, Stephenson MD,
Fisher J, et al. Miscarriage matters: the epidemiological, physical,
psychological, and economic costs of early pregnancy loss. Lancet.
2021;397(10285):1658–67.
35. Gerber-Epstein P, Leichtentritt RD, Benyamini Y. The experience of miscar-
riage in first pregnancy: the women’s voices. Death Stud. 2008;33(1):1–29.
36. Alexander GR, Kotelchuck M. Quantifying the adequacy of prenatal care:
a comparison of indices. Public Health Rep. 1996;111(5):408.
37. Conley D, Bennett NG. Race and the inheritance of low birth weight. Soc
Biol. 2000;47(1–2):77–93.
38. Zephyrin, L, Seervai, S, Lewis, C, Katon, JG. Community-Based Models
to Improve Maternal Health Outcomes and Promote Health Equity. The
Commonwealth Fund; 2021. https:// www. commo nweal thfund. org/ publi
catio ns/ issue- briefs/ 2021/ mar/ commu nity- models- impro ve- mater nalou
tcomes- equity. Accessed 15 Aug 2024.
39. Lebron CN, Mitsdarffer M, Parra A, Chavez JV, Behar-Zusman V. Latinas
and Maternal and Child Health: Research, Policy, and Representation.
Matern Child Health J. 2023. https:// doi. org/ 10. 1007/ s10995- 023- 03662-z.
40. Spinillo A, Capuzzo E, Piazzi G, Baltaro F, Stronati M, Ometto A. Signifi-
cance of low birthweight for gestational age among very preterm infants.
BJOG: Int J Obstet Gynaecol. 1997;104(6):668–73.
41. Armstrong B, Nolin A, McDonald A. Work in pregnancy and birth weight
for gestational age. Occup Environ Med. 1989;46(3):196–9.
42. Velaphi S, Mokhachane M, Mphahlele R, Beckh-Arnold E, Kuwanda
M, Cooper P. Survival of very-low-birth-weight infants according to
birth weight and gestational age in a public hospital. S Afr Med J.
2005;95(7):504–9.
43. Tsai LY, Chen YL, Tsou KI, Mu SC, Group TPIDCS, et al. The impact of small-
for-gestational-age on neonatal outcome among very-low-birth-weight
infants. Pediatr Neonatol. 2015;56(2):101–7.
44. Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: a condi-
tional inference framework. J Comput Graph Stat. 2006;15(3):651–74.
45. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gómez AN, Kaiser Ł,
Polosukhin I. Attention is all you need. Advances in Neural Information
Processing Systems: Proceedings of the 31st International Conference on
Neural Information Processing Systems (NIPS 2017). Long Beach, CA,
USA; 2017.
46. Molnar C. 8.1 Partial Dependence Plot (PDP) | Interpretable Machine
Learning. 2024. https:// chris tophm. github. io/ inter preta ble- ml- book/ pdp.
html. Accessed 15 Aug 2024.
47. Fawcett T. An introduction to ROC analysis. Pattern Recogn Lett.
2006;27(8):861–74.
48. Patra S, Sarangi G. Association between maternal anthropometry and
birth outcome. J Pediatr Assoc India. 2017;6(2):85–94.
49. Devaki G, Shobha R. Maternal anthropometry and low birth weight: a
review. Biomed Pharmacol J. 2018;11(2):815–20.
50. Cunningham FG, Leveno KJ, Bloom SL, Spong CY, Dashe JS, Hoffman BL,
et al. Williams Obstetrics. 24th ed. New York: McGraw-Hill; 2014.
51. Komlos J, Baur M. From the tallest to (one of) the fattest: the enigmatic
fate of the American population in the 20th century. Econ Hum Biol.
2004;2(1):57–74.
52. Denney JT, Krueger PM, Rogers RG, Boardman JD. Race/ethnic and sex
differentials in body mass among US adults. Ethn Dis. 2004;14(3):389–98.
53. Silva AM, Shen W, Heo M, Gallagher D, Wang Z, Sardinha LB, et al. Ethnic-
ity-related skeletal muscle differences across the lifespan. Am J Hum Biol:
Off J Hum Biol Assoc. 2010;22(1):76–82.
54. Yin L, Annett-Hitchcock K. Comparison of body measurements
between Chinese and U.S. females. The Journal of The Textile Institute.
2019;110(12):1716–24. https:// doi. org/ 10. 1080/ 00405 000. 2019. 16175 31.
55. Lucas M. Low birth weight–the less than 2500g cut-off: is it applicable to
Sri Lanka? Sri Lanka J Perinat Med. 2023;4(2):6-17. https:// doi. org/ 10. 4038/
sljpm. v4i2. 70.
56. Valderrama CE, Ketabi N, Marzbanrad F, Rohloff P, Clifford GD. A review
of fetal cardiac monitoring, with a focus on low-and middle-income
countries. Physiol Meas. 2020;41(11):11TR01.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 24 of 24
Dolaand Valderrama BMC Medical Informatics and Decision Making (2024) 24:367
57. Valderrama CE, Marzbanrad F, Hall-Clifford R, Rohloff P, Clifford GD. A
proxy for detecting IUGR based on gestational age estimation in a Guate-
malan rural population. Front Artif Intell. 2020;3:56.
58. Stephenson BJK, Willett WC. Racial and ethnic heterogeneity in diets of
low-income adult females in the United States: results from National
Health and Nutrition Examination Surveys from 2011 to 2018. Am J Clin
Nutr. 2023;117(3):625–34.
59. Barrett H, Browne A. Health, hygiene and maternal education: Evidence
from The Gambia. Soc Sci Med. 1996;43(11):1579–90.
60. Osterman MJ, Hamilton BE, Martin JA, Driscoll AK, Valenzuela CP. Births:
Final Data for 2022. Natl Vital Stat Rep: Cent Dis Control Prev Natl Cent
Health Stat Natl Vital Stat Syst. 2024;73(2):1–56.
61. Aregay M, Lawson AB, Faes C, Kirby RS, Carroll R, Watjou K. Impact of
Income on Small Area Low Birth Weight Incidence Using Multiscale
Models. AIMS Public Health. 2015;2:667–680. Epub 2015 Oct 10. PMID:
27398390; PMCID: PMC4936536. https:// doi. org/ 10. 3934/ publi cheal th.
2015.4. 667.
62. Griffiths LJ, Dezateux C, Cole TJ, et al. Differential parental weight and
height contributions to offspring birthweight and weight gain in infancy.
Int J Epidemiol. 2007;36(1):104–7. https:// doi. org/ 10. 1093/ ije/ dyl210.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... The predictive models was created by help of XGBoost algorithm which consist of an iterative process combining multiple decision trees with each subsequent tree being trained to minimize the residual errors of the preceding ensemble model [17]. It has shown strong performance in medicine [18], nance [19], and intelligence systems [20].Shapley Additive Explanations (SHAP) [21,22] were further employed to elucidate the predictive mechanisms of XGBoost through systematically quantifying the contributions of risk or protective factors to model predictions [23] and producing diagram for visualizations.. ...
... Second, the study was a cross-sectional analysis, limiting the capacity to perform longitudinal analyses that explore the changing trends between birth weight and parental predictors. We recognize that the dataset used in this research lacked some socioeconomic [41] and paternal factors [21], which could be relevant to assessing birth weight outcomes. To derive more robust and reliable conclusions, it is imperative to conduct multi-center studies across diverse populations, incorporating a broader range of relevant factors. ...
Preprint
Full-text available
Background Low birth weight (LBW), defined as a newborn weighing less than 2500 grams, is an increasingly significant public health concern. Exploring the risk and protective factors for LBW is getting more and more important. This study aimed to utilize predictive models to identify the most critical factors associated with LBW in singleton pregnancies. Methods : A retrospective cohort study was conducted at the Binzhou Medical University Hospital, China, from 2022 to 2023. Singleton pregnancies with gestational age exceeding 27 weeks were included, while multiple pregnancies and fetal anomalies were excluded. Logistic regression (LR) and extreme gradient boosting (XGBoost) algorithms were employed to predict LBW and normal birth weight (NBW) outcomes. The LR model was interpreted using on odds ratio analysis and nomograms, whereas the XGBoost model was elucidated through Shapley Additive Explanations (SHAP) values to determine the factors most strongly associated with LBW. Results : In this cohort of 10,227 deliveries, 237 cases were classified as LBW. The XGBoost model demonstrated superior performance in predicting LBW, achieving an AUROC of 0.797. Both LR and XGBoost model identified maternal age, gestational age, BMI, hypertensive disorders of pregnancy (HDP),fetal distress as the critical factor associated with LBW. Additionally, a follow-up study of LBW identified that LBW infants encounter significant health challenges, including a high rate of hospitalization and the complex neonatal complications included congenital anomalies, NRDS and neonatal hyperbilirubinemia. Conclusion: This study demonstrated that the XGBoost model showed promising predictive accuracy for LBW deliveries. Pregnant women with a gestational age of less than 37 weeks, gestational BMI below 18 kg/m², maternal age younger than 25 years, or maternal comorbidities such as HDP or fetal distress are at an increased risk of delivering LBW infants. These findings highlight potential contributors to LBW disparities in China and underscore the utility of ML in maternal health research.
... Furthermore, Dola et al. used the 2022 Centers for Disease Control and Prevention's National Natality Dataset to predict LBW based on several factors, including anthropometric, socioeconomic, and demographic data from parents. They concluded that the threshold for LBW in Asians and Hispanics may be lower than the WHO definition (<2500 g) [23]. race and ethnicity did not influence fetal growth. ...
... Furthermore, Dola et al. used the 2022 Centers for Disease Control and Prevention's National Natality Dataset to predict LBW based on several factors, including anthropometric, socioeconomic, and demographic data from parents. They concluded that the threshold for LBW in Asians and Hispanics may be lower than the WHO definition (<2500 g) [23]. and 3000 g in Denmark in 2007. ...
Article
Full-text available
Low birth weight (LBW) is a significant concern not only because of its association with perinatal outcomes, but also because of its long-term impact on future health. Despite the physical differences among individuals of different ethnicities, the definition of LBW remains the same for all ethnicities. This study aimed to explore and discuss this issue. We compiled national data from several countries and found that maternal height was negatively correlated with LBW incidence. We discovered the INTERGROWTH-21st chart may not be suitable for the Japanese population, as the Japanese birth weight chart differs from the INTERGROWTH-21st chart. Researchers have reported different LBW cutoff values used to assess adverse perinatal outcomes for different countries. However, there is currently no definition of LBW independent of the mother’s country of origin that can be used for predicting the risk of adverse health outcomes. Therefore, the current era of personalized healthcare may be the perfect time to establish a standard definition of LBW which is independent of the mother’s country of origin. Considering the future of healthcare, it seems an apt time to discuss the development of a more meaningful definition of LBW that can be applied across ethnicities. Further research is needed to investigate the cutoff values of LBW in every ethnicity.
Article
Preterm birth and low birth weight remain major contributors to neonatal morbidity and mortality, yet the underlying mechanisms are not fully understood. Maternal microbiota has been implicated in adverse pregnancy outcomes, but key mediators remain unidentified. We previously showed that the microbiota-derived peptide corisin induces epithelial apoptosis via mitochondrial membrane depolarization and reactive oxygen species accumulation. In this retrospective preliminary study, we evaluated the association between maternal serum corisin levels and pregnancy outcomes in 84 eligible women. Among them, 10 experienced preterm birth, and 22 delivered low-birth-weight infants. Corisin levels were significantly elevated in these groups compared with women with full-term, normal-weight deliveries. Preterm birth was associated with increased tissue factor, while low birth weight correlated with higher thrombin–antithrombin complex and soluble thrombomodulin and lower fibrinogen levels. Corisin concentrations showed negative correlations with maternal BMI, birth weight and length, and estimated fetal weight. Positive correlations were observed between corisin, myeloperoxidase, and several coagulation markers. These preliminary findings suggest that elevated maternal corisin levels are associated with adverse pregnancy outcomes and may reflect underlying mechanisms involving oxidative stress and coagulation activation. Further investigation is warranted to clarify its potential role as a microbiota-derived biomarker in pregnancy complications.
Article
Full-text available
Background In this umbrella review, we systematically evaluated the evidence from meta-analyses and systematic reviews of maternal factors associated with low birth weight. Methods PubMed, Scopus, and Web of Science were searched to identify all relevant published studies up to August 2023. We included all meta-analysis studies (based on cohort, case-control, cross-sectional studies) that examined the association between maternal factors (15 risk factors) and risk of LBW, regardless of publication date. A random-effects meta-analysis was conducted to estimate the summary effect size along with the 95% confidence interval (CI), 95% prediction interval, and heterogeneity (I²) in all meta-analyses. Hedges’ g was used as the effect size metric. The effects of small studies and excess significance biases were assessed using funnel plots and the Egger’s test, respectively. The methodological quality of the included studies was assessed using the AMSTAR 2 tool. Results We included 13 systematic Review with 15 meta-analysis studies in our study based on the inclusion criteria. The following 13 maternal factors were identified as risk factors for low birth weight: crack/cocaine (odds ratio [OR] 2.82, 95% confidence interval [CI] 2.26–3.52), infertility (OR 1.34, 95% CI 1.2–1.48), smoking (OR 2.00, 95% CI 1.76–2.28), periodontal disease (OR 2.41, 95% CI 1.67–3.47), depression (OR 1.84, 95% CI 1.34–2.53), anemia (OR 1.32, 95% CI 1.13–1.55), caffeine/coffee (OR 1.34, 95% CI 1.14–1.57), heavy physical workload (OR 1.87, 95% CI 1.00-3.47), lifting ≥ 11 kg (OR 1.59, 95% CI 1.02–2.48), underweight (OR 1.79, 95% CI 1.20–2.67), alcohol (OR 1.23, 95% CI 1.04–1.46), hypertension (OR 3.90, 95% CI 2.73–5.58), and hypothyroidism (OR 1.40, 95% CI 1.01–1.94). A significant negative association was also reported between antenatal care and low birth weight. Conclusions This umbrella review identified drug use (such as crack/cocaine), infertility, smoking, periodontal disease, depression, caffeine and anemia as risk factors for low birth weight in pregnant women. These findings suggest that pregnant women can reduce the risk of low birth weight by maintaining good oral health, eating a healthy diet, managing stress and mental health, and avoiding smoking and drug use.
Article
Full-text available
Over the last 50 years, the Latino population in the US has grown and changed. Latinos are the nation’s largest minority group and among this group, there is incredible diversity. Much of Latino health research and outcomes have been treated interchangeably with immigrant health, but as the US Latino population evolves so should the focus of Latino health research. We contend that as maternal and child health (MCH) outcomes are an utmost important indicator of a country’s health, and as Latinos make up 18% of the US’s population, it is imperative that we move past dated research frameworks to a more nuanced understanding of the health of Latina women and children. We summarize how acculturation has been used to describe differences in MCH outcomes, discuss how the umbrella term “Latino” masks subgroups differences, explore Afro-Latinidad in MCH, examine the effects of the sociopolitical climate on the health of families, and demonstrate the limited representation of Latinos in MCH research. We conclude that a deeper understanding of Latino health is necessary to achieve health equity for Latina women and their children.
Article
Full-text available
There is limited evidence regarding the utility of fetal monitoring during pregnancy, particularly during labor and delivery. Developed countries rely on consensus 'best practices' of obstetrics and gynecology professional societies to guide their protocols and policies. Protocols are often driven by the desire to be as safe as possible and avoid litigation, regardless of the cost of downstream treatment. In high-resource settings, there may be a justification for this approach. In low-resource settings, in particular, interventions can be costly and lead to adverse outcomes in subsequent pregnancies. Therefore, it is essential to consider the evidence and cost of different fetal monitoring approaches, particularly in the context of treatment and care in low-to-middle income countries. This article reviews the standard methods used for fetal monitoring, with particular emphasis on fetal cardiac assessment, which is a reliable indicator of fetal well-being. An overview of fetal monitoring practices in low-to-middle income counties, including perinatal care access challenges, is also presented. Finally, an overview of how mobile technology may help reduce barriers to perinatal care access in low-resource settings is provided.
Article
Full-text available
Purpose While it is well documented that maternal adverse exposures contribute to a series defects on offspring health according to the Developmental Origins of Health and Disease (DOHaD) theory, paternal evidence is still insufficient. Advanced paternal age is associated with multiple metabolism and psychiatric disorders. Birth weight is the most direct marker to evaluate fetal growth. Therefore, we designed this study to explore the association between paternal age and birth weight among infants born at term and preterm (<37 weeks gestation). Methods A large retrospective study was conducted using population-based hospital data from January 2015 to December 2019 that included 69,964 cases of singleton infant births with complete paternal age data. The primary outcome was infant birth weight stratified by sex and gestational age including small for gestational age (SGA, 10th percentile) and large for gestational age (LGA, 90th percentile). Birth weight percentiles by gestational age were based on those published in the INTERGROWTH-21st neonatal weight-for gestational-age standard. Logistic regression analysis and linear regression model were used to estimate the association between paternal age and infant birth weight. Results Advanced paternal age was associated with a higher risk for a preterm birth [35–44 years: adjusted odds ratio (OR) = 1.13, 95%CI (1.03 to 1.24); >44 years: OR = 1.36, 95%CI (1.09 to 1.70)]. Paternal age exerted an opposite effect on birth weight with an increased risk of SGA among preterm infants (35–44years: OR = 1.85, 95%CI (1.18 to 2.89) and a decreased risk among term infant (35–44years: OR = 0.81, 95%CI (0.68 to 0.98); >44 years: OR = 0.50, 95%CI (0.26 to 0.94). U-shaped associations were found in that LGA risk among term infants was higher in both younger (<25 years) (OR = 1.32; 95%CI, 1.07 to 1.62) and older (35–44 years) (OR = 1.07; 95% CI, 1.01 to 1.14) fathers in comparison to those who were 25 to 34 years old at the time of delivery. Conclusions Our study found advanced paternal age increased the risk of SGA among preterm infants and for LGA among term infants. These findings likely reflect a pathophysiology etiology and have important preconception care implications and suggest the need for antenatal monitoring.
Article
Objectives- This report presents 2022 data on U.S. births by selected characteristics. Trends in fertility patterns and maternal and infant characteristics are described. Methods-Descriptive tabulations based on birth certificates of the 3.67 million births registered in 2022 are shown by maternal age, live-birth order, race and Hispanic origin, marital status, tobacco use, prenatal care, source of payment for the delivery, method of delivery, gestational age, birthweight, and plurality. Selected data by mother's state of residence and birth rates also are shown. Trends for 2010 to 2022 are presented for selected items, and by race and Hispanic origin for 2016-2022. Results-A total of 3,667,758 births occurred in the United States in 2022, essentially unchanged from 2021. The general fertility rate declined 1% from 2021 to 56.0 births per 1,000 females ages 15-44 in 2022. The birth rate for females ages 15-19 declined 2% from 2021 to 2022; birth rates fell 7% for women ages 20-24, rose 1% to 5% for women ages 25-29 and 35-44, and rose 12% for women ages 45-49 (the first increase since 2016). The total fertility rate declined less than 1% to 1,656.5 births per 1,000 women in 2022. Birth rates declined for unmarried women but increased for married women from 2021 to 2022. Prenatal care beginning in the first trimester declined to 77.0% in 2022; the percentage of women who smoked during pregnancy declined to 3.7%. The cesarean delivery rate was unchanged in 2022 (32.1%); Medicaid was the source of payment for 41.3% of births. The preterm birth rate declined 1% to 10.38%; the low birthweight rate rose 1% to 8.60%. The twin birth rate was unchanged in 2022 (31.2 per 1,000 births); the 2% decrease in the triplet and higher-order multiple birth rate.
Article
Background: Poor diet is a major risk factor of cardiovascular and chronic diseases, particularly for low-income female adults. However, the pathways by which race and ethnicity plays a role in this risk factor have not been fully explored. Objectives: This observational study aimed to identify dietary consumption differences by race and ethnicity of US female adults living at or below the 130% poverty income level from 2011 to 2018. Methods: A total of 2917 adult females aged 20 to 80 years from the National Health and Nutrition Examination Survey (2011-2018) living at or below the 130% poverty income level with at least one complete 24-hour dietary recall were classified into 5 self-identified racial and ethnic subgroups (Mexican, other Hispanic, non-Hispanic [NH]-White, NH-Black, and NH-Asian). Dietary consumption patterns were defined by 28 major food groups summarized from the Food Pattern Equivalents Database and derived via a robust profile clustering model, which identifies foods that share consumption patterns across all low-income female adults and foods that differ in consumption patterns based on the racial and ethnic subgroups. Results: All food consumption patterns were identified at the local level, defined by racial and ethnic subgroups. Legumes and cured meats were the most differentiating foods identified across all racial and ethnic subgroups. Higher consumption levels of legumes were observed among Mexican-American and other Hispanic females. Higher consumption levels of cured meat were observed among NH-White and Black females. NH-Asian females had the most uniquely characterized patterns with a higher consumption of prudent foods (fruits, vegetables, and whole grains). Conclusions: Differences among the consumption behaviors of low-income female adults were found along racial and ethnic lines. Efforts to improve the nutritional health of low-income female adults should consider racial and ethnic differences in diets to appropriately focus interventions.
Article
Miscarriage is generally defined as the loss of a pregnancy before viability. An estimated 23 million miscarriages occur every year worldwide, translating to 44 pregnancy losses each minute. The pooled risk of miscarriage is 15·3% (95% CI 12·5-18·7%) of all recognised pregnancies. The population prevalence of women who have had one miscarriage is 10·8% (10·3-11·4%), two miscarriages is 1·9% (1·8-2·1%), and three or more miscarriages is 0·7% (0·5-0·8%). Risk factors for miscarriage include very young or older female age (younger than 20 years and older than 35 years), older male age (older than 40 years), very low or very high body-mass index, Black ethnicity, previous miscarriages, smoking, alcohol, stress, working night shifts, air pollution, and exposure to pesticides. The consequences of miscarriage are both physical, such as bleeding or infection, and psychological. Psychological consequences include increases in the risk of anxiety, depression, post-traumatic stress disorder, and suicide. Miscarriage, and especially recurrent miscarriage, is also a sentinel risk marker for obstetric complications, including preterm birth, fetal growth restriction, placental abruption, and stillbirth in future pregnancies, and a predictor of longer-term health problems, such as cardiovascular disease and venous thromboembolism. The costs of miscarriage affect individuals, health-care systems, and society. The short-term national economic cost of miscarriage is estimated to be £471 million per year in the UK. As recurrent miscarriage is a sentinel marker for various obstetric risks in future pregnancies, women should receive care in preconception and obstetric clinics specialising in patients at high risk. As psychological morbidity is common after pregnancy loss, effective screening instruments and treatment options for mental health consequences of miscarriage need to be available. We recommend that miscarriage data are gathered and reported to facilitate comparison of rates among countries, to accelerate research, and to improve patient care and policy development.
Chapter
Human fetal regulatory systems are highly sensitive to maternal stress, toxins, nutritional deprivation, maternal infection, and inflammation. The developmental origins of health and disease (DOHaD) hypothesis posits that fetal adaptation to stressful intrauterine conditions may alter organ development and function, increasing the risk for disease later in life. Adverse intrauterine conditions are reflected in shortened gestation, smaller body size at birth, and, in some cases, fetal growth restriction as well. Extremely preterm birth is a complex, multi-faceted risk factor that may warrant significant management across the lifespan. As adults, survivors continue to face significant physiological and psychological stresses that affect their health, education, mobility, work, and social functioning. In the present chapter, we compare long-term functional outcomes in adult survivors of extremely low birth weight (ELBW; ≤1000 g) and adults born at normal birth weight (NBW; ≥2500 g) in the McMaster ELBW Cohort, the oldest known prospectively followed cohort of ELBW survivors, born from 1977 to 1982. While some consequences of these early stresses are not subject to mitigation, strategies exist to manage their long-term effects. We examine naturalistic and structural models of acquired resilience that may reduce the impact of stressors associated with extremely preterm birth.