Stef van Buuren’s research while affiliated with Utrecht University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (271)


Impact of maternal prepregnancy body mass index on neonatal outcomes following extremely preterm birth
  • Article

February 2025

·

19 Reads

Obesity

Charlotte Girard

·

Jennifer Zeitlin

·

·

[...]

·

Objective Extremes of prepregnancy maternal BMI increase neonatal mortality and morbidity at term. They also increase the risk of extremely preterm (EP, i.e., <27 weeks' gestational age) births. However, the association between maternal BMI and outcomes for EP babies is poorly understood. Methods We used a cross‐country design, bringing together the following three population‐based, prospective, national EP birth cohorts: EXPRESS (Sweden, 2004–2007); EPICure 2 (UK, 2006); and EPIPAGE 2 (France, 2011). We included all singleton births at 22 to 26 weeks' gestational age with a live fetus at maternal hospital admission. Our exposure was maternal prepregnancy BMI, i.e., underweight, reference, overweight, or obesity. Odds ratios (OR) for survival without severe neonatal morbidity to hospital discharge according to maternal BMI were calculated using logistic regression. Results A total of 1396 babies were born to mothers in the reference group, 140 to those with underweight, 719 to those with overweight, 556 to those with obesity, and 445 to those with missing BMI information. There was no difference in survival without major neonatal morbidity (reference, 22%; underweight, 26%, OR, 1.31, 95% CI: 0.82–2.08; overweight, 23%, OR, 1.00, 95% CI: 0.77–1.29; obesity, 19%, OR, 0.94, 95% CI: 0.70–1.25). Conclusions No associations were seen between maternal BMI and outcomes for EP babies.


Figure 1. Schematic of a roadmap for conducting a NARFCS sensitivity analysis
Summary of variables relevant to the motivating example (n=4882)
Sensitivity analysis for multivariable missing data using multiple imputation: a tutorial
  • Preprint
  • File available

February 2025

·

3 Reads

Multiple imputation is a popular method for handling missing data, with fully conditional specification (FCS) being one of the predominant imputation approaches for multivariable missingness. Unbiased estimation with standard implementations of multiple imputation depends on assumptions concerning the missingness mechanism (e.g. that data are "missing at random"). The plausibility of these assumptions can only be assessed using subject-matter knowledge, and not data alone. It is therefore important to perform sensitivity analyses to explore the robustness of results to violations of these assumptions (e.g. if the data are in fact "missing not at random"). In this tutorial, we provide a roadmap for conducting sensitivity analysis using the Not at Random Fully Conditional Specification (NARFCS) procedure for multivariate imputation. Using a case study from the Longitudinal Study of Australian Children, we work through the steps involved, from assessing the need to perform the sensitivity analysis, and specifying the NARFCS models and sensitivity parameters, through to implementing NARFCS using FCS procedures in R and Stata.

Download

Enhancing comparability in early child development assessment with the D-score

November 2024

·

29 Reads

International Journal of Behavioral Development

The lack of a valid and interpretable score to track early child development over time is a primary reason for neglecting child development in policymaking. Many instruments exist, but there is no accepted method for comparing their scores across different ages, samples, and instruments. This paper aims (1) to enhance the Development Score (D-score), a unidimensional scale for early child development, to compare measurements across ages, samples, and instruments, (2) to develop a conversion key that enables the transformation of measurements obtained from existing instruments into a D-score, and (3) to investigate two new measures designed to optimize the quantification of the D-score. Study 1 gathered data from 51 sources in 32 countries among 66,075 children using 18 instruments with 2,211 items. Subject matter experts used the output of the Study-1 true score equating model to create the Global Scales for Early Development Short Form (GSED SF) and Long Form (GSED LF). Study 2 collected additional data on the GSED LF and GSED SF in three countries among 4,374 children. The Study-2 model enables the conversion of measurements into a D-score for 20 different instruments. We propose the D-score as a unifying evaluation unit to reduce fragmentation, simplify measurement, and enhance comparability.


Example of three instruments linked by common items (i.e. equate clusters)
Illustration of the mis-alignment parameter, the left panel illustrates large mis-alignment between two instruments (γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} = 4) and the right panel a small misalignment (γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} = 0.05)
Difficulty estimates from model without active equate clusters (ρ = 0.85; γ = 1.94) (left) and model with active equate clusters (ρ = 0.99; γ = -0.03) (right) where difficulty ranges are close, cohort abilities differ and five equate clusters are spread through both instruments. Difficulty estimates are coloured by instrument
Latent ability in logits for age for the three studies. The left panel results from the model without equate clusters and the right panel results from the model with equate clusters
Percentage pass for ability in the data for the equate clusters
Harmonizing measurements: establishing a common metric via shared items across instruments

November 2024

·

22 Reads

·

1 Citation

Population Health Metrics

Background The proliferation of instruments that define instrument-specific metrics impedes progress in comparative assessment across populations. This paper explores a method to extract a common metric from related but different instruments and transform the original measurements into scores with a standard unit of measurement. Methods Existing data from four assessment instruments of child development, collected from three different samples of children, were used to create “equate clusters” of items that measure the same behaviour in (slightly) different ways. A probability model was formulated to identify best items and groups to serve as anchors linking the instruments, assuming that items in an anchoring or “active” equate cluster are psychometrically equivalent. Quantification and inspection of item characteristic curves were used to resolve which equate clusters should be active. We simulated the impact of various analytic choices. Results Simulation confirmed the feasibility of creating a common metric from data collected with different instruments from respondent samples with different abilities. The method performed as expected in an application in early childhood development. Conclusions The use of equate clusters is an intuitive and flexible way to establish a common metric across instruments and facilitates the transformation of measurements obtained to a standardized scale. Standardizing instrument scores to a common metric allows for population-level comparisons on a global scale.


Evaluating the median p -value method for assessing the statistical significance of tests when using multiple imputation

October 2024

·

12 Reads


Stability of neurodevelopmental trajectories in moderately late and early preterm children born 15 years apart

April 2024

·

8 Reads

Pediatric Research

Background: Neurodevelopmental trajectories of preterm children may have changed due to changes in care and in society. We aimed to compare neurodevelopmental trajectories in early and moderately late preterm children, measured using the Developmental (D)-score, in two cohorts born 15 years apart. Methods: We included early preterm and moderately late preterm children from two Dutch cohorts (LOLLIPOP, 2002-2003 and ePREM, 2016-2017). ePREM counterparts were matched to LOLLIPOP participants by gestational age and sex. D-score trajectories were summarized by a multilevel model with random intercepts and random slopes, and multigroup analyses were used to test if the intercepts and slopes differed across cohorts. Results: We included 1686 preterm children (1071 moderately late preterm, 615 early preterm) from LOLLIPOP, and matched these with 1686 ePREM counterparts. The neurodevelopmental trajectories of the two cohorts were mostly similar. For early preterm children, we found no statistically significant differences. For moderately late preterm children, both the intercept (43.0 vs. 42.3, p < 0.001) and slope (23.5 vs. 23.9, p = 0.002) showed some, but only clinically minor, differences. Conclusion: Developmental trajectories, measured using the D-score, in the first four years of life are comparable and stable across a period of 15 years for both early and moderately late preterm children. Impact: Neurodevelopmental trajectories are similar for early and moderately late preterm children born 15 years apart and thus seem quite stable in time. The validated Developmental score visualizes these trajectories based on developmental milestone attainment Because of its stability over time, the Developmental score trajectory may aid clinicians in neurodevelopmental assessment of preterm children as this simplifies monitoring and interpretation, similar to a growth chart.


Infants dying in founding homes in European cities or nations and infants placed into care homes in the countryside. Data from Peiper (Peiper 1955).
Characteristics of fast and slow life history strategies. HRV (heart rate variability).
Networks in Auxology – proceedings of the 31st Aschauer Soiree, held at Aschau, Germany, June 17th 2023

December 2023

·

140 Reads

·

1 Citation

Human Biology and Public Health

Twenty-seven scientists met for the annual Auxological conference held at Aschau, Germany, to particularly discuss the interaction between social factors and human growth, and to highlight several topics of general interest to the regulation of human growth. Humans are social mammals. Humans show and share personal interests and needs, and are able to strategically adjust size according to social position, with love and hope being prime factors in the regulation of growth. In contrast to Western societies, where body size has been shown to be an important predictor of socioeconomic status, egalitarian societies without formalized hierarchy and material wealth-dependent social status do not appear to similarly integrate body size and social network. Social network structures can be modeled by Monte Carlo simulation. Modeling dominance hierarchies suggests that winner-loser effects play a pivotal role in robust self-organization that transcends the specifics of the individual. Further improvements of the St. Nicolas House analysis using re-sampling/bootstrap techniques yielded encouraging results for exploring dense networks of interacting variables. Customized pediatric growth references, and approaches towards a Digital Rare Disease Growth Chart Library were presented. First attempts with a mobile phone application were presented to investigate the associations between maternal pre-pregnancy overweight, gestational weight gain, and the child’s future motor development. Clinical contributions included growth patterns of individuals with Silver-Russell syndrome, and treatment burden in children with growth hormone deficiency. Contributions on sports highlighted the fallacy inherent in disregarding the biological maturation status when interpreting physical performance outcomes. The meeting explored the complex influence of nutrition and lifestyle on menarcheal age of Lithuanian girls and emphasized regional trends in height of Austrian recruits. Examples of the psychosocial stress caused by the forced migration of modern Kyrgyz children and Polish children after World War II were presented, as well as the effects of nutritional stress during and after World War I. The session concluded with a discussion of recent trends in gun violence affecting children and adolescents in the United States, and aspects of life history theory using the example of "Borderline Personality Disorder." The features of this disorder are consistent with the notion that it reflects a "fast" life history strategy, with higher levels of allostatic load, higher levels of aggression, and greater exposure to both childhood adversity and chronic stress. The results were discussed in light of evolutionary guided research. In all contributions presented here, written informed consent was obtained from all participants in accordance with institutional Human investigation committee guidelines in accordance with the Declaration of Helsinki amended October 2013, after information about the procedures used.





Citations (52)


... Preterm infants are beset by a variety of health problems that require lengthy stays in the NICU ranging from a few weeks to a few months [7]. Term infants with congenital anomalies, sepsis, or hypoxic-ischemic encephalopathy might also spend several days or weeks in the NICU. ...

Reference:

Financial incentives for family members of hospitalized neonates for improving family presence
Changes in neonatal morbidity, neonatal care practices, and length of hospital stay of surviving infants born very preterm in the Netherlands in the 1980s and in the 2000s: a comparison analysis with identical characteristics definitions

BMC Pediatrics

... Following this, missing exposure data were imputed using the Multivariate Imputation by Chained Equations (MICE) package in R software (van Buuren & Groothuis-Oudshoorn, 2011). The imputed datasets were generated using Predictive Mean Matching (PMM) for both continuous and binary exposure data (Austin & van Buuren, 2023). To test the robustness of our imputation strategy, we adjusted the number of imputations (m = 10, 20) and the maximum number of iterations (maxit = 20, 30). ...

Logistic regression vs. predictive mean matching for imputing binary covariates

... 30 However, we will also develop methodology to model longitudinal change at the individual level, whilst accounting for potentially confounding factors, based on approaches developed within the pediatric growth charting literature. 39,41,42 This framework will also accommodate potential biases specific to longitudinal cognitive studies, for example those that may be due to practice or motivational effects (e.g. where individuals with a mental illness may have a higher or lower motivation to perform the tests). ...

Evaluation and prediction of individual growth trajectories

... (ii) The fit of the imputation model may be verified with the help of a posterior predictive check (Nguyen et al., 2017;Zhao, 2022). A straightforward posterior predictive check for imputation methodology is the multiple over-imputation of observed data values (Cai et al., 2022). If the statistical properties of the overimputed values are equivalent to those of the observed data values, one could infer that the imputation model fits the observed part of the incomplete data reasonably well. ...

Graphical and numerical diagnostic tools to assess multiple imputation models by posterior predictive checking
  • Citing Article
  • June 2023

Heliyon

... For this reason, we present two datasets for all dictyopterans: one excluding ratios with coxae and tarsus measurements altogether to avoid introducing biases; in the second one, fossil species without coxa and tarsus were given an approximate value using the relation between coxa or tarsus length and femoral length in phylogenetically and temporally close fossil species of a similar size (see Table S1). The number of components included in the final interpretation of the analysis was based on the broken-stick model (van Buuren, 2023). Components were retained if their cumulative weight explained more than 70% of the total variance (Table S3). ...

Broken Stick Model for Irregular Longitudinal Data

Journal of Statistical Software

... There are two versions of the GSED with each serving a different purpose: the GSEDshort form version that is a caregiver-reported measure intended for population-level monitoring and the GSEDlong form a directly administered tool designed more for program evaluation purposes. The WHO team has been conducting a multi-country validation study of the GSED tools in seven countries and implemented in two rounds [10]. Round 1 included Bangladesh, Pakistan, Tanzania, and study results supporting GSED validity in these contexts have been published in the GSED technical report [11]. ...

Protocol for validation of the Global Scales for Early Development (GSED) for children under 3 years of age in seven countries

BMJ Open

... Building upon the CREDI, a new open-access tool has been developed recently by the World Health Organization (WHO) in collaboration with various global technical experts called the Global Scales for Early Development (GSED) [8]. Launched publicly in 2023, the GSED aims to become a global standardized tool that can provide international comparable data on ECD measured in terms of a single holistic score (D-score) among children 0-3 years of age [9]. There are two versions of the GSED with each serving a different purpose: the GSEDshort form version that is a caregiver-reported measure intended for population-level monitoring and the GSEDlong form a directly administered tool designed more for program evaluation purposes. ...

The creation of the Global Scales for Early Development (GSED) for children aged 0–3 years: combining subject matter expert judgements with big data

... However, as patients with knee OA experience often repeated hospital admissions, the resulting readmission data constitutes a form of recurrent event data. Owing to the autocorrelation of the recurrent event data, the classical Cox proportional risk model should not be applied for survival analysis; rather, a frailty model should be used [21]. The frailty model has been used as a common method for dealing with longitudinal data since it was proposed [6,22]. ...

Longitudinal individual predictions from irregular repeated measurements data

... In the case where the test sample obeys the normal distribution with the unknown mean and variance, the joint prior distribution of mean and variance 2 can be assumed to be the normalinverse gamma distribution [2,3,11], as shown in Eq. (8). ...

Joint distribution properties of fully conditional specification under the normal linear model with normal inverse-gamma priors

... along the middle of the probability continuum), for example, a person with low ACE severity endorsed an item that is less common for someone in that group to endorse. Ideally, the infit and outfit statistics for items should be between 0.5 and 1.5 [29,31]. For both statistics, being higher than this range is a larger concern as it is an indication that the item is not fitting well, whereas being lower than 0.5 might be an indication of overfitting [30,31]. ...

Child development with the D-score: turning milestones into measurement