Content uploaded by Christian Winkler
Author content
All content in this area was uploaded by Christian Winkler on Jan 29, 2019
Content may be subject to copyright.
RefCurv: A Software for the Construction of
Pediatric Reference Curves
Christian Winkler
University of Bonn
Katharina Linden
University of Bonn
Andreas Mayr
University of Bonn
Thomas Schultz
University of Bonn
Thomas Welchowski
University of Bonn
Johannes Breuer
University of Bonn
Ulrike Herberg
University of Bonn
Abstract
In medicine, reference curves serve as an important tool for everyday clinical practice.
Pediatricians assess the growth process of children with the help of percentile curves
serving as norm references. The mathematical methods for the construction of these
reference curves are sophisticated and often require technical knowledge beyond the scope
of physicians. An easy-to-use software for life scientists and physicians is missing. As
a consequence, most medical publications do not document the construction properly.
This project aims to develop a software that enables non-technical users to apply modern
statistical methods to create and analyze reference curves.
In this paper, we present RefCurv, a software that facilitates the construction of
reference curves. The software comprises functionalities to select and visualize data. Users
can fit models to the data and graphically present them as percentile curves. Furthermore,
the software provides features to highlight possible outliers, perform model selection, and
analyze the sensitivity.
RefCurv is an open-source software with a graphical user interface (GUI) written in
Python. It uses Rand the gamlss add-on package (Rigby and Stasinopoulos (2005)) as
the underlying statistical engine.
RefCurv simplifies the process to create percentile curves with a broad set of data pro-
cessing features. The tool can help to standardize the procedure and plan the acquisition
of data. An exemplary analysis of the robustness for the underlying statistical methods is
shown in case scenarios. Also, a method to design studies concerning the required sample
size and the model setting is demonstrated.
In summary, RefCurv is the first software based on the gamlss package, which enables
practitioners to construct and analyze reference curves in a user-friendly GUI. In broader
terms, the software brings together the fields of statistical learning and medical applica-
tion. Consequently, RefCurv can help to establish the construction of reference curves in
other medical fields.
Keywords: reference curves, percentile curves, centile estimation, z-scores, LMS method, pe-
diatrics, echocardiography, open source, R,gamlss,Python.
arXiv:1901.09775v1 [stat.AP] 28 Jan 2019
2RefCurv: A Software for the Construction of Pediatric Reference Curves
1. Introduction
Reference curves and charts are standard tools to describe the normal range of a parameter.
In clinical practice, physicians use percentile curves (or z-score curves) to evaluate measured
values of patients. Comparing the measurement to a reference helps to quantify the severity
of the disease and diagnose the condition of a patient. In this context, percentile curves
have been established for most common physiological and anthropometric parameters. For
children, the curves can be used to assess the growth process. In literature, several reference
curves and charts for pediatric parameters are available. One prominent parameter is the
Body Mass Index (BMI) (Cole, Freeman, and Preece (1995); Fredriks, van Buuren, Wit, and
Verloove-Vanhorick (2000b)).
Figure 1shows an example of pediatric reference curves for the BMI, which we fitted to
a dataset from a previous study (Fredriks, Van Buuren, Burgmeijer, Meulmeester, Beuker,
Brugman, Roede, Verloove-Vanhorick, and Wit (2000a)). Furthermore, there have been stud-
ies on weight, height and head circumference (Cole, Freeman, and Preece (1998); Group and
de Onis (2006); Cacciari, Milani, Balsamo, Spada, Bona, Cavallo, Cerutti, Gargantini, Greg-
gio, Tonini et al. (2006); Neuhauser, Schienkiewitz, Rosario, Dortschy, and Kurth (2013)).
Figure 1: Reference curves for BMI based on a dataset of healthy Dutch boys
(Fredriks et al. (2000a)). RefCurv was used to fit a model to the data points and depict
it in the form of percentile curves. The labels indicate the percentiles, e.g. "P3" stands for
the third percentile.
Christian Winkler 3
Another broad application field for reference curves are echocardiographical parameters
(Kobayashi, Fuse, Sakamoto, Mikami, Ogawa, Hamaoka, Arakaki, Nakamura, Nagasawa,
Kato et al. (2016); Dallaire and Dahdah (2011); Cantinotti, Kutty, Franchi, Paterni, Scalese,
Iervasi, and Koestenberger (2017)). Echocardiography has become an essential support for
cardiological examination in children. Due to its noninvasiveness and fast application, it has
been established as a standard technology in everyday clinical practice. Cardiologists use ref-
erence curves to detect cardiac pathologies and plan surgical treatments. A recent literature
review on this growing field is given by Mawad, Drolet, Dahdah, and Dallaire (2013). In the
present paper, the focus is on the cardiological application of our software in children and
examples are based on echocardiographical measurements.
The mathematical methods for the construction of pediatric reference curves have been shaped
by the publications of Cole and Green (Cole (1990); Cole and Green (1992); Cole et al.
(1998)). In Cole (1990), the author proposes an algorithm for fitting smooth curves to
data by using a penalized likelihood. Furthermore, Cole and Green describe the Box-Cox
Cole Green (BCCG) distribution for pediatric growth curves and show the application on a
dataset for BMI (Cole et al. (1995)). This approach has since been called the LMS method (or
Lambda-Mu-Sigma method) and has been applied in many studies (Mul, Fredriks, Van Bu-
uren, Oostdijk, Verloove-Vanhorick, and Wit (2001); Fredriks et al. (2000a); Katzmarzyk
(2004); Nysom, Mølgaard, Hutchings, and Michaelsen (2001); Ataei, Hosseini, Fayaz, Navidi,
Taghiloo, Kalantari, and Ataei (2016); Hirschler, Molinari, Maccallini, Hidalgo, Gonzalez,
and de los Cobres Study Group (2016); Khadilkar, Ekbote, Chiplonkar, Khadilkar, Kajale,
Kulkarni, Parthasarathy, Arya, Bhattacharya, and Agarwal (2014)).
Subsequently, a program called LMSchartmaker was implemented by the group of Cole and
Green enabling practitioners to apply the LMS method. This tool was used by multiple stud-
ies but we found issues regarding the scientific practice. On the one hand, LMSchartmaker is
not open-source and there is not any description of the implementation. On the other hand,
a scientific publication and references are missing.
At the same time, Rigby and Stasinopoulos developed and implemented the Radd-on pack-
age for "Generalized Additive Models for Location Scale and Shape" (gamlss,Rigby and
Stasinopoulos (2005); Stasinopoulos, Rigby et al. (2007)). The gamlss package contains the
LMS method and algorithms by Cole and Green. In addition, it extends the method by pro-
viding other model classes and diagnostic tools to assess the fitted reference curves. Unlike
LMSchartmaker, the gamlss package is open-source, scientifically well-documented and free
of charge. However, the usage of gamlss requires an intense study of related mathematical
methods and programming skills in R.
Despite the availability of multiple statistical methods, most medical publications do not doc-
ument the construction of reference curves properly or miss important information such as
details about the model selection. One reason might be the complex application of statistical
methods, which is a challenge for physicians and data analysts alike.
Researchers often cannot reproduce study results in the form of reference curves because
datasets are not published by the authors.
4RefCurv: A Software for the Construction of Pediatric Reference Curves
This project aims to develop RefCurv, an easy-to-use software for the construction of ref-
erence curves. With this tool, we want to enable non-technical users to create and analyze
percentile curves for clinical usage. Moreover, it was intended to help experts with the ad-
vanced analysis of reference curves. Likewise, we proposed features to plan the study design
such as estimating the required sample size.
RefCurv uses Rand the gamlss add-on package as the underlying statistical engine. Users
can apply the LMS method on their data or use a customized GAMLSS model for their cal-
culations. The graphical user interface (GUI) is written in Python using its features in data
visualization and processing. Users can define model settings in the GUI. This information
is passed on to an R-script using functions from the gamlss package. After computation,
the results are delivered back to the GUI. Functionalities for data selection, model selection,
and model validation are provided. The software is designed to simplify data processing and
model fitting. With RefCurv, users are guided through a simple workflow from acquired data
to reference curves. The software is intended mainly for users without any specific program-
ming or mathematical skills.
Furthermore, an echocardiographical dataset was acquired in previous studies by our research
group. We use these data to demonstrate and explain the application of RefCurv. The given
examples can be considered as recommended steps for the construction of reference curves.
Christian Winkler 5
2. Methods
The main focus of RefCurv lies on the LMS method by Cole using the gamlss package in
Rfor the statistical computations. The model used for the LMS method is a special case
of a GAMLSS model. The model class is defined by the Box-Cox Cole Green (BCCG)
distribution and penalized splines as smoothing for the distribution parameters L, M and S.
Penalized splines are implemented in Ras pb() function. Each penalized spline has a degree
of freedom (df) to be predefined by the user. These three parameters (L_df,M_df and S_df)
are arguments of the pb() function and they are therefore defined as hyperparameters. We
chose a setting of L_df = 0, M_df = 1 and S_df = 0 as the default model.
Consequently, the model fitting according to the LMS method is implemented as followed:
LMS_model <- gamlss(y ~ pb(x, df = M_df),
sigma.formula = ~ pb(x, df = S_df),
nu.formula = ~ pb(x, df = L_df),
family = "BCCG",
method = RS(),
data = dataset_training)
The Rigby and Stasinopoulos algorithm, RS(), is used for the fitting (Stasinopoulos et al.
2007).
The LMS method has been established as a standard procedure for pediatric reference curves.
Therefore, it is set as default for the model fitting (Appendix A). Apart from that, the advance
settings in RefCurv allow the user to fit a broader set of univariate GAMLSS models to the
data.
Details about the installation and software architecture of RefCurv are given in the appendix
(Appendix B). The statistical engine is the gamlss package and RefCurv consequently inherits
its limitations.
RefCurv’s model fitting is based on a model class with a BCCG distribution. One limitation
is that this model class is developed and tested for positive data values only. Furthermore,
methods might be sensitive to outliers and model fitting might fail if data is distributed
unevenly. These limitations and how to address them will be discussed in the application
section.
In this section, we will describe RefCurv in four parts. After giving information about the
repository and documentation (2.1), we will present its graphical user interface (2.2). Next,
RefCurv’s features and functions are presented (2.3). Finally, we will recommend steps for
the construction of reference curves (2.4).
2.1. GitHub repository and documentation
RefCurv is open source and currently available as version 0.4.2. The source code and binaries
are provided on GitHub: https://github.com/xi2pi/RefCurv
The related GitHub Wiki contains a quick guide and instructions for the application. The
example datasets, which were used in this paper, can be accessed through the software directly.
The repository will be kept up to date and news will be announced on GitHub. Developers
can exchange information in the related forum for issues.
6RefCurv: A Software for the Construction of Pediatric Reference Curves
In addition to the source code, we created a website (https://refcurv.com) and video tutorials
(https://vimeo.com/user93523411).
RefCurv has been developed and tested on Windows and Linux. More information about
package versions are given in Appendix B.
2.2. Graphical user interface
Figure 2shows RefCurv’s graphical user interface (GUI) consisting of a table viewer (left)
and a plot viewer (right). Users can select data columns in the table viewer and visualize
them in the plot viewer as scatter plot. In this example, we demonstrate the application on
an echocardiographical dataset. The end-systolic volume of the left ventricle (ESV) is plotted
against the age. The default model was fitted and graphically depicted as curves in the plot
viewer. Each curve represents the percentile of the underlying distribution (3rd, 10th, 25th,
50th, 75th, 90th, 97th) and is labeled accordingly (e.g. "P3" stands for the third percentile).
Percentile curves can be easily converted into z-score curves.
Users can navigate in RefCurv through the toolbar at the top of the window. The toolbar
consists of a set of buttons for categories such as "Model" for the model processing. RefCurv
is a Multiple Document Interface application, meaning that users can adjust settings in sub-
windows for most functions.
Plot viewerTable viewer
Toolbar Reference curves
Figure 2: RefCurv’s graphical user interface with a table viewer (left) and a plot
viewer (right). Users can navigate through the features by using the upper toolbar.
Christian Winkler 7
2.3. Features
Import of data
RefCurv allows the import of data tables in the form of CSV files ("File" →"Load Data").
The following structure of the data table is required: columns contain measured variables,
while rows represent the cases. The first row of the chart should be a header indicating the
name of the measured variables.
Subjects Variable 1 Variable 2
Subject 1 ... ...
Subject 2 ... ...
... ... ...
Table 1: Structure of the input table for RefCurv
After import, users can inspect the data in the table viewer as highlighted in Figure 2.
Data selection
After the data are loaded, RefCurv will provide functions to select data points and exclude
them in case they are considered as anomalies. For that, users can choose two variables in
the lower right drop down menu, one for the x-axis and one for the y-axis. Chosen columns
are highlighted in the table viewer. By clicking the "Plot" button, a scatter plot is created.
Data points are highlighted in the scatter plot when chosen in the table viewer. By checking
or unchecking the box in the table, subjects can be excluded or included respectively. Chosen
data serves as the training dataset and can be used for the model fitting.
Figure 3: RefCurv’s model fitting. The model fitting window shows the model parame-
ters and gives a summary in the text output field.
8RefCurv: A Software for the Construction of Pediatric Reference Curves
Model fitting
For model fitting, the selected data is passed as training dataset to the gamlss function. Users
can specify the hyperparameters in the model fitting window ("Model" →"Model Fitting").
The hyperparameters for the LMS method are the degree of freedom (df) for the penalized
splines of L, M, and S. After the fitting, a text output field provides a summary of the fitting
results. We recommend a value for df between 0 and 5 respectively. The effect of different
settings for the hyperparameters, L_df,M_df and S_df, on the resulting percentile curves
is shown in Figure 4. The higher the value for df of the penalized spline is, the higher the
flexibility of the curves will be.
(a) L_df = 0, M_df = 1, S_df = 0 (b) L_df = 1, M_df = 2, S_df = 1 (c) L_df = 4, M_df = 4, S_df = 4
Figure 4: Model fitting with different settings for the hyperparameters, L_df,
M_df and S_df.
The plot viewer depicts resulting percentile curves after the computation. The text output
shows the output of the gamlss() function. The output gives information about the fitting
results and diagnostic values such as the global deviance.
Advanced model fitting
In the advanced model fitting ("Model" →"Model Fitting (advanced)"), GAMLSS model
settings can be customized. Table 2shows a list of distributions and smoothing functions for
GAMLSS models.
Distribution Rfunction
Box-Cox Cole and Green BCCG()
Box-Cox power exponential BCPE()
Box-Cox-t BCT()
Smoothing function Rfunction
Cubic splines cs()
Polynomials poly()
Penalized splines pb()
Table 2: Distributions and smoothing functions for GAMLSS models.
A full list with distributions and smoothing functions are presented in Stasinopoulos et al.
(2007).
An example GAMLSS model with a BCCG distribution and cubic splines as smoothing
function is:
Christian Winkler 9
GAMLSS_model <- gamlss(y ~ cs(x, df = 1),
sigma.formula = ~ cs(x, df = 0)),
nu.formula = ~ cs(x, df = 0),
family = "BCCG",
method = RS(),
data = dataset_training)
In RefCurv, the model fitting with this setting can be realized by typing the command to the
input text field of the advanced model fitting window (figure 5).
Figure 5: RefCurv’s advanced model fitting.
The features model selection and sensitivity analysis are only available for LMS models.
GAMLSS models with other settings are so far not supported.
10 RefCurv: A Software for the Construction of Pediatric Reference Curves
Outlier detection
Outliers in the training dataset might have an adverse effect on the model fitting. Datasets
might contain outliers because of transcription errors, for instance. RefCurv offers a fast
way to detect and analyze potential outliers. Users can decide to exclude individual outliers
consequently. The outlier detection is based on a model fitting result. After a predefined
Figure 6: RefCurv’s outlier detection.
model is fitted, the residuals will be calculated. RefCurv’s outlier detection feature allows
highlighting data points regarding the residuals. Limits for highlighting data points can be
chosen individually. In the example in figure 6, we chose to set the limit to the 90% and 10%
("Setting" →"Outliers Setting"). As a result, data points above the 90th percentile curve and
below the 10th percentile curve are highlighted in yellow. Afterwards, users can deselect data
points that they consider as outliers. Residuals are added as column in the table so that a
quantitative assessment is possible.
Model selection
As shown before, the LMS method can have different outcomes depending on the degree of
freedom (df) for the penalized splines of the three parameters L, M and S. The task of the
model selection is to find an appropriate setting for the df and balance the trade-off issue be-
tween the goodness of fit and complexity. Overfitting can be avoided with this step. RefCurv
provides two different ways of model selection.
The first model selection method uses the Bayesian Information Criterion (BIC) as decision
support for selection (Appendix C). A grid search is performed to find the best model con-
cerning the BIC. The model selection window in RefCurv allows to set the limits for the df of
L, M and S. Default step size is set to 1. The output of the model selection is a list of models
ordered by BIC. The df setting of the model with the lowest BIC is considered as best for
the chosen dataset.
The second method for model selection is based on cross-validation. RefCurv uses the gamlss
Christian Winkler 11
Figure 7: RefCurv’s model selection. The range for df are set to L_df = 0,...,5; M_df
= 0,...,5; S_df = 0,...,5
function gamlssCV() for this task. Since datasets are often small in the field of pediatrics, we
decided to implement a 10-fold cross-validation. The validation is performed on the training
dataset. For that, the dataset is split into ten folds. As a next step, the model is trained on
nine folds of the dataset, while the remaining part serves as a validation dataset. Afterwards,
the global deviance for the validation dataset is computed, which gives information about the
generalization error of the model. Stepwise, each of the ten folds has served as a validation
dataset. Finally, the overall generalization error is computed as the mean of the global de-
viances.
A cross-validation can be time-consuming due to its computational effort. The model selec-
tion based on the BIC is faster and therefore computationally more efficient. Furthermore,
RefCurv’s BIC method is automatized in the form of a grid search. For the practical appli-
cation, we currently recommend the BIC method as the model selection for users with little
statistical background knowledge.
Sensitivity analysis
In order to analyze the sensitivity of the fitting method, RefCurv offers a feature to add noise
to data points. This kind of uncertainty could be caused by measurement errors. Figure 8
shows the concept of the sensitivity analysis.
Users can choose single or multiple data points, which are depicted in black. The variations
∆yup and ∆ydown can be applied concerning the chosen response variable y. As a result,
there are three different datasets (black, green, red), which will be used as training data. The
method then fits a model and shows the percentile curves for each of the three cases. In figure
8the 50th percentile curve is depicted (black, green, red).
Figure 9shows an example of the sensitivity analysis in RefCurv. Chosen data points with
variation are highlighted in yellow. The values for the variation can be set by the user in the
text fields below.
This feature also allows to examine the influence of data points on the percentile curves. By
varying one data point, for example, we can analyze the effect on the 50th percentile curve.
12 RefCurv: A Software for the Construction of Pediatric Reference Curves
x
y
P50
P50
P50
yt
Δyup
Δydown
Figure 8: Concept of the sensitivity analysis. A model is fitted to each of the training
datasets (red, black, green), which symbolically consists of four data points in this figure.
The 50th percentile curve is shown for each of the three cases in the corresponding color.
Figure 9: RefCurv’s sensitivity analysis. Chosen data points with variation are high-
lighted in yellow. The curves represent the 50th percentile curve for the three cases: variation
up, variation down and no variation.
Christian Winkler 13
Model comparison
In the model comparison window, users can compare the percentile curves of two models. As
an example, models with different settings for the df can be fitted and compared afterwards
to analyze the effect.
Reverse computation
For decision support, clinicians often use reference curves or charts from the literature. One
problem is that the distribution parameters (L, M, and S for the BCCG distribution) are
often missing. RefCurv’s reverse computation feature enables users to approximate L, M,
and S values for given reference curves. With this method, it is possible to express any
reference curve as a LMS model. To achieve that, the method fits a BCCG distribution to
the reference curves. The results are the distribution parameters L, M, and S for each value
of the covariate. More mathematical details about this approach is given in the Appendix D.
Export
Resulting reference curves can be exported ("File" →"Save Reference Curves") as a graph
(all common graphical formats) or as a table (CSV file). The values for L, M, and S are
automatically exported so that percentiles or z-scores can be computed manually. For clinical
use, the values for L, M, and S are essential to compute the z-score of a new case using Cole’s
formula (Appendix A).
Z-Score/Percentile converter
Percentiles can be converted into z-scores and vice versa. It depends on the examination
which of both terms is used by the clinician. RefCurv offers a converter to deal with both
definitions ("Calculator" →"Z-score/Percentile Converter"). In figure 10, we converted the
percentile value of 75 to a z-score. In that case, we receive a z-score of 0.67449 as the result.
Figure 10: RefCurv’s Z-score/Percentile converter
Z-Score calculator
Clinicians obtain percentile and z-score values of patients as a diagnostic parameter. These
values have to be computed with measured data and for a given reference curve. Currently,
there is a big number of web and smartphone applications to compute the z-score. With
RefCurv’s z-score calculator ("Calculator" →"Z-score Calculator"), it is possible to compute
z-score values of patients directly after the construction of reference curves.
Figure 11 shows an example where the z-score for the entered data point is 1.473.
14 RefCurv: A Software for the Construction of Pediatric Reference Curves
Figure 11: RefCurv’s z-score calculator. The z-score of the entered data point (x =
100, y = 40) is 1.473.
Monte Carlo Simulation
RefCurv’s Monte Carlo Simulation is a feature that could help researchers to design a study.
The goal is to plan the required sample size for the construction of reference curves. The
simulation is based on a GAMLSS model to create a random sample. Users can enter the
simulated sample size in this step.
Next, this simulated random sample can be used for the model fitting and analyzing with
different settings. Based on this approach, users gather information about the behavior of
the models fitted to the created sample size. As a result, users might estimate an appropriate
sample size for the construction of reference curves.
Christian Winkler 15
2.4. Recommended steps for the modeling of pediatric reference curves with
the LMS method
The steps for constructing reference curves depend on the analyst’s choice. The data analyst
could choose the order: model selection, model model fitting, outliers analysis. On the other
side, the outlier analysis could also be performed before.
Consequently, different approaches can lead to different results and none of these is objective or
ideal. However, an unified workflow can improve reliability, comparability and reproducibility.
1. Data Preparation
Data visualization
Outlier detection
2. Model Selection
Model class
Hyperparameter tuning
3. Model Fitting
Fitting of model parameters
4. Model Testing / Evaluation
Validation on test dataset
Training dataset
Model setting
GAMLSS model
Figure 12: Recommended steps for the modeling of pediatric reference curves
The gamlss package offers different constellations for the application steps of the LMS method.
Steps include model selection and cross-validation. The documentation is very comprehensive
(Stasinopoulos, Rigby, Heller, Voudouris, and De Bastiani (2017)). However, we found that
gamlss methods are applied in arbitrarily order. A guideline for practitioners seems to be
missing. We suggest here steps for the modeling of reference curves with the LMS method,
which can be achieved with RefCurv. Figure 12 highlights our four recommended steps.
1. Data preparation is the first step of reference curve modeling. The data visualization
is crucial to get an overview of the data distribution. We recommend to depict data in
a scatter plot and use descriptive statistics to analyze the behavior. The dataset could
contain outliers that might have a negative effect on the construction. By highlighting
possible outliers, users can reassess, filter and correct the data. We have to make sure
that the data serve as a good training set for the model fitting.
16 RefCurv: A Software for the Construction of Pediatric Reference Curves
The output of this modeling step is the training dataset that can be used for the model
fitting.
2. As a next step, we recommend to perform model selection. The task of this step is to
define the model class (distribution family, smoothing functions and hyperparameter),
which will be fitted to the data. The decision for the model class should be based on
the data distribution, sample size and other data characteristics. Therefore, this step
requires experience with modeling.
The LMS method uses penalized splines and therefore belongs to the group of non-
parametric models. These models can be used if the amount of data is high and a-priori
knowledge about the data distribution is missing. In RefCurv, this class is set as default,
so that users do not have to deal with complicated model selection tasks.
Furthermore, the LMS method contains the hyperparameters,df_L,df_M and df_S.
These hyperparameters have to be tuned during the model selection.
3. The model fitting follows after the hyperparameters have been found. In this step,
the model parameters are fitted. For the LMS methods, the model parameters are the
vectors L, M and S. The final result of this modeling step is a generalized LMS model
that describes the behavior of the data.
4. The last step is the model testing / evaluation. In this step, the model has to be
validated on an independent test dataset from the population. As a result, users can
compute the prediction error for this test dataset, which explains the quality of the
model.
Christian Winkler 17
3. Application
In this section, we will show how to apply RefCurv on an example dataset, which was acquired
in a previous study of our group (Krell, Laser, Dalla-Pozza, Winkler, Hildebrandt, Kececioglu,
Breuer, and Herberg (2018)). The dataset is accessible for users through the software. First,
we will highlight an example where we apply the recommended steps for modeling, which
were listed in the previous section. Second, we will go through a case scenario to emphasize
the advantages of RefCurv. Last, we will demonstrate how a study design in terms of sample
size can be planned.
3.1. Example
After loading the file ("Examples" →"Echo example"), users can observe the data of 351
healthy children in the table viewer. Measured variables are age, weight, height, end-systolic
volume (ESV), end-diastolic volume (EDV) and stroke volume (SV) of the left ventricle. The
left ventricle is one of the large chambers of the heart and cardiologists measure its volume
and shape with echocardiograms. Data from both genders were combined for this example.
1. Data preparation
First, we examined the data in the table viewer and plotted them as a scatter plot to analyze
the data distribution. Age and ESV were selected as variables in the main window. Selecting
data points in the table highlights them in the scatter plot (Figure 13 (a)).
As a next step, we highlighted possible outliers by fitting a standard model (L_df = 0, M_df
= 1, S_df = 0) to the data. The limit for highlighting possible outliers was set to the 3rd and
the 97th percentile curve. In the interest of simplification, all data below the 3rd percentile
curve and above the 97th percentile curve were deselected for this example (Figure 13 (b)).
Please note that only some of the highlighted data points - the ones that the analyst assesses
as abnormal - should be considered as outliers.
(a) Data visualization. (b) Outlier detection.
Figure 13: Data preparation. Data are visualized (a) and possible outliers are
highlighted (b).
18 RefCurv: A Software for the Construction of Pediatric Reference Curves
2. Model selection
In order to optimize the hyperparameters L_df,M_df, and S_df, RefCurv’s BIC model selec-
tion was performed. The range for each df was set to be 0 to 5.
The result of the model selection is shown in Figure 14. The model with setting M_df=4,
S_df=0 and L_df=0 had the lowest BIC (1940.633). This model was chosen as the best
model, and its settings were used in the model fitting window to create the new prediction.
Figure 14: Model selection. The range for df are set to L_df = 0,...,5; M_df = 0,...,5;
S_df = 0,...,5
3. Model fitting
The model was fitted with the tuned hyperparameters (M_df=4, S_df=0, L_df=0).
Figure 15 shows the results of the model fitting process.
Figure 15: Model fitting. The df are set to M_df=4, S_df=0, and L_df=0
4. Model testing
As the last step, the model was validated by using the implemented 10-fold cross-validation
function. In the model validation window, the LMS-values, which were found through the
model selection process, were given (Figure 16). The cross-validated global deviance was
1902.871 for this case.
Christian Winkler 19
Figure 16: Model testing. We used a 10-fold cross-validation to determine the cross
validated global deviance of 1902.871 for the model. The global deviance for the training of
the last model during the cross validation (10th iteration step) was 1645.632.
3.2. Case scenario
In order to display the other features of RefCurv, we present a case scenario for the construc-
tion of reference curves. First, we studied the impact of reducing data points and creating a
gap in the age range. We also investigated the effect of the data points on the sides (edges)
of the measuring range.
Figure 17: Case scenario. We reduced the number of data points (from left to right)
creating a gap in the data cloud.
In this scenario, data points from the training dataset were gradually excluded in the middle
of the data cloud. This resulted in a gap possibly causing computational problems. With this
procedure, the feasibility and robustness of the LMS method were tested. Figure 17 shows
the procedure. When the number of data points reached less than 274, the LMS method gave
unsatisfying reference curves with low smoothness as a result. A change of the hyperparam-
eters df did not help to improve the smoothness.
To solve this issue, we used the RefCurv’s "Advanced model fitting". We defined GAMLSS_model
20 RefCurv: A Software for the Construction of Pediatric Reference Curves
with the following setting:
GAMLSS_model <- gamlss(y ~ poly(x,2),
sigma.formula = ~ poly(x,1),
nu.formula = ~ poly(x,1),
family = "BCCG",
method = RS(),
data = dataset_training)
where poly(x) is the function for evaluating orthogonal polynomials. Figure 18 shows the
result of the fitting.
Figure 18: Advanced model fitting. We defined a new gamlss model with poly(x)
functions for the curve fitting.
Christian Winkler 21
3.3. Design of study
RefCurv’s Monte Carlo simulation can be used to visualize the impact of this sample size.
Let us take a look at the resulting reference curves from the example before (Figure 15). The
curves can be loaded into the simulation window and different sample sizes can be created
for the simulation. We chose a sample size of 500 and reduced it to 100 (Figure 19).
(a) n = 500 (b) n = 100
Figure 19: Monte Carlo simulation. Different sample sizes were created from a previ-
ously chosen model.
We continued with the lower sample size (n= 100), fitted a LMS model with standard setting
and compared it to the original model (Figure 20). The difference of the 50th percentile
curve for both models was compared. It shows that the absolute difference is never bigger
than 1 milliliter. From these results, users could conclude that a sample size of 100 might be
sufficient to create percentile curves. We recommend to use similar analyses like computing
the difference of the other percentile curves to corroborate this assumption.
22 RefCurv: A Software for the Construction of Pediatric Reference Curves
Monte Carlo simulation
(n = 100)
Difference between
original model and
simulation
Original model
(Example 3.1.)
50th percentile
Figure 20: Model comparison. We used the model from example 3.1 and created a
sample (n= 100) by Monte Carlo simulation. This sample size served as training dataset to
fit a model. The 50th percentile of both models is compared.
We illustrated that RefCurv’s features might help to design studies before data is acquired.
This can be achieved by going through a case scenario like the presented one. Questions about
the number of data points, data distribution, robustness of the LMS method and impact of
outliers can be analyzed in advance. Issues like missing data could be considered during the
planning.
4. Discussion
In the field of medical research, physicians and life scientists miss an easy-to-use software for
the construction of reference curves. For the application of modern statistical approaches,
most methods such as the gamlss package are implemented in Rand require programming
skills. Furthermore, the number of steps for the statistical analysis is high and hampers a
quick analysis. Thus, there is a gap between the statistical methods and end-users. To ad-
dress this issue, we presented RefCurv, a software that enables the construction and analysis
of reference curves for children.
In this article, we focused on medical and particularly echocardiographical data where ref-
erence curves are broadly discussed. Dallaire and Dahdah (2011) presented a very detailed
analysis of modeling approaches. In their study, they focused on parametric regression models
and analyzed features such as goodness of fit for the model and data distribution. A simi-
lar approach was presented by Kobayashi et al. (2016) while they also added nonparametric
regression models to their analysis. These studies have led to the necessity of developing a
Christian Winkler 23
tool, which can simplify and automate the computation. Our project focused on the LMS
method and GAMLSS models because of their good quality and reliability. In this context,
the gamlss package provides a broad set of distributions and smoothing functions.
With this project, we lay the foundation for further analysis of reference curves. RefCurv
helps to solve issues that were discussed in multiple articles before. Cantinotti, Scalese,
Franchi, Corana, Viacava, Assanta, Santoro, and Koestenberger (2018) propose, for example,
to develop a uniform approach to data normalization. The same authors developed an appli-
cation for smartphones, BabyNorm, which enables and simplifies the use of medical reference
values in clinics (Cantinotti et al. (2017)). The advantage of BabyNorm is the possibility
to choose between different published reference curves. Clinicians can compare patient data
to normal values, which are given in the journal articles. We found that a quality index for
the published reference curves is missing. Users mostly have to choose reference from studies
arbitrarily without knowing any details about the references values provided by the study,
such as a number of data points, statistical method or goodness of fit. RefCurv can solve
this problem by offering methods such as the cross-validation method to rank different models.
The development process of RefCurv is ongoing in order to improve the functionality. An
automated computation of the sample size is planned. So far, the software was tested with
multiple datasets and is found to be stable. However, stability and convergence issues might
occur like stated in documentations of the gamlss package (Rigby and Stasinopoulos (2005)).
So far, the handling of negative data points has not been considered but will be considered
in future versions. In the future, RefCurv will be tested on large and highly distributed data
to find out about limitations.
RefCurv can help to standardize procedures and plan the acquisition of data. The design
of the study can be planned in advance through exemplary case scenarios. For example,
researchers could simulate the impact of the sample size on their reference curves to find out:
(i) the minimum number of data points required, (ii) the effect of an increasing number of
data points, (iii) the correct choice of predictor for the curves, (iv) the necessity to stratify by
gender or other variables. The construction of percentile curves with RefCurv can determine
the impact of these parameters on their study results.
A fundamental problem in pediatrics is the low number of measurements because the data
acquisition is long lasting, expensive and difficult. Consequently, sample sizes are often small,
which is discussed in multiple articles (Tanaka (1987); Cantinotti et al. (2017); Williams,
Thomson, Seto, Contopoulos-Ioannidis, Ioannidis, Curtis, Constantin, Batmanabane, Hartling,
and Klassen (2012)). A solution for this issue is to acquire data in multicenter studies such
as our example dataset (Krell et al. (2018)). Merging of data can be easily managed with
RefCurv and the effect for different training datasets on the resulting reference curves can be
quickly tested.
24 RefCurv: A Software for the Construction of Pediatric Reference Curves
4.1. Reuse potential
We demonstrated the software on echocardiographical data of children. Apart from pediatric
applications, reference curves play an important role in many other disciplines of medicine.
As an example, they could be used to describe the growth process of organs or the effect of
a drug. Beyond that, every natural science or technical environment requires references in
order to analyze and improve processes. Thus, RefCurv’s application field is flexible.
The application of the LMS method have been proven to be valuable for pediatric reference
curves. However, model classes and distributions as listed in Stasinopoulos et al. (2007) are
accessible through the advanced model fitting in RefCurv. Therefore, this software is flexible
and can be adjusted according to research hypothesis, theory or purpose.
Due to its easy-to-use GUI, the application does not need any extended training but can be
applied quickly. After the installation, data visualization, model fitting and reference curve
analysis are intuitive.
RefCurv uses GAMLSS models in a Python environment, which opens the door for combi-
nations with other Python software packages. The Simvascular project (Updegrove, Wilson,
Merkow, Lan, Marsden, and Shadden (2017)), for example, offers methods to model the car-
diovascular system and provides a Python interface. Computation results in Simvascular are
however complex and difficult to understand for physicians. RefCurv could help to translate
Simvascular’s output into reference curves, which are easy to understand for pediatric cardi-
ologists, for example. We currently work on a connection to the Simvascular framework.
Altogether, we can recommend this software for students and researchers of any field, who
plan to construct reference curves. Likewise, the software can be used for educational purposes
at all levels. For clinicians, this tool can help to understand the underlying methods of the
construction of percentile curves and its challenges, such as the tuning of hyperparameters.
In science and especially in medical science, the usage of proprietary software with restricted
access to the code is unfortunately a standard practice. This issue makes it hard for re-
searchers to understand and reproduce the results of other publications. Also, it restricts
the scientist from sharing and contributing to other works. This project is entirely open-
source and the source code was released under GPLv3. We encourage other working groups
to develop RefCurv and share their knowledge about reference curves so that the scientific
community can profit from its value.
Christian Winkler 25
4.2. Conclusion
In this paper, we presented RefCurv, a software package enabling to construct reference
curves. The software uses the statistical methods of the gamlss package in R and provides
a user-friendly GUI for data visualization written in Python. Combining both packages,
RefCurv provides a clear structured workflow from data to reference curves. The main fea-
tures of this software are the model fitting, model selection, sensitivity analysis and model
validation.
In the present article, we showed exemplarily how RefCurv can improve the application of
GAMLSS models. As a result, this package can now also be used by physicians and non-
technicians.
Due to these advantages, RefCurv could help improving clinical studies to reduce time and
costs. We showed how to systematically design studies according to sample size, subject group
and medical parameters. In conclusion, a well-designed plan can help to create high-quality
reference curves.
Acknowledgments
This study is funded by Fördergemeinschaft Deutsche Kinderherzzentren e.V.
We thank Rupert Hammen and Jochen Kunkel for their help in testing RefCurv.
26 RefCurv: A Software for the Construction of Pediatric Reference Curves
References
Ataei N, Hosseini M, Fayaz M, Navidi I, Taghiloo A, Kalantari K, Ataei F (2016). “Blood
pressure percentiles by age and height for children and adolescents in Tehran, Iran.” Journal
of human hypertension,30(4), 268.
Cacciari E, Milani S, Balsamo A, Spada E, Bona G, Cavallo L, Cerutti F, Gargantini L,
Greggio N, Tonini G, et al. (2006). “Italian cross-sectional growth charts for height, weight
and BMI (2 to 20 yr).” Journal of endocrinological investigation,29(7), 581–593.
Cantinotti M, Kutty S, Franchi E, Paterni M, Scalese M, Iervasi G, Koestenberger M (2017).
“Pediatric echocardiographic nomograms: what has been done and what still needs to be
done.” Trends in cardiovascular medicine,27(5), 336–349.
Cantinotti M, Scalese M, Franchi E, Corana G, Viacava C, Assanta N, Santoro G, Koesten-
berger M (2018). “Why Use Percentiles and Not Z Scores to Calculate Pediatric Echocardio-
graphic Nomograms? The Need for a Uniform Approach to Data Normalization.” Journal
of the American Society of Echocardiography.
Cole TJ (1990). “The LMS method for constructing normalized growth standards.” European
journal of clinical nutrition,44(1), 45–60.
Cole TJ, Freeman JV, Preece MA (1995). “Body mass index reference curves for the UK,
1990.” Archives of disease in childhood,73(1), 25–29.
Cole TJ, Freeman JV, Preece MA (1998). “British 1990 growth reference centiles for weight,
height, body mass index and head circumference fitted by maximum penalized likelihood.”
Statistics in medicine,17(4), 407–429.
Cole TJ, Green PJ (1992). “Smoothing reference centile curves: the LMS method and penal-
ized likelihood.” Statistics in medicine,11(10), 1305–1319.
Dallaire F, Dahdah N (2011). “New equations and a critical appraisal of coronary artery Z
scores in healthy children.” Journal of the American Society of echocardiography,24(1),
60–74.
Fenton T, Sauve R (2007). “Using the LMS method to calculate z-scores for the Fenton
preterm infant growth chart.” European journal of clinical nutrition,61(12), 1380.
Fredriks AM, Van Buuren S, Burgmeijer RJ, Meulmeester JF, Beuker RJ, Brugman E, Roede
MJ, Verloove-Vanhorick SP, Wit JM (2000a). “Continuing positive secular growth change
in The Netherlands 1955–1997.” Pediatric research,47(3), 316.
Fredriks AM, van Buuren S, Wit JM, Verloove-Vanhorick S (2000b). “Body index measure-
ments in 1996–7 compared with 1980.” Archives of disease in childhood,82(2), 107–112.
Group WMGRS, de Onis M (2006). “WHO Child Growth Standards based on length/height,
weight and age.” Acta paediatrica,95, 76–85.
Hirschler V, Molinari C, Maccallini G, Hidalgo M, Gonzalez C, de los Cobres Study Group SA
(2016). “Waist circumference percentiles in indigenous Argentinean school children living
at high altitudes.” Childhood Obesity,12(1), 77–85.
Christian Winkler 27
Katzmarzyk P (2004). “Waist circumference percentiles for Canadian youth 11–18 y of age.”
European journal of clinical nutrition,58(7), 1011.
Khadilkar A, Ekbote V, Chiplonkar S, Khadilkar V, Kajale N, Kulkarni S, Parthasarathy L,
Arya A, Bhattacharya A, Agarwal S (2014). “Waist circumference percentiles in 2-18 year
old Indian children.” The Journal of pediatrics,164(6), 1358–1362.
Kobayashi T, Fuse S, Sakamoto N, Mikami M, Ogawa S, Hamaoka K, Arakaki Y, Nakamura
T, Nagasawa H, Kato T, et al. (2016). “A new Z score curve of the coronary arterial
internal diameter using the lambda-mu-sigma method in a pediatric population.” Journal
of the American Society of Echocardiography,29(8), 794–801.
Krell K, Laser KT, Dalla-Pozza R, Winkler C, Hildebrandt U, Kececioglu D, Breuer J,
Herberg U (2018). “Real-Time Three-Dimensional Echocardiography of the Left Ven-
tricle—Pediatric Percentiles and Head-to-Head Comparison of Different Contour-Finding
Algorithms: A Multicenter Study.” Journal of the American Society of Echocardiography,
31(6), 702–711.
Mawad W, Drolet C, Dahdah N, Dallaire F (2013). “A review and critique of the statistical
methods used to generate reference values in pediatric echocardiography.” Journal of the
American Society of Echocardiography,26(1), 29–37.
Mul D, Fredriks AM, Van Buuren S, Oostdijk W, Verloove-Vanhorick SP, Wit JM (2001).
“Pubertal development in the Netherlands 1965–1997.” Pediatric research,50(4), 479.
Neuhauser H, Schienkiewitz A, Rosario AS, Dortschy R, Kurth BM (2013). “Referenzperzen-
tile für anthropometrische Maßzahlen und Blutdruck aus der Studie zur Gesundheit von
Kindern und Jugendlichen in Deutschland (KiGGS).”
Nysom K, Mølgaard C, Hutchings B, Michaelsen KF (2001). “Body mass index of 0 to 45-
y-old Danes: reference values and comparison with published European reference values.”
International journal of obesity,25(2), 177.
Rigby RA, Stasinopoulos DM (2005). “Generalized additive models for location, scale and
shape.” Journal of the Royal Statistical Society: Series C (Applied Statistics),54(3), 507–
554.
Stasinopoulos DM, Rigby RA, et al. (2007). “Generalized additive models for location scale
and shape (GAMLSS) in R.” Journal of Statistical Software,23(7), 1–46.
Stasinopoulos MD, Rigby RA, Heller GZ, Voudouris V, De Bastiani F (2017). Flexible re-
gression and smoothing: using GAMLSS in R. Chapman and Hall/CRC.
Tanaka JS (1987). “" How big is big enough?": Sample size and goodness of fit in structural
equation models with latent variables.” Child development, pp. 134–146.
Updegrove A, Wilson NM, Merkow J, Lan H, Marsden AL, Shadden SC (2017). “SimVascular:
An open source pipeline for cardiovascular simulation.” Annals of biomedical engineering,
45(3), 525–541.
Williams K, Thomson D, Seto I, Contopoulos-Ioannidis DG, Ioannidis JP, Curtis S, Con-
stantin E, Batmanabane G, Hartling L, Klassen T (2012). “Standard 6: age groups for
pediatric trials.” Pediatrics,129(Supplement 3), S153–S160.
28 RefCurv: A Software for the Construction of Pediatric Reference Curves
A. The LMS method by Cole
The LMS method is a special case of a generalized additive model and was originally proposed
by Cole (1990). In summary, the approach can be defined by univariate nonparametric GAM.
Let Y= (y1, y2,...yn)T,∀yi>0be a positive random variable with nobservations. The ex-
planatory variable is defined by X= (x1, x2,...xn)T. The model is defined by the parameters
L,Mand S. While Lis considered as skewness parameter, Sis defined as scale parameter
and Mlocation parameter.
Yshould yield a Box-Cox Cole Green (BCCG) distribution denoted by BCCG(M,S,L). A
transformed random variable Zis given by
Z=
1
LS "Y
ML
−1#,if L6= 0
1
Slog Y
M,if L= 0
(1)
for 0< Y < ∞, where M > 0,S > 0and −∞ < L < ∞, and where the random variable Z
is assumed to follow a truncated standard normal distribution.
The probability density function for one observation yand its transform zis given by
fY(y) = yL−1exp −1
2z2
MLS√2πΦ1
S|L|(2)
where Φ() is is the cumulative distribution function (cdf) of a standard normal distribution.
Figure 21 shows the probability density function for different values of L, M, and S.
Figure 21: The probability density function fY(y)for the BCCG distribution
with different values for L, M, and S. Parameter values: (a) L = 1, M = (40, 45, 50), S
= 0.1; (b) L = (1, 10, 15), M = 45, S = 0.1; (c) L = 1, M = 45, S = (0.08, 0.1, 0.14).
Choosing BCCG, the additive model has the form
Christian Winkler 29
M=h1(x)
log(S) = h2(x)
L=h3(x)
(3)
where hi() (for i= 1,2,3) are non-parametric smoothing functions. Originally, cubic splines
cs() have been used as smoothing functions. As alternative to the classic approach, penal-
ized splines were introducted by Eilers and Marx (1996). Penalized Splines (or P-splines)
are piecewise polynomials defined by B-spline basis functions in the explanatory variable,
where the coefficients of the basis functions are penalized to guarantee sufficient smoothness
(Stasinopoulos, 2007). The (gamlss) package offers the function pb() for fitting penalized
splines where df is the desired equivalent number of degrees of freedom.
The the model with the non-parametric functions hk(k= 1,2,3) is fitted by maximizing the
penalized log likelihood function lp, which is defined as
lp=ld−1
2
3
X
k=1
λkZ∞
−∞
h00
k(x)dx
=ld−1
2λ1Z∞
−∞
h00
1(x)dx −1
2λ2Z∞
−∞
h00
2(x)dx −1
2λ3Z∞
−∞
h00
3(x)dx
(4)
where h00
i(x)is the second derivative of hi(x)with respect to x.λ1,λ2, and λ3are smoothing
parameters, which have to be predefined.
The likelihood function of the data is
ld=
n
X
i=1
li(5)
and liis the log likelihood function of observation yiwhich can be computed with (2). The
penalized log likelihood function (4) is maximized iteratively using either the RS() algorithm
(Rigby and Stasinopoulos (2005)) or CG() algorithm (Cole and Green), which in turn uses a
backfitting algorithm to perform each step of the Fisher scoring procedure.
In summary, the LMS method can be applied to a training dataset dataset_training by using
the following piece of code:
LMS_model <- gamlss(y ~ pb(x, df = M_df),
sigma.formula = ~ pb(x, df = S_df),
nu.formula = ~pb(x, df = L_df),
family = "BCCG",
method = RS(),
data = dataset_training)
30 RefCurv: A Software for the Construction of Pediatric Reference Curves
B. RefCurv - Installation and Software Architecture
RefCurv is currently available as version 0.4.2 for Windows (32-bit) and Linux. You can find
installation instructions for all systems on https://refcurv.com. The source code for each
version can be found in the GitHub respository of RefCurv.
For Windows, RefCurv 0.4.2 comes as complete package and does not require any other
dependencies to be installed. We tested the software with the versions mentioned below.
The main program is written in Python (3.4.0 32-bit) and relies on following packages (with
version):
•numpy (1.14.2)
•scipy (1.1.0)
•matplotlib (2.2.2)
•pandas (0.22.0)
•PyQt4 (4.11.4)
Furthermore, RefCurv is based on R(3.5.2 for 32-bit) and gamlss (5.1-2) add-on package as
statistical engine.
C. Bayesian Information Criterion (BIC)
The Bayesian information criterion (BIC) or Schwarz information criterion (also SIC, SBC,
SBIC) is a criterion for model selection. It is typically used to choose among a models with
a different setting of hyperparameters. The model with the lowest BIC is preferred.
The BIC is defined as
BI C = ln(n)k−2 ln(ˆ
ld)(6)
where ˆ
lpis the maximized value of the likelihood function lp(5). nis the number of observa-
tions and kis the number of parameters estimated by the model.
The BIC can help to find a compromise between model complexity and goodness of fit. On
the one hand, it penalizes high complexity with the term ln(n)k. On the other hand, the
goodness of fit is represented as 2 ln(ˆ
ld). A high goodness of fit will result in a low BIC.
D. LMS parameter estimation from percentile curves
Fenton and Sauve (2007) proposed using Cole’s methods to estimate the LMS parameters
from percentile curves. They used the Fenton growth chart for preterm infants and generated
new percentile curves from the estimated and smoothed LMS parameters. As a result, they
found the new curve to be similar to the original curves.
This approach can help to use existing charts for z-score prediction of new subjects. There-
fore, we implemented an automatized feature to estimate the LMS parameter values for a
Christian Winkler 31
given chart.
Figure 22 shows percentile curves and the probability density functions BCCG at three differ-
ent positions of the covariate x= (44.2,110.6,177.0). LMS parameter values were estimated
by fitting the probability density function to the percentile curves. The result of the estima-
Figure 22: LMS parameter estimation from percentile curves. The BCCG distri-
bution was fitted to the percentile values. The density function for three different positions
of the covariate x= (44.2,110.6,177.0) are highlighted.
tion from percentile curves are L, M, and S over the range of the covariate as shown in figure
23.
32 RefCurv: A Software for the Construction of Pediatric Reference Curves
Figure 23: LMS parameter values against the covariate.