 Access to this fulltext is provided by Springer Nature.
 Learn more
Download available
Content available from Social Indicators Research
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Social Indicators Research (2022) 161:907–936
https://doi.org/10.1007/s11205020024255
1 3
ORIGINAL RESEARCH
Composite‑Based Path Modeling forConditional Quantiles
Prediction. AnApplication toAssess Health Diﬀerences
atLocal Level inaWell‑Being Perspective
CristinaDavino1 · PasqualeDolce2 · StefaniaTaralli3 · DomenicoVistocco4
Accepted: 28 June 2020 / Published online: 24 July 2020
© The Author(s) 2020
Abstract
Quantile compositebased path modeling is a recent extension to the conventional partial
least squares path modeling. It estimates the eﬀects that predictors exert on the whole con
ditional distributions of the outcomes involved in path models and provides a comprehen
sive view on the structure of the relationships among the variables. This method can also
be used in a predictive way as it estimates model parameters for each quantile of inter
est and provides conditional quantile predictions for the manifest variables of the outcome
blocks. Quantile compositebased path modeling is shown in action on real data concern
ing wellbeing indicators. Health outcomes are assessed taking into account the eﬀects
of Economic wellbeing and Education. In fact, to support an accurate evaluation of the
regional performances, the conditions within the outcomes arise should be properly con
sidered. Assessing health inequalities in this multidimensional perspective can highlight
the unobserved heterogeneity and contribute to advances in knowledge about the dynamics
producing the wellbeing outcomes at local level.
Keywords PLS path modeling· Quantile compositebased path modeling· Conditional
quantile prediction· Wellbeing· Territorial inequalities· Health indicators
* Cristina Davino
cristina.davino@unina.it
Pasquale Dolce
pasquale.dolce@unina.it
Stefania Taralli
taralli@istat.it
Domenico Vistocco
domenico.vistocco@unina.it
1 Department ofEconomics andStatistics, University ofNaples Federico II, Naples, Italy
2 Department ofPublic Health, University ofNaples Federico II, Naples, Italy
3 ISTAT , Roma, Italy
4 Department ofPolitical Science, University ofNaples Federico II, Naples, Italy
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
908
C.Davino et al.
1 3
1 Introduction
This paper outlines the methodology and the results of a study concerning the relationships
among three wellbeing domains (Education, Economic WellBeing and Health) measured
on Italian provinces. Data come from the Italian system of indicators on Equitable and
Sustainable WellBeing (Benessere Equo e Sostenibile—BES) proposed by the National
Institute of Statistics (ISTAT 2018). BES represents a well established reference database
in the national and international debate on the research of alternative wellbeing measures.
The present paper proposes an advancement of the work elaborated in Davino etal. (2017,
2018), where a hierarchical composite model was used to study the relationships among
components of the BES. The proposal exploits quantile regression (QR) (Koenker and
Basset 1978; Koenker 2005) to obtain the best predictions in a network of simultaneous
equations.
The introduction of structural equation modeling has been a turning point in the analy
sis of complex relationships among unobservable variables, ﬁrst in its hard modeling, or
covariancebased approach (Jöreskog 1978), and then in the soft modeling, or composite
based approach (Wold 1982; Tenenhaus etal. 2005). This paper embraces the soft mod
eling approach, which does not require any distributional assumption on the variables and
exploits nonparametric methods to estimate the model parameters. Partial Least Squares
Path Modeling (PLS–PM), proposed by Wold (1985), Tenenhaus etal. (2005), is one of
the most widespread compositebased method for structural equation modeling. Nowdays,
PLS–PM is a well established method both in statistical literature (Esposito Vinzi etal.
2010) and in applied research in several disciplines (Henseler etal. 2009; Hair etal. 2012;
Sarstedt etal. 2017; Di Napoli etal. 2019).
Recently a quantile approach to PLS–PM called Quantile Compositebased Path Mode
ling (QC–PM) was proposed by Davino and Esposito Vinzi (2016) to broaden the potential
of PLS–PM. To this end, QC–PM exploits QR (Koenker and Basset 1978; Koenker 2005)
in all the steps of the PLS–PM estimation algorithm. This allows to highlight if and how
the relationships among variables change according to the explored quantile of interest. It
is worth to emphasize that QC–PM is not an alternative to PLS–PM, but rather its ideal
completion. PLS–PM aims to estimate the eﬀect of the involved variables on the condi
tional mean of the responses, QC–PM extends the focus to the whole conditional distribu
tion; PLS–PM provides an eﬀective summary of the dependence structure, QC–PM is a
useful tool to magnify it.
The present study deals with the use of compositebased models for predictive pur
poses. In fact, PLS–PM cannot easily be used as a predictive modeling because the net
work of relationships is complex and the identiﬁcation of a single direction to be explained
is troublesome. At this regard, see Evermann and Tate (2016), Shmueli etal. (2016) and
Dolce et al. (2017). Starting from the explicit and general formulation of the predictive
model introduced by Dolce and Hanaﬁ (2017) and used in Dolce etal. (2018), the pre
sent paper proposes a predictiveoriented QC–PM, namely a model able to provide the best
prediction for each statistical unit. It extends the approach proposed by Davino and Vis
tocco (2015, 2018) which was aimed at identifying a typology in a dependence model. The
authors introduced the “best quantile” for each unit, i.e. the quantile associated with the
conditional model that provides the best estimate of the response variable (the best model
for each unit). This paper exploits the “best quantile” approach in the compositebased path
modeling in order to obtain accurate predictions by estimating several path models at dif
ferent quantiles.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
909
Composite‑Based Path Modeling forConditional Quantiles…
1 3
Even if the risk of overﬁtting is always lurking when we deal with prediction, it does
not actually matter for the case concerned here since we aim to provide the best insample
predictions and not to generalize on diﬀerent data. The analyzed dataset contains indeed all
the population units, namely the Italian provinces, and we aim to deﬁne the most accurate
model for each Health indicator and the best predictive model for each province.
2 The Reference Framework
The empirical application focuses on the prediction of health levels and health inequalities
in a regional wellbeing perspective. BES is the reference framework. It consists of a wide
set of about 130 statistical indicators produced by the Italian National Institute of Statistics
(Istat) to describe and monitor the progress of Italian society from a social and environ
mental point of view in a comprehensive way (ISTAT 2013, 2018).
Health is one of the 12 domains of wellbeing considered in the BES framework,
together with Education and training, Work and life balance, Economic wellbeing, Social
relationships, Politics and Institutions, Safety, Subjective wellbeing, Landscape and cul
tural heritage, Environment, Innovation, research and creativity, and Quality of services.
In the BES framework, Health is seen as a central element in life and an essential con
dition for people’s wellbeing and prosperity of populations. In fact, Health outcomes are
related to many dimensions of the individual and social wellbeing. Among the multiple
relationships, which link Health to the other BES domains and assets, we focus here on
two related domains: Education and training and Economic wellbeing. In a wellbeing
perspective, Education does not only have an intrinsic value but it directly aﬀects other
wellbeing domains. People with higher education levels have higher standards of living
and more possibilities to ﬁnd work, they live longer and better because they have healthier
lifestyles, easier access to services and more opportunities to ﬁnd less risky jobs. Similarly,
Economic wellbeing is both an asset of BES and a driver of the wellbeing outcomes in
other domains. Indeed, earning capacities and economic resources ensure that an individ
ual can obtain and support a speciﬁc standard of living.
Recent studies (Costa etal. 2014; ISTAT 2019b; Murtin etal. 2017; Petrelli etal. 2019)
focused on the relationships between health and socioeconomic conditions at individual
level. They conﬁrm that regional disparities in health outcomes are still marked in Italy,
both in terms of life expectancy at birth and mortality risk. Health inequalities among
Italian regions arise regardless of age, gender and socioeconomic status, but they clearly
appear to be related to socioeconomic factors, as they have a higher impact in the poorer
southern regions of Italy. Furthermore, lower education levels explain a considerable pro
portion of mortality risk, although with diﬀerent eﬀects by geographical area: males with
a lower education level throughout Italy have a life expectancy at birth that is 3 years less
than those with higher education; residents in southern Italy lose an additional year in life
expectancy, regardless of education level. Other studies highlight that health inequalities
are more severe within the southern Italian regions than within the northern ones. The sin
gle and joint eﬀects of education and income factors are remarkable: mortality inequalities
between better educated and less educated people explain globally about 25% of deaths
among men and more than 10% among women. The diﬀerences in health outcomes among
the Italian regions also result from local policies, as the Italian Regional Administrations
have the main power to regulate and organize the public health services; instead, focus
ing on the subregional level, the other diﬀerentiating factors gain greater importance.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
910
C.Davino et al.
1 3
Therefore, the impact of economic factors and of education levels is bearer of interesting
analysis. In particular, it would be useful to obtain a model to predict health outcomes con
trolling for the factors that aﬀect Education and Economic WellBeing.
3 Methodological Framework
3.1 Partial Least Squares Path Modeling
PLS–PM, originally developed by Wold (1982, 1985), is a powerful multivariate statistical
method that can be applied to the study of the relationships among K blocks of observed
variables. Such blocks of observed or manifest variables (MVs),
𝐗=[𝐗1,…,𝐗K]
, meas
ure K latent variables (LVs),
𝜉1,…,𝜉k
, usually named as components or composites.
PLS–PM is commonly considered an alternative approach to the covariance structure
analysis (Jöreskog 1978) of Structural Equation Modeling (SEM), although these two
approaches belong to two diﬀerent families of statistical methods.
PLS–PM focuses on LV scores computation, accounting for variances of MVs and cor
relations between LVs. Each block of MVs is summarized in a component, or a composite
(i.e. an exact linear combination of the MVs), that maximizes the explained variance of the
set of MVs. Therefore, PLS–PM is commonly referred to as a componentbased, compos
itebased or variancebased approach. Great ﬂexibility, robustness, few demands concern
ing distributional assumptions and requirement for identiﬁcation are the main features of
PLS–PM, and underpin its widespread dissemination in many areas (Esposito Vinzi etal.
2010; Hair etal. 2014).
More formally, let us consider that P variables are collected in a table
𝐗
of data parti
tioned in K blocks:
Let
𝐗
k
={x
ip
k}
be the generic block, where
– the input blocks are in the ﬁrst J positions,
– the intermediate blocks run from block
J+1
to block
J+Q
,
– the output blocks run from block
J+Q+1
to block K
–
i=1, …,n
, with n denoting the number of observations,
–
pk=1, …,Pk
, with
Pk
being the number of MVs in the k–th block.
We denote by
𝜉k={𝜉ik}
the corresponding LVs for each block of variables. A generic MV
is instead denoted by
𝐱pk={xipk}
. The path diagram in Fig.1 shows an example of a sim
ple path model with an input, an intermediate and an output block of manifest variables.
The general model consists of two submodels: the inner (or structural) model and the
outer (or measurement) model. The measurement model relates each MV to its own LV by
the following equation:
where
𝜆pk0
is a location parameter,
𝜆pk
is the loading coeﬃcient that captures the eﬀect
of
𝝃k
on
𝐱pk
, and
𝝃k
is the measurement error variable. The structural model captures and
speciﬁes the dependence relationships among LVs. A generic dependent LV is linked to
the corresponding explanatory LVs by the following model:
𝐗=[𝐗1,…,𝐗J,𝐗J+1,…,𝐗J+Q,𝐗J+Q+1,…,𝐗K]
(1)
𝐱pk =𝜆pk
0
+𝜆pk𝝃k+𝝐pk ,
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
911
Composite‑Based Path Modeling forConditional Quantiles…
1 3
where
𝛽kk′
is the socalled path coeﬃcient that captures the eﬀects of the predictor LV
𝝃k′
on the dependent LV
𝝃k
, and
𝜻k
is the inner residual variable.
The following weighting relation deﬁnes the casewise scores of each LV as a weighted
aggregate of its own MVs:
where
̂wkp
is the outer weight obtained through the PLS–PM iterative algorithm.
Since in PLS–PM there are diﬀerent kinds of residual variables, a set of partial (or local)
least squares (PLS) criteria are deﬁned and the optimal solution is found by an iterative
algorithm (Lohmöller 1989). In particular, the estimation of the model parameters in Equa
tions (1) and (2) proceeds in two stages. The ﬁrst stage computes the outer weight vec
tors
̂wk
in Equations (3), and consequently the composite
𝝃k
, through an iterative algorithm
alternating OLS simple or multiple regressions. The second stage estimates the loading
coeﬃcients
𝜆pk
and the path coeﬃcients
𝛽kk′
through classical OLS regressions.
The statistical and numerical properties of PLS–PM were deeply investigated and inter
esting results were found in terms of global optimization criteria and convergence prop
erties (Glang 1988; Mathes 1993; Hanaﬁ 2007; Krämer 2007; Tenenhaus and Tenenhaus
2011). Furthermore, recent methodological developments introduced interesting features of
the method, starting on which it is possible to generate predictions from PLS path models.
The following subsection details prediction through PLS–PM.
3.2 Predictive‑Oriented PLS‑PM
PLS–PM is a powerful method both for explorative and predictive purposes. This is a dis
tinctive feature compared to covariancebased SEM, which mainly focuses on obtaining
(2)
𝝃
k=𝛽k0+
∑
k
�
→k
𝛽kk�𝝃k�+𝜻k
,
(3)
̂
𝝃
k=
P
k
∑
p=
1
̂wkp𝐱pk
,
Fig. 1 A path model with an input, an intermediate and an output block of manifest variables
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
912
C.Davino et al.
1 3
valid inferences for population parameters. However, since its origin PLS–PM has been
almost exclusively used as an explanation oriented technique. Only in the recent years the
predictive ability of PLS–PM started to gain increasing interest from researchers (Ever
mann and Tate 2016; Shmueli etal. 2016; Dolce etal. 2017; Danks and Ray 2018; Shmueli
etal. 2019; Sharma et al. 2019). The limitation of the PLS–PM as explanatory modeling
was due to the lack of an explicit formulation of the predictive model, because of the com
plexity of the PLS path model. Two models are considered in PLS–PM, i.e. Equations (1)
and (2), and data are partitioned into three kinds of blocks: input (only used for prediction),
intermediate (used for prediction and as dependent blocks) and output blocks (only used as
dependent blocks). As a matter of fact, prediction in PLS–PM has been considered a diﬃ
cult task because a choice should be made between either prediction from the measurement
model or from the structural model. Moreover, prediction of individual observations may
refer to either individual LV score observations or individual observations of the MVs in
the dependent blocks. Finally, intermediate blocks pose a special challenge in the predic
tive context, because they play a twofold role in the model: they are both predictor variable
blocks and dependent variable blocks.
Lohmöller (1989) deﬁnes ﬁve diﬀerent sorts of predictions from PLS–PM:
1. communality prediction: each MV is predicted by the corresponding LV—Equation (1);
2. structural prediction: the prediction of each LV is obtained using the related predictor
LVs—Equation (2);
3. validity prediction: the prediction of each LV is obtained using their MVs—Equation
(3);
4. redundancy prediction: each MV is predicted by the predictor LVs that is directly con
nected to its own LV;
5. operative prediction: each MV is predicted using only the MVs of the predictor blocks
(all the LVs are replaced with their corresponding weight relation)—Equation (3).
Despite this complexity, it is possible to generate predictions from PLS–PM since appro
priate schemes were recently proposed Shmueli etal. (2016), Dolce and Hanaﬁ (2017).
The present paper uses the explicit and general formulation of the predictive model pro
posed in Dolce and Hanaﬁ (2017), which incorporates both the measurement and structural
model in an unique model, and requires only MVs as predictors and outcomes, following
the operative prediction deﬁned in Lohmöller (1989).
3.3 Quantile Composite‑Based Path Modeling
QC–PM has been proposed by Davino and Esposito Vinzi (2016) to complement PLS–PM.
QC–PM exploits QR to explore the whole distribution of dependent variables as function
of the set of predictors. Since PLS–PM is based on simple and multiple OLS regressions,
its coeﬃcients focus on the conditional means of the dependent variables. Even if it pro
vides an eﬀective summary, in some cases the estimates of coeﬃcients may vary along
the distribution of the dependent variable. This happens in presence of heteroscedastic
variances of the errors or highly skewed dependent variables. In such cases, PLS–PM may
give an incomplete picture of the relationships among variables. The quantile approach is
instead able to model the location, the scale and the shape of the responses.
QR was introduced by Koenker (2005) to extend the regression model from the con
ditional mean to any conditional quantile of interest. In linear models, QR estimates have
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
913
Composite‑Based Path Modeling forConditional Quantiles…
1 3
the same interpretation of any other linear model. The intercept measures the response
value when all the regressors are set to zero, the slopes measure the rates of change in the
response per a unit change in the value of the correspondent regressor, keeping all the oth
ers constant. Since QR estimates a set of coeﬃcients (intercept and slopes) for each con
sidered quantile, coeﬃcients must be interpreted in terms of the quantiles of the response.
A dense grid of equally spaced quantiles can provide a fairly accurate approximation of
the whole quantile process (Furno and Vistocco 2018) and a reconstruction of the whole
conditional distribution of the response variable, because each conditional quantile predicts
the correspondent location of the response variable (Davino etal. 2013).
QC–PM uses QR instead of OLS regression in all the estimation steps. In particular,
QC–PM consists in introducing a QR in the steps described in Sect.3.1 to estimate model
parameters—Equations (1), (2) and (3). For each quantile
𝜃
of interest,
𝜃∈(0, 1)
, the ﬁrst
stage computes the outer weights vectors
wk
and each composite
𝝃k
, through an iterative
algorithm alternating simple QR or multiple QR. The second stage estimates the loading
coeﬃcients
𝜆pk(𝜃)
and the path coeﬃcients
𝛽kk
�
(𝜃)
through QR:
Similarly to the unconditional quantile minimisation (Fox and Rubin 1964), the conditional
quantile estimator is obtained by minimizing a weighted sum of residuals. The function to
minimize in case of equation (4) is:
where the ﬁrst element of
𝜉
is equal to 1 to include the intercept, and
𝜌𝜃(.)
is the check
function, which asymmetrically weights positive and negative residuals, namely:
The same holds in case of Equation5.
For each quantile of interest, QC–PM provides a set of outer weights, loadings and path
coeﬃcients. Therefore, it oﬀers a more complete picture of the relationships among vari
ables both in the outer model (as the outer weights measure the eﬀects of each MV on
the corresponding construct) and in the inner model (as the path coeﬃcients quantify the
impact of lowerorder constructs on higher order constructs). However, in order to compare
path coeﬃcients estimated over quantiles, measurement invariance has to be satisﬁed in the
models (Henseler etal. 2016). In other words, for each MV, all the loading should be very
similar across quantiles and compared to the one estimated by PLS–PM, because the same
LV should be measured across quantiles. If loadings change across quantiles, there is no
guarantee that a LV is measuring the same concept and path coeﬃcients estimated at dif
ferent quantiles cannot be reliably compared.
In this situation, a possible solution may be to ﬁx the quantile to the median in the
measurement model and letting quantiles change just in the structural part, following the
approach proposed by Wang et al. (2016) in factorbased structural equation modeling.
This approach is justiﬁed if we consider that the role of the measurement model is to relate
(4)
𝐱pk(𝜃)=𝜆pk0(𝜃)+𝜆pk (𝜃)𝝃k+𝝐pk ,
(5)
𝝃
k(𝜃)=𝛽k0(𝜃)+
∑
k
�
→k
𝛽kk�(𝜃)𝝃k�+𝜻k
.
(6)
̂
𝝀
(𝜃)=argmin𝝀(𝜃)
n
∑
i=1
𝜌𝜃
(
xi−𝛏T
i𝝀(𝜃)
)
(7)
𝜌
𝜃(u)=
{
𝜃uif u>
0
(1−𝜃)uif u≤
0
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
914
C.Davino et al.
1 3
the MVs to LV and to construct a score for the LV. Finally, measures of goodness of ﬁt and
tests for evaluating the statistical signiﬁcance of the coeﬃcients typically used in PLS–PM
can be easily extended to the QC–PM approach (see Davino etal. 2016 for details).
4 Predictive‑Oriented QC‑PM
Exploiting the ability of QR to model the whole conditional distribution of a dependent
variable, a Quantile Compositebased Path Modeling can be used in a prediction per
spective. As above stated, the use of a dense grid of equally spaced quantiles provides an
accurate approximation of the whole quantile process (Davino etal. 2013). The predictive
model and the proposed procedure aims to provide the best insample predictions (often
called ﬁtted values). The computation of insample predictions requires estimating the
parameters of a PLS model by using a given data sample and then using the model to pre
dict values for cases of the same sample (Shmueli etal. 2016).
This paper proposes a twostep procedure to provide the best predictions of the outcome
MVs:
1. Estimation of several path models
The ﬁrst step aims to estimate the speciﬁed path model for a dense grid of equally
spaced quantiles through the QC–PM algorithm. QC–PM provides m estimates for each
parameter of the model, m LV scores and m previsions of the outcome MVs, where m
is the number of the chosen quantiles. In particular, for the empirical analysis proposed
in this paper, we exploited a grid of quantiles
𝜃
varying from 0.01 to 0.99 with a step of
0.01.
2. Identiﬁcation of the best model for each outcome MV and for each unit
The second step aims to deﬁne the most predictive model for each outcome MV and
the best accurate model for each statistical observation (henceforth best quantile). To
achieve this goal it is necessary to compute the predictions corresponding to each quan
tile. Considering the outcome blocks, a partitioned table of predictions
̂
𝐗
is obtained
for each
𝜃
:
̂
𝐗
(𝜃)=[
̂
𝐗
J+Q+1
(𝜃),…,
̂
𝐗
K
(𝜃
)]
. The generic element of a MV prediction,
̂
𝐱pk(𝜃)={̂xipk(𝜃)}
, represents the prediction value of the MV
𝐱pk
, for the ith unit accord
ing to the
𝜃
th quantile.
The best model for each unit i and for each dependent MV of the kth block,
xpk
, is
identiﬁed by the quantile that best predict the variable, namely by the quantile which
minimizes the absolute diﬀerence between the observed value and the estimated value:
where
𝜃best
ipk
represents the quantile associated to the best predictive model for each unit
i, each indicator
pk
in each block k, while
̂xipk(𝜃)
is the correspondent best prediction
for
x
ip
k
.
The denser the quantile grid is, the more accurate the forecasts provided by the predic
tive approach to QC–PM are. The best quantile provides an estimation of the unit position
in the conditional distribution of the outcome variable. A comparison between the vector
of the best quantiles for a given dependent variable and the corresponding unconditional
(8)
𝜃
best
ipk =argmin𝜃=1,…,m



xipk −̂xipk (𝜃)



Content courtesy of Springer Nature, terms of use apply. Rights reserved.
915
Composite‑Based Path Modeling forConditional Quantiles…
1 3
quantile (namely the position of each unit in the observed MV) allows to understand what
is the eﬀect of the structure of relationships in predicting each outcome variable.
A very simple example can be used to clarify the added value of the proposal and the
diﬀerent information that unconditional and conditional quantiles convey. Figure2 shows
the scatterplot of two variables observed on a sample of 3000 units. The dependent vari
able, represented on the vertical axis, has been generated from a Gamma model. Its uncon
ditional distribution is represented through a boxplot, a dotplot and an histogram on the
righthand side of the ﬁgure using gray color. The regressor, represented on the horizontal
axis, is a numerical variable with six values. The conditional distributions of the response
on the six values of the regressor are depicted through the ﬁrst six dotplots, starting from
the lefthand side. The plot portrays a scenario in which the standard deviation exhibits
a linear growth in the response variable as the regressor increases, the skewness and the
excess kurtosis being instead constant and positive. Two observations are highlighted as an
example: the diamond denotes an observation located above the median of the dependent
variable distribution, with an unconditional quantile equal to 0.61; the triangle is placed
in the lower tail, its unconditional quantile being 0.35. This is evident from the position of
the two points in the gray righthand boxplot. Starting from a dense grid of quantiles, the
correspondent QR models were estimated, in order to identify the best quantile for each
observation, namely the one that minimizes the function in Equation (8). The two lines
depicted in Fig. 2 are the best QR models identiﬁed for the two example observations,
respectively at the quantiles 0.31 for the diamond and 0.58 for the triangle. In this simple
example, it is easy to prove that the identiﬁed best quantiles correspond to the estimation
of the position of the two points in the conditional distributions. The two points are indeed
highlighted according to their regressor value, namely in correspondence of the second and
ﬁfth conditional distribution (from left). They lie on the two best models whose
𝜃
values
Fig. 2 An illustrative scatterplot to visualize the unconditional rankquantiles (right grey doplot) of two
points (diamond and triangle) and their corresponding positions in the conditional distributions (second and
fourth dotplot, from the left)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
916
C.Davino et al.
1 3
are equivalent to their position in the conditional distribution. This is further highlighted
superimposing a boxplot in order to assess their position.
5 An Empirical Analysis: The Prediction ofHealth Indicators
5.1 Data Description
This paper proposes an exploration of the relationships among Health, Education and
training (henceforth EDU) and Economic wellbeing (henceforth ECO) using the “BES
measures at local level” (ISTAT 2019a), a subset of the BES indicators, that are measured
and regularly updated by Istat on the 110 Italian provinces and metropolitan cities (i.e. at
NUTS3 level1). The interest in such an application concerns both advances in knowledge
about the dynamics producing the wellbeing outcomes at local level (multiplier eﬀects or
tradeoﬀs) and a more complete evaluation of regional inequalities of wellbeing.
At the local level the wellbeing assets can strengthen each other aﬀecting multiple dis
advantages or advantages. For this reason it is important to consider both the levels and the
relationships among BES indicators. Furthermore, equal results can be achieved in very
diﬀerent contexts and conditions. So, in assessing or comparing the wellbeing outcomes,
the conditions within outcomes arise should be properly considered, adopting a multidi
mensional approach, able to support an accurate evaluation of the regional performances.
Figure3 shows the speciﬁed network of relations. EDU, ECO and Health are the unob
served complex concepts that are measured as composites of the corresponding MVs
(squares in Fig.3 detailed in Tables3 and 4 in the “Appendix”). Even though the model
could be enriched by including further measures or domains, it still considers the most of
the BES indicators that are currently produced by Istat at the NUTS3 level.
In the path model in Fig.1, Health variables are placed as response variables of EDU
and ECO. The underlying hypothesis, supported by literature and empirical studies (Costa
etal. 2014; Mackenbach etal. 2008; Murtin etal. 2017; Petrelli etal. 2019), is that EDU
has a direct eﬀect on Health and an indirect eﬀect mediated by ECO. In fact human capital
is both a factor of economic competitiveness and wellbeing, as higher education oﬀers
more income opportunities, and promotes lower vulnerability to health risks.
With respect to the MVs, life expectancy at birth of males (O.1.1M) and females
(O.1.1F) and infant mortality rate (O.1.2.MEAN_aa) are the three indicators used to meas
ure the main global outcomes in the Health domain. EDU consists of indicators of qualiﬁ
cation (O.2.2; O.2.3), competences (O_2.7_2.8; O_2.7_2.8_AA), participation in education
and longlife learning (O.2.4; O.2.5aa; O.2.6). ECO is measured by indicators of income
and wealth (O.4.1, O.4.2, O.4.3, O.4.5) and economic diﬃculties (O.4.4aa; O.4.6aa). All
the indicators were positively oriented towards the BES (the higher is the indicator value,
the greater is the BES) to provide an easier interpretation of the results. The “aa” suﬃx
1 The acronym NUTS (from the French “Nomenclature des unités territoriales statistiques” NUTS) stands
for Nomenclature of Territorial Units for Statistics, that is the European Statistical System oﬃcial classiﬁ
cation for the territorial units. The NUTS is a partitioning of the EU territory for statistical purposes based
on local administrative units. The NUTS codes for Italy have three hierarchical levels: NUTS1 (Groups of
regions); NUTS2 (Regions); NUTS3(Provinces and Metropolitan Cities).
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
917
Composite‑Based Path Modeling forConditional Quantiles…
1 3
denotes those indicators that we reversed for this purpose2. Data refer to the latest update
available (reference year is given in the last column of Table4 in the “Appendix”).
A preliminary analysis of the distribution of the MVs provides an examination of the
heterogeneity that can be observed in the distribution of wellbeing in the Italian provinces.
Figures4, 5 and 6 show the violin plots, a combination of a box plot and a density plot. It
is realized rotating and placing symmetrically on each side two density plots. The length of
the vertical axis of each graph allows to appreciate the range of the observed values while
the shape highlights how values are distributed in terms of variability and skewness. The
black dot in the middle is the median value. Note that violin plots in diﬀerent panels are not
always comparable, as the variables have diﬀerent unit of measurement and scales.
As expected, life expectancy at birth has a similar distribution for females (O.1.1F) and
males (O.1.1M). However the median value is lower for men (80.5 years) compared to
women (85.0). The gender gap (4.5 years comparing median values) is wide even look
ing at the ranges of the distributions: the maximum for men (82.1 years) is smaller than
the minimum for women (82.8 years). In both cases the Italian provinces fall into ranges
of equal width, but the male’s life expectancy has a more regular shape. Infant mortality
is a rare phenomenon, so the corresponding MV (O.1.2 MEAN_aa) has a high territorial
and temporal variability; for this reason the model was calculated on a threeyear average.
Even this more aggregate measure reveals large diﬀerences among the Italian provinces.
The range is 5.1 points per thousand between the province with the worst result (equal to
zero in the chart because the indicator was reversed) and the one with the best result. Most
of the Italian provinces thicken in the centre of the distribution, with few cases placed in
upper and lower ends. Therefore the major diﬀerences concern a small number of cases.
The territorial heterogeneity in health outcomes does not have a clear geographical gradi
ent. The provinces of central and northern Italy have more often better results than those
Fig. 3 The PLSPM model
2 To reverse the indicators we used the maxmin method (OECD 2008).
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
918
C.Davino et al.
1 3
O.1.1F O.1.1M O.1.2.MEAN_aa
0
1
2
3
4
5
79
80
81
82
83
84
85
86
Health block
Fig. 4 Violin plot of the Health indicators
O.2.6 O_2.7_2.8 O_2.7_2.8_AA
O.2.2 O.2.3 O.2.4 O.2.5aa
0
10
20
30
40
40
50
60
0
5
10
15
20
30
40
170
180
190
200
210
220
50
60
70
4
8
12
16
Education block
Fig. 5 Violin plot of the Education indicators
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
919
Composite‑Based Path Modeling forConditional Quantiles…
1 3
of the south and islands, with many positive and negative exceptions (ISTAT 2019b). Con
sidering the Health predictors MVs, those of ECO are clearly the most discriminating. In
particular all the indicators of income and wealth (O.4.1, O.4.2, O.4.3, O.4.5) have very
asymmetric and polarized distributions, showing a sharp division between the group of
provinces with the best economic outcomes and the group of the most penalized ones. The
density of the median class is always lower compared to these two opposite groups. The
shape of these graphs reﬂects a clear separation between the richer provinces of northern
and central Italy, and the group of the southern and islands ones. The same division con
cerns the territorial distribution of lowincome pensioners (O.4.4aa) which is very asym
metrical. In the Education block, the widest asymmetries emerge about the competences of
young students (O_2.7_2.8) and the participation in education (O.2.5aa); both this meas
ures oppose the southern Italian provinces, more disadvantaged, to the northern and central
ones. Looking at the participation in lifelong learning (O.2.6) and at the highest qualiﬁca
tion levels of the population (O.2.3) the territorial heterogeneity has a quite diﬀerent sharp,
that becomes thinner and longer moving towards the best outcomes, which therefore con
cern a few leading Italian provinces (namely the northeastern ones for O.2.6 indicator and
the northern metropolitan cities for O.2.3 indicator).
After examining the distribution of each MV, it is necessary to check the internal con
sistency of each block of MVs through the Cronbach’s
𝛼
and DillonGoldstein’s
𝜌
, which
need to be greater than 0.7. For the Cronbach’s
𝛼
, Conﬁdence Intervals are also reported
following a recent approach (Trinchera et al. 2018). Moreover, the average variance
extracted (AVE) is also considered (Tenenhaus etal. 2005). Table1 shows that, for all the
blocks, internal consistency is satisﬁed and all the AVE values are greater than 0.5.
O.4.6aa O.4.2 O.4.3
O.4.1 O.4.4aa O.4.5
100000
150000
200000
250000
14000
16000
18000
20000
22000
0
5
10
15000
20000
25000
30000
10000
15000
20000
25000
0.0
0.5
1.0
1.5
Economic well−being block
Fig. 6 Violin plot of the Economic wellbeing indicators
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
920
C.Davino et al.
1 3
5.2 PLS‑PM andQC–PM Results
The model in Fig.3 was estimated using the classical PLS–PM and QC–PM ﬁxed to the
three quartiles (
𝜃
=[0.25, 0.5, 0.75]). Given the territorial heterogeneity expressed by the
model MVs, the aim is to explore whether estimates vary across diﬀerent parts of the vari
able distributions.
Firstly, the estimated loadings in the measurement models are examined. Table2 pre
sents the loadings for each MV estimated using conventional PLS–PM and QC–PM at the
three deﬁned quantiles.
Except for few cases, loadings are very similar across quantiles and compared to the one
estimated by PLS–PM. However, for each block of MVs, measurement invariance should
be veriﬁed to evaluate any potential diﬀerence in the structural relationships across the var
ious quantiles (i.e., comparison among path coeﬃcients). QCPM still lacks a statistical
test for measurement invariance, but we applied the variant of QCPM ﬁxing the quantile
in the measurement model to the median and found no relevant diﬀerences in results, hence
the measurement of the LV remains essentially the same for all quantiles, which allows a
reliable comparison of path coeﬃcients across the various quantiles.
Bars in each panel of Fig.7 represent (from the top to the bottom) the path coeﬃcients
and the standard errors measuring respectively the eﬀects on the conditional average and
on the conditional quartiles of Health. It is interesting to note how QC–PM results comple
ment PLS–PM results. Looking at the average of the distribution (PLS–PM), Education
is the most important driver of Health, but QC–PM reveals that its importance is greater
where the Health is lower (that is lower or equal to the median) while it decreases as the
Health grows. In essence, the eﬀect of EDU on health conditions increases moving from
provinces with good to worse results. With regard to ECO, the additional information pro
vided by QC–PM is also interesting because the PLS–PM results suggests that ECO does
not contribute to Health while the QC–PM reveals an high path coeﬃcient in those prov
inces where Health scores are the highest ever. Therefore ECO plays a discriminating role
in explaining the best absolute outcomes.
The territorial heterogeneity of the MVs is often associated with geographical dif
ferences; therefore it may be useful to add the geographical location of the provinces in
the analysis of the model results (Davino etal. 2017). A possible source of heterogeneity
could be, for example, the geographical area considering that Italian provinces are usually
grouped into four areas: northeast (20%), northwest (23%), centre (20%) and south and
islands (37%). Figure8 shows all the possible scatter plot combining the three composites
obtained by estimating the model in Fig.3 with PLS–PM. In each panel a diﬀerent color
and shape of the points is used to distinguish the eﬀect of the area. The lines represent the
regression lines estimated in the four subgroups of provinces considering the variable rep
resented on the vertical axis as dependent variable. The boxplots on the right (top) side of
each panel show the distribution of the composites represented on the vertical (horizontal)
Table 1 Reliability and internal
consistency measures Cronbach’s
𝛼
(95% C.I.) Dillon–
Goldstein’s
𝜌
AVE
Education 0.907 (0.885; 0.928) 0.927 0.647
Economic wellbeing 0.953 (0.942; 0.964) 0.963 0.813
Health 0.733 (0.640; 0.827) 0.853 0.666
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
921
Composite‑Based Path Modeling forConditional Quantiles…
1 3
Table 2 Measurement model results
Loadings (
𝜆
) PLSPM Estimation at diﬀerent quantiles
𝜃=0.25
𝜃=0.5
𝜃=0.75
Health O.1.1.F 0.93 0.98 0.91 0.94
O.1.1.M 0.89 0.92 0.88 0.85
O.1.1.2.Mean_aa 0.57 0.65 0.61 0.29
Education O.2.2 0.86 0.93 0.89 0.82
O.2.3 0.85 0.81 0.79 0.77
O.2.4 0.63 0.61 0.62 0.65
O.2.5.aa 0.91 0.90 0.96 0.94
O.2.6 0.71 0.63 0.81 0.82
O._2.7_2.8 0.85 0.76 0.85 0.85
O._2.7_2.8_AA 0.78 0.78 0.77 0.75
Economic WellBeing O.4.1 0.98 0.93 0.97 1.03
O.4.4aa 0.90 0.91 0.92 0.90
O.4.5 0.93 0.85 0.91 1.02
O.4.6aa 0.76 0.85 0.68 0.75
O.4.2 0.92 0.92 0.92 0.96
O.4.3 0.91 0.87 0.88 0.89
0.03
0.07
0.24
0.08
(s.e. = 0.15)
(s.e. = 0.23)
(s.e. = 0.16)
(s.e. = 0.13)
0.79
0.69
0.44
0.64
(s.e. = 0.15)
(s.e. = 0.25)
(s.e. = 0.17)
(s.e. = 0.13)
economic well−being education
0.00.3 0.60.9 0.00.3 0.60.9
θ=0.25
θ=0.5
θ=0.75
pls −pm
θ=0.25
θ=0.5
θ=0.75
pls −pm
Fig. 7 Path coeﬃcients linking Economic wellbeing and Education to Health
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
922
C.Davino et al.
1 3
axis distinguishing once again the provinces by geographical area. Considering the uni
variate distributions of the composites, it is clear that the southern provinces are lagging
behind in all the three contexts analysed; the third quartile for the southern provinces is
always well below the ﬁrst quartile for the other provinces. The gap is broader for EDU and
ECO.
As expected, given the distribution of EDU and ECO MVs (Sect.5.1), the composites
distribution has a clear geographical orientation: values increase moving from the south
to the north of Italy, with the northeast group leading. Conversely, the Health distribu
tion follows a diﬀerent geographical progression. The group of central Italian provinces,
although very heterogeneous, tends to have better scores than all the other groups, includ
ing the northeastern one.
Considering the relationships among Health, ECO and EDU, the three composites
are highly correlated at a national level but diﬀerently at local level. In Fig.8, for each
couple of scores, a regression line is estimated in the four geographical areas and super
imposed to the scatter plot. The simultaneous representation of the scatter plot and
regression lines allows to capture both the trend and the heterogeneity of the relation
ship. The correlation between the ECO and EDU composites is equally strong in all the
four geographical areas despite the heterogeneity observed within each group of prov
inces (Fig.8, bottom right panel). The correlation between Health and ECO it is by far
the strongest in the south and islands group, according to the greater path coeﬃcient of
ECO on the Health worst outcomes (note that the south and islands provinces always
lie in the Health distribution queue, with just one exception). Instead the strongest
−2
−1
0
1
2
−2 −1 012
economic well−being
health
−2
−1
0
1
2
−2 −1 012
education
health
north−east
north−west
center
south_islands
−2
−1
0
1
2
−2 −1 012
education
economic well−being
Fig. 8 Education, Economic wellbeing and Health distributions according to the geographic area. PLSPM
results
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
923
Composite‑Based Path Modeling forConditional Quantiles…
1 3
correlation between Health and EDU scores is that of the center group. Looking at the
PLS–PM results shown in Fig.8, the assumptions underlying the model are not con
ﬁrmed for all the northeastern provinces: also due to the high dispersion and heteroge
neity of this provinces, no correlation arise between Health and EDU outcomes, while
that between Health and ECO (very weak) still has a negative sign.
The results of the QC–PM can provide a better deﬁnition of the characteristics of this
heterogeneity. Focusing on the Health composite and on its three conditional quartiles,
it is possible to analyse similarities and diﬀerences among the geographical areas at
diﬀerent health conditions. Figure9 shows the distribution of the Health composite for
each area (diﬀerent panels) and for each model (rows in each panel). The density plot,
the dot diagram and the boxplots allow to explore all the features of the distributions. In
each line a segment joins the averages of the composite at the three quartiles. Consider
ing that the global averages of the composites provided by the PLS–PM and by QC–PM
at each quartile are equal respectively to 0, 0.47, 0.01 and 0.47, it is possible to note
that the averages of the southern provinces distributions are always below the global
average, while northeastern provinces (and partially also the northwestern ones) show
an opposite behavior.
Fig. 9 Distribution of the Health composite from a PLSPM (top in each panel) and QC–PM estimated at
the three quartiles, according to the geographic area
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
924
C.Davino et al.
1 3
5.3 Prediction Results
In order to exploit the model in Fig.3 in an operative predictive perspective, it is necessary
to consider that Health is composed by three outcome variables: life expectancy at birth
of males (O.1.1M) and females (O.1.1F) and infant mortality rate (O.1.2.MEAN_aa). In
deepening the presentation of the prediction results, we will just focus on O.1.1M, which,
as seen above (Sect.5.1) is the most informative among the three Health indicators we
used in the model. In fact it is more robust than the infant mortality rate (that is aﬀected
by extravariability) and, compared to females rates, it is more able to explain both the
diﬀerences among Italian provinces and the improvements achieved over time in the gen
eral level of the total life expectancy in Italy. As we deal with insample prediction, the
postanalysis of the results can be based on the comparison between the observed and the
prediction values. As described in the methodological sections, the proposed best quantile
approach computes the predicted vector of each MV by selecting the quantile model pro
viding the best prediction for each province. In case of a simple regression model (with
one dependent variable and one regressor) the identiﬁcation of the best quantile allows to
exactly reconstruct the observed variable. In a more complex model as the network of rela
tionships in Fig.3 is, the goal is to identify the best prediction.
Figure 10 shows a smoothed version (using a linear smoother) of the scatterplot of
observed and predicted values for the O.1.1M variable where the predicted values derives
from a PLS–PM (solid line) and from the best quantile approach (dashed line). The gray
line depicts the bisector, namely the place of the points where the observed values and the
expected values coincide perfectly. It results that the best quantile predictions are much
more accurate than PLS–PM, but the marginal gain in accuracy decreases at the distri
bution tails. This consideration is also conﬁrmed in the analysis by geographical area
Fig. 10 A smoothed version
(using a linear smoother) of
the scatterplot of observed and
predicted values for the O.1.1M
variable obtained using a PLS–
PM (solid line) and the best
quantile approach (dashed line)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
925
Composite‑Based Path Modeling forConditional Quantiles…
1 3
(Fig. 11): the higher accuracy of the best quantile predictions is particularly evident for
the northeast and south and islands areas (the dashed lines are very closed to the bisector)
while the lower accuracy of the best quantile predictions in the extreme parts of the distri
bution is more evident in northwest (low tail) and center (high tail).
The proposed quantile modelbased prediction approach can provide useful information
to understand, for each province, what is the contribution of the system of relationships in
the model in Fig.3 to the prediction of health levels. In essence, the comparison between
conditional and unconditional quantiles for the Health MVs tells us if the observed results
are in line with the starting conditions (in terms of ECO and EDU).
The scatter plot in Fig.12 visualizes all the provinces according to the assigned best
quantiles and to the unconditional quantiles. The unconditional quantile (horizontal axis)
is the position of each province in the MV distribution without considering/controlling
the eﬀect played by EDU and ECO, while the best quantile (vertical axis) represents the
Fig. 11 A smoothed version (using a linear smoother) of the scatterplot of observed and predicted values
for the O.1.1M variable obtained using a PLS–PM (solid line) and the best quantile approach (dashed line)
according to the geographic area
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
926
C.Davino et al.
1 3
position of each province compared to all the other provinces that have similar EDU and
ECO levels (i.e., the position in the conditional distribution).
If on one hand, looking at the unconditional quantiles of O.1.1M one can ﬁnd verti
cally aligned provinces that have the same value of the life expectancy of males also
starting from diﬀerent EDU and ECO levels, on the other hand looking at the horizon
tal alignment of the points, one can ﬁnd provinces that perform similarly while having
diﬀerent socioeconomic levels. The furthest points from the bisector of the scatterplot
identify the territories where the diﬀerences between observed and “potential” results
are greatest. These divergences are the most interesting cases to study. By number
ing the quadrants counterclockwise and starting from the quadrant at the top right, we
can identify two “critical” situations: at the top left (second quadrant) fall those prov
inces that get better results than the expected ones (the best quantile is greater than
the unconditional) and at the bottom right (fourth quadrant) fall those territories with
an important negative gap (the best quantile is much lower than the observed one).
The geographical area has some inﬂuence on the relationship between unconditional
and conditional quantiles. In fact, in the second quadrant we ﬁnd almost exclusively
southern provinces (there are only Latina, Frosinone and Fermo for the center) while
in the fourth quadrant we ﬁnd only northern provinces together with Rome. The scatter
Torino
Vercelli Novara
Cuneo
Asti
Alessandria
Biella
Verbano Cusio Ossola
Aosta
Imperia
Savona
Genova
La Spezia
Varese
Como
Sondrio
Milano
Bergamo
Brescia
Pavia
Cremona
Mantova
Lecco
Lodi
Monza e della Brianza
Bolzano/Bozen
Trento
Verona
Vicenza
Belluno
Trev iso
Venezia
Padova
Rovigo
Udine
Gorizia
Tr ieste
Pordenone
Piacenza
Parma
Reggio nell'Emilia
Modena
Bologna
Ferrara
Ravenna
Cesena
Rimini
Massa Carrara
Lucca
Pistoia
Firenze
Livorno
Pisa
Arezzo
Siena
Grosseto
Prato
Perugia
Terni
Pesaro e Urbino
Ancona
Macerata
Ascoli Piceno
Fermo
Viterbo
Rieti
Roma
Latina
Frosinone
L'Aquila
Teramo
Pescara
Chieti
Campobasso
Isernia
Caserta
Benevento
Napoli
Avellino
Salerno
Foggia
Bari
Taranto
Brindisi
Lecce
Barletta Andria Tr ani
Potenza
Matera
Cosenza
Catanzaro
Reggio di Calabria
Crotone
Vibo Valentia
Trapani
Palermo
Messina
Agrigento
Caltanissetta
Enna
Catania
Ragusa
Siracusa
Sassari
Nuoro
Cagliari
Oristano
Olbia Tempio
Ogliastra
Medio Campidano
Carbonia Iglesias
life expectancy at birth (male) O.1.1M
0.00 0.25 0.50 0.75 1.00
0.00
0.25
0.50
0.75
1.00
unconditional quantile
best quantile
north eastnorth west center south_islands
a
Fig. 12 The scatter plot of the provinces according to the unconditional and conditional best quantiles of
O.1.1M. The color and shape of the points represent the geographic area
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
927
Composite‑Based Path Modeling forConditional Quantiles…
1 3
conﬁrms the advantage of the center and northen areas and the penalisation of the south
and islands also for the males life expectancy at birth. However in this multimensional
perspective the dimensions of the advantages and disadvantages are quite diﬀerent from
those we could appreciate in a context of univariate analysis (i.e. considering the single
indicators and not also their interrelationships).
For a more analytical illustration of the additional information that the model provides,
an extreme example can be isolated: Bologna vs Ravenna. Bologna and Ravenna share
quite similar health outcomes, considering the O.1.1M variable, but the predicted values
are quite diﬀerent as shown by their positions respectively in the fourth quadrant (Bolo
gna) and in the second one (Ravenna). If we consider the subset of the provinces of the
northeastern area, it is more evident the eﬀect induced by the model in the Health results
of the two provinces. The slope graph in Fig.13 shows the subgroup of provinces in the
northeast of Italy ranked according to their position in the original (unconditional, left
hand side) and estimated (conditional, lefthand side) distribution of O.1.1M. The slope of
the lines joining the unconditional and conditional position of each province clearly visu
alize how much taking into account the levels of EDU and ECO can aﬀect the life expec
tancy of males. The limit case is represented by an horizontal line: it would mean that ECO
and EDU levels make no contribution to the knowledge of Health. Both for Bologna and
Ravenna, the results in terms of life expectancy of males, estimated in itself, are excel
lent, among the highest: the two provinces share the 83th percentile in Italy. However, the
eﬀects of the estimated model are diﬀerent in the two provinces resulting in an improve
ment in the position occupied by Ravenna (increasing slope of the stick) and a worsening
for Bologna (decreasing slope of the stick), the two best quantiles being 0.20 and 0.89,
respectively.
To interpret the diﬀerent eﬀect played by the model, with similar observed results, it is
necessary to go back to the distribution of the original indicators. Figure14 and 15 show
some univariate statistics of the ECO and EDU through parallel coordinates. The grey dou
ble lines join the quartiles of the indicators while the thin lines represent the averages by
Fig. 13 The slope graph of the subgroup of provinces in the northeast of Italy. Provinces are ranked
according to their position in the original (unconditional, lefthand side) and estimated (conditional, left
hand side) distribution of O.1.1M
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
928
C.Davino et al.
1 3
geographical area. The broken lines representing Ravenna and Bologna are highlighted and
allow to contextualize why the performance of Bologna appears less brilliant. If on one
hand the levels of EDU and ECO in Bologna are among the highest, on the other hand in
the group of provinces with these highest levels of EDU and ECO, Bologna ranks among
the last in terms of life expectancy of males, as highlighted by the best quantile that is
much lower than the unconditional one. This diﬀerence gives us a measure of the gap
between potential and actual results, which in this case is negative. Reading the results
Bologna
Ravenna
S−I
CEN
NW
NE
q1
q2
q3
−1
0
1
2
O.4.1O.4.4aa O.4.5 O.4.6aa O.4.2O.4.3
Fig. 14 Distribution of the univariate statistics of the Economic wellbeing indicators: quartiles (grey dou
ble lines), averages by geographical area (thin lines). The broken lines representing Ravenna and Bologna
are highlighted
Bologna
Ravenna
S−I
NW
CEN
NE
q1
q2
q3
−1
0
1
2
3
O.2.2O.2.3 O.2.4O.2.5aa O.2.6O_2.7_2.8 O_2.7_2.8_AA
Fig. 15 Distribution of the univariate statistics of the Education indicators: quartiles (grey double lines),
averages by geographical area (thin lines). The broken lines representing Ravenna and Bologna are high
lighted
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
929
Composite‑Based Path Modeling forConditional Quantiles…
1 3
from the point of view of Ravenna, it appears that Ravenna has the same life expectancy of
males as Bologna (81.5 years), but a much less favorable ECO context, especially concern
ing the disposable income of households (O.4.1), the incomes of employees (O.4.2), pen
sioners (O.4.3) and, the wealth of families (O.4.5), and it is shown above that in the model
ECO has a higher path coeﬃcient on the higher part of the Health distribution. The gaps in
terms of EDU are even more marked in particular on the rates of graduates (O.2.3), transi
tion to university (O.2.4) and participation in lifelong learning (O.2.6); moreover we know
that EDU in the model has a high path coeﬃcient on the higher part of the Health distribu
tion. Given its advantage over Ravenna in terms of EDU and ECO, Bologna should have a
far better result in terms life expectancy of males than what is observed.
An opposite pattern can be exempliﬁed by Cosenza and Catanzaro (Fig.16). Cosenza
falls in the group of provinces with low results (81st with 79.9 years), but gets a better
position than expected, given ECO and EDU, as its conditional quantile is higher than the
unconditional and the positive gap is quite wide. So we could say that Cosenza performs
better than Catanzaro, as it gets the same result but in a more unfavorable context, mostly
due to the lower levels of the indicators of ECO (see Fig.17).
6 Conclusions andFurther Developments
The analysis of the relationships among complex and unobservable factors can be
enhanced using a quantile approach to PLS–PM, which allows to highlight the unobserved
heterogeneity that could be overlooked by the classic estimation of the average eﬀects. In
this paper QC–PM is also proposed in a predictive perspective providing the best estima
tion, and thus the best model, associated to each statistical unit. For a given statistical unit,
the quantile associated to the best model, in the paper named ”best quantile”, condenses
the eﬀect played by the regressors on the position of the unit in the conditional distribu
tion of the dependent variable. QCPM lacks a statistical test for measurement invariance,
Catanzaro
Cosenza
S−I
NW
CEN
NE
q1
q2
q3
−1
0
1
O.2.2 O.2.3 O.2.4 O.2.5aa O.2.6O_2.7_2.8O_2.7_2.8_A
A
Fig. 16 Distribution of the univariate statistics of the Education indicators: quartiles (grey double lines),
averages by geographical area (thin lines). The broken lines representing Catanzaro and Cosenza are high
lighted
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
930
C.Davino et al.
1 3
which allows a reliable comparison among path coeﬃcients estimates over quantiles. How
ever, as abovementioned, a possible variant of QCPM can be used, ﬁxing the quantile
in the measurement model to the median and changing only the quantile in the structural
model. We applied this variant, but we did not ﬁnd relevant diﬀerences in results, probably
because there are not evident diﬀerences among loadings over quantiles.
The potential arising from a joint use of a PLS–PM and a QC–PM are exploited to
explore the relationships among the Health outcomes and the levels of Economic well
being and Education in Italian provinces. The model is deﬁned using a subset of dimen
sions and indicators of the wellknown BES dataset produced by ISTAT at NUTS3 level
(ISTAT 2019a). The underlying idea is that health levels and health inequalities at local
level can be assessed more in depth taking into account both the observed and the unob
served heterogeneity. In fact similar levels of health can result from very diﬀerent per
formances, when they are achieved in diﬀerent socioeconomic conditions. The study
provided a multidimensional analysis of health inequalities at local level, in the eﬀort to
capture the unobserved heterogeneity that can be explained taking into account the rela
tionships among Health, Economic wellbeing and Education.
The results of the PLS–PM conﬁrmed that there is a relationship between Education
and Health, as we hypothesized in the theoretical model. The QC–PM also revealed the
existence of a relevant relationship between Economic wellbeing and high levels of Health
and a decreasing impact and contribution of Education to increasing levels of Health. The
geographical area also provided useful information for understanding diﬀerences in Health
levels and in the relations among Health, Economic wellbeing and Education. Globally,
the PLS–PM conﬁrmed that the three wellbeing domains are highly correlated in all
geographical areas, except the northeast. Deepening the analysis in a predictive perspec
tive the best quantile predictions resulted much more eﬃcient than PLS–PM, especially
concerning the northeast subgroup. The observed health results of each province could
then be assessed taking into account jointly its placement in the unconditional distribu
tion, the results of the best quantile prediction and its geographical location. Looking at life
Catanzaro
Cosenza
S−I
CEN
NW
NE
q1
q2
q3
−1
0
1
O.4.3O.4.1 O.4.4aa O.4.5O.4.6aa O.4.2
Fig. 17 Distribution of the univariate statistics of the Economic wellbeing indicators: quartiles (grey dou
ble lines), averages by geographical area (thin lines). The broken lines representing Catanzaro and Cosenza
are highlighted
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
931
Composite‑Based Path Modeling forConditional Quantiles…
1 3
expectancy of males, many provinces, almost all located in the south and islands, get low
but better results than expected; in contrast, the provinces that get high but lower results
than expected are less numerous and none of them is southern or insular.
Future research will explore the strengths of QCPM for prediction outside the data
sample used for estimating the model (outofsample prediction) and will try to deepen the
knowledge about the determinants of health diﬀerences at the local level, including in the
model the relationships among health and other wellbeing assets, such as the environment,
the quality of health services, the exposure to risky jobs and other vulnerability factors.
Acknowledgements Open access funding provided by Università degli Studi di Napoli Federico II within
the CRUICARE Agreement.
Compliance with ethical standards
Conﬂict of interest The authors declare that they have no relevant or material ﬁnancial interests that relate to
the research described in the paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long
as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Com
mons licence, and indicate if changes were made. The images or other third party material in this article
are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.
Appendix
See Tables3, 4.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
932
C.Davino et al.
1 3
Table 3 Manifest variables for the Health block and the Education block
Domain Label Indicator (MV) Description Unit of measurement Reference year
Health O.1.1.F Life expectancy at birth (females) Life expectancy expresses the average
number of years that a child born in a
given calendar year can expect to live if
exposed during his whole life to the risks
of death observed in the same year at
diﬀerent age
Average number of years 2017
O.1.1M Life expectancy at birth (males)
O.1.2.MEAN_aa Infant mortality rate Ratio of children dead during the ﬁrst year
of life to the total number of children
born in the same year
Per 1.000 born alive (3 years mean) 20142016
Education O.2.2 People with at least upper secondary edu
cation level (2564 years old)
Ratio of people aged 2564 years having
completed at least upper secondary
education (ISCED[1] level not below 3)
to the total of people aged 2564 years,
[1] ISCED is the UNESCO International
Standard Classiﬁcation of Education for
degree programs and related degrees.
Percentage 2018
Level 3 is the Upper secondary educa
tion degree, Level 5 is the First stage of
tertiary education degree.
O.2.3 People having completed tertiary education
(3034 years old)
Ratio of people aged 3034 years having
completed tertiary education (ISCED
5, 6, 7 or 8) to the total of people aged
3034 years.
Percentage 2018
O.2.4 Firsttime entry rate to university by cohort
of upper secondary graduates
Proportion of newgraduates from upper
secondary education enrolled for the ﬁrst
time at university in the same year of
upper secondary graduation
Cohortspeciﬁc percentage rate 2017
O.2.5.aa People not in education, employment or
training (Neet)
Ratio of people aged 1529 years that are
not in education, employment, or training
to the total people aged 1529 years
Percentage 2018
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
933
Composite‑Based Path Modeling forConditional Quantiles…
1 3
Table 3 (continued)
Domain Label Indicator (MV) Description Unit of measurement Reference year
O.2.6 Participation in longlife learning Ratio of people aged 2564 years partici
pating in formal or nonformal education
to the total people aged 2564 years
Percentage 2018
O_2.7_2.8 Level of literacy and numeracy Scores obtained in the tests of functional
skills of the students in the II classes of
upper secondary education
Average score on a 0–200 scale 2018
O_2.7_2.8_AA Gender diﬀerences in the level of numeracy
and literacy
Diﬀerences between males and females
students in the level of numeracy and
literacy
Absolute diﬀerence between spe
ciﬁc average scores
2018
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
934
C.Davino et al.
1 3
Table 4 Manifest variables for the Wellbeing block
Domain Label Indicator (MV) Description Unit of measurement Reference year
Well Being O.4.1 Per capita disposable income Ratio of total disposable income of households to the
total number of residents
Euro 2016
O.4.4aa Pensioners with low pension amount Pensioners who receive a monthly gross pension of less
than 500 Euros to the total number of pensioners
Percentage 2017
O.4.5 Per capita net wealth Ratio of total net wealth of households to the total
number of residents
Thousands of euro 2016
O.4.6aa Rate of bad debts of the bank loans to families Ratio of the amounts of new nonperforming loans in
the year (loans to subjects declared insolvent or dif
ﬁcult to recover during the year) to the total stock of
nonperforming loans during the year
Percentage 2017
O.4.2 Average annual salary of employees Ratio of the total annual remuneration (gross of per
sonal income tax) of nonagricultural private sectors
employees to the number of employees
Euro 2017
O.4.3 Average annual amount of pension income per capita Ratio of the total amount of pensions paid in the year to
the total number of pensioners
Euro 2017
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
935
Composite‑Based Path Modeling forConditional Quantiles…
1 3
References
Costa, G., Bassi, M., Gensini, G. F., Marra, M., Nicelli, A. L., & Zengarini, N. (2014). L’equità nella salute
in Italia. Secondo rapporto sulle disuguaglianze sociali in sanità. Roma: Franco Angeli.
Danks, N., & Ray, S. (2018). Predictions from partial least squares models. In F. Ali, S. Rasoolimanesh, &
C. Cobanoglu (Eds.), Applying partial least squares in tourism and hospitality research (pp. 35–52).
Emerald Publishing Limited.
Davino, C., Esposito Vinzi, V., & Dolce, P. (2016). Assessment and validation in quantile compositebased
path modeling. In H. Abdi, V. Esposito Vinzi, G. Russolillo, G. Saporta, & L. Trinchera, (Eds.), The
multiple facets of partial least squares and related methods (pp. 169–185). Springer Proceedings in
Mathematics & Statistics. New York: Springer.
Davino, C., Dolce, P., & Taralli, S. (2017). Quantile compositebased model: A recent advance in plspm.
a preliminary approach to handle heterogeneity in the measurement of equitable and sustainable well
being. In H. Latan & R. Noonan (Eds.), Partial least squares path modeling: basic concepts, methodo
logical issues and applications (pp. 81–108). Cham: Springer.
Davino, C., Dolce, P., Taralli, S., & Esposito Vinzi, V. (2018). A quantile compositeindicator approach for
the measurement of equitable and sustainable wellbeing: A case study of the italian provinces. Social
Indicators Research, 136, pp. 999–1029, Dordrecht, Kluwer Academic Publishers.
Davino, C., & Esposito Vinzi, V. (2016). Quantile compositebased path modelling. Advances in Data Anal
ysis and Classiﬁcation, 10(4), 491–520.
Davino, C., Furno, M., & Vistocco, D. (2013). Quantile regression: Theory and applications. Wiley, Wiley
Series in Probability and Statistics.
Davino, C., & Vistocco, D. (2015). Quantile regression for clustering and modeling data. In I. Morlini, T.
Minerva, & M. Vichi (Eds.), Advances in statistical models for data analysis: studies in classiﬁcation,
data analysis, and knowledge organization (pp. 85–96). Heidelberg: Springer.
Davino, C., & Vistocco, D. (2018). Handling heterogeneity among units in quantile regression. Investigating
the impact of students’ features on University outcome. Statistics & Its Interface, 11, 541–556.
Di Napoli, I., Dolce, P., & Arcidiacono, C. (2019). Community trust: A social indicator related to commu
nity engagement. Social Indicators Research, 145(2), 551–579.
Dolce, P., Esposito Vinzi, V., & Lauro, C. N. (2017). Predictive path modeling through PLS and other com
ponentbased approaches: Methodological issues and performance evaluation. In H. Latan & R. Noo
nan (Eds.), Partial least squares path modeling: Basic concepts, methodological issues and applica
tions (pp. 153–172). Cham: Springer.
Dolce, P., & Hanaﬁ, M. (2017). Multidimensional blocks in predictive path modeling, 9th international
conference on pls and related methods (PLS’17), Macau, China, 17–19 June 2017.
Dolce, P., Esposito Vinzi, V., & Lauro, C. (2018). Nonsymmetrical compositebased path modeling.
Advances in Data Analysis and Classiﬁcation, 12(3), 759–784.
Esposito Vinzi, V., Chin, W. W., Henseler J., & Wang, H. (Eds.). (2010). Handbook of partial least squares.
Springer.
Evermann, J., & Tate, M. (2016). Assessing the predictive performance of structural equation model estima
tors. Annals of Mathematical Statistics, 35(3), 1019–1030.
Fox, M., & Rubin, H. (1964). Admissibility of quantile estimates of a single location parameter. Journal of
Business Research, 69(10), 4565–4582.
Furno, M., & Vistocco, D. (2018). Quantile regression: Estimation and simulation. Wiley, Wiley Series in
Probability and Statistics.
Glang, M. (1988). Maximierung der Summe erklärter Varianzen in linearrekursiven Strukturgleichun
gsmodellen mit multiple Indikatoren: Eine Alternative zum Schäatzmodus B des PartialLeastSquares
Verfahren. Phd thesis, Universität Hamburg, Hamburg, Germany.
Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M. (2014). A primer on partial least squares structural
equation modeling (PLSSEM) (2nd ed.). Thousand Oaks, CA: Sage.
Hair, J., Sarstedt, M., Pieper, T., & Ringle, C. (2012). The use of partial least squares structural equation
modeling in strategic management research: A review of past practices and recommendations for
future applications. Long Range Planning, 45, 320–340.
Hanaﬁ, M. (2007). PLS path modeling: Computation of latent variables with the estimation mode B. Com
putational Statistics, 22, 275–292.
Henseler, J., Ringle, C. M., & Sarstedt, M. (2016). Testing measurement invariance of composites using
partial least squares. International Marketing Review, 33(3), 405–431.
Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in
international marketing. Advances in International Marketing, 20, 277–319.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
936
C.Davino et al.
1 3
ISTAT. (2013). Rapporto Bes 2013. Il benessere equo e sostenibile in Italia. Roma, Istat. https ://www.istat
.it/it/archi vio/84348 .
ISTAT. (2018). Bes report 2018: Equitable and sustainable wellbeing in Italy. Rome, Istat. https ://www.
istat .it/en/archi vio/22514 0.
ISTAT. (2019a). Misure del Benessere dei territori. Tavole di dati. Rome, Istat. https ://www.istat .it/it/archi
vio/23062 7.
ISTAT. (2019b). Le diﬀerenze territoriali di benessere  Una lettura a livello provinciale. Rome, Istat. https
://www.istat .it/it/archi vio/23324 3.
Koenker, R. (2005). Quantile regression. Cambridge: Cambridge University Press.
Koenker, R., & Basset, G. (1978). Regression quantiles. Econometrica, 46, 33–50.
Jöreskog, K. G. (1978). Structural analysis of covariance and correlation matrices. Psychometrika, 43(4),
443–477.
Krämer, N. (2007). Analysis of highdimensional data with partial least squares and boosting. Phd thesis,
Technische Universität Berlin, Berlin, Germany.
Lohmöller, J. B. (1989). Latent variable path modeling with partial least squares. Heildelberg:
PhysicaVerlag.
Mackenbach, J. P., Stirbu, I., Roskam, A. J., Schaap, M. M., Menvielle, G., Leinsalu, M., et al. (2008).
European union working group on socioeconomic inequalities in health. Socioeconomic inequalities in
health in 22 European countries. The New England Journal of Medicine, 358, 2468–2481.
Mathes, H. (1993). Global optimisation criteria of the PLSalgorithm in recursive path models with latent
variables. In K. Haagen, D. Bartholomew, & M. Deister (Eds.), Statistical modelling and latent vari
ables. Amsterdam: Elsevier Science.
Murtin, F., Mackenbach, J., Jasilionis, D., & Mira d’Ercole, M. (2017). Inequalities in longevity by educa
tion in OECD countries:Insights from new OECD estimates”, OECD Statistics Working Papers, No.
2017/02, OECD Publishing, Paris.
OECD. (2008). Handbook on constructing composite indicators: Methodology and user guide. Paris:
OECD.
Petrelli, A., Di Napoli, A., Sebastiani, G., Rossi, A., Rossi, P. Giorgi, Demuru, E., Costa, G., Zengarini, N.,
Alicandro, G., Marchetti, S., Marmot, M., & Frova, L. (2019). Italian Atlas of mortality inequalities by
education level. Epidemiologia e prevenzione, 43, 1S1: 1–120.
Sarstedt, M., Ringle, C. M., & Hair, J. F. (2017). Partial least squares structural equation modeling. In C.
Homburg etal. (Eds.), Handbook of Market Research.
Sharma, P. N., Shmueli, G., Sarstedt, M., Danks, N., & Ray, S. (2019). Predictionoriented model selec
tion in partial least squares path modeling. Decision Sciences. https ://doi.org/10.1111/deci.12329 .
(forthcoming).
Shmueli, G., Ray, S., Velasquez Estrada, J. M., & Chatla, S. B. (2016). The elephant in the room: Predictive
performance of PLS models. Journal of Business Research, 69(10), 4552–4564.
Shmueli, G., Sarstedt, M., Hair, J. F., Cheah, J.H., Ting, H., Vaithilingam, S., et al. (2019). Predictive
model assessment in PLSSEM: Guidelines for using PLSpredict. European Journal of Marketing,
forthcoming.
Tenenhaus, M., Vinzi, V. E., Chatelin, Y. M., & Lauro, C. (2005). PLS path modeling. Computational sta
tistics and data analysis, 159–205.
Tenenhaus, A., & Tenenhaus, M. (2011). Regularized generalized canonical correlation analysis. Psycho
metrika, 76(2), 257–284.
Trinchera, L., Marie, N., & Marcoulides, G. A. (2018). A distribution free interval estimate for Coeﬃcient
Alpha. Structural Equation Modeling: A Multidisciplinary Journal, 25(6), 876–887.
Wang, Y., Feng, X. N., & Song, X. Y. (2016). Bayesian quantile structural equation models. Structural
Equation Modeling: A Multidisciplinary Journal, 23(2), 246–258.
Wold, H. (1982). Soft modeling: The basic design and some extensions. In K. Jöreskog & H. Wold (Eds.),
Systems under indirect observation (Vol. 2, pp. 1–54). Amsterdam: NorthHolland.
Wold, H. (1985). Partial least squares. In S. Kotz & N. L. Johnson (Eds.), Encyclopedia of statistical sci
ences. Hoboken: Wiley.
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aﬃliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for smallscale personal, noncommercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be noncommercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commonslicensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal noncommercial
use, it is important to note that Users may not:
use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for interlibrary loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at
onlineservice@springernature.com