ArticlePDF Available

Abstract and Figures

Disability and dependency (lack of autonomy in performing common everyday actions) affect health status and quality of life, therefore they are significant public health issues. The main purpose of this study is to establish the existing relationship among different variables (continuous, categorical and binary) referred to children between 3 and 6 years old and their functional dependence in basic activities of daily living. We combine different types of information via weighted related metric scaling to obtain homogeneous profiles for dependent Spanish children. The redundant information between groups of variables is modeled with an interaction parameter that can be optimized according to several criteria. In this paper, the goal is to obtain maximum explained variability in an Euclidean configuration. Data comes from the Survey about Disabilities, Personal Autonomy and Dependence Situations, EDAD 2008, (Spanish National Institute of Statistics, 2008)
Content may be subject to copyright.
Working Paper 11-36
Statistics and Econometrics Series 28
November 2011
Departamento de Estadística
Universidad Carlos III de Madrid
Calle Madrid, 126
28903 Getafe (Spain)
Fax (34) 91 624-98-49
PROFILE IDENTIFICATION VIA WEIGHTED RELATED METRIC SCALING:
AN APPLICATION TO DEPENDENT SPANISH CHILDREN
Irene Albarrán(1), Pablo Alonso(2) and Aurea Grané(1)
Abstract:
Disability and dependency (lack of autonomy in performing common everyday actions)
affect health status and quality of life, therefore they are significant public health issues.
The main purpose of this study is to establish the existing relationship among different
variables (continuous, categorical and binary) referred to children between 3 and 6 years old
and their functional dependence in basic activities of daily living. We combine different
types of information via weighted related metric scaling to obtain homogeneous profiles for
dependent Spanish children. The redundant information between groups of variables is
modeled with an interaction parameter that can be optimized according to several criteria.
In this paper, the goal is to obtain maximum explained variability in an Euclidean
configuration. Data comes from the Survey about Disabilities, Personal Autonomy and
Dependence Situations, EDAD 2008, (Spanish National Institute of Statistics, 2008).
Keywords: ADL, Disability, Mixed-type Data, Public Health, Related Metric Scaling.
AMS subject classification: 62-07, 62-09, 62H20, 62H99, 62P05.
Authors’ address: (1) Statistics Department, Universidad Carlos III de Madrid, C/ Madrid
126, 28903 Getafe (Madrid), Spain. (2) Statistics Department, Universidad de Alcalá, Pza.
San Diego, s/n - 28801 Alcalá de Henares (Madrid).
E-mail: A. Grané, aurea.grane@uc3m.es (Corresponding author), I. Albarrán,
irene.albarran@uc3m.es, P. Alonso, pablo.alonso@uah.es .
This work has been partially supported by Spanish grant MTM2010-17323 (Spanish
Ministry of Science and Innovation).
Profile identification via weighted related metric scaling:
An application to dependent Spanish children
Irene Albarr´an(1) Pablo Alonso(2) Aurea Gran´e(1)
(1) Statistics Department. Universidad Carlos III de Madrid.
(2) Statistics Department. Universidad de Alcal´a.
Abstract
Disability and dependency (lack of autonomy in performing common everyday
actions) affect health status and quality of life, therefore they are significant
public health issues. The main purpose of this study is to establish the exist-
ing relationship among different variables (continuous, categorical and binary)
referred to children between 3 and 6 years old and their functional dependence
in basic activities of daily living. We combine different types of information via
weighted related metric scaling to obtain homogeneous profiles for dependent
Spanish children. The redundant information between groups of variables is
modelled with an interaction parameter that can be optimized according to sev-
eral criteria. In this paper, the goal is to obtain maximum explained variability
in an Euclidean configuration. Data comes from the Survey about Disabilities,
Personal Autonomy and Dependence Situations, EDAD 2008, (Spanish National
Institute of Statistics, 2008).
Keywords: ADL, Disability, Mixed-type Data, Public Health, Related Metric Scaling
AMS Subject Classification: 62-07, 62-09, 62H20, 62H99, 62P05.
1 Introduction
Living is not always an easy task, specially if the physical or mental conditions are
not in their fullness. This is the case when people are not able to carry out certain
activities that are common in our daily life because they suffer some disabilities. This
situation refers to the negative aspects of the interaction between the individual and
the environment, i.e., deficits, limitations in the activity and restrictions in his/her
social participation (WHO 2001b). This set of obstacles turns into a tougher situation
when a third person is required to do these activities, which is the case of depen-
dency. Disability has traditionally been a marginalized concern of public health and
has usually been viewed as a failure of primary prevention. However, disparities in
Financial support from research project MTM2010-17323 by the Spanish Ministry of Science and
Innovation.
Authors’ address: (1) Statistics Department, Universidad Carlos III de Madrid, C/ Madrid 126,
28903 Getafe, Spain. (2) Statistics Department, Universidad de Alcal´a, Plaza de la Victoria
2, 28802 Alcal´a de Henares, Spain. E–mails: I. Albarr´an, irene.albarran@uc3m.es, P. Alonso,
pablo.alonsog@uah.es, A. Gran´e, aurea.grane@uc3m.es
Corresponding author: A. Gran´e; Date: December 1, 2011.
1
health behaviours, health access, and health status between people with and without
disabilities suggest that opportunities exist for public health to engage people with
disabilities to improve their overall health (Crews and Lollar 2008). Disability is a
large public health problem in the main developed countries. It has been analysed in
the US (Pope and Tarlov 1991) and in the Western European countries (Haveman and
Wolfe 2000). The International Classification of Functioning, Disability and Health
(ICF)(WHO 2001b) tries to establish a consensus in its understanding, by establish-
ing a difference between the basic activities of the daily life and the instrumental
activities of the daily life. The first ones are defined as those essential activities for
an independent life. There are many ways to define what dependency is. One of the
most accepted is that included in Resolution R(98) of the Council of Europe that
defines it as “such state in which people, whom for reason connected to the lack or
loss of physical, mental or intellectual autonomy, require assistance and/or extensive
help in order to carry out common everyday actions”.
This general definition has been translated to national legislations in an heterogeneous
way (Kamette 2011). In fact, it may happen that a man/woman can be considered
dependant in a country but not in another. For instance, a 58 year-old dependent
man in Spain or in Germany will not be considered dependant in France because in
the French system it is required to be over 60 years old. Or, in a another way, the
intensity of the legally recognized dependence may change amongst countries (see
Albarr´an, Alonso, and Bolanc´e 2009). If we consider the Spanish case, the definition
of dependency is that included in article 2 of Act 39/2006, of 14th December, on
the Promotion, Personal Autonomy and care for Dependent persons. It is defined
as a “permanent state in which persons that for reasons derived from age, illness or
disability and linked to the lack or loss of physical, mental, intellectual or sensorial
autonomy require the care of another person/other people or significant help in or-
der to perform basic activities of daily living or, in the case of people with mental
disabilities or illness, other support for personal autonomy”.
Another key element is the assessment of dependency. This question is usually solved
using specific scales that take into account the disabilities suffered by the person
jointly with their intensity. It is not easy to measure this last matter, being one of
the most extended solutions to evaluate the time dedicated by a third person to help
him/her to do certain activities, such as dressing or eating by him/herself. This is
how it is assessed in several national systems such as those currently in force in Spain
or in Germany. The evaluation of dependency in Spain is ruled by the Royal Decree
504/2007, of 20th April, that passes the scale for assessment of the situations of de-
pendency set by Act 39/2006. According to it, the scale goes from 0 to 100 points
and at least 25 points are needed to acknow the entitlement to the benefits of the
System. In Table 1 we show the dependency graduation following Spanish legislation.
According to the scale value reached by an individual, Act 39/2006 establishes a min-
imum level of protection, which is defined and financially guaranteed by the General
State Administration. Dependent persons shall be entitled to access under equal con-
ditions to the benefits and services foreseen in this Act, according to the terms laid
down in it. For instance, regarding dependent children, some family benefits, social
service benefits in re-education and rehabilitation can be obtained. This scale is used
in all cases when individuals are over 3 years old. Nevertheless, when the individuals
are under 6, the International Classification of Functioning (ICF) mentioned before
2
Table 1: Dependency graduation following Spanish legislation
Dependency Degree Level Scale values Dependency Degree Level Scale values
Non dependant - - [0,25) Severe II 1 [50,65)
II 2 [65,75)
Moderate I 1 [25,40) Major III 1 [75,90)
I 2 [40,50) III 2 [90,100]
Moderate dependency The person needs help in order to perform various basic ADL(1)
at least once a day or the person needs intermittent or limited
support for his/her personal autonomy.
Severe dependency The person needs help in order to perform various basic ADL
two or three times a day, but he/she does not want the
permanent support of a carer or he/she needs extensive support
for his/her personal autonomy.
Major dependency The person needs help in order to perform various basic ADL
several times a day or he/she needs the indispensable and
continuous support of another person or he/she needs
generalised support for his/her personal autonomy.
(1) ADL stands for Activities of Daily Living.
is replaced by its version for children and youth (ICF-CY) (WHO 2001a). This ver-
sion is based on the same model as ICF with added content adapted to these groups
of people. ICF is meant to provide a common language to professionals and other
stakeholders involved in facilitating functioning for persons with body impairments
and activity limitations. Besides, in this case the term limitation is used instead of
disability. However, the federal Maternal and Child Health Bureau (MCHB) defines
children with special health care needs as those for a chronic physical, developmental,
behavioral, or emotional condition and who also require health and related services
of a type or amount beyond that required by children generally (see USDHHS 2004).
Dependence is the main impact factor on health and quality of life (Mill´an-Calenti
2006). There are many studies about dependence when people is over 65 (see Gram-
menos 2003, Lafortune and Balestat 2007, Giannakouris 2009). However, concerning
children, there is a lack of research using internationally accepted measurements. Be-
navente and Pfeiffer (2002) pointed out that children with disabilities is a large area
to be covered. In particular, our contribution is to study those children with specific
disabilities linked to some limitations (see Annex I) who are considered dependant ac-
cording to the Spanish legislation. Becoming dependant during the first stages of life
implies that the individual is going to need special cares during the rest of his/her
life (Claeson and Waldman 2000 and Hauser-Cram, Warfield, and Shonkoff 2001).
Nowadays, thanks to the combined efforts of medicine, public health and policy, chil-
dren with chronic conditions or disability live to adulthood, often with a life span
similar to the general population (Cannell, Brumback, Bouldin, Hess, Wood, Sloyer,
Reiss, and Andresen 2011). For this reason, it seems quite necessary to know as best
as possible how the Spanish child population is living in these circumstances, that is,
we are interested in establishing homogeneous profiles. Each of these groups will have
different necessities (for instance, medical, psychological or social cares) with differ-
ent economic consequences. The aim of this paper is to classify into different groups
3
the dependent Spanish population between 3 and 6 years old. These groups will be
created depending on certain inherent characteristics such as sex, suffered limitation
and its severity or weekly hours of personal assistance. The statistical information
comes from the Survey about Disabilities, Personal Autonomy and Dependence Sit-
uations, 2008 (EDAD 2008, according to its Spanish acronym). This question, which
is crucial in actuarial science, has been usually solved by classical segmentation tech-
niques, such as k-means (see, for instance, Anderberg 1973 and Morgan and Ray
1995). Applications to class rating definitions may be found in Loimaranta, Jacobs-
son, and Lonka (1980), Campbell (1986), Boj, Claramunt, and Fortiana (2001) and
Boj, Claramunt, and Fortiana (2004). However, technical relevant problems arise be-
cause of (i) the sampling design and (ii) the nature of the variables observed. Indeed,
in EDAD 2008 a two-stage sampling was conducted by INE, leading to individuals
that represent population groups of different sizes. We refer to this situation as a
weighted context, in contrast to the classical iid sampling. Regarding the nature of
the variables, data collected through questionnaires are often of mixed-type obtained
as measures of variables at different levels, e.g. quantitative, multi-scale categorical
and binary variables. The repertory of statistical techniques suitable for this mixed-
type data is scarce. Among them, multidimensional scaling appears to be one of the
most flexible techniques dealing with mixed-type data. Bearing in mind the setting
of homogeneous profiles, it becomes crucial the definition of a proper distance (or dis-
similarity) function among individuals. The well-known Gower’s general similarity
coefficient (see Gower 1971, Cox and Cox 2000) considers mixtures of quantitative,
multi-scale categorical and binary variables. Nevertheless, the additive treatment of
the variables of Gower’s based similarity coefficients results in a lack of considera-
tion of the association between variables. Gran´e and Romera (2009) proposed to
construct a joint metric via related metric scaling (Cuadras and Fortiana 1998) from
three different distance matrices computed on quantitative, multi-scale categorical
and binary variables, respectively. Through a case study these authors show the po-
tential of this technique, which overperforms the classical Gower’s metric, and study
the sensitivity and robustness of their proposal through crossvalidation procedures.
This paper extends their proposal in two directions. Firstly, we consider related met-
ric scaling in the weighted context, in the sense that each individual can represent
a group of individuals, and secondly we model the redundant information between
groups of variables through an interaction parameter, that can be optimized accord-
ing to several criteria (see also Esteve 2003). For example, in the case study of data
coming from EDAD 2008, since we are interested in obtaining homogeneous profiles
for dependent Spanish children, the interaction parameter is achieved by imposing
maximum explained variability in the Euclidean configuration.
Searching for profiles of homogeneous dependent Spanish children we find out that the
Spanish scale is not sufficient in order to measure properly the severity of the situation
of dependency, since the time devoted to care dependent children and the intensity
of the dependency is not directly related. This is quite relevant, since the scale value
reached by an individual allows him/her to access to the benefits entitled by the
Spanish Act 39/2006. Finally, we have observed that the definition and classification
that the International Classification of Functioning for children and youth (WHO
2001a) establishes can be refined and complemented according to USDHHS (2004).
The rest of the paper is organised as follows: the Spanish database is described in
4
Section 2. Section 3 is devoted to multidimensional scaling methodology for mixed-
type data in the weighted context, where the new proposal of weighted related metric
scaling is introduced. An application of this new methodology can be found in Section
4, where we obtain different homogenous profiles for dependent Spanish children using
data coming from EDAD 2008. Finally, we conclude in Section 5.
2 Database used in the analysis
Three surveys about disability have been undertaken by INE (Spanish National In-
stitute of Statistics) during the last 25 years in Spain. The first one, elaborated in
1986, was the Survey about Disabilities, Impairments and Handicaps (EDDM 1986,
according its Spanish acronym). The next one, the Survey about Disabilities, Im-
pairments and Health Status (EDDES 1999, according its Spanish acronym), was
prepared using data of 1999. Finally, the last one was the Survey about Disabilities,
Personal Autonomy and Dependence Situations (EDAD 2008, according its Spanish
acronym). Although all of them talk about disabilities, it is impossible to track this
phenomenon in a homogeneous way along the years because the definition of that
concept has been changing through the years depending on the classification used to
prepare the survey.
2.1 Recent disability survey in Spain: EDAD 2008
In order to provide reliable estimates at the national level, the survey was performed
around the country using sampling. In particular, a two-stage sampling was per-
formed, stratified and proportional to the size of the Spanish autonomous regions
(with stratified sampling distribution proportional to population size in stratum,
within each Spanish province). See INE (2010) for more details on the sampling
methodology.
EDAD 2008 gives information about people with disabilities that were living either
in a particular home or in institutions. In the first case, the survey was prepared
interviewing 260,000 people who were living in 96,000 different houses whereas for
institutionalized people, 11,000 people in 800 centers were asked about their situation.
This survey is based on the concept of self perceived disability, in accordance with
the recommendations of the World Health Organization. So, the target people is
identified through a set of questions about the possible difficulties they can find
in doing some specific activities. Despite its drawbacks, the main advantage of this
strategy is that it is focused in the daily activities of the individuals and the problems
they may have while doing them, with no consideration of medical matters. That is,
it puts the attention of both interviewer and interviewed in functional affairs since
they are key aspects when talking about disability (Jim´enez and Huete 2010).
According to EDAD 2008, there are more than 4.1 million Spanish people suffering
at least one kind of disability, 3.85 million out of them living with their relatives or
in their own homes, whereas the remaining 0.27 millions are in specialized centers.
Although the global prevalence rate is 9.1%, in the case of people living at home, this
rate is lower than that for people living in institutions (8.5% and 17.7%, respectively).
Disability is mainly related to two main variables: sex and age. Until 45 years old, the
male prevalence is greater than the female one. After that age, the relative incidence
5
is greater for women. In general terms, more than 50% people with this problem are
at least 65 years old, being most of them women. Figures about people affected that
are living at home can be seen in Table 2.
Table 2: People with disability living at home: number and prevalence rate
Number (000) Prevalence rate (%)
Ages (years) Total Men Women Total Men Women
Under 6 60.4 36.4 24.0 2.2% 2.5% 1.8%
Between 6 and 44 608.1 345.1 263.0 2.5% 2.8% 2.3%
Between 45 and 64 951.8 409.0 542.8 8.7% 7.6% 9.8%
65 or more 2,227.5 756.7 1,470.8 30.3% 24.1% 34.9%
Total 3,847.8 1,547.2 2,300.6 8.5% 6.9% 10.1%
Source: own elaboration using EDAD 2008
Despite the fact that the survey includes the term “dependence” into its denomina-
tion, the questionnaire does not consider questions on this topic. In fact, if we looked
for the number of dependants reflected in the survey, we would not be able to know
how many individuals would be in this situation. So, the only way to answer this
question is trying to apply as best as possible both the definition incorporated in ar-
ticle 2 of Act 39/2006 and the assessment scale regulated by Royal Decree 504/2007.
Hence, the result is an estimation.
Besides this problem, there is another aspect that makes the study of this contin-
gency in children even more difficult. It must be considered that the analysis of this
phenomenon for population until 6 years old has to be done using the ICF-CY Clas-
sification, where the concept of disability is replaced by that of limitation, because
children are dependant by themselves. For instance, it has no sense to talk about
self care in children. Moreover, there are other limitations that can only be seen
during the growth of a child, i.e., some difficulties in speaking. Therefore, it is no
surprising that the proportion of children with limitations increases with age (Grupo
de Atenci´on Temprana 2000). In addition, children in the earlier ages (0-2 years old)
are assessed with an special scale whose concepts are not reflected in EDAD 2008.
This is the reason why this paper is focused on children between 3 and 6 years old.
In a strict sense, the considered ages are those between 36 and 71 months old.
2.2 Description of the data set
After having filtered the information with the definition of dependency included in
Act 39/2006, the number of records to be analysed is 84. Taking into account the
sampling methodology, the Spanish National Institute of Statistics estimates that
the number of dependent Spanish children with possibilities of receiving public aid is
13,296. Their number and prevalence rate by age and gender are shown in Table 3.
In Table 4 we briefly describe the twenty-four mixed-type variables considered in the
analysis. They consist of three continuous variables such as the age of the child (in
months), the scale value reached by the child and weekly hours of attention, three
multi-state categorical variables such as some information about the respondent,
the type of received aids and the severity of limitations to perform activities of daily
6
Table 3: Dependent children: number and prevalence rate by age and gender
Number Prevalence rate (%)
Ages (months) Total Boys Girls Total Boys Girls
Between 36 and 47 2,751 1,702 1,049 0.6% 0.7% 0.5%
Between 48 and 59 5,621 3,473 2,148 1.2% 1.5% 0.9%
Between 60 and 71 4,923 3,272 1,652 1.1% 1.4% 0.8%
Source: own elaboration using EDAD 2008
living (ADL) and, finally, eighteen binary variables such as sex and several limitations
described in the Annex.
Table 4: Variables included in the analysis with its possible values
Values/categories Values/categories
Type Description (% frequency distribution) Type Description (% frequency distribution)
B(1) sex male(66.6%), female (33.3%) B lim 15 yes (7.9%), no(92.1%)
B lim 1(2) yes (13.1%), no(86.9%) B lim 16 yes (70.1%), no(29.9%)
B lim 2 yes (21.5%), no(78.5%) B lim 17 yes (76.5%), no(25.5%)
B lim 3 yes (26.6%), no(73.4%) B lim 18 yes (79.3%), no(20.7%)
B lim 5 yes (7.6%), no(92.4%) C age (months) from 36 to 71
B lim 6 yes (2.0%), no(98.0%) C scale from 0 to 100
B lim 7 yes (14.4%), no(85.6%) C hours-week from 0 to 168
B lim 8 yes (24.5%), no(75.5%) CT inf-relac parents (92.8%), tutor (2.4%),
B lim 9 yes (36.1%), no(63.9%) (respondent) grandparents (4.8%)
B lim 10 yes (17.1%), no(82.9%) CT B-2 only personal assistance (67.1%),
B lim 11 yes (80.9%), no(19.1%) (received aids) personal assistance and aids (32.9%),
B lim 12 yes (7.6%), no(92.4%) only technical aids (0.0%)
B lim 13 yes (33.4%), no(66.6%) CT B-5 (severity moderate(69.9%), severe (19.2%),
B lim 14 yes (30.2%), no(69.8%) of limitations cannot perform ADL(3) (10.9%)
to perform ADL)
(1) B=binary, C=continuos, CT=categorical. (2) See Annex I for the definition of limitations lim 1 to lim 18.
(3) ADL stands for Activities of Daily Living.
3 Weighted Multidimensional Scaling
for mixed-type data
Multidimensional Scaling (MDS) is a multivariate technique closely related to Prin-
cipal Component Analysis (PCA) and Correspondence Analysis (CA), well-known
techniques and widely used by applied researchers. The objective of these techniques
is the description and the pictorial representation of a data set. The information
provided by the data set may be a matrix of observations corresponding to a set of
continuous variables, which is the case of PCA, a contingency table obtained from the
classification of a set of objects according to categorical variables, which is the case
of CA, and for the MDS the data set is a square matrix of dissimilarities between a
set of objects. The main advantage of MDS is that it is able to cope with variables of
any type (binary, categorical, numerical, functional, . . .) or even a mixture of them,
since using a proper “distance” function one can obtain the matrix of dissimilarities
7
between the set of objects (see Ramsay 1980). In particular, the purpose of MDS
is to construct a set of points in a Euclidean space whose interdistances are either
equal (metric or classical MDS) or approximately equal (nonmetric MDS) to those
in a given matrix of dissimilarities, in such a way that the interpoint distances ap-
proximate the interobject dissimilarities as closely as possible. That is, given a n×n
matrix , containing the squares of dissimilarities between nobjects, the goal is
to obtain a n-point configuration onto orthogonal axes (called Euclidean configura-
tion/map or MDS configuration), so that the L2–distances between the coordinates
of these npoints coincide with the corresponding entries in . These coordinates are
called a metric scaling representation of . Various possible measures of approxi-
mation between interpoint distances and interobject dissimilarities can be used, each
resulting in a different MDS configuration. In this work, these coordinates are ob-
tained via spectral decomposition. General context references are Borg and Groenen
(2005), Cox and Cox (2000) and Krzanowski and Marriott (1994) as well as Gower
and Hand (1996).
In the following we review the extension of classical MDS concepts to the weighted
context, derived by Boj, Claramunt, and Fortiana (2001). Recall that in the weighted
context each individual can represent a population group of different size.
Given n p-dimensional vectors {zi,1in}containing the information of the n
different individuals we compute a squared distances matrix , with entries δ2(zi,zj),
for 1 i, j n. Since this information can be either of qualitative or quantitative
nature, or both, it is crucial the adequacy of the dissimilarity function used in the
computation of . Additionally, we have w= (w1,...,wn)a vector of weights, such
that wi>0, for i= 1,...,n, and 1w= 1, where 1is the n×1 vector of ones.
Suppose that we are interested in obtaining a metric scaling representation of ,
provided that satisfies the Euclidean requirement. Given w, define Dw=diag(w),
an×ndiagonal matrix whose diagonal is the vector of weights, and Kw=1w, then
Jw=IKwis the w-centering matrix, which is an orthogonal projector with respect
to Dw, idempotent and self-adjoint with respect to Dw. Then, the doubly w-centered
inner-product matrix is
Gw=1
2Jw∆ Jw
and
Fw=D1/2
wGwD1/2
w(1)
is the standardized inner-product matrix. The Euclidean requirement is equivalent
to the positive semi-definiteness of Gw, hence to the existence of an Xwsuch that
Gw=XwX
w, called in the weighted context a w-centered Euclidean representation
of , meaning that wXw=0and that the squared Euclidean interdistances between
the rows of Xwcoincide with the corresponding entries in .1
This matrix Xwis the w-weighted metric scaling representation of , which is ob-
tained through the spectral decomposition of (1) as
Xw=D1/2
wU Λ,(2)
1If some of the eigenvalues of Gware negative, then does not admit an Euclidean configuration,
which means that some of the axes in the representation are imaginary. In this case, a possible
solution (still valid in the weighted context, since wis an eigenvector of Gwof 0 eigenvalue) is to
consider the transformation ˜
=+c(1n1
nIn), where c2|λ|and λis the negative eigenvalue
of maximum module, which assures an Euclidean configuration for ˜
.
8
where Λ2is a diagonal matrix containing the eigenvalues of Fw, ordered in decreasing
order, and Uis the matrix whose columns are the corresponding eigenvectors. The
rows of Xwcontain the principal coordinates of the nindividuals and its columns
are the principal axes of this representation.
In the following we describe two ways of obtaining a w-weighted metric scaling rep-
resentation of from pcontinuous and categorical variables measured on a set of
nindividuals with a weight vector w. The first one is the classical approach and
proceeds by computing Gower’s general similarity coefficient in the weighted con-
text, whereas the second one is called weighted related metric scaling and extends
the proposal of Cuadras and Fortiana (1998) to the weighted context.
3.1 Classical approach: Gower’s general similarity coefficient
After a review of the specialized literature, we found that the most popular similarity
measure in the context of mixed-type data is the well-known Gower’s general simi-
larity coefficient (see Gower 1971), which for two p-dimensional vectors ziand zjis
equal to
sij =Pp1
h=1 (1 − |zih zjh|/Rh) + a+α
p1+ (p2d) + p3
,(3)
where p=p1+p2+p3,p1is the number of continuous variables, aand dare the
number of positive and negative matches, respectively, for the p2binary variables, α
is the number of matches for the p3multi-state categorical variables, and Rhis the
range of the h-th continuous variable. The entries of matrix are computed as
δ2(zi,zj)=1sij.(4)
Gower (1971) proved that (4) satisfies the Euclidean requirement.
3.2 Weighted Related Metric Scaling
Like all distance functions satisfying additivity with respect to variables, the distance
based on Gower’s general similarity coefficient implicitly ignores any association (e.g.
correlation) between variables (Gower 1992, Krzanowski 1994). Alternative metrics
have been proposed in the literature to overcome that problem, among them, we de-
cided to extend Related Metric Scaling (Cuadras and Fortiana 1998) to the weighted
context. Related metric scaling is a multivariate technique that allows to obtain
a unique representation of a set of individuals from several distance matrices com-
puted on the same set of individuals. The method is based on the construction of a
joint metric that satisfies several axioms related to the property of identifying and
discarding redundant or repeated information.
Given a set of m2 matrices of squared distances measured on the same group of
nindividuals, {α}α=1,...,m, and a vector of weights w, the first requirement in the
construction of the w-joint metric is that all matrices αhave the same geometric
variability. This concept was introduced by Cuadras and Fortiana (1995) as a variant
of Rao’s diversity coefficient (Rao 1982a, Rao 1982b) and, given a squared distances
matrix α=δ2(zi,zj){1i,jn}, its sample version in the weighted context is:
Vw(δ) = 1
2wαw=tr(Fw),(5)
9
where Fwis the corresponding standardized inner-product matrix.
For each squared distances matrix {α}α=1,...,m, we consider its doubly w-centered
inner-product matrix Gwand its standardized inner-product matrix
Fw=D1/2
wGwD1/2
w,for α= 1,...,m,
and obtain the w-joint metric as that whose standardized inner-product matrix is:
Fw=
m
X
α=1
FwλX
α6=β
F1/2
wF1/2
w,(6)
where λis an interaction parameter that can be optimized according to several cri-
teria. For example, in this work, we are interested in obtaining maximum explained
variability in an Euclidean configuration, and in Section 4 we call optimum weighted
related metric scaling to the Euclidean configuration obtained by this procedure. See
Esteve (2003) for a wide and rigorous study on the construction of metrics.
The second summand of formula (6) is the key tool for eliminating redundant infor-
mation coming from different sources (different variable types, in our case). Roughly
speaking, this second term makes the difference with Gower’s metric and provides
the desired flexibility when dealing with mixed-type data. Formula (6) extends for-
mula (8) of Cuadras (1998) to the weighted context, where the interaction parameter
was fixed to λ= 1/m and inner-product matrices were used instead of standardized
inner-product matrices. Formula (6) is obtained so that the following properties are
fulfilled when λ= 1/m. We explicit them for m= 2:
1. If 1=0then Fw=Fw,2,
2. If 1=2then Fw=Fw,1=Fw,2,
3. If the Euclidean configurations associated to 1and 2generate orthogonal
subspaces on Rn, then Fw=Fw,1+Fw,2,
4. Fw0.
Principal coordinates are computed directly from matrix Fwof (6), but in case it is
necessary, we can recover matrix with the following formula:
=gw1+1 g
w2Gw,(7)
where Gw=D1/2
wFwD1/2
wand gw=diag(Gw).
4 A case study
In this Section we apply the two techniques described above to the data matrix Z,
whose rows are 84 records representing 13,296 dependent children with possibilities of
receiving public aid (see Section 2.2 for the data set description). Hereafter, we call
Gower’s metric to the distance matrix derived using formula (4) and w-joint metric
to the distance matrix obtained from formula (7).
In this study, the w-joint metric is constructed from m= 3 different squared distances
matrices measured on the same set of n= 84 individuals. We call them 1,2and
10
3. The vector of weights, w, is estimated by INE from the survey and taking into
account the sampling design.
Matrix 1contains the information related to the three numerical variables consid-
ered in the study, that were the age of the child, the scale value reached by the child
and weekly hours of attention. In particular, we compute 1matrix using a robust
version of Mahalanobis’ distance
δ2(zi,zj) = (zizj)S1(zizj),
that consists of estimating the entries in the covariance matrix Sin a robust way.
The variance of the j-th continuous variable is estimated from a 5%-trimmed sam-
ple, as suggested by Tuckey (1960). A robust estimator for the covariance between
variables Zjand Zkis obtained from
s
jk =1
4σ2
+ˆσ2
),
where ˆσ2
+and ˆσ2
are robust estimators of the variances of Zj+Zkand ZjZk,
respectively (see Gnanadesikan 1997).
Matrix 2contains the information concerning three multi-state categorical vari-
ables, that is, information about the respondent, the type of received aids and the
severity of limitations to perform ADL. In this case, we start by computing a sim-
ilarity matrix S2, that contains Sokal-Michener’s pairwise similarities, and obtain
2= 2 (1 1− S2).
Matrix 3contains the information of eighteen binary variables (sex and the lim-
itations described in the Annex). In this case, we compute a similarity matrix S3,
whose entries are the Jaccard’s pairwise similarities, and obtain 3= 2 (1 1− S3).
Finally, from formula (6) we construct the w-joint metric and obtain the Euclidean
configurations shown in Figures 1–4. In this study, the interaction parameter λis
optimized so that these configurations have maximum explained variability.
4.1 Euclidean configurations
In Figure 1 we depict three-dimensional principal coordinate representations of the
data set obtained via the classical approach (panels (a1) and (a2)) and through
weighted related metric scaling technique (panels (b1) and (b2)). We give two differ-
ent views of each representation for better comparison.
Two main advantages can be observed while using the new proposal based on the
w-joint metric. Firstly, there is an increase in the percentage of explained variability
and, secondly, the group of non-dependent children is quite well identified. A possible
explanation may be that the information contained in the continuous variables is
better incorporated with the w-joint metric than with the classical approach. Hence,
hereafter, and with the aim of defining homogeneous profiles of dependent children,
we focus our attention in the w-joint metric representation.
4.2 Looking for influent variables
Next, we are interested in capturing the underlying structure of the groups obtained
in a natural way through weighted related metric scaling. To determine which vari-
ables are more powerful in explaining the homogeneity within groups we compute the
11
Figure 1: Euclidean maps obtained via (a) classical MDS and (b) Optimum weighted
related metric scaling
Classical approach, 37.17% of explained variability.
(a1) Gower’s metric. View 1. (a2) Gower’s metric. View 2.
4
0
0.2
−0.4
−0 2 00.2
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
x1
x2
x3
No dependency
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
Weighted related metric scaling, 47.08% of explained variability.
(b1) Optimum w-joint metric. View 1. (b2) Optimum w-joint metric. View 2.
00.5 11.5 −1 01
−0 5
0
0 5
1
1 5
x2
x1
x3
No dependecy
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
5
0
0 5
1
1.5
−1 01
−1.5
−1
−0.5
0
0.5
1
1.5
x1
x2
x3
No dependecy
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
the correlation coefficients between the original variables and the first three principal
axes, shown in Table 5. We consider Pearson’s correlation coefficient for continu-
ous variables, whereas Spearman’s correlation coefficient is computed for categorical
variables.
Form Table 5 it can be seen that continuous variables, such as age, scale or hours-
week, are more correlated with the principal axes obtained from the w-joint metric
than with those obtained via Gower’s metric. Moreover, categorical variable B5 and
binary variables lim2, lim3, lim9 and lim11 also have influence on the principal axes.
For instance, when using the w-joint metric, the first principal coordinate is mostly
determined by variables lim 2, lim 3, lim 9, lim 11, B5, scale and hours-week, whereas
age and hours-week are influent variables for the second principal axis. Finally,
lim11, age and scale have great influence on the third principal coordinate. Such an
information will be valuable in the definition of homogeneous profiles. Bearing this
12
Table 5: Correlation coefficients (Pearson for continuous variables and Spearman for
categorical ones) between the principal coordinates and the considered variables.
Gower’s metric Optimum w-joint metric
1st P.C. 2nd P.C. 3rd P.C. 1st P.C. 2nd P.C. 3rd P.C.
(x1) (x2) (x3) (x1) (x2) (x3)
sex 0.3187 0.3291 0.2791 -0.0844 0.1614 0.0771
lim 1 -0.5261 -0.0153 0.1754 0.3529 -0.2787 -0.3602
lim 2 -0.6452 0.0450 0.1500 0.5226 -0.2826 -0.2988
lim 3 -0.7548 -0.0622 0.0908 0.5269 -0.2890 -0.4046
lim 5 -0.3040 0.0550 0.1006 0.2376 -0.0197 -0.0322
lim 6 -0.0507 -0.0784 0.3159 -0.0438 -0.0922 -0.0715
lim 7 0.0409 0.0288 0.1031 -0.0667 0.0152 0.1167
lim 8 -0.5914 -0.2052 -0.0842 0.3816 -0.1556 -0.2870
lim 9 -0.7910 -0.1824 -0.0318 0.5082 -0.2490 -0.4591
lim 10 -0.3183 0.3550 -0.2097 0.3781 -0.0794 -0.0156
lim 11 -0.5475 0.4665 -0.0522 0.5745 0.1485 -0.6378
lim 12 -0.1840 0.1405 -0.4131 0.2392 -0.2743 0.0468
lim 13 -0.1332 0.6302 -0.4703 0.2756 0.0154 -0.0133
lim 14 -0.2475 0.6663 -0.1949 0.4333 -0.0048 0.0070
lim 15 -0.1164 0.2514 0.0755 0.0471 -0.2443 -0.3135
lim 16 0.0854 0.7082 0.4083 0.1135 0.2739 -0.0187
lim 17 0.1268 0.4842 0.5499 0.0427 0.2801 -0.0749
lim 18 -0.2546 0.3283 -0.2625 0.3271 0.0232 -0.0244
age 0.2712 0.0286 0.0448 0.2062 0.6345 0.7274
scale -0.6824 0.5046 -0.0776 0.7425 0.1188 -0.6761
hours-week -0.5723 0.1520 -0.4066 0.6839 -0.5928 0.0779
inf-relac -0.1459 -0.1216 -0.1411 0.1557 0.0309 0.1409
B2 -0.3906 -0.3593 0.5739 0.1021 -0.1416 -0.2125
B5 -0.4657 0.4320 -0.1206 0.5812 -0.2005 -0.1699
Notes: Bold numbers reflect coefficients greater than 0.5 in absolute value.
Source: Own elaboration.
objective in mind, in Figures 2–3 we plot several projections (in dimension two) of
the principal coordinate representations shown in panels (b1) and (b2) of Figure 1,
using the information of those variables more correlated with the principal axes (lim
2, lim 3, lim 9, lim 11, B5, age, scale, hours-week) to color the individuals. In this
way, groups of homogeneous individuals will become apparent.
After analysing all possible projections, we decided to include only the most represen-
tative ones. For this reason, Figure 2 contains the principal coordinate representation
(3rd P.C. versus 1st P.C.) obtained from the w-joint metric. Looking at panel (f),
and comparing it with panel (a), we can see that lim 11 (the child can hardly do the
things that other children do at the same age) is crucial in splitting the individuals
in two groups: dependants and non-dependants. In fact, when a child is declared as
dependant, almost always lim 11 is present (95.1% and 90.8% in Degree II level 1
and 2, respectively, and 100% in Degree III). The remaining variables with correla-
tion coefficients greater than 0.50 in absolute value (lim 2, lim 3, lim 9, age, scale,
hours-week and B5) are quite useful for constructing dependency profiles. That is,
children not affected by dependency show a moderate severity in limitations linked to
ADL. Besides, they neither suffer those limitations associated to vertical movements
(lim 2, lim 3 and lim 9) nor lim 11. Their scale value is fully identified using the 1st
and 3rd principal corrdinates and, in most cases, their ages are between 48 and 71
months old.
13
Figure 2: Principal coordinate representations obtained via optimum weighted related
metric scaling. Projections of Figure 1 (panels (b1) and (b2)) configurations onto 1st
and 3rd P.C.
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
No dependency
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
mod. difficulty
severe difficulty
cannot perform ADL
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
no
yes
(a) scale (b) B5 (c) limitation 2
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
no
yes
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
no
yes
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
no
yes
(d) limitation 3 (e) limitation 9 (f) limitation 11
The most important features in dependent children can be summarized as follows:
none of them is under 50 points in the dependency scale; all exhibit moderate severity,
at least; lim 2, lim 3 and lim 9 are not suffered by those with a scale value between
50 and 75 and they are hardly manifested in children with Degree III (37.9% and
5.2% in level 1 and 2, respectively). That is, these three limitations are associated to
severe dependency cases. Summing up, we can establish the following profiles:
[Pr1] Non-dependent children (19%): Moderate difficulty to perform ADL, none of
limitations 2, 3, 9, 11 are suffered, less than 56 weekly hours of attention are
needed by 79% of them.
[Pr2] Severe dependent children (40.6%): Composed by 100% of children in Degree
II (level 1,2). None of limitations 2, 3, 9 are suffered, almost all (93%) suffer
limitation 11 and 59% of them need less than 56 weekly hours of attention.
[Pr3] Major dependent children (25.4%): Composed by 75% of children in Degree III
level 1. All of them suffer limitations 9, 11 and 53% of them need more than
56 weekly hours of attention.
[Pr4] Utmost dependent children (15%): Composed by 100% of children in Degree III
level 2 and 25% of children in Degree III level 1. Severe difficulties to perform
14
ADL or cannot perform them. All of them suffer limitation 11 and almost all
(95%) suffer limitations 2, 3, 9, 82% of them need more than 56 weekly hours
of attention, 88% of whom need more than 155 weekly hours of attention.
Profiles [Pr1] and [Pr4] clearly describe opposite situations, whereas profiles [Pr2]
and [Pr3] consider different realities under the same Degree of dependency. In fact,
it is possible to find individuals with the same scale value but with huge differences
in difficulties to perform ADL, age and necessity of attention.
If we focus the attention on the relationship between the intensity of severities (B5)
and the scale value reached by each child, we see that both variables are directly
related. In fact, children in Degree II level 1 show a moderate severity and 78.8%
of those in Degree III level 2 cannot perform ADL. On the other hand, looking at
Figure 2 it seems that it is difficult to distinguish between severe difficulty to perform
ADL and cannot perform ADL (there are groups of individuals with same values for
limitations 2, 3, 9, 11, but not for B5). This may lead us to conclude that it would
be better to join those categories in only one.
Figure 3 contains several projections of the principal coordinate representation ob-
tained from the w-joint metric. Panels (a1)–(c1) depict 2nd P.C. versus 1st P.C.,
panels (a2)-(c2) show 3rd P.C. versus 1st P.C. and, finally, panels (a3)–(c3) contain
3rd P.C. versus 2nd P.C. We prefer to include again variable scale (panels (a1)–(a3)),
for better comparison. For example, we can see the usefulness of variable age in panel
(b3). A special case is that of variable hours-week, which seems to be contradictory
with the groups defined by variable scale.
Despite the Spanish Act establishes a direct link between the amount of time devoted
to care dependent people and the intensity of the dependency, one of the most sur-
prising results is that there is a no direct relationship between the number of weekly
hours for care and the level of dependency. In fact, there are individuals that need
more than 155 hours per week in opposite situations (61% of children in Degree III
level 2 versus 21.2% that are non dependent). See Figure 4 and also panels (a3) and
(c3) of Figure 3. This same effect was noticed by Albarr´an and Alonso (2006) and
Gispert Magarolas, Clot-Razquin, Rivero Fern´andez, Freitas Ram´ırez, Ru´ız-Ramos,
Ru´ız Luque, Busquets Bou, and Argim´on Pall`as (2008). Similar results were found
in Bihan and Martin (2006) when studying some European systems of assistance to
dependent people. This fact reflects that the Spanish scale is not properly measuring
the severity of the situation of dependency, which is quite worrying, since the scale
value reached by an individual allows him/her to access to the benefits entitled by
the Spanish Act 39/2006.
5 Concluding remarks
Disability is a large public health problem even in developed countries. Dependence
is the main impact factor on health and quality of life. Suffering dependency when
people is over 65 is the most common situation. For this reason, some authors
join both states, dependency and ageing (see Casado and L´opez 2001, Moragas and
Cristofol 2003, and L´opez(Dir.), Comas, Monteverde, Casado, Caso, and Ibem 2005).
However, it is not true that all people over a certain age would be included in the
group of population affected by this contingency. It would be possible to prevent
15
Figure 3: Principal coordinate representations obtained via optimum weighted related
metric scaling. Projections of Figure 1 (panels (b1) and (b2)) configurations onto two
principal axes.
2nd P.C. vs. 1st P.C. 3rd P.C. vs. 1st P.C. 3rd P.C. vs. 2nd P.C.
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
x1
x2
No dependency
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
No dependency
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x2
x3
No dependency
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
(a1) scale (a2) scale (a3) scale
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
x1
x2
[36,48)
[48,60)
[60,72]
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
[36,48)
[48,60)
[60,72]
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x2
x3
[36,48)
[48,60)
[60,72]
(b1) age (in months) (b2) age (in months) (b3) age (in months)
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
x1
x2
[0,14)
[14,56)
[56,156)
156
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x1
x3
[0,14)
[14,56)
[56,156)
156
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
x2
x3
[0,14)
[14,56)
[56,156)
156
(c1) hours-week (c2) hours-week (c3) hours-week
this situation if people followed a healthy way of life, the health care system was
more efficient than today is and if we were able to have an early diagnosis of the
chronical illness (Zunzunegui 1998). Although the former is true, it must be said
that it is possible to find people in dependency at any age during the lifetime, even
in the childhood. There are many studies about dependence when people is over 65,
however, concerning children, there is a lack of research. This study contributes in
this line. Our main purpose is to establish the existing relationship among different
16
Figure 4: Principal coordinate representations obtained via Optimum Weighted Re-
lated Metric Scaling. Individuals identified by (a) scale and (b) hours-week.
−1
0
1
−1
−1
0
1
x1
x2
x3
No dependecy
Degree II level 1
Degree II level 2
Degree III level 1
Degree III level 2
−1
0
1
−1
−1
0
1
x1
x2
x3
[0,14)
[14,56)
[56,156)
156
(a) scale (b) hours-week
variables (numerical, categorical and binary) referred to Spanish children between 3
and 6 years old and their functional dependence in basic activities of daily living.
Data comes from the Survey about Disabilities, Personal Autonomy and Dependence
Situations, EDAD 2008, (Spanish National Institute of Statistics, 2008), where each
individual represents a number of similar individuals. The number of multivariate
techniques than can cope with mixed-type and weighted data is quite scarce. In this
paper we propose a multivariate methodology for mixed-type data to search for ho-
mogeneous profiles. In particular, we extend the work of Gran´e and Romera (2009) to
the weighted context. Moreover, we include an interaction parameter which provides
more flexibility when dealing with mixed-type data. The main findings are: Firstly,
this new technique overperfoms the classical one based on Gower’s metric, in the sense
that homogeneous groups are better separated. This may be due to the possibility
of constructing an ad-hoc metric with the property of discarding redundant informa-
tion. Secondly, the things that a child can hardly do compared with other children
at the same age (limitation 11) seems to be crucial in splitting the individuals into
dependants and non-dependants. This finding goes in the line of USDHHS (2004)
and complements the universal definition and classification established by the Inter-
national Classification of Functioning for children and youth (WHO 2001a). Thirdly,
the time devoted to care dependent children and the intensity of the dependency is
not directly related. This was also found by Albarr´an and Alonso (2006), among
others, and reinforces the finding that the Spanish scale is not properly measuring
the severity of the situation of dependency. This is quite relevant, since the scale
value reached by an individual allows him/her to access to the benefits entitled by
the Spanish Act 39/2006.
17
References
Albarr´an, I. and P. Alonso (2006). Dependent Individuals Classification based on
the 1999 Disabilities, Impairments and Health Status Survey. Revista Espa˜nola
de Salud ublica 80 (4), 349–360.
Albarr´an, I., P. Alonso, and C. Bolanc´e (2009). A Comparison of the Spanish, the
French and the German Valuation Scales to Measure Dependency and Public
Support for People with Disabiities. Revista Espa˜nola de Salud P´ublica 83 (3),
379–392.
Anderberg, M. (1973). Cluster analysis for applications. Technical report, DTIC
Document.
Benavente, G. and D. Pfeiffer (2002). Bibliography–an annotated bibliography on
children with disabilities. Disability Studies Quarterly 22 (2), 135–159.
Bihan, B. and C. Martin (2006). A comparative case study of care systems for frail
elderly people: Germany, Spain, France, Italy, United Kingdom and Sweden.
Social Policy & Administration 40 (1), 26–46.
Boj, E., M. Claramunt, and J. Fortiana (2001). Herramientas estad´ısticas para el
estudio de perfiles de riesgo. Instituto de Actuarios Espa˜noles, Tercera ´
Epoca(7),
59–89.
Boj, E., M. Claramunt, and J. Fortiana (2004). An´alisis multivariante aplicado a
la selecci´on de factores de riesgo en la tarificaci´on, Volume 88. Cuadernos de
la Fundaci´on MAPFRE Estudios.
Borg, I. and P. J. F. Groenen (2005). Modern Multidimensional Scaling: Theory
and Applications (second ed. ed.). New York: Springer.
Campbell, M. (1986). An integrated system for estimating the risk premium of
individual car models in motor insurance. ASTIN Bul letin 16 (2), 165–183.
Cannell, M., B. Brumback, E. Bouldin, J. Hess, D. Wood, P. Sloyer, J. Reiss, and
E. Andresen (2011). Age group differences in healthcare access for people with
disabilities: Are young adults at increased risk? results from the 2007 florida
behavioral risk factor surveillance system. Journal of Adolescent Health 49,
219–221.
Casado, D. and G. L´opez (2001). Vejez, dependencia y cuidados de larga duraci´on.
Barcelona: Fundaci´on La Caixa.
Claeson, M. and R. J. Waldman (2000). The evolution of child health programmes
in developing countries: from targeting diseases to targeting people. Bulletin-
World Health Organization 78 (10), 1234–1245.
Cox, T. F. and M. A. A. Cox (2000). Multidimensional Scaling (second ed. ed.).
London, Boca Raton, FL.: Chapman & Hall.
Crews, J. and D. Lollar (2008). The public health dimensions of disability. Inter-
national Encyclopedia of Public Health, 422–432.
Cuadras, C. M. (1998). Multidimensional dependencies in ordination and classifi-
cation. In K. Fern´andez and A. Morineau (Eds.), Analyses Multidimensionelles
des Donn´ees, Saint-Mand (France), pp. 15–25. CISIA-CERESTA.
18
Cuadras, C. M. and J. Fortiana (1995). A continuous metric scaling solution for a
random variable. Journal of Multivariate Analysis 52, 1–14.
Cuadras, C. M. and J. Fortiana (1998). Visualizing categorical data with related
metric scaling. In J. Blasius and M. Greenacre (Eds.), Visualization of Cate-
gorical Data, London. Academic Press.
Esteve, A. (2003). Distancias estad´ısticas y relaciones de dependencia entre con-
juntos de variables. Ph. D. thesis, Universitat de Barcelona.
Giannakouris, K. (2009). Ageing characterises the demographic perspectives of the
European societies. Eurostat: Statistics in Focus. Retrieved 9, 8–72.
Gispert Magarolas, R., G. Clot-Razquin, A. Rivero Fern´andez, A. Freitas Ram´ırez,
M. Ru´ız-Ramos, C. Ru´ız Luque, E. Busquets Bou, and J. Argim´on Pall`as
(2008). Dependence Profile in Spain: An Analysis from the Disability Survey
of 1999. Revista Espnola de Salud P´ublica 82 (6), 653–665.
Gnanadesikan, R. (1997). Methods for Statistical Data Analysis of Multivariate
Observations. New York: John Wiley & Sons.
Gower, J. C. (1971). A general coefficient of similarity and some of its properties.
Biometrics 27, 857–874.
Gower, J. C. (1992). Generalized biplots. Biometrika 79, 475–493.
Gower, J. C. and D. Hand (1996). Biplots. London, UK: Chapman & Hall.
Grammenos, S. (2003). Feasibility Study. Comparable statistics in the area of care
of dependent adults in the European Union. European Commission. Luxem-
bourg.
Gran´e, A. and R. Romera (2009). Sensitivity and robustness in MDS configura-
tions for mixed-type data: A study of the economic crisis impact on socially
vulnerable Spanish people. Working Paper 10-35, Universidad Carlos III de
Madrid. Statistics and Econometrics Series 19.
Grupo de Atenci´on Temprana (2000). Real Patronato de Prevenci´on y Atenci´on a
Personas con Minusval´ıa. Madrid.
Hauser-Cram, P., M. E. Warfield, and J. P. Shonkoff (2001). Children with dis-
abilities: A longitudinal study of child development and parent well-being, Vol-
ume 66. Wiley-Blackwell.
Haveman, R. and B. Wolfe (2000). The economics of disability and disability policy.
Handbook of Health Economics 1, 995–1051.
INE (2010). Encuesta sobre Discapacidad, Autonom´ıa personal y Situaciones de
Dependencia (EDAD), Metodolog´ıa. Ed. Subdirecci´on General de Estad´ısticas
Sociales Sectoriales (INE), Madrid, Espa˜na.
Jim´enez, A. and A. Huete (2010). Estad´ısticas y otros registros sobre discapacidad
en Espa˜na. Pol´ıtica y Sociedad 47 (1), 165–174.
Kamette, F. (2011). Dependency Care in the EU: a comparative analysis. Founda-
tion Robert Schumann. Social Issues European Issue, 196.
Krzanowski, W. J. (1994). Ordination in the presence of of group structure for
general multivariate data. Journal of Classification 11, 195–207.
19
Krzanowski, W. J. and F. H. C. Marriott (1994). Multivariate Analysis. Part 1,
Volume Distributions, ordination and inference. London: Edward Arnold.
Lafortune, G. and G. Balestat (2007). Trends in severe Disability among elderly
people: assessing the evidence in 12 OECD countries and the future implica-
tions. Health Working Papers.
Loimaranta, K., J. Jacobsson, and H. Lonka (1980). On the use of mixture models
in clustering multivariate frequency data. In Transactions of the 21st Interna-
tional Congress of Actuaries, Volume 2, pp. 147–161.
opez(Dir.), G., A. Comas, M. Monteverde, D. Casado, J. R. Caso, and P. Ibem
(2005). Envejecimiento y dependencia. Situaci´on actual y retos de futuro.
Barcelona: Caixa Catalunya.
Mill´an-Calenti, J. (2006). Principios de geriatr´ıa y gerontolog´ıa. McGraw-Hill In-
teramericana de Espa˜na.
Moragas, R. and R. Cristofol (2003). El coste de la dependencia al envejecer.
Barcelona: Herder.
Morgan, B. and A. Ray (1995). Non-uniqueness and inversions in cluster analysis.
Applied Statistics 44, 117–134.
Pope, A. and A. Tarlov (1991). Disability in America: Toward a national agenda
for prevention. National Academy Press, 2201 Consitution Ave., NW, Wash-
ington, DC.
Ramsay, J. O. (1980). Joint analysis of direct ratings, pairwise preferences and
dissimilarities. Psychometrika 45, 149–165.
Rao, C. R. (1982a). Diversity and dissimilarity coefficients: A unified approach.
Theoretical Population Biology 21, 24–43.
Rao, C. R. (1982b). Diversity: Its measurement, decomposition, apportionment
and analysis. Sankhy¯a. The Indian Journal of Statistics, Series A 44, 1–22.
Tuckey, J. W. (1960). A survey of sampling from contaminated distributions. In
I. O. et al. (Ed.), Contributions to Probability and Statistics, pp. 448–485.
Standford University Press.
USDHHS (2004). National agenda for children with special health care needs:
Achieving the goals 2000. US Department of Health.
WHO (2001a). International Classification of Functioning, Disability and Health-
Child and Youth version (ICF-CY). WHO (World Health Organization).
Geneva.
WHO (2001b). International Classification of Functioning, Disability and Health
(ICF). WHO (World Health Organization). Geneva.
Zunzunegui, M. (1998). Envejecimiento y salud. Madrid: Informe de la Sociedad
Espa˜nola de Salud P´ublica y Administracon Sanitaria.
20
Annex I
Definition of limitations
lim 1 For children over 9 months old: the child has troubles to stay
sitting down without help
lim 2 For children over 9 months old: the child has troubles to stay
standing up without help
lim 3 For children over 9 months old: the child has troubles to walk
by his/her own
lim 5 The child can hardly see
lim 6 The child is fully deaf
lim 7 It seems that the child can hardly hear
lim 8 The child has troubles to move his/her arms
lim 9 The child has any weakness or stiffness in the legs
lim 10 The child sometimes has convulsions, goes rigid or lose consciousness
lim 11 The child can hardly do the things that other children do
at the same age
lim 12 The child is frequently sad or depressed
lim 13 The child can hardly mix with other children,
as the children at the same age do
lim 14 For children over 2 years old: the child can hardly understand
simple instructions
lim 15 For children between 2-3 years old: the child can hardly
recognize and name objects
lim 16 For children between 3-5years old: the child can hardly speak
lim 17 The child is into any specialized education system for stimulation
lim 18 The child has been diagnosed by a doctor or a psychologist of any
illness that last more than one year
Source: ICF-CY Classification
21
... In this framework, classical methods of data analysis which assume simple random sampling may no longer be valid, and weighting may appear as the only or best alternative. Albarrán et al. (2015) [9] reviewed the extension of classical MDS concepts to the weighted context. ...
... In this framework, classical methods of data analysis which assume simple random sampling may no longer be valid, and weighting may appear as the only or best alternative. Albarrán et al. (2015) [9] reviewed the extension of classical MDS concepts to the weighted context. ...
... Other metrics for mixed data can be considered, although the k-prototypes algorithm should be modified accordingly. For instance, a more robust metric that can overcome some of the shortcomings of Gower's is related metric scaling (RelMS) by Cuadras (1998) [11], which was used in [9] to obtain robust profiles in weighted and mixed datasets. However, in this work, we prefer to illustrate our methodology by using Gower's metric due to the computational complexity of RelMS. ...
Article
Full-text available
This work provides a procedure with which to construct and visualize profiles, i.e., groups of individuals with similar characteristics, for weighted and mixed data by combining two classical multivariate techniques, multidimensional scaling (MDS) and the k-prototypes clustering algorithm. The well-known drawback of classical MDS in large datasets is circumvented by selecting a small random sample of the dataset, whose individuals are clustered by means of an adapted version of the k-prototypes algorithm and mapped via classical MDS. Gower’s interpolation formula is used to project remaining individuals onto the previous configuration. In all the process, Gower’s distance is used to measure the proximity between individuals. The methodology is illustrated on a real dataset, obtained from the Survey of Health, Ageing and Retirement in Europe (SHARE), which was carried out in 19 countries and represents over 124 million aged individuals in Europe. The performance of the method was evaluated through a simulation study, whose results point out that the new proposal solves the high computational cost of the classical MDS with low error.
... Unnecessary explanations can make boring what is evidently beautiful and simple. 7 It is an interesting point of view in the Schoenfield theory which Aurea Grané follows in her jobs ( [7] and [8]). ...
... Unnecessary explanations can make boring what is evidently beautiful and simple. 7 It is an interesting point of view in the Schoenfield theory which Aurea Grané follows in her jobs ( [7] and [8]). ...
Article
Full-text available
Category and Type Theory, and its natural evolution to Homotopy Type Theory, are essentials on the mathematical language for research-level in Computer Science and beyond. This paper shows a brief simplification of the very useful notions of these theories for constructing innovations that help to open our minds and find new ways for discovering knowledge.
... The present work is a novelty approach to solve the same problem and, as far as we know, this is the first time that dependency evolution is used to characterize the individuals in order to enhance the regular estimation of health expectancy. Other recent studies on dependency are Albarrán et al. (2015) and Albarrán-Lozano et al. (2017), regarding dependent children. ...
Article
Full-text available
The aging of population is perhaps the most important problem that developed countries must face in the near future. Dependency can be seen as a consequence of the process of gradual aging. In a health context, this contingency is defined as a lack of autonomy in performing basic activities of daily living that requires the care of another person or significant help. In Europe in general and in Spain in particular, this phenomena represents a problem with economic, political, social and demographic implications. The prevalence of dependency in the population, as well as its intensity and evolution over the course of a person’s life are issues of greatest importance that should be addressed. The aim of this work is the estimation of life expectancy free of dependency (LEFD) based on functional trajectories to enhance the regular estimation of health expectancy. Using information from the Spanish survey EDAD 2008, we estimate the number of years spent free of dependency for disabled people according to gender, dependency degree (moderate, severe, major) and the earlier or later onset of dependency compared to a central trend. The main findings are as follows: first, we show evidence that to estimate LEFD ignoring the information provided by the functional trajectories may lead to non-representative LEFD estimates; second, in general, dependency-free life expectancy is higher for women than for men. However, its intensity is higher in women with later onset on dependency; Third, the loss of autonomy is higher (and more abrupt) in men than in women. Finally, the diversity of patterns observed at later onset of dependency tends to a dependency extreme-pattern in both genders.
Article
Full-text available
In this study, we propose a hyper-simplified indicator of health and well-being for data visualization purposes in large datasets and apply it to SHARE survey data, the largest macro survey on health, ageing and retirement for 18 European countries. The indicator is based on four thematic sub-indicators, each focussing on a particular issue, which are obtained from more than twenty mixed variables measured on more than 60,000 respondents; Next, PCA is used to summarize their information in order to find and visualize profiles of “healthy ageing” across Europe. As a result, EU countries are classified in three groups, that segment the database into the least to the most individuals at risk of health and well-being. The methodology we propose is wide enough to be extended to other surveys or disciplines.
Article
Full-text available
The main objective of this paper is to visualize profiles of older Europeans to better understand differing levels of dependency across Europe. Data comes from wave 6 of the Survey of Health, Ageing and Retirement in Europe (SHARE), carried out in 18 countries and representing over 124 million aged individuals in Europe. Using the information of around 30 mixed-type variables, we design four composite indices of wellbeing for each respondent: self-perception of health, physical health and nutrition, mental agility, and level of dependency. Next, by implementing the k-prototypes clustering algorithm, profiles are created by combining those indices with a collection of socio-economic and demographic variables about the respondents. Five profiles are established that segment the dataset into the least to the most individuals at risk of health and socio-economic wellbeing. The methodology we propose is wide enough to be extended to other surveys or disciplines.
Article
Some statistical models, quite different in the symbolic mathematical sense, may provide similar results. After commenting two probability examples, we comment and compare multiple factor analysis (MFA) with related metric scaling (RMDS), two multivariate procedures dealing with mixed data. Each data set can be quantitative, binary, qualitative or nominal, and has been observed on the same individuals but coming from several sources. Then MFA and RMDS are two approaches for representing the individuals. We study the analogies and differences between both methodologies to guide users interested in performing multidimensional representations of mixed-type data. Though in general MFA and RMDS provide similar results, we prove that RMDS takes into account the association between the different sets of variables, providing, in some cases, better and more coherent representations. We also propose a parametric RMDS which includes MFA as a particular case. Article in memory of John C. Gower (1930-2019).
Article
The International Classification of Functioning, Disability and Health (ICF) was published by the World Health Organization in 2001 to provide a standardized description of health and health-related states (WHO 2001). The ICF classifies functioning and disability associated with health conditions at the levels of body/body parts, the whole person and the person in their environmental context. The ICF is a multipurpose classification that can be used by different disciplines and sectors to provide a scientific basis and common language for the description of health and health related states, outcomes and determinants. ICF data enables comparison across countries, disciplines and time, and provides a coding system for health information (WHO 2001). The ICF is not a tool for assessment, but rather it provides the basis for such tools as well as a framework to which these tools can be related, thus building up a more complete picture of how a person lives. This paper will provide an overview of this international classification, give examples of its use in national data collections as well as detail its relevance to ergonomic research and practice.
Chapter
Disability traditionally has been a marginalized concern of public health and has largely been viewed as a failure of primary prevention. However, disparities in health behaviors, healthcare access, and health status between people with and without disabilities suggest that opportunities exist for public health to engage people with disabilities to improve their overall health. In this entry, we address case definition of disability, conceptual dimensions of disability that have led to modeling of the experience of disability, as well as US and international estimates of the population of people with disabilities. We also discuss discrete age groups—children, adults, and older adults—and evidence of health disparities between people with and without disabilities. Finally, we discuss health promotion directions to improve the health of this population.
Chapter
This chapter discusses the visualization of categorical data with related metric scaling. Data that can be likened to distances are common in multivariate statistics where they are often called “dissimilarities.” A dissimilarity matrix is a square, symmetric matrix of nonnegative data and has zeros on its diagonal. The metric scaling, also called the principal coordinate analysis, is a technique that allows construction of a map or Euclidean configuration from a matrix of dissimilarities. Because the same set of distances can be obtained from several Euclidean configurations of points, one of them is selected as the usual metric scaling solution. The main advantage of the metric scaling becomes apparent when a dissimilarity matrix is processed that has not been obtained from actual measurements from a map. The chapter also discusses the basics of the metric scaling, metric scaling graphic representation, and methodology for the metric scaling along with an empirical application.
Article
Reduction of Dimensionality. Development and Study of Multivariate Dependencies. Multidimensional Classification and Clustering. Assessment of Specific Aspects of Multivariate Statistical Models. Summarization and Exposure. References. Appendix. Indexes.