ArticlePDF Available

Area-based studies and the evaluation of multilevel influences on health outcomes

Authors:
Chapter 12
Area-based studies and the evaluation of multilevel
influences on health outcomes
Graham Moon
1
SV Subramanian
2
Kelvyn Jones
3
Craig Duncan 1
Liz Twigg 1
Introduction
At root, much health research is concerned with individual people, events,
interventions or programmes. For example, interest is frequently focussed on the
identification of regularities underpinning variations in the mortality or morbidity of
individuals. Equally a comparison might be made between the therapeutic impact on
individuals of a new drug and standard treatment regime using the traditional
randomise controlled trial (see Chapter 5). These ‘individual-level’ studies can be
contrasted with those that have an explicit focus on geographical areas: what are often
termed ‘ecological’ studies (Box 1). This second type of study manifests itself most
simply through the production of choropleth maps comparing health outcomes
between areas. Ecological studies are most common in health geography but are also
typical of descriptive epidemiology and studies of health inequality. Good recent
examples include Shaw et al. (1999) and Mitchell et al. (2000).
<<Box 1 about here>>
This chapter looks beyond the simplistic dichotomisation of the individual and the
ecological and offers an assessment of current directions in health research concerned
with areas, places or geographies. This assessment focuses mainly on methodological
issues regarding the analysis of area effects. An initial section briefly considers the
nature of ecological analyses. This is used to introduce a more substantial assessment
of the shortcomings of the ecological approach and the difficulties entailed in
untangling area effects on health. From this assessment, and the concern to look
beyond the dichotomisation of the individual and the aggregate, it is concluded that an
understanding of area effects must implicate both individual and ecological factors.
To this end, the third substantive section of the chapter examines multilevel
modelling, explaining what it does and outlining its scope in health research. A short
conclusion summarises the contentions of the chapter.
1
Institute for the Geography of Health, University of Portsmouth.
2
Harvard School of Public Health, Harvard University
3
School of Geographical Sciences, University of Bristol
Ecological Studies
Numerous health research studies have used an ecological design to examine
variations or inequalities from a geographical perspective. Subject matter has ranged
across mortality, morbidity, health-related behaviour and health service topics. Source
data have generally been derived from routine officially-collected sources. The
finding of area differences in the health status of populations has long been
established. Research has consistently shown that people in different geographical
areas apparently experience different degrees types of ill health. Curtis and Jones
(1998) provide a useful review of the field.
In the case of mortality the spatial scale of these ecological studies has been both
national and more localised with variations being considered between relatively large
regions as well as between smaller spatial units. Curtis (1995) summarises this work.
Illsley and Le Grand (1993) exemplify national scale work. They examined ten-year
age-specific death rates in England at seven points from the 1930 to 1990 and found
that, while a north-south gradient persisted in older age groups, it faded in younger
age groups. Further work on this theme has corroborated the continued existence of a
north-south divide in mortality at certain ages and indicated the lack of recent
progress in narrowing this divide (Drever and Whitehead 1995). Sub-regional
analyses document similar, often entrenched, patterns of inequality, as do more
localised studies. Both are conventionally based on comparisons between local
government units though local studies may equally use census or postal geographies.
An urban penalty is a frequent finding: urban and metropolitan areas tend to
experience higher rates of mortality and morbidity than rural areas (Mullen, 1992).
What factors underpin these observed geographical variations? Why does a map of
mortality (or any other health issue) reveal geographical variation? Ecological studies
examine this question by ‘associative analyses’ seeking correlations with other
variables measured at the same geographical scale or building multiple regression
models that seek to summarise the relationship between an areal outcome measure
and a set of candidate predictors. Candidature in both these approaches is driven by
theory and, nowadays, has increasingly implicated the social environment,
particularly via measures of deprivation and structural perspectives on the causation
of ill-health (Townsend et al. 1988a; Acheson 1998; Shaw et al. 1999). To avoid
problems arising from small numbers and ensure statistical reliability, both
approaches commonly work with extremely coarse geographies.
The majority of correlational associative analyses link deprivation to the calculation
of simple areal rates that control for population composition by disaggregating or
standardising on the basis of personal characteristics, most typically to allow for the
impact of age and sex on the health variable. Examples from research on health-
related behaviour include Blaxter (1990), Dunbar and Morgan (1987) and Balarajan
and Yuen (1986). Since this work only considers a very narrow range of
characteristics at any one time, it provides only a limited insight into the nature of
areal variation. To this end, a smaller number of studies have adopted the more
sophisticated regression modeling strategy, allowing for varying population
compositions by including age and sex as predictor variables alongside other variables
(Braddon et al., 1988; Whichelow et al., 1991). This latter approach has the advantage
of making clear the contribution of demography to health inequalities; Asthana et al.
(2004) show how the contribution of age and sex to health variations usually exceeds
that of deprivation.
The measures of deprivation employed in these analyses are many and varied,
capturing different aspects of deprivation such as those derived from educational
disadvantage, poor housing, tenure status or unemployment. Mostly these data are
obtained from national censuses though some may be aggregated from larger routine
surveys, providing sample sizes are sufficient to permit generalisation at the desired
spatial scale of analysis. In UK-based ecological analyses, much use is also made of
composite indicators of deprivation. These offer a single-figure summary of several
separate deprivation measures. Perhaps the best known of these composite indicators
is the Jarman ‘under-privileged area index’ (Jarman 1984). This uses standard scores,
weighting and linearisation to summarise eight measures in a single composite
indicator designed to capture family doctor workload; over time it has mutated into an
indicator of deprivation though it is not without its critics (Senior, 1991). Other
similar indicators include the Townsend (Townsend et al. 1988b; Ben-Shlomo et
al.1996) and Carstairs indices (Carstairs and Morris, 1991). Both are less
controversial as indicators of deprivation. A rather different methodological strategy
underpins the construction of the UK Office of the Deputy Prime Minister’s Index of
Deprivation, but with similar results (Box 2).
<<Box 2 about here>>
Using data on mortality for 1989-1993 in each of over 350 English local authorities,
Drever and Whitehead (1995) exemplify the use of a composite deprivation indicator
as an aid to understanding health variations. They found a significant relationship
between mortality and the then Department of the Environment’s 1991 deprivation
index, equating high mortality with greater deprivation. This relationship was
strongest for males and was broadly consistent across age groups. Another example,
at a more localised spatial scale, is provided by Eames et al. (1993). They relate
several alternative social deprivation indices to premature mortality using over 8000
English electoral wards as their ecological unit. They suggest that the effects of
deprivation are not limited to the poorest areas and also note that the impact of
deprivation on health varies across the country, being greatest in parts of the North
and in Central South-West London and least in East Anglia. There is a consistent
relationship between ill health and deprivation, with the highest mortality rates found
in inner-city areas and suburban local-authority housing estates.
As an alternative to single or composite measures of deprivation, some ecological
studies use area typologies. Examples of UK area typologies include those derived by
National Statistics from Census data and commercial classifications such as ACORN
classification developed by CACI Limited, (CACI 2003) or MOSAIC (Experian
2004). These typologies classify areas on the basis of a cluster analysis of area-based
statistical measures. Clusters, or groups of areas, are internally homogeneous but
different from other clusters. They are generally given names that reflect their key
characteristics. Cluster identification depends on the variables selected and the
clustering methodology as well as choice of areal unit, but the resulting area
typologies are highlight underlying ecological structuring within a population that is
related to many aspects of human behaviour, including health. Meltzer et al. (2000)
provide a recent example. They used ACORN to examine the proportion of children
with a mental disorder in relation to area type. They found that areas characterised as
‘striving’ (low income, less prosperous) had rates of childhood mental disorder nearly
250% above those found in ‘thriving’ (wealthy) areas. Other examples of the use of
multivariate classifications include Shouls et al. (1996) and Wiggins et al. (1998).
Whilst this work has successfully links a health differential to the social millieu of an
area type, it is not clear what aspect of these classifications it is that brings about a
health differential.
Traditionally in health and deprivation research, ecological studies have been used to
help develop explanations of health inequalities. The areal basis to ecological analysis
has however attracted little attention in its own right. ‘Geography’ has generally been
used simply as a framework for identifying patterns or as a means of organising data.
Yet this usage is far from unproblematic. Mitchell (2001) notes the tendency to
conflate small-area administrative divisions with more organic, functional and
theoretically relevant notions of neighbourhood. Mitchell et al. (2000) have
emphasised the advantages that come from working with areas of relatively equal
population size or social homogeneity. Perhaps most importantly, the work of
Openshaw (1983) on the modifiable areal unit problem (Box 3) serves as a reminder
that the size and configuration of areal units can directly influence the results of
associative analyses, to the extent of rendering statistically significant associations
insignificant and, at the extreme, changing the direction of relationships.
<<<Box 3 about here>>
Notwithstanding these difficulties, it is clear that ecological studies imply variations
between places in terms of health outcomes. Places are (seemingly) different.
However, a key question is hidden in this innocuous statement. Is it place that makes
the difference? At issue is the extent to which the spatial differences evident in
ecological studies are simply a reflection of the differing social profiles of the resident
populations within their areal units or whether there is something about an area which
has an independent effect on health in its own right: a so called ‘area effect’. It is to
area effects that attention now turns.
Area Effects
If place makes a difference to health, health outcomes depend not only on individual
characteristics, (age, gender, class, and so on) but also on place, the setting, ‘ecology’,
or surrounding environment in which individuals live and work. A key interpretive
question is the extent to which the observed place differences are ‘area’ or
‘ecological’ effects or merely a result of different types of people living in these
places. To put it another way, do people of similar characteristics experience different
health outcomes in different places?
Much ecological research is unable to answer such questions because it has conflated
the genuinely ecological and the ‘aggregate’. Hampson (1991, p. 26) directly equates
the two claiming that the unique distinguishing feature of ecological studies is that
they are empirical investigations involving a group of individuals as the unit of
analysis. Real ecological effects would indisputably operate at the areal level
reflecting predictors and associated mechanisms operating at, and solely at, an areal
level. The search for such measures and their assessment in properly designed studies
is a current area of considerable research (Macintyre and Ellaway 2000; Ellen et al.
2001; Macintyre et al., 2002). Among the candidate measures for genuine ecological
status might be those that capture the efficiency and effectiveness of area-based policy
or the procedures of local service providers, the closure or out-migration of significant
employers, measures of community stress, and indicators of community social capital.
In a crude sense these ecological measures, particularly the last two, provide the
empirical case for community development as a strategy for raising or maintaining the
chances of good health.
Aggregate area effects, in contrast, equate the effect of an area with the sum of the
many individual effects associated with the people living within the area. In this
situation the key interpretive question of posed above becomes particularly apposite.
If common membership of an area by a set of individuals brings about an effect that is
over and above that resulting from individual characteristics then there may indeed be
an area effect: the whole may be more than the sum of the parts. If this does not
happen, it is individual factors that matter, not area effects. To assume an area effect
on the basis of evidence derived from aggregate data is to commit the ‘ecological
fallacy’ of transferring results from aggregates to individuals (Box 4). As Figure 1
shows, the aggregate relation may even be of the opposite sign to the individual
relations on which it is based. As Susser (1973) points out, the term ‘ecological
fallacy’ is really a misnomer; he prefers ‘aggregative fallacy’. Shared residence in an
area does not necessarily mean that individuals will draw the same influences from it;
those who are spatial neighbours are not always social neighbours who vary in their
interactions with those who live close to them.
<<<Figure One about here>>>
<<Box 4 about here>>
Many, perhaps the majority, of ecological analyses are in fact analyses of aggregate
data. Area effects are potentially of importance, but they cannot be assessed through
aggregate analysis. They cannot distinguish the difference a place makes from what is
in a place (Jones and Moon, 1993). As the classic paper by Robinson showed, there is
no reason to suggest that a relationship that occurs at the aggregate level can be held
to exist at the individual level (Robinson, 1950). Taking a hypothetical example, high
levels of illness may be associated with high levels of unemployment at the regional
level but people who are ill in regions with high unemployment may be those in work.
One result of the widespread recognition of the ecological fallacy was a turn to survey
methods and studies using the individual as a unit of analysis. Work at the individual
level risks missing important area effects. If no attempt is made to incorporate
measures relating to area characteristics then the ‘atomistic fallacy’ (Alker, 1969) is
committed in which research completely ignores the situated nature of individual
action and outcomes. This decontextualised research strategy not only denies area
effects (and other collective impacts on health) but, in its individualistic approach, has
had undoubted resonance with much neo-liberal health policy-making. Thus, Thomas
(1993) declared that The Health of the Nation (DoH 1988), a policy developed by the
British Conservative party, viewed health-related behaviour as “a matter of individual
responsibility” and that “behaviours are not placed in context” but rather are
conceptualized in traditionally narrow epidemiological terms.
Composition and context
As a further basis for unpacking ideas about area effects, Macintyre (Macintyre et al.,
1993; Macintyre 1995) has usefully distinguished between ‘compositional’,
‘collective’ and ‘contextual effects’ (Box 5). The first refers to the idea that place
differences are an artefact of differential socioeconomic composition. A purely
compositional explanation for observed area variations in smoking would be that the
sort of people whose personal and household characteristics are associated with high
smoking rates tend to live in certain sorts of regions or localities, and this is why rates
there are high; the people would smoke wherever they lived, and the place itself has
no effect on their likelihood of smoking. A compositional effect has been invoked to
explain differences in mortality between subjects in the Paisley/Renfrew Study (in the
West of Scotland) and in the Whitehall Study (in London and the South East of
England); it was argued that observed differences between the study populations were
due to differences in the distribution of types of people in the two study populations,
not to other differences between the West of Scotland and South East of England
(Davey Smith et al., 1995).
<<Box 5 about here>>
Collective effects relate to the ‘social miasma’ by which individuals conform to the
behaviour of the dominant group living in area. The third term, contextual, represents
the situation where place characteristics have a direct effect and can be equated with
the ‘genuine ecological effects’ referred to above. Contextual effects may either
impact on all residents equally, or more on some types of residents than others (for
instance males as compared with females). Such influences might include climate,
soil or water conditions, the built environment, publicly or privately provided
amenities or facilities, sociocultural factors such as predominant religion or history,
and the reputation of the area (Macintyre et al., 1993). Contextual explanations are
perhaps most intuitively plausible when we compare countries: cultural differences in
attitudes to smoking may in large part be the reason why middle class professional
men are far more likely to smoke in Paris than in San Francisco; and people in
countries bordering the Mediterranean may be more likely to eat fresh fruit and salads
all the year round than are those in Russia or Scotland, because of greater availability.
The question is whether such explanations also apply to variations within countries,
whether by regions, districts or neighbourhoods.
How might this come about? Work in social theory provides a start point. Anthony
Giddens’ structuration theory (Giddens, 1984) draws attention to the way in which
knowledgeable individuals draw upon social structures in their day-to-day living.
Human agents operate within particular socio-cultural milieux that contain a number
of specific structural factors (conceptualised as rules and resources by Giddens) that
stimulate and shape behaviour and constrain outcomes. By drawing on these, society
and its constituent individuals produce patterns of behaviours and outcomes that also
reproduce the structural factors that were involved. However, as these factors are as
much a result of processes and a medium of those processes, there is always the
possibility that they will be recreated differently as circumstances change. This
recursive process is fundamentally context-specific and Giddens’ theory emphasizes
the way in which the interplay between individuals (agency) and social factors
(structure/context) will be constituted differently at different times and in different
settings. Giddens’ work is complex and abstract but its (over) simplification here
offers an important general reminder of the vitally important connections that exist
between phenomena at a number of different levels. Processes operating at macro-
levels need to be set alongside knowledgeable and capable human agents behaving at
a micro-level.
Less conceptually, it is possible to discern four sets of mechanisms that may produce
place differences: processes concerned with the physical environment, the cultural
milieux, place deprivation and selective mobility. These are often highly interrelated.
The first recognises that people living in one part of a city or a region are likely to
share common water supplies and experience similar levels of environmental
pollution. The importance of such physical variables has, as noted earlier, long been
recognized (Gardner, 1973) but much of this literature has been based on aggregate
data. Such ecological variables may then interact with household or individual
characteristics such as lead plumbing or tobacco consumption to produce differential
health outcomes.
Differences resulting from individual interaction with specific local cultures
acknowledge that local specificity must be seen as integral to explanations of general
social processes. Processes can never operate on the head of a pin in a geographically
undifferentiated world; rather, social processes literally ‘take place’. The juxtaposition
of disparate forces in a place can create a qualitatively distinct setting which of itself
can influence and modify the general processes. People and places exist in a recursive
dialectic relationship: people create structures in the context of places; those
structures then condition the making of people. People act individually and
communally to create these local cultures through their everyday routines and
institutional practices. This frames the local context which then provides the setting in
which people learn to interpret and respond to general societal structures. As a
‘general’ process unfolds across space it interacts with places that have both a
distinctive history and culture; as a result people change, places change and the
uniqueness of place is maintained.
The basic distinction between composition and context is recalled in identifying area
effects stemming from differences resulting from processes associated with place
deprivation. Smith (1977) long ago summarized the distinction between people and
place deprivation. In the former, people are deprived by virtue of their position in the
socio-economic system. In contrast, place-based deprivation refers to poor access to
locationally specific goods and services. In the health literature, these arguments have
been given a particular twist by Wilkinson (1996). He argues that material
infrastructure and societal arrangements are geared through market forces to ‘average’
people. Thus, food retailing is increasingly arranged for those with a car.
Consequently, those without cars have to shop at the more expensive remaining
corner shops with their restricted range. Additional costs are incurred and additional
effort is required to manage day-to-day tasks. Cummins and Macintyre (2002) and
Wrigley et al. (2002) provide empirical support for this argument in their work on the
health effects of ‘food deserts’. The basic conclusion is that, whatever one’s personal
characteristics, the opportunity structures in poorer areas are less conducive to health
than those in better-off areas.
Differences resulting from the processes of selective mobility distinguish area effects
stemming from the cumulative effects of individual mobility and the push and pull
area effects of government policy and the market economy. As a result, certain groups
of people are constrained while others are enabled; some places attract while others
are regarded as ‘sink estates’ where few would choose to live. Different place-specific
health outcomes may then occur as a result of the differential mobility of these
different groups. These mechanisms for producing apparent differences between
places have long been recognized as potentially important (Hill, 1925) and research
continues into such notions as the ‘healthy migrant’ hypothesis (Strachan et al. 1995;
Brimblecombe et al. 1999, 2000; Dorling et al. 2000). Of course, contextual
differences based purely on selective mechanisms may be seen as something of an
artefact: studies of area effects on mortality are notoriously bedeviled by mobility in
the years immediately prior to death. However, they also represent an important
‘geographical sorting’ of the population on which other processes of contextual
differentiation may then operate.
These notions of contextuality have general relevance as they apply not only when the
focus is upon context as geographical setting but also when context is seen in terms of
temporal (for example, different time periods), administrative (for example, health
care administration areas), or institutional (for example, hospital or clinic) settings. In
terms of last two cases, there are extremely important implications for health services
research as they connect with the use of performance indicators. Variations in the
performance of health service activities between different settings can be attributed to
both the type of clients particular units serve (compositional effects) as well as the
nature of the environment from and in which the service is provided (contextual or
area effects) (Jones and Moon 1993, Leyland and Goldstein 2001).
Understanding area effects on health thus entails an acknowledgment of the existence
of both compositional and contextual effects. The two are intricately connected
(Macintyre and Ellaway 2000; Macintyre et al. 2002). Though there may be some
genuine ecological effects, even these may be mediated or modified by composition.
Much health-related data can therefore be expected to exhibit considerable non-
stationarity (relationships between variables will not be consistent across all the
geographic regions covered by the data); the parameters of association will vary
spatially. While this differentiation can be due to model mis-specification or random
sampling variation (Fotheringham, 1997), it can also be reflective of area effects. To
test for that eventuality requires an appropriate analytical strategy. Moreover, that
strategy needs to recognise that area effects can be reactive or consensual. In the
reactive case, individual and ecological effects operate in the ‘opposite direction’
while, in the consensual case, the effects are mutually reinforcing.
Some evidence for (real) area effects
Operationalising ideas about composition and context and isolating the ‘reality’ of
area effects has proved to be difficult and contentious. On the basis of a large-scale
empirical study using the UK Health and Lifestyle Survey, Blaxter felt convinced that
contextual effects are important and that places matter in their own right. She wrote:
“while the health of manual men and women was almost always poorer than that of
non-manual, it is clear that types of living area do make a difference” (Blaxter, 1990,
p. 82). Fox et al. (1984) and Britton et al. (1990) replicated this conclusion using the
UK Longitudinal Study and Webber and Craig’s (1978) national typology of 36
different types of wards. In contrast, Sloggett and Joshi, also using the Longitudinal
Study, have suggested that: “excess mortality associated with residence in areas
designated as deprived...is wholly explained by the concentration in those areas of
people with adverse personal or household socio-economic characteristics” (Sloggett
and Joshi, 1994). They found that a positive linear relationship between ward level
deprivation and premature mortality largely disappeared when account was taken of
individual socio-economic characteristics. They support a compositional explanation
for geographical variations in mortality although there remains a substantial and
significant North/South difference in mortality in their analysis even after controlling
for deprivation and individual social characteristics.
The evidence for contextual effects is therefore suggestive but equivocal. As a further
example, regional differences in smoking and drinking in Britain have been observed
for both men and women, standardised for age and socioeconomic group (Balarajan
and Yuen, 1986). However, these regional differences varied by sex and according to
which measures of these behaviours were used. For example, the proportion that had
never smoked did not differ by region for males, though it did for females. Both sexes
showed significant variation by region in the proportions that had given up smoking.
Standardised smoking ratios for heavy smoking showed a gradient from the North
West to South East for both sexes. The suggestion from this and many other studies is
that the effects of area or aspects of area on health vary according to individual
characteristics, such as gender. This is also consistent with findings on socio
economic and area variations in health as well as work on health related behaviours.
For example, studies have shown particularly marked health differences between rich
and poor people in generally affluent places (Curtis, 1995; Ecob, 1996)
Thus the existing literature suggests that there may be some variation between areas
not accounted for by compositional factors, and that the level of disadvantage or
deprivation of the local area may have predictive power over and above individual
factors. As Phillimore (1993) has argued: “[The] Characteristics of places may be as
important as the characteristics of people for an understanding of particular patterns of
health” (p. 176). He suggests that distinctive differences between places (in this case,
Sunderland and Middlesborough) remain unexplained by standard accounts of the
illness/deprivation nexus.
Moving forward
The arguments above represent the basic case for area effects on health. Just because
health outcomes vary geographically, area effects are not necessarily present:
compositional factors must be controlled but, once this is done, any remaining
variation constitutes an area effect. Control must however be effective and
comprehensive. Taken to an extreme, this position can be resolved to one in which
area effects might be expected to ‘disappear’ once full and effective control for
composition is made. To paraphrase the methodologist Gary King: “if we really
understood [health variations], we would not need to know much of contextual
effects…” (King 1996: 161). His argument finds it empirical parallel with the
conclusion of Sloggett and Joshi (1994) that area effects reduce to the impact of
composition.
King’s is an important challenge to people interested in area effects. It can however
be countered. It remains intuitively sensible to test for the possibility of area effects
and intuitively plausible the impact of the composition will vary by context. Unless
contextual variables are considered, their direct effects and their indirect mediation by
compositional variables cannot be identified. Moreover, composition itself has an
areal dimension. Compositional estimates will be inefficient and biased if their
inevitable spatial autocorrelation is not recognised in modelling strategies. Of course,
controlling for individual factors can identify area effects, but that fact that individual
(compositional) factors may ‘explain’ between-place variation, serves as a reminder
that real understanding of area effects is complex. Essentially King’s concentration on
individuals as separate from their ecologies misses the important point that
individuals’ actions, choices and experiences are situated in the social-geographical
places where they live their lives.
Undoubtedly, then, a key imperative in health research should be a declared aim of
articulating the connections between the actions of individuals and the socio-
ecological context in which these actions are performed. This argument suggests an
approach that is multilevel, examining the circumstances of individuals at one level,
and the contexts or ecologies in which they are located at another level and, crucially,
doing so simultaneously. To gain a better understanding of area effects, all the
relevant levels of analysis need to be considered at the same time. Macintyre et al.
(1993) posed the question: “Should we be focusing on places or people?” They went
on to argue that much health research has overplayed the distinction, given too much
emphasis to the compositional and underplayed context. It should not be a case of one
or the other, or too much of one and not enough of the other. Simultaneous
consideration is essential for a proper understanding of the mechanisms by which
places can affect people (Duncan et al., 1993).
Multilevel Modeling
In the past fifteen years multilevel modeling has emerged as the approach of choice
within health research for the effective simultaneous consideration of composition and
context and thus the definitive assessment of area effects on health-related outcomes
(Box 6). Its key advantage is that it is conceptually realistic as it handles the micro-
scale of composition and the macro-scale of contexts simultaneously within one
model. By distinguishing the different ‘levels’ at which the determinants of health
operate, multilevel models are able to treat the contexts identified in any one model as
a random sample drawn from a larger underlying population. The procedures then
make inferences about the variation among all contexts in the population using this
random sample of contexts. Consequently, the variation between contexts is not
treated as being fixed but, rather, as a random property that relates to a larger
population (DiPrete and Forristal, 1994; Hox and Kreft, 1994; Blakely and
Woodward, 2000).
<<Box 6 about here>>
By the mid 1990s the value of the multilevel approach was beginning to be recognised
within many different research areas associated with health and health care. Duncan et
al. (1998), Rice and Leyland (1996) and Von Korff et al. 1992 provided reviews.
Early health applications focused on institutional performance (Jones and Moon,
1990, 1991; Jones et al. 1991; Leyland, 1995; Leyland and Boddy, 1997) and the
geography of health-related behaviour (Duncan et al. 1993, 1996, 1998, 1999). More
recently, the growth in output has been exponential and multilevel analysis has
become a standard part of the quantitative health research armamentarium (Diez
Roux, 1998, 2000, 2002) and the research emphasis has shifted to the complexity of
area effects and elucidation of the impacts of social capital and income inequality on
health (Subramanian, Lochner and Kawachi, 2003; Subramanian and Kawachi, 2004).
To this end the approach is now established across a range of disciplines concerned
with social epidemology and software options are widely available (Box 7).
<<Box 7 about here>>
Multilevel basics
Multilevel analysis requires data that contains measures of both composition and
context. At the least, observations within a data set need to have identifiers that
distinguish the contextual setting(s) in which each observation is to be found. Such
data are fairly widely available. Surveys, multicentre trials, and the products of record
linkage exercises typically possess the necessary properties. In Britain, examples
include the Longitudinal Study (LS), the 1991 Census Sample of Anonymised
Records (SAR), the Health Survey for England (HSE) and the General Household
Survey. British researchers have mainly used data from the routine large-scale surveys
deposited in the national data archive These surveys offer large sample sizes,
generally in excess of 15,000 observations each year, and a well-found design with
standardised question formats. Importantly, these designs incorporate hierarchical
sampling with respondents drawn at random from randomly selected areas; this
multilevel structure should be recognised in any analysis, notwithstanding its
additional ability to enable the isolation of area effects.
Data for multilevel analysis are therefore relatively widely available. There are
however two general problems that need to be noted and which apply in different
ways to most sources. First, there is the matter of confidentiality restrictions. These
may limit the availability of individual data, restricting release to a small number of
cross-tabulations with limited dimensionality. For example, the dangers of identifying
individuals in the census mean that only a few three-way cross tabulations are
produced for small areas. Confidentiality may also mean that the identifiers of higher-
level sampling units (areas) are not released. The fact that observations come from
different areas, one set of observations come from one area and one from another may
be known, but the actual identity of the areas may be confidential. In this situation the
scope for linking in further data to describe an area is lost. The characteristics of an
area can only be estimated by aggregating the characteristics of the observations that
lie within.
The second data problem that is frequently encountered in multilevel analysis
concerns the definition of the higher-level areal units. As noted briefly above in the
discussion of the shortcomings of ecological studies, the areas employed in such work
rarely have much sociological meaning, though they often have administrative
relevance. The same criticism can be applied to multilevel data sources. Area
identifiers linked to sampling, such as postcode sectors, enable an insight into local
area effects, but they work with areas that mean little to the public. Where such
information is not available, perhaps for confidentiality reasons, the researcher can be
left with very high-level administrative geographies. In an extreme case, the available
areal indicators in the SAR is a relatively coarse, arbitrary sub-regional geography
(Gould and Jones, 1996)
Four graphical typologies can be used to outline the multilevel approach and show
how it provides a framework for contextual analysis in health research that is
technically robust. Consider a simple regression model in which it is hypothesized
that cigarette consumption (the response variable) is a function of a person's age (the
predictor variable). A traditional single-level OLS regression analysis might generate
the relationship shown in Fig. 2(a). Here the cigarette consumption/age relationship is
shown as a straight line with a positive slope: older people consume more cigarettes.
In this model the context in which the behaviour occurs is completely ignored: one
single relationship is held to exist everywhere. In effect the model has explained
everything in general and nothing in particular.
This can be rectified by recognising the communities in which individuals live and
using a two-level model with individuals at level-1 nested within communities at
level-2. One possible result is shown in Fig. 2(b), a two-level 'random- intercepts'
model. Here each of six different communities has its own cigarette consumption/age
relation represented by a separate line. The single, thicker line represents the general
relationship across all six communities. The parallel lines imply that, while cigarette
consumption increases with age at the same rate in each place, some places have
uniformly higher consumption rates than others. With the multilevel approach,
therefore, we can see both the general relationship across all places and the particular
relationship in specific places. In Fig. 2(c) and (d) the situation is more complicated as
the steepness of the lines varies from place to place. In Fig. 2(c) the pattern is such
that place makes very little difference for the elderly but there is a high degree of
between- community variation in the cigarette consumption of the young. In Fig. 2(d)
there is a complex interaction between age and place. In some communities it is the
young who have relatively high rates; in others it is the old.
<<Figure 2 about here>>
The differing patterns of Fig. 2(b)-(d) are simply achieved by varying the slopes and
intercepts of the lines. If the vertical axis is centred at the mean age of individuals, the
intercept represents the number of cigarettes consumed by a person of average age.
The slope represents the increase in cigarette consumption associated with a unit
increase in age. The key feature of multilevel models is that the communities are
treated as a sample drawn from a population and their potentially different intercepts
and slopes are treated as coming from two distributions at a higher level. A multilevel
analysis summarizes these higher-level distributions in terms of two parts: a ‘fixed’
part that is unchanging across contexts, and a ‘random’ part that is allowed to vary.
The fixed part gives the mean value of each distribution: the average slope and
intercept across all communities (shown by the thick lines in Fig. 2). The random part
consists of variances that summarize the degree to which the community-specific
slopes and intercepts differ from these average values.
By adopting a multilevel approach researchers are no longer restricted to working at a
single level and this provides a number of substantive advantages. First, by combining
individual and aggregate levels together in one analysis both the ecological fallacy
(Robinson, 1950) and the atomistic fallacy (Alker 1969) can be avoided. Working
solely at an individual level means the context of local cultures is ignored, whilst
working just at the aggregate level fails to capture individual variation fully. Second,
by working at more than one level, the approach can start to separate compositional
from contextual differences. Taking the example of smoking consumption, there may,
nationally, be a tendency for older males to be heavy smokers. Consequently, high
smoking places may simply result from the concentration of older males in certain
locations. Alternatively, they could be a result of regional cultures that encourage
smoking in all types of people. The former is a compositional difference related to the
type of people contained within particular contexts. The latter is a contextual effect
and refers to the difference arising irrespective of composition. A multilevel approach
is able to separate these two effects and therefore has an important role to play in the
examination of regional behavioral stereotypes.
By working at several levels simultaneously it becomes possible to allow for
contextual variation in the predictor variables. Similar types of people (on the basis of
individual characteristics) may not necessarily be behaving in the same way
everywhere or experiencing the same health outcomes. The example above talks of
the behaviour of people on average across all places and their specific behaviour in
particular places. It is possible to model any variation found between places by
including fixed variables at higher levels that reflect contextual characteristics. For
example, a measure of community deprivation might be included to see whether it
was a significant predictor of the variation between places in cigarette consumption
by a person of average age. Additionally, cross-level interaction variables can be
included as fixed effects. These capture situations where the characteristics of people
and the characteristics of contexts interact to produce substantively different
expressions of behaviour. For example, people of low social status may consume
varying amounts of cigarettes depending upon the social composition of the area in
which they live.
In terms of the potential for multilevel models to identify area effects, it is evident,
thus far, that the basic methodological gain concerns the ability to ‘control for’ the
impact on some health-related outcome of both the mean level of individual
characteristics and the variability of those individual characteristics. The impact on
area effects of this control can be evidenced in five fundamental ways. First, fixed
part measures of the slopes associated with variables measuring higher-level
characteristics indicate the independent effect of such measures having allowed for
composition. Thus, to follow through the example above, the inclusion of an area
measure of community deprivation in a model would enable the identification of the
average effect of area deprivation on cigarette consumption, given the age of people
living in an area.
Second, fixed part interactions between compositional and contextual variables
(cross-level interactions) can be used to show how, on average, people with particular
characteristics experience differential outcomes in the sorts of places indicated by the
higher-level variable. Consider a two-level model (individuals in areas) with the
response being the probability of an individual reporting limiting long term illness and
an individual predictor variable identifying low social class as opposed to high social
class and an area-level predictor, the percentage of high social class in an area. Cross-
level interaction between the individual and the areal-level predictors can reveal a
number of possibilities. There may be marked differences between low and high
social class individuals in terms of their relationship to the area effect: it might even
be negative for low social class but positive for high social class.
The three remaining parameters of interest when identifying area effects concern the
random part of the model. It is important to stress that area effects are not only
evidenced through fixed part terms; well-designed studies need to consider area
effects in terms of the structure of variation. The random part of any model
summarises the variability of slopes, intercepts and their covariance - the degree to
which slopes and intercepts are related. Returning to Figure 2, the cases shown would
each have the same level 1 random part: a variance term summarizing residual
individual variation in cigarette consumption. However, they would differ in terms of
their level-2 random parts. Figure 2(a) is the result of a single non-zero intercept and
slope. This is a single-level model and so there are no higher-level distributions: there
is no variability between areas as the possibility has not been entertained in the model.
Figure 2(b) has a set of intercepts but a single slope common to all areas. This simple
third type of area effect would be captured by the random-part measure of intercept
variation. The central empirical question concerning area effects is whether this
higher-level variation remains significant when a range of appropriate and relevant
individual variables (e.g.: age, gender, income, class, employment status, housing
tenure, educational background etc.) are included in an overall model to allow for the
population composition of particular areas.
Figures 2(c) and (d) show variation in both intercepts and slopes. Measures capturing
that variation would indicate area effects that imply both that areas differ and that they
differ in relation to age. The fourth type of area effect thus concerns the extent to
which slopes differ between areas: in figure 2(c) age impacts on cigarette
consumption much more in some areas than others. The degree of variation between
areas changes according to age and does so in an increasing fashion
The final type of area effect concerns the covariation of slope and intercept variation
at the area level. In Figure 2(c), the cigarette consumption/age relation is strongest in
areas where consumption is higher on average; a steep slope is associated with a high
intercept. The complex criss-crossing of Fig. 2(d) is the result of a lack of pattern in
the relationship between the variations in the slopes and intercepts. Overall therefore,
the variation in area effects can be summarised by only three terms in the random part
of the model. Importantly, this situation prevails whether there are 20 contexts or 200.
As well as looking at complexity in terms of between-area variation, it is also possible
to look at variation between individuals. This is beyond the scope of this chapter but,
besides being of substantive interest in its own right, complex between-individual
heterogeneity can have important implications for estimates of between-area
heterogeneity as there may be confounding across levels. What may appear to be
higher-level area variability may in fact be between-individual, within-area
heterogeneity (Bullen et al., 1997). .
Just a little algebra
Thus far a graphical approach has been used to introduce the capabilities of multilevel
models as a means of identifying area effects on health-related outcomes. Re-
expressing these contentions as algebra is relatively straightforward, at least as far as
they have been taken above. The fully-random models of Figure 1c and 1d in which
both slopes and intercepts vary can be summarised in the equation:
Yij = β0 + β1Xij + e0ij + u0j + u1j
There is a single outcome, cigarette consumption (Y) and a single individual-level
predictor variable: age (X) centred about its mean. Yij represents the cigarette
consumption of individual i in place j, Xij is the centred age of individual i in place j.
The terms β0 and β1 comprise the fixed part of the model and identify, respectively,
the intercept (β0) and the age-related slope (β1). The β parameters can be interpreted
as follows: β0 is the consumption for individuals of average. β1 is the linear increase
of consumption with age.
The random part of the model is identified by e0ij, u0j and u1j. These are the ‘random’
departures from β0 and β1. The e0ij captures the variations in individual cigarette
consumption that are not accounted for by age: the individual ‘random’ term. They
can be summarized by a single variance term, σ2e0. The u0j identify the variation in
consumption at the area level, taking into account individual age and the ‘residual’
variation at the individual level; it can also be summarised by a single variance term,
σ2u0. Finally, u1j distinguishes the variation in the strength of the relationship between
consumption and age across areas: the area-specific variability of the age-related
slope (β1). Its variance is σ2u1. Without this term, the graph of the equation would be
that shown in Figure 1b. Not shown in the above equation, but implicitly present is
the covariation of the area-level slopes and intercepts: σ2u10.
This basic notation builds on the notions of a standard regression model and indicates
all that is needed to interpret a simple multilevel model. It can readily be extended by
adding additional fixed effects, β terms, measured on either a continuous scale or as a
dummy variable, such as sex or the presence/absence of some areal attribute. It should
also be noted that a range of different types response variables can be handled
(Rasbash et al., 2003). As well as the standard Gaussian model for continuous
responses that has been introduced here, logit, log-log and probit models can be
specified to model proportions and binary outcomes. Poisson and negative binomial
distribution models are available to model counts, and multinomial and ordered
multinomial models to model multiple categories.
A multitude of multilevel models
Figure 3 reveals the further generalisability of a multilevel approach by showing how
the basic two-level structure can be extended to reflect a number of other more
complex, yet frequently occurring and substantively interesting, data structures. The
two-level structure of Fig. 3(a) can be readily extended to the three-level structure of
Fig. 3(b) with individuals at level-1 nested within local neighbourhoods at level-2 and
regions at level-3. Variables can be included at each of the three levels making it
possible, for example, to examine the cigarette consumption/age relation in the
context of both local economic prosperity and regional economic prosperity. This
extension of the framework to many levels is important as it ensures that any area
effects are apportioned to the relevant level. Rice et al. (1998) and Weich et al.
(2004a,b) have focussed attention on household membership as a level between
individuals and neighbourhoods.
<<Figure 3 about here>>
There are many examples of these ‘spatial’ models in which people nest within areas.
Recent examples include Subramanian and Kawachi (2003), Subramanian et al.
(2003) and Subramanian et al. (2004); Twigg et al. (2000a) demonstrate the utility of
the approach in synthetic estimation. Moon and Barnett (2003) used a generalised
binomial proportional-response model to compare Maori and Pakeha (European)
smoking in New Zealand. They were able to work with a four-level model of
individuals nested in census tracts nested in larger census area units nested in local
government districts. The work concluded that, while Maori tend to smoke more than
Pakeha, they actually smoke less than might be expected in districts where they form
a large fraction of the population. This intriguing finding was linked to notions of
relative inequality and prompted an ongoing project on the impact of segregation on
smoking behaviour. An earlier example of the basic spatial multilevel model was
Duncan et al. (1993). This work sparked a number of increasingly complex inquiries
into area effects on smoking and drinking through its controversial conclusion that
area was of negligible influence on either behaviour. This conclusion was derived
from an analysis of HALS at three levels; individual, ward (local electoral districts,
average population size around 5000) and region (22 in UK, average population
around 2.5 million). Controlling for a number of individual characteristics reduced
regional differences in smoking prevalence to 3% either side of the national average,
and average alcohol consumption to 10% either side of the national average.
In passing, at this stage, it should be noted that the key issue in multilevel analysis is
hierarchy. So far the focus has been on spatial hierarchies that build from people to
local areas to larger regions. Each level in the hierarchy nests within a higher level.
There is no reason however why this nesting need be spatial in an areal sense. The
classic development of multilevel modelling was in educational research and saw
pupils nested within classes within schools (Aitken and Longford 1986; Goldstein
1995). The obvious analogy is patients in wards in hospitals, or people in family
doctor lists. These nestings are spatial in a ‘point’ sense: they link people to point
locations rather than areas. Jones and Moon (1991) undertook such an analysis in their
study of vaccination uptake and its variation between UK general practices after
controlling for the individual characteristics of list members.
A further point to note is that hierarchies need not build from the individual. Area-
based outcomes can be considered within higher-level regional contexts. In the
absence of individual data, this approach can provide a sound way to control for both
‘local’ circumstances and the possibility of higher-level regional variations. Langford
(1995) used this approach in a study of district-level variation in childhood leukaemia
mortality. He worked with 1469 districts nested within 62 higher-level ‘counties’ and
found little variation at the higher-level once district variation had been accounted for.
The small amount of higher-level variation was greatest for rural areas. The outcome
measure was also associated with high level of people in the armed forces. Langford
used these findings to draw tentative conclusions about the relationship of leukaemia
to population mixing in small communities.
Health researchers are also obviously interested in how outcome measures change
over time (Goldstein et al. 1994). Moon et al. (2002) provide an exploratory example
but the multilevel framework itself can be modified to understand context in terms of
temporal settings. Figure 3 (c) shows how a repeated cross-sectional design can be
represented as a multilevel structure. Here level-3 in the hierarchy identifies areas,
level-2 is years and level-1 is individuals. Thus, level 2 represents repeated
measurements of places. Such a structure can be used to examine outcome trends
within higher-level units taking account of their changing compositional make-up. A
recent example of this design are provided by Jones and Jen (2004) who used data
from the World Values Survey with self-rated health as the response variable and
individuals at level-1, survey years at level-2 and the countries of the World Values
Survey at level-3. One of the bonuses in this design is that it is relatively robust to
imbalance; countries do not have to report for every survey year.
Figure 3(d) shows another way in which time as context can be built into a multilevel
analysis. In a repeated individual measures or ‘panel design’, level-l is the
measurement occasion indexed by its time, level-2 is the individual and level-3 refers
to areas. Thus, level-l represents repeated measurements of a group of individuals at
particular times while the characteristics of the individuals are recorded at level-2.
Such a structure allows the assessment of individual change within contextual setting.
Unlike conventional repeated measures methods which require a fixed set of repeated
observations for all persons, both the number of observations per person and the
temporal spacing among the observations may vary in a multilevel approach. Recent
examples of the repeated individual measures design are provided by Neuendorfer et
al. (2001) in a study of the effects, over time, of depressive symptoms in persons with
Alzheimer's disease on depression in their family caregivers. They found that the rate
of increase in caregiver depression was predicted by the rate of increase in patient
depressive symptoms and by increases in patient dependency in activities of daily
living. Other examples are provided by Hardy et al. (2003) and Sithole and Jones
(2002); an extensive discussion is available in Singer and Willett (2003) whose fourth
chapter gives explicit consideration to a muiltilevel panel model of. Alcohol
consumption.
A further example of bringing in time as context in a multilevel framework concerns
survival models where attention is focused on the time to an event. In these cases, the
concern is with the survival of individuals (with particular characteristics) in
particular contexts. An example would be the time a respondent stays alive after the
beginning of a trial; this could be related to both individual and contextual
characteristics. Such an approach requires special methods as the complete survival
time is often unknown for many respondents. Jones et al. (2004) discuss this research
direction.
Bringing in greater complexity
Conceptually, the same response measured at different times is no different from
many responses measured at one time. Consequently, the multilevel framework can
also be used to represent several different, though related, response variables. In the
case of health research, this enables researchers to examine several different
measures/dimensions of health status simultaneously. These different measures form a
set of response variables at level- l, which nest within individuals at level-2, who nest
within communities at level-3. This form of multilevel structure is shown in Fig. 3(e).
It is not necessary for measurements to be made on all individuals for all responses
and the model can accommodate sets of responses that are a mixture of both
categorical and continuous variables as well as situations where the responses are
measured in the same way.
The multivariate multilevel model has great potential as researchers are often
interested in two (or more) different, though related, dimensions of health. Duncan et
al. (1996) and Twigg et al. (2000b) provide case studies of the approach. The former
examined the inter-relation of the decision to smoke and the amount smoked finding
that areas with many smokers also tend to have people to smoke more. The latter
investigated the simultaneous effect of individual demographic characteristics and
socio-structural factors on self-reported problem drinking as revealed by CAGE
scores and 'unsafe' levels of alcohol consumption, Whilst the influence of key
sociostructural variables was broadly similar for both unsafe alcohol consumption and
high CAGE scores, there were notable exceptions when results were examined by
tenure group: those in the rented sector were more likely to be problem drinkers as
revealed by CAGE, but less likely to consume (unsafe) amounts of alcohol.. Both
dimensions of drinking behaviour were influenced by the consumption patterns of
others in the household, with both likelihoods increasing as the average consumption
of others in the household rose. The authors also found that the proportion of the
population whose drinking behaviour might be classed as (potentially) problematic
via the CAGE responses was substantially less than the proportion consuming above
recommended 'safe' levels.
All the models considered so far have been strictly hierarchical, and contextual effects
have nested within each other. Such a conception is frequently unrealistic and does
not exhaust the possibilities of the multilevel formulation. Contextual sources of
variation overlap and more than one may exist at each level. The resulting structure
may be hierarchical, but the hierarchy is complex. Each lower-level unit may belong
to more than one unit at the next higher level. In the case of health outcomes, an
individual’s health status may be influenced both by where they live and where they
work. This can be modelled using a cross-classified structure with individuals at
level-1 and both neighbourhood and workplace at level-2. This structure is shown in
Fig. 3(f). Explanatory variables can be included for individual-level characteristics
and for both level-2 units. Substantively, this allows different contexts to be
simultaneously modelled making it possible to identify contextual settings that are
having a confounding influence (Rasbash and Browne 2001). What appears as
between-workplace variation may in fact really be between-neighbourhood variation.
Langford and Bentham (1996) provide an example of a multilevel cross- classified
analysis. Their level-one was a geographical area, English health authorities, and their
outcome measure was the standardised mortality ratio for males and females for each
district. At level-2 they had crossed measures for the region and for an area typology,
the ACORN family. The main point of interest was that a district in any one region
might be in any of the several ACORN families, and vice-versa. From their study,
they were able to conclude region accounts for approximately four times more
variation in SMRs than is explained by the ACORN classification and a clear North-
South divide in excess mortality remains when both region and socio-economic
classification are modeled simultaneously,
Recently attention has focussed on the additional complexity that can result when
higher-level units are defined by a membership made of lower-level units and that
membership changes over time (Goldstein 2003). In the simple example of people
nested within areas, an individual might move several times and any individual-level
outcome might reflect area effects drawn from several contexts. Subramanian (2004)
elaborates this model. Figure 4 provides a visual summary. Time measurements
(level-1) are nested within individuals (level-2) who are in turn nested in
neighbourhoods (level-3). Importantly, individuals are assigned different weights for
the time spent in each neighbourhood. This, individual 25 moved from neighbourhood
1 to neighbourhood 25 during time period t1-t2, spending 20% of her time in
neighbourhood 1 and 80% in her new neighbourhood. This multiple-membership
design should allow control of changing context as well as changing composition. It
can also be extended to enable consideration of weighted effects of proximate
contexts (Langford et al. 1999). So, for example, the geographical distribution of
disease can be seen not only as a matter of composition and the immediate context in
which an outcome occurs, but also a consequence of the impact of nearby contexts
with nearer areas being more influential than more distant ones. As perhaps the
ultimate extension of this idea, Goldstein (2003) talks of the modelling of area-level
outcomes in which the area is conceptualised as having a multiple membership of
individuals.
<<Figure 4 about here>>
If multilevel models are constructed that are ‘realistically complex’ (Best et al, 1996)
a better understanding of area effects and the ecology of health may eventuate.
Reaching this goal is likely to require a combination of different structures in any
analysis. Incorporating both time and area in an analysis is central to this task. Few
people stay forever in the same single context; most move from one to other. Meeting
this challenge underlines the need for high quality geo-referenced data.
Cautionary notes
Despite the considerable research attention that has focussed on multilevel analysis,
the approach is not without its problems. Nor is it without its critics and outstanding
challenges. Three issues can be isolated. The first relates to sample sizes. To obtain
reliable estimates of both within and between-area variation, we require many
individuals from many areas. The former allows a precise assessment of within-area
relationships, the latter a precise assessment of between-area relationships, and the
two together allow one to be distinguished from the other. Of course, there can be
some compromise between the number of higher and lower level units, but the net
result is that data requirements are substantial. Many studies will not be able to satisfy
these criteria, particularly with regard to the number of contexts (sampled and
distinguishable areas). It is important moreover to note that there are substantial
technical implications if multilevel analyses are carried out when the number of
higher-level units is small, including downwardly biased variance estimates. Recent
developments in Monte Carlo Markov Chain approaches to multilevel modelling
(Rasbash et al. 2003; Goldstein 2003) can alleviate this problem somewhat but with a
concomitant computational overhead.
A second issue relates to operationalising context in terms of the hierarchical structure
of particular datasets (Sampson and Morenoff 2002). This problem is most formidable
in terms of analyses focusing on indeterminate spatial structures rather than more
clearly defined institutional ones. Massey (1991) makes the central point: “localities
are not simple areas you can easily draw a line around”. Yet many health-based
multilevel studies pay little attention to this warning and use the structure of the
dataset to define higher-level units. These structures often derive from administrative
boundaries and, whilst they may capture some notion of context, they often have no
explicit theoretical justification in terms of the outcomes being studied.
Third, multilevel studies face a set of challenges in terms of their operationalisation.
The need for realistic complexity has already been noted. So too has the need to
recognise that area effects are seldom caputured in a simple hierarchy but are
themselves subject to autocorrelation with adjoining and overlapping areas. More
straightforwardly however, there is a need to counter a tendency to focus on the fixed
effects in models by an enhanced consideration of the random part (Subramanian et
al. 2003b). Complex variance-covariance structures are often necessary in order to
understand fully the nature of area effects. An inevitable counterpoint of this tendency
to complexity, necessary or not, must also be the development of robust procedures
for model testing and diagnostic analysis (Langford and Lewis 1998; Leyland and
Goldstein 2001).
Conclusion
This chapter has moved from an assessment of ecological studies, through a
consideration of methodological issues associated with the notion of area effects, to
an outline of the utility of multilevel models as a coherent framework for framing and
testing ideas about contextuality and variability. A key strength of the multilevel
framework is its considerable generality. It can be used to tackle a number of different
and important questions of interest to health researchers. It is now increasingly
recognized that area effects are complex. As Jones and Duncan (1996, 80) note, there
has been too much stress in health research on the stereotypical and the average and
not enough on variability. There is seldom a single area effect; reality is
heterogeneous: there are multiple areas effects.
Summary
Area-based studies run the risk of the ecological fallacy. There is no certainty that
an observed difference between areas applies to all individuals within that area.
Individual studies risk the atomistic fallacy. They are decontextualised and imply
that everything is the same, everywhere.
Research into area effects (also known as contextual effects) needs to control for
composition (the characteristics of the individuals within an area).
Multilevel models provide a means of achieving such control and can be applied
in a range of different designs that capture spatial, temporal and multiple settings.
Realistically complex modelling is required to gain a better understanding of area
effects.
Further Reading
The statistical basis of multilevel modeling is set out in Goldstein, (1995/2001),
Raudenbush and Bryk (2002), Snijders and Bosker (1999), Hox (2002), and Kreft and
de Leeuw, 1998). This latter is perhaps the simplest entry-level text. The websites of
Goldstein’s multilevel models project (http://multilevel.ioe.ac.uk) and Bryk and
Raudenbush’s ‘hierarchical linear modeling’ project
(http://www.ssicentral.com/hlm/hlm.htm) are both helpful, particularly the former.
Goldstein has collaborated with Leyland on a recent text setting out health
applications in some detail (Leyland and Goldstein 2001). Throughout the chapter
reference has been made to the authors’ work applying multilevel models; a set of
papers from the mid and late 1990s outline this work in the area of health-related
behaviour (Duncan et al. 1993, 1996, 1998, 1999). Useful recent reviews of health-
related multilevel modeling include Diez Roux (2000, 2001), Pickett and Pearl (2001)
and Subramanian et al. (2003b). On area effects more generally, see Macintyre et al.
(1993). For an alternative perspective on ecological analysis and a relief from health
research, read (King, 1996).
References
... Multilevel models recognize the existence of spatial clusters in the dataset, and may be used to separate unit and area level effects by treating the groups (areas) as a random sample from a population of groups. There are many examples of multilevel models that make use of census data; Moon et al. (2005) Different sampling fractions will be studied. ...
Thesis
p>Disclosure control methods are used to protect the confidentiality of individuals and households in aggregate census data. With growth in computational power, the disclosure control problem has been rapidly transformed. Increased analytical power has stimulated user demand for more detailed information for smaller geographic areas and to customized geographical boundaries. However, the possibility of allowing census users to create their own aggregates from census microdata, and for small areas, can lead to problems of disclosure by differencing. Traditionally, methods of statistical disclosure control have been aspatial in nature. This thesis describes a new framework of geographical perturbation methods designed to deal with the spatial nature of disclosure risk. The research offers several new contributions, specifically; (1) A framework of new geographical perturbation methods is defined, based on creating uncertainty around geographical location. Zone-independent methods are designed for protection in a flexible- tabulation scenario and to account for the spatial dimension of risk. (2) Techniques for implementation of these methods are tested on a synthetic census dataset which show comparable risk-utility outcomes to RRS (an existing method used for the US and UK Censuses). The advantages and disadvantages of the proposed methods are discussed with regard to ease of implementation and flexibility of parameter values. (3) One of these new methods; LDS, is then explored in more detail showing a significant improvement over RRS in terms of the risk-utility outcome. Risk reduction is illustrated in a geographical differencing scenario and distortion to utility explored in a spatial context of typical census users' analyses.</p
... To better address these concerns, multilevel analytical techniques may be used (e.g. Blakely and Subramanian, 2006;Moon et al., 2005;Subramanian and Jones, 2003). ...
Article
Full-text available
Variations that exist in the frequency and severities of crashes across regions may be due to differences in road user behaviors or indirectly due to differences in regional characteristics. Regional strategies towards “vision zero” road fatalities, consisting of appropriate safety policies and laws, supported with public education and backed by appropriate sanctions, have the ability to shape road user behaviors in the long term. In this paper, certain human-centered crash factors are viewed as the outcome of a hierarchical system made up of road users nested in regions, in a way that regional characteristics like policies and punitive measures influence road user behaviors. Hence, we propose a multilevel framework that captures driver characteristics and regional attributes that directly and indirectly affect crash outcomes. The concept was applied to crash data analysis for the state of Alabama, where it was found that the probability of a fatal crash involving a typical driver is 0.115. About 6.19% of the variability in the fatal crash rate involving drivers from the state is accounted for by the city and 3.84% is accounted for by the county of residence of the causal driver, leaving 89.97% of the variability to be accounted for by driver attributes or other crash contributing factors. Fatal crash rates varied significantly across the state and some crash factors were more pronounced among drivers from particular cities and counties. In view of these findings, specific countermeasures and structural adjustments may be targeted in locations with the highest proportions of risky driver behaviors.
... The World Health Organization Regional Committee for Europe suggests that investing in public health policies that increase social cohesion, social capital and people's resilience may contribute to improving population health and achieving several Sustainable Development Goals (Dyakova et al., 2017). However, results are disparate and there is often confusion with regard to the scale, or ecological effects, on the observed association (Glonti et al., 2016;Moon et al., 2005). Whereas local ecological variables are necessary to examine structural, contextual, and sociological influence on individual health and well-being (Pearce, 2000), the ecological association between NCL and population health indicators has not been thoroughly studied. ...
Article
Neighbourhood community life has been widely recognized as an important determinant of population health. This systematic review of reviews provides an overview of the evidence for the ecological correlation between neighbourhood community life and population health. Nine databases were searched from 2008 to 2018 in order to identify systematic reviews of studies examining the association between neighbourhood community life and population health in urban neighbourhoods within the Organisation for Economic Co-operation and Development countries. Two reviewers completed selection and data extraction, then assessed the methodological quality of reviews using the Measurement Tool to Assess Systematic Reviews. We identified three high quality reviews and five of moderate quality. The reviews vary in quality of methodology, concepts, and measures. Most of the reviews examined the influence of social cohesion, social capital, and social interactions on health. Reviews found evidence supporting a consistently favourable correlation between social cohesion and physical activity, as well as a favourable trend in the relationship between social cohesion and healthy weight. They also found evidence of a favourable trend in the correlation between social capital and healthy weight. Reviews identified studies supporting a consistently favourable correlation between social interaction and depression. We identify evidence of a positive association between neighbourhood community life and several population health outcomes. Future research should define and conceptualize neighbourhood community life factors and health indicators to improve the comparison between studies and the process of evidence synthesis. This will also enable policy makers to take appropriate decisions.
... [The impact assessment] The measurement issue complicates however with noting that in addition to differences between local communities (communes, like gmina in Poland) in terms of social cohesion measures, the cross-level -household and communityinterpenetrating factors of dependency and heterogeneity affect the relevant (household and community) well-being measures. Therefore, the question of impact of the trans-border processes calls on two tasks: (i) complementing the data available from public statistics by special survey and (ii) developing a multilevel analytical framework capable of distinguishing border-related 'neighborhood differences' from ''the difference a neighborhood makes' (Moon et al., 2005) in evaluation of the trans-border effect. ...
Conference Paper
Full-text available
Non-institutional trans-border economic activities and transactions to a large extent avoid (by their very nature) statistical observations. This compels us to confine ourselves to indirect and incomplete information, and to deal with consequences of such a situation for analysis and evaluation at the measurement and data collection stages. The purpose of this paper is twofold. First and foremost it is to draft a conceptual and methodological framework for measuring the scope and describing the character of the trans-border individual activities (economic, tourist, cultural, etc.) along with efforts to evaluate the impact they may have on the outcome measures at each of the three levels of well-being: micro-household well-being, mezzo-community well-being, and macro-regional development and national well-being. Second, to demonstrate usefulness of the framework 'at work' through presenting preliminary results of both measurement and evaluation questions basing on data from multi-source database, including special survey research that is currently under implementation for gathering information on two-side cross-border activities (Polish and foreign citizens) along the Poland's border with its neighboring countries, with special attention being paid to the non-EU members countries (Ukraine, Belarus, and Russia/Kaliningrad). Problem and approach. [Assessing the scope of the phenomenon] The gap that exist between actual effects of the trans-border economic activities and officially recognized level of the relevant outcome measures-from national product down to regional development to community and
... How much of the variation in crash outcomes is attributable to individual/contextual interaction? Such questions have previously been asked and addressed in public health studies using multilevel analytical techniques (e.g., Khan, 1997;Subramanian et al., 2003;Moon et al., 2005;Blakely and Subramanian, 2006). Accounting for variability of disease occurrence and severity in a population generally involves statistical and epidemiological modeling techniques. ...
Book
Full-text available
The Book comprising peer reviewed papers of the 3rd International Conference on Transportation in Africa (ICTA2016) Conference Proceedings titled “Mitigating Contemporary Challenges facing Transportation in Africa” is targeted for transport sector policy and decision makers, providers of all modes of transport, development partners, practitioners, contractors, consultants, researchers, the academia, students, transport users and other interested parties.The International Conference on Transportation in Africa (ICTA) is a forum intended to share, exchange and debate experiences, best practices, and new technologies in the provision, maintenance and management of all modes of transport. It is also a forum for sharing and exchange of existing and new approaches on Transportation that enables countries to develop effective integrated transportation systems, which are safe, efficient, reliable and affordable. Most participants from African countries, United States of America, Europe, Latin America, Asia, and Australia were expected to attend the 3rd International Conference on Transportation in Africa held at Ramada Resort – (Coco Beach) Hotel, Accra, Ghana, 26th – 28th October 2016. The major objectives of organizing the conference were: (i) To promote Safe and Sustainable Transportation Systems in Africa. (ii) To promote establishment of professional networking among transport professionals (iii) To provide a forum for discussion and exchange of views as well as learn experiences of best practices and innovative solutions in the Transportation sector. The Events have facilitated bringing together Member States, Universities, and Ministries responsible for transport infrastructure, Roads Agencies and Authorities, Transportation Professionals in Africa. In addition, among other things, the events have facilitated the following: i) Exchange and sharing of research information for transport infrastructure development ii) Conducting of Technical Site Visits that enables professionals to learn from each other regarding application of appropriate materials in road construction, technologies, operations of various transportation modes such as roads and ports. iii) Carried out Capacity Building Programmes such as workshops that were conducted concurrently with the conference. iv) Exhibitions on various products, technologies, and services applied by companies, organizations and institutions from within and outside Africa. v) Formulation of collaborations and partnerships between institutions from Africa and those outside Africa. vi) Gathering of professionals from various countries inside and outside Africa and sharing experiences and innovative solutions. vii) Visiting natural resources and sight-seeing in host countries. In this respect, the International Conference on Transportation in Africa has been an event that has put Africa as a continent on the global map. In consideration of the above said, Ghana through the Transport Professionals accepted the recommendation by African Transportation Professional Networking Group to co-host the 3rd International Conference on Transportation in Africa that included Technical Presentations, Workshops, Panel discussions, Technical Site Visit as well as Exhibitions of Products, Technologies and Services. - Dr. Adewole Simon Oladele (Editor)
... How much of the variation in crash outcomes is attributable to individual/contextual interaction? Such questions have previously been asked and addressed in public health studies using multilevel analytical techniques (e.g., Khan, 1997;Subramanian et al., 2003;Moon et al., 2005;Blakely and Subramanian, 2006). Accounting for variability of disease occurrence and severity in a population generally involves statistical and epidemiological modeling techniques. ...
Conference Paper
Full-text available
The paper assesses perceived quality of intercity bus transport service provided by quasi-government owned (Intercity State Transport Corporation and Metro Mass Transit Ltd) and private transport operators (Ghana Private Road Transport Union, VVIP, VIP, FORD and DIPLOMAT). Qualitative data from in-depth interviews and observations were used to buttress the findings from modified SERVQUAL questionnaires administered on 497 passengers. The results of the analysis revealed that passengers “perceived functional quality” for private operators’ service was better than that of quasi government intercity bus transport operators. However, perceived corporate image and technical quality of quasi-government owned service operators was higher than that of private operators. In spite of the findings, respondents had good perceived quality for intercity bus transport service on the route. Recommendations were offered for the two broad transport operators to improve on perceived quality.
Article
The absence of sound sampling procedures and statistical analyses to estimate solid waste generation in many developing countries has resulted in incomplete historical records of waste quantity and composition. Data is often arbitrarily aggregated or disaggregated as a function of waste generators to obtain results at the desired spatial level of analysis. Inference fallacies arising from the generalization or individualization of results are almost never considered. In this paper, Panama, one of the fastest-growing developing countries, was used as a case-study to review the main methodological approaches to estimate solid waste generation per capita per day, and at different hierarchical levels (from households to the country). The solid waste generation intensity indicator is used by the Panamanian waste management authority to run the waste management system. It was also the main parameter employed by local and foreign companies to estimate solid waste generation in Panama between 2001 and 2008. The methodological approaches used by these companies were mathematically formalized and classified as per the expressions suggested by Subramanian et al. (2009). Seven inference fallacies (ecological, individualistic, stage, floating population, linear forecasting, average population and mixed spatial levels) were identified and allocated to the studies. Foreign companies committed three of the seven inference fallacies, while one was committed by the local entity. Endogenous knowledge played an important role in these studies to avoid spatial levels mismatch and multilevel measurements appear to produce more reliable information than studies obtained via other means.
Chapter
In this chapter, the concept of multilevel statistical models as it relates to understanding place effects and more generally contextual effects is discussed. The chapter begins by describing what constitutes a multilevel data analysis followed by a discussion on how a range of data structures that are observed in the real world or due to sampling design can be accommodated within a multilevel framework. After laying down the general motivation for developing a multilevel perspective to data analysis, multilevel models that are most relevant to answering substantive questions are specified. In particular, multilevel models are contrasted with fixed models. In conclusion, the substantive and technical advantages of using a multilevel modeling approach to statistical analysis are summarized.
Article
Multilevel analysis has recently emerged as a useful analytical technique in several fields, including public health and epidemiology. This glossary defines key concepts and terms used in multilevel analysis.