ArticlePDF Available

Using Linked Consumer Registers to Estimate Residential moves in the United Kingdom

Authors:

Abstract

This paper argues that frequently updated data on the nature of residential moves and the circumstances of mov-ers in the United Kingdom are insufficient for many research purposes. Accordingly, we develop previous research reported in this Journal to re-purpose consumer and administrative data in order to develop annual estimates of residential mobility between all UK neighbourhoods. We use a unique digital corpus of linked individual and household-level consumer registers compiled by the UK Consumer Data Research Centre, comprising over 143 million unique address records pertaining to the entire UK adult population over the period 1997-2016. We describe how records pertaining to individuals vacating a property can be assigned to their most probable residential destination, based on novel methods of matching names, assessing household composition, and using information on the date and probable distance of residential moves. We believe that the results of this analysis contribute highly granular, frequently updated estimates of residential moves that can be used to chart population-wide outcomes of residential mobility and migration behaviour, as well as the socio-spatial characteristics of the sedentary population. K E Y W O R D S consumer data, data linkage, internal migration, linked consumer registers, residential mobility
J R Stat Soc Series A. 2021;00:1–23.
|
1
wileyonlinelibrary.com/journal/rssa
Received: 13 January 2020
|
Accepted: 7 May 2021
DOI: 10.1111/rssa.12713
ORIGINAL ARTICLE
Using linked consumer registers to estimate
residential moves in the United Kingdom
Justin T.van Dijk
|
GuyLansley
|
Paul A.Longley
This is an open access article under the terms of the Creat ive Commo ns Attri bution License, which permits use, distribution and reproduction
in any medium, provided the original work is properly cited.
© 2021 The Authors. Journal of the Royal Statistical Society: Series A (Statistics in Society) published by John Wiley & Sons Ltd on behalf of
Royal Statistical Society
Department of Geography, University
College London, London, UK
Correspondence
Paul A. Longley, Department of
Geography, University College London,
Gower Street, London WC1E 6BT, UK.
Email: p.longley@ucl.ac.uk
Funding information
Engineering and Physical Sciences
Research Council, Grant/Award Number:
EP/M023583/1; Economic and Social
Research Council, Grant/Award Number:
ES/L011840/1
Abstract
This paper argues that frequently updated data on the na-
ture of residential moves and the circumstances of mov-
ers in the United Kingdom are insufficient for many
research purposes. Accordingly, we develop previous
research reported in this Journal to re- purpose consumer
and administrative data in order to develop annual esti-
mates of residential mobility between all UK neighbour-
hoods. We use a unique digital corpus of linked individual
and household- level consumer registers compiled by the
UK Consumer Data Research Centre, comprising over
143 million unique address records pertaining to the en-
tire UK adult population over the period 1997– 2016. We
describe how records pertaining to individuals vacating a
property can be assigned to their most probable residential
destination, based on novel methods of matching names,
assessing household composition, and using information
on the date and probable distance of residential moves. We
believe that the results of this analysis contribute highly
granular, frequently updated estimates of residential moves
that can be used to chart population- wide outcomes of resi-
dential mobility and migration behaviour, as well as the
socio- spatial characteristics of the sedentary population.
KEYWORDS
consumer data, data linkage, internal migration, linked consumer
registers, residential mobility
2
|
VAN DIJK et Al.
1
|
INTRODUCTION AND OVERVIEW
Every year, in the process of registering the right to vote, most of the UK’s adult resident population
assents to inclusion on the public version of the Electoral Roll, and others consent to inclusion in
contact lists in the course of acquiring goods or services. Subject to appropriate consents being given,
these lists are then used by local governments and businesses in further aspects of business and ser-
vice planning. When concatenated within annual time periods, these data can provide highly granular
inventories of local populations and their characteristics, on faster refresh cycles and higher spatial
granularity than many conventional statistical sources. In a previous paper published in this Journal,
we described the linkage and analysis of consumer and administrative data sources and the prove-
nance of these ‘linked consumer registers’ (LCRs: Lansley et al., 2019). This paper described how,
for each year over the period 1997– 2016, registers provide comprehensive, highly disaggregate and
frequently updateable representations of population size and structure, along with reliable estimates
of incompleteness and possible bias. These registers were linked only in cross section from numerous
sources, and models were developed to impute gaps in them when sources failed to detect continuity
of residence because data for particular years were missing. The paper appraised the applicability and
value of the resulting unique data resource through the derivation of an annual small area household
change index.
Our research agenda is to further develop and evaluate these registers along with other new data
sources to create timely and pertinent nationally representative datasets for policy analysis (see
Longley et al., 2018 for a consolidated statement). Our endeavours using conventional statistics along-
side consumer data suggest very high levels of population coverage, and triangulation with conven-
tional statistical sources makes it possible to investigate potential bias and other data quality issues
(see Hand, 2018). These innovations can be seen as part of a wider movement to re- purpose new
Big Data sources in order to supplement conventional statistics and permit richer analysis of known
populations of interest. The specific motivation for this paper is to build on the data infrastructure of
the linked consumer registers in order to explicitly link all individual records throughout the 20- year
period that they cover in order to better understand the outcomes of intra- national migration and res-
idential mobility.
The task of constructing any geographically extensive longitudinal dataset can be hugely chal-
lenging and time- consuming: in the United Kingdom it is additionally complex because population
data are captured by three separate statistical agencies covering England and Wales, Scotland and
Northern Ireland, while periodic boundary changes can limit direct comparison of aggregated data
(Lomax & Stillwell, 2017). Detailed and reliable origin– destination figures on moves between small
administrative geographies are available through decennial UK Censuses of Population, although truly
population- wide updates have not been collected since the 2011 Census, with subsequent local level
estimates derived from administrative sources, principally National Health Service (NHS) records.
Greater granularity and specification of origin– destination flows in estimating internal migration is
highly desirable in order to provide appropriate levels of services, especially where precise geographic
targeting is an issue (Travers et al., 2007). In this context, internal migration plays an important role in
shaping the current population structure at the local level (Lomax et al., 2014).
Data on population movements can contribute to understanding of related processes of internal
migration (typically motivated by employment opportunities) and other residential mobility (typi-
cally motivated by adjustment of housing requirements to changed household circumstances): see
Coulter et al., 2016. In response, the social composition of neighbourhood areas may be considered to
endure or change over time, depending on whether the limited numbers of movers are replaced with
others of similar ilk (Timms, 1975) or whether structural changes in residential composition manifest
|
3VAN DIJK et al.
neighbourhood gentrification or deterioration (Hamnett, 1991; Harvey, 1973; Park & Burgess, 1925).
In both cases, population movement (or the lack of it) is the engine of stasis or change (see Smith &
Denholm, 2006). At coarser regional scales, the net effects of neighbourhood change may both drive
and be driven by local labour markets and may be associated with changing patterns of social inequal-
ities (Fielding, 1992) as well as wider economic, political, cultural and environmental contexts (Smith
et al., 2015).
A primary motivation for residential moves is the desire of households to improve or adapt
to changing living circumstances, rendering residential mobility an indicator of social mobil-
ity (Fielding, 1992) as well as family life cycle change (Stapleton, 1980) or life course transitions
(Tyrrell & Kraftl, 2015). Where some places offer greater opportunities than others, as manifest
through high employment levels, they typically attract young migrants (Fielding, 1992; Rees et al.,
1996), resulting in positive net migration flows (as first postulated by Ravenstein, 1885). Young
adults are typically more geographically mobile as they tend to move independently rather than as
a household: Bell et al., (2002) report that young adults have the highest propensities to move, and
that they subsequently become increasingly sedentary with age until eventual retirement. This said,
Fielding (2012) also suggests that the relationship between life course and migration is intricate and
that spatial outcomes can differ between places as a result of some young adults postponing their
entry into the labour market or delaying family formation. Taken together, understanding issues of
labour market differentiation, kinship and identity are all manifest in residential moves (see Clark
& Moore, 1980; Fielding, 2012; Finney & Simpson, 2008; Smith et al., 2015), and highly granular
measurement of household structure and origin– destination attributes is essential to improve under-
standing of motivation and process.
In the absence of other comprehensive and frequently updated data on the nature of local residen-
tial moves and the circumstances of movers, this paper proposes a highly disaggregate framework
for measuring residential moves in the United Kingdom. Building on the LCRs, we develop explicit
longitudinal linkage of all records throughout the 20- year period covered by the registers in order to
uncover the outcomes of intra- national migration and residential mobility decisions. Ascertaining
whether moves are motivated by domestic circumstances (typically described as residential mobility)
or household economics (typically defined as migration), is difficult (see Coulter et al., 2016) and we
consider that either can result in address transitions between annual updates of the consumer registers.
We describe how records pertaining to individuals that vacate a property are assigned to their most
probable destination address, based on novel methods of name matching, assessment of household
composition and use of information on the date and probable distance of residential moves. The indi-
vidual level results of this migration model are aggregated and, in the case of the 12months prior to
the 2011 Census, are compared with official statistics. The results of this analysis contribute highly
granular, frequently updated estimates of residential moves and can be used to chart population- wide
outcomes of residential mobility behaviour, and to examine the characteristics of individuals and
households that do or do not move.
2
|
HOUSEHOLD MOVES AND THE LINKED
CONSUMER REGISTERS
The temporal and spatial granularity of conventional statistical sources is widely understood to be
insufficient to understand the outcomes of a number of residential mobility or migration processes
(Lomax et al., 2013; Lomax & Stillwell, 2017). While cross- sectional Census data are only avail-
able at 10 yearly intervals, longitudinal census records link successive censuses, albeit for just 1%
4
|
VAN DIJK et Al.
of the population (Champion & Shuttleworth, 2017). Small sample sizes also limit the geographic
granularity of other Office for National Statistics (ONS) surveys such as the Labour Force Survey
and the General Household Survey, with deleterious consequences for understanding the detail of
neighbourhood change. As a consequence, researchers and practitioners have sought alternative
data sources which might reliably inform about residential mobility occurrences on a more regular
basis.
One of the most popular alternative sources of data on migration, used by the ONS to compile
Mid- Year Population Estimates, has been the NHS Central Register (NHSCR), outputs of which
record the moves of patients between, but not within, health authority areas (Lomax & Stillwell,
2017). Unfortunately, this source was discontinued in February 2016 (ONS, 2020a) and the sub-
sequent Patient Register Data Service (PRDS) for England and Wales was left as the main data-
set feeding into the annual population estimates (Lomax & Stillwell, 2017; ONS, 2020a). Where
the Patient Register does not cover within- year moves, the Personal Demographic Service (PDS),
which replaced the NHSCR in mid- 2017, is also now used in calculating Mid- Year Population
Estimates by the ONS. Similar to the NHSCR, the PDS records weekly updates on the movements
of patients and, together with the PDS, is used to estimate residential moves between local author-
ities. The PDS also makes it possible to estimate cross- border flows between England and Wales,
Scotland and Northern Ireland (ONS, 2020b). However, even where service organizations have a
mandate for universal coverage, as with the NHS, address records are patchy in quality and some
transient groups (especially young adults) are heavily under- recorded and, more generally, short
distance moves are underrepresented (Lomax et al., 2013; Lomax & Stillwell, 2017). Moreover,
these data are only made available for research purposes at relatively coarse spatial aggregations
(Stillwell & Thomas, 2016).
Unlike NHS data sources, the LCRs are compiled from administrative and consumer sources and
are created using a blend of both deterministic and fuzzy procedures in order to link public versions
of the UK Electoral Register and consumer files over the period 1997– 2016. The first 6 years of the
series are drawn from the full Electoral Register prior to the introduction of opt- out provisions in
2003, and numerous consumer sources are used for subsequent years in order to supplement the pub-
lic (post- opt- out) registers. The component registers were obtained in annual releases from a range
of industry value- added resellers, although the identities of the different private sector providers are
not known. The combined coverage of the registers wanes over time, principally because consumer
data sources do not fully compensate for increasing rates of opt- out from the public Electoral Register.
Additionally, the early years of the LCRs have known ‘non- voter’ bias (see Electoral Commission,
2016; Hoinville & Jowell, 1978), but the completeness and provenance of the post 2003 registers
is largely undocumented. A full discussion of remedial steps to address these issues is provided in
Lansley et al. (2019).
The annual component LCRs were each created through linkage of each of the component
annual registers to the best available address frame, comprising Ordnance Survey AddressBase
Premium (Ordnance Survey, 2020) and the Royal Mail's Postcode Address File (PAF: Royal
Mail, 2020). AddressBase Premium is the most comprehensive geographic dataset of addresses
available in the United Kingdom. The Postcode Address File is a list of all postal address in the
United Kingdom, owned and maintained by Royal Mail. In addition to addresses, the LCRs re-
cord adults’ given and family names and the first and last time they were observed at any given
recorded address. The total numbers of adults recorded in each year of the LCRs were within 2%
of official ONS Mid- Year Population Estimates, after including the very small number of impu-
tations used when house sales were known to have occurred but no new residents were found in
any later time period.
|
5VAN DIJK et al.
The LCRs have many useful applications that can be tailored to bespoke temporal and geographic
aggregations, subject to strict disclosure controls. Estimates of neighbourhood population turnover
have been created by detecting when new household units join an address (Lansley et al., 2019).
Elsewhere, name- based tools have been used to infer ethnicity (Kandt & Longley, 2018) and hence
create neighbourhood estimates of changing ethnic composition (Lan et al., 2019); similar tools have
been used to chart intergenerational population change (Kandt et al., 2020). In important respects,
the benefits of the LCRs accrue from their granularity and their regular annual update cycle.
3
|
RECORD MATCHING PROCEDURES AND THE SPACE–
TIME GEOGRAPHY OF RECORDED MOVES
The LCRs comprise only names, addresses and timestamps indicating probable start and end dates
of residence. In this paper, we present a novel methodology to link those leaving an address to their
most probable destinations, using timestamps, relative locations and combinations of names within
households recorded at addresses. The methodology essentially entails linking apparent disappear-
ances of named individuals at one address and reappearance at a different one within predefined time-
frames. This procedure works very effectively where combinations of individuals identify a unique
household, or where an individual's given and surname combination is rare (or even unique). In other
cases, it is necessary to use a probabilistic method to assign names that could link multiple origins and
destinations, incorporating time and distance factors. Once completed, our results are compared with
official statistics from the 2011 Census and other aggregate migration estimates. Here we set out the
procedures used to construct internal migration estimates from the LCRs and the empirical findings
arising from the matching process.
3.1
|
Matching of forename– surname pairings
Names are tokens that provide reliable ways of tracing individual and household movements because
they are usually retained throughout their bearers’ life courses, unless changed upon marriage or gen-
der reassignment. While most forename– surname pairs are not unique, they are usually very uncom-
mon: see Figure 1.
While it is thus not always possible conclusively to identify a unique individual from their full
name, household combinations at unique addresses are very likely to be distinctive. For example,
LCR data identify the most common full name bearer combination at a single address in 2016 as John
Smith and Margaret Smith living in the same household, occurring at 170 addresses: the component
names John Smith and Margaret Smith occurred individually 9,853 and 6,122 times respectively.
Viewed in each of the 20years for which LCR data are available, an average of 62.6 % of households
comprised unique name compositions, a figure that rises to 85 % for households comprising two or
more persons. The proportion of unique household combinations has gradually increased since 1997,
and only 7% of households share name combinations with more than 50 other households, the major-
ity being lone adults. The LCRs document an apparent decrease in average household size with lone
adults increasing from 36 % to 46 % over the 20- year period: this figure is slightly higher than the 41
% recorded by the 2011 Census of Population, and may manifest replacement of (self- assigned) head
of household registration for the Electoral Roll with individual registration in 2014. Failure to match
addresses drawn from different LCR sources may also contribute to the higher LCRs figures. Against
the background of these characteristics, individuals’ names and their household aggregations were
6
|
VAN DIJK et Al.
used to trace likely movements between addresses across the 20- year LCR period, resulting in either
unique matches or a shortlist of candidates for further consideration.
3.2
|
Temporal lags in recording moves
The format and precision with which residence at addresses is recorded varies between the constitu-
ent LCR data sources. Land Registry data record changes in property ownership, albeit not always
the precise dates on which sales take effect. Many residential moves take place in ‘housing chains’,
wherein house finance requires coordination of multiple moves on the same date, which can ren-
der linkage of household movements straightforward. Address- level house sales data from the Land
Registry (Open Data for England and Wales) and Registers for Scotland (obtained from the Urban
Big Data Centre) were linked to the LCRs in order to cross validate some of the apparent moves and
to hone estimates of the dates upon which house sales in the owner- occupier sector triggered moves.
Just under 14million entries in the LCRs were linked to at least one property sale using Land Registry
(England and Wales) or Registers of Scotland price paid data. Voter registration data are of more
mixed quality, but their provenance is quite well understood: the Electoral Commission (2016) has
identified that the majority of electors re- register within 2 years of changing address, although lags of
2 to 3years are not uncommon, particularly between General Elections. However, it is also estimated
that 17 % of eligible voters (9.4million adults) in Great Britain are not correctly registered at their cur-
rent address, and that 11 % of the full register entries are inaccurate, affecting up to 5.6million adults
(Electoral Commission, 2019). Comprehensive accuracy assessment is therefore difficult, since the
timestamps from the various consumer data sources are also of largely unknown provenance.
FIGURE 1 Percentages of the adult population captured in the linked consumer registers bearing forename–
surname pairings that occur 1, 2– 9, 10– 49 and 50 or more times over the period 1997– 2016 (
source: Author calculations)
|
7VAN DIJK et al.
3.3
|
Probable distances of moves
The attenuating effect of distance upon numbers of residential moves has been broadly understood for
over 125years (Lee, 1966; Ravenstein, 1885). Today it remains the case that the majority of moves
occur over short distances: the 2011 Census records that 57.1 % of the individuals aged 16 and over
that changed address within the preceding 12months moved within the same Local Authority District
(LAD). Previous use of conventional aggregate statistics has not established detailed distributions of
distances moved (Stillwell & Thomas, 2016), although evidence suggests that residential mobility
over longer distances has become less common since the 1970s (Champion & Shuttleworth, 2017). In
recent years the growth of the private rental sector (where short- term lets are common) is also under-
stood to have increased the frequency of moves, although such moves often remain focused upon the
same employment centres (House of Commons, 2013).
4
|
IMPLEMENTATION OF MATCHING PROCEDURES
We developed a two- stage model in order to estimate the origin– destination pairings of as many
apparent movers as possible. The tension in this Big Data problem, applied to the circa 143million
individual name and address records in the LCRs, was to capture as many actual moves as possible,
while remaining cognizant of the risk of assigning false positives in an incomplete dataset of variable
data quality. Processes of household formation and dissolution needed also to be considered, as well
as the effects of international migration. In the first stage of the procedure, deterministic assignment
was used to link uniquely named lone individuals and their household members between origin–
destination pairs in each time period. The second stage used a probabilistic approach to match re-
maining individuals who appeared to have moved, by developing a highly computationally intensive
procedure in which every possible interaction within a 3- year window was identified and the most
probable pairs were matched.
Lansley et al. (2019) describe how the LCRs are engineered as a smoothed time series in which any
gaps in annual records of a named individual at an address between their dates of moving in and out are
simply filled using that individual's name. They also describe how household characteristics are imputed
in later years where individuals are known to have moved out but the names of replacement residents are
not known (e.g. because they have not yet registered as voters). In such instances, the number of replace-
ment residents is noted but the records not used in the residential mobility analysis reported here: the
number of cases increases in the final years of the time series because there is only a short run of records
over which gaps may be plugged. A consequence of more frequent resort to imputation in later years is
that the residential mobility results likely become less complete, pending inclusion of post 2016 data that
plug gaps by picking up new residents, for example through lagged voter registration.
Our initial premise was that the start and end dates of every residence history recorded in the LCR
could be used to identify a move within the United Kingdom— without addressing additions or losses
to the system arising from comings of age, marriages (if associated with a name change), deaths and
international migration. Intra- UK migration indeed accounts for the majority of transitions, as evi-
denced by 2011 Census estimates that identify that over five million adults changed UK address in
the preceding 12months. As mentioned above, Land Registry property sales data were used, where
possible, to calibrate the start and end dates of LCR residential histories where single or multiple res-
idential moves were identifiable from financial transactions. In other cases, the ‘first seen’ and ‘last
seen’ dates recorded in the LCRs were used. The frequency distribution of the recoded residential
histories that result is shown in Table 1.
8
|
VAN DIJK et Al.
In this Table, the ‘last observation’ figure relates to all individual records that are not recorded at
the same address in any subsequent year, while all other records are defined as ‘first observations’.
Thus the ‘last observation’ figure for 1997 indicates that there were 6,209,906 individual records that
are not recorded at the same address in any of the subsequent years and are therefore entered into the
linking procedure. A feature of each annual consumer register is that different sources contribute to-
wards them and under the terms of data supply, detailed metadata of non- Electoral Register sources
are not provided. Reduction in the volume of consumer data provided for 2011 has the consequence
of inflating the ‘last observation’ figure, and the ‘first observation’ figure is reduced. This largely ac-
counts for the different sizes of the ‘first observation’ and ‘last observation’ figures across the years.
4.1
|
Stage 1
The ‘first seen’ and ‘last seen’ data, verified where possible using Land Registry property sales data,
were used to bound residences for all forename– surname pairings with precisely two occurrences
within the LCRs using the following steps:
1. If a forename– surname pairing disappeared from an origin and subsequently reappeared at
a destination and remained there for more than a year, the records were linked as a move.
TABLE 1 Frequencies of ‘last seen’ and ‘first seen’ occurrences in each year of the linked consumer registers
(LCRs)
Year Last observation First observation
1997 6,209,906 45,342,404
1998 6,065,950 8,123,676
1999 8,142 ,35 4 6,528,756
2000 4,045,262 8, 43 7,74 8
20 01 6,288,883 3,055,384
2002 5,74 4,780 7,433,461
2003 5,001,650 5,0 9 9,17 2
2004 5,578,480 4,553,431
2005 5,108,975 4,665,936
2006 3,979,450 4,793,2 42
2007 3,542,251 4,698,893
2008 4,438,117 3,6 54, 831
2009 5,325, 528 6,003,444
2010 6 ,727,176 6 ,2 03,975
2011 13,74 4,674 4,779,294
2012 7, 575,16 6 5, 016 ,716
2013 6,415,383 6,29 2,786
2014 6,586,815 2,111,4 63
2015 6 ,814, 220 2,409,062
2016 26,454,029 4,585,375
|
9VAN DIJK et al.
All such linkages were considered as definitive and were therefore excluded from the second
stage of the matching procedure.
2. For each uniquely matched forename– surname pairing, the non- unique names of any associated
household members were retrieved and added if they matched the same origin– destination pair-
ing. Household members were defined as any individual that shared the same address at any time
during the residence of the uniquely named individual at the address. These individuals were also
removed from the second stage of the matching procedure.
In total, Stage 1 identified 3,045,108 moves, an average of 152,255 moves per year. Figure 2a shows
that a large share of matches— accounting for 43.1 % of the total number of all matched records— were
made over consecutive years, consistent with Electoral Commission findings that most individuals re-
register within 2years of changing address, albeit that only a minority re- register within 12months of
moving (Electoral Commission, 2016). Figure 2b confirms a strong distance attenuation effect upon
mobility, despite there being no distance weighting included in the matching procedure. Over the 20-
year LCR period, 62.3 % of individual moves capture in this stage of the model occurred within the
same local authority: the equivalent figure from the 2011 Census for individuals aged 16 and over is
57.1 %.
4.2
|
Stage 2
The Stage 2 model was devised to match as many unmatched individuals leaving an address as possi-
ble with their most probable name- matched pairings. This entailed attempted linkage of all remaining
names observed to depart from an address to every bearer of the same name that joined any other ad-
dress within a 3- year window, including the same year. There were c. 7.8billion candidate non- unique
pairings over the 20- year period, including multiple moves by the same individuals. For example,
1,243 John Smiths were last observed at addresses in 2013, making potential matches for any of the
14,238 John Smiths that were first recorded at other addresses during a subsequent 3- year window.
In addressing this huge combinatorial problem, we draw upon Gale and Shapley’s (1962) ‘stable
marriage problem’, which develops an algorithm to allocate men and women into suitable marriages
using the ranked preferences of each individual. Our adaptation of this approach allocates individuals
believed to have left addresses to the most probable vacated properties. Arranged by order of impor-
tance, our score was based on the number of residents’ names that matched, whether both properties
were sold on the same day, the distance between the two properties, and the time lag between the last
observation at the origin address and the first observation at the new one, taking into account normal
time lags in detection of new voter registrations, etc. The precise score comprised:
a. A count of the number of full names that matched at each candidate address;
b. whether linked Land Registry or Register for Scotland data indicated that the origin and destination
properties were (weighted 1) or were not (weighted 0) sold on the same day or sold in the same year
(weighted 0.1);
c. the distance separating the origin and destination, measured as the inverse of the straight- line dis-
tance connecting origin and destination, range standardized from a theoretical maximum distance
of 1,200km to a value between 0 and 1; and
d. the time elapsed between the ‘last seen’ date at the origin address and the earliest ‘first seen’ occur-
rence at the destination, rescaled to values between 0 and 0.1 and with moves within the same year
weighted the same as moves matched across 2years.
10
|
VAN DIJK et Al.
FIGURE 2 (a) Lag time (years) required in order to match moves using names; and (b) distribution of straight-
line distances moved by matched individuals in Stage 1
|
11VAN DIJK et al.
4.3
|
Additional data cleaning and record selection
The combination of Stages 1 and 2 returned c. 47.5million moves across the 20- year period covered
by the LCR. Analysis of origin– destination pairs in Stages 1 and 2, however, identified many appar-
ent moves over very short distances, including c. 3.5 million moves within the same unit postcode.
Exploratory analysis indicated that some of these addresses appeared in multiple formats in successive
LCR entries, despite the best efforts to standardize them. Three diagnostic checks were therefore de-
vised in order to remove probable duplicate addresses in the combined results of Stage 1 and Stage 2:
1. full string matching of the first line of the address (irrespective of postcodes or other
inconsistencies);
2. use of Soundex fuzzy matching (Stanier, 1990) to filter out addresses that were very similar when
read aloud (and may have been captured incorrectly where consumer addresses were entered through
dictation); and
3. identification of addresses that differed by just one or two characters, as identified using the
Levenshtein Distance measure.
Together, these additional checks matched 6.7 % of the moves originally identified in Stage 1 and
2, which were reclassified as pertaining to the same address. The probability density distribution of
distances over which these moves took place is shown in Figure 3. The majority of probable dupli-
cate addresses as identified by the address matching, Soundex matching or Levenshtein matching are
moves over <100metres— suggesting that these moves were indeed a result of formatting differences
in the LCRs. We then took out all moves that arrived at a destination in 1997 and left an origin in
2016: these are, respectively, the beginning and end of the time series, for which we do not know
whether arrivals at a destinations or departures from origins are the result of a move simply an ar-
tefact of the time series— and so might be recorded as false positives. After removing the probable
FIGURE 3 Density distributions of assumed similar addresses by distance of apparent move
12
|
VAN DIJK et Al.
duplicate addresses and updating the records to exclude moves ending in 1997 or beginning in 2016,
41,658,922 moves were identified over the 20- year period covered by the LCRs— yielding an average
of 2,082,946 moves per annum.
5
|
RESULTS
5.1
|
Aggregate estimates
Although the procedures set out here were applied consistently to all of the LCRs, the results inevi-
tably reflect (a) the completeness of the raw source data and (b) the extent to which it is possible to
replace gaps in the time series, which is greater in earlier time periods. The apparent fluctuation over
time in LCR estimates of median distances moved grouped by year of arrival (Figure 4) should be
viewed in this context. The early years (1997– 2002) of the LCRs derive from a single administrative
source (albeit collected by multiple local authorities) and return the smallest median distances of
moves for these years. (Note that 1997 is excluded from Figure 4 as 1997 is not used as a year of first
arrival because it is the start of the time series). Subsequent ‘opt out’ provisions for the Electoral Roll
source and use of multiple consumer data sources to top up the registers creates greater uncertainty
about the completeness and provenance of the data, and this is associated with increased estimated
median values of distances moved, particularly in 2014 and 2015. Gaps in the register in any year can
FIGURE 4 Median distance moved estimates (boxes) and interquartile ranges (whiskers) grouped by year of first
arrival, 1998– 2016
|
13VAN DIJK et al.
be filled by carrying forward observations from previous years or carrying back observations from
subsequent years. Post 2013 the ‘carry back’ window is increasingly truncated by the impending end
of the time series and so the numbers of unfilled gaps progressively increase the probabilities of mis-
matching individual records. A manifestation of this is likely to be at least part of the increased dis-
tance of move estimates in the later years, although of course it may also arise because of increases in
the actual distances moved. Accordingly, and as we discuss below, it may be appropriate to consider
the later distance statistics as provisional, in much the same way as some conventional ONS statistics
are badged as provisional.
2011 Census estimates (ONS, 2014) of residential moves within and between local authorities of
residents aged 16+ are broadly comparable: the UK- wide Pearson correlation coefficient comparing
the origin– destination counts recorded in the 2011 Census with the assigned LCR origin– destination
counts for the corresponding year is 0.97. This value is calculated using the pairwise interaction
matrix of 380 harmonized LADs and accounts for the flows in both directions between every pair of
LADs. In terms of coverage, the LCR matches are equivalent to 63.4 % of the total number of moves
recorded in the 2011 Census: 52.3 % of the total number of recorded intra- Local Authority moves and
78.4 % of the total number of recorded inter- Local Authority moves. For 2011, the median distance
of moves between all origins and destinations using the LCRs is 8.14km, which is almost twice the
4.2km captured in the 2011 Census flow data (ONS, 2015): this is likely to be because the LCR
captures a larger share of inter- Local Authority moves. The 2011 LCR median intra- Local Authority
distance moved is 1.7km, while that captured by the 2011 Census is 1.7km. For the inter- Local
Authority moves these median distance values are 37.3 and 30.6km respectively.
No UK- wide origin– destination tables are available for inter- Local Authority moves post 2011,
although pairwise annual mid- year estimates of moves derived from NHS patient registers provide
data on moves between LADs in England and Wales are available (ONS, 2020b). The ONS also
releases annual local area estimates of the total number of internal migrants flowing into and out
of each UK District (ONS, 2019b). In Figure 5 we present estimates of moves into and out of each
District, expressed as ratios between the LCR and ONS estimates. Here, a ratio of unity identifies
perfect correspondence between the estimates. LCR under- predictions outnumber over- predictions,
but where outward moves are over- estimated so too are inward moves, and vice- versa. ONS UK- wide
District Mid- Year Population Estimates pertain to all individuals (ONS, 2019a) rather than adults, and
as such are expected to be larger than their LCR counterparts. The Pearson correlation coefficients
between these total inflows and total outflows for the post- Census years of the LCRs remain stable
at 0.93±0.01. Figure 5 also shows that the correspondence between estimates endures over time:
the mean ratios for inflows and outflows in 2011 are 0.72±0.18 and 0.71 ±0.17 respectively, and
the corresponding figures for 2016 are 0.65±0.15 and 0.65±0.16. The slight decrease in the mean
figures over time can most likely be ascribed to the LCR recording a lower fraction of the total adult
population in the last years covered by the Registers.
Figure 6 confirms that the 2011 patterns of residential mobility between LADs estimated using
the LCRs are broadly consistent with those recorded by the Census, although not all longer distance
moves from and to London are captured— most likely because the migration model trades off distance
and household composition when evaluating text string matches in order to reduce the likelihood of
false positive matches. However, shorter distance moves from Northern Ireland to mainland United
Kingdom to destinations close to Liverpool and Manchester are more common than recorded in the
Census. The left- hand map shows the 2011 flows reported by ONS (2014) with a minimum two- way
flow of 200 and the right- hand map shows the LCR estimates with a lower minimum two- way flow of
150— in recognition that the LCRs pick up 63.4 % of all moves and 78.4 % of all inter- Local Authority
flows.
14
|
VAN DIJK et Al.
A great strength of these estimates is their granularity and local specificity, but the lack of these
qualities in most frequently updated statistics frustrates many direct comparisons. Figure 7 never-
theless shows two social aggregations. Figure 7a illustrates that the 2011 Consumer Register has a
consistent coverage of all Output Area Classification (OAC: Gale et al., 2016) Super Groups when
compared to the Census flow data (ONS, 2015). Estimated in- and out- migration rates are balanced
and consistent with general processes of household formation, progression through the housing mar-
ket and dissolution. Figure 7b shows similar consistency when moves are broken down by Index of
Multiple Deprivation (IMD) deciles for Great Britain (2019 IMD for England and Wales, 2020 IMD
for Scotland), and changes in the balance between in- migration and out- migration suggest general up-
ward filtering of households through the housing market. The over- all message is thus of consistency
of LCR estimates with the 2011 Census. These findings provide a platform for further use of the data
for locality studies, to which we now turn.
5.2
|
Disaggregate estimates
A major advantage of the grounding of the LCRs at the level of the individual is that, subject to dis-
closure control, estimates of residential mobility may be generated for any convenient aggregation. To
this end, Figure 8 presents mobility patterns between Middle Super Output Areas (MSOAs) in Greater
London, subject to minimum two- way flows of 10 or more individuals. These results identify that
many Outer London Boroughs (e.g. Kingston- upon- Thames and Sutton) host large numbers of moves
FIGURE 5 Estimates of ratios between the linked consumer registers and office for national statistics. Local
Area estimates of total inflows and total outflows, for selected years
|
15VAN DIJK et al.
within their boundaries and that these outnumber moves further afield. Physical or psychological bar-
riers remain associated with diminished mobility, such as moves across the Thames or across the Lea
Valley Park separating Waltham Forest from Enfield, Haringey and Hackney.
In a similar vein, over- all trends may be disaggregated at local level, in order to envision the
precise ways in which residential filtering occurs. Knowledge of the precise origins and destina-
tions of moves makes it possible to characterize the ways in which local housing markets function
across a full range of scales. For example, the innermost ring of Figure 9a details how the gentri-
fying area of Spitalfields and Banglatown in East London attracts an estimated 24.3 % of movers
from UK origins outside London, shown as the sectioned arc across the top of the innermost ring.
Those moving out, shown in Figure 9b are less likely (20.6 %) to select destinations outside of
London, shown by the shorter sectioned arc in a similar position on the innermost ring. Over the
full period, the migration model captured 7,847 moves going into Spitalfields and Banglatown and
9,202 moves going out.
Selected popular origins and destinations are labelled in this Figure, subject to disclosure control
thresholds, but all are identifiable using the LCRs. In Figure 10, origin (Figure 10a) and destination
(Figure 10b) locations of movers into and out of this area are characterized as more or less deprived
as measured using the 2019 IMD for England and Wales. Although net population change is quite
FIGURE 6 UK- wide principal migration flows between Local Authority origin– destination pairs in 2011: (a) As
reported by ONS (2014); and (b) as estimated using the LCRs. Darker lines indicate larger flows
Census
Migration Model
(a) (b)
16
|
VAN DIJK et Al.
small (Table 2), it is quite clear that incomers to Spitalfields are somewhat more likely to come from
areas in the least deprived IMD deciles, while the destinations of out- movers over the period 1997–
2016 are less salubrious. Such measures make it possible to profile changes in the characteristics of
neighbourhood residents and to relate these to the trajectories of their neighbourhoods (cf. Rabe &
Taylor, 2010).
FIGURE 7 (a) 2011 Census and linked consumer register (LCR) adult population estimates for each of the
2011 Output Area Classification Supergroups, 2011 LCR estimates of adult individual moves into and out of these
categories, and 2011 Census counts of individual moves into and out of these categories; and (b) proportions of the
2011 LCR falling into each Index of Multiple Deprivation (IMD) decile and associated estimates of moves into and
out of each decile
(a) 2011 Output Area Classification Super Groups
(b)
Index of Multiple Deprivation
|
17VAN DIJK et al.
6
|
DISCUSSION AND CONCLUSIONS
The LCRs enable a new UK- wide exploration of local outcomes arising from internal migration and
residential mobility over the period 1997– 2016, in unprecedented spatio- temporal detail. In particular,
they make possible analysis of the origins and destinations of a large proportion of residential moves
and facilitate detailed analysis of the processes of residential filtering consequent upon residential
mobility.
The innovation of this research is to use UK nationwide lists of names and addresses, compiled
from consumer and administrative sources, to link the successive addresses occupied by individuals
and households over a 20- year period, with annual updates. The source data are individual level and
recorded at very high geographic and temporal granularity, and triangulation with periodic Census
sources and Mid- Year Population Estimates suggests that they are representative of underlying pop-
ulation movements and the structure of the internal migration system. The main caveat to this is that
the period for which ‘carry back’ observations to plug gaps in the Registers is reduced for later years,
with the implication that estimated distances of residential moves slightly increase if local matches are
undetectable. The diversity of extant surnames and of naming practices otherwise renders full names
an effective means of linking records over time and space, albeit that the procedures of linkage are
necessarily complex and computationally intensive. This methodology might be adopted for different
time periods and in different parts of the world in order to devise generalized and timely snapshots of
FIGURE 8 Aggregated post- 2011 intra- London movers between MSOAs, subject to a minimum disclosure
threshold of 10
18
|
VAN DIJK et Al.
the outcomes of residential mobility. This approach is especially timely, in view of ever- growing inter-
ests in repurposing administrative and consumer data sources to supplement conventional population
statistics derived from censuses and social surveys.
Conventional Census statistics provide comprehensive coverage of residential mobility through-
out the United Kingdom, albeit only every 10 years. Patient Register Data Service and Personal
FIGURE 9 (a) Origins of moves to Spitalfields and Banglatown Ward in East London; and (b) destinations
of outward moves, 1997– 2016. In each case the innermost ring (in lightest hue) identifies regions, the middle ring
identifies constituent Districts and the outermost rings identifies constituent Electoral Wards
Origins Destinations
(a) (b)
FIGURE 10 (a) Origins of movers to Spitalfields and Banglatown Ward in East London by 2019 English
Index of Multiple Deprivation (IMD) deciles; and (b) destinations of outward movers by 2019 English IMD deciles,
1997– 2016. IMD deciles 1– 10 range from the most to the least deprived IMD deciles
Origins
Destinations
(a) (b)
|
19VAN DIJK et al.
Demographic data do much to fill this gap, but are only available to researchers at much coarser
granularity. Subject to disclosure control procedures, the LCRs offer annual updates at any convenient
level of spatial granularity, yet using the assumptions made in this paper only identify 63.4 % of the
total number of adult residential movers when compared with the figures from the 2011 Census.
Clearly, the analysis reported here does not detect all adult movers, but our results nevertheless break
new ground by providing disaggregate annual updates that cannot be gleaned from any other statistical
source. The relative importance of individual name text string matching, household composition and
distance of move might also be changed to create versions of the register that require higher matches
for particular types of moves, such as those over long distances, in applications where the risk of false
positives can be managed. Further research, and the results of extending the run of the LCRs, should
be conducted in order to examine the degree to which estimates for the later years can be refined, and
our own research is presently seeking to extend the series to 2020 and beyond. Further research will
also examine the degrees to which our matching procedures should be relaxed or strengthened in order
to represent the residential mobility and migration behaviour of specific groups, such as students or
the elderly.
These initial findings suggest a number of other directions for future research. First, the changes
in data collection in recent years have potential implications for the generality of our migration model
and we continue to undertake sensitivity analyses in order to ascertain whether bias is amplified or
reduced over time. This work includes investigation of whether Zoopla rental listings data for indi-
vidual properties can be used to apportion change to the rental market. Second, more sophisticated
distance measures might be used in order to better represent the attenuating effect of distance upon
moves, using 2011 Census data to calibrate the matching procedure (cf. Stillwell & Thomas, 2016).
Third, processes of household formation and dissolution might be modelled in order to fill the gaps
in origin and destination data that become increasingly apparent in the later years of the LCRs, and
post 2016 updates might be used to assist in this process. Fourth, our own research is investigating the
identifiable correspondences between names and ethnicity, age and gender, as well as related issues of
household structure: we see this as a promising way of evaluating the accuracy of our models and of
extending their usefulness into a number of important domains. For example, the materials developed
here might be used to model changes in local incidence of multi- generation households from ethnic
minorities, viewed in the context of the Covid- 19 pandemic.
TABLE 2 Origins and destinations of movers to and from Spitalfields and Banglatown Ward in East London
by 2019 English IMD deciles. Index of Multiple Deprivation (IMD) deciles 1– 10 range from the most to the least
deprived
IMD Decile Percentage of movers moving into the area Percentage of movers moving out of the area
13.9 4.6
218.8 20.5
320.4 22.8
414.9 16 .1
59.6 9.2
68.6 7.2
77.1 6.2
86.4 5.0
95.8 4.8
10 4.5 3.6
20
|
VAN DIJK et Al.
The last of these challenges speaks to a shift in modelling focus from aggregations to the human in-
dividual, extracting information from given and family names. Research using other data sources such
as consolidated name and date of birth data alongside published records of baby names has modelled
the age and sex characteristics of individuals (Lansley & Longley, 2016), while given and family name
pairings have also been associated with census data in order to predict ethnicity (Kandt & Longley,
2018). This focus on prediction at the individual level offers the prospect of adding a range of probable
individual and household covariates that are available only from censuses or (in more aggregate form)
from health registers.
For now, the first full version of the LCR migration estimates (‘CDRC Migration Model 1.0’)
provides new opportunities to better understand residential mobility patterns and their associated
social implications. In developing data on residential movements in this paper we have restricted
ourselves to providing brief illustrations of potential applications through linkage to conventional
small area deprivation and geodemographic indicators. These illustrative examples are indicative
of the potential full range of geodemographic indicators that might be used alongside LCR esti-
mates. Subject to disclosure control procedures, the high spatial granularity of the LCRs makes it
possible to contribute to analysis framed using any convenient geography. The data source makes
possible comprehensive analysis of population flows and neighbourhood changes. In our own
future research, we propose to investigate residential moves within and between different niche
housing markets, in order to better understand the local, regional and national patterns of housing
market differentiation across the United Kingdom today. In this paper, we have focused only upon
identifying the timing and location of residential moves and have not addressed the ways in which
the LCRs enhance understanding of the nature and timing of household moves, or the character-
istics of households that remain sedentary over the life course. As such, we also propose to use
individuals’ names as tokens of identity that can be used to describe household structure as well
as residential mobility history, and thus to develop scale free geographic representations of social
mobility outcomes.
ACKNOWLEDGEMENTS
This research was funded by the UK Engineering and Physical Sciences grant reference EP/M023583/1
(funder of the Urban Dynamics Lab: www.ucl.ac.uk/urban - dynam ics- lab/) and the UK Economic and
Social Research Council grant ES/L011840/1 (funder of the Consumer Data Research Centre: www.
cdrc.ac.uk/). The raw data and aggregated outputs from this project may be obtained on successful
application through the Consumer Data Research Centre Data Service (cdrc.ac.uk). The Registers of
Scotland data for this research were provided by the Urban Big Data Centre, Glasgow. Census output
data contain Crown copyright. Census Crown copyright material is reproduced with the permission of
the Controller of HMSO and the Queen's Printer for Scotland
CODE DISCLOSURE
Our processing of the data falls under the public interest derogation for research under Article
89 of the General Data Protection Regulation. Although formed from proprietary component
data sources, the resulting LCRs are available forbona fideresearch purposes on successful
application by accredited safe researchers to the UK Economic and Social Research Council
Consumer Data Research Centre (cdrc.ac.uk). This enables access to the code that has been
used to link the individuals over for different years. Aggregated data products which have been
run through disclosure controls will be made available to the research community and public
institutions to improve the availability of statistics for further research and end uses in provid-
ing public services.
|
21VAN DIJK et al.
ORCID
Justin T. van Dijk https://orcid.org/0000-0001-5496-425X
Guy Lansley https://orcid.org/0000-0002-3406-178X
Paul A. Longley https://orcid.org/0000-0002-4727-6384
REFERENCES
Bell, M., Blake, M., Boyle, P., Duke- Williams, O., Rees, P., Stillwell, J. et al. (2002) Cross- national comparison of
internal migration: Issues and measures. Journal of the Royal Statistical Society: Series A (Statistics in Society),
165(3), 435– 464. https://doi.org/10.1111/1467- 985X.t01- 1- 00247
Champion, A.G. & Shuttleworth, I. (2017) Are people moving address less? An analysis of migration within England
and Wales, 1971– 2011, by distance of move. Population, Space and Place, 23(3), e2026. https://doi.org/10.1002/
psp.2026
Clark, W.A. & Moore, E.G. (1980) Residential mobility and public policy. Beverly Hills: Sage Publications.
Electoral Commission. (2016) The December 2015 electoral registers in Great Britain: Accuracy and completeness
of the registers in Great Britain and the transition to Individual Electoral Registration. London: The Electoral
Commission. Available from: https://www.elect oralc ommis sion.org.uk/sites/ defau lt/files/ pdf_file/The- Decem ber-
2015- elect oral- regis ters- in- Great - Brita in- REPORT.pdf (accessed date September 26, 2019).
Electoral Commission. (2019) Accuracy and completeness of the 2018 electoral registers in Great Britain. London: The
Electoral Commission. Available from: https://www.elect oralc ommis sion.org.uk/who- we- are- and- what- we- do/
our- views - and- resea rch/our- resea rch/accur acy- and- compl etene ss- elect oral- regis ters/2019- repor t- accur acy- and-
compl etene ss- 2018- elect oral- regis ters- great - britain (accessed date September 26, 2019).
Coulter, R., Van Ham, M. & Findlay, A.M. (2016) Re- thinking residential mobility: Linking lives through time and
space. Progress in Human Geography, 40, 352– 374. https://doi.org/10.1177/03091 32515 575417
Fielding, A.J. (1992) Migration and social mobility: South East England as an escalator region. Regional Studies, 26(1),
1– 15. https://doi.org/10.1080/00343 40921 23313 46741
Fielding, A.J. (2012) Migration in Britain, paradoxes of the present, prospects for the future. Cheltenham: Edward Elgar.
Finney, N. & Simpson, L. (2008) Internal migration and ethnic groups: Evidence for Britain from the 2001 Census.
Population, Space and Place, 14(2), 63– 83. https://doi.org/10.1002/psp.481
Gale, C.G., Singleton, A.D., Bates, A.G. & Longley, P.A. (2016) Creating the 2011 area classification for output areas
(2011 OAC). Journal of Spatial Information Science, 12, 1– 27. https://doi.org/10.5311/JOSIS.2016.12.232
Gale, D. & Shapley, L.S. (1962) College admissions and the stability of marriage. The American Mathematical Monthly,
69(1), 9– 15.
Hamnett, C. (1991) The blind men and the elephant: The explanation of gentrification. Transactions of the institute of
British Geographers, 16(2), 173– 189.
Hand, D.J. (2018) Statistical challenges of administrative and transaction data. Journal of the Royal Statistical Society:
Series A (Statistics in Society), 181(3), 555– 605. https://doi.org/10.1111/rssa.12315
Harvey, D. (1973) Social justice and the city. Athens: University of Georgia Press.
Hoinville, G. & Jowell, R. (1978) Survey research practice. London: Heinemann Educational Books.
House of Commons (2013) The Private Rented Sector. First Report of Session 2013– 14. London: The Stationery Office
Limited.
Kandt, J. & Longley, P.A. (2018) Ethnicity estimation using family naming practices. PLoS One, 13(8), e0201774.
https://doi.org/10.1371/journ al.pone.0201774
Kandt, J., Van Dijk, J.T. & Longley, P.A. (2020) Family names origins and inter- generational demographic change in
Great Britain. Annals of the American Association of Geographers. 110(6), 1726– 1742.
Lan, T., Kandt, J. & Longley, P.A. (2019) Geographic scales of residential segregation in English cities. Urban
Geography, 41(1), 103– 123. https://doi.org/10.1080/02723 638.2019.1645554
Lansley, G., Li, W. & Longley, P.A. (2019) Creating a linked consumer register for granular demographic analysis.
Journal of the Royal Statistical Society: Series A (Statistics in Society), 182(4), 1587– 1605. https://doi.org/10.1111/
rssa.12476
Lansley, G. & Longley, P.A. (2016) Deriving age and gender from forenames for consumer analytics. Journal of Retailing
and Consumer Services, 30, 271– 278. (G Lansley, P A Longley) https://doi.org/10.1016/j.jretc onser.2016.02.007
Lee, E.S. (1966) A theory of migration. Demography, 3(1), 47– 57.
22
|
VAN DIJK et Al.
Lomax, N., Norman, P., Rees, P. & Stillwell, J. (2013) Subnational migration in the United Kingdom: Producing a
consistent time series using a combination of available data and estimates. Journal of Population Research, 30,
265– 288. https://doi.org/10.1007/s1254 6- 013- 9115- z
Lomax, N. & Stillwell, J. (2017) United Kingdom: Temporal change in internal migration. In: Champion, T., Cooke,
T. & Shuttleworth, I. (Eds.) Internal migration in the Developed World: Are we becoming less mobile?. London:
Routledge, pp. 120– 146.
Lomax, N., Stillwell, J., Norman, P. & Rees, P. (2014) Internal migration in the United Kingdom: Analysis of an es-
timated inter- district time series, 2001– 2011. Applied Spatial Analysis, 7, 25– 45. https://doi.org/10.1007/s1206
1- 013- 9098- 3
Longley, P.A., Cheshire, J. & Singleton, A. (2018) Consumer Data Research. London: UCL Press.
ONS. (2015). 2011 Census: Flow Data. [data collection]. UK Data Service. SN: 7713- 2011 Special Migration Statistics
- OA Level. https://doi.org/10.5255/UKDA- SN- 7713- 1
ONS. (2014) 2011 Census: Origin- destination statistics on migration for local authorities in the United Kingdom and
on workplace for Output Areas and workplace zones, England and Wales. Available from: https://www.nomis
web.co.uk/censu s/2011/all_table s?relea se=UK.1 (accessed date December 18, 19). Office for National Statistics.
ONS. (2019a). Mid- year population estimates quality and methodology information. Available from: https://www.ons.
gov.uk/peopl epopu latio nandc ommun ity/popul ation andmi grati on/popul ation estim ates/metho dolog ies/annua lmidy
earpo pulat iones timat esqmi (accessed July 07, 2019). Office for National Statistics.
ONS. (2019b). Local area migration indicators, UK. Available from: https://www.ONS.gov.uk/peopl epopu latio nandc
ommun ity/popul ation andmi grati on/migra tionw ithin theuk/ datas ets/local aream igrat ionin dicat orsun itedk ingdom
(accessed date December 12, 2019). Office for National Statistics.
ONS. (2020a). Population estimates for the UK, mid- 2019: methods guide. Available from: https://www.ons.gov.uk/
peopl epopu latio nandc ommun ity/popul ation andmi grati on/popul ation estim ates/metho dolog ies/metho dolog yguid
eform id201 5ukpo pulat iones timat eseng landa ndwal esjun e2016 #appen dix- 4- histo ric- geogr aphy- chang es- 2009-
to- 2019 (accessed date September 09, 2020). Office for National Statistics.
ONS. (2020b). Internal migration: detailed estimates by origin and destination local authorities, age and sex. Available
from: https://www.ons.gov.uk/peopl epopu latio nandc ommun ity/popul ation andmi grati on/migra tionw ithin theuk/
datas ets/inter nalmi grati onbyo rigin andde stina tionl ocala uthor ities sexan dsing leyea rofag edeta ilede stima tesda taset
(accessed date September 14, 2020). Office for National Statistics.
Park, R.E. & Burgess, E.W. (1925) The city: Suggestions for investigation of human behaviour in the urban environ-
ment. Chicago: The University of Chicago Press.
Rabe, B. & Taylor, M. (2010) Residential mobility, quality of neighbourhood and life course events.
Journal of the Royal Statistical Society: Series A (Statistics in Society), 173(3), 531– 555. https://doi.
org/10.1111/j.1467- 985X.2009.00626.x
Ravenstein, E.G. (1885) The laws of migration. Journal of the Statistical Society of London, 48(2), 167– 235.
Rees, P., Stillwell, J., Convey, A. & Kupiszewski, M. (1996) Population migration in the European Union. Chichester:
John Wiley and Sons Ltd.
Royal Mail. (2020) PAF. Available from: https://www.power edbyp af.com/produ ct/paf/ (accessed date September 09,
2020). Royal Mail.
Smith, D.P. & Denholm, J. (2006) Studentification: A guide to challenges, opportunities and practice. London:
Universities UK.
Smith, D.P., Finney, N., Halfacree, K. & Walford, N. (2015) Internal migration: Geographical perspectives and pro-
cesses. Farnham: Ashgate Limited.
Stanier, A. (1990) How accurate is Soundex matching. Computers in Genealogy, 3(7), 286– 288.
Stapleton, C.M. (1980) Reformulation of the family life- cycle concept: Implications for Residential Mobility.
Environment and Planning A, 12(10), 1103– 1118.
Stillwell, J. & Thomas, M. (2016) How far do internal migrants really move? Demonstrating a new method for the
estimation of intra- zonal distance. Regional Studies, Regional Science, 3(1), 28– 47. https://doi.org/10.1080/21681
376.2015.1109473
Ordnance Survey. (2020). Address data. Available from: https://www.ordna ncesu rvey.co.uk/busin ess- gover nment/ addre
ss- data?gclid =Cj0KC Qjw- uH6BR DQARI sAI3I - UdeqE O4wLC L2CzN WbHTq - A5QWh SCTSL iEtoJ_CYvOg
NiIGD 9qKTu RUaAr FPEALw_wcB (accessed date September 09, 2020). Ordnance Survey.
|
23VAN DIJK et al.
Timms, D. (1975). The urban mosaic: Towards a theory of residential differentiation. Cambridge Geographical Studies
No. 2. London: Cambridge University Press.
Travers, T., Tunstall, R., Whitehead, C. & Pruvot, S. (2007) Population mobility and service provision: A report for
London Councils. London: London School of Economics.
Tyrrell, N. & Kraftl, P. (2015) Lifecourse and internal migration. In: Smith, D.P., Finney, N., Halfacree, K. & Walford,
N. (Eds.) Internal migration: Geographical perspectives and processes. Farnham: Ashgate Limited.
How to cite this article: Van Dijk J. Lansley G.&Longley P.A. (2021) Using linked
consumer registers to estimate residential moves in the United Kingdom. Journal of the Royal
Statistical Society: Series A (Statistics in Society), 00, 1– 23. https://doi.org/10.1111/rssa.12713
... We also used estimates of population turnover, defined as the proportion of households in each LSOA in 2019 who were different from those who had lived there in 2002, from the Consumer Data Research Centre. 13 The Consumer Data Research Centre estimates these proportions by using the names of households members, individually and in combination, and addresses and dates of records from electoral and consumer registers and land registry sales data. ...
... Population density was calculated by dividing total population, separately for 2002 and 2019, by the LSOA area. Population turnover is a measure of change in the resident population, defined as the proportion of households in each LSOA in 2019 who were different from those who had lived there in 2002.13 ...
Article
Full-text available
Background: London has outperformed smaller towns and rural areas in terms of life expectancy increase. Our aim was to investigate life expectancy change at very-small-area level, and its relationship with house prices and their change. Methods: We performed a hyper-resolution spatiotemporal analysis from 2002 to 2019 for 4835 London Lower-layer Super Output Areas (LSOAs). We used population and death counts in a Bayesian hierarchical model to estimate age- and sex-specific death rates for each LSOA, converted to life expectancy at birth using life table methods. We used data from the Land Registry via the real estate website Rightmove (www.rightmove.co.uk), with information on property size, type and land tenure in a hierarchical model to estimate house prices at LSOA level. We used linear regressions to summarise how much life expectancy changed in relation to the combination of house prices in 2002 and their change from 2002 to 2019. We calculated the correlation between change in price and change in sociodemographic characteristics of the resident population of LSOAs and population turnover. Findings: In 134 (2.8%) of London's LSOAs for women and 32 (0.7%) for men, life expectancy may have declined from 2002 to 2019, with a posterior probability of a decline >80% in 41 (0.8%, women) and 14 (0.3%, men) LSOAs. The life expectancy increase in other LSOAs ranged from <2 years in 537 (11.1%) LSOAs for women and 214 (4.4%) for men to >10 years in 220 (4.6%) for women and 211 (4.4%) for men. The 2.5th-97.5th-percentile life expectancy difference across LSOAs increased from 11.1 (10.7-11.5) years in 2002 to 19.1 (18.4-19.7) years for women in 2019, and from 11.6 (11.3-12.0) years to 17.2 (16.7-17.8) years for men. In the 20% (men) and 30% (women) of LSOAs where house prices had been lowest in 2002, mainly in east and outer west London, life expectancy increased only in proportion to the rise in house prices. In contrast, in the 30% (men) and 60% (women) most expensive LSOAs in 2002, life expectancy increased solely independently of price change. Except for the 20% of LSOAs that had been most expensive in 2002, LSOAs with larger house price increases experienced larger growth in their population, especially among people of working ages (30-69 years), had a larger share of households who had not lived there in 2002, and improved their rankings in education, poverty and employment. Interpretation: Large gains in area life expectancy in London occurred either where house prices were already high, or in areas where house prices grew the most. In the latter group, the increases in life expectancy may be driven, in part, by changing population demographics. Funding: Wellcome Trust; UKRI (MRC); Imperial College London; National Institutes of Health Research.
... This would make it possible better to understand whether migrant assimilation is on a convergent path, and the extent to which this pathway is contingent upon British regional geography. Second, the linkage of these registers makes it possible to examine the dynamics of residential movements of different groups, in terms of resultant changes in deprivation, viewed in local and regional contexts (Longley et al., 1991;Van Dijk et al., 2021). Third, the nature of the groups might be further disinterred through evaluation of linkage of resident names to the most likely periods in British history in which group migrations have taken place. ...
Article
Full-text available
This paper documents population-wide inequalities of outcome in Great Britain amongst and between long-established and more recently arrived family groups. ‘Establishment’ is defined using family group presence in the 1851 Census of Population as a benchmark, and the ethnicity or nationality of more recent mi- grants is determined through classification of given and family names. Inequalities of outcome are measured using a harmonised indicator of neighbourhood dep- rivation (hardship). White British individuals tend to live in the best neighbour- hoods, but within-group inequalities reflect regional locations in which different family names were first coined 700 or more years ago. The living circumstances of White Irish and Chinese migrants are observed to be in line with long-established White British family lines, but other conventionally defined ethnic groups fare worse, some very markedly so. Disaggregation of conventional ethnic groups used by the Office for National Statistics such as White Other and Other Asian reveals stark within-group inequalities. These findings suggest: (a) regional ori- gins of inter-generational inequalities amongst the White British; (b) comparable neighbourhood environments experienced by the White Irish, Chinese and some White Other groups and (c) significantly worse neighbourhood circumstances within and between other more recently arrived immigrant groups. This work has several implications for understanding economic assimilation of migrants and the existence of inequalities amongst and between populations.
... Our approach is to use the near-complete Linked Consumer Register (LCR) of all adult individual names and addresses in Great Britain in 2011 (see Lansley et al. 2019;Van Dijk et al. 2021) as a frame to estimate ethnicities. The 2011 LCR provides an annual snapshot of the UK adult population created and curated by the ESRC Consumer Data Research Centre (CDRC), as part of a corpus of such data initially covering the period 1997-2016. ...
Article
Full-text available
This paper develops an improved method for estimating the ethnicity of individuals based on individual level pairings of given and family names. It builds upon previous research by using a global database of names from c. 1.7 billion living individuals, supplemented by individual level historical census data. In focusing upon Great Britain, these resources enable, respectively, greater precision in estimating probable global origins and better estimation of self-identification amongst long-established family groups such as the Irish Diaspora. We report on geographic issues in adjusting the weighting of groups that are systematically under- or over-predicted using other methods. Our individual level estimates are evaluated using both small area Great Britain census data for 2011 and individual level data for asylum seekers in Canada between 1995 and 2012. Our conclusions assess the value of such estimates in the conduct of social equity audits and in depicting the social mobility outcomes of residential mobility and migration across Great Britain.
... All component registers are precisely georeferenced by matching address records to the Ordnance Survey (GB) AddressBase Premium product. The LCRs pertain to the vast majority of the adult population in the UK and have been used in demographic research on topics like ethnic segregation (see ref. 30 ) and residential mobility (see ref. 31 ). ...
Article
Full-text available
Empirical analysis of social mobility is typically framed by outcomes recorded for only a single, recent generation, ignoring intergenerational preconditions and historical conferment of opportunity. We use the detailed geography of relative deprivation (hardship) to demonstrate that different family groups today experience different intergenerational out- comes and that there is a distinct Great Britain-wide geography to these inequalities. We trace the evolution of these inequalities back in time by coupling family group level data for the entire Victorian population with a present day population-wide consumer register. Further geographical linkage to neighbourhood deprivation data allows us to chart the different social mobility outcomes experienced by every one of the 13,378 long-established family groups. We identify clear and enduring regional divides in England and Scotland. In substantive terms, use of family names and new historical digital census resources are central to recognising that geography is pivotal to understanding intergenerational inequalities.
Article
Background Cancers are the leading cause of death in England. Our aim was to estimate trends from 2002-2019 in mortality from leading cancers for the 314 districts in England. Methods We used vital registration data in England from 2002 to 2019 for ten leading cancers by sex according to the total number of deaths over the study period, and a residual group of all other cancers. We used a Bayesian hierarchical model to obtain robust estimates of age- and cause-specific death rates. We applied life tables to calculate the probability of dying between birth and 80 years of age by sex, cancer cause of death, district and year. We report Spearman rank correlation between the probability of dying from a cancer and district-level poverty in 2019. Findings In 2019, the probability of dying from a cancer ranged from 0.10 (95% credible interval 0.10-0.11) to 0.17 (0.16-0.18) for women and from 0.12 (0.12-0.13) to 0.22 (0.21-0.23) for men. The most unequal cancers were lung cancer for women (3.7-times (3.2-4.4) variation between the districts with the highest and lowest probabilities of dying) and stomach cancer for men (3.2-times (2.6-4.1)). The cancers with the least geographical variability were lymphoma and multiple myeloma (1.2-times (1.1-1.4) for women and 1.2-times (1.0-1.4) for men), and leukaemia (1.1-times (1.0-1.4) for women and 1.2-times (1.0-1.5) for men). The correlation between probability of dying from a cancer and district poverty was 0.74 (0.72-0.76) for women and 0.79 (0.78-0.81) for men. The probability of dying declined in all districts from 2002 to 2019: the reductions ranged from 6.6% (0.3-13.1%) to 30.1% (25.6-34.5%) for women and 12.8% (7.1-18.8%) to 36.7% (32.2-41.2%) for men. Interpretation Cancers with modifiable risk factors and potential for screening for pre-cancerous lesions had heterogeneous trends and the greatest inequality. Reducing these inequalities requires addressing factors affecting both incidence and survival at the local level.
Article
Over the past 20 years, increasing land values, a rising population and inward investment from overseas have combined to encourage the demolition and redevelopment of many large council-owned estates across London. While it is now widely speculated that this is causing gentrification and displacement, the extent to which it has forced low-income households to move away from their local community remains to a large degree conjectural and specific to those estates that have undergone special scrutiny. Given the lack of spatially disaggregated migration data that allows us to study patterns of dispersal from individual estates, in this article, we report on an attempt to use consumer-derived data (LCRs) to infer relocations at a high spatial resolution. The evidence presented suggests that around 85% of those displaced remain in London, with most remaining in borough, albeit there is evidence of an increasing number of moves out of London to the South-East and East of England.
Article
Full-text available
Several public services in Malta operate under the stewardship of different governmental bodies, ministries, or departments. This results in considerable effort in the delivery of public services, especially ones that require the use of multiple registries, such as integrated public services (IPSs). Co-creation and co-production are increasingly being seen by public administrations as an approach toward mitigating issues stemming from such a siloed environment. Indeed, they are seen as a means to improve service provision through the delivery of citizen-centric public services that are more efficient and effective. This paper presents the Malta pilot as part of the inGOV project. The latter aims to develop and deploy a comprehensive IPS holistic framework and ICT mobile tools that will support IPS co-creation and governance. The Malta pilot focuses on modernising the Digital Common Household Unit public service. Improving considerably upon the previous ad hoc solution, the Digital Common Household Unit public service implements an iterative co-creation and co-production approach with the various stakeholders. This paper therefore presents the applied methodology in researching current challenges and enablers to the co-creation and co-production of a digital common household unit public service, with a specific focus on sustainability.
Article
Full-text available
Modern web mapping techniques have enhanced the storytelling capability of cartography. In this paper, we present our recent development of a web mapping facility that can be used to extract interesting stories and unique insights from a diverse range of socio-economic and demographic variables and indicators, derived from a variety of datasets. We then use three curated narratives to show that online maps are effective ways of interactive storytelling and visualisation, which allow users to tailor their own story maps. We discuss the reasons for the revival of the recent attention to narrative mapping and conclude that our interactive web mapping facility powered by data assets can be employed as an accessible and powerful toolkit, to identify geographic patterns of various social and economic phenomena by social scientists, journalists, policymakers, and the public.
Article
Full-text available
This paper examines the association between given and family names and self-ascribed ethnicity as classified by the 2011 Census of Population for England and Wales. Using Census data in an innovative way under the new Office for National Statistics (ONS) Secure Research Service (SRS; previously the ONS Virtual Microdata Laboratory, VML), we investigate how bearers of a full range of given and family names assigned themselves to 2011 Census categories, using a names classification tool previously described in this journal. Based on these results, we develop a follow-up ethnicity estimation tool and describe how the tool may be used to observe changing relations between naming practices and ethnic identities as a facet of social integration and cosmopolitanism in an increasingly diverse society.
Article
We develop bespoke geospatial routines to typify 88,457 surnames by their likely ancestral geographic origins within Great Britain. Linking this taxonomy to both historic and contemporary population data sets, we characterize regional populations using surnames that indicate whether their bearers are likely to be long-settled. We extend this approach in a case study application, in which we summarize intergenerational change in local populations across Great Britain over a period of 120 years. We also analyze much shorter term demographic dynamics and chart likely recent migratory flows within the country. Our research demonstrates the value of family names in characterizing long-term population change at regional and local scales. We find evidence of selective migratory flows in both time periods alongside increasing demographic diversity and distinctiveness between regions in Great Britain.
Article
The barriers to social integration posed by ethnic residential segregation are currently receiving renewed attention in Great Britain. A common characteristic of past studies of ethnic segregation in Britain is reliance upon aggregated Census data, raising potential issues of ecological fallacy. In this study, we address this challenge by using novel individual-level Consumer Register data for the UK to calculate an entropy-based spatial segregation index. We measure changes in segregation over twenty years and examine the impact of geographic scales upon observed levels of segregation in five policy relevant case study areas. Our results and findings can be used to improve the evidence base on segregation dynamics in the United Kingdom and have methodological implications for the future study of the phenomenon.
Article
Administrative data are becoming increasingly important. They are typically the side effect of some operational exercise and are often seen as having significant advantages over alternative sources of data. Although it is true that such data have merits, statisticians should approach the analysis of such data with the same cautious and critical eye as they approach the analysis of data from any other source. The paper identifies some statistical challenges, with the aim of stimulating debate about and improving the analysis of administrative data, and encouraging methodology researchers to explore some of the important statistical problems which arise with such data.
Article
Our objectives are to identify the issues that researchers encounter when measuring internal migration in different countries and to propose key indicators that analysts can use to compare internal migration at the 'national' level. We establish the benefits to be gained by a rigorous approach to cross-national comparisons of internal migration and discuss issues that affect such comparisons. We then distinguish four dimensions of internal migration on which countries can be compared and, for each dimension, identify a series of summary measures. We illustrate the issues and measures proposed by comparing migration in Australia and Great Britain.