PresentationPDF Available

What is… Iterative Proportional Fitting?

Authors:
What is… Iterative
Proportional Fitting?
Nik Lomax
School of Geography, University of Leeds
British Society for Population Studies Annual Conference, Cardiff
9 September 2019
Iterative Proportional Fitting (IPF) is…
A technique for reweighting a known multidimensional array (e.g.
cross-tabulated data) to target marginal totals
Used by demographers, transport planners economists and computer
scientists
Can be done in a wide range of software, from Excel to bespoke
packages
You might know it by another name
RAS in economics (see Bacharach 1965)
CrossFratar (Fratar 1954) or Furness (Furness 1965) in transport
engineering
Raking in in computer science and statistics (Cohen 2008)
IPF has also been referred to as rim-weighting or structure-
preserving estimation (Simpson and Tranmer 2005).
Simply a way of reweighting a distribution
Lomax, N., Norman, P., Rees, P. et al. (2013) Subnational migration in the United Kingdom: producing a
consistent time series using a combination of available data and estimates, J Pop Research, 30: 265.
https://doi.org/10.1007/s12546-013-9115-z
Background
First (demographic) use of IPF widely attributed to Deming and
Stephan (1940), who applied the technique to data from the 1940
U.S. census
Although there were complete counts of the population for certain
characteristics, when these characteristics were cross-tabulated the
output was limited to a sample of the population.
They used this sample as the starting distribution (the seeds) and
applied IPF to derive an estimate of these cross-tabulated
characteristics for the whole population.
Some examples of IPF in action
To estimate the characteristics of residents of small geographical
areas Birkin and Clarke (1988)
To updated the age and sex structure of small area populations in the
UK (Rees 1994)
To estimate small area population counts of car ownership and tenure
type using 1991 Census data (Simpson and Tranmer, 2005)
To disaggregate migration data by age and sex by (Willekens, Por, and
Raquillet, 1981; Willekens, 1982).
To estimate missing cross-border migration data for the United
Kingdom (Lomax et al. 2013)
Example: estimating UK migration
Example: estimating UK migration
Example: estimating UK migration
Extension to multiple dimensions: Age
Ethnicity - Health
Software solutions
Modules and user-produced syntax are available for Excel, SAS,
Matlab, Stata, and SPSS.
I like the mipfp package in R
References
Bacharach, M. 1965. Estimating nonnegative matrices from marginal data. International Economic Review, 6(3): 294310
Birkin, M., and M. Clarke. 1988. SYNTHESISA synthetic spatial information system for urban and regional analysis: Methods and
examples.Environment and Planning A20(12): 164571
Cohen, M. 2008. Raking. Encyclopedia of survey researchmethods, ed.P.Lavrakas, 67274. Thousand Oaks, CA: Sage.
Fratar, T. J. 1954. Vehicular trip distribution by successive approximations. Traffic Quarterly, 8(1): 5365.
Furness, K. P. 1965. Time function iteration. Traffic Engineering and Control, 7(7): 45860.
Lomax, N., Norman, P., Rees, P. et al. 2013. Subnational migration in the United Kingdom: producing a consistent time series using
a combination of available data and estimates, J Pop Research, 30: 265. https://doi.org/10.1007/s12546-013-9115-z
Lomax, N & Norman, P. 2016. Estimating Population Attribute Values in a Table: “Get Me Started in” Iterative Proportional Fitting,
The Professional Geographer, 68:3,451-461, DOI: 10.1080/00330124.2015.1099
Rees, P. 1994. Estimating and projecting the populations of urban communities. Environment & Planning A, 26:167197.
Simpson, L., and M. Tranmer. 2005. Combining sample and census data in small area estimates: Iterative proportional fitting with
standard software. The Professional Geographer57 (2): 22234.
Willekens, F. 1982. Multidimensional population analysis with incomplete data. In Multidimensional mathematical demography,
ed. K. Land and A. Rogers, 43111. NewYork: Academic.
Willekens, F., A. Por, and R. Raquillet. 1981. Entropy, multiproportional, and quadratic techniques for inferring patterns of
migration from aggregate data. In: Advances in multiregional demography, ed. A. Rogers, 84106. Laxenburg, Austria: International
Institute for Applied Systems Analysis.
ResearchGate has not been able to resolve any citations for this publication.
Article
The combination of detailed sample data with less detailed but fully enumerated marginal subtotals is the focus of a wide range of research. In this article we advocate careful modeling of sample data, followed by Iterative Proportional Fitting (IPF). The modeling aims to estimate accurately the interaction or odds ratios of complex tables, which is information not contained in the marginal subtotals. IPF ensures consistency with the subtotals. We advance this work in three practical ways. First, we show that detailed small-area estimates of both counts and proportional distributions usually gain accuracy by combining data for larger areas containing the small areas, and we illustrate the multilevel framework to achieve these estimates. Second, we find that a general classification or socioeconomic typology of the small areas is even more associated with the within-area interactions than is membership of the larger area. Third, we show how the Statistical Package for the Social Sciences (SPSS) can be used for IPF in any number of dimensions and with any structure of constraining marginal subtotals. Throughout, we use an example taken from the 1991 U.K. Census. These data allow us to evaluate various methods combining 100 percent tabulations and the Samples of Anonymised Records.
Article
There is a growing interest from a wide variety of sources in information pertaining to the characteristics of residents of small geographical areas together with their associated activity patterns. Reliance on the use of conventional aggregate data sources combined with the British Government's reluctance to make available microdata in the form of a public-use data set has restricted the type of questions analysts have been able to ask. The application of a methodology for generating synthetic microdata from a number of different aggregate sources is reported. The resultant information system can be used in a flexible manner to product distributions not currently available from aggregate sources. Additionally, the microdata form direct inputs into microsimulation models. The application described has been undertaken with Leeds Metropolitan District as the system of interest and a wide range of outputs is produced to illustrate the method.
Article
"The author describes a model for estimating and projecting the populations of communities living in small areas within cities. The model provides a means of updating the demographic inputs needed for projection between censuses and means of developing scenarios of demographic change and housing development. The method for estimating small-area populations between censuses is evaluated with recently published 1991 [U.K.] Census data. Single-year age-group detail is provided and the associated databases are embedded in a flexible user interface. Illustrative projections are discussed and interpreted for the northern English city of Bradford.... The model, although particular to districts of West Yorkshire, has been specified in a general way and could be adapted for use with any British district."
Vehicular trip distribution by successive approximations
  • T J Fratar
• Fratar, T. J. 1954. Vehicular trip distribution by successive approximations. Traffic Quarterly, 8(1): 53-65.
Subnational migration in the United Kingdom: producing a consistent time series using a combination of available data and estimates
  • N Lomax
  • P Norman
  • P Rees
• Lomax, N., Norman, P., Rees, P. et al. 2013. Subnational migration in the United Kingdom: producing a consistent time series using a combination of available data and estimates, J Pop Research, 30: 265. https://doi.org/10.1007/s12546-013-9115-z • Lomax, N & Norman, P. 2016. Estimating Population Attribute Values in a Table: "Get Me Started in" Iterative Proportional Fitting, The Professional Geographer, 68:3,451-461, DOI: 10.1080/00330124.2015.1099