ArticlePDF Available

Abstract and Figures

The Luxembourg Income Study (now known as LIS) provides public access for research purposes to harmonized unit-record data sets for multiple countries, in addition to providing summary statistics from those data, including poverty and inequality measures. LIS is a well-managed and undeniably important global public institution for research on inequality and social policy in rich countries. However, LIS’s eligibility criteria, country coverage, timeliness and some of its measurement practices limit its usefulness for many other purposes. The paper identifies a number of issues that would need to be addressed by a truly global micro-data base for studying poverty and inequality.
Probability of being included in LIS and PovcalNet LIS plans to expand coverage further in the coming years, but one would have to say that progress has been slow so far. Ten years ago, LIS included 130 surveys for 29 countries (Smeeding 2004). At the time of writing it has 211 surveys from 40 countries. However, by my count only 27 surveys from 2002 onwards are for developing countries as defined by PovcalNet. Yet over the same period, the coverage of PovcalNet went from 450 surveys for 97 countries to 850 surveys for 127 countries. So it is clear that only a small proportionless than 10 % -of the extra surveys for developing countries that became available over this period made it into LIS. Some of the expansion in survey availability was for "low-income countries" which LIS has never aimed to represent, although that is a choice LIS has made, not something imposed on LIS. Today's world has a continuum of countries at different income levels, and it is no longer very meaningful to make a sharp distinction between "low-" and "middle-income" countries, or even "middle-" and "high-income." The separation of "rich-country" databases from "poor-country" databases is hard to defend today. Most LIS countries now have multiple survey rounds going back to the early 1980s. But this too has an economic gradient, as seen in Fig. 3. 19 Richer countries have more surveys in LIS, reflecting its origins as a "rich-world" data archive. There is a mean of 4.2 surveys per country (median of 4). Italy has the most, at 11. Over half (60 %) of LIS's global population coverage is accounted for by the two most populous countries, China and India. However, any knowledgeable user interested in those two countries will be disappointed by LIS's survey coverage. Both countries have only one survey. A closer scrutiny of why this is so points to some of the problems LIS faces. The China survey is not from the governmental statistics office -the National Bureau of Statistics (NBS); alas, NBS micro data sets are deemed to be a State secret, which restricts learning within and outside China. This is unfortunate and arguably inconsistent with "the intellectual approach of seeking truth from facts" (Du Runsheng 2006, p.2) that has been prominent since the reform path started by Deng Xiaoping in 1978. By my understanding
Content may be subject to copyright.
J Econ Inequal
DOI 10.1007/s10888-015-9298-y
The Luxembourg Income Study
Martin Ravallion
Received: 4 February 2015 / Accepted: 16 February 2015
© Springer Science+Business Media New York 2015
Abstract The Luxembourg Income Study (now known as LIS) provides public access for
research purposes to harmonized unit-record data sets for multiple countries, in addition to
providing summary statistics from those data, including poverty and inequality measures.
LIS is a well-managed and undeniably important global public institution for research on
inequality and social policy in rich countries. However, LIS’s eligibility criteria, country
coverage, timeliness and some of its measurement practices limit its usefulness for many
other purposes. The paper identifies a number of issues that would need to be addressed by
a truly global micro-data base for studying poverty and inequality.
Keywords Income ·Poverty ·Inequality ·Household surveys ·Data access
1 Introduction
The Luxembourg Income Study (officially renamed “LIS” in 2011) is a non-profit organi-
zation, founded in 1983. LIS’s primary aim is to facilitate the creation of, and access to,
“harmonized” data files at unit-record level on household incomes, employment and other
characteristics for multiple countries. LIS started with a small set of rich countries but has
since expanded to include some middle-income developing countries.
Over 20 years ago I reviewed Smeeding et al. (1990), the first published volume from
LIS (Ravallion 1992). My review began by agreeing with Tony Atkinson’s (1990, p.xvii)
comment (in the opening sentence of his introduction to the Smeeding et al. volume) that
LIS is “. . . one of the most exciting developments in applied economic research in recent
years.” This paper asks whether LIS is still exciting at 30 years of age.
M. Ravallion ()
Department of Economics, Georgetown University, Washington, DC 20057, USA
M. Ravallion
The paper begins with an overview of LIS and its usage. It then looks at the coverage,
timeliness, accessibility and quality of LIS.1In keeping with the topic of this special issue,
the focus will be on LIS’s use for international comparisons of income distributions. Other
applications, such as in studying labor supply, are not explicitly considered, although some
of the material here will still be relevant to those applications.
2 An overview of LIS
The idea for LIS emerged at an international conference on poverty in 1982 when a number
of researchers using household-level data realized that there were gains from pooling their
knowledge in the interest of making better cross-national comparisons (Smeeding et al.
1985). The founding director was Tim Smeeding, who has been closely associated with the
project since its inception.2Janet Gornick has been the director since 2006.
LIS’s main office is in Luxembourg, with a satellite in New York, at CUNY. The website
lists 18 staff, 12 in the Luxembourg office (though many are jointly funded with non-LIS
functions). LIS is not only a data provider but also an independent research center, doing
original research on LIS data.
The current budgeted cost of LIS’s core data and research activities appears to be about
C1 million per year.3The largest donor is the Luxembourg Government; additional funds
come from about 25 governments and institutions, all in the rich world. Aside from the
Luxembourg Government, there appears to be little or no official donor commitment to
long-term funding of LIS. Funds are renewed on an ad hoc basis. (There is limited cost
recovery from users.) One would not normally expect the overall level of the voluntary
contributions attained this way to be sufficient to assure that a public good such as LIS is
supplied at what could reasonably be considered the optimal level. In my judgment, some
(but not all) of the deficiencies identified by this review are consistent with the view that
LIS is indeed underfunded.
2.1 How is LIS different to other data sources?
The LIS output closest to the various databases reviewed in this special issue is LIS’s “Key
Figures.” This provides summary tables on selected country-level poverty and inequality
measures, such as the Gini index. The summary statistics in Key Figures are calculated by
LIS staff from the LIS micro data, which cover 40 countries at the time of writing.4Of
course, there are other places to obtain such summary statistics. The World Income Inequal-
ity Database (WIID) of the World Institute for Development Economics Research (WIDER)
gives 5,300 Gini indices for 145 countries. It is known that the comparability of the numbers
1The LIS office also supports the Luxembourg Wealth Study (LWS) which provides micro data on assets and
debts. There are currently 12 countries in LWS, all “rich countries.” LWS is not covered by this review.
2Others closely involved in creating LIS include Serge Allegrezza, Marc Cigrang, John Coder, Robert
Erikson, Lee Rainwater (LIS’s first research director) and Gaston Schaber.
3For the Luxembourg office, this is expected to be C1.3 million in 2014, of which 60 % is for the core LIS
activities with which this paper is concerned. (The rest comprise various self-financing activities such as
training workshops.) A similar accounting of the cost of the CUNY office is not available but Janet Gornick
thought an extra 20 % would be reasonable. This brings the total for the core activities to about C940,000.
4This is the number of countries listed on the LIS website early October 2013, under “LIS Database by
The Luxembourg Income Study
in such compilations is questionable (see, for example, Atkinson and Brandolini 2001). The
World Bank’s PovcalNet is the source of the summary measures on poverty and inequal-
ity found in the Bank’s World Development Indicators (WDI), as in (for example World
Bank 2013). PovcalNet provides both fixed and some user-driven calculations of poverty
and inequality measures for “developing countries,” defined as Part 2 member countries
of the World Bank (at the time the dataset began around 1990).5There are pros and cons
of each source. It terms of coverage, while WIID is the largest (by far) it is probably the
least methodologically consistent internally, while LIS is the smallest but most consistent.
PovcalNet and the WDI are somewhere between the two.
However, the distinctive feature and value-added of LIS is not Key Figures but rather
the access it provides to a set of harmonized micro data files derived from survey data sets
supplied by participating (governmental and non-governmental) statistical agencies at the
country level. To preserve confidentiality, access to the micro data is remote, meaning that
program code is sent to LIS rather than being run by the user directly on the data.
While LIS was way ahead of its time in the 1980s, today there are a number of similar
products, though serving somewhat different needs. The University of Michigan’s ICPSR
provides a valuable archive of deposited data sets used in past research studies, mostly
academic. These include numerous surveys data sets though they tend to be more specialized
surveys, and there is no attempt to have broad country coverage or to harmonize data. The
University of Minnesota’s IPUMS-I provides access to Census data at the micro level for
74 countries. The World Bank’s Comparative Living Standards Project (CLSP) provides
a harmonized set of files for the surveys for developing countries done under the Living
Standards Measurement Study (LSMS). The FAO’s Rural Income Generating Activities
(RIGA) database provides access to income and selected other constructed variables in unit-
record form for 19 developing countries. There are also some regional initiatives such as
SEDLAC, which provides summary statistics on poverty and inequality across 24 countries
in Latin America.
LIS is probably more standardized than any other library of micro data sets, with the
possible exception of CLSP. Of course, there are limits to the harmonization. For example,
some surveys do not ask for income taxes paid or social security contributions since respon-
dents quite often do not know these things (and, in any case, net income is considered the
better welfare indicator). Harmonization is also costly and naturally there is a trade-off with
limited resources. Other compilations of data have done less (some far less) than LIS to
standardize the data, but have attained greater (often much greater) coverage of countries
and over time. Where the data provider chooses to be in this trade-off depends (of course)
on the objective of the data tool.
Many users will not appreciate how much work is required to create properly documented
and reasonably harmonized micro data from the diverse sources files. Harmonization
requires that common units are used and common definitions, although there are constraints
to the latter, such that the harmonization is never complete or even well-defined. There are
naturally limits to harmonization in the LIS model. In the short-term, LIS must of course
take the survey as given. However, LIS does provide feedback to the various source agen-
cies on any problems it encounters and LIS is directly involved in various internal forums
5In the interests of full disclosure I co-founded (with Shaohua Chen) PovcalNet while working in the World
Bank’s research department. The data tool was devised to allow replication of the World Bank’s global
poverty measures, and to allow users to test their sensitivity to alternative assumptions, and to support other
poverty monitoring efforts.
M. Ravallion
on data standards.6This is nonetheless a somewhat different model to the LSMS and CLSP.
The LSMS entails facilitating and implementing surveys on the ground in the countries
concerned. This difference stems from the fact that the LSMS has a mandate for statisti-
cal capacity building at country level. LSMS surveys use a similar format across countries,
though it is unclear that one could reasonably call them “harmonized.” In practice there is
variation across the surveys, stemming from the fact that staff work with local counterparts
to modify the survey instrument to fit local needs. There is a trade-off between harmoniza-
tion across countries in the data collection stage and the relevance of the survey to local
conditions, including local policy concerns.
2.2 Usage and influence of LIS
LIS appears to have a loyal core group of around 200 regular users, year-after-year, with
varying numbers of new users joining each year and others leaving.7Over 2007–10, the
total count of users with new or renewed registrations in the current year (what LIS calls
“active users” although it is unclear just how active they are) was a reasonably steady 400
per annum. Over 2010–12, this rose to over 1,000 active users.8This sharp rise in regis-
trants came mainly from new users, and could well reflect interest in the social impacts of
the Great Recession and the Euro crisis. The number of processed jobs (specific task sub-
missions to LIS) averaged 51,000 per year over 2007–12. This is roughly double the number
in 2002 (26,000), although the count did not show any net increase over 2010–12.9Users
come mainly (91 %) from LIS’s contributing countries. So this tool appears to be mainly of
interest to “rich world” researchers, which will not be surprising when we look more closely
at coverage in Section 3.
LIS research outputs can be tracked using LIS’s own working paper series (WPS) since
researchers using LIS data must publish their results in this series (while not preventing
other publication outlets).10 There are over 600 Working Papers on the LIS WPS site at the
time of writing, with 565 different authors. (Tim Smeeding is clearly the most prolific of
the LIS authors, with 82 LIS working papers having him as an author or co-author). The
main topics of the LIS papers have been the measurement of poverty and inequality, and the
incidence of social spending, often from a comparative perspective. Oddly, the search tool
on the LIS’s WPS website does not allow one to enter a personal search request (one is con-
fined to pre-set “key words.”) So I created my own file of all the abstracts of LIS WPS and
searched on that file. In terms of frequency, the top 30 substantive words in the abstracts are
(with word counts):11 income (336), countries (318), poverty (297), inequality (277), social
(259), differences (226), States (1999), data (187), distribution (158), policy (150), national
(177), welfare (136), family (130), household (124), comparative (119), effects (114), work
6This includes the Canberra Group meetings in the 1990s, and various OECD expert groups and meetings at
the United Nations Statistical Commission.
7Over 2007–12, the mean count of continuing users (active users less new users) was 242 with a standard
error of only 7.2.
8The information in this paragraph is drawn from a presentation made to the LIS Board in 2013 by Thierry
Kruten of LIS, which was kindly passed onto me by Janet Gornick.
9This is puzzling. On the presumption that the new users tend to be less able programmers one might expect
them to need more, not fewer, runs. Possibly the newly-attracted registrants were just not as active.
10The count includes various series, the one labeled “LIS” being the largest with 597 papers at the time of
11 I ignore some words for obvious reasons, like “paper,” “study,” “using” and “LIS.”
The Luxembourg Income Study
(109), women (97), children (94), measures (91), Luxembourg (88), cross (85), rates (82),
redistribution (79), changes (77), analysis (75), employment (72), age (70), transfers (67)
and earnings (60).
A second source of data on LIS’s influence on research is the set of journal articles men-
tioning LIS. An “all text” search for “Luxembourg Income Study” in EconLit identifies 527
articles in academic journals.12 The first was in 1987, although this was only an announce-
ment about LIS, in the Review of Income and Wealth (RIW). The first LIS paper published
in an academic journal was Buhmann et al. (1988), by LIS staff and also published in RIW,
which has since published 120 papers that refer to LIS, more than any other journal.
Figure 1plots the counts of papers, for both WPS and articles in EconLit mentioning
LIS.13 The time profiles make sense, given the lags between a WP and the final journal
article. Both series show signs of a recent decline in output although this is not as easily
identified using the count of journal articles, possibly reflecting publication lags.14 It is
puzzling that the decline in research output have not come with declining usage of the
data. Possibly usage has shifted from publishing researchers to others, such as users in
governments or international agencies.
LIS had found applications in research across the political and social sciences, not just
economics; for example, in 2005 LIS won an award from the American Political Science
Association for its contribution to the field of comparative politics. And the applications
have not been confined to academic research. So in assessing LIS usage it is important to
cast a wider net than is possible with EconLit or scanning the more academic WPs.
Objective indicators with a broad coverage can be found in two Google tools. The first
is the Google Ngram Viewer, which counts the incidence of any word or phrase in digitized
writings on the internet. This shows a steady rise in the incidence of references to “Luxem-
bourg Income Study” in digitized books from 1983 until about 2000.15 There has been a
leveling off since then, though the incidence has stayed fairly firm. Second, a search for ref-
erences to “Luxembourg Income Study” in Google Scholar (GS) delivers a count of 8,700,
of which 349 are found in material on the internet dated in LIS’s first 10 years (1983–1992),
2,420 are in the next 10 years (1993–2002), with 5,740 since 2003.16 So citations to LIS
have continued to accumulate, with no sign of any recent decline; indeed, two-thirds of the
citations are from internet documents in the last 10 years. Of course, such a rise in citation
counts partly reflects the growth of digitized content with the expansion of the internet.17
12Entering just “LIS” retrieves a large amount of irrelevant content. However, it appears to be rare to refer to
LIS without at some point using the phrase “Luxembourg Income Study.
13At the time of writing the EconLit journal entries only go up to June 2013; the data entry for 2013 in the
Figure is set at three times the recorded count for that year.
14 The Figure also gives a non-parametric kernel regression. Using instead a parametric regression of the
WPS count on year and year-squared both coefficients are significant at the 0.01 % level. For the journal
article count both coefficients are significant at the 3 % level. (The zeros for the first three years were
dropped.) Dropping the observation for 2013, the coefficients are only significant at the 6 % level.
15By “incidence” I mean references to “Luxembourg Income Study” as a % of all words in Google Books
that year.
16Google Scholar casts a wider net that other bibliographic tools that are confined to more academic working
papers and journal article. Using “LIS” instead does not adequately identify the relevant citations.
17For example, about the same share of the total number of references in Google Scholar to “living standards
measurement” and “poverty measurement” are since 2003.
M. Ravallion
1984 1988 1992 1996 2000 2004 2008 2012
LIS Working Papers
Journal articles in EconLit
Fig. 1 Counts of LIS working papers and journal articles mentioning LIS by year
3 Coverage, frequency and timeliness
The 40 countries in LIS at the time of writing represent 62 % of the world’s population.
They are by no means a random sample of the world. This can be seen from Fig. 2,which
gives the probabilities of being included in LIS and PovcalNet. Inclusion is plotted against
log GNI per capita, for all 175 countries in the world for which GNI is available in the
WDI. The Figure also gives non-parametric regression functions, interpretable as the mean
probability of inclusion at a given income level. While there is a clear “rich-country bias”
in LIS, it is the opposite for PovcalNet, reflecting the World Bank’s mandate of fighting
absolute poverty in the developing world. However, it remains that 70 % of countries are
in the region of common support (the interval of GNI which contains representatives from
Thirty years ago one might have explained the rich-country bias of LIS evident in
Fig. 2on the grounds that survey data were not readily available for poor countries. But that
is clearly no longer true, as is also evident from Fig. 2. At the time of writing, PovcalNet
includes distributional data on consumption or income from 850 national household surveys
for 127 developing countries. Granted the micro data are not publically available for many
of these countries, though nor are they for all current LIS countries (which is why they
have the confidentiality restrictions that determine their mode of data access). And while
data quality is uneven, that is clearly also true amongst current LIS countries. While one
still hears rather poorly informed claims about the “low quality” data in developing coun-
tries, I would contend that the majority of the national household surveys done by statistical
offices in the developing world today meet the standards expected of surveys done in the
rich world. Poor data quality in developing countries is now a lame excuse for excluding
those countries.
18Outliers exaggerate this somewhat; if one eliminates the top two GNIs in the PovcalNet countries then the
region of common support drops to an interval of GNI per capita containing 52 % of the world’s countries.
The Luxembourg Income Study
6 7 8 9 10 11
PovcalNet LIS
Gross national income per capita (2010; $PPP ; log scale)
Probabi lity of being included in the dataset
5 6 7 8 9 10 11 12
Density of log GNI per capita
Fig. 2 Probability of being included in LIS and PovcalNet
LIS plans to expand coverage further in the coming years, but one would have to say
that progress has been slow so far. Ten years ago, LIS included 130 surveys for 29 countries
(Smeeding 2004). At the time of writing it has 211 surveys from 40 countries. However, by
my count only 27 surveys from 2002 onwards are for developing countries as defined by
PovcalNet. Yet over the same period, the coverage of PovcalNet went from 450 surveys for
97 countries to 850 surveys for 127 countries. So it is clear that only a small proportion –
less than 10 % – of the extra surveys for developing countries that became available over this
period made it into LIS. Some of the expansion in survey availability was for “low-income
countries” which LIS has never aimed to represent, although that is a choice LIS has made,
not something imposed on LIS. Today’s world has a continuum of countries at different
income levels, and it is no longer very meaningful to make a sharp distinction between “low-
” and “middle-income” countries, or even “middle-“ and “high-income.” The separation of
“rich-country” databases from “poor-country” databases is hard to defend today.
Most LIS countries now have multiple survey rounds going back to the early 1980s. But
this too has an economic gradient, as seen in Fig. 3.19 Richer countries have more surveys
in LIS, reflecting its origins as a “rich-world” data archive. There is a mean of 4.2 surveys
per country (median of 4). Italy has the most, at 11.
Over half (60 %) of LIS’s global population coverage is accounted for by the two
most populous countries, China and India. However, any knowledgeable user interested in
those two countries will be disappointed by LIS’s survey coverage. Both countries have
only one survey. A closer scrutiny of why this is so points to some of the problems LIS
The China survey is not from the governmental statistics office – the National Bureau of
Statistics (NBS); alas, NBS micro data sets are deemed to be a State secret, which restricts
learning within and outside China. This is unfortunate and arguably inconsistent with “the
intellectual approach of seeking truth from facts” (Du Runsheng 2006, p.2) that has been
prominent since the reform path started by Deng Xiaoping in 1978. By my understanding
19The figure only counts surveys with a live link.
M. Ravallion
8.00 8.25 8.50 8.75 9.00 9.25 9.50 9.75 10.00 10.25 10.50 10.75 11.00
Gross national income per capita (2010; $PP P; log scale)
Number of live links to surveys in LISSY
Fig. 3 Number of surveys in LIS by country
of the law, the only way a “LIS-type” solution using remote access could work is that the
micro data continued to sit on NBS’s servers, but the software allows users to have remote
access without being able to download the micro data. This is technically feasible.
Similarly to China, the one survey for India, for 2004, is not the Government’s National
Sample Survey (NSS) data, now in its 68th round with about 20 surveys since 1970. The
NSS is one of the oldest and most respected surveys in the developing world. The most
recent micro data have just become available, for the 68th round for 2010–11. Unlike China,
these data have been public access since the mid-1990s (with some user fees). The NSS
surveys are clearly not in LIS because they rely on collecting data on consumption, rather
than income.
Timeliness of data is important for most users. The most recent micro data sets for a
number of countries are in the early to mid-2000s. Averaging across all 40 countries in LIS
at the time of writing, the mean year of the latest survey is 2006; the median is 2005 and
the range is 1997 to 2010. Figure 4gives the histogram of the latest year of the surveys in
LIS. Bi-modality is evident. There is one group of 14 countries with survey data for 2010,
which is reasonably recent by current international standards.20 Then there is a lower mode
at 2004, with a sizeable spread around this mode. However, many countries do have more
recent surveys than found in LIS.21
These lags will clearly put off many users when they know that more recent surveys are
available elsewhere for many of these countries. Given that other data compilations draw on
more recent data for overlapping countries, LIS’s own processing needs are clearly part of
the reason. While the harmonization process takes time, some potential users will no doubt
20The 14 countries are Colombia, Germany, Greece, Ireland, Israel, Italy, Luxembourg, Russia, Slovak
Republic, Slovenia, South Africa, Spain, the United Kingdom and the United States.
21I can verify this for Brazil, China, Hungary, India, Mexico, Peru, Poland and Romania; there may be other
surveys I am unaware of. And in all of these cases the “LIS lag” exceeds two years; the largest is 14 years,
for Romania.
The Luxembourg Income Study
1997 1999 2001 2003 2005 2007 2009 2011
Fig. 4 Histogram of the year of the latest survey in LIS
wonder why the raw data is not put up in advance, also recognizing that some users will not
need the LISSified files.22
4 Accessing LIS data
Users of LIS do not have direct access to the micro data. The aim is that users get the
summary tables they want from the micro data but do not actually have any contact with
the micro data. This is done to respect confidentiality agreements with the governmental
statistics offices or other agencies from which LIS obtained the primary data. For some
countries the micro data are already public; it is not clear why LIS does not provide users
with the harmonized unit-record data in such cases. The relevant data tool in LIS is called
One must register to obtain access to LISSY but this is easy. For students (anywhere)
and non-student researchers in certain countries, access to the micro data is free. Whether
non-students are liable to pay a fee depends on where they live. If your country of residence
(or institution) has registered for LIS and paid the required fees then you have free access.
At the time of writing there are only 19 countries for which both student and non-student
researchers have free access.23 Users from designated “high-income countries” pay C500
per quarter (or C2000 per year), while those from designated “middle-income countries”
and “low-income countries” pay half and one quarter of these amounts respectively. LIS
does not allow commercial users at any price. This is a requirement of some data providers
and donors (rather than LIS). It is a puzzling restriction with no clear justification to my
mind. What is wrong with commercial usage, at reasonable fees?
To submit a data request users first obtain a password and username and sign a pledge to
abide by LIS rules. Users need to submit their own programs to run remotely on the desired
22The website does identify surveys received but for which harmonization is in progress.
23At the time of writing I did not include those counties for which the LIS website says “free until date x”
when x has passed. Since then LIS has provided extensions and updated the site.
M. Ravallion
LIS micro data set. The request (in text format with a standardized header) must use SAS,
SPSS or Stata (by far the most popular package now for LIS submissions). (LIS is planning
to add R, the only truly open source statistical package.) If the code is free of errors then
one usually obtains the results quickly – in a matter of minutes unless a manual check is
deemed necessary. Data results are returned to the same email address and must be in a
summary form that preserves confidentiality. The system is available 24 hours a day, 7 days
a week. Delays arise if manual intervention is called for, such as when a user account has to
be renewed or the submitted program risks violating LIS rules.
One attraction of LISSY is that one can pool data for multiple countries in a single run;
for example, one might combine all the surveys for (say) Europe, or all the surveys for
that matter. One can then estimate a cross-country regression for (say) household income,
with country and year-specific variables (or country/year fixed effects) as well as the usual
household characteristics. Capacity problems can also be expected with large jobs. One
LISSY user reported to me an example of a very large job of this sort that crashed with
the (rather unhelpful) message: “Your job has been refused.” However, with the help of LIS
staff it was possible for this user to partition the job to run properly. The impression I get
from the users I consulted in preparing this paper is that LIS staff are quite responsive to
queries, albeit with the inevitable delays (such as due to different time zones).
I did a trial run of LISSY for the purpose of this review. I chose to compare spending
on social transfers in Australia (my country of origin) with the United States (my country
of residence). Using the latest survey rounds, 2010 for the U.S. and 2003 for Australia, I
regressed “social security transfers” (essentially public transfers for social protection and
assistance, including public pensions but excluding private pensions) as a share of total
income on (log) household size, a dummy variable for the US, and the interaction of these
two variables. A simple Stata program returned my results in two or three minutes. (I found
that the share of income from social transfers declines significantly with household size in
both countries, though slightly less quickly in the US, and the share of income from social
transfers is lower in the U.S. than Australia at any given household size.)
To those of us who are more used to having the micro data files, LISSY’s remote access
method may feel a little clumsy, although it works quite well. One does sometimes like
to look at the unit-record data to check something, but losing this option is probably not
a concern for most users. Some data manipulations that are easy with direct access to the
micro data would clearly be a challenge with LISSY’s remote access mode, such as merging
household-level data bases with external geographically-referenced data, such as on public
spending or infrastructure.
There are two other ways users can access the LIS data, with the choice depend-
ing on what one wants from the data. While the main value-added of LIS is clearly the
access to micro data that it facilitates, Key Figures provides summary statistics across
the 40 LIS countries for multiple years in most cases. Key Figures is the LIS out-
put that most resembles the other data compilations reviewed in this special issue, and
it appears to be widely consulted as an authoritative source of poverty and inequality
No registration with LIS is required to access Key Figures. The data compilations are
almost certainly more internally consistent than other compendiums, although this comes
at the price of more limited coverage. However, there are concerns about some of the
measurement practices underlying these data, which I will return to in Section 6.
24Key Figures also provides a set of employment data by gender that is not reviewed here.
The Luxembourg Income Study
The second route to the LIS data is the Web Tabulator,to which this review now turns.
5 Democratizing research: Web Tabulator
To reap the benefits of open data the community of users also needs open tools for data
analysis, without which the cost of entry can be high for many potential users. I expect that
there are a great many people who could learn from LIS data but do not know how to write
code in Stata (say).
Fortunately the technology has improved, so it is now possible to give users access to the
data for analysis without being able to “take away” the data. Web Tabulator does just that.
Users can make their own tabulations from the primary data (only household data at this
point) and this can be done quite easily. One does not need to know any software program.
This is an important innovation for LIS.
I also took Web Tabulator for a test drive for the purpose of this review, again comparing
social transfers in Australia and the US. The latest survey rounds in Web Tabulator were for
2003 and 2004 – a 10 year lag that is a lot longer than those currently working on developing
countries have come to expect. And the Web Tabulator data for the U.S. lags 10 years behind
the LISSY data.
Tabl e 1is the result of my test drive. The table gives mean social transfers as a % of
mean gross income for the “poor” and “non-poor” defined by a relative poverty line set at
50 % of the country median (following a standard LIS practice that I will return to below). I
also stratified by household size. We see that the poor depend more on social transfers than
the non-poor. While this holds in both countries, the shares are higher for Australia at all
household sizes. In both countries, we see again that the share of social transfers in income
tends to be lower for larger households. I found Web Tabulator to be easy to use and I had
my table within 10 minutes of opening the tool.
My trial suggested that Web Tabulator could be improved greatly. There are small annoy-
ances, such as the fact that the variables are not defined within the data tool – one needs to
go outside it to search. And “social security transfers” are abbreviated as “social transfers”
in Web Tabulator, which creates confusion.
Tab l e 1 An example of the calculations that can be done using LIS’s Web Tabulator
Mean social transfers as a % of mean gross income Poverty status (relative to 50 % of median)
Household size Country and year Not poor Poor
One person Australia 2003 30.81 106.74
United States 2004 33.32 80.27
Two persons Australia 2003 27.07 108.50
United States 2004 23.19 65.17
Three persons Australia 2003 15.61 91.57
United States 2004 7.67 42.20
Four persons Australia 2003 10.23 76.82
United States 2004 5.32 35.87
Five persons or more Australia 2003 13.90 73.07
United States 2004 6.88 33.83
M. Ravallion
The biggest limitation is that (as yet) there is not that much one can actually do analyti-
cally with Web Tabulator – just simple descriptive tabulations of the pre-set variables. It is
no substitute for LISSY, but Web Tabulator could be much more powerful without requiring
knowledge of the software used by LISSY. One cannot even do simple manipulations of the
small set of “canned” variables.
In any future developments to the Web Tabulator idea more variables should be included,
anticipating user applications. And the tool needs to include more options for manipulating
the variables and more analytic functions. With limited resources, LIS clearly faces a hard
choice between investing in Web Tabulator and addressing the pressing coverage and time-
liness issues for its core database in LISSY. But more could be done with existing resources
if LIS adopted new software tools from other sources. There would seem to be a strong case
for replacing the existing Web Tabulator with a tool serving the same end but based instead
on the entire LISSY data base. The World Bank’s research department has devised a soft-
ware solution called ODAT (an extended version of the analytic front end of CLSP) with
powerful statistical capabilities yet preserving full confidentiality and allowing users with-
out any knowledge of statistical software to derive versions of Table 2, and much more.25
Web Tabulator users would have a more powerful tool, no harder to use. And I expect that
for many users of LISSY, this new version of Web Tabulator will suffice. This could be up
and running quickly at modest cost.
6 LIS practices
Since its inception, LIS has emphasized standardization to enhance international compa-
rability and policy relevance. This has entailed imposing common protocols. Here I am
concerned that some LIS practices are open to question or even hard to defend – not least
in the developing countries of the world that LIS has been aiming to embrace, albeit with
limited success so far.
Before turning to specifics, beyond some early contributions (such as Buhmann et al.
1988) the LIS website and “LIS literature” in the WPs seems a little short on methodological
content. I would have liked to see more work documenting and critically assessing LIS
protocols, such as in constructing “comparable” income aggregates and welfare metrics and
dealing with thorny issues such as the valuation of income-in-kind. Such methodological
work is important for both current users (given that the methodological choices can matter
to, for example, policy inferences) and future efforts at data harmonization by others.
6.1 Processing protocols
Prior to around 2000, LIS used an odd coding rule that replaced missing monetary variables
in the primary data with a zero, so that a zero became ambiguous – it could either mean
zero or that the data are missing (which are of course very different things). Thankfully,
this has been replaced by better coding practices. However, it remains that LIS does not do
its own imputations for missing values. The problem for the types of comparisons that LIS
25ODAT was developed by Qinghua Zhao at the Bank. ODAT is in trials with China’s NBS and is planned to
become publically available (free of charge) for other applications late 2013. ODAT has the full capabilities
of R, but does not require programming in R. A simple user-friendly interface is used.
The Luxembourg Income Study
Tab l e 2 Five different households who are equally well off according to LIS
Household Household income ($/day) No. adults No. children Income Income Income
Per person per equivalent Per equivalent
single person single adult
(LIS square-root scale) (alternative scale)
A Small $5.00 1 0 $5.00 5.00 5.00
B Medium, nuclear $10.00 2 2 $2.50 5.00 3.76
CMedium+grandparent $10.00 3 1 $2.50 5.00 3.51
D Large, nuclear $15.00 2 7 $1.67 5.00 3.20
E Large and extended $15.00 6 3 $1.67 5.00 2.81
Note: The alternative scale assumes that children cost 70 % of adults and that the scale parameter is 0.8
M. Ravallion
is designed for is that the extent of missing values (such as for income) is unlikely to be
random, within or between countries. Furthermore, imputation practices vary across survey
data providers, and are sometimes less than ideal. Various imputation/matching methods
address item nonresponse by exploiting the questions that are in fact answered (see, for
example, Little and Rubin 1987). However, the devil is in the details, and I have come
across examples in which the imputation method was far from sound, due to the absence of
obvious covariates for matching (to find otherwise “similar” comparators for imputing the
missing values), even when in fact the variables are data.
It would be a lot of work to systematically impute using best practices from the literature.
I can’t judge whether the benefits would outweigh the costs. However, it is surely odd that
LIS puts such emphasis on assuring comparability of the income data when it is not missing
but ignores the potential biases from diverse and often absent imputation methods.
Another issue that users should be aware of is that LIS summary statistics on income dis-
tribution are based incomes that have been deliberately truncated. Incomes are top-coded,
meaning that they are truncated at 10 times the median (Lustig et al. 2013). Along with
others, I have an aversion to top coding of reported incomes (or expenditures). My gen-
eral presumption is that high-income respondents to household surveys under-state their
incomes, so LIS’s top-coding makes matters worse. LIS also employs bottom coding in Key
Figures, at 1 % of mean income. Also households with zero income are excluded from the
calculations reported in Key Figures. These practices are puzzling and I can see no justifica-
tion. Some households do have zero or low income at some times; consumption, of course,
is another matter.26
These coding practices leave me skeptical of summary measures on inequality and
poverty in Key Figures. A strength of LIS is that users can construct their own measures
from the micro data, but this adds greatly to the costs facing those users who simply want
reliable, comparable, summary statistics.
6.2 LIS’s favored welfare indicator
The best single indicator of economic welfare available in LIS is probably “disposable
household income” defined as “total monetary and non-monetary current income net of
income taxes and social security contributions.” With recent changes in LIS, this concept
now includes income-in-kind (with appropriate backward revisions as well), although it still
does not include imputed rent or the value of public health and schooling services.
LIS excludes surveys that do not have household income data. This might seem a natu-
ral restriction; this is after all the Luxembourg Income Study. However, we should remind
ourselves why we care about income in this context, which is its bearing on human wel-
fare, for the purpose of assessing the extent of poverty or inequality. There are users of LIS
(myself included) who prefer a standard (comprehensive) consumption aggregate over cur-
rent income as a metric of welfare. We might well agree that permanent income is the most
relevant income concept for judging welfare, and that it is also more relevant for measur-
ing the true level of poverty or inequality, given that current income includes predictable
transient or life-cycle income differences. (Policies too are rarely concerned with reason-
ably predictable short-term income losses.) However, while current income measured in a
survey would only rarely accord with permanent income, current consumption does reveal
permanent income under certain conditions. That is an attractive property.
26Web Tabulator has a box one can tick to include the zeros; deleting them is the default.
The Luxembourg Income Study
It must be acknowledged that this property of consumption does require some strong
assumptions, notably that credit markets work well enough to allow households to smooth
consumption as they see fit. In reality there are borrowing constraints stemming from asym-
metric information. Advocates of using income data in preference to consumption often
point to these constraints. However, recognizing the existence of credit market imperfections
does not justify believing that current income is a better welfare metric than consump-
tion. We need not presume that markets are perfect to still expect that consumption will be
smoothed to some extent in the face of income fluctuations. Households can save and they
do have foresight.
Another argument sometimes made for preferring income measures is that they better
reflect “potential consumption.” But this too is questionable. First, we can question whether
potential consumption is a valid welfare indicator. A poor farmer may get a bumper harvest
once in 20 years, but he can hardly be judged to be no longer poor, even in that fortunate
year. Second, even if we accepted that potential consumption was what we are after, income
is hardly a good measure; we would surely want to know liquid wealth, and here too actual
consumption may well be revealing about potential consumption.
On top of these more theoretical arguments, household consumption is probably better
measured than income in many cases, especially in developing countries. When incomes
fluctuate over time (such as due to seasonality in agriculture) it is very difficult to measure
them in surveys, and this is often not even attempted. Incomes also contain illegal earn-
ings or components hidden from taxation that households will be disinclined to reveal in a
survey. Consumption surveys are less prone to these problems. The gaps between grossed-
up aggregates from surveys and the closest comparable measures in the National Accounts
have often been of concern (though reflecting errors in both sources, as well as different
concepts). However, these gaps appear to be a bigger problem in general for income sur-
veys in developing countries than consumption surveys (Ravallion 2003), although poor
survey instruments for measuring income are also a problem (in the worst case, asking
“what is your income?” will hardly ever give a reliable answer; a much better practice
is to build it up from specific detailed components, which can take many pages of a
The choice will clearly also depend on the context. For example, when rain-fed agricul-
ture is a more important source of income, consumption surveys will be more appealing.
The fact that LIS started out in rich countries (and still does not represent poor countries
well) has influenced its choice, although it has been argued that consumption is also a better
measure of economic welfare for LIS-type purposes in rich countries (Slesnick 2001,on
the U.S.). What is clear is that a truly “global LIS” would need to embrace consumption sur-
veys, which have been more popular (for good reasons) in the developing world (including
in the Bank’s LSMS).27
Another source of concern relates to how LIS adjusts for differences in household demo-
graphics. What is called “equivalized income” in LIS is household income divided by the
square root of household size. LIS’s use of the square-root scale appears to have been influ-
ential, with non-LIS researchers adopting this scale on the grounds that LIS does so. Such
equivalence scales are invariably rather arbitrary. There are well-known, and deep, identifi-
cation problems in estimating such scales based on observed demand behavior alone (Pollak
and Wales 1979).
27For further discussion on measuring consumption properly see the excellent overview in Deaton and Zaidi
M. Ravallion
It is not clear why LIS adopted the square root scale and not something else, and why
there is no allowance for the fact that children tend to have lower consumption demands
than adults. Taking the square root implies substantial economies of scale, especially given
that imputed rents for housing are excluded (noting that the scope for economies of scale is
greater for housing than most other demands on incomes).
Lay users of the summary data will not easily understand the implications of this square-
root scale, and if they did they may well question it. Providing users something like my
Tabl e 2would help. Here I give stylized data on household income and demographic com-
position for five households. According to LIS, members of all five households are equally
well off. I am sure I am not the only person who would question that claim. Household E
with 6 adults and 3 children and a total income of $15.00 per day is deemed by LIS to be no
worse off than a single adult with an income of $5.00 per day. To test sensitivity I also give
income per equivalent single adult for an alternative scale that allows for lower expenditure
needs of children than adults and more modest scale economies in consumption. (Specifi-
cally the scale is (NA+0.7NC)0.8where NA is the number of adults and NC is the number
of children.)
The kind of sensitivity to scale parameters evident in Table 2is well understood amongst
researchers, and is also known to have potentially important implications for social policies
(especially policies that involve demographic characteristics, such as family allowances,
which will become less “pro-poor” in their incidence the more one allows for economies
of scale). The first academic paper using LIS (and by LIS staff) tested the sensitivity
of measures of poverty and inequality to the choice of scale parameters and concluded
“The results of our analysis indicate that choice of equivalence scale can some-
times systematically affect absolute and relative levels of poverty and inequality and
therefore rankings of countries (or population subgroups within countries). Because
of these sensitivities, one must carefully consider summary statements and policy
implications derived from cross-national comparisons of poverty and/or inequality.
(Buhmann et al. 1988, p.115)
Unfortunately, many users of Web Tabulator or Key Figures will not be aware of this
warning. It seems that LIS’s research arm might have better informed its data arm!
Nor is it clear that consistent comparisons of real income distributions across countries
should use a common equivalence scale. The parameters of such a scale – the differential
allowance for children’s needs and the adjustment for economies of scale – can be expected
to vary with consumption patterns. For example, in richer countries, the share of income
spent on food (for which there is no scale economy) tends to be lower while the share for
housing (with more potential for economies of scale) is higher. Users with the required
programming skills can construct their own scales from LISSY but those using Key Figures
6.3 Prices
My 1992 review complained that LIS had not used Purchasing Power Parity (PPP) rates for
exchange rate conversions. This becomes especially important when LIS expands to include
developing countries where official exchange rate conversions understate real incomes
given the existence of non-traded goods (especially services). There has been some progress
on this front. Web Tabulator uses PPP rates. However, the means and medians given in Key
The Luxembourg Income Study
Figures are not in PPP $’s but local currency units at the time of the survey. It would be
more useful to give the mean and median in constant prices at PPP.
There does not appear to be any adjustment for spatial differences in price levels within
countries, as is now common for developing countries. Again context matters; most goods
prices probably vary less spatially in rich countries than poor ones (given poorer infras-
tructure), although housing is an important exception and has a high budget share in rich
It is puzzling why adjustments for inflation for data collected at different times are only
done if the inflation rate exceeds 10 %. This is creating unnecessary measurement error.
6.4 Poverty measures
From a global perspective, the bulk of past poverty measurement practice has been polarized
between a “rich world” in which income-based measures of relative poverty dominate and
a “poor world” in which absolute poverty measures have been the norm and consumption-
based welfare measurement has been seen as best practice. LIS Key Figures (and the poverty
classifications in Web Tabulator) are in the former camp.28 LISSY users are (of course)
free to make absolute comparisons by introducing their own (sub-national or national) price
However, research using LIS has mainly relied on making comparisons of relative distri-
bution across countries. The phrase “absolute poverty” is mentioned nowhere on the LIS site
and by my count no more than 40 of the 600 LIS WPS make absolute poverty comparisons.
Nor are LIS’s relative poverty measures in Key Figures and Web Tabulator beyond ques-
tion. The LIS tradition has been to use a poverty line set at half the country’s median. (Key
Figures also gives results for 40 % and 60 % of the median.) There is an extensive litera-
ture on such relative poverty measures.29 There has also been much debate. One issue in
the literature is whether the poverty line should be fixed relative to the mean or relative to
the median (Saunders and Smeeding 2002;Easton2002;DeMesnard2007). The median
is more robust to measurement errors at the extremes, although poverty lines set as a con-
stant proportion of the median can have perverse properties when the Lorenz curve shifts,
as demonstrated by De Mesnard (2007).
However, whether the line is set at a constant proportion of the mean or median, there
is a more serious objection, namely that such a poverty measure will have the seemingly
perverse property that if all incomes grow by the same proportion the measure will be
unchanged. This yields some surprising poverty comparisons. We are told in Key Figures
that the incidence of poverty in India is lower than in China, and only slightly higher than for
the United States. (The website gives a poverty rate for India in 2004 of 20 % as compared
to 17 % for the same year for the US; the poverty rate given for China in 2002 is 25 %.) Of
course, once you set a poverty line as a constant proportion of the mean or median you end
up with a measure of relative distribution, divorced from absolute levels of living. Pooling
all the observations given in Key Figures the correlation rate between the LIS poverty rate
and the Gini index is very high 0.92 (n=211).
Users who want to make inter-country comparisons might be advised to augment the LIS
poverty measures in Key Figures with absolute measures from other sources. Users should
28The mean and median income are provided in Key Figures, but with no real comparability across countries
or over time (as already noted).
29Ravallion (2012) provides a commentary and references.
M. Ravallion
also be wary of using these data for comparisons over time for a given country. For example,
we are told that the poverty rate for the United Kingdom rose from 5 % in 1969 to 15 % in
2010. And I suspect that if I was to dig a bit in the LIS data I would find situations in which
measured poverty fell during recessions.
The usual defense for such measurement practices is to say that these measures allow
for relative deprivation – that people care about their income relative to the mean or median
of their country of residence. But as long as we think that poverty is absolute in the space
of welfare (or capabilities) one can only derive these strongly relative poverty measures
if welfare only depends on relative income (own income relative to the median). In other
words, one needs to assume that welfare does not depend on own-income at given relative
income. This must surely be considered a very strong assumption. Maybe it can defended as
plausible in some very rich European countries – maybe Luxembourg? – but it can hardly
be plausible in most of the current LIS countries, let alone the largely excluded developing
Another defense one hears of these measures is that they embrace the costs of “social
inclusion,” which are deemed to rise with average income. Here too one might readily agree
that there are such costs and that they rise with average income, but question whether they
are directly proportional to the mean or median. And when one carries this argument to
poor countries, one is likely to under-state the costs of social inclusion using such measures
(Ravallion 2012).
There are now better ways of measuring relative poverty. Ravallion and Chen (2011)
propose a “weakly relative poverty measure” that can allow for relative deprivation and
costs of social inclusion but does not entail setting the line at a constant proportion of the
mean or median.30 The elasticity of the poverty line to the mean starts out at zero in the
poorest countries (purely absolute poverty measures) but then rises with the mean, though
only reaching an elasticity of unity as mean income goes to infinity. Ravallion and Chen
show that their schedule of weakly relative lines fits the data on national poverty lines well.
Truly global measures of poverty following this approach are now available (Ravallion and
Chen 2013).
LIS’s preferred measurement practices accord with practices in a number of the LIS
countries but certainly not all. Indeed, even in 1990, it could not be argued that absolute
poverty is irrelevant to the seven rich countries then in LIS, since one of those countries (the
United States) uses absolute poverty lines. Today’s LIS includes many more countries that
also use absolute measures. If LIS Key Figures and Web Tabulator were to be relevant to
all the LIS countries then they would need to include absolute as well as relative measures.
And this would clearly be essential if LIS was to be considered a globally relevant data
7 Some problems under the surface
There are some deeper problems in LIS’s underlying survey data that matter to some of the
uses made of LIS, including in international comparisons. LIS should not be singled out
30This builds on an approach proposed by Atkinson and Bourguignon (2001), but with the important differ-
ence that our measure allows the cost of social inclusion to have a positive lower bound. Foster (1998)also
proposed a “hybrid line” given by the weighted geometric mean of an absolute and a (strongly) relative line.
While this is also “weakly relative,” it has a constant elasticity, whereas the elasticity rises from zero to unity
in the Ravallion and Chen proposal – consistently with the data on national lines.
The Luxembourg Income Study
for criticism in not addressing these problems, as they are shared by virtually all data sets.
Nonetheless, users should be well aware of them. We might also expect prominent data
providers such as LIS to show leadership in the flagging these problems and pointing to
possible solutions.
A case in point is survey non-response bias. Some proportion of those households sam-
pled for a survey either refuse to be interviewed or are impossible to reach at home. This is
often called “unit nonresponse” (as distinct from item nonresponse, as discussed in Section
6.1). Some surveys make efforts to avoid unit nonresponse, using “call-backs” to non-
responding households and fees paid to those who agree to be interviewed. Nonetheless,
the problem is practically unavoidable and nonresponse rates of 10 % or higher are now
common; indeed, I know of national surveys for which 30 % of those sampled did not
The bigger concern is that nonresponse is clearly not random. More likely, compliance
falls with income; this is consistent with evidence for the U.S. (Groves and Couper 1998,
based on compliance with the long schedule of the U.S. Census administered to a random
There are methods that can be used to correct for selective response. Using the same
survey data for the U.S. found in LIS, Korinek, Mistiaen and Ravallion (KMR) (2006)
show how the latent income effect on compliance can be estimated consistently with
the available data on average response rates and the measured distribution of income
across geographic areas.31 This allowed KMR to re-weight the data. They found a sig-
nificant negative income effect on U.S. survey compliance. After correcting for this,
measured inequality is higher, with about 5 % points added to the Gini index. A higher
overall mean is also called for. In terms of the impact on measures of poverty, the
downward bias in the mean tends to offset the downward bias in measured inequal-
ity. The tendency for low-income groups to be over-represented (because of their higher
compliance probabilities) still means that the poverty rate tends to be over-estimated,
though KMR find that the impact is small up to poverty lines normally used in the
I would hazard to guess that these problems are at least as severe for the international
comparability of measures of poverty and inequality as the problems addressed by the
standardizations done by LIS, though the effects do not necessarily go in the same direc-
tion. Consider the Gini index of income inequality in the U.S. for 2004. The “raw” Gini
index based on income per person from the Current Population Survey reported by KMR is
44.8 %. After LISification, the Gini index for the same year from the same survey is 37.2 %
(from Key Figures). That is a large downward revision. By contrast, the KMR correction
for selective compliance in the CPS brings the Gini index for 2004 up to 49.2 %. The same
upward adjustment to the LIS Gini index would bring it to 41.6 %, a few points below the
unstandardized figure.
The traditional household survey faces many new challenges today. There is a per-
ception that survey response rates are declining across the globe, although I have not
seen evidence. In addition to concerns about under-reporting and selective compliance,
there are worries about out-of-date sample frames, infrequency of surveys, lags in pro-
cessing, questionnaire length, robustness to changes in questionnaire design, heterogeneity
in the interpretation of survey questions including subjective scales, and weak integra-
tion with other data sources. There is not much discussion of these deeper problems
31Korinek et al. (2007) go more deeply into the estimation method and its robustness.
M. Ravallion
on the LIS website, or the working papers.32 More work on these topics is surely
8 Conclusions
The landing page of the LIS website says that “Our mission is to enable, facilitate, promote,
and conduct cross-national comparative research on socio-economic outcomes and on the
institutional factors that shape those outcomes.” If we take “cross-national” to embrace the
world as a whole, then LIS can only claim partial success in its mission. While LIS has
made some progress toward expanding country coverage to include “middle-income” coun-
tries and adding multiple surveys over time, it has not made as much progress on these
fronts as it might have, especially given the huge expansion in survey-data availability glob-
ally. Indeed, very few of the new national household surveys available globally have made
it into LIS. While we are seeing an overlap in the “poor world” and “rich world” data
sets, alongside the convergence in their economies, the “rich-country bias” of LIS is still
The reality today is that LIS is still a tool serving inequality and social policy ana-
lysts doing strictly relativist comparative work on income distribution across mainly rich
countries. There is a clear niche for LIS in this role and the service that LIS provides its
community of users in rich countries for about C1 million per year makes it good value.
Credit must go to LIS’s donors, management and staff for providing this important public
However, neither the present LIS nor the various data tools that have emerged in recent
years serve the large and clearly growing demand for a truly global survey-data archive,
linking up to other databases at national and sub-national levels. Resources have undoubt-
edly constrained progress, but so have LIS’s long-standing practices and protocols. LIS can
stay the valued tool that it is for comparative research on rich countries. However, to become
an exciting research tool for global micro-data analysis in today’s world, LIS would have
to broaden its coverage considerably, become more timely, and more flexible and eclec-
tic in its eligibility criteria and measurement practices, and more open to non-specialist
users in its modes of data access. Importantly, LIS would need to embrace measurement
practices more common, and often more defensible, in poorer countries. That need not pre-
clude keeping many of its current practices, although some of those (such as the use of
strongly relative poverty measures) would seem hard to defend even in rich countries. The
key point is that a globally relevant LIS would need to be more open to practices across the
The questions remain: Can LIS rise to this new challenge, recognizing that this
would require many changes? Or will the new comparative data tools that have emerged
for the developing world morph into the new global tool for comparative analysis
that is needed today? The Gornick et al. (2015) response to the present paper does
not offer much hope that LIS will rise to the challenge anytime soon. But who
32A forward by Tony Atkinson to a new LIS volume is one of the few references to these problems I could
find: see (Atkinson 2013).
The Luxembourg Income Study
Acknowledgments The author thanks Nora Lustig and Francisco Ferreira for encouraging him to write the
paper and Janet Gornick for promptly answering many questions on LIS. Helpful comments were received
from Kathleen Beegle, Andrea Brandolini, Gero Carletto, Shaohua Chen, Francisco Ferreira, Janet Gornick,
Markus J¨
antti, Thierry Kruten, Teresa Munzi, Prem Sangraula and Dominique van de Walle. The data analysis
for this paper was done in early October 2013 and does not allow for any updates after that date. To assure
accuracy on purely factual matters, the Director of LIS, Janet Gornick, was given the opportunity to read this
paper before it was it was made public and she also consulted with her senior staff.
Atkinson, A.B.: Introduction. In Smeeding et al. (1990)
Atkinson A.B.: Forward to Gornick and J¨
antti (2013)
Atkinson, A.B., Bourguignon, F.: Poverty and Inclusion from a World Perspective. In: Stiglitz, J.E., Muet,
P.-A. (eds.) Governance, Equity and Global Markets. Oxford University Press, Oxford, England (2001)
Atkinson, A.B., Brandolini, A.: Promise and Pitfalls in the Use of “Secondary” Data-Sets: Income Inequality
in OECD Countries as a Case Study. J. Econ. Lit. 39, 771–799 (2001)
Buhmann, B., Rainwater, L., Schmaus, G., Smeeding, T.: Equivalence Scales, Well-Being, Inequality, and
Poverty: Sensitivity Estimates across Ten Countries Using the Luxembourg Income Study Database.
Rev. Income Wealth 34, 115–42 (1988)
Deaton, A., Zaidi, S.: Guidelines for Constructing Consumption Aggregates for Welfare Analysis. Living
Standards Measurement Study Working Paper, vol. 135. World Bank, Washington, DC (2002)
De Mesnard, L.: Poverty Reduction: The Paradox of the Endogenous Poverty Line, LEG - Document de
travail - Economie 2007-05, LEG, Laboratoire d’Economie et de Gestion, CNRS UMR 5118. Universit´
de Bourgogne (2007)
Du, R.: The Course of China’s Rural Reform. International Food Policy Research Institute, Washington, DC
Easton, B.: Beware the Median. Soc. Policy Res. Cent. Newsl. 82, 6–7 (2002)
Foster, J.E.: Absolute versus relative poverty. Am. Econ. Rev., Papers Proc. 88(2), 335–341 (1998)
Gornick, J.C., J¨
antti, M. (eds.): Income Inequality: Economic Disparities and the Middle Class in Affluent
Countries. Stanford University Press (2013)
Gornick, J.C., J¨
antti, M., Munzi, T., Kruten, T.: Response to The Luxembourg Income Study review by
Martin Ravallion, Journal of Economic Inequality (2015)
Groves, R.E., Couper, M.P.: Nonresponse in Household Interview Surveys. Wiley, New York (1998)
Korinek, A., Mistiaen, J., Ravallion, M.: Survey Nonresponse and the Distribution of Income. J. Econ.
Inequal. 4(2), 33–55 (2006)
Korinek, A., Mistiaen, J., Ravallion, M.: An econometric method of correcting for unit nonresponse bias in
surveys. J. Econ. 136, 213–235 (2007)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
Lustig, N., Teles, D., Huang, J.: Describing International Datasets of Inequality Indicators, Appendix to this
special issue on international databases (2013)
Pollak, R., Wales, T.: Welfare Comparison and Equivalence Scale. Am. Econ. Rev. 69, 216–21 (1979)
Ravallion, M.: Review of Smeeding et al., 1990, Journal of Economic Literature, September (1992)
Ravallion, M.: Measuring Aggregate Economic Welfare in Developing Countries: How Well do National
Accounts and Surveys Agree Rev. Econ. Stat. 85(3), 645–652 (2003)
Ravallion, M.: Poverty Lines Across the World. In: Jefferson, P.N. (ed.) The Oxford Handbook of the
Economics of Poverty. Oxford University Press, Oxford (2012)
Ravallion, M., Chen, S.: Weakly relative poverty. Rev. Econ. Stat. 93(4), 1251–1261 (2011)
Ravallion, M., Chen, S.: A proposal for truly global poverty measures. Glob. Policy 4(3), 258–265 (2013)
Saunders, P., Smeeding, T.: Beware the Mean! Soc. Policy Res. Cent. Newsl. 81(1), 4–5 (2002)
Slesnick, D.: Consumption and Social Welfare. Living Standards and their Distribution in the United States.
Cambridge University Press, Cambridge, England (2001)
Smeeding, T.: Twenty Years of Research on Income Inequality, Poverty, and Redistribution in the Developed
World: Introduction and Overview. Socio-Econ. Rev. 2, 149–163 (2004)
Smeeding, T., O’Higgins, M., Rainwater, L.: Poverty, Inequality and Income Distribution in Comparative
Perspective: The Luxembourg Income Study (LIS). The Urban Institute, Washington DC (1990)
Smeeding, T., Schmaus, G., Allegrezza, S.: An introduction to LIS. LIS Work. Paper, 1 (1985)
Smeeding, T., Schmaus, G., O’Higgins, M.: The LIS database: Technical and methodological aspects. In
Smeeding et al. (1990)
World Bank: World Development Indicators. World Bank, Washington DC (2013)
... SI 1 of the supplementary information describes the dataset, all input data and its construction in more detail. This dataset was derived using existing data from the Luxembourg income survey (LIS) (Ravallion 2015), PovCal (Smeeding and Latner 2015) and UNU WIDER (WIDER 2008). Our approach can be summarized in the following steps, 1. ...
Full-text available
Understanding and projecting income distributions within countries and regions is important to understanding consumption trends and the distributional consequences of climate impacts and responses. Several global, country-level projections of income distribution are available but most project only the Gini coefficient (a summary statistic of the distribution) or utilize the Gini along with the assumption of a lognormal distribution. We test the lognormal assumption and find that it typically underestimates income in the highest deciles and over-estimates it in others. We find that a new model based on two principal components of national time series data for income distribution provides a better fit to the data for all deciles, especially for the highest and lowest. We also construct a projection model in which the first principal component is driven by the Gini coefficient and the second captures deviations from this relationship. We use the model to project income distribution by decile for all countries for the five Shared Socioeconomic Pathways. We find that inequality is consistently higher than projections based on the Gini and the lognormal functional form, with some countries reaching ratios of the highest to lowest income deciles that are almost three times their value using the lognormal assumption.
... All of this can pose a challenge to researchers interested in cross-country analysis using income inequality secondary databases in a meaningful way to make valid cross-country comparisons. These issues are well known, and a strand of the inequality literature has focused on related measurement issues (see Förster & Tóth, 2015;Francois & Rojas-Romagosa, 2005;Gasparini & Tornarolli, 2015;Ravallion, 2015;Smeeding & Latner, 2015;andSolt, 2009, 2020). ...
Full-text available
This paper studies the influence of different aspects of gender equality or female empowerment on income inequality. A key question addressed is: Are there positive spillovers from gender equality to income equality? Using data drawn from 162 nations over the years 1985–2019, results show that nations with a long history of women's suffrage, greater representation of women in the government, lower fertility rates and better overall gender equality experienced lower income inequality, ceteris paribus. These results are largely supported in several robustness checks, including different model estimation strategies to address potential reverse‐causality issues, considering alternative measures of income inequality, and considerations of persistence and nonlinearities in the gender inequality measures. The spillovers from some dimensions of gender equality are found to be sensitive to existing income inequality. Policymakers ignoring the payoffs from gender empowerment on income distribution might be underinvesting in initiatives to empower women.
... See Atkinson, Rainwater, and Smeeding (1995),Smeeding and Grodner (2000),,Atkinson, Brandolini, and Smeeding (2001) andGottschalk and Smeeding (1997) for surveys of earlier studies on income inequality. See alsoGottschalk and Danziger (2005),Burkhauser, Feng, and Jenkins (2009) andRavallion (2015). 11 SeeBurkhauser et al. (2012),Armour, Burkhauser, and Larrimore (2013),Geloso and Magness (2020),Geloso (2018) andDelsol, Lecaussin, and Martin (2017). ...
Full-text available
This paper critiques policies aimed at income redistribution within the framework of Marx’s political economy and Marx’s theory of value. The ‘Neoliberal’ characterization of capitalist history, the tendency to divide capitalism into distinct sub-periods or phases, continues to play a significant role for describing changes to social policy in the United States (US). Prominent academics & public intellectuals continue to hold spirited debates over whether or not welfare spending in the US has become less generous over the years, owing to increasing ‘neoliberalization’ of the American economy. This mixed-methods research paper analyses trends in social welfare spending and income inequality for the US. To supplement the analysis, a novel dataset on Distributional National Accounts (DNA) from the Bureau of Economic Analysis (BEA) is also analyzed. Evidence suggests previous concerns in the literature are unfounded and are not borne out by empirical evidence.
... See Atkinson, Rainwater, and Smeeding (1995), Smeeding and Grodner (2000), , Atkinson, Brandolini, and Smeeding (2001) and Gottschalk and Smeeding (1997) for surveys of earlier studies on income inequality. See also Gottschalk and Danziger (2005), Burkhauser, Feng, and Jenkins (2009) and Ravallion (2015). personal income tax forms or appearing on personal income tax forms only in the form of capital gains". ...
Full-text available
This unpublished Masters Research Paper contains two interrelated sections; the first section uses an accurate textual reading of Smith, Ricardo, & Marx to 1) resolve the authorship dispute concerning the origins of the so-called Labour Theory of Value, and 2) clear up the confused attribution debate (`authorship kerfuffle') of Pierro Sraffa's (1898-1983) Physical Quantities framework. I then use my textual analysis to criticize contractual defenses of sweatshops, such as the ones offered by Zwolinski (2007), and Powell and Zwolinski (2012), and clear up some confusion regarding the link between `exploitation' as a concept and the so-called Labour Theory of Value. The second section builds on the first part by analysing trends in US public expenditures and critiquing redistributionist politics within the framework of an accurate understanding of Marx's value theory and capitalist production. To supplement my analysis, I examine a novel dataset on Distributional National Accounts (DINA) from the US Bureau of Economic Analysis (BEA). I also discuss the `Neoliberal' characterization of capitalist development, and tendency to divide capitalism into distinct sub-periods. Overall, I find that that previous concerns in the literature about an alleged `shrinking' of the welfare state, generally attributed to the post-WWII `neoliberalizion' of the US economy, has been overestimated, and is not borne out by empirical evidence. Suggested citation: Balaji, Ridhiman. 2022. "Should Capitalist Production be Considered Exploitative? A Contribution to Discussions on Sweatshops, Income Inequality, & The Capitalist Production Process". Concordia University.
... For more information, see 7 We perform calculations using the software Stata and Statistica (StatSoft) and additional computer programs written by ourselves in Fortran 99. 8 For LIS practices in respect to the microdata, see Ravallion (2015). data already top-coded by data providers to guarantee the confidentiality of high-income households/persons. ...
Full-text available
In the distributive analysis, the constant relative inequality aversion utility function is a standard tool for ethical judgements of income distributions. The sole parameter ε of this function expresses a society's aversion to inequality. However, the profession has not committed to the range of ε. This paper aims to estimate the parameter ε of the constant relative inequality aversion utility function using datasets available from the Luxembourg Income Study Database. We utilise the method of estimating ε assuming incomes obey the generalised beta distribution of the second kind. The estimator of ε is derived from the mathematical condition of the existence of the social welfare function. We elaborate an 'atlas' of 388 estimates of ε for 55 countries across time. We also verify two hypotheses: 1) The richer the country , the greater the societal inequality aversion; 2) The greater (lower) the inequality aversion, the lower (greater) income inequality. Our data do not confirm the 1st hypothesis. For verifying the 2nd hypothesis, we use the inequality development relationship augmented by inequality aversion. The 2nd hypothesis is unfalsified in about 90% of country-year cases. JEL Classification: C10; D30; D60; I30, O15
... Also the OECD Income Distribution Database (OECD IDD) and the database from Eurostat EU-Silc are frequently used for comparative analyses; see Sect. 2. All datasets have strengths and limitations. LIS's micro-data seem to be the best available data to compare income poverty and redistribution through social benefits and income taxes from country to country and over time because the data have been harmonized (Nolan and Marx 2009;Smeeding and Latner 2015;Ravallion 2015). Unfortunately, recent data years are not yet available for many countries in LIS. ...
Full-text available
Most welfare states design their tax/benefit-system to combat income poverty. This paper analyzes the effectiveness of social transfers and income taxes in alleviating poverty. We use micro-data from the Luxembourg Income Study to examine the antipoverty effect of social transfers and income taxes. Our data also allow us to decompose the trajectory of the market income poverty to disposable income poverty into 7 different benefits, income taxes and social contributions. On average across 49 countries, 15 percent of the total population is lifted out of poverty via tax/benefit-systems. As far as specific social programs are concerned, only three programs account for the bulk of total poverty reduction: old-age/disability/survivor scheme (81%), social programs for family and children (14%) and the unemployment scheme (8%).
... Eight different databases form part of ATG database, where the LIS is considered the best source. See Ravallion (2015) for an excellent review of LIS. Solt (2016Solt ( , p. 1268) considers the LIS is the "gold standard of cross-nationally comparable inequality data." ...
Purpose The purpose of this paper is to analyze empirically the relationship between the labor share and income inequality, as measured by the Gini coefficient and by the income shares for different quintiles, during the period 1990–2015 for 62 developed and developing countries. Design/methodology/approach This study uses panel data techniques to analyze empirically the relationship between the labor share and income inequality. Findings This paper finds that a lower labor share is associated with a higher Gini coefficient. A lower labor share is found to be strongly associated with a smaller income share for the lowest two quintiles and larger income share for the highest quintile and weakly associated with a smaller income share for the third and fourth quintiles. Moreover, this paper finds that the lower the quintile, the stronger the impact of the labor share on the income share of the quintile. Social implications Policymakers should take into account the evolution of the labor share. Public policies that improve labor market outcomes, such as those aimed to promote participation in the labor market and strengthen the human capital of low-income groups, seem necessary to prevent the rise in economic inequalities. Moreover, as the digital transformation of society progresses, policies to promote skill deepening may have an important role in reversing excessive inequalities. Originality/value How changes in the labor share are associated with changes in the Gini coefficient, and how this is driven by income shares for different quintiles, for a broad range of countries during the most recent period, has not been comprehensively studied using panel data techniques.
... The nature of the data available for this analysis has major implications for the form it takes. The two core sources are the Luxembourg Income Study (LIS) and the OECD Income Distribution Database (Atkinson et al. 1995;OECD 2008OECD , 2011OECD , 2012OECD , 2015Gornick and Jäntti 2013;Ravallion 2015;Gasparini and Tornarolli 2015). Both provide data on household incomes standardised, insofar as possible, across countries and over time, which is critical for this comparative analysis. ...
Full-text available
This paper first highlights the extent to which national income per head will be unreliable as an indicator of household income change over time around the middle for rich countries, in the short or long run, and will mislead as to the relative performance of countries in achieving broadly-based improvements in prosperity. It then demonstrates that ‘inequality-adjusting’ national income will not suffice to bridge the gap. The divergence between the trajectory of median household income and GDP/GNI per capita is due to a variety of factors that themselves vary in significance across countries and over time, with the distribution of the gains from growth being only one. Median income thus needs to be accorded a central role alongside GDP per capita in both official monitoring of living standards and research on inclusive growth. Growth in median incomes will not be a reliable measure of what is happening to the incomes of the poor, though, so low incomes and poverty certainly need to be separately monitored and analysed: one cannot assume that growth that transmits to the middle is also going towards the bottom.
COVID-19 is mostly considered to have ravaged places with high levels of inequality and poverty. Yet, in the case of Europe, the evidence for this is limited. In this paper we address this gap in our knowledge by exploring how regional variations in poverty, wealth and interpersonal inequality have shaped COVID-19-related excess mortality. The results show that during the first 18 months of the pandemic there is no link between inequality and poverty, on the one hand, and the lethality of the disease, on the other. The geographical concentration of wealthy people is related to more, not less, excess mortality.
The most widely cited social welfare statistics in the United States are based on tabulations of family income. The picture that emerges gives cause for concern: median family income has hardly changed over the last twenty-five years while inequality has increased and poverty remains persistently high. Yet consumption-based statistics as employed in this work yield rigorous and quite different estimates of real individual and social welfare. Closely linked to economic theory, Professor Slesnick's examination of standards of living, inequality, and poverty reveal that the standard of US living has grown significantly while inequality and poverty have decreased to relatively low levels. His assessment is drawn from extended period data in order to chart long-run trends. The work will be of interest to economists, sociologists, economic historians, political scientists, and other readers in the social and policy sciences. Designed to be accessible to non-economists, technical details are relegated to appendices.
An Introduction to Survey Participation. A Conceptual Framework for Survey Participation. Data Resources for Testing Theories of Survey Participation. Influences on the Likelihood of Contact. Influences of Household Characteristics on Survey Cooperation. Social Environmental Influences on Survey Participation. Influences of the Interviewers. When Interviewers Meet Householders: The Nature of Initial Interactions. Influences of Householder-Interviewer Interactions on Survey Cooperation. How Survey Design Features Affect Participation. Practical Survey Design Acknowledging Nonresponse. References. Index.
This book presents original empirical research on economic inequality in affluent countries, using comparable data from two widely-recognized, high quality, data sources: the Luxembourg Income Study Database and the Luxembourg Wealth Study Database. Both of these publicly-accessible databases are housed at LIS, a cross-national data archive that is directed by the book's two Editors. The volume's seventeen empirical chapters explore change over time in income inequality; the ways in which politics affects and is affected by economic inequality; the extent to which women's work, paid and unpaid, affects inequality; and cross-national comparisons of the distribution of various measures of household wealth. This book is exceptional in its inclusion of patterns of work within households, and politics, as sources of inequality. A key strength of the book is its emphasis on the economic position of the middle class; most studies of inequality include a secondary focus on either poverty or top incomes, leaving aside careful assessments of the middle of the income distribution. In addition to its thirteen comparative chapters, the book closes with a section that assesses inequality in four selected cases - Japan, Iceland, India, and South Africa. These countries, each with a unique pattern of inequality, have rarely appeared in cross-national texts on economic inequality, largely due to the lack of comparable data.