Content uploaded by Agnes R. Quisumbing
Author content
All content in this area was uploaded by Agnes R. Quisumbing
Content may be subject to copyright.
No. 0323
Social Protection Discussion Paper Series
Data Sources for Microeconometric Risk
and Vulnerability Assessments
John Hoddinott and Agnes Quisumbing
December 2003
Social Protection Unit
Human Development Network
The World Bank
Social Protection Discussion Papers are not formal publications of the World Bank. They present preliminary and
unpolished results of analysis that are circulated to encourage discussion and comment; citation and the use of such a
paper should take account of its provisional character. The findings, interpretations, and conclusions expressed in this
paper are entirely those of the author(s) and should not be attributed in any manner to the World Bank, to its affiliated
organizations or to members of its Board of Executive Directors or the countries they represent.
For free copies of this paper, please contact the Social Protection Advisory Service, The World Bank, 1818 H Street,
N.W., Washington, D.C. 20433 USA. Telephone: (202) 458-5267, Fax: (202) 614-0471, E-mail:
socialprotection@worldbank.org. Or visit the Social Protection website at http://www.worldbank.org/sp.
DATA SOURCES FOR MICROECONOMETRIC RISK AND
VULNERABILITY ASSESSMENTS
John Hoddinott and Agnes Quisumbing
International Food Policy Research Institute
Washington, D.C.
December 2003
John Hoddinott and Agnes Quisumbing are Senior Research Fellows in the Food Consumption and
Nutrition Division, International Food Policy Institute.
iii
Contents
1. Introduction.................................................................................................................... 1
2. Information Needs for Vulnerability Measurement....................................................... 2
2.1 Conceptualizing Information Needs Using the “Risk Chain” .................................2
2.2 An Overview of Data Issues ....................................................................................4
Degree of Covariance of Risks and Shocks............................................................ 4
Ex Post versus Ex Ante Mechanisms.................................................................... 16
Timing and Frequency of Surveys........................................................................ 17
Cross-Validation of Responses ............................................................................. 17
Types of Data and Methods of Data Collection.................................................... 18
3. Household Data From Survey-Based Methods............................................................ 19
3.1 Single-Cross Section of Households...................................................................... 19
Uses of This Data Source...................................................................................... 19
Advantages and Disadvantages............................................................................. 19
Issues and Innovations ..........................................................................................21
3.2 Repeated Cross-Sections........................................................................................ 23
Advantages and Disadvantages............................................................................. 23
Issues and Innovations ..........................................................................................23
3.3 Panel data ............................................................................................................... 26
Advantages and Disadvantages............................................................................. 27
Issues and Innovations ..........................................................................................29
4. Locality Information and Data from Contextual Methods .......................................... 30
4.1 Community Information........................................................................................ 30
Issues and Innovations ..........................................................................................31
4.2 Secondary Sources .................................................................................................33
Issues and Innovations ..........................................................................................34
4.3 Contextual information......................................................................................... 36
5. Concluding Remarks.................................................................................................... 40
Annex 1: Modules on Risks and Shocks...........................................................................43
References......................................................................................................................... 57
iv
Tables
1 Typical data sources for the identification of risks, risk exposure,
and risk realization........................................................................................................ 5
2 Typical data sources on ex post and ex ante risk management instruments............... 10
3 Typical data sources for outcomes.............................................................................. 15
Examples
1 Risk and vulnerability assessment in Guatemala ........................................................ 20
2 Risk-related hardship faced by rural households in Ethiopia .....................................20
3 Examining vulnerability to risk in Kenya ................................................................... 24
4 Welfare losses from poverty and risk in Bulgaria ......................................................26
5 Effects of shocks on adult body mass in rural Ethiopiaa............................................. 27
6 Shocks and schooling attainment in Guatemala .........................................................32
7 Integrating quantitative and qualitative research in studying vulnerability................ 38
1
1. INTRODUCTION
The increasing recognition that there are considerable flows into and out of the
poverty pool (e.g., see Baulch and Hoddinott 2000) has focused interest in household
vulnerability as the basis for a social protection strategy. As Holzmann and Jorgensen (2000)
note, in a dynamic environment where adverse economic shocks are easily transmitted across
geographic borders, a social protection scheme might be able to perform more effectively the
task of protecting households from the adverse effects of poverty by adopting a forward
looking approach that not only identifies the groups of households that are presently poor but
also the households that are vulnerable to economic shocks and other risks such as natural
disasters and climate conditions.
The task of undertaking a risk and vulnerability assessment can be complicated both
by multiple definitions of vulnerability and the scarcity of data with which to undertake
vulnerability measurements.
The feasibility of applying a particular empirical approach is often dictated by data
concerns. Analysts recognize that vulnerability involves both welfare losses due to low
consumption or poverty as well as those due to uncertainty. However, analysts undertaking a
risk and vulnerability assessment often only have a cross-sectional household survey at their
disposal. Panel data, while ideal, are expensive, and oftentimes an urgent policy question
needs to be answered before a new survey can be fielded. What are the options available to
such analysts in the field? How can they make use of existing data sources to come up with
measures of vulnerability? What are the pros and cons of various data sources? What are
cost-effective ways of collecting additional information to enhance existing data sources?
This “toolkit” is designed to assist practitioners undertaking vulnerability assessments
by identifying data sources, assessing their suitability for risk and vulnerability measurement,
and proposing suggestions for data collection to supplement existing sources. It
complements “Methods for Microeconometric Risk and Vulnerability Assessments: A
Review with Empirical Examples” (Hoddinott and Quisumbing 2003), which discusses
techniques for assessing vulnerability and issues relating to their application. The emphasis
in both toolkits is on quantitative, survey-based methods for vulnerability assessment,
although this document will also discuss contextual methods similar to those used in
2
livelihoods-based approaches.1 Section 2 maps data sources to stages of the “risk chain,” and
presents an overview of data issues. This section also shows how information on risk, risk
management, and outcomes can be extracted from LSMS-type surveys at both the household
and community levels. Section 3 discusses the use of household data from survey-based
methods in risk and vulnerability assessments, while Section 4 deals with locality data (from
community interviews and secondary sources) and data collected using contextual methods.
Both Sections 3 and 4 illustrate the use of each data source with examples from recent risk
and vulnerability assessments, discuss their advantages and disadvantages, and propose
innovations to improve their usefulness for vulnerability assessments. Section 5 concludes.
An Annex contains sample modules for assessing risk and vulnerability over the long-,
medium-, and short-term.
2. INFORMATION NEEDS FOR VULNERABILITY MEASUREMENT
2.1 Conceptualizing Information Needs Using the “Risk Chain”
The “risk chain” (Heitzmann, Caganarajah, and Seigel 2002) is a useful approach for
conceptualizing vulnerability. This decomposes household vulnerability into several
components: (1) risk, or uncertain events; (2) options for managing risk, or risk responses,
and (3) the outcome in terms of welfare loss. Tables 1, 2, and 3 map data sources to
information needed to analyze each stage of the risk chain (Hietzmann, Caganarajah, and
Seigel 2002) and are organized according to the following row headings:
• Risk, risk exposure, risk realization (Natural and environmental; Social and
political; Lifecycle/demographic; Economic; Health).
• Ex ante and ex post risk management (Asset accumulation and diversification;
Social capital and social relations; Acquisition of knowledge; Livelihood choices;
Preventive actions; Credit; Private Insurance; Private transfers; Public transfers).
1 This paper draws heavily on papers and discussions at an International Food Policy Research Institute/World
Bank (IFPRI/WB) Workshop on Risk and Vulnerability: Estimation and Policy Implications held at the
International Food Policy Research Institute in September 2002.
3
• Outcomes (Consumption; Health; Nutrition; Schooling). Note that these
outcomes are a subset of those enumerated under the Millennium Development
Goals.
The organization within each heading follows a similar pattern. Each begins with a
set of general categories, then breaks these down into specific examples. For example, one
risk management mechanism is asset accumulation and diversification. Within this group,
we enumerate the following instruments: financial assets, property, equipment, and
household durables.
All three tables contain column headings for data sources: single cross-sections,
panel surveys, community surveys, and secondary sources. Each table may have columns
which are specific to it (for example, a column on the degree of risk covariance in Table 1),
but the data source columns are common to all tables. Because LSMS-type surveys are
increasingly being used for vulnerability assessments, we include two columns listing the
relevant LSMS modules and the reference in the LSMS prototype questionnaire found in the
edited volume by Grosh and Glewwe (2000). Under “Natural and environmental risks” in
Table 1, for example, we list whether data can be found in cross-sections, panels, and
community surveys; the relevant LSMS module would be the community questionnaire.
Under “Property” (Table 2), for example, we list the following LSMS modules containing
relevant information on property holdings: savings, housing, and agriculture with agriculture
further subdivided into land and livestock). Reference in LSMS prototype questionnaire
refers to the section and question numbers within the relevant LSMS module. So
“Agriculture E Q1-Q6” refers to the Agriculture module of the prototype LSMS
questionnaire, part E, Questions 1-6. Note that some of these prototype modules come in
“short,” “standard,” and “extended” sizes. In all cases, we use the “expanded version” or,
where that is not available, the “standard version.” Also note that in a few cases, these
modules are divided into “sections,” or more confusingly, “modules,” as indicated under the
reference to the LSMS prototype questionnaire.
Tables 1 to 3 provide a broad overview of the data needs; the specifics by data source
will be discussed in Sections 3 and 4.
4
2.2 An Overview of Data Issues
The completeness of information on various stages of the risk chain depends crucially
on the available data, with some stages of the chain covered better than others. Heitzmann,
Canagarajah, and Siegel (2002) and Alwang, Siegel, and Jørgensen (2001) argue that data
and statistics on different types of risks, risk exposure, and outcomes are more readily
available than detailed information on risk responses, which is the most difficult part of the
risk chain to identify and quantify. Part of the difficulty in conducting risk and vulnerability
assessments with existing data is that these data were often not intended for this purpose, and
thus measures of risks and responses to risk will be imperfect. Moreover, without data
collected before and after a shock, it will be difficult to identify whether an action is a risk
prevention (ex ante) mechanism or a risk-mitigation (ex post) response.
Degree of Covariance of Risks and Shocks
Although econometric approaches do not require any a priori classification of shocks
according to their degree of covariance, it may be useful in terms of systematically
identifying sources of data on risk and degree of risk exposure (Table 1). Shocks can be
classified into idiosyncratic (specific to the individual or household) or spatially covariate
(covering a wider area such as a community, region, or even the nation). For spatially
covariate risks and shocks, community information and secondary sources such as rainfall
and administrative data on wages and prices are a very valuable complement to household
data. By contrast, information on risk management instruments (Table 2) and outcomes
(Table 3) is more likely to be available at the household level, although some risk-
management institutions may operate at the community level, such as public works
programs. One problem with matching household data with secondary data is the difficulty
of mapping and matching localities—often one loses households from surveys because they
do not match the spatially referenced data. Administrative boundaries may also be
misleading when matching rainfall data, where
Table 1. Typical data sources for the identification of risks, risk exposure, and risk realization
Type of risk Degree of
covariance
Cross-sectional
individual or
household
survey Panel
survey Community
survey Secondary sources
Relevant LSMS
modules
Reference in
LSMS
prototype
questionnaire Comments
Natural and
environmental
Physical location Community yes yes yes Community Section 2. Q1,
Q2, Q4-Q7 Information on
these risks can also
be collected as
retrospective
questions in cross-
section and panel
surveys
Weather shocks (heavy
rainfall, droughts,
hurricanes)
Community retrospective yes yes rainfall data,
weather data Community Section 8. Q12
Natural disasters
(landslides, volcanic
eruptions, earthquakes,
floods)
Community retrospective yes yes historical data,
seismological data Community Section 8. Q2,
Q13-Q16
Crop losses to rodents,
pests Idiosyncratic,
Community yes yes yes Agriculture C1. Q2, Q6
Loss of infrastructure Community yes yes Community Section 3. Q14,
Q16
Pollution Community,
idiosyncratic yes yes yes Community Section 11. Q1-
Q8
Housing B. Q39 – Q41
Environment (air
pollution) Module 3. Q1,
Q2, Q6, Q8, Q9,
Q11, Q13
Water, water pollution Idiosyncratic,
Community yes yes yes Environment (water
pollution) Module 4. Q1-
Q10, Q21, Q22,
Q24, Q25
Environment (water
source) Module 4. Q20,
Q21, Q24
Community Section 7. Q12-
Q24
Housing B. Q1-Q7, Q10-
Q20
Sanitation Idiosyncratic,
Community yes yes yes Community Section 7. Q32,
Q36-Q41
Environment
Module 5. Q4,
Q9, Q18, Q19,
Type of risk Degree of
covariance
Cross-sectional
individual or
household
survey Panel
survey Community
survey Secondary sources
Relevant LSMS
modules
Reference in
LSMS
prototype
questionnaire Comments
Q9, Q18, Q19,
Q45, Q48, Q50,
Q64
Housing B. Q21-Q26
Deforestation Idiosyncratic,
Community retrospective yes yes satellite
photographs
Social and political risks
Ethnic fractionalization Idiosyncratic,
community yes yes yes Community Section 2. Q10 Not collected in
most surveys,
though information
on race, ethnicity,
and religion are
collected
Religious
fractionalization Idiosyncratic,
community yes yes yes Community Section 2. Q11
Linguistic
fractionalization Idiosyncratic,
community yes yes yes Community Section 2. Q12
Crime, gangs Idiosyncratic,
community yes yes yes crime statistics
Domestic violence Idiosyncratic yes yes yes crime statistics Not usually in
survey
questionnaires,
often
underreported in
crime statistics
Terrorism, civil strife,
war Community yes yes yes news reports
Risks in policy
environment: credibility
and commitment to
continue policies
National,
Community yes yes Difficult to
measure, but could
be gleaned from
monitoring policy
changes
Life cycle/demographic
Household size, number
of dependents, recent
births, gender of head,
old age, deaths in family,
family dissolution, etc.
Idiosyncratic yes yes Household roster A. Q1-Q7
National statistics
on infant mortality
rates, maternal
mortality rates, and
life expectancy can
also be a good
source of
informa
tion
Type of risk Degree of
covariance
Cross-sectional
individual or
household
survey Panel
survey Community
survey Secondary sources
Relevant LSMS
modules
Reference in
LSMS
prototype
questionnaire Comments
information
Women’s access to
resources Community yes anthropological
accounts Community Section 2. Q14 Good candidate for
using qualitative
methods
Economic
Macro shocks: BOP,
financial crisis, currency
crisis, terms of trade
National national accounts
Loss in value of financial
assets or pension funds
linked to inflation, stock
market, or exchange rate
collapses
Usually
covariate, could
be idiosyncratic
yes yes
Risk in asset returns Can be both
idiosyncratic and
covariate
yes yes yes yes Secondary data on
land prices, stock
prices could
provide
information on
returns to different
types of assets
Access to common
property resources;
unclear commitments
regarding public goods
Both
idiosyncratic and
covariate
yes yes yes Community (access
to land) Section 2. Q19-
Q23
Price risk Community yes
Business failure or
indebtedness Idiosyncratic,
community yes yes Household
enterprises H. Q23, Q24
Resettlement Idiosyncratic,
community yes yes
Unemployment Idiosyncratic,
community yes yes labor statistics Employment A. Q12, Q18-
Q21
D. Q27-Q29
Indebtedness Idiosyncratic yes yes Credit A. Q3, Q4, Q6-
Q8, Q10-Q12,
Q14-Q16, Q18-
Q20, Q23, Q25
B. Q1, Q20, Q28,
Q29, Q44, Q52,
Q53, Q68, Q76
Household
enterprises
H. Q23, Q24
Type of risk Degree of
covariance
Cross-sectional
individual or
household
survey Panel
survey Community
survey Secondary sources
Relevant LSMS
modules
Reference in
LSMS
prototype
questionnaire Comments
enterprises
Uncertain access to
inputs and cash flow
support during
production
Idiosyncratic yes yes Credit (denial of
credit) C. Q1-Q6
Idiosyncratic yes yes Community (access
to land) Section 2. Q19-
Q23
Idiosyncratic yes yes Household
enterprises
(problems
obtaining inputs)
H. Q4, Q5, Q12,
Q13
Other constraints on
production Idiosyncratic yes yes Household
enterprises
(constraints on
enterprises)
E. Q26-Q33
Idiosyncratic yes yes Household
enterprises (threat
of foreign
competition)
H. Q22
Security of tenure,
property rights Idiosyncratic,
community yes yes yes Agriculture (land
rights) A1. Q9
A3. Q10
Idiosyncratic,
community yes yes yes Agriculture
(livestock died,
stolen, lost)
E. Q2, Q8
Idiosyncratic,
community yes yes yes Household
enterprises
(security of title)
G. Q4, Q5
Idiosyncratic,
community yes yes yes Household
enterprises
(problems
associated with
registering
business)
F. Q31, Q32
Housing
(ownership and
security of title)
C. Q1, Q2, Q6-
Q10
Imperfect enforcement of
contracts and informal
arrangements
Idiosyncratic,
community yes yes yes Employment
(contractual
security)
C. Q16, Q17,
Q85, Q86
D. Q9, Q11,
Q12, Q38, Q39
Type of risk Degree of
covariance
Cross-sectional
individual or
household
survey Panel
survey Community
survey Secondary sources
Relevant LSMS
modules
Reference in
LSMS
prototype
questionnaire Comments
Uncertainty regarding
rationing in public
support (exclusion from
social safety net)
Idiosyncratic,
community yes yes yes
Health risks
Illness, injury, and
disability Idiosyncratic yes yes Health (self-
reported morbidity) A. Q30-Q47
Epidemic (e.g. malaria) Idiosyncratic,
Community yes yes yes Health (self-
reported morbidity) A. Q30-Q47
Malnutrition Idiosyncratic yes yes national nutrition
surveys Anthropometry Q1-Q8
Substance use Idiosyncratic yes yes possible Health (tobacco
consumption) B. Q2-Q11 Communities may
have perceptions
about the degree of
substance abuse in
their area, although
they may not be
willing to reveal
them in surveys
Health (alcohol
consumption) B. Q14-Q16
Sources: Authors’ compilations; Dercon (2001); Heitzmann, Canagarajah, and Siegel (2002).
Table 2. Typical data sources on ex post and ex ante risk management instruments
Risk management
mechanism Instrument Type of
Institution Cross-section
survey Panel survey
Community
survey Secondary
sources Relevant LSMS
modules
Reference in
LSMS prototype
questionnaire
Asset accumulation,
diversification, and
disposal
Financial assets Savings and
contributions to
savings accounts
Private yes yes Savings (liquid
assets) B. Q1, Q2, Q4
C. Q1-Q3
E. Q1, Q2
Consumption
(contributions to
savings accounts)
C. Q1-Q4
Participation in
ROSCAs Private yes yes Savings (ROSCAs) D. Q1, Q3
Consumption
(contributions to
ROSCAs, tontins
etc)
C. Q1-Q4
Credit (loans to
others) Private yes yes Credit (loans to
others) D. Q1, Q2, Q7
Property (acquisition
and disposal) Savings Private yes yes Savings A. Q2, Q3
Housing Private yes yes Housing
(characteristics,
Ownership and
value of dwelling)
A.Q1-Q5, Q11,
Q13-Q16
C. Q1, Q2, Q6-
Q12
Agriculture (value
of landholdings) A1. Q2, Q4, Q9,
Q10
A3. Q2, Q6, Q10,
Q11
Agriculture
(livestock) E. Q1-Q6
Equipment Private yes yes Agriculture (value
of farm capital) B1. Q1-Q6
B2. Q1-Q2
Household
enterprises (Value
of equipment,
inputs, inventory)
G. Q1-Q6, Q37-
Q39
H. Q2, Q10
Household durables Private yes yes Consumption E. Q1, Q2, Q7
Social capital and
social relations Private yes yes
Community
gatherings
Private yes yes yes Community
Section 2. Q16
-
Q18
Risk management
mechanism Instrument Type of
Institution Cross-section
survey Panel survey
Community
survey Secondary
sources Relevant LSMS
modules
Reference in
LSMS prototype
questionnaire
gatherings Q18
Existence of
cooperatives Private yes yes yes Community Section 5. Q22,
Q23
Participation in
community or social
activities
Private yes yes Time use Version 2. Q21,
Q22
Gift giving Private yes yes Savings A. Q13, Q14
Consumption (gifts,
charity) C. Q1-Q4
D. Q2, Q14-Q20
Agriculture (land) A3. Q2, Q17
Dowry, brideprice Private yes yes Consumption C. Q1-Q4
Acquisition of
knowledge Private yes yes
Access to new
technologies Private, public yes yes yes Community Section 5. Q25,
Q26
Access to agricultural
extension Public yes yes yes Community Section 5. Q19-
Q21
Agriculture F. Q1-Q4, Q7-
Q10, Q13-Q15,
Q18, Q19
Knowledge of STDs,
HIV/AIDS Private, public yes yes yes Health B. Q24-Q26
Community
(provision of
information about
AIDS)
Section 9. Q18
Livelihood choices
Access to
employment outside
the community
Private yes yes yes Community Section 4. Q8-Q20
Household roster
(migration of
household
members)
A.Q8-Q11
C. Q1-Q14
Occupational choice
and diversification,
including child labor
Private yes yes yes Employment B. Q1-Q4
C. Q1-Q4
D. Q1-Q4, Q31-
Q34
Time use Version 2. Q1-Q3
Agricultural
diversification Private yes yes yes Agriculture A1. Q6
Risk management
mechanism Instrument Type of
Institution Cross-section
survey Panel survey
Community
survey Secondary
sources Relevant LSMS
modules
Reference in
LSMS prototype
questionnaire
Enterprise
diversification Private yes yes yes Household
enterprises A.Q1, Q2
B. Q1, Q3
C. Q6, Q8
D. Q1, Q3, Q4,
Q5, Q6, Q7, Q9,
Q10, Q11, Q14-
Q17
Preventive actions Risk minimization in
agriculture Private yes yes yes Community
(availability of
irrigated land)
Section 5. Q29-
Q33
Agriculture
(household access
to irrigation)
A1. Q4, Q7
A3. Q6, Q8
Agriculture (land
improvement) A1. Q12.
A3. Q13-Q15
Agriculture (use of
pesticides,
herbicides,
fungicides)
D2, Q19, Q20,
Q30, Q31
Treating water
supplies Private, public yes yes yes Data on
infrastructure Environment Module 4. Q20,
Q21, Q24
Housing B. Q8, Q9
Sanitation Private, public yes yes yes Data on
infrastructure Environment Module 5. Q4,
Q26, Q39
Housing B. Q21-Q26
Health and child care
(checkups, proper
feeding and weaning
practices)
Private yes yes yes Health facility
data, especially
service statistics
Time use (child
care) Version 2. Q14
Time use (health
care) Version 2. Q15,
Q20
Health (child
immunization) C. Q1-Q11
Fertility (pre-natal
care) B. Q2-Q5
Fertility
(breastfeeding) B. Q2, Q7-Q9
Physical fitness Private yes yes yes
Health (physical
exercise,
participation in
B. Q17-Q23
Risk management
mechanism Instrument Type of
Institution Cross-section
survey Panel survey
Community
survey Secondary
sources Relevant LSMS
modules
Reference in
LSMS prototype
questionnaire
participation in
sports)
Contraception Private yes yes Fertility C. Q1-Q4
Community health
programs Private, public yes yes yes Community
(general) Section 8. Q7-Q8
Community
(outreach and
immunization)
Section 9. Q10-
Q16
Credit
Community access to
savings instruments Public, private yes Community Section 6. Q1
Community access to
public/private credit
institutions
Public, private yes Community Section 6. Q2-Q9
Household access to
credit Private yes yes yes Credit B. Q53, Q68
Private Insurance
Health insurance Private, public yes yes Health D. Q2-Q17
Sick leave Private yes yes Employment C. Q48
Pension entitlement Public, private yes yes Employment C. Q49
Sharecropping Private yes yes Agriculture A2. Q13-Q17
Private transfers
Receipt of private
transfers Private yes yes Transfers and non-
labor income A. Q2-Q20
Consumption (food
gifts received) B. Q1, Q10
Consumption (on-
food gifts received) C. Q1, Q5, Q6
Education (payment
of school fees) B. Q3-Q5
Access to transfers
from siblings Private yes yes Household roster
(siblings living
elsewhere)
D. Q1-Q11
Risk management
mechanism Instrument Type of
Institution Cross-section
survey Panel survey
Community
survey Secondary
sources Relevant LSMS
modules
Reference in
LSMS prototype
questionnaire
Public transfers
Existence of public
works programs Public yes yes yes Community Section 4. Q32-
Q35
Entitlement to social
security Public, private yes yes Employment B. Q6
C. Q8, Q81
D. Q8
Receipt of public
transfers (including
food aid and disaster
relief)
Public yes yes yes Transfers and non-
labor income B. Q1-Q9
Receipt of
scholarship,
conditional cash
transfers
Public, private yes yes yes Education B. Q6, Q7
Sources: Authors’ compilations; Heitzmann, Canagarajah, and Siegel (2002).
Table 3. Typical data sources for outcomes
Outcome and indicator Single cross-section Panel Community Secondary Relevant LSMS
modules Reference in LSMS
prototype questionnaire
Consumption
Food consumption1 yes yes Consumption A.Q3, Q4
B. Q1-B10
Consumption of nonfood goods1 yes yes Consumption A.Q1, Q2
C. Q1-Q6
Expenditures on schooling yes yes Education B. Q2
E. Q18
Expenditures on health (Fees, transport,
medicines) yes yes Health E. Q8, Q14-Q19, Q27, Q33-
Q38, Q46, Q52-Q57, Q65-
Q68, Q76-Q79, Q87-Q94
Expenditures on housing2 yes yes Housing C. Q3-Q5, Q16-Q40
Employment related expenditures yes yes Employment C. Q13, Q23, Q42
Health
Self-assessment of health status yes yes Health A. Q3-Q4
Self-reported limitations in daily
activities yes yes Health A. Q11-Q29
Self-reported morbidity yes yes Health A. Q30-Q47
Incidence/severity of diarrhea yes yes Health A. Q6-Q9
Respiratory illness yes yes Environment Module 3. Q5-Q8
Observed activities of daily living yes yes Health G. Q1-Q11
Cognitive functioning3 H. Q1-Q32
Nutrition
Child height for age4 yes yes Anthropometry Q1-Q7
Child weight for height4 yes yes Anthropometry Q6-Q8
Child weight for age4 yes yes Anthropometry Q1-Q5, Q8
Adult BMI yes yes
Schooling
Grades attended yes yes official
education
statistics
Education A. Q3-Q26
Attending school yes yes official
education
statistics
Education A. Q28-Q50
Notes. 1. Note that total household consumption must be “built up” from these LSMS modules. See Deaton and Grosh (2000) for a discussion.
2. Note that imputing a value to owner-occupied housing is difficult. See Deaton and Grosh (2000) and Malpezzi (2000) for a discussion.
3. This is administered only to respondents over 40.
These measures should be converted to z scores that standardize measures across children of different ages and sex.
16
topography plays a more important role. Prior knowledge regarding the degree of
covariance of shocks may help inform data designers regarding the level of aggregation:
if shocks are highly covariate, it may be more cost-effective to collect data at a higher
level of aggregation.
In practice, however, even within well-defined rural communities, variance
decomposition analysis reveals that few risks are purely idiosyncratic or common.
Variance decomposition analysis involves computing the contribution of village-level
variance to total variance: the lower its contribution, the more idiosyncratic the shock.
Dercon (2002), drawing from his work in rural Ethiopia (Dercon and Krishnan 2000),
finds that most shocks have both idiosyncratic and common parts. In Ethiopia, for
example, rainfall variation had a large covariate component (the village-level variance
accounted for 41 percent of the variation in the individual rainfall index), while number
of days lost due to illness was clearly an idiosyncratic shock (the village-level variance
accounted for only 5.2 percent of total variance). A priori classifications may also
misclassify shocks. In the Guatemala poverty assessment, shocks were classified a priori
into idiosyncratic or covariate (Tesliuc and Lindert 2002). However, a variance
decomposition test showed that location alone explained less than 25 percent of all
shocks that were classified as covariate (except inflation). The shocks with a high degree
of covariance at the local level were bad harvests and income losses, which were
classified a priori as idiosyncratic. Respondent reports of the impact of shocks may have
systematic biases. Again, in the Guatemala poverty assessment, respondents tended to
“complain” about covariate shocks and to be more “honest” about the impact of
idiosyncratic shocks (the share of covariate shocks that had no negative impact on
household income or wealth was significantly larger than the equivalent share of
idiosyncratic shocks).
Ex Post versus Ex Ante Mechanisms
There are a number of reasons why information on risk responses is difficult to
obtain. First, obtaining information on expectations is inherently difficult. A person’s
answer regarding questions regarding “expected yield” or whether it was a “normal” year
involves three elements: the person’s understanding of the objective distribution of risks,
17
the person’s own response, and the person’s own risk preferences. How does one get at
the probability distribution of the risk? Is it a one-time event, or something that occurs
with some regularity? In Bangladesh, for example, floods occur yearly, but the 1998
floods were memorable because of their severity. By asking about well-defined, specific
recent events, one can get some idea about the risk distribution that the person faces.
Discrete events can also be recalled over a longer period than recurring events. Second,
depending on the timing of the survey (see the next point below), a response could be
identified either as an ex ante or an ex post response. Take as an example membership in
a rotating savings association (ROSCA). Suppose that the member was interviewed prior
to a shock that enabled her to withdraw funds from the ROSCA. In that case,
membership in the ROSCA would be interpreted as an ex ante risk management
mechanism. However, suppose she was interviewed after a shock, and she had just
withdrawn funds from the ROSCA. Without knowing the date that she joined the
ROSCA in reference to the timing of the shock, it would be difficult to establish whether
a particular mechanism was used ex ante or ex post.
Timing and Frequency of Surveys
As already hinted at above, the timing of the survey work is important. Shocks
work with time lags, and have different distributions. Because shocks are, by definition,
unanticipated, it is often pure coincidence that a survey will be able to capture
information on shocks (particularly if it is a one-time shock) unless the survey was
conducted for that purpose. A case can certainly be made for shorter surveys that are
fielded more frequently for monitoring purposes, rather than long surveys that are fielded
at longer intervals.
Cross-Validation of Responses
Cross-validation is important if different data sources are inconsistent. For
example, there may be disagreement between household-level data and cluster-level data.
Depending on whether geographic boundaries are drawn and where the household is
actually located, administrative data may not be relevant to households in a particular
cluster, if households (for example) obtain public services from a municipality other than
18
their official place of residence. Cross-validation within the household may also be
necessary. Often, we rely on the head of the household to report on assets or risk
responses of other household members. Evidence from Indonesia (Thomas and
Frankenberg 1999) suggests that husbands tend to underestimate their wives’ asset
holdings and vice versa.
Types of Data and Methods of Data Collection2
As mentioned in the introduction, the emphasis in this toolkit is on data from
household surveys, supplemented by data from secondary sources. It is useful to
distinguish types of data from methods of data collection. Data can be classified into
quantitative or qualitative; methods into noncontextual and contextual. In a survey-based
context, quantitative data measure the degree to which a feature is present, while
qualitative data are numeric observations that denote the presence or absence of a
characteristic or membership to a particular category. Qualitative data can be analyzed
using quantitative methods, e.g., they can be used to calculate percentages, frequencies,
chi-squares, or other statistics (Chung 2000, p. 337). Qualitative data are also defined in
terms of textual or visual data that have been derived from interviews, observations,
documents, or records. While these data are often associated with methods that require
“intensive, often repeated encounters with small numbers of people in their natural
environment” (Chung 2000, p. 337), a distinction between survey-based and contextual
methods (Hentschel 1999; Moser 2001) is more useful. Contextual methods are those
that attempt to understand human behavior within the social, cultural, economic, and
political environment of a locality (Hentschel 1999, p. 71). Survey-based methods, on
the other hand, involve structured interviews of a representative household sample to
obtain information on a range of questions, and preformulated, closed-ended, and
codifiable questions are usually asked to one household member (often the head) during
one or two visits. In the remainder of this paper we discuss six types of data and the
ability of each of these data sources to shed light on each stage of the risk chain.
2 For a more comprehensive discussion of types of data and methods of data collection, see Booth et al.
1998; Hentschel 1999, and Moser 2000). A thorough discussion of qualitative methods in the context of
the LSMS is found in Chung (2000).
19
3. HOUSEHOLD DATA FROM SURVEY-BASED METHODS
3.1 Single-Cross Section of Households
A cross-section survey of households, conducted at a single point in time, is often
the only data source for conducting risk and vulnerability assessments. While adequate
for a poverty assessment, a single-cross section is problematic for measuring
vulnerability because of the absence of data from more than one point in time—that is,
this data set does not have any intertemporal variability. Consequently, users of single
cross-sections have used cross-sectional variability as a proxy for intertemporal
variability, as explained in our companion paper, Hoddinott and Quisumbing (2003).
However, single cross-sections can still be used for vulnerability assessments if
they are supplemented with other data sources, such as historical or time-series data on
cropping patterns and weather events. These could inform the simulations described in
Section 5 of Hoddinott and Quisumbing (2003). They can also be supplemented by
qualitative, contextual studies. If the analyst knows beforehand that risk and
vulnerability measurement is one of the objectives of conducting the household survey,
retrospective questions can be included to capture, albeit imperfectly, information about
past shocks as well as ex ante coping mechanisms.
Uses of This Data Source
We use an example from a recent vulnerability assessment conducted by the
World Bank in Guatemala (Example 1) and a study of risk-coping in Ethiopia (Example
2) to illustrate the uses of this data source.
Advantages and Disadvantages
The main advantage of a single cross-sectional household survey is its
availability, and, if it is designed to be nationally representative, its ability to capture the
diversity of living conditions of households within a country. As shown in Tables 1, 2,
and 3, most household surveys under the Living Standards Measurement Study (LSMS)
20
Example 1. Risk and vulnerability assessment in Guatemala
The recently completed vulnerability assessment in Guatemala (Tesliuc and Lindert 2002) illustrates
how creative use of a single cross-section survey, combined with a qualitative study, can provide a wealth
of information on risks and coping mechanisms. This study combined quantitative data from the Living
Standards Measurement Study and qualitative information from an in-depth qualitative study of poverty
and exclusion conducted in 10 villages in Guatemala. Both data sources were designed to capture issues
related to vulnerability, risks, and risk management.
The quantitative survey included a risks and shocks module, in which households were asked to report
if they had experienced a shock during the previous 12 months, using precoded questions for 28 economic,
natural, social/political, and life-cycle shocks. These shocks were classified ex ante into covariant and
idiosyncratic shocks. Households also reported: (1) whether these shocks triggered a reduction or loss of
their income or wealth; (2) the main strategy that they used to cope with their welfare loss; (3) if they had
succeeded in reversing the reduction or loss in their welfare by the time of the survey, and (4) the estimated
time that had elapsed until successful resolution of the situation. Information on covariant shocks was also
collected from the community questionnaire at the survey cluster level.
The vulnerability assessment included several types of analysis of shocks and their impact, including
(1) factor analysis to understand the correlation structure or “bunching” of shocks; (2) a multivariate
logistic model to examine the association between a household’s characteristics and location and the
probability that it reports a shock or incurs wealth and income losses due to the shock and the probability
that it has recovered from the negative impact of the shock by the time of the interview; (3) nonparametric
density estimation to estimate the counterfactual density of consumption or income; (4) multiple regression
analysis to estimate the cost of shocks; (5) propensity score matching to estimate the cost of shocks; and (6)
multiple regression analysis to estimate vulnerability to consumption poverty.
Example 2. Risk-related hardship faced by rural households in Ethiopia
A retrospective module on shocks that caused serious hardship in the last 20 years was administered
during the third round of the Ethiopian Rural Household Survey (Dercon and Krishnan 2000). The list was
based on pilots using open-ended questions. It asked whether the event caused very serious hardship in the
last 20 years and to nominate the years in which it occurred, where simple landmark dates were used to
help dating during the interviews to survey respondents. The largest percentage of households mentioned
harvest failure due to environmental factors, with the mode year the year of the 1984 famine. Illness and
deaths, affecting labor in the household or shock to livestock holdings were also very common. But
outcomes due to the individual-specific effects of “policy” are also high—higher than war or banditry,
despite years of civil war in this period. While rural policy was especially restrictive in this period, this
effect also measures the unpredictability of the impact of policy, with specific measures at the village level,
in the form of villagization or taxation (often in the form of forced deliveries of grain) affecting people in
unexpected and often arbitrary ways.
Events causing hardship
Percentage of households to have
been seriously affected in last 20
years
Harvest failure (drought, flooding, frost, etc.) 78
Policy shock (taxation, forced labor, ban on migration...) 42
Labor problems (illness or deaths) 40
Oxen problems (diseases, deaths) 39
Other livestock (diseases, deaths) 35
Land problems (villagization, land reform) 17
Asset losses (fire, loss) 16
War 7
Crime/banditry (theft, violence) 3
Source: Dercon 2002.
21
or patterned after the LSMS, even if not designed specifically to examine risk and
vulnerability, are quite rich in information on various parts of the risk chain. In
particular, they provide information on consumption or measures of household well-
being, household demographic characteristics, livelihood strategies, and coping
mechanisms.
The primary drawback of using a single cross section is, of course, the absence of
temporal variability. As explained in Hoddinott and Quisumbing (2003), identifying the
household characteristics that are associated with vulnerability requires making strong
assumptions about the stochastic process generating consumption, in particular assuming
that the cross-sectional variance can be used to estimate inter-temporal variance. While
the cross-sectional variance can explain that portion of intertemporal variance due to
idiosyncratic components or cluster-specific shocks, it will not capture intertemporal or
aggregate (household invariant by time-varying) shocks. It may produce good estimates
of vulnerability if the distribution of risks and risk management instruments is similar
over time (Tesliuc and Lindert 2002), if the macroeconomic environment is stable and if
shocks do not generate survivorship bias. They are less well suited to capturing the
impact of large aggregate shocks, such as the late 1990s East Asia crisis, unless the
sample size is especially large and the data on consumption especially detailed (Friedman
and Levinsohn 2002; Hoddinott and Quisumbing 2003).
Issues and Innovations
With a single cross-section, analysts will have to depend on a good retrospective
module to obtain a history of shocks and responses to shocks and secondary data
(administrative data, rainfall data) to get an idea of the distribution of the shocks.
• The phrasing of the question has important implications for the accuracy of
the response.
• Amendments of basic questionnaire modules to capture risk and vulnerability
issues might be cost effective, for example, amending the household roster to
get more information on orphans (e.g., whether one parent is deceased, when
the child joined the household, etc.).
22
• To get an idea about the distribution of risks:
(a) Make a list of potential events and go through the list with the respondent;
(b) Ask when the event occurred in relation to seasonal events (for example,
did the malaria episode hit during the peak labor demand season?);
(c) Ask what people are doing to prevent the risky event or to cope with it;
(d) Ask about the likelihood of the shock occurring again.
• Evaluate the pros and cons of prelisting risks and asking respondents to rank
them in importance. Soliciting a list from respondent may bring up things that
the researcher is unaware of, but the list may exclude important risks that are
likely to be sensitive (e.g., risk of domestic violence). Also, lists are likely to
be highly correlated with the immediate past. Risk-ranking exercises are
useful to find out about what has happened recently, but may not be a good
predictor of vulnerability to future events.
• Adjust the recall period to the frequency of the shock, with longer periods for
catastrophic shocks and shorter periods for frequent shocks.
• In designing the questionnaire, use known, discrete events as “signposts” for
recalling smaller shocks (e.g., date of Independence, the 1998 floods in
Bangladesh, Hurricane Mitch).
• Collect information on the frequency and severity of shocks and their
correlation with other shocks. Households may report the same shock, even
though the characteristics and severity of the shock may be different. Self-
reported information is often subject to respondent bias, with richer
respondents tending to “complain” more than poorer respondents (Hoddinott
and Quisumbing 2003; Tesliuc and Lindert 2002).
• Cross-validate reported shocks with secondary sources such as administrative
data, rainfall data, seismological data, and historical records. Cross-validate
household reports with community reports.
• Use a specially designed qualitative/contextual study to get at aspects of risk
and vulnerability that are difficult to capture using household surveys.
23
3.2 Repeated Cross-Sections
A number of countries have household surveys that are undertaken at regular
intervals, but that are not panel surveys because they do not return to the same
households. Examples of these are the Family Income and Expenditure Surveys in the
Philippines, the Welfare Monitoring Surveys in Kenya (Example 3), the SUSENAS
surveys in Indonesia. The Living Standards Measurement Studies in Côte d’Ivoire used a
rotating panel that results in a mix of longitudinal and repeated cross-sectional data. If
the repeated cross-sections are drawn from the same sampling frame, then cluster panels
can be created, permitting an analysis of intertemporal variation within the cluster, even
if the households covered within each cluster may be different.
Advantages and Disadvantages
Repeated cross-sections clearly have an advantage over a single cross-section by
being able to capture intertemporal variation. Unlike panel data, which are relatively
rare, repeated cross-sections are more readily available, being part of many countries’
regular statistical activities. Construction of cluster or community averages also is a way
of creating observations on a whole range of variables over time, when panel data are not
available. If the sample sizes are large enough, repeated cross-sections can be used to
create pseudo-panels of cohorts.
Issues and Innovations
How useful are cluster data for making inferences about household vulnerability?3
The basic assumption underlying this approach is that each cluster represents a
“representative household.” One concern is that each “representative household” in fact
constitutes a different number of households. This would be true if households varied
widely in their characteristics and behavior across clusters, and if clusters were given
equal weights in the regression analysis. However, even if each cluster did not consist of
the same number of households, or if clusters were of different size, this concern can be
3 This discussion draws heavily from Christiaensen and Subbarao (2001).
24
Example 3. Examining vulnerability to risk in Kenya
Christiaensen and Subbarao (2001) use the 1994 and 1997 Welfare Monitoring Surveys in Kenya to
construct a vulnerability profile and examine the determinants of vulnerability. Both the 1994 and 1997
WMS were collected on the same clusters, even though the households covered in each cluster may have
differed. The authors estimate the coefficients of the determinants of the ex ante mean and variance of
each cluster’s average per adult equivalent household consumption in 1997 based on its average household
and locality characteristics in 1994. The dependent variable is real expenditure per adult equivalent in
1997, while the regressors fall into three categories: risk factors, risk exposure, and coping capacity (see
below). The analysis is conducted separately for non-arid and arid/semi-arid areas. The authors then
construct a vulnerability profile by substituting the 1994 socioeconomic characteristics of each community
in the estimated regression equations. Assuming that consumption is lognormally distributed and using the
official poverty line as a reference, a community was considered to fall under the poverty line if future
consumption of its households was on average below the poverty line. In 1994, communities, on average,
had a 35 percent chance of falling below the poverty line (V0) in 1997. Using a threshold of 50 percent, a
community is classified as vulnerable if it has a chance of 50 percent or more to fall below the poverty line
in the future. Almost a fifth (19.3 percent) of all communities fall into this category. The authors also
compute different statistics based on the variants of the Pα measure: the expected gap (V1), the
conditional expected gap (V1/V0), and the normalized expected gap squared (V2).
How useful are these estimates for vulnerability analysis? The usefulness of these measures depends
on their ability to predict future poverty. The Spearman correlation coefficient between the 1994
vulnerability measures V0, V1, and V2 and actual average consumption of the community in 1997 are
-0.47, -0.44, and –0.42, respectively. The average probability of shortfall (35 percent) is nearly identical to
the proportion of communities who fell below the poverty gap in 1997, and the average expected shortfall
is also close to the actual consumption gap in 1997. Contingency table analysis also shows that more than
two-thirds of the clusters are correctly classified as poor or nonpoor by the vulnerability measure, although
the classification is sensitive to the vulnerability cutoff used.
Variables related to risk factors, risk exposure, and coping capacity of rural, nonpastoralist
communities in Kenya are
Ø Dependent variable
Ø Real expenditure (Ksh) per adult equivalent in 1997
Ø Risk Factors
1996 rainfall below 70 percent of historical average
Percent adult members/household with fever/malaria during last two weeks in 1994
Ø Risk Exposure
Landholdings (acres) per adult equivalent in 1994
Fertilizer use per adult equivalent in 1994
Percent adult unskilled public-sector workers/household in 1994
Percent adult skilled private-sector workers/household in 1994
Percent adult unskilled private-sector workers/household in 1994
Income share from pensions in 1994
Income share from nonagricultural activities (excluding pensions and transfers) in 1994
Ø Coping capacity
Household size in 1994
Dependency ratio in 1994
Percent Literate adults/household in 1994
Number of animals per household adult equivalent
Use of electricity for lighting or cooking in 1994
Time to food market (as recorded in 1997 WMS)
25
addressed in the regression analysis using sampling weights. A second concern, raised in
the context of cross-country studies (Behrman and Deolalikar 1988) is that the use of
average data may be misleading if the distributional issues are important and if the
distribution is different across clusters. Even though households within clusters tend to
be more homogeneous than households within countries—and thus distributional
differences are of less concern—it may be advisable to do a variance decomposition for
some measures of interest to see whether intra-cluster variability is greater than inter-
cluster variability. So long as the distribution of “representative households” reflects the
distribution of household and locality characteristics, the estimated coefficients of the ex
ante mean and variance of future consumption will provide a good indication of the
relative importance of the determinants of household vulnerability.
In the example above, Christiaensen and Subbarao estimate vulnerability
(expected poverty) in 1997 using 1994 values as regressors. To construct a vulnerability
profile at a later point in time, one would need to recalibrate the model using the latest
household survey data. The timeliness of a vulnerability profile constructed using earlier
data depends on the temporal stability of three factors: (1) estimated returns to
demographic and social characteristics of households and their communities as well as
natural shocks occurring between the period on which analysis is based and the future
period for which vulnerability is being assessed; (2) rainfall and weather patterns between
the two time periods; and (3) households and community characteristics themselves. If
shocks cause widespread dissolution of households, changes in household structure, and
changes in the structure of communities, vulnerability profiles based on an earlier
distribution of characteristics will no longer be valid.
To summarize, repeated cross-sections can be a useful tool for vulnerability
analysis if:
• The same sampling frame is used for both cross-sections, and clusters are
relatively homogeneous;
• The appropriate sampling weights are used in the analysis;
26
• No significant changes took place in economic, social, and political
institutions that may alter the returns to household and community
characteristics;
• Rainfall and weather patterns are stable;
• Household and community characteristics are relatively stable in the face of
shocks which occurred in the intervening period;
• Where more recent data are available, for example, on rainfall, coefficients
can be estimated based on the more recent data; and
• The model is recalibrated for the time period for which a vulnerability
prediction is required based on the most recently available data.
3.3 Panel data4
A number of the vulnerability measures discussed in the companion document to
this toolkit are best estimated using panel data such as those that examine both ex ante
poverty and also household responses to risk (Examples 4 and 5).
Example 4. Welfare losses from poverty and risk in Bulgaria
Ligon and Schechter (2002) define vulnerability as the sum of both losses due to poverty and losses
due to risk exposure (Ligon and Schechter 2002). They use monthly data from the Household Budget
Survey in Bulgaria, collected over 12 months, to estimate their vulnerability measure. They divide
idiosyncratic risk into three parts: risk arising from variation in the income stream, from changes in the
number of pensioners in each household, and from changes in the number of unemployed persons in the
household. They also attempt to measure the contribution of various components of the vulnerability
measure to overall vulnerability, using both total consumption and food consumption. They find that for
both measures, poverty is the largest single component of vulnerability. After that, unexplained risk is the
second largest component, and aggregate risk is the third largest component—thus aggregate risks are
more important than idiosyncratic sources of risk. They then regress each element of vulnerability on a set
of observable household characteristics. They find that households headed by an employed, educated male
are less vulnerable to aggregate shocks than are other households. They also find that the correlates of
vulnerability are extremely similar to the correlates of poverty; moreover, the correlates of aggregate risk
are the same as the correlates of poverty. This is not surprising since aggregate shocks are, by definition,
the same for all households, and so poorer households will experience a greater impact on their utility from
this component of risk.
4 This discussion draws heavily from Glewwe and Jacoby (2000).
27
Example 5. Effects of shocks on adult body mass in rural Ethiopiaa
Dercon and Krishnan (2000) use panel data to explore determinants of adult Body Mass Index (BMI)
movements across seasons in Ethiopia. They argue that predictable movements in relative prices and wages
could affect the optimal path of nutritional status over the seasons, with price variability and differential
returns to labor in off-peak and peak seasons encouraging the use of the body as a store of energy, provided
that returns to other liquid assets are low, resulting in different ‘optimal’ weights across the seasons: feast
when prices are low and fast when prices are high. They find evidence that this occurs; BMI increases by
about 0.5 percent during periods of peak labor needs and up to 0.5 percent in the period immediately after
the harvest. The effects are typically only significant and large for households with low landholdings, so
that “feast now, fast later” is a strategy typically used by poorer households. This use of body weight as a
consumption-smoothing device is consistent with imperfections in risk management strategies, suggesting
that returns to assets, food stocks and returns to using the body as a store of energy are not integrated and
arbitrage is profitable. Further, Dercon and Krishnan (2000) show that rainfall shocks significantly affect
BMI. Even though rainfall was relatively favorable in the period of their study, relatively poor rainfall in
some communities lowered BMI by about 0.9 percent for households with low landholdings, suggesting
that the absence of effective risk management strategies was costly in terms of adult health.
The impact of shocks during 1994/95 on the BMI of Ethiopian adults
Source of shock and fluctuation (by group) Estimated coefficient t-statistic
Selected community and household-level shocks
Peak labor period for males 0.0039 1.70*
Peak labor period for females 0.0050 2.10**
Postharvest period (land rich household) 0.0015 0.69
Postharvest period (land poor household) 0.0049 2.45***
Rain shock (land rich households)
Rain shock (land poor households) -0.0012
0.0089 -0.14
2.01**
Individual specific shocks
Own illness if male in South (land rich household)
Own illness if male in South (land poor household)
Own illness if female in South (land rich household)
Own illness if female in South (land poor household)
Own illness if male in North (land rich household)
Own illness if male in North (land poor household)
Own illness if female in North (land rich household)
Own illness if female in North (land poor household)
-0.0010
0.0001
-0.0022
-0.0042
0.0013
-0.0016
0.0004
-0.0007
-0.95
0.04
-1.17
-5.90***
1.17
-1.35
0.32
-0.81
Notes:
1. The dependent variable is the natural logarithm of BMI. Sample size is 1,787.
2. The results are based on a model regressing the change in BMI on the previous level of BMI, shocks
and a number of time-varying control variables, as well as controlling for fixed effects in the change in
BMI. The Arrellano-Bond GMM estimator is used. Group-specific effects are obtained via interaction
terms.
3. Land poor households have less than the median level of land per adult per village.
4. * Significant at the 10 percent level; ** significant at the 5 percent level.
5. Further details and full results in Dercon and Krishnan (2000).
Advantages and Disadvantages
Although a series of repeated cross-sections could lend itself to synthetic cohort
analysis, panel data have a number of advantages for undertaking risk and vulnerability
assessments: (1) in the absence of measurement error, panel data enable more precise
28
estimation of changes in variable means; (2) they are suited to estimating changes at the
individual level whereas repeated cross-sectional surveys only permit comparisons over
time across broad groups; (3) they provide more accurate data on past events than
retrospective surveys; and (4) they may be cheaper to collect than repeated cross-sections
since a subset of basic information will not need to be collected, but rather updated.
Since panel data are, essentially, a series of surveys on the same cross-sectional
units, any type of analysis that can be done with a single cross-section, or a series of
cross-sections, can also be done with panel data. One advantage of panel data is that they
permit correction for unobserved household-level characteristics that may be correlated
with the error terms in the regression. The standard procedure used has been fixed-
effects estimation. However, fixed-effects estimates based on panel data may not be the
best or only solution to the problem of unobservables. First, fixed-effects estimates rely
on the assumption that the unobserved factors that may affect outcomes are fixed over
time, which is not necessarily the case. Second, fixed-effects procedures do not
necessarily require individual-level panel data—repeated surveys of communities with
different households could tease out unobserved community effects, while retrospective
data could help generate “past” observations on exposure to a program. Third,
nonrandom sample attrition needs to be considered. Attrition may lead to bias if
households or individuals that remain in the sample differ in unobserved ways from those
that have left. Sample attrition will not lead to bias if the characteristics that lead to
attrition (or selective migration) do not change over time.
Some caveats also need to be pointed out if panel data are used to estimate models
in which some of the right-hand-side variables are endogenous. First, there are
difficulties in finding credible instrumental variables and in correctly specifying the
unobserved heterogeneity. Second, one must distinguish between transitory shocks and
measurement error in the data. This is especially important when making inferences
about transitory and chronic poverty. Third, one must deal with the problem of panel
attrition. Other problems with fixed-effects estimation have to do with the loss of
statistical degrees of freedom, the loss of the ability to estimate coefficients on time-
invariant variables (which will drop out in the fixed-effects estimation), incompatibility
of fixed effects with some estimation methods, and the possibility that differencing will
29
worsen the problem of measurement error. Despite these caveats, Glewwe and Jacoby
(2000) argue that sound questionnaire design can be used to reduce the drawbacks of
panel data, particularly in generating suitable instruments for estimation and repeated
contemporaneous measures of outcomes of interest to avoid measurement error.
From a survey logistics issue, collecting panel data will need to deal with
respondent fatigue, which could be a factor leading to attrition due to nonresponse or
unwillingness to be surveyed. Panel data based on a sampling frame of dwellings may
miss groups like pastoralists. Panel data based on a household sampling frame will have
to face issues like drastic changes in household structure due to death or migration, or
simply aging. Also, panel data can be expensive. Lastly, over time, the panel will no
longer be representative of the population, unless households are added to maintain the
representativeness of the panel.
Issues and Innovations
Some changes can be made at the survey design stage to maximize the usefulness
of panel data for risk and vulnerability assessments.
• When designing a survey, work into the first round the ability to find people
subsequently by having complete addresses, a complete household roster, and
a name list of respondents. The latter does not need to be released for
confidentiality reasons, but the ability to track respondents should be ensured.
• If it is not feasible to field a large household surveys (like the LSMS) on a
regular basis, consider having a small panel component to understand
dynamic issues.
• Because panels typically cover only a few years, retrospective modules can
help bridge the gap between survey years.
• If panel data were (fortuitously) collected before and after a shock, use
various modules from the earlier period to examine ex ante mechanisms, and
from the later period to analyze ex post response. For example, information
on diversification, low-risk activities, migration, nonmonetary savings, and
30
risk-sharing groups provide insights into ex-ante mechanisms, while changes
in labor supply, remittances, and informal transfers are ex-post mechanisms.
4. LOCALITY INFORMATION AND DATA FROM CONTEXTUAL METHODS
Locality data collected from community questionnaires and secondary sources
provide important information on the household’s environment. Locality information can
be obtained from a variety of sources: “community questionnaires” on local
infrastructure, health and education facilities; administrative sources; market price
surveys; archives; rainfall stations; focus groups and key informants detailing local
histories; and, where appropriate, other primary data sources such as Demographic and
Health Surveys. Data from contextual methods also provide insights into the social and
cultural environment of households, and may be extremely useful in examining
individual perceptions of risk and vulnerability and sensitive issues that are less suitable
for survey-based methods. In cases where the analyst has no other household-level data
source but a cross-sectional survey, locality data may be the only source of information
on intertemporal variation. Contextual methods can get at people’s perceptions of risk
and vulnerability, and explore issues that may less amenable to survey questionnaires,
including sensitive issues such as intrahousehold relations, crime, illness, magic, and
politics, as well as more “complicated,” multidimensional issues such as power
relationships, trust, and belief systems. Contextual methods can also be especially useful
in drawing up a timeline of shocks and major events affecting the community.
4.1 Community Information
Information collected from “community questionnaires” on local infrastructure,
health and education facilities, and market prices are a valuable complement to household
surveys in undertaking risk and vulnerability assessments.5 Although data from
secondary sources (such as administrative data) are also collected at the community level,
we use the term “community information” to mean that collected from interviews with
key informants and community members, using a structured questionnaire (we discuss
5 This section draws heavily on a detailed discussion of the design of community questionnaires in
Frankenberg (2000).
31
secondary data below). In combination with secondary sources, such as administrative
data and information collected by government statistical services, community
questionnaires are an important source of information on public institutions and
interventions that may affect a household’s vulnerability to, and ability to cope with,
shocks. Over half of LSMS data sets collected before 1997 included community and
price questionnaires, covering topics such as demographics, the economy and
infrastructure, education, health, agriculture, and the prices of food and nonfood goods
(Grosh and Glewwe 2000). A smaller number of surveys also collected facility data.
Most existing LSMS data sets contain information on the availability of sanitation
facilities, power, water supply and public works such as road and transport networks and,
in some cases, irrigation systems, health and education facilities and the distance to travel
to them (Frankenberg 2000: 317). By combining data on access to schools or facilities
with household-level data, it is possible to produce descriptive statistics on households’
access to health care, educational opportunities, and other public services (Example 6).
Questions about when facilities opened are also useful, when linked to retrospective data
from the household about past behavior. These can be used to relate intracommunity or
intrafamily changes in behavior or outcomes over time to changes in access to services.
Community questionnaires can also be used to obtain information about local
institutions and norms. For example, land tenure relations are often not codified in areas
governed by customary tenure. Using community interviews in Ghana and Sumatra
(Quisumbing et al. 2001; Suyanto et al. 2001a), researchers were able to define the
strength of property rights attached to different types of land depending on the mode of
acquisition based on customary law. Community questionnaires can also yield data on
access to informal networks as well as the presence of “safety nets” in times of crises.
Finally, community questionnaires can also be designed to obtain a history of local
covariate shocks.
Issues and Innovations
Some of the issues surrounding the use of community data have to do with its use
in analysis, and others to do with survey design and administration. First, communities
with access to infrastructure, health facilities, and schools may well have a number of
32
Example 6. Shocks and schooling attainment in Guatemala
Stein et al. (2003) present a preliminary examination of determinants of completed grades of schooling
amongst adults born in four villages in eastern Guatemala. Recognizing that this attainment would reflect
past events such as shocks, they drew on a specially commissioned anthropological study of these villages.
This work included discussions with key informants, focus groups as well as a review of records held by
the schools in the four villages. These data were used to construct three measures of school quality
(whether all six grades of primary schooling were offered; the ratio of teachers to grades taught and the
availability of a preschool) and three covariant shocks (whether schools were closed after the 1976
earthquake; a terms-of-trade shock that rendered cash crop production in one village unprofitable; and a
positive employment shock—the availability of high-paying wage jobs in a cement factory). Controlling
for individual’s sex, cohort effects and locality fixed effects, they find that two measures of school quality
and all three covariant shocks had statistically significant impacts on the number of grades attained. An
analysis that had not used qualitative data would have missed the impact of these shocks completely.
other attributes that contribute to positive outcomes. Unless the analysis controls for
these other attributes, the effects of access to facilities will be overstated. Second,
measures often tend to be highly collinear. Third, missing data at the individual
household, community, or facility level means that analysts can only use a significantly
smaller and possibly nonrandom subset of observations. A fourth problem is the
potential endogeneity between community characteristics and individual or household-
level behavior and outcomes. Governments may place programs in communities where
households have certain characteristics, or households may migrate to places where
publicly provided infrastructure is present. While these issues are not a disadvantage of
community data, analysts need to take them into account.
In designing community questionnaires to be fielded as a complement to a
household survey, survey designers may want to take note of the following points:
• Pay special attention to the definition of the “community.” Cluster boundaries
may not have any significance to people actually living in the community.
Ideally, communities should be defined in accordance with the particular
outcome of interest to survey designers and the characteristics of survey
clusters. This may be more problematic in urban areas. From a pragmatic
point of view, it is helpful to be able to match communities to the units for
which administrative data have been collected.
33
• Key informants at very local levels, and persons who have lived in the
community for a longer period of time will be valuable sources of information
for building up a history of shocks and events in the community.
• Questions on shocks can be asked at both the household and community level
for cross-validation. If shocks are both multiple and covariant, community
information can provide the context for individual responses to be interpreted.
Questions on public responses to shocks should also be asked at the
community level, to capture such responses as public works, local disaster
relief schemes, etc.
• When collecting information on public safety nets, go beyond ascertaining
their presence or absence; look into the probability of actually receiving
support or, conversely, of being excluded from the safety net. For example,
community questionnaires on public works programs could ascertain criteria
for eligibility as well as explore the possibility that eligible persons may be
excluded from the program due to rationing or explicit or implicit forms of
social exclusion.
• Community-level indicators can, of course, be constructed from household
averages. The usefulness of household averages depends on the size of the
geographical area to be characterized, the number of respondents per
community, and the degree of heterogeneity across potential sampling units in
the area.
4.2 Secondary Sources
Secondary sources of data, such as administrative data, data from rainfall stations,
price data from market surveys, and even publicly available data sets such as the
Demographic and Health Surveys, can all be used for risk and vulnerability analysis.
One advantage of many of these secondary data sources is that, in many countries, they
have been collected for several years, although their quality will need to be verified on a
case-by-case basis. Indeed, when the analyst only has a single cross-section survey to
work with, information from secondary sources will be valuable in capturing
intertemporal dimensions of risk. For example, the extent of a weather shock could be
34
proxied by using the percent deviation of rainfall during the survey year from its long-run
trend, or an indicator of rainfall variability could be constructed using the variance of
monthly or yearly rainfall.
Issues and Innovations
Administrative data. Some issues that need to be taken into account when using
secondary data are (1) the geographic coverage of the data; (2) the age and timing of the
data; (3) whether the data have sufficient information with which to construct variables
that enable analysis of relevant policy questions; (4) availability of policy or control
variables in the data; and (5) measurement error (Frankenberg 2000: 323). The analyst
may want to ask the following questions:
• Do the data cover the entire country or specific geographic areas? Even if the
country coverage is nationwide, matching secondary data to household data
may be difficult if a different geographical coding scheme was used in the
secondary data set than the one used in the household survey. Ideally, both
data sets should contain the names of the administrative areas about which the
data were collected as well as a common set of codes.
• In matching data related to weather and agriculture, pay attention to
topography: the presence or absence of mountain ranges may affect rainfall
patterns and create microclimates that may not be captured by mean rainfall
collected at a rainfall station.
• Are the data too old to be of any use? Are they gathered frequently enough to
make a meaningful time series, e.g., of rainfall, possible?
• Do the data have enough information on policy relevant variables? Have they
been collected at too high a level of aggregation? Do they report the
conditions that actually prevail in a community, or what is supposed to be at a
facility?
• Is there information on private institutions, not just public facilities for which
administrative data are more readily available?
35
GIS data. GIS data have much potential for risk and vulnerability assessment,
because they enable units of information to be spatially referenced. This would enable
better visualization of the spatial distribution of the data, the stratification of sampling,
identification of spatial correlates of vulnerability, geographical targeting, and assessment
of the local and nonlocal (externality) impacts of some types of shocks (Wood and
Rhinehart 2002). However, there are many outstanding issues that need to be recognized
when using GIS data. First, most socioeconomic data are not geo-referenced, although
newer surveys are beginning to take geographical coordinates at the time of interview.
Second, the scale of GIS data is often inappropriate, e.g., 8-20 kilometer grids for NDVI
and climate data. Third, their accuracy is usually unstated and often unassessed, such as
maps of rainfall surfaces. Most GIS information is also on status or condition, with little
on trends and dynamics (e.g., migration, changing rainfall variability), although
information on land cover change from satellite and aerial photographs is improving.
Thus, most GIS information is relevant to static vulnerability assessments or short-term
(current season) risks and interventions. Most GIS data also lack spatial predictive power
for most risk variables in a time frame relevant for policy design, with the exception of
some aspects of climate change and land cover change. However, the spatial analysis
framework can be used to help assess the potential impact of some type of interventions,
such as changing land use practices, technologies, and infrastructure.
Many of the suggestions for improving the GIS information system are probably
not directly relevant to analysts and survey designers. However, one key
recommendation that is very easy to implement is simply to geo-reference socioeconomic
data. With handheld GPS units, it is now relatively easy to get latitude and longitude
measurements of dwellings and facilities. While these can be used to compute distance
to facilities (and, by taking measurements as a sequence of readings, one can get more
accurate estimates for distance compared to straight-line distance), the potentially most
useful possibility would be the ability to link into digitized maps of the areas in which the
survey is being conducted. As more and more data series become digitized and geo-
referenced, it will be increasingly possible to address the spatial aspects of risk and
vulnerability in both measurement and analysis.
36
Census and other demographic data. Census data and demographic surveys (such
as the Demographic and Health Surveys) are especially valuable for characterizing life-
cycle risks. Census data can give an idea of the size of particular age cohorts as well as
their geographic distribution. Matching the geographic distribution of the population to,
say, rainfall and seismological data could identify population groups that might be most
vulnerable to weather and earthquake shocks.
Agricultural census and crop reporting data. Although agricultural censuses are
typically collected once every ten years (if at all), such data can be used to generate
profiles of cropping patterns throughout the country, which can be used as a proxy for
regional crop diversification. If matched with rainfall data, these can identify regions that
are ex ante vulnerable to rainfall shocks. If certain regions are devoted to specific export
crops, these regions can also be identified as ex ante vulnerable to terms of trade shocks
or changes in world market prices.
Nutrition and health surveys. Similar to agricultural census data, nationwide
nutrition and health surveys are not conducted frequently. However, they can provide
information on regions that are more likely to have high prevalence of malnutrition as
well as high incidence of contagious diseases.
4.3 Contextual Information6
Many outcomes of interest are not amenable to measurement using standard
quantitative survey techniques, particularly when one is interested in processes as much
as outcomes. In assessing the likelihood of poverty, for example, one can use “objective”
quantitative measures such as the probability of falling below the poverty line. However,
since vulnerability, like poverty, is a multidimensional concept, quantitative measures
will not necessarily capture “subjective” issues like the effect of vulnerability on behavior
and the effect of vulnerability on well-being. Other approaches such as livelihoods
analysis and participatory approaches yield rich, contextual data, although doubts can be
raised (rightly or wrongly) about their generality and representativeness (Dercon 2001).
6 This section draws from Adato and Meinzen-Dick (2002b).
37
Concerns regarding lack of representativeness can, unfortunately, be used as an excuse to
ignore valid findings.
In practice, most survey researchers use some form of qualitative research or
contextual method to understand the political, economic, or cultural context in which
their surveys are conducted (Chung 2000), ranging from ad hoc efforts (informal
conversations with villagers) to more systematic contextual studies conducted in tandem
with the survey. Chung (2000) points to three ways in which contextual methods can be
used to improve household surveys.7 First, contextual methods can be used to produce
hypotheses and to shape a survey’s conceptual framework. Second, contextual methods
can be used to clarify the questions and terms that are used in a survey. Third, contextual
methods can be used to explain counterintuitive or inconclusive survey findings.
The emerging consensus is that both survey-based and contextual approaches can
be useful for vulnerability analyses. For example, open-ended questions can be used to
identify sources of vulnerability, which can then be used to formulate questions in a
quantitative survey. They can also be used to explore topics that are less amenable to
survey questionnaires, including sensitive issues such as intrahousehold relations, crime,
illness, magic, and politics, as well as more “complicated,” multidimensional issues such
as power relationships, trust, and belief systems. An example where both survey and
contextual approaches were combined is found in the CGIAR study of the impact of
modern agricultural technologies (see Example 7). Moreover, by addressing the issue of
representativeness head on, it is possible to combine approaches in the same study.
Moser (2001), for example, argues that it is possible for research using participatory
approaches to be quantified and to make it representative. This would involve careful
choice of communities and efforts in post-coding of answers in patterns—for example,
the sampling frame used in the household survey could be used to generate the
7 Chung’s exposition is in terms of “qualitative methods,” but in this toolkit we use “contextual methods”
in line with the distinction between qualitative and quantitative data, and contextual versus noncontextual
methods.
38
Example 7. Integrating quantitative and qualitative research in studying
vulnerability
In 2000 and 2001, IFPRI researchers—in collaboration with other CGIAR centers—undertook a
multicountry study assessing the social impact of new agricultural technologies in Bangladesh, Kenya,
Mexico, and Zimbabwe (Adato and Meinzen-Dick 2002a). These studies combined quantitative
household surveys, focus groups, key informant interviews, in-depth household case studies, and
secondary data. The case studies combined social and economic (as well as some biophysical), qualitative
and quantitative, participatory, and conventional (or extractive) data.
All the case studies made use of focus groups to elicit collective experience and opinions. Separate
groups were convened for men and women of different wealth/poverty categories. For example, in
Bangladesh, six focus groups were held in each selected village (men and women separately for the very
poor, poor, and nonpoor categories of households). Preexisting survey data helped in the disaggregation of
wealth groupings for the focus groups, particularly in communities where a wealth ranking exercise may
be divisive or difficult to carry out (e.g., because of large community size or time limitations that prevent
researchers from getting sufficiently acquainted with a community to comfortably carry out such an
exercise). Where possible, households that were selected for the surveys were included in the focus groups
to improve the comparability of the information obtained by the different sources.
During the focus group meetings, a range of participatory and extractive data collection activities was
conducted: seasonality mapping, identification and ranking of livelihood activities and sources of
vulnerability, as well as discussions of the technologies being studied and dissemination approaches. In
some of the studies (e.g., Kenya, Zimbabwe, and Mexico), focus groups were used following a series of
household case studies to further investigate issues raised (including the experiences of households not
included in these studies), check whether the findings resonate or contradict, and receive feedback on the
research findings. In other studies (especially in Bangladesh), focus groups were the primary means of
qualitative data collection, but were followed up with in-depth interviews or case studies of individuals
who participated in those groups.
Key informant interviews allowed the research team to follow up in more detail with individuals that
have specialized knowledge. This may include researchers from CGIAR and national centers, NGO,
community organization, or government project staff, extension agents, local seed distributors and shops,
agricultural researchers from the private sector, community elders, chiefs, early adopters, etc. Semi-
structured interviews allowed the researchers to go in with a core set of information that they hope to
collect, but also to follow up on relevant topics that emerge during the course of the discussion.
In-depth household case studies provided more detail on the complexity of household livelihood
strategies, particularly in the Kenya, Zimbabwe, and Mexico cases. Researchers lived in sample villages
for three to six months, spending time in the homes of a subsample of the survey households, conducting
informal interviews, observing and participating in their daily activities, such as farming, extension field
days, and social interactions and activities. Such participant observation can provide insights that are not
available from other methods and inform and refine the questions asked in other, more structured, data
collection.
In the case of Zimbabwe, for example, ethnographic fieldwork revealed that the vast majority of
people perceived vulnerability of crops to be due to magic or witchcraft. For example, yields can be stolen
through magic, while high yields are similarly achieved through magic. People are unwilling to show
interest in or to observe others’ crops (which could be a way of learning from each other) to avoid
suspicions of witchcraft. Farmers were unwilling to discuss yields and prices, and discounted scientific
explanations. There was also a widespread belief that lent or borrowed animals or implements could be
bewitched. The researchers concluded that the perceived vulnerability to magic can inhibit the spread of
agricultural technology, since diffusion of new technologies involve experimentation and sharing of
lessons learned from the experiment. Moreover, the vulnerability to accusations of magic has implications
for the formation of social capital.
39
subsamples for further study using contextual approaches. In her study of individual,
household, and community vulnerability in the context of economic crises, Moser (2001)
used three types of data collection methods in the same communities: a random sample
survey to collect statistically quantifiable data, a subsample survey using both structured
and open-ended questions to collect qualitative data, and a community survey using
contextual methods such as participant observation, triangulations, and interviews to
collect qualitative community-level data. An example using both survey-based and
contextual methods is shown in Example 7.
Drawing from the work of Hentschel (1999) and Moser (2001), we propose some
guidelines for using contextual methods in risk and vulnerability assessments:
• Some methods will be better than others in generating particular types of data.
Some types of information on risk and vulnerability can be obtained through
contextual methods of data collection only. In these instances, strict statistical
representativeness will have to give way to inductive conclusion, internal
validity, and replicability of results.
• Think of contextual methods and quantitative surveys as part of an iterative
process. Contextual methods can be used to design appropriate noncontextual
data collection tools. If the contextual study can be conducted prior to the
quantitative survey, it can help define terms and categories for the quantitative
questionnaire.
• Where information requires survey-based methods, or if the contextual study
is conducted after the household survey, contextual methods can play an
important role for assessing the validity of the results at the local level, and
interpreting the survey results.
• In cases where different data collection methods can be used to probe general
results, formal links between the methods can—and need to—be established.
For example, preexisting survey data can be used to disaggregate wealth
groupings for focus groups, particularly in communities where a wealth
ranking exercise may be divisive or difficult to carry out (e.g., because of
large community size or time limitations that prevent researchers from getting
40
sufficiently acquainted with a community to comfortably carry out such an
exercise).
• The same sampling frame for the household surveys and contextual methods
can also be used if possible. Households included in surveys can be selected
for focus groups to improve the comparability of the information obtained
through different sources. Group responses tend to provide information on
norms, whereas individual answers may reveal deviation from those norms,
depending on individual or household circumstances.
• Pay attention to linking different sources of data. Depending on the sequence
of data collection, insights from the surveys might be followed up in the focus
group or key informant interviews and participant observation, or vice versa.
• Assemble a research team with the proper mix of skills. The team can include
an economist, technical scientists, social scientist (a sociologist or
anthropologist) with extensive experience in the region, and national
economics and social analysis experts who guide the data collection and
analysis, and who work with teams of less experienced researchers, engaging
in training and capacity building. The field staff require strong analytic and
facilitation skills in order to conduct the focus group and household case
studies, while the key informant interviews are often conducted by the
national or international social or economics experts.
5. CONCLUDING REMARKS
This “toolkit” set out to assist practitioners undertaking vulnerability assessments
by identifying data sources, assessing their suitability for risk and vulnerability
measurement, and proposing suggestions for data collection to supplement existing
sources. Using the “risk chain” as an organizing principle, we mapped data sources into
the three stages of the risk chain: (1) risk, or uncertain events; (2) options for managing
risk, or risk responses, and (3) the outcome in terms of welfare loss. These data sources
included cross-sectional surveys, panel surveys, community information, and secondary
sources, as well as LSMS-type household surveys. We then discussed the uses of six
41
types of data for risk and vulnerability assessments, pointing out their advantages and
disadvantages, as well as suggesting innovations in their collection and use. To
summarize, these types of data are:
• Single-cross-section (Section 3.1)
• Repeated cross-sections (Section 3.2)
• Panel data (Section 3.3)
• Community information (Section 4.1)
• Secondary sources (Section 4.2)
• Data from contextual methods (Section 4.3)
Because the objectives of each risk and vulnerability assessment will differ, and
because the analyst will be faced with different types of data constraints in each particular
situation, we offer three very general conclusions:
Ø Use multiple data sources, but be aware of the advantages and limitations of each
source. Each data source will have its own advantages in terms of information
content, reliability, and availability. Understanding the features of each data source
in relation to information requirements at each stage of the risk chain will help the
analyst choose among sources. Always cross-validate information from different
sources, and seek expert opinion, especially from in-country experts who know both
the data generating infrastructure as well as the country’s policy environment.
Ø Let the question drive the methods and data, not vice versa. The companion
document on methods for risk and vulnerability assessment suggests that all
vulnerability analyses attempt to do the following: (1) identify the correlates of
vulnerability; (2) examine the sources of vulnerability by characterizing risks and
shocks faced by the population as well as the distribution of those shocks; and (3)
determine the gaps between risks and risk management mechanisms. Using these
general guidelines will enable the analyst to identify the data required, and to conduct
the analyses, given data constraints.
42
Ø Be open to contextual methods to complement analyses using household survey data.
Contextual methods can be used at different stages of the risk and vulnerability
assessment—whether to produce hypotheses and to shape a survey’s conceptual
framework, to clarify the questions and terms that are used in a survey, or to explain
counterintuitive or inconclusive survey findings. Using multiple methods may
improve validity and relevance of the vulnerability assessment, particularly in cases
where panel data are not available and discussions of poverty and vulnerability may
be sensitive.
ANNEX 1. MODULES ON RISKS AND SHOCKS
The modules presented in this section are draft modules on risks and shocks that correspond to different recall periods. The first module, patterned
after the third round of the Ethiopian Rural Household Survey, collects information on long-term shocks and coping mechanisms but has been
modified somewhat to be equally applicable to households residing in urban areas. The second module, which is similar in design to the long-term
shocks module, has a recall period of 12 months. In addition, we also provide a module specifically tailored to agricultural shocks that could be
inserted into a section of the household questionnaire on agriculture.
As with any survey module, it is vitally important to pretest these questions to ensure that they are appropriate to the country/locality in which they
will be used. Many of these questions have been used in one or two surveys, but the module as a whole has been subject to considerably less use
than that of many “LSMS” type modules. While we have erred on the side of being exhaustive in what is listed here, survey designers should be
aware that their respondents’ time is valuable and that the quality of information collected deteriorates if respondents are requested to participate in
an excessively lengthy survey. Good pilot testing should make it possible to reduce the number of questions being asked; further, in some
circumstances, it may make sense to move some of the topics listed here to a community-level questionnaire. Also, survey planners may want to
consider more carefully how much detail they need on the specifics of shocks. For example, the questions on shocks in the previous 20 years could
be made more general, with the more specific questions reserved for shocks in the last 12 months. If survey planners are administering both the long-
term and the medium-term shocks module, they may want to qualify that the 20-year recall does not include the past 12 months.
A. MODULE ON LONG-TERM SHOCKS AND COPING MECHANISMS
IN THE LAST 20 YEARS, SINCE (name an event which is a widely known landmark date in the history of the country) has this household been
affected by a shock—an event that led to a reduction in your asset holdings, caused your household income to fall or resulted in a significant
reduction in consumption? We would like to learn more about the worst shocks in the last 20 years.
NATURE OF SHOCK SPECIFICS OF SHOCK CODE In what years did these shocks
occur? (List the three worst
shock years, in descending order
of severity)
Did these shocks result in:
1. Loss of productive assets
2. Loss of household income
3. Reduction in household
consumption
4. Asset & income loss
5. Asset loss & reduced
consumption
6. Income loss & reduced
consumption
7. Asset, income loss & reduced
consumption
How widespread was this shock?
1. Only affected my household
2. Affected some households in
this village
3. Affected all households in this
village
4. Affected this village and other
villages nearby
6. Affected the region
7. Affected the whole country
Drought 100
Too much rain or flood 101
Earthquake 102
Volcanic eruption 103
Landslides 104
Erosion 105
Frost and hailstorm 106
Pests or diseases that affected crops
before they were harvested 107
Pests or diseases that led to storage
losses 108
1. …has there been a weather
or environmental shock?
Pests or diseases that affected livestock 109
Destruction, confiscation or theft of
tools or inputs for production 201
Theft of cash 202
Theft of stored crops 203
Destruction or theft of housing 204
Destruction or theft of consumer goods 205
Death of working adult household
members 206
Death of other household members 207
Disablement of working adult household
members 208
Disablement of other household
members 209
2. … has there been war, civil
conflict, banditry, crime
Conscription, abduction or draft of
working adult household members 210
NATURE OF SHOCK SPECIFICS OF SHOCK CODE In what years did these shocks
occur? (List the three worst
shock years, in descending order
of severity)
Did these shocks result in:
1. Loss of productive assets
2. Loss of household income
3. Reduction in household
consumption
4. Asset & income loss
5. Asset loss & reduced
consumption
6. Income loss & reduced
consumption
7. Asset, income loss & reduced
consumption
How widespread was this shock?
1. Only affected my household
2. Affected some households in
this village
3. Affected all households in this
village
4. Affected this village and other
villages nearby
6. Affected the region
7. Affected the whole country
Confiscation of land 301
Confiscation of other assets 302
Land reform 303
Resettlement, villagization or forced
migration 304
Bans on migration 305
Forced labor 306
Forced contributions or arbitrary
taxation 307
Imprisonment for political reasons 308
Disrimination for political reasons 309
Disrimination for social or ethnic
reasons 310
Contract dispute or default affecting
access to land 311
Contract dispute or default affecting to
other inputs 312
3. …have there been negative
political, social or legal
events?
Contract dispute or default affecting sale
of products 313
Lack of financing/capital 401
Lack of access to inputs 402
Increase in input prices 403
Decrease in output prices 404
Lack of demand or inability to sell
agricultural products 405
Lack of demand or inability to sell
nonagricultural products 406
Unemployment 407
4. … have there been
economic shocks
NATURE OF SHOCK SPECIFICS OF SHOCK CODE In what years did these shocks
occur? (List the three worst
shock years, in descending order
of severity)
Did these shocks result in:
1. Loss of productive assets
2. Loss of household income
3. Reduction in household
consumption
4. Asset & income loss
5. Asset loss & reduced
consumption
6. Income loss & reduced
consumption
7. Asset, income loss & reduced
consumption
How widespread was this shock?
1. Only affected my household
2. Affected some households in
this village
3. Affected all households in this
village
4. Affected this village and other
villages nearby
6. Affected the region
7. Affected the whole country
Death of husband 501
Death of wife 502
Other death (Specify ___________
___________________________) 503
Illness of husband 504
Illness of wife 505
Other illness (Specify ___________
___________________________) 506
Divorce 507
Abandonment 508
Disputes with extended family members
regarding land 509
Disputes with extended family members
regarding other assets 510
5. …have there been other
events or shocks?
601
602
603
604
6. … have there been other
events or shocks that we have
not listed here. (Specify)
Ask the respondent to list the five worst crises from the list above. What did the household do to compensate, resolve or address this loss of
assets, loss of income and/or reduction in consumption?
Five most important crises
List the five most important crises from the
modules above
What did the household do to compensate or resolve this decrease or loss of
income and/or inheritance? List the three most important activities in descending
order of importance)
Spent savings or investments………………….1
Pawned goods………………………………….2
Mortgaged house or land………………………3
Cashed in securities……………………………4
Worked more, if already working…………… .5
Other members went to work………………….6
Applied for a cash loan from a private bank…..7
Applied for a cash loan from a state bank……..8
Asked for a cash loan from a family member….9
Asked for a cash loan from a friend…………..10
Asked for a cash loan from a moneylender …11
Asked for a cash loan form work……………..12
Sold house or land…………………………. 13
Sold animals……………………………….. 14
Sold appliances, equipment, machines…….. 15
Sold jewelry……………………………….. 16
Sold the harvest in advance…………….. 17
With help from government organizations… 18
With help from private entities……………. 19
With help from international entities…….. 20
With help from NGOs……………...……. 21
With help from the neighbors or friends 22
Reduce food consumption………………… 23
Stop consuming some products or services…. 24
Didn’t do anything…………………………. 25
Other, what?…………………………………..26
How much time did it take to go back to the
position you were in before the crisis?
Less than a year….…….1
One year to five years….2
Six to 10 years…………3
More than 10 years…….4
Never recovered………..5
Code Year
B. MODULE ON MEDIUM-TERM SHOCKS AND COPING MECHANISMS
IN THE LAST 12 MONTHS, has this household been affected by a shock—an event that led to a reduction in your asset holdings, caused your
household income to fall or resulted in a significant reduction in consumption? We would like to learn more about these events.
When did these occur? NATURE OF SHOCK SPECIFICS OF SHOCK CODE
In the
last four
weeks
Bet-
ween
one
month
and six
months
ago
Bet-
ween six
and 12
months
ago
Did these shocks result in:
1. Loss of productive assets
2. Loss of household income
3. Reduction in household
consumption
4. Asset & income loss
5. Asset loss & reduced
consumption
6. Income loss & reduced
consumption
7. Asset, income loss & reduced
consumption
How widespread was this shock?
1. Only affected my household
2. Affected some households in
this village
3. Affected all households in this
village
4. Affected this village and other
villages nearby
6. Affected the region
7. Affected the whole country
Drought 100
Too much rain or flood 101
Earthquake 102
Volcanic eruption 103
Landslides 104
Erosion 105
Frost and hailstorm 106
Pests or diseases that affected crops
before they were harvested 107
Pests or diseases that led to storage
losses 108
1. …has there been a weather
or environmental shock?
Pests or diseases that affected livestock 109
Destruction, confiscation or theft of
tools or inputs for production 201
Theft of cash 202
Theft of stored crops 203
Destruction or theft of housing 204
Destruction or theft of consumer goods 205
Death of working adult household
members 206
Death of other household members 207
Disablement of working adult household
members 208
Disablement of other household
members 209
2. … has there been war, civil
conflict, banditry, crime
Conscription, abduction or draft of
working adult household members 210
When did these occur? NATURE OF SHOCK SPECIFICS OF SHOCK CODE
In the
last four
weeks
Bet-
ween
one
month
and six
months
ago
Bet-
ween six
and 12
months
ago
Did these shocks result in:
1. Loss of productive assets
2. Loss of household income
3. Reduction in household
consumption
4. Asset & income loss
5. Asset loss & reduced
consumption
6. Income loss & reduced
consumption
7. Asset, income loss & reduced
consumption
How widespread was this shock?
1. Only affected my household
2. Affected some households in
this village
3. Affected all households in this
village
4. Affected this village and other
villages nearby
6. Affected the region
7. Affected the whole country
Confiscation of land 301
Confiscation of other assets 302
Land reform 303
Resettlement, villagization or forced
migration 304
Bans on migration 305
Forced labor 306
Forced contributions or arbitrary
taxation 307
Imprisonment for political reasons 308
Discrimination for political reasons 309
Discrimination for social or ethnic
reasons 310
Contract dispute or default affecting
access to land 311
Contract dispute or default affecting to
other inputs 312
3. …have there been negative
political, social or legal
events?
Contract dispute or default affecting sale
of products 313
Lack of financing/capital 401
Lack of access to inputs 402
Increase in input prices 403
Decrease in output prices 404
Lack of demand or inability to sell
agricultural products 405
Lack of demand or inability to sell
nonagricultural products 406
Unemployment 407
4. … have there been
economic shocks
When did these occur? NATURE OF SHOCK SPECIFICS OF SHOCK CODE
In the
last four
weeks
Bet-
ween
one
month
and six
months
ago
Bet-
ween six
and 12
months
ago
Did these shocks result in:
1. Loss of productive assets
2. Loss of household income
3. Reduction in household
consumption
4. Asset & income loss
5. Asset loss & reduced
consumption
6. Income loss & reduced
consumption
7. Asset, income loss & reduced
consumption
How widespread was this shock?
1. Only affected my household
2. Affected some households in
this village
3. Affected all households in this
village
4. Affected this village and other
villages nearby
6. Affected the region
7. Affected the whole country
Death of husband 501
Death of wife 502
Other death (Specify ___________
___________________________) 503
Illness of husband 504
Illness of wife 505
Other illness (Specify ___________
___________________________) 506
Divorce 507
Abandonment 508
Disputes with extended family members
regarding land 509
Disputes with extended family members
regarding other assets 510
5. …have there been other
events or shocks?
601
602
603
604
6. … have there been other
events or shocks that we have
not listed here. (Specify)
Ask the respondent to list the five worst crises from the list above. What did the household do to compensate, resolve or address this loss of
assets, loss of income and/or reduction in consumption?
Five most important crises
List the five most important crises from the
modules above
What did the household do to compensate or resolve this decrease or loss of
income and/or inheritance? List the three most important activities in descending
order of importance.
Spent savings or investments………………….1
Pawned goods………………………………….2
Mortgaged house or land………………………3
Cashed in securities……………………………4
Worked more, if already working…………… .5
Other members went to work………………….6
Applied for a cash loan from a private bank…..7
Applied for a cash loan from a state bank……..8
Asked for a cash loan from a family member….9
Asked for a cash loan from a friend…………..10
Asked for a cash loan from a moneylender …11
Asked for a cash loan form work……………..12
Sold house or land…………………………. 13
Sold animals……………………………….. 14
Sold appliances, equipment, machines…….. 15
Sold jewelry……………………………….. 16
Sold the harvest in advance…………….. 17
With help from government organizations… 18
With help from private entities……………. 19
With help from international entities…….. 20
With help from NGOs……………...……. 21
With help from the neighbors or friends 22
Reduce food consumption………………… 23
Stop consuming some products or services…. 24
Didn’t do anything…………………………. 25
Other, what?…………………………………..26
How much time did it take to go back to the
position you were in before the crisis?
Less than a month….…….1
Between one and six months….2
Between six and 12 months …………3
Not yet recovered …….4
Code When occurred
SEASONALITY AND AGRICULTURE-RELATED SHOCKS
Note: This module should be adapted to use the local names for the agricultural seasons.
Please think back to the last main rainy season (or major growing season in ____________). We would like to know whether any of the
following events happened to you that affected the growth of your crops and the harvest. These questions should be asked to all farmers who
harvest during the main rainy season, all farmers who grow permanent crops, and any other farmers for whom these rains can be relevant.
Permanent crop growers should be asked in general about the growing season preceding the last main harvest. If the crop is continuously
harvested, ask for a general assessment of the last growing season.
1. Is the main rainy season
important for your crops?
Yes………………..1
Not very important..2
No…………………3
2. According to your own
plans, did the main rains
come on time?
On time…………1
Too late…………2
Too early……….3
3. Was there enough rain on
your fields at the beginning of
the rainy season?
Enough…………1
Too much………2
Too little……….3
4. Did the rains stop on time on
your fields?
On time……………1
Stopped too late…..2
Stopped too early…3
5. Did it rain near the
harvest time?
Yes….1
No…..2
Did your crops suffer from any of the following? Yes….1 No…..2
Low
temperatures Wind/storm Flooding/waterlogging Plant diseases Insects Livestock
eating/trampling
crops
Weed damage
Other
(specify)
Please mention which crops were most affected by the above factors during the last rainy season, and mention whether they were moderately or very
badly affected (up to three crops). Give comments if necessary.
Crop affected (crop codes) Please list the three most important factors in
descending order of importance.
Low temperatures…………………1
Wind/storm……………………….2
Flooding/waterlogging……………3
Plant diseases……………………..4
Insects………………………….5
Livestock eating/trampling crops…6
Weed damage……………………..7
Other………………………………8
(specify)
Overall, how badly affected
was this crop?
Slightly affected ….1
Moderately………..2
Severely……… …..3
Totally destroyed….4
Comments