ArticlePDF Available

Estimating Food Consumption and Poverty Indices with Mobile Phone Data

Authors:

Abstract and Figures

Recent studies have shown the value of mobile phone data to tackle problems related to economic development and humanitarian action. In this research, we assess the suitability of indicators derived from mobile phone data as a proxy for food security indicators. We compare the measures extracted from call detail records and airtime credit purchases to the results of a nationwide household survey conducted at the same time. Results show high correlations (> .8) between mobile phone data derived indicators and several relevant food security variables such as expenditure on food or vegetable consumption. This correspondence suggests that, in the future, proxies derived from mobile phone data could be used to provide valuable up-to-date operational information on food security throughout low and middle income countries.
Content may be subject to copyright.
A preview of the PDF is not available
... Many of the datasets are also recent, frequently updated and in some cases openly accessible (e.g., certain types of satellite imagery and Twitter data); they are also available over significant spatial scales, as well as at fine spatial resolutions (e.g., individual level generated data, high resolution aerial imagery). These datasets therefore have significant advantages for use within many sustainable development relevant applications, including but not limited to: estimating poverty levels (Smith-Clarke and Capra, 2016; Steele et al., 2017); studying the migration patterns resulting from climate stresses (Lu et al., 2016), predicting food insecurity (Decuyper and Rutherford, 2014), and determining the greatest influencers in the spread of disease (Tatem et al., 2014). ...
... The second major challenge in using these types of novel datasets is understanding what relevant information can be extracted and what are the best approaches and methods to do so. For example, a study in 2014 tested whether a person's expenditure on a mobile phone could be related to their food security (Decuyper and Rutherford, 2014). When comparing this expenditure to different patterns of food consumption, the study found that the consumption of vitamin rich vegetables, rice, bread, sugar and fresh meat did have a positive correlation with airtime purchases, whilst the consumption of white sweet potato had a significant negative correlation (Decuyper and Rutherford, 2014). ...
... For example, a study in 2014 tested whether a person's expenditure on a mobile phone could be related to their food security (Decuyper and Rutherford, 2014). When comparing this expenditure to different patterns of food consumption, the study found that the consumption of vitamin rich vegetables, rice, bread, sugar and fresh meat did have a positive correlation with airtime purchases, whilst the consumption of white sweet potato had a significant negative correlation (Decuyper and Rutherford, 2014). However, broadly cultivated items like cassava and beans had no relation with the expenditure on mobile phones (Decuyper and Rutherford, 2014). ...
Thesis
Full-text available
Reducing the risk of populations to disaster is a key priority for those working within sustainable development, as highlighted by global policies including the Sustainable Development Goals and the Sendai Framework for Disaster Risk Reduction. Consequently, there is a need to understand where disaster risk is at its greatest, yet its quantification has proven difficult. Disaster risk is a function of the likely occurrence and exposure of a hazard, the vulnerability of the population to the hazard, and their (in)ability to prepare for, absorb and build back from the adverse impacts of the hazard, often understood as their resilience. The quantification of the latter two aspects, vulnerability and resilience, is not straightforward, with both having multiple definitions as well as approaches to their measurement. Within the wider resilience field, an alternative approach to its measurement is evolving, which specifically focuses on social networks as the unit of analysis. The premise is that greater social connectivity will directly enhance resilience, can be evaluated through a singular approach, and can be quantified using social network analysis. This approach has however been limited by the availability of data at substantive spatial and temporal scales. This PhD proposes that there is a significant opportunity to utilise Call Detail Records (CDRs), the metadata generated from the use of a mobile phone, to address these data limitations. The overall aim of this thesis is to assess the feasibility of using CDRs to create a social connectivity dataset that can be used specifically within disaster resilience estimation for disaster risk reduction. To substantiate the creation of this dataset from CDRs, the theoretical framework behind using social connectivity for disaster resilience estimation is first established, including a systematic review that evaluates the importance of social networks for disaster risk reduction in Nepal. The thesis then accounts for the representativeness of the CDR dataset through analysing the changing geo-demographics of mobile phone ownership in Nepal. In the last decade, household ownership has grown substantially Nepal across different socio-economic groups, whilst individual ownership stood at 82% in 2016. As a result, the CDR dataset is likely to be representative of a substantial cross-section of Nepal’s population. The feasibility of using CDRs to represent real-world social networks is then addressed by mapping the spatial distribution of the social communities detected within the CDR network. The study finds that the social communities are spatially concentrated; within these distributions, geographic communities, such as towns and cities, can be identified. The thesis then evaluates whether CDRs can be used for improved mapping and measurement of social connectivity for disaster resilience and risk estimation, creating a social connectivity index using novel CDR data and social network analysis. The index and its variables show that there are clear geographical patterns to social connectivity, with the peri-urban middle Hill regions expected to demonstrate the greatest resilience due to their sizeable and strong bonding and bridging networks. The thesis then addresses the limitations of each of the analyses presented and identifies future opportunities for further research. The thesis concludes that CDRs and the emerging body of literature on social connectivity and social network analysis present a significant opportunity to rethink the current methods of measurement of disaster resilience for disaster risk reduction.
... Data can be used to monitor the risk of food insecurity in real time, to forecast near-term shortages, and to identify areas at risk in the long-term, all of which can guide interventions. For real-time and near-term systems, it is possible to distill relevant signals from mobile phones, credit card transactions, and social media data [171,434,645]. These have emerged as low-cost, high-reach alternatives to manual surveying. ...
Article
Full-text available
Climate change is one of the greatest challenges facing humanity, and we, as machine learning (ML) experts, may wonder how we can help. Here we describe how ML can be a powerful tool in reducing greenhouse gas emissions and helping society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by ML, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the ML community to join the global effort against climate change.
... Logistical information from companies operating in the region can supplement mapping and access data. The World Food Programme working with UN Global Pulse has demonstrated that data from mobile phone signal can be used to estimate multidimensional poverty indices and food demand 142 . Information such as aerial imaging, which might be held primarily by private sector organisations, can be invaluable in a crisis 143 . ...
... As the physical access of mobile phone use becomes more widespread, data on ownership and usage offer a new source of information relating to household demographics and spending choices. For instance, regional aggregated measures of phone penetration and use have been shown to correlate with regionally aggregated population statistics from census and household surveys [17,18]. In Nepal, the 2016 Nepal Demographic Health Surveys reported widespread distribution of mobile phone ownership by gender (73% of women and 89% of men), and location (87% of the rural and 90% of urban dwellers) [19]. ...
Article
Full-text available
Household food insecurity remains a major policy challenge in low-income countries. Identifying accurate measures that are relatively easy to collect has long been an important priority for governments seeking to better understand and fund solutions for communities in remote settings. Conventional approaches based on surveys can be time-consuming and costly, while data derived from satellite imagery represent proxies focused on biological processes (such as rainfall and crop growth) lack granularity in terms of human behaviors. As a result, there has recently been interest in tapping into the large digital footprint offered by mobile phone usage. This paper explores empirical relationships between data relating to mobile phones (ownership and spending on service use), and food insecurity in rural Nepal. The work explores models for estimating community-level food insecurity through aggregated mobile phone variables in a proof-of-concept approach. In addition, sensitivity analyses were performed by considering the performance of the models under different settings. The results suggest that mobile phone variables on ownership and expenditure can be used to estimate food insecurity with reasonable accuracy. This suggests that such an approach can be used in and beyond Nepal as an option for collecting timely food insecurity information , either alone or in combination with conventional approaches.
Thesis
Full-text available
Les progrès dans la lutte contre la faim ont été significatifs en Afrique de l’Ouest et au Burkina Faso entre 2000 et 2014, avant que la situation alimentaire ne se détériore. Les raisons sont multiples et interdépendantes : des phénomènes météorologiques extrêmes plus fréquents et l'augmentation de la population tendent à réduire la disponibilité alimentaire ; les déplacements de population dus aux conflits ont pour conséquence la chute de la production agricole et la désorganisation des circuits de distribution ; la pauvreté structurelle des populations est aggravée par un contexte économique mondial difficile. Pour suivre, analyser et prévoir les situations d'insécurité alimentaire, les systèmes de sécurité alimentaire (SSA) intègrent principalement des données agroclimatiques issues d’images satellites et des indicateurs de nutrition, de production et d’économie issus d’enquêtes ménages. Ces enquêtes sont essentielles à la production d’indicateurs clés pour mesurer la sécurité alimentaire (SA), mais sont coûteuses économiquement et en temps.L'objectif de cette thèse est de fournir des approches innovantes pour l’estimation d’indicateurs de SA et de leurs déterminants, en utilisant des données hétérogènes publiquement accessibles et des approches fondées sur l’intelligence artificielle, dans la perspective d'appuyer les méthodes utilisées par les SSA. Pour cela, plusieurs questions de recherche sont traitées : sur quels indicateurs s’appuyer pour mesurer la SA et quelles en sont les limites ? Comment traiter l’hétérogénéité thématique, temporelle et spatiale des données ? Comment extraire des éléments explicatifs à partir des données ? Pour répondre à ces problématiques, cette thèse propose trois contributions.Premièrement, nous faisons un état des lieux des nombreux indicateurs utilisés pour quantifier cette notion complexe qu’est la SA. Puis, nous nous concentrons sur des indicateurs de SA issus d’enquêtes ménages et étudions ce qu’ils nous révèlent sur la SA, leur validité spatiale et temporelle, ainsi que les biais auxquels ils peuvent être sujets. Nous montrons que malgré leurs biais inhérents, ces indicateurs contiennent des informations spatiales et interannuelles cohérentes qui peuvent être exploitées pour le suivi des crises alimentaires au niveau sub-national.Deuxièmement, nous proposons des approches originales combinant des méthodes d'apprentissage automatique et profond (i.e., forêts aléatoires, réseaux de neurones convolutifs, réseaux de neurones récurrents) pour obtenir des approximations d’indicateurs de SA issus d’enquêtes ménages. Ces approches intègrent et combinent des données explicatives hétérogènes. Les données explicatives sont des variables quantitatives (e.g., données météorologiques), des images (e.g., densités de population, occupation des sols) et des points GPS (e.g., hôpitaux, écoles, événements violents) avec différentes granularités spatio-temporelles. Nous mettons en évidence la pertinence des approches d'apprentissage automatique selon les données à traiter et constatons l’apport significatif de variables issues de domaines variés.Troisièmement, nous étudions l’apport des données textuelles, possédant un fort potentiel explicatif, pour effectuer une analyse qualitative de la SA en nous basant sur un corpus de journaux burkinabés. Nous examinons la capacité des méthodes de fouille de textes à extraire automatiquement des informations qualitatives sur la situation alimentaire globale, régionale et annuelle à partir de ce corpus. Ce travail a permis d'obtenir des informations qualitatives spécifiques sur la thématique de la SA et sur ses caractéristiques spatiale et temporelle.A travers ces trois contributions, cette thèse considère la problématique de l’hétérogénéité des données liées à la SA en mettant l’accent sur les dimensions spatio-temporelles et thématiques qu’elles véhiculent. Les cadres méthodologiques génériques proposés pourront être étendus et adaptés à d’autres domaines.
Article
Full-text available
Background Policy makers need access to reliable data to monitor and evaluate the progress of development outcomes and targets such as sustainable development outcomes (SDGs). However, significant data and evidence gaps remain. Lack of resources, limited capacity within governments and logistical difficulties in collecting data are some of the reasons for the data gaps. Big data—that is digitally generated, passively produced and automatically collected—offers a great potential for answering some of the data needs. Satellite and sensors, mobile phone call detail records, online transactions and search data, and social media are some of the examples of big data. Integrating big data with the traditional household surveys and administrative data can complement data availability, quality, granularity, accuracy and frequency, and help measure development outcomes temporally and spatially in a number of new ways.The study maps different sources of big data onto development outcomes (based on SDGs) to identify current evidence base, use and the gaps. The map provides a visual overview of existing and ongoing studies. This study also discusses the risks, biases and ethical challenges in using big data for measuring and evaluating development outcomes. The study is a valuable resource for evaluators, researchers, funders, policymakers and practitioners in their effort to contributing to evidence informed policy making and in achieving the SDGs. Objectives Identify and appraise rigorous impact evaluations (IEs), systematic reviews and the studies that have innovatively used big data to measure any development outcomes with special reference to difficult contexts Search Methods A number of general and specialised data bases and reporsitories of organisations were searched using keywords related to big data by an information specialist. Selection Criteria The studies were selected on basis of whether they used big data sources to measure or evaluate development outcomes. Data Collection and Analysis Data collection was conducted using a data extraction tool and all extracted data was entered into excel and then analysed using Stata. The data analysis involved looking at trends and descriptive statistics only. Main Results The search yielded over 17,000 records, which we then screened down to 437 studies which became the foundation of our systematic map. We found that overall, there is a sizable and rapidly growing number of measurement studies using big data but a much smaller number of IEs. We also see that the bulk of the big data sources are machine-generated (mostly satellites) represented in the light blue. We find that satellite data was used in over 70% of the measurement studies and in over 80% of the IEs. Authors' Conclusions This map gives us a sense that there is a lot of work being done to develop appropriate measures using big data which could subsequently be used in IEs. Information on costs, ethics, transparency is lacking in the studies and more work is needed in this area to understand the efficacies related to the use of big data. There are a number of outcomes which are not being studied using big data, either due to the lack to applicability such as education or due to lack of awareness about the new methods and data sources. The map points to a number of gaps as well as opportunities where future researchers can conduct research.
Article
Full-text available
The use of big data promises to drive economic growth and development and can therefore be a value-adding factor, but compared to private or public organisations, the country level is rarely investigated, and that is even more evident for developing countries. Another topic hardly ever considered in the big data research field is ‘big data readiness’, which means the level of preparation and willingness to exploit big data. We address these shortcomings in the literature and focus on the big data readiness of developing countries. Thus, the first research question is: what components are required for an index measuring big data readiness, and how can such an index be designed? We use a design science approach to develop the “Big Data Readiness Index” (BDRI), which is then applied to all African countries to answer our second research question: how do African countries perform in terms of the BDRI? Our analysis yields country rankings that show relatively high BDRI scores for coastal countries, such as South Africa, Kenya and Namibia, and for islands, such as Mauritius. Related implications for both research and policy are discussed.
Chapter
In this chapter, the authors develop one such quality assessment for home detection methods from call detailed record (CDR) data. They argue that little research exists on the validity and related errors of home detection methods and that the sensitivity of results to researcher choices when setting up home detection algorithms (HDAs) is poorly understood. The authors present an extensive empirical analysis of home detection methods when performed on a nationwide CDR dataset of traces from about 18 million mobile phone users in France in 2007. They analyze the validity of nine different HDAs and assess different sources of uncertainty that relate to them and the obtained results. The authors discuss different measures for validation and investigate the sensitivity of results to researcher choices such as HDA parameter choice and observation period restriction. They use aggregated population counts from census data as a ground truth dataset to compare against aggregated user counts.
Article
Full-text available
Well-being is an important value for people’s lives, and it could be considered as an index of societal progress. Researchers have suggested two main approaches for the overall measurement of well-being, the objective and the subjective well-being. Both approaches, as well as their relevant dimensions, have been traditionally captured with surveys. During the last decades, new data sources have been suggested as an alternative or complement to traditional data. This paper aims to present the theoretical background of well-being, by distinguishing between objective and subjective approaches, their relevant dimensions, the new data sources used for their measurement and relevant studies. We also intend to shed light on still barely unexplored dimensions and data sources that could potentially contribute as a key for public policing and social development.
Article
Full-text available
This study leverages mobile phone data to analyze human mobility patterns in a developing nation, especially in comparison to those of a more industrialized nation. Developing regions, such as the Ivory Coast, are marked by a number of factors that may influence mobility, such as less infrastructural coverage and maturity, less economic resources and stability, and in some cases, more cultural and language-based diversity. By comparing mobile phone data collected from the Ivory Coast to similar data collected in Portugal, we are able to highlight both qualitative and quantitative differences in mobility patterns - such as differences in likelihood to travel, as well as in the time required to travel - that are relevant to consideration on policy, infrastructure, and economic development. Our study illustrates how cultural and linguistic diversity in developing regions (such as Ivory Coast) can present challenges to mobility models that perform well and were conceptualized in less culturally diverse regions. Finally, we address these challenges by proposing novel techniques to assess the strength of borders in a regional partitioning scheme and to quantify the impact of border strength on mobility model accuracy.
Article
Full-text available
Significance Knowing where people are is critical for accurate impact assessments and intervention planning, particularly those focused on population health, food security, climate change, conflicts, and natural disasters. This study demonstrates how data collected by mobile phone network operators can cost-effectively provide accurate and detailed maps of population distribution over national scales and any time period while guaranteeing phone users’ privacy. The methods outlined may be applied to estimate human population densities in low-income countries where data on population distributions may be scarce, outdated, and unreliable, or to estimate temporal variations in population density. The work highlights how facilitating access to anonymized mobile phone data might enable fast and cheap production of population maps in emergency and data-scarce situations.
Article
Full-text available
The D4D-Senegal challenge is an open innovation data challenge on anonymous call patterns of Orange's mobile phone users in Senegal. The goal of the challenge is to help address society development questions in novel ways by contributing to the socio-economic development and well-being of the Senegalese population. Participants to the challenge are given access to four mobile phone datasets. This paper describes the three datasets. The datasets are based on Call Detail Records (CDR) of phone calls and text exchanges between more than 9 million of Orange's customers in Senegal between January 1, 2013 to December 31, 2013. The datasets are: (1) antenna-to-antenna traffic for 1666 antennas on an hourly basis, (2) fine-grained mobility data on a rolling 2-week basis for a year with bandicoot behavioral indicators at individual level for about 300,000 randomly sampled users, (3) one year of coarse-grained mobility data at arrondissement level with bandicoot behavioral indicators at individual level for about 150,000 randomly sampled users
Article
Full-text available
The spatial dissemination of a directly transmitted infectious disease in a population is driven by population movements from one region to another allowing mixing and importation. Public health policy and planning may thus be more accurate if reliable descriptions of population movements can be considered in the epidemic evaluations. Next to census data, generally available in developed countries, alternative solutions can be found to describe population movements where official data is missing. These include mobility models, such as the radiation model, and the analysis of mobile phone activity records providing individual geo-temporal information. Here we explore to what extent mobility proxies, such as mobile phone data or mobility models, can effectively be used in epidemic models for influenza-like-illnesses and how they compare to official census data. By focusing on three European countries, we find that phone data matches the commuting patterns reported by census well but tends to overestimate the number of commuters, leading to a faster diffusion of simulated epidemics. The order of infection of newly infected locations is however well preserved, whereas the pattern of epidemic invasion is captured with higher accuracy by the radiation model for centrally seeded epidemics and by phone proxy for peripherally seeded epidemics.
Conference Paper
Full-text available
The deep penetration of mobile phones offers cities the ability to opportunistically monitor citizens' interactions and use data-driven insights to better plan and manage services. In this context, transit operators can leverage pervasive mobile sensing to better match observed demand for travel with their service offerings. With large scale data on mobility patterns, operators can move away from the costly and resource intensive four-step transportation planning processes prevalent in the West, to a more data-centric view, that places the instrumented user at the center of development. In this framework, using mobile phone data to perform transit analysis and optimization represents a new frontier with significant societal impact, especially in developing countries. AllAboard is a system to optimize the planning of a public transit network using mobile phone data with the goal to improve ridership and user satisfaction. Mobile phone location data is used to infer origin-destination flows in the city, which are then converted to ridership on the existing transit network. Sequential travel patterns from individual call location data is used to propose new candidate transit routes. An optimization model evaluates how to improve the existing transit network to increase ridership and user satisfaction, both in terms of travel and wait time. The system has been tested for the city of Abidjan, Ivory Coast, with the focus to improve the existing SOTRA transit network.
Article
Full-text available
Reliable statistical information is important to make political decisions on a sound basis and to help measure the impact of policies. Unfortunately, statistics offices in developing countries have scarce resources and statistical censuses are therefore conducted sporadically. Based on mobile phone communications and history of airtime credit purchases, we estimate the relative income of individuals, the diversity and inequality of income, and an indicator for socioeconomic segregation for fine-grained regions of an African country. Our study shows how to use mobile phone datasets as a starting point to understand the socio-economic state of a country, which can be especially useful in countries with few resources to conduct large surveys.
Conference Paper
Human mobility is one of the key factors at the basis of the spreading of diseases in a population. Containment strategies are usually devised on movement scenarios based on coarse-grained assumptions. Mobility phone data provide a unique opportunity for building models and defining strategies based on very precise information about the movement of people in a region or in a country. Another very important aspect is the underlying social structure of a population, which might play a fundamental role in devising information campaigns to promote vaccination and preventive measures, especially in countries with a strong family (or tribal) structure. In this paper we analyze a large-scale dataset describing the mobility and the call patterns of a large number of individuals in Ivory Coast. We present a model that describes how diseases spread across the country by exploiting mobility patterns of people extracted from the available data. Then, we simulate several epidemics scenarios and we evaluate mechanisms to contain the epidemic spreading of diseases, based on the information about people mobility and social ties, also gathered from the phone call data. More specifically, we find that restricting mobility does not delay the occurrence of an endemic state and that an information campaign based on one-to-one phone conversations among members of social groups might be an effective countermeasure.
Article
Governments and other organisations often rely on data collected by household surveys and censuses to identify areas in most need of regeneration and development projects. However, due to the high cost associated with the data collection process, many developing countries conduct such surveys very infrequently and include only a rather small sample of the population, thus failing to accurately capture the current socio-economic status of the country's population. In this paper, we address this problem by means of a methodology that relies on an alternative source of data from which to derive up to date poverty indicators, at a very fine level of spatio-temporal granularity. Taking two developing countries as examples, we show how to analyse the aggregated call detail records of mobile phone subscribers and extract features that are strongly correlated with poverty indexes currently derived from census data.
Article
The ubiquitous presence of cell phones in emerging economies has brought about a wide range of cell phone-based services for low-income groups. Often times, the success of such technologies highly depends on its adaptation to the needs and habits of each social group. In an attempt to understand how cell phones are being used by citizens in an emerging economy, we present a large-scale study to analyze the relationship between specific socio-economic factors and the way people use cell phones in an emerging economy in Latin America. We propose a novel analytical approach that combines large-scale datasets of cell phone records with countrywide census data to reveal findings at a national level. Our main results show correlations between socio-economic levels and social network or mobility patterns among others. We also provide analytical models to accurately approximate census variables from cell phone records with R2 ≈ 0.82.
Article
Human mobility is one of the key factors at the basis of the spreading of diseases in a population. Containment strategies are usually devised on movement scenarios based on coarse-grained assumptions. Mobility phone data provide a unique opportunity for building models and defining strategies based on very precise information about the movement of people in a region or in a country. Another very important aspect is the underlying social structure of a population, which might play a fundamental role in devising information campaigns to promote vaccination and preventive measures, especially in countries with a strong family (or tribal) structure. In this paper we analyze a large-scale dataset describing the mobility and the call patterns of a large number of individuals in Ivory Coast. We present a model that describes how diseases spread across the country by exploiting mobility patterns of people extracted from the available data. Then, we simulate several epidemics scenarios and we evaluate mechanisms to contain the epidemic spreading of diseases, based on the information about people mobility and social ties, also gathered from the phone call data. More specifically, we find that restricting mobility does not delay the occurrence of an endemic state and that an information campaign based on one-to-one phone conversations among members of social groups might be an effective countermeasure.