The impact of biases in mobile phone ownership on estimates of human mobility

Department of Engineering and Public Policy, Carnegie Mellon University, , 5000 Forbes Avenue, Pittsburgh, PA 15221, USA.
Journal of The Royal Society Interface (Impact Factor: 3.92). 01/2013; 10(81):20120986. DOI: 10.1098/rsif.2012.0986
Source: PubMed


Mobile phone data are increasingly being used to quantify the movements of human populations for a wide range of social, scientific and public health research. However, making population-level inferences using these data is complicated by differential ownership of phones among different demographic groups that may exhibit variable mobility. Here, we quantify the effects of ownership bias on mobility estimates by coupling two data sources from the same country during the same time frame. We analyse mobility patterns from one of the largest mobile phone datasets studied, representing the daily movements of nearly 15 million individuals in Kenya over the course of a year. We couple this analysis with the results from a survey of socioeconomic status, mobile phone ownership and usage patterns across the country, providing regional estimates of population distributions of income, reported airtime expenditure and actual airtime expenditure across the country. We match the two data sources and show that mobility estimates are surprisingly robust to the substantial biases in phone ownership across different geographical and socioeconomic groups.

Download full-text


Available from: Abdisalan Noor, Jul 28, 2014
    • "Researchers are studying migration movements following disasters as a way to understand the spread of infectious disease. For instance, Buckee and her research team (Wesolowski et al., 2013; Buckee et al., 2013) used location data from mobile phones to explore the human moving patterns in Kenya to prevent malaria and other diseases from spreading. Information collected on human travel patterns from mobile phone usage are being utilized to develop predictive models to further fight malaria in the region (Talbot, 2013). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The number of smart phone users in Malaysia has already reached over 50% of the total mobile service subscribers in 2014 and this number is growing radically (Google, 2014). According to a joint-study done by Google and TNS, Malaysia ranked number one worldwide for smartphone Internet access exclusivity where 35% of the smartphone users exclusively use their devices as the only means of accessing the Internet (Google, 2014, Lee, 2014). In fact, Malaysia along with Singapore, Hong Kong, China and South Korea are the only nations worldwide to be using smartphones more than computers as the primary device for accessing the Internet (Google, 2014; Lee, 2014). Based on Google’s consumer barometer, more than 60% of Malaysians use the Internet everyday across all ages and more than 70% of those in their 20s connect to the Internet everyday (Google, 2014). Clearly, Malaysians are at the forefront of smartphone and Internet usage. The level of smartphone usage connotes the mass data that flows through the telecommunication network every millisecond via user activities such as calls, messages, application download and usage, social networking posts, geo position data, web browsing history, and billing records. These digital footprints are rich data that can be used to create innovative service for end users. For instance, telcos Orange (for Abidjan, Cote d'Ivoire) and Korea Telecom offered access to anonymized data containing user records of text messages exchanged and local calls to support transport planning to reduce traffic congestion and optimize night bus routes (FutureGov, 2014). Similarly, XO communication in US and Ufone in Pakistan used consumer telecommunications data to analyze the churn rate and identify factors to increase customer retention rate (IBM, 2015). Smartphone data can also be used to analyze migration patterns for managing and monitoring the impacts of global and local socio-economic crises. Therefore, data from user smartphone activities could provide ample business opportunities to improve the quality of services and the quality of life. Telcos, with their ability to collect, store, process, and maintain customer data, are bound to be the biggest beneficiary of this big data trend (Brown et al., 2011). However, the current data leverage among telcos is limited by consumer data privacy and protection as well as security challenges. The consequences of privacy in the Big Data era are not fully understood and the policies are under-developed (Kshetri, 2014). Recently, developed countries such as Japan, UK, USA, and South Korea initiate specific guidelines for big data personal information protection via standard committees and working groups (KISDI, 2014). Malaysia, however, still lags behind other countries in enacting specific guidelines and policies on consumer privacy protection (. Even though its government has officially enforced the Personal Data Protection Act 2010 (PDPA) on November 15th, 2013, its implementation remains challenging and its effect unclear in today’s fast changing technology such as Cloud, Big data analytics, and the Internet of Things (IoT). The immediate task now is to understand the state of the PDPA implementation before the industry can move forward to leverage on consumer data. This understanding is novel and essential because policy is the core facilitating factor to promote better personal information protection (Xu et al., 2011). Therefore, the objective of this research is to examine the state of the PDPA implementation and its effect on the Malaysia telecommunications sector from different stakeholder perspectives. We seek to answer the following questions: • What is the state of the current PDPA implementation in the telecommunications sector? How is it when compared to the privacy protection effort in other countries? • What are the challenges faced in the implementation from different stakeholders’ perspectives? What would be the suggested solutions? Giddens’ (1976) Structuration Theory (ST) and the Competing Value Framework (CVF) (Quinn and Rohrbaugh, 1983) will serve as the theoretical foundations for our research. ST analyzes the interplay between the structure and the agents to understand how both dimensions influence each other. When applying to the context of the telecommunications sector, the structure is the policies and regulations enacted by the government while the agents are stakeholders such as the telcos and the consumers who function within the structure. The duality concept in ST suggests that the structure imposes certain limitations on how the agents will act but the agents through time could alter the structure to adapt to new changes. Since the structure and the agents (even among the agents) have competing interests, CVF will help to dissect how different interests conflict each other. We will adopt a two-phase research approach. In phase one, we will conduct intensive literature review on policies, standards, and practices of personal information and privacy protection in different countries. We will compare the strengths and weaknesses of the Malaysia privacy protection policy with the policy in other countries. In phase two, we will follow the Delphi methodology to conduct in-depth interviews with policy makers and the management group of the four largest telcos in Malaysia (Celcom, Maxis, Digi, and U-Mobile). The goal is two-fold: (1) gauge stakeholders’ viewpoints of on the state of PDPA implementation and the challenges faced, and (2) understand stakeholders’ expectation of privacy protection. The AST and the CVF will frame our interview questions. The expected results of this study are (1) a comparison of the regulations and policy of data privacy protection within the telecommunications sector in Malaysia and other countries; and (2) a list of challenges and expectations different stakeholders have toward privacy data protection.
    No preview · Conference Paper · Jun 2015
  • Source
    • "In fact, recent research examines heterogeneous mobile phone owners according to user attributes, such as gender and socioeconomic status, comparing mobile users to the general population [5][17]. This allows us to understand that such heterogeneity impacts the estimation results of human mobility analyses using CDRs [18]. This means that the interpretation and analysis of results may be misleading if there is no clear understanding of which parts of society the data represent. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The understanding of mass population movements has greatly advanced with the rapid spread of ubiquitous devices. Anonymized call detail records (CDRs) for mobile phones have enabled us to not only trace individual trajectories but also approximate activity patterns, including significant locations such as homes and workplaces. The majority of studies analyzing CDRs attempt to utilize the mobility patterns of anonymized crowds to improve transportation and public health. This is quite reasonable because CDRs can capture the movements of people at given times and places, whereas general statistics usually account for a population based on their locations of residence. However, it has also been pointed out that there are discrepancies between the movements of people as observed through CDRs and those of an entire population in a given area. This is because CDRs only represent device users. In fact, we can never learn about the population that is unobservable through CDRs only by analyzing CDRs. Therefore, this study attempts to provide clues to help us understand the whereabouts of the unobservable population by analyzing two months of the CDRs for 58 volunteers with mobile device service from a major telecommunications company in combination with field survey data from Dhaka. We surveyed the personal and household attributes of mobile users in relation to their calling behavior. The analysis results show that per mobile user observed in CDRs, there is an average of roughly 2.4 to 2.8 unobservable people. Their age groups and gender composition are also provided. We find that male and female users exhibit opposite trends in call locations according to the presence of children within the household. In addition, based on field observations, we find that the location and time distributions of small children follow some specific routines. Our findings contribute to the understanding of the whereabouts of the unobservable population, the majority of whom are children and are considered to be vulnerable to disasters or infectious diseases but are difficult to locate through CDRs alone.
    Full-text · Conference Paper · Mar 2015
  • Source
    • "05/13 Orange Ivory Coast 5 months (2012) 500 K Liu [85] 05/13 Orange Ivory Coast 5 months (2012) 500 K Kung [140] 05/13 Orange, STC, AirSage Several countries months (2006-2013) 18 M Validation Tizzoni [77] 09/13 Orange 3 cities – 6.8 M Douglass [120] 05/15 Telecom Italia Milan, Italy 2 months (2013 – Wesolowski [122] "
    [Show abstract] [Hide abstract]
    ABSTRACT: This report surveys the literature on analyses of mobile traffic collected by operators within their network infrastructure. This is a recently emerged research field, and, apart a few outliers, relevant works cover the period from 2005 to date, with a sensible densification over the last three years. We provide a thorough review of the multidisciplinary activities that rely on mobile traffic datasets, identifying major categories and sub-categories in the literature, so as to outline a hierarchical classification of research lines. When detailing the works pertaining to each class, we balance a comprehensive view of state-of- the-art results with punctual focuses on the methodological aspects. Our approach provides a complete introductory guide to the research based on mobile traffic analysis. It allows summarizing the main findings of the current state-of-the-art, as well as pinpointing important open research directions.
    Preview · Article · Mar 2015 · IEEE Communications Surveys & Tutorials
Show more