The impact of biases in mobile phone ownership on estimates of human mobility

Department of Engineering and Public Policy, Carnegie Mellon University, , 5000 Forbes Avenue, Pittsburgh, PA 15221, USA.
Journal of The Royal Society Interface (Impact Factor: 3.92). 01/2013; 10(81):20120986. DOI: 10.1098/rsif.2012.0986
Source: PubMed


Mobile phone data are increasingly being used to quantify the movements of human populations for a wide range of social, scientific and public health research. However, making population-level inferences using these data is complicated by differential ownership of phones among different demographic groups that may exhibit variable mobility. Here, we quantify the effects of ownership bias on mobility estimates by coupling two data sources from the same country during the same time frame. We analyse mobility patterns from one of the largest mobile phone datasets studied, representing the daily movements of nearly 15 million individuals in Kenya over the course of a year. We couple this analysis with the results from a survey of socioeconomic status, mobile phone ownership and usage patterns across the country, providing regional estimates of population distributions of income, reported airtime expenditure and actual airtime expenditure across the country. We match the two data sources and show that mobility estimates are surprisingly robust to the substantial biases in phone ownership across different geographical and socioeconomic groups.

Download full-text


Available from: Abdisalan Noor, Jul 28, 2014
    • "Researchers are studying migration movements following disasters as a way to understand the spread of infectious disease. For instance, Buckee and her research team (Wesolowski et al., 2013; Buckee et al., 2013) used location data from mobile phones to explore the human moving patterns in Kenya to prevent malaria and other diseases from spreading. Information collected on human travel patterns from mobile phone usage are being utilized to develop predictive models to further fight malaria in the region (Talbot, 2013). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The number of smart phone users in Malaysia has already reached over 50% of the total mobile service subscribers in 2014 and this number is growing radically (Google, 2014). According to a joint-study done by Google and TNS, Malaysia ranked number one worldwide for smartphone Internet access exclusivity where 35% of the smartphone users exclusively use their devices as the only means of accessing the Internet (Google, 2014, Lee, 2014). In fact, Malaysia along with Singapore, Hong Kong, China and South Korea are the only nations worldwide to be using smartphones more than computers as the primary device for accessing the Internet (Google, 2014; Lee, 2014). Based on Google’s consumer barometer, more than 60% of Malaysians use the Internet everyday across all ages and more than 70% of those in their 20s connect to the Internet everyday (Google, 2014). Clearly, Malaysians are at the forefront of smartphone and Internet usage. The level of smartphone usage connotes the mass data that flows through the telecommunication network every millisecond via user activities such as calls, messages, application download and usage, social networking posts, geo position data, web browsing history, and billing records. These digital footprints are rich data that can be used to create innovative service for end users. For instance, telcos Orange (for Abidjan, Cote d'Ivoire) and Korea Telecom offered access to anonymized data containing user records of text messages exchanged and local calls to support transport planning to reduce traffic congestion and optimize night bus routes (FutureGov, 2014). Similarly, XO communication in US and Ufone in Pakistan used consumer telecommunications data to analyze the churn rate and identify factors to increase customer retention rate (IBM, 2015). Smartphone data can also be used to analyze migration patterns for managing and monitoring the impacts of global and local socio-economic crises. Therefore, data from user smartphone activities could provide ample business opportunities to improve the quality of services and the quality of life. Telcos, with their ability to collect, store, process, and maintain customer data, are bound to be the biggest beneficiary of this big data trend (Brown et al., 2011). However, the current data leverage among telcos is limited by consumer data privacy and protection as well as security challenges. The consequences of privacy in the Big Data era are not fully understood and the policies are under-developed (Kshetri, 2014). Recently, developed countries such as Japan, UK, USA, and South Korea initiate specific guidelines for big data personal information protection via standard committees and working groups (KISDI, 2014). Malaysia, however, still lags behind other countries in enacting specific guidelines and policies on consumer privacy protection (. Even though its government has officially enforced the Personal Data Protection Act 2010 (PDPA) on November 15th, 2013, its implementation remains challenging and its effect unclear in today’s fast changing technology such as Cloud, Big data analytics, and the Internet of Things (IoT). The immediate task now is to understand the state of the PDPA implementation before the industry can move forward to leverage on consumer data. This understanding is novel and essential because policy is the core facilitating factor to promote better personal information protection (Xu et al., 2011). Therefore, the objective of this research is to examine the state of the PDPA implementation and its effect on the Malaysia telecommunications sector from different stakeholder perspectives. We seek to answer the following questions: • What is the state of the current PDPA implementation in the telecommunications sector? How is it when compared to the privacy protection effort in other countries? • What are the challenges faced in the implementation from different stakeholders’ perspectives? What would be the suggested solutions? Giddens’ (1976) Structuration Theory (ST) and the Competing Value Framework (CVF) (Quinn and Rohrbaugh, 1983) will serve as the theoretical foundations for our research. ST analyzes the interplay between the structure and the agents to understand how both dimensions influence each other. When applying to the context of the telecommunications sector, the structure is the policies and regulations enacted by the government while the agents are stakeholders such as the telcos and the consumers who function within the structure. The duality concept in ST suggests that the structure imposes certain limitations on how the agents will act but the agents through time could alter the structure to adapt to new changes. Since the structure and the agents (even among the agents) have competing interests, CVF will help to dissect how different interests conflict each other. We will adopt a two-phase research approach. In phase one, we will conduct intensive literature review on policies, standards, and practices of personal information and privacy protection in different countries. We will compare the strengths and weaknesses of the Malaysia privacy protection policy with the policy in other countries. In phase two, we will follow the Delphi methodology to conduct in-depth interviews with policy makers and the management group of the four largest telcos in Malaysia (Celcom, Maxis, Digi, and U-Mobile). The goal is two-fold: (1) gauge stakeholders’ viewpoints of on the state of PDPA implementation and the challenges faced, and (2) understand stakeholders’ expectation of privacy protection. The AST and the CVF will frame our interview questions. The expected results of this study are (1) a comparison of the regulations and policy of data privacy protection within the telecommunications sector in Malaysia and other countries; and (2) a list of challenges and expectations different stakeholders have toward privacy data protection.
    26th European Regional Conference of the International Telecommunications Society, San Lorenzo de El Escorial, Spain; 06/2015
  • Source
    • "In fact, recent research examines heterogeneous mobile phone owners according to user attributes, such as gender and socioeconomic status, comparing mobile users to the general population [5][17]. This allows us to understand that such heterogeneity impacts the estimation results of human mobility analyses using CDRs [18]. This means that the interpretation and analysis of results may be misleading if there is no clear understanding of which parts of society the data represent. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The understanding of mass population movements has greatly advanced with the rapid spread of ubiquitous devices. Anonymized call detail records (CDRs) for mobile phones have enabled us to not only trace individual trajectories but also approximate activity patterns, including significant locations such as homes and workplaces. The majority of studies analyzing CDRs attempt to utilize the mobility patterns of anonymized crowds to improve transportation and public health. This is quite reasonable because CDRs can capture the movements of people at given times and places, whereas general statistics usually account for a population based on their locations of residence. However, it has also been pointed out that there are discrepancies between the movements of people as observed through CDRs and those of an entire population in a given area. This is because CDRs only represent device users. In fact, we can never learn about the population that is unobservable through CDRs only by analyzing CDRs. Therefore, this study attempts to provide clues to help us understand the whereabouts of the unobservable population by analyzing two months of the CDRs for 58 volunteers with mobile device service from a major telecommunications company in combination with field survey data from Dhaka. We surveyed the personal and household attributes of mobile users in relation to their calling behavior. The analysis results show that per mobile user observed in CDRs, there is an average of roughly 2.4 to 2.8 unobservable people. Their age groups and gender composition are also provided. We find that male and female users exhibit opposite trends in call locations according to the presence of children within the household. In addition, based on field observations, we find that the location and time distributions of small children follow some specific routines. Our findings contribute to the understanding of the whereabouts of the unobservable population, the majority of whom are children and are considered to be vulnerable to disasters or infectious diseases but are difficult to locate through CDRs alone.
    2015 IEEE International Conference on Pervasive Computing and Communications (PerCom2015), St. Louis, USA; 03/2015
  • Source
    • "Recent data on the typical ages of mobile phone owners in Namibia were obtained from the Universal Service Baseline Study of the Communications Regulatory Authority of Namibia [41] and showed that while the majority of users were between 20 and 30 years old, there was a broad spread across age groups (Additional file 2). Moreover, recent analyses suggest that such biases may have a limited effect on general estimates of human mobility [43]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: As successful malaria control programmes re-orientate towards elimination, the identification of transmission foci, targeting of attack measures to high-risk areas and management of importation risk become high priorities. When resources are limited and transmission is varying seasonally, approaches that can rapidly prioritize areas for surveillance and control can be valuable, and the most appropriate attack measure for a particular location is likely to differ depending on whether it exports or imports malaria infections.Methods/Results: Here, using the example of Namibia, a method for targeting of interventions using surveillance data, satellite imagery, and mobile phone call records to support elimination planning is described. One year of aggregated movement patterns for over a million people across Namibia are analyzed, and linked with case-based risk maps built on satellite imagery. By combining case-data and movement, the way human population movements connect transmission risk areas is demonstrated. Communities that were strongly connected by relatively higher levels of movement were then identified, and net export and import of travellers and infection risks by region were quantified. These maps can aid the design of targeted interventions to maximally reduce the number of cases exported to other regions while employing appropriate interventions to manage risk in places that import them. The approaches presented can be rapidly updated and used to identify where active surveillance for both local and imported cases should be increased, which regions would benefit from coordinating efforts, and how spatially progressive elimination plans can be designed. With improvements in surveillance systems linked to improved diagnosis of malaria, detailed satellite imagery being readily available and mobile phone usage data continually being collected by network providers, the potential exists to make operational use of such valuable, complimentary and contemporary datasets on an ongoing basis in infectious disease control and elimination.
    Malaria Journal 02/2014; 13(1):52. DOI:10.1186/1475-2875-13-52 · 3.11 Impact Factor
Show more