Conference Paper

REAPER: Real-time App Analysis for Augmenting the Android Permission System

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Android's app ecosystem relies heavily on third-party libraries as they facilitate code development and provide a steady stream of revenue for developers. However, while Android has moved towards a more fine-grained run time permission system, users currently lack the required resources for deciding whether a specific permission request is actually intended for the app itself or is requested by possibly dangerous third-party libraries. In this paper we present Reaper, a novel dynamic analysis system that traces the permissions requested by apps in real time and distinguishes those requested by the app's core functionality from those requested by third-party libraries linked with the app. We implement a sophisticated UI automator and conduct an extensive evaluation of our system's performance and find that Reaper introduces negligible overhead, rendering it suitable both for end users (by integrating it in the OS) and for deployment as part of an official app vetting process. Our study on over 5K popular apps demonstrates the large extent to which personally identifiable information is being accessed by libraries and highlights the privacy risks that users face. We find that an impressive 65% of the permissions requested do not originate from the core app but are issued by linked third-party libraries, 37.3% of which are used for functionality related to ads, tracking, and analytics. Overall, Reaper enhances the functionality of Android's run time permission model without requiring OS or app modifications, and provides the necessary contextual information that can enable users to selectively deny permissions that are not part of an app's core functionality.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... JIT-MF relies on a combination of static and dynamic instrumentation of compiled code to implement trigger points as in-line function hooks and the memory dumping process in the form of instrumentation code. Binary instrumentation also underpins various other Android security techniques where it is required to operate within real-time parameters (Diamantaris et al., 2019;Heuser et al., 2014;Chen et al., 2017;Li et al., 2015). Furthermore, JIT-MF considers the temporal aspect of volatile memory forensic collection concerned with the timely collection of ephemeral artifacts. ...
... Yet, other attack vectors may also present the same opportunity for LOtL tactics. Zygote and binder infection combined with a rooting exploit (Kaspersky, 2016), as well as app-level virtualization frameworks (Shi et al., 2019) and third-party library infections (Diamantaris et al., 2019) provide further attack vectors, resulting in similarly stealthy attacks and for which JIT-MF could be a solution in terms of incident response. ...
Preprint
Full-text available
Digital investigations of stealthy attacks on Android devices pose particular challenges to incident responders. Whereas consequential late detection demands accurate and comprehensive forensic timelines to reconstruct all malicious activities, reduced forensic footprints with minimal malware involvement, such as when Living-Off-the-Land (LOtL) tactics are adopted, leave investigators little evidence to work with. Volatile memory forensics can be an effective approach since app execution of any form is always bound to leave a trail of evidence in memory, even if perhaps ephemeral. Just-in-Time Memory Forensics (JIT-MF) is a recently proposed technique that describes a framework to process memory forensics on existing stock Android devices, without compromising their security by requiring them to be rooted. Within this framework, JIT-MF drivers are designed to promptly dump in-memory evidence related to app usage or misuse. In this work, we primarily introduce a conceptualized presentation of JIT-MF drivers. Subsequently, through a series of case studies involving the hijacking of widely-used messaging apps, we show that when the target apps are forensically enhanced with JIT-MF drivers, investigators can generate richer forensic timelines to support their investigation, which are on average 26% closer to ground truth.
... Users who installed the apps with AFP configuration can identify and customize finegrained permission levels on private or confidential resources [79,80]. In the same year, [81] introduced a dynamic analysis approach that monitors the permissions requested by apps during the run-time and recognizes those requested permissions by the app's fundamental functionality from those demanded by third-party libraries linked with the app [81]. In 2016, [8] proposed a framework that enforces fine-grained security privacy policies and enables users to manage access of applications to sensitive elements. ...
... Users who installed the apps with AFP configuration can identify and customize finegrained permission levels on private or confidential resources [79,80]. In the same year, [81] introduced a dynamic analysis approach that monitors the permissions requested by apps during the run-time and recognizes those requested permissions by the app's fundamental functionality from those demanded by third-party libraries linked with the app [81]. In 2016, [8] proposed a framework that enforces fine-grained security privacy policies and enables users to manage access of applications to sensitive elements. ...
Article
Full-text available
Android mobile apps gain access to numerous users’ private data. Users of different Android mobile apps have less control over their sensitive data during their installation and run-time. Too often, these apps consider data privacy less serious than users’ expectations. Many mobile apps misbehave and upload users’ data without permission which confirmed the possibility of privacy leakage through different network channels. The literature has proposed various approaches to protect user’s data and avoid privacy violations. In this paper, we provide a comprehensive overview of state-of-art research on Android user privacy, and data flow control. the aim is to highlight the main trends, pinpoint the main methodologies applied, and enumerate the privacy violations faced by Android users. We also shed some light on the directions where the researcher’s community effort is still needed. To this end, we conduct a Systematic Literature Review (SLR) during which we surveyed 114 relevant research papers published in leading conferences and journals. Our thorough examination of the relevant literature has led to a critical analysis of the proposed solutions with a focus on user privacy extensions and mechanism for the Android mobile platform. Furthermore, possible solutions and research directions have been discussed.
... Several studies investigated Android application permission system. REAPER [36] is a tool that traces the permissions requested by apps in real time and discriminates them from other permissions requested by third-party libraries linked with the app. Fasano et al. [37] proposed a formal method to detect the exact point in the code of an Android application where a permission is invoked at runtime. ...
Article
Full-text available
Android applications have recently witnessed a pronounced progress, making them among the fastest growing technological fields to thrive and advance. However, such level of growth does not evolve without some cost. This particularly involves increased security threats that the underlying applications and their users usually fall prey to. As malware becomes increasingly more capable of penetrating these applications and exploiting them in suspicious actions, the need for active research endeavors to counter these malicious programs becomes imminent. Some of the studies are based on dynamic analysis, and others are based on static analysis, while some are completely dependent on both. In this paper, we studied static, dynamic, and hybrid analyses to identify malicious applications. We leverage machine learning classifiers to detect malware activities as we explain the effectiveness of these classifiers in the classification process. Our results prove the efficiency of permissions and the action repetition feature set and their influential roles in detecting malware in Android applications. Our results show empirically very close accuracy results when using static, dynamic, and hybrid analyses. Thus, we use static analyses due to their lower cost compared to dynamic and hybrid analyses. In other words, we found the best results in terms of accuracy and cost (the trade-off) make us select static analysis over other techniques.
... Designed to generate descriptions for security-relevant implementation parts, the approach of Zhang et al. [34] evaluates a graphbased app representation. To recognize critical app behavior at runtime, Diamantaris et al. [6] hook calls to privacy-relevant APIs and inspect security-critical parameter values via backtracking. ...
Conference Paper
Full-text available
Permissions are a key factor in Android to protect users' privacy. As it is often not obvious why applications require certain permissions, developer-provided descriptions in Google Play and third-party markets should explain to users how sensitive data is processed. Reliably recognizing whether app descriptions cover permission usage is challenging due to the lack of enforced quality standards and a variety of ways developers can express privacy-related facts. We introduce a machine learning-based approach to identify critical discrepancies between developer-described app behavior and permission usage. By combining state-of-the-art techniques in natural language processing (NLP) and deep learning, we design a convolutional neural network (CNN) for text classification that captures the relevance of words and phrases in app descriptions in relation to the usage of dangerous permissions. Our system predicts the likelihood that an app requires certain permissions and can warn about descriptions in which the requested access to sensitive user data and system features is textually not represented. We evaluate our solution on 77,000 real-world app descriptions and find that we can identify individual groups of dangerous permissions with a precision between 71% and 93%. To highlight the impact of individual words and phrases, we employ a model explanation algorithm and demonstrate that our technique can successfully bridge the semantic gap between described app functionality and its access to security- and privacy-sensitive resources.
... They ranked permissions according to its usage and used a priori association rules to extract significant permissions for classification of malign and benign apps. Diamantaris et al. [11] distinguished permissions required by the core functions of the Android apps and integrated by third-party libraries. They showed 30 topmost permissions used by core and third-party libraries. ...
... A smaller number of works go a step further by collecting and analyzing more contextual information to determine a potential privacy violation. For example, some of them focus on detecting who is accessing the personal data (by distinguishing third-party libraries from the app itself [35]- [37], while others focus on discriminating whether access to certain personal data is required for the app's core functionality or another secondary (third-party) task [38]. The Google Play Protect (GPP) approach [39] is also aligned with this paradigm aiming at detecting potential harmful behaviour in the Android ecosystem, including the disclosure of personal data off the device via Spyware. ...
Article
Full-text available
The pervasiveness of Android mobile applications and the services they support allow the personal data of individuals to be collected and shared worldwide. However, data protection legislations usually require all participants in a personal data flow to ensure an equivalent level of personal data protection, regardless of location. In particular, the European General Data Protection Regulation constrains cross-border transfers of personal data to non-EU countries and establishes specific requirements to carry them out. This paper presents a method to systematically assess compliance of Android mobile apps with the requirements for cross-border transfers established by the European data protection regulation. We have validated the method with one hundred Android apps, finding an outstanding 66% of ambiguous, inconsistent and omitted cross-border transfer disclosures.
... These studies on Android permissions have mainly leveraged static analysis techniques to understand the role of a given permission [7], [22], [25], potential privacy violation incurred by overprivileged apps [22], [59], permission circumvention [55], description-to-permission fidelity [53], and improve mapping of Android permissions to framework/SDK API methods [8], [2]. Some recent research efforts also utilize dynamic analysis systems to distinguish and trace the permissions requested by apps at the runtime and those requested by the app's core functionality [17] and generate a more precise call graph enabling the system to extract the permission specification and improve the mapping [41]. Our study complements these studies by showing the scales and the prevalence of private information collection in the real world devices. ...
Preprint
Full-text available
Mobile phones enable the collection of a wealth of private information, from unique identifiers (e.g., email addresses), to a user's location, to their text messages. This information can be harvested by apps and sent to third parties, which can use it for a variety of purposes. In this paper we perform the largest study of private information collection (PIC) on Android to date. Leveraging an anonymized dataset collected from the customers of a popular mobile security product, we analyze the flows of sensitive information generated by 2.1M unique apps installed by 17.3M users over a period of 21 months between 2018 and 2019. We find that 87.2% of all devices send private information to at least five different domains, and that actors active in different regions (e.g., Asia compared to Europe) are interested in collecting different types of information. The United States (62% of the total) and China (7% of total flows) are the countries that collect most private information. Our findings raise issues regarding data regulation, and would encourage policymakers to further regulate how private information is used by and shared among the companies and how accountability can be truly guaranteed.
... These studies on Android permissions have mainly leveraged static analysis techniques to understand the role of a given permission [8], [23], [26], potential privacy violation incurred by overprivileged apps [23], [64], permission circumvention [59], description-to-permission fidelity [56], and improve mapping of Android permissions to framework/SDK API methods [9], [2]. Some recent research efforts also utilize dynamic analysis systems to distinguish and trace the permissions requested by apps at the runtime and those requested by the apps core functionality [18] and generate a more precise call graph enabling the system to extract the permission specification and improve the mapping [43]. Our study complements these studies by showing the scales and the prevalence of private information collection in the real world devices. ...
Conference Paper
Full-text available
Mobile phones enable the collection of a wealth of private information, from unique identifiers (e.g., email addresses), to a user's location, to their text messages. This information can be harvested by apps and sent to third parties, which can use it for a variety of purposes. In this paper we perform the largest study of private information collection (PIC) on Android to date. Leveraging an anonymized dataset collected from the customers of a popular mobile security product, we analyze the flows of sensitive information generated by 2.1M unique apps installed by 17.3M users over a period of 21 months between 2018 and 2019. We find that 87.2% of all devices send private information to at least five different domains, and that actors active in different regions (e.g., Asia compared to Europe) are interested in collecting different types of information. The United States (62% of the total) and China (7% of total flows) are the countries that collect most private information. Our findings raise issues regarding data regulation, and would encourage policymakers to further regulate how private information is used by and shared among the companies and how accountability can be truly guaranteed.
... Similar to monitors like REAPER [11] and MOSES [25], JIT-MF uses trigger points which, rather than being indicators for malicious events, such as permission misuse, are indicators of benign events that may be misused by an attacker. In contrast to typical monitors, JIT-MF dumps necessary memory contents for post-analysis at runtime, which is less costly than online analysis. ...
Chapter
Full-text available
Attackers regularly target Android phones and come up with new ways to bypass detection mechanisms to achieve long-term stealth on a victim’s phone. One way attackers do this is by leveraging critical benign app functionality to carry out specific attacks. In this paper, we present a novel generalised framework, JIT-MF ( Just-in-time Memory Forensics ), which aims to address the problem of timely collection of short-lived evidence in volatile memory to solve the stealthiest of Android attacks. The main components of this framework are i) Identification of critical data objects in memory linked with critical benign application steps that may be misused by an attacker; and ii) Careful selection of trigger points, which identify when memory dumps should be taken during benign app execution. The effectiveness and cost of trigger point selection, a cornerstone of this framework, are evaluated in a preliminary qualitative study using Telegram and Pushbullet as the victim apps targeted by stealthy malware. Our study identifies that JIT-MF is successful in dumping critical data objects on time, providing evidence that eludes all other forensic sources. Experimentation offers insight into identifying categories of trigger points that can strike a balance between the effort required for selection and the resulting effectiveness and storage costs. Several optimisation measures for the JIT-MF tools are presented, considering the typical resource constraints of Android devices.
... The results show that Aper can significantly outperform existing tools, and find real bugs in popular Android projects with useful debugging information. In the future, we plan to extend Aper to support more types of runtime permission bugs, such as library-induced [39] or device-specific [65] bugs. ...
Preprint
The Android platform introduces the runtime permission model in version 6.0. The new model greatly improves data privacy and user experience, but brings new challenges for app developers. First, it allows users to freely revoke granted permissions. Hence, developers cannot assume that the permissions granted to an app would keep being granted. Instead, they should make their apps carefully check the permission status before invoking dangerous APIs. Second, the permission specification keeps evolving, bringing new types of compatibility issues into the ecosystem. To understand the impact of the challenges, we conducted an empirical study on 13,352 popular Google Play apps. We found that 86.0% apps used dangerous APIs asynchronously after permission management and 61.2% apps used evolving dangerous APIs. If an app does not properly handle permission revocations or platform differences, unexpected runtime issues may happen and even cause app crashes. We call such Android Runtime Permission issues as ARP bugs. Unfortunately, existing runtime permission issue detection tools cannot effectively deal with the ARP bugs induced by asynchronous permission management and permission specification evolution. To fill the gap, we designed a static analyzer, Aper, that performs reaching definition and dominator analysis on Android apps to detect the two types of ARP bugs. To compare Aper with existing tools, we built a benchmark, ARPfix, from 60 real ARP bugs. Our experiment results show that Aper significantly outperforms two academic tools, ARPDroid and RevDroid, and an industrial tool, Lint, on ARPfix, with an average improvement of 46.3% on F1-score. In addition, Aper successfully found 34 ARP bugs in 214 opensource Android apps, most of which can result in abnormal app behaviors (such as app crashes) according to our manual validation.
... Each mobile HTML5 WebAPI is associated with a low-level Android API call. In order to validate the results of the JavaScript interception and to identify which ones require a permission, we use the PermissionHarvester [26] module that hooks every Android permission protected API call. Since access to some of the sensors does not require an Android permission, we also manually identified and hooked the functions that give access to non-permission-protected sensor data. ...
Conference Paper
Smartphone sensors can be leveraged by malicious apps for a plethora of different attacks, which can also be deployed by malicious websites through the HTML5 WebAPI. In this paper we provide a comprehensive evaluation of the multifaceted threat that mobile web browsing poses to users, by conducting a large-scale study of mobile-specific HTML5 WebAPI calls used in the wild. We build a novel testing infrastructure consisting of actual smartphones on top of a dynamic Android app analysis framework, allowing us to conduct an end-to-end exploration. Our study reveals the extent to which websites are actively leveraging the WebAPI for collecting sensor data, with 2.89% of websites accessing at least one mobile sensor. To provide a comprehensive assessment of the potential risks of this emerging practice, we create a taxonomy of sensor-based attacks from prior studies, and present an in-depth analysis by framing our collected data within that taxonomy. We find that 1.63% of websites could carry out at least one of those attacks. Our findings emphasize the need for a standardized policy across browsers and the ability for users to control what sensor data each website can access.
Chapter
The Android ecosystem is dynamic and diverse. Controls have been set in place to allow mobile device users to regulate exchanged data and restrict apps from accessing sensitive personal information and system resources. Modern versions of the operating system implement the run-time permission model which prompts users to allow access to protected resources the moment an app attempts to utilize them. It is assumed that, in general, the run-time permission model, compared to its predecessor, enhances users’ security awareness. In this paper we show that installed apps on Android devices are able to employ the systems’ public assets and extract users’ permission settings. Then we utilize permission data from 71 Android devices to create privacy profiles based on users’ interaction with permission dialogues initiated by the system during run-time. Therefore, we demonstrate that any installed app that runs on the foreground can perform an endemic live digital forensic analysis on the device and derive similar privacy profiles of the user. Moreover, focusing on the human factors of security, we show that although in theory users can control the resources they make accessible to apps, they eventually fail to successfully recall these settings, even for the apps that they regularly use. Finally, we briefly discuss our findings derived from a pen-and-paper exercise showcasing that users are more likely to allow apps to access their location data on contemporary mobile devices (running version Android 10).
Article
Modern smartphone sensors can be leveraged for providing novel functionality and greatly improving the user experience. However, sensor data can be misused by privacy-invasive or malicious entities. Additionally, a wide range of other attacks that use mobile sensor data have been demonstrated; while those attacks have typically relied on users installing malicious apps, browsers have eliminated that constraint with the deployment of HTML5 WebAPI. In this article, we conduct a comprehensive evaluation of the multifaceted threat that mobile web browsing poses to users by conducting a large-scale study of mobile-specific HTML5 WebAPI calls across more than 183K of the most popular websites. We build a novel testing infrastructure consisting of actual smartphones on top of a dynamic Android app analysis framework, allowing us to conduct an end-to-end exploration. In detail, our system intercepts and tracks data access in real time, from the WebAPI JavaScript calls down to the Android system calls. Our study reveals the extent to which websites are actively leveraging the WebAPI for collecting sensor data, with 2.89% of websites accessing at least one sensor. To provide a comprehensive assessment of the risks of this emerging practice, we create a taxonomy of sensor-based attacks from prior studies and present an in-depth analysis by framing our collected data within that taxonomy. We find that 1.63% of websites can carry out at least one attack and emphasize the need for a standardized policy across all browsers and the ability for users to control what sensor data each website can access.
Article
Full-text available
Software Quality Control (SQC) techniques are widely used throughout the software development process with the objective of assessing and detecting anomalies that affect the quality of an information system. Privacy is one quality attribute of software systems for which several SQC techniques have been proposed in recent years. However, research has been carried out from different perspectives and, consequently, it has led to a growing body of knowledge scattered across different domains. To bridge this gap, we have carried out a systematic mapping study to provide practitioners and researchers with an overview of the state-of-the-art techniques to carry out software quality control of information systems focusing on aspects of privacy. Our results show a steady growth in the research efforts in this field. The European General Data Protection Regulation seems to have a significant influence on this growth, since 37% of techniques that focus on assessing compliance derive their assessment criteria from this legal framework. The maturity of the techniques varies between the type of technique: Formal verification techniques exhibit the lowest level of maturity while the combination of techniques has demonstrated its successful application in real-world scenarios. The latter seems a promising avenue of research as it provides better results in terms of coverage, precision and effectiveness than the application of individual, isolated techniques. In this paper, we describe the existing SQC techniques focusing on privacy and provide a suitable basis for identifying future research directions.
Article
Most Android applications include third‐party libraries (3PLs) to make revenues, to facilitate their development, and to track user behaviors. 3PLs generally require specific permissions to realize their functionalities. Current Android systems manage permissions in app (process) granularity. As a result, the permission sets of apps with 3PLs (3PL‐apps) may be augmented, introducing overprivilege risks. In this paper, we firstly study how severe the problem is by analyzing the permission sets of 27 718 real‐world Android apps with and without 3PLs downloaded in both 2016 and 2017. We find that the usage of 3PLs and the permissions required by 3PL‐apps have increased over time. As a result, the possibility of overprivilege risks increases. We then propose Perman, a fine‐grained permission management mechanism for Android. Perman isolates the permissions of the host app and those of the 3PLs through dynamic code instrumentation. It allows users to manage permission requests of different modules of 3PL‐apps during app runtime. Unlike existing tools, Perman does not need to redesign Android apps and systems. Therefore, it can be applied to millions of existing apps and various Android devices. We conduct experiments to evaluate the effectiveness and efficiency of Perman. The experimental results verify that Perman is capable of managing permission requests of the host app and those of the 3PLs. We also confirm that the overhead introduced by Perman is comparable to that by existing commercial permission management tools.
Conference Paper
Full-text available
Mobile apps are ubiquitous, operate in complex environments and are developed under the time-to-market pressure. Ensuring their correctness and reliability thus becomes an important challenge. This paper introduces Stoat, a novel guided approach to perform stochastic model-based testing on Android apps. Stoat operates in two phases: (1) Given an app as input, it uses dynamic analysis enhanced by a weighted UI exploration strategy and static analysis to reverse engineer a stochastic model of the app's GUI interactions; and (2) it adapts Gibbs sampling to iteratively mutate/refine the stochastic model and guides test generation from the mutated models toward achieving high code and model coverage and exhibiting diverse sequences. During testing, system-level events are randomly injected to further enhance the testing effectiveness. Stoat was evaluated on 93 open-source apps. The results show (1) the models produced by Stoat cover 17~31% more code than those by existing modeling tools; (2) Stoat detects 3X more unique crashes than two state-of-the-art testing tools, Monkey and Sapienz. Furthermore, Stoat tested 1661 most popular Google Play apps, and detected 2110 previously unknown and unique crashes. So far, 43 developers have responded that they are investigating our reports. 20 of reported crashes have been confirmed, and 8 already fixed.
Conference Paper
Full-text available
We analyze the software stack of popular mobile advertising libraries on Android and investigate how they protect the users of advertising-supported apps from malicious advertising. We find that, by and large, Android advertising libraries properly separate the privileges of the ads from the host app by confining ads to dedicated browser instances that correctly apply the same origin policy. We then demonstrate how malicious ads can infer sensitive information about users by accessing external storage, which is essential for media-rich ads in order to cache video and images. Even though the same origin policy prevents confined ads from reading other apps' external-storage files, it does not prevent them from learning that a file with a particular name exists. We show how, depending on the app, the mere existence of a file can reveal sensitive information about the user. For example, if the user has a pharmacy price-comparison app installed on the device, the presence of external-storage files with certain names reveals which drugs the user has looked for. We conclude with our recommendations for redesigning mobile advertising software to better protect users from malicious advertising.
Conference Paper
Full-text available
Mobile applications are increasingly integrating third-party libraries to provide various features, such as advertising , analytics, social networking, and more. Unfortunately, such integration with third-party libraries comes with the cost of potential privacy violations of users, because Android always grants a full set of permissions to third-party libraries as their host applications. Unintended accesses to users' private data are underestimated threats to users' privacy, as complex and often obfuscated third-party libraries make it hard for application developers to estimate the correct behaviors of third-party libraries. More critically, a wide adoption of native code (JNI) and dynamic code executions such as Java reflection or dynamic code reloading, makes it even harder to apply state-of-the-art security analysis. In this work, we propose FLEXDROID, a new Android security model and isolation mechanism, that provides dynamic , fine-grained access control for third-party libraries. With FLEXDROID, application developers not only can gain a full control of third-party libraries (e.g., which permissions to grant or not), but also can specify how to make them behave after detecting a privacy violation (e.g., providing a mock user's information or kill). To achieve such goals, we define a new notion of principals for third-party libraries, and develop a novel security mechanism, called inter-process stack inspection that is effective to JNI as well as dynamic code execution. Our usability study shows that developers can easily adopt FLEXDROID's policy to their existing applications. Finally, our evaluation shows that FLEXDROID can effectively restrict the permissions of third-party libraries with negligible overheads.
Conference Paper
Full-text available
Since the appearance of Android, its permission system was central to many studies of Android security. For a long time, the description of the architecture provided by Enck et al. in [31] was immutably used in various research papers. The introduction of highly anticipated runtime permissions in Android 6.0 forced us to reconsider this model. To our surprise, the permission system evolved with almost every release. After analysis of 16 Android versions, we can confirm that the modifications, especially introduced in Android 6.0, considerably impact the aptness of old conclusions and tools for newer releases. For instance, since Android 6.0 some signature permissions, previously granted only to apps signed with a platform certificate, can be granted to third-party apps even if they are signed with a non-platform certificate; many permissions considered before as threatening are now granted by default. In this paper, we review in detail the updated system, introduced changes, and their security implications. We highlight some bizarre behaviors, which may be of interest for developers and security researchers. We also found a number of bugs during our analysis, and provided patches to AOSP where possible.
Article
Full-text available
The packaging model of Android apps requires the entire code necessary for the execution of an app to be shipped into one single apk file. Thus, an analysis of Android apps often visits code which is not part of the functionality delivered by the app. Such code is often contributed by the common libraries which are used pervasively by all apps. Unfortunately, Android analyses, e.g., for piggybacking detection and malware detection, can produce inaccurate results if they do not take into account the case of library code, which constitute noise in app features. Despite some efforts on investigating Android libraries, the momentum of Android research has not yet produced a complete set of common libraries to further support in-depth analysis of Android apps. In this paper, we leverage a dataset of about 1.5 million apps from Google Play to harvest potential common libraries, including advertisement libraries. With several steps of refinements, we finally collect by far the largest set of 1,113 libraries supporting common functionalities and 240 libraries for advertisement. We use the dataset to investigates several aspects of Android libraries, including their popularity and their proportion in Android app code. Based on these datasets, we have further performed several empirical investigations to confirm the motivations behind our work.
Conference Paper
Full-text available
Smartphone usage is tightly coupled with the use of apps that can be either free or paid. Numerous studies have investigated the tracking libraries associated with free apps. Only a limited number of these have focused on paid apps. As expected, these investigations indicate that tracking is happening to a lesser extent in paid apps, yet there is no conclusive evidence. This paper provides the first large-scale study of paid apps. We analyse top paid apps obtained from four different countries: Australia, Brazil, Germany, and US, and quantify the level of tracking taking place in paid apps in comparison to free apps. Our analysis shows that 60% of the paid apps are connected to trackers that collect personal information compared to 85%--95% in free apps. We further show that approximately 20% of the paid apps are connected to more than three trackers. With tracking being pervasive in both free and paid apps, we then quantify the aggregated privacy leakages associated with individual users. Using the data of user installed apps of over 300 smartphone users, we show that 50% of the users are exposed to more than 25 trackers which can result in significant leakages of privacy.
Article
Full-text available
Antivirus companies, mobile application marketplaces, and the security research community, employ techniques based on dynamic code analysis to detect and analyze mobile malware. In this paper, we present a broad range of anti-analysis techniques that malware can employ to evade dynamic analysis in emulated Android environments. Our detection heuristics span three different categories based on (i) static properties, (ii) dynamic sensor information, and (iii) VM-related intricacies of the Android Emulator. To assess the effectiveness of our techniques, we incorporated them in real malware samples and submitted them to publicly available Android dynamic analysis systems, with alarming results. We found all tools and services to be vulnerable to most of our evasion techniques. Even trivial techniques, such as checking the value of the IMEI, are enough to evade some of the existing dynamic analysis frameworks. We propose possible countermeasures to improve the resistance of current dynamic analysis tools against evasion attempts.
Article
Full-text available
Android uses a permission-based security model to restrict applications from accessing private data and privileged resources. However, the permissions are assigned at the application level, so even untrusted third-party libraries, such as advertisement, once incorporated, can share the same privileges as the entire application, leading to over-privileged problems. We present AFrame, a developer friendly method to isolate untrusted third-party code from the host applications. The isolation achieved by AFrame covers not only the process/permission isolation, but also the display and input isolation. Our AFrame framework is implemented through a minimal change to the existing Android code base; our evaluation results demonstrate that it is effective in isolating the privileges of untrusted third-party code from applications with reasonable performance overhead.
Article
Full-text available
Modern smartphone operating systems (OSs) have been developed with a greater emphasis on security and protecting privacy. One of the mechanisms these systems use to protect users is a permission system, which requires developers to declare what sensitive resources their applications will use, has users agree with this request when they install the application and constrains the application to the requested resources during runtime. As these permission systems become more common, questions have risen about their design and implementation. In this paper, we perform an analysis of the permission system of the Android smartphone OS in an attempt to begin answering some of these questions. Because the documentation of Android's permission system is incomplete and because we wanted to be able to analyze several versions of Android, we developed PScout, a tool that extracts the permission specification from the Android OS source code using static analysis. PScout overcomes several challenges, such as scalability due to Android's 3.4 million line code base, accounting for permission enforcement across processes due to Android's use of IPC, and abstracting Android's diverse permission checking mechanisms into a single primitive for analysis. We use PScout to analyze 4 versions of Android spanning version 2.2 up to the recently released Android 4.0. Our main findings are that while Android has over 75 permissions, there is little redundancy in the permission specification. However, if applications could be constrained to only use documented APIs, then about 22% of the non-system permissions are actually unnecessary. Finally, we find that a trade-off exists between enabling least-privilege security with fine-grained permissions and maintaining stability of the permission specification as the Android OS evolves.
Conference Paper
Android applications are frequently plagiarized or repackaged, and software obfuscation is a recommended protection against these practices. However, there is very little data on the overall rates of app obfuscation, the techniques used, or factors that lead to developers to choose to obfuscate their apps. In this paper, we present the first comprehensive analysis of the use of and challenges to software obfuscation in Android applications. We analyzed 1.7 million free Android apps from Google Play to detect various obfuscation techniques, finding that only 24.92% of apps are obfuscated by the developer. To better understand this rate of obfuscation, we surveyed 308 Google Play developers about their experiences and attitudes about obfuscation. We found that while developers feel that apps in general are at risk of plagiarism, they do not fear theft of their own apps. Developers also report difficulties obfuscating their own apps. To better understand, we conducted a follow-up study where the vast majority of 70 participants failed to obfuscate a realistic sample app even while many mistakenly believed they had been successful. These findings have broad implications both for improving the security of Android apps and for all tools that aim to help developers write more secure software.
Conference Paper
When accessing online private resources (e.g., user profiles, photos, shopping carts) from a client (e.g., a desktop web-browser or a mobile app), the service providers must implement proper access control, which typically involves both authentication and authorization. However, not all of the service providers follow the best practice, resulting in various access control vulnerabilities. To understand such a threat in a large scale, and identify the vulnerable access control implementations in online services, this paper introduces AuthScope, a tool that is able to automatically execute a mobile app and pinpoint the vulnerable access control implementations, particularly the vulnerable authorizations, in the corresponding online service. The key idea is to use differential traffic analysis to recognize the protocol fields and then automatically substitute the fields and observe the server response. One of the key challenges for a large scale study lies in how to obtain the post-authentication request-and-response messages for a given app. We have thus developed a targeted dynamic activity explorer to perform an in-context analysis and drive the app execution to automatically log in the service. We have tested AuthScope with 4,838 popular mobile apps from Google Play, and identified 597 0-day vulnerable authorizations that map to 306 apps.
Article
The enormous popularity of smartphones, their rich sensing capabilities, and the data they have about their users have lead to millions of apps being developed and used. However, these capabilities have also led to numerous privacy concerns. Platform manufacturers, as well as researchers, have proposed numerous ways of mitigating these concerns, primarily by providing fine-grained visibility and privacy controls to the user on a per-app basis. In this paper, we show that this per-app permission approach is suboptimal for many apps, primarily because most data accesses occur due to a small set of popular third-party libraries which are common across multiple apps. To address this problem, we present the design and implementation of ProtectMyPrivacy (PmP) for Android, which can detect critical contextual information at runtime when privacy-sensitive data accesses occur. In particular, PmP infers the purpose of the data access, i.e. whether the data access is by a third-party library or by the app itself for its functionality. Based on crowdsourced data, we show that there are in fact a set of 30 libraries which are responsible for more than half of private data accesses. Controlling sensitive data accessed by these libraries can therefore be an effective mechanism for managing their privacy. We deployed our PmP app to 1,321 real users, showing that the number of privacy decisions that users have to make are significantly reduced. In addition, we show that our users are better protected against data leakage when using our new library-based blocking mechanism as compared to the traditional app-level permission mechanisms.
Article
Mobile apps frequently request access to sensitive data, such as location and contacts. Understanding the purpose of why sensitive data is accessed could help improve privacy as well as enable new kinds of access control. In this article, we propose a text mining based method to infer the purpose of sensitive data access by Android apps. The key idea we propose is to extract multiple features from app code and then use those features to train a machine learning classifier for purpose inference. We present the design, implementation, and evaluation of two complementary approaches to infer the purpose of permission use, first using purely static analysis, and then using primarily dynamic analysis. We also discuss the pros and cons of both approaches and the trade-offs involved.
Conference Paper
While much effort has been made to detect and measure the privacy leakage caused by the advertising (ad) libraries integrated in mobile applications (i.e., apps), analytics libraries, which are also widely used in mobile apps have not been systematically studied for their privacy risks. Different from ad libraries, the main function of analytics libraries is to collect users’ in-app actions. Hence, by design, analytics libraries are more likely to leak users’ private information. In this work, we study what information is collected by the analytics libraries integrated in popular Android apps. We design and implement a tool called “Alde”. Given an app, Alde employs both static analysis and dynamic analysis to detect the data collected by analytics libraries. We also study what private information can be leaked by the apps that use the same analytics library. Moreover, we analyze apps’ privacy policies to see whether app developers have notified the users that their in-app action information is collected by analytics libraries. Finally, we select 8 widely used analytics libraries to study and apply our method on 300 apps downloaded from both Chinese app markets and Google play. Our experimental results request the emerging need for better regulating the use of analytics libraries in Android apps.
Conference Paper
Mobile computing has experienced enormous growth in market share and computational power in recent years. As a result, mobile malware is becoming more sophisticated and more prevalent, leading to research into dynamic sandboxes as a widespread approach for detecting malicious applications. However, the event-driven nature of Android applications renders critical the capability to automatically generate deterministic and intelligent user interactions to drive analysis subjects and improve code coverage. In this paper, we present CuriousDroid, an automated system for exercising Android application user interfaces in an intelligent, user-like manner. CuriousDroid operates by decomposing application user interfaces on-the-fly and creating a context-based model for interactions that is tailored to the current user layout. We integrated CuriousDroid with Andrubis, a well-known Android sandbox, and conducted a large-scale evaluation of 38,872 applications taken from different data sets. Our evaluation demonstrates significant improvements in both end-to-end sample classification as well as increases in the raw number of elicited behaviors at runtime.
Conference Paper
Smartphone usage is tightly coupled with the use of apps that can be either free or paid. Numerous studies have investigated the tracking libraries associated with free apps. Only a limited number of these have focused on paid apps. As expected, these investigations indicate that tracking is happening to a lesser extent in paid apps, yet there is no conclusive evidence. This paper provides the first large-scale study of paid apps. We analyse top paid apps obtained from four different countries: Australia, Brazil, Germany, and US, and quantify the level of tracking taking place in paid apps in comparison to free apps. Our analysis shows that 60% of the paid apps are connected to trackers that collect personal information compared to 85%--95% in free apps. We further show that approximately 20% of the paid apps are connected to more than three trackers. With tracking being pervasive in both free and paid apps, we then quantify the aggregated privacy leakages associated with individual users. Using the data of user installed apps of over 300 smartphone users, we show that 50% of the users are exposed to more than 25 trackers which can result in significant leakages of privacy.
Conference Paper
The vast majority of online services nowadays, provide both a mobile friendly website and a mobile application to their users. Both of these choices are usually released for free, with their developers, usually gaining revenue by allowing advertisements from ad networks to be embedded into their content. In order to provide more personalized and thus more effective advertisements, ad networks usually deploy pervasive user tracking, raising this way significant privacy concerns. As a consequence, the users do not have to think only their convenience before deciding which choice to use while accessing a service: web or app, but also which one harms their privacy the least. In this paper, we aim to respond to this question: which of the two options protects the users' privacy in the best way apps or browsers? To tackle this question, we study a broad range of privacy related leaks in a comparison of several popular apps and their web counterpart. These leaks may contain not only personally identifying information (PII) but also device-specific information, able to cross-application and cross-site track the user into the network, and allow third parties to link web with app sessions. Finally, we propose an anti-tracking mechanism that enable the users to access an online service through a mobile app without risking their privacy. Our evaluation shows that our approach is able to preserve the privacy of the user by reducing the leaking identifiers of apps by 27.41% on average, while it imposes a practically negligible latency of less than 1 millisecond per request.
Conference Paper
Third-party libraries on Android have been shown to be security and privacy hazards by adding security vulnerabilities to their host apps or by misusing inherited access rights. Correctly attributing improper app behavior either to app or library developer code or isolating library code from their host apps would be highly desirable to mitigate these problems, but is impeded by the absence of a third-party library detection that is effective and reliable in spite of obfuscated code. This paper proposes a library detection technique that is resilient against common code obfuscations and that is capable of pinpointing the exact library version used in apps. Libraries are detected with profiles from a comprehensive library database that we generated from the original library SDKs. We apply our technique to the top apps on Google Play and their complete histories to conduct a longitudinal study of library usage and evolution in apps. Our results particularly show that app developers only slowly adapt new library versions, exposing their end-users to large windows of vulnerability. For instance, we discovered that two long-known security vulnerabilities in popular libs are still present in the current top apps. Moreover, we find that misuse of cryptographic APIs in advertising libs, which increases the host apps' attack surface, affects 296 top apps with a cumulative install base of 3.7bn devices according to Play. To the best of our knowledge, our work is first to quantify the security impact of third-party libs on the Android ecosystem.
Conference Paper
Understanding the purpose of why sensitive data is used could help improve privacy as well as enable new kinds of access control. In this paper, we introduce a new technique for inferring the purpose of sensitive data usage in the context of Android smartphone apps. We extract multiple kinds of features from decompiled code, focusing on app-specific features and text-based features. These features are then used to train a machine learning classifier. We have evaluated our approach in the context of two sensitive permissions, namely ACCESS_FINE_LOCATION and READ_CONTACT_LIST, and achieved an accuracy of about 85% and 94% respectively in inferring purposes. We have also found that text-based features alone are highly effective in inferring purposes.
Conference Paper
Many popular, free online services provide cross-platform interfaces via Web browsers as well as apps on iOS and Android. To monetize these services, many additionally include tracking and advertising libraries that gather information about users with significant privacy implications. Given that the Web-based and mobile-app-based ecosystems evolve independently, an important open question is how these platforms compare with respect to user privacy. In this paper, we conduct the first head-to-head study of 50 popular, free online services to understand which is better for privacy---Web or app? We conduct manual tests, extract personally identifiable information (PII) shared over plaintext and encrypted connections, and analyze the data to understand differences in user-data collection across platforms for the same service. While we find that all platforms expose users' data, there are still opportunities to significantly limit how much information is shared with other parties by selectively using the app or Web version of a service.
Conference Paper
Mobile operating systems like Android failed to provide sufficient protection on personal data, and privacy leakage becomes a major concern. To understand the security risks and privacy leakage, analysts have to carry out data-flow analysis. In 2014, Android upgraded with a fundamentally new design known as Android RunTime (ART) environment in Android 5.0. ART adopts ahead-of-time compilation strategy and replaces previous virtual-machine-based Dalvik. Unfortunately, many data-flow analysis systems like TaintDroid were designed for the legacy Dalvik environment. This makes data-flow analysis of new apps and malware infeasible. We design a multi-level information-flow tracking system for the new Android system called TaintART. TaintART employs a multi-level taint analysis technique to minimize the taint tag storage. Therefore, taint tags can be stored in processor registers to provide efficient taint propagation operations. We also customize the ART compiler to maximize performance gains of the ahead-of-time compilation optimizations. Based on the general design of TaintART, we also implement a multi-level privacy enforcement to prevent sensitive data leakage. We demonstrate that TaintART only incurs less than 15% overheads on a CPU-bound microbenchmark and negligible overhead on built-in or third-party applications. Compared to legacy Dalvik environment in Android 4.4, TaintART achieves about 99.7% faster performance for Java runtime benchmark.
Conference Paper
Third-party libraries are widely used in Android application development. While they extend functionality, third-party libraries are likely to pose a threat to users. Firstly, third-party libraries enjoy the same permissions as the applications; therefore libraries are over-privileged. Secondly, third-party libraries and applications share the same internal file space, so that applications’ files are exposed to third-party libraries. To solve these problems, a considerable amount of effort has been made. Unfortunately, the requirement for a modified Android framework makes their methods impractical. In this paper, a developer-friendly tool called LibCage is proposed, to prohibit permission abuse of third-party libraries and protect user privacy without modifying the Android framework or libraries’ bytecode. At its core, LibCage builds a sandbox for each third-party library in order to ensure that each library is subject to a separate permission set assigned by developers. Moreover, each library is allocated an isolated file space and has no access to other space. Importantly, LibCage works on Java reflection as well as dynamic code execution, and can defeat several possible attacks. We test on real-world third-party libraries, and the results show that LibCage is capable of enforcing a flexible policy on third-party libraries at run time with a modest performance overhead.
Article
We present ARTist, a compiler-based application instrumentation solution for Android. ARTist is based on the new ART runtime and the on-device dex2oat compiler of Android, which replaced the interpreter-based managed runtime (DVM) from Android version 5 onwards. Since dex2oat is yet uncharted, our approach required first and foremost a thorough study of the compiler suite's internals and in particular of the new default compiler backend Optimizing. We document the results of this study in this paper to facilitate independent research on this topic and exemplify the viability of ARTist by realizing two use cases. Moreover, given that seminal works like TaintDroid hitherto depend on the now abandoned DVM, we conduct a case study on whether taint tracking can be re-instantiated using a compiler-based instrumentation framework. Overall, our results provide compelling arguments for preferring compiler-based instrumentation over alternative bytecode or binary rewriting approaches.
Conference Paper
It is well known that apps running on mobile devices extensively track and leak users' personally identifiable information (PII); however, these users have little visibility into PII leaked through the network traffic generated by their devices, and have poor control over how, when and where that traffic is sent and handled by third parties. In this paper, we present the design, implementation, and evaluation of ReCon: a cross-platform system that reveals PII leaks and gives users control over them without requiring any special privileges or custom OSes. ReCon leverages machine learning to reveal potential PII leaks by inspecting network traffic, and provides a visualization tool to empower users with the ability to control these leaks via blocking or substitution of PII. We evaluate ReCon's effectiveness with measurements from controlled experiments using leaks from the 100 most popular iOS, Android, and Windows Phone apps, and via an IRB-approved user study with 92 participants. We show that ReCon is accurate, efficient, and identifies a wider range of PII than previous approaches.
Conference Paper
Today's feature-rich smartphone apps intensively rely on access to highly sensitive (personal) data. This puts the user's privacy at risk of being violated by overly curious apps or libraries (like advertisements). Central app markets conceptually represent a first line of defense against such invasions of the user's privacy, but unfortunately we are still lacking full support for automatic analysis of apps' internal data flows and supporting analysts in statically assessing apps' behavior. In this paper we present a novel slice-optimization approach to leverage static analysis of Android applications. Building on top of precise application lifecycle models, we employ a slicing-based analysis to generate data-dependent statements for arbitrary points of interest in an application. As a result of our optimization, the produced slices are, on average, 49% smaller than standard slices, thus facilitating code understanding and result validation by security analysts. Moreover, by re-targeting strings, our approach enables automatic assessments for a larger number of use-cases than prior work. We consolidate our improvements on statically analyzing Android apps into a tool called R-Droid and conducted a large-scale data-leak analysis on a set of 22,700 Android apps from Google Play. R-Droid managed to identify a significantly larger set of potential privacy-violating information flows than previous work, including 2,157 sensitive flows of password-flagged UI widgets in 256 distinct apps.
Conference Paper
The packaging model of Android apps requires the entire code necessary for the execution of an app to be shipped into one single apk file. Thus, an analysis of Android apps often visits code which is not part of the functionality delivered by the app. Such code is often contributed by the common libraries which are used pervasively by all apps. Unfortunately, Android analyses, e.g., for piggybacking detection and malware detection, can produce inaccurate results if they do not take into account the case of library code, which constitute noise in app features. Despite some efforts on investigating Android libraries, the momentum of Android research has not yet produced a complete set of common libraries to further support in-depth analysis of Android apps. In this paper, we leverage a dataset of about 1.5 million apps from Google Play to harvest potential common libraries, including advertisement libraries. With several steps of refinements, we finally collect by far the largest set of 1,113 libraries supporting common functionality and 240 libraries for advertisement. We use the dataset to investigates several aspects of Android libraries, including their popularity and their proportion in Android app code. Based on these datasets, we have further performed several empirical investigations to confirm the motivations behind our work.
Conference Paper
More and more people rely on mobile devices to access the Internet, which also increases the amount of private information that can be gathered from people's devices. Although today's smartphone operating systems are trying to provide a secure environment, they fail to provide users with adequate control over and visibility into how third-party applications use their private data. Whereas there are a few tools that alert users when applications leak private information, these tools are often hard to use by the average user or have other problems. To address these problems, we present PrivacyGuard, an open-source VPN-based platform for intercepting the network traffic of applications. PrivacyGuard requires neither root permissions nor any knowledge about VPN technology from its users. PrivacyGuard does not significantly increase the trusted computing base since PrivacyGuard runs in its entirety on the local device and traffic is not routed through a remote VPN server. We implement PrivacyGuard on the Android platform by taking advantage of the VPNService class provided by the Android SDK. PrivacyGuard is configurable, extensible, and useful for many different purposes. We investigate its use for detecting the leakage of multiple types of sensitive data, such as a phone's IMEI number or location data. PrivacyGuard also supports modifying the leaked information and replacing it with crafted data for privacy protection. According to our experiments, PrivacyGuard can detect more leakage incidents by applications and advertisement libraries than TaintDroid. We also demonstrate that PrivacyGuard has reasonable overhead on network performance and almost no overhead on battery consumption.
Conference Paper
The proliferation of mobile apps is due in part to the advertising ecosystem which enables developers to earn revenue while providing free apps. Ad-supported apps can be developed rapidly with the availability of ad libraries. However, today?s ad libraries essentially have access to the same resources as the parent app, and this has caused signi?cant privacy concerns. In this paper, we explore ef?cient methods to de-escalate privileges for ad libraries where the resource access privileges for ad libraries can be different from that of the app logic. Our system, PEDAL, contains a novel machine classi?er for detecting ad libraries even in the presence of obfuscated code, and techniques for automatically instrumenting bytecode to effect privilege de-escalation even in the presence of privilege inheritance. We evaluate PEDAL on a large set of apps from the Google Play store and demonstrate that it has a 98% accuracy in detecting ad libraries and imposes less than 1% runtime overhead on apps.
Conference Paper
Advertising is the primary source of revenue for many mobile apps. One important goal of the ad delivery process is targeting users, based on criteria like users' geolocation, context, demographics, long-term behavior, etc. In this paper we report an in-depth study that broadly characterizes what targeting information mobile apps send to ad networks and how effectively, if at all, ad networks utilize the information for targeting users. Our study is based on a novel tool, called MadScope, that can (1) quickly harvest ads from a large collection of apps, (2) systematically probe an ad network to characterize its targeting mechanism, and (3) emulate user profiles of specific preferences and interests to study behavioral targeting. Our analysis of 500K ad requests from 150K Android apps and 101 ad networks indicates that apps do not yet exploit the full potential of targeting: even though ad controls provide APIs to send a lot of information to ad networks, much key targeting information is optional and is often not provided by app developers. We also use MadScope to systematically probe top 10 in-app ad networks to harvest over 1 million ads and find that while targeting is used by many of the top networks, there remain many instances where targeting information or behavioral profile does not have a statistically significant impact on how ads are chosen. We also contrast our findings with a recent study of targeted in-browser ads.
Article
Prior works have shown that the list of apps installed by a user reveal a lot about user interests and behavior. These works rely on the semantics of the installed apps and show that various user traits could be learnt automatically using off-the-shelf machine-learning techniques. In this work, we focus on the re-identifiability issue and thoroughly study the unicity of smartphone apps on a dataset containing 54,893 Android users collected over a period of 7 months. Our study finds that any 4 apps installed by a user are enough (more than 95% times) for the re-identification of the user in our dataset. As the complete list of installed apps is unique for 99% of the users in our dataset, it can be easily used to track/profile the users by a service such as Twitter that has access to the whole list of installed apps of users. As our analyzed dataset is small as compared to the total population of Android users, we also study how unicity would vary with larger datasets. This work emphasizes the need of better privacy guards against collection, use and release of the list of installed apps.
Article
Today's smartphones are a ubiquitous source of private and confidential data. At the same time, smartphone users are plagued by carelessly programmed apps that leak important data by accident, and by malicious apps that exploit their given privileges to copy such data intentionally. While existing static taint-analysis approaches have the potential of detecting such data leaks ahead of time, all approaches for Android use a number of coarse-grain approximations that can yield high numbers of missed leaks and false alarms. In this work we thus present FlowDroid, a novel and highly precise static taint analysis for Android applications. A precise model of Android's lifecycle allows the analysis to properly handle callbacks invoked by the Android framework, while context, flow, field and object-sensitivity allows the analysis to reduce the number of false alarms. Novel on-demand algorithms help FlowDroid maintain high efficiency and precision at the same time. We also propose DroidBench, an open test suite for evaluating the effectiveness and accuracy of taint-analysis tools specifically for Android apps. As we show through a set of experiments using SecuriBench Micro, DroidBench, and a set of well-known Android test applications, FlowDroid finds a very high fraction of data leaks while keeping the rate of false positives low. On DroidBench, FlowDroid achieves 93% recall and 86% precision, greatly outperforming the commercial tools IBM AppScan Source and Fortify SCA. FlowDroid successfully finds leaks in a subset of 500 apps from Google Play and about 1,000 malware apps from the VirusShare project.
Article
Recent years have witnessed incredible growth in the popularity and prevalence of smart phones. A flourishing mobile application market has evolved to provide users with additional functionality such as interacting with social networks, games, and more. Mobile applications may have a direct purchasing cost or be free but ad-supported. Unlike in-browser ads, the privacy im-plications of ads in Android applications has not been thoroughly explored. We start by comparing the similarities and differences of in-browser ads and in-app ads. We examine the effect on user privacy of thirteen popular Android ad providers by reviewing their use of permissions. Worryingly, several ad libraries checked for permissions beyond the required and optional ones listed in their documentation, including dangerous permissions like CAMERA, WRITE CALENDAR and WRITE CONTACTS. Further, we discover the insecure use of Android's JavaScript extension mechanism in several ad libraries. We identify fields in ad requests for private user information and confirm their presence in network data obtained from a tier-1 network provider. We also show that users can be tracked by a network sniffer across ad providers and by an ad provider across applications. Finally, we discuss several possible solutions to the privacy issues identified above.
Article
Android applications often include third-party libraries written in native code. However, current native components are not well managed by Android's security architecture. We present NativeGuard, a security framework that isolates native libraries from other components in Android applications. Leveraging the process-based protection in Android, NativeGuard isolates native libraries of an Android application into a second application where unnecessary privileges are eliminated. NativeGuard requires neither modifications to Android nor access to the source code of an application. It addresses multiple technical issues to support various interfaces that Android provides to the native world. Experimental results demonstrate that our framework works well with a set of real-world applications, and incurs only modest overhead on benchmark programs.
Article
Mobile app ecosystems have experienced tremendous growth in the last six years. This has triggered research on dynamic analysis of performance, security, and correctness properties of the mobile apps in the ecosystem. Exploration of app execution using automated UI actions has emerged as an important tool for this research. However, existing research has largely developed analysis-specific UI automation techniques, wherein the logic for exploring app execution is intertwined with the logic for analyzing app properties. PUMA is a programmable framework that separates these two concerns. It contains a generic UI automation capability (often called a Monkey) that exposes high-level events for which users can define handlers. These handlers can flexibly direct the Monkey's exploration, and also specify app instrumentation for collecting dynamic state information or for triggering changes in the environment during app execution. Targeted towards operators of app marketplaces, PUMA incorporates mechanisms for scaling dynamic analysis to thousands of apps. We demonstrate the capabilities of PUMA by analyzing seven distinct performance, security, and correctness properties for 3,600 apps downloaded from the Google Play store.
Article
Today's smartphones are a ubiquitous source of private and confidential data. At the same time, smartphone users are plagued by carelessly programmed apps that leak important data by accident, and by malicious apps that exploit their given privileges to copy such data intentionally. While existing static taint-analysis approaches have the potential of detecting such data leaks ahead of time, all approaches for Android use a number of coarse-grain approximations that can yield high numbers of missed leaks and false alarms. In this work we thus present FlowDroid, a novel and highly precise static taint analysis for Android applications. A precise model of Android's lifecycle allows the analysis to properly handle callbacks invoked by the Android framework, while context, flow, field and object-sensitivity allows the analysis to reduce the number of false alarms. Novel on-demand algorithms help FlowDroid maintain high efficiency and precision at the same time. We also propose DroidBench, an open test suite for evaluating the effectiveness and accuracy of taint-analysis tools specifically for Android apps. As we show through a set of experiments using SecuriBench Micro, DroidBench, and a set of well-known Android test applications, FlowDroid finds a very high fraction of data leaks while keeping the rate of false positives low. On DroidBench, FlowDroid achieves 93% recall and 86% precision, greatly outperforming the commercial tools IBM AppScan Source and Fortify SCA. FlowDroid successfully finds leaks in a subset of 500 apps from Google Play and about 1,000 malware apps from the VirusShare project.
Conference Paper
We present a system Dynodroid for generating relevant inputs to unmodified Android apps. Dynodroid views an app as an event-driven program that interacts with its environment by means of a sequence of events through the Android framework. By instrumenting the framework once and for all, Dynodroid monitors the reaction of an app upon each event in a lightweight manner, using it to guide the generation of the next event to the app. Dynodroid also allows interleaving events from machines, which are better at generating a large number of simple inputs, with events from humans, who are better at providing intelligent inputs. We evaluated Dynodroid on 50 open-source Android apps, and compared it with two prevalent approaches: users manually exercising apps, and Monkey, a popular fuzzing tool. Dynodroid, humans, and Monkey covered 55%, 60%, and 53%, respectively, of each app's Java source code on average. Monkey took 20X more events on average than Dynodroid. Dynodroid also found 9 bugs in 7 of the 50 apps, and 6 bugs in 5 of the top 1,000 free apps on Google Play.
Conference Paper
Smartphones in general and Android in particular are increasingly shifting into the focus of cybercriminals. For understanding the threat to security and privacy it is important for security researchers to analyze malicious software written for these systems. The exploding number of Android malware calls for automation in the analysis. In this paper, we present Mobile-Sandbox, a system designed to automatically analyze Android applications in two novel ways: (1) it combines static and dynamic analysis, i.e., results of static analysis are used to guide dynamic analysis and extend coverage of executed code, and (2) it uses specific techniques to log calls to native (i.e., "non-Java") APIs. We evaluated the system on more than 36,000 applications from Asian third-party mobile markets and found that 24% of all applications actually use native calls in their code.