Conference Paper

REAPER: Real-time App Analysis for Augmenting the Android Permission System

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Android's app ecosystem relies heavily on third-party libraries as they facilitate code development and provide a steady stream of revenue for developers. However, while Android has moved towards a more fine-grained run time permission system, users currently lack the required resources for deciding whether a specific permission request is actually intended for the app itself or is requested by possibly dangerous third-party libraries. In this paper we present Reaper, a novel dynamic analysis system that traces the permissions requested by apps in real time and distinguishes those requested by the app's core functionality from those requested by third-party libraries linked with the app. We implement a sophisticated UI automator and conduct an extensive evaluation of our system's performance and find that Reaper introduces negligible overhead, rendering it suitable both for end users (by integrating it in the OS) and for deployment as part of an official app vetting process. Our study on over 5K popular apps demonstrates the large extent to which personally identifiable information is being accessed by libraries and highlights the privacy risks that users face. We find that an impressive 65% of the permissions requested do not originate from the core app but are issued by linked third-party libraries, 37.3% of which are used for functionality related to ads, tracking, and analytics. Overall, Reaper enhances the functionality of Android's run time permission model without requiring OS or app modifications, and provides the necessary contextual information that can enable users to selectively deny permissions that are not part of an app's core functionality.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... 3,293 users in 136 countries worldwide contributed to their research by allowing the researchers to continuously monitor open ports on their smartphones. By installing an [47], [48], [49] [50], [51], [52] [53], [54] [55], [56], [57], [ 2) SSL/TLS verification SSL/TLS is a fundamental security mechanism that provides a secure transmission channel. Implementation errors in SS-L/TLS can expose the risk of man-in-the-middle (MITM) attacks. ...
... However, it is not possible for users to figure out what part of the code of an application requested the permission -code from a third-party library or code from the author of the application. This fact lead Diamantaris et al. [54] to study how frequently third-party libraries would request permissions. Their approach, named Reaper, is able to trace permissions in realtime in order to map which permissions are used by third-party libraries. ...
... Fuzzing [55], [46], [58], [54], [80], [86], [61] [47], [84], [86] [53], [59], [60], [48] [13], [70], [71] [68], [64], [62] [74] ...
Article
Full-text available
Dynamic analysis is a technique that is used to fully understand the internals of a system at runtime. On Android, dynamic security analysis involves real-time assessment and active adaptation of an app’s behaviour, and is used for various tasks, including network monitoring, system-call tracing, and taint analysis. The research on dynamic analysis has made significant progress in the past years. However, to the best of our knowledge, there is a lack in secondary studies that analyse the novel ideas and common limitations of current security research. The main aim of this work is to understand dynamic security analysis research on Android to present the current state of knowledge, highlight research gaps, and provide insights into the existing body of work in a structured and systematic manner. We conduct a systematic literature review (SLR) on dynamic security analysis for Android. The systematic review establishes a taxonomy, defines a classification scheme, and explores the impact of advanced Android app testing tools on security solutions in software engineering and security research. The study’s key findings centre on tool usage, research objectives, constraints, and trends. Instrumentation and network monitoring tools play a crucial role, with research goals focused on app security, privacy, malware detection, and software testing automation. Identified limitations include code coverage constraints, security-related analysis obstacles, app selection adequacy, and non-deterministic behaviour. Our study results deepen the understanding of dynamic analysis in Android security research by an in-depth review of 43 publications. The study highlights recurring limitations with automated testing tools and concerns about detecting or obstructing dynamic analysis.
... Our primary concern is Android malware that exhibits long-term stealth by evading early detection mechanisms, which eventually is only detected much later through its consequences (if ever). Established stealthy attack vectors, such as accessibility [4] or app-level virtualisation misuse [5], and several others [6]- [8], have become increasingly popular among malware authors, with many recent incidents gaining worldwide reach, leaving devastating effects [9]- [12]. More seriously for incident response, the resulting reduced forensic footprint for any attack employing such attack vectors has also been demonstrated [13], [14]. ...
... It enables malicious apps, requesting only the INTERNET permission, to read and write to multiple sensor data files at will, thus circumventing the Android sensor security model to stealthily sniff as well as manipulate many of the Android's restricted sensors (even touch input). Zygote and binder infection combined with a rooting exploit [6], as well as app-level virtualization frameworks [7] and third-party library infections [8] provide further attack vectors, potentially resulting in similarly stealthy attacks which render any efforts by classifier-based malware detectors futile and for which JIT-MF could be a solution in terms of incident response, provided a forensic readiness stage. ...
... Crucially for incident response, the resulting reduced forensic footprint for recent stealthy attacks has also been demonstrated [13]. Zygote and binder infection combined with a rooting exploit [6], as well as app-level virtualisation frameworks [7] and third-party library infections [8] provide further attack vectors, resulting in similarly stealthy attacks which require a solution in terms of incident response, due to likely late detection and minimal (if at all) forensic footprint stored on the device. ...
Article
Full-text available
The increasing dominance of Android smartphones for everyday communication and data processing makes long-term stealthy malware an even more dangerous threat. Recent malware campaigns like Flubot demonstrate that by employing stealthy malware techniques even at minimal capacity, malware is highly effective in making its way to millions of devices with little resistance from existing detection mechanisms. Consequential late detection demands comprehensive forensic timelines to reconstruct all malicious activities. However, the reduced forensic footprint of stealthy attacks with minimal malware involvement leaves investigators little evidence to work with even when utilising state-of-the-art digital forensics tools. Volatile memory forensics can be effective in such scenarios since app execution of any form is always bound to leave a trail of evidence in memory, even if it is short-lived. In this work, we motivate the need for JIT-MF (Just-in-time Memory Forensics), a technique that aims to address the challenges that arise with timely collection of short-lived evidence in volatile memory to solve the stealthiest of Android attacks. By taking an incident-response-centric approach, focused on protecting stock Android device users rather than treating them as potential adversaries, we show that JIT-MF tools can collect elusive attack steps in volatile memory without requiring device rooting. Furthermore, we build MobFor , a JIT-MF based tool focusing on capturing evidence related to messaging hijack attacks. This tool provides a context to explore solutions for JIT-MF implementation challenges, aiming to render JIT-MF tools practical for real-world requirements. Finally, we demonstrate that when compared to state-of-the-art digital forensic tools Belkasoft and XRY in a realistic attack scenario involving an enhanced version of the WhatsApp Pink malware and stock Android devices, only MobFor can recover the contents of messages sent by the malware, hence decisively contributing to an enriched forensic timeline.
... This not only enables the spread of malware on the official Google Playstore [1][2][3] but also allows the malware to evade detection on the victim's device, resulting in devastating effects (such as the unauthorized transfer of funds from legitimate banking apps), whenever attacks are noticed only from the consequences of their successful execution [4][5][6][7][8][9]. Established attack vectors such as accessibility [10] and several others [11][12][13][14][15][16] allow malware to attain stealth by leveraging living-off-the-land tactics that enable malware to offload critical attack steps to benign legitimate app functionality. For instance, the benign functionality of sensitive app categories such as messaging and financial apps can be instrumental to stealthy attacks aiming to hijack this functionality to offload attack steps such as malware propagation through messaging or seemingly legitimate fund transfers. ...
... Crucially, the threat model presented requires zero permissions, thus minimizing the malware component and bypassing permission-based detection mechanisms. Zygote and binder infection combined with rooting exploit [13], and third-party library infections [15] provide further attack vectors, potentially resulting in similarly stealthy attacks that render any efforts by classifier-based malware detectors futile. Tap'n Ghost [50] presents yet another relevant attack vector. ...
Article
Full-text available
The ubiquity of Android smartphones makes them targets of sophisticated malware, which maintain long-term stealth, particularly by offloading attack steps to benign apps. Such malware leaves little to no trace in logs, and the attack steps become difficult to discern from benign app functionality. Endpoint detection and response (EDR) systems provide live forensic capabilities that enable anomaly detection techniques to detect anomalous behavior in application logs after an app hijack. However, this presents a challenge, as state-of-the-art EDRs rely on device and third-party application logs, which may not include evidence of attack steps, thus prohibiting anomaly detection techniques from exposing anomalous behavior. While, theoretically, all the evidence resides in volatile memory, its ephemerality necessitates timely collection, and its extraction requires device rooting or app repackaging. We present VEDRANDO, an enhanced EDR for Android that accomplishes (i) the challenge of timely collection of volatile memory artefacts and (ii) the detection of a class of stealthy attacks that hijack benign applications. VEDRANDO leverages memory forensics and app virtualization techniques to collect timely evidence from memory, which allows uncovering attack steps currently uncollected by the state-of-the-art EDRs. The results showed that, with less than 5% CPU overhead compared to normal usage, VEDRANDO could uniquely collect and fully reconstruct the stealthy attack steps of ten realistic messaging hijack attacks using standard anomaly detection techniques, without requiring device or app modification.
... JIT-MF relies on a combination of static and dynamic instrumentation of compiled code to implement trigger points as in-line function hooks and the memory dumping process in the form of instrumentation code. Binary instrumentation also underpins various other Android security techniques where it is required to operate within real-time parameters (Diamantaris et al., 2019;Heuser et al., 2014;Chen et al., 2017;Li et al., 2015). Furthermore, JIT-MF considers the temporal aspect of volatile memory forensic collection concerned with the timely collection of ephemeral artifacts. ...
... Yet, other attack vectors may also present the same opportunity for LOtL tactics. Zygote and binder infection combined with a rooting exploit (Kaspersky, 2016), as well as app-level virtualization frameworks (Shi et al., 2019) and third-party library infections (Diamantaris et al., 2019) provide further attack vectors, resulting in similarly stealthy attacks and for which JIT-MF could be a solution in terms of incident response. ...
Preprint
Full-text available
Digital investigations of stealthy attacks on Android devices pose particular challenges to incident responders. Whereas consequential late detection demands accurate and comprehensive forensic timelines to reconstruct all malicious activities, reduced forensic footprints with minimal malware involvement, such as when Living-Off-the-Land (LOtL) tactics are adopted, leave investigators little evidence to work with. Volatile memory forensics can be an effective approach since app execution of any form is always bound to leave a trail of evidence in memory, even if perhaps ephemeral. Just-in-Time Memory Forensics (JIT-MF) is a recently proposed technique that describes a framework to process memory forensics on existing stock Android devices, without compromising their security by requiring them to be rooted. Within this framework, JIT-MF drivers are designed to promptly dump in-memory evidence related to app usage or misuse. In this work, we primarily introduce a conceptualized presentation of JIT-MF drivers. Subsequently, through a series of case studies involving the hijacking of widely-used messaging apps, we show that when the target apps are forensically enhanced with JIT-MF drivers, investigators can generate richer forensic timelines to support their investigation, which are on average 26% closer to ground truth.
... Users who installed the apps with AFP configuration can identify and customize finegrained permission levels on private or confidential resources [79,80]. In the same year, [81] introduced a dynamic analysis approach that monitors the permissions requested by apps during the run-time and recognizes those requested permissions by the app's fundamental functionality from those demanded by third-party libraries linked with the app [81]. In 2016, [8] proposed a framework that enforces fine-grained security privacy policies and enables users to manage access of applications to sensitive elements. ...
... Users who installed the apps with AFP configuration can identify and customize finegrained permission levels on private or confidential resources [79,80]. In the same year, [81] introduced a dynamic analysis approach that monitors the permissions requested by apps during the run-time and recognizes those requested permissions by the app's fundamental functionality from those demanded by third-party libraries linked with the app [81]. In 2016, [8] proposed a framework that enforces fine-grained security privacy policies and enables users to manage access of applications to sensitive elements. ...
Article
Full-text available
Android mobile apps gain access to numerous users’ private data. Users of different Android mobile apps have less control over their sensitive data during their installation and run-time. Too often, these apps consider data privacy less serious than users’ expectations. Many mobile apps misbehave and upload users’ data without permission which confirmed the possibility of privacy leakage through different network channels. The literature has proposed various approaches to protect user’s data and avoid privacy violations. In this paper, we provide a comprehensive overview of state-of-art research on Android user privacy, and data flow control. the aim is to highlight the main trends, pinpoint the main methodologies applied, and enumerate the privacy violations faced by Android users. We also shed some light on the directions where the researcher’s community effort is still needed. To this end, we conduct a Systematic Literature Review (SLR) during which we surveyed 114 relevant research papers published in leading conferences and journals. Our thorough examination of the relevant literature has led to a critical analysis of the proposed solutions with a focus on user privacy extensions and mechanism for the Android mobile platform. Furthermore, possible solutions and research directions have been discussed.
... In addition to the studies focusing on WATs, it is important to recognize research that examined smartphone permissions and proposed related privacy-enhancing technologies. Researchers analyzed mobile apps and spotted the overprivileged ones [24,35,55,80]. Such apps often ask for more permissions than are required, as engineers may misunderstand some permission data scope [96]. ...
Article
Full-text available
Wearable activity trackers (WATs) have recently gained worldwide popularity, with over a billion devices collecting a range of personal data. To receive additional services, users commonly share this data with third-party applications (TPAs). However, this practice poses potential privacy risks. Privacy-enhancing technologies have been developed to address these concerns, but they often lack user-centered design, and therefore, are less likely to be directly related to users’ concerns and to be widely adopted. This study takes a participatory design approach involving 𝑁 = 26 experienced WAT users who share data with TPAs. Through a series of design sessions, participants conceptualized 19 solutions, from which we identified seven different design features. We further analyze and discuss how these features can be combined to assist users in managing their data sharing with TPAs and, therefore, enhancing their privacy. Finally, we selected the three most promising features, namely partial sharing, reminder, and revocation assistance, and conducted an online survey with 𝑁 = 201 WAT users to better understand the potential effectiveness and usability of these features. This work makes an important contribution by offering user-centered solutions and valuable insights for integrating privacy-enhancing technologies into the WAT ecosystem.
... Moreover, research work such as BorderPatrol [48] and Reaper [16] make use of some unique techniques to fight against the malicious application. While the former attempts to protect the device by generating a customized Mobile Device Manager which leverages finegrained contextual information, the later provides a real-time analysis of the android application, augmenting and complementing the android permission system, hence potentially countering the ongoing attack. ...
Preprint
Full-text available
Presently, Bangladesh is expending substantial efforts to digitize its national infrastructure, with a significant emphasis on achieving this goal through mobile applications that facilitate online payments and banking system advancements. Despite the lack of knowledge about the security level of these systems, they are currently in frequent use without much consideration. To observe whether they follow the minimum global set standards, we choose to conduct static and dynamic analysis of the applications using available open-source analyzers and open-source tools. This allows us to attempt to extract sensitive information, if possible, and determine whether the applications adhere to the standards of MASVS set by OWASP. We show how we analyzed 17 .apks and a SDK using open source scanner and discover security flaws to the applications, such as weaknesses related to data storage, vulnerable cryptographic elements, insecure network communications, and unsafe utilization of WebViews, detected by the scanner. These outputs demonstrate the need for extensive manual analysis of the application through source code review and dynamic analysis. We further implement reverse engineering and dynamic approach to verify the outputs and expose some applications do not comply with the standard method of network communication. Moreover, we attempt to verify the rest of the potential vulnerabilities in the next phase of our ongoing investigation.
... It detects malicious actions the app might perform at runtime. Diamantaris et al [22] presents a dynamic analysis system that tracks permission requests made by an Android app in realtime as part of its core functionality, and separates those permission requests from requests made by thirdparty libraries linked with the Android app. The objective was to counter confidential information leakage attacks committed by third-party libraries linked to Android apps. ...
... An example of this is when a password manager extension has host permission for one site, then it cannot access any other site. When it comes to API permissions, the extension core can only access it when it is guarded by permissions if it has the corresponding permissions in its manifest [22][23] [24]. To add on, gaining access to certain Chrome extension APIs and browser APIs are constrained by host permissions. ...
Conference Paper
Full-text available
In today's world, web browsers are used by most everyone daily. Many of us take this incredible functionality for granted, and don't recognize the potential risks that are involved with simple internet use. These risks exist in both the browsers and their extensions. Google Chrome has cemented itself as one of the go-to browsers for both commercial and everyday consumer use. In fact, Google Chrome is currently the most used web browser in the world. But with such heavy global use, comes a feeling of false security. Just like many applications and browsers that are widely available and free to use, Google Chrome has its weaknesses and vulnerabilities. These exist within the browser itself, but for the purposes of this research, Chrome's extensions will be the main focus. This research explores the potential risks that are involved in utilizing browser extensions for Google Chrome. We will also look at a variety of attacks that extensions can execute, and exactly how they work. These attacks aren't guaranteed to cause malicious behavior, but we will also discuss ways of increasing user safety when operating a web browser, and specifically browser extensions. The objective of this research is to test a variety of different attacks using browser extensions on Google Chrome. By researching and implementing a variety of different attacks, authors plan to find where Chrome is susceptible to allowing malicious extensions. This research will help to inform browser users of the dangers that exist when using extensions, and how threat actors may be deceiving them to perform malicious activities on their computers. This research also shows the different vulnerabilities that exist within browsers, and demonstrates the privileges that browsers have on a computer. It is expected the research output to show that browser vulnerabilities aren't a one size fits all type of attack and also expected some websites to have lower levels of protection allowing for poor security against malicious extensions.
... Reaper (Diamantaris et al., 2019) provides real-time analysis of Android apps, in an effort to augment and complement the Android Permission System, thus potentially counteracting ongoing attacks. As with DyPolDroid, Reaper leverages dynamic analysis of Android APK files to detect permission abuse, and also uses stack trace info of the running process for further processing. ...
Article
Full-text available
Android applications are extremely popular, as they are widely used for banking, social media, e-commerce, etc. Such applications typically leverage a series of Permissions, which serve as a convenient abstraction for mediating access to security-sensitive functionality within the Android Ecosystem, e.g., sending data over the Internet. However, several malicious applications have recently deployed attacks such as data leaks and spurious credit card charges by abusing the Permissions granted initially to them by unaware users in good faith. To alleviate this pressing concern, we present DyPolDroid, a dynamic and semi-automated security framework that builds upon Android Enterprise, a device-management framework for organizations, to allow for users and administrators to design and enforce so-called Counter-Policies, a convenient user-friendly abstraction to restrict the sets of Permissions granted to potential malicious applications, thus effectively protecting against serious attacks without requiring advanced security and technical expertise. Additionally, as a part of our experimental procedures, we introduce Laverna, a fully operational application that uses permissions to provide benign functionality at the same time it also abuses them for malicious purposes. To fully support the reproducibility of our results, and to encourage future work, the source code of both DyPolDroid and Laverna is publicly available as open-source.
... Several studies investigated Android application permission system. REAPER [36] is a tool that traces the permissions requested by apps in real time and discriminates them from other permissions requested by third-party libraries linked with the app. Fasano et al. [37] proposed a formal method to detect the exact point in the code of an Android application where a permission is invoked at runtime. ...
Article
Full-text available
Android applications have recently witnessed a pronounced progress, making them among the fastest growing technological fields to thrive and advance. However, such level of growth does not evolve without some cost. This particularly involves increased security threats that the underlying applications and their users usually fall prey to. As malware becomes increasingly more capable of penetrating these applications and exploiting them in suspicious actions, the need for active research endeavors to counter these malicious programs becomes imminent. Some of the studies are based on dynamic analysis, and others are based on static analysis, while some are completely dependent on both. In this paper, we studied static, dynamic, and hybrid analyses to identify malicious applications. We leverage machine learning classifiers to detect malware activities as we explain the effectiveness of these classifiers in the classification process. Our results prove the efficiency of permissions and the action repetition feature set and their influential roles in detecting malware in Android applications. Our results show empirically very close accuracy results when using static, dynamic, and hybrid analyses. Thus, we use static analyses due to their lower cost compared to dynamic and hybrid analyses. In other words, we found the best results in terms of accuracy and cost (the trade-off) make us select static analysis over other techniques.
... The results show that Aper can significantly outperform existing tools, and find real bugs in popular Android projects with useful debugging information. In the future, we plan to extend Aper to support more types of runtime permission bugs, such as library-induced [39] or device-specific [65] bugs. ...
Preprint
Full-text available
The Android platform introduces the runtime permission model in version 6.0. The new model greatly improves data privacy and user experience, but brings new challenges for app developers. First, it allows users to freely revoke granted permissions. Hence, developers cannot assume that the permissions granted to an app would keep being granted. Instead, they should make their apps carefully check the permission status before invoking dangerous APIs. Second, the permission specification keeps evolving, bringing new types of compatibility issues into the ecosystem. To understand the impact of the challenges, we conducted an empirical study on 13,352 popular Google Play apps. We found that 86.0% apps used dangerous APIs asynchronously after permission management and 61.2% apps used evolving dangerous APIs. If an app does not properly handle permission revocations or platform differences, unexpected runtime issues may happen and even cause app crashes. We call such Android Runtime Permission issues as ARP bugs. Unfortunately, existing runtime permission issue detection tools cannot effectively deal with the ARP bugs induced by asynchronous permission management and permission specification evolution. To fill the gap, we designed a static analyzer, Aper, that performs reaching definition and dominator analysis on Android apps to detect the two types of ARP bugs. To compare Aper with existing tools, we built a benchmark, ARPfix, from 60 real ARP bugs. Our experiment results show that Aper significantly outperforms two academic tools, ARPDroid and RevDroid, and an industrial tool, Lint, on ARPfix, with an average improvement of 46.3% on F1-score. In addition, Aper successfully found 34 ARP bugs in 214 opensource Android apps, most of which can result in abnormal app behaviors (such as app crashes) according to our manual validation.
... Similar to monitors like REAPER [11] and MOSES [25], JIT-MF uses trigger points which, rather than being indicators for malicious events, such as permission misuse, are indicators of benign events that may be misused by an attacker. In contrast to typical monitors, JIT-MF dumps necessary memory contents for post-analysis at runtime, which is less costly than online analysis. ...
Chapter
Full-text available
Attackers regularly target Android phones and come up with new ways to bypass detection mechanisms to achieve long-term stealth on a victim’s phone. One way attackers do this is by leveraging critical benign app functionality to carry out specific attacks. In this paper, we present a novel generalised framework, JIT-MF ( Just-in-time Memory Forensics ), which aims to address the problem of timely collection of short-lived evidence in volatile memory to solve the stealthiest of Android attacks. The main components of this framework are i) Identification of critical data objects in memory linked with critical benign application steps that may be misused by an attacker; and ii) Careful selection of trigger points, which identify when memory dumps should be taken during benign app execution. The effectiveness and cost of trigger point selection, a cornerstone of this framework, are evaluated in a preliminary qualitative study using Telegram and Pushbullet as the victim apps targeted by stealthy malware. Our study identifies that JIT-MF is successful in dumping critical data objects on time, providing evidence that eludes all other forensic sources. Experimentation offers insight into identifying categories of trigger points that can strike a balance between the effort required for selection and the resulting effectiveness and storage costs. Several optimisation measures for the JIT-MF tools are presented, considering the typical resource constraints of Android devices.
... These studies on Android permissions have mainly leveraged static analysis techniques to understand the role of a given permission [7], [22], [25], potential privacy violation incurred by overprivileged apps [22], [59], permission circumvention [55], description-to-permission fidelity [53], and improve mapping of Android permissions to framework/SDK API methods [8], [2]. Some recent research efforts also utilize dynamic analysis systems to distinguish and trace the permissions requested by apps at the runtime and those requested by the app's core functionality [17] and generate a more precise call graph enabling the system to extract the permission specification and improve the mapping [41]. Our study complements these studies by showing the scales and the prevalence of private information collection in the real world devices. ...
Preprint
Full-text available
Mobile phones enable the collection of a wealth of private information, from unique identifiers (e.g., email addresses), to a user's location, to their text messages. This information can be harvested by apps and sent to third parties, which can use it for a variety of purposes. In this paper we perform the largest study of private information collection (PIC) on Android to date. Leveraging an anonymized dataset collected from the customers of a popular mobile security product, we analyze the flows of sensitive information generated by 2.1M unique apps installed by 17.3M users over a period of 21 months between 2018 and 2019. We find that 87.2% of all devices send private information to at least five different domains, and that actors active in different regions (e.g., Asia compared to Europe) are interested in collecting different types of information. The United States (62% of the total) and China (7% of total flows) are the countries that collect most private information. Our findings raise issues regarding data regulation, and would encourage policymakers to further regulate how private information is used by and shared among the companies and how accountability can be truly guaranteed.
... These studies on Android permissions have mainly leveraged static analysis techniques to understand the role of a given permission [8], [23], [26], potential privacy violation incurred by overprivileged apps [23], [64], permission circumvention [59], description-to-permission fidelity [56], and improve mapping of Android permissions to framework/SDK API methods [9], [2]. Some recent research efforts also utilize dynamic analysis systems to distinguish and trace the permissions requested by apps at the runtime and those requested by the apps core functionality [18] and generate a more precise call graph enabling the system to extract the permission specification and improve the mapping [43]. Our study complements these studies by showing the scales and the prevalence of private information collection in the real world devices. ...
Conference Paper
Full-text available
Mobile phones enable the collection of a wealth of private information, from unique identifiers (e.g., email addresses), to a user's location, to their text messages. This information can be harvested by apps and sent to third parties, which can use it for a variety of purposes. In this paper we perform the largest study of private information collection (PIC) on Android to date. Leveraging an anonymized dataset collected from the customers of a popular mobile security product, we analyze the flows of sensitive information generated by 2.1M unique apps installed by 17.3M users over a period of 21 months between 2018 and 2019. We find that 87.2% of all devices send private information to at least five different domains, and that actors active in different regions (e.g., Asia compared to Europe) are interested in collecting different types of information. The United States (62% of the total) and China (7% of total flows) are the countries that collect most private information. Our findings raise issues regarding data regulation, and would encourage policymakers to further regulate how private information is used by and shared among the companies and how accountability can be truly guaranteed.
... A smaller number of works go a step further by collecting and analyzing more contextual information to determine a potential privacy violation. For example, some of them focus on detecting who is accessing the personal data (by distinguishing third-party libraries from the app itself [35]- [37], while others focus on discriminating whether access to certain personal data is required for the app's core functionality or another secondary (third-party) task [38]. The Google Play Protect (GPP) approach [39] is also aligned with this paradigm aiming at detecting potential harmful behaviour in the Android ecosystem, including the disclosure of personal data off the device via Spyware. ...
Article
Full-text available
The pervasiveness of Android mobile applications and the services they support allow the personal data of individuals to be collected and shared worldwide. However, data protection legislations usually require all participants in a personal data flow to ensure an equivalent level of personal data protection, regardless of location. In particular, the European General Data Protection Regulation constrains cross-border transfers of personal data to non-EU countries and establishes specific requirements to carry them out. This paper presents a method to systematically assess compliance of Android mobile apps with the requirements for cross-border transfers established by the European data protection regulation. We have validated the method with one hundred Android apps, finding an outstanding 66% of ambiguous, inconsistent and omitted cross-border transfer disclosures.
... They ranked permissions according to its usage and used a priori association rules to extract significant permissions for classification of malign and benign apps. Diamantaris et al. [11] distinguished permissions required by the core functions of the Android apps and integrated by third-party libraries. They showed 30 topmost permissions used by core and third-party libraries. ...
... Designed to generate descriptions for security-relevant implementation parts, the approach of Zhang et al. [34] evaluates a graphbased app representation. To recognize critical app behavior at runtime, Diamantaris et al. [6] hook calls to privacy-relevant APIs and inspect security-critical parameter values via backtracking. ...
Conference Paper
Full-text available
Permissions are a key factor in Android to protect users' privacy. As it is often not obvious why applications require certain permissions, developer-provided descriptions in Google Play and third-party markets should explain to users how sensitive data is processed. Reliably recognizing whether app descriptions cover permission usage is challenging due to the lack of enforced quality standards and a variety of ways developers can express privacy-related facts. We introduce a machine learning-based approach to identify critical discrepancies between developer-described app behavior and permission usage. By combining state-of-the-art techniques in natural language processing (NLP) and deep learning, we design a convolutional neural network (CNN) for text classification that captures the relevance of words and phrases in app descriptions in relation to the usage of dangerous permissions. Our system predicts the likelihood that an app requires certain permissions and can warn about descriptions in which the requested access to sensitive user data and system features is textually not represented. We evaluate our solution on 77,000 real-world app descriptions and find that we can identify individual groups of dangerous permissions with a precision between 71% and 93%. To highlight the impact of individual words and phrases, we employ a model explanation algorithm and demonstrate that our technique can successfully bridge the semantic gap between described app functionality and its access to security- and privacy-sensitive resources.
... Each mobile HTML5 WebAPI is associated with a low-level Android API call. In order to validate the results of the JavaScript interception and to identify which ones require a permission, we use the PermissionHarvester [26] module that hooks every Android permission protected API call. Since access to some of the sensors does not require an Android permission, we also manually identified and hooked the functions that give access to non-permission-protected sensor data. ...
Conference Paper
Smartphone sensors can be leveraged by malicious apps for a plethora of different attacks, which can also be deployed by malicious websites through the HTML5 WebAPI. In this paper we provide a comprehensive evaluation of the multifaceted threat that mobile web browsing poses to users, by conducting a large-scale study of mobile-specific HTML5 WebAPI calls used in the wild. We build a novel testing infrastructure consisting of actual smartphones on top of a dynamic Android app analysis framework, allowing us to conduct an end-to-end exploration. Our study reveals the extent to which websites are actively leveraging the WebAPI for collecting sensor data, with 2.89% of websites accessing at least one mobile sensor. To provide a comprehensive assessment of the potential risks of this emerging practice, we create a taxonomy of sensor-based attacks from prior studies, and present an in-depth analysis by framing our collected data within that taxonomy. We find that 1.63% of websites could carry out at least one of those attacks. Our findings emphasize the need for a standardized policy across browsers and the ability for users to control what sensor data each website can access.
Article
Within the Android mobile operating system, Android permissions act as a system of safeguards designed to restrict access to potentially sensitive data and privileged components. Multiple research studies indicate flaws and limitations of the Android permission system, prompting Google to implement a more regulated and fine-grained permission model. This newly-introduced complexity creates confusion for developers leading to incorrect permissions and a significant risk to users security and privacy. We present a systematic study of theoretical and practical misuse of permissions. For this analysis we derive the unified permissions and call mappings that represent theoretical requirements of permissions and calls. We develop PChecker, an approach that identifies the discrepancies between the official Android permissions documentation and permission implementation in the Android platform source code based on these mappings. We evaluate four versions of the Android Open Source Project code (major versions 10–13) and shed light on the prevalence of discrepancies between the official Android guidelines for permissions and their implementation in the Android platform source code. We further show that these discrepancies result in miscompliance in third-party Android apps.</p
Conference Paper
Data mining is capable of giving and providing the hidden, unknown, and interesting information in terms of knowledge in healthcare industry. It is useful to form decision support systems for the disease prediction and valid diagnosis of health issues. The concepts of data mining can be used to recommend the solutions and suggestions in medical sector for precautionary measures to control the disease origin at early stage. Today, diabetes is a most common and life taking syndrome found all over theworld. The presence of diabetes itself is a cause tomany other health issues in the form of side effects in human body. In such cases when considered, a need is to find the hidden data patterns from diabetic data to discover the knowledge so as to reduce the invisible health problems that arise in diabetic patients. Many studies have shown that AssociativeClassification concept of dataminingworks well and can derive good outcomes in terms of prediction accuracy. This research work holds the experimental results of the work carried out to predict and detect the by-diseases in diabetic patients with the application of Associative Classification, and it discusses an improved algorithmic method of Associative Classification named Associative Classification using Maximum Threshold and Super Subsets (ACMTSS) to achieve accuracy in better terms. Keywords Knowledge · By-disease · Maximum threshold · Super subsets · ACMTSS · Associative Classification
Chapter
Android applications are extremely popular, as they are widely used for banking, social media, e-commerce, etc. Such applications typically leverage a series of Permissions, which serve as a convenient abstraction for mediating access to security-sensitive functionality, e.g., sending data over the Internet, within the Android Ecosystem. However, several malicious applications have recently deployed attacks such as data leaks and spurious credit card charges by abusing the Permissions granted initially to them by unaware users in good faith. To alleviate this pressing concern, we present DyPolDroid, a dynamic and semi-automated security framework that builds upon Android Enterprise, a device-management framework for organizations, to allow for users and administrators to design and enforce so-called Counter-Policies, a convenient user-friendly abstraction to restrict the sets of Permissions granted to potential malicious applications, thus effectively protecting against serious attacks without requiring advanced security and technical expertise. Additionally, as a part of our experimental procedures, we introduce Laverna, a fully operational application that uses permissions to provide benign functionality at the same time it also abuses them for malicious purposes. To fully support the reproducibility of our results, and to encourage future work, the source code of both DyPolDroid and Laverna is publicly available as open-source.KeywordsPermission-abuse attacksAccess controlAndroid Enterprise
Chapter
The Android ecosystem is dynamic and diverse. Controls have been set in place to allow mobile device users to regulate exchanged data and restrict apps from accessing sensitive personal information and system resources. Modern versions of the operating system implement the run-time permission model which prompts users to allow access to protected resources the moment an app attempts to utilize them. It is assumed that, in general, the run-time permission model, compared to its predecessor, enhances users’ security awareness. In this paper we show that installed apps on Android devices are able to employ the systems’ public assets and extract users’ permission settings. Then we utilize permission data from 71 Android devices to create privacy profiles based on users’ interaction with permission dialogues initiated by the system during run-time. Therefore, we demonstrate that any installed app that runs on the foreground can perform an endemic live digital forensic analysis on the device and derive similar privacy profiles of the user. Moreover, focusing on the human factors of security, we show that although in theory users can control the resources they make accessible to apps, they eventually fail to successfully recall these settings, even for the apps that they regularly use. Finally, we briefly discuss our findings derived from a pen-and-paper exercise showcasing that users are more likely to allow apps to access their location data on contemporary mobile devices (running version Android 10).
Article
Modern smartphone sensors can be leveraged for providing novel functionality and greatly improving the user experience. However, sensor data can be misused by privacy-invasive or malicious entities. Additionally, a wide range of other attacks that use mobile sensor data have been demonstrated; while those attacks have typically relied on users installing malicious apps, browsers have eliminated that constraint with the deployment of HTML5 WebAPI. In this article, we conduct a comprehensive evaluation of the multifaceted threat that mobile web browsing poses to users by conducting a large-scale study of mobile-specific HTML5 WebAPI calls across more than 183K of the most popular websites. We build a novel testing infrastructure consisting of actual smartphones on top of a dynamic Android app analysis framework, allowing us to conduct an end-to-end exploration. In detail, our system intercepts and tracks data access in real time, from the WebAPI JavaScript calls down to the Android system calls. Our study reveals the extent to which websites are actively leveraging the WebAPI for collecting sensor data, with 2.89% of websites accessing at least one sensor. To provide a comprehensive assessment of the risks of this emerging practice, we create a taxonomy of sensor-based attacks from prior studies and present an in-depth analysis by framing our collected data within that taxonomy. We find that 1.63% of websites can carry out at least one attack and emphasize the need for a standardized policy across all browsers and the ability for users to control what sensor data each website can access.
Article
Full-text available
Software Quality Control (SQC) techniques are widely used throughout the software development process with the objective of assessing and detecting anomalies that affect the quality of an information system. Privacy is one quality attribute of software systems for which several SQC techniques have been proposed in recent years. However, research has been carried out from different perspectives and, consequently, it has led to a growing body of knowledge scattered across different domains. To bridge this gap, we have carried out a systematic mapping study to provide practitioners and researchers with an overview of the state-of-the-art techniques to carry out software quality control of information systems focusing on aspects of privacy. Our results show a steady growth in the research efforts in this field. The European General Data Protection Regulation seems to have a significant influence on this growth, since 37% of techniques that focus on assessing compliance derive their assessment criteria from this legal framework. The maturity of the techniques varies between the type of technique: Formal verification techniques exhibit the lowest level of maturity while the combination of techniques has demonstrated its successful application in real-world scenarios. The latter seems a promising avenue of research as it provides better results in terms of coverage, precision and effectiveness than the application of individual, isolated techniques. In this paper, we describe the existing SQC techniques focusing on privacy and provide a suitable basis for identifying future research directions.
Article
Most Android applications include third‐party libraries (3PLs) to make revenues, to facilitate their development, and to track user behaviors. 3PLs generally require specific permissions to realize their functionalities. Current Android systems manage permissions in app (process) granularity. As a result, the permission sets of apps with 3PLs (3PL‐apps) may be augmented, introducing overprivilege risks. In this paper, we firstly study how severe the problem is by analyzing the permission sets of 27 718 real‐world Android apps with and without 3PLs downloaded in both 2016 and 2017. We find that the usage of 3PLs and the permissions required by 3PL‐apps have increased over time. As a result, the possibility of overprivilege risks increases. We then propose Perman, a fine‐grained permission management mechanism for Android. Perman isolates the permissions of the host app and those of the 3PLs through dynamic code instrumentation. It allows users to manage permission requests of different modules of 3PL‐apps during app runtime. Unlike existing tools, Perman does not need to redesign Android apps and systems. Therefore, it can be applied to millions of existing apps and various Android devices. We conduct experiments to evaluate the effectiveness and efficiency of Perman. The experimental results verify that Perman is capable of managing permission requests of the host app and those of the 3PLs. We also confirm that the overhead introduced by Perman is comparable to that by existing commercial permission management tools.
Conference Paper
Full-text available
Mobile apps are ubiquitous, operate in complex environments and are developed under the time-to-market pressure. Ensuring their correctness and reliability thus becomes an important challenge. This paper introduces Stoat, a novel guided approach to perform stochastic model-based testing on Android apps. Stoat operates in two phases: (1) Given an app as input, it uses dynamic analysis enhanced by a weighted UI exploration strategy and static analysis to reverse engineer a stochastic model of the app's GUI interactions; and (2) it adapts Gibbs sampling to iteratively mutate/refine the stochastic model and guides test generation from the mutated models toward achieving high code and model coverage and exhibiting diverse sequences. During testing, system-level events are randomly injected to further enhance the testing effectiveness. Stoat was evaluated on 93 open-source apps. The results show (1) the models produced by Stoat cover 17~31% more code than those by existing modeling tools; (2) Stoat detects 3X more unique crashes than two state-of-the-art testing tools, Monkey and Sapienz. Furthermore, Stoat tested 1661 most popular Google Play apps, and detected 2110 previously unknown and unique crashes. So far, 43 developers have responded that they are investigating our reports. 20 of reported crashes have been confirmed, and 8 already fixed.
Conference Paper
Full-text available
Since the appearance of Android, its permission system was central to many studies of Android security. For a long time, the description of the architecture provided by Enck et al. in [31] was immutably used in various research papers. The introduction of highly anticipated runtime permissions in Android 6.0 forced us to reconsider this model. To our surprise, the permission system evolved with almost every release. After analysis of 16 Android versions, we can confirm that the modifications, especially introduced in Android 6.0, considerably impact the aptness of old conclusions and tools for newer releases. For instance, since Android 6.0 some signature permissions, previously granted only to apps signed with a platform certificate, can be granted to third-party apps even if they are signed with a non-platform certificate; many permissions considered before as threatening are now granted by default. In this paper, we review in detail the updated system, introduced changes, and their security implications. We highlight some bizarre behaviors, which may be of interest for developers and security researchers. We also found a number of bugs during our analysis, and provided patches to AOSP where possible.
Article
Full-text available
The packaging model of Android apps requires the entire code necessary for the execution of an app to be shipped into one single apk file. Thus, an analysis of Android apps often visits code which is not part of the functionality delivered by the app. Such code is often contributed by the common libraries which are used pervasively by all apps. Unfortunately, Android analyses, e.g., for piggybacking detection and malware detection, can produce inaccurate results if they do not take into account the case of library code, which constitute noise in app features. Despite some efforts on investigating Android libraries, the momentum of Android research has not yet produced a complete set of common libraries to further support in-depth analysis of Android apps. In this paper, we leverage a dataset of about 1.5 million apps from Google Play to harvest potential common libraries, including advertisement libraries. With several steps of refinements, we finally collect by far the largest set of 1,113 libraries supporting common functionalities and 240 libraries for advertisement. We use the dataset to investigates several aspects of Android libraries, including their popularity and their proportion in Android app code. Based on these datasets, we have further performed several empirical investigations to confirm the motivations behind our work.
Conference Paper
Full-text available
Smartphone usage is tightly coupled with the use of apps that can be either free or paid. Numerous studies have investigated the tracking libraries associated with free apps. Only a limited number of these have focused on paid apps. As expected, these investigations indicate that tracking is happening to a lesser extent in paid apps, yet there is no conclusive evidence. This paper provides the first large-scale study of paid apps. We analyse top paid apps obtained from four different countries: Australia, Brazil, Germany, and US, and quantify the level of tracking taking place in paid apps in comparison to free apps. Our analysis shows that 60% of the paid apps are connected to trackers that collect personal information compared to 85%--95% in free apps. We further show that approximately 20% of the paid apps are connected to more than three trackers. With tracking being pervasive in both free and paid apps, we then quantify the aggregated privacy leakages associated with individual users. Using the data of user installed apps of over 300 smartphone users, we show that 50% of the users are exposed to more than 25 trackers which can result in significant leakages of privacy.
Article
Full-text available
Due to the amount of data that smartphone applications can potentially access, platforms enforce permission systems that allow users to regulate how applications access protected resources. If users are asked to make security decisions too frequently and in benign situations, they may become habituated and approve all future requests without regard for the consequences. If they are asked to make too few security decisions, they may become concerned that the platform is revealing too much sensitive information. To explore this tradeoff, we instrumented the Android platform to collect data regarding how often and under what circumstances smartphone applications are accessing protected resources regulated by permissions. We performed a 36-person field study to explore the notion of "contextual integrity," that is, how often are applications accessing protected resources when users are not expecting it? Based on our collection of 27 million data points and exit interviews with participants, we examine the situations in which users would like the ability to deny applications access to protected resources. We found out that at least 80% of our participants would have preferred to prevent at least one permission request, and overall, they thought that over a third of requests were invasive and desired a mechanism to block them.
Article
Full-text available
Antivirus companies, mobile application marketplaces, and the security research community, employ techniques based on dynamic code analysis to detect and analyze mobile malware. In this paper, we present a broad range of anti-analysis techniques that malware can employ to evade dynamic analysis in emulated Android environments. Our detection heuristics span three different categories based on (i) static properties, (ii) dynamic sensor information, and (iii) VM-related intricacies of the Android Emulator. To assess the effectiveness of our techniques, we incorporated them in real malware samples and submitted them to publicly available Android dynamic analysis systems, with alarming results. We found all tools and services to be vulnerable to most of our evasion techniques. Even trivial techniques, such as checking the value of the IMEI, are enough to evade some of the existing dynamic analysis frameworks. We propose possible countermeasures to improve the resistance of current dynamic analysis tools against evasion attempts.
Article
Full-text available
Android uses a permission-based security model to restrict applications from accessing private data and privileged resources. However, the permissions are assigned at the application level, so even untrusted third-party libraries, such as advertisement, once incorporated, can share the same privileges as the entire application, leading to over-privileged problems. We present AFrame, a developer friendly method to isolate untrusted third-party code from the host applications. The isolation achieved by AFrame covers not only the process/permission isolation, but also the display and input isolation. Our AFrame framework is implemented through a minimal change to the existing Android code base; our evaluation results demonstrate that it is effective in isolating the privileges of untrusted third-party code from applications with reasonable performance overhead.
Article
Full-text available
Modern smartphone operating systems (OSs) have been developed with a greater emphasis on security and protecting privacy. One of the mechanisms these systems use to protect users is a permission system, which requires developers to declare what sensitive resources their applications will use, has users agree with this request when they install the application and constrains the application to the requested resources during runtime. As these permission systems become more common, questions have risen about their design and implementation. In this paper, we perform an analysis of the permission system of the Android smartphone OS in an attempt to begin answering some of these questions. Because the documentation of Android's permission system is incomplete and because we wanted to be able to analyze several versions of Android, we developed PScout, a tool that extracts the permission specification from the Android OS source code using static analysis. PScout overcomes several challenges, such as scalability due to Android's 3.4 million line code base, accounting for permission enforcement across processes due to Android's use of IPC, and abstracting Android's diverse permission checking mechanisms into a single primitive for analysis. We use PScout to analyze 4 versions of Android spanning version 2.2 up to the recently released Android 4.0. Our main findings are that while Android has over 75 permissions, there is little redundancy in the permission specification. However, if applications could be constrained to only use documented APIs, then about 22% of the non-system permissions are actually unnecessary. Finally, we find that a trade-off exists between enabling least-privilege security with fine-grained permissions and maintaining stability of the permission specification as the Android OS evolves.
Conference Paper
Full-text available
Smartphones in general and Android in particular are increasingly shifting into the focus of cybercriminals. For understanding the threat to security and privacy it is important for security researchers to analyze malicious software written for these systems. The exploding number of Android malware calls for automation in the analysis. In this paper, we present Mobile-Sandbox, a system designed to automatically analyze Android applications in two novel ways: (1) it combines static and dynamic analysis, i.e., results of static analysis are used to guide dynamic analysis and extend coverage of executed code, and (2) it uses specific techniques to log calls to native (i.e., "non-Java") APIs. We evaluated the system on more than 36,000 applications from Asian third-party mobile markets and found that 24% of all applications actually use native calls in their code.
Conference Paper
Android applications are frequently plagiarized or repackaged, and software obfuscation is a recommended protection against these practices. However, there is very little data on the overall rates of app obfuscation, the techniques used, or factors that lead to developers to choose to obfuscate their apps. In this paper, we present the first comprehensive analysis of the use of and challenges to software obfuscation in Android applications. We analyzed 1.7 million free Android apps from Google Play to detect various obfuscation techniques, finding that only 24.92% of apps are obfuscated by the developer. To better understand this rate of obfuscation, we surveyed 308 Google Play developers about their experiences and attitudes about obfuscation. We found that while developers feel that apps in general are at risk of plagiarism, they do not fear theft of their own apps. Developers also report difficulties obfuscating their own apps. To better understand, we conducted a follow-up study where the vast majority of 70 participants failed to obfuscate a realistic sample app even while many mistakenly believed they had been successful. These findings have broad implications both for improving the security of Android apps and for all tools that aim to help developers write more secure software.
Conference Paper
When accessing online private resources (e.g., user profiles, photos, shopping carts) from a client (e.g., a desktop web-browser or a mobile app), the service providers must implement proper access control, which typically involves both authentication and authorization. However, not all of the service providers follow the best practice, resulting in various access control vulnerabilities. To understand such a threat in a large scale, and identify the vulnerable access control implementations in online services, this paper introduces AuthScope, a tool that is able to automatically execute a mobile app and pinpoint the vulnerable access control implementations, particularly the vulnerable authorizations, in the corresponding online service. The key idea is to use differential traffic analysis to recognize the protocol fields and then automatically substitute the fields and observe the server response. One of the key challenges for a large scale study lies in how to obtain the post-authentication request-and-response messages for a given app. We have thus developed a targeted dynamic activity explorer to perform an in-context analysis and drive the app execution to automatically log in the service. We have tested AuthScope with 4,838 popular mobile apps from Google Play, and identified 597 0-day vulnerable authorizations that map to 306 apps.
Article
The enormous popularity of smartphones, their rich sensing capabilities, and the data they have about their users have lead to millions of apps being developed and used. However, these capabilities have also led to numerous privacy concerns. Platform manufacturers, as well as researchers, have proposed numerous ways of mitigating these concerns, primarily by providing fine-grained visibility and privacy controls to the user on a per-app basis. In this paper, we show that this per-app permission approach is suboptimal for many apps, primarily because most data accesses occur due to a small set of popular third-party libraries which are common across multiple apps. To address this problem, we present the design and implementation of ProtectMyPrivacy (PmP) for Android, which can detect critical contextual information at runtime when privacy-sensitive data accesses occur. In particular, PmP infers the purpose of the data access, i.e. whether the data access is by a third-party library or by the app itself for its functionality. Based on crowdsourced data, we show that there are in fact a set of 30 libraries which are responsible for more than half of private data accesses. Controlling sensitive data accessed by these libraries can therefore be an effective mechanism for managing their privacy. We deployed our PmP app to 1,321 real users, showing that the number of privacy decisions that users have to make are significantly reduced. In addition, we show that our users are better protected against data leakage when using our new library-based blocking mechanism as compared to the traditional app-level permission mechanisms.
Article
Mobile apps frequently request access to sensitive data, such as location and contacts. Understanding the purpose of why sensitive data is accessed could help improve privacy as well as enable new kinds of access control. In this article, we propose a text mining based method to infer the purpose of sensitive data access by Android apps. The key idea we propose is to extract multiple features from app code and then use those features to train a machine learning classifier for purpose inference. We present the design, implementation, and evaluation of two complementary approaches to infer the purpose of permission use, first using purely static analysis, and then using primarily dynamic analysis. We also discuss the pros and cons of both approaches and the trade-offs involved.
Conference Paper
While much effort has been made to detect and measure the privacy leakage caused by the advertising (ad) libraries integrated in mobile applications (i.e., apps), analytics libraries, which are also widely used in mobile apps have not been systematically studied for their privacy risks. Different from ad libraries, the main function of analytics libraries is to collect users’ in-app actions. Hence, by design, analytics libraries are more likely to leak users’ private information. In this work, we study what information is collected by the analytics libraries integrated in popular Android apps. We design and implement a tool called “Alde”. Given an app, Alde employs both static analysis and dynamic analysis to detect the data collected by analytics libraries. We also study what private information can be leaked by the apps that use the same analytics library. Moreover, we analyze apps’ privacy policies to see whether app developers have notified the users that their in-app action information is collected by analytics libraries. Finally, we select 8 widely used analytics libraries to study and apply our method on 300 apps downloaded from both Chinese app markets and Google play. Our experimental results request the emerging need for better regulating the use of analytics libraries in Android apps.
Conference Paper
Mobile computing has experienced enormous growth in market share and computational power in recent years. As a result, mobile malware is becoming more sophisticated and more prevalent, leading to research into dynamic sandboxes as a widespread approach for detecting malicious applications. However, the event-driven nature of Android applications renders critical the capability to automatically generate deterministic and intelligent user interactions to drive analysis subjects and improve code coverage. In this paper, we present CuriousDroid, an automated system for exercising Android application user interfaces in an intelligent, user-like manner. CuriousDroid operates by decomposing application user interfaces on-the-fly and creating a context-based model for interactions that is tailored to the current user layout. We integrated CuriousDroid with Andrubis, a well-known Android sandbox, and conducted a large-scale evaluation of 38,872 applications taken from different data sets. Our evaluation demonstrates significant improvements in both end-to-end sample classification as well as increases in the raw number of elicited behaviors at runtime.
Conference Paper
Smartphone usage is tightly coupled with the use of apps that can be either free or paid. Numerous studies have investigated the tracking libraries associated with free apps. Only a limited number of these have focused on paid apps. As expected, these investigations indicate that tracking is happening to a lesser extent in paid apps, yet there is no conclusive evidence. This paper provides the first large-scale study of paid apps. We analyse top paid apps obtained from four different countries: Australia, Brazil, Germany, and US, and quantify the level of tracking taking place in paid apps in comparison to free apps. Our analysis shows that 60% of the paid apps are connected to trackers that collect personal information compared to 85%--95% in free apps. We further show that approximately 20% of the paid apps are connected to more than three trackers. With tracking being pervasive in both free and paid apps, we then quantify the aggregated privacy leakages associated with individual users. Using the data of user installed apps of over 300 smartphone users, we show that 50% of the users are exposed to more than 25 trackers which can result in significant leakages of privacy.
Conference Paper
Mobile applications are increasingly integrating third-party libraries to provide various features, such as advertising , analytics, social networking, and more. Unfortunately, such integration with third-party libraries comes with the cost of potential privacy violations of users, because Android always grants a full set of permissions to third-party libraries as their host applications. Unintended accesses to users' private data are underestimated threats to users' privacy, as complex and often obfuscated third-party libraries make it hard for application developers to estimate the correct behaviors of third-party libraries. More critically, a wide adoption of native code (JNI) and dynamic code executions such as Java reflection or dynamic code reloading, makes it even harder to apply state-of-the-art security analysis. In this work, we propose FLEXDROID, a new Android security model and isolation mechanism, that provides dynamic , fine-grained access control for third-party libraries. With FLEXDROID, application developers not only can gain a full control of third-party libraries (e.g., which permissions to grant or not), but also can specify how to make them behave after detecting a privacy violation (e.g., providing a mock user's information or kill). To achieve such goals, we define a new notion of principals for third-party libraries, and develop a novel security mechanism, called inter-process stack inspection that is effective to JNI as well as dynamic code execution. Our usability study shows that developers can easily adopt FLEXDROID's policy to their existing applications. Finally, our evaluation shows that FLEXDROID can effectively restrict the permissions of third-party libraries with negligible overheads.
Conference Paper
The vast majority of online services nowadays, provide both a mobile friendly website and a mobile application to their users. Both of these choices are usually released for free, with their developers, usually gaining revenue by allowing advertisements from ad networks to be embedded into their content. In order to provide more personalized and thus more effective advertisements, ad networks usually deploy pervasive user tracking, raising this way significant privacy concerns. As a consequence, the users do not have to think only their convenience before deciding which choice to use while accessing a service: web or app, but also which one harms their privacy the least. In this paper, we aim to respond to this question: which of the two options protects the users' privacy in the best way apps or browsers? To tackle this question, we study a broad range of privacy related leaks in a comparison of several popular apps and their web counterpart. These leaks may contain not only personally identifying information (PII) but also device-specific information, able to cross-application and cross-site track the user into the network, and allow third parties to link web with app sessions. Finally, we propose an anti-tracking mechanism that enable the users to access an online service through a mobile app without risking their privacy. Our evaluation shows that our approach is able to preserve the privacy of the user by reducing the leaking identifiers of apps by 27.41% on average, while it imposes a practically negligible latency of less than 1 millisecond per request.
Conference Paper
Third-party libraries on Android have been shown to be security and privacy hazards by adding security vulnerabilities to their host apps or by misusing inherited access rights. Correctly attributing improper app behavior either to app or library developer code or isolating library code from their host apps would be highly desirable to mitigate these problems, but is impeded by the absence of a third-party library detection that is effective and reliable in spite of obfuscated code. This paper proposes a library detection technique that is resilient against common code obfuscations and that is capable of pinpointing the exact library version used in apps. Libraries are detected with profiles from a comprehensive library database that we generated from the original library SDKs. We apply our technique to the top apps on Google Play and their complete histories to conduct a longitudinal study of library usage and evolution in apps. Our results particularly show that app developers only slowly adapt new library versions, exposing their end-users to large windows of vulnerability. For instance, we discovered that two long-known security vulnerabilities in popular libs are still present in the current top apps. Moreover, we find that misuse of cryptographic APIs in advertising libs, which increases the host apps' attack surface, affects 296 top apps with a cumulative install base of 3.7bn devices according to Play. To the best of our knowledge, our work is first to quantify the security impact of third-party libs on the Android ecosystem.
Conference Paper
Understanding the purpose of why sensitive data is used could help improve privacy as well as enable new kinds of access control. In this paper, we introduce a new technique for inferring the purpose of sensitive data usage in the context of Android smartphone apps. We extract multiple kinds of features from decompiled code, focusing on app-specific features and text-based features. These features are then used to train a machine learning classifier. We have evaluated our approach in the context of two sensitive permissions, namely ACCESS_FINE_LOCATION and READ_CONTACT_LIST, and achieved an accuracy of about 85% and 94% respectively in inferring purposes. We have also found that text-based features alone are highly effective in inferring purposes.
Conference Paper
Many popular, free online services provide cross-platform interfaces via Web browsers as well as apps on iOS and Android. To monetize these services, many additionally include tracking and advertising libraries that gather information about users with significant privacy implications. Given that the Web-based and mobile-app-based ecosystems evolve independently, an important open question is how these platforms compare with respect to user privacy. In this paper, we conduct the first head-to-head study of 50 popular, free online services to understand which is better for privacy---Web or app? We conduct manual tests, extract personally identifiable information (PII) shared over plaintext and encrypted connections, and analyze the data to understand differences in user-data collection across platforms for the same service. While we find that all platforms expose users' data, there are still opportunities to significantly limit how much information is shared with other parties by selectively using the app or Web version of a service.
Conference Paper
Mobile operating systems like Android failed to provide sufficient protection on personal data, and privacy leakage becomes a major concern. To understand the security risks and privacy leakage, analysts have to carry out data-flow analysis. In 2014, Android upgraded with a fundamentally new design known as Android RunTime (ART) environment in Android 5.0. ART adopts ahead-of-time compilation strategy and replaces previous virtual-machine-based Dalvik. Unfortunately, many data-flow analysis systems like TaintDroid were designed for the legacy Dalvik environment. This makes data-flow analysis of new apps and malware infeasible. We design a multi-level information-flow tracking system for the new Android system called TaintART. TaintART employs a multi-level taint analysis technique to minimize the taint tag storage. Therefore, taint tags can be stored in processor registers to provide efficient taint propagation operations. We also customize the ART compiler to maximize performance gains of the ahead-of-time compilation optimizations. Based on the general design of TaintART, we also implement a multi-level privacy enforcement to prevent sensitive data leakage. We demonstrate that TaintART only incurs less than 15% overheads on a CPU-bound microbenchmark and negligible overhead on built-in or third-party applications. Compared to legacy Dalvik environment in Android 4.4, TaintART achieves about 99.7% faster performance for Java runtime benchmark.
Conference Paper
Third-party libraries are widely used in Android application development. While they extend functionality, third-party libraries are likely to pose a threat to users. Firstly, third-party libraries enjoy the same permissions as the applications; therefore libraries are over-privileged. Secondly, third-party libraries and applications share the same internal file space, so that applications’ files are exposed to third-party libraries. To solve these problems, a considerable amount of effort has been made. Unfortunately, the requirement for a modified Android framework makes their methods impractical. In this paper, a developer-friendly tool called LibCage is proposed, to prohibit permission abuse of third-party libraries and protect user privacy without modifying the Android framework or libraries’ bytecode. At its core, LibCage builds a sandbox for each third-party library in order to ensure that each library is subject to a separate permission set assigned by developers. Moreover, each library is allocated an isolated file space and has no access to other space. Importantly, LibCage works on Java reflection as well as dynamic code execution, and can defeat several possible attacks. We test on real-world third-party libraries, and the results show that LibCage is capable of enforcing a flexible policy on third-party libraries at run time with a modest performance overhead.
Article
We present ARTist, a compiler-based application instrumentation solution for Android. ARTist is based on the new ART runtime and the on-device dex2oat compiler of Android, which replaced the interpreter-based managed runtime (DVM) from Android version 5 onwards. Since dex2oat is yet uncharted, our approach required first and foremost a thorough study of the compiler suite's internals and in particular of the new default compiler backend Optimizing. We document the results of this study in this paper to facilitate independent research on this topic and exemplify the viability of ARTist by realizing two use cases. Moreover, given that seminal works like TaintDroid hitherto depend on the now abandoned DVM, we conduct a case study on whether taint tracking can be re-instantiated using a compiler-based instrumentation framework. Overall, our results provide compelling arguments for preferring compiler-based instrumentation over alternative bytecode or binary rewriting approaches.
Conference Paper
It is well known that apps running on mobile devices extensively track and leak users' personally identifiable information (PII); however, these users have little visibility into PII leaked through the network traffic generated by their devices, and have poor control over how, when and where that traffic is sent and handled by third parties. In this paper, we present the design, implementation, and evaluation of ReCon: a cross-platform system that reveals PII leaks and gives users control over them without requiring any special privileges or custom OSes. ReCon leverages machine learning to reveal potential PII leaks by inspecting network traffic, and provides a visualization tool to empower users with the ability to control these leaks via blocking or substitution of PII. We evaluate ReCon's effectiveness with measurements from controlled experiments using leaks from the 100 most popular iOS, Android, and Windows Phone apps, and via an IRB-approved user study with 92 participants. We show that ReCon is accurate, efficient, and identifies a wider range of PII than previous approaches.
Conference Paper
Today's feature-rich smartphone apps intensively rely on access to highly sensitive (personal) data. This puts the user's privacy at risk of being violated by overly curious apps or libraries (like advertisements). Central app markets conceptually represent a first line of defense against such invasions of the user's privacy, but unfortunately we are still lacking full support for automatic analysis of apps' internal data flows and supporting analysts in statically assessing apps' behavior. In this paper we present a novel slice-optimization approach to leverage static analysis of Android applications. Building on top of precise application lifecycle models, we employ a slicing-based analysis to generate data-dependent statements for arbitrary points of interest in an application. As a result of our optimization, the produced slices are, on average, 49% smaller than standard slices, thus facilitating code understanding and result validation by security analysts. Moreover, by re-targeting strings, our approach enables automatic assessments for a larger number of use-cases than prior work. We consolidate our improvements on statically analyzing Android apps into a tool called R-Droid and conducted a large-scale data-leak analysis on a set of 22,700 Android apps from Google Play. R-Droid managed to identify a significantly larger set of potential privacy-violating information flows than previous work, including 2,157 sensitive flows of password-flagged UI widgets in 256 distinct apps.
Conference Paper
The packaging model of Android apps requires the entire code necessary for the execution of an app to be shipped into one single apk file. Thus, an analysis of Android apps often visits code which is not part of the functionality delivered by the app. Such code is often contributed by the common libraries which are used pervasively by all apps. Unfortunately, Android analyses, e.g., for piggybacking detection and malware detection, can produce inaccurate results if they do not take into account the case of library code, which constitute noise in app features. Despite some efforts on investigating Android libraries, the momentum of Android research has not yet produced a complete set of common libraries to further support in-depth analysis of Android apps. In this paper, we leverage a dataset of about 1.5 million apps from Google Play to harvest potential common libraries, including advertisement libraries. With several steps of refinements, we finally collect by far the largest set of 1,113 libraries supporting common functionality and 240 libraries for advertisement. We use the dataset to investigates several aspects of Android libraries, including their popularity and their proportion in Android app code. Based on these datasets, we have further performed several empirical investigations to confirm the motivations behind our work.
Conference Paper
More and more people rely on mobile devices to access the Internet, which also increases the amount of private information that can be gathered from people's devices. Although today's smartphone operating systems are trying to provide a secure environment, they fail to provide users with adequate control over and visibility into how third-party applications use their private data. Whereas there are a few tools that alert users when applications leak private information, these tools are often hard to use by the average user or have other problems. To address these problems, we present PrivacyGuard, an open-source VPN-based platform for intercepting the network traffic of applications. PrivacyGuard requires neither root permissions nor any knowledge about VPN technology from its users. PrivacyGuard does not significantly increase the trusted computing base since PrivacyGuard runs in its entirety on the local device and traffic is not routed through a remote VPN server. We implement PrivacyGuard on the Android platform by taking advantage of the VPNService class provided by the Android SDK. PrivacyGuard is configurable, extensible, and useful for many different purposes. We investigate its use for detecting the leakage of multiple types of sensitive data, such as a phone's IMEI number or location data. PrivacyGuard also supports modifying the leaked information and replacing it with crafted data for privacy protection. According to our experiments, PrivacyGuard can detect more leakage incidents by applications and advertisement libraries than TaintDroid. We also demonstrate that PrivacyGuard has reasonable overhead on network performance and almost no overhead on battery consumption.
Conference Paper
The proliferation of mobile apps is due in part to the advertising ecosystem which enables developers to earn revenue while providing free apps. Ad-supported apps can be developed rapidly with the availability of ad libraries. However, today?s ad libraries essentially have access to the same resources as the parent app, and this has caused signi?cant privacy concerns. In this paper, we explore ef?cient methods to de-escalate privileges for ad libraries where the resource access privileges for ad libraries can be different from that of the app logic. Our system, PEDAL, contains a novel machine classi?er for detecting ad libraries even in the presence of obfuscated code, and techniques for automatically instrumenting bytecode to effect privilege de-escalation even in the presence of privilege inheritance. We evaluate PEDAL on a large set of apps from the Google Play store and demonstrate that it has a 98% accuracy in detecting ad libraries and imposes less than 1% runtime overhead on apps.
Conference Paper
Advertising is the primary source of revenue for many mobile apps. One important goal of the ad delivery process is targeting users, based on criteria like users' geolocation, context, demographics, long-term behavior, etc. In this paper we report an in-depth study that broadly characterizes what targeting information mobile apps send to ad networks and how effectively, if at all, ad networks utilize the information for targeting users. Our study is based on a novel tool, called MadScope, that can (1) quickly harvest ads from a large collection of apps, (2) systematically probe an ad network to characterize its targeting mechanism, and (3) emulate user profiles of specific preferences and interests to study behavioral targeting. Our analysis of 500K ad requests from 150K Android apps and 101 ad networks indicates that apps do not yet exploit the full potential of targeting: even though ad controls provide APIs to send a lot of information to ad networks, much key targeting information is optional and is often not provided by app developers. We also use MadScope to systematically probe top 10 in-app ad networks to harvest over 1 million ads and find that while targeting is used by many of the top networks, there remain many instances where targeting information or behavioral profile does not have a statistically significant impact on how ads are chosen. We also contrast our findings with a recent study of targeted in-browser ads.
Article
Prior works have shown that the list of apps installed by a user reveal a lot about user interests and behavior. These works rely on the semantics of the installed apps and show that various user traits could be learnt automatically using off-the-shelf machine-learning techniques. In this work, we focus on the re-identifiability issue and thoroughly study the unicity of smartphone apps on a dataset containing 54,893 Android users collected over a period of 7 months. Our study finds that any 4 apps installed by a user are enough (more than 95% times) for the re-identification of the user in our dataset. As the complete list of installed apps is unique for 99% of the users in our dataset, it can be easily used to track/profile the users by a service such as Twitter that has access to the whole list of installed apps of users. As our analyzed dataset is small as compared to the total population of Android users, we also study how unicity would vary with larger datasets. This work emphasizes the need of better privacy guards against collection, use and release of the list of installed apps.
Article
Today's smartphones are a ubiquitous source of private and confidential data. At the same time, smartphone users are plagued by carelessly programmed apps that leak important data by accident, and by malicious apps that exploit their given privileges to copy such data intentionally. While existing static taint-analysis approaches have the potential of detecting such data leaks ahead of time, all approaches for Android use a number of coarse-grain approximations that can yield high numbers of missed leaks and false alarms. In this work we thus present FlowDroid, a novel and highly precise static taint analysis for Android applications. A precise model of Android's lifecycle allows the analysis to properly handle callbacks invoked by the Android framework, while context, flow, field and object-sensitivity allows the analysis to reduce the number of false alarms. Novel on-demand algorithms help FlowDroid maintain high efficiency and precision at the same time. We also propose DroidBench, an open test suite for evaluating the effectiveness and accuracy of taint-analysis tools specifically for Android apps. As we show through a set of experiments using SecuriBench Micro, DroidBench, and a set of well-known Android test applications, FlowDroid finds a very high fraction of data leaks while keeping the rate of false positives low. On DroidBench, FlowDroid achieves 93% recall and 86% precision, greatly outperforming the commercial tools IBM AppScan Source and Fortify SCA. FlowDroid successfully finds leaks in a subset of 500 apps from Google Play and about 1,000 malware apps from the VirusShare project.
Article
Recent years have witnessed incredible growth in the popularity and prevalence of smart phones. A flourishing mobile application market has evolved to provide users with additional functionality such as interacting with social networks, games, and more. Mobile applications may have a direct purchasing cost or be free but ad-supported. Unlike in-browser ads, the privacy im-plications of ads in Android applications has not been thoroughly explored. We start by comparing the similarities and differences of in-browser ads and in-app ads. We examine the effect on user privacy of thirteen popular Android ad providers by reviewing their use of permissions. Worryingly, several ad libraries checked for permissions beyond the required and optional ones listed in their documentation, including dangerous permissions like CAMERA, WRITE CALENDAR and WRITE CONTACTS. Further, we discover the insecure use of Android's JavaScript extension mechanism in several ad libraries. We identify fields in ad requests for private user information and confirm their presence in network data obtained from a tier-1 network provider. We also show that users can be tracked by a network sniffer across ad providers and by an ad provider across applications. Finally, we discuss several possible solutions to the privacy issues identified above.
Article
Android applications often include third-party libraries written in native code. However, current native components are not well managed by Android's security architecture. We present NativeGuard, a security framework that isolates native libraries from other components in Android applications. Leveraging the process-based protection in Android, NativeGuard isolates native libraries of an Android application into a second application where unnecessary privileges are eliminated. NativeGuard requires neither modifications to Android nor access to the source code of an application. It addresses multiple technical issues to support various interfaces that Android provides to the native world. Experimental results demonstrate that our framework works well with a set of real-world applications, and incurs only modest overhead on benchmark programs.
Article
Mobile app ecosystems have experienced tremendous growth in the last six years. This has triggered research on dynamic analysis of performance, security, and correctness properties of the mobile apps in the ecosystem. Exploration of app execution using automated UI actions has emerged as an important tool for this research. However, existing research has largely developed analysis-specific UI automation techniques, wherein the logic for exploring app execution is intertwined with the logic for analyzing app properties. PUMA is a programmable framework that separates these two concerns. It contains a generic UI automation capability (often called a Monkey) that exposes high-level events for which users can define handlers. These handlers can flexibly direct the Monkey's exploration, and also specify app instrumentation for collecting dynamic state information or for triggering changes in the environment during app execution. Targeted towards operators of app marketplaces, PUMA incorporates mechanisms for scaling dynamic analysis to thousands of apps. We demonstrate the capabilities of PUMA by analyzing seven distinct performance, security, and correctness properties for 3,600 apps downloaded from the Google Play store.
Article
Today's smartphones are a ubiquitous source of private and confidential data. At the same time, smartphone users are plagued by carelessly programmed apps that leak important data by accident, and by malicious apps that exploit their given privileges to copy such data intentionally. While existing static taint-analysis approaches have the potential of detecting such data leaks ahead of time, all approaches for Android use a number of coarse-grain approximations that can yield high numbers of missed leaks and false alarms. In this work we thus present FlowDroid, a novel and highly precise static taint analysis for Android applications. A precise model of Android's lifecycle allows the analysis to properly handle callbacks invoked by the Android framework, while context, flow, field and object-sensitivity allows the analysis to reduce the number of false alarms. Novel on-demand algorithms help FlowDroid maintain high efficiency and precision at the same time. We also propose DroidBench, an open test suite for evaluating the effectiveness and accuracy of taint-analysis tools specifically for Android apps. As we show through a set of experiments using SecuriBench Micro, DroidBench, and a set of well-known Android test applications, FlowDroid finds a very high fraction of data leaks while keeping the rate of false positives low. On DroidBench, FlowDroid achieves 93% recall and 86% precision, greatly outperforming the commercial tools IBM AppScan Source and Fortify SCA. FlowDroid successfully finds leaks in a subset of 500 apps from Google Play and about 1,000 malware apps from the VirusShare project.
Conference Paper
We present a system Dynodroid for generating relevant inputs to unmodified Android apps. Dynodroid views an app as an event-driven program that interacts with its environment by means of a sequence of events through the Android framework. By instrumenting the framework once and for all, Dynodroid monitors the reaction of an app upon each event in a lightweight manner, using it to guide the generation of the next event to the app. Dynodroid also allows interleaving events from machines, which are better at generating a large number of simple inputs, with events from humans, who are better at providing intelligent inputs. We evaluated Dynodroid on 50 open-source Android apps, and compared it with two prevalent approaches: users manually exercising apps, and Monkey, a popular fuzzing tool. Dynodroid, humans, and Monkey covered 55%, 60%, and 53%, respectively, of each app's Java source code on average. Monkey took 20X more events on average than Dynodroid. Dynodroid also found 9 bugs in 7 of the 50 apps, and 6 bugs in 5 of the top 1,000 free apps on Google Play.