Content uploaded by David Syman
Author content
All content in this area was uploaded by David Syman on Nov 11, 2021
Content may be subject to copyright.
Mobile Apps Analysis: A Hybrid Approach using Dynamic Syntax Tree
Elina Petrovna
Department of Software Engineering
BitBrainery University, London - UK
Ruth Goldberg, David Syman
Security Reviewer Srl, Grosseto - Italy
[2015- 15th of October]
Abstract –Mobile devices are rapidly expanding as per user's
need and they are potential risky for personal and
professional privacy, because of the various malwares and
increasing security issues and threats. With lots of frequent
application releases and updates happening, conducting a
complete security analysis of mobile Apps becomes crucial.
In this paper, we present a different approach to ensure the
security of these devices starting from the Apps
development. We examine the two major platforms in the
mobile space, iOS and Android, and for each we provide a
thorough investigation of existing and historical security
features, evidence-based discussion of known security
bypass techniques, and concrete recommendations for
remediation.
Keywords - Mobile analysis, Sandboxing, Hybrid analysis, code
analysis, SAST, MAST, Vulnerability prediction
I INTRODUCTION
Most of Mobile Apps analysis techniques process the
following three aspects only:
APP PERMISSIONS
The top five permissions [4] that Apps request are:
• Storage
• Photos
• Camera
• Location
• Microphone
An App is vulnerable if requests more than permissions
really required, or in case of invalid usage [2] of such
permissions.
VULNERABILITIES
A vulnerability comes from either the application’s source
code [3] or from the external libraries it uses, in most cases
early versions of open-source libs. Hundreds of code
vulnerabilities are referenced by the National Vulnerability
Database (NVD), the OWASP mobile security project and US-
CERT.
UNEXPECTED BEHAVIORS
A mobile application can perform unwanted actions
because of the external libraries adopted, compromised by
malware [1] [5], or because of a development mistake.
Therefore, the development techniques using current
programming languages cannot work properly if based on
languages’ security constructs or external libraries only,
because there is not complete set of information needed for
analysis.
The main contribution of this paper is a description of an
automated, all-in-one, mobile application (Android and iOS)
penetration testing, malware analysis and security
assessment framework capable of performing and
correlating static and dynamic analyses. This framework,
hereby named Hybrid Analyzer, should support mobile app
binaries (APK, XAPK, IPA) along with zipped source code and
provides REST APIs for seamless integration with your CI/CD
or DevOps pipeline. The Hybrid Analyzer helps you to
perform runtime security assessment and interactive
instrumented testing.
II HYBRID ANALYSIS TECHNIQUES
Further than above listed Mobile Apps analysis techniques,
the Hybrid Analyzer also covers the following:
• Android Hardcoded Secrets- Rooted phone, Hard disk
forensics, ADB backup, Debuggable application allowing
run-as (mitigated by Android). Social engineering, Bug in
application allowing for data exfiltration, Bug in
application allowing arbitrary command execution (not
mitigated by Android).
• Apple iOS Key Security Flaws- Limited benefit of
encryption for powered-on devices, Evidence of past
hardware (SEP) compromise, Limitations of “end-to-end
encrypted” cloud services.
• iOS IPA and APK/XAPK binary analysis- .ipa and .apk files
are just zipped files that include the application
executable and a bunch of other stuff. In most cases they
are not 100% encrypted and contains images, web
pages, db files, configuration files and even zipped
source code.
• Image Interpreting- detects Async Inputs, Application
Status Changes, Lost Connections, Unhandled Errors by
interpreting the images dynamically
• Decompilation- Decrypt and decompile binaries for
obtaining a seamless source code in Java/Dalvik or
Objective-C/opcode.
• Static Analysis- Static Analysis of decrypted code,
configuration files, db files
• Protocol testing- Find in the code references to security
protocols, execute Dynamic Analysis of TLS/SSL Security,
SSL Pinning Bypass, LGTM issues, and REST API exposed
for TLS/SSL included.
• Intent/app extension dumper- Through auxiliary Frida
script.
• Correlation- between Static Analysis and Dynamic
Analysis.
In the following sections we try to briefly describe the hybrid
analysis techniques listed above. It is necessary for a better
understanding of the proposed new hybrid approach, based
on Dynamic Syntax Tree [6] representing the real scope of
this publication.
Android Hardcoded Secrets
Google’s Android operating system, and many third-party
phones that use Android, incorporates several security
features that are analogous to those provided by Apple
devices. Unlike Apple, Google does not fully control the
hardware and software stack on all Android-compatible
smartphones: some Google Android phones are
manufactured entirely by Google, while other devices are
manufactured by third parties. Moreover, device
manufacturers routinely modify the Android operating
system prior to deployment. To determine the level of
security currently provided by these Android devices against
sophisticated attackers, we considered the full scope of
Google’s public documentation, postings from mobile
forensics companies, and other documents in the public
record. Our main findings are as follows:
Limited benefit of encryption for powered-on devices. Like
Apple iOS, Google Android provides encryption for files and
data stored on disk. However, Android’s encryption
mechanisms provide fewer gradations of protection. Android
provides no equivalent of Apple’s Complete Protection (CP)
encryption class, which evicts decryption keys from memory
shortly after the phone is locked. Therefore, Android
decryption keys always remain in memory after “first unlock,”
and user data is potentially vulnerable to forensic capture.
De-prioritization of end-to-end encrypted backup. Android
incorporates an end-to-end encrypted backup service based
on physical hardware devices stored on Google’s datacenters.
The design of this system ensures that recovery of backups
can only occur if initiated by a user who knows the backup
passcode, an on-device key protected by the user’s PIN or
other authentication factor. Unfortunately, the end-to-end
encrypted backup service must be opted-in to by app
developers and is paralleled by the opt-out Android Auto-
Backup, which simply synchronizes app data to Google Drive,
encrypted with keys held by Google.
Large attack surface. Android is the composition of systems
developed by various organizations and companies. The
Android kernel has Linux at its core, but also contains chip
vendor- and device manufacturer-specific modification. Apps,
along with support libraries, integrate with system
components and provide their own services to the rest of the
device. Because the development of these components is not
centralized, cohesively integrating security for all of Android
would require significant coordination, and in many cases
such efforts are lacking or nonexistent.
Limited use of end-to-end encryption. End-to-end encryption
for messages in Android is only provided by default in third-
party messaging applications. Native Android applications do
not provide end-to-end encryption: the only exception being
Google Duo, which provides end-to-end encrypted video calls.
The current lack of default end-to-end encryption for
messages allows the service provider (for example, Google) to
view messages and logs, potentially putting user data at risk
from hacking, unwanted targeted advertising, subpoena, and
surveillance systems.
Availability of data in services. Android has deep integration
with Google services, such as Drive, Gmail, and Photos.
Android phones that utilize these services (most of them)
send data to Google, which stores the data under keys it
controls - effectively an extension of the lack of end-to-end
encryption beyond just messaging services. These services
accumulate rich sets of information on users that can be
exfiltrated either by knowledgeable criminals (via system
compromise) or by law enforcement (via subpoena power).
Android also makes heavy use of SELinux to further sandbox
applications, including its own services so that compromising
them doesn’t necessarily mean full root access. In some cases,
these protections are already strong enough to require no
extra encryption.
Rooted Phone The case of a rooted phone is straightforward:
Android does not allow a regular user to root their phone
easily. Given this, we know that a user with a rooted phone
has either done so willingly or has fallen victim to
sophisticated malware that exploits Android itself. In the first
case we have two options: employ rooting detection and
change the applications behavior on a rooted phone, or we
can accept the risk of a user who rooted their own phone
being more vulnerable. The second case requires a more
sophisticated attacker so we will need to evaluate the
probability that our application will be the target for this
attacker. The second case is particularly difficult to deal with,
but Android can still potentially protect some application
secrets even in the case of rooting.
Hard Disk Forensics would be an issue if a user’s phone were
to fall into the hands of someone who knows how to image
and read a hard disk. This isn’t as hard as it sounds, so it is a
worthwhile method to consider. Luckily Android provides file
based or full disk encryption to help here.
Social Engineering is always a possibility for stealing a user’s
secrets and encryption doesn’t necessarily help here. While it
would help in the case where the attacker is trying to convince
someone to install some malware on their phone that targets
your application, that is also mitigated through general
Android security practices. If this is our only reason for
encrypting application data and we think it is valid, then it
might be better to tackle this problem at a different level.
ADB Backup is a well-publicized issue in which users with
Android debugging enabled could be vulnerable to
inadvertently allowing application data to be backed up by a
malicious USB port or an attacker with physical access to their
phone. The idea for the former would look something like
plugging a phone into some USB port that “looks safe” but
attempts to use the adb backup command. Neither of these
are particularly easy to pull off anymore since the phone
needs to be unlocked and there is user interaction required
to start the backup. This can also be mitigated through other
means, particularly configuration option in the
AndroidManifest.xml file.
run-as Applications that are debuggable allow anyone that
can get a shell on the phone to use the run-as {application}
command to access the filesystem as if they were the
application itself. Android warns you when applications are
hard coded to be debuggable, but the mitigation here is
simple: don’t deliver a debuggable application.
Application Bugs are an interesting vector to consider. The
risk here is that data will be exfiltrated from the application
due to an issue with application logic. Exposed IPC services –
things such as exported Services, Broadcast Receivers, and
Content Providers – would be a good place to look for these
types of issues. These are some of the things that make
Android so versatile but are also a major target for malware.
If we have exported IPC services, we can bet that is the first
place an attacker will look for an entrypoint into our
application.
Code Execution Vulnerabilities in the application itself could
be game over for any application data without a sufficiently
robust encryption mechanism. We could consider attempting
to mitigate the damage this causes to application data
integrity and secrecy, but it is difficult in general.
Apple iOS Key Security Flaws
Apple is also noteworthy for three reasons: (1) the company
has overall control of both the hardware and operating
system software deployed on its devices, (2) the company’s
business model closely restricts which software can be
installed on the device, and (3) Apple management has, in the
past, expressed vocal opposition to making technical changes
in their devices that would facilitate law enforcement access.
To determine the level of security currently provided by
Apple against sophisticated attackers, we considered the full
scope of Apple’s public documentation. Our main findings are
as follows:
Limited benefit of encryption for powered-on devices. Apple
advertises the broad use of encryption to protect user data
stored on-device. However, we observed that a surprising
amount of sensitive data maintained by built-in applications
is protected using a weak “available after first unlock” (AFU)
protection class, which does not evict decryption keys from
memory when the phone is locked. The impact is that most
sensitive user data from Apple’s built-in applications can be
accessed from a phone that is captured and logically
exploited while it is in a powered-on (but locked) state. We
also found circumstantial evidence from a 2014 update to
Apple’s documentation that the company has, in the past,
reduced the protection class assurances regarding certain
system data, to unknown effect. Finally, we found
circumstantial evidence in both the DHS procedures and
investigative documents that law enforcement now routinely
exploits the availability of decryption keys to capture large
amounts of sensitive data from locked phones. Documents
acquired by Upturn, a privacy advocate organization, support
these conclusions, documenting law enforcement records of
passcode recovery against both powered-off and simply
locked iPhones of all generations.
Weaknesses of cloud backup and services. Apple’s iCloud
service provides cloud-based device backup and real-time
synchronization features. By default, this includes photos,
email, contacts, calendars, reminders, notes, text messages
(iMessage and SMS/MMS), Safari data (bookmarks, search
and browsing history), Apple Home data, Game Center data,
and cloud storage for installed apps. We examine the current
state of data protection for iCloud and determine
(unsurprisingly) that activation of these features transmits an
abundance of user data to Apple’s servers, in a form that can
be accessed remotely by criminals who gain unauthorized
access to a user’s cloud account, as well as authorized law
enforcement agencies with subpoena power. More
surprisingly, we identify several counter-intuitive features of
iCloud that increase the vulnerability of this system. As one
example, Apple’s “Messages in iCloud” feature advertises the
use of an Apple-inaccessible “end-to-end” encrypted
container for synchronizing messages across devices.
However, activation of iCloud Backup in tandem causes the
decryption key for this container to be uploaded to Apple’s
servers in a form that Apple (and potential attackers, or law
enforcement) can access. Similarly, we observe that Apple’s
iCloud Backup design results in the transmission of device-
specific file encryption keys to Apple. Since these keys are the
same keys used to encrypt data on the device, this
transmission may pose a risk if a device is subsequently
physically compromised. More generally, we find that the
documentation and user interface of these backup and
synchronization features are confusing and may lead to users
unintentionally transmitting certain classes of data to Apple’s
servers.
Evidence of past hardware (SEP) compromise. iOS devices
place strict limits on passcode guessing attacks through the
assistance of a dedicated processor known as the Secure
Enclave processor (SEP). We examined the public
investigative record to review evidence that strongly
indicates that as of 2018, passcode guessing attacks were
feasible on SEP-enabled iPhones using special tools. To our
knowledge, this most likely indicates that a software bypass
of the SEP was available in-the-wild during this timeframe.
We also reviewed more recent public evidence and were not
able to find dispositive evidence that this exploit is still in use
for more recent phones (or whether exploits still exist for
older iPhones). Given how critical the SEP is to the ongoing
security of the iPhone product line, we flag this uncertainty
as a serious risk to consumers.
Limitations of “end-to-end encrypted” cloud services. Several
Apple iCloud services advertise “end-to-end” encryption in
which only the user (with knowledge of a password or
passcode) can access cloud-stored data. These services are
optionally provided in Apple’s CloudKit containers and via the
iCloud Keychain backup service. Implementation of this
feature is accomplished via the use of dedicated Hardware
Security Modules (HSMs) provisioned at Apple’s data centers.
These devices store encryption keys in a form that can only
be accessed by a user and are programmed by Apple such
that cloud service operators cannot transfer information out
of an HSM without user permission.
As noted above, our finding is that the end-to-end
confidentiality of some encrypted services is undermined
when used in tandem with the iCloud backup service. More
critically, we observe that Apple’s documentation and user
settings blur the distinction between “encrypted” (such that
Apple has access) and “end-to-end encrypted” in a manner
that makes it difficult to understand which data is available
to Apple. Finally, we observe a fundamental weakness in the
system: Apple can easily cause user data to be re-provisioned
to a new (and possibly compromised) HSM simply by
presenting a single dialog on a user’s phone. We discuss
techniques for mitigating this vulnerability.
Based on these findings, our overall conclusion is that data
for iOS devices is highly available to both sophisticated
criminals and law enforcement actors with either cloud or
physical access. This is due to a combination of the weak
protections offered by current Apple iCloud services, and
weak defaults used for encrypting sensitive user data on-
device. The impact of these choices is that Apple’s data
protection is fragile: once certain software or cloud
authentication features are breached, attackers can access
most sensitive user data on device. Later in this work we
propose improvements aimed at improving the resilience of
Apple’s security measures.
iOS IPA and APK/XAPK binary analysis
App binary files are zip files from which a lot of sensitive data
can be extracted.
List of Targets for NIST Mobile Device Acquisition Forensics
• Cellular network subscriber information: IMEI, MEID/ESN
• Personal Information Management (PIM) data: address
book/contacts, calendar, memos, etc
• Call logs: incoming, outgoing, missed
• Text messages: SMS, MMS (audio, graphic, video)
• Instant messages
• Stand-alone files: audio, documents, graphic, video
• E-mail
• Web activity: history, bookmarks
• GPS and geo-location data
• Social media data: accounts, content
• SIM/UICC data: provider, IMSI, MSISDN, etc.
In accordance with NIST standards, DHS tests forensic
software for mobile device acquisition of the above categories
of data [Source: NIST].
Further, the extracted Images during Dynamic Analysis can
be an important source of sensitive information (see the
relative section below).
Decompilation
App binaries are easy to decompile or reverse-engineering to
Android Dalvik or iOS opcode. The obtained source code
cannot be compiled but executing a Static Analysis on it can
detect a bunch of vulnerabilities that really exist in the App.
The human readable Dalvik or opcode is a great hint for
understanding executable code. For example, all iOS opcode
selector names reside in the __objc_methname section of the
__TEXT segment. Reverse engineering is a helpful approach
that can be used for investigating and analyzing software code
to research malware, fix software issues, ensure software
compatibility, simplify support for undocumented legacy
code, etc. To reverse engineer a piece of software, you need
to know the basic binary executable structure and have a set
of tools for browsing and disassembling executables.
The Hybrid Analyzer will apply:
• Reversing open-source code
• Getting an executable to reverse engineer
• Reversing emulator binaries
• Finding the cause of application-specific issues
• Reverse engineering using private or internal functionality
• Communicating with a daemon
Static Analysis
It will perform source code-based analysis without running
the application to not depend on the runtime environment.
So that static analysis can be used in line with application
development. Static testing will be more effectively carried
out regularly within a predetermined time so that every time
an update or release of App is carried out, at the same time,
the test has been done without having to run the application.
Protocol testing
Hybrid Analyzer introduces SSL/TLS traffic decryption on the
fly. A traditional packet-capture tool is useless when you need
to inspect the contents of an encrypted session, be it the App
downloading a webpage or contacts an unknown server for
an obscure reason. Today, virtually all network traffic is
encrypted. Hybrid Analyzer is a game-changer: It intercepts
SSL and TLS traffic and displays the contents as if you were
capturing an unencrypted TCP session.
Intent/app extension dumper
During Dynamic Analysis Intents (Android) or App Extensions
are detected using Frida or Needle, without the need for a
jailbroken or rooted mobile device. It will be able to detect
data storage usage, inter-process communication, network
communications, hooking and binary protections.
Correlation
Static and Dynamic analyses will be correlated using Dynamic
Syntax Tree, combining, and correlating the results. You are
able to identify which vulnerabilities are truly exploitable and
should be at the top of your remediation list.
III DYNAMIC SYNTAX TREE
Mapping of constructs from the dynamic language into a
Dynamic Syntax Tree (DST) is a kind of semantic analysis.
Information implying from the syntax is analyzed and the
results are inserted back into the same tree, but in the form
of complete static information. Thereafter this preprocessing
enables us to work with the syntax tree of the dynamic code
as it is in a static code with some limitations, that are not
resolvable until runtime in dynamic languages. For that
reason, we provide a binaries analysis. Binaries will be
sandboxed collecting dynamic information at runtime, using
a very fast algorithm that will be discussed in a future paper.
Mixing source code and binaries analysis will fix the above-
mentioned limitations.
Future Work
Hybrid Analyzer sandboxing feature will be used in a Eu-
Funded project, MOBILE Lab - PANOPTESEC project lead by
BitBrainery University, together with NVISO Labs and
Security Reviewer Srl.
ACKNOWLEDGMENTS
REFERENCES
[1] Olalere, M., et al., A review of bring your own
device on security issues. Sage Open, 2015. 5(2):
p. 2158244015580372.
https://doi.org/10.1177/2158244015580372
Google Scholar
[2] Altuwaijri, H. and S. Ghouzali, Android data storage security: A review.
Journal of King Saud University-Computer and Information Sciences,
2020. 32(5): p. 543–552. https://doi.org/10.1016/j.jksuci.2018.07.004
Google Scholar
[3] Wu, L., X. Du, and X. Fu, Security threats to mobile multimedia
applications: Camera-based attacks on mobile phones. IEEE
Communications Magazine, 2014. 52(3): p. 80–87.
https://doi.org/10.1109/MCOM.2014.6766089
Google Scholar
[4] Alepis, E. and C. Patsakis, Unravelling security issues of runtime
permissions in android. Journal of Hardware and Systems Security,
2019. 3(1): p. 45–63. https://doi.org/10.1007/s41635-018-0053-2
Google Scholar
[5] Doğru, İ.A. and Ö. Kiraz, Web-based android malicious software
detection and classification system. Applied Sciences, 2018. 8(9): p.
1622. https://doi.org/10.3390/app8091622
Google Scholar
[6] Moses, T. and Syman D. Dynamic Syntax Tree: Implementation Results,
2012.
https://www.researchgate.net/publication/304704212_Dynamic_Synta
x_Tree_Implementation_Results