Conference PaperPDF Available

Towards Firmware Analysis of Industrial Internet of Things (IIoT) - Applying Symbolic Analysis to IIoT Firmware Vetting

Authors:

Figures

Content may be subject to copyright.
Towards Firmware Analysis of Industrial Internet of Things (IIoT)
Applying Symbolic Analysis to IIoT Firmware Vetting
Geancarlo Palavicini Jr, Josiah Bryan, Eaven Sheets, Megan Kline, John San Miguel
US Department of Defense, SPAWAR Systems Center Pacific, San Diego, California, U.S.A.
{geancarlo.palavicini, josiah.bryan, eaven.sheets, megan.kline, john.m.sanmiguel}@navy.mil
Keywords: Industrial Internet of Things, Firmware Vetting, Internet of Things, Cybersecurity, Vulnerability Research,
Embedded Systems, Security, Malware, Emerging Threats, Binary Analysis, Virtualization.
Abstract: Embedded systems and Industrial Internet of Things (IIoT) devices are rapidly increasing in number and com-
plexity. The subset IIoT refers to Internet of Things (IoT) devices that are used in manufacturing and industrial
control systems actively being connected to larger networks and the public internet. As a result, cyber-physical
attacks are becoming an increasingly common tactic employed to cause economic and physical damage. This
work aims to perform near automated firmware analysis on embedded systems, Industrial Control Systems (fo-
cusing on Programmable Logic Controllers), Industrial Internet of Things devices, and other cyber-physical
systems in search of malicious functionality. This paper explores the use of binary analysis tools such as angr,
the cyber reasoning system (CRS) ’Mechanical Phish’, American Fuzzy Lop (AFL), as well as virtualization
tools such as OpenPLC, firmadyne, and QEMU to uncover hidden vulnerabilities, find ways to mitigate those
vulnerabilities, and enhance the security posture of the Industrial Internet of Things.
1 INTRODUCTION
Embedded systems and Industrial Internet of Things
(IIoT) devices are rapidly increasing in number and
complexity. Industrial Internet of Things devices are
a part of a large family of Internet of Things (IoT) de-
vices, which are simply non-traditional devices that
are now being connected to the public Internet. The
subset IIoT refers to IoT devices that are used in man-
ufacturing and industrial control systems. As a re-
sult, cyber-physical attacks are becoming an increas-
ingly common tactic employed to cause economic
and physical damage (Sadeghi et al., 2015). Paired
with the lack of efficient cyber security analysis, in-
creased connectivity, sloppy programming, and speed
to market pressures, cyber-physical attacks have cre-
ated a dangerous climate for Operation Technology
(OT) networks and Internet of Things devices.
As embedded technology and capabilities in-
crease, the firmware for these systems becomes in-
creasingly difficult to analyze in search for mali-
cious functionality. Our goal is to perform near au-
tomated firmware analysis on embedded systems, In-
dustrial Control Systems (focusing on Programmable
Logic Controllers), Industrial Internet of Things de-
vices, and other cyber-physical systems in search of
malicious functionality. Our work adapts UC Santa
Barbara’s binary analysis framework called ’angr’
(Shoshitaishvili et al., 2016), as well as compo-
nents from their cyber reasoning system (CRS) called
’Mecanical Phish’ to perform semi-automated analy-
sis on IIoT systems. This paper makes the following
contributions:
Leverages proven open-source technologies and
approaches for tradtional software / firmware
analysis for use on IIoT firmware
Extends the work of Shoshitaishvili et al. to in-
corporate custom architecture backends for non-
standard and propritary architectures to the angr
framework
Lays out a proposed approach for automating por-
tions of our firmware analysis approach
This paper is organized as follows. Section 2 dis-
cusses the current state of the art for IoT binary anal-
ysis. Section 3 dives into the approach taken to ana-
lyze IIoT devices in search of vulnerabilities and the
effort to automate the processes, as well as challeges
and mitigations. Section 4 discusses the initial results
and findings of applying dynamic symbolic execution
and symbolic assisted fuzzing to analyzing IIoT de-
vice firmware. Section 5 summarizes the paper with
the conclusion and Section 6 closes out the work pre-
sented in this paper with a glance toward future work.
2 RELATED WORK
The runtime-monitoring framework presented by
(Janicke et al., 2015) addresses subtle changes to crit-
ical system behavior through semantic attacks. Tra-
ditional protection systems such as Intrusion Detec-
tion Systems (IDS) have difficulty identifying such
classifications of attacks. These subtle changes create
states in program execution that could result in ma-
chinery violating safety requirements such as a drill
being used on a product that does not exist at the lo-
cation that drilling occurs. The state execution of the
machinery specifies the current state of the system,
and the behavior that should be exhibited from the
next state. Behavior outside of this specification is
determined to be outside of the normal execution, and
thus malicious.
Cruz et al., present Shadow Security Unit (SSU)
in (Cruz et al., 2015) as an in-line network moni-
tor. The lightweight device sits in parallel to a PLC
or RTU and passively collects communications and
control data to detect active attacks in a live environ-
ment. This is presented as a solution to the specific
challenge of sensitivity to timing that is present with
ICS and SCADA systems and it is a contibution to
the larger work the CockpitCI project as described
in (Cruz et al., 2014). SSU is a minimally invasive
behavior monitor and correlation engine for anomaly
detection and foucses on live run-time environments.
The work most closely related to ours is that of
Almgren et al., under the CRISALIS Consortium
(Almgren et al., 2015). They are working on au-
tomated vulnerability discovery for Critical Infras-
tructure (CI) environments. Their approach includes
application-level protocol fuzzing, emulation and dy-
namic analysis of embedded devices, and security
testing prioritization by means of vulnerability indi-
cators. They outlines 5 challenges to large scale
vulnerability analysis of embedded firmware, namely
’building a representative dataset’, ’firmware identi-
fication’, ’unpacking and custom formats’, ’scalabil-
ity and computational limits’, and ’results confirma-
tion’ (Costin et al., 2014). Our work aims to ease two
of those challenges, namely the ’unpacking and cus-
tom formats’, as well as ’results confirmation’. We’re
tackling the ’unpacking and custom formats’ chal-
lenge through the development of custom architecture
backends for the angr framework, and as much as pos-
sible, automating our extraction process. We are also
easing the challenge of ”results confirmation’ through
our emulation of firmware to verify the exploitability
of discovered vulnerabilities. A major difference in
our approach stems from our leveraging of static, dy-
namic, and symbolic analysis of the firmware samples
with the angr framework, as well as symbolic-assisted
fuzzing through Driller’s AFL/angr hybrid approach.
Mclaughlin et al., introduce the Trusted Safety
Verifier (TSV) in (McLaughlin et al., 2014) which
supports control system security by reducing the
trusted computing base for safe process execution
throughout the entire system. TSV is deployed as an
in-line device that uses an instruction list lifter and
translates it to Instruction List Intermediate Language
(ILIL) for ease of processing. It is designed to han-
dle PLC specific features, like function blocks, timers,
counters, master control relays, data blocks, and edge
detection in order to detect potential for data injection
attacks and PLC firmware exploitation.
3 APPROACH
Our proposed approach for automated firmware anal-
ysis focuses on three major tasks preparation of the
firmware image for loading into the angr framework,
emulation for verification of discovered vulnerabili-
ties, and analysis of the firmware sample ’angr style’.
Once loaded in the framework, we can benefit from
the angr’s reliance on the python programming lan-
guage to further automated the process. The stages of
our approach are as follows (Figure 1 shows a graph-
ical representation of this process):
Extraction and cleanup;
Emulation;
Analysis angr style.
The firmware must first be extracted and loaded
into the angr framework. For the extraction portion
we will make use of tools such as binwalk (devttys0,
2016a) and firmware-mod-kit (Collake and Heffner,
2013) to extract the packaged firmware.
Once the firmware is extracted we will run our
own developed software to add or remove content
from the binary to prepare it for efficient analysis.
Cleaning up the packaged firmware is extremely im-
portant for utilizing many of the resource intensive
analyses in angr. An example of this is the Symbolic
Execution portion, which is prone to path explosion
while analyzing complex binaries, which we will fur-
ther discuss in section 3.5.
Once the firmware has been successfully extracted
and cleaned up, we will emulate the firmware using
OpenPLC (for PLC emulation) (Alves et al., 2014),
QEMU (Bellard, 2017), and/or Firmadyne (Chen
et al., 2016). After these initial steps, it will be time
to load the firmware into angr using the framework’s
default loader CLE or utilizing IDA’s binary loader.
Once the binary is loaded we can use a combination
of various binary analysis techniques (E.G. fuzzing
and symbolic execution) to discover the functionality
and identify malicious behavior such as backdoors,
information leakage, or botnet code.
To aid the analysis, we will be adding vendor spe-
cific conventions and libraries to the knowledge base
that angr populates after recovering a Control Flow
Graph (CFG) of the firmware. Angr’s knowledge base
is a shared repository of information discovered by
the angr framework as it progresses through the analy-
sis of a particular sample(Shoshitaishvili et al., 2016).
3.1 Firmware Extraction
Extraction of firmware from IIoT devices makes it
possible to replicate the device’s behavior through
emulation.
Three techniques are explored, which include
downloading the firmware from the vendor’s website
or additional sources, capturing it during a device up-
date, and extracting from the device.
Accessing firmware images through vendors can
be a trying process, due to protection mechanisms im-
plemented by manufacturers. Given the vast number
of devices to analyze, this method is not feasible, as
it suffers a lack of scalability. The remaining feasible
options are constrained by physical possession of the
IIoT device.
IIoT firmware updates are almost exclusively ap-
plied automatically, which can lead to hurdles in
firmware retrieval based on how the updates are
pushed to the device. The challenge of extracting a
working firmware image from the device is not an
uniform process and in many instances unique to the
specific device.
Potential interfaces to extract the firmware consist
of JTAG, UART, and in-circuit serial programming
(ICSP). Vendors take steps to block debug interfaces
such as the ones listed above; however, dumping the
flash chip directly may be the only option.
With the firmware image extracted, the task turns
to extracting the core components: boot loader, ker-
nel image, and file system. There is a wide spectrum
of vendors with no explicit standard on how firmware
images should be structured. The consequence is hav-
ing to reverse engineer and analyze each component
to determine what is pertinent to the system for emu-
lation.
Luckily, Linux provides utilities and tools to aid
in this endeavor. The Linux file utility will ver-
ify the contents of the image to either be a com-
pressed file or data. Running further Linux utilities
strings and hexdump can reveal insightful informa-
tion such as firmware version, operating system, and
boot loader. The information provided by these utili-
ties contributes to a preliminary blueprint of the oper-
ation of an IIoT device.
Binwalk, a firmware analysis tool (devttys0,
2016a), can be executed against the firmware image
to identify embedded files, executable code, and per-
form recursive file system extraction. Two Linux
file systems commonly associated with IIoT device
firmware are squashfs (devttys0, 2016b), a com-
pressed read-only file system, and jffs2 (Gupta, 2016)
(OWASP, 2016), a long-structured file system for use
with flash memory devices.
There are two open-source projects, sasquatch
(devttys0, 2016c) and jefferson (sviehb, 2016) respec-
tively, that add modifications to existing decompres-
sion utilities for the file systems. The file system is
composed of binaries and initialization scripts that
can be used to investigate disassembly for static anal-
ysis and emulate behaviors of the IoT device for dy-
namic analysis.
In addition, vendors modify file systems to pre-
vent them from being extracted. ’Firmware modifica-
tion kit’ is designed to attempt the extraction of un-
traditional squashfs and cramfs file systems that have
been modified from firmware using TRX or uImage
headers. This tool is critical to the extraction process
due to its ability to rebuild what has been disassem-
bled. Alterations can be made to exhibit malicious in-
tent for the exploitation of the rebuilt device firmware.
The tool has the capability to re-flash the IIoT device
with the malicious rebuilt binaries, and can be verified
through emulation.
3.2 Firmware Emulation
Firmware emulation provides an operational environ-
ment separated from device hardware. QEMU is a
machine emulator and virtualization platform, that in-
cludes capabilities for full system and user-mode em-
ulation (Bellard, 2017). This is achieved by QEMU
through hardware virtualization capable of emulating
CPUs with dynamic binary translation. Various CPU
architectures are supported for emulation including
IA-32, x86-64, MIPS, and ARM. QEMU is the stan-
dard emulation tool for firmware binaries and images,
and can be leveraged for deep analysis of firmware
behavior. Understanding the disparity in behaviors
of non-malicious and malicious binaries is essential
to determining potential vulnerabilities in firmware.
With increased popularity of IIoT devices, assurance
in non-malicious behavior is vital in securing these
devices from adversaries requiring emulation of the
underlying firmware for verification of any discov-
ered vulnerability.
Figure 1: Approach: Extraction & cleanup, Emulation, Analysis angr style
3.2.1 Emulating PLCs
Due to the nature of PLC deployments as ’off-
network’ devices, security was not an area of focus
for Programmable Logic Controllers (PLCs), the con-
trol unit for ICS. This phenomenon of merging physi-
cal devices such as PLCs with the internet has been
coined the Industrial Internet of Things. The con-
cept of IoT complements the functionality of a PLC
by controlling other machines including sensors and
other devices across a network with the desire to be
autonomous. The main test subjects for this research
are PLCs, both commercially available PLC devices,
as well as open-source technologies and PLC virtual-
ization platforms.
OpenPLC is a fully functional, standardized,
open-source PLC (Alves et al., 2014), capable of
supporting all 5 of the programming laguages in the
IEC-61131-3 specification (ST, IL, LADDER, FBD
and SFC). It allows research to be done on physical
hardware and processes in conjunction with an envi-
ronment for emulation such as Linux. The project
includes a Modbus/TCP communication capability
that can interface with any human machine inter-
face (HMI) software that supports Modbus/TCP, it
includes a nodeJS environment for interfacing Open-
PLC with the hardware. Modbus is a serial communi-
cation protocol designed by Modicon, and commonly
used for PLC communication (Modbus, 2012).
3.3 angr Framework
The angr framework will be used for its unique abil-
ity to combine static, dynamic, and symbolic anal-
ysis. This research leverages angr’s capabilities ap-
plied towards Programmable Logic Controller (PLC)
firmware, with the goal of discovering any hidden vul-
nerability in the underlying software controlling the
hardware, known as the firmware. Although angr’s
loader (CLE) can handle most common hardware ar-
chitectures, PLC firmware images can pose interest-
ing challenges with respect to this task. Therefore,
before we can leverage the binary analysis capabili-
ties of the angr framework, we must first overcome
the challenge of loading the PLC firmware image into
angr for further processing.
The angr framework is a python based tool for an-
alyzing binaries developed by researchers from UC
Santa Barbara. It is a product of the Defense Ad-
vanced Research Project Agency’s (DARPA) Vetting
Commodity IT Software and Firmware (VET) pro-
gram and further enhanced through DARPA’s call for
autonomous cybersecurity systems known as the Cy-
ber Grand Challenge (CGC) (DARPA, 2016). The
angr development team finished in the top three for
the CGC final competition. The goal of the CGC
was to achieve autonomous systems capable of test-
ing for vulnerabilities, exploiting the vulnerabili-
ties found, generating security patches, and applying
those patches.
UCSB’s cyber reasoning system (CRS), Mechan-
ical Phish (Shellphish, 2016), implemented angr
with American Fuzzy Lop (AFL) (lcamtuf, 2017) to
achieve the challenges set forth by DARPA. The abil-
ity of angr to employ automation of a collection of
analyses makes it an ideal tool for analysing firmware
behavior.
Angr is comprised of four main components:
CLE, VEX, Claripy, and Simuvex. The framework in-
corporates a binary loader, CLE, responsible for mak-
ing the binary easy for angr to analyze, whether static,
dynamic, or symbolic analysis is to be performed.
Loaded binaries are converted to an intermediate
representation (IR), Valgrind’s IR ’VEX’. This con-
version abstracts away differences in architecture to
allow a single analysis to be run on all loaded bina-
ries.
The solver engine, Claripy, is used for constraint
solving of concrete and symbolic expressions which
are necessary for symbolic execution.
The final component, the simulation engine Simu-
vex, provides a semantic understanding of VEX IR
in combination with program state in order to accom-
plish static, dynamic, and symbolic analysis.
These four components are fundamental to allow-
ing angr to perform dynamic symbolic execution and
various static analysis on binaries with an easy to use
extensible framework.
3.3.1 Firmware loading with angr
Correctly loading the firmware into angr is the first
and arguably most important part in our process of an-
alyzing embedded systems. By default angr’s loader
CLE can load most common architectures including
ARM, MIPS, x86, AMD, AARCH, and PPC (Shoshi-
taishvili et al., 2016).
If the binary we are analyzing is not one of these
architectures (for example a binary blob), we can ex-
tend CLE to perform more fine grained entry point
discovery where we can begin analysis. This will re-
quire us to both develop and integrate our own soft-
ware and other open source tools to aid CLE.
One technique that angr currently supports is mak-
ing use of IDA’s loader to load binaries that CLE can-
not, and leveraging the feedback from IDA to begin
analysis. Any vendor specific information that we
find when loading binaries will be added to the knowl-
edge base to aid future loading and entry point discov-
ery efforts.
3.3.2 Firmware analysis angr style
Angr offers many different analysis techniques that
can be performed on a loaded binary. For our pur-
poses we will focus on Control Flow Graph (CFG) re-
covery, Program Slicing, the solver engine, AFL, and
Symbolic Execution (Shoshitaishvili et al., 2015).
Without a CFG most of the other analysis will not
work, and recovering a CFG from complex binaries
can be quite difficult. This further stresses the impor-
tance of narrowing down our areas of interest within
a binary, creating a comprehensive firmware knowl-
edge base, summarizing as much code as possible
(such as Standard C Libraries), and removing unnec-
essary content.
Once we have extracted the CFG we will per-
form fuzzing and symbolic execution. For this por-
tion we will leverage Driller, which augments fuzzing
with symbolic execution to discover paths within the
program which lead to an authenticated state, while
recording the constraints required to reach that state.
Driller switches back and forth between fuzzing and
symbolic execution to work its way through a pro-
gram (Stephens et al., 2016). The fuzzing portion is
handled by AFL, while the symbolic execution is han-
dled by angr.
For example, if a backdoor existed within the
firmware of a smart plug, driller could be used to
find the backdoor along with the constraints required
to reach that authenticated state, and use the built-in
solver engine to find the input required to access the
backdoor based on those constraints.
Another feature of angr that we plan to make use
of is program slicing. Program slicing is a subset of
statements from the original program. While analyz-
ing embedded systems, there are certain parts of the
system that we are more concerned with than others,
which is where program slicing comes in. This gives
us the ability to focus exclusively on a particular piece
of the program.
3.3.3 Driller
In this section we give a brief description of the hybrid
symbolic-assisted fuzzing approach implemented by
UC Santa Barbara in Driller (Stephens et al., 2016), it
must be noted that we do not take any credit for their
implementation.
Driller uses an instrumented fuzzing engine to
drive the dynamic symbolic execution, once the
fuzzer reaches a point where it cannot find any other
paths, it switches to angr’s symbolic execution engine
to leverage its ability to find the values needed to sat-
sify the constraints of the branches that the fuzzer can-
not solve. It then switches back to the AFL fuzzer
with the new inputs need to get through the next por-
tion of the binary. This process takes place as many
time as needed to reach a program crash point.
One of the strengths of driller’s fuzz-guided sym-
bolic execution is that it does not require test cases
to be supplied at the start of the analysis, although it
speeds up the analysis if you supply AFL with ini-
tial test cases. It can generate its own input test cases
leveraging angr’s symbolic execution engine.
The concept of combining fuzzing with symbolic
execution is not novel, but Driller’s ability to auto-
matically test portions of the code to replace them
with the symbolic summaries without user interven-
tion is a novel approach that we are leveraging to aid
in automating the analysis of IIoT firmware. Interest-
ingly, this approach deals with both the path explosion
challenge as well as the challenge of automating any
and all possible portions of our approach, which we’ll
cover in the next subsection.
3.4 Challenges and Mitigations
There are two overarching challenges faced by our
current approach. The first challenge is inherent in
any solution based on dynamic symbolic execution,
namely the path explosion problem. The second chal-
lenge stems from the lack of automation in both ex-
traction and analysis of firmware images. We’ll cover
both of these challenges and the chosen mitigations in
the following subsections.
3.4.1 Path Explosion
The path explosion problem is a well-known limita-
tion of concolic execution or dynamic symbolic exe-
cution approaches(Shoshitaishvili et al., 2016). Any-
time the analysis engine reaches a branch, it solves the
constraints required to take both sides of the branch.
As the analysis engine discovers an ever-growing
number of branches (and solves the constraints re-
quired to take both paths of each branch), the number
of paths begins to grow at an exponential rate (Shoshi-
taishvili et al., 2016). Several approaches have been
proposed in the literature to mitigate the path explo-
sion problem, such as program slicing (?), instru-
menting the symbolic analysis engine (Stephens et al.,
2016), path merging, under-constraint symbolic exe-
cution (Shoshitaishvili et al., 2016).
We are investigating the efficacy of program slic-
ing and fuzz-guided symbolic analysis provided by
the angr framework and the driller hybrid approach
developed by UCSB (Stephens et al., 2016), as well
as reducing complexity and the amount of code ana-
lyzed with symbolic summaries.
Symbolic summaries are not a new concept in bi-
nary analysis (Stephens et al., 2016). They are man-
ual descriptions of the state changes cause by a given
function to a particular program execution. They
summarize the expected output and end result of ex-
ecuting the function, expressed to the analysis engine
in similar form to a binary instruction, thus allowing
the analysis to skip that particular function’s code. In-
stead of having to analyze the entire function to reach
that changed state, the summary is supplied to the
analysis engine.
Symbolic summaries help the analysis by simpli-
fying the amount of code to be analyzed, reducing the
complexity of the analysis. It enables the analysis to
drive deeper into the program and aids in mitigating
some of the path explosion issues inherent in sym-
bolic execution based approaches.
The one drawback from symbolic summaries is
that since those portions of code are not analyzed, it’s
possible that a flaw in that summarized portion of the
code will be missed. This is a trade-off that we accept
in order to reach into deeper portions of the firmware.
The other technique relied upon by our work is the
use of automated fuzz-guided symbolic analysis ap-
proach, or symbolic-assisted fuzzing, embodied by
the Driller solution developed by UC Santa Barbara
for the DARPA VET and CGC programs (DARPA,
2016) [discussed in section 3.3.3].
3.4.2 Process Automation
As we discussed in section 3, firmware extraction can
be a difficult and at times an extremely manual pro-
cess. Part of the challenge of automating this pro-
cess is due to the fact that PLC firmware can come
in many different specialized architectures, and pro-
prietary implementations that can complicate extrac-
tion, emulation and analysis e.g., Asymmetric Multi
Processing architecture, specialized instruction set ar-
chitecture, ARM-Cortex. Although the angr frame-
work’s loader (CLE) can load binaries from several
different architectures, including ARM, it can have
problems loading custom firmware samples. The
framework has little to no support for SPARC, PIC
or AVR, among other specialized embedded systems
architectures.
One of our approaches to mitigating this challenge
is through the development of additional architecture
support for the angr framework. The use of the frame-
work in itself allows us to develop python scripts to
automate our process. In combination with the AFL
fuzzer, as it is implemented in, the framework further
automated the discovery of vulnerabilities, which we
leverage to further our automation efforts. As such,
we are in the early stages of development of custom
architecture backends for extending angr’s ability to
load PLC firmware easily and as automated as possi-
ble.
For this task we are focusing on the following
steps: (angr, 2017)
1. Adding the architecture information to the appro-
priate files in the angr framework.
2. Adding an intermediate representation (IR) trans-
lation to work with VEX IR. This may be either
an extension to PyVEX, producing IRSBs,
3. If your IR is not VEX, add a simuvex.SimEngine
to support it.
4. Adding a calling convention (simuvex.SimCC) to
support SimProcedures (including system calls)
5. Adding or modifying an angr.SimOS to support
initialization activities.
6. Creating a CLE backend to load binaries, or ex-
tending the CLE ELF backend to know about the
new architecture if the binary format is ELF.
The other approach involves developing modules
and scripts for as much of the process in our manual
extraction of firmware images. Our exploration and
use of the firmadyne solution by (Chen et al., 2016)
attempts to leveraging previous solutions aimed at au-
tomating this difficult task. As we continue to explore
the automation of PLC firmware extraction and analy-
sis, we will continue pursuing the potential expansion
of firmadyne to further automate the tools and extend
its capabilities.
4 INITIAL RESULTS
A prototype system was developed in order to per-
form early analysis of Programmable Logic Con-
troller firmware vetting utilizing the angr framework.
The prototype system leverages the OpenPLC (Alves
et al., 2014) project on a Raspberry PI 3 Model B.
The OpenPLC project can be used to emulate a simi-
lar process running on a Siemens S7-1200 PLC.
The prototype process is composed of the emu-
lated PLC, running a ladder logic program that con-
trols the on/off functions of a fan, as well as the speed
that the fan operates under. We performed an angr-
base analysis of both the extracted firmware image
and the ladder logic program controlling the process.
We successfully extracted the PLC’s firmware.
Recalling, from our discussion on our use of emula-
tion in section 3, that the purpose of the emulation
step, in our approach, is to aid in the verification of
any discovered vulnerability, and to help us determine
the viability of exploiting the vulnerabilities found.
Given that we already have the emulated process, we
will rely on this process to verify the exploitability of
any discovered vulnerabilities.
Once the firmware is extracted, we loaded the
firmware sample image into the angr framework for
analysis. We conducted analysis on the sample, in-
cluding dynamic symbolic execution analysis. Us-
ing the angr framework, we extracted and added to
the framework’s knowledge base, a data dependency
graph, and a Control Flow Graph recovery including:
- function list
- node list
- predecessors list
- successors list
We also identified the locations of the ladder logic
program, within the firmware image, that controls the
fan speed (in the OpenPLC/Raspberry Pi prototype).
The analysis showed a lack of stack protection mech-
anisms, such as Data Execution Protection (DEP) or
Address Space Layout Randomization (ASLR). This
lack of protection mechanisms makes the class of
stack protection vulnerabilities, such as overflows, a
possible attack vector.
The second major observation from the analysis
results was the existence of a authentication bypass
vulnerability similar to the Siemens SIMATIC S7-
1200 PLC Systems Replay Security Bypass and De-
nial of Service Vulnerabilities (Cert, 2014) (Beres-
ford, 2011). In terms of this vulnerability, we can
conclude that any process that writes to the mod-
bus coil will be accepted as valid input (allowing
changes to the fans operations and speed). Specially
crafted packets (in our case modbus packets) would
allow an attacker to send packets to the program and
change values in the registers, and the process would
be changed based on the false values provided by the
attacker.
There are some differences between our emulated
process and an S7 PLC that we would like to point
out. OpenPLC uses modbus as its communication
method, whereas the public exploits for the S7 oper-
ate against the iso-tsap protocol for communications.
Thus, Siemens PLCs with older firmware version are
vulnerable to replay attacks over iso-tsap, whereas
OpenPLC is vulnerable to replay attacks over mod-
bus.
5 CONCLUSION
This work leverages the binary analysis framework
’angr’, portions of the cyber reasoning system (CRS)
’Mechanical Phish’ (Shoshitaishvili et al., 2016)
(Shoshitaishvili et al., 2015) (Shellphish, 2016), as
well as firmware extraction and modification tech-
niques and tools to automate the discovery of vulner-
abilities in IIoT devices. We have chosen to use PLCs
as our initial IIoT test subject.
Our approach includes extraction and emulation
of PLC firmware, as well as analysis using angr, AFL,
and Driller. This approach has helped us uncover vul-
nerabilities, enabling us to devise solutions to mitigate
those vulnerabilities in order to enhance the security
posture of the Industrial Internet of Things. We have
some early results that have been able to discover vul-
nerabilities in Industrial Internet of Things emulated
in our laboratory environment, namely lack of stack
protection and authentication bypass. As more anal-
yses are conducted and verified, we will update the
community on findings and proposed mitigations to
the discovered vulnerabilities.
6 FUTURE WORK
Given the early result discussed in the paper, we have
begun expanding our analysis of PLC firmware on
several brands of controllers. We have started to an-
alyze a few versions of the Siemens S7 controller,
as well as several different models of Allen Bradley
PLCs. We are also exploring the potential to improve
the performance of angr through the use of symbolic
summaries. We are working towards expanding the
angr framework’s ability to load other architectures
specific to PLC manufacturers, and exploring the po-
tential to extend the firmadyne tool to further auto-
mate the analysis of PLC firmware.
REFERENCES
Almgren, M., Balzarotti, D., Stijohann, J., and Zambon,
E. (2015). Runtime-monitoring for industrial control
systems. Electronics, 4(3):995 – 1017.
Alves, T. R., Buratto, M., de Souza, F. M., and Rodrigues,
T. V. (2014). Openplc: An open source alternative
to automation. In Proc. IEEE Global Humanitarian
Technology Conf. (GHTC 2014), pages 585–589.
angr (2017). angr-docs. Contributing to the framework.
Bellard, F. (2017). Qemu.
Beresford, D. (2011). Siemens simatic s7-1200 plc systems
replay security bypass and denial of service vulnera-
bilities.
Cert, I. (2014). Siemens s7-1200 plc vulnerabilities.
Chen, D. D., Egele, M., Woo, M., and Brumley, D. (2016).
Towards automated dynamic analysis for linux-based
embedded firmware. In ISOC Network and Dis-
tributed System Security Symposium (NDSS).
Collake, J. and Heffner, C. (2013). Firmware modification
kit.
Costin, A., Zaddach, J., Francillon, A., Balzarotti, D., and
Antipolis, S. (2014). A large-scale analysis of the se-
curity of embedded firmwares. In USENIX Security,
pages 95–110.
Cruz, T., Barrigas, J., Proenc¸a, J., Graziano, A., Panzieri, S.,
Lev, L., and Sim ˜
oes, P. (2015). Improving network se-
curity monitoring for industrial control systems. In In-
tegrated Network Management (IM), 2015 IFIP/IEEE
International Symposium on, pages 878–881. IEEE.
Cruz, T., Proenc¸a, J., Sim ˜
oes, P., Aubigny, M., Ouedraogo,
M., Graziano, A., and Yasakhetu, L. (2014). Improv-
ing cyber-security awareness on industrial control sys-
tems: The cockpitci approach. In 13th European Con-
ference on Cyber Warfare and Security ECCWS-2014
The University of Piraeus Piraeus, Greece, page 59.
DARPA (2016). Darpa cyber grand challenge.
devttys0 (2016a). Binwalk. Firmware Analysis Tool.
devttys0 (2016b). Reverse engineering firmware: Linksys
wag120n. SquashFS common file system for IoT.
devttys0 (2016c). Sasquatch. Set of patches to the standard
unsquashfs utility.
Gupta, A. (2016). Firmware analysis for iot devices.
Janicke, H., Nicholson, A., Webber, S., and Cau, A. (2015).
Runtime-monitoring for industrial control systems.
Electronics, 4(3):995 – 1017.
lcamtuf (2017). American fuzzy lop.
McLaughlin, S. E., Zonouz, S., Pohly, D., and McDaniel, P.
(2014). A trusted safety verifier for process controller
code. In NDSS, volume 14.
Modbus (2012). MODBUS Protocol Specification. Modi-
con, v1.1b3 edition.
OWASP (2016). Iot firmware analysis.
Sadeghi, A. R., Wachsmann, C., and Waidner, M. (2015).
Security and privacy challenges in industrial internet
of things. In Proc. 52nd ACM/EDAC/IEEE Design
Automation Conf. (DAC), pages 1–6.
Shellphish, U. (2016). Mechanical phish. Cyber Reasoning
System for DARPA Cyber Grand Challenge.
Shoshitaishvili, Y., Wang, R., Hauser, C., Kruegel, C., and
Vigna, G. (2015). Firmalice - Automatic Detection
of Authentication Bypass Vulnerabilities in Binary
Firmware. In Proceedings of the 2015 Network and
Distributed System Security Symposium.
Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N.,
Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser,
C., Kruegel, C., and Vigna, G. (2016). Sok: State of
the art of war: Offensive techniques in binary analysis.
In IEEE Symposium on Security and Privacy.
Stephens, N., Grosen, J., Salls, C., Dutcher, A., Wang, R.,
Corbetta, J., Shoshitaishvili, Y., Kruegel, C., and Vi-
gna, G. (2016). Driller: Augmenting fuzzing through
selective symbolic execution. In Proceedings of the
2016 Network and Distributed System Security Sym-
posium.
sviehb (2016). Jefferson. JFFS2 filesystem extraction tool.
... System emulation schemes have also been the subject of research with a view to understand common challenges faced by developers and testers [38,87,98]. The tools and techniques employed for reverse engineering have been discussed in [19,25,[94][95][96]99,100], providing basic discussion of pre-processing, de-compiling, unpacking, and evaluation techniques. ...
... Expanding to industrial IoT-ware, the dynamic framework proposed by Palavicini et al. [99] uses a combination of methods, including binary analysis tools, cyber reasoning system, fuzzer, as well as security analysis virtualization solutions such as OpenPLC, Firmadyne, and QEMU. The study proposes a three-stage approach, starting with the extraction of the firmware blob to extract code for emulation, further emulating the code, and analyze the results for vulnerabilities using a number of techniques such as fuzzing and symbolic execution. ...
Article
Full-text available
In recent years, the Internet of Things (IoT) paradigm has been widely applied across a variety of industrial and consumer areas to facilitate greater automation and increase productivity. Higher dependability on connected devices led to a growing range of cyber security threats targeting IoT-enabled platforms, specifically device firmware vulnerabilities, often overlooked during development and deployment. A comprehensive security strategy aiming to mitigate IoT firmware vulnerabilities would entail auditing the IoT device firmware environment, from software components, storage, and configuration, to delivery, maintenance, and updating, as well as understanding the efficacy of tools and techniques available for this purpose. To this effect, this paper reviews the state-of-the-art technology in IoT firmware vulnerability assessment from a holistic perspective. To help with the process, the IoT ecosystem is divided into eight categories: system properties, access controls, hardware and software re-use, network interfacing, image management, user awareness, regulatory compliance, and adversarial vectors. Following the review of individual areas, the paper further investigates the efficiency and scalability of auditing techniques for detecting firmware vulnerabilities. Beyond the technical aspects, state-of-the-art IoT firmware architectures and respective evaluation platforms are also reviewed according to their technical, regulatory, and standardization challenges. The discussion is accompanied also by a review of the existing auditing tools, the vulnerabilities addressed, the analysis method used, and their abilities to scale and detect unknown attacks. The review also proposes a taxonomy of vulnerabilities and maps them with their exploitation vectors and with the auditing tools that could help in identifying them. Given the current interest in analysis automation, the paper explores the feasibility and impact of evolving machine learning and blockchain applications in securing IoT firmware. The paper concludes with a summary of ongoing and future research challenges in IoT firmware to facilitate and support secure IoT development.
... Among the different works that we reviewed, the works of Palavicini Jr. et al. [23] and Siboni et al. [24] overlap with our objectives. On the one hand, Palavicini Jr. et al. apply symbolic analysis to vet, in a semi-automated way, Industrial IoT (IIoT) firmware using angr, a UC Santa Barbara binary analysis framework [25], and Mecanical Phish, a component from the same university's cyber reasoning system, to perform the semi-automated analysis of IIoT. ...
... means fully covered, means partially covered, and □ means uncovered. [14] □ □ □ Reviews and recommendations Chen et al. [15] □ □ □ MassVet technique Celik et al. [20] □ Challenges and opportunities Fernandes et al. [21] □ □ FlowFence system Palavicini Jr. et al. [23] □ Semi-automatic firmware vetting Siboni et al. [24] Security testbed framwork ...
Article
Like any emerging and disruptive technology, multiple obstacles are slowing down the Internet of Things (IoT) expansion for instance, multiplicity of things’ standards, users’ reluctance and sometimes rejection due to privacy invasion, and limited IoT platform interoperability. IoT expansion is also accompanied by the widespread use of mobile apps supporting anywhere, anytime service provisioning to users. By analogy to vetting mobile apps, this paper addresses the lack of principles and techniques for vetting IoT devices (things) in preparation for their integration into mission-critical systems. Things have got vulnerabilities that should be discovered and assessed through proper device vetting. Unfortunately, this is not happening. Rather than sensing a nuclear turbines steam level, a thing could collect some sensitive data about the turbine without the knowledge of users and leak these data to third parties. This paper presents a guiding framework that defines the concepts of, principles of, and techniques for thing vetting as a pro-active response to potential things vulnerabilities.
... Moreover, AR technology can be used for training purposes, providing workers with a virtual simulation of a task or process before they attempt it in the real world. This can help to reduce the time and cost associated with traditional training methods, while also allowing workers to practice tasks in a safe and controlled environment [31]. However, there are also challenges associated with the use of AR in manufacturing, including the need for specialized hardware and software, as well as issues related to data management and cybersecurity [32]. ...
Article
Full-text available
The recent revolution in Industry 4.0 (IR 4.0) has characterized the integration of advance technologies to bring the fourth industrial revolution to scale the manufacturing landscape. There are different key drivers for this revolution, in this research we have explored the following among them such as, Industrial Internet of Things (IIoT), Deep Learning, Blockchain and Augmented Reality. The emerging concept from blockchain namely “Non-Fungible Token” (NFT) relating to the uniqueness of digital assets has vast potential to be considered for physical assets identification and authentication in the IR 4.0 scenario. Similarly, the data acquired through the deployment of IIoT devices and sensors into smart industry spectrum can be transformed to generated robust analytics for different industry use-cases. The predictive maintenance is a major scenario in which early equipment failure detection using deep learning model on acquireddata from IIoT devices has major potential for it. Similarly, the augmented reality can be able to provide real-time visualization within the factory environment to gather real-time insight and analytics from the physical equipment for different purposes.This research initially conducted a survey to analyse the existing developments in these domains of technologies to further widen its horizon for this research. This research developed and deployed a smart contract into an ethereum blockchain environment to simulate the use-case for NFT for physical assets and processes synchronization. The next phase was deploying deep learning algorithms on a dataset having data generated from IIoT devices and sensors. The Feedforward and Convolutional Neural Network were used to classify the target variables in relation with predictive maintenance failure analysis. Lastly, the research also proposed an AR based framework for the visualization ecosystem within the industry environment to effectively visualize and monitoryIIoT based equipment’s for different industrial use-cases i.e., monitoring, inspection, quality assurance.
... Once the malicious firmware is transferred into the targeted microcontroller, the attacker would only need the microcontroller's architecture which can be easily obtained from its manual [34]. In a reverse engineering attack, the attacker obtains the source code of the software or firmware used in the embedded device, and hence gains access to confidential information such as hard-coded credentials [35]. VOLUME , 20xx An attacker can also execute various malicious software (malware) to infect a given microcontroller [36]. ...
Article
Full-text available
Digital Twins (DTs) have been gaining popularity in various applications, such as smart manufacturing, smart energy, smart mobility, and smart healthcare. In simple terms, DT is described as a virtual replica of a given physical product, system, or process. It consists of three major segments: the physical entity, its virtual counterpart, and the connections between them. While the data is collected from a physical entity, processed at the virtual layer, and accessed in the form of a DT at the application layer, it is exposed to several security risks. To ensure the applicability of a DT system, it is imperative to understand these security risks and their implications. However, there is a lack of a framework that can be used to assess the security of a DT. This paper presents a framework in which the security of a DT can be analyzed with the help of a formal verification technique. The framework captures the defense of the system at different layers and considers various attacks at each layer. The security of the DT system is represented as a state-transition system and the security properties are captured in temporal logic. Probabilistic model checking (PMC) is used to verify the systems against these properties. In particular, the framework is used to analyze the probability of success and the cost of various potential attacks that can occur at each layer in a DT system. The applicability of the proposed framework is demonstrated with the help of a detailed case study in the healthcare domain.
... The most common Arduino boards are Uno, Leonardo, Micro, Nano (entry-level), Mega, Pro Mini (enhanced features). Each of the boards has different specifications and can have various applications [43,63,76,94] (Figure 3.10). ...
Chapter
The Internet of Things is not just the computer network; it’s simply the connection between electronic devices. In the upcoming scenario, we have this point that all the things near and surrounded by us will be interconnected, such as air conditioners, refrigerators, microwaves, televisions, induction, and any other electronic device. The future is to connect everything and anything as much as possible. So the connection of all the devices and the transfer of data between them is not just the Internet, it’s the Internet of Things. In the near future, smart cities and smart homes will be the major example of the Internet of Things. As all the physical devices will be interconnected, so each device will act as a node for the system. The objective of the Internet of Things is to make things work smartly. As the number of nodes will drastically increase the security concern, data connectivity and data management will be handled accordingly. This chapter covers the introduction to the Internet of Things and the Industrial Internet of Things along with some examples of the areas in which the Internet of Things has been used.
... They evaluated the framework using 2794 malicious apps with high detection accuracy. Palavicini et al. [74] performed static analysis on IoT firmware to avoid path explosion when dynamically analyzing complex binaries with symbolic execution using a software emulator. Yao et al. [16] identified a previously unknown vulnerability which is known as privilege separation vulnerability. ...
Article
Full-text available
IoT security and privacy has raised grave concerns. Efforts have been made to design tools to identify and understand vulnerabilities of IoT systems. Most of the existing protocol security analysis techniques rely on a well understanding of the underlying communication protocols. In this paper, we systematically present the first manual reverse engineering framework for discovering communication protocols of embedded Linux based IoT systems. We have successfully applied our framework to reverse engineer a number of IoT systems. As an example, we present a detailed use of the framework reverse-engineering the WeMo smart plug communication protocol by extracting the firmware from the flash, performing static and dynamic analysis of the firmware and analyzing network traffic. The discovered protocol exposes severe design flaws that allow attackers to control or deny the service of victim plugs. Our manual reverse engineering framework is generic and can be applied to both read-only and writable Embedded Linux filesystems.
... They evaluated the framework using 2794 malicious apps with high detection accuracy. Palavicini et al. [77] performed static analysis on IoT firmware to avoid path explosion when dynamically analyzing complex binaries with symbolic execution using a software emulator. Yao et al. [16] identified a previously unknown vulnerability which is known as privilege separation vulnerability. ...
Preprint
Full-text available
IoT security and privacy has raised grave concerns. Efforts have been made to design tools to identify and understand vulnerabilities of IoT systems. Most of the existing protocol security analysis techniques rely on a well understanding of the underlying communication protocols. In this paper, we systematically present the first manual reverse engineering framework for discovering communication protocols of embedded Linux based IoT systems. We have successfully applied our framework to reverse engineer a number of IoT systems. This paper presents a detailed use of the framework reverse engineering the WeMo smart plug communication protocol by extracting the firmware from the flash, performing static and dynamic analysis of the firmware and investigating network traffic. The discovered protocol exposes some severe design flaws that allow attackers to control or deny the service of victim plugs. Our manual reverse engineering framework is generic and can be applied to both read-only and writable Embedded Linux filesystems.
Thesis
Full-text available
In this final degree project, an investigation was carried out on the cybersecurity aspects of Industry 4.0, Operational Technology and the Internet of Industrial Things. Its scope covered: growth, evolution, risks and threats that are present in recent years and today, as well as cybersecurity countermeasures proposed in research by other authors. Later, in this research, the cybersecurity problems caused by the convergence between Information and Operational Technologies were addressed, which, although it is an aspect that has been present for years, has been considerably deepened by the introduction of devices known as I.IoT. Finally, a Cybersecurity Program oriented to I.IoT was proposed to approach this discipline in a systematic and holistic way. Through this program, designed in stages, an organization can apply previously identified countermeasures and establish evolutionary processes for continuous improvement. It includes technical and management aspects for the reduction of risks and the containment of threats presented by the convergence between Information and Operational Technologies with the arrival of I.IoT.
Article
Cybersecurity is one of the main challenges faced by companies in the context of the Industrial Internet of Things (IIoT), in which a number of smart devices associated with machines, computers and people are networked and communicate with each other. In this connected industrial scenario, personnel need to be aware of cybersecurity issues in order to prevent or minimise the occurrence of cybersecurity incidents and corporate data breaches, and thus to make companies resilient to cyber-attacks. In addition, the recent increase in smart working due to the COVID-19 pandemic means that the need for cybersecurity awareness is more relevant than ever. In this study, we carry out a systematic literature review in order to analyse how the existing state of the art deals with cybersecurity awareness in the context of IIoT, and to provide a comprehensive overview of this topic. Four areas of analysis are considered: (i) definitions of the concepts of cybersecurity awareness and information security awareness, with keyword extrapolation (e.g. cybersecurity control level, information and responsibility); (ii) the industrial context of the analysed studies (e.g. manufacturing, critical infrastructure); (iii) the techniques adopted to raise company awareness of cybersecurity (e.g. serious games, online questionnaires); and (iv) the main benefits of a large-scale campaign of cybersecurity awareness (e.g. the effectiveness of employees in terms of managing cybersecurity issues, identification of cyber-attacks). Practitioners and researchers can benefit from our analysis of the features of each area in their future research and applications.
Article
Full-text available
Industrial Control Systems (ICS) are widely deployed in nation’s critical national infrastructures such as utilities, transport, banking and health-care. Whilst Supervisory Control and Data Acquisition (SCADA) systems are commonly deployed to monitor real-time data and operations taking place in the ICS they are typically not equipped to monitor the functional behaviour of individual components. In this paper (This paper expands on an earlier position paper presented at the International Symposium for Industrial Control System and SCADA Cyber Security Research 2014), we are presenting a runtime-monitoring technology that provides assurances of the functional behaviour of ICS components and demonstrates how this can be used to provide additional protection of the ICS against cyber attacks similar to the well-known Stuxnet attack.
Conference Paper
Full-text available
As embedded systems are more than ever present in our society, their security is becoming an increasingly important issue. However, based on the results of many recent analyses of individual firmware images, embedded systems acquired a reputation of being insecure. De-spite these facts, we still lack a global understanding of embedded systems’ security as well as the tools and techniques needed to support such general claims.In this paper we present the first public, large-scale analysis of firmware images. In particular, we unpacked32 thousand firmware images into 1.7 million individual files, which we then statically analyzed. We leverage this large-scale analysis to bring new insights on the security of embedded devices and to underline and detail several important challenges that need to be addressed in future research. We also show the main benefits of looking at many different devices at the same time and of linking our results with other large-scale datasets such as the ZMap’s HTTPS survey.In summary, without performing sophisticated static analysis, we discovered a total of 38 previously unknown vulnerabilities in over 693 firmware images. Moreover,by correlating similar files inside apparently unrelated firmware images, we were able to extend some of those vulnerabilities to over 123 different products. We also confirmed that some of these vulnerabilities altogether are affecting at least 140Kdevices accessible over the Internet. It would not have been possible to achieve these results without an analysis at such wide scale.We believe that this project, which we plan to provide as a firmware unpacking and analysis web service (www.firmware.re), will help shed some light on the security of embedded devices.
Conference Paper
Full-text available
Programmable Logic Controller (PLC) technology plays an important role in the automation architectures of several critical infrastructures such as Industrial Control Systems (ICS), controlling equipment in contexts such as chemical processes, factory lines, power production plants or power distribution grids, just to mention a few examples. Despite their importance, PLCs constitute one of the weakest links in ICS security, frequently due to reasons such as the absence of secure communication mechanisms, authenticated access or system integrity checks. While events such as the Stuxnet worm have raised awareness for this problem, industry has slowly reacted, either due to reliability or cost concerns. This paper introduces the Shadow Security Unit, a low-cost device deployed in parallel with a PLC or Remote Terminal Unit (RTU), being capable of transparently intercepting its communications control channels and physical process I/O lines to continuously assess its security and operational status. The proposed device does not require significant changes to the existing control network, being able to work in standalone or integrated within an ICS protection framework.
Article
Today, embedded, mobile, and cyberphysical systems are ubiquitous and used in many applications, from industrial control systems, modern vehicles, to critical infrastructure. Current trends and initiatives, such as "Industrie 4.0" and Internet of Things (IoT), promise innovative business models and novel user experiences through strong connectivity and effective use of next generation of embedded devices. These systems generate, process, and exchange vast amounts of security-critical and privacy-sensitive data, which makes them attractive targets of attacks. Cyberattacks on IoT systems are very critical since they may cause physical damage and even threaten human lives. The complexity of these systems and the potential impact of cyberattacks bring upon new threats. This paper gives an introduction to Industrial IoT systems, the related security and privacy challenges, and an outlook on possible solutions towards a holistic security framework for Industrial IoT systems.