Content uploaded by Clark L. Coleman
Author content
All content in this area was uploaded by Clark L. Coleman on Nov 12, 2019
Content may be subject to copyright.
1
A System For The Security Protection Of Embedded
Binary Programs
Jack W. Davidson1,2, Jason D. Hiser2, Anh Nguyen-Tuong2, Clark L. Coleman1, William H. Hawkins2,
John C. Knight2,3, Benjamin D. Rodes3, Ashlie B. Hocking3
1Zephyr Software LLC, 2University of Virginia, 3Dependable Computing LLC
Charlottesville, VA USA
Abstract—Software for which development artifacts are
missing is increasingly common and difficult to avoid, including
in embedded systems. The lack of development artifacts leaves
doubt about whether the software possesses critical security
properties and makes enhancement of the software extremely
difficult. Embedded systems often have strict resource
restrictions/constraints making the application of security
enhancements especially difficult. In this paper, we present
details of a system that is being developed to provide significant
protection against security exploits of embedded systems. The
system operates on binary programs. No source code or other
development artifacts are required, and the typical size and time
constraints of embedded systems are accounted for in the
analysis and processing of subject binary programs. Formal
verification of security properties is used to eliminate
unnecessary security transformations, and transformations are
applied by a highly efficient static binary rewriter.
Keywords—Embedded system security, formal verification,
static binary rewriting
I. INTRODUCTION
Modern sophisticated embedded systems are often built
with software from a variety of sources. For example, an
embedded system might use code libraries for which
development details including the source code are unavailable,
i.e., the libraries are Software Of Unknown Provenance
(SOUP). Whether such software is adequately secure is
difficult to determine because of the lack of necessary
development documentation.
For security-critical applications, various techniques have
been developed to enhance the security properties of software
for which access to development artifacts is unavailable [8].
Embedded systems often have strict resource restrictions
making the application of security enhancements difficult.
In this paper, we present details of a system that is being
developed by Zephyr Software, the University of Virginia and
Dependable Computing that is designed to provide significant
protection against security exploits of embedded software. We
summarize the project technology, present preliminary results,
and discuss future plans.
The system combines the Zephyr Security Toolkit (ZeST)
and Dependable Computing’s security-case toolkit (SCT), and
interfaces with AdaCore’s SPARK Pro toolset. The system
operates on binary programs, i.e., no source code, debug
information or other development artifacts are required, and the
typical size and time constraints of embedded systems are
accounted for in the analysis and processing of subject binary
programs.
The basic approach being used is in five parts:
• For a certain class of vulnerabilities, formal
verification of security properties is attempted. If it
succeeds, the property is verified. If the proof fails,
either one or more vulnerabilities exist or inadequate
proof strategies were used. In either case, targeted
changes are made to the program until the proof
succeeds.
• For vulnerabilities for which formal verification is not
possible, transformations are applied to constrain
execution.
• For vulnerabilities for which execution constraints are
not suitable and to provide broad protection against
unknown vulnerabilities the system applies artificial
diversity.
• A high-performance static binary rewriter applies the
transformations indicated by the previous three parts of
the system to the subject binary program.
• A rigorous security case is developed to allow system
stakeholders to determine whether the overall system
security and resulting system performance is sufficient
for their needs.
The initial implementation platform is a Linux system
running on x86-64 (64-bit) hardware, with future plans to port
to other platforms of interest.
An overview of the system is shown in Fig. 1. The
processing of the system is defined by a security policy
specification that the user provides interactively. The policy
specification system uses overhead estimates obtained from
benchmarking experiments to assist in the selection of
defenses. The policy specification also defines whether the
defense should terminate execution upon defeating an
attempted exploit, or try to recover and resume execution.
2
Fig. 1. Figure 1: System operational overview.
The original binary program is read by a static analyzer
(STARS), disassembled and analyzed to provide information
needed by the rest of the system. The original program and the
results of analysis are stored in the Intermediate Representation
Database (IRDB). Next proofs are attempted on a variety of
security-related theorems. The results of the proof attempts are
returned to STARS and used to determine the necessary
defenses. Defenses are selected from the available constrained
execution and artificial diversity sets. Finally transformations
that will implement the necessary defenses are defined and
applied with the Zipr binary rewriter. In concert with the
enhancements to the subject software, a rigorous security case
is developed.
II. STATIC ANALYSIS
The STARS static analyzer reads the binary file and:
• Disassembles the binary program (using IDA Pro).
• Produces a variety of analyses of the binary program
including the control-flow and data-flow analyses,
stack-frame analyses, recovered control structures
from the original source program, invariants for the
various loops, and security-related function pre- and
post-conditions.
• Performs initial vulnerability analyses that are
improved in a second pass using the proof results.
• Enters the binary program into the Intermediate
Representation Database (IRDB), a Postgres database.
Each instruction in the program is stored as a separate
record.
• Translates the binary program into SPARK Ada [1]
including a translation of data structures into Ada
syntax, and development of security-base SPARK
constraints (pre- and post-conditions, invariants, etc.).
III. PROOF SYSTEM
The output of static analysis includes the entire binary
program translated into SPARK Ada [1]. SPARK Ada is used
as an intermediate representation of the binary program,
because the language supports specification of security
properties and is supported by a high-quality proof
infrastructure, the SPARK Pro analysis tools [6].
Security properties of interest about each function in the
software are defined as putative theorems and proofs of the
theorems are attempted. Current properties stated include: (a)
integrity of the stack pointer, (b) freedom from changes to the
return address referenced by the stack pointer, and (c) the
inability to call setuid() with an argument of zero.
If a theorem can be proved, then the associated function has
the desired property for all possible executions. Thus,
vulnerabilities such as buffer overflows do not exist in the
function and no additional security protection is required for
those vulnerabilities.
If the theorem cannot be proved, then the function might
contain one or more vulnerabilities. By examining the reason
for the proof failure, the location within the function that needs
to be enhanced (so as to make the proof possible) can often be
identified and suitable enhancements defined for the software.
The results of the proof attempts are returned to STARS for
use in customizing the protections that will be applied to the
subject program.
Multiple binary programs ranging in size from 11
instructions to over 2,000 have been converted into SPARK
Ada and several security properties proven for each. Original
machine instructions are usually translated into 1 to 4 SPARK
statements. We have also examined binary programs with: (a)
different numbers and types of data structures, control
structures, function calls and nested loops, and (b) increasing
complex security properties, such passing references to buffers
of variable size and scaled and indirect memory accesses. At
present, proof times have ranged from 6 to approximately 600
seconds, with an average time of 46 seconds.
These proof activities stimulated the development of
heuristics for automating the translation process and improving
the speed of the proof process. In addition, preliminary
scalability analysis has been performed to identify heuristics
that have manageable growth curves. For example, various
heuristics have enabled the proof of a MiBench benchmark
containing 8 loops, one of which contains 4 nested loops, in
approximately 13 seconds. Initially, we were unable to prove a
complete version of the program, because the prover timed out
after 10,000 seconds.
IV. EXECUTION CONSTRAINTS
Execution constraints ensure that certain undesirable
execution sequences are either blocked or detected.
STARS
Static
Analyzer
Original
Binary File
Policy
Spec'n
Transformed
Binary File
Zipr
Binary
Rewriter
Overhead
Estimates
Policy
Spec'n
System
Untrusted Protected
SPARK Pro
Proof
System
Proof
Results
SPARK Ada
Represent'n
Security
Case
GSN
Argument
Editor
Security
Case
Manager
Security
Theorems
Evidence Collection
Intermediate
Represent'n
DataBase
(IRDB)
Zephyr Security
Toolkit
Dependable Commuting
Security Case Toolkit
User Specified
Security
Enhancements
System
Stakeholders
3
The major execution constraint that is implemented is
called Selective Control Flow Integrity (SCFI). SCFI limits
control transfers to legitimate targets. It is called selective,
because, unlike other binary-only control flow integrity
techniques, not all indirect branches need to be instrumented.
Instead, we can select not to protect indirect branches that we
have proven to be secure.
Our basic SCFI allows an indirect branch to jump to any
instruction that might be an indirect branch target. Basic SCFI
incurs 3% additional run-time overhead on average and
provides a 99% reduction in the attack surface.
We have a refined version of SCFI that analyzes indirect
branches more thoroughly. This refined version detects that
some indirect branches use a jump table (as is often used in
implementing a switch/case statement in high-level languages).
It also detects call sites in the program, and limits individual
return instructions to a specific set of targets.
These improvements in granularity help significantly. For
example, in the examples we have analyzed, more than 60% of
returns are fully analyzed. These refinements leave the
overhead unchanged while adding several orders of magnitude
reduction in the attack surface for return address hijackings.
Typically, when a control flow violation is detected,
continued execution is not possible since no valid target is
available. However, since SCFI limits targets so effectively, we
can choose a reasonable target from the set. We have
developed a simple algorithm for attempting to continue
execution properly and are in the process of evaluating its
effectiveness.
V. ARTIFICIAL DIVERSITY
Cyber attacks can be made more difficult by randomization
a characteristic of the target program. Randomization does not
eliminate vulnerabilities; it merely changes information that the
adversary previously knew thereby denying the adversary
information critical for the exploit. Such randomization is
referred to as artificial diversity [3]. At present, the system
applies two artificial-diversity techniques: Stack Layout
Transformation and Block-Level Instruction Layout
Randomization.
Stack Layout Transformation (SLX) is designed to thwart
stack-based buffer overflow attacks, including intra-frame
overflows and non-control data attacks [10]. SLX includes
three separate transforms that are applied to the stack frames of
individual functions: (a) randomizing the order of local
variables, (b) adding random-length padding between
variables, and (c) placing canaries, i.e., random values that are
checked periodically during execution.
To effect these transforms requires that: (1) the stack layout
be redefined, (2) instructions that access the stack be modified
to use the redefined layout, and (3) instructions be inserted to
set and check canary values during execution.
Block-Level Instruction Layout Randomization (BILR)
randomizes the order of code at the basic-block level and is
conceptually similar to Instruction Layout Randomization
(ILR) [4]. The reordering can be based on a variety of
objectives such as best fit, maximization of the distance
between originally adjacent blocks, etc. BILR can significantly
help with some types of arc-injection and ROP style attacks.
VI. STATIC BINARY REWRITING
Zipr is a general purpose, static binary rewriter that uses a
novel reassembly technique to generate compact, transformed
versions of original programs (statically or dynamically linked)
and libraries, relying only on access to their binary code for
rewriting [4].
In order to guarantee the proper execution of rewritten
programs and libraries in all cases, other static binary rewriters
such as SecondWrite [1] encapsulate a copy of the original
program within the rewritten version and incur an overhead of
at least 100% in disk size. Zipr does not require such
duplication, because of the development of a new reassembly
technique. Programs in the SPEC2006 benchmark suite
rewritten using Zipr increase from their native size by an
average of only 4%. and Zipr-rewritten binaries carry an
average performance overhead of only 5%, a significant and
major improvement on the state of the art.
For the system described in this paper, Zipr's most
important feature is its plugin architecture that allows users to
control the rewriting algorithm. Through plugins, Zipr users
can direct the rewriter to add the security features and defenses
described herein. For example, there are existing Zipr plugins
for BILR and stack-layout transformation.
VII. SECURITY CASE
Overhead estimates guide the user’s selection of security
techniques that the system is to apply. But a trade-off is
required between the inevitable overhead incurred and the
efficacy of the resulting security protection. Different
techniques incur different amounts of overhead and provide
different forms of protection.
To allow stakeholders to assess the overall efficacy of the
security enhancements applied and to support the necessary
trade-off, the system includes support for the development and
analysis of a rigorous security case [9]. A security case
provides a comprehensive, valid and compelling argument that
a system is adequately secure for a given application in a given
environment. A security case documents the rationale for belief
that the system is adequately secure thereby facilitating
scrutiny of that rationale.
The overall organization of a security case is shown in Fig.
2. The security claim defines the goal of the analysis and
transformation of the subject binary program. Evidence is
collected during the enhancement process about the subject
software, the various analyses, and the transformations. A
4
rigorous argument is developed that is designed to compel
belief in the security claim based on the available evidence.
Arguments are documented graphically using the Goal
Structuring Notation (GSN) [7].
Fig. 2. Overall organization of a rigorous security case.
VIII. FUTURE WORK
Planned future work for the system described includes:
• Formal verification: Planned improvements to the
formal verification system include: (a) to increase the
size of programs that can be automatically translated
and analyzed to a target of 1,000,000 instructions, (b)
to keep analysis and proof times to practical levels, and
(c) to complement the proof technology being
developed with a “repair” mechanism that will insert
guard instructions into the program automatically
when the initial proof fails.
• Binary rewriting: Planned improvements to the static
binary rewriter include reduction of the memory
overhead of Zipr by development of more
sophisticated code placement and code packing
algorithms.
• Continued execution: Planned improvements to
continued execution include recovery from memory
overwriting exploits by developing techniques to
restore corrupted program state beyond the shadowed
values (return addresses, function pointers, critical
arguments) that are restored in the current
implementation.
• Defenses: SCFI will be extended to cross-file control
flow (e.g., among the main executable file and the
shared objects), and the granularity of SCFI will be
further improved with improvements in the static
analyses in STARS.
IX. SUMMARY
The use of binary software for which source code and
development artifacts are unavailable, has become widespread
and is difficult to avoid, even in embedded systems. Important
properties, such as critical security properties, cannot be
assured because of the lack of necessary development
documentation.
The system described is designed to provide substantial
security hardening of embedded software when only the binary
form of the software is available. Formal verification is used to
identify locations where either additional hardening is not
required or where security enhancements can be targeted.
Execution constraints and artificial diversity are applied to
supplement the targeted security enhancements, and a static
binary rewriter applies all of the necessary modifications to the
binary program.
Raytheon Integrated Defense Systems, has conducted a
preliminary Red Team attack on specimen software to which
the described system had been applied. The SCFI defense
initially defeated most hijacking attacks and defeated all in-
scope hijacking exploits after the attack surface reduction for
return addresses was improved.
ACKNOWLEDGMENTS
The authors gratefully acknowledge the technical support
provided to this project by AdaCore.
REFERENCES
[1] K. Anand, M. Smithson, A. Kotha, K. Elwazeer, R. Barua:
“Decompilation to compiler high IR in a binary rewriter”, Tech. rep.,
University of Maryland (November 2010),
http://www.ece.umd.edu/~barua/high-IR-technical-report10.pdf
[2] J. Barnes, SPARK: The Proven Approach to High Integrity Software.
Altran Praxis, 2012.
[3] J. Davidson, J. Hiser, A. Nguyen-Tuong, M. Co, B. Rodes, J. Knight,
“Security protection of binary programs,” IET System Safety and Cyber
Security Conference, Bristol, UK 2015.
[4] W. Hawkins, J. Hiser, A. Nguyen-Tuong, C. Coleman, J. Davidson,
“Efficient static binary rewriting” Department of Computer Science
Technical Report CS2015-5, University of Virginia, 2015.
[5] J. Hiser, A. Nguyen-Tuong, M. Co, M. Hall, J. Davidson. “ILR: where’d
my gadgets go?,” IEEE Symposium on Security and Privacy, 2012, pp.
571-585.
[6] A. Hocking, B. Rodes, J. Knight, J. Davidson, and C. Coleman “A proof
Infrastructure for binary programs,” in submission to NASA Formal
Methods (NFM), 2016.
[7] T. Kelly, and R. Weaver. “The Goal structuring notation–a safety
argument notation,” in Dependable Systems and Networks Workshop on
Assurance Cases, Florence, Italy, 2004.
[8] B. Rodes and J. Knight, “Speculative software modification and its use
in securing SOUP,” in the European Dependable Computing
Conference, Newcastle upon Tyne, UK, 2014, pp. 210-221.
[9] B. Rodes, J. Knight, A. Nguyen-Tuong, J. Hiser, M. Co, J. Davidson, “A
case study of security case development,” 23rd Safety-critical Systems
Symposium, Bristol, UK, 2015.
[10] B. Rodes, A. Nguyen-Tuong, J. Hiser, J. Knight, M. Co, J. Davidson,
“Defense against stack-based attacks using speculative stack layout
transformation,” Third International Conference on Runtime
Verification, Istanbul, Turkey, 2012, pp. 308-313.
Security
Claim
Context Within
Which Claim Is
Made
Body of Evidence About The
System Including Details Of
Transformations, Analysis,
Testing
Rigorous Argument Linking The
Body of Evidence to the
Security Claim
Assumptions
Made About The
System