Conference PaperPDF Available

A System for the Security Protection of Embedded Binary Programs

Authors:

Abstract and Figures

Software for which development artifacts are missing is increasingly common and difficult to avoid, including in embedded systems. The lack of development artifacts leaves doubt about whether the software possesses critical security properties and makes enhancement of the software extremely difficult. Embedded systems often have strict resource restrictions/constraints making the application of security enhancements especially difficult. In this paper, we present details of a system that is being developed to provide significant protection against security exploits of embedded systems. The system operates on binary programs. No source code or other development artifacts are required, and the typical size and time constraints of embedded systems are accounted for in the analysis and processing of subject binary programs. Formal verification of security properties is used to eliminate unnecessary security transformations, and transformations are applied by a highly efficient static binary rewriter.
Content may be subject to copyright.
1
A System For The Security Protection Of Embedded
Binary Programs
Jack W. Davidson1,2, Jason D. Hiser2, Anh Nguyen-Tuong2, Clark L. Coleman1, William H. Hawkins2,
John C. Knight2,3, Benjamin D. Rodes3, Ashlie B. Hocking3
1Zephyr Software LLC, 2University of Virginia, 3Dependable Computing LLC
Charlottesville, VA USA
AbstractSoftware for which development artifacts are
missing is increasingly common and difficult to avoid, including
in embedded systems. The lack of development artifacts leaves
doubt about whether the software possesses critical security
properties and makes enhancement of the software extremely
difficult. Embedded systems often have strict resource
restrictions/constraints making the application of security
enhancements especially difficult. In this paper, we present
details of a system that is being developed to provide significant
protection against security exploits of embedded systems. The
system operates on binary programs. No source code or other
development artifacts are required, and the typical size and time
constraints of embedded systems are accounted for in the
analysis and processing of subject binary programs. Formal
verification of security properties is used to eliminate
unnecessary security transformations, and transformations are
applied by a highly efficient static binary rewriter.
KeywordsEmbedded system security, formal verification,
static binary rewriting
I. INTRODUCTION
Modern sophisticated embedded systems are often built
with software from a variety of sources. For example, an
embedded system might use code libraries for which
development details including the source code are unavailable,
i.e., the libraries are Software Of Unknown Provenance
(SOUP). Whether such software is adequately secure is
difficult to determine because of the lack of necessary
development documentation.
For security-critical applications, various techniques have
been developed to enhance the security properties of software
for which access to development artifacts is unavailable [8].
Embedded systems often have strict resource restrictions
making the application of security enhancements difficult.
In this paper, we present details of a system that is being
developed by Zephyr Software, the University of Virginia and
Dependable Computing that is designed to provide significant
protection against security exploits of embedded software. We
summarize the project technology, present preliminary results,
and discuss future plans.
The system combines the Zephyr Security Toolkit (ZeST)
and Dependable Computing’s security-case toolkit (SCT), and
interfaces with AdaCore’s SPARK Pro toolset. The system
operates on binary programs, i.e., no source code, debug
information or other development artifacts are required, and the
typical size and time constraints of embedded systems are
accounted for in the analysis and processing of subject binary
programs.
The basic approach being used is in five parts:
For a certain class of vulnerabilities, formal
verification of security properties is attempted. If it
succeeds, the property is verified. If the proof fails,
either one or more vulnerabilities exist or inadequate
proof strategies were used. In either case, targeted
changes are made to the program until the proof
succeeds.
For vulnerabilities for which formal verification is not
possible, transformations are applied to constrain
execution.
For vulnerabilities for which execution constraints are
not suitable and to provide broad protection against
unknown vulnerabilities the system applies artificial
diversity.
A high-performance static binary rewriter applies the
transformations indicated by the previous three parts of
the system to the subject binary program.
A rigorous security case is developed to allow system
stakeholders to determine whether the overall system
security and resulting system performance is sufficient
for their needs.
The initial implementation platform is a Linux system
running on x86-64 (64-bit) hardware, with future plans to port
to other platforms of interest.
An overview of the system is shown in Fig. 1. The
processing of the system is defined by a security policy
specification that the user provides interactively. The policy
specification system uses overhead estimates obtained from
benchmarking experiments to assist in the selection of
defenses. The policy specification also defines whether the
defense should terminate execution upon defeating an
attempted exploit, or try to recover and resume execution.
2
Fig. 1. Figure 1: System operational overview.
The original binary program is read by a static analyzer
(STARS), disassembled and analyzed to provide information
needed by the rest of the system. The original program and the
results of analysis are stored in the Intermediate Representation
Database (IRDB). Next proofs are attempted on a variety of
security-related theorems. The results of the proof attempts are
returned to STARS and used to determine the necessary
defenses. Defenses are selected from the available constrained
execution and artificial diversity sets. Finally transformations
that will implement the necessary defenses are defined and
applied with the Zipr binary rewriter. In concert with the
enhancements to the subject software, a rigorous security case
is developed.
II. STATIC ANALYSIS
The STARS static analyzer reads the binary file and:
Disassembles the binary program (using IDA Pro).
Produces a variety of analyses of the binary program
including the control-flow and data-flow analyses,
stack-frame analyses, recovered control structures
from the original source program, invariants for the
various loops, and security-related function pre- and
post-conditions.
Performs initial vulnerability analyses that are
improved in a second pass using the proof results.
Enters the binary program into the Intermediate
Representation Database (IRDB), a Postgres database.
Each instruction in the program is stored as a separate
record.
Translates the binary program into SPARK Ada [1]
including a translation of data structures into Ada
syntax, and development of security-base SPARK
constraints (pre- and post-conditions, invariants, etc.).
III. PROOF SYSTEM
The output of static analysis includes the entire binary
program translated into SPARK Ada [1]. SPARK Ada is used
as an intermediate representation of the binary program,
because the language supports specification of security
properties and is supported by a high-quality proof
infrastructure, the SPARK Pro analysis tools [6].
Security properties of interest about each function in the
software are defined as putative theorems and proofs of the
theorems are attempted. Current properties stated include: (a)
integrity of the stack pointer, (b) freedom from changes to the
return address referenced by the stack pointer, and (c) the
inability to call setuid() with an argument of zero.
If a theorem can be proved, then the associated function has
the desired property for all possible executions. Thus,
vulnerabilities such as buffer overflows do not exist in the
function and no additional security protection is required for
those vulnerabilities.
If the theorem cannot be proved, then the function might
contain one or more vulnerabilities. By examining the reason
for the proof failure, the location within the function that needs
to be enhanced (so as to make the proof possible) can often be
identified and suitable enhancements defined for the software.
The results of the proof attempts are returned to STARS for
use in customizing the protections that will be applied to the
subject program.
Multiple binary programs ranging in size from 11
instructions to over 2,000 have been converted into SPARK
Ada and several security properties proven for each. Original
machine instructions are usually translated into 1 to 4 SPARK
statements. We have also examined binary programs with: (a)
different numbers and types of data structures, control
structures, function calls and nested loops, and (b) increasing
complex security properties, such passing references to buffers
of variable size and scaled and indirect memory accesses. At
present, proof times have ranged from 6 to approximately 600
seconds, with an average time of 46 seconds.
These proof activities stimulated the development of
heuristics for automating the translation process and improving
the speed of the proof process. In addition, preliminary
scalability analysis has been performed to identify heuristics
that have manageable growth curves. For example, various
heuristics have enabled the proof of a MiBench benchmark
containing 8 loops, one of which contains 4 nested loops, in
approximately 13 seconds. Initially, we were unable to prove a
complete version of the program, because the prover timed out
after 10,000 seconds.
IV. EXECUTION CONSTRAINTS
Execution constraints ensure that certain undesirable
execution sequences are either blocked or detected.
STARS
Static
Analyzer
Original
Binary File
Policy
Spec'n
Transformed
Binary File
Zipr
Binary
Rewriter
Overhead
Estimates
Policy
Spec'n
System
Untrusted Protected
SPARK Pro
Proof
System
Proof
Results
SPARK Ada
Represent'n
Security
Case
GSN
Argument
Editor
Security
Case
Manager
Security
Theorems
Evidence Collection
Intermediate
Represent'n
DataBase
(IRDB)
Zephyr Security
Toolkit
Dependable Commuting
Security Case Toolkit
User Specified
Security
Enhancements
System
Stakeholders
3
The major execution constraint that is implemented is
called Selective Control Flow Integrity (SCFI). SCFI limits
control transfers to legitimate targets. It is called selective,
because, unlike other binary-only control flow integrity
techniques, not all indirect branches need to be instrumented.
Instead, we can select not to protect indirect branches that we
have proven to be secure.
Our basic SCFI allows an indirect branch to jump to any
instruction that might be an indirect branch target. Basic SCFI
incurs 3% additional run-time overhead on average and
provides a 99% reduction in the attack surface.
We have a refined version of SCFI that analyzes indirect
branches more thoroughly. This refined version detects that
some indirect branches use a jump table (as is often used in
implementing a switch/case statement in high-level languages).
It also detects call sites in the program, and limits individual
return instructions to a specific set of targets.
These improvements in granularity help significantly. For
example, in the examples we have analyzed, more than 60% of
returns are fully analyzed. These refinements leave the
overhead unchanged while adding several orders of magnitude
reduction in the attack surface for return address hijackings.
Typically, when a control flow violation is detected,
continued execution is not possible since no valid target is
available. However, since SCFI limits targets so effectively, we
can choose a reasonable target from the set. We have
developed a simple algorithm for attempting to continue
execution properly and are in the process of evaluating its
effectiveness.
V. ARTIFICIAL DIVERSITY
Cyber attacks can be made more difficult by randomization
a characteristic of the target program. Randomization does not
eliminate vulnerabilities; it merely changes information that the
adversary previously knew thereby denying the adversary
information critical for the exploit. Such randomization is
referred to as artificial diversity [3]. At present, the system
applies two artificial-diversity techniques: Stack Layout
Transformation and Block-Level Instruction Layout
Randomization.
Stack Layout Transformation (SLX) is designed to thwart
stack-based buffer overflow attacks, including intra-frame
overflows and non-control data attacks [10]. SLX includes
three separate transforms that are applied to the stack frames of
individual functions: (a) randomizing the order of local
variables, (b) adding random-length padding between
variables, and (c) placing canaries, i.e., random values that are
checked periodically during execution.
To effect these transforms requires that: (1) the stack layout
be redefined, (2) instructions that access the stack be modified
to use the redefined layout, and (3) instructions be inserted to
set and check canary values during execution.
Block-Level Instruction Layout Randomization (BILR)
randomizes the order of code at the basic-block level and is
conceptually similar to Instruction Layout Randomization
(ILR) [4]. The reordering can be based on a variety of
objectives such as best fit, maximization of the distance
between originally adjacent blocks, etc. BILR can significantly
help with some types of arc-injection and ROP style attacks.
VI. STATIC BINARY REWRITING
Zipr is a general purpose, static binary rewriter that uses a
novel reassembly technique to generate compact, transformed
versions of original programs (statically or dynamically linked)
and libraries, relying only on access to their binary code for
rewriting [4].
In order to guarantee the proper execution of rewritten
programs and libraries in all cases, other static binary rewriters
such as SecondWrite [1] encapsulate a copy of the original
program within the rewritten version and incur an overhead of
at least 100% in disk size. Zipr does not require such
duplication, because of the development of a new reassembly
technique. Programs in the SPEC2006 benchmark suite
rewritten using Zipr increase from their native size by an
average of only 4%. and Zipr-rewritten binaries carry an
average performance overhead of only 5%, a significant and
major improvement on the state of the art.
For the system described in this paper, Zipr's most
important feature is its plugin architecture that allows users to
control the rewriting algorithm. Through plugins, Zipr users
can direct the rewriter to add the security features and defenses
described herein. For example, there are existing Zipr plugins
for BILR and stack-layout transformation.
VII. SECURITY CASE
Overhead estimates guide the user’s selection of security
techniques that the system is to apply. But a trade-off is
required between the inevitable overhead incurred and the
efficacy of the resulting security protection. Different
techniques incur different amounts of overhead and provide
different forms of protection.
To allow stakeholders to assess the overall efficacy of the
security enhancements applied and to support the necessary
trade-off, the system includes support for the development and
analysis of a rigorous security case [9]. A security case
provides a comprehensive, valid and compelling argument that
a system is adequately secure for a given application in a given
environment. A security case documents the rationale for belief
that the system is adequately secure thereby facilitating
scrutiny of that rationale.
The overall organization of a security case is shown in Fig.
2. The security claim defines the goal of the analysis and
transformation of the subject binary program. Evidence is
collected during the enhancement process about the subject
software, the various analyses, and the transformations. A
4
rigorous argument is developed that is designed to compel
belief in the security claim based on the available evidence.
Arguments are documented graphically using the Goal
Structuring Notation (GSN) [7].
Fig. 2. Overall organization of a rigorous security case.
VIII. FUTURE WORK
Planned future work for the system described includes:
Formal verification: Planned improvements to the
formal verification system include: (a) to increase the
size of programs that can be automatically translated
and analyzed to a target of 1,000,000 instructions, (b)
to keep analysis and proof times to practical levels, and
(c) to complement the proof technology being
developed with a repair” mechanism that will insert
guard instructions into the program automatically
when the initial proof fails.
Binary rewriting: Planned improvements to the static
binary rewriter include reduction of the memory
overhead of Zipr by development of more
sophisticated code placement and code packing
algorithms.
Continued execution: Planned improvements to
continued execution include recovery from memory
overwriting exploits by developing techniques to
restore corrupted program state beyond the shadowed
values (return addresses, function pointers, critical
arguments) that are restored in the current
implementation.
Defenses: SCFI will be extended to cross-file control
flow (e.g., among the main executable file and the
shared objects), and the granularity of SCFI will be
further improved with improvements in the static
analyses in STARS.
IX. SUMMARY
The use of binary software for which source code and
development artifacts are unavailable, has become widespread
and is difficult to avoid, even in embedded systems. Important
properties, such as critical security properties, cannot be
assured because of the lack of necessary development
documentation.
The system described is designed to provide substantial
security hardening of embedded software when only the binary
form of the software is available. Formal verification is used to
identify locations where either additional hardening is not
required or where security enhancements can be targeted.
Execution constraints and artificial diversity are applied to
supplement the targeted security enhancements, and a static
binary rewriter applies all of the necessary modifications to the
binary program.
Raytheon Integrated Defense Systems, has conducted a
preliminary Red Team attack on specimen software to which
the described system had been applied. The SCFI defense
initially defeated most hijacking attacks and defeated all in-
scope hijacking exploits after the attack surface reduction for
return addresses was improved.
ACKNOWLEDGMENTS
The authors gratefully acknowledge the technical support
provided to this project by AdaCore.
REFERENCES
[1] K. Anand, M. Smithson, A. Kotha, K. Elwazeer, R. Barua:
Decompilation to compiler high IR in a binary rewriter”, Tech. rep.,
University of Maryland (November 2010),
http://www.ece.umd.edu/~barua/high-IR-technical-report10.pdf
[2] J. Barnes, SPARK: The Proven Approach to High Integrity Software.
Altran Praxis, 2012.
[3] J. Davidson, J. Hiser, A. Nguyen-Tuong, M. Co, B. Rodes, J. Knight,
“Security protection of binary programs,” IET System Safety and Cyber
Security Conference, Bristol, UK 2015.
[4] W. Hawkins, J. Hiser, A. Nguyen-Tuong, C. Coleman, J. Davidson,
“Efficient static binary rewriting Department of Computer Science
Technical Report CS2015-5, University of Virginia, 2015.
[5] J. Hiser, A. Nguyen-Tuong, M. Co, M. Hall, J. Davidson. “ILR: where’d
my gadgets go?,” IEEE Symposium on Security and Privacy, 2012, pp.
571-585.
[6] A. Hocking, B. Rodes, J. Knight, J. Davidson, and C. ColemanA proof
Infrastructure for binary programs,” in submission to NASA Formal
Methods (NFM), 2016.
[7] T. Kelly, and R. Weaver. The Goal structuring notationa safety
argument notation,” in Dependable Systems and Networks Workshop on
Assurance Cases, Florence, Italy, 2004.
[8] B. Rodes and J. Knight, “Speculative software modification and its use
in securing SOUP, in the European Dependable Computing
Conference, Newcastle upon Tyne, UK, 2014, pp. 210-221.
[9] B. Rodes, J. Knight, A. Nguyen-Tuong, J. Hiser, M. Co, J. Davidson, “A
case study of security case development,” 23rd Safety-critical Systems
Symposium, Bristol, UK, 2015.
[10] B. Rodes, A. Nguyen-Tuong, J. Hiser, J. Knight, M. Co, J. Davidson,
Defense against stack-based attacks using speculative stack layout
transformation, Third International Conference on Runtime
Verification, Istanbul, Turkey, 2012, pp. 308-313.
Security
Claim
Context Within
Which Claim Is
Made
Body of Evidence About The
System Including Details Of
Transformations, Analysis,
Testing
Rigorous Argument Linking The
Body of Evidence to the
Security Claim
Assumptions
Made About The
System
Article
Full-text available
Binary rewriting is changing the semantics of a program without having the source code at hand. It is used for diverse purposes, such as emulation (e.g., QEMU), optimization (e.g., DynInst), observation (e.g., Valgrind), and hardening (e.g., Control flow integrity enforcement). This survey gives detailed insight into the development and state-of-the-art in binary rewriting by reviewing 67 publications from 1966 to 2018. Starting from these publications, we provide an in-depth investigation of the challenges and respective solutions to accomplish binary rewriting. Based on our findings, we establish a thorough categorization of binary rewriting approaches with respect to their use-case, applied analysis technique, code-transformation method, and code generation techniques. We contribute a comprehensive mapping between binary rewriting tools, applied techniques, and their domain of application. Our findings emphasize that although much work has been done over the past decades, most of the effort was put into improvements aiming at rewriting general purpose applications but ignoring other challenges like altering throughput-oriented programs or software with real-time requirements, which are often used in the emerging field of the Internet of Things. To the best of our knowledge, our survey is the first comprehensive overview on the complete binary rewriting process.
Article
Full-text available
A binary rewriter is a piece of software that accepts a binary executable program as input, and produces an improved executable as output. This paper describes the first technique in literature to decompile the input binary into an existing compiler's high-level intermediate form (IR). The compiler's back-end is then used to generate the output binary from the IR. Doing so enables the use of the rich set of compiler analysis and transformation passes available in mature compilers. It also enables bi-nary rewriters to perform complex high-level transforma-tions, such as automatic parallelization, not possible in ex-isting binary rewriters. Certain characteristics of binary code pose a great chal-lenge while translating a binary to a high-level compiler IR; these include the use of an explicitly addressed stack, lack of function prototypes and the lack of symbols. We present techniques to overcome these challenges. We have built a prototype binary rewriter called SecondWrite that uses LLVM, a widely-used compiler infrastructure, as our intermediate IR, and rewrites both x86 binaries. Our re-sults show that SecondWrite accelerates un-optimized bi-naries by 27% on average for our benchmarks, and main-tains the performance of already optimized binaries with-out any custom optimizations on our part. We also present two case studies for custom improvement – automatic par-allelization and security – to exemplify the benefits and applications of a binary rewriter using a high IR.
Article
Full-text available
Through randomization of the memory space and the confinement of code to non-data pages, computer security researchers have made a wide range of attacks against program binaries more difficult. However, attacks have evolved to exploit weaknesses in these defenses. To thwart these attacks, we introduce a novel technique called Instruction Location Randomization (ILR). Conceptually, ILR randomizes the location of every instruction in a program, thwarting an attacker's ability to re-use program functionality (e.g., arc-injection attacks and return-oriented programming attacks). ILR operates on arbitrary executable programs, requires no compiler support, and requires no user interaction. Thus, it can be automatically applied post-deployment, allowing easy and frequent re-randomization. Our preliminary prototype, working on 32-bit x86 Linux ELF binaries, provides a high degree of entropy. Individual instructions are randomly placed within a 31-bit address space. Thus, attacks that rely on a priori knowledge of the location of code or derandomization are not feasible. We demonstrated ILR's defensive capabilities by defeating attacks against programs with vulnerabilities, including Adobe's PDF viewer, acroread, which had an in-the-wild vulnerability. Additionally, using an industry-standard CPU performance benchmark suite, we compared the run time of prototype ILR-protected executables to that of native executables. The average run-time overhead of ILR was 13% with more than half the programs having effectively no overhead (15 out of 29), indicating that ILR is a realistic and cost-effective mitigation technique.
Article
Full-text available
In Europe, over recent years, the responsibility for ensuring system safety has shifted onto the developers and operators to construct and present well reasoned arguments that their systems achieve acceptable levels of safety. These arguments (together with supporting evidence) are typically referred to as a "safety case". This paper describes the role and purpose of a safety case. Safety arguments within safety cases are often poorly communicated. This paper presents a technique called GSN (Goal Structuring Notation) that is increasingly being used in safety-critical industries to improve the structure, rigor, and clarity of safety arguments. The paper also describes a number of extensions, based upon GSN, which can be used to assist the maintenance, construction, reuse and assessment of safety cases. The aim of this paper is to describe the current industrial use and research into GSN such that its applicability to other types of Assurance Case, in addition to safety cases, can also be considered.
Conference Paper
The use of Software of Unknown Provenance (SOUP) in the development of modern information systems has become widespread and is difficult to avoid. Unfortunately, important properties of the software, such as critical security properties, cannot be assured with SOUP making the use of SOUP problematic. In this paper we summarize Kevlar, a comprehensive approach to enhancing the security of SOUP. Kevlar operates with access only to the binary program, i.e., no source code or other development documentation is required, and the current implementation supports Intel X86 software running on Ubuntu platforms.
Conference Paper
Establishing properties of binary programs by proof is a desirable goal when the properties of interest are crucial, such as those that arise in safety- and security-critical applications. Practical development of proofs for binary programs requires a substantial infrastructure to disassemble the program, define the machine semantics, and actually undertake the required proofs. At the center of these infrastructure requirements is the need to document semantics in a formal language. In this paper we present a work-in-progress proof infrastructure for binary programs based on AdaCore and Altran’s integrated development and verification environment, SPARKPro. We illustrate the infrastructure with proof of a security property.
Conference Paper
This paper describes a novel technique to defend binaries against intra-frame stack-based attacks, including overflows into local variables, when source code is unavailable. The technique infers a specification of a function's stack layout, i.e., variable locations and boundaries, and then seeks to apply a combination of transformations, including variable reordering, random-sized padding between variables, and placement of canaries. To overcome the imprecision of static binary analysis, yet be as aggressive as possible in the transformations applied to the stack layout, the technique is speculative. A stack frame is aggressively transformed based on static analysis, and the validity of inferred stack layout is assessed through regression testing. If a transformation changes a program's semantics because of imprecision in the inference of the stack layout, a less aggressive layout is inferred until the transformed program passes the supplied regression tests. We present an overview of the technique and preliminary results of its feasibility and security effectiveness.
Conference Paper
We present an engineering process model for generating software modifications that is designed to be used when either most or all development artifacts about the software, including the source code, are unavailable. This kind of software, commonly called Software Of Unknown Provenance (SOUP), raises many doubts about the existence and adequacy of desired dependability properties, for example security. These doubts motivate some users to apply modifications to enhance dependability properties of the software, however, without necessary development artifacts, modifications are made in a state of uncertainty and risk. We investigate enhancing dependability through software modification in the presence of these risks as an engineering problem and introduce an engineering process for generating software modifications called Speculative Software Modification (SSM). We present the motivation and guiding principles of SSM, and a case study of SSM applied to protect software against buffer overflow attacks when only the binary is available.
Efficient static binary rewriting
  • W Hawkins
  • J Hiser
  • A Nguyen-Tuong
  • C Coleman
  • J Davidson
W. Hawkins, J. Hiser, A. Nguyen-Tuong, C. Coleman, J. Davidson, "Efficient static binary rewriting" Department of Computer Science Technical Report CS2015-5, University of Virginia, 2015.
SPARK: The Proven Approach to High Integrity Software
  • J Barnes