Figure 2 - uploaded by Jason Hiser
Content may be subject to copyright.
Source publication
Current software development methodologies and practices, while enabling the production of large complex software systems, can have a serious negative impact on software quality. These negative impacts include excessive and unnecessary software complexity, higher probability of software vulnerabilities, diminished execution performance in both time...
Context in source publication
Citations
... Helix++'s binary rewriting is based on Zipr. (15)(16)(17)(18)(19)(20) Zipr's core functionality supports block-level instruction randomization (BILR), similar to Zhan,et.al. (21) Zipr achieves this functionality by doing deep binary analysis and building an IR. ...
The open-source Helix++ project improves the security posture of computing platforms by applying cutting-edge cybersecurity techniques to diversify and harden software automatically. A distinguishing feature of Helix++ is that it does not require source code or build artifacts; it operates directly on software in binary form--even stripped executables and libraries. This feature is key as rebuilding applications from source is a time-consuming and often frustrating process. Diversification breaks the software monoculture and makes attacks harder to execute as information needed for a successful attack will have changed unpredictably. Diversification also forces attackers to customize an attack for each target instead of attackers crafting an exploit that works reliably on all similarly configured targets. Hardening directly targets key attack classes. The combination of diversity and hardening provides defense-in-depth, as well as a moving target defense, to secure the Nation's cyber infrastructure.
... While relocating direct (i.e., absolute and PC-relative) control flow is generally trivial, attempting so for indirect transfers is undecidable and risks corrupting the resulting binary, as their respective targets cannot be identified with any generalizable accuracy [39,54]. ZAFL addresses this challenge conservatively via address pinning [25,27], which "pins" any unmovable items (including but not limited to: indirectly-called function entries, callee-to-caller return targets, data, or items that cannot be precisely disambiguated as being either code or data) to their original addresses; 4 while safely relocating the remaining movable items around these pins (often via chained jumps). Though address pinning will likely over-approximate the set of unmovable items at slight cost to binary performance and/or space efficiency (particularly for exceedingly-complex binaries with an abundance of jump tables, handwritten assembly, or data-in-code), its general-purpose soundness, speed, and scalability [38] makes it promising for facilitating coverage-preserving CGT. ...
Coverage-guided fuzzing's aggressive, high-volume testing has helped reveal tens of thousands of software security flaws. While executing billions of test cases mandates fast code coverage tracing, the nature of binary-only targets leads to reduced tracing performance. A recent advancement in binary fuzzing performance is Coverage-guided Tracing (CGT), which brings orders-of-magnitude gains in throughput by restricting the expense of coverage tracing to only when new coverage is guaranteed. Unfortunately, CGT suits only a basic block coverage granularity -- yet most fuzzers require finer-grain coverage metrics: edge coverage and hit counts. It is this limitation which prohibits nearly all of today's state-of-the-art fuzzers from attaining the performance benefits of CGT. This paper tackles the challenges of adapting CGT to fuzzing's most ubiquitous coverage metrics. We introduce and implement a suite of enhancements that expand CGT's introspection to fuzzing's most common code coverage metrics, while maintaining its orders-of-magnitude speedup over conventional always-on coverage tracing. We evaluate their trade-offs with respect to fuzzing performance and effectiveness across 12 diverse real-world binaries (8 open- and 4 closed-source). On average, our coverage-preserving CGT attains near-identical speed to the present block-coverage-only CGT, UnTracer; and outperforms leading binary- and source-level coverage tracers QEMU, Dyninst, RetroWrite, and AFL-Clang by 2-24x, finding more bugs in less time.
... It lifts binary code into LLVM-IR, then translates the IR back to machine code after adding the new code. Zipr [24] and Zipr++ [25] are designed in the similar way to enforce the security of the original code. Dyninst [6] disassembles the binary function and extracts its control flow graph, then inserts new basic blocks into the graph. ...
The security of binary programs is significantly threatened by software vulnerabilities. When vulnerabilities are found, those applications are exposed to malicious attacks which exploit the known vulnerabilities. Thus, it is necessary to patch them when vulnerabilities are reported to the public as soon as possible. However, it still heavily relies on manual work to locate and correct the corresponding defective code in binary programs. In order to raise productivity and ensure software security, it becomes imperative to automate the process. In this paper, we propose BINPATCH to automatically patch known vulnerabilities of binary programs. It firstly locates the defective function, which contains the vulnerability, via similar code comparison. Then, it reuses the corresponding code from the correct version of the defective function as the patch code, and inserts it to the defective function via binary rewriting. BINPATCH is evaluated on eight real-world vulnerabilities, and the experimental results show that it is able to not only locate the defective code effectively, but also patch the code correctly.
Recurrent neural networks are increasingly employed in safety-critical applications, such as control in cyber-physical systems, and therefore their verification is crucial for guaranteeing reliability and correctness. We present a novel approach for verifying the dynamic behavior of Long short-term memory networks (LSTMs), a popular type of recurrent neural network (RNN). Our approach employs the satisfiability modulo theories (SMT) solver iSAT solving complex Boolean combinations of linear and non-linear constraint formulas (including transcendental functions), and it therefore is able to verify safety properties of these networks.KeywordsFormal verificationRecurrent neural networksLSTMSMT solvingiSAT
Enterprise software updates depend on the interaction between user and developer organizations. This interaction becomes especially complex when a single developer organization writes software that services hundreds of different user organizations. Miscommunication during patching and deployment efforts lead to insecure or malfunctioning software installations. While developers oversee the code, the update process starts and ends outside their control. Since developer test suites may fail to capture buggy behavior finding and fixing these bugs starts with user generated bug reports and 3rd party disclosures. The process ends when the fixed code is deployed in production. Any friction between user, and developer results in a delay patching critical bugs.
Two common causes for friction are a failure to replicate user specific circumstances that cause buggy behavior and incompatible software releases that break critical functionality. Existing test generation techniques are insufficient. They fail to test candidate patches for post-deployment bugs and to test whether the new release adversely effects customer workloads. With existing test generation and deployment techniques, users can’t choose (nor validate) compatible portions of new versions and retain their previous version’s functionality.
We present two new technologies to alleviate this friction. First, Test Generation for Ad Hoc Circumstances transforms buggy executions into test cases. Second, Binary Patch Decomposition allows users to select the compatible pieces of update releases. By sharing specific context around buggy behavior and developers can create specific test cases that demonstrate if their fixes are appropriate. When fixes are distributed by including extra context users can incorporate only updates that guarantee compatibility between buggy and fixed versions.
We use change analysis in combination with binary rewriting to transform the old executable and buggy execution into a test case including the developer’s prospective changes that let us generate and run targeted tests for the candidate patch. We also provide analogous support to users, to selectively validate and patch their production environments with only the desired bug-fixes from new version releases.
This paper presents a new patching workflow that allows developers to validate prospective patches and users to select which updates they would like to apply, along with two new technologies that make it possible. We demonstrate our technique constructs tests cases more effectively and more efficiently than traditional test case generation on a collection of real world bugs compared to traditional test generation techniques, and provides the ability for flexible updates in real world scenarios.
Code diversification techniques are popular code-reuse attacks defense. The majority of code diversification research focuses on analyzing non-functional properties, such as whether the technique improves security. This paper provides a methodology to verify functional equivalence between the original and a diversified binary. We present a formal notion of binary equivalence resilient to diversification. Moreover, an algorithm is presented that checks whether two binaries – one original and one diversified – satisfy that notion of equivalence. The purpose of our work is to allow untrusted diversification techniques in a safety-critical context. We apply the methodology to three state-of-the-art diversification techniques used on the GNU Coreutils package. Overall, results show that our method can prove functional equivalence for 85,315 functions in the analyzed binaries.
Binary rewriting is changing the semantics of a program without having the source code at hand. It is used for diverse purposes, such as emulation (e.g., QEMU), optimization (e.g., DynInst), observation (e.g., Valgrind), and hardening (e.g., Control flow integrity enforcement). This survey gives detailed insight into the development and state-of-the-art in binary rewriting by reviewing 67 publications from 1966 to 2018. Starting from these publications, we provide an in-depth investigation of the challenges and respective solutions to accomplish binary rewriting. Based on our findings, we establish a thorough categorization of binary rewriting approaches with respect to their use-case, applied analysis technique, code-transformation method, and code generation techniques. We contribute a comprehensive mapping between binary rewriting tools, applied techniques, and their domain of application. Our findings emphasize that although much work has been done over the past decades, most of the effort was put into improvements aiming at rewriting general purpose applications but ignoring other challenges like altering throughput-oriented programs or software with real-time requirements, which are often used in the emerging field of the Internet of Things. To the best of our knowledge, our survey is the first comprehensive overview on the complete binary rewriting process.