Conference Paper

Revery: From Proof-of-Concept to Exploitable

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Automatic exploit generation is an open challenge. Existing solutions usually explore in depth the crashing paths, i.e., paths taken by proof-of-concept (POC) inputs triggering vulnerabilities, and generate exploits when exploitable states are found along the paths. However, exploitable states do not always exist in crashing paths. Moreover, existing solutions heavily rely on symbolic execution and are not scalable in path exploration and exploit generation. In addition, few solutions could exploit heap-based vulnerabilities. In this paper, we propose a new solution revery to search for exploitable states in paths diverging from crashing paths, and generate control-flow hijacking exploits for heap-based vulnerabilities. It adopts three novel techniques:(1) a digraph to characterize a vulnerability's memory layout and its contributor instructions;(2) a fuzz solution to explore diverging paths, which have similar memory layouts as the crashing paths, in order to search more exploitable states and generate corresponding diverging inputs;(3) a stitch solution to stitch crashing paths and diverging paths together, and synthesize EXP inputs able to trigger both vulnerabilities and exploitable states. We have developed a prototype of revery based on the binary analysis engine angr, and evaluated it on a set of 19 real world CTF (capture the flag) challenges. Experiment results showed that it could generate exploits for 9 (47%) of them, and generate EXP inputs able to trigger exploitable states for another 5 (26%) of them.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... AEG (automatic exploit generation) was introduced to quickly generate exploits and improve the ability to assess the exploitability of software vulnerabilities [20][21][22][23]. In recent years, heap-based AEGs have improved, and some simple exploitations for user-data corruption can be processed [24,25]. However, the exploitation of metadata requires a high degree of skill, and so this process is usually combined with human expertise [26]. ...
... e existing AEG methods lack the expertise of the intermediate exploitation process, and it uses a saltatory strategy to complete the migration of memory state (MMS). For example, Revery [24] jumps directly from the panic state to the memory state including arbitrary address writing (AAW). However, in most cases, the exploitation of heap vulnerabilities, especially metadata corruption, needs to be closely combined with human expertise, and it is necessary to traverse intermediate memory states in a step-by-step strategy. ...
... e existing AEGs perform badly in refining the heap layout and often "jumps" during the processing of the heap layout. For example, Revery [24] jumps directly from the state of triggering a vulnerability to the state of hijacking the control flow. e limitation is that they adopt a random exploration strategy in the heap layout process, and they are not guided sufficiently by human expertise. ...
Article
Full-text available
In recent years, increased attention is being given to software quality assurance and protection. With considerable verification and protection schemes proposed and deployed, today’s software unfortunately still fails to be protected from cyberattacks, especially in the presence of insecure organization of heap metadata. In this paper, we aim to explore whether heap metadata could be corrupted and exploited by cyberattackers, in an attempt to assess the exploitability of vulnerabilities and ensure software quality. To this end, we propose RELAY, a software testing framework to simulate human exploitation behavior for metadata corruption at the machine level. RELAY employs the heap layout serialization method to construct exploit patterns from human expertise and decomposes complex exploit-solving problems into a series of intermediate state-solving subproblems. With the heap layout procedural method, RELAY makes use of the fewer resources consumed to solve a layout problem according to the exploit pattern, activates the intermediate state, and generates the final exploit. Additionally, RELAY can be easily extended and can continuously assimilate human knowledge to enhance its ability for exploitability evaluation. Using 20 CTF&RHG programs, we then demonstrate that RELAY has the ability to evaluate the exploitability of metadata corruption vulnerabilities and works more efficiently compared with other state-of-the-art automated tools.
... The company needs to overview the challenges, obstacles, and tasks before implementing Proof of Concept (PoC). The success and most important stages in RPA, based on the background study, include Process Assessment-Business Case-Proof of Concept-Project Design and Build-RPA Life cycle [9] (see Figure 2). ...
... The company needs to overview the challenges, obstacles, and tasks before implementing Proof of Concept (PoC). The success and most important stages in RPA, based on the background study, include Process Assessment-Business Case-Proof of Concept-Project Design and Build-RPA Life cycle [9] (see Figure 2). and bureaucracy necessary from another department [10]. ...
Article
Full-text available
Automation technology is changing and transforming innovation into the industrial landscape and Human Resources (HR) should ensure to adapt and practice its deployment to realise its benefits in time and for cost savings. The implementation of Robotic Process Automation (RPA) in HR can help to offer better service to ensure compliance of the processes with standards and regulations. RPA is a software technology that manages software robots to emulate human actions when interacting with digital platforms. RPA is a solution that could perform repetitions to take over activities carried out by humans. However, a robot is not thought to be able to replace the HR but is, instead, useful to support driven processes. The purpose of the study is to prove the efficiency and effectiveness of RPA in the Human Resource Management System (HRMS) compared to the manual process performed by a human. Different types of components and characteristics were identified to adopt RPA in HRMS based on the data measurement in the implementation process. This study designs and develops an HRMS model using RPA tools to achieve the target process. The model was developed based on a case study of an existing model of RPA in HRMS from an IT consultancy industry. In the HR process, the project uses an application focusing on the parameters of gathering, storing and accessing employees’ information from other modules. Lastly, the gaps in the HRMS to improve productivity are evaluated and explained.
... To the best of our knowledge, there are neither tools for statically modeling and comparing static CFI defenses against each other, nor static CRA crafting tools which are aware of a set of applied defenses. Existing tools, including static pattern-based gadget searching tools [12,55] and dynamic attack construction tools [10,14,22,51,53], all lack deeper knowledge of the protected program. As such, they can find CRA gadgets, but cannot determine if the gadgets are usable after a defense was deployed. ...
... Revery [53] crafts attacks by analyzing a vulnerable program and by collecting runtime information on the crashing path as for example taint attributes of variables. Revery fails in some cases to generate an attack due to complicated defense mechanisms of which the tool is not aware. ...
Preprint
Full-text available
Control-flow hijacking attacks are used to perform malicious com-putations. Current solutions for assessing the attack surface afteracontrol flow integrity(CFI) policy was applied can measure onlyindirect transfer averages in the best case without providing anyinsights w.r.t. the absolute calltarget reduction per callsite, and gad-get availability. Further, tool comparison is underdeveloped or notpossible at all. CFI has proven to be one of the most promising pro-tections against control flow hijacking attacks, thus many effortshave been made to improve CFI in various ways. However, there isa lack of systematic assessment of existing CFI protections. In this paper, we presentLLVM-CFI, a static source code analy-sis framework for analyzing state-of-the-art static CFI protectionsbased on the Clang/LLVM compiler framework.LLVM-CFIworksby precisely modeling a CFI policy and then evaluating it within aunified approach.LLVM-CFIhelps determine the level of securityoffered by different CFI protections, after the CFI protections weredeployed, thus providing an important step towards exploit cre-ation/prevention and stronger defenses. We have usedLLVM-CFIto assess eight state-of-the-art static CFI defenses on real-worldprograms such as Google Chrome and Apache Httpd.LLVM-CFIprovides a precise analysis of the residual attack surfaces, andaccordingly ranks CFI policies against each other.LLVM-CFIalsosuccessfully paves the way towards construction of COOP-like codereuse attacks and elimination of the remaining attack surface bydisclosing protected calltargets under eight restrictive CFI policies.
... Regarding automated EXP generation, researchers manage to generate EXP samples that require high precision by sketching complex exploitation-related constraints via symbolic expressions and solving constraints to yield working EXPs. Previous work shows symbolic execution is one of the most effective approaches to do this work [4][5][6][7][9][10][11]. AEMB also employs symbolic execution to generate EXPs. ...
... Revery [10] attempts to exploit the vulnerability provided with non-exploitable PoCs and proposes a novel layout-oriented fuzzing and a control-flow stitching solution. It also contributed to the exploitation of heap memory vulnerabilities. ...
Article
Full-text available
Modern operating systems set exploit mitigations to thwart the exploit, which has also become a barrier to automated exploit generation (AEG). Many current AEG solutions do not fully account for exploit mitigations, and as a result, they are unable to accurately assess the exploitability of vulnerabilities in such settings.This paper proposes AEMB , an automated solution for bypassing exploit mitigations and generating useable exploits (EXPs). Initially, AEMB identifies exploit mitigations in the system based on characteristics of the program execution environment . Then, AEMB implements exploit mitigations bypassing the payload generation by modeling expert experience and constructs the corresponding constraints. Next, during the program’s execution, AEMB uses symbol execution to collect symbol information and create exploit constraints. Finally, AEMB utilizes a solver to solve the constraints, including payload constraints and exploit constraints, to generate the EXP. In this paper, we evaluated a prototype of AEMB on six test programs and seven real-world applications. Furthermore, we conducted 54 sets of experiments on six different combinations of exploit mitigations. Experiment results indicate that AEMB can automatically overcome exploit mitigations and produce successful exploits for 11 out of 13 applications.
... Specifically, AEG [6] was the first tool to automatically search for an exploitable program vulnerability [34] and to generate a control-flow hijacking attack. Revery [49] is an extension of AEG that addressed additional challenges. However, it can only automatically create return-to-stack and return-to-libc exploits. ...
... Revery [49] is a dynamic attack crafting tool that analyzes a vulnerable program and collects runtime information on the crashing path as for example taint attributes of variables. Revery is an extension of AEG but goes beyond by focusing on other challenges. ...
Conference Paper
Full-text available
Exploiting a program requires a security analyst to manipulate data in program memory with the goal to obtain control over the program counter and to escalate privileges. However, this is a tedious and lengthy process as: (1) the analyst has to massage program data such that a logical reliable data passing chain can be established, and (2) depending on the attacker goal certain in-place fine-grained protection mechanisms need to be bypassed. Previous work has proposed various techniques to facilitate exploit development. Unfortunately, none of them can be easily used to address the given challenges. This is due to the fact that data in memory is difficult to be massaged by an analyst who does not know the peculiarities of the program as the attack specification is most of the time only textually available, and not automated at all. In this paper, we present indirect transfer oriented programming (iTOP), a framework to automate the construction of control-flow hijacking attacks in the presence of strong protections including control flow integrity, data execution prevention, and stack canaries. Given a vulnerable program, iTOP automatically builds an exploit payload with a chain of viable gadgets with solved SMT-based memory constraints. One salient feature of iTOP is that it contains 13 attack primitives powered by a Turing complete payload specification language, ESL. It also combines virtual and non-virtual gadgets using COOP-like dispatchers. As such, when searching for gadget chains, iTOP can respect, for example, a previously enforced CFI policy, by using only legitimate control flow transfers. We have evaluated iTOP with a variety of programs and demonstrated that it can successfully generate working exploits with the developed attack primitives.
... To the best of our knowledge, there are neither tools for statically modeling and comparing static CFI defenses against each other, nor static CRA crafting tools which are aware of a set of applied defenses. Existing tools, including static pattern-based gadget searching tools [12,55] and dynamic attack construction tools [10,14,22,51,53], all lack deeper knowledge of the protected program. As such, they can find CRA gadgets, but cannot determine if the gadgets are usable after a defense was deployed. ...
... Revery [53] crafts attacks by analyzing a vulnerable program and by collecting runtime information on the crashing path as for example taint attributes of variables. Revery fails in some cases to generate an attack due to complicated defense mechanisms of which the tool is not aware. ...
Conference Paper
Full-text available
Control-flow hijacking attacks are used to perform malicious computations. Current solutions for assessing the attack surface after a control flow integrity (CFI) policy was applied can measure only indirect transfer averages in the best case without providing any insights w.r.t. the absolute calltarget reduction per callsite, and gadget availability. Further, tool comparison is underdeveloped or not possible at all. CFI has proven to be one of the most promising protections against control flow hijacking attacks, thus many efforts have been made to improve CFI in various ways. However, there is a lack of systematic assessment of existing CFI protections. In this paper, we present LLVM-CFI, a static source code analysis framework for analyzing state-of-the-art static CFI protections based on the Clang/LLVM compiler framework. LLVM-CFI works by precisely modeling a CFI policy and then evaluating it within a unified approach. LLVM-CFI helps determine the level of security offered by different CFI protections, after the CFI protections were deployed, thus providing an important step towards exploit creation/prevention and stronger defenses. We have used LLVM-CFI to assess eight state-of-the-art static CFI defenses on real-world programs such as Google Chrome and Apache Httpd. LLVM-CFI provides a precise analysis of the residual attack surfaces, and accordingly ranks CFI policies against each other. LLVM-CFI also successfully paves the way towards construction of COOP-like code reuse attacks and elimination of the remaining attack surface by disclosing protected calltargets under eight restrictive CFI policies.
... Automatic discovery of heap exploit techniques is a small step toward AEG's ambitious vision [10,14], but it is worth emphasizing its importance and difficulty. Despite several attempts to accomplish fully automated exploit generation [10,14,15,36,47,55,56,66], AEG, particularly for heap vulnerabilities, is so sophisticated and difficult that all the state-of-the-art cyber reasoning systems from DARPA CGC, (i.e., systems finding and exploiting vulnerabilities automatically [24,33,58,63]), failed to address; according to organizers, only a single heap vulnerability was successfully exploited in the CGC final event. Recently, Repel et al. [55] proposes a symbolic-execution-based approach aiming at AEG for heap vulnerabilities, but only works for old allocators without security checks. ...
Preprint
Heap exploitation techniques to abuse the metadata of allocators have been widely studied since they are application independent and can be used in restricted environments that corrupt only metadata. Although prior work has found several interesting exploitation techniques, they are ad-hoc and manual, which cannot effectively handle changes or a variety of allocators. In this paper, we present a new naming scheme for heap exploitation techniques that systematically organizes them to discover the unexplored space in finding the techniques and ArcHeap, the tool that finds heap exploitation techniques automatically and systematically regardless of their underlying implementations. For that, ArcHeap generates a set of heap actions (e.g. allocation or deallocation) by leveraging fuzzing, which exploits common designs of modern heap allocators. Then, ArcHeap checks whether the actions result in impact of exploitations such as arbitrary write or overlapped chunks that efficiently determine if the actions can be converted into the exploitation technique. Finally, from these actions, ArcHeap generates Proof-of-Concept code automatically for an exploitation technique. We evaluated ArcHeap with real-world allocators --- ptmalloc, jemalloc, and tcmalloc --- and custom allocators from the DARPA Cyber Grand Challenge. ArcHeap successfully found 14 out of 16 known exploitation techniques and found five new exploitation techniques in ptmalloc. Moreover, ArcHeap found several exploitation techniques for jemalloc, tcmalloc, and even for the custom allocators. Further, ArcHeap can automatically show changes in exploitation techniques along with version change in ptmalloc using differential testing.
... The dynamic vulnerability mining method represented by fuzzing is widely-deployed [24] since its introduction in the early 1990s [23], it has evolved from simple robust test to the self-feedback based on the code coverage or else information from the execution process [27,28]. And with the development of artificial intelligence, vulnerability analysis and exploit technology, fuzzing technology has gradually become more intelligent [29], and the entire software vulnerability discovery technology is gradually mature and moving towards automation [30,31]. ...
Article
Full-text available
With the rapid development of network security and the frequent appearance of CPU vulnera-bilities, CPU security have gradually raised great attention and become a crucial issue in the computer field. Undocumented instructions, as one of the important threats to system security, is an important entry for CPU security research. Using fuzzing technology can automatically test the CPU instruction set and discover po-tential undocumented instructions, but the existing methods are of slow search speed and low accuracy. Therefore, this paper designs an efficient fuzzing method (UISFuzz) for undocumented instruction searching. This method has the following merits: (1) the instruction search speed is greatly improved by an automatic instruction format recognition, as the low efficient part of the known instruction format is skipped and there-fore the instruction search space is much narrowed; (2) the false positive rate is reduced by a recheck mech-anism based on the expert knowledge database to filter the wrongly found instructions; (3) the overhead of the method is decreased by optimizing the result analysis program, and the scope of the system is expanded, where more processors with lower performance are compatible. Typical CPU experimental results show that, the UISFuzz can successfully find undocumented instructions in the CPUs and simultaneously improve the time efficiency by 5 times compared with existing tools.
... With the facilitation of forward and backward taint analysis, Mothe et al. devised a technical approach to craft working exploits for simple vulnerabilities in user-mode applications [26]. Utilizing various dynamic analysis methods, a team from the UK and a team from China crafted working exploits for those heap overflow vulnerabilities residing in the userland applications [35,44]. Using various program analysis, the Shellphish team at UCSB developed two systems (PovFuzzer and Rex) which give a security analyst the ability to turn a crash to a working exploit [38,39,41]. ...
Conference Paper
To determine the exploitability for a kernel vulnerability, a secu- rity analyst usually has to manipulate slab and thus demonstrate the capability of obtaining the control over a program counter or performing privilege escalation. However, this is a lengthy process because (1) an analyst typically has no clue about what objects and system calls are useful for kernel exploitation and (2) he lacks the knowledge of manipulating a slab and obtaining the desired layout. In the past, researchers have proposed various techniques to facilitate exploit development. Unfortunately, none of them can be easily applied to address these challenges. On the one hand, this is because of the complexity of the Linux kernel. On the other hand, this is due to the dynamics and non-deterministic of slab variations. In this work, we tackle the challenges above from two perspectives. First, we use static and dynamic analysis techniques to explore the kernel objects, and the corresponding system calls useful for exploitation. Second, we model commonly-adopted exploitation methods and develop a technical approach to facilitate the slab layout adjustment. By extending LLVM as well as Syzkaller, we implement our techniques and name their combination after SLAKE. We evaluate SLAKE by using 27 real-world kernel vulnerabilities, demonstrating that it could not only diversify the ways to perform kernel exploitation but also sometimes escalate the exploitability of kernel vulnerabilities.
... In recent years, various AEG solutions for different objectives have emerged. e solutions related to heap vulnerability are [13][14][15][16][17]. And [18][19][20][21] are solutions for format string vulnerability. ...
Article
Full-text available
Stack buffer overflow vulnerability is a common software vulnerability that can overwrite function return addresses and hijack program control flow, causing serious system problems. Existing automated exploit generation (AEG) solutions cannot bypass position-independent executable (PIE) exploit mitigation and cannot cope with the situation where the standard output function is not introduced into the program. In this paper, we propose a solution to alleviate the above difficulties: BofAEG, which is based on symbolic execution and dynamic analysis to automatically detect stack buffer overflow vulnerability and generate exploit. We used to capture the flag (CTF) and common vulnerabilities and exposures (CVE) programs for experiments. Results show that BofAEG can not only detect and generate exploits effectively but also implement more exploit techniques and is faster than existing AEG solutions.
... Existing tools, including static pattern-based gadget searching tools [51], [45] and dynamic attack construction tools [40], [25], [62], [56], [24], all lack deeper knowledge of the protected program. As such, they can find CRA gadgets, but cannot determine if the gadgets are usable after a defense was deployed. ...
Preprint
Full-text available
Protecting programs against control-flow hijacking attacks recently has become an arms race between defenders and attackers. While certain defenses, e.g., \textit{Control Flow Integrity} (CFI), restrict the targets of indirect control-flow transfers through static and dynamic analysis, attackers could search the program for available gadgets that fall into the legitimate target sets to bypass the defenses. There are several tools helping both attackers in developing exploits and analysts in strengthening their defenses. Yet, these tools fail to adequately (1) model the deployed defenses, (2) compare them in a head-to-head way, and (3) use program semantic information to help craft the attack and the countermeasures. Control Flow Integrity (CFI) has proved to be one of the promising defenses against control flow hijacks and tons of efforts have been made to improve CFI in various ways in the past decade. However, there is a lack of a systematic assessment of the existing CFI defenses. In this paper, we present Reckon, a static source code analysis tool for assessing state-of-the-art static CFI defenses, by first precisely modeling them and then evaluating them in a unified framework. Reckon helps determine the level of security offered by different CFI defenses, and find usable code gadgets even after the CFI defenses were applied, thus providing an important step towards successful exploits and stronger defenses. We have used Reckon to assess eight state-of-the-art static CFI defenses on real-world programs such as Google's Chrome and Apache Httpd. Reckon provides precise measurements of the residual attack surfaces, and accordingly ranks CFI policies against each other. It also successfully paves the way to construct code reuse attacks and to eliminate the remaining attack surface, by disclosing calltargets under one of the most restrictive CFI defenses.
Chapter
Automatic exploit generation for heap vulnerabilities is an open challenge. Current studies require a sensitive pointer on the heap to hijack the control flow and pay little attention to vulnerabilities with limited capabilities. In this paper, we propose HAEPG, an automatic exploit framework that can utilize known exploitation techniques to guide exploit generation. We implemented a prototype of HAEPG based on the symbolic execution engine S2E [15] and provided four exploitation techniques for it as prior knowledge. HAEPG takes crashing inputs, programs, and prior knowledge as input, and generates exploits for vulnerabilities with limited capabilities by abusing heap allocator’s internal functionalities.
Article
Full-text available
Decompilation aims to analyze and transform low-level program language (PL) codes such as binary code or assembly code to obtain an equivalent high-level PL. Decompilation plays a vital role in the cyberspace security fields such as software vulnerability discovery and analysis, malicious code detection and analysis, and software engineering fields such as source code analysis, optimization, and cross-language cross-operating system migration. Unfortunately, the existing decompilers mainly rely on experts to write rules, which leads to bottlenecks such as low scalability, development difficulties, and long cycles. The generated high-level PL codes often violate the code writing specifications. Further, their readability is still relatively low. The problems mentioned above hinder the efficiency of advanced applications (e.g., vulnerability discovery) based on decompiled high-level PL codes.In this paper, we propose a decompilation approach based on the attention-based neural machine translation (NMT) mechanism, which converts low-level PL into high-level PL while acquiring legibility and keeping functionally similar. To compensate for the information asymmetry between the low-level and high-level PL, a translation method based on basic operations of low-level PL is designed. This method improves the generalization of the NMT model and captures the translation rules between PLs more accurately and efficiently. Besides, we implement a neural decompilation framework called Neutron. The evaluation of two practical applications shows that Neutron’s average program accuracy is 96.96%, which is better than the traditional NMT model.
Conference Paper
We present the first approach to automatic exploit generation for heap overflows in interpreters. It is also the first approach to exploit generation in any class of program that integrates a solution for automatic heap layout manipulation. At the core of the approach is a novel method for discovering exploit primitives---inputs to the target program that result in a sensitive operation, such as a function call or a memory write, utilizing attacker-injected data. To produce an exploit primitive from a heap overflow vulnerability, one has to discover a target data structure to corrupt, ensure an instance of that data structure is adjacent to the source of the overflow on the heap, and ensure that the post-overflow corrupted data is used in a manner desired by the attacker. Our system addresses all three tasks in an automatic, greybox, and modular manner. Our implementation is called GOLLUM, and we demonstrate its capabilities by producing exploits from 10 unique vulnerabilities in the PHP and Python interpreters, 5 of which do not have existing public exploits.
Conference Paper
Full-text available
Fuzzing is an effective software testing technique to find bugs. Given the size and complexity of real-world applications, modern fuzzers tend to be either scalable, but not effective in exploring bugs that lie deeper in the execution, or capable of penetrating deeper in the application, but not scalable. In this paper, we present an application-aware evolutionary fuzzing strategy that does not require any prior knowledge of the application or input format. In order to maximize coverage and explore deeper paths, we leverage control-and data-flow features based on static and dynamic analysis to infer fundamental properties of the application. This enables much faster generation of interesting inputs compared to an application-agnostic approach. We implement our fuzzing strategy in VUzzer and evaluate it on three different datasets: DARPA Grand Challenge binaries (CGC), a set of real-world applications (binary input parsers), and the recently released LAVA dataset. On all of these datasets, VUzzer yields significantly better results than state-of-the-art fuzzers, by quickly finding several existing and new bugs.
Conference Paper
Full-text available
We tackle the problem of automated exploit generation for web applications. In this regard, we present an approach that significantly improves the state-of-art in web injection vulnerability identification and exploit generation. Our approach for exploit generation tackles various challenges associated with typical web application characteristics: their multi-module nature, interposed user input, and multi-tier architectures using a database backend. Our approach develops precise models of application workflows, database schemas, and native functions to achieve high quality exploit generation. We implemented our approach in a tool called Chainsaw. Chainsaw was used to analyze 9 open source applications and generated over 199 first- and second-order injection exploits combined, significantly outperforming several related approaches.
Conference Paper
Full-text available
Temporal memory safety errors, such as dangling pointer dereferences and double frees, are a prevalent source of software bugs in unmanaged languages such as C. Existing schemes that attempt to retrofit temporal safety for such languages have high runtime overheads and/or are incomplete, thereby limiting their effectiveness as debugging aids. This paper presents CETS, a compile-time transformation for detecting all violations of temporal safety in C programs. Inspired by existing approaches, CETS maintains a unique identifier with each object, associates this metadata with the pointers in a disjoint metadata space to retain memory layout compatibility, and checks that the object is still allocated on pointer dereferences. A formal proof shows that this is sufficient to provide temporal safety even in the presence of arbitrary casts if the program contains no spatial safety violations. Our CETS prototype employs both temporal check removal optimizations and traditional compiler optimizations to achieve a runtime overhead of just 48% on average. When combined with a spatial-checking system, the average overall overhead is 116% for complete memory safety
Conference Paper
Full-text available
The serious bugs and security vulnerabilities facilitated by C/C++'s lack of bounds checking are well known, yet C and C++ remain in widespread use. Unfortunately, C's arbitrary pointer arithmetic, conflation of pointers and arrays, and programmer-visible memory layout make retrofitting C/C++ with spatial safety guarantees extremely challenging. Existing approaches suffer from incompleteness, have high runtime overhead, or require non-trivial changes to the C source code. Thus far, these deficiencies have prevented widespread adoption of such techniques. This paper proposes SoftBound, a compile-time transformation for enforcing spatial safety of C. Inspired by HardBound, a previously proposed hardware-assisted approach, SoftBound similarly records base and bound information for every pointer as disjoint metadata. This decoupling enables SoftBound to provide spatial safety without requiring changes to C source code. Unlike HardBound, SoftBound is a software-only approach and performs metadata manipulation only when loading or storing pointer values. A formal proof shows that this is sufficient to provide spatial safety even in the presence of arbitrary casts. SoftBound's full checking mode provides complete spatial violation detection with 67% runtime overhead on average. To further reduce overheads, SoftBound has a store-only checking mode that successfully detects all the security vulnerabilities in a test suite at the cost of only 22% runtime overhead on average.
Article
Full-text available
We present a technique for finding security vulnerabilitiesin Web applications. SQL Injection (SQLI) and cross-sitescripting (XSS) attacks are widespread forms of attackin which the attacker crafts the input to the application toaccess or modify user data and execute malicious code. Inthe most serious attacks (called second-order, or persistent,XSS), an attacker can corrupt a database so as to causesubsequent users to execute malicious code.This paper presents an automatic technique for creatinginputs that expose SQLI and XSS vulnerabilities. The techniquegenerates sample inputs, symbolically tracks taintsthrough execution (including through database accesses),and mutates the inputs to produce concrete exploits. Oursis the first analysis of which we are aware that preciselyaddresses second-order XSS attacks.Our technique creates real attack vectors, has few falsepositives, incurs no runtime overhead for the deployed application,works without requiring modification of applicationcode, and handles dynamic programming-languageconstructs. We implemented the technique for PHP, in a toolArdilla. We evaluated Ardilla on five PHP applicationsand found 68 previously unknown vulnerabilities (23 SQLI,33 first-order XSS, and 12 second-order XSS).
Article
Memory safety in C and C++ remains largely unresolved. A technique usually called "memory tagging" may dramatically improve the situation if implemented in hardware with reasonable overhead. This paper describes two existing implementations of memory tagging: one is the full hardware implementation in SPARC; the other is a partially hardware-assisted compiler-based tool for AArch64. We describe the basic idea, evaluate the two implementations, and explain how they improve memory safety. This paper is intended to initiate a wider discussion of memory tagging and to motivate the CPU and OS vendors to add support for it in the near future.
Conference Paper
Existing Greybox Fuzzers (GF) cannot be effectively directed, for instance, towards problematic changes or patches, towards critical system calls or dangerous locations, or towards functions in the stack-trace of a reported vulnerability that we wish to reproduce. In this paper, we introduce Directed Greybox Fuzzing (DGF) which generates inputs with the objective of reaching a given set of target program locations efficiently. We develop and evaluate a simulated annealing-based power schedule that gradually assigns more energy to seeds that are closer to the target locations while reducing energy for seeds that are further away. Experiments with our implementation AFLGo demonstrate that DGF outperforms both directed symbolic-execution-based whitebox fuzzing and undirected greybox fuzzing. We show applications of DGF to patch testing and crash reproduction, and discuss the integration of AFLGo into Google's continuous fuzzing platform OSS-Fuzz. Due to its directedness, AFLGo could find 39 bugs in several well-fuzzed, security-critical projects like LibXML2. 17 CVEs were assigned.
Conference Paper
Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test is generated by slightly mutating a seed input. If the test exercises a new and interesting path, it is added to the set of seeds; otherwise, it is discarded. We observe that most tests exercise the same few "high-frequency" paths and develop strategies to explore significantly more paths with the same number of tests by gravitating towards low-frequency paths. We explain the challenges and opportunities of CGF using a Markov chain model which specifies the probability that fuzzing the seed that exercises path i generates an input that exercises path j. Each state (i.e., seed) has an energy that specifies the number of inputs to be generated from that seed. We show that CGF is considerably more efficient if energy is inversely proportional to the density of the stationary distribution and increases monotonically every time that seed is chosen. Energy is controlled with a power schedule. We implemented the exponential schedule by extending AFL. In 24 hours, AFLFAST exposes 3 previously unreported CVEs that are not exposed by AFL and exposes 6 previously unreported CVEs 7x faster than AFL. AFLFAST produces at least an order of magnitude more unique crashes than AFL.
Conference Paper
Memory access bugs, including buffer overflows and uses of freed heap memory, remain a serious problem for programming languages like C and C++. Many memory error detectors exist, but most of them are either slow or detect a limited set of bugs, or both. This paper presents AddressSanitizer, a new memory error detector. Our tool finds out-of-bounds accesses to heap, stack, and global objects, as well as use-after-free bugs. It employs a specialized memory allocator and code instrumentation that is simple enough to be implemented in any compiler, binary translation system, or even in hardware. AddressSanitizer achieves efficiency without sacrificing comprehensiveness. Its average slowdown is just 73% yet it accurately detects bugs at the point of occurrence. It has found over 300 previously unknown bugs in the Chromium browser and many bugs in other software.
Article
In this paper we present Mayhem, a new system for automatically finding exploitable bugs in binary (i.e., executable) programs. Every bug reported by Mayhem is accompanied by a working shell-spawning exploit. The working exploits ensure soundness and that each bug report is security-critical and actionable. Mayhem works on raw binary code without debugging information. To make exploit generation possible at the binary-level, Mayhem addresses two major technical challenges: actively managing execution paths without exhausting memory, and reasoning about symbolic memory indices, where a load or a store address depends on user input. To this end, we propose two novel techniques: 1) hybrid symbolic execution for combining online and offline (concolic) execution to maximize the benefits of both techniques, and 2) index-based memory modeling, a technique that allows Mayhem to efficiently reason about symbolic memory at the binary level. We used Mayhem to find and demonstrate 29 exploitable vulnerabilities in both Linux and Windows programs, 2 of which were previously undocumented.
Article
Prior work has shown that return oriented programming (ROP) can be used to bypass W⊕X, a software defense that stops shellcode, by reusing instructions from large libraries such as libc. Modern operating systems have since enabled address randomization (ASLR), which ran-domizes the location of libc, making these techniques unusable in practice. However, modern ASLR implemen-tations leave smaller amounts of executable code unran-domized and it has been unclear whether an attacker can use these small code fragments to construct payloads in the general case. In this paper, we show defenses as currently deployed can be bypassed with new techniques for automatically creating ROP payloads from small amounts of unran-domized code. We propose using semantic program ver-ification techniques for identifying the functionality of gadgets, and design a ROP compiler that is resistant to missing gadget types. To demonstrate our techniques, we build Q, an end-to-end system that automatically gener-ates ROP payloads for a given binary. Q can produce payloads for 80% of Linux /usr/bin programs larger than 20KB. We also show that Q can automatically per-form exploit hardening: given an exploit that crashes with defenses on, Q outputs an exploit that bypasses both W⊕X and ASLR. We show that Q can harden nine real-world Linux and Windows exploits, enabling an attacker to automatically bypass defenses as deployed by industry for those programs.
Conference Paper
Att ackers commonly exploit buggy programs to break into computers. Security-critical bugs pave the way for attackers to install trojans, propagate worms, and use victim computers to send spam and launch denial-of-service attacks. A direct way, therefore, to make computers more secure is to find securitycritical bugs before they are exploited by attackers. Unfortunately, bugs are plentiful. For example, the Ubuntu Linux bug-management database listed more than 103,000 open bugs as of January 2013. Specific widely used programs (such as the Firefox Web browser and the Linux 3.x kernel) list 7,597 and 1,293 open bugs in their public bug trackers, respectively.a Other projects, including those that are closed-source, likely involve similar statistics. These are just the bugs we know; there is always the persistent threat of zero-day exploits, or attacks against previously unknown bugs. Among the thousands of known bugs, which should software developers fix first? Which are exploitable?
Conference Paper
The automatic patch-based exploit generation problem is: given a program P and a patched version of the program P', automatically generate an exploit for the potentially unknown vulnerability present in P but fixed in P'. In this paper, we propose techniques for automatic patch-based exploit generation, and show that our techniques can automatically generate exploits for 5 Microsoft programs based upon patches provided via Windows Update. Although our techniques may not work in all cases, a fundamental tenant of security is to conservatively estimate the capabilities of attackers. Thus, our results indicate that automatic patch-based exploit generation should be considered practical. One important security implication of our results is that current patch distribution schemes which stagger patch distribution over long time periods, such as Windows Update, may allow attackers who receive the patch first to compromise the significant fraction of vulnerable hosts who have not yet received the patch.
OSS-Fuzz -Google's continuous fuzzing service for open source software
  • Kostya Serebryany
Kostya Serebryany. 2017. OSS-Fuzz -Google's continuous fuzzing service for open source software. (2017).
Heap feng shui in javascript
  • Alexander Sotirov
Alexander Sotirov. 2007. Heap feng shui in javascript. Black Hat Europe (2007).
Automatic Generation of Data-Oriented Exploits
  • Hong Hu
  • Zheng Leong Chua
  • Sendroiu Adrian
  • Prateek Saxena
  • Zhenkai Liang
Hong Hu, Zheng Leong Chua, Sendroiu Adrian, Prateek Saxena, and Zhenkai Liang. 2015. Automatic Generation of Data-Oriented Exploits.. In USENIX Security Symposium. 177-192.
The automated exploitation grand challenge
  • Julien Vanegue
Julien Vanegue. 2013. The automated exploitation grand challenge. In presented at H2HC Conference.
FUZE: Towards Facilitating Exploit Generation for Kernel Use-After-Free Vulnerabilities
  • Wei Wu
  • Yueqi Chen
  • Jun Xu
  • Xinyu Xing
  • Xiaorui Gong
  • Wei Zou
Wei Wu, Yueqi Chen, Jun Xu, Xinyu Xing, Xiaorui Gong, and Wei Zou. 2018. FUZE: Towards Facilitating Exploit Generation for Kernel Use-After-Free Vulnerabilities. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association.
Automatically assessing crashes from heap overflows
  • Yan Liang He
  • Hong Cai
  • Purui Hu
  • Zhenkai Su
  • Yi Liang
  • Huafeng Yang
  • Jia Huang
  • Xiangkun Yan
  • Dengguo Jia
  • Feng
SoftBound: Highly Compatible and Complete Spatial Memory Safety for C
  • Jianzhou Santosh Nagarakatte
  • Milo M K Zhao
  • Steve Martin
  • Zdancewic
Santosh Nagarakatte, Jianzhou Zhao, Milo M.K. Martin, and Steve Zdancewic. 2009. SoftBound: Highly Compatible and Complete Spatial Memory Safety for C. In Intl. Conf. on Programming Language Design and Implem.
Manh-Dung Nguyen, and Abhik Roychoudhury
  • Marcel Böhme
  • Van-Thuan
  • Pham
Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2329-2344.
Cyber grand challenge
  • Da Darpa
DA DARPA. 2014. Cyber grand challenge. Retrieved June 6 (2014), 2014.
Exploitation and state machines. Proceedings of Infiltrate
  • Thomas Dullien
  • Halvar Flake
Thomas Dullien and Halvar Flake. 2011. Exploitation and state machines. Proceedings of Infiltrate (2011).
New features in AddressSanitizer
  • Alexey Samsonov
  • Kostya Serebryany
Alexey Samsonov and Kostya Serebryany. 2013. New features in AddressSanitizer. (2013).
MemorySanitizer: fast detector of uninitialized memory use in C++
  • Evgeniy Stepanov
  • Konstantin Serebryany
Evgeniy Stepanov and Konstantin Serebryany. 2015. MemorySanitizer: fast detector of uninitialized memory use in C++. In Code Generation and Optimization (CGO), 2015 IEEE/ACM International Symposium on. IEEE, 46-55.
Automatic exploit generation
  • Thanassis Avgerinos
  • Sang Kil Cha
  • Alexandre Rebert
  • J Edward
  • Maverick Schwartz
  • David Woo
  • Brumley
Thanassis Avgerinos, Sang Kil Cha, Alexandre Rebert, Edward J Schwartz, Maverick Woo, and David Brumley. 2014. Automatic exploit generation. Commun. ACM 57, 2 (2014), 74-84.
Kostya Serebryany. 2017. OSS-Fuzz - Google's continuous fuzzing service for open source software
  • Kostya Serebryany
Konstantin Serebryany Derek Bruening Alexander Potapenko and Dmitriy Vyukov
  • Konstantin Serebryany Derek Bruening
  • Alexander Potapenko
  • Dmitriy Vyukov
Online: accessed 01-May-2018. Michal Zalewski
  • Michal Zalewski