Article

Language-independent sandboxing of just-in-time compilation and self-modifying code

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

When dealing with dynamic, untrusted content, such as on the Web, software behavior must be sandboxed, typically through use of a language like JavaScript. However, even for such specially-designed languages, it is difficult to ensure the safety of highly-optimized, dynamic language runtimes which, for efficiency, rely on advanced techniques such as Just-In-Time (JIT) compilation, large libraries of native-code support routines, and intricate mechanisms for multi-threading and garbage collection. Each new runtime provides a new potential attack surface and this security risk raises a barrier to the adoption of new languages for creating untrusted content. Removing this limitation, this paper introduces general mechanisms for safely and efficiently sandboxing software, such as dynamic language runtimes, that make use of advanced, low-level techniques like runtime code modification. Our language-independent sandboxing builds on Software-based Fault Isolation (SFI), a traditionally static technique. We provide a more flexible form of SFI by adding new constraints and mechanisms that allow safety to be guaranteed despite runtime code modifications. We have added our extensions to both the x86-32 and x86-64 variants of a production-quality, SFI-based sandboxing platform; on those two architectures SFI mechanisms face different challenges. We have also ported two representative language platforms to our extended sandbox: the Mono common language runtime and the V8 JavaScript engine. In detailed evaluations, we find that sandboxing slowdown varies between different benchmarks, languages, and hardware platforms. Overheads are generally moderate and they are close to zero for some important benchmark/platform combinations.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... However, they only provide a probabilistic defense. NaCl-JIT [20] and RockJIT [21] enforce coarse-grained CFI on the JIT compiled code with sandbox techniques. Their performance overheads, however, are too high. ...
... The NaCl-JIT [20] solution is the first CFI solution deployed on JIT compiled code. It uses the Software Fault Isolation (SFI) [31] based sandbox technique to enforce that all indirect control transfer in JIT compiled code only jump to aligned code address. ...
... The NaCl-JIT [20] and RockJIT [21] solutions can mitigate this type of attacks by sandboxing all memory write operations to eliminate unauthorized write operations. The solution [39] enforces the JIT memory to be non-writable, and delegates all JIT memory write operations to a trusted process that shares the JIT memory with the browser. ...
Conference Paper
Full-text available
Web browsers are one of the most important end-user applications to browse, retrieve, and present Internet resources. Malicious or compromised resources may endanger Web users by hijacking web browsers to execute arbitrary malicious code in the victims' systems. Unfortunately, the widely-adopted Just-In-Time compilation (JIT) optimization technique, which compiles source code to native code at runtime, significantly increases this risk. By exploiting JIT compiled code, attackers can bypass all currently deployed defenses. In this paper, we systematically investigate threats against JIT compiled code, and the challenges of protecting JIT compiled code. We propose a general defense solution, JITScope, to enforce Control-Flow Integrity (CFI) on both statically compiled and JIT compiled code. Our solution furthermore enforces the W⊕X policy on JIT compiled code, preventing the JIT compiled code from being overwritten by attackers. We show that our prototype implementation of JITScope on the popular Firefox web browser introduces a reasonably low performance overhead, while defeating existing real-world control flow hijacking attacks.
... Thus, if at any instance during the execution of the process, a new executable page is allocated, the hardware monitor will flag an alarm alerting the kernel that the new page does not have any corresponding golden hash. Therefore, our proposed architecture currently limits the support of applications that allow for just-in-time (JIT) compilation and run-time code relocation [35]. ...
... In our future work, we plan to address some of the challenges introduced when supporting applications that allow for just-in-time (JIT) compilation and run-time code relocation [35]. One possibility is to include page hash generation both in software as well as in the hardware root-of-trust as an enrollment process for pages with code modifications. ...
Article
Full-text available
Attacks on embedded devices are becoming more and more prevalent, primarily due to the extensively increasing plethora of software vulnerabilities. One of the most dangerous types of these attacks targets application code at run-time. Techniques to detect such attacks typically rely on software due to the ease of implementation and integration. However, these techniques are still vulnerable to the same attacks due to their software nature. In this work, we present a novel hardware-assisted run-time code integrity checking technique where we aim to detect if executable code resident in memory is modified at run-time by an adversary. Specifically, a hardware monitor is designed and attached to the device’s main memory system. The monitor creates page-based signatures (hashes) of the code running on the system at compile-time and stores them in a secure database. It then checks for the integrity of the code pages at run-time by regenerating the page-based hashes (with data segments zeroed out) and comparing them to the legitimate hashes. The goal is for any modification to the binary of a user-level or kernel-level process that is resident in memory to cause a comparison failure and lead to a kernel interrupt which allows the affected application to halt safely.
... One top priority is to write a verifier (similar to [28]) to remove the compiler from the TCB. Another interesting improvement to our SFI is to support Just-In-Time (JIT) compilation and self-modifying code, which can be done in a way similar to [7]. This is important to support language runtimes for high-level programming languages, e.g., Java and JavaScript. ...
Conference Paper
One cornerstone of computer security is hardware-based isolation mechanisms, among which an emerging technology named Intel Software Guard Extensions (SGX) offers arguably the strongest security on x86 architecture. Intel SGX enables user-level code to create trusted memory regions named enclaves, which are isolated from the rest of the system, including privileged system software. This strong isolation of SGX, however, forbids sharing any trusted memory between enclaves, making it difficult to implement any features or techniques that must share code or data between enclaves. This dilemma between isolation and sharing is especially challenging to system software for SGX (e.g., library OSes), to which both properties are highly desirable. To resolve the tension between isolation and sharing in system software for SGX, especially library OSes, we propose a single-address-space approach, which runs all (user-level) processes and the library OS in a single enclave. This single-enclave architecture enables various memory-sharing features or techniques, thus improving both performance and usability. To enforce inter-process isolation and user-privilege isolation inside the enclave, we design a multi-domain software fault isolation (SFI) scheme, which is unique in its support for two types of domains: 1) data domains, which enable process isolation, and 2) code domains, which enable shared libraries. Our SFI is implemented efficiently by leveraging Intel Memory Protection Extensions (MPX). Experimental results show an average overhead of 10%, thus demonstrating the practicality of our approach.
... Shying the complexity of JIT engines, few CFI schemes have been tested on JIT compilers. One of the notable exceptions is NaCl SFI [1], which provides a coarsegrained CFI implementation for JIT engines, but faces an overhead of 51% on x64 systems. Similarly, RockJIT instruments JIT-compiled code with coarse-grained checks, verifying the control flow instruction targets at runtime. ...
... Nowadays, using JavaScript maliciously has become a major threat to client Machines. [9,39,35,22,14] Nature of problems is same for Smartphones, Desktops and Clouds. All the above security related issues are major threats to Smartphones, Desktops and Clouds. ...
... We model a JIT engine such as a browser environment that performs dynamic binary transformations. Like most browser environments and other code randomization defenses [35,36] that employ dynamic binary instrumentation, we assume that the JIT engine is checked for vulnerabilities and its address space is protected by memory protection mechanisms such as code signing [78], sandboxing [79], and Intel Software Guard Extensions (SGX) [80]. ...
Conference Paper
Full-text available
Heterogeneous Chip Multiprocessors have been shown to provide significant performance and energy efficiency gains over homogeneous designs. Recent research has expanded the dimensions of heterogeneity to include diverse Instruction-Set Architectures, called Heterogeneous-ISA Chip Multi-processors. This work leverages such an architecture to realize substantial new security benefits, and in particular, to thwart Return-Oriented Programming. This paper proposes a novel security defense called HIPStR – Heterogeneous-ISA Program State Relocation – that performs dynamic randomization of run-time program state, both within and across ISAs. This technique outperforms the state-of-the-art just-in-time code reuse (JIT-ROP) defense by an average of 15.6%, while simultaneously providing greater security guarantees against classic return-into-libc, ROP, JOP, brute force, JIT-ROP, and several evasive variants.
Article
Fay is a flexible platform for the efficient collection, processing, and analysis of software execution traces. Fay provides dynamic tracing through use of runtime instrumentation and distributed aggregation within machines and across clusters. At the lowest level, Fay can be safely extended with new tracing primitives, including even untrusted, fully optimized machine code, and Fay can be applied to running user-mode or kernel-mode software without compromising system stability. At the highest level, Fay provides a unified, declarative means of specifying what events to trace, as well as the aggregation, processing, and analysis of those events. We have implemented the Fay tracing platform for Windows and integrated it with two powerful, expressive systems for distributed programming. Our implementation is easy to use, can be applied to unmodified production systems, and provides primitives that allow the overhead of tracing to be greatly reduced, compared to previous dynamic tracing platforms. To show the generality of Fay tracing, we reimplement, in experiments, a range of tracing strategies and several custom mechanisms from existing tracing frameworks. Fay shows that modern techniques for high-level querying and data-parallel processing of disagreggated data streams are well suited to comprehensive monitoring of software execution in distributed systems. Revisiting a lesson from the late 1960s [Deutsch and Grant 1971], Fay also demonstrates the efficiency and extensibility benefits of using safe, statically verified machine code as the basis for low-level execution tracing. Finally, Fay establishes that, by automatically deriving optimized query plans and code for safe extensions, the expressiveness and performance of high-level tracing queries can equal or even surpass that of specialized monitoring tools.
Article
Heterogeneous Chip Multiprocessors have been shown to provide significant performance and energy efficiency gains over homogeneous designs. Recent research has expanded the dimensions of heterogeneity to include diverse Instruction Set Architectures, called Heterogeneous-ISA Chip Multiprocessors. This work leverages such an architecture to realize substantial new security benefits, and in particular, to thwart Return-Oriented Programming. This paper proposes a novel security defense called HIPStR -- Heterogeneous-ISA Program State Relocation -- that performs dynamic randomization of run-time program state, both within and across ISAs. This technique outperforms the state-of-the-art just-in-time code reuse (JIT-ROP) defense by an average of 15.6%, while simultaneously providing greater security guarantees against classic return-into-libc, ROP, JOP, brute force, JIT-ROP, and several evasive variants.
Article
Heterogeneous Chip Multiprocessors have been shown to provide significant performance and energy efficiency gains over homogeneous designs. Recent research has expanded the dimensions of heterogeneity to include diverse Instruction Set Architectures, called Heterogeneous-ISA Chip Multiprocessors. This work leverages such an architecture to realize substantial new security benefits, and in particular, to thwart Return-Oriented Programming. This paper proposes a novel security defense called HIPStR -- Heterogeneous-ISA Program State Relocation -- that performs dynamic randomization of run-time program state, both within and across ISAs. This technique outperforms the state-of-the-art just-in-time code reuse (JIT-ROP) defense by an average of 15.6%, while simultaneously providing greater security guarantees against classic return-into-libc, ROP, JOP, brute force, JIT-ROP, and several evasive variants.
Article
Heterogeneous Chip Multiprocessors have been shown to provide significant performance and energy efficiency gains over homogeneous designs. Recent research has expanded the dimensions of heterogeneity to include diverse Instruction Set Architectures, called Heterogeneous-ISA Chip Multiprocessors. This work leverages such an architecture to realize substantial new security benefits, and in particular, to thwart Return-Oriented Programming. This paper proposes a novel security defense called HIPStR -- Heterogeneous-ISA Program State Relocation -- that performs dynamic randomization of run-time program state, both within and across ISAs. This technique outperforms the state-of-the-art just-in-time code reuse (JIT-ROP) defense by an average of 15.6%, while simultaneously providing greater security guarantees against classic return-into-libc, ROP, JOP, brute force, JIT-ROP, and several evasive variants.
Article
Managed languages such as JavaScript are popular. For performance, modern implementations of managed languages adopt Just-In-Time (JIT) compilation. The danger to a JIT compiler is that an attacker can often control the input program and use it to trigger a vulnerability in the JIT compiler to launch code injection or JIT spraying attacks. In this paper, we propose a general approach called RockJIT to securing JIT compilers through Control-Flow Integrity (CFI). RockJIT builds a fine-grained control-flow graph from the source code of the JIT compiler and dynamically updates the control-flow policy when new code is generated on the fly. Through evaluation on Google's V8 JavaScript engine, we demonstrate that RockJIT can enforce strong security on a JIT compiler, while incurring only modest performance overhead (14.6% on V8) and requiring a small amount of changes to V8's code. Key contributions of RockJIT are a general architecture for securing JIT compilers and a method for generating fine-grained control-flow graphs from C++ code.
Article
Control-Flow Integrity (CFI) is a software-hardening technique. It inlines checks into a program so that its execution always follows a predetermined Control-Flow Graph (CFG). As a result, CFI is effective at preventing control-flow hijacking attacks. However, past fine-grained CFI implementations do not support separate compilation, which hinders its adoption. We present Modular Control-Flow Integrity (MCFI), a new CFI technique that supports separate compilation. MCFI allows modules to be independently instrumented and linked statically or dynamically. The combined module enforces a CFG that is a combination of the individual modules' CFGs. One challenge in supporting dynamic linking in multithreaded code is how to ensure a safe transition from the old CFG to the new CFG when libraries are dynamically linked. The key technique we use is to have the CFG represented in a runtime data structure and have reads and updates of the data structure wrapped in transactions to ensure thread safety. Our evaluation on SPECCPU2006 benchmarks shows that MCFI supports separate compilation, incurs low overhead of around 5%, and enhances security.
Article
Web browsers have become a de facto universal operating system, and JavaScript its instruction set. Unfortunately, running other languages in the browser is not generally possible. Translation to JavaScript is not enough because browsers are a hostile environment for other languages. Previous approaches are either non-portable or require extensive modifications for programs to work in a browser. This paper presents Doppio, a JavaScript-based runtime system that makes it possible to run unaltered applications written in general-purpose languages directly inside the browser. Doppio provides a wide range of runtime services, including a file system that enables local and external (cloud-based) storage, an unmanaged heap, sockets, blocking I/O, and multiple threads. We demonstrate DOPPIO's usefulness with two case studies: we extend Emscripten with Doppio, letting it run an unmodified C++ application in the browser with full functionality, and present DoppioJVM, an interpreter that runs unmodified JVM programs directly in the browser. While substantially slower than a native JVM (between 24X and 42X slower on CPU-intensive benchmarks in Google Chrome), DoppioJVM makes it feasible to directly reuse existing, non compute-intensive code.
Article
Testing large software packages can become very time intensive. To address this problem, researchers have investigated techniques such as Test Suite Minimization. Test Suite Minimization reduces the number of tests in a suite by removing tests that appear redundant, at the risk of a reduction in fault-finding ability since it can be difficult to identify which tests are truly redundant. We take a completely different approach to solving the same problem of long running test suites by instead reducing the time needed to execute each test, an approach that we call Unit Test Virtualization. With Unit Test Virtualization, we reduce the overhead of isolating each unit test with a lightweight virtualization container. We describe the empirical analysis that grounds our approach and provide an implementation of Unit Test Virtualization targeting Java applications. We evaluated our implementation, VMVM, using 20 real-world Java applications and found that it reduces test suite execution time by up to 97% (on average, 62%) when compared to traditional unit test execution. We also compared VMVM to a well known Test Suite Minimization technique, finding the reduction provided by VMVM to be four times greater, while still executing every test with no loss of fault-finding ability.
Article
For performance and for incorporating legacy libraries, many Java applications contain native-code components written in unsafe languages such as C and C++. Native-code components interoperate with Java components through the Java Native Interface (JNI). As native code is not regulated by Java's security model, it poses serious security threats to the managed Java world. We introduce a security framework that extends Java's security model and brings native code under control. Leveraging software-based fault isolation, the framework puts native code in a separate sandbox and allows the interaction between the native world and the Java world only through a carefully designed pathway. Two different implementations were built. In one implementation, the security framework is integrated into a Java Virtual Machine (JVM). In the second implementation, the framework is built outside of the JVM and takes advantage of JVM-independent interfaces. The second implementation provides JVM portability, at the expense of some performance degradation. Evaluation of our framework demonstrates that it incurs modest runtime overhead while significantly enhancing the security of Java applications.
Article
Full-text available
1. Abstract The ability of spyware to circumvent common security practices, surreptitiously exporting confidential information to remote parties and illicitly consuming system resources, is a rising security concern in government, corporate, and home computing environments. While it is the common perception that spyware infection is the result of high risk Internet surfing behavior, our research shows main-stream web sites listed in popular search engines contribute to spyware infection irrespective of patch levels and despite "safe" Internet surfing practices. Experiments conducted in July of 2005 revealed the presence of spyware in several main-stream Internet sectors as evidenced in the considerable infection of both patched and unpatched Windows XP test beds. Although the experiment emulated conservative web surfing practices by not interacting with web page links, images, or banner advertisements, spyware infection of Internet Explorer based test beds occurred swiftly through cross-domain scripting and ActiveX exploits. As many as 71 different spyware programs were identified among 6 Internet sectors. Real-estate and online travel-related web sites infected the test beds with as many as 14 different spyware programs and one bank-related web site appeared to be the source of a resource consuming dialing program. Empirical analysis suggests that spyware infection via drive-by-download attacks has thus far been unabated by security patches or even prudent web surfing behavior. At least for the moment, it appears the choice of web browser applications is the single most effective measure in preventing spyware infection via drive-by-downloads.
Conference Paper
Full-text available
Java applications often need to incorporate native-code components for efficiency and for reusing legacy code. However, it is well known that the use of native code defeats Java's security model. We describe the design and implementation of Robusta, a complete framework that provides safety and security to native code in Java applications. Starting from software-based fault isolation (SFI), Robusta isolates native code into a sandbox where dynamic linking/loading of libraries in supported and unsafe system modification and confidentiality violations are prevented. It also mediates native system calls according to a security policy by connecting to Java's security manager. Our prototype implementation of Robusta is based onNative Client and OpenJDK. Experiments in this prototype demonstrate Robusta is effective and efficient, with modest runtime overhead on a set of JNI benchmark programs. Robusta can be used to sandbox native libraries used in Java's system classes to prevent attackers from exploiting bugs in the libraries. It can also enable trustworthy execution of mobile Java programs with native libraries. The design of Robusta should also be applicable when other type-safe languages (e.g., C#, Python) want to ensure safe interoperation with native libraries
Conference Paper
Full-text available
DCG (Dynamic Code Generation) technologies have found widely applications in the Web 2.0 era, Dion Blazakis recently presented a Flash JIT-Spraying attack against Adobe Flash Player that easily circumvented DEP and ASLR protection mechanisms built in modern operating systems. We have generalized and extended JIT Spraying into DCG Spraying. Based our analyses on this abstract model of DCG Spraying, we have found that all mainstream DCG implementations (Java/ JavaScript/ Flash/ .Net/ SilverLight) are vulnerable against DCG Spraying attack, and none of the existing ad hoc defenses such as compilation optimization, random NOP padding and constant splitting provides effective protection. Furthermore, we propose a new protection method, INSeRT, which combines randomization of intrinsic elements of machine instructions and randomly planted special trapping snippets. INSeRT practically renders the "sprayed code" ineffective, while alerts the host program of ongoing attacking attempts. We implemented a prototype of INSeRT on the V8 JavaScript engine, and the performance overhead is less than 5%, which should be acceptable in practical application.
Conference Paper
Full-text available
Many of today's web sites contain substantial amounts of client-side code, and consequently, they act more like pro- grams than simple documents. This creates robustness and performance challenges for web browsers. To give users a robust and responsive platform, the browser must identify program boundaries and provide isolation between them. We provide three contributions in this paper. First, we present abstractions of web programs and program in- stances, and we show that these abstractions clarify how browser components interact and how appropriate program boundaries can be identified. Second, we identify backwards compatibility tradeoffs that constrain how web content can be divided into programs without disrupting existing web sites. Third, we present a multi-process browser architect ure that isolates these web program instances from each other, improving fault tolerance, resource management, and perfor- mance. We discuss how this architecture is implemented in Google Chrome, and we provide a quantitative performance evaluation examining its benefits and costs.
Conference Paper
Full-text available
Software Fault Isolation (SFI) is an effective approach to sandboxing binary code of questionable provenance, an interesting use case for native plugins in a Web browser. We present software fault isolation schemes for ARM and x86-64 that provide control-flow and memory integrity with average performance overhead of under 5% on ARM and 7% on x86-64. We believe these are the best known SFI implementations for these architectures, with significantly lower overhead than previous systems for similar architectures. Our experience suggests that these SFI implementations benefit from instruction-level parallelism, and have particularly small impact for work- loads that are data memory-bound, both properties that tend to reduce the impact of our SFI systems for future CPU implementations.
Conference Paper
Full-text available
When a program uses Software TransactionalMemory (STM) to synchronize accesses to shared memory, the performance often depends on which STM implementation is used. Implementation vary greatly in their underlying mechanisms, in the features they provide, and in the assumptions they make about the common case. Consequently, the best choice of algorithm is workload-dependent. Worse yet, for workload composed of multiple phases of execution, the "best choice of implementation may change during execution. We present a low-overhead system for adapting between STM implementations. Like previous work, our system enable adaptivity between different parameterizations of a given algorithm, and it allows adapting between the use of transactions and coarse-grained locks. In addition, we support dynamic switching between fundamentally different STM implementations. We also explicitly support irrevocability retry-based condition synchronization, and privatization. Through a series of experiments, we show that our system introduces negligible overhead. We also present a candidate use of dynamic adaptivity, as a replacement for contention management. When using adaptivity in this manner, STM implementations can be simplified to a great degree without lowering throughput or introducing a risk of pathological slowdown, even for challenging workloads.
Conference Paper
Full-text available
We present a way to run Objective Caml programs on a standard, unmodified web browser, with a compatible data representation and execution model, including concurrency. To achieve this, we designed a bytecode interpreter in JavaScript, as well as an implementation of the runtime library. Since the web browser does not provide the same interaction mechanisms as a typical Objective Caml environment, we provide an add-on to the standard library, enabling interaction with the web page. As a result, one can now build the client side of a web application with the standard Objective Caml compiler and run it on any modern web browser.
Conference Paper
Full-text available
In this paper, we present a new method to protect software against illegal acts of hacking. The key idea is to add a mechanism of self-modifying codes to the original program, so that the original program becomes hard to be analyzed. In the binary program obtained by the proposed method, the original code fragments we want to protect are camouflaged by dummy instructions. Then, the binary program autonomously restores the original code fragments within a certain period of execution, by replacing the dummy instructions with the original ones. Since the dummy instructions are completely different from the original ones, code hacking fails if the dummy instructions are read as they are. Moreover, the dummy instructions are scattered over the program, therefore, they are hard to be identified. As a result, the proposed method helps to construct highly invulnerable software without special hardware.
Conference Paper
Full-text available
Bugs in kernel extensions remain one of the main causes of poor operating system reliability despite proposed tech- niques that isolate extensions in separate protection domains to contain faults. We believe that previous fault isolation techniques are not widely used because they cannot iso- late existing kernel extensions with low overhead on stan- dard hardware. This is a hard problem because these ex- tensions communicate with the kernel using a complex in- terface and they communicate frequently. We present BGI (Byte-Granularity Isolation), a new software fault isolation technique that addresses this problem. BGI uses efficient byte-granularity memory protection to isolate kernel exten- sions in separate protection domains that share the same address space. BGI ensures type safety for kernel objects and it can detect common types of errors inside domains. Our results show that BGI is practical: it can isolate Win- dows drivers without requiring changes to the source code and it introduces a CPU overhead between 0 and 16%. BGI can also find bugs during driver testing. We found 28 new bugs in widely used Windows drivers.
Article
Full-text available
Array bound checking refers to determining whether all array references in a program are within their declared ranges. This checking is critical for software verification and validation because subscripting arrays beyond their declared sizes may produce unexpected results, security holes, or failures. It is available in most commercial compilers but current implementations are not as efficient and effective as one may have hoped: (1) the execution times of array bound checked programs are increased by a factor of up to 5, (2) the compilation times are increased, which is detrimental to development and debugging, (3) the related error messages do not usually carry information to locate the faulty references, and (4) the consistency between actual array sizes and formal array declarations is not often checked.This article presents two optimization techniques that deal with Points 1, 2, and 3, and a new algorithm to tackle Point 4, which is not addressed by the current literature. The first optimization technique is based on the elimination of redundant tests, to provide very accurate information about faulty references during development and testing phases. The second one is based on the insertion of unavoidable tests to provide the smallest possible slowdown during the production phase. The new algorithm ensures the absence of bound violations in every array access in the called procedure with respect to the array declarations in the calling procedure. Our experiments suggest that the optimization of array bound checking depends on several factors, not only the percentage of removed checks, usually considered as the best improvement measuring metrics. The debugging capability and compile-time and run-time performances of our techniques are better than current implementations. The execution times of SPEC95 CFP benchmarks with range checking added by PIPS, our Fortran research compiler, are slightly longer, less than 20%, than that of unchecked programs. More problems due to functional and data recursion would have to be solved to extend these results from Fortran to other languages such as C, C++, or Java, but the issues addressed in this article are nevertheless relevant.
Conference Paper
Full-text available
In this paper, we describe the techniques that have been implemented in the IBM TestaRossa (TR) just-in-time (JIT) compiler to safely perform aggressive code patching and collect accurate profiles in the context of a Java application employing multiple threads and dynamic class loading and unloading. Previous work in these areas either did not account for the synchronization cost of safety or dynamic class loading/unloading effects in a heavily multithreaded program or did not consider how different patching techniques may be required for different platforms where instruction cache coherence guarantees vary. We evaluate the space and time overhead to make our profiling framework correct, showing that privatizing the profiling variables to achieve correctness impacts execution time only minimally but it can grow the stack frames for profiled methods by less than 15% on average for the SPECjvm98 and SPECjbb2000 benchmarks. Since methods are profiled for only a brief time and the stack frames themselves are not large, we do not consider this growth to be prohibitive. The techniques reported in this paper are implemented in the 1.5.0 release of the IBM Developer Kit for Java targeting 12 different processor-operating system platforms.
Conference Paper
Full-text available
Recent research has proposed self-checksumming as a method by which a program can detect any possibly malicious modification to its code. Wurster et al. developed an attack against such programs that renders code modifications undetectable to any self-checksumming routine. The attack replicated pages of program text and altered values in hardware data structures so that data reads and instruction fetches retrieved values from different memory pages. A cornerstone of their attack was its applicability to a variety of commodity hardware: they could alter memory accesses using only a malicious operating system. In this paper, we show that their page-replication attack can be detected by self-checksumming programs with self-modifying code. Our detection is efficient, adding less than 1 microsecond to each checksum computation in our experiments on three processor families, and is robust up to attacks using either costly interpretive emulation or specialized hardware.
Article
Most current web browsers employ a monolithic architec- ture that combines \the user" and \the web" into a single protection domain. An attacker who exploits an arbitrary code execution vulnerability in such a browser can steal sen- sitive les or install malware. In this paper, we present the security architecture of Chromium, the open-source browser upon which Google Chrome is built. Chromium has two modules in separate protection domains: a browser kernel, which interacts with the operating system, and a rendering engine, which runs with restricted privileges in a sandbox. This architecture helps mitigate high-severity attacks with- out sacricing compatibility with existing web sites. We dene a threat model for browser exploits and evaluate how the architecture would have mitigated past vulnerabilities.
Conference Paper
Software vulnerabilities have had a devastating effect on the Internet. Worms such as CodeRed and Slammer can compromise hundreds of thousands of hosts within hours or even minutes, and cause millions of dollars of damage (25, 42). To successfully combat these fast auto- matic Internet attacks, we need fast automatic attack de- tection and filtering mechanisms. In this paper we propose dynamic taint analysis for au- tomatic detection of overwrite attacks, which include most types of exploits. This approach does not need source code or special compilation for the monitored program, and hence works on commodity software. To demonstrate this idea, we have implemented TaintCheck, a mechanism that can perform dynamic taint analysis by performing binary rewriting at run time. We show that TaintCheck reliably detects most types of exploits. We found that TaintCheck produced no false positives for any of the many different programs that we tested. Further, we describe how Taint- Check could improve automatic signature generation in several ways.
Article
Executing untrusted code while preserving security requires that the code be prevented from modifying memory or executing instructions except as explicitly allowed. Software-based fault isolation (SFI) or "sandboxing" enforces such a policy by rewriting the untrusted code at the instruction level. However, the original sandboxing technique of Wahbe et al. is applicable only to RISC architectures, and most other previous work is either insecure, or has been not described in enough detail to give confidence in its security properties. We present a new sandboxing technique that can be applied to a CISC architecture like the IA-32, and whose application can be checked at load-time to minimize the TCB. We describe an implementation which provides a robust security guarantee and has low runtime overheads (an average of 21% on the SPECint2000 benchmarks). We evaluate the utility of the technique by applying it to untrusted decompression modules in an archive tool, and its safety by constructing a machine-checked proof that any program approved by the verification algorithm will respect the desired safety property.
Chapter
Malicious web sites perform drive-by download attacks to infect their visitors with malware. Current protection approaches rely on black- or white-listing techniques that are difficult to keep up-to-date. As todays drive-by attacks already employ encryption to evade network level detection we propose a series of techniques that can be implemented in web browsers to protect the user from such threats. In addition, we discuss challenges and open problems that these mechanisms face in order to be effective and efficient.
Conference Paper
This paper presents a method for creating formally correct just-in- time (JIT) compilers. The tractability of our approach is demon- strated through, what we believe is the first, verification of a JIT compiler with respect to a realistic semantics of self-modifying x86 machine code. Our semantics includes a model of the instruction cache. Two versions of the verified JIT compiler are presented: one generates all of the machine code at once, the other one is incre- mental i.e. produces code on-demand. All proofs have been per- formed inside the HOL4 theorem prover.
Conference Paper
The Smalltalk-80* programming language includes dynamic storage allocation, full upward funargs, and universally polymorphic procedures; the Smalltalk-80 programming system features interactive execution with incremental compilation, and implementation portability. These features of modern programming systems are among the most difficult to implement efficiently, even individually. A new implementation of the Smalltalk-80 system, hosted on a small microprocessor-based computer, achieves high performance while retaining complete (object code) compatibility with existing implementations. This paper discusses the most significant optimization techniques developed over the course of the project, many of which are applicable to other languages. The key idea is to represent certain runtime state (both code and data) in more than one form, and to convert between forms when needed.
Conference Paper
This paper presents DTrace, a new facility for dynamic instrumentation of production systems. DTrace features the ability to dynamically instrument both user-level and kernel-level software in a unified and absolutely safe fashion. When not explicitly enabled, DTrace has zero probe effect — the system operates exactly as if DTrace were not present at all. DTrace allows for many tens of thousands of instrumentation points, with even the smallest of systems offering on the order of 30,000 such points in the kernel alone. We have developed a C-like high-level control language to describe the predicates and actions at a given point of instrumentation. The lan- guage features user-defined variables, including thread- local variables and associative arrays. To eliminate the need for most postprocessing, the facility features a scal- able mechanism for aggregating data and a mechanism for speculative tracing. DTrace has been integrated into the Solaris operating system and has been used to find serious systemic performance problems on production systems — problems that could not be found using pre- existing facilities.
Conference Paper
Most software are vulnerable to attacks, so it is easy to attack software vulnerabilities, like buer overflows, format strings, to write data to some locations that are important. So the authors developed an approach that prevent both control data attack and non control data attack to enforce data flow integrity. At first it uses static analysis to compute a data flow graph, and then it instruments the program, so it can ensure that the data flow, then if data flow integrity is violated, it will trigger an exception. Then optional optimizations are described to reduce the resulting additional overheads. So this technique can be used in practical applications for it can be applied to existing C and C++ programs automatically, and this method requires no modifications, also it does not have false positives but has low overhead.
Conference Paper
Self-modifying code (SMC), in this paper, broadly refers to anyprogram that loads, generates, or mutates code at runtime. It is widely used in many of the world's critical software systems tosupport runtime code generation and optimization, dynamic loading and linking, OS boot loader, just-in-time compilation, binary translation,or dynamic code encryption and obfuscation. Unfortunately, SMC is alsoextremely difficult to reason about: existing formal verification techniques-including Hoare logic and type system-consistentlyassume that program code stored in memory is fixedand immutable; this severely limits their applicability and power. This paper presents a simple but novel Hoare-logic-like framework that supports modular verification of general von-Neumann machine code with runtime code manipulation. By dropping the assumption that code memory is fixed and immutable, we are forced to apply local reasoningand separation logic at the very beginning, and treat program code uniformly as regular data structure. We address the interaction between separation and code memory and show how to establish the frame rules for local reasoning even in the presence of SMC. Our frameworkis realistic, but designed to be highly generic, so that it can support assembly code under all modern CPUs (including both x86 andMIPS). Our system is expressive and fully mechanized. We prove itssoundness in the Coq proof assistant and demonstrate its power by certifying a series of realistic examples and applications-all of which can directly run on the SPIM simulator or any stock x86 hardware.
Conference Paper
This paper describes the design, implementation and evaluation of Native Client, a sandbox for untrusted x86 native code. Native Client aims to give browser-based applications the computational performance of native applications without compromising safety. Native Client uses software fault isolation and a secure runtime to direct system interaction and side effects through interfaces managed by Native Client. Native Client provides operating system portability for binary code while supporting performance-oriented features generally absent from Web application programming environments, such as thread support, instruction set extensions such as SSE, and use of compiler intrinsics and hand-coded assembler. We combine these properties in an open architecture that encourages community review and 3rd-party tools.
Article
Cyclone is a type-safe programming language that provides explicit run-time code generation. The Cyclone compiler uses a template-based strategy for run-time code generation in which pre-compiled code fragments are stitched together at run time. This strategy keeps the cost of code generation low, but it requires that optimizations, such as register allocation and code motion, are applied to templates at compile time. This paper describes a principled approach to implementing such optimizations. In particular, we generalize standard flowgraph intermediate representations to support templates, define a mapping from (a subset of) Cyclone to this representation, and describe a dataflow-analysis framework that supports standard optimizations across template boundaries.
Article
Current software attacks often build on exploits that subvert machine-code execution. The enforcement of a basic safety property, control-flow integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple and its guarantees can be established formally, even with respect to powerful adversaries. Moreover, CFI enforcement is practical: It is compatible with existing software and can be done efficiently using software rewriting in commodity systems. Finally, CFI provides a useful foundation for enforcing further security policies, as we demonstrate with efficient software implementations of a protected shadow call stack and of access control for memory regions.
Article
Software systems have been using "just-in-time" compilation (JIT) techniques since the 1960s. Broadly, JIT compilation includes any translation performed dynamically, after a program has started execution. We examine the motivation behind JIT compilation and constraints imposed on JIT compilation systems, and present a classification scheme for such systems. This classification emerges as we survey forty years of JIT work, from 1960--2000.
Article
In the literature, performance results are frequently summarized using the arithmetic mean of performance ratios, leading, in some cases, to wrong conclusions or, at best, inappropriate statistics. The authors attempt to elucidate this inadvertent misuse of statistics in reporting results by pointing out why the arithmetic mean should not be used to summarize normalized performance numbers, and showing why the geometric mean is the more appropriate measure. They do this in the form of some simple rules for improved statistical analysis of performance benchmark results.
Article
Thesis (Ph. D.)--University of Washington, 2005. Despite decades of research in fault tolerance, commodity operating systems, such as Windows and Linux, continue to crash. In this dissertation, I describe a new reliability subsystem for operating systems that prevents the most common cause of crashes, device driver failures, without requiring changes to drivers themselves. To date, the subsystem has been used in Linux to prevent system crashes in the presence of driver failures, recover failed drivers transparently to the OS and applications, and update drivers "on the fly" without requiring a system reboot after installation. Measurements show that the system is extremely effective at protecting the OS from driver failures, while imposing little runtime overhead.
Article
The authors demonstrate how their Minimal i386 Software Fault Isolation Tool (MiSFIT) protects applications from end user extensions written in otherwise unsafe languages. They also compare the performance of unprotected code with MiSFIT-protected versions. MiSFIT can be used to fault isolate dynamically linked extensions to Web browsers, operating system extensions, or client code linked to a database server. As performance results show, by providing safety at a reasonably small overhead, MiSFIT is part of an end-to-end solution to the problem of constructing extensible systems
Article
This paper describes the motivation, architecture and performance of SPIN, an extensible operating system. SPIN provides an extension infrastructure together with a core set of extensible services that allow applications to safely change the operating system's interface and implementation. These changes can be specified with finegranularity, allowing applications to achieve a desired level of performance and functionality from the system. Extensions are dynamically linked into the operating system kernel at application runtime, enabling them to access system services with low overhead. A capabilitybased protection model that relies on language and linktime mechanisms enables the system to inexpensively export fine-grained interfaces to system services. SPIN and its extensions are written in Modula-3 and run on DEC Alpha workstations. 1 Introduction SPIN is an operating system that can be dynamically specialized to safely meet the performance and functionality requirements of applic...
Article
this paper in L a T E Xpartly supported by ARPA (ONR) grant N00014-94-1-0775 to Stanford University where John McCarthy has been since 1962. Copied with minor notational changes from CACM, April 1960. If you want the exact typography, look there. Current address, John McCarthy, Computer Science Department, Stanford, CA 94305, (email: jmc@cs.stanford.edu), (URL: http://www-formal.stanford.edu/jmc/ ) by starting with the class of expressions called S-expressions and the functions called S-functions. In this article, we first describe a formalism for defining functions recursively. We believe this formalism has advantages both as a programming language and as a vehicle for developing a theory of computation. Next, we describe S-expressions and S-functions, give some examples, and then describe the universal S-function apply which plays the theoretical role of a universal Turing machine and the practical role of an interpreter. Then we describe the representation of S-expressions in the memory of the IBM 704 by list structures similar to those used by Newell, Shaw and Simon [2], and the representation of S-functions by program. Then we mention the main features of the LISP programming system for the IBM 704. Next comes another way of describing computations with symbolic expressions, and finally we give a recursive function interpretation of flow charts. We hope to describe some of the symbolic computations for which LISP has been used in another paper, and also to give elsewhere some applications of our recursive function formalism to mathematical logic and to the problem of mechanical theorem proving. 2 Functions and Function Definitions
Article
This report describes Proof-Carrying Code, a software mechanism that allows a host system to determine with certainty that it is safe to execute a program supplied by an untrusted source. For this to be possible, the untrusted code supplier must provide with the code a safety proof that attests to the code's safety properties. The code consumer can easily and quickly validate the proof without using cryptography and without consulting any external agents. In order to gain preliminary experience with proof-carrying code, we have performed a series of case studies. In one case study, we write safe assembly-language network packet filters. These filters can be executed with no run-time overhead, beyond a one-time cost of 1 to 3 milliseconds for validating the attached proofs. The net result is that our packet filters are formally guaranteed to be safe and are faster than packet filters created using Berkeley Packet Filters, Software Fault Isolation, or safe languages such as Modula-3. In ...
Article
This paper was published in: This is a preprint of an article published in Software: Practice and Experience 32(3), pages 265-294, 2002 http://www.interscience.wiley.com/ ; in particular, VM code consists of a sequence of VM instructions. In such designs the interpretive system is divided into a front end, i.e., a compiler that produces VM code, and a VM interpreter that executes this code. The advantages of this approach are efficiency (the VM is usually designed to be interpreted with minimal interpreter overhead), and a clean interface between modules of the interpretive system. Well-known examples of virtual machines are Java's JVM [LY99], Prolog's WAM [AK91], and Smalltalk's VM [GR83]
Article
this article are those of the authors and do not reflect the views of these agencies. Authors's addresses: G. Morrisett, D. Walker, and N. Glew: 4130 Upson Hall, Ithaca, NY 148537501, USA; K. Crary: School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA. Permission to make digital/hard copy of all or part of this material without fee is granted provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery, Inc. (ACM). To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. 2 G. Morrisett et al. 1.
Interpreter exploitation: Pointer inference and JIT spraying
  • D Blazakis
URL http://blog.chromium.org
  • Kevin Millikin
  • Florian Schneider
A new approach to the functional design of a digital computer, Papers presented at the
  • R S Barton
A framework for reducing the cost of instrumented code
  • Matthew Arnold
  • Barbara G Ryder
The structure and performance of interpreters
  • H Theodore
  • Dennis Romer
  • Geoffrey M Lee
  • Alec Voelker
  • Wayne A Wolman
  • Jean-Loup Wong
  • Brian N Baer
  • Henry M Bershad
  • Levy