Conference Paper

Parallelizing security checks on commodity hardware

DOI: 10.1145/1346281.1346321 Conference: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2008, Seattle, WA, USA, March 1-5, 2008
Source: DBLP


Speck1 is a system that accelerates powerful security checks on commodity hardware by executing them in parallel on multiple cores. Speck provides an infrastructure that allows sequential in- vocations of a particular security check to run in parallel without sacrificing the safety of the system. Speck creates parallelism in two ways. First, Speck decouples a security check from an appli- cation by continuing the application, using speculative execution, while the security check executes in parallel on another core. Sec- ond, Speck creates parallelism between sequential invocations of a security check by running later checks in parallel with earlier ones. Speck provides a process-level replay system to deterministically and efficiently synchronize state between a security check and the original process. We use Speck to parallelize three security checks: sensitive data analysis, on-access virus scanning, and taint propaga- tion. Running on a 4-core and an 8-core computer, Speck improves performance 4x and 7.5x for the sensitive data analysis check, 3.3x and 2.8x for the on-access virus scanning check, and 1.6x and 2x for the taint propagation check.

Download full-text


Available from: Jason Flinn,
  • Source
    • "In this paper, we focus on solving the latter replay problem for multithreaded programs, but the principles discussed here could be applied to build a deterministic multiprocessor system as well. The ability to faithfully reproduce an execution has proven useful in many areas, including debugging [27] [47], fault tolerance [11], computer forensics [16], dynamic analysis [13] [36], and workload capture [34]. However, past solutions to deterministic replay for shared-memory multiprocessor systems have been unsatisfactory either due to performance costs [17] [29] [49], reliance on custom hardware [24] [32] [33] [54], or lack of sufficiently strong determinism guarantees [2] [38] [51] [56]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Chimera uses a new hybrid program analysis to provide deterministic replay for commodity multiprocessor systems. Chimera leverages the insight that it is easy to provide deterministic multiprocessor replay for data-race-free programs (one can just record non-deterministic inputs and the order of synchronization operations), so if we can somehow transform an arbitrary program to be data-race-free, then we can provide deterministic replay cheaply for that program. To perform this transformation, Chimera uses a sound static data-race detector to find all potential data-races. It then instruments pairs of potentially racing instructions with a weak-lock, which provides sufficient guarantees to allow deterministic replay but does not guarantee mutual exclusion. Unsurprisingly, a large fraction of data-races found by the static tool are false data-races, and instrumenting them each of them with a weak-lock results in prohibitively high overhead. Chimera drastically reduces this cost from 53x to 1.39x by increasing the granularity of weak-locks without significantly compromising on parallelism. This is achieved by employing a combination of profiling and symbolic analysis techniques that target the sources of imprecision in the static data-race detector. We find that performance overhead for deterministic recording is 2.4% on average for Apache and desktop applications and about 86% for scientific applications.
    Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation; 06/2012
  • Source
    • "Speck. Nightingale et al. proposed a system called Speck [40], which uses a multi-core processor to speculatively execute the untrusted program while concurrently performing security checks on an instrumented copy on another core. To synchronize the copies, Speck records all non-deterministic system calls (e.g., read) made by the instrumented copy and replays their outcome to the uninstrumented process. "
    [Show abstract] [Hide abstract]
    ABSTRACT: TxBox is a new system for sand boxing untrusted applications. It speculatively executes the application in a system transaction, allowing security checks to be parallelized and yielding significant performance gains for techniques such as on-access anti-virus scanning. TxBox is not vulnerable to TOCTTOU attacks and incorrect mirroring of kernel state. Furthermore, TxBox supports automatic recovery: if a violation is detected, the sand boxed program is terminated and all of its effects on the host are rolled back. This enables effective enforcement of security policies that span multiple system calls.
    Security and Privacy (SP), 2011 IEEE Symposium on; 06/2011
  • Source
    • "Others have shown demand-driven analyses that are only enabled when operating on variables of interest to the analysis [16] [20]. There have also been works on parallelizing security checks [34] and decoupling the act of analysis from the original execution [8] [37]. These techniques do not completely solve the overhead problem. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents an argument for distributing dynamic software analyses to large populations of users in order to lo-cate bugs that cause security flaws. We review a collection of dynamic analysis systems and show that, despite a great deal of effort from the research community, their performance is still too low to allow their use in the field. We then show that there are effective sampling mechanisms for accelerat-ing a wide range of powerful dynamic analyses. These mech-anisms reduce the rate at which errors are observed by indi-vidual analyses, but this loss can be offset by the subsequent increase in test population. Nevertheless, there are unsolved issues in this domain that deserve attention if this technique is to be widely utilized.
Show more