Conference Paper

Parallelizing security checks on commodity hardware.

DOI: 10.1145/1346281.1346321 Conference: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2008, Seattle, WA, USA, March 1-5, 2008
Source: DBLP

ABSTRACT Speck1 is a system that accelerates powerful security checks on commodity hardware by executing them in parallel on multiple cores. Speck provides an infrastructure that allows sequential in- vocations of a particular security check to run in parallel without sacrificing the safety of the system. Speck creates parallelism in two ways. First, Speck decouples a security check from an appli- cation by continuing the application, using speculative execution, while the security check executes in parallel on another core. Sec- ond, Speck creates parallelism between sequential invocations of a security check by running later checks in parallel with earlier ones. Speck provides a process-level replay system to deterministically and efficiently synchronize state between a security check and the original process. We use Speck to parallelize three security checks: sensitive data analysis, on-access virus scanning, and taint propaga- tion. Running on a 4-core and an 8-core computer, Speck improves performance 4x and 7.5x for the sensitive data analysis check, 3.3x and 2.8x for the on-access virus scanning check, and 1.6x and 2x for the taint propagation check.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Device drivers are an Achilles' heel of modern commodity operating systems, accounting for far too many system failures. Previous work on driver reliability has focused on protecting the kernel from unsafe driver side-effects by interposing an invariant-checking layer at the driver interface, but otherwise treating the driver as a black box. In this paper, we propose and evaluate Guardrail, which is a more powerful framework for run-time driver analysis that performs decoupled instruction-grain dynamic correctness checking on arbitrary kernel-mode drivers as they execute, thereby enabling the system to detect and mitigate more challenging correctness bugs (e.g., data races, uninitialized memory accesses) that cannot be detected by today's fault isolation techniques. Our evaluation of Guardrail shows that it can find serious data races, memory faults, and DMA faults in native Linux drivers that required fixes, including previously unknown bugs. Also, with hardware logging support, Guardrail can be used for online protection of persistent device state from driver bugs with at most 10% overhead on the end-to-end performance of most standard I/O workloads.
    Proceedings of the 19th international conference on Architectural support for programming languages and operating systems; 02/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Chimera uses a new hybrid program analysis to provide deterministic replay for commodity multiprocessor systems. Chimera leverages the insight that it is easy to provide deterministic multiprocessor replay for data-race-free programs (one can just record non-deterministic inputs and the order of synchronization operations), so if we can somehow transform an arbitrary program to be data-race-free, then we can provide deterministic replay cheaply for that program. To perform this transformation, Chimera uses a sound static data-race detector to find all potential data-races. It then instruments pairs of potentially racing instructions with a weak-lock, which provides sufficient guarantees to allow deterministic replay but does not guarantee mutual exclusion. Unsurprisingly, a large fraction of data-races found by the static tool are false data-races, and instrumenting them each of them with a weak-lock results in prohibitively high overhead. Chimera drastically reduces this cost from 53x to 1.39x by increasing the granularity of weak-locks without significantly compromising on parallelism. This is achieved by employing a combination of profiling and symbolic analysis techniques that target the sources of imprecision in the static data-race detector. We find that performance overhead for deterministic recording is 2.4% on average for Apache and desktop applications and about 86% for scientific applications.
    Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation; 06/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Dynamic information flow tracking (DIFT) has shown to be an effective security measure for detecting both memory corruption attacks and semantic attacks at run-time on a wild range of systems from embedded systems and mobile devices to cloud computing. When applying DIFT to multi-thread applications running on multi-core architectures, the data processing and metadata processing are normally decoupled, i.e., being performed in different places at different times. Therefore, if the metadata access is not in the same order as data access, inconsistency issues may arise, which would reduce the security effectiveness of DIFT. Avoiding such inconsistency between data access and metadata access, i.e., maintaining metadata coherence, has become a challenging issue. In this paper, we propose METACE (METAdata Coherence Enforcement). METACE includes architectural enhancement in the memory management unit and leverages the existing cache coherence hardware and protocol to enforce metadata coherence. It introduces minimum changes to cores, coprocessors, and the memory hierarchy. It covers the complete set of data dependencies without deadlocks and is compatible with different memory consistency models. Our approach does not require modification of the source code. METACE supports out-of-order metadata access resulting in less performance degradation than previous approaches.
    Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy; 06/2013

Full-text (2 Sources)

Available from
Jun 5, 2014