Article

Cache missing for fun and profit

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

We describe the construction of a channel between processes via the state of a shared memory cache, and its use in the cryptanalysis of RSA. Unlike earlier side-channel attacks involving memory caches, our attack has the remarkable property of only requiring that a single private key operation be observed. We also discuss other methods in which this channel might be abused, and provide some suggestions to processor designers, op-erating system vendors, and the authors of cryptographic software as to how this and related attacks could be mitigated or eliminated entirely.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... While traditionally such attacks were implemented using native code [7,29,49,58,60,79,80], recent works have demonstrated that JavaScript code in browsers can also be used to launch such attacks [24,30,57,69]. In an attempt to mitigate JavaScript-based side-channel leakage, browser vendors have mainly focused on restricting the ability of an attacker to precisely measure time [15,16,84]. ...
... The shared use of a processor, therefore, creates the opportunity for information leakage between programs or security domains [22]. Leakage could be via shared state [3,32,44,80] or via contention on either the limited state storage space [27,49,58,60] or the bandwidth of microarchitectural components [2,10,82]. Exploiting this leakage, multiple side-channel attacks have been presented, extracting cryptographic keys [2,10,11,25,32,49,58,60,65,80,82], monitoring user behavior [29,33,57,64,69], and extracting other secret information [7,36,79]. ...
... Leakage could be via shared state [3,32,44,80] or via contention on either the limited state storage space [27,49,58,60] or the bandwidth of microarchitectural components [2,10,82]. Exploiting this leakage, multiple side-channel attacks have been presented, extracting cryptographic keys [2,10,11,25,32,49,58,60,65,80,82], monitoring user behavior [29,33,57,64,69], and extracting other secret information [7,36,79]. Side-channel attacks were shown to allow leaking between processes [32,49,58,60,80], web browser tabs [24,57,69], virtual machines [37,49,80,86], and other security boundaries [7,18,36,44]. ...
Preprint
The "eternal war in cache" has reached browsers, with multiple cache-based side-channel attacks and countermeasures being suggested. A common approach for countermeasures is to disable or restrict JavaScript features deemed essential for carrying out attacks. To assess the effectiveness of this approach, in this work we seek to identify those JavaScript features which are essential for carrying out a cache-based attack. We develop a sequence of attacks with progressively decreasing dependency on JavaScript features, culminating in the first browser-based side-channel attack which is constructed entirely from Cascading Style Sheets (CSS) and HTML, and works even when script execution is completely blocked. We then show that avoiding JavaScript features makes our techniques architecturally agnostic, resulting in microarchitectural website fingerprinting attacks that work across hardware platforms including Intel Core, AMD Ryzen, Samsung Exynos, and Apple M1 architectures. As a final contribution, we evaluate our techniques in hardened browser environments including the Tor browser, Deter-Fox (Cao el al., CCS 2017), and Chrome Zero (Schwartz et al., NDSS 2018). We confirm that none of these approaches completely defend against our attacks. We further argue that the protections of Chrome Zero need to be more comprehensively applied, and that the performance and user experience of Chrome Zero will be severely degraded if this approach is taken.
... Fortunately, recent years have also seen an increase in the awareness of such attacks, and the availability of countermeasures to mitigate them. To start with, a large number of existing attacks (e.g., [6,8,16,27,28,38,39,81]) can be mitigated by disabling simultaneous multi-threading (SMT) and cleansing the CPU microarchitectural state (e.g., the cache) when context switching between different security domains. Second, cross-core cache-based attacks (e.g., [22,41,65]) can be blocked by partitioning the last-level cache (e.g., with Intel CAT [64,74]), and disabling shared memory between processes in different security domains [102]. ...
... The root cause of these attacks is the limited storage space of the shared microarchitectural resource. Examples of shared resources that can be used for eviction-based channels are the L1 data [63,78,81] and instruction [2,5,115] caches, the TLB [39], the branch target buffer (BTB) [26,27] and the lastlevel cache (LLC) [22,36,41,42,51,54,65,69,88,111,116]. ...
... In these attacks, the victim and the attacker run on the same core and their execution is interleaved. Simultaneous multithreading (SMT) approaches [3,4,8,39,63,78,81,105] rely on the victim and the attacker executing on the same core in parallel (concurrently). Multicore approaches [22,25,41,42,51,54,65,69,94,[109][110][111]116] are the most generic with the victim and 1 Other classifications exist, such as the historical one into storage or timing channels [1], but our classification is more useful for this paper. ...
Preprint
We introduce the first microarchitectural side channel attacks that leverage contention on the CPU ring interconnect. There are two challenges that make it uniquely difficult to exploit this channel. First, little is known about the ring interconnect's functioning and architecture. Second, information that can be learned by an attacker through ring contention is noisy by nature and has coarse spatial granularity. To address the first challenge, we perform a thorough reverse engineering of the sophisticated protocols that handle communication on the ring interconnect. With this knowledge, we build a cross-core covert channel over the ring interconnect with a capacity of over 4 Mbps from a single thread, the largest to date for a cross-core channel not relying on shared memory. To address the second challenge, we leverage the fine-grained temporal patterns of ring contention to infer a victim program's secrets. We demonstrate our attack by extracting key bits from vulnerable EdDSA and RSA implementations, as well as inferring the precise timing of keystrokes typed by a victim user.
... Fetching data directly from this component is much faster. However, such timing differences can reveal the victim program's access traces [86,151,155]. ...
... While it is normally observed through cache hits, [32] proposed that the adversary can use cache miss information for better attack efficiency. Prime-Probe was first adopted to attack the AES encryption on the L1 data cache [149,151,155]. Then Aciicmez et al. [1] applied it to L1 instruction cache to check whether certain instructions are executed by the victim. ...
... By monitoring the execution trace of those branches, the adversary learns if each window is zero, and further recovers the secret. Such attacks have been realized against RSA [17,155] and ECDSA [7,9,15,72,199]. The second one is a secret-dependent data flow: the access location in the pre-computed table is determined by each window value. ...
Preprint
Full-text available
Side-channel attacks have become a severe threat to the confidentiality of computer applications and systems. One popular type of such attacks is the microarchitectural attack, where the adversary exploits the hardware features to break the protection enforced by the operating system and steal the secrets from the program. In this paper, we systematize microarchitectural side channels with a focus on attacks and defenses in cryptographic applications. We make three contributions. (1) We survey past research literature to categorize microarchitectural side-channel attacks. Since these are hardware attacks targeting software, we summarize the vulnerable implementations in software, as well as flawed designs in hardware. (2) We identify common strategies to mitigate microarchitectural attacks, from the application, OS and hardware levels. (3) We conduct a large-scale evaluation on popular cryptographic applications in the real world, and analyze the severity, practicality and impact of side-channel vulnerabilities. This survey is expected to inspire side-channel research community to discover new attacks, and more importantly, propose new defense solutions against them.
... This abstraction also enables CPU vendors to introduce transparent optimizations in the microarchitecture without requiring changes in the architecture. However, these optimizations regularly introduce new side channels that attackers can exploit [3,10,56,69,74,80,86,89]. ...
... jump cond. jump Evict+Reload [74] mem. accesses mem. ...
... access mem. access , ( ) Prime+Probe [74] mem. accesses mem. ...
Preprint
In the last years, a series of side channels have been discovered on CPUs. These side channels have been used in powerful attacks, e.g., on cryptographic implementations, or as building blocks in transient-execution attacks such as Spectre or Meltdown. However, in many cases, discovering side channels is still a tedious manual process. In this paper, we present Osiris, a fuzzing-based framework to automatically discover microarchitectural side channels. Based on a machine-readable specification of a CPU's ISA, Osiris generates instruction-sequence triples and automatically tests whether they form a timing-based side channel. Furthermore, Osiris evaluates their usability as a side channel in transient-execution attacks, i.e., as the microarchitectural encoding for attacks like Spectre. In total, we discover four novel timing-based side channels on Intel and AMD CPUs. Based on these side channels, we demonstrate exploitation in three case studies. We show that our microarchitectural KASLR break using non-temporal loads, FlushConflict, even works on the new Intel Ice Lake and Comet Lake microarchitectures. We present a cross-core cross-VM covert channel that is not relying on the memory subsystem and transmits up to 1 kbit/s. We demonstrate this channel on the AWS cloud, showing that it is stealthy and noise resistant. Finally, we demonstrate Stream+Reload, a covert channel for transient-execution attacks that, on average, allows leaking 7.83 bytes within a transient window, improving state-of-the-art attacks that only leak up to 3 bytes.
... SMT is for instance implemented in Intel (Intel Corporation, 2017) (named Hyper-threading) or AMD processors (Advanced Micro Devices, Inc., 2017). In literature, Percival (2005) exploits this technology to steal cryptographic keys via the L1 data cache. A spy and a victim process, the latter of which is generating digital signatures with RSA, run on the same core, but in different threads. ...
... Although the following overview focuses on data cache attacks, the majority of approaches also affect the instruction side of the cache hierarchy. In early literature, Percival (2005) introduces a method to observe L1 data cache accesses. The author uses it to retrieve an RSA key by observing the execution of a sliding window modular exponentiation implementation. ...
... Early microarchitectural attacks focused on shared resources that are private to every processor core. The branch prediction unit (Acıiçmez et al., 2006) and core-private cache levels (Percival, 2005) have both successfully been exploited to launch attacks between processes sharing the same core. In scenarios where the adversary cannot influence the processor core affinity, the success of these attacks is considerably limited. ...
Chapter
The Internet of Things (IoT) rapidly closes the gap between the virtual and the physical world. As more and more information is processed through this expanding network, the security of IoT devices and backend services is increasingly important. Yet, side-channel attacks pose a significant threat to systems in practice, as the microarchitectures of processors, their power consumption, and electromagnetic emanation reveal sensitive information to adversaries. This chapter provides an extensive overview of previous attack literature. It illustrates that microarchitectural attacks can compromise the entire IoT ecosystem: from devices in the field to servers in the backend. A subsequent discussion illustrates that many of today's security mechanisms integrated in modern processors are in fact vulnerable to the previously outlined attacks. In conclusion to these observations, new countermeasures are needed that effectively defend against both microarchitectural and power/EM based side-channel attacks.
... This ability can be turned into various security exploitations; for example, leaking kernel memory as part of original Spectre [1] and Meltdown [2] attacks and creating key loggers or covert channels [3]. Such attacks have also been successfully employed to retrieve secret keys from various cryptographic algorithms, such as AES [4], RSA [5] and ECDSA [6]. Therefore, preventing cache attacks is a major challenge for modern micro-architectures. ...
... Cache-based side-channel attacks are particularly powerful because they are not limited to attacks on cryptosystems [4][5][6][32][33][34][35]. Recent works [36][37][38][39] have shown how cache-based side-channel attacks can bypass many countermeasures that are based on the address space layout randomization (ASLR) mechanism. ...
... Several works have shown that cache attacks are possible in all levels and all types of cache memory. The PRIME+PROBE [33] attack was initially performed on first-level data caches to attack AES [5,32,33] or instruction caches [40]. The LLC is a more interesting attack target because adversaries and victims do not need to share the same CPU. ...
Article
Full-text available
Cache timing attacks, i.e., a class of remote side-channel attack, have become very popular in recent years. Eviction set construction is a common step for many such attacks, and algorithms for building them are evolving rapidly. On the other hand, countermeasures are also being actively researched and developed. However, most countermeasures have been designed to secure last-level caches and few of them actually protect the entire memory hierarchy. Cache randomization is a well-known mitigation technique against cache attacks that has a low-performance overhead. In this study, we attempted to determine whether address randomization on first-level caches is worth considering from a security perspective. In this paper, we present the implementation of a noise-free cache simulation framework that enables the analysis of the behavior of eviction set construction algorithms. We show that randomization at the first level of caches (L1) brings about improvements in security but is not sufficient to mitigate all known algorithms, such as the recently developed Prime–Prune–Probe technique. Nevertheless, we show that L1 randomization can be combined with a lightweight random eviction technique in higher-level caches to mitigate known conflict-based cache attacks.
... Many performance optimizations are done in the microarchitecture of the CPU, i.e., the actual implementation of the instruction-set architecture (ISA). Especially data-dependent optimizations, such as caches [76], [39], [80], [117] or branch predictors [1], [2], [26], have been well-studied. These optimizations have been shown to leak meta-data, e.g., memoryaccess patterns of the processed data, via side channels such as timing differences [30]. ...
... These optimizations have been shown to leak meta-data, e.g., memoryaccess patterns of the processed data, via side channels such as timing differences [30]. Traditionally, such microarchitectural attacks, e.g., cache attacks [117], [76], were mainly used to attack cryptographic algorithms, where the processing of secrets led to secret-dependent memory accesses [59], [77], [98], [7], [80], [76], [120], [52]. As a result, many cryptographic libraries are nowadays resilient against side-channel attacks [7], [20]. ...
... Cache attacks exploit the access-time differences for cached and uncached data. Since their introduction in tographic primitives [59], [77], [98], [7], [80], [76], user interactions [89], [62], [108], or as building blocks for transientexecution attacks [64], [58], [13]. ...
Preprint
Full-text available
In the quest for efficiency and performance, edge-computing providers eliminate isolation boundaries between tenants, such as strict process isolation, and instead let them compute in a more lightweight multi-threaded single-process design. Edge-computing providers support a high number of tenants per machine to reduce the physical distance to customers without requiring a large number of machines. Isolation is provided by sandboxing mechanisms, e.g., tenants can only run sandboxed V8 JavaScript code. While this is as secure as a sandbox for software vulnerabilities, microarchitectural attacks can bypass these sandboxes. In this paper, we show that it is possible to mount a Spectre attack on such a restricted environment, leaking secrets from co-located tenants. Cloudflare Workers is one of the top three edge-computing solutions and handles millions of HTTP requests per second worldwide across tens of thousands of web sites every day. We demonstrate a remote Spectre attack using amplification techniques in combination with a remote timing server, which is capable of leaking 120 bit/h. This motivates our main contribution, Dynamic Process Isolation, a process isolation mechanism that only isolates suspicious worker scripts following a detection mechanism. In the worst case of only false positives, Dynamic Process Isolation simply degrades to process isolation. Our proof-of-concept implementation augments a real-world cloud infrastructure framework, Cloudflare Workers, which is used in production at large scale. With a false-positive rate of only 0.61%, we demonstrate that our solution vastly outperforms strict process isolation in terms of performance. In our security evaluation, we show that Dynamic Process Isolation statistically provides the same security guarantees as strict process isolation, fully mitigating Spectre attacks between multiple tenants.
... Disable Hardware Threading [70], [150] Way to reduce the cost flushing ...
... Prime+Probe attacks are actually harder to perform in LLC than L1 level of cache. It is due to perceptibility of processor-memory activity at LLC [38], difficult to Prime+Probe all LLC [3], [4], [5], [7], [70], [199], [200], classifying cache sets related to security critical program of victim and probing resolution. Prime+Probe technique [18], [70] to perform attacks are usual way of exploiting contemporary set associative cache. ...
... It is due to perceptibility of processor-memory activity at LLC [38], difficult to Prime+Probe all LLC [3], [4], [5], [7], [70], [199], [200], classifying cache sets related to security critical program of victim and probing resolution. Prime+Probe technique [18], [70] to perform attacks are usual way of exploiting contemporary set associative cache. This technique has been used to exploit different levels of cache such as L1-data (L1-D) cache [18], [70] , L1-instruction (L1-I) cache [201] and Last Level Cache (LLC) [171]. ...
Thesis
Access-driven cache-based sidechannel attacks, a sub-category of SCAs, are strong cryptanalysis techniques that break cryptographic algorithms by targeting their implementations. Despite valiant efforts, mitigation techniques against such attacks are not very effective. This is mainly because most mitigation techniques usually protect against any given specific vulnerability and do not take a system-wide approach. Moreover, these solutions either completely remove or greatly reduce the prevailing performance benefits in computing systems that are hard earned over many decades. This thesis presents arguments in favor of enhancing security and privacy in modern computing architectures while retaining the performance benefits. The thesis argues in favor of a need-based protection, which would allow the operating system to apply mitigation only after successful detection of CSCAs. Thus, detection can serve as a first line of defense against such attacks. However, for detection-based protection strategy to be effective, detection needs to be highly accurate, should incur minimum system overhead at run-time, should cover a large set of attacks and should be capable of early stage detection, i.e., before the attack completes. This thesis proposes a complete framework for detection-based protection. At first, the thesis presents a highly accurate, fast and lightweight detection framework to detect a large set of Cache-based SCAs at run-time under variable system load conditions. In the follow up, the thesis demonstrates the use of this detection framework through the proposition of an OS-level run-time detection-based mitigation mechanism for Linux generalpurpose distribution. Though the proposed mitigation mechanism is proposed for Linux general distributions, which is widely used in commodity hardware, the solution is scalable to other operating systems. We provide extensive experiments to validate the proposed detection framework and mitigation mechanism. This thesis demonstrates that security and privacy are system-wide concerns and the mitigation solutions must take a holistic approach.
... With the rise of sophisticated secure software, the adversaries have turned to hardware-based attacks for retrieving secret information. One category of hardware-based attacks that have gained a lot of attention in the past decade is: cache-based timing channel attack [1]. Caches have been the major target of these attacks because of the following reasons. ...
... Cryptography algorithms like AES [2], Rivest-Shamir-Adleman encryption (RSA) [3], and Elliptic Curve Digital Signature Algorithm (ECDSA) [4] have been the primary target of these attacks. The attacker uses the cache access pattern during encryption and decryption process to leak the private key of these ciphers [1,5,6]. ...
Article
Full-text available
Cache timing channel attacks has attained a lot of attention in the last decade. These attacks exploits the timing channel created by the significant time gap between cache and main memory accesses. It has been successfully implemented to leak the secret key of various cryptography algorithms. The latest advancements in cache attacks also exploit other micro-architectural components such as hardware prefetchers, branch predictor, and replacement engine, in addition to the cache memory. Detection of these attacks is a difficult task as the attacker process running in the processor must be detected before significant portion of the attack is complete. The major challenge for mitigation and defense mechanisms against these attacks is maintaining the system performance while disabling or avoiding these attacks. The overhead caused by detection, mitigation and defense mechanism must not be significant to system’s performance. This paper discusses the research carried out in three aspects of cache security: cache timing channel attacks, detection techniques of these attacks, and defense mechanisms in details.
... Such side channel attacks exploit subtle timing variations resulting from contention on CPU micro-architectural resources to extract otherwise-unavailable secret information [221,42,217,265,292,18,84,146,85,222,104,296,291]. Recent works, Meltdown [172] and Spectre [155], combine micro-architectural attacks with speculative execution, allowing the attacker to read the entire address space of victim processes (if they contain vulnerable code) ...
... Such side channel attacks exploit subtle timing variations resulting from contention on CPU micro-architectural resources to extract otherwiseunavailable secret information. Since their introduction over a decade ago [221,42,217,265], micro-architectural attacks have been used to break numerous cryptographic implementations [293,87,145], track user behaviors [215,171,104], and create covert channels [103,277]. Moreover, recent works combine micro-architectural attacks with speculative execution [155,124,172], allowing the attacker to read the entire address space of victim processes or of the operating system. ...
Thesis
A plethora of major security incidents---in which personal identifiers belonging to hundreds of millions of users were stolen---demonstrate the importance of improving the security of cloud systems. To increase security in the cloud environment, where resource sharing is the norm, we need to rethink existing approaches from the ground-up. This thesis analyzes the feasibility and security of trusted execution technologies as the cornerstone of secure software systems, to better protect users' data and privacy. Trusted Execution Environments (TEE), such as Intel SGX, has the potential to minimize the Trusted Computing Base (TCB), but they also introduce many challenges for adoption. Among these challenges are TEE's significant impact on applications' performance and non-trivial effort required to migrate legacy systems to run on these secure execution technologies. Other challenges include managing a trustworthy state across a distributed system and ensuring these individual machines are resilient to micro-architectural attacks. In this thesis, I first characterize the performance bottlenecks imposed by SGX and suggest optimization strategies. I then address two main adoption challenges for existing applications: managing permissions across a distributed system and scaling the SGX's mechanism for proving authenticity and integrity. I then analyze the resilience of trusted execution technologies to speculative execution, micro-architectural attacks, which put cloud infrastructure at risk. This analysis revealed a devastating security flaw in Intel's processors which is known as Foreshadow/L1TF. Finally, I propose a new architectural design for out-of-order processors which defeats all known speculative execution attacks.
... Especially caches, buffering memory recently accessed, are a popular attack target, with generic attack techniques like Flush+Reload enabling attacks with a high spatial and temporal resolution and high accuracy. The first cache attacks were focused on cryptographic algorithms [73], [99], [7], [75], [71], [58], [41], [47], [43], [42], [48]. In the last decade, the focus has been extended to non-cryptographic applications that still operate on secret data, e.g., breaking address-space layout randomization [46], [36], [6], [27], [57], [54], attacking secure enclaves [35], [10], [65], [86], [25], spying on websites and user input [61], [89], [107], and covert channels [114], [115], [61], [62], [85]. ...
... Kocher [56] demonstrated that timing attacks on cryptographic primitives are possible i.e., via caches or non constant-time arithmetic operations. Cache attacks were used to attack cryptographic primitives [56], [73], [99], [7], [75], [71], [58], [41], [47], [43], [42], [48], to break the integrity of secure enclaves [35], [10], [65], [86], [25], monitor user interaction and keystrokes [61], [89], [107], and build stealthy and fast covert channels [114], [61], [62], [85]. Two main techniques on cache attacks evolved with Prime+ Probe [71] and Flush+Reload [115]. ...
Preprint
Full-text available
Cache template attacks demonstrated automated leakage of user input in shared libraries. However, for large binaries, the runtime is prohibitively high. Other automated approaches focused on cryptographic implementations and media software but are not directly applicable to user input. Hence, discovering and eliminating all user input side-channel leakage on a cache-line granularity within huge code bases are impractical. In this paper, we present a new generic cache template attack technique, LBTA, layered binary templating attacks. LBTA uses multiple coarser-grained side channel layers as an extension to cache-line granularity templating to speed up the runtime of cache templating attacks. We describe LBTA with a variable number of layers with concrete side channels of different granularity, ranging from 64 B to 2MB in practice and in theory beyond. In particular the software-level page cache side channel in combination with the hardware-level L3 cache side channel, already reduces the templating runtime by three orders of magnitude. We apply LBTAs to different software projects and thereby discover data deduplication and dead-stripping during compilation and linking as novel security issues. We show that these mechanisms introduce large spatial distances in binaries for data accessed during a keystroke, enabling reliable leakage of keystrokes. Using LBTA on Chromium-based applications, we can build a full unprivileged cache-based keylogger. Our findings show that all user input to Chromium-based apps is affected and we demonstrate this on a selection of popular apps including Signal, Threema, Discord, and password manager apps like passky. As this is not a flaw of individual apps but the framework, we conclude that all apps that use the framework will also be affected, i.e., hundreds of apps.
... Attackers can also exploit hardware threading to examine a competing threads L1 cache usage in real time [14] [36]. Simultaneous multithreading (the sharing of the operation resources of a superscalar processor between multiple execution threads) is a feature implemented into Intel Pentium 4 processors. ...
... Such shared access to memory caches can facilitate side channels and enable a malign thread with restricted privilege to scan the operation of another thread. In turn, this results in allowing the attackers to steal cryptographic keys [15] [26] [36]. Additionally, by exploiting side-channel information based on CPU delay, adversaries could potentially mount TBSCAs against the Data Encryption Standard (DES) implemented in some applications. ...
Chapter
Despite its many technological and economic benefits, Cloud Computing poses complex security threats resulting from the use of virtualisation technology. Compromising the security of any component in the cloud virtual infrastructure will negatively affect the security of other elements and so impact the overall system security. By characterising the diversity of cyber-attacks carried out in the Cloud, this paper aims to provide an analysis of both common and underexplored security threats associated with the cloud from a technical viewpoint. Accordingly, the paper will suggest emerging solutions that can help to address such threats. The paper also offers future research directions for cloud security that we hope can inspire the research community to develop more effective security solutions for cloud systems.
... The Prime+Probe [38,41] cache side-channel does not require the existence of shared memory. The attacker primes the cache sets with its own data, and probes whether these cache sets are still occupied after the victim program has been scheduled. ...
... Cache side-channels, e.g., Evict+Time [38], Flush+Flush [23], Flush+Reload [63], Prime+Probe [38,41], utilize the timing variance caused by different memory behaviors. The large latency difference between a cache hit and a cache miss allows for high resolution, low noise observations. ...
Preprint
Modern processor designs use a variety of microarchitectural methods to achieve high performance. Unfortunately, new side-channels have often been uncovered that exploit these enhanced designs. One area that has received little attention from a security perspective is the processor's hard-ware prefetcher, a critical component used to mitigate DRAM latency in today's systems. Prefetchers, like branch predictors, hold critical state related to the execution of the application, and have the potential to leak secret information. But up to now, there has not been a demonstration of a generic prefetcher side-channel that could be actively exploited in today's hardware. In this paper, we present AfterImage, a new side-channel that exploits the Intel Instruction Pointer-based stride prefetcher. We observe that, when the execution of the processor switches between different private domains, the prefetcher trained by one domain can be triggered in another. To the best of our knowledge, this work is the first to publicly demonstrate a methodology that is both algorithm-agnostic and also able to leak kernel data into userspace. AfterImage is different from previous works, as it leaks data on the non-speculative path of execution. Because of this, a large class of work that has focused on protecting transient, branch-outcome-based data will be unable to block this side-channel. By reverse-engineering the IP-stride prefetcher in modern Intel processors, we have successfully developed three variants of AfterImage to leak control flow information across code regions, processes and the user-kernel boundary. We find a high level of accuracy in leaking information with our methodology (from 91%, up to 99%), and propose two mitigation techniques to block this side-channel, one of which can be used on hardware systems today.
... A microarchitectural attack makes use of shared hardware resources to leak information across isolation boundaries. They have been used in a variety of applications, from creating covert channels [32], retrieving secret keys of ciphers [13,58], reading Operating System data [38,43], fingerprinting websites [65], logging keystrokes [64], reverse engineering Deep Learning algorithms [30], and breaking Address Space Layout Randomization [11,27]. The last couple of years have seen these attack vectors applied on a range of devices from mobile phones [42] to third-party cloud platforms [64]. ...
... Thread t 2 gets flagged for an epoch but recovers, while thread t 3 remains unaffected. In a typical micro-architectural attack, the attacker runs a program called the spy that contends with a victim program for shared hardware resources such as a common Cache Memory [6,28,42,45,58], Branch Prediction Unit (BPU) [7,23], Translation Lookaside Buffer (TLB) [26], or DRAM [37,59]. The contention affects the spy's execution time in a manner that correlates with the victim's execution. ...
Preprint
Micro-architectural attacks use information leaked through shared resources to break hardware-enforced isolation. These attacks have been used to steal private information ranging from cryptographic keys to privileged Operating System (OS) data in devices ranging from mobile phones to cloud servers. Most existing software countermeasures either have unacceptable overheads or considerable false positives. Further, they are designed for specific attacks and cannot readily adapt to new variants. In this paper, we propose a framework called LEASH, which works from the OS scheduler to stymie micro-architectural attacks with minimal overheads, negligible impact of false positives, and is capable of handling a wide range of attacks. LEASH works by starving maliciously behaving threads at runtime, providing insufficient time and resources to carry out an attack. The CPU allocation for a falsely flagged thread found to be benign is boosted to minimize overheads. To demonstrate the framework, we modify Linux's Completely Fair Scheduler with LEASH and evaluate it with seven micro-architectural attacks ranging from Meltdown and Rowhammer to a TLB covert channel. The runtime overheads are evaluated with a range of real-world applications and found to be less than 1% on average.
... Since all memory accesses go through the cache hierarchy, this correlation between secret data and memory accesses creates a cache side-channel. RSA (Rivest-Shamir-Adleman) [23] was also shown to be vulnerable to cache side-channel attacks [21,29]. The RSA algorithm requires modular exponentiation operations. ...
... As highlighted in Figs. 20,21,22,23,24, and 25, the effect of reconfiguration is virtually non-existent after the prime phase is completed. However, this may not be completely accurate, since the results are based on the assumption that the combined effect of reconfiguration and flushing is the sum of the individual effects, which has not been fully validated. ...
Article
Full-text available
Side-channel attacks exploit architectural features of computing systems and algorithmic properties of applications executing on these systems to steal sensitive information. Cache side-channel attacks are more powerful and practical compared to other classes of side-channel attacks due to several factors, such as the ability to be mounted without physical access to the system. Some secure cache architectures have been proposed to counter side-channel attacks. However, they all incur significant performance overheads. This work explores the viability of using adaptive caches, which are conventionally used as a performance-oriented architectural feature, as a defense mechanism against cache side-channel attacks. We conduct an empirical analysis, starting from establishing a baseline for the attacker’s ability to infer information regarding the memory accesses of the victim process when there is no active defense mechanism in place and the attacker is fully aware of all the cache parameters. Then, we analyze the effectiveness of the attack without complete knowledge of the cache configuration. Finally, based on the insight that the success of the attack is heavily dependent on knowledge of the cache configuration, we formulate a cache monitoring and user-defined events detection methodology, implement a generalized run-time cache reconfiguration technique, and observe their effect on successfully detecting and mitigating attacks on the cache subsystem. We observe that reconfiguring different cache parameters during a side-channel attack reduces the accuracy of the attack in detecting cache sets accessed by the victim by 44% on average, with a maximum of 90% reduction.
... Cache Attacks are one of the well-known categories of side-channel attacks that search to find information about the memory locations that have been accessed by a victim program [1,2,[37][38][39][40][41][42][43][44]. Cache attacks exploit this fact that the accesses of victim process Figure 3. Prime+Probe Attack; (a) Priming the intended cache set by attacker with the data that is suspected to be accessed by victim, (b) Victim access to the intended cache set, (c) Probing the cache sets again by attacker [50] to memory an be monitored by a spy process, if a cache is shared between them. ...
... January 2022, Volume 14, Number 1 (pp.[27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46] ...
... • Eviction-based attacks: Prime+Probe [45], [49]- [51] can detect eviction of lines in the same cache set(s) as the lines accessed by the victim. • Reload-based attacks: Flush+Reload [46], [52]- [54] and Evict+Time [45] relies on detecting whether a shared line is recently accessed by the victim. ...
Conference Paper
Full-text available
Spectre and Meltdown attacks reveal the perils of speculative execution, a prevalent technique used in modern processors. This paper proposes ReViCe, a hardware technique to mitigate speculation based attacks. ReViCe allows speculative loads to update caches early but keeps any replaced line in the victim cache. In case of misspeculation, replaced lines from the victim cache are used to restore the caches, thereby preventing any cache-based Spectre and Meltdown attacks. Moreover, ReViCe injects jitter to conceal any timing difference due to speculative lines. Together speculation restoration and jitter injection allow ReViCe to make speculative execution secure. We present the design of ReViCe following a set of security principles and evaluate its security based on shared-core and cross-core attacks exploiting various Spectre variants and cache side channels. Our scheme incurs 2-6% performance overhead. This is less than the state-of-the-art hardware approaches. Moreover, ReViCe achieves these results with minimal area and energy overhead (0.06% and 0.02% respectively).
... Instead, the cache is filled with a window of the missing memory line. Percival [40] suggests that systems that exploit hyperthreading are susceptible to any attacks from an adversary. The effect of disabling hyperthreading is notable on the system's throughput, but the trade-off between security and performance is acceptable. ...
Article
Full-text available
Abstract Multilevel cache architectures are widely used in modern heterogeneous systems for performance improvement. However, satisfying the performance and security requirements at the same time is a challenge for such systems. A simple and efficient timing attack on the shared portions of multilevel hierarchical caches and its corresponding countermeasure is proposed here. The proposed attack prolongs the execution time of the victim threads by inducing intentional race conditions in shared memory spaces. Then, a thread‐mapping algorithm to detect such race conditions between a group of threads and resolve them as a countermeasure against the attack is proposed. The proposed countermeasure dynamically monitors races on cache blocks and distributes existing and new threads on processing cores to minimize cache contention. Upon detection of a high contention rate that might be either due to an attack or a natural race condition, two mechanisms, namely cache access‐rate reduction and thread migration, will be used by the countermeasure algorithm to resolve the race situation. Evaluations on SPECCPU 2006 benchmark suite show that the proposed algorithm not only protects the system against the introduced attack but also boosts the overall system performance by an average of 46.35% and 55.92% for the worst and average cases, respectively.
... In cloud service infrastructures, virtual machines share the underlying hardware resources and isolation is usually not provided by default. Stealthy cache attacks such as FLUSH+RELOAD [29,31,70], PRIME+PROBE [33,38,47,56,57], EVICT+TIME [56] and others [18] are considered practical; this allows adversaries to steal secret AES [38], RSA [70], and ElGamal [47] cryptographic keys, spy over encrypted channels [34], and log keys [30]. A similar scenario also occurs in smart phones, where a malicious application can learn side-channel information about the system through shared resources such as caches [43]. ...
Preprint
Users are demanding increased data security. As a result, security is rapidly becoming a first-order design constraint in next generation computing systems. Researchers and practitioners are exploring various security technologies to meet user demand such as trusted execution environments (e.g., Intel SGX, ARM TrustZone), homomorphic encryption, and differential privacy. Each technique provides some degree of security, but differs with respect to threat coverage, performance overheads, as well as implementation and deployment challenges. In this paper, we present a systemization of knowledge (SoK) on these design considerations and trade-offs using several prominent security technologies. Our study exposes the need for \textit{software-hardware-security} codesign to realize efficient and effective solutions of securing user data. In particular, we explore how design considerations across applications, hardware, and security mechanisms must be combined to overcome fundamental limitations in current technologies so that we can minimize performance overhead while achieving sufficient threat model coverage. Finally, we propose a set of guidelines to facilitate putting these secure computing technologies into practice.
... Memory components such as DRAM [29] and cache [43] are not the only microarchitectural attack surfaces. Spectre attacks on the branch prediction unit [30,38] imply that side channels such as caches can be used as a primitive for more advanced attacks on speculative engines. ...
Conference Paper
Full-text available
Modern microarchitectures incorporate optimization techniques such as speculative loads and store forwarding to improve the memory bottleneck. The processor executes the load speculatively before the stores, and forwards the data of a preceding store to the load if there is a potential dependency. This enhances performance since the load does not have to wait for preceding stores to complete. However, the dependency prediction relies on partial address information, which may lead to false dependencies and stall hazards. In this work, we are the first to show that the dependency resolution logic that serves the speculative load can be exploited to gain information about the physical page mappings. Microarchitectural side-channel attacks such as Rowhammer and cache attacks like Prime+Probe rely on the reverse engineering of the virtual-to-physical address mapping. We propose the SPOILER attack which exploits this leakage to speed up this reverse engineering by a factor of 256. Then, we show how this can improve the Prime+Probe attack by a 4096 factor speed up of the eviction set search, even from sand-boxed environments like JavaScript. Finally, we improve the Rowhammer attack by showing how SPOILER helps to conduct DRAM row conflicts deterministically with up to 100% chance, and by demonstrating a double-sided Rowhammer attack with normal user's privilege. The later is due to the possibility of detecting contiguous memory pages using the SPOILER leakage.
... Workaround -Software partitioning [11], [8], [9]; Noise injection [29], [26]; Auditing hardware counters [34] Probing memory accesses [35]; Auditing hardware counters [30] Unavailable Temporal isolation [27]; Spatial isolation [19], [20], [29], [13]; Restricting clflush [21]; Close-page policy [23] VM clusters [30] -Temporary fix -Disabling SMT [15]; Dedicated instances [1] and takes reactive measures accordingly. Yet, this approach can result in a high number of false positives depending on the workload. ...
Conference Paper
Full-text available
Microarchitectural cross-VM covert channels are software-launched attacks which exploit multi-tenant environments' shared hardware. They enable transmitting information from a compromised system when the information flow policy does not allow to do so. These attacks represent a threat to the confidentiality and integrity of data processed and stored on cloud platforms. Although potentially severe, covert channels tend to be overlooked due to an allegedly strong adversary model. The literature focuses on mechanisms for encoding information through timing variations, without addressing practical considerations. Furthermore, the field lacks a realistic evaluation framework. Covert channels are usually compared to each other using the channel capacity. While a valuable performance metric, the capacity is inadequate to assess the severity of an attack. In this paper, we conduct a comprehensive study on the severity of microarchitectural covert channels in public clouds. State-of-the-art attacks are evaluated against the Common Vulnerability Scoring System in its most recent version (CVSS v3.1). The study shows that a medium severity score of 5.0 is achieved. In comparison, the SSLv3 POODLE (CVE-2014-3566) and OpenSSL Heartbleed (CVE-2014-0160) vulnerabilities achieved respective scores of 3.1 and 7.5. As such, the paper successfully demonstrates that covert channels are not theoretical threats, and that they require the immediate attention of the community. Furthermore, we devise a new and independent scoring system, the Covert Channel Scoring System (CCSS). The scoring of related works under the CCSS shows that cache-based covert channels, although more and more popular, are the least practical ones to deploy. We encourage authors of future cross-VM covert channel attacks to include a CCSS metric in their study, in order to account for deployment constraints and provide a fair point of comparison for the adversary model.
... In the most famous examples, an adversary exploits physical leakages of an electronic device such as its power consumption [KJJ99] or Electromagnetic (EM) emanations [QS01] to recover a cryptographic key. Many other side-channels have been pointed out in the literature such as timing attacks [BT11], cache monitoring [Per05] or even network packets length analysis [SSH + 14]. In any case, the problem can be reduced to the following form: an adversary is able to learn realizations of a leakage variable L, often called a trace, and aims at using it to infer information about another related secret variable S. ...
... With the increase of shared architectures (e.g. infrastructure as a service) arise more powerful attacks, where an attacker can monitor the cache of the victim and recover information on secret-dependent memory accesses [39,195,199]. More generally, microarchitectural attacks enable an attacker to observe changes to the microarchitectural state [3,104,103,5,225,115]. ...
Thesis
Programs commonly perform computations involving secret data, relying on cryptographic code to guarantee their confidentiality. In this context, it is crucial to ensure that cryptographic code cannot be exploited by an attacker to leak secret data. Unfortunately, even if cryptogtaphic algorithms are based upon secure mathematical foundations, their execution in the physical world can produce side-effects that can be exploited to recover secrets. In particular, an attacker can exploit the execution time of a program to leak secret data, or use timing to recover secrets encoded in the microarchitecture using microarchitectural timing attacks. More recently, Spectre attacks showed that it is also possible to exploit processor optimizations—in particular speculation mechanisms—to leak secret data.In this thesis, we develop automated program analyses for checking confidentiality of secret data in cryptographic code. In particular, we target three crucial properties of cryptographic implementations: (1) secret-erasure, which ensures that secret data are erased from memory at the end of the program; (2) constant-time, which protects against timing microarchitectural attacks; (3) speculative constant-time, which protects against Spectre attacks. These properties have two characteristics in common that make them challenging to analyze. First, they are properties of pairs of traces (namely 2-hypersafety), which makes them incompatible with the standard verification framework (designed for safety properties). Second, they are not always preserved by compilers.Our goal in this thesis is to design automatic symbolic analyses for pairs of traces, operating at binary-level, including processor speculations, and that scale on real-world cryptographic code. Our analyses are built on top of relational symbolic execution—the adaptation of symbolic execution to 2-hypersafety—that we complement with dedicated optimizations: (1) for efficient relational symbolic execution at binary-level, (2) for modeling efficiently the speculative semantics of programs. We implement our analyses into two open-source tools: Binsec/Rel, a tool for bug-finding and bounded-verification of constant-time and secret-erasure; and Binsec/Haunted, a tool for detecting vulnerabilities to Spectre attacks. Our experimental evaluation shows that our optimizations drastically improve performance over prior approaches, allowing for faster analysis, finding more bugs, and enabling bounded-verification on real-world cryptographic primitives whereas prior approaches times-out.Using our tools, we analyze a wide range of cryptographic primitives and utility functions from open-source libraries (including Libsodium, OpenSSL, BearSSL and HACL*) for constant-time (338 binaries), secret-erasure (680 binaries), and Spectre (45 binaries). Along the way, we discover a few new bugs, as well as weaknesses in standard protections schemes.
... Prime+Probe This cache attack methodology measures the difference in time it takes to refill a given cache set. It does not rely on any shared memory addresses between the attacker and the victim [Per05,OST06]. Apart from cryptographic protocols, Prime+Probe has also been demonstrated to be effective at obtaining sensitive information in cloud environments [LYG + 15, IAES15, YKSA15]. ...
Thesis
Public-key cryptosystems are constructed using one-way functions which ensure both the security and the efficiency of the schemes. One of the two main candidates originally considered to construct public-key cryptosystems is modular exponentiation with its hard inverse operation, computing discrete logarithms. In this thesis, we study the security of protocols that make use of modular exponentiation where the exponent is a secret of the protocol. To assess the security of such protocols, one can either estimate the hardness of directly solving the discrete logarithm problem (DLP) in the groups considered by the protocols or look at implementation vulnerabilities from fast exponentiation algorithms. One way of estimating the security of protocols based on the hardness of the discrete logarithm problem is to directly study the complexity of the algorithms that solve the latter. In this thesis, we first study the asymptotic complexity of algorithms that solve DLP over finite fields “\F_{p^n}” precisely of the form where pairings take their values. These algorithms come from the index-calculus family from which the Number Field Sieve (NFS) is an example. This study allows us to draw conclusions on the security of pairing-based protocols. We also propose a first implementation of the variant Tower Number Field Sieve (TNFS) of NFS, which has better asymptotic complexity, along with a record computation of a discrete logarithm in a 521-bit finite field with TNFS. This variant had never been implemented before due to the difficulty of sieving in higher dimensions, i.e., dimensions greater than two. Finally, the security of deployed protocols not only relies on the hardness of the underlying mathematical problem but also on the implementation of the algorithms involved. Many fast modular exponentiation algorithms have piled up over the years and some implementations have brought vulnerabilities that are exploitable by side-channel attacks, in particular cache attacks. The second aspect of this thesis thus considers key recover methods when partial information is recovered from a side channel.
... With the increase of shared architectures (e.g. infrastructure as a service) arise more powerful attacks, where an attacker can monitor the cache of the victim and recover information on secret-dependent memory accesses [39,195,199]. More generally, microarchitectural attacks enable an attacker to observe changes to the microarchitectural state [3,104,103,5,225,115]. ...
Thesis
Programs commonly perform computations involving secret data, relying on cryptographic code to guarantee their confidentiality. In this context, it is crucial to ensure that cryptographic code cannot be exploited by an attacker to leak secret data. Unfortunately, even if cryptogtaphic algorithms are based upon secure mathematical foundations, their execution in the physical world can produce side-effects that can be exploited to recover secrets. In particular, an attacker can exploit the execution time of a program to leak secret data, or use timing to recover secrets encoded in themicroarchitecture using microarchitectural timing attacks. More recently, Spectre attacks showed that it is also possible to exploit processor optimizations—in particular speculation mechanisms—to leak secret data.In this thesis, we develop automated program analyses for checking confidentiality of secret data in cryptographic code. In particular, we target three crucial properties of cryptographic implementations: (1) secret-erasure, which ensures that secret data are erased from memory at the end of the program; (2) constant-time, which protects against microarchitectural timing attacks; (3) speculative constant-time, which protects against Spectre attacks. These properties have two characteristics in common that make them challenging to analyze. First, they are properties of pairs of traces (namely 2-hypersafety), which makes them incompatible with the standard verification framework (designed for safety properties). Second, they are not always preserved by compilers.Our goal in this thesis is to design automatic symbolic analyses for pairs of traces, operating at binary-level, including processor speculations, and that scale on real-world cryptographic code. Our analyses are built on top of relational symbolic execution—the adaptation of symbolic execution to 2-hypersafety—that we complement with dedicated optimizations: (1) for efficient relational symbolic execution at binary-level, (2) for modeling efficiently the speculative semantics of programs. We implement our analyses into two open-source tools: Binsec/Rel, a tool for bug-finding and bounded-verification of constant-time and secret-erasure; and Binsec/Haunted, a tool for detecting vulnerabilities to Spectre attacks. Our experimental evaluation shows that our optimizations drastically improve performance over prior approaches, allowing for faster analysis, finding more bugs, and enabling bounded-verification on real-world cryptographic primitives whereas prior approaches times-out.Using our tools, we analyze a wide range of cryptographic primitives and utility functions from open-source libraries (including Libsodium, OpenSSL, BearSSL and HACL*) for constant-time (338 binaries), secret-erasure (680 binaries), and Spectre (45 binaries). Along the way, we discover a few new bugs, as well as weaknesses in standard protections schemes.
... Some of the exploits relying on cache SCA are presented here. [72] attack AES OpenSSL on x86, whereas [74] and [13] attack the RSA OpenSSL and RSA SGX SDK version. In [39] attackers observed the cache to extract the users' keystrokes, and in [71], they attacked the javascript sandbox. ...
Article
Full-text available
With the advances in the field of the Internet of Things (IoT) and Industrial IoT (IIoT), these devices are increasingly used in daily life or industry. To reduce costs related to the time required to develop these devices, security features are usually not considered. This situation creates a major security concern. Many solutions have been proposed to protect IoT/IIoT against various attacks, most of which are based on attacks involving physical access. However, a new class of attacks has emerged targeting hardware vulnerabilities in the micro-architecture that do not require physical access. We present attacks based on micro-architectural hardware vulnerabilities and the side effects they produce in the system. In addition, we present security mechanisms that can be implemented to address some of these attacks. Most of the security mechanisms target a small set of attack vectors or a single specific attack vector. As many attack vectors exist, solutions must be found to protect against a wide variety of threats. This survey aims to inform designers about the side effects related to attacks and detection mechanisms that have been described in the literature. For this purpose, we present two tables listing and classifying the side effects and detection mechanisms based on the given criteria.
... Noisy Measurements. Another group of defenses aims to impede a successful attack by preventing the adversary from performing precise time measurements, e.g., by restricting the access to timers [73], [75], [67], by injecting noise into the system [95], [41] or deliberately slowing down the system clock [40], [68]. However, workarounds have been found to create timers [84] or to perform attacks without relying on timers [25]. ...
Preprint
Full-text available
Shared cache resources in multi-core processors are vulnerable to cache side-channel attacks. Recently proposed defenses have their own caveats: Randomization-based defenses are vulnerable to the evolving attack algorithms besides relying on weak cryptographic primitives, because they do not fundamentally address the root cause for cache side-channel attacks. Cache partitioning defenses, on the other hand, provide the strict resource partitioning and effectively block all side-channel threats. However, they usually rely on way-based partitioning which is not fine-grained and cannot scale to support a larger number of protection domains, e.g., in trusted execution environment (TEE) security architectures, besides degrading performance and often resulting in cache underutilization. To overcome the shortcomings of both approaches, we present a novel and flexible set-associative cache partitioning design for TEE architectures, called Chunked-Cache. Chunked-Cache enables an execution context to "carve" out an exclusive configurable chunk of the cache if the execution requires side-channel resilience. If side-channel resilience is not required, mainstream cache resources are freely utilized. Hence, our solution addresses the security-performance trade-off practically by enabling selective and on-demand utilization of side-channel-resilient caches, while providing well-grounded future-proof security guarantees. We show that Chunked-Cache provides side-channel-resilient cache utilization for sensitive code execution, with small hardware overhead, while incurring no performance overhead on the OS. We also show that it outperforms conventional way-based cache partitioning by 43%, while scaling significantly better to support a larger number of protection domains.
... They both take the last-level cache as a covert channel for information leakage. Flush+Reload and Flush+Flush rely on the operating system's or hypervisor's shared memory technology, whereas Prime+Probe [7][8][9] does not. Flush+Reload and Flush+Flush pose a great threat to information security, so we study the detection method of these two attacks. ...
Article
Full-text available
Cache side channel attacks, as a type of cryptanalysis, seriously threaten the security of the cryptosystem. These attacks continuously monitor the memory addresses associated with the victim’s secret information, which cause frequent memory access on these addresses. This paper proposes CacheHawkeye, which uses the frequent memory access characteristic of the attacker to detect attacks. CacheHawkeye monitors memory events by CPU hardware performance counters. We proved the effectiveness of CacheHawkeye on Flush+Reload and Flush+Flush attacks. In addition, we evaluated the accuracy of CacheHawkeye under different system loads. Experiments demonstrate that CacheHawkeye not only has good accuracy but can also adapt to various system loads.
... Besides the load-based covert channels described above, a plethora of other mechanisms have been used to transmit information. For example, the CPU caches [30], [41], the DRAM row buffers [31], and CPU functional units [37] are shared resources that can enable covert channels with significantly higher throughput than the covert channel described in this work. Other covert channels often also can be prevented by other countermeasures, though -in these cases, cache partitioning, memory channel isolation, and disabling of hyper-threading can be used to prevent resource sharing. ...
Preprint
Covert channels are communication channels used by attackers to transmit information from a compromised system when the access control policy of the system does not allow doing so. Previous work has shown that CPU frequency scaling can be used as a covert channel to transmit information between otherwise isolated processes. Modern systems either try to save power or try to operate near their power limits in order to maximize performance, so they implement mechanisms to vary the frequency based on load. Existing covert channels based on this approach are either easily thwarted by software countermeasures or only work on completely idle systems. In this paper, we show how the automatic frequency scaling provided by Intel Turbo Boost can be used to construct a covert channel that is both hard to prevent without significant performance impact and can tolerate significant background system load. As Intel Turbo Boost selects the maximum CPU frequency based on the number of active cores, our covert channel modulates information onto the maximum CPU frequency by placing load on multiple additional CPU cores. Our prototype of the covert channel achieves a throughput of up to 61 bit/s on an idle system and up to 43 bit/s on a system with 25% utilization.
Chapter
Security and privacy issues are magnified by the volume, variety, and velocity of big data, such as large-scale cloud infrastructures, diversity of data sources and formats, the streaming nature of data acquisition and high volume inter-cloud migration. In the past, big data was limited to very large organizations such as governments and large enterprises that could afford to create and own the infrastructure necessary for hosting and mining large amounts of data. These infrastructures were typically proprietary and were isolated from general networks. Today, big data is cheaply and easily accessible to organizations large and small through public cloud infrastructure. The purpose of this chapter is to highlight the big data security and privacy challenges and also presents some solutions for these challenges, but it does not provide a definitive solution for the problem. It rather points to some directions and technologies that might contribute to solve some of the most relevant and challenging big data security and privacy issues.
Article
Full-text available
During recent years, many researchers and professionals have revealed the endangerment of wireless communication technologies and systems from various cyberattacks, these attacks cause detriment and harm not only to private enterprises but to the government organizations as well. The attackers endeavor new techniques to challenge the security frameworks, use powerful tools and tricks to break any sized keys, security of private and sensitive data is in the stale mark. There are many advancements are being developed to mitigate these attacks. In this conjunction, this paper gives a complete account of survey and review of the various exiting advanced cyber security standards along with challenges faced by the cyber security domain. The new generation attacks are discussed and documented in detail, the advanced key management schemes are also depicted. The quantum cryptography is discussed with its merits and future scope of the same. Overall, the paper would be a kind of technical report to the new researchers to get acquainted with the recent advancements in Cyber security domain.
Chapter
Data confidentiality is put at risk on cloud platforms where multiple tenants share the underlying hardware. As multiple workloads are executed concurrently, conflicts in memory resource occur, resulting in observable timing variations during execution. Malicious tenants can intentionally manipulate the hardware platform to devise a covert channel, enabling them to steal the data of co-residing tenants. This paper presents two new microarchitectural covert channel attacks using the memory controller. The first attack allows a privileged adversary (i.e. process) to leak information in a native environment. The second attack is an extension to cross-VM scenarios for unprivileged adversaries. This work is the first instance of leakage channel based on the memory controller. As opposed to previous denial-of-service attacks, we manage to modulate the load on the channel scheduler with accuracy. Both attacks are implemented on cross-core configurations. Furthermore, the cross-VM covert channel is successfully tested across three different Intel microarchitectures. Finally, a comparison against state-of-the-art covert channel attacks is provided, along with a discussion on potential mitigation techniques.
Chapter
We discuss side-channel attacks on CRT-RSA encryption or signature scheme (the RSA scheme with the Chinese remainder theorem) implemented via the sliding window method. The sliding window method calculates exponentiations through repeated squaring and multiplication. These square-and-multiply sequences can be obtained by side-channel attacks, and there is the risk of recovering CRT-RSA secret keys from these sequences. Especially, in CHES 2017, it is proved that we can recover secret keys from the correct square-and-multiply sequences in polynomial time when the window size w is less than 4. However, there are errors in the obtained sequences. Oonishi and Kunihiro proposed a method for recovering secret keys from noisy sequences when \(w=1\). Although this work only addresses the case with \(w=1\), it should be possible to recover secret keys for larger values of w. In this paper, we propose a new method for recovering secret keys from noisy sequences in the sliding window method. Moreover, we clarify the amount of errors for which our method works.
Article
Full-text available
Privacy protection is an essential part of information security. The use of shared resources demands more privacy and security protection, especially in cloud computing environments. Side-channel attacks based on CPU cache utilize shared CPU caches within the same physical device to compromise the system’s privacy (encryption keys, program status, etc.). Information is leaked through channels that are not intended to transmit information, jeopardizing system security. These attacks have the characteristics of both high concealment and high risk. Despite the improvement in architecture, which makes it more difficult to launch system intrusion and privacy leakage through traditional methods, side-channel attacks ignore those defenses because of the shared hardware. Difficult to be detected, they are much more dangerous in modern computer systems. Although some researchers focus on the survey of side-channel attacks, their study is limited to cryptographic modules such as Elliptic Curve Cryptosystems. All the discussions are based on real-world applications (e.g., Curve25519), and there is no systematic analysis for the related attack and security model. Firstly, this paper compares different types of cache-based side-channel attacks. Based on the comparison, a security model is proposed. The model describes the attacks from four key aspects, namely, vulnerability, cache type, pattern, and range. Through reviewing the corresponding defense methods, it reveals from which perspective defense strategies are effective for side-channel attacks. Finally, the challenges and research trends of CPU cache-based side-channel attacks in both attacking and defending are explored. The systematic analysis of CPU cache-based side-channel attacks highlights the fact that these attacks are more dangerous than expected. We believe our survey would draw developers’ attention to side-channel attacks and help to reduce the attack surface in the future.
Chapter
Caches leak information through timing measurements and side-channel attacks. Several attack primitives exist with different requirements and trade-offs. Flush+Flush is a stealthy and fast one that uses the timing of the clflush instruction depending on whether a line is cached. We show that the CPU interconnect plays a bigger role than previously thought in these timings and in Flush+Flush error rate. In this paper, we show that a naive implementation that does not account for the topology of the interconnect yields very high error rates, especially on modern CPUs as the number of cores increases. We therefore reverse-engineer this topology and revisit the calibration phase of Flush+Flush for different attacker models to determine the correct threshold for clflush hits and misses. We show that our method yields close-to-noiseless side-channel attacks by attacking the AES T-tables implementation of OpenSSL, and by building a covert channel. We obtain a maximal capacity of 5.8 Mbit/s with our method, compared to 1.9 Mbit/s with a naive Flush+Flush implementation on an Intel Core i9-9900 CPU.
Article
In this work, we present a novel approach, called Detector+, to detect, isolate, and prevent timing-based side channel attacks (i.e., timing attacks) at runtime. The proposed approach is based on a simple observation that the time measurements required by the timing attacks differ from those required by the benign applications as these attacks need to measure the execution times of typically quite short-running operations. Detector+, therefore, monitors the time readings made by processes and mark consecutive pairs of readings that are close to each other in time as suspicious. In the presence of suspicious time measurements, Detector+ introduces noise into the measurements to prevent the attacker from extracting information by using these measurements. The sequence of suspicious time measurements are then analyzed by using a sliding window based approach to pinpoint the malicious processes at runtime. We have empirically evaluated the proposed approach by using five well known timing attacks, including Meltdown, together with their variations, representing some of the mechanisms that an attacker can employ to become stealthier. In one evaluation setup, each type of attack was carried out concurrently by multiple processes. In the other setup, multiple types of attacks were carried out concurrently. In all the experiments, Detector+ detected all the malicious time measurements with almost a perfect accuracy, prevented all the attacks, and correctly pinpointed all the malicious processes involved in the attacks without any false positives after they have made a few time measurements with an average runtime overhead of 1.56%.
Article
Meltdown released in 2018 is a hardware vulnerability primarily affecting Intel modern processors. It allows a rogue process to read the kernel data in CPU L1D cache. To defend against the Meltdown attack in legacy processors, the most effective software-only mitigation approach is to unmap kernel memory from user processes, known as kernel page-table isolation (KPTI). In this paper, we present a novel Meltdown-type attack, named KPTImew, that can defeat KPTI in Linux and reliably dump all the target data in the kernel address space. We observe that there still exists kernel memory mapped in a user process, indicating that the mapped memory content can still be leaked through the Meltdown attack. However, the Meltdown attack is limited to leaking data that must be resident in CPU L1D cache. To lift the limitation, we propose a new technique, called reDump, as a part of our contribution. reDump exploits speculative execution to load data in the mapped memory into L1D cache and thus reliably dump the data using the Meltdown attack. To further leak data from the whole kernel including the above mapped memory, KPTImew first establishes data dependency between the mapped memory and any target kernel memory, and then exploits the data dependency to bring certain mapped kernel data into L1D cache that is dependent on targeted kernel data. When the mapped kernel data is leaked, the targeted kernel data can be leaked through the data-dependency. We modify an open-source tool, called smatch, to find such gadgets in recent kernels (i.e., 4.17.3 and 5.8.7) for loading the kernel mapped data into L1D cache and establishing the data-dependency, respectively. Specifically, dozens of potential gadgets are found in default kernel compile configuration while hundreds of gadget candidates are available for all-yes compile configuration. Our experiments show that reDump leaks 32 B of the mapped data within 6 seconds on average. With the assistance of reDump, KPTImew leaks any 32 B of kernel data within 12 seconds on average. In comparison, KPTImew can also work independently and requires 218 seconds on average to leak 32 B without reDump.
Chapter
In the current, fast paced development of computer hardware, hardware manufacturers often focus on an expedited time to market paradigm or on maximum throughput. This inevitably leads to a number of unintentional hardware vulnerabilities. These vulnerabilities can be exploited to launch devastating hardware attacks and as a result compromise the privacy of end-users. Microarchitectural attacks—the exploit of the microarchitectural behaviour of modern computer systems, is an example of such a hardware attack, and also the central focus of this paper. This type of attack can exploit microarchitectural performance of processor implementations, which in turn can potentially expose hidden hardware states. Microarchitectural attacks compromise the security of computational environments even within advanced protection mechanisms such as virtualisation and sandboxes. In light of these security threats against modern computing hardware, a detailed survey of recent attacks that exploit microarchitectural elements in modern, shared computing hardware were performed from a Digital Forensic perspective. It is demonstrated that the CPU (central processing unit) is an attractive resource to be targeted by attackers, and show that adversaries could potentially use microarchitectural cache-based side-channel attacks to extract and analytically examine sensitive data from their victims. This study only focuses on cache-based attacks as opposed to other variants of side-channel attacks, which have a broad application range. The paper makes three major contributions to the body of knowledge: Firstly in terms of the broadness of the scope of the analysis and a detailed examination of the means by which the data is analysed for performing side channel attacks, secondly with regards to how novel uses of data can facilitate side channel attacks, and thirdly also in the provision of an agenda for directing future research.
Chapter
In 2015, the block cipher Kalyna has been approved as the new encryption standard of Ukraine. The cipher is a substitution-permutation network, whose design is based on AES, but includes several different features. Most notably, the key expansion in Kalyna is designed to resist recovering the master key from the round keys. In this paper we present a cache attack on the Kalyna key expansion algorithm. Our attack observes the cache access pattern during key expansion, and uses the obtained information together with one round key to completely recover the master key. We analyze all five parameter sets of Kalyna. Our attack significantly reduces the attack cost and is practical for the Kalyna-128/128 variant, where it is successful for over 97% of the keys and has a complexity of only 243.58. To the best of our knowledge, this is the first attack on the Kalyna key expansion algorithm. To show that the attack is feasible, we run the cache attack on the reference implementation of Kalyna-128/128, demonstrating that we can obtain the required side-channel information. We further perform the key-recovery step on our university’s high-performance compute cluster. We find the correct key within 37 hours and note that the attack requires 50K CPU hours for enumerating all key candidates. As a secondary contribution we observe that the additive key whitening used in Kalyna facilitates first round cache attacks. Specifically, we design an attack that can recover the full first round key with only seven adaptively chosen plaintexts.
Chapter
The seminal work of Heninger and Shacham (Crypto 2009) demonstrated a method for reconstructing secret RSA keys from partial information of the key components. In this paper we further investigate this approach but apply it to a different context that appears in some side-channel attacks. We assume a fixed-window exponentiation algorithm that leaks the equivalence between digits, without leaking the value of the digits themselves. We explain how to exploit the side-channel information with the Heninger-Shacham algorithm. To analyse the complexity of the approach, we model the attack as a Markov process and experimentally validate the accuracy of the model. Our model shows that the attack is feasible in the commonly used case where the window size is 5.
Preprint
Full-text available
Recent years have brought microarchitectural security intothe spotlight, proving that modern CPUs are vulnerable toseveral classes of microarchitectural attacks. These attacksbypass the basic isolation primitives provided by the CPUs:process isolation, memory permissions, access checks, andso on. Nevertheless, most of the research was focused on In-tel CPUs, with only a few exceptions. As a result, few vulner-abilities have been found in other CPUs, leading to specula-tions about their immunity to certain types of microarchi-tectural attacks. In this paper, we provide a black-box anal-ysis of one of these under-explored areas. Namely, we inves-tigate the flaw of AMD CPUs which may lead to a transientexecution hijacking attack. Contrary to nominal immunity,we discover that AMD Zen family CPUs exhibit transient ex-ecution patterns similar for Meltdown/MDS. Our analysisof exploitation possibilities shows that AMDs design deci-sions indeed limit the exploitability scope comparing to In-tel CPUs, yet it may be possible to use them to amplify othermicroarchitectural attacks.
Article
Full-text available
Introduction Designers of protection systems are usually preoccupied with the need to safeguard data from unauthorized access or modification, or programs from unauthorized execution. It is known how to solve these problems well enough so that a program can create a controlled environment within which another, possibly untrustworthy program, can be run safely [1, 21. Adopting terminology appropriate for our particular case, we will call the first program a customer and the second a service. The customer will want to ensure that the service cannot access (i.e. read or modify) any of his data except those items to which he explicitly grants access. If he is cautious, he will only grant access to items which are needed as input or output for the service program. In general it is also necessary to provide for smooth transfers of control, and to handle error conditions. Furthermore, the service must be protected from intrusion by the customer, since the service may be a proprietary
Article
This paper demonstrates complete AES key recovery from known-plaintext timings of a network server on another computer. This attack should be blamed on the AES design, not on the particular AES library used by the server; it is extremely difficult to write constant-time high-speed AES software for common general-purpose computers. This paper discusses several of the obstacles in detail.
Conference Paper
We describe several software side-channel attacks based on inter-process leakage through the state of the CPU's memory cache. This leakage reveals memory access patterns, which can be used for cryptanalysis of cryptographic primitives that employ data-dependent table lookups. The attacks allow an unprivileged process to attack other processes running in parallel on the same processor, despite partitioning methods such as memory protection, sandboxing and virtualization. Some of our methods require only the ability to trigger services that perform encryption or MAC using the unknown key, such as encrypted disk partitions or secure network links. Moreover, we demonstrate an extremely strong type of attack, which requires knowledge of neither the specific plaintexts nor ciphertexts, and works by merely monitoring the effect of the cryptographic process on the cache. We discuss in detail several such attacks on AES, and experimentally demonstrate their applicability to real systems, such as OpenSSL and Linux's dm-crypt encrypted partitions (in the latter case, the full key can be recovered after just 800 writes to the partition, taking 65 milliseconds). Finally, we describe several countermeasures which can be used to mitigate such attacks.
Article
Timing attacks are usually used to attack weak computing devices such as smartcards. We show that timing attacks apply to general software systems. Specifically, we devise a timing attack against OpenSSL. Our experiments show that we can extract private keys from an OpenSSL-based web server running on a machine in the local network. Our results demonstrate that timing attacks against network servers are practical and therefore security systems should defend against them. (c) 2005 Elsevier B.V. All rights reserved.
Conference Paper
We present a method to solve integer polynomial equations in two variables, provided that the solution is suitably bounded. As an application, we show how to find the factors of N = PQ if we are given the high order ((1/4) log2 N) bits of P. This compares with Rivest and Shamir’s requirement of ((1/3) log2 N) bits.
Conference Paper
This paper presents the results of applying an attack against the Data Encryption Standard (DES) implemented in some applications, using side-channel information based on CPU delay as proposed in [11]. This cryptanalysis technique uses side-channel information on encryption processing to select and collect effective plaintexts for cryptanalysis, and infers the information on the expanded key from the collected plaintexts. On applying this attack, we found that the cipher can be broken with 223 known plaintexts and 224 calculations at a success rate > 90%, using a personal computer with 600-MHz Pentium III. We discuss the feasibility of cache attack on ciphers that need many S-box look-ups, through reviewing the results of our experimental attacks on the block ciphers excluding DES, such as AES.
Article
We expand on the idea, proposed by Kelsey et al. (14), of cache memory being used as a side-channel which leaks information during the run of a cryptographic algorithm. By using this side-channel, an attacker may be able to reveal or narrow the possible values of secret information held on the target device. We describe an attack which encrypts chosen plaintexts on the target processor in order to collect cache profiles and then performs around computational steps to recover the key. As well as describing and simulating the theoretical attack, we discuss how hardware and algorithmic alterations can be used to defend against such tech- niques.
Conference Paper
This paper examines simultaneous multithreading, a technique permitting several independent threads to issue instructions to a superscalar's multiple functional units in a single cycle. We present several models of simultaneous multithreading and compare them with alternative organizations: a wide superscalar, a fine-grain multithreaded processor, and single-chip, multiple-issue multiprocessing architectures. Our results show that both (single-threaded) superscalar and fine-grain multithreaded architectures are limited in their ability to utilize the resources of a wide-issue processor. Simultaneous multithreading has the potential to achieve 4 times the throughput of a superscalar, and double that of fine-grain multi-threading. We evaluate several cache configurations made possible by this type of organization and evaluate tradeoffs between them. We also show that simultaneous multithreading is an attractive alternative to single-chip multiprocessors; simultaneous multithreaded processors with a variety of organizations outperform corresponding conventional multiprocessors with similar execution resources. While simultaneous multithreading has excellent potential to increase processor utilization, it can add substantial complexity to the design. We examine many of these complexities and evaluate alternative organizations in the design space.
Article
Cache memories can contribute to significant performance advantages due to the gap between CPU and memory speed. They have although been considered as a contributor to unpredictability while the user can't be sure of the time that will elapse when a memory-operation is performed. To avoid the conflict, the real-time-people has turned of the cache and run the program in the old-fashioned way. Turning the cache off will however also make other features like instruction pipelining less beneficial so the "new" processors will not give the performance speedup as they was mentioned to give. This paper will present the state of the art in the area and show some techniques to give the cache memory a chance on the real-time architecture board, so even the high performance CPUs will be used in the real-time area.
Freebsd security advisory FreeBSD-SA-05:09.htt
  • Freebsd Project
FreeBSD Project. Freebsd security advisory FreeBSD-SA-05:09.htt, May 2005.
OpenSSL: The open source toolkit for
  • Ssl Tls
The OpenSSL Project. OpenSSL: The open source toolkit for SSL/TLS. http://www.openssl.org/.