Article

On-Demand Snapshot: An Efficient Versioning File System for Phase-Change Memory

Authors:
  • Soonsil University
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Versioning file systems are widely used in modern computer systems as they provide system recovery and old data access functions by retaining previous file system snapshots. However, existing versioning file systems do not perform well with the emerging PCM (phase-change memory) storage, because they are optimized for hard disks. Specifically, a large amount of additional writes incurred by maintaining snapshot degrades the performance of PCM seriously as write operations are the performance bottleneck of PCM. This paper presents a novel versioning file system, designed for PCM, that reduces the writing overhead of a snapshot significantly. Unlike existing versioning file systems that incur cascade writes up to the file system root, our scheme breaks the recursive update chain at the immediate parent level. The proposed file system is implemented on Linux 2.6 as a prototype. Measurement studies with various I/O benchmarks show that the proposed file system improves the I/O throughput by 144 percent on average, compared to ZFS, a representative versioning file system.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... However, PCM has weaknesses to become a main memory medium as its access time is relatively slower (about 1-5x read, 5-25x write) compared to DRAM and it has limited write endurance. Hence recently, it is being considered as a high-speed storage medium (like swap device) or longlatency memory that is to be used with a DRAM buffer [1,3,4,11,12]. This paper presents a new page replacement policy for the system that uses PCM as a swap device. ...
... Meanwhile, PCM is known to possess significantly different physical characteristics from hard disks. As a result, PCM specific file systems [11,12] and PCM specific memory management techniques [1,3,4] have been extensively studied. Unlike these studies, research on virtual memory systems that use PCM as a swap device is in its infancy. ...
... Though details of the architectures are different, the role of additional DRAM is commonly to hide the slow write operations of PCM and also increase the lifespan of PCM by absorbing frequent write operations. Studies on PCM specific file systems also aim to reduce write traffic to PCM [11,12]. ...
Article
Full-text available
Phase-change memory (PCM) is a promising technology that is anticipated to be used in the memory hierarchy of future computer systems. However, its access time is relatively slower than DRAM and it has limited endurance cycle. Due to this reason, PCM is being considered as a high-speed storage medium (like swap device) or long-latency memory. In this paper, we adopt PCM as a virtual memory swap device and present a new page replacement policy that considers the characteristics of PCM. Specifically, we aim to reduce the write traffic to PCM by considering the dirtiness of pages when making a replacement decision. The proposed replacement policy tracks the dirtiness of a page at the granularity of a sub-page and replaces the least dirty page among pages not recently used. Experimental results with various workloads show that the proposed policy reduces the amount of data written to PCM by 22.9% on average and up to 73.7% compared to CLOCK. It also extends the lifespan of PCM by 49.0% and reduces the energy consumption of PCM by 3.0% on average. © 2015, Institute of Electronics Engineers of Korea. All right reserved.
... In this case, by combining the SCM device with the storage system and using the non-volatile characteristics of SCM, one can simplify and accelerate the storage system. For example, Lee et al. [16] presented a versioning file system for PCM that reduces the writing overhead of a snapshot significantly by breaking the recursive update chain at the immediate parent level. Fang et al. [6] presented an SCM-based approach for DBMSs logging, which achieves higher performance by using a simplified system design and better concurrency support. ...
... [1] PostMark (version 1.51) [12] Filebench(version 1.4.9) [26] Since an SCM device is not yet commercially available and read and write performance of an SCM device is close to memory, most research in the domain uses memory to simulate SCM [17,16,25,6,8,9]. Our tests of the SJM also used memory to simulate the SCM. ...
Conference Paper
Full-text available
Considering the unique characteristics of storage class memory (SCM), such as non-volatility, fast access speed, byte-addressability, low-energy consumption, and in-place modification support, we investigated the features of over-write and append-write and propose a safe and write-efficient SCM-based journaling mechanism for a file system called SJM. SJM integrates the ordered and journaling modes of the traditional journaling mechanisms by storing the metadata and over-write data in the SCM-based logging device as a write-ahead log and strictly controlling the data flow. SJM writes back the valid log blocks to the file system according to their access frequency and sequentiality and thus improves the write performance. We implemented SJM on Linux 3.12 with ext2, which has no journal mechanisms. Evaluation results show that ext2 with SJM outperforms ext3 with a ramdisk-based journaling device while keeping the version consistency, especially under workloads with large write requests.
... To cope with this situation, we adopt PCM as high-speed swap storage and discuss how such systems can be managed efficiently. In traditional HDD-based swap storage, a large amount of adjacent data are loaded together when a page fault occurs since the seek time of HDD is very large [6]. To improve CPU utilization, a process that incurs a page fault becomes blocked and the CPU is switched to other process while handling the page fault. ...
... We assume that PCM is put on DIMM slots and there is no context switching while handling page faults. The access time of a PCM device consists of a static time component needed for each access and a time component proportional to the request size [4,6,10]. Suppose that the proportional time component of a write and a read operation is T p write and T p read , respectively, and the static time component of a read and a write is T i read and T i write , respectively. ...
Article
Full-text available
This letter explores the performance of PCM-based swap storage and presents three management techniques. First, we attach PCM on DIMM slots to eliminate block I/O overhead and remove the context switching process while handling page faults. Second, we reduce the page size and turn off the read-ahead option to consider the high performance characteristics of PCM. Third, we decrease the DRAM memory size to save energy consumption without sacrificing the performance.
... The first one is the need for high programming current density (10 7 A/cm 2 ) to match the level of a diode. Unarguably the major concern associated with PCM is its threshold voltage drift and long-term resistance [11,[18][19][20][21][22][23][24][25]. By obeying the power law, resistance of the amorphous phase increases slowly with the applied voltage, thus restricting it from the multilevel operation and hindering the regular twostate operation if the threshold voltage is raised above the designed operating value. ...
Preprint
Full-text available
Memory has always been a building block element for information technology. Emerging technologies such as artificial intelligence, big data, the internet of things, etc., require a novel kind of memory technology that can be energy efficient and have an exception data retention period. Among several existing memory technologies, resistive random-access memory (RRAM) is an answer to the above question as it is necessary to possess the combination of speed of RAM and nonvolatility, thus proving to be one of the most promising candidates to replace flash memory in next-generation non-volatile RAM applications. This review discusses the existing challenges and technological advancements made with RRAM, including switching mechanism, device structure, endurance, fatigue resistance, data retention period, and mechanism of resistive switching in inorganic oxides material used as a dielectric layer. Finally, a summary and a perspective on future research are presented.
... The read and write latency of PCM is set to 100 (ns) and 350 (ns), respectively, and the read and write energy of PCM is set to 0.2 (nJ/bit) and 1.0 (nJ/bit), respectively [28]. For simulating DRAM memory, the read and write latencies are equally set to 50 (ns) and the read/write energy is set to 0.1 (nJ/bit) following previous studies [26,28]. The static power of DRAM is set to 1 (W/GB) and the default size of DRAM is set to the entire footprint of workloads in order not to incur any page faults. ...
Article
Full-text available
As the size of data grows rapidly in modern IoT (Internet-of-Things) and CPS (Cyber-Physical System) applications, the memory power consumption of real-time embedded systems increases dramatically. Unlike general-purpose systems where memory consumes about 10% of the CPU power consumption, modern real-time systems have the memory power of 20-50% of CPU power. This is because the memory size of a real-time system should be large enough to accommodate the entire task set, and thus DRAM refresh operations become a major source of power consumption. In this article, we present a new swap scheme for real-time systems, which aims at reducing memory power consumption. To support swap with real-time constraints, we adopt high-speed NVM storage and co-optimize power-savings in CPU and memory. Unlike traditional real-time task models that only consider the executions in CPU, we define an extended task model that characterizes memory and storage paths of tasks as well, and tightly evaluate the worst-case execution time by formulating the overlapped latency between CPU and memory. By optimizing the CPU supply voltage and the memory swap ratio of given task set, our scheme reduces the energy consumption of real-time systems by 31.1% on average under various workload conditions.
... This can eliminate a large proportion of out-of-place updates that occur in copy-on-write file systems. Lee et al. propose another copy-on-write file system for NVM called OND [26]. Unlike conventional copy-on-write file systems that need the propagation of out-of-place updates up to the file system root, OND breaks the recursive writes at the immediate parent level. ...
Article
Full-text available
Recently, NVM (non-volatile memory) has advanced as a fast storage medium, and traditional memory management systems designed for HDD storage should be reconsidered. In this article, we revisit the page sizing problem in NVM storage, specially focusing on virtualized systems. The page sizing problem has not caught attention in traditional systems because of the two reasons. First, the memory performance is not sensitive to the page size when HDD is adopted as storage. We show that this is not the case in NVM storage by analyzing the TLB miss rate and the page fault rate, which have trade-off relations with respect to the page size. Second, changing the page size in traditional systems is not easy as it accompanies significant overhead. However, due to the widespread adoption of virtualized systems, the page sizing problem becomes feasible for virtual machines, which are generated for executing specific workloads with fixed hardware resources. In this article, we design a page size model that accurately estimates the TLB miss rate and the page fault rate for NVM storage. We then present a method that has the ability of estimating the memory access time as the page size is varied, which can guide a suitable page size for given environments. By considering workload characteristics with given memory and storage resources, we show that the memory performance of virtualized systems can be improved by 38.4% when our model is adopted.
... The access latency of the flash storage is less than fifty milliseconds, and hence the performance gap of storage and memory becomes less than 3 orders of magnitude. Such trends have been speeded up by the commercialization of SCM whose access latency is just 1 or 2 orders of magnitude slower than DRAM [6,7]. ...
... The typical access time of NAND flash memory is less than 50 milliseconds, and thus the speed gap between storage and memory is reduced to three orders of magnitude. This trend has been accelerated by the appearance of NVM, of which the access time is about 1-100 times that of DRAM [5,6]. ...
Article
Full-text available
Recently, non-volatile memory (NVM) has advanced as a fast storage medium, and legacy memory subsystems optimized for DRAM (dynamic random access memory) and HDD (hard disk drive) hierarchies need to be revisited. In this article, we explore the memory subsystems that use NVM as an underlying storage device and discuss the challenges and implications of such systems. As storage performance becomes close to DRAM performance, existing memory configurations and I/O (input/output) mechanisms should be reassessed. This article explores the performance of systems with NVM based storage emulated by the RAMDisk under various configurations. Through our measurement study, we make the following findings. (1) We can decrease the main memory size without performance penalties when NVM storage is adopted instead of HDD. (2) For buffer caching to be effective, judicious management techniques like admission control are necessary. (3) Prefetching is not effective in NVM storage. (4) The effect of synchronous I/O and direct I/O in NVM storage is less significant than that in HDD storage. (5) Performance degradation due to the contention of multi-threads is less severe in NVM based storage than in HDD. Based on these observations, we discuss a new PC configuration consisting of small memory and fast storage in comparison with a traditional PC consisting of large memory and slow storage. We show that this new memory-storage configuration can be an alternative solution for ever-growing memory demands and the limited density of DRAM memory. We anticipate that our results will provide directions in system software development in the presence of ever-faster storage devices.
... A victim DRAM page destination must be selected for the migrating page (line [8][9][10][11]. The free page, unreferenced page and the expired page in DRAM will be selected as a victim page (30)(31)(32)(33). We deal with the DRAM page whose consecutive accesses number is more than 8 times as an expired page (line23-24). ...
Article
Full-text available
Using hybrid main memory in embedded systems to process image processing applications has become an irresistible trend. However, the performance deficiencies (less write endurance, relative longer write latency) in Phase Change Memory (PCM) have become the bottleneck for performance improvement in the whole system. To further improve the performance in hybrid memory system, data migration schemes have been put forward to migrating the sequential data between DRAM and PCM. Nevertheless, the existing schemes are always universal type page migration schemes, which cannot be aware of the write hot pages in image processing applications accurately. In addition, a large number of unnecessary page migrations occurring has the potential to increase the latency overhead of page management and destroy the locality of the applications leading to performance degradation. In this paper, focusing on the image processing applications, we present a better estimator in predicting the future memory writes: inter-reference distance with history write frequency. Based on this observation, we present a novel page migration scheme for hybrid DRAM and PCM memory architecture called WIRD (write frequency with inter-reference distance). The scheme monitors access patterns, migrates pages between DRAM and PCM relying on the memory controller. Experimental results show that our scheme minimizes the unnecessary page migrations which make most of the write hot pages absorbed by DRAM efficiently. Meanwhile, the high efficiency character of WIRD has decreased the write to read switching ratio in the system which makes the PCM effective page write access time reduced in a large extent.
... To observe the latency sensitivity of CRAST, we emulate a non-volatile memory's write latency by introducing an additional delay after each clflush instruction, as in previous studies [14,15,16]. No delays are added for the store instruction as the CPU cache hides such delays [17]. In our study, we consider two types of non-volatile memory: STT(Spin-transfer torque)-RAM and PCM. ...
Article
The rapid pace of innovation in non-volatile memory technologies such as 3D Xpoint [1], NVDIMM [2], and zSSD [3] is set to transform how we build, deploy, and manage data service platforms. In particular, the emergence of a byte-addressable and persistent type of memory changes the landscape of the current storage architecture, consolidating different functionalities of memory and storage into a single layer [4]. To take full advantage of this advanced technology, this letter presents a crash-resilient skip list (CRAST) which serves as an in-memory data management module in a key-value store to support crash-consistency from a system failure when running on non-volatile memory. By maintaining the persistent in-memory data in a consistent manner, the proposed skip list provides strong reliability and high performance simultaneously in modern data service platforms. We demonstrate the efficacy of CRAST by implementing its prototype in LevelDB. We experimentally show that CRAST provides excellent performance across various workloads, compared to the original key-value store without any compromise on reliability.
... At present, there are three main types of persistent storage technology based on PCM: persistent storage technology based on external memory, persistent storage technology based on main memory and hybrid external memory technology. 1. External memory persistent storage technology: This technology uses PCM devices as the secondary storage devices, exists in the storage system like flash memory SSD and disk, data storage organization technology such as external memory file system [5,6], space management to solve the problem of wear leveling [7] and so on. 2. Main memory persistent storage technology: Discuss how to manage data in PCM by the characteristic of data storage access PCM through CPU directly, such as the management method of file system, management method based on object and so on. ...
Article
Full-text available
The era of big data is here, the demand of mass data for the storage and processing ability of computer system is bigger and bigger. The computer’s information process ability is strong enough; however, the performance of computer storage system has not improved much. In this paper, we use the DRAM and PCM to build mixed main memory and use the SSD and HDD to build secondary storage to build a hybrid storage system. Aiming at the hit rate in hybrid main memory and the writing life of PCM, a hotness-aware page management algorithm is proposed. We research the hybrid memory architecture based on PCM and DRAM, and we propose a page partition management method based on heat perception. We use the operating mechanism that is similar with traditional CLOCK algorithm to ensure the system hit rate. And we lead into the recently twice concept of writing distance and combine with the page history information to accurately judge the hot or cold of pages. Then, we design the page migration management mechanism. By writing clock linked list to track the page writing heat dynamic, we move the hot page to DRAM. And, we reduce the number of PCM write to improve the life of PCM. Finally, it is verified by the simulation experiments that this method reduces the number of write times on PCM by 9.5%, while ensuring the hit rate.
... However, PCM has weaknesses to substitute DRAM memory in its entirety as its access time is relatively slower compared to DRAM and it has limited write endurance of 10 6 -10 8 . Thus, PCM is recently considered as a high-speed secondary storage medium as well as far memory that is to be used along with DRAM [3][4][5]. ...
Article
Full-text available
Due to the recent advances in Phage-Change Memory (PCM) technologies, a new memory hierarchy of computer systems with PCM is expected to appear. In this paper, we present a new page replacement policy that adopts PCM as a high speed swap device. As PCM has limited write endurance, our goal is to minimize the amount of data written to PCM. To do so, we defer the eviction of dirty pages in proportion to their dirtiness. However, excessive preservation of dirty pages in memory may deteriorate the page fault rate, especially when the memory capacity is not enough to accommodate full working-set pages. Thus, our policy monitors the current working-set size of the system, and controls the deferring level of dirty pages not to degrade the system performances. Simulation experiments show that the proposed policy reduces the write traffic to PCM by 160% without performance degradations. © 2017, Institute of Electronics Engineers of Korea. All rights reserved.
... However, it does not show competitive performance to be a memory medium in current hardware specifications. In particular, the access time of PCM is slower than DRAM about 2-5x in reads and 8-50x in writes [7,8]. Hence, it is recently being considered as a high-speed storage rather than a memory medium. ...
Article
Full-text available
This paper presents an optimized adoption of NVM for the storage system of heterogeneous applications. Our analysis shows that a bulk of I/O does not happen on a single storage partition, but it is varied significantly for different application categories. In particular, journaling I/O accounts for a dominant portion of total I/O in DB applications like OLTP, whereas swap I/O accounts for a large portion of I/O in graph visualization applications, and file I/O accounts for a large portion in web browsers and multimedia players. Based on these observations, we argue that maximizing the performance gain with NVM is not obtained by fixing it as a specific storage partition but varied widely for different applications. Specifically, for graph visualization, DB, and multimedia player applications, using NVM as a swap, a journal, and a file system partitions, respectively, performs well. Our optimized adoption of NVM improves the storage performance by 10-61%.
... More recently, PCM is being considered as a highspeed storage medium (like swap device) as well as far memory that is to be used along with DRAM [1,4,5,18,19]. In this paper, we adopt PCM as a high-speed swap device as shown in Fig. 1(c) and investigate the management issues and potential benefits of PCM swap storage. ...
Article
Full-text available
Due to the recent advances in non-volatile memory technologies such as PCM, a new memory hierarchy of computer systems is expected to appear. In this paper, we explore the performance of PCMbased swap systems and discuss how this system can be managed efficiently. Specifically, we introduce three management techniques. First, we show that the page fault handling time can be reduced by attaching PCM on DIMM slots, thereby eliminating the software stack overhead of block I/O and the context switch time. Second, we show that it is effective to reduce the page size and turn off the read-ahead option under the PCM swap system where the page fault handling time is sufficiently small. Third, we show that the performance is not degraded even with a small DRAM memory under a PCM swap device; this leads to the reduction of DRAM’s energy consumption significantly compared to HDD-based swap systems. We expect that the result of this paper will lead to the transition of the legacy swap system structure of “large memory – slow swap” to a new paradigm of “small memory – fast swap”. © 2015, Institute of Electronics Engineers of Korea. All rights reserved
Article
Energy-saving is one of the most important missions in the design of battery-based mobile systems. Many ideas have been suggested for saving energy in different system layers. Specifically, (1) lowering the supply voltage for idle CPU time slots, (2) using hybrid memory to save DRAM refresh power, and (3) task offloading to edge/cloud servers are well-acknowledged techniques used in CPU, memory, and network subsystems. In this paper, we show that co-optimizing these three techniques is necessary for further reducing the energy consumption of mobile real-time systems. To this end, we present an extended task model and formulate the effect of dynamic voltage/frequency scaling (DVFS), hybrid memory allocation, and task offloading problems as a unified measure. We then present a new real-time task scheduling scheme, called Co-TOMS, to co-optimize the energy-saving techniques in CPU, memory, and network subsystems by considering the given task set and resource conditions. The main contributions of our study can be summarized as follows. First, we optimize three energy-saving techniques across different system layers and find that they have significant influence on each other. For example, the effect of DVFS alone is limited in mobile systems, but combining it with offloading greatly amplifies its efficiency. Second, previous studies on offloading usually define a deadline as the maximum allowable latency at the application-level, but we focus on hard real-time systems that must meet task-level deadlines. Third, we design a steady-state genetic algorithm that allows fast convergence with reasonable computation overhead under various resource and workload conditions.
Article
As flash-based solid-state drive (SSD) becomes more prevalent because of the rapid fall in price and the significant increase in capacity, customers expect better data services than disk-based systems. However, higher performance and new characteristics of flash require a rethinking of data services. For example, backup and recovery is an important service in a database system since it protects data against unexpected hardware and software failures. To provide backup and recovery, backup/recovery tools or methods can be used. However, the tools perform time-consuming jobs, and the methods may negatively affect run-time performance during normal operation even though high-performance SSDs are used. To handle these issues, we propose an SSD-assisted backup/recovery scheme for database systems. Our scheme is to utilize the characteristics of flash for backup/recovery operations. To this end, we exploit the resources inside SSD, and we call our SSD BR-SSD. We design and implement the functionality in Samsung enterprise-class SSD for more realistic systems. Furthermore, we exploit and integrate BR-SSDs into database systems in replication and RAID environments, as well as a database system in a single BR-SSD. The experimental result demonstrates that our scheme provides fast backup/recovery but does not negatively affect the run-time performance during normal operation.
Article
The “human dimensions” of energy use in buildings refer to the energy-related behaviors of key stakeholders that affect energy use over the building life cycle. Stakeholders include building designers, operators, managers, engineers, occupants, industry, vendors, and policymakers, who directly or indirectly influence the acts of designing, constructing, living, operating, managing, and regulating the built environments, from individual building up to the urban scale. Among factors driving high-performance buildings, human dimensions play a role that is as significant as that of technological advances. However, this factor is not well understood, and, as a result, human dimensions are often ignored or simplified by stakeholders. This paper presents a review of the literature on human dimensions of building energy use to assess the state-of-the-art in this topic area. The paper highlights research needs for fully integrating human dimensions into the building design and operation processes with the goal of reducing energy use in buildings while enhancing occupant comfort and productivity. This research focuses on identifying key needs for each stakeholder involved in a building’s life cycle and takes an interdisciplinary focus that spans the fields of architecture and engineering design, sociology, data science, energy policy, codes, and standards to provide targeted insights. Greater understanding of the human dimensions of energy use has several potential benefits including reductions in operating cost for building owners; enhanced comfort conditions and productivity for building occupants; more effective building energy management and automation systems for building operators and energy managers; and the integration of more accurate control logic into the next generation of human-in-the-loop technologies. The review concludes by summarizing recommendations for policy makers and industry stakeholders for developing codes, standards, and technologies that can leverage the human dimensions of energy use to reliably predict and achieve energy use reductions in the residential and commercial buildings sectors.
Article
Synchronization has been studied extensively in the context of weakly coupled oscillators using the so-called phase response curve (PRC) which measures how a change of the phase of an oscillator is affected by a small perturbation. This approach was based upon the work of Malkin, and it has been extended to relaxation oscillators. Namely, synchronization conditions were established under the weak coupling assumption, leading to a criterion for the existence of synchronous solutions of weakly coupled relaxation oscillators. Previous analysis relies on the fact that the slow nullcline does not intersect the fast nullcline near one of its fold points, where canard solutions can arise. In the present study we use numerical continuation techniques to solve the adjoint equations and we show that synchronization properties of canard cycles are different than those of classical relaxation cycles. In particular, we highlight a new special role of the maximal canard in separating two distinct synchronization regimes: the Hopf regime and the relaxation regime. Phase plane analysis of slow-fast oscillators undergoing a canard explosion provides an explanation for this change of synchronization properties across the maximal canard.
Conference Paper
Full-text available
In recent years, the explosion of the data such as text, image, audio, video, data centers and backup data lead to a lot of problem in both storage and retrieval process. The enterprises invest lot of money for storing the data. Hence, an efficient technique is needed for handling the enormous data. There are two existing techniques for eliminating the redundant data in the storage system such as data deduplication and data reduction. Data deduplication is one of the best technique which eliminates redundant data, reduces the bandwidth and also minimizes the disk usage and cost. Various research papers have been studied from the literature, as the result, this paper attempts to summarize various storage optimization techniques, concepts and categories using data deduplication. In addition to this, chunk based data deduplication techniques are surveyed in detail.
Article
A simplified model of the crustacean gastric mill network is considered. Rhythmic activity in this network has largely been attributed to half center oscillations driven by mutual inhibition. We use mathematical modeling and dynamical systems theory to show that rhythmic oscillations in this network may also depend on, or even arise from, a voltage-dependent electrical coupling between one of the cells in the half-center network and a projection neuron that lies outside of the network. This finding uncovers a potentially new mechanism for the generation of oscillations in neuronal networks.
Article
With dramatic growth of data and rapid enhancement of computing powers, data accesses become the bottleneck restricting overall performance of a computer system. Emerging phase-change memory (PCM) is byte-addressable like DRAM, persistent like hard disks and Flash SSD, and about four orders of magnitude faster than hard disks or Flash SSDs for typical file system I/Os. The maturity of PCM from research to production provides a new opportunity for improving the I/O performance of a system. However, PCM also has some weaknesses, for example, long write latency, limited write endurance, and high active energy. Existing processor cache systems, main memory systems, and online storage systems are unable to leverage the advantages of PCM, and/or to mitigate PCM's drawbacks. The reason behind this incompetence is that they are designed and optimized for SRAM, DRAM memory, and hard drives, respectively, instead of PCM memory. There have been some efforts concentrating on rethinking computer architectures and software systems for PCM. This article presents a detailed survey and review of the areas of computer architecture and software systems that are oriented to PCM devices. First, we identify key technical challenges that need to be addressed before this memory technology can be leveraged, in the form of processor cache, main memory, and online storage, to build high-performance computer systems. Second, we examine various designs of computer architectures and software systems that are PCM aware. Finally, we obtain several helpful observations and propose a few suggestions on how to leverage PCM to optimize the performance of a computer system.
Article
In this paper, we adopt PCM (phase-change memory) as a virtual memory swap device and present a new page replacement policy that considers the characteristics of PCM. Specifically, we aim to reduce the write traffic to PCM by considering the dirtiness of pages when making a replacement decision. The proposed policy tracks the dirtiness of a page at the granularity of a sub-page and replaces the least dirty page among the pages not recently used. Experimental results show that the proposed policy reduces the amount of data written to PCM by 22.9% on average and up to 73.7% compared to CLOCK. It also extends the lifespan of PCM by 49.0% and reduces the energy consumption of PCM by 3.0% on average.
Article
Big data has become a hot topic in both academia and industry. However, due to the limitations of current computer system architectures, big data management is facing a lot of new challenges w.r.t. performance, energy, etc. Recently, a new kind of storage media called phase change memory (PCM) introduces new opportunities for advancing computer architectures and big data management, due to its non-volatility, byte-addressability, high read speed, low energy, etc. As a kind of non-volatile storage media, PCM has some unique features of DRAM, such as byte-addressability and high read/write performance, thus can be regarded as a cross-layer storage media for re-designing current storage architecture so as to realize high-performance storage. In this paper, we summarize the features of PCM, and present a survey on PCM-based data management. We discuss the related advances in terms of two aspects, namely that PCM is used as secondary storage and that PCM is used as main memory. We also introduce the current studies on the applications of PCM in various areas. Finally, we propose some future research directions on PCM-based data management so as to provide some valuable references for big data storage and management on new storage architectures.
Article
This letter analyzes the characteristics of write operations in file systems for PCM (phase-change memory), and observes that a large proportion of writes only involve a small modification of the previous version. This observation motivates a new file system for PCM called P2FS, which reduces write traffic to PCM by exploiting the similarity of data between versions, while preserving the same reliability level of existing copy-on-write file systems. Experimental results show that P2FS reduces file I/O time by 48%, and energy consumption by 65%, on average, compared to copy-on-write.
Conference Paper
The business continuity is essential for any enterprise application where remote replication enables customers to store the data on a Logical Disk (LDisk) at the local site and replicate the same at remote locations. In case of a disaster at local site, the replicated LDisk (remote copy) at remote site is marked as primary copy and the remote copy is made available without any downtime. The replication to destination is configured either in sync-mode or async-mode. In case of async-mode, the host IOs are first processed by the source array at the local site. A snapshot of the LDisk is triggered periodically and the new snapshot is replicated to the destination array at remote site. In this configuration, one particular node of source array becomes loaded with ongoing host IOs, snapshot, and replication activities. In the scale-out model, a storage array consists of multiple nodes and hence, the replication tasks and responsibilities can be distributed to a different node. We propose a cloning mechanism called DeltaClone, which replicates the incremental changes of LDisk across nodes. The ownership of a LDisk and its DeltaClone are assigned to two different nodes which are called as master node and slave node respectively. When the periodic request is triggered to synchronize the LDisk data with its remote copy, the current DeltaClone is frozen and it is then merged with remote copy. Hence, the replication tasks are carried out at slave node without affecting the performance of the master node and the ongoing host IOs. The slave node is re-elected periodically to ensure the dynamic load-balancing across the nodes. Our distributed design improves the overall storage performance and the simulation results showed that the proposed method outperforms the traditional methods.
Conference Paper
The emerging Storage Class Memory, which offers characteristics of byte-addressability, persistence, and low power consumption, will be expected to replace memory/storages. A new file system is required in this environment. In the design of file systems, data consistency is one of the most important issues that should be taken into account. To do this, most file systems exploit journaling or shadow paging for the consistency. Shadow paging employs copy-on-write for the consistency. However, it incurs many copy overheads. In order to relieve these problems, BPFS proposed a short-circuit shadow paging. But, in our experiments, we showed that it incurs many copy-on-write blocks as ever. In this paper, we propose a last block logging mechanism for improving the performance and the lifetime of SCM-based file system, by reducing copy-on-write blocks considerably. Our approach is to store the only changed contents to the available space of the last block, instead of performing copy-on-write on the entire block. Also, our approach updates the address of the last block and maintains the information on the logged data in order to ensure the data consistency. SQLite benchmark shows that the proposed mechanism reduces the overall elapsed time by 14% and the written data amount up to 72%, compared to the mechanism of BPFS.
Article
Full-text available
Modern file systems assume the use of disk, a system-wide performance bottleneck for over a decade. Current disk caching and RAM file systems either impose high overhead to access memory content or fail to provide mechanisms to achieve data persistence across reboots.The Conquest file system is based on the observation that memory is becoming inexpensive, which enables all file system services to be delivered from memory, except for providing large storage capacity. Unlike caching, Conquest uses memory with battery backup as persistent storage, and provides specialized and separate data paths to memory and disk. Therefore, the memory data path contains no disk-related complexity. The disk data path consists of optimizations only for the specialized disk usage pattern.Compared to a memory-based file system, Conquest incurs little performance overhead. Compared to several disk-based file systems, Conquest achieves 1.3x to 19x faster memory performance, and 1.4x to 2.0x faster performance when exercising both memory and disk. Conquest realizes most of the benefits of persistent RAM at a fraction of the cost of a RAM-only solution. It also demonstrates that disk-related optimizations impose high overheads for accessing memory content in a memory-rich environment.
Article
Full-text available
Persistent systems support mechanisms which allow programs to create and manipulate arbitrary data structures which outlive the execution of the program which created them. A persistent store supports mechanisms for the storage and retrieval of objects in a uniform manner regardless of their lifetime. Since all data of the system is in this repository it is important that it always be in a consistent state. This property is called integrity. The integrity of the persistent store depends in part on the store being resilient to failures. That is, when an error occurs the store can recover to a previously recorded consistent state. The mechanism for recording this state and performing recovery is called stability. This paper considers an implementation of a persistent store based on a large virtual memory and shows how stability is achieved.
Conference Paper
Full-text available
Nonvolatile memory technology is evolving continuously and commercial products such as FeRAM and PRAM are now appearing in the market. As Nonvolatile-RAM (NVRAM) has properties of both memory and storage, it can store persistent data objects while allowing fast and random access. To utilize NVRAM for general purpose storing of frequently updated data across power disruptions, some essential features of the file system including naming, recovery, and space management are required while exploiting memory-like properties of NVRAM. Conventional file systems, including even recently developed NVRAM file systems, show very low space efficiency wasting more than 50% of the total space in some cases. To efficiently utilize the relatively expensive NVRAM, we design and analyze a new extent-based file system, which we call NEBFS (NVRAM Extent-Based File System). We analyze and compare the space utilization of conventional file systems with NEBFS
Conference Paper
Full-text available
Phase-change memory (PCM) is becoming widely recognized as the most likely candidate to unify the many memory technologies that exist today (Lee, et al., 2007). The combination of non-volatile attributes of flash, RAM-like bit-alterability, and fast reads and writes position PCM to enable changes in the memory subsystems of cellular phones, PCs and countless embedded and consumer electronics applications. This design's multi-level cell (MLC) capabilities combined with long- term scalability reduce PCM costs as only realized before by hard disk drives. MLC technology is challenged with fitting more cell states (4 in the case of 2 bit per cell), along with distribution spreads due to process, design, and environmental variations, within a limited window. We describe a 256Mb MLC test-chip in a 90nm micro-trench (mutrench) PCM technology, and MLC endurance results from an 8Mb 0.18mum PCM test-chip with the same trench cell structure. A program algorithm achieving tightly placed inner states and experimental results illustrating distinct current distributions are presented to demonstrate MLC capability.
Conference Paper
Full-text available
Technology trends may soon favor building main memory as a hybrid between DRAM and non-volatile memory, such as flash or PC-RAM. We describe how the operating system might manage such hybrid memories, using semantic information not available in other layers. We describe preliminary experiments suggesting that this approach is viable.
Conference Paper
Full-text available
Magnetic RAM (MRAM) is a new memory technology with access and cost characteristics comparable to those of conventional dynamic RAM (DRAM) and the non-volatility of magnetic media such as disk. Simply replacing DRAM with MRAM will make main memory non-volatile, but it will not improve file system performance. However, effective use of MRAM in a file system has the potential to significantly improve performance over existing file systems. The HeRMES file system will use MRAM to dramatically improve file system performance by using it as a permanent store for both file system data and metadata. In particular, metadata operations, which make up over 50% of all file system requests [14], are nearly free in HeRMES because they do not require any disk accesses. Data requests will also be faster, both because of increased metadata request speed and because using MRAM as a non-volatile cache will allow HeRMES to better optimize data placement on disk. Though MRAM capacity is too small to replace disk entirely, HeRMES will use MRAM to provide high-speed access to relatively small units of data and metadata, leaving most file data stored on disk.
Conference Paper
Full-text available
File systems using non-volatile RAM (NVRAM) promise great improvements in file system performance over conventional disk storage. However, current technology allows for a relatively small amount of NVRAM, limiting the effectiveness of such an approach. We have developed a prototype in-memory file system which utilizes data compression on inodes, and which has preliminary support for compression of file blocks. Our file system, MRAMFS, is also based on data structures tuned for storage efficiency in non-volatile memory. This prototype allows us to examine how to use this limited resource more efficiently. Simulations show that inodes can be reduced to 15-20 bytes each at a rate of 250,000 or more inodes per second. This is a space savings of 79-85% over conventional 128-byte inodes. Our prototype file system shows that for metadata operations, inode compression does not significantly impact performance, while significantly reducing the space used by inodes. We also note that a naive block-based implementation of file compression does not perform acceptably either in terms of speed or compression achieved.
Conference Paper
Full-text available
In this paper, we present the scalable and efficient flash file system using the combination of NAND and Phase-change RAM (PRAM). Until now, several flash file systems have been developed considering the physical characteristics of NAND flash. However, previous flash file systems still have a high performance overhead and a scalability problem of the mounting time and the memory usage because, in most case, the metadata is written with several words at a single update even though the writes in NAND flash must be performed in terms of page, which is typically 2 KiB. The proposed flash file system called PFFS uses PRAM to mitigate the limitation of NAND flash. The PRAM is a next generation non-volatile memory and good for dealing with word level read/write of a small size of data. PFFS hence separates the metadata from the regular data in a file system and saves them into PRAM. Consequently, the PFFS manages all the files and directories in the PRAM and outperforms other flash file systems. The experimental results show that the performance of PFFS is 25% better than YAFFS2 for small-file writes while matching YAFFS2 performance for large writes and the mouting time and the memory usage of PFFS are O(1).
Conference Paper
Full-text available
Modern computer systems have been built around the assumption that persistent storage is accessed via a slow, block-based interface. However, new byte-addressable, persistent memory technologies such as phase change memory (PCM) offer fast, fine-grained ac cess to persistent storage. In this paper, we present a file system and a hardware archi- tecture that are designed around the properties of persiste nt, byte- addressable memory. Our file system, BPFS, uses a new techniq ue called short-circuit shadow pagingto provide atomic, fine-grained updates to persistent storage. As a result, BPFS provides st rong re- liability guarantees and offers better performance than traditional file systems, even when both are run on top of byte-addressabl e, persistent memory. Our hardware architecture enforces atomicity and ordering guarantees required by BPFS while still providing the performance benefits of the L1 and L2 caches. Since these memory technologies are not yet widely available, we evaluate BPFS on DRAM against NTFS on both a RAM disk and a traditional disk. Then, we use microarchitectural simulations to estimate the performance of BPFS on PCM. Despite providing strong safety and consistency guarantees, BPFS on DRAM is typ- ically twice as fast as NTFS on a RAM disk and 4-10 times faster than NTFS on disk. We also show that BPFS on PCM should be significantly faster than a traditional disk-based file syst em.
Conference Paper
Full-text available
The scalability of future massively parallel processing (MPP) systems is challenged by high failure rates. Current hard disk drive (HDD) checkpointing results in overhead of 25% or more at the petascale. With a direct correlation between checkpoint frequencies and node counts, novel techniques that can take more frequent checkpoints with minimum overhead are critical to implement a reliable exascale system. In this work, we leverage the upcoming Phase-Change Random Access Memory (PCRAM) technology and propose a hybrid local/global checkpointing mechanism after a thorough analysis of MPP systems failure rates and failure sources. We propose three variants of PCRAM-based hybrid checkpointing schemes, DIMM+HDD, DIMM+DIMM, and 3D+3D, to reduce the checkpoint overhead and offer a smooth transition from the conventional pure HDD checkpoint to the ideal 3D PCRAM mechanism. The proposed pure 3D PCRAM-based mechanism can ultimately take checkpoints with overhead less than 4% on a projected exascale system.
Article
Full-text available
B-trees are used by many file systems to represent files and directories. They provide guaranteed logarithmic time key-search, insert, and remove. File systems like WAFL and ZFS use shadowing, or copy-on-write, to implement snapshots, crash recovery, write-batching, and RAID. Serious difficulties arise when trying to use b-trees and shadowing in a single system. This article is about a set of b-tree algorithms that respects shadowing, achieves good concurrency, and implements cloning (writeable snapshots). Our cloning algorithm is efficient and allows the creation of a large number of clones. We believe that using our b-trees would allow shadowing file systems to better scale their on-disk data structures.
Article
Full-text available
While NAND flash memory is used in a variety of end-user devices, it has a few disadvantages, such as asymmetric speed of read and write operations, inability to in-place updates, among others. To overcome these problems, various flash-aware strategies have been suggested in terms of buffer cache, file system, FTL, and others. Also, the recent development of next-generation nonvolatile memory types such as MRAM, FeRAM, and PRAM provide higher commercial value to non-volatile RAM (NVRAM). At today's prices, however, they are not yet cost-effective. In this paper, we suggest the utilization of small-sized, next-generation NVRAM as a write buffer to improve the .overall performance of NAND flash memory-based storage systems. We propose various block-based NVRAM write buffer management policies and evaluate the performance improvement of NAND flash memory-based storage systems under each policy. Also, we propose a novel write buffer-aware flash translation layer algorithm, optimistic FTL, which is designed to harmonize well with NVRAM write buffers. Simulation results show that the proposed buffer management policies outperform the traditional page-based LRU algorithm and the proposed optimistic FTL outperforms previous log block-based FTL algorithms, such as BAST and FAST.
Article
Full-text available
Software transactional memory (STM) has been proposed to simplify the development and to increase the scalability of concurrent programs. One problem of existing STMs is that of having long-running read transactions co-exist with shorter update transactions. This problem is of practical importance and has so far not been addressed by other papers in this domain. We approach this problem by investigating the performance of a STM using snapshot isolation and a novel lazy multi-version snapshot algorithm to decrease the validation costs - which can increase quadratically with the number of objects read in STMs with invisible reads. Our measurements demonstrate that snapshot isolation can increase throughput for workloads with long transactions. In comparison to other STMs with invisible reads, we can reduce the validation costs by using our lazy consistent snapshot algorithm.
Article
Full-text available
Conquest is a disk/persistent-RAM hybrid file system that is incrementally deployable and realizes most of the benefits of cheaply abundant persistent RAM. Conquest consists of two specialized and simplified data paths for incore and on-disk storage and outperforms popular diskbased file systems by 43% to 97%.
Article
Full-text available
A comprehensive versioning file system creates and retains a new file version for every WRITE or other modification request. The resulting history of file modifications provides a detailed view to tools and administrators seeking to investigate a suspect system state. Conventional versioning systems do not efficiently record the many prior versions that result. In particular, the versioned metadata they keep consumes almost as much space as the versioned data. This paper examines two space-efficient metadata structures for versioning file systems and describes their integration into the Comprehensive Versioning File System (CVFS). Journal-based metadata encodes each metadata version into a single journal entry; CVFS uses this structure for inodes and indirect blocks, reducing the associated space requirements by 80%. Multiversion b-trees extend the per-entry key with a timestamp and keep current and historical entries in a single tree; CVFS uses this structure for directories, reducing the associated space requirements by 99%. Experiments with CVFS verify that its current-version performance is similar to that of non-versioning file systems. Although access to historical versions is slower than conventional versioning systems, checkpointing is shown to mitigate this effect.
Article
DRAM is facing severe scalability challenges in sub-45nm tech- nology nodes due to precise charge placement and sensing hur- dles in deep-submicron geometries. Resistive memories, such as phase-change memory (PCM), already scale well beyond DRAM and are a promising DRAM replacement. Unfortunately, PCM is write-limited, and current approaches to managing writes must de- commission pages of PCM when the first bit fails. This paper presents dynamically replicated memory (DRM), the first hardware and operating system interface designed for PCM that allows continued operation through graceful degradation when hard faults occur. DRM reuses memory pages that con- tain hard faults by dynamically forming pairs of complementary pages that act as a single page of storage. No changes are required to the processor cores, the cache hierarchy, or the operating sys- tem's page tables. By changing the memory controller, the TLBs, and the operating system to be DRM-aware, we can improve the lifetime of PCM by up to 40x over conventional error-detection techniques.
Conference Paper
DRAM is facing severe scalability challenges in sub-45nm tech- nology nodes due to precise charge placement and sensing hur- dles in deep-submicron geometries. Resistive memories, such as phase-change memory (PCM), already scale well beyond DRAM and are a promising DRAM replacement. Unfortunately, PCM is write-limited, and current approaches to managing writes must de- commission pages of PCM when the first bit fails. This paper presents dynamically replicated memory (DRM), the first hardware and operating system interface designed for PCM that allows continued operation through graceful degradation when hard faults occur. DRM reuses memory pages that con- tain hard faults by dynamically forming pairs of complementary pages that act as a single page of storage. No changes are required to the processor cores, the cache hierarchy, or the operating sys- tem's page tables. By changing the memory controller, the TLBs, and the operating system to be DRM-aware, we can improve the lifetime of PCM by up to 40x over conventional error-detection techniques.
Article
This paper discusses a number of modelling approaches to simulate phase-transition behaviour in Ge 2 Sb 2 Te 5 material used in optical and electrical phase-change memories. Such models include the well-known Johnson-Mehl-Avrami-Kolmogorov (JMAK) formalism to calculate the fraction of crystallized material. In the literature this model is widely used, but parameters of the model reported by different investigators vary considerably, and can probably be attributed to the inappropriate use of the JMAK approach. In order to overcome the restrictions imposed by JMAK theory, generalizations based on the classical nucleation theory have been suggested and used by several researchers. A more recent alternative modelling technique is the so-called Cellular Automata (CA) approach. A CA representation can be driven by stochastic or deterministic rules, by theoretical descriptions or by empirical rules derived from experimental observations. Finally, yet another approach is that based on evaluating the population density of 'crystal clusters' using rate-equations. Although it is computationally quite intensive, this technique has yielded favourable and reliable results, and we present various simulation-experiment comparisons to illustrate the capabilities of this approach. We also use the various models to assess the limits of thermodynamic stability in GeSbTe phase-change materials, with a view to using such material in ultra-high density (1 Tbit/sq.in. and beyond) scanning probe-based memories. Some write-read simulations of such a memory at Tbit/sq.in densities are also presented.
Article
Existing file system benchmarks are deficient in portraying performance in the ephemeral small-file regime used by Internet software, especially: electronicmail; netnews; and web-based commerce. PostMark is a new benchmark to measure performance for this class of application.In this paper, PostMark test results are presented and analyzed for both UNIX and Windows NT application servers. Network Appliance Filers (file server appliances) are shown to provide superior performance (via NFS or CIFS) compared to local disk alternatives, especially at higher loads. Such results are consistent with reports from ISPs (Internet Service Providers) who have deployed NetApp filers to support such applications on a large scale.
Article
The workstation file system for the Cedar programming environment was modified to improve its robustness and performance. Previously, the file system used hardware-provided labels on disk blocks to increase robustness against hardware and software errors. The new system does not require hardware disk labels, yet is more robust than the old system. Recovery is rapid after a crash. The performance of operations on file system metadata, e.g., file creation or open, is greatly improved. The new file system has two features that make it atypical. The system uses a log, as do most database systems, to recover metadata about the file system. To gain performance, it uses group commit, a concept derived from high performance database systems. The design of the system used a simple, yet detailed and accurate, analytical model to choose between several design alternatives in order to provide good disk performance.
Conference Paper
PCM is both a sustaining technology and a disruptive technology. These two aspects can be complementarily considered to speed up PCM market penetration. In addition PCM can be exploited by the memory system and by the convergence of consumer, computer and communication electronic systems. Some topics of PCM penetration in different memory systems have been described. The caching of the existing memory technologies, reducing the overall system cost and system complexity will be the compelling motivation. Bandwidth will drive the sustaining side of PCM in code and data transfer applications, while reduction in power dissipation will represent a further added value of this technology.
Conference Paper
In order to reduce the energy dissipation in main memory of computer systems, phase change memory (PCM) has emerged as one of the most promising technologies to incorporate into the memory hierarchy. However, PCM has two critical weaknesses to substitute DRAM memory in its entirety. First, the number of write operations allowed to each PCM cell is limited. Second, write access time of PCM is about 6-10 times slower than that of DRAM. To cope with this situation, hybrid memory architectures that use a small amount of DRAM together with PCM memory have been suggested. In this paper, we present a new memory management technique for hybrid PCM and DRAM memory architecture that efficiently hides the slow write performance of PCM. Specifically, we aim to estimate future write references accurately and then absorb most memory writes into DRAM. To do this, we analyze the characteristics of memory write references and find two noticeable phenomena. First, using write history alone performs better than using both read and write history in estimating future write references. Second, the frequency characteristic is a better estimator than temporal locality but combining these two properties appropriately leads to even better results. Based on these two observations, we present a new page replacement algorithm called CLOCK-DWF (CLOCK with Dirty bits and Write Frequency) that significantly reduces the number of write operations that occur on PCM.
Article
The dream of replacing rotating mechanical storage, the disk drive, with solid-state, nonvolatile RAM may become a reality in the near future. Approximately ten new technologies—collectively called storage-class memory (SCM)—are currently under development and promise to be fast, inexpensive, and power efficient. Using SCM as a disk drive replacement, storage system products will have random and sequential I/O performance that is orders of magnitude better than that of comparable disk-based systems and require much less space and power in the data center. In this paper, we extrapolate disk and SCM technology trends to 2020 and analyze the impact on storage systems. The result is a 100- to 1,000-fold advantage for SCM in terms of the data center space and power required.
Conference Paper
The predicted shift to non-volatile, byte-addressable memory (e.g., Phase Change Memory and Memristor), the growth of "big data", and the subsequent emergence of frameworks such as memcached and NoSQL systems require us to rethink the design of data stores. To derive the maximum performance from these new memory technologies, this paper proposes the use of single-level data stores. For these systems, where no distinction is made between a volatile and a persistent copy of data, we present Consistent and Durable Data Structures (CDDSs) that, on current hardware, allows programmers to safely exploit the low-latency and non-volatile aspects of new memory technologies. CDDSs use versioning to allow atomic updates without requiring logging. The same versioning scheme also enables rollback for failure recovery. When compared to a memory-backed Berkeley DB B-Tree, our prototype-based results show that a CDDS B-Tree can increase put and get throughput by 74% and 138%. When compared to Cassandra, a two-level data store, Tembo, a CDDS B-Tree enabled distributed Key-Value system, increases throughput by up to 250%-286%.
Conference Paper
DRAM is facing severe scalability challenges in sub-45nm tech- nology nodes due to precise charge placement and sensing hur- dles in deep-submicron geometries. Resistive memories, such as phase-change memory (PCM), already scale well beyond DRAM and are a promising DRAM replacement. Unfortunately, PCM is write-limited, and current approaches to managing writes must de- commission pages of PCM when the first bit fails. This paper presents dynamically replicated memory (DRM), the first hardware and operating system interface designed for PCM that allows continued operation through graceful degradation when hard faults occur. DRM reuses memory pages that con- tain hard faults by dynamically forming pairs of complementary pages that act as a single page of storage. No changes are required to the processor cores, the cache hierarchy, or the operating sys- tem's page tables. By changing the memory controller, the TLBs, and the operating system to be DRM-aware, we can improve the lifetime of PCM by up to 40x over conventional error-detection techniques.
Conference Paper
Memory scaling is in jeopardy as charge storage and sensing mechanisms become less reliable for prevalent memory technologies, such as DRAM. In contrast, phase change memory (PCM) storage relies on scalable current and thermal mechanisms. To exploit PCM's scalability as a DRAM alternative, PCM must be architected to address relatively long latencies, high energy writes, and finite endurance. We propose, crafted from a fundamental understanding of PCM technology parameters, area-neutral architectural enhancements that address these limitations and make PCM competitive with DRAM. A baseline PCM system is 1.6x slower and requires 2.2x more energy than a DRAM system. Buffer reorganizations reduce this delay and energy gap to 1.2x and 1.0x, using narrow rows to mitigate write energy and multiple rows to improve locality and write coalescing. Partial writes enhance memory endurance, providing 5.6 years of lifetime. Process scaling will further reduce PCM energy costs and improve endurance.
Conference Paper
Using nonvolatile memories in memory hierarchy has been investigated to reduce its energy consumption because nonvolatile memories consume zero leakage power in memory cells. One of the difficulties is, however, that the endurance of most nonvolatile memory technologies is much shorter than the conventional SRAM and DRAM technology. This has limited its usage to only the low levels of a memory hierarchy, e.g., disks, that is far from the CPU. In this paper, we study the use of a new type of nonvolatile memories -- the Phase Change Memory (PCM) as the main memory for a 3D stacked chip. The main challenges we face are the limited PCM endurance, longer access latencies, and higher dynamic power compared to the conventional DRAM technology. We propose techniques to extend the endurance of the PCM to an average of 13 (for MLC PCM cell) to 22 (for SLC PCM) years. We also study the design choices of implementing PCM to achieve the best tradeoff between energy and performance. Our design reduced the total energy of an already low-power DRAM main memory of the same capacity by 65%, and energy-delay ² product by 60%. These results indicate that it is feasible to use PCM technology in place of DRAM in the main memory for better energy efficiency.
Conference Paper
The memory subsystem accounts for a significant cost and power budget of a computer system. Current DRAM-based main memory systems are starting to hit the power and cost limit. An alternative memory technology that uses resistance contrast in phase-change materials is being actively investigated in the circuits community. Phase Change Memory (PCM) devices offer more density relative to DRAM, and can help increase main memory capacity of future systems while remaining within the cost and power constraints. In this paper, we analyze a PCM-based hybrid main memory system using an architecture level model of PCM.We explore the trade-offs for a main memory system consisting of PCMstorage coupled with a small DRAM buffer. Such an architecture has the latency benefits of DRAM and the capacity benefits of PCM. Our evaluations for a baseline system of 16-cores with 8GB DRAM show that, on average, PCM can reduce page faults by 5X and provide a speedup of 3X. As PCM is projected to have limited write endurance, we also propose simple organizational and management solutions of the hybrid memory that reduces the write traffic to PCM, boosting its lifetime from 3 years to 9.7 years.
Conference Paper
Phase change memory (PCM) is an emerging memory technology for future computing systems. Compared to other non-volatile memory alternatives, PCM is more matured to production, and has a faster read latency and potentially higher storage density. The main roadblock precluding PCM from being used, in particular, in the main memory hierarchy, is its limited write endurance. To address this issue, recent studies proposed to either reduce PCM's write frequency or use wear-leveling to evenly distribute writes. Although these techniques can extend the lifetime of PCM, most of them will not prevent deliberately designed malicious codes from wearing it out quickly. Furthermore, all the prior techniques did not consider the circumstances of a compromised OS and its security implication to the overall PCM design. A compromised OS will allow adversaries to manipulate processes and exploit side channels to accelerate wear-out. In this paper, we argue that a PCM design not only has to consider normal wear-out under normal application behavior, most importantly, it must take the worst-case scenario into account with the presence of malicious exploits and a compromised OS to address the durability and security issues simultaneously. In this paper, we propose a novel, low-cost hardware mechanism called Security Refresh to avoid information leak by constantly migrating their physical locations inside the PCM, obfuscating the actual data placement from users and system software. It uses a dynamic randomized address mapping scheme that swaps data using random keys upon each refresh due. The hardware overhead is tiny without using any table. The best lifetime we can achieve under the worst-case malicious attack is more than six years. Also, our scheme incurs around 1% performance degradation for normal program operations.
Article
Memory scaling is in jeopardy as charge storage and sensing mechanisms become less reliable for prevalent memory technologies, such as dynamic random access memory (DRAM). In contrast, phase change memory (PCM) relies on programmable resistances, as well as scalable current and thermal mechanisms. To deploy PCM as a DRAM alternative and to exploit its scalability, PCM must be architected to address relatively long latencies, high energy writes, and finite endurance. We propose architectural enhancements that address these limitations and make PCM competitive with DRAM. A baseline PCM system is 1.6× slower and requires 2.2× more energy than a DRAM system. Buffer reorganizations reduce this delay and energy gap to 1.2× and 1.0×, using narrow rows to mitigate write energy as well as multiple rows to improve locality and write coalescing. Partial writes mitigate limited memory endurance to provide more than 10 years of lifetime. Process scaling will further reduce PCM energy costs and improve endurance.
Conference Paper
We discuss novel multi-level write algorithms for phase change memory which produce highly optimized resistance distributions in a minimum number of program cycles. Using a novel integration scheme, a test array at 4 bits/cell and a 32 kb memory page at 2 bits/cell are experimentally demonstrated.
Conference Paper
Modern file systems associate the deletion of a file with the release of the storage associated with that file, and file writes with the irrevocable change of file contents. We propose that this model of file system behavior is a relic of the past, when disk storage was a scarce resource. We believe that the correct model should ensure that all user actions are revocable. Deleting a file should change only the name space and file writes should overwrite no old data. The file system, not the user should control storage allocation using a combination of user specified policies and information gleaned from file-edit histories to determine which old versions of a file to retain and for how long. The paper presents the Elephant file system, which provides users with a new contract: Elephant will automatically retain all important versions of the users' files. Users name previous file versions by combining a traditional pathname with a time when the desired version of a file or directory existed. Elephant manages storage at the granularity of a file or groups of files using user-specified retention policies. This approach contrasts with checkpointing file systems such as Plan-9 AFS, and WAFL, that periodically generate efficient checkpoints of entire file systems and thus restrict retention to be guided by a single policy for all files within that file system. We also report on the Elephant prototype, which is implemented as a new Virtual File System in the FreeBSD kernel
Article
...................................................................................................4 1. Introduction.................................................................................5 2. Introduction To Snapshots ........................................................6 2.1. User Access to Snapshots........................................................6 2.2. Snapshot Administration ........................................................7 3. WAFL Implementation ...............................................................8 3.1. Overview ................................................................................8 3.2. Meta-Data Lives in Files .........................................................8 3.3. Tree of Blocks........................................................................9 3.4. Snapshots.............................................................................10 3.5. File System Consistency and Non-Volatile RAM ..................12 3.6. W...
Article
Modern file systems associate the deletion of a file with the release of the storage associated with that file, and file writes with the irrevocable change of file contents. We propose that this model of file system behavior is a relic of the past, when disk storage was a scarce resource. We believe that the correct model should ensure that all user actions are revocable. Deleting a file should change only the name space and file writes should overwrite no old data. The file system, not the user, should control storage allocation using a combination of user specified policies and information gleaned from file-edit histories to determine which old versions of a file to retain and for how long. This paper presents the Elephant file system, which provides users with a new contract: Elephant will automatically retain all important versions of the users files. Users name previous file versions by combining a traditional pathname with a time when the desired version of a file or directory ex...
Taking Advantage of Storage Class Memory Technology through System Software Support,
  • S Baek
  • K Sun
  • J Choi
  • E Kim
  • D Lee
  • S H Noh
S. Baek, K. Sun, J. Choi, E. Kim, D. Lee, and S.H. Noh, " Taking Advantage of Storage Class Memory Technology through System Software Support, " Proc. Workshop the Interaction between Operating Systems and Computer Architecture (WIOSCA '09), 2009.
&ldquo,Phase Change Memory Architecture and the Quest for Scalability,&rdquo
  • B C Lee
  • E Ipek
  • O Mutlu
  • D Burger
Protected and Persistent RAM Filesystem
  • Sourceforge
SourceForge, " Protected and Persistent RAM Filesystem, " http:// pramfs.sourceforge.net, 2013.
Fetzer and P. Felber, &ldquo,Snapshot Isolation for Software Transactional Memory,&rdquo
  • T Riegel
&ldquo,File System Design for an NFS File Server Appliance,&rdquo
  • D Hitz
  • J Lau
  • M Malcom
Noh, &ldquo,Taking Advantage of Storage Class Memory Technology through System Software Support,&rdquo
  • S Baek
  • K Sun
  • J Choi
  • E Kim
  • D Lee
Taking Advantage of Storage Class Memory Technology through System Software Support
  • baek