Heon Yeom

Heon Yeom
  • Doctor of Philosophy
  • Professor at Seoul National University

About

349
Publications
31,965
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,506
Citations
Introduction
Systems Software
Current institution
Seoul National University
Current position
  • Professor

Publications

Publications (349)
Article
Full-text available
Recent advancements in memory technology have opened up a wealth of possibilities for innovation in data structures. The emergence of byte-addressable persistent memory (PM) with its impressive capacity and low latency has accelerated the adoption of PM in existing hashing-based indexes. As a result, several new hashing schemes utilizing emulators...
Article
Full-text available
Popular deep learning frameworks like PyTorch utilize GPUs heavily for training, and suffer from out-of-memory (OOM) problems if memory is not managed properly. CUDA Unified Memory (UM) allows the oversubscription of tensor objects in the GPU, but suffers from heavy performance penalties. In this paper, we build upon our UM implementation and creat...
Article
Full-text available
Distributed supercomputing is becoming common in different companies and academia. Most of the parallel computing researchers focused on harnessing the power of commodity processors and even internet computers to aggregate their computation powers to solve computationally complex problems. Using flexible commodity cluster computers for supercomputi...
Article
Full-text available
The amount of data generated by scientific applications on high-performance computing systems is growing at an ever-increasing pace. Most of the generated data are transferred to storage in remote systems for various purposes such as backup, replication, or analysis. To detect data corruption caused by network or storage failures during data transf...
Article
Full-text available
In file systems, a single write system call can make multiple modifications to data and metadata, but such changes are not flushed in an atomic way. To retain the consistency of file systems, conventional approaches guarantee crash consistency in exchange for sacrificing system performance. To mitigate the performance penalty, non-volatile memory (...
Article
Full-text available
The emerging computational storage drives (CSDs) provide new opportunities by moving data computation closer to the storage. Performing computation within storage drives enables data pre/post-processing without expensive data transfers. Moreover, large amounts of data can be processed in parallel thanks to the nature of the field-programmable gate...
Article
Monolithic applications are a subject that includes several knowledge areas. Sometimes it can be a challenge to optimize CPU or IO requirements because it is not trivial to recognize the problem itself and improve it. There are many approaches to resolve this situation, where a trending one is the microservices. As a variant of the service-oriented...
Article
Full-text available
With in-memory databases (IMDBs), where all data sets reside in main memory for fast processing speed, logging and checkpointing are essential for achieving persistence in data. Logging of IMDBs has evolved to reduce run-time overhead to suit the systems, but this causes an increase in recovery time. Checkpointing technique compensates for these pr...
Article
Full-text available
The amount of data in modern computing workloads is growing rapidly. Meanwhile, the capacity of main memory is growing slowly; thus, memory management of operating systems plays an increasingly important role in application performance. Recent scientific applications process large amounts of data as well. They tend to manage intermediate data in an...
Article
As the size of data grows in modern applications, the efficient usage of limited resources is becoming crucial. To reorganize large data under memory limitations, many data-intensive applications utilize external sort as a critical component. A storage framework is especially needed since the entire dataset must be loaded and flushed a couple of ti...
Article
Edge computing focuses on processing near the source of the data. Edge computing devices using the Tegra SoC architecture provide a physically distinct GPU memory architecture. In order to take advantage of this architecture, different modes of memory allocation need to be considered. Different GPU memory allocation techniques yield different resul...
Article
Full-text available
Distributed ledger technology faces scalability problems due to a long commit time despite recent successes for cryptocurrency. Small group consensus studies have improved this scalability of distributed ledgers. However, they still have problems of the consensus process itself. For example, most blockchain systems perform serialized block proposal...
Article
Full-text available
The low capacity of main memory has become a critical issue in the performance of systems. Several memory schemes, utilizing multiple classes of memory devices, are used to mitigate the problem; hiding the small capacity by placing data in proper memory devices based on the hotness of the data. Memory tracers can provide such hotness information, b...
Article
Full-text available
High-performance storage devices, such as Non-Volatile Memory express Solid-State Drives (NVMe SSDs), have been widely adopted in data centers. Especially, multiple storage devices provide higher I/O performance compared with a single device. However, the performance can be reduced in the case of workloads with mixed read and write requests (e.g.,...
Article
Full-text available
RDMA is increasingly becoming popular not only in HPC but also in data centers where high throughput and low latency are critical requirements. RDMA supports several types of transports, each of which has different characteristics, so that users can choose the right one to meet their requirements. Reliable connected (RC) transport has advantages on...
Article
Most modern multi-core processors provide a shared last level cache (LLC) where data from all cores are placed to improve performance. However, this opens a new challenge for cache management, owing to cache pollution. With cache pollution, data with weak temporal locality can evict other data with strong temporal locality when both are mapped into...
Article
Full-text available
Solid-state drives (SSDs) have accelerated the architectural evolution of storage systems with several characteristics (e.g., out-of-place update) compared with hard disk drives (HDD). Out-of-place update of SSDs naturally can support transaction mechanism which is commonly used in systems to provide crash consistency. Thus, transactional functiona...
Conference Paper
Modern workloads tend to have huge working sets and low locality. Despite this trend, the capacity of DRAM has not been increased enough to accommodate such huge working sets. Therefore, memory management mechanisms optimized for such modern workloads are widely required today. For such optimizations, knowing the data access pattern of given worklo...
Chapter
Full-text available
In this paper, we describe a hybrid MPI implementation of a discontinuous Galerkin scheme in Computational Fluid Dynamics which can utilize all the available processing units (CPU cores or GPU devices) on each computational node. We describe the optimization techniques used in our GPU implementation making it up to 74.88x faster than the single cor...
Article
Full-text available
Non-volatile random access memory (NVRAM) is a promising approach to persistent data storage with outstanding advantages over traditional storage devices, such as hard disk drives (HDDs) and solid state drives (SSDs). Some of its biggest advantages are its DRAM-like read latency and microsecond-level write latency, which are several hundred times f...
Conference Paper
Today, distributed supercomputing is a buzzword for many organizations and most of the parallel computing researchers are focused on harnessing the power of commodity processors and even internet computers to aggregate their computation powers to solve the computationally complex problems. The cost effective and scalable nature benefited researcher...
Article
Full-text available
As modern computer systems face the challenge of large data, filesystems have to deal with a large number of files. This leads to amplified concerns of metadata operations as well as data operations. Most filesystems manage metadata of files by constructing in-memory data structures, such as directory entry (dentry) and inode. We found inefficienci...
Article
As flash-based solid-state drive (SSD) becomes more prevalent because of the rapid fall in price and the significant increase in capacity, customers expect better data services than disk-based systems. However, higher performance and new characteristics of flash require a rethinking of data services. For example, backup and recovery is an important...
Article
The importance of physically contiguous memory has increased in modern computing environments, including both low- and high-end systems. Existing physically contiguous memory allocators generally have critical limitations. For example, the most commonly adopted solution, the memory reservation technique, wastes a significant amount of memory space....
Article
Cloud-based databases often decouple database instances from physical storage to provide reliability and high availability to users. This design can robustly handle a single point of failure but needs substantial effort to attain good performance. In this paper, we analyze the decoupled architecture and present important optimization issues that we...
Conference Paper
Journaling file systems provide crash-consistency to applications by keeping track of uncommitted changes in the journal area (journaling) and writing committed changes to their original area at a certain point (checkpointing). They generally use coarse-grained locking to access shared data structures and perform I/O operations by a single thread....
Conference Paper
As modern computer systems face the challenge of managing large data, filesystems must deal with a large number of files. This leads to amplified concerns of metadata and data operations. Filesystems in Linux manage the metadata of files by constructing in-memory structures such as directory entry (dentry) and inode. However, we found inefficiencie...
Article
Full-text available
A key-value store is an essential component that is increasingly demanded in many scale-out environments, including social networks, online retail environments, and cloud services. Modern key-value storage engines provide many features, including transaction, versioning, and replication. In storage engines, transaction processing provides atomicity...
Article
Full-text available
Modern storage systems are facing an important challenge of making the best use of fast storage devices. Even though the underlying storage devices are being enhanced, the traditional storage stack falls short of utilizing the enhanced characteristics, as it has been optimized specifically for hard disk drives. In this article, we optimize the stor...
Conference Paper
Emerging next-generation non-volatile memory (NVM) technologies, including PCM and STT-MRAM, provide low latency, high bandwidth, non-volatility, and high capacity. Recently, NVM has drawn much attention from the database community to improve transaction processing (e.g., write-ahead logging). As a complement to existing work, we investigate NVM fo...
Article
Fast non-volatile memory (NVM) technologies (e.g., phase change memory, spin-transfer torque memory, and MRAM) provide high performance to legacy storage systems. These NVM technologies have attractive features, such as low latency and high throughput to satisfy application performance. Accordingly, fast storage devices based on fast NVM lead to a...
Article
Full-text available
Emerging non-volatile memory (NVM) technology with high throughput and scalability has considerable attraction in cloud and enterprise storage systems. The industry and academic communities made the NVMe specification to elicit the highest performance on NVM devices. While the technology is commercially viable, it is important to consider the perfo...
Conference Paper
Full-text available
In this paper, we present a flash solid-state drive (SSD) optimization that provides hints of SSD internal behaviors , such as device I/O time and buffer activities, to the OS in order to mitigate the impact of I/O completion scheduling delays. The hints enable the OS to make reliable latency predictions of each I/O request so that the OS can make...
Article
Full-text available
Purpose: The current study investigates the feasibility of a platform for a nationwide dose monitoring system for dental radiography. The essential elements for an unerring system are also assessed. Materials and methods: An intraoral radiographic machine with 14 X-ray generators and five sensors, 45 panoramic radiographic machines, and 23 cone-...
Article
In modern operating systems, memory-mapped I/O (mmio) is an important access method that maps a file or file-like resource to a region of memory. The mapping allows applications to access data from files through memory semantics (i.e., load/store) and it provides ease of programming. The number of applications that use mmio are increasing because m...
Article
While demand for physically contiguous memory allocation is still alive, especially in embedded system, existing solutions are insufficient. The most adapted solution is reservation technique. Though it serves allocation well, it could severely degrade memory utilization. There are hardware solutions like Scatter/Gather DMA and IOMMU. However, cost...
Article
In the current era of big data and cloud computing, the amount of data utilized is increasing, and various systems to process this big data rapidly are being developed. A distributed file system is often used to store the data, and glusterFS is one of popular distributed file systems. As computer technology has advanced, NAND flash SSDs (Solid Stat...
Article
Cloud engineering leverages innovations from a diverse spectrum of disciplines, from computer science and engineering to business informatics, toward the holistic treatment of key technical and business issues related to clouds.
Article
Lately, fast storage devices are rapidly increasing in social network services, cloud platforms, etc. Unfortunately, the traditional Linux I/O stack is designed to maximize performance on disk-based storage. Emerging byte-addressable and low-latency non-volatile memory technologies (e.g., phase-change memories, MRAMs, and the memristor) provide ver...
Conference Paper
Emerging high-performance storage devices have attractive features such as low latency and high throughput. This leads to a rapid increase in the demand for fast storage devices in cloud platforms, social network services, etc. However, there are few block-based file systems that are capable of utilizing superior characteristics of fast storage dev...
Article
Current key value stores rely on DRAM based inmemory architectures where scalability is limited by high power and low density of DRAM. As an alternative, flash SSDs has been explored because of the merits of low power, high density and high internal parallelism. However, the unpredictable latency caused by SSD internal resource conflicts challenges...
Article
The era of multi-core processors has begun since the limit of the clock speed has been reached. These days, multi-core technology is used not only in desktops, servers, and table PCs, but also in smartphones. In this architecture, there is always interference between processes, because of the sharing of system resources. To address this problem, ca...
Conference Paper
Memory bandwidth is a major resource which is shared among all CPU cores. The development speed of memory bandwidth cannot catch up with the increasing number of CPU cores. Thus, the contention for occupying more memory bandwidth among concurrently executing tasks occurs. In this paper, we have presented Bubble Task method which mitigates memory co...
Conference Paper
In this work, we optimize the highly variable latency of flash SSDs by presenting a host side storage engine, which is capable of cooperating with the SSDs, and augmented by the redundancy of multiple SSD instances. The storage engine schedules I/O and SSD internal operations to data blocks replicated among multiple SSDs exploiting the support of a...
Conference Paper
Lately, fast storage devices are rapidly increasing in social network services, cloud platforms, etc. Unfortunately, the traditional Linux I/O stack is designed to maximize performance on disk-based storage. Emerging byte-addressable and low-latency non-volatile memory (NVM) technologies (e.g., Phase-change memories, spin-transfer torque MRAMs, and...
Article
Storage I/O in VM (Virtual Machine) environments, which requires low latency, becomes problematic as the fast storage such as SSDs (Solid-State Drives) is currently in use. The low performance problem in the VM environment is caused by 1) the presence of additional software layer such as guest OS, 2) context switching between VM and host OS, and 3)...
Patent
Full-text available
A data synchronization system is provided. In the data synchronization system, a synchronization message transmitting party transmits a synchronization message with meta information to a synchronization message receiving party, and the synchronization message receiving party interprets and stores the meta information included in the synchronization...
Article
Recently, Distributed computing processing begins using both CPU(Central processing unit) and GPU(Graphic processing unit) to improve the performance to overcome darksilicon problem which cannot use all of the transistors because of the electric power limitation. There is an integrated graphics processor that CPU and GPU share memory and Last level...
Article
The Electronic Commerce System should provide prompt responses for user requests and continuous services despite partial failures of its sub systems. In addition, scalability and flexibility should be supported as the number of customers increases. Because Electronic Commerce is beyond the national border, the system should be implemented across di...
Article
This paper presents a mathematical model to evaluate the performance of grid resources when availability of the resources is taken into account. The proposed model uses continuous time Markov chains (CTMCs) to model the failure-repair behavior of a grid resource. In grid computing environment, a resource not only may fail during task execution, but...
Article
Fast storage devices are an emerging solution to satisfy data-intensive applications. They provide high transaction rates for DBMS, low response times for Web servers, instant on-demand paging for applications with large memory footprints, and many similar advantages for performance-hungry applications. In spite of the benefits promised by fast har...
Conference Paper
Full-text available
In this paper, we present OS I/O path optimizations for NAND flash solid-state drives, aimed to minimize scheduling delays caused by additional contexts such as interrupt bottom halves and background queue runs. With our optimizations, these contexts are eliminated and merged into hardware interrupts or I/O participating threads without introducing...
Conference Paper
Today's large data centers become cautious about its massive energy consumption. Range of efforts were made to reduce energy consumption from hardware to software level. Meanwhile, advance of technologies and engineering efforts makes flash-based drive an affordable option with higher capacity in the storage. Though this class of storage drive basi...

Network

Cited By