Sungjin Lee

Sungjin Lee
Massachusetts Institute of Technology | MIT · Computer Science and Artificial Intelligence Laboratory

PhD

About

42
Publications
9,150
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,088
Citations
Citations since 2017
14 Research Items
632 Citations
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
Additional affiliations
October 2013 - present
Massachusetts Institute of Technology
Position
  • PostDoc Position
September 2007 - August 2013
Seoul National University
Position
  • PhD Student

Publications

Publications (42)
Article
Full-text available
Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data and daily twitter feeds where the datasets of interest are 5TB to 20 TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to...
Conference Paper
Full-text available
NAND flash-based solid-state drives (SSDs) are increasingly popular in enterprise server systems because of their advantages over hard disk drives such as higher performance and lower power consumption. However, the limited and unpredictable lifetime of SSDs remains to be a serious obstacle to wider adoption of SSDs in enterprise systems. In this p...
Article
Full-text available
The multi-level cell (MLC) NAND flash memory technology enables multiple bits of information to be stored on a single cell, thus making it possible to increase the density of the memory without increasing the die size. For most MLC flash memories, each cell can be programmed as a single-level cell or a multi-level cell during runtime. Therefore, it...
Article
Full-text available
As flash memory technologies quickly improve, NAND flash memory-based storage devices are becoming a viable alternative as a secondary storage solution for general-purpose computing systems such as personal computers and enterprise server systems. Most existing flash translation layer (FTL) schemes are, however, ill-suited for such systems because...
Article
A recent ultra-large SSD (e.g., a 32-TB SSD) provides many benefits in building cost-efficient enterprise storage systems. Owing to its large capacity, however, when such SSDs fail in a RAID storage system, a long rebuild overhead is inevitable for RAID reconstruction that requires a huge amount of data copies among SSDs. Motivated by modern SSD fa...
Article
As ransomware attacks have been prevalent, it becomes crucial to make anti-ransomware solutions that defend against ransomwares these days. In this paper, we propose a new SSD-assisted ransomware defense system, called SSD-Insider++, which prevents users' files from being damaged by ransomware attacks. SSD-Insider++ is composed of two novel feature...
Conference Paper
When data deduplication is used for extending the SSD lifetime inside an SSD, one of the key performance factors is how to manage the fingerprint cache. Since the size of the fingerprint cache is limited, the fingerprint cache should be very selective in choosing which fingerprints should be stored in the cache. In this paper, we show that write pr...
Conference Paper
We present a low-overhead ransomware-proof SSD, called RansomBlocker (RBlocker). RBlocker provides 100% full protections against all possible ransomware attacks by delaying every data deletion until no attack is guaranteed. To reduce storage overheads of the delayed deletion, RBlocker employs a time-out based backup policy. Based on the fact that r...
Article
Mobile devices, such as smartphones, have become a necessity in our daily life. However, users may notice that after being used for a long time, mobile devices begin to experience sluggish response. Based on an empirical study on a set of aged mobile devices, we identified that file fragmentation is among the key factors that contribute to the prog...
Conference Paper
Data-center applications running on distributed databases often suffer from unexpectedly high response time fluctuation which is caused by long tail latency. In this paper, we find that long tail latency of user writes is mainly created by the interference with garbage collection (GC) tasks running in various system layers. In order to address the...
Conference Paper
In this paper, we analyze and optimize I/O latency of a petabyte scale, high performance all-flash array (AFA) system based on NVMe SSDs. A flash-based SSD itself shows relatively low and consistent latency but, in AFA systems where several tens or hundreds of SSDs are combined in a single host machine, applications often see higher and more diverg...
Conference Paper
Modern mobile systems are designed to run multiple apps simultaneously to provide a better experience for end users. In such a multi-tasking environment, a foreground app that a user is actually interacting with is often delayed by background ones, which results in significant degradation of user-perceived response time and user experience. Based o...
Conference Paper
Recent NAND flash devices have large page sizes. Although large pages are useful in increasing the flash capacity, they can degrade both the performance and lifetime of flash storage systems when small writes are dominant. We propose a new NAND programming scheme, called erase-free sub-page programming (ESP), which allows the same page to be progra...
Article
A key-value store (KVS), such as memcached and Redis, is widely used as a caching layer to augment the slower persistent backend storage in data centers. DRAM-based KVS provides fast key-value access, but its scalability is limited by the cost, power and space needed by the machine cluster to support a large amount of DRAM. This paper offers a 10X...
Article
The decreasing lifetime of NAND flash memory, as a side effect of recent advanced semiconductor process scaling, is emerging as one of major barriers to the wide adoption of SSDs in high-performance computing systems. In this paper, we propose Dynamic Erase Voltage and Time Scaling (DeVTS), an integrated approach to extend the lifetime (particularl...
Conference Paper
For SSD-based RAID systems, the Diff-RAID technique has been proposed to reduce the probability of correlated multiple failures among SSDs by differentiating the amount of written data to each SSD. Although Diff-RAID works well for workloads with many small random writes (which require frequent parity updates), it does not perform well with recent...
Article
Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data, and daily Twitter feeds, where the datasets of interest are 5TB to 20TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to...
Conference Paper
We propose a new system-level solution that improves both the performance and lifetime of NAND storage systems by exploiting the performance asymmetry of NAND devices. At the device level, we propose a new program sequence, called relaxed program sequence (RPS), which allows more flexible page allocations in a block without compromising NAND reliab...
Article
NAND flash-based Solid-State Drives (SSDs) are becoming a viable alternative as a secondary storage solution for many computing systems. Since the physical characteristics of NAND flash memory are different from conventional Hard-Disk Drives (HDDs), flash-based SSDs usually employ an intermediate software layer, called a Flash Translation Layer (FT...
Article
Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data and daily twitter feeds where the datasets of interest are 5TB to 20 TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to...
Conference Paper
Full-text available
For NAND flash-based storage systems, managing garbage collection (GC) efficiently is a critical requirement to achieve both high performance and long lifetimes. In this paper, we propose a just-in-time GC technique, called JIT-GC, which invokes background GC operations only when necessary depending on future write demands. JIT-GC was motivated by...
Article
Full-text available
Flash storage devices behave quite differently from hard disk drives (HDDs); a page on flash has to be erased before it can be rewritten, and the erasure has to be performed on a block which consists of a large number of contiguous pages. It is also important to distribute writes evenly among flash blocks to avoid premature wearing. To achieve inte...
Patent
Full-text available
A method of controlling a storage device, the method including calculating, in a controller of the storage device, data throughput of the storage device in a current period, comparing, in the controller, the data throughput to a reference value and adjusting, with the controller, an operation performance of the storage device in a next period based...
Article
Full-text available
The multi-level cell (MLC) NAND flash memory technology enables multiple bits of information to be stored in a memory cell, thus making it possible to increase the density of flash memory without increasing the die size. In MLC NAND flash memory, each memory cell can be programmed as a single-level cell or a multi-level cell at runtime because of i...
Conference Paper
Full-text available
The cost-per-bit of NAND flash memory has been continuously improved by semiconductor process scaling and multi-leveling technologies (e.g., a 10 nm-node TLC device). However, the decreasing lifetime of NAND flash memory as a side effect of recent advanced technologies is regarded as a main barrier for a wide adoption of NAND flash-based storage sy...
Article
Full-text available
NAND flash-based solid-state drives (SSDs) are increasingly popular in enterprise server systems because of their advantages over hard disk drives such as higher performance and lower power consumption. However, the decreasing write endurance and the unpredictable lifetime remains to be a serious obstacle to their wider adoption in enterprise syste...
Article
Full-text available
NAND flash-based storage device is becoming a viable storage solution for mobile and desktop systems. Because of the erase-before-write nature, flash-based storage devices require garbage collection that causes significant performance degradation, incurring a large number of page migrations and block erasures. To improve I/O performance, therefore,...
Conference Paper
Full-text available
We propose a new approach, called dynamic program and erase scaling (DPES), for improving the endurance of NAND flash memory. The DPES approach is based on our key finding that the NAND endurance is dependent on the erase voltage as well as the number of P/E cycles. Since the NAND endurance has a near-linear dependence on the erase voltage, lowerin...
Conference Paper
Full-text available
We propose an efficient software-based out-of-order scheduling technique, called SOS, for high-performance NAND flash-based SSDs. Unlike an existing hardware-based out-of-order technique, our proposed software-based solution, SOS, can make more efficient out-of-order scheduling decisions by exploiting various mapping information and I/O access char...
Conference Paper
Full-text available
As the semiconductor process is scaled down, the endurance of NAND flash memory greatly deteriorates. To overcome such a poor endurance characteristic and to provide a reasonable storage lifetime, system-level endurance enhancement techniques are rapidly adopted in recent NAND flash-based storage devices like solid-state drives (SSDs). In this pape...
Conference Paper
Full-text available
As the cell size of NAND flash memory is shrinking, its physical characteristics such as performance and lifetime are significantly degraded. As effective solutions of overcoming such poor physical characteristics, more cross-layer system-level approaches (such as compression and deduplication techniques) are expected to be developed. These system-...
Article
Full-text available
The performance and lifetime of highperformance solid-state drives (SSDs) can be improved by data compression, which can reduce the amount of data physically transferred from/to flash memory. In this paper, we present our experience of building a high-performance solid-state drive using a hardware accelerated compression module called BlueZIP. In o...
Conference Paper
Full-text available
NAND flash memory is commonly known as a power-efficient storage medium. Because of the increasing complexity of flash-based storage devices, however, it is more difficult to achieve good power-efficiency without considering an energy-efficient storage device design. In this paper, we investigate the potential benefit of dynamic voltage/frequency s...
Article
Full-text available
In this paper we describe BlueSSD, an open platform for exploring hardware and software for NAND flash-based SSD architectures. We introduce the overall architecture of BlueSSD from a hardware and software perspective and briefly explain our design methodology. Preliminary evaluation shows that BlueSSD delivers performance comparable to commerciall...
Article
Full-text available
With continuing improvements in both the price and the capac-ity, flash memory-based storage devices are becoming a viable so-lution for satisfying high-performance storage demands of desk-top systems as well as mobile embedded systems. Because of the erase-before-write characteristic of flash memory, a flash memory-based storage system requires a...
Article
Full-text available
A hybrid hard disk employs the advantages of both a hard disk and a NAND flash memory, thus making it a cost-effective fast secondary storage device. In this paper, we improve its I/O performance by combining an intelligent data pinning policy for the flash memory with a caching technique which is aware of access patterns for the flash memory and D...

Network

Cited By

Projects

Projects (2)