Conference PaperPDF Available

DSS: High I/O Bandwidth Disaggregated Object Storage System for AI Applications

Authors:

Abstract and Figures

With exponential data generation rate, specifically in data-intensive applications like Deep learning, AI, etc., the demand for storage with high-bandwidth and great scalability that supports unstructured data format is increasing. To fulfill this need, Samsung proposes DSS storage solution, which implements object Key-Value API on top of NVMe over Fabric (NVMeOF) SSD hardware. The new solution has the same attributes as NVMeOF and is designed explicitly for storing object format. The support of storage remote access protocols (i.e., RDMA) facilitates the disaggregation of storage and computational resources. Therefore, storage can be easily scaled. Besides supporting object storage and scalability, our proposed architecture can provision the bandwidth demands for each application running on each client server. This paper introduces our DSS Storage systems that support high-bandwidth per capacity for object-format data with an effortlessly scale-up feature. DSS uses some methods to deterministically provide bandwidth to the client sessions to mitigate the contention and starvation. Therefore, our storage design is essential for large concurrent multi-session workloads with intensive reads such as AI training. Our design achieves throughput 180-275 GB/sec for read, 26-38 GB/sec for write when evaluated by S3 Benchmark.
Content may be subject to copyright.
DSS: High I/O Bandwidth Disaggregated Object
Storage System for AI Applications
Mahsa Bayati
Memory Solution Lab
Samsung Semiconductor Inc.
San Jose, CA
mahsa.b@samsung.com
Harsh Roogi
Memory Solution Lab
Samsung Semiconductor Inc.
San Jose, CA
h.roogi@samsung.com
Somnath Roy
Memory Solution Lab
Samsung Semiconductor Inc.
San Jose, CA
som.roy@samsung.com
Ron Lee
Memory Solution Lab
Samsung Semiconductor Inc.
San Jose, CA
r2.lee@samsung.com
Abstract—With exponential data generation rate, specifically
in data-intensive applications like Deep learning, AI, etc., the de-
mand for storage with high-bandwidth and great scalability that
supports unstructured data format is increasing. To fulfill this
need, Samsung proposes DSS storage solution, which implements
object Key-Value API on top of NVMe over Fabric (NVMeOF)
SSD hardware. The new solution has the same attributes as
NVMeOF and is designed explicitly for storing object format. The
support of storage remote access protocols (i.e., RDMA) facili-
tates the disaggregation of storage and computational resources.
Therefore, storage can be easily scaled. Besides supporting object
storage and scalability, our proposed architecture can provision
the bandwidth demands for each application running on each
client server. This paper introduces our DSS Storage systems
that support high-bandwidth per capacity for object-format data
with an effortlessly scale-up feature. DSS uses some methods
to deterministically provide bandwidth to the client sessions to
mitigate the contention and starvation. Therefore, our storage
design is essential for large concurrent multi-session workloads
with intensive reads such as AI training. Our design achieves
throughput 180-275 GB/sec for read, 26-38 GB/sec for write when
evaluated by S3 Benchmark.
Index Terms—Dissaggregated Storage, Object Storage, high
I/O bandwidth, AI Data intensive applications
I. INTRODUCTION & MOTIVATION
The fast-growing amount of data should be stored on and
retrieved from advanced storage devices that can manage high
I/O bandwidth, meet the user’s desired service quality, scale
resources easily. Some leading storage companies claim read
I/O bandwidth around 24 GB/sec per node with a maximum
of 256 TB capacity for four nodes. Samsung DSS Solution,
with similar bandwidth, conveniently scales up, i.e., supports
more than 256 TB for two nodes.
Besides, recently the more significant portion of generated
data is object format. The current storage systems use block
SSD devices; thus, there is an overhead to convert the object to
block format. Samsung recently developed a new Key-Value
APIs for NVMeOF devices that support storing the object
format data directly into the disk without the operating system
involvement for block and object conversion. The new Key-
Value design is based on the NVMeOF SSDs, enabling access
to storage devices over fabric in a disaggregated fashion. The
disaggregated system scales easier by adding more resources
to the storage or computational nodes. Applications such as
AI and Deep Learning employ unstructured data and have
aggressive demand for storage; thus, our disaggregated storage
is the best match.
One of the main concerns in the storage server that provides
the data access to multiple client application sessions run-
ning simultaneously (e.g., the AI training sessions accessing
the storage simultaneously) is bandwidth inconsistency and
Fig. 1. (a) DSS Architecture overview (client/object storage servers) (b) DSS
back-end and front-end components and the network design
congestion. This usually happens when one client session
aggressively seizes the bandwidth, or many client sessions are
competing with each other. Therefore, the quality of service
and the run-time are unpredictable. DSS has the feature to de-
terministically provide each client session with their required
quality of service and provide them with the completion time
prediction.
Accordingly, our storage server provides a non-congestion
access design for object format storage with effortless scal-
ability. The rest of this paper is organized as follows, Sec.
II, discusses DSS architectural design, in Sec. III, we present
the results of our system evaluation. Finally, we conclude and
provide our future work.
II. ARCHITECTURAL DESIGN
DSS has three essential components including, (I) object
storage servers, (II) client servers, and (III) network connecting
these clients and storage servers together (See Fig. 1 (a))
A. Architectural Components
1) Storage Server: Our storage servers can be parted into
two sections (I) front-end (II) back-end, i.e., target. In this
paper, we co-locate both components on one server, reduc-
ing the cost and simplifies the network and communication
setup. As shown in Fig 1 (b), the front-end mainly includes
modified MinIO [1] software, i.e., a well-known Amazon S3
compliant open-source object-storage, that uses KV API for
data store access. We have modified stock MinIO to run
in a distributed shared-everything Key-Value environment for
improved scaling and performance. MinIO is responsible for
data consistency by adopting erasure coding, which manages
faulty drives and random bit flipping.
NVMe-oF supported Target stack is located on the back-
end and provides high-performance key-value services over
the RDMA and IP-based networks. Target application software
is designed to run in user mode and can abstract out the
SSD devices and perform Key-Value operations. To offer the
abstraction of storage pool in distributed storage, we employ
two components: (I) KV pool, which can be mapped to
one or more SSD devices, and (II) subsystem, that provides
the object storage abstraction required to pool many SSDs
using one or more Namespace(s)/ Container(s) with aggregated
performance. Each of the subsystems can be exported to a
client application as a single Namespace or Container.
Another essential component in the back-end is UFM
(Unified Fabric Manager). UFM is a lightweight ecosystem
software that manages Samsung devices, shown in Fig. 2.
UFM manages any topology, architecture, and storage ( in
this paper, KV-SSD). In our architecture, the UFM mainly
manages the fabric, discovers, monitors, and configure devices
and networks. It also collects logs and statistics to ensure the
cluster is working properly.
Fig. 2. Universal Fabric Manager Architecture
2) Network Setup: DSS supports multiple high-speed Eth-
ernet network ports, where each storage server has 4x dual-
port NICs. Thus, the network software stack will support two
different protocols: (I) TCP IP, Front-end VLANs is set up for
interaction of the client and storage server. Client operates on
the object data by sending GET/ PUT/ LIST/ DEL to MinIO
using these VLANs, which handles S3 traffic (See Fig 1 (b)
VLANs 40:43). (II) RoCEv2 traffic Back-end VLANs is only
for the RDMA access of target to the subsystem, and clients
will not interact with this network; they send their requests
indirectly through MinIO. VLANs 30:34 direct the object
traffic that is coming from the targets. Therefore, targets listen
to the RDMA ports and manage the access to the objects in
each drive.
3) Client Servers: Client servers are responsible for running
the application and requesting the data from the storage. To
facilitate accessing our object storage, we developed DSS
Client Library, which includes APIs that function as a medium
between the client and our DSS storage. Client library is
responsible for loading the requested data from storage and
distribute data among the DSS storage servers. DSS client
library accesses the storage servers and performs the actual
S3 operations such as PUT/ GET/ DEL/ LIST. It takes cluster
configurations containing a list of endpoints as input and
maximizes performance by load balancing and distributing the
user request to the endpoints.
III. EXP ER IM EN TS & RES ULTS
Experimental Setup- We evaluate DSS architecture using 10
homogeneous storage servers with 16 client servers. Table I
represents the hardware specification of our storage and client
servers.
TABLE I
HARDWARE SP ECI FIC ATIO N
Storage Server Client
CPU Type AMD EPYC 7742 Dell R740xd
CPU Speed 3.4GHz 2.6 GHz
Num of Cores 64 24
OS CentOS Linux CentOS
NIC 4x Dual 200GbE 2x100GbE
Storage Node SSD PM1733 (16X) 4TB N/A
Results- To measure the performance of DSS system, we
ran S3 benchmark [2] with 30 TB data of 1MB object size.
We measure the throughput with and without Erasure Coding
(EC). As shown in table II, our DSS storage server can achieve
around 180 to 275 GB/sec for GET, 26 to 38 GB/sec for PUT
operations.
TABLE II
DSS ARCHITECTURE PERFORMANCE USING S3-BENCHMARK
With EC (GB/sec) Without EC (GB/sec)
PUT 26.27 38.2
GET 180 275.4
IV. CONCLUSIONS & FUTURE WORK
In this paper, we introduce DSS, our new object storage
system, which deploys object storage Key-Value APIs on
NVMeOF SSDs. DSS is a disaggregated storage system that
features deterministic high I/O bandwidth and scalability over
object storage. We measure DSS throughput for read and write,
which are around 2 and 1 orders of magnitude accordingly.
In the future, we improve our storage system performance
further, by enabling S3 service over RDMA to eliminate
http/TCP copy overhead. We also plan to offload the S3 Select
API to our FPGA unit on smart SSDs and filter the data before
transferring to the client. Therefore clients only retrieve the
data they need. In addition, we will complete our provisioning
technique for providing consistent bandwidth for each client
session based on their Quality of Service (QoS).
REFERENCES
[1] “MinIO object storage cluster,” https://min.io/.
[2] “S3-benchmark,” https://github.com/wasabi-tech/s3-benchmark.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.