Read/write pipelines of SOTA DNA storage solutions

Read/write pipelines of SOTA DNA storage solutions

Source publication
Preprint
Full-text available
The surge in demand for cost-effective, durable long-term archival media, coupled with density limitations of contemporary magnetic media, has resulted in synthetic DNA emerging as a promising new alternative. Today, the limiting factor for DNA-based data archival is the cost of writing (synthesis) and reading (sequencing) DNA. Newer techniques tha...

Contexts in source publication

Context 1
... the rest of this paper, we provide an overview of challenges in DNA storage (Section 2), present the aforementioned aspects of OA-DSM design in detail (Section 3), and demonstrate their ability to achieve better accuracy and higher error-tolerance than SOTA methods using both simulation studies and a real wetlab validation experiment where we succesfully encoded and decoded a 1.2MB compressed TPC-H database archive. Figure 1 provides an overview of SOTA DNA storage pipelines. Digital data is stored on DNA by first encoding bits into quaternary . ...
Context 2
... the clustering stage, other SOTA methods apply consensus in each cluster followed by decoding in two separate phases as shown in Figure 1. In OA-DSM, we exploit the motif design and columnar layout of oligos to iteratively perform consensus and decoding in an integrated fashion as shown in Figure 3. Unlike other approaches, OA-DSM processes the reads one column at a time. ...
Context 3
... see that in this portion, the substition rate is dominant, which is 3× higher than insertion and deletion rates. Figure 11 compares our error rates with those reported in prior work on DNA storage [11,14,16,17,24]. While the actual rates vary due to differences in synthesis and sequencing steps, we see that the overall trends are similar. ...
Context 4
... the actual rates vary due to differences in synthesis and sequencing steps, we see that the overall trends are similar. Using the aligned reads, we also report the indel distribution in Figure 10 which shows a histogram of edit distances between the reads and references. As can be seen, 96.97% reads have edit distance less than 10, indicating that the error rate is less than 6%. ...