Lee Organick's research while affiliated with University of Washington Seattle and other places

Publications (15)

Article
With the rapidly decreasing cost of array-based oligo synthesis, large-scale oligo pools offer significant benefits for advanced applications including gene synthesis, CRISPR-based gene editing, and DNA data storage. The selective retrieval of specific oligos from these complex pools traditionally uses polymerase chain reaction (PCR). Designing a l...
Preprint
Full-text available
With the rapidly decreasing cost of array-based oligo synthesis, large-scale oligo pools offer significant benefits for advanced applications, including gene synthesis, CRISPR-based gene editing, and DNA data storage. Selectively retrieving specific oligos from these complex pools traditionally uses Polymerase Chain Reaction (PCR), in which any sel...
Article
Full-text available
As global demand for digital storage capacity grows, storage technologies based on synthetic DNA have emerged as a dense and durable alternative to traditional media. Existing approaches leverage robust error correcting codes and precise molecular mechanisms to reliably retrieve specific files from large databases. Typically, files are retrieved us...
Article
Full-text available
DNA sequencing is the molecular-to-digital conversion of DNA molecules, which are made up of a linear sequence of bases (A,C,G,T), into digital information. Central to this conversion are specialized fluidic devices, called sequencing flow cells, that distribute DNA onto a surface where the molecules can be read. As more computing becomes integrate...
Article
Full-text available
Synthetic DNA has recently risen as a viable alternative for long‐term digital data storage. To ensure that information is safely recovered after storage, it is essential to appropriately preserve the physical DNA molecules encoding the data. While preservation of biological DNA has been studied previously, synthetic DNA differs in that it is typic...
Preprint
Full-text available
Synthetic DNA has recently risen as a viable alternative for long-term digital data storage. To ensure that information is safely recovered after storage, it is essential to appropriately preserve the physical DNA molecules encoding the data. While preservation of biological DNA has been studied previously, synthetic DNA differs in that it is typic...
Article
Full-text available
DNA has recently emerged as an attractive medium for archival data storage. Recent work has demonstrated proof-of-principle prototype systems; however, very uneven (biased) sequencing coverage has been reported, which indicates inefficiencies in the storage process. Deviations from the average coverage in the sequence copy distribution can either c...
Article
Full-text available
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Article
Full-text available
Synthetic DNA is gaining momentum as a potential storage medium for archival data storage. In this process, digital information is translated into sequences of nucleotides and the resulting synthetic DNA strands are then stored for later retrieval. Here, we demonstrate reliable file recovery with PCR-based random access when as few as ten copies pe...
Preprint
Full-text available
Synthetic DNA has been gaining momentum as a potential storage medium for archival data storage (1,2,3,4,5,6,7,8,9). Digital information is translated into sequences of nucleotides and the resulting synthetic DNA strands are then stored for later individual file retrieval via PCR (7,8,9) (Fig. 1a). Using a previously presented encoding scheme9 and...
Preprint
Full-text available
DNA has recently emerged as an attractive medium for future digital data storage because of its extremely high information density and potential longevity. Recent work has shown promising results in developing proof-of-principle prototype systems. However, very uneven (biased) sequencing coverage distributions have been reported, which indicates in...
Preprint
Full-text available
Modern next-generation DNA sequencers support multiplex sequencing to improve throughput and decrease costs. This is done by pooling and sequencing samples together in parallel, which are later demultiplexed according to their unique indexes. When reads are assigned to the wrong index, called index cross-talk, information is leaked between samples....
Article
Full-text available
Synthetic DNA is durable and can encode digital data with high density, making it an attractive medium for data storage. However, recovering stored data on a large-scale currently requires all the DNA in a pool to be sequenced, even if only a subset of the information needs to be extracted. Here, we encode and store 35 distinct files (over 200 MB o...
Preprint
Current storage technologies can no longer keep pace with exponentially growing amounts of data. ¹ Synthetic DNA offers an attractive alternative due to its potential information density of ~ 10 ¹⁸ B/mm ³ , 10 ⁷ times denser than magnetic tape, and potential durability of thousands of years. ² Recent advances in DNA data storage have highlighted te...

Citations

... Data search in an archive or database will become important for DNA storage if conventional storage methods become obsolete. However, most research of DNA data readout so far has focused on retrieving data items by unique identifiers, but not by their actual content 90 . Recent works presented a solution to how content-based similarity search using data encoded in DNA could shape future systems. ...
... The lifespan of current digital media storage systems (optical media, magnetic tapes, hard disk drives or flash memory) does not exceed seven years on average. Hence, these data must be regularly copied and kept at constant 5 temperature and humidity resulting in a colossal energy cost. Data centers consume 2% of the world's electricity, and their carbon footprint exceeds that of civil aviation (1). ...
... Bioinformatics as a whole has made staggering advances in the field of genetics [65]. Challenges that remain unsolved, hindering the benefit of national or global genomics databases, include DNA data storage and random access retrieval [66], data privacy management [67], and predictive genomics analysis methods. Variant filtration in rare disease is based on reference allele frequency, yet the result is not clinically actionable in many cases. ...
... The DNA data storage channel is a complex channel with several types of errors. Most previous studies have focused on substitution and indel errors 28,32,36,[45][46][47][48] . In practice, however, DNA breaks and rearrangements occur frequently during the preservation of DNA molecules and PCRbased data copying, threatening the robustness of DNA data storage. ...
... It is not necessary to store millions of copies of each DNA strand. While theoretically 455 EB 4 can be stored per gram of DNA, technically, it has been shown to be possible to fully recover the digital file when 10 copies of each DNA strand are present 22 . This allows for an extremely high information density of 17 EB/g 22 . ...
... The DNA data storage channel is a complex channel with several types of errors. Most previous studies have focused on substitution and indel errors 28,32,36,[45][46][47][48] . In practice, however, DNA breaks and rearrangements occur frequently during the preservation of DNA molecules and PCRbased data copying, threatening the robustness of DNA data storage. ...
... The write (synthesis) and read (sequencing) processes are error-prone. For each base pair (one nucleotide), it may involve around 1% error rate [8]. To handle these errors, researchers use error-correction code (ECC) to recover errors resulting in a much high overhead. ...
... This increasing integration and dependence on the digital space (for example, computer-controlled instruments within biomanufacturing processes) creates a new category of risks between cyber and biological systems. Cyber-biocrime describes criminal activities carried out by combined means of computers/Internet and biological/biochemical material, and was discussed in six studies (Ney et al., 2017(Ney et al., , 2018Wintle et al., 2017;Peccoud et al., 2018;Faezi et al., 2019;Qu, 2019). Peccoud et al. introduce the need for "cyberbiosecurity" to prevent (for example) the manufacture of nefarious products through the tampering of electronic orders of DNA sequences or the interception of shipments. ...
... Subsequently, using error correction, the DNA sequences can be decoded back into the strings of bits, which make up the digital file 4,5,22-26 . To date, files of sizes up to 200 MB have been stored in DNA [24][25][26] , and calculations show that in theory, all information produced globally in one year could be stored in 4 g of DNA 8,13,17,27 . ...