Video is encoded onto the DNA of living bacteria using CRISPR

Images and a GIF were successfully encoded into the genome of a living E. coli cell.

The study, published in Nature, saw researchers uses nucleotides – DNA’s building blocks – to produce a code that relates to the individual pixels of each image. To encode the GIF, researchers delivered the sequences frame-by-frame over time to the genome of living E.coli bacteria. Once delivered, the data can then be retrieved by sequencing the DNA and then reconstructed by reading the pixel nucleotide code, which can be done with around 90 percent accuracy. We spoke to Seth Shipman of Harvard University about the study, which also provides new insights into the functioning of the CRISPR system.

ResearchGate: What motivated this study?

Seth Shipman: With this study, we were aiming to see whether the CRISPR-Cas system could be used to capture complex information with a time component and store that information in the DNA of a living cell's genome. We care about that because we want to make cells that can record information from biological systems and store that information so that we can collect data without having to disrupt the biological system.

To the left are a series of frames showing the mare "Anna G." galloping, which were encoded into nucleotides and captured sequentially over time by the CRISPR adaptation system in living bacteria. To the right are the frames after multiple generations of bacterial growth, recovered by sequencing bacterial genomes. Credit: Seth Shipman.

RG: Can you tell us what you achieved?

Shipman: We found that the molecular recording system we're working on can capture complex information – images – over time. And that, even if we do not code any information about the correct order of images in the DNA, we can still get the information out correctly from the bacterial genomes. Along the way, we found that there are some rules to designing DNA sequences that will work well with this molecular recording system.

RG: How is this different from previous successes storing files on DNA?

Shipman: A lot of great work has been done storing information in synthesized DNA. However, what we've been doing here is storing information in DNA inside of living cells. It's a trickier problem because, not only do you need to synthesize the coded DNA, but you also need to deliver it to cells and get them to incorporate the new bases into their genomes. We're piloting this system using images and movies, but we hope that it will eventually be used to capture information that we don't already know, like what is going on inside of a cell. That information would also be stored in the cell's genome so that we could go in and retrieve it at a later time. That might allow us to make cells that would help us collect data from the biological systems that we want to study.

RG: How is this possible with the CRISPR system? How did you retrieve the data?

Shipman: We synthesize the DNA, and then we tried two different strategies to encode the data. One was rigid, four colors with a base corresponding to each color. That code didn't work so well, because it created sequences that were not very compatible with the biology of the system. We ended up using a more flexible code, similar to the code used to make proteins. In this code, we had 21 colors and each could be coded by three different nucleotide codes. That way we could flexibly create the DNA, avoiding problematic sequences.

To the left is an image of a human hand, which was encoded into nucleotides and captured by the CRISPR-Cas adaptation system in living bacteria. To the right is the image after multiple generations of bacterial growth, recovered by sequencing bacterial genomes. Credit: Seth Shipman.

RG: How much information could potentially be stored on DNA?

Shipman: I believe that the current record is 200MB. We're operating at a much smaller scale, because we're trying to interface with living biology.

RG: What’s next for your research?

Shipman: Currently, we are supplying the information that we want to be recorded. That is necessary for us to understand how the system works, whether it works, and how to make it better. Our next step is to hook this system up to biology so that it might record a process that we don't yet understand, rather than information that we already know.

Featured image courtesy of NIAID.