ChapterPDF Available

Abstract and Figures

New approaches for data archiving are required due to a constant increase in digital information production and lack of a capacitive, low maintenance storage medium. High-density information encoding and longevity are the two important advantages which have recently made DNA an attractive target for information storage. However, creating new copies of the same encoded information by producing new, artificial DNA sequences is not financially viable. Moreover, a naked DNA molecule can be greatly affected by environmental influences, thus resulting in DNA mutations and changes in the stored information. Our approach demonstrates the great potential of plants and seeds in circumventing these drawbacks. It shows that artificially encoded data can be stored and multiplied within plants.
Content may be subject to copyright.
Chapter 4
The Potential of Plants and Seeds in DNA-Based
Information Storage
Karin Fister, Iztok Fister Jr., and Jana Murovec
Abstract New approaches for data archiving are required due to a constant increase
in digital information production and lack of a capacitive, low maintenance storage
medium. High-density information encoding and longevity are the two important
advantages which have recently made DNA an attractive target for information stor-
age. However, creating new copies of the same encoded information by producing
new, artificial DNA sequences is not financially viable. Moreover, a naked DNA
molecule can be greatly affected by environmental influences, thus resulting in DNA
mutations and changes in the stored information. Our approach demonstrates the
great potential of plants and seeds in circumventing these drawbacks. It shows that
artificially encoded data can be stored and multiplied within plants.
4.1 Introduction
Data storage is relevant for keeping track of our history and for accomplishing
tasks during our day-to-day lives. It has evolved significantly from the first written
records of the ancient Sumerian’s and Egyptians into our time, where it is embedded
in a rapidly expanding data production environment. Stones were replaced by
paper, which is being replaced progressively by electronic storage media. Compared
to printed data, the latter are characterized by relatively small physical space
requirements and by the ease of copying digital data. However, there are some major
drawbacks with current storage technologies. The first drawback is their limited
capacity. For instance, at the time of this writing, the world’s highest capacity
K. Fister ()
Faculty of Medicine, University of Maribor, Taborska 8, 2000, Maribor, Slovenia
e-mail: karin.ljubic@student.um.si
I. Fister Jr.
Faculty of Electrical Engineering and Computer Science, University of Maribor,
Smetanova 17, 2000, Maribor, Slovenia
e-mail: iztok.fister1@um.si
J. Murovec
Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, 1000, Ljubljana, Slovenia
e-mail: jana.murovec@bf.uni-lj.si
© Springer International Publishing AG 2017
A.J. Schuster (eds.), Understanding Information, Advanced Information
and Knowledge Processing, DOI 10.1007/978-3-319-59090-5_4
69
70 K. Fister et al.
helium drive can only save up to 10 terabytes (TB) of data. Although the storage
capacity of all data storage equipment grew from less than 3 (optimally compressed)
exabytes in 1986 to several zettabytes in 2013 and is doubling roughly every 3 years,
the International Data Corporation has mentioned that, even in 2007, the total
amount of digital data produced on the planet exceeded the amount of available
storage (Hilbert and L’opez 2011;Cisco2016; Gant et al. 2007). The problem of
data explosion can also be emphasized in the field of modern biology. Biological
data are heterogeneous, they stem from a wide range of experiments. Getting the
most from the data requires comparing them to the relevant prior knowledge. That
means scientists have to store large data sets and analyse them. The European
Bioinformatics Institute (EBI) in UK, one of the largest biology data repositories
in the world, currently stores 20 petabytes of data (Marx 2013). Efforts to produce
enough storage is resulting in increased production of storage devices and the
building of large data centers, all of which increases environmental contamination
and raises the costs of their maintenance. Another major disadvantage of electronic
storage media is their short lifespan (usually in years), which depends on the
frequency of access (Ajwani et al. 2008).
Due to the constant increase of digital data production and the above-mentioned
concerns, new approaches for data storage are being sought. DNA has proven to
be useful for archival storage, because it offers some major improvements over
digital storage media such as: information density, stability when stored under
optimal conditions and minuteness. The first message stored in DNA dates back
to 1988 (Davis 1996). More recently, in the past 4 years, there have been two
major breakthroughs in the field of DNA-based information storage (Church et al.
2012; Goldman et al. 2013). First of all, novel next-generation DNA synthesis
and sequencing technologies have expanded the boundaries of previous DNA
production approaches, which were able to encode and decode only trivial amounts
of information (Davis 1996) and, in addition, lacked the possibility of scaling-
up (Clelland et al. 1999). The second advance was achieved with the use of the
Huffman code (Brand 2000; MacKay 2003; Ailenberg and Rotstein 2009)asa
compression method for large scales of bytes to minor scale DNA bases. Actually,
this approach has proven to be the most accurate method and could be scaled
beyond the boundaries of current archiving methods (Church et al. 2012; Goldman
et al. 2013; Ailenberg and Rotstein 2009). So far, sets of computer files encoded in
DNA have been 739kB (Goldman et al. 2013), 675 kB (Church et al. 2012), and
83 kB (Grass et al. 2015). Encouraged by these achievements, it has been proposed
that DNA-based storage might already be economically viable for archives with no
extensive access, such as historical and government records or large-scale science
projects that generate massive amounts of data (Brand 2000; The Economist 2012).
The stability of DNA is highly dependent on the storage conditions, which should
provide constant low temperatures, as in freezers, and protection from atmospheric
water, oxygen and ozone. It has been demonstrated that, at room temperature,
solid-state DNA degradation through depurination, base deamination, and base or
sugar oxidation, is affected greatly by water and oxygen (Bonnet et al. 2010), thus
dictating the need for special equipment or preservation procedures. The problem
4 The Potential of Plants and Seeds. . . 71
is even more pronounced because laboratory plastic ware is neither moisture nor
airtight and storage in refrigerators is not always possible. DNA shells (Colotte
et al. 2011; Clermont et al. 2014; Liu et al. 2015) have been presented recently
as an alternative to DNA storage at room temperature. Although they provide an
alternative approach, the technology assumes an anoxic and anhydrous atmosphere
in small glass vials fitted into stainless-steel, laser-sealed mini-capsules, all of which
boost storage costs. The same problem is encountered when using dry-state DNA
stabilization systems, such as commercial Biomatrica DNA stable plates, trehalose
and polyvinyl alcohol (PVA) plates (Ivanova and Kuzmina 2013) or inorganic silica
capsules (Grass et al. 2015).
This chapter addresses several of the issues mentioned before. In its essence,
the chapter describes a novel approach for DNA-based data storage that does not
focus on information quantity but rather on a new storage medium that combines
DNA stability and, consequently, information preservation, with low costs for its
conservation and multiplication. We chose a living plant, the widely known model
plant Nicotiana benthamiana, to be the target multi-cellular, eukaryotic organism
for digital information hosting. Reasons for choosing this particular plant include
the plant’s short generation time, its high seed yield and its ease of growing under
natural and controlled environments (Goodin et al. 2008). Further, we selected the
well-known ‘Hello world!’ computer program (Langtangen 2006)inahigh-
level, universally-used programing language, in our work the Python1programing
language, to be encoded and stored in the plant. In order to provide the reader with
a detailed understanding of our work, we organized the remainder of this chapter as
follows. Section 4.2 describes the process of storing digital data into plants in detail.
We start by describing the coding program that was developed in order to transform
the digital information into the DNA sequence. Next, we describe the synthesis of
our artificial ‘Code DNA’ by Integrated DNA Techonolgy and the process of plant
transformation by co-cultivation with Agrobacterium tumefaciens containing the
binary plasmid. We conclude the section with a description of the screening process
for detecting the presence of our Code DNA in the plant. Section 4.3 describes the
results of our experiment. In Sect. 9.6 we discuss the advantages of storing data into
plants and their seeds. Section 7.5 ends the chapter with a summary.
4.2 Materials and Methods
The aim in this section is to provide the reader with a detailed description of our
work. The section starts with DNA basics, a description of the basic structure of
DNA molecules, as well as the process of transfering information from a mother
cell to its two daughter cells. Knowing these basic ideas is crucial for understanding
the backbone of our experiment. Next, we describe the coding program. The coding
1Python. https://www.python.org/. Accessed: 2016-06-30.
72 K. Fister et al.
program was developed to transform our digital data in the form of bits into the
sequence of DNA nucleotides. By using the coding program we transformed a
computer program into a sequence of nucleotides. We named this artificial sequence
the Code DNA. The section describes the synthesis of this Code DNA in detail. We
discuss, briefly, the plant material and focus on the process of plant transformation
with Agrobacterium tumefaciens containing the binary plasmid. The end of the
section focuses on the process of extraction of our Code DNA from the leaf tissue.
For a start, Fig. 4.1 presents the key steps of storing data into plants and obtaining
data from it.
4.2.1 DNA Basics
A DNA molecule consists of two polynucleotide chains: DNA chains or DNA
strands. Each chain consists of four types of nucleotide subunits and each nucleotide
is composed of a five-carbon sugar to which are attached one or more phosphate
groups and a nitrogen-containing base. In DNA nucleotides, the sugar is deoxyri-
bose attached to a single phosphate group. Therefore, the three basic parts of
nucleotides are sugar, phosphate group and base. The base may be either Adenine
(A), Cytosine (C), Guanine (G) or Thymine (T). Figure 4.2 presents the structure of
Fig. 4.1 Flow chart, illustrating the key processes involved when storing data into plants (top)and
obtaining data from it (bottom)
Fig. 4.2 Elementary nitrogenous bases (adenine, guanine, cytosine, thymine) in the nucleic acid
of DNA. Usually represented by the letters A–G–C–T
4 The Potential of Plants and Seeds. . . 73
all four nitrogenous bases within DNA nucleotides. The backbone of the DNA chain
consists of covalently linked sugars and phosphate groups in alternating fashion.
The two chains are held together by hydrogen bonds between the base portion of
the nucleotides. The double-helix or the three-dimensional structure of DNA arises
from the chemical and structural features of its two chains. The shapes and chemical
structure of the bases allow pairing with hydrogen bonds only between A and T and
between C and G. This pairing is referred to as Watson-Crick base pair. Because
of these base-pairing requirements each strand of the DNA molecule is exactly
complementary (called Watson-Crick complementarity) to the nucleotide sequence
of its partner strand. Therefore, we can predict the sequence of the second strand
by knowing the sequence of the first strand. The sequence of nucleotides of one or
another strand carries biological information or, in the case of our experiment, –
the artificial information – that must be copied accurately for transmission to the
next generation each time a cell divides. At each cell division, the double helix
unfolds allowing the pairing with new, complementary bases. Each DNA strand
serves as a template for its own duplication. The ability of each strand of a DNA
molecule to act as a template for producing a complementary strand enables a cell
to copy its genetic information before passing it on to new cells. Therefore, the
artificially inserted DNA sequence is copied too, and the digital information, which
this artificial sequence presents, is carried to every cell of a plant and to all of its
seeds and progenies.
4.2.2 Coding Program
We developed a coding program that first translates text to binary. The whole coding
program is available on-line2and enables coding text into DNA sequences and
decoding DNA sequences into text. The maximum length of inserted characters to
be encoded into DNA is limited to 300, while the maximum length of an inserted
DNA sequence to be decoded back to text is 1,200 bases. Currently, the program
enables coding letters from A to Z and a to z (English alphabet), numbers from
0 to 9 and special characters hashtag (#) and apostroph (’). The program offers
two coding options: ‘Classic’ and ‘Compressed1’. The Classic option, which uses
2 bits for coding a base, is as follows: 00 for A, 10 for C, 11 for T and 01 for
G. This encoding scheme enables the avoidance of sequences that are: difficult to
synthesize, sequences with long repeats and sequences with extreme CG content.
The Compressed1 option is upgraded by using the Huffmann compression method.
This method reduces the overall number of bits used to encode a string of symbols
inserted in the coding window. The Huffmann compression method allows up to
60% higher compression than the Classic option. The percentage depends on the
2Plant-based data storage project. http://www.storing-data-into-living-plant.net/. Accessed: 2016-
07-30.
74 K. Fister et al.
length of the inserted text. When using the Compressed1 option, users are given a
‘Key’, which has to be used in order to decode the DNA sequence back to text. This
Key links the user to the specific Huffmann tree that was used for compression.
4.2.3 Code DNA Synthesis and Cloning
The ‘Hello world!’ computer program was structured and written in the form
of the syntax #begin print ‘Hello world’ #end. The syntax was coded using the
Classic option of our coding program into the Code DNA. Primer annealing
sequences were added upstream and downstream of the Code DNA for subsequent
sequencing reactions. The Code DNA was synthesized by Integrated DNA Tech-
nology (IDT, Leuven, Belgium) and cloned into the MCS of a linearized plasmid
vector pCAMBIA 1302-ZsGreen (Susiˇ
cetal.2014) using a Gibson Assembly
Cloning Kit following the manufacturer’s instructions (New England Biolabs,
Ipswich, MA, USA). The binary plasmid pCAMBIA 1302-ZsGreen-Code contained
a hygromycin phosphotransferase (hptII) selectable marker gene and the ZsGreen
reporter gene, both driven by the cauliflower mosaic virus 35S promoter. The
binary plasmid was electroporated into ElectroMAX Agrobacterium tumefaciens
LBA 4404 (Invitrogen).
4.2.4 Plant Material
Here we describe the preparation of plant material for transformation. Seeds of
Nicotiana benthamiana were surface sterilized and germinated in petri dishes on
solid medium containing 2 g L1sucrose (Duchefa Biochemie B. V.) and 8 g L1
Daishin agar (Duchefa Biochemie B. V.). Plant seedlings were sub-cultivated
every four weeks on fresh medium containing Murashige and Skoog macro- and
micro-elements (MS; Duchefa Biochemie B. V.), 2 mg L1thiamine HCl (Sigma),
1mgL
1pyridoxine HCl (Sigma), 1 mg L1nicotinic acid (Sigma), 30 g L1
sucrose and 8 g L1Daishin agar. All media were adjusted to pH 5.8 before
autoclaving and plant tissue cultures were maintained at 23 ˙1ıC and a 16-h
photoperiod.
4.2.5 Plant Transformation
Containing the binary plasmid pCAMBIA 1302-ZsGreen-Code, Agrobacterium
tumefaciens was grown overnight at 28 ıC by shaking in liquid YEB medium pH
7.0 and prepared for co-cultivation as described in Susiˇ
cetal.(2014). Explants
were immersed in bacterial suspension for 15 m with periodic shaking. Inoculated
4 The Potential of Plants and Seeds. . . 75
Fig. 4.3 Main steps in the plant transformation process. The starting-point in the process is the
gene or synthetic DNA to be transferred. The end-product is a genetically modified plant
explants were blotted on sterile filter paper and transferred to regeneration
medium composed of MS macro- and micro-elements, 0.1 mg L1Fe-Na2-EDTA
(Sigma), 0.1 g L1myo-inositol (Sigma), 0.1 mg L1thiamine HCl, 1 mg L16-
Benzylaminopurine (6-BAP) (Duchefa Biochemie B. V.), 0.1 mg L1˛-naphtalene
acetic acid (NAA) (Duchefa Biochemie B. V.), 30 g L1sucrose, 200 µM
acetosyringone (AS), 8 g L1Daishin agar, pH 5,8. Cultures were incubated in
the dark for four days and then washed in 200 mgL1solution of timentin (Duchefa
Biochemie B. V.), blot-dried on filter paper and transferred to petri dishes containing
regeneration medium without AS but supplemented with 150 mg L1of timentin
and 10 mg L1hygromycine (Duchefa Biochemie B. V.). Regenerating shoots
were transferred to fresh cultivation medium composed of MS macro- and micro-
elements, 2 mg L1thiamine HCl, 1 mg L1pyridoxine HCl, 1 mg L1nicotinic
acid, 30 g L1sucrose, 150 mg L1of timentin and 8 g L1Daishin agar, pH 5.8.
Figure 4.3 illustrates the main steps of plant transformation.
4.2.6 DNA Isolation and PCR Analysis
Total genomic DNA was extracted from the leaf tissue of plants regenerated on
the selective medium by a modified cetyl trimethylammonium bromide (CTAB)
method. Sequences of primer pairs used in polymerase chain reactions (PCR) for
screening for the presence of the Code DNA and for the presence of selectable
76 K. Fister et al.
Table 4.1 Sequences of primers used and lengths of amplified fragments (in bp)
Primer name Primer sequence 50–30
Amplified fragment
length (bp)
Code-For GCAATGAGCGGTAGGAGTG 172
Code-Rev ACG GTC AGC ATG TGA CAG TC
HptII-For ATG ACC GCT GTT ATG CGG CCA TTG 641
HptII-Rev AAA AAG CCT GAA CTC ACC GCG ACG
ZsGreen-For AGA ACT CGT GTC CTG CTG GT 208
ZsGreen-Rev ATG ATC TTC TCG CAG GAT GG
ˇ-actin-qPCR-For CTG GCA TTG CAG ATC GTA TGA 75
ˇ-actin-qPCR-Rev GCG CCA CCA CCT TGA TCT T
Code-qPCR-75-For TCG CAA ATG AGC GGT AGG A 75
Code-qPCR-75-Rev TTC ACG AGC CGG CGT ACT
and reporter genes, together with the lengths of amplified fragments, are listed
in Table 4.1. PCR reactions were performed according to Susiˇ
cetal.(2014).
The polymerase chain reaction is a technique used to amplify a precisely defined
piece of DNA across several orders of magnitude. Therefore, this method generates
thousands to millions of copies of a particular DNA sequence and makes its
detection and sequencing much easier. qPCR stands for quantitative PCR and is
actually a form of PCR method which allows us additionaly to determine the
quantity of a target sequence in a sample. Plantlets with positive amplification
results were analyzed further by real-time quantitative PCR on an ABI PRISM 7,500
Fast Sequence Detection System and 7,500 Software v2.3 (Applied Biosystems,
Foster City, USA). The primers used (see Table 4.1) were designed to amplify
a 75-bp section of the Code DNA and a 75-bp section of the tobacco ˇ-actin
gene (Faize et al. 2010), which was used as the internal reference gene. Each
10 µl reaction was composed of 5 µl FastStart Universal SYBR Green Master (Rox)
(Roche, Basel, Switzerland), 9.5–0.15 ng of DNA and 600 nM of each primer.
Amplification was performed under the following thermal cycling conditions: 95ıC
10 m, 40 cycles at 95 ıC for 10 s followed by 60 ıC for 30 s. Each reaction was
run in triplicate (technical replications), and PCR amplification specificity was
confirmed by melting-curve analysis and by agarose electrophoresis. The transgene
copy number was calculated according to the method developed by Weng et al.
(2004).
4.2.7 Sanger Sequencing
Plant DNA was first amplified with primer pair Code-For, Code-Rev using a
standard PCR protocol as described in Susiˇ
cetal.(2014). Unused primers and
nucleotides were removed from PCR amplification products with ExoSap-IT and
the sequencing reaction was performed separately for each primer with a BigDye R
4 The Potential of Plants and Seeds. . . 77
Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems), both following the
manufacturer’s instructions. Amplified products were separated by capillary elec-
trophoresis using an ABI PRISM R
3100 Genetic Analyzer (Applied Biosystems)
and the results analyzed with CodonCode Aligner 4.0.4. The sequence obtained
was decoded with the program described above.
4.3 Results
This section provides the results of our work. The main outcome, perhaps, is that
inserted Code DNA was obtained successfully from the leaf of a plant.
4.3.1 Coding Program
The complete developed program, which enables coding text into DNA sequences
and decoding DNA sequences into text, is available on the Internet.2The syntax of
the ‘Hello world!’ program was coded using the Classic option of the program
from which the Code DNA was obtained.
4.3.2 Storing Data in N. Benthamiana and Reading Data
from the Plant
We obtained several N. benthamiana plants and seeds (see Fig. 4.4) with normal
phenotypes and growth, which were PCR positive for the Code DNA, selectable
marker gene and resistance gene. They were analyzed further by quantitative
Fig. 4.4 Nicotiana
benthamiana plant, and seeds
of Nicotiana benthamiana
with incorporated Code DNA
78 K. Fister et al.
real-time PCR in order to determine the copy number of the inserted Code DNA
and only transgenic plants (T0) containing one copy (i.e., hemizygous for the Code
DNA) were left for self-pollination.
After germination of their seeds, the T1 progeny were analyzed for the presence
of the Code DNA in their genome and a 1:3 segregation of PCR positive results
was observed, thus confirming the hemizygosity of their mother plants. Transgene
lines (T1) containing two copies of the Code DNA (i.e., homozygous for the Code
DNA) were selected and left to self-pollinate for further storage of their seeds. Some
of the seeds were germinated and the plants (T2) grew normally. The DNAs of all
checked T2 plants contained the Code and was sequenced as described in 2.7 Sanger
sequencing. The obtained DNA sequence was decoded, resulting in the syntax of the
Hello world!’ program showing on a display device.
4.4 Discussion
DNA-based storage of data has been proposed as an outperforming replacement for
electronic storage devices, due to its durability and low space requirements (Cox
2001; Church et al. 2012; Goldman et al. 2013; Grass et al. 2015). However,
since artificial DNA (Grass et al. 2015; Colotte et al. 2011; Clermont et al. 2014;
Liu et al. 2015; Ivanova and Kuzmina 2013; Anchordoquy and Molina 2007)
and microorganisms (Gibson et al. 2010; Farzadfard and Lu 2014; Ausländer and
Fussenegger 2014) require specific pretreatments and equipment for their storage,
an alternative approach, i.e., storing data in seeds, is presented here. Seeds are one
of the oldest storage media on Earth and they preserve genetic information for
thousands of years. Due to their stability and longevity, they are the most often
used material for plant genetic resource preservation in the world’s over 1,750
genebanks. They are already guardians of our natural and cultural heritage and, with
the implementation of our proof-of-concept study presented in this work, their role
could be even more pronounced.
Storing data in plant seeds is a simple, safe and economic solution for data
storage, since seeds do not need special equipment for storage because they
possess a wide range of natural mechanisms of protection and are easy to grow.
Seeds have already proved their durability over thousands of years. Examples
demonstrating this durability include the 1,600-year-old seeds of Anagyris foetida,
a relict species endemic to the Mediterranean region, which were germinated
successfully (Özgen et al. 2012) or the most ancient viable multicellular plants on
Earth – the species Silene stenophylla Ledeb. (Caryophyllaceae) – which have been
regenerated (Yashina et al. 2012) from approximately 31,800 years old placenta
fragments. Seeds of Nicotiana spp. are known to preserve their germination ability
for up to 10 years under ambient temperatures and/or relative humidity (Agacka
4 The Potential of Plants and Seeds. . . 79
et al. 2013), while long-term depositories such as Svalbard Global Seed Vault3
can protect them even from massive natural or man-made cataclysms. Under a
controlled environment with reduced temperature and relative humidity, the need for
seed regeneration would be minimal. Taking into account the estimated spontaneous
mutation rate for Arabidopsis thaliana of 7109base substitutions per site per
generation (Ossowski et al. 2010) and the estimated N. benthamiana genome size
of 3 Gb (Bombarely et al. 2012), there is a negligible chance of a mutation in the
encoded DNA. However, by increasing the length of DNA insertions (i.e. amount
of stored data) the chances of unwanted mutations also increase. Therefore, before
implementation of our proof of concept, the maximum length of heterologous DNA
that can be introduced into N. benthamiana genome and the mutation rate of such
long sequences have to be determined. Data storage in seeds goes beyond plant
genome manipulation for biotechnological research and plant breeding or simple
embedded ‘watermarks’ (Liss et al. 2012). It takes advantage of multi-cellular
organisms and serves for propagating the encoded information in daughter cells.
The host organism is able to grow and multiply with the embedded information and
every cell of the organism contains a copy of the encoded information. It avoids the
costs of producing multiple copies of the same encoding information synthetically,
which has been estimated to be $12,400 per MB (Goldman et al. 2013).
Insertions of short computer programs within plants could also provide a detailed
description of given varieties, since a need for such labeling has already been
expressed. The incorporation of such information into a plants’ own DNA would
particularly help consumers in terms of satisfying the ever-growing demand for food
quality and origin information. It can also be used as an extremely useful tool for
variety protection (Fister et al. 2017).
In relation to manipulating and storing archives, our approach could be leverage
for a new look at accessing, browsing and reading information, since hand-
held, single-molecule DNA sequencers are becoming available (Pennisi 2012) and
upgrading them to being able to obtain an encoded sequence directly from a
leaf (Ljubiˇ
c and Fister 2014) could be the next step.
4.5 Summary
This chapter presented our work on the utilization of a multi-cellular, eukaryotic
organism for storing valuable data. Our work describes a free copy-paste method
that avoids the costs of synthetic production of multiple copies of the same encoding
information, which is currently estimated to be $12,400 per MB for information
3Global crop diversity trust. https://www.croptrust.org/what-we-do/svalbard-global-seed-vault/.
Accessed: 2016-07-26.
80 K. Fister et al.
storage in naked DNA with negligible additional computational costs. In contrast
to a naked DNA molecule, which can be affected by unfavorable environmental
conditions, DNA stored in a seed is protected against alterations and degradation
over time without the need of any active maintenance. Our approach demonstrates
that artificially encoded data can be stored and multiplied in plants without affecting
their vigor and fertility. It is inheritable to progeny and authentically reproducible
while the reduced metabolism of seeds provides an additional protection for
encoded DNA archives.
References
Agacka M, Depta A, Börner M, Doroszewska T, Hay FR, Börner A (2013) Viability of nicotiana
spp. seeds stored under ambient temperature. Seed Sci Technol 41(3):474–478
Ailenberg M, Rotstein OD (2009) An improved Huffman coding method for archiving text, images,
and music characters in DNA. Biotechniques 47(3):747
Ajwani D, Malinger I, Meyer U, Toledo S (2008) Characterizing the performance of flash memory
storage devices and its impact on algorithm design. In: Proceedings of the 7th International
conference on experimental algorithms (WEA’08), Provincetown, pp 208–219
Anchordoquy TJ, Molina MC (2007) Preservation of DNA. Cell Preserv Technol 5(4):180–188
Ausländer S, Fussenegger M (2014) Dynamic genome engineering in living cells. Science
346(6211):813–814
Bombarely A, Rosli HG, Vrebalov J, Moffett P, Mueller LA, Martin GB (2012) A draft
genome sequence of Michaelicotiana benthamiana to enhance molecular plant-microbe biology
research. Mol Plant-Microbe Interact 25(12):1523–1530
Bonnet J, Colotte M, Coudy D, Couallier V, Portier J, Morin B, Tuffet S (2010) Chain and
conformation stability of solid-state DNA: implications for room temperature storage. Nucleic
Acids Res 38(5):1531–1546
Brand S (2000) Clock of the long now: time and responsibility. Basic Books, New York
Church GM, Gao Y, Kosuri S (2012) Next-generation digital information storage in DNA. Science
337(6102):1628–1628
Cisco (2016) The zettabyte era: trends and analysis. White paper, Cisco Systems,
Inc. Available via http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-
networking-index-vni/vni-hyperconnectivity-wp.html. Accessed 26 Nov 2016
Clelland CT, Risca V, Bancroft C (1999) Hiding messages in DNA microdots. Nature
399(6736):533–534
Clermont D, Santoni S, Saker S, Gomard M, Gardais E, Bizet C (2014) Assessment of DNA encap-
sulation, a new room-temperature DNA storage method. Biopreserv Biobanking 12(3):176–183
Colotte M, Coudy D, Tuffet S, Bonnet J (2011) Adverse effect of air exposure on the stability of
DNA stored at room temperature. Biopreserv Biobanking 9(1):47–50
Cox JPL (2001) Long-term data storage in DNA. Trends Biotechnol 19(7):247–250
Davis J (1996) Microvenus. Art J 55(1):70–74
Faize M, Faize L, Burgos L (2010) Using quantitative real-time PCR to detect chimeras in
transgenic tobacco and apricot and to monitor their dissociation. BMC Biotechnol 10(1):53
Farzadfard F, Lu TK (2014) Genomically encoded analog memory with precise in vivo DNA
writing in living cell populations. Science 346(6211):1256272
Fister K, Fister I, Murovec J, Bohanec B (2017) DNA labelling of varieties covered by patent
protection: a new solution for managing intellectual property rights in the seed industry.
Transgenic Res 26(1):87–95
4 The Potential of Plants and Seeds. . . 81
Gant JF, Reinsel D, Chute C, Schlichting W, McArthur J, Minton S, Xheneti I, Toncheva A,
Manfrediz A (2007) The expanding digital universe. White paper, International Data Cor-
poration. Available via https://web.archive.org/web/20130310100607/http://www.emc.com/
collateral/analyst-reports/expanding-digital-idc-white-paper.pdf. Accessed 26 Nov 2016
Gibson DG, Glass JI, Lartigue C, Noskov VN, Chuang RY, Algire MA, Benders GA, Montague
MG, Ma L, Moodie MM (2010) Creation of a bacterial cell controlled by a chemically
synthesized genome. Science 329(5987):52–56
Goldman N, Bertone P, Chen S, Dessimoz C, LeProust EM, Sipos B, Birney E (2013) Towards
practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature
494(7435):77–80
Goodin MM, Zaitlin D, Naidu RA, Lommel SA (2008) Nicotiana benthamiana: its history and
future as a model for plant-pathogen interactions. Mol Plant-Microbe Interact 21(8):1015–1026
Grass RN, Heckel R, Puddu M, Paunescu D, Stark WJ (2015) Robust chemical preservation
of digital information on DNA in silica with error-correcting codes. Angew Chem Int Ed
54(8):2552–2555
Hilbert M, López P (2011) The world’s technological capacity to store, communicate, and compute
information. Science 332(6025):60–65
Ivanova NV, Kuzmina ML (2013) Protocols for dry DNA storage and shipment at room
temperature. Mol Ecol Resour 13(5):890–898
Langtangen HP (2006) Python scripting for computational science, 3rd edn. Springer, Berlin
Liss M, Daubert D, Brunner K, Kliche K, Hammes U, Leiherer A, Wagner R (2012) Embedding
permanent watermarks in synthetic genes. PLoS One 7(8):e42465
Liu X, Li Q, Wang X, Zhou X, He X, Liao Q, Zhu F, Cheng L, Zhang Y (2015) Evaluation
of DNA/RNAshells for room temperature nucleic acids storage. Biopreserv Biobanking
13(1):49–55
Ljubiˇ
c K, Fister I Jr (2014) How to store Wikipedia into a forest tree: initial idea. In: Proceedings
of the first International conference on multimedia, scientific information and visualization for
information systems and metrics (MSIVISM’14), pp 45–52
MacKay DJC (2003) Information theory, inference and learning algorithms. Cambridge University
Press, New York
Marx V (2013) Biology: the big challenges of big data. Nature 498(7453):255–260
Ossowski S, Schneeberger K, Lucas-Lledó JI, Warthmann N, Clark RM, Shaw RG, Weigel D,
Lynch M (2010) The rate and molecular spectrum of spontaneous mutations in arabidopsis
thaliana. Science 327(5961):92–94
Özgen M, Özdilek A, Birsin MA, Önde S, ¸Sahin D, Açıkgöz E, Kaya Z (2012) Analysis of ancient
DNA from in vitro grown tissues of 1600-year-old seeds revealed the species as Anagyris
foetida. Seed Sci Res 22(4):279–286
Pennisi E (2012) Search for pore-fection. Science 336(6081):534–537
Susiˇ
c N, Bohanec B, Murovec J (2014) Agrobacterium tumefaciens-mediated transformation of
bush monkey-flower (Mimulus aurantiacus Curtis) with a new reporter gene ZsGreen. Plant
Cell Tissue Organ Cult 116(2):243–251
The Economist (2012) Digital archiving: history flushed. The economist. Available via http://www.
economist.com/node/21553410. Accessed 26 July 2016
Weng H, Pan A, Yang L, Zhang C, Liu Z, Zhang D (2004) Estimating number of transgene copies
in transgenic rapeseed by real-time PCR assay with HMG I/Y as an endogenous reference gene.
Plant Mol Biol Report 22(3):289–300
Yashina S, Gubin S, Maksimovich S, Yashina A, Gakhova E, Gilichinsky D (2012) Regeneration
of whole fertile plants from 30,000-year-old fruit tissue buried in Siberian permafrost. Proc
Natl Acad Sci USA 109(10):4008–4013
Article
Despite the numerous projects and exhibitions dedicated to technology and Internet infrastructure, “Data Center Studies” has not yet fully grappled with art’s role in the wider intervention critical scholarship is making via the data center—as object, cultural image, sociotechnical imaginary, site, metaphor, and concept. In this article, I weave in artworks and insights from artists to make a theoretical intervention about what constitutes environmental media “in the cloud,” and how a critique of Internet infrastructure has spurred a critical data center art movement.
Article
DNA watermarking is a reliable technology for marking recombinant strains of bacteria, yeast races, plant varieties and lines, including GMOs. A DNA watermark is a rather short sequence of nucleotides inserted into the DNA of any organisms in order to prove its genetically engineered or fully synthetic origin. Virtually any non-biological information in the form of authors' surnames, places of their work, some unique texts, etc. can be encoded in DNA watermarks. Moreover, different authors use various methods of converting such non-genetic information into nucleotide sequences, briefly discussed in this review. In addition to DNA marker sites embedded in genomes, water DNA signs can also be considered as some unique nucleotide sequences added to various media and substances (groundwater, oil, milk, ink, pesticides, etc.), including chemically synthesized long oligonucleotides, which can then be detected using PCR. Some attention is paid in this review to the problem of GMO, which, in fact, should not exist due to many reasons, the main of which is that GM plants are no more dangerous than ordinary varieties of agricultural plants.
Article
Full-text available
Наведено результати критичного аналізу перспектив застосування ДНК пам’яті для довгострокового зберігання ін- формації. Показано, що ДНК пам’ять забезпечує як запис інформації з високою щільністю запису, так і її довгостро- кове зберігання. Наведені дані свідчать, що технології ДНК пам’яті зможуть сприяти суттєвим змінам в архівному зберіганні даних. Проведено аналіз галузей застосування ДНК пам’яті. Виконано дослідження кодів, що використо- вуються для представлення даних у ДНК пам’яті. Детально проаналізовано декілька підходів щодо проєктування кодонів ДНК і різноманітних підходів до зберігання даних, визначено плюси та мінуси кожного з підходів. Обговорю- ються методи стеганографії з використанням молекул ДНК для безпечного зберігання даних.
Conference Paper
Full-text available
DNA represents a possibility of highly compact and extremely resistant storage of information in the future. DNA is composed of two complementary strands. These complementary strands are built of basic building block called bases. These are adenine, guanine, cytosine and thymine. In the last year, this type of encoding information gained with the help of nanotechnology. With the help of artificial sequencing of bases in different sequences, the DNA can encrypt the digital information that is recorded in a computer readable form-in a form of bits. 1 bit is a basic unit of information. With last year´s successes in this field began the expansion of this field. Future promises even commercially available methods for storing and reading information. A significant reduction of costs of this type of coding is needed. Prospects are promising, because the prices are falling exponentially. In this article we discuss about using DNA as information storage and systematically present the first successes in this field. We propose new ideas for storing data, which are based on currently developed technologies and out-of-the-box thinking.
Article
Full-text available
Plant breeders’ rights are undergoing dramatic changes due to changes in patent rights in terms of plant variety rights protection. Although differences in the interpretation of »breeder’s exemption«, termed research exemption in the 1991 UPOV, did exist in the past in some countries, allowing breeders to use protected varieties as parents in the creation of new varieties of plants, current developments brought about by patenting conventionally bred varieties with the European Patent Office (such as EP2140023B1) have opened new challenges. Legal restrictions on germplasm availability are therefore imposed on breeders while, at the same time, no practical information on how to distinguish protected from non-protected varieties is given. We propose here a novel approach that would solve this problem by the insertion of short DNA stretches (labels) into protected plant varieties by genetic transformation. This information will then be available to breeders by a simple and standardized procedure. We propose that such a procedure should consist of using a pair of universal primers that will generate a sequence in a PCR reaction, which can be read and translated into ordinary text by a computer application. To demonstrate the feasibility of such approach, we conducted a case study. Using the Agrobacterium tumefaciens transformation protocol, we inserted a stretch of DNA code into Nicotiana benthamiana. We also developed an on-line application that enables coding of any text message into DNA nucleotide code and, on sequencing, decoding it back into text. In the presented case study, a short command line coding the phrase »Hello world« was transformed into a DNA sequence that was inserted in the plant genome. The encoded message was reconstructed from the resulting T1 seedlings with 100 % accuracy. The feasibility and possible other applications of this approach are discussed.
Article
Full-text available
Traditional nucleic acids preservation methods rely on maintaining samples in cold environments, which are costly to operate and time sensitive. Recent work validated that using room temperature for the storage of nucleic acids is possible if the samples are completely protected from water and oxygen. Here, we conducted accelerated aging and real-time degradation studies to evaluate the new technology DNAshell and RNAshell, which preserves DNA and RNA at room temperature, including the DNA and RNA yield, purity, and integrity. DNA and RNA solutions are dried in the presence of stabilizers in stainless steel minicapsules, then redissolved after different time points of heating and storing at room temperature. Results show that DNAshell and RNAshell ensure the safe storage of nucleic acids at room temperature for long periods of time, and that the quality of these nucleic acids is suitable for common downstream analysis.
Article
Full-text available
A new procedure for room-temperature storage of DNA was evaluated whereby DNA samples from human tissue, bacteria, and plants were stored under an anoxic and anhydrous atmosphere in small glass vials fitted in stainless-steel, laser-sealed capsules (DNAshells(®)). Samples were stored in DNAshells(®) at room temperature for various periods of time to assess any degradation and compare it to frozen control samples and those stored in GenTegra™ tubes. The study included analysis of the effect of accelerated aging by using a high temperature (76°C) at 50% relative humidity. No detectable DNA degradation was seen in samples stored in DNAshells(®) at room temperature for 18 months. Polymerase chain reaction experiments, pulsed field gel electrophoresis, and amplified fragment length polymorphism analyses also demonstrated that the protective properties of DNAshells(®) are not affected by storage under extreme conditions (76°C, 50% humidity) for 30 hours, guaranteeing 100 years without DNA sample degradation. However, after 30 hours of storage at 76°C, it was necessary to include adjustments to the process in order to avoid DNA loss. Successful protection of DNA was obtained for 1 week and even 1 month of storage at high temperature by adding trehalose, which provides a protective matrix. This study demonstrates the many advantages of using DNAshells(®) for room-temperature storage, particularly in terms of long-term stability, safety, transport, and applications for molecular biology research.
Article
Nicotiana benthamiana is the most widely used experimental host in plant virology, due mainly to the large number of diverse plant viruses that can successfully infect it. Addi- tionally, N. benthamiana is susceptible to a wide variety of other plant-pathogenic agents (such as bacteria, oomycetes, fungi, and so on), making this species a cornerstone of host-pathogen research, particularly in the context of innate immunity and defense signaling. Moreover, because it can be genetically transformed and regenerated with good efficiency and is amenable to facile methods for virus- induced gene silencing or transient protein expression, N. benthamiana is rapidly gaining popularity in plant biology, particularly in studies requiring protein localization, inter- action, or plant-based systems for protein expression and purification. Paradoxically, despite being an indispensable research model, little is known about the origins, genetic variation, or ecology of the N. benthamiana accessions cur- rently used by the research community. In addition to ad- dressing these latter topics, the purpose of this review is to provide information regarding sources for tools and reagents that can be used to support research in N. benthamiana. Finally, we propose that N. benthamiana is well situated to become a premier plant cell biology model, particularly for the virology community, who as a group were the first to recognize the potential of this unique Australian native.
Article
Information, such as text printed on paper or images projected onto microfilm, can survive for over 500 years. However, the storage of digital information for time frames exceeding 50 years is challenging. Here we show that digital information can be stored on DNA and recovered without errors for considerably longer time frames. To allow for the perfect recovery of the information, we encapsulate the DNA in an inorganic matrix, and employ error-correcting codes to correct storage-related errors. Specifically, we translated 83 kB of information to 4991 DNA segments, each 158 nucleotides long, which were encapsulated in silica. Accelerated aging experiments were performed to measure DNA decay kinetics, which show that data can be archived on DNA for millennia under a wide range of conditions. The original information could be recovered error free, even after treating the DNA in silica at 70 °C for one week. This is thermally equivalent to storing information on DNA in central Europe for 2000 years.
Article
Cellular memory is crucial to many natural biological processes and sophisticated synthetic biology applications. Existing cellular memories rely on epigenetic switches or recombinases, which are limited in scalability and recording capacity. In this work, we use the DNA of living cell populations as genomic “tape recorders” for the analog and distributed recording of long-term event histories. We describe a platform for generating single-stranded DNA (ssDNA) in vivo in response to arbitrary transcriptional signals. When coexpressed with a recombinase, these intracellularly expressed ssDNAs target specific genomic DNA addresses, resulting in precise mutations that accumulate in cell populations as a function of the magnitude and duration of the inputs. This platform could enable long-term cellular recorders for environmental and biomedical applications, biological state machines, and enhanced genome engineering strategies.
Article
Living cells continuously measure, process, and store cellular and environmental information in response to specific signals. Bioengineers are now starting to use these systems to build customized genetic regulatory circuits that can control targeted biological processes. They are also using them to develop new regulatory modes and novel biochemical pathways ( 1 ). In combination with new genome-editing tools ( 2 ), this technology holds great promise for the development of biomedical and biotechnological applications of specially engineered “designer” cells. On page 825, Farzadfard and Lu ( 3 ) move us closer to this goal by constructing a cellular memory device that is based on a conditional gene-editing platform. In this way they have gained access to the enormous storage capacity of genomic DNA to record analog information.
Article
A successful in vitro Agrobacterium-mediated transformation protocol was developed for Mimulus aurantiacus, a model species for ecological and evolutionary genetics and a promising ornamental plant. Three binary vectors were tested, each containing the hptII selectable marker gene and one of the reporter genes: gusA, EGFP or ZsGreen, all of them under CaMV 35S promoter. Genetic transformation was achieved through 4 days of co-cultivation of leaf, petiole and hypocotyl explants with Agrobacterium tumefaciens strain LBA 4404. Explants produced transformed callus tissue on solid modified Murashige and Skoog medium supplemented with 1 mg L−1 6-benzylaminopurine, 0.5 mg L−1 1-naphthaleneacetic acid, 30 g L−1 sucrose and 20 or 50 mg L−1 hygromycin B. All three reporter genes were expressed in callus tissue but the intensity of expression gradually decreased during further plant development. The new reporter gene ZsGreen proved suitable for plant transformation experiments since very intense and bright fluorescence was detected. Out of 1,760 co-cultured explants, 110 plants were regenerated and all of them were found to be PCR positive for the selection and/or reporter genes. Chemiluminescent Southern blot analysis revealed that 91 % of the regenerated plants (100 T0 plants) contained T-DNA integrated in their genome. Transformation efficiency varied from 1.4 to 23.3 % for hypocotyl and petiole explants, respectively. Integration of some backbone sequences in plant genomes was confirmed in 75.3 % of T0 plants. Using this protocol, stable transformants expressing selectable marker gene hptII and one of the reporter genes (gusA, ZsGreen or EGFP) were obtained in 4–5 months.