Content uploaded by Jana Murovec
Author content
All content in this area was uploaded by Jana Murovec on Aug 29, 2017
Content may be subject to copyright.
Chapter 4
The Potential of Plants and Seeds in DNA-Based
Information Storage
Karin Fister, Iztok Fister Jr., and Jana Murovec
Abstract New approaches for data archiving are required due to a constant increase
in digital information production and lack of a capacitive, low maintenance storage
medium. High-density information encoding and longevity are the two important
advantages which have recently made DNA an attractive target for information stor-
age. However, creating new copies of the same encoded information by producing
new, artificial DNA sequences is not financially viable. Moreover, a naked DNA
molecule can be greatly affected by environmental influences, thus resulting in DNA
mutations and changes in the stored information. Our approach demonstrates the
great potential of plants and seeds in circumventing these drawbacks. It shows that
artificially encoded data can be stored and multiplied within plants.
4.1 Introduction
Data storage is relevant for keeping track of our history and for accomplishing
tasks during our day-to-day lives. It has evolved significantly from the first written
records of the ancient Sumerian’s and Egyptians into our time, where it is embedded
in a rapidly expanding data production environment. Stones were replaced by
paper, which is being replaced progressively by electronic storage media. Compared
to printed data, the latter are characterized by relatively small physical space
requirements and by the ease of copying digital data. However, there are some major
drawbacks with current storage technologies. The first drawback is their limited
capacity. For instance, at the time of this writing, the world’s highest capacity
K. Fister ()
Faculty of Medicine, University of Maribor, Taborska 8, 2000, Maribor, Slovenia
e-mail: karin.ljubic@student.um.si
I. Fister Jr.
Faculty of Electrical Engineering and Computer Science, University of Maribor,
Smetanova 17, 2000, Maribor, Slovenia
e-mail: iztok.fister1@um.si
J. Murovec
Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, 1000, Ljubljana, Slovenia
e-mail: jana.murovec@bf.uni-lj.si
© Springer International Publishing AG 2017
A.J. Schuster (eds.), Understanding Information, Advanced Information
and Knowledge Processing, DOI 10.1007/978-3-319-59090-5_4
69
70 K. Fister et al.
helium drive can only save up to 10 terabytes (TB) of data. Although the storage
capacity of all data storage equipment grew from less than 3 (optimally compressed)
exabytes in 1986 to several zettabytes in 2013 and is doubling roughly every 3 years,
the International Data Corporation has mentioned that, even in 2007, the total
amount of digital data produced on the planet exceeded the amount of available
storage (Hilbert and L’opez 2011;Cisco2016; Gant et al. 2007). The problem of
data explosion can also be emphasized in the field of modern biology. Biological
data are heterogeneous, they stem from a wide range of experiments. Getting the
most from the data requires comparing them to the relevant prior knowledge. That
means scientists have to store large data sets and analyse them. The European
Bioinformatics Institute (EBI) in UK, one of the largest biology data repositories
in the world, currently stores 20 petabytes of data (Marx 2013). Efforts to produce
enough storage is resulting in increased production of storage devices and the
building of large data centers, all of which increases environmental contamination
and raises the costs of their maintenance. Another major disadvantage of electronic
storage media is their short lifespan (usually in years), which depends on the
frequency of access (Ajwani et al. 2008).
Due to the constant increase of digital data production and the above-mentioned
concerns, new approaches for data storage are being sought. DNA has proven to
be useful for archival storage, because it offers some major improvements over
digital storage media such as: information density, stability when stored under
optimal conditions and minuteness. The first message stored in DNA dates back
to 1988 (Davis 1996). More recently, in the past 4 years, there have been two
major breakthroughs in the field of DNA-based information storage (Church et al.
2012; Goldman et al. 2013). First of all, novel next-generation DNA synthesis
and sequencing technologies have expanded the boundaries of previous DNA
production approaches, which were able to encode and decode only trivial amounts
of information (Davis 1996) and, in addition, lacked the possibility of scaling-
up (Clelland et al. 1999). The second advance was achieved with the use of the
Huffman code (Brand 2000; MacKay 2003; Ailenberg and Rotstein 2009)asa
compression method for large scales of bytes to minor scale DNA bases. Actually,
this approach has proven to be the most accurate method and could be scaled
beyond the boundaries of current archiving methods (Church et al. 2012; Goldman
et al. 2013; Ailenberg and Rotstein 2009). So far, sets of computer files encoded in
DNA have been 739kB (Goldman et al. 2013), 675 kB (Church et al. 2012), and
83 kB (Grass et al. 2015). Encouraged by these achievements, it has been proposed
that DNA-based storage might already be economically viable for archives with no
extensive access, such as historical and government records or large-scale science
projects that generate massive amounts of data (Brand 2000; The Economist 2012).
The stability of DNA is highly dependent on the storage conditions, which should
provide constant low temperatures, as in freezers, and protection from atmospheric
water, oxygen and ozone. It has been demonstrated that, at room temperature,
solid-state DNA degradation through depurination, base deamination, and base or
sugar oxidation, is affected greatly by water and oxygen (Bonnet et al. 2010), thus
dictating the need for special equipment or preservation procedures. The problem
4 The Potential of Plants and Seeds. . . 71
is even more pronounced because laboratory plastic ware is neither moisture nor
airtight and storage in refrigerators is not always possible. DNA shells (Colotte
et al. 2011; Clermont et al. 2014; Liu et al. 2015) have been presented recently
as an alternative to DNA storage at room temperature. Although they provide an
alternative approach, the technology assumes an anoxic and anhydrous atmosphere
in small glass vials fitted into stainless-steel, laser-sealed mini-capsules, all of which
boost storage costs. The same problem is encountered when using dry-state DNA
stabilization systems, such as commercial Biomatrica DNA stable plates, trehalose
and polyvinyl alcohol (PVA) plates (Ivanova and Kuzmina 2013) or inorganic silica
capsules (Grass et al. 2015).
This chapter addresses several of the issues mentioned before. In its essence,
the chapter describes a novel approach for DNA-based data storage that does not
focus on information quantity but rather on a new storage medium that combines
DNA stability and, consequently, information preservation, with low costs for its
conservation and multiplication. We chose a living plant, the widely known model
plant Nicotiana benthamiana, to be the target multi-cellular, eukaryotic organism
for digital information hosting. Reasons for choosing this particular plant include
the plant’s short generation time, its high seed yield and its ease of growing under
natural and controlled environments (Goodin et al. 2008). Further, we selected the
well-known ‘Hello world!’ computer program (Langtangen 2006)inahigh-
level, universally-used programing language, in our work the Python1programing
language, to be encoded and stored in the plant. In order to provide the reader with
a detailed understanding of our work, we organized the remainder of this chapter as
follows. Section 4.2 describes the process of storing digital data into plants in detail.
We start by describing the coding program that was developed in order to transform
the digital information into the DNA sequence. Next, we describe the synthesis of
our artificial ‘Code DNA’ by Integrated DNA Techonolgy and the process of plant
transformation by co-cultivation with Agrobacterium tumefaciens containing the
binary plasmid. We conclude the section with a description of the screening process
for detecting the presence of our Code DNA in the plant. Section 4.3 describes the
results of our experiment. In Sect. 9.6 we discuss the advantages of storing data into
plants and their seeds. Section 7.5 ends the chapter with a summary.
4.2 Materials and Methods
The aim in this section is to provide the reader with a detailed description of our
work. The section starts with DNA basics, a description of the basic structure of
DNA molecules, as well as the process of transfering information from a mother
cell to its two daughter cells. Knowing these basic ideas is crucial for understanding
the backbone of our experiment. Next, we describe the coding program. The coding
1Python. https://www.python.org/. Accessed: 2016-06-30.
72 K. Fister et al.
program was developed to transform our digital data in the form of bits into the
sequence of DNA nucleotides. By using the coding program we transformed a
computer program into a sequence of nucleotides. We named this artificial sequence
the Code DNA. The section describes the synthesis of this Code DNA in detail. We
discuss, briefly, the plant material and focus on the process of plant transformation
with Agrobacterium tumefaciens containing the binary plasmid. The end of the
section focuses on the process of extraction of our Code DNA from the leaf tissue.
For a start, Fig. 4.1 presents the key steps of storing data into plants and obtaining
data from it.
4.2.1 DNA Basics
A DNA molecule consists of two polynucleotide chains: DNA chains or DNA
strands. Each chain consists of four types of nucleotide subunits and each nucleotide
is composed of a five-carbon sugar to which are attached one or more phosphate
groups and a nitrogen-containing base. In DNA nucleotides, the sugar is deoxyri-
bose attached to a single phosphate group. Therefore, the three basic parts of
nucleotides are sugar, phosphate group and base. The base may be either Adenine
(A), Cytosine (C), Guanine (G) or Thymine (T). Figure 4.2 presents the structure of
Fig. 4.1 Flow chart, illustrating the key processes involved when storing data into plants (top)and
obtaining data from it (bottom)
Fig. 4.2 Elementary nitrogenous bases (adenine, guanine, cytosine, thymine) in the nucleic acid
of DNA. Usually represented by the letters A–G–C–T
4 The Potential of Plants and Seeds. . . 73
all four nitrogenous bases within DNA nucleotides. The backbone of the DNA chain
consists of covalently linked sugars and phosphate groups in alternating fashion.
The two chains are held together by hydrogen bonds between the base portion of
the nucleotides. The double-helix or the three-dimensional structure of DNA arises
from the chemical and structural features of its two chains. The shapes and chemical
structure of the bases allow pairing with hydrogen bonds only between A and T and
between C and G. This pairing is referred to as Watson-Crick base pair. Because
of these base-pairing requirements each strand of the DNA molecule is exactly
complementary (called Watson-Crick complementarity) to the nucleotide sequence
of its partner strand. Therefore, we can predict the sequence of the second strand
by knowing the sequence of the first strand. The sequence of nucleotides of one or
another strand carries biological information or, in the case of our experiment, –
the artificial information – that must be copied accurately for transmission to the
next generation each time a cell divides. At each cell division, the double helix
unfolds allowing the pairing with new, complementary bases. Each DNA strand
serves as a template for its own duplication. The ability of each strand of a DNA
molecule to act as a template for producing a complementary strand enables a cell
to copy its genetic information before passing it on to new cells. Therefore, the
artificially inserted DNA sequence is copied too, and the digital information, which
this artificial sequence presents, is carried to every cell of a plant and to all of its
seeds and progenies.
4.2.2 Coding Program
We developed a coding program that first translates text to binary. The whole coding
program is available on-line2and enables coding text into DNA sequences and
decoding DNA sequences into text. The maximum length of inserted characters to
be encoded into DNA is limited to 300, while the maximum length of an inserted
DNA sequence to be decoded back to text is 1,200 bases. Currently, the program
enables coding letters from A to Z and a to z (English alphabet), numbers from
0 to 9 and special characters hashtag (#) and apostroph (’). The program offers
two coding options: ‘Classic’ and ‘Compressed1’. The Classic option, which uses
2 bits for coding a base, is as follows: 00 for A, 10 for C, 11 for T and 01 for
G. This encoding scheme enables the avoidance of sequences that are: difficult to
synthesize, sequences with long repeats and sequences with extreme CG content.
The Compressed1 option is upgraded by using the Huffmann compression method.
This method reduces the overall number of bits used to encode a string of symbols
inserted in the coding window. The Huffmann compression method allows up to
60% higher compression than the Classic option. The percentage depends on the
2Plant-based data storage project. http://www.storing-data-into-living-plant.net/. Accessed: 2016-
07-30.
74 K. Fister et al.
length of the inserted text. When using the Compressed1 option, users are given a
‘Key’, which has to be used in order to decode the DNA sequence back to text. This
Key links the user to the specific Huffmann tree that was used for compression.
4.2.3 Code DNA Synthesis and Cloning
The ‘Hello world!’ computer program was structured and written in the form
of the syntax #begin print ‘Hello world’ #end. The syntax was coded using the
Classic option of our coding program into the Code DNA. Primer annealing
sequences were added upstream and downstream of the Code DNA for subsequent
sequencing reactions. The Code DNA was synthesized by Integrated DNA Tech-
nology (IDT, Leuven, Belgium) and cloned into the MCS of a linearized plasmid
vector pCAMBIA 1302-ZsGreen (Susiˇ
cetal.2014) using a Gibson Assembly
Cloning Kit following the manufacturer’s instructions (New England Biolabs,
Ipswich, MA, USA). The binary plasmid pCAMBIA 1302-ZsGreen-Code contained
a hygromycin phosphotransferase (hptII) selectable marker gene and the ZsGreen
reporter gene, both driven by the cauliflower mosaic virus 35S promoter. The
binary plasmid was electroporated into ElectroMAX Agrobacterium tumefaciens
LBA 4404 (Invitrogen).
4.2.4 Plant Material
Here we describe the preparation of plant material for transformation. Seeds of
Nicotiana benthamiana were surface sterilized and germinated in petri dishes on
solid medium containing 2 g L1sucrose (Duchefa Biochemie B. V.) and 8 g L1
Daishin agar (Duchefa Biochemie B. V.). Plant seedlings were sub-cultivated
every four weeks on fresh medium containing Murashige and Skoog macro- and
micro-elements (MS; Duchefa Biochemie B. V.), 2 mg L1thiamine HCl (Sigma),
1mgL
1pyridoxine HCl (Sigma), 1 mg L1nicotinic acid (Sigma), 30 g L1
sucrose and 8 g L1Daishin agar. All media were adjusted to pH 5.8 before
autoclaving and plant tissue cultures were maintained at 23 ˙1ıC and a 16-h
photoperiod.
4.2.5 Plant Transformation
Containing the binary plasmid pCAMBIA 1302-ZsGreen-Code, Agrobacterium
tumefaciens was grown overnight at 28 ıC by shaking in liquid YEB medium pH
7.0 and prepared for co-cultivation as described in Susiˇ
cetal.(2014). Explants
were immersed in bacterial suspension for 15 m with periodic shaking. Inoculated
4 The Potential of Plants and Seeds. . . 75
Fig. 4.3 Main steps in the plant transformation process. The starting-point in the process is the
gene or synthetic DNA to be transferred. The end-product is a genetically modified plant
explants were blotted on sterile filter paper and transferred to regeneration
medium composed of MS macro- and micro-elements, 0.1 mg L1Fe-Na2-EDTA
(Sigma), 0.1 g L1myo-inositol (Sigma), 0.1 mg L1thiamine HCl, 1 mg L16-
Benzylaminopurine (6-BAP) (Duchefa Biochemie B. V.), 0.1 mg L1˛-naphtalene
acetic acid (NAA) (Duchefa Biochemie B. V.), 30 g L1sucrose, 200 µM
acetosyringone (AS), 8 g L1Daishin agar, pH 5,8. Cultures were incubated in
the dark for four days and then washed in 200 mgL1solution of timentin (Duchefa
Biochemie B. V.), blot-dried on filter paper and transferred to petri dishes containing
regeneration medium without AS but supplemented with 150 mg L1of timentin
and 10 mg L1hygromycine (Duchefa Biochemie B. V.). Regenerating shoots
were transferred to fresh cultivation medium composed of MS macro- and micro-
elements, 2 mg L1thiamine HCl, 1 mg L1pyridoxine HCl, 1 mg L1nicotinic
acid, 30 g L1sucrose, 150 mg L1of timentin and 8 g L1Daishin agar, pH 5.8.
Figure 4.3 illustrates the main steps of plant transformation.
4.2.6 DNA Isolation and PCR Analysis
Total genomic DNA was extracted from the leaf tissue of plants regenerated on
the selective medium by a modified cetyl trimethylammonium bromide (CTAB)
method. Sequences of primer pairs used in polymerase chain reactions (PCR) for
screening for the presence of the Code DNA and for the presence of selectable
76 K. Fister et al.
Table 4.1 Sequences of primers used and lengths of amplified fragments (in bp)
Primer name Primer sequence 50–30
Amplified fragment
length (bp)
Code-For GCAATGAGCGGTAGGAGTG 172
Code-Rev ACG GTC AGC ATG TGA CAG TC
HptII-For ATG ACC GCT GTT ATG CGG CCA TTG 641
HptII-Rev AAA AAG CCT GAA CTC ACC GCG ACG
ZsGreen-For AGA ACT CGT GTC CTG CTG GT 208
ZsGreen-Rev ATG ATC TTC TCG CAG GAT GG
ˇ-actin-qPCR-For CTG GCA TTG CAG ATC GTA TGA 75
ˇ-actin-qPCR-Rev GCG CCA CCA CCT TGA TCT T
Code-qPCR-75-For TCG CAA ATG AGC GGT AGG A 75
Code-qPCR-75-Rev TTC ACG AGC CGG CGT ACT
and reporter genes, together with the lengths of amplified fragments, are listed
in Table 4.1. PCR reactions were performed according to Susiˇ
cetal.(2014).
The polymerase chain reaction is a technique used to amplify a precisely defined
piece of DNA across several orders of magnitude. Therefore, this method generates
thousands to millions of copies of a particular DNA sequence and makes its
detection and sequencing much easier. qPCR stands for quantitative PCR and is
actually a form of PCR method which allows us additionaly to determine the
quantity of a target sequence in a sample. Plantlets with positive amplification
results were analyzed further by real-time quantitative PCR on an ABI PRISM 7,500
Fast Sequence Detection System and 7,500 Software v2.3 (Applied Biosystems,
Foster City, USA). The primers used (see Table 4.1) were designed to amplify
a 75-bp section of the Code DNA and a 75-bp section of the tobacco ˇ-actin
gene (Faize et al. 2010), which was used as the internal reference gene. Each
10 µl reaction was composed of 5 µl FastStart Universal SYBR Green Master (Rox)
(Roche, Basel, Switzerland), 9.5–0.15 ng of DNA and 600 nM of each primer.
Amplification was performed under the following thermal cycling conditions: 95ıC
10 m, 40 cycles at 95 ıC for 10 s followed by 60 ıC for 30 s. Each reaction was
run in triplicate (technical replications), and PCR amplification specificity was
confirmed by melting-curve analysis and by agarose electrophoresis. The transgene
copy number was calculated according to the method developed by Weng et al.
(2004).
4.2.7 Sanger Sequencing
Plant DNA was first amplified with primer pair Code-For, Code-Rev using a
standard PCR protocol as described in Susiˇ
cetal.(2014). Unused primers and
nucleotides were removed from PCR amplification products with ExoSap-IT and
the sequencing reaction was performed separately for each primer with a BigDye R
4 The Potential of Plants and Seeds. . . 77
Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems), both following the
manufacturer’s instructions. Amplified products were separated by capillary elec-
trophoresis using an ABI PRISM R
3100 Genetic Analyzer (Applied Biosystems)
and the results analyzed with CodonCode Aligner 4.0.4. The sequence obtained
was decoded with the program described above.
4.3 Results
This section provides the results of our work. The main outcome, perhaps, is that
inserted Code DNA was obtained successfully from the leaf of a plant.
4.3.1 Coding Program
The complete developed program, which enables coding text into DNA sequences
and decoding DNA sequences into text, is available on the Internet.2The syntax of
the ‘Hello world!’ program was coded using the Classic option of the program
from which the Code DNA was obtained.
4.3.2 Storing Data in N. Benthamiana and Reading Data
from the Plant
We obtained several N. benthamiana plants and seeds (see Fig. 4.4) with normal
phenotypes and growth, which were PCR positive for the Code DNA, selectable
marker gene and resistance gene. They were analyzed further by quantitative
Fig. 4.4 Nicotiana
benthamiana plant, and seeds
of Nicotiana benthamiana
with incorporated Code DNA
78 K. Fister et al.
real-time PCR in order to determine the copy number of the inserted Code DNA
and only transgenic plants (T0) containing one copy (i.e., hemizygous for the Code
DNA) were left for self-pollination.
After germination of their seeds, the T1 progeny were analyzed for the presence
of the Code DNA in their genome and a 1:3 segregation of PCR positive results
was observed, thus confirming the hemizygosity of their mother plants. Transgene
lines (T1) containing two copies of the Code DNA (i.e., homozygous for the Code
DNA) were selected and left to self-pollinate for further storage of their seeds. Some
of the seeds were germinated and the plants (T2) grew normally. The DNAs of all
checked T2 plants contained the Code and was sequenced as described in 2.7 Sanger
sequencing. The obtained DNA sequence was decoded, resulting in the syntax of the
‘Hello world!’ program showing on a display device.
4.4 Discussion
DNA-based storage of data has been proposed as an outperforming replacement for
electronic storage devices, due to its durability and low space requirements (Cox
2001; Church et al. 2012; Goldman et al. 2013; Grass et al. 2015). However,
since artificial DNA (Grass et al. 2015; Colotte et al. 2011; Clermont et al. 2014;
Liu et al. 2015; Ivanova and Kuzmina 2013; Anchordoquy and Molina 2007)
and microorganisms (Gibson et al. 2010; Farzadfard and Lu 2014; Ausländer and
Fussenegger 2014) require specific pretreatments and equipment for their storage,
an alternative approach, i.e., storing data in seeds, is presented here. Seeds are one
of the oldest storage media on Earth and they preserve genetic information for
thousands of years. Due to their stability and longevity, they are the most often
used material for plant genetic resource preservation in the world’s over 1,750
genebanks. They are already guardians of our natural and cultural heritage and, with
the implementation of our proof-of-concept study presented in this work, their role
could be even more pronounced.
Storing data in plant seeds is a simple, safe and economic solution for data
storage, since seeds do not need special equipment for storage because they
possess a wide range of natural mechanisms of protection and are easy to grow.
Seeds have already proved their durability over thousands of years. Examples
demonstrating this durability include the 1,600-year-old seeds of Anagyris foetida,
a relict species endemic to the Mediterranean region, which were germinated
successfully (Özgen et al. 2012) or the most ancient viable multicellular plants on
Earth – the species Silene stenophylla Ledeb. (Caryophyllaceae) – which have been
regenerated (Yashina et al. 2012) from approximately 31,800 years old placenta
fragments. Seeds of Nicotiana spp. are known to preserve their germination ability
for up to 10 years under ambient temperatures and/or relative humidity (Agacka
4 The Potential of Plants and Seeds. . . 79
et al. 2013), while long-term depositories such as Svalbard Global Seed Vault3
can protect them even from massive natural or man-made cataclysms. Under a
controlled environment with reduced temperature and relative humidity, the need for
seed regeneration would be minimal. Taking into account the estimated spontaneous
mutation rate for Arabidopsis thaliana of 7109base substitutions per site per
generation (Ossowski et al. 2010) and the estimated N. benthamiana genome size
of 3 Gb (Bombarely et al. 2012), there is a negligible chance of a mutation in the
encoded DNA. However, by increasing the length of DNA insertions (i.e. amount
of stored data) the chances of unwanted mutations also increase. Therefore, before
implementation of our proof of concept, the maximum length of heterologous DNA
that can be introduced into N. benthamiana genome and the mutation rate of such
long sequences have to be determined. Data storage in seeds goes beyond plant
genome manipulation for biotechnological research and plant breeding or simple
embedded ‘watermarks’ (Liss et al. 2012). It takes advantage of multi-cellular
organisms and serves for propagating the encoded information in daughter cells.
The host organism is able to grow and multiply with the embedded information and
every cell of the organism contains a copy of the encoded information. It avoids the
costs of producing multiple copies of the same encoding information synthetically,
which has been estimated to be $12,400 per MB (Goldman et al. 2013).
Insertions of short computer programs within plants could also provide a detailed
description of given varieties, since a need for such labeling has already been
expressed. The incorporation of such information into a plants’ own DNA would
particularly help consumers in terms of satisfying the ever-growing demand for food
quality and origin information. It can also be used as an extremely useful tool for
variety protection (Fister et al. 2017).
In relation to manipulating and storing archives, our approach could be leverage
for a new look at accessing, browsing and reading information, since hand-
held, single-molecule DNA sequencers are becoming available (Pennisi 2012) and
upgrading them to being able to obtain an encoded sequence directly from a
leaf (Ljubiˇ
c and Fister 2014) could be the next step.
4.5 Summary
This chapter presented our work on the utilization of a multi-cellular, eukaryotic
organism for storing valuable data. Our work describes a free copy-paste method
that avoids the costs of synthetic production of multiple copies of the same encoding
information, which is currently estimated to be $12,400 per MB for information
3Global crop diversity trust. https://www.croptrust.org/what-we-do/svalbard-global-seed-vault/.
Accessed: 2016-07-26.
80 K. Fister et al.
storage in naked DNA with negligible additional computational costs. In contrast
to a naked DNA molecule, which can be affected by unfavorable environmental
conditions, DNA stored in a seed is protected against alterations and degradation
over time without the need of any active maintenance. Our approach demonstrates
that artificially encoded data can be stored and multiplied in plants without affecting
their vigor and fertility. It is inheritable to progeny and authentically reproducible
while the reduced metabolism of seeds provides an additional protection for
encoded DNA archives.
References
Agacka M, Depta A, Börner M, Doroszewska T, Hay FR, Börner A (2013) Viability of nicotiana
spp. seeds stored under ambient temperature. Seed Sci Technol 41(3):474–478
Ailenberg M, Rotstein OD (2009) An improved Huffman coding method for archiving text, images,
and music characters in DNA. Biotechniques 47(3):747
Ajwani D, Malinger I, Meyer U, Toledo S (2008) Characterizing the performance of flash memory
storage devices and its impact on algorithm design. In: Proceedings of the 7th International
conference on experimental algorithms (WEA’08), Provincetown, pp 208–219
Anchordoquy TJ, Molina MC (2007) Preservation of DNA. Cell Preserv Technol 5(4):180–188
Ausländer S, Fussenegger M (2014) Dynamic genome engineering in living cells. Science
346(6211):813–814
Bombarely A, Rosli HG, Vrebalov J, Moffett P, Mueller LA, Martin GB (2012) A draft
genome sequence of Michaelicotiana benthamiana to enhance molecular plant-microbe biology
research. Mol Plant-Microbe Interact 25(12):1523–1530
Bonnet J, Colotte M, Coudy D, Couallier V, Portier J, Morin B, Tuffet S (2010) Chain and
conformation stability of solid-state DNA: implications for room temperature storage. Nucleic
Acids Res 38(5):1531–1546
Brand S (2000) Clock of the long now: time and responsibility. Basic Books, New York
Church GM, Gao Y, Kosuri S (2012) Next-generation digital information storage in DNA. Science
337(6102):1628–1628
Cisco (2016) The zettabyte era: trends and analysis. White paper, Cisco Systems,
Inc. Available via http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-
networking-index-vni/vni-hyperconnectivity-wp.html. Accessed 26 Nov 2016
Clelland CT, Risca V, Bancroft C (1999) Hiding messages in DNA microdots. Nature
399(6736):533–534
Clermont D, Santoni S, Saker S, Gomard M, Gardais E, Bizet C (2014) Assessment of DNA encap-
sulation, a new room-temperature DNA storage method. Biopreserv Biobanking 12(3):176–183
Colotte M, Coudy D, Tuffet S, Bonnet J (2011) Adverse effect of air exposure on the stability of
DNA stored at room temperature. Biopreserv Biobanking 9(1):47–50
Cox JPL (2001) Long-term data storage in DNA. Trends Biotechnol 19(7):247–250
Davis J (1996) Microvenus. Art J 55(1):70–74
Faize M, Faize L, Burgos L (2010) Using quantitative real-time PCR to detect chimeras in
transgenic tobacco and apricot and to monitor their dissociation. BMC Biotechnol 10(1):53
Farzadfard F, Lu TK (2014) Genomically encoded analog memory with precise in vivo DNA
writing in living cell populations. Science 346(6211):1256272
Fister K, Fister I, Murovec J, Bohanec B (2017) DNA labelling of varieties covered by patent
protection: a new solution for managing intellectual property rights in the seed industry.
Transgenic Res 26(1):87–95
4 The Potential of Plants and Seeds. . . 81
Gant JF, Reinsel D, Chute C, Schlichting W, McArthur J, Minton S, Xheneti I, Toncheva A,
Manfrediz A (2007) The expanding digital universe. White paper, International Data Cor-
poration. Available via https://web.archive.org/web/20130310100607/http://www.emc.com/
collateral/analyst-reports/expanding-digital-idc-white-paper.pdf. Accessed 26 Nov 2016
Gibson DG, Glass JI, Lartigue C, Noskov VN, Chuang RY, Algire MA, Benders GA, Montague
MG, Ma L, Moodie MM (2010) Creation of a bacterial cell controlled by a chemically
synthesized genome. Science 329(5987):52–56
Goldman N, Bertone P, Chen S, Dessimoz C, LeProust EM, Sipos B, Birney E (2013) Towards
practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature
494(7435):77–80
Goodin MM, Zaitlin D, Naidu RA, Lommel SA (2008) Nicotiana benthamiana: its history and
future as a model for plant-pathogen interactions. Mol Plant-Microbe Interact 21(8):1015–1026
Grass RN, Heckel R, Puddu M, Paunescu D, Stark WJ (2015) Robust chemical preservation
of digital information on DNA in silica with error-correcting codes. Angew Chem Int Ed
54(8):2552–2555
Hilbert M, López P (2011) The world’s technological capacity to store, communicate, and compute
information. Science 332(6025):60–65
Ivanova NV, Kuzmina ML (2013) Protocols for dry DNA storage and shipment at room
temperature. Mol Ecol Resour 13(5):890–898
Langtangen HP (2006) Python scripting for computational science, 3rd edn. Springer, Berlin
Liss M, Daubert D, Brunner K, Kliche K, Hammes U, Leiherer A, Wagner R (2012) Embedding
permanent watermarks in synthetic genes. PLoS One 7(8):e42465
Liu X, Li Q, Wang X, Zhou X, He X, Liao Q, Zhu F, Cheng L, Zhang Y (2015) Evaluation
of DNA/RNAshells for room temperature nucleic acids storage. Biopreserv Biobanking
13(1):49–55
Ljubiˇ
c K, Fister I Jr (2014) How to store Wikipedia into a forest tree: initial idea. In: Proceedings
of the first International conference on multimedia, scientific information and visualization for
information systems and metrics (MSIVISM’14), pp 45–52
MacKay DJC (2003) Information theory, inference and learning algorithms. Cambridge University
Press, New York
Marx V (2013) Biology: the big challenges of big data. Nature 498(7453):255–260
Ossowski S, Schneeberger K, Lucas-Lledó JI, Warthmann N, Clark RM, Shaw RG, Weigel D,
Lynch M (2010) The rate and molecular spectrum of spontaneous mutations in arabidopsis
thaliana. Science 327(5961):92–94
Özgen M, Özdilek A, Birsin MA, Önde S, ¸Sahin D, Açıkgöz E, Kaya Z (2012) Analysis of ancient
DNA from in vitro grown tissues of 1600-year-old seeds revealed the species as Anagyris
foetida. Seed Sci Res 22(4):279–286
Pennisi E (2012) Search for pore-fection. Science 336(6081):534–537
Susiˇ
c N, Bohanec B, Murovec J (2014) Agrobacterium tumefaciens-mediated transformation of
bush monkey-flower (Mimulus aurantiacus Curtis) with a new reporter gene ZsGreen. Plant
Cell Tissue Organ Cult 116(2):243–251
The Economist (2012) Digital archiving: history flushed. The economist. Available via http://www.
economist.com/node/21553410. Accessed 26 July 2016
Weng H, Pan A, Yang L, Zhang C, Liu Z, Zhang D (2004) Estimating number of transgene copies
in transgenic rapeseed by real-time PCR assay with HMG I/Y as an endogenous reference gene.
Plant Mol Biol Report 22(3):289–300
Yashina S, Gubin S, Maksimovich S, Yashina A, Gakhova E, Gilichinsky D (2012) Regeneration
of whole fertile plants from 30,000-year-old fruit tissue buried in Siberian permafrost. Proc
Natl Acad Sci USA 109(10):4008–4013