ArticlePDF Available

Storing and Reading Information in Mixtures of Fluorescent Molecules


Abstract and Figures

The rapidly increasing use of digital technologies requires the rethinking of methods to store data. This work shows that digital data can be stored in mixtures of fluorescent dye molecules, which are deposited on a surface by inkjet printing, where an amide bond tethers the dye molecules to the surface. A microscope equipped with a multichannel fluorescence detector distinguishes individual dyes in the mixture. The presence or absence of these molecules in the mixture encodes binary information (i.e., “0” or “1”). The use of mixtures of molecules, instead of sequence-defined macromolecules, minimizes the time and difficulty of synthesis and eliminates the requirement of sequencing. We have written, stored, and read a total of approximately 400 kilobits (both text and images) with greater than 99% recovery of information, written at an average rate of 128 bits/s (16 bytes/s) and read at a rate of 469 bits/s (58.6 bytes/s).
Content may be subject to copyright.
Storing and Reading Information in Mixtures of Fluorescent
Amit A. Nagarkar, Samuel E. Root, Michael J. Fink, Alexei S. Ten, Brian J. Caerty,
Douglas S. Richardson, Milan Mrksich, and George M. Whitesides*
Cite This:
Read Online
ACCESS Metrics & More Article Recommendations *
sıSupporting Information
ABSTRACT: The rapidly increasing use of digital technologies requires
the rethinking of methods to store data. This work shows that digital data
can be stored in mixtures of uorescent dye molecules, which are
deposited on a surface by inkjet printing, where an amide bond tethers the
dye molecules to the surface. A microscope equipped with a multichannel
uorescence detector distinguishes individual dyes in the mixture. The
presence or absence of these molecules in the mixture encodes binary
information (i.e., 0or 1). The use of mixtures of molecules, instead of
sequence-dened macromolecules, minimizes the time and diculty of
synthesis and eliminates the requirement of sequencing. We have written,
stored, and read a total of approximately 400 kilobits (both text and
images) with greater than 99% recovery of information, written at an
average rate of 128 bits/s (16 bytes/s) and read at a rate of 469 bits/s (58.6 bytes/s).
In order to preserve information over long periods of time,
reduce the energy consumption for storage, and prevent
tampering with stored information, new materials and
strategies for storage of information would be useful and
may be required.
Current devices used to store information
(optical media, magnetic media, and ash memory) have
insucient operational lifetimes for long-term storage
typically less than two decadesand require substantial energy
to maintain the stored information.
Molecules (including, but not limited to DNA) can be used
to store information without power, at high areal density, and
are claimed to be stable for thousands of years or more.
For these systems to be applied to store information, however,
several problems must be considered including (i) read/write
speeds, (ii) retention of information, (iii) density of
information, and (iv) cost.
Here, we demonstrate a write-once-read-many (WORM)
molecular information storage approach using mixtures of
uorescent dye molecules covalently bound to an epoxy
substrate. An inkjet printer enables writing of information at a
rate of 16 bytes/s, and a multichannel uorescence detector in
a confocal microscope enables reading at a rate of 58
kilobytes/s. Using this approach, we have written 14 075
bytes of digital information on a 7.2 mm ×7.2 mm surface
(resulting in an aerial information density of 271.5 bytes/mm2)
and read this information over 1000 times without signicant
loss (less than 20%) in uorescent signal intensity. This
approach enables information storage with high density, fast
read/write speeds, and multiple reads of a single set of
molecules without loss of information, all at an acceptable cost.
Devices currently used to store digital information
including optical disks, ash drives, and hard disk drives
have operational lifetimes on the order of decades.
alternative approach to such technologies is, in principle, to
store information in molecules, as molecule-based storage
systems can have very high theoretical storage densities and
half-lives that can extend to millions of years.
Sequence-dened polymers have been examined for
application in data storage, information processing, and
product validation. Inspired by how nature stores genetic
information, synthetic DNA has become the most popular
molecule to be considered for information storage. While
synthetic DNA provides one of the densest methods of data
storage (1018 bytes/mm3),
storage of information in long
DNA strands suers from several signicant problems: (i)
DNA sequencing methods (e.g., Next Gen Sequencing
) are
slow and, even with massive parallelization, typically require
multiple hours to decode a simple message. This slow rate of
reading makes this technique impractical for many applications
where latency (time to access and read the stored information)
Received: June 17, 2021
Research Article
© XXXX The Authors. Published by
American Chemical Society A
Downloaded via on October 15, 2021 at 18:37:08 (UTC).
See for options on how to legitimately share published articles.
is important (e.g., data centers); (ii) information systems that
use synthetic DNA typically use polymers that are greater than
100 nucleotides in length, which, due to inecient monomer
coupling, lead to multiple truncation products that decreases
the information density of the material.
As an alternative to DNA-based systems, several groups have
examined nonbiological polymers for molecular information
storage. In particular, Lutz and co-workers
have encoded
binary information into several sequence-dened polymers,
including non-natural polyphosphates,
and oligo(triazole amide)s
and decoded informa-
tion in these polymers by sequencing them with tandem mass
spectrometry. These synthetic polymer systems require
extensive synthesis and purication. For these polymers to
encode kilobytes of data, the polymer chain must be thousands
of units long, but iterative monomer addition suers from a
decrease in the yield of the polymer with each additional
We have recently demonstrated that information can be
stored in the composition of a mixture of oligopeptides, rather
than the sequence of a long polymer with individual units
covalently bonded to form a chain.
The use of smaller
fragments, combined with the commercial availability of these
units, eliminates the need for time-consuming and expensive
synthesis. We have used laser-ionization mass spectrometry to
read information stored in molecules on a metal surface. This
method has certain limitations: (i) mass spectrometry is a
destructive approach, and thus information is destroyed during
read-out; (ii) only one location is read at a time, making the
process of read-out slow (20 bits/s) and dicult to parallelize;
(3) there is limited potential to scale down the feature size, as a
decrease in laser spot size leads to an increase in noise.
The objective of this work is to demonstrate the storage of
information in a set of optically distinguishable molecules
(rather than oligopeptides distinguishable by molecular weight
using a mass spectrometer). Rather than molecular weight, we
use the dierence in the wavelength of uorescent emission of
commercially available dyestodesignanoptochemical
molecular information storage system. Information is written
by inkjet printing of solutions of uorescent dyes onto a
reactive polymeric substrate. Information is readusing a
uorescence microscope equipped with a multichannel
uorescence detector that can resolve, simultaneously and
independently, any combination of the dyes on the substrate.
This optical read-out technique uses commercially available
technologies and takes advantage of parallelized reading. The
system enabled by this combination of molecules is
fundamentally dierent from other optical storage methods.
The substrate, onto which information is written, is a thin
lm of an epoxy polymer, that contains reactive amino groups.
The N-hydroxysuccinimide (NHS) functionalized dyes react
on the substrate to form stable amide bonds. We demonstrate
that these covalently immobilized dyes are stable to more than
1000 reads without signicant loss of intensity. In this work,
we used commercially available uorescent dyes that have been
optimized to reduce the extent of photobleaching.
There are several advantages of our molecular information
storage technique as compared to magnetic tape, which is the
state-of-the-art for long-term storage technique:
(i) informa-
tion can be stored, presumably, with lower environmental and
power requirements (in magnetic tape, the binder that secures
the paramagnetic material to the substrate can fail in humid
); (ii) information can be stored less expensively
than with magnetic tape (see Supporting Information, section
S19); (iii) reading of information is parallelizeda single
image le can be used to read the information, unlike
sequential reading in magnetic tape;
(iv) information can be
encrypted with novel schemes (see the Registration section).
Choice of Dye Molecules. We chose seven commercially
available uorescent dye molecules with dierent emission
maxima to demonstrate our strategy (Figure 2). The detection
technique, a multichannel uorescence detector, uses a linear
array of detection channels to resolve multiple emission bands
in parallel and enables spatially resolved information on the
presence or absence of the dye molecules to be obtained in a
single scan across the substrate. In principle, this technique can
be expanded to incorporate more dyes as well (and encode
more information in the same amount of area). The dyes are
dissolved in dimethyl sulfoxide (DMSO), ltered through a
0.45 μm polysulfone syringe lter, and injected into the inkjet
printer cartridge (see Table S1 for concentrations). Figure 2A
lists the dyes used in this study. The optimal concentrations of
the dyes were determined empirically by observing their
uorescence intensity in a microscope.
WritingInformation. Inkjet printing is a material
deposition technique that has enabled high-resolution micro-
fabrication with specialized materials and has been demon-
strated to be applicable to areas such as electronics,
drug discovery
and others.
Inkjet printing
has four attractive features: (i) additive operation, where drops
are deposited only where needed; (ii) the ability to use a
variety of inks (aqueous, organic, nanoparticle composites,
biological materials, etc.); (iii) scalability to high throughput
and large substrate area; (iv) lower cost than photo-
lithography-based patterning. We use inkjet printing (other
technologies like aerosol-jet printing
and electrohydrody-
namic jet printing
provide better printing resolution but are
either too expensive or are not commercially available) to print
Table 1. Comparison of Methods for Archival Data Storage
method cost ($/GB) stability write speed
(MB/s) read speed
magnetic tape
(LTO-7) 0.016
1030 years 4 ×102
up to 2000
not yet
determined 1×1063×106
imaging (this
not yet
determined 16 ×1066×102
Magnetic tape is the most common technology used to store data for
archival purposes. DNA data storage and self-assembled monolayer-
desorption and ionization (SAMDI) data storage are molecular
information storage strategies that have received interest in the
research community. This work describes storage of information in
mixtures of uorescent molecules.
Total revenue divided by total
data volume of tapes shipped in a year.
Current generation (LTO-
Reports vary ($530,000$31,250,000 per GB written). DNA
sequencing also incurs cost.
Estimated lifetime of DNA encapsulated
in silica.
Overall throughput is estimated to be on the order of kB/
Specic values for writing rates are not reported.
Using a single
state-of-the-art sequencing device.
Self-assembled monolayers for
matrix-assisted laser desorption/ionization;
See Supporting In-
formation. section S19 for detailed calculations.
ACS Central Science Research Article
1 pL droplets with a 30-μm center-to-center distance between
adjacent spots on the substrate (Supporting Information,
Figure S4). To demonstrate storage of information at high
density, we wrote the rst section of one of the most seminal
research papers in scientic history: Experimental researches
in electricityby Michael Faraday, Phil. Trans. R. Soc. Lond.
1832,122, 125162 (Supporting Information, section S18).
This text contains 14 075 characters (i.e., 14 075 bytes of
information when converted to ASCII).
ReadingInformation. Fluorescence imaging is a
powerful tool for high-resolution characterization of biological
samples and materials. The availability of a variety of
uorescent dyes enables unprecedented control in the labeling
of specic sites on the sample. Recently, the analysis of spectral
data sets and the separation of signals by spectral imaging,
combined with linear unmixing, have overcome problems of
spectral overlap for uorescent dyes, and is used widely in
biological systems.
We used a Zeiss LSM 800 uorescence microscope, which
has one of the most versatile implementations for spectral
imaging. In this technique, uorescence emission passes
through a pinhole and is separated by wavelength by a
diraction grating (Supporting Information, Figure S5). The
spectrally resolved light is then projected onto a linear array of
34 detection channels in a photomultiplier detector. The
wavelength of emitted light is determined by the position of
the channel receiving the photons. This system allows very
precise determination of the intensity of peaks separated by
only a few nanometers and thus the concentration of the dyes
responsible. The presence/absence of a specicuorescent dye
molecule at a specic location on the substrate can thus be
Choice of Substrate. Long-term storage of information
requires the formation of thermodynamically stable bonds with
very long half-lives. An amide bond is one of the most
thermodynamically stable bonds available to organic chem-
In our strategy, we used N-hydroxysuccinimide-
functionalized dye molecules, which spontaneously react with
amino groups to form amide bonds. We synthesized a cross-
linked epoxy polymer with an excess of the amine curing agent.
This substrate contains reactive secondary amino groups
(Supporting Information, Figure S2).
The epoxy polymer is
processed by hot-pressing a mixture of bisphenol A diglycidyl
ether and triethylene tetramine at 120 °C between a glass
coverslip and a at PDMS surface (see Experimental Section).
We control the pressure (70 psi) to obtain 50-μm thick lms
(Supporting Information, Figures S2 and S3). It is important
to have a at surface for the substrate because irregularities in
thickness lead to blurring of the image and incomplete focusing
in the microscope on reading the information (Supporting
Information Figure S16 for an example).
Encoding Scheme for Writingof Binary Informa-
tion. A binary representation of ASCII characters consists of
eight bits where each bit is either 0or 1.
In our encoding
scheme, the binary representation of each ASCII character in
the bit string is assigned a position (positions 18, Figure 3).
This position is assigned to a uorescent dye molecule (here,
we assign dyes to positions in the order of increasing emission
maxima). The bit strings for the positions are then used to
generate a printable pattern. Here, we generate a square
pattern out of the bit strings, but, in principle, any pattern
geometry is possible. These square patterns are then
sequentially printed on the substrate using an inkjet printer.
0indicates absence of a dye molecule, and 1indicates the
presence of a dye molecule. For printable ASCII characters, the
rst binary digit is always 0, and hence, the rst square
pattern is always a blank pattern. Thus, we require only seven
dye molecules for data storage of printable eight-bit ASCII
Registration. Figure 3B shows a schematic representation
of the fact that registration of the printed grids of uorescent
molecules is not required. The uorescent molecules, when
deposited onto the substrate, lie on a grid where the presence
or absence of the molecule at the intersection of the gridlines
determines binary information (i.e., 0or 1). When these
grids are sequentially printed onto the substrate, any oset
between grids of dierent uorescent molecules does not make
adierence to the output obtained on reading through a
multichannel-uorescence detector. To help in determining
the position of the grid, we place three dots that serve as
calibration spotsas shown in Figure 3B.
We used the Fujilm Dimatix DMP 2831 inkjet printer to
deposit the dye molecules onto the substrate. As this inkjet
printer can accommodate only one cartridge at a time, we
manually changed the cartridges (each containing one
uorescent dye solution) to print the computer-generated
images. We needed 7 manual cartridge changes, and it took
116 s on an average to write each pattern for Experimental
researches in electricityat 30 μm center-to-center spot
distance on a 7.2 mm ×7.2 mm substrate area. This time and
area translate into a writing speed of 16 bytes/s.
The substrate with the written information was placed in a
Zeiss LSM 880 uorescence microscope in an inverted
Figure 1. A schematic diagram of the writingand readingprocess.
ASCII information is converted to a binary bit string which is then
encoded into printable patterns and printed with an inkjet printer.
The presence or absence of dye molecules at a location represents a
byte of data. The information is written on an epoxy substrate which
contains free amino groups. Printing of the dyes leads to an amide
bond formation between the substrate and the dye and leads to
covalent immobilization of the dye onto the substrate at a specic
location. Imaging of the printed substrate using a multichannel
uorescence detector represents the readingof the written
information. The multichannel uorescence detector can, simulta-
neously and independently, detect the presence or absence of the dye
molecules at a specic location. One very important feature of our
approach is that the registration of the dyes with respect to each other
is not important for decoding the stored information.
ACS Central Science Research Article
conguration. Four lasers (405 nm, 488 nm, 561 nm, 633 nm)
were chosen to excite all the dyes simultaneously in the visible
spectrum region (410695 nm). Using the in-built spectral
imaging function, we could resolve all the patterns with very
good spatial resolution for each dye. Figure 4B show cropped
regions of the unmixed images for all seven dye molecules. It
took approximately 240 s to record the image, giving an
eective reading speed of 58.64 kilobytes/s (469 kilobits/s).
Here we use an inkjet printer and print at the highest
resolution, where it is not possible to specify whether the grids
are oset or perfectly overlap. The multichannel uorescence
detector is required to show the presence/absence of a dye in
the same location as other dyes if they perfectly overlap.
Decoding the Information. It is straightforward to use
image analysis to decode the stored information. The
individual patterns are read using a simple Python program
script using the OpenCV computer vision library.
obtained good accuracy (99.64%) of the recovered information
(measured as the number of bits read correctly as a percentage
of the total number of bits). This accuracy can be improved
with more sophisticated image analysis techniques and error
correction codes.
The most common reason for inaccuracies
during reading were dust particles adhering to the substrate
Stability of the Information. Photobleaching is the
attenuation of uorescence intensity of a uorophore molecule,
primarily due to the cleavage of covalent bonds in the molecule
on reaction with oxygen. In our experiments, photobleaching
did not signicantly aect our recorded information. As
compared to traditional biological labeling experiments, we use
a high concentration of the uorescent dye (micromolar
quantities). Two benets of using high concentration of dyes
are (i) low laser power is required to excite the uorescent
dyes at a location, (ii) lower laser power also decreases the rate
of photobleaching. In our experiments, a 2 mm ×2mm
portion of the information was continuously read 1000 times
in air without signicant loss in intensity (Figure 5). After 1000
reading cycles, dye 425 showed the largest reduction in
intensity (21%), while all other dyes showed a <15% change
in intensity.
Storage of Digital Images. Our strategy of storage of
information for ASCII data can be applied to store non-ASCII
data as well. As shown in Figure 6, we converted a 3 kilobyte
JPEG image of Michael Faraday into a bit string, encoded the
bit string to print in seven uorescent dyes, and inkjet printed
the molecules onto the substrate. In this case, as the data are
already in a compressed format (JPEG), the quality of
recovered data is much more sensitive to errors than when it
is in a loss-less image encoding format. An example of the
image with 0.4% printing errors (0.4% bits read wrong as
compared to the input bit string) is given in the Supporting
Information (Figure S17).
Our technology for storage of information in mixtures of
uorescent molecules can be expanded with the use of other
uorophores with narrow emission bandwidths (e.g., quantum
or J-aggregates
). An expanded palette of uorophores
will allow for the use of more uorophores per location and
could also allow simpler, band-pass lter-based reading,
eliminating the requirement for an expensive multichannel
uorescence detector. More sophisticated drop-on-demand
technologies (electrohydrodynamic inkjet printer commercial-
ized by SIJ corporation, Japan, dip-pen lithography,
etc.) can
print at much higher resolutions (sub-1 μm spotspot
distance). At 1 μm spot-to-spot distance with eight uorescent
Figure 2. Optically distinguishable uorescent dyes. (A) Structures of the uorescent dyes used in this study along with their emission spectra in
dimethyl sulfoxide. (B) Reaction scheme for the covalent immobilization of the dye molecule on the substrate. Amino groups in the substrate react
with the N-hydroxysuccinimide derivatives of the uorescent dye to link the uorescent dye to the substrate with an amide bond.
ACS Central Science Research Article
dyes, the areal storage density will be 5 Gbits/in2, which is
comparable to the latest generation of magnetic tape (LTO-8,
areal density: 8 Gbit/in2). Another area for improvement is the
use of error correction codes to decrease error rates (e.g., Reed
Solomon error correction codes
have been extensively used
to decrease error rates in optical media like compact discs, and
Blu-ray discs).
In conclusion, we report a fundamentally new molecular data
storage technology that leverages the optical characteristics of
conjugated molecules. A multichannel uorescence detector
enables the simultaneous and independent detection of the
presence or absence of a molecule in a mixture on a surface.
The writingprocess uses inkjet printing wherein molecules
that are deposited onto the surface form an amide bond to link
the dye molecules to the substrate. An important characteristic
of this information storage method is that registration of the
individual molecules is not required. This characteristic is, to
our best knowledge, unique; it dierentiated this method from
existing optical data storage technologies.
We also show that multiple (>1000) readouts of such optical
molecular information are possible without signicant loss of
information through bleaching and other mechanisms. This is
also unique as compared to other molecular information
storage systems which involve destructive reading (e.g.,
sequencing of DNA
or laser-ablation of oligopeptides
We have also demonstrated the fastest reading speed of any of
the molecular information storage methods (0.469 Mbits/s).
Access to newer drop-on-demand technologies like electro-
hydrodynamic inkjet printing would enable commercially
competitive areal information density.
This optical molecular information storage technology
presents solutions to important problems that are faced by
emerging molecular information storage technologies: energy
used for storage, cost, and ability to resist corruption.
Safety. Epoxy resins are known skin sensitizers and should
be handled carefully with all safety precautions and personal
protective equipment in a well-ventilated fume hood. Caution
must be taken while handling the reactive uorescent dyes as
their safety hazards are not fully known.
Materials. AlexaFluor dyes were purchased from Thermo
Fisher and used without further purication. Atto 425 dye, dry
dimethyl sulfoxide (DMSO), bisphenol A diglycidyl ether, and
triethylenetetramine were purchased from Sigma-Aldrich and
used without further purication.
Fabrication of the Epoxy Substrate. Bisphenol A
diglycidyl ether (2.4 g, 7 mmol) was mixed with 0.6 g of
triethylene tetramine (4.2 mmol, 3 equiv). This solution was
vigorously stirred for 2 min and degassed under a vacuum (80
mbar) for 5 min. This solution (2.6 mL) was poured onto a
glass slide and placed inside a heat-press. A PDMS (Sylgard
184) block (10 cm ×10 cm ×0.5 cm) was placed on top of
this solution. The PDMS slab plays two roles: (i) it ensures
that the top surface is at, (ii) it does not stick to the epoxy
lm, and hence it is easy to remove after the epoxy polymer has
cured. The polymer was cured under 20 psi pressure at 120 °C
for 30 min. The PDMS layer was manually removed, and the
Figure 3. Encoding and registration. (A) Encoding scheme for storage of data for ASCII characters. The algorithm converts the input ASCII string
into binary bit strings. Each specic position in the binary data is combined to generate a separate bit string, which corresponds to a specic
uorescent molecule. These eight-bit strings are then converted into a pattern (here, a square pattern, but, in principle, any shape of an array of
spots can be used) and printed sequentially onto the substrate with an inkjet printer. (B) A schematic representation demonstrating that the
registration of dierent colors of the printed grids is not required. The grids, when printed onto the substrate, can either be perfectly registered or
be printed with an oset. In both cases, as information is read using uorescence emission at predetermined wavelengths, the patterns can be read
independently (1 = presence of the dye and 0 = absence of the dye). Independent read-out from each channel of the uorescent detector facilitates
this non-registeredinformation storage.
ACS Central Science Research Article
epoxy lm on the glass substrate was cooled down to room
temperature. This lm was then washed with n-hexane three
times to remove any potential residue left by the PDMS
Instrumentation. Inkjet Printing. Writing was carried out
with a Fujilm Dimatix DMP 2831 printer with a 1 pL printing
volume cartridge. The printing parameters are ring voltage of
the active nozzle: 16 V; ring voltage of inactive nozzles: 12 V,
printing height: 0.5 mm. Cartridges containing the uorescent
dyes were manually changed for each dye.
Atomic Force Microscopy. A sample consisting of an epoxy
lm on a glass slide was rst sonicated in isopropyl alcohol for
10 min and then dried under a nitrogen stream. Atomic force
microscope (AFM) images were obtained using an Asylum
Research Cypher AFM in tapping mode with a 300 kHz
cantilever. Supporting Information, Figure S2 provides the
Prolometry. A sample consisting of an epoxy lm on a
microscope glass slide (VWR, 1 mm thickness) was sliced with
a razor blade to introduce trenches into the lm, sonicated in
isopropyl alcohol for 10 min, and then dried under nitrogen.
Prolometry was performed using a Bruker DektakXT
prolometer equipped with a 5-μm radius diamond tip and
with 3 mg of applied force.
Supporting Information, Figure S3 provides the data.
Microscopy. Reading was carried out using a Zeiss LSM 880
confocal microscope with an in-built 34 channel photo-
multiplier detector. Four lasers were used to excite all the dyes:
Figure 4. Reading of information. (A) Fluorescence microscope image of the rst section of FaradaysExperimental researches in electricityon a
50 μm epoxy polymer lm written using the encoding scheme shown in Figure 3. The image was recorded with excitation using four lasers
simultaneously (405 nm, 488 nm, 561 nm, 633 nm). (B) Zoomed-in image of the printed droplets on the epoxy substrate. (C) Linear unmixing of
the uorescent microscope image leads to independent deconvolution of each dye at a location. The panel shows individual grids of uorescent dye
molecules 425, 488, 514, 555, 568, 594, and 647 obtained by spectral unmixing of the original image using the Zeiss Zen Black software.
Figure 5. A subset of the printed area was continuously read 1000 times in the uorescence microscope. (A) Image of the printed region on the
rst read. (B) Image of the same region after 1000 reads. All the patterns of the dyes were easily readable after 1000 cycles of reading. (C) Table
showing the percentage of the initial uorescence intensity remaining after reading the data 1000 times in air.
ACS Central Science Research Article
405 nm, 488 nm, 561 nm, 633 nm. The in-built multichannel
uorescence detector was calibrated with individual dyes
printed on the epoxy substrate.
sıSupporting Information
The Supporting Information is available free of charge at
Absorption and emission spectra of uorescent dyes, list
of concentrations of the uorescent dyes used, character-
ization of the substrate, spectrally unmixed images of
each dye, computer-generated printing patterns, images
of printed droplets, image of problems while focusing in
the microscope, stored and decoded image of Michael
Faraday containing errors, encoded text from Faradays
Experimental Researches in Electricity, estimation of
the lower bound of cost per GB, typical inkjetting
waveform, and an estimation of lifetime of the
uorescent dyes (1010 years by extrapolation using the
Arrhenius equation) (PDF)
Corresponding Author
George M. Whitesides Department of Chemistry and
Chemical Biology, Harvard University, Cambridge,
Massachusetts 02138, United States;
0001-9451-2442; Email: gwhitesides@
Amit A. Nagarkar Department of Chemistry and Chemical
Biology, Harvard University, Cambridge, Massachusetts
02138, United States
Samuel E. Root Department of Chemistry and Chemical
Biology, Harvard University, Cambridge, Massachusetts
02138, United States
Michael J. Fink Department of Chemistry and Chemical
Biology, Harvard University, Cambridge, Massachusetts
02138, United States;
Alexei S. Ten Department of Chemistry and Chemical
Biology, Harvard University, Cambridge, Massachusetts
02138, United States;
Brian J. Caerty Department of Chemistry and Chemical
Biology, Harvard University, Cambridge, Massachusetts
02138, United States
Douglas S. Richardson Harvard Center for Biological
Imaging, Cambridge, Massachusetts 02138, United States
Milan Mrksich Department of Chemistry and Department
of Biomedical Engineering, Northwestern University,
Evanston, Illinois 60208, United States;
Complete contact information is available at:
Author Contributions
G.M.W. conceived the idea of molecular information storage in
mixtures of molecules. A.A.N., B.J.C., S.E.R. and G.M.W.
postulated information storage in mixtures of uorescent
molecules. A.A.N., S.E.R., and D.R. conducted the experiments
and performed the analysis. A.S.T., M.J.F., and M.M. provided
valuable input for improvement of the manuscript. A.A.N.,
S.E.R., M.J.F., and G.M.W. wrote the manuscript with inputs
from all the authors.
interest(s): A.A.N., A.S.T., and M.J.F. acknowledge an equity
interest in Datacule Inc. G.M.W. acknowledges an equity
interest and a board position in Datacule Inc.
This work was supported by Defence Advanced Research
Projects Agency (DARPA) under Award No. W911NF-18-2-
(1) Lloyd, S. Ultimate physical limits to computation. Nature 2000,
406, 10471054.
(2) Shulaker, M. M.; et al. Three-dimensional integration of
nanotechnologies for computing and data storage on a single chip.
Nature 2017,547,7478.
(3) Pansare, A. V.; et al. In Situ Nanoparticle Embedding for
Authentication of Epoxy Composites. Adv. Mater. 2018,30, 1801523.
(4) Salahuddin, S.; Ni, K.; Datta, S. The era of hyper-scaling in
electronics. Nature Electronics 2018,1, 442450.
(5) Baliga, J.; Ayre, R. W.; Hinton, K.; Tucker, R. S. Green Cloud
Computing: Balancing Energy in Processing, Storage, and Transport.
Proc. IEEE 2011,99, 149167.
(6) Brandner, R.; Pordesch, U.; Wallace, C. Long-Term Archive
Service Requirements; The IETF Trust: Internet Requests for
Comments, 2007.
(7) Clelland, C. T.; Risca, V.; Bancroft, C. Hiding messages in DNA
microdots. Nature 1999,399, 533534.
Figure 6. A JPEG image of Michael Faraday (3.95 KB) was converted into patterns for the seven uorescent dyes used in this study. This
information was printed onto a 4.2 mm ×4.2 mm epoxy substrate. An example of a decoded image with 0.4% printing errors is shown in the
Supporting Information, Figure S17. The image of Michael Faraday has been reproduced with permission from Getty Images.
ACS Central Science Research Article
(8) Green, J. E.; et al. A 160-kilobit molecular electronic memory
patterned at 1011 bits per square centimetre. Nature 2007,445, 414
(9) Colquhoun, H.; Lutz, J.-F. Information-containing macro-
molecules. Nat. Chem. 2014,6, 455456.
(10) Zhirnov, V.; Zadegan, R. M.; Sandhu, G. S.; Church, G. M.;
Hughes, W. L. Nucleic acid memory. Nat. Mater. 2016,15, 366370.
(11) Al Ouahabi, A.; Charles, L.; Lutz, J.-F. Synthesis of non-natural
sequence-encoded polymers using phosphoramidite chemistry. J. Am.
Chem. Soc. 2015,137 (16), 56295635.
(12) Charles, L.; Laure, C.; Lutz, J.-F.; Roy, R. K. MS/MS
sequencing of digitally encoded poly (alkoxyamine amide)s. Macro-
molecules 2015,48 (13), 43194328.
(13) Amalian, J.-A.; Trinh, T. T.; Lutz, J.-F.; Charles, L. MS/MS
digital readout: analysis of binary information encoded in the
monomer sequences of poly (triazole amide)s. Anal. Chem. 2016,
88 (7), 37153722.
(14) Goodwin, C. A.; Ortu, F.; Reta, D.; Chilton, N. F.; Mills, D. P.
Molecular magnetic hysteresis at 60 K in dysprosocenium. Nature
2017,548, 439442.
(15) Bhat, W. A. Bridging data-capacity gap in big data storage.
Future Generation Computer Systems 2018,87, 538548.
(16) Gu, M.; Li, X.; Cao, Y. Optical storage arrays: a perspective for
future big data storage. Light: Sci. Appl. 2014,3, e177.
(17) Grass, R. N.; Heckel, R.; Puddu, M.; Paunescu, D.; Stark, W. J.
Robust Chemical Preservation of Digital Information on DNA in
Silica with Error-Correcting Codes. Angew. Chem., Int. Ed. 2015,54,
(18) Goodwin, S.; McPherson, J. D.; McCombie, W. R. Coming of
age: ten years of next-generation sequencing technologies. Nat. Rev.
Genet. 2016,17, 333351.
(19) Kosuri, S.; Church, G. M. Large-scale de novo DNA synthesis:
technologies and applications. Nat. Methods 2014,11, 499507.
(20) Organick, L.; et al. Random access in large-scale DNA data
storage. Nat. Biotechnol. 2018,36, 242248.
(21) Cafferty, B. J.; Ten, A. S.; Fink, M. J.; Morey, S.; Preston, D. J.;
Mrksich, M.; Whitesides, G. M. Storage of Information Using Small
Organic Molecules. ACS Cent. Sci. 2019,5(5), 911916.
(22) Feenstra, A. D.; Dueñas, M. E.; Lee, Y. J. Five Micron High
Resolution MALDI Mass Spectrometry Imaging with Simple,
Interchangeable, Multi-Resolution Optical System. J. Am. Soc. Mass
Spectrom. 2017,28, 434442.
(23) Catalogue Press Release,
(24) Fontana, R. E.; Decad, G. M. Moores law realities for recording
systems and memory storage components: HDD, tape, NAND, and
optical. AIP Adv. 2018,8, 056506.
(25) Ceze, L.; Nivala, J.; Strauss, K. Molecular digital data storage
using DNA. Nat. Rev. Genet. 2019,20, 456466.
(26) Meiser, L. C.; Koch, J.; Antkowiak, P. L.; Stark, W. J.; Heckel,
R.; Grass, R. N. DNA synthesis for true random number generation.
Nat. Commun. 2020,11, 5869.
(27) Lantz, M. Why the Future of Data Storage Is (Still) Magnetic
Tape. IEEE Spectrum: Technology Engineering and Science News, Aug.
(28) Bertram, H.; Cuddihy, E. Kinetics of the humid aging of
magnetic recording tape. IEEE Trans. Magn. 1982,18, 993999.
(29) Sandsta, O.; Midtstraum, R. Analysis of retrieval of multimedia
data stored on magnetic tape. Proceedings International Workshop on
Multi-Media Database Management Systems; Cat. No. 98TB100249;
Dayton, OH, USA, 1998; pp 5463.
(30) Sirringhaus, H.; et al. High-resolution inkjet printing of all-
polymer transistor circuits. Science 2000,290, 21232126.
(31) Shimoda, T.; et al. Solution-processed silicon films and
transistors. Nature 2006,440, 783786.
(32) Shimoda, T.; Morii, K.; Seki, S.; Kiguchi, H. Inkjet printing of
light-emitting polymer displays. MRS Bull. 2003,28, 821827.
(33) Chang, S. C.; et al. Multicolor organic light-emitting diodes
processed by hybrid inkjet printing. Adv. Mater. 1999,11, 734737.
(34) Lemmo, A. V.; Rose, D. J.; Tisone, T. C. Inkjet dispensing
technology: Application in drug discovery. Curr. Opin. Biotechnol.
1998,9, 615617.
(35) Heller, M. J. DNA microarray technology: Devices, systems,
and applications. Annu. Rev. Biomed. Eng. 2002,4, 129153.
(36) Hiller, J.; Mendelsohn, J. D.; Rubner, M. F. Reversibly erasable
nanoporous anti-reflection coatings from polyelectrolyte multilayers.
Nat. Mater. 2002,1,5963.
(37) Wilkinson, N. J.; Smith, M. A.; Kay, R. W.; Harris, R. A. A
review of aerosol jet printinga non-traditional hybrid process for
micro-manufacturing. International Journal of Advanced Manufacturing
Technology 2019,105, 45994619.
(38) Park, J.-U.; et al. High-resolution electrohydrodynamic jet
printing. Nat. Mater. 2007,6, 782789.
(39) Valm, A. M.; Oldenbourg, R.; Borisy, G. G. Multiplexed
Spectral Imaging of 120 Different Fluorescent Labels. PLoS One 2016,
11, e0158495.
(40) Martin, R. B. Free energies and equilibria of peptide bond
hydrolysis and formation. Biopolymers 1998,45 (5), 351353.
(41) Pansare, A. V.; et al. Shape-Coding: Morphology-Based
Information System for Polymers and Composites. ACS Appl. Mater.
Interfaces 2020,12, 2755527561.
(42) Hieronymus, J. L. ASCII phonetic symbols for the worlds
languages Worldbet. J. Int. Phonetic Assoc. 1993,23.
(43) Pulli, K.; Baksheev, A.; Kornyakov, K.; Eruhimov, V. Realtime
Computer Vision with OpenCV. Queue 2012,10,4056.
(44) Reed, I. S.; Solomon, G. Polynomial Codes over Certain Finite
Fields. J. Soc. Ind. Appl. Math. 1960,8(2), 300304.
(45) Owen, J.; Brus, L. Chemical Synthesis and Luminescence
Applications of Colloidal Semiconductor Quantum Dots. J. Am.
Chem. Soc. 2017,139 (32), 1093910943.
(46) Würthner, F.; Kaiser, T. E.; Saha-Möller, C. R. J-Aggregates:
From Serendipitous Discovery to Supramolecular Engineering of
Functional Dye Materials. Angew. Chem., Int. Ed. 2011,50, 3376
(47) Liu, G.; Petrosko, S. H.; Zheng, Z.; Mirkin, C. A. Evolution of
Dip-Pen Nanolithography (DPN): From Molecular Patterning to
Materials Discovery. Chem. Rev. 2020,120, 60096047.
ACS Central Science Research Article
... Writing (encoding) information at the molecular level can be achieved with either (i) sequences of monomers concatenated within one molecular string [7][8][9][10][11][12] or with (ii) mixtures of individual, unique compounds [19][20][21][22][23] . For both, there is a trade-off between synthetic difficulty, material demands, and information capacity. ...
... Long sequences can be made from a few monomers, but synthetic writing is difficult, slow, and prone to errors [7][8][9] . On the other hand, mixtures are easy to make but require a high number of uniquely distinguishable components, which eventually become a limiting factor [19][20][21][22][23] . Theoretically, an alternative approach, better balancing the need to generate many codes with a small number of components and synthetic steps, is (iii) to first synthesize short sequences and then use them as unique components in mixtures. ...
... Another challenge for spectroscopic methods are signal overlaps that limit the number of distinguishable codes. For example, fluorescent labels, as one of the most important tools for interrogation of biological systems, provide relatively broad signals within a narrow spectral window [23][24][25][26] . Up to ten different fluorescent tags can be distinguished from mixtures with advanced computational methods, but this still amounts only to 2 10 -1 = 1023 codes 26 . ...
Full-text available
Contactless digital tags are increasingly penetrating into many areas of human activities. Digitalization of our environment requires an ever growing number of objects to be identified and tracked with machine-readable labels. Molecules offer immense potential to serve for this purpose, but our ability to write, read, and communicate molecular code with current technology remains limited. Here we show that magnetic patterns can be synthetically encoded into stable molecular scaffolds with paramagnetic lanthanide ions to write digital code into molecules and their mixtures. Owing to the directional character of magnetic susceptibility tensors, each sequence of lanthanides built into one molecule produces a unique magnetic outcome. Multiplexing of the encoded molecules provides a high number of codes that grows double-exponentially with the number of available paramagnetic ions. The codes are readable by nuclear magnetic resonance in the radiofrequency (RF) spectrum, analogously to the macroscopic technology of RF identification. A prototype molecular system capable of 16-bit (65,535 codes) encoding is presented. Future optimized systems can conceivably provide 64-bit (~10^19 codes) or higher encoding to cover the labelling needs in drug discovery, anti-counterfeiting and other areas.
Lanthanide‐based upconverting nanoparticles (UCNPs) are largely sought‐after for biomedical applications ranging from bioimaging to therapy. A straightforward strategy is proposed here using the naturally sourced polymer phytoglycogen to coencapsulate UCNPs with hydrophobic photosensitizers as an optical imaging platform and light‐induced therapeutic agents. The resulting multifunctional sub‐micrometer‐sized luminescent beads are shown to be cytocompatible as carrier materials, which encourages the assessment of their potential in biomedical applications. The loading of UCNPs of various elemental compositions enables multicolor hyperspectral imaging of the UCNP‐loaded beads, endowing these materials with the potential to serve as luminescent tags for multiplexed imaging or simultaneous detection of different moieties under near‐infrared (NIR) excitation. Coencapsulation of UCNPs and Rose Bengal opens the door for potential application of these microcarriers for collagen crosslinking. Alternatively, coloading UCNPs with Chlorin e6 enables NIR‐light triggered generation of reactive oxygen species. Overall, the developed encapsulation methodology offers a straightforward and noncytotoxic strategy yielding water‐dispersible UCNPs while preserving their bright and color‐tunable upconversion emission that would allow them to fulfill their potential as multifunctional platforms for biomedical applications. The assembly and characterization of upconverting nanoparticle (UCNP)‐loaded corn‐derived phytoglycogen beads synthesized through a simple encapsulation method are presented. Those resultant luminescent beads contain UCNPs only or UCNPs in combination with photosensitizer Chlorin e6 and Rose Bengal, respectively. The beads are noncytotoxic and their potential application as multifunctional microcarriers for multicolor imaging, ROS generation, and protein crosslinking is demonstrated.
The emerging challenges of the big data era, both in storage density and security, require the development of the next-generation optical data storage materials. Here, we report for the first time a photo-stimulated luminescence (PSL) material, Ba3Ga2O6: Pr³⁺, for rewritable optical storage and write-once-read-many data preservation. Ba3Ga2O6: Pr³⁺, with an isolated deep trap depth in the range 1.26–1.53eV, has been used for data encoding/decoding under the excitation of 254 nm UV light and by the simulation of an 808 nm NIR laser. Meanwhile, the phosphor allows for high-security write-once-read-many optical memory by taking advantage of the irreversible change of the photoluminescence (PL) color from blue to green (a binary blue ‘0’ and green ‘1’ code) irradiated by 365 nm UV light. The comprehensive investigations indicate that the irreversible PL switching is as a result of the order-disorder structural transition by thermal treatment. The new persistent luminescence material not only exhibits promising applications in sustainable rewritable data storage, but also paves the way for write-once-read-many optical information storage with a high level of security.
Full-text available
Laser‐induced forward transfer (LIFT) is a rapid laser‐patterning technique for high‐throughput combinatorial synthesis directly on glass slides. A lack of automation and precision limited LIFT applications to simple proof‐of‐concept syntheses of fewer than 100 compounds. Here, we report an automated synthesis instrument that combines laser transfer and robotics for parallel synthesis in a microarray format with up to 10000 individual reactions/cm2. An optimized pipeline for amide bond formation is the basis for preparing complex peptide microarrays with thousands of different sequences in high yield with high reproducibility. The resulting peptide arrays are of higher quality than commercial peptide arrays. More than 4800 15‐residue peptides resembling the entire Ebola virus proteome on a microarray were synthesized to study the antibody response of an Ebola virus infection survivor. We identified known and unknown epitopes that serve now as a basis for Ebola diagnostic development. The versatility and precision of the synthesizer is demonstrated by in situ synthesis of fluorescent molecules via Schiff base reaction and multi‐step patterning of precisely definable amounts of fluorophores. This automated laser transfer synthesis approach opens new avenues for high‐throughput chemical synthesis and biological screening. This article is protected by copyright. All rights reserved
Flüssigkristalle als responsive Materialien etwa in der organischen Elektronik, erste Nanogürtel mit Acencharakter, direkt aus der Atmosphäre entferntes CO2, Disauerstoff wird organokatalytisch zu Wasserstoffperoxid, und Chinazolinone lassen sich biokatalytisch herstellen.
Full-text available
The volume of securely encrypted data transmission required by today’s network complexity of people, transactions and interactions increases continuously. To guarantee security of encryption and decryption schemes for exchanging sensitive information, large volumes of true random numbers are required. Here we present a method to exploit the stochastic nature of chemistry by synthesizing DNA strands composed of random nucleotides. We compare three commercial random DNA syntheses giving a measure for robustness and synthesis distribution of nucleotides and show that using DNA for random number generation, we can obtain 7 million GB of randomness from one synthesis run, which can be read out using state-of-the-art sequencing technologies at rates of ca. 300 kB/s. Using the von Neumann algorithm for data compression, we remove bias introduced from human or technological sources and assess randomness using NIST’s statistical test suite.
Full-text available
Dip-pen nanolithography (DPN) is a nanofabrication technique that can be used to directly write molecular patterns on substrates with high resolution and registration. Over the past two decades, DPN has evolved in its ability to transport molecular and material "inks" (e.g., alkanethiols, biological molecules like DNA, viruses, and proteins, polymers, and nanoparticles) to many surfaces in a high-throughput fashion, enabling the synthesis and study of complex chemical and biological structures. In addition, DPN has laid the foundation for a series of related scanning probe methodologies, for example, polymer pen lithography (PPL), scanning probe block copolymer lithography (SPBCL), and beam-pen lithography (BPL), which do not rely on cantilever tips. Structures prepared with these methodologies have been used to understand the consequences of miniaturization and open a door to new capabilities in catalysis, optics, biomedicine, and chemical synthesis, where, in sum, a process originally intended to compete with tools used by the semiconductor industry for rapid prototyping has transcended that application to advanced materials discovery. This review outlines the major DPN advances, the subsequent methods based on the technique, and the opportunities for future fundamental and technological exploration. Most importantly, it commemorates the 20th anniversary of the discovery of DPN.
Full-text available
Aerosol Jet Printing (AJP) is an emerging contactless direct write approach aimed at the production of fine features on a wide range of substrates. Originally developed for the manufacture of electronic circuitry, the technology has been explored for a range of applications, including, active and passive electronic components, actuators, sensors, as well as a variety of selective chemical and biological responses. Freeform deposition, coupled with a relatively large stand-off distance, is enabling researchers to produce devices with increased geometric complexity compared to conventional manufacturing or more commonly used direct write approaches. Wide material compatibility, high resolution and independence of orientation have provided novelty in a number of applications when AJP is conducted as a digitally driven approach for integrated manufacture. This overview of the technology will summarise the underlying principles of AJP, review applications of the technology and discuss the hurdles to more widespread industry adoption. Finally, this paper will hypothesise where gains may be realised through this assistive manufacturing process.
Full-text available
Synthetic DNA is durable and can encode digital data with high density, making it an attractive medium for data storage. However, recovering stored data on a large-scale currently requires all the DNA in a pool to be sequenced, even if only a subset of the information needs to be extracted. Here, we encode and store 35 distinct files (over 200 MB of data), in more than 13 million DNA oligonucleotides, and show that we can recover each file individually and with no errors, using a random access approach. We design and validate a large library of primers that enable individual recovery of all files stored within the DNA. We also develop an algorithm that greatly reduces the sequencing read coverage required for error-free decoding by maximizing information from all sequence reads. These advances demonstrate a viable, large-scale system for DNA data storage and retrieval.
Computer vision is a rapidly growing field devoted to analyzing, modifying, and high-level understanding of images. Its objective is to determine what is happening in front of a camera and use that understanding to control a computer or robotic system, or to provide people with new images that are more informative or aesthetically pleasing than the original camera images. Application areas for computer-vision technology include video surveillance, biometrics, automotive, photography, movie production, Web search, medicine, augmented reality gaming, new user interfaces, and many more.
Molecular data storage is an attractive alternative for dense and durable information storage, which is sorely needed to deal with the growing gap between information production and the ability to store data. DNA is a clear example of effective archival data storage in molecular form. In this Review, we provide an overview of the process, the state of the art in this area and challenges for mainstream adoption. We also survey the field of in vivo molecular memory systems that record and store information within the DNA of living cells, which, together with in vitro DNA data storage, lie at the growing intersection of computer systems and biotechnology. Throughout evolution, DNA has been the primary medium of biological information storage. In this article, Ceze, Nivala and Strauss discuss how DNA can be adopted as a storage medium for custom data, as a potential future complement to current data storage media such as computer hard disks, optical disks and tape. They discuss strategies for coding, decoding and error correction and give examples of implementation both in vitro and in vivo.
Although information is ubiquitous, and its technology arguably among the highest that humankind has produced, its very ubiquity has posed new types of problems. Three that involve storage of information (rather than computation) include its usage of energy, the robustness of stored information over long times, and its ability to resist corruption through tampering. The difficulty in solving these problems using present methods has stimulated interest in the possibilities available through fundamentally different strategies, including storage of information in molecules. Here we show that storage of information in mixtures of readily available, stable, low-molecular-weight molecules offers new approaches to this problem. This procedure uses a common, small set of molecules (here, 32 oligopeptides) to write binary information. It minimizes the time and difficulty of synthesis of new molecules. It also circumvents the challenges of encoding and reading messages in linear macromolecules. We have encoded, written, stored, and read a total of approximately 400 kilobits (both text and images), coded as mixtures of molecules, with greater than 99% recovery of information, written at an average rate of 8 bits/s, and read at a rate of 20 bits/s. This demonstration indicates that organic and analytical chemistry offer many new strategies and capabilities to problems in long-term, zero-energy, robust information storage.
In the past five decades, the semiconductor industry has gone through two distinct eras of scaling: the geometric (or classical) scaling era and the equivalent (or effective) scaling era. As transistor and memory features approach 10 nanometres, it is apparent that room for further scaling in the horizontal direction is running out. In addition, the rise of data abundant computing is exacerbating the interconnect bottleneck that exists in conventional computing architecture between the compute cores and the memory blocks. Here we argue that electronics is poised to enter a new, third era of scaling — hyper-scaling — in which resources are added when needed to meet the demands of data abundant workloads. This era will be driven by advances in beyond-Boltzmann transistors, embedded non-volatile memories, monolithic three-dimensional integration and heterogeneous integration techniques. This Perspective argues that electronics is poised to enter a new era of scaling – hyper-scaling – driven by advances in beyond-Boltzmann transistors, embedded non-volatile memories, monolithic three-dimensional integration, and heterogeneous integration techniques.