A proposal for a DNA-based computer code

International Invention Journal of Biochemistry and Bioinformatics 10/2013; 1(1):1-4.


The use of DNA has become an attractive method for storing information in the future biocomputers due to its capacity to store a large amount of information while requiring little physical volume. In the last decade, the order of nucleotides (nt) has been considered as the best method to store a large amount of data. However, proposals for this method have weaknesses. I present a new coding system for DNA-based computing that uses 4 nt per symbol. This code is based on the conversion of all 256 computer symbols' ASCII numbers into base-4 numbers and on assigning nucleotides ATCG to 0123 respectively. This encoding has: uniformity, due to all symbols coded with 4 nt; consistency, due to a biunivocal relationship between the symbols and tetraplets; homogeneity, because similar symbols share the same first nt; intuitiveness in locating reading frames; and error resistance, due to shorter sequences, homogeneity on the first nt, and almost none nt repetition longer than two. This coding system will provide a more efficient method to implement DNA-based information storage, which will thus help to design upcoming biocomputers.

Download full-text


Available from: Alfonso Jimenez-Sanchez, Oct 04, 2015
97 Reads
  • Source
    Science 10/2001; 293(5536):1763-5. · 33.61 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The unique properties of DNA make it a fundamental building block in the fields of supramolecular chemistry, nanotechnology, nano-circuits, molecular switches, molecular devices, and molecular computing. In our recently introduced autonomous molecular automaton, DNA molecules serve as input, output, and software, and the hardware consists of DNA restriction and ligation enzymes using ATP as fuel. In addition to information, DNA stores energy, available on hybridization of complementary strands or hydrolysis of its phosphodiester backbone. Here we show that a single DNA molecule can provide both the input data and all of the necessary fuel for a molecular automaton. Each computational step of the automaton consists of a reversible software molecule input molecule hybridization followed by an irreversible software-directed cleavage of the input molecule, which drives the computation forward by increasing entropy and releasing heat. The cleavage uses a hitherto unknown capability of the restriction enzyme FokI, which serves as the hardware, to operate on a noncovalent software input hybrid. In the previous automaton, software input ligation consumed one software molecule and two ATP molecules per step. As ligation is not performed in this automaton, a fixed amount of software and hardware molecules can, in principle, process any input molecule of any length without external energy supply. Our experiments demonstrate 3 x 10(12) automata per microl performing 6.6 x 10(10) transitions per second per microl with transition fidelity of 99.9%, dissipating about 5 x 10(-9) W microl as heat at ambient temperature.
    Proceedings of the National Academy of Sciences 04/2003; 100(5):2191-6. DOI:10.1073/pnas.0535624100 · 9.67 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: An improved Huffman coding method for information storage in DNA is described. The method entails the utilization of modified unambiguous base assignment that enables efficient coding of characters. A plasmid-based library with efficient and reliable information retrieval and assembly with uniquely designed primers is described. We illustrate our approach by synthesis of DNA that encodes text, images, and music, which could easily be retrieved by DNA sequencing using the specific primers. The method is simple and lends itself to automated information retrieval.
    BioTechniques 09/2009; 47(3):747-54. DOI:10.2144/000113218 · 2.95 Impact Factor
Show more