Is junk DNA bunk? A critique of encode

Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada B3H 4R2.
Proceedings of the National Academy of Sciences (Impact Factor: 9.67). 03/2013; 110(14). DOI: 10.1073/pnas.1221376110
Source: PubMed
Do data from the Encyclopedia Of DNA Elements (ENCODE) project render the notion of junk DNA obsolete? Here, I review older arguments for junk grounded in the C-value paradox and propose a thought experiment to challenge ENCODE's ontology. Specifically, what would we expect for the number of functional elements (as ENCODE defines them) in genomes much larger than our own genome? If the number were to stay more or less constant, it would seem sensible to consider the rest of the DNA of larger genomes to be junk or, at least, assign it a different sort of role (structural rather than informational). If, however, the number of functional elements were to rise significantly with C-value then, (i) organisms with genomes larger than our genome are more complex phenotypically than we are, (ii) ENCODE's definition of functional element identifies many sites that would not be considered functional or phenotype-determining by standard uses in biology, or (iii) the same phenotypic functions are often determined in a more diffuse fashion in larger-genomed organisms. Good cases can be made for propositions ii and iii. A larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon, is needed.

Full-text preview

Available from:
  • Source
    • "Furthermore , they are often accompanied by mutations in nearby bases [124, 158], further compounding the involvement of mutational mechanisms in their origination. 22 along with some attempts to reconcile the opposing views [34, 33, 93].as a whole: the network gradually changes as it finds where it can give way under this complex set of forces. Thus, through mutational writing, the network processes a large amount of information under natural selection. "
    [Show abstract] [Hide abstract] ABSTRACT: The theory of interaction-based evolution argues that, at the most basic level of analysis, there is a third alternative for how adaptive evolution works besides a) accidental mutation and natural selection and b) Lamarckism, namely, c) information provided by natural selection on the fit between the organism and its environment is absorbed by non-accidental mutation. This non-accidental mutation is non-Lamarckian yet useful for evolution, and is due to evolved and continually evolving mutational mechanisms operating in the germ cells. However, this theory has left a fundamental problem open: If mutational mechanisms are not Lamarckian---if they are not "aware" of the environment and the macroscale phenotype---then how could heritable novelty be due to anything other than accidental mutation? This paper aims to address this question by arguing the following. Mutational mechanisms can be broadly construed as enacting local simplification operations on the DNA in germ cells, along with gene duplication. The joint action of these mutational operations and natural selection provides simplification under performance pressure. This joint action creates from preexisting biological interactions new elements that have the inherent capacity to come together into unexpected useful interactions with other such elements, thus explaining nature's tendency for cooption. Novelty thus arises not from a local genetic accident but from gradual network-level evolution. Many empirical observations are explained from this perspective, from cooption and gene fusion at the molecular level, to the evolution of behavior and instinct at the organismal level. Finally, the nature of mutational mechanisms and the need to study them in detail are described, and a connection is drawn between evolution and learning.
    Preview · Article · May 2016
  • Source
    • "The now well-established phenomenon of pervasive transcription232425, that has triggered the (in)famous debate around the results of the ENCODE project2627282930, when pitted against the formal considerations outlined above, suggests a radical line of thinking on the nature of biological information and meaning. The indisputable findings that (nearly) all sequences in complex genomes, such as human, are transcribed at some level (at least in some cell types and at some life stages) most likely fit the same population genetic perspective202122. "
    [Show abstract] [Hide abstract] ABSTRACT: Biological information encoded in genomes is fundamentally different from and effectively orthogonal to Shannon entropy. The biologically relevant concept of information has to do with 'meaning', i.e. encoding various biological functions with various degree of evolutionary conservation. Apart from direct experimentation, the meaning, or biological information content, can be extracted and quantified from alignments of homologous nucleotide or amino acid sequences but generally not from a single sequence, using appropriately modified information theoretical formulae. For short, information encoded in genomes is defined vertically but not horizontally. Informally but substantially, biological information density seems to be equivalent to 'meaning' of genomic sequences that spans the entire range from sharply defined, universal meaning to effective meaninglessness. Large fractions of genomes, up to 90% in some plants, belong within the domain of fuzzy meaning. The sequences with fuzzy meaning can be recruited for various functions, with the meaning subsequently fixed, and also could perform generic functional roles that do not require sequence conservation. Biological meaning is continuously transferred between the genomes of selfish elements and hosts in the process of their coevolution. Thus, in order to adequately describe genome function and evolution, the concepts of information theory have to be adapted to incorporate the notion of meaning that is central to biology.
    Preview · Article · Mar 2016 · Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences
  • Source
    • "Our understanding of TF- DNA interactions may therefore be, at best, 70% complete, notwithstanding that many TFs have contextspecific binding affinities dependent upon co-factors, cell type, cellular activation status, and/or genetic back- ground [101,102]. Beyond this, we have only a partial catalogue of uDBP recognition sites, and although we now have foundational in vitro chromatin feature data for key cell types, the in vivo relevance of these features and their consistency across genetic backgrounds is not fully established [103] . Addressing these gaps will require continued systematic data aggregation with complementary development of statistical methods, such as improved approaches for modeling TF sequence specifi- city [104]. "
    [Show abstract] [Hide abstract] ABSTRACT: Psoriasis is a cytokine-mediated skin disease that can be treated effectively with immunosuppressive biologic agents. These medications, however, are not equally effective in all patients and are poorly suited for treating mild psoriasis. To develop more targeted therapies, interfering with transcription factor (TF) activity is a promising strategy. Meta-analysis was used to identify differentially expressed genes (DEGs) in the lesional skin from psoriasis patients (n = 237). We compiled a dictionary of 2935 binding sites representing empirically-determined binding affinities of TFs and unconventional DNA-binding proteins (uDBPs). This dictionary was screened to identify "psoriasis response elements" (PREs) overrepresented in sequences upstream of psoriasis DEGs. PREs are recognized by IRF1, ISGF3, NF-kappaB and multiple TFs with helix-turn-helix (homeo) or other all-alpha-helical (high-mobility group) DNA-binding domains. We identified a limited set of DEGs that encode proteins interacting with PRE motifs, including TFs (GATA3, EHF, FOXM1, SOX5) and uDBPs (AVEN, RBM8A, GPAM, WISP2). PREs were prominent within enhancer regions near cytokine-encoding DEGs (IL17A, IL19 and IL1B), suggesting that PREs might be incorporated into complex decoy oligonucleotides (cdODNs). To illustrate this idea, we designed a cdODN to concomitantly target psoriasis-activated TFs (i.e., FOXM1, ISGF3, IRF1 and NF-kappaB). Finally, we screened psoriasis-associated SNPs to identify risk alleles that disrupt or engender PRE motifs. This identified possible sites of allele-specific TF/uDBP binding and showed that PREs are disproportionately disrupted by psoriasis risk alleles. We identified new TF/uDBP candidates and developed an approach that (i) connects transcriptome informatics to cdODN drug development and (ii) enhances our ability to interpret GWAS findings. Disruption of PRE motifs by psoriasis risk alleles may contribute to disease susceptibility.
    Full-text · Article · Dec 2015 · Clinical and Translational Medicine
Show more