Gene and genon concept: coding versus regulation

Institut Jacques Monod, CNRS and Univ. Paris 7, 2, place Jussieu, 75251, Paris-Cedex 5, France.
Theory in Biosciences (Impact Factor: 1.23). 11/2007; 126(2-3):65-113. DOI: 10.1007/s12064-007-0012-x
Source: PubMed


We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term "genon". In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various pieces, as steered by the genon. It emerges finally as an uninterrupted nucleic acid sequence at mRNA level just prior to translation, in faithful correspondence with the amino acid sequence to be produced as a polypeptide. After translation, the genon has fulfilled its role and expires. The distinction between the protein coding information as materialised in the final polypeptide and the processing information represented by the genon allows us to set up a new information theoretic scheme. The standard sequence information determined by the genetic code expresses the relation between coding sequence and product. Backward analysis asks from which coding region in the DNA a given polypeptide originates. The (more interesting) forward analysis asks in how many polypeptides of how many different types a given DNA segment is expressed. This concerns the control of the expression process for which we have introduced the genon concept. Thus, the information theoretic analysis can capture the complementary aspects of coding and regulation, of gene and genon.

Download full-text


Available from: Scherrer Klaus, Nov 12, 2015
  • Source
    • "As a result, the delimitation and definition of different types of genes is not as straightforward as it is for molecules at the lower molecular level (e.g. Gerstein et al., 2007; Scherrer and Jost, 2007a,b; Prohaska and Stadler, 2008), because it is not reasonable or practical to use overall compositional and structural identity as a criterion. This explains why the annotation and classification of genes is more difficult and controversial than the description of DNA sequence data. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The present article discusses the need for standardization in morphology in order to increase comparability and communicability of morphological data. We analyse why only morphological descriptions and not character matrices represent morphological data and why morphological terminology must be free of homology assumptions. We discuss why images only support and substantiate data but are not data themselves. By comparing morphological traits and DNA sequence data we reveal fundamental conceptual shortcomings of the former that result from their high average degree of individuality. We argue that the delimitation of morphological units, of datum units, and of evidence units must be distinguished, each of which involves its own specific problems. We conclude that morphology suffers from the linguistic problem of morphology that results from the lack of (i) a commonly accepted standardized morphological terminology, (ii) a commonly accepted standardized and formalized method of description, and (iii) a rationale for the delimitation of morphological traits. Although this is not problematic for standardizing metadata, it hinders standardizing morphological data. We provide the foundation for a solution to the linguistic problem of morphology, which is based on a morphological structure concept. We argue that this structure concept can be represented with knowledge representation languages such as the resource description framework (RDF) and that it can be applied for morphological descriptions. We conclude with a discussion of how online databases can improve morphological data documentation and how a controlled and formalized morphological vocabulary, i.e. a morphological RDF ontology, if it is based on a structure concept, can provide a possible solution to the linguistic problem of morphology.
    Cladistics 06/2010; 26(3):301-325. DOI:10.1111/j.1096-0031.2009.00286.x · 6.22 Impact Factor
  • Source
    • ", [38] , , Scherrer Jost [39] "
    [Show abstract] [Hide abstract]
    ABSTRACT: One hundred years has passed since the term of "gene" was coined in 1909, the gene definition has been revised many times in the past 100 years. Gene has changed from an abstract symbol to a specific segment, which can produce protein or functional RNA, and finally became one of the most important biological words. With the accomplishment of the genome project, particularly the project of Encyclopedia of DNA Elements (ENCODE), our knowledge about the the complexity and diversity of genomic organization and dynamics of genomes posed important challenges to the classical molecular gene concept. As is becoming more evident that the relations between information stored at DNA level and functional products are very intricate, some people consider that it was time to make a redefinition of gene. In this review, we briefly outline gene definition of this history and the development of gene concept these days.
    Hereditas (Beijing) 05/2010; 32(5):448-54. DOI:10.3724/SP.J.1005.2010.00448
  • Source
    • "We have been glad to see that our paper (Scherrer and Jost 2007) solicited such insightful or supportive commentaries as those of Noble, of Gros, of Prohaska and Stadler, of Forsdyke, and Billeter, as well as the alternative proposal of Stadler et al., and we hope that this will trigger further conceptual discussions about the definition of the gene and inspire further research about programs of gene expression, in the light of recent advances in molecular biology and bioinformatics (Billeter 2009; Forsdyke 2009; Gros 2009; Noble 2009; Prohaska and Stadler 2008; Stadler et al. 2009). The commentaries raise some important issues. "

    Theory in Biosciences 09/2009; 128(3). DOI:10.1007/s12064-009-0069-9 · 1.23 Impact Factor
Show more