A model of evolution with constant selective pressure for regulatory DNA sites.

Institute for Information Transmission Problems (the Kharkevich Institute) of RAS, Bolshoi Karetny pereulok, 19, GSP-4, Moscow, 127994, Russia.
BMC Evolutionary Biology (Impact Factor: 3.29). 02/2007; 7:125. DOI:10.1186/1471-2148-7-125
Source: PubMed

ABSTRACT Molecular evolution is usually described assuming a neutral or weakly non-neutral substitution model. Recently, new data have become available on evolution of sequence regions under a selective pressure, e.g. transcription factor binding sites. To reconstruct the evolutionary history of such sequences, one needs evolutionary models that take into account a substantial constant selective pressure.
We present a simple evolutionary model with a single preferred (consensus) nucleotide and the neutral substitution model adopted for all other nucleotides. This evolutionary model has a rate matrix in which all substitutions that do not involve the consensus nucleotide occur with the same rate. The model has two time scales for achieving a stationary distribution; in the general case only one of the two rate parameters can be evaluated from the stationary distribution. In the middle-time zone, a counterintuitive behavior was observed for some parameter values, with a probability of conservation for a non-consensus nucleotide greater than that for the consensus nucleotide. Such an effect can be observed only in the case of weak preference for the consensus nucleotide, when the probability to observe the consensus nucleotide in the stationary distribution is less than 1/2. If the substitution rate is represented as a product of mutation and fixation, only the fixation can be calculated from the stationary distribution. The exhibited conservation of non-consensus nucleotides does not take place if the elements of mutation matrix are identical, and can be related to the reduced mutation rate between the non-consensus nucleotides. This bias can have no effect on the stationary distribution of nucleotide frequencies calculated over the ensemble of multiple alignments, e.g. transcription factor binding sites upstream of different sets of co-regulated orthologous genes.
The derived model can be used as a null model when analyzing the evolution of orthologous transcription factor binding sites. In particular, our findings show that a nucleotide preferred at some position of a multiple alignment of binding sites for some transcription factor in the same genome is not necessarily the most conserved nucleotide in an alignment of orthologous sites from different species. However, this effect can take place only in the case of a mutation matrix whose elements are not identical.

0 0
  • [show abstract] [hide abstract]
    ABSTRACT: Comparative computer-assisted analysis was used to study putative GlpR-regulons responsible for metabolism of glycerol and glycerol-3-phosphate in genomes of alpha-, beta-, and gamma-proteobacteria. New palindromic GlpR-binding signals were identified in gamma-proteobacteria; consensus sequences being TGTTCGATAACGAACA for Enterobacteriaceae, wTTTTCGTATACGAAAAw for Pseudomonadaceae, and AATGCTCGATCGAGCATT for Vibrionaceae. The signals in alpha- and beta-proteobacteria were also identified: they contained 3-4 direct TTTCGTT repeats separated by 3-4 nucleotide pairs.
    Molekuliarnaia biologiia 01/2003; 37(5):843-9.
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Most of the sophisticated methods to estimate evolutionary divergence between DNA sequences assume that the two sequences have evolved with the same pattern of nucleotide substitution after their divergence from their most recent common ancestor (homogeneity assumption). If this assumption is violated, the evolutionary distance estimated will be biased, which may result in biased estimates of divergence times and substitution rates, and may lead to erroneous branching patterns in the inferred phylogenies. Here we present a simple modification for existing distance estimation methods to relax the assumption of the substitution pattern homogeneity among lineages when analyzing DNA and protein sequences. Results from computer simulations and empirical data analyses for human and mouse genes are presented to demonstrate that the proposed modification reduces the estimation bias considerably and that the modified method performs much better than the LogDet methods, which do not require the homogeneity assumption in estimating the number of substitutions per site. We also discuss the relationship of the substitution and mutation rate estimates when the substitution pattern is not the same in the lineages leading to the two sequences compared.
    Molecular Biology and Evolution 11/2002; 19(10):1727-36. · 10.35 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Charles Darwin proposed that evolution occurs primarily by natural selection, but this view has been controversial from the beginning. Two of the major opposing views have been mutationism and neutralism. Early molecular studies suggested that most amino acid substitutions in proteins are neutral or nearly neutral and the functional change of proteins occurs by a few key amino acid substitutions. This suggestion generated an intense controversy over selectionism and neutralism. This controversy is partially caused by Kimura's definition of neutrality, which was too strict (|2Ns|< or =1). If we define neutral mutations as the mutations that do not change the function of gene products appreciably, many controversies disappear because slightly deleterious and slightly advantageous mutations are engulfed by neutral mutations. The ratio of the rate of nonsynonymous nucleotide substitution to that of synonymous substitution is a useful quantity to study positive Darwinian selection operating at highly variable genetic loci, but it does not necessarily detect adaptively important codons. Previously, multigene families were thought to evolve following the model of concerted evolution, but new evidence indicates that most of them evolve by a birth-and-death process of duplicate genes. It is now clear that most phenotypic characters or genetic systems such as the adaptive immune system in vertebrates are controlled by the interaction of a number of multigene families, which are often evolutionarily related and are subject to birth-and-death evolution. Therefore, it is important to study the mechanisms of gene family interaction for understanding phenotypic evolution. Because gene duplication occurs more or less at random, phenotypic evolution contains some fortuitous elements, though the environmental factors also play an important role. The randomness of phenotypic evolution is qualitatively different from allele frequency changes by random genetic drift. However, there is some similarity between phenotypic and molecular evolution with respect to functional or environmental constraints and evolutionary rate. It appears that mutation (including gene duplication and other DNA changes) is the driving force of evolution at both the genic and the phenotypic levels.
    Molecular Biology and Evolution 01/2006; 22(12):2318-42. · 10.35 Impact Factor

Full-text (2 Sources)

Available from
Mar 21, 2013