Yong E Zhang

Peking University, Beijing, Beijing Shi, China

Are you Yong E Zhang?

Claim your profile

Publications (10)108.66 Total impact

  • Article: Adaptive Evolution and the Birth of CTCF Binding Sites in the Drosophila Genome.
    [show abstract] [hide abstract]
    ABSTRACT: Changes in the physical interaction between cis-regulatory DNA sequences and proteins drive the evolution of gene expression. However, it has proven difficult to accurately quantify evolutionary rates of such binding change or to estimate the relative effects of selection and drift in shaping the binding evolution. Here we examine the genome-wide binding of CTCF in four species of Drosophila separated by between ∼2.5 and 25 million years. CTCF is a highly conserved protein known to be associated with insulator sequences in the genomes of human and Drosophila. Although the binding preference for CTCF is highly conserved, we find that CTCF binding itself is highly evolutionarily dynamic and has adaptively evolved. Between species, binding divergence increased linearly with evolutionary distance, and CTCF binding profiles are diverging rapidly at the rate of 2.22% per million years (Myr). At least 89 new CTCF binding sites have originated in the Drosophila melanogaster genome since the most recent common ancestor with Drosophila simulans. Comparing these data to genome sequence data from 37 different strains of Drosophila melanogaster, we detected signatures of selection in both newly gained and evolutionarily conserved binding sites. Newly evolved CTCF binding sites show a significantly stronger signature for positive selection than older sites. Comparative gene expression profiling revealed that expression divergence of genes adjacent to CTCF binding site is significantly associated with the gain and loss of CTCF binding. Further, the birth of new genes is associated with the birth of new CTCF binding sites. Our data indicate that binding of Drosophila CTCF protein has evolved under natural selection, and CTCF binding evolution has shaped both the evolution of gene expression and genome evolution during the birth of new genes.
    PLoS Biology 11/2012; 10(11):e1001420. · 11.45 Impact Factor
  • Article: Segmental dataset and whole body expression data do not support the hypothesis that non-random movement is an intrinsic property of Drosophila retrogenes.
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND: Several studies in Drosophila have shown excessive movement of retrogenes from the X chromosome to autosomes, and that these genes are frequently expressed in the testis. This phenomenon has led to several hypotheses invoking natural selection as the process driving male-biased genes to the autosomes. Metta and Schlotterer (BMC Evol Biol 2010, 10:114) analyzed a set of retrogenes where the parental gene has been subsequently lost. They assumed that this class of retrogenes replaced the ancestral functions of the parental gene, and reported that these retrogenes, although mostly originating from movement out of the X chromosome, showed female-biased or unbiased expression. These observations led the authors to suggest that selective forces (such as meiotic sex chromosome inactivation and sexual antagonism) were not responsible for the observed pattern of retrogene movement out of the X chromosome. RESULTS: We reanalyzed the dataset published by Metta and Schlotterer and found several issues that led us to a different conclusion. In particular, Metta and Schlotterer used a dataset combined with expression data in which significant sex-biased expression is not detectable. First, the authors used a segmental dataset where the genes selected for analysis were less testis-biased in expression than those that were excluded from the study. Second, sex-biased expression was defined by comparing male and female whole-body data and not the expression of these genes in gonadal tissues. This approach significantly reduces the probability of detecting sex-biased expressed genes, which explains why the vast majority of the genes analyzed (parental and retrogenes) were equally expressed in both males and females. Third, the female-biased expression observed by Metta and Schlotterer is mostly found for parental genes located on the X chromosome, which is known to be enriched with genes with female-biased expression. Fourth, using additional gonad expression data, we found that autosomal genes analyzed by Metta and Schlotterer are less up regulated in ovaries and have higher chance to be expressed in meiotic cells of spermatogenesis when compared to X-linked genes. CONCLUSIONS: The criteria used to select retrogenes and the sex-biased expression data based on whole adult flies generated a segmental dataset of female-biased and unbiased expressed genes that was unable to detect the higher propensity of autosomal retrogenes to be expressed in males. Thus, there is no support for the authors' view that the movement of new retrogenes, which originated from X-linked parental genes, was not driven by selection. Therefore, selection-based genetic models remain the most parsimonious explanations for the observed chromosomal distribution of retrogenes.
    BMC Evolutionary Biology 09/2012; 12(1):169. · 3.52 Impact Factor
  • Source
    Article: Re-analysis of the larval testis data on meiotic sex chromosome inactivation revealed evidence for tissue-specific gene expression related to the drosophila X chromosome.
    [show abstract] [hide abstract]
    ABSTRACT: Meiotic sex chromosome inactivation (MSCI) during spermatogenesis has been proposed as one of the evolutionary driving forces behind both the under-representation of male-biased genes on, and the gene movement out of, the X chromosome in Drosophila. However, the relevance of MSCI in shaping sex chromosome evolution is controversial. Here we examine two aspects of a recent study on testis gene expression (Mikhaylova and Nurminsky, BMC Biol 2011, 9:29) that failed to support the MSCI in Drosophila. First, Mikhaylova and Nurminsky found no differences between X-linked and autosomal genes based on the transcriptional profiling of the early testis development, and thus concluded that MSCI does not occur in D. melanogaster. Second, they also analyzed expression data from several D. melanogaster tissues and concluded that under-representation on the X chromosome is not an exclusive property of testis-biased genes, but instead, a general property of tissue-specific genes. By re-analyzing the Mikhaylova and Nurminsky's testis data and the expression data on several D. melanogaster tissues, we made two major findings that refuted their original claims. First, the developmental testis data has generally greater experimental error than conventional analyses, which reduced significantly the power to detect chromosomal differences in expression. Nevertheless, our re-analysis observed significantly lower expression of the X chromosome in the genomic transcriptomes of later development stages of the testis, which is consistent with the MSCI hypothesis. Second, tissue-specific genes are also in general enriched with genes more expressed in testes than in ovaries, that is testis-biased genes. By completely excluding from the analyses the testis-biased genes, which are known to be under-represented in the X, we found that all the other tissue-specific genes are randomly distributed between the X chromosome and the autosomes. Our findings negate the original study of Mikhaylova and Nurminsky, which concluded a lack of MSCI and generalized the pattern of paucity in the X chromosome for tissue-specific genes in Drosophila. Therefore, MSCI and other selection-based models such as sexual antagonism, dosage compensation, and meiotic-drive continue to be viable models as driving forces shaping the genomic distribution of male-related genes in Drosophila.
    BMC Biology 06/2012; 10:49; author reply 50. · 5.75 Impact Factor
  • Article: Reshaping of global gene expression networks and sex-biased gene expression by integration of a young gene.
    [show abstract] [hide abstract]
    ABSTRACT: New genes originate frequently across diverse taxa. Given that genetic networks are typically comprised of robust, co-evolved interactions, the emergence of new genes raises an intriguing question: how do new genes interact with pre-existing genes? Here, we show that a recently originated gene rapidly evolved new gene networks and impacted sex-biased gene expression in Drosophila. This 4-6 million-year-old factor, named Zeus for its role in male fecundity, originated through retroposition of a highly conserved housekeeping gene, Caf40. Zeus acquired male reproductive organ expression patterns and phenotypes. Comparative expression profiling of mutants and closely related species revealed that Zeus has recruited a new set of downstream genes, and shaped the evolution of gene expression in germline. Comparative ChIP-chip revealed that the genomic binding profile of Zeus diverged rapidly from Caf40. These data demonstrate, for the first time, how a new gene quickly evolved novel networks governing essential biological processes at the genomic level.
    The EMBO Journal 04/2012; 31(12):2798-809. · 9.20 Impact Factor
  • Source
    Article: Accelerated recruitment of new brain development genes into the human genome.
    [show abstract] [hide abstract]
    ABSTRACT: How the human brain evolved has attracted tremendous interests for decades. Motivated by case studies of primate-specific genes implicated in brain function, we examined whether or not the young genes, those emerging genome-wide in the lineages specific to the primates or rodents, showed distinct spatial and temporal patterns of transcription compared to old genes, which had existed before primate and rodent split. We found consistent patterns across different sources of expression data: there is a significantly larger proportion of young genes expressed in the fetal or infant brain of humans than in mouse, and more young genes in humans have expression biased toward early developing brains than old genes. Most of these young genes are expressed in the evolutionarily newest part of human brain, the neocortex. Remarkably, we also identified a number of human-specific genes which are expressed in the prefrontal cortex, which is implicated in complex cognitive behaviors. The young genes upregulated in the early developing human brain play diverse functional roles, with a significant enrichment of transcription factors. Genes originating from different mechanisms show a similar expression bias in the developing brain. Moreover, we found that the young genes upregulated in early brain development showed rapid protein evolution compared to old genes also expressed in the fetal brain. Strikingly, genes expressed in the neocortex arose soon after its morphological origin. These four lines of evidence suggest that positive selection for brain function may have contributed to the origination of young genes expressed in the developing brain. These data demonstrate a striking recruitment of new genes into the early development of the human brain.
    PLoS Biology 10/2011; 9(10):e1001179. · 11.45 Impact Factor
  • Article: A cautionary note for retrocopy identification: DNA-based duplication of intron-containing genes significantly contributes to the origination of single exon genes.
    [show abstract] [hide abstract]
    ABSTRACT: Retrocopies are important genes in the genomes of almost all higher eukaryotes. However, the annotation of such genes is a non-trivial task. Intronless genes have often been considered to be retroposed copies of intron-containing paralogs. Such categorization relies on the implicit premise that alignable regions of the duplicates should be long enough to cover exon-exon junctions of the intron-containing genes, and thus intron loss events can be inferred. Here, we examined the alternative possibility that intronless genes could be generated by partial DNA-based duplication of intron-containing genes in the fruitfly genome. By building pairwise protein-, transcript- and genome-level DNA alignments between intronless genes and their corresponding intron-containing paralogs, we found that alignments do not cover exon-exon junctions in 40% of cases and thus no intron loss could be inferred. For these cases, the candidate parental proteins tend to be partially duplicated, and intergenic sequences or neighboring genes are included in the intronless paralog. Moreover, we observed that it is significantly less likely for these paralogs to show inter-chromosomal duplication and testis-dominant transcription, compared to the remaining 60% of cases with evidence of clear intron loss (retrogenes). These lines of analysis reveal that DNA-based duplication contributes significantly to the 40% of cases of single exon gene duplication. Finally, we performed an analogous survey in the human genome and the result is similar, wherein 34% of the cases do not cover exon-exon junctions. Thus, genome annotation for retrogene identification should discard candidates without clear evidence of intron loss. mlong@uchicago.edu; zhangy@uchicago.edu
    Bioinformatics 07/2011; 27(13):1749-53. · 5.47 Impact Factor
  • Source
    Article: Deficiency of X-linked inverted duplicates with male-biased expression and the underlying evolutionary mechanisms in the Drosophila genome.
    [show abstract] [hide abstract]
    ABSTRACT: Inverted duplicates (IDs) are pervasive in genomes and have been reported to play functional roles in various biological processes. However, the general underlying evolutionary forces that maintain IDs in genomes remain largely elusive. Through a systematic screening of the Drosophila melanogaster genome, 20,223 IDs were detected in nonrepetitive intergenic regions, far more than expectation under the neutrality model. 3,846 of these IDs were identified to have stable hairpin structure (i.e., the structural IDs). Based on whole-genome transcriptome profiling data, we found 628 unannotated expressed structural IDs, which had significantly different genomic distributions and structural properties from the unexpressed IDs. Among the expressed structural IDs, 130 exhibited higher expression in males than in females (i.e., male-biased expression). Compared with sex-unbiased ones, these male-biased IDs were significantly underrepresented on the X chromosome, similar to previously reported pattern of male-biased protein-coding genes. These analyses suggest that a selection-driven process, rather than a purely neutral mutation-driven mechanism, contributes to the maintenance of IDs in the Drosophila genome.
    Molecular Biology and Evolution 05/2011; 28(10):2823-32. · 5.55 Impact Factor
  • Source
    Article: New genes in Drosophila quickly become essential.
    Sidi Chen, Yong E Zhang, Manyuan Long
    [show abstract] [hide abstract]
    ABSTRACT: To investigate the origin and evolution of essential genes, we identified and phenotyped 195 young protein-coding genes, which originated 3 to 35 million years ago in Drosophila. Knocking down expression with RNA interference showed that 30% of newly arisen genes are essential for viability. The proportion of genes that are essential is similar in every evolutionary age group that we examined. Under constitutive silencing of these young essential genes, lethality was high in the pupal stage and also found in the larval stages. Lethality was attributed to diverse cellular and developmental defects, such as organ formation and patterning defects. These data suggest that new genes frequently and rapidly evolve essential functions and participate in development.
    Science 12/2010; 330(6011):1682-5. · 31.20 Impact Factor
  • Source
    Article: Age-dependent chromosomal distribution of male-biased genes in Drosophila.
    [show abstract] [hide abstract]
    ABSTRACT: We investigated the correlation between the chromosomal location and age distribution of new male-biased genes formed by duplications via DNA intermediates (DNA-level) or by de novo origination in Drosophila. Our genome-wide analysis revealed an excess of young X-linked male-biased genes. The proportion of X-linked male-biased genes then diminishes through time, leading to an autosomal excess of male-biased genes. The switch between X-linked and autosomal enrichment of male-biased genes was also present in the distribution of both protein-coding genes on the D. pseudoobscura neo-X chromosome and microRNA genes of D. melanogaster. These observations revealed that the evolution of male-biased genes is more complicated than the previously detected one-step X→A gene traffic and the enrichment of the male-biased genes on autosomes. The pattern we detected suggests that the interaction of various evolutionary forces such as the meiotic sex chromosome inactivation (MSCI), faster-X effect, and sexual antagonism in the male germline might have shaped the chromosomal distribution of male-biased genes on different evolutionary time scales.
    Genome Research 11/2010; 20(11):1526-33. · 13.61 Impact Factor
  • Source
    Article: Chromosomal redistribution of male-biased genes in mammalian evolution with two bursts of gene gain on the X chromosome.
    [show abstract] [hide abstract]
    ABSTRACT: Mammalian X chromosomes evolved under various mechanisms including sexual antagonism, the faster-X process, and meiotic sex chromosome inactivation (MSCI). These forces may contribute to nonrandom chromosomal distribution of sex-biased genes. In order to understand the evolution of gene content on the X chromosome and autosome under these forces, we dated human and mouse protein-coding genes and miRNA genes on the vertebrate phylogenetic tree. We found that the X chromosome recently acquired a burst of young male-biased genes, which is consistent with fixation of recessive male-beneficial alleles by sexual antagonism. For genes originating earlier, however, this pattern diminishes and finally reverses with an overrepresentation of the oldest male-biased genes on autosomes. MSCI contributes to this dynamic since it silences X-linked old genes but not X-linked young genes. This demasculinization process seems to be associated with feminization of the X chromosome with more X-linked old genes expressed in ovaries. Moreover, we detected another burst of gene originations after the split of eutherian mammals and opossum, and these genes were quickly incorporated into transcriptional networks of multiple tissues. Preexisting X-linked genes also show significantly higher protein-level evolution during this period compared to autosomal genes, suggesting positive selection accompanied the early evolution of mammalian X chromosomes. These two findings cast new light on the evolutionary history of the mammalian X chromosome in terms of gene gain, sequence, and expressional evolution.
    PLoS Biology 01/2010; 8(10). · 11.45 Impact Factor