Lesheng Kong

University of Oxford, Oxford, ENG, United Kingdom

Are you Lesheng Kong?

Claim your profile

Publications (10)142.47 Total impact

  • Article: Evidence for conserved post-transcriptional roles of unitary pseudogenes and for frequent bifunctionality of mRNAs.
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND: Recent reports have highlighted instances of mRNAs that, in addition to coding for protein, regulate the abundance of related transcripts by altering miRNA availability. These two mRNA roles - one mediated by RNA and the other by protein - are inter-dependent and hence cannot easily be separated. Whether the RNA-mediated role of transcripts is important, per se, or whether it is a relatively innocuous consequence of competition by different transcripts for miRNA-binding remains unknown. RESULTS: Here we took advantage of 48 loci that encoded proteins in the earliest eutherian ancestor, but whose protein-coding capability has since been lost specifically during rodent evolution. Sixty-five percent of such loci, which we term 'unitary pseudogene', have retained their expression in mouse and their transcripts exhibit conserved tissue expression profiles. The maintenance of these unitary pseudogenes' spatial expression profiles is associated with conservation of their miRNA response elements and these appear to preserve the post-transcriptional roles of their protein-coding ancestor. We used mouse Pbcas4, an exemplar of these transcribed unitary pseudogenes, to experimentally test our genome wide predictions. We demonstrate that the role of Pbcas4 as a competitive endogenous RNA has been conserved and has outlived its ancestral gene's loss of protein-coding potential. CONCLUSIONS: These results show that post-transcriptional regulation by bifunctional mRNAs can persist over long evolutionary time periods even after their protein coding abilitity has been lost.
    Genome biology 11/2012; 13(11):R102. · 6.63 Impact Factor
  • Article: A common ancestry for BAP1 and Uch37 regulators.
    [show abstract] [hide abstract]
    ABSTRACT: To reveal how the polycomb repressive-deubiquitinase (PR-DUB) complex controls substrate selection specificity, we undertook a detailed computational sequence analysis of its components: additional sex combs like 1 (ASXL1) and BRCA1-associated protein 1 (BAP1) proteins. This led to the discovery of two previously unrecognized domains in ASXL1: a forkhead (winged-helix) DNA-binding domain and a deubiquitinase adaptor domain shared with two regulators of ubiquitin carboxyl-terminal hydrolase 37 (Uch37), namely adhesion regulating molecule 1 (ADRM1) and nuclear factor related to kappaB (NFRKB). Our analysis demonstrates a common ancestry for BAP1 and Uch37 regulators in PR-DUB, INO80 chromatin remodelling and proteosome complexes. luis.sanchezpulido@dpag.ox.ac.uk Supplementary data are available at Bioinformatics online.
    Bioinformatics 05/2012; 28(15):1953-6. · 5.47 Impact Factor
  • Source
    Article: The genome of the green anole lizard and a comparative analysis with birds and mammals.
    [show abstract] [hide abstract]
    ABSTRACT: The evolution of the amniotic egg was one of the great evolutionary innovations in the history of life, freeing vertebrates from an obligatory connection to water and thus permitting the conquest of terrestrial environments. Among amniotes, genome sequences are available for mammals and birds, but not for non-avian reptiles. Here we report the genome sequence of the North American green anole lizard, Anolis carolinensis. We find that A. carolinensis microchromosomes are highly syntenic with chicken microchromosomes, yet do not exhibit the high GC and low repeat content that are characteristic of avian microchromosomes. Also, A. carolinensis mobile elements are very young and diverse-more so than in any other sequenced amniote genome. The GC content of this lizard genome is also unusual in its homogeneity, unlike the regionally variable GC content found in mammals and birds. We describe and assign sequence to the previously unknown A. carolinensis X chromosome. Comparative gene analysis shows that amniote egg proteins have evolved significantly more rapidly than other proteins. An anole phylogeny resolves basal branches to illuminate the history of their repeated adaptive radiations.
    Nature 08/2011; 477(7366):587-91. · 36.28 Impact Factor
  • Source
    Article: The genome of the green anole lizard and a comparative analysis with birds and mammals
    Nature 08/2011; 477(7366):587-591. · 36.28 Impact Factor
  • Source
    Article: The genome of a songbird.
    [show abstract] [hide abstract]
    ABSTRACT: The zebra finch is an important model organism in several fields with unique relevance to human neuroscience. Like other songbirds, the zebra finch communicates through learned vocalizations, an ability otherwise documented only in humans and a few other animals and lacking in the chicken-the only bird with a sequenced genome until now. Here we present a structural, functional and comparative analysis of the genome sequence of the zebra finch (Taeniopygia guttata), which is a songbird belonging to the large avian order Passeriformes. We find that the overall structures of the genomes are similar in zebra finch and chicken, but they differ in many intrachromosomal rearrangements, lineage-specific gene family expansions, the number of long-terminal-repeat-based retrotransposons, and mechanisms of sex chromosome dosage compensation. We show that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets. We also show evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience. These results indicate an active involvement of the genome in neural processes underlying vocal communication and identify potential genetic substrates for the evolution and regulation of this behaviour.
    Nature 04/2010; 464(7289):757-62. · 36.28 Impact Factor
  • Article: Accelerated evolution of PAK3- and PIM1-like kinase gene families in the zebra finch, Taeniopygia guttata.
    [show abstract] [hide abstract]
    ABSTRACT: Genes encoding protein kinases tend to evolve slowly over evolutionary time, and only rarely do they appear as recent duplications in sequenced vertebrate genomes. Consequently, it was a surprise to find two families of kinase genes that have greatly and recently expanded in the zebra finch (Taeniopygia guttata) lineage. In contrast to other amniotic genomes (including chicken) that harbor only single copies of p21-activated serine/threonine kinase 3 (PAK3) and proviral integration site 1 (PIM1) genes, the zebra finch genome appeared at first to additionally contain 67 PAK3-like (PAK3L) and 51 PIM1-like (PIM1L) protein kinase genes. An exhaustive analysis of these gene models, however, revealed most to be incomplete, owing to the absence of terminal exons. After reprediction, 31 PAK3L genes and 10 PIM1L genes remain, and all but three are predicted, from the retention of functional sites and open reading frames, to be enzymatically active. PAK3L, but not PIM1L, gene sequences show evidence of recurrent episodes of positive selection, concentrated within structures spatially adjacent to N- and C-terminal protein regions that have been discarded from zebra finch PAK3L genes. At least seven zebra finch PAK3L genes were observed to be expressed in testis, whereas two sequences were found transcribed in the brain, one broadly including the song nuclei and the other in the ventricular zone and in cells resembling Bergmann's glia in the cerebellar Purkinje cell layer. Two PIM1L sequences were also observed to be expressed with broad distributions in the zebra finch brain, one in both the ventricular zone and the cerebellum and apparently associated with glial cells and the other showing neuronal cell expression and marked enrichment in midbrain/thalamic nuclei. These expression patterns do not correlate with zebra finch-specific features such as vocal learning. Nevertheless, our results show how ancient and conserved intracellular signaling molecules can be co-opted, following duplication, thereby resulting in lineage-specific functions, presumably affecting the zebra finch testis and brain.
    Molecular Biology and Evolution 03/2010; 27(8):1923-34. · 5.55 Impact Factor
  • Source
    Article: Tandem duplication, circular permutation, molecular adaptation: how Solanaceae resist pests via inhibitors.
    Lesheng Kong, Shoba Ranganathan
    [show abstract] [hide abstract]
    ABSTRACT: The Potato type II (Pot II) family of proteinase inhibitors plays critical roles in the defense system of plants from Solanaceae family against pests. To better understand the evolution of this family, we investigated the correlation between sequence and structural repeats within this family and the evolution and molecular adaptation of Pot II genes through computational analysis, using the putative ancestral domain sequence as the basic repeat unit. Our analysis discovered the following interesting findings in Pot II family. (1) We classified the structural domains in Pot II family into three types (original repeat domain, circularly permuted domain, the two-chain domain) according to the existence of two linkers between the two domain components, which clearly show the circular permutation relationship between the original repeat domain and circularly permuted domain. (2) The permuted domains appear more stable than original repeat domain, from available structural information. Therefore, we proposed a multiple-repeat sequence is likely to adopt the permuted domain from contiguous sequence segments, with the N- and C-termini forming a single non-contiguous structural domain, linking the bracelet of tandem repeats. (3) The analysis of nonsynonymous/synonymous substitution rates ratio in Pot II domain revealed heterogeneous selective pressures among amino acid sites: the reactive site is under positive Darwinian selection (providing different specificity to target varieties of proteinases) while the cysteine scaffold is under purifying selection (essential for maintaining the fold). (4) For multi-repeat Pot II genes from Nicotiana genus, the proteolytic processing site is under positive Darwinian selection (which may improve the cleavage efficiency). This paper provides comprehensive analysis and characterization of Pot II family, and enlightens our understanding on the strategies (Gene and domain duplication, structural circular permutation and molecular adaptation) of Solanaceae plants for defending pathogenic attacks through the evolution of Pot II genes.
    BMC Bioinformatics 02/2008; 9 Suppl 1:S22. · 2.75 Impact Factor
  • Source
    Article: MPID-T: database for sequence-structure-function information on T-cell receptor/peptide/MHC interactions.
    [show abstract] [hide abstract]
    ABSTRACT: Normal adaptive immune responses operate under major histocompatibility complex (MHC) restriction by binding to specific, short antigenic peptides and presenting them to appropriate T-cell receptors (TcRs). Sequence-structure-function information is critical in understanding the principles governing peptide/MHC (pMHC) and TcR/pMHC recognition and binding. A new database for sequence-structure-function information on TcR/pMHC interactions, MHC-Peptide Interaction Database version T (MPID-T), is now available with the latest available Protein Data Bank (PDB) data and interaction parameters on TcR/pMHC complexes. MPID-T is a manually curated MySQL database containing experimentally determined structures of 187 pMHC complexes and 16 TcR/pMHC complexes available in the PDB. Each structure is manually verified, classified, and analysed for intermolecular interactions (i) between the MHC and its corresponding bound peptide and (ii) between TcR and its bound pMHC complex where TcR structural information is available. The MPID-T database retrieval system has precomputed interaction parameters that include solvent accessibility, hydrogen bonds, gap volume and gap index. Structural visualisation of the TcR/pMHC complex, pMHC complex, MHC or the bound peptide can be performed using freely available graphics applications such as MDL Chime or RasMol, while structural alignment (based on MHC class and peptide length) can be viewed using the Jmol molecular viewer or an MDL Chime-compatible web browser client. MPID-T contains structural descriptors for in-depth characterisation of TcR/pMHC and pMHC interactions. The ultimate purpose of MPID-T is to enhance the understanding of the binding mechanism underlying TcR/pMHC and pMHC interactions by mapping the TcR footprint on the MHC and its bound peptide, as this eventually determines T-cell recognition and binding. AVAILABILITY: The MPID-T database retrieval system is available at http://surya.bic.nus.edu.sg/mpidt CONTACT: Joo Chuan Tong (jctong@i2r.a-star.edu.sg).
    Applied Bioinformatics 02/2006; 5(2):111-4.
  • Source
    Article: SDPMOD: an automated comparative modeling server for small disulfide-bonded proteins.
    [show abstract] [hide abstract]
    ABSTRACT: Small disulfide-bonded proteins (SDPs) are rich sources for therapeutic drugs. Designing drugs from these proteins requires three-dimensional structural information, which is only available for a subset of these proteins. SDPMOD addresses this deficit in structural information by providing a freely available automated comparative modeling service to the research community. For expert users, SDPMOD offers a manual mode that permits the selection of a desired template as well as a semi-automated mode that allows users to select the template from a suggested list. Besides the selection of templates, expert users can edit the target-template alignment, thus allowing further customization of the modeling process. Furthermore, the web service provides model stereochemical quality evaluation using PROCHECK. SDPMOD is freely accessible to academic users via the web interface at http://proline.bic.nus.edu.sg/sdpmod.
    Nucleic Acids Research 08/2004; 32(Web Server issue):W356-9. · 8.03 Impact Factor
  • Source
    Article: Delineation of modular proteins: domain boundary prediction from sequence information.
    Lesheng Kong, Shoba Ranganathan
    [show abstract] [hide abstract]
    ABSTRACT: The delineation of domain boundaries of a given sequence in the absence of known 3D structures or detectable sequence homology to known domains benefits many areas in protein science, such as protein engineering, protein 3D structure determination and protein structure prediction. With the exponential growth of newly determined sequences, our ability to predict domain boundaries rapidly and accurately from sequence information alone is both essential and critical from the viewpoint of gene function annotation. Anyone attempting to predict domain boundaries for a single protein sequence is invariably confronted with a plethora of databases that contain boundary information available from the internet and a variety of methods for domain boundary prediction. How are these derived and how well do they work? What definition of 'domain' do they use? We will first clarify the different definitions of protein domains, and then describe the available public databases with domain boundary information. Finally, we will review existing domain boundary prediction methods and discuss their strengths and weaknesses.
    Briefings in Bioinformatics 07/2004; 5(2):179-92. · 5.20 Impact Factor