[show abstract][hide abstract] ABSTRACT: Rice is the most important staple food for a large part of the world's human population and also a key model organism for biological studies of crops as well as other related plants. Here we present RiceWiki (http://ricewiki.big.ac.cn), a wiki-based, publicly editable and open-content platform for community curation of rice genes. Most existing related biological databases are based on expert curation; with the exponentially exploding volume of rice knowledge and other relevant data, however, expert curation becomes increasingly laborious and time-consuming to keep knowledge up-to-date, accurate and comprehensive, struggling with the flood of data and requiring a large number of people getting involved in rice knowledge curation. Unlike extant relevant databases, RiceWiki features harnessing collective intelligence in community curation of rice genes, quantifying users' contributions in each curated gene and providing explicit authorship for each contributor in any given gene, with the aim to exploit the full potential of the scientific community for rice knowledge curation. Based on community curation, RiceWiki bears the potential to make it possible to build a rice encyclopedia by and for the scientific community that harnesses community intelligence for collaborative knowledge curation, covers all aspects of biological knowledge and keeps evolving with novel knowledge.
Nucleic Acids Research 10/2013; · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: Long non-coding RNAs (lncRNAs) have been found to perform various functions in a wide variety of important biological processes. To make easier interpretation of lncRNA functionality and conduct deep mining on these transcribed sequences, it is convenient to classify lncRNAs into different groups. Here, we summarize classification methods of lncRNAs according to their four major features, namely, genomic location and context, effect exerted on DNA sequences, mechanism of functioning and their targeting mechanism. In combination with the presently available function annotations, we explore potential relationships between different classification categories, and generalize and compare biological features of different lncRNAs within each category. Finally, we present our view on potential further studies. We believe that the classifications of lncRNAs as indicated above are of fundamental importance for lncRNA studies, helpful for further investigation of specific lncRNAs, for formulation of new hypothesis based on different features of lncRNA and for exploration of the underlying lncRNA functional mechanisms.
[show abstract][hide abstract] ABSTRACT: Immortality and tumorigenicity are two distinct characteristics of cancers. Immortalization has been suggested to precede tumorigenesis. To understand the molecular mechanisms of tumorigenicity and cancer progression in mammary epithelium, we established a tumorigenic cell model by means of heavy-ion radiation of an immortal cell model, which was created by overexpressing the human telomerase reverse transcriptase (hTERT) in normal human mammary epithelial cells. We examined the expression profile of this tumorigenic cell line (T_hMEC) using the hTERT-overexpressing immortal cell line (I_hMEC) as a control. In-depth RNA-seq data was generated by using the next-generation sequencing (NGS) platform (Life Technologies SOLiD3). We found that house-keeping (HK) and tissue-specific (TS) genes were differentially regulated during the tumorigenic process. HK genes tended to be activated while TS genes tended to be repressed. In addition, the HK genes and TS genes tended to contribute differentially to the variation of gene expression at different RPKM (gene expression in reads per exon kilobase per million mapped sequence reads) levels. Based on transcriptome analysis of the two cell lines, we defined 7053 differentially-expressed genes (DEGs) between immortality and tumorigenicity. Differential expression of 20 manually-selected genes was further validated using qRT-PCR. Our observations may help to further our understanding of cellular mechanism(s) in the transition from immortalization to tumorigenesis.
[show abstract][hide abstract] ABSTRACT: The trypanosomatid GP63 proteases are known to be involved in parasite-host interaction and exhibit strong sequence and structural similarities to those of their hosts and insect vectors. Based on genome sequences of the three trypanosomatids, Trypanosoma brucei, Trypanosoma cruzi, and Leishmania spp., we annotated all their GP63 proteases and divided highly duplicated T. cruzi GP63 proteases into four novel groups according to sequence features. In Leishmania spp., we studied the evolutionary dynamics of GP63 proteins and identified 57 amino acid sites that are under significant positive selections. These sites may contribute to the functional variations of the GP63 proteases and provide clues for vaccine development.
Parasitology Research 04/2011; 109(4):1075-84. · 2.85 Impact Factor
[show abstract][hide abstract] ABSTRACT: Transposons are sequence elements widely distributed among genomes of all three kingdoms of life, providing genomic changes and playing significant roles in genome evolution. Trichomonas vaginalis is an excellent model system for transposon study since its genome (~160 Mb) has been sequenced and is composed of ~65% transposons and other repetitive elements. In this study, we primarily report the identification of Kolobok-type transposons (termed tvBac) in T. vaginalis and the results of transposase sequence analysis. We categorized 24 novel subfamilies of the Kolobok element, including one autonomous subfamily and 23 non-autonomous subfamilies. We also identified a novel H2CH motif in tvBac transposases based on multiple sequence alignment. In addition, we supposed that tvBac and Mutator transposons may have evolved independently from a common ancestor according to our phylogenetic analysis. Our results provide basic information for the understanding of the function and evolution of tvBac transposons in particular and other related transposon families in general.
Journal of Genetics and Genomics 02/2011; 38(2):63-70. · 2.08 Impact Factor
[show abstract][hide abstract] ABSTRACT: There are 48 members of the GP63 protease family in Trichomonas vaginalis according to our annotations; 37 of them are predicted to be transmembrane proteins. Because the GP63 protease family is the largest surface protease family and the second largest surface protein family, they are most likely to be involved in the interactions between T. vaginalis and the host cell surfaces, or otherwise participate in infection. We performed a preliminary study on the functions of GP63 in T. vaginalis (TvGP63). We demonstrated the cell surface localization of one highly expressed member of TvGP63 using indirect immunofluorescence assays in both isolate T016 and isolate 30236. The specific inhibitor of TvGP63 protease, 1,10-phenanthroline, was found to significantly inhibit the destruction of HeLa cells, whereas another chelator, EDTA, could not. Further tests showed that 1,10-phenanthroline did not inhibit the adherence of T. vaginalis cells to HeLa cells. The results presented in here suggest that GP63 protease plays a vital role in T. vaginalis infection process, but may not be related to the adherence of parasitic cells to their hosts.
Parasitology Research 01/2011; 109(1):71-9. · 2.85 Impact Factor
[show abstract][hide abstract] ABSTRACT: Using DNA microarrays, we generated both mRNA and miRNA expression data from 6 non-small cell lung cancer (NSCLC) tissues and their matching normal control from adjacent tissues to identify potential miRNA markers for diagnostics. We demonstrated that hsa-miR-96 is significantly and consistently up-regulated in all 6 NSCLCs. We validated this result in an independent set of 35 paired tumors and their adjacent normal tissues, as well as their sera that are collected before surgical resection or chemotherapy, and the results suggested that hsa-miR-96 may play an important role in NSCLC development and has great potential to be used as a noninvasive marker for diagnosing NSCLC. We predicted potential miRNA target mRNAs based on different methods (TargetScan and miRanda). Further classification of miRNA regulated genes based on their relationship with miRNAs revealed that hsa-miR-96 and certain other miRNAs tend to down-regulate their target mRNAs in NSCLC development, which have expression levels permissive to direct interaction between miRNAs and their target mRNAs. In addition, we identified a significant correlation of miRNA regulation with genes coincide with high density of CpG islands, which suggests that miRNA may represent a primary regulatory mechanism governing basic cellular functions and cell differentiations, and such mechanism may be complementary to DNA methylation in repressing or activating gene expression.
PLoS ONE 01/2011; 6(10):e26502. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: There have been two types of well-characterized DNA sequence periodicities; both are found to be associated with important molecular mechanisms. One is a 3-nt periodicity corresponding to codon triplets, the other is a 10.5-nt periodicity related to the structure of DNA helixes. In the process of analyzing the genome and transcriptome of Trichomonas vaginalis, we observed a 120.9-nt periodicity along DNA sequences. Different from the 3- and 10.5-nt periodicities, this novel periodicity originates near the 5'-end of transcripts, extends along the direction of transcription, and weakens gradually along transcripts. As a result, codon usage as well as amino acid composition is constrained by this periodicity. Similar periodicities were also identified in other organisms, but with variable length associated with the length of nucleosome units. We validated this association experimentally in T. vaginalis, and demonstrated that the periodicity manifests nucleotide variations between linker-DNA and wrapping-DNA along nucleosome array. We conclude that this novel DNA sequence periodicity is a signature of nucleosome organization suggesting that nucleosomes are well-positioned with regularity, especially near the 5'-end of transcripts.
Nucleic Acids Research 11/2008; 36(19):6228-36. · 8.28 Impact Factor