[show abstract][hide abstract] ABSTRACT: Plants have developed several morphological and physiological strategies to adapt to phosphate stress. We analyzed the inducible transcripts associated with phosphate starvation and over-abundant phosphate supply to characterize the transcriptome in rice seedlings using the mRNA-Seq strategy. Fifty-three million reads obtained from 16 libraries under various phosphate stress and recovery treatments were uniquely mapped to the rice genome. Transcripts identified specifically tagged to 40,574 (root) and 39,748 (shoot) Rice Annotation Project (RAP) transcripts. Additionally, we detected uniquely 10,388 transcripts with no match to any RAP transcript. These transcripts that showed specific response to Pi stress include those without ORFs that may act as non-protein coding transcripts. With an accompanying browser of the transcriptome under Pi stress, a deeper understanding of the structural and functional features of both annotated and unannotated Pi stress-responsive transcripts can provide useful information in improving Pi acquisition and utilization in rice and other cereal crops.
[show abstract][hide abstract] ABSTRACT: Full-length cDNA (FLcDNA) libraries consisting of 172,000 clones were constructed from a two-row malting barley cultivar (Hordeum vulgare 'Haruna Nijo') under normal and stressed conditions. After sequencing the clones from both ends and clustering the sequences, a total of 24,783 complete sequences were produced. By removing duplicates between these and publicly available sequences, 22,651 representative sequences were obtained: 17,773 were novel barley FLcDNAs, and 1,699 were barley specific. Highly conserved genes were found in the barley FLcDNA sequences for 721 of 881 rice (Oryza sativa) trait genes with 50% or greater identity. These FLcDNA resources from our Haruna Nijo cDNA libraries and the full-length sequences of representative clones will improve our understanding of the biological functions of genes in barley, which is the cereal crop with the fourth highest production in the world, and will provide a powerful tool for annotating the barley genome sequences that will become available in the near future.
[show abstract][hide abstract] ABSTRACT: Centromeres are sites for assembly of the chromosomal structures that mediate faithful segregation at mitosis and meiosis. This function is conserved across species, but the DNA components that are involved in kinetochore formation differ greatly, even between closely related species. To shed light on the nature, evolutionary timing and evolutionary dynamics of rice centromeres, we decoded a 2.25-Mb DNA sequence covering the centromeric region of chromosome 8 of an indica rice variety, 'Kasalath' (Kas-Cen8). Analysis of repetitive sequences in Kas-Cen8 led to the identification of 222 long terminal repeat (LTR)-retrotransposon elements and 584 CentO satellite monomers, which account for 59.2% of the region. A comparison of the Kas-Cen8 sequence with that of japonica rice 'Nipponbare' (Nip-Cen8) revealed that about 66.8% of the Kas-Cen8 sequence was collinear with that of Nip-Cen8. Although the 27 putative genes are conserved between the two subspecies, only 55.4% of the total LTR-retrotransposon elements in 'Kasalath' had orthologs in 'Nipponbare', thus reflecting recent proliferation of a considerable number of LTR-retrotransposons since the divergence of two rice subspecies of indica and japonica within Oryza sativa. Comparative analysis of the subfamilies, time of insertion, and organization patterns of inserted LTR-retrotransposons between the two Cen8 regions revealed variations between 'Kasalath' and 'Nipponbare' in the preferential accumulation of CRR elements, and the expansion of CentO satellite repeats within the core domain of Cen8. Together, the results provide insights into the recent proliferation of LTR-retrotransposons, and the rapid expansion of CentO satellite repeats, underlying the dynamic variation and plasticity of plant centromeres.
The Plant Journal 09/2009; 60(5):805-19. · 6.58 Impact Factor
[show abstract][hide abstract] ABSTRACT: The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics information that could be useful for comprehensive understanding of the rice biology. Since the last publication of the RAP-DB, the IRGSP genome has been revised and reassembled. In addition, a large number of rice-expressed sequence tags have been released, and functional genomics resources have been produced worldwide. Thus, we have thoroughly updated our genome annotation by manual curation of all the functional descriptions of rice genes. The latest version of the RAP-DB contains a variety of annotation data as follows: clone positions, structures and functions of 31 439 genes validated by cDNAs, RNA genes detected by massively parallel signature sequencing (MPSS) technology and sequence similarity, flanking sequences of mutant lines, transposable elements, etc. Other annotation data such as Gnomon can be displayed along with those of RAP for comparison. We have also developed a new keyword search system to allow the user to access useful information. The RAP-DB is available at: http://rapdb.dna.affrc.go.jp/ and http://rapdb.lab.nig.ac.jp/.
Nucleic Acids Research 02/2008; 36(Database issue):D1028-33. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is approximately 32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene.
Genome Research 03/2007; 17(2):175-83. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: Rice (Oryza sativa L.) is a model organism for the functional genomics of monocotyledonous plants since the genome size is considerably smaller than those of other monocotyledonous plants. Although highly accurate genome sequences of indica and japonica rice are available, additional resources such as full-length complementary DNA (FL-cDNA) sequences are also indispensable for comprehensive analyses of gene structure and function. We cross-referenced 28.5K individual loci in the rice genome defined by mapping of 578K FL-cDNA clones with the 56K loci predicted in the TIGR genome assembly. Based on the annotation status and the presence of corresponding cDNA clones, genes were classified into 23K annotated expressed (AE) genes, 33K annotated non-expressed (ANE) genes, and 5.5K non-annotated expressed (NAE) genes. We developed a 60mer oligo-array for analysis of gene expression from each locus. Analysis of gene structures and expression levels revealed that the general features of gene structure and expression of NAE and ANE genes were considerably different from those of AE genes. The results also suggested that the cloning efficiency of rice FL-cDNA is associated with the transcription activity of the corresponding genetic locus, although other factors may also have an effect. Comparison of the coverage of FL-cDNA among gene families suggested that FL-cDNA from genes encoding rice- or eukaryote-specific domains, and those involved in regulatory functions were difficult to produce in bacterial cells. Collectively, these results indicate that rice genes can be divided into distinct groups based on transcription activity and gene structure, and that the coverage bias of FL-cDNA clones exists due to the incompatibility of certain eukaryotic genes in bacteria.
PLoS ONE 02/2007; 2(11):e1235. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: A contig-oriented database for annotation of the rice genome has been constructed to facilitate map-based rice genomics. The Rice Annotation Database has the following functional features: (i) extensive effort of manual annotations of P1-derived artificial chromosome/bacterial artificial chromosome clones can be merged at chromosome and contig-level; (ii) concise visualization of the annotation information such as the predicted genes, results of various prediction programs (RiceHMM, Genscan, Genscan+, Fgenesh, GeneMark, etc.), homology to expressed sequence tag, full-length cDNA and protein; (iii) user-friendly clone / gene query system; (iv) download functions for nucleotide, amino acid and coding sequences; (v) analysis of various features of the genome (GC-content, average value, etc.); and (vi) genome-wide homology search (BLAST) of contig- and chromosome-level genome sequence to allow comparative analysis with the genome sequence of other organisms. As of October 2004, the database contains a total of 215 Mb sequence with relevant annotation results including 30 000 manually curated genes. The database can provide the latest information on manual annotation as well as a comprehensive structural analysis of various features of the rice genome. The database can be accessed at http://rad.dna.affrc.go.jp/.
Nucleic Acids Research 02/2005; 33(Database issue):D651-5. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: Introduction The Rice Genome Research Program (RGP) has been pursuing the sequencing of the entire genome since 1998 in collaboration with the International Rice Genome Sequencing Project (IRGSP). As of Dec 2002, a high-quality draft sequence of the entire genome has been completed. Currently, we have sequenced and accumulated a total of 239 Mb of six rice chromosomes assigned to RGP . These include about 110 Mb of the non-overlapping, finished sequences. As a next step for the post-genome era, it is extremely essential that the accumulated information be e#ciently managed and integrated to facilitate map-based informatics. We have developed a Rice Annotation Database (RAD) in order to store and concisely view the rice genome sequence with relevant annotation data. The current status and future developments are described here. 2 System Architecture and Improvements RAD is a relational database, which facilitates storage, query and visualization of annotation information such as s
[show abstract][hide abstract] ABSTRACT: Understanding the organization of eukaryotic centromeres has both fundamental and applied importance because of their roles in chromosome segregation, karyotypic stability, and artificial chromosome-based cloning and expression vectors. Using clone-by-clone sequencing methodology, we obtained the complete genomic sequence of the centromeric region of rice (Oryza sativa) chromosome 8. Analysis of 1.97 Mb of contiguous nucleotide sequence revealed three large clusters of CentO satellite repeats (68.5 kb of 155-bp repeats) and >220 transposable element (TE)-related sequences; together, these account for approximately 60% of this centromeric region. The 155-bp repeats were tandemly arrayed head to tail within the clusters, which had different orientations and were interrupted by TE-related sequences. The individual 155-bp CentO satellite repeats showed frequent transitions and transversions at eight nucleotide positions. The 40 TE elements with highly conserved sequences were mostly gypsy-type retrotransposons. Furthermore, 48 genes, showing high BLAST homology to known proteins or to rice full-length cDNAs, were predicted within the region; some were close to the CentO clusters. We then performed a genome-wide survey of the sequences and organization of CentO and RIRE7 families. Our study provides the complete sequence of a centromeric region from either plants or animals and likely will provide insight into the evolutionary and functional analysis of plant centromeres.
The Plant Cell 04/2004; 16(4):967-76. · 9.25 Impact Factor