Osamu Ogasawara

Osamu Ogasawara
National Institute of Genetics · DNA Data Bank Japan Center (DDBJ)

About

71
Publications
13,122
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,401
Citations

Publications

Publications (71)
Article
Full-text available
The Bioinformation and DDBJ (DNA Data Bank of Japan) Center (DDBJ Center; https://www.ddbj.nig.ac.jp) operates archival databases that collect nucleotide sequences, study and sample information, and distribute them without access restriction to progress life science research as a member of the International Nucleotide Sequence Database Collaboratio...
Article
Full-text available
The Bioinformation and DDBJ Center (DDBJ Center, https://www.ddbj.nig.ac.jp) provides databases that capture, preserve and disseminate diverse biological data to support research in the life sciences. This center collects nucleotide sequences with annotations, raw sequencing data, and alignment information from high-throughput sequencing platforms,...
Article
Full-text available
Studies in human genetics deal with a plethora of human genome sequencing data that are generated from specimens as well as available on public domains. With the development of various bioinformatics applications, maintaining the productivity of research, managing human genome data, and analyzing downstream data is essential. This review aims to gu...
Article
Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One sol...
Article
Full-text available
The Bioinformation and DDBJ Center (https://www.ddbj.nig.ac.jp) in the National Institute of Genetics (NIG) maintains a primary nucleotide sequence database as a member of the International Nucleotide Sequence Database Collaboration (INSDC) in partnership with the US National Center for Biotechnology Information and the European Bioinformatics Inst...
Article
Full-text available
Background Container virtualization technologies such as Docker are popular in the bioinformatics domain because they improve the portability and reproducibility of software deployment. Along with software packaged in containers, the standardized workflow descriptors Common Workflow Language (CWL) enable data to be easily analyzed on multiple compu...
Article
Full-text available
We have fully integrated public chromatin chromatin immunoprecipitation sequencing (ChIP-seq) and DNase-seq data (n > 70,000) derived from six representative model organisms (human, mouse, rat, fruit fly, nematode, and budding yeast), and have devised a data-mining platform—designated ChIP-Atlas (http://chip-atlas.org). ChIP-Atlas is able to show a...
Article
Full-text available
The Genomic Expression Archive (GEA) for functional genomics data from microarray and high-throughput sequencing experiments has been established at the DNA Data Bank of Japan (DDBJ) Center (https://www.ddbj.nig.ac.jp), which is a member of the International Nucleotide Sequence Database Collaboration (INSDC) with the US National Center for Biotechn...
Preprint
Full-text available
Noncoding regions of the human genome possess enhancer activity and harbor risk loci for heritable diseases. Whereas the binding profiles of multiple transcription factors (TFs) have been investigated, integrative analysis with the large body of public data available so as to provide an overview of the function of such noncoding regions has remaine...
Article
Aim: Bioinformatics analysis for Illumina Infinium Human DNA methylation BeadArray is essential, but still remains difficult task for many experimental researchers. We here aimed to develop a browser-accessible bioinformatics tool for analyzing the BeadArray data. Materials & methods: The tool was established as an analytical pipeline using R, P...
Article
The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US Nati...
Article
Full-text available
Gene expression data are exponentially accumulating; thus, the functional annotation of such sequence data from metadata is urgently required. However, life scientists have difficulty utilizing the available data due to its sheer magnitude and complicated access. We have developed a web tool for browsing reference gene expression pattern of mammali...
Article
Full-text available
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for B...
Article
Full-text available
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the fra...
Article
Full-text available
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Cent...
Article
Full-text available
The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. This database content is shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleo...
Article
Full-text available
The DNA data bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) maintains a primary nucleotide sequence database and provides analytical resources for biological information to researchers. This database content is exchanged with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the fram...
Article
Full-text available
The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchang...
Article
Full-text available
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) provides a nucleotide sequence archive database and accompanying database tools for sequence submission, entry retrieval and annotation analysis. The DDBJ collected and released 3 637 446 entries/2 272 231 889 bases between July 2009 and June 2010. A highlight of the released data was arc...
Data
Full-text available
Time development of hypothetical mRNA abundance generated by Monte Carlo simulations of the previous model (L = 0.0). Other model parameters were: M = 20,000, N = 300,000. The line shows y = 0.1/x. (0.37 MB PDF)
Data
Full-text available
Time development of hypothetical mRNA abundance generated by Monte Carlo simulations of the refined neutral model (L = 1.0) Other model parameters were: M = 20,000, N = 300,000. The line shows y = 0.1/x. (0.39 MB PDF)
Article
Full-text available
The relative contributions of natural selection and random genetic drift are a major source of debate in the study of gene expression evolution, which is hypothesized to serve as a bridge from molecular to phenotypic evolution. It has been suggested that the conflict between views is caused by the lack of a definite model of the neutral hypothesis,...
Article
Full-text available
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has collected and released 1 701 110 entries/1 116 138 614 bases between July 2008 and June 2009. A few highlighted data releases from DDBJ were the complete genome sequence of an endosymbiont within protist cells in the termite gut and Cap Analysis Gene Expression tags for human and mou...
Article
Full-text available
DDBJ (http://www.ddbj.nig.ac.jp) collected and released 1 880 115 entries or 1 134 086 245 bases in the period from July 2006 to June 2007. The released data contains the high-throughput cDNAs of cricket and high-quality draft genome of medaka among others. Our computer system has been upgraded since March 2007. Another new aspect is an efficient d...
Article
Full-text available
BodyMap-Xs (http://bodymap.jp) is a database for cross-species gene expression comparison. It was created by the anatomical breakdown of 17 million animal expressed sequence tag (EST) records in DDBJ using a sorting program tailored for this purpose. In BodyMap-Xs, users are allowed to compare the expression patterns of orthologous and paralogous g...
Article
Full-text available
The Human Anatomic Gene Expression Library (H-ANGEL) is a resource for information concerning the anatomical distribution and expression of human gene transcripts. The tool contains protein expression data from multiple platforms that has been associated with both manually annotated full-length cDNAs from H-InvDB and RefSeq sequences. Of the H-Inv...
Article
Full-text available
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so,...
Article
Full-text available
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so,...
Data
List of Library Origins of H-Inv cDNAs (182 Libraries) The dataset consists of 41,118 H-Inv cDNAs that were cloned from cDNA libraries derived from 182 varieties of cell and tissue. (33 KB XLS).
Data
List of H-Inv Proteins with Potential EC Numbers (1,892 H-Inv Proteins) The allotted EC numbers are based on the corresponding DNA databank records, UniProt/Swiss-Prot and TrEMBL records that show sequence similarity to the proteins, and InterPro records that the proteins hit. (247 KB XLS).
Data
Full-text available
Gene Structure (A) Gene structure of the cDNAs. (B) The frequencies and varieties of repetitive sequences found in the cDNAs. A list of the 20,899 loci representing cDNAs that RepeatMasker showed contained repetitive elements. (C) The positions (5′ UTR, ORF, and 3′ UTR) of repetitive sequences in the protein-coding cDNAs. A total of 1,863 cDNAs con...
Data
List of Polymorphic Microsatellites Inferred by Comparisons between the H-Inv cDNAs and Genomic Sequences (56 KB XLS).
Data
Full-text available
Size Distribution of Predicted ORFs The size distribution of all H-Inv proteins among the five similarity categories. (24 KB PDF).
Data
Full-text available
Numbers of Representative H-Inv cDNAs That Are Homologous to Proteins in Each Taxonomic Group Two thresholds (E < 10−5, white bars, and E < 10−10, black bars) were employed. The “animal” group does not include mammalian species. The “eukaryote” group represents eukaryotic species other than animals, fungi, and plants. (9 KB PDF).
Data
Full-text available
Prediction of ORFs (A) Schematic diagram for the prediction of ORFs. This diagram illustrates the ORF prediction method used on all H-Inv cDNAs. The method was based upon the alignment of similarity searches using FASTY and BLASTX. Gene prediction was carried out using GeneMark. Prior to the prediction of ORFs, we judged if a sequence had any frame...
Data
Full-text available
Scheme of Prediction for Functional Annotation (A) Schematic diagram for determining a representative transcript for each locus. The procedure of computational autoannotation is illustrated. Prior to the human curation of the representative transcript of each H-Inv cluster, we performed computational autoannotation. (B) Schematic diagram for functi...
Data
Full-text available
A Functional Classification of H-Inv Protein Families That Have Homologs in Each Taxonomic Group H-Inv protein families were identified by clustering H-Inv proteins using the single-linkage clustering method. Then, the number of homologs for each H-Inv protein family was calculated. Mammalian species are excluded from the “animal” group. “eukaryote...
Data
Full-text available
H-Inv Annotation Viewers (A) G-integra: A genome mapping viewer. (B) SOUP Locus annotation viewer. (C) SOUP cDNA annotation viewer. (D) SMO Viewer: The similarity, motif, and ORF information viewer. (2,022 KB PDF).
Data
Full-text available
The InterPro IDs Identified in H-Inv Proteins The top 40 InterPro IDs identified in H-Inv proteins and proteins from other species are listed for all types (A) and for each type of family, domain, and repeat (B–D). Analyses were conducted by InterPro ver. 3.1. Nonredundant proteome datasets of other species were obtained from the following sites: f...
Data
Features of Category II Proteins A total of 4,104 H-Inv proteins were classified as Category II based on sequence similarity to functionally validated proteins. The table and figure show source species of proteins in public databases to which the Category II proteins were similar. (9 KB PDF).
Data
Full-text available
H-Inv KEGG Analysis Results (Images of KEGG Pathways) The images illustrate the metabolic pathways of KEGG database based on the EC number assignments to H-Inv proteins. (47 KB PDF).
Data
Full-text available
A Sample View of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/) A FLcDNA (BC003551) is shown with its detailed annotations, e.g., gene structure, functional annotation, ORF predictions, protein structure prediction by GTOP, etc. The H-InvDB has links to other internal databases (red boxes) such as a genome map viewer (G-integr...
Data
CAI and Codon Usage (A) CAI was measured for all H-Inv proteins. CAI is a measure of biased patterns for synonymous codon usage (http://biobase.dk/embossdocs/cai.html). (B) Codon usage in predicted ORFs of H-Inv proteins. Total tri-nucleotide frequencies (forward strand) for the sequences of each species are shown. Nonredundant proteome datasets fo...
Data
Full-text available
Tissue Library Origins of H-Inv Proteins The results of classification into five similarity categories for each of ten tissue classes. (A) Numbers of H-Inv proteins. (B) Histogram. (10 KB PDF).
Data
Full-text available
List of Newly Assigned Human Enzymes (32 H-Inv Proteins) All these 32 H-Inv proteins were newly assigned enzyme numbers with the support of the KEGG pathway. These enzyme assignments were previously unrepresented in Homo sapiens. (33 KB PDF).
Data
Full-text available
Basic Statistics for UTR Sequences Analyzed (8 KB PDF).
Data
GO Term Assignment to H-Inv Proteins (A) Molecular function. (B) Cellular component. (C) Biological process. (74 KB PDF).
Data
Full-text available
A Functional Classification of Representative H-Inv cDNAs That Have Homologs in Other Species (See also Figure 6.) (9 KB PDF).
Data
Full-text available
UTR Replacements in Primates and Rodents One hundred and forty-seven UTR replacements distributed among different species were detected. (9 KB PDF).
Data
List of the Databases and Software Used in the H-Inv Project (31 KB PDF).
Data
Full-text available
A Detailed Functional Annotation Based on Protein Modules (25 KB PDF).
Article
As a first step toward the quantitative comparison of clinical features of diseases, we indexed the text descriptions in the Clinical Synopsis section of the Online Mendelian Inheritance in Man (OMIM) with concepts for the body parts, organs, and tissues contained in the Metathesaurus of the Unified Medical Language System (UMLS). We also indexed t...
Article
Full-text available
Detailed analysis of human gene expression data reveals several patterns of relationship between transcript frequency and abundance rank. In muscle and liver, organs composed primarily of a homogeneous population of differentiated cells, they obey Zipf's law. In cell lines, epithelial tissue and compiled transcriptome data, only high-rankers deviat...
Article
Full-text available
After the accomplishment of human draft sequence, more and more efforts are being made in the mapping of the data-driven patterns to background knowledge, hop-ing to efficiently produce hypotheses out of the flood of data. Here we propose a framework of biomedical data and knowl-edge that has a high adaptability to the automated data interpretation...
Article
Expression of cytochrome P450 cholesterol side chain cleavage (P450scc) and 3beta-hydroxysteroid dehydrogenase (3beta-HSD) mRNAs was examined in chicken embryonic adrenal glands and gonads between days 4 and 12 of incubation. In situ hybridization analysis showed that 3beta-HSD mRNA appeared on day 5 of incubation in the adrenal glands and on day 6...
Article
The primary structure of the N-terminal extracellular region of the follitropin receptor (FSH-R), which is thought to be responsible for hormone binding specificity, was determined in three reptilian species (tortoise, gecko, and lizard). Remarkably low sequence homologies were detected in the C-terminal part of the extracellular domain. This regio...
Article
We have developed an internet-accessible database, TissueDB, which provides a hierarchy of names and synonyms for adult human tissues. There are two goals for TissueDB. The first is to provide a framework within which to store data concerning gene expression, tissue sources of cultured cell lines, and other spatially organized data. The second goal...

Network

Cited By