Yasukazu Nakamura

Yasukazu Nakamura
  • Doctor of Science (PhD)
  • Professor at National Institute of Genetics

About

270
Publications
78,097
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
26,672
Citations
Current institution
National Institute of Genetics
Current position
  • Professor
Additional affiliations
January 2009 - present
National Institute of Genetics
Position
  • Professor
January 2009 - present
Kazusa DNA Research Institute
Position
  • Visiting Senior Researcher
January 2009 - present
The Graduate University for Advanced Studies, SOKENDAI
Position
  • Professor

Publications

Publications (270)
Preprint
Full-text available
The liverwort Marchantia polymorpha is a key model organism for understanding land plant evolution, development, and gene regulation. To support the growing demand for high-quality genomic resources, we present MarpolBase, a comprehensive and integrated genome database that hosts newly assembled, high-accuracy reference genomes for both the male Ta...
Article
Full-text available
Background Accurate taxonomic classification in genome databases is essential for reliable biological research and effective data sharing. Mislabeling or inaccuracies in genome annotations can lead to incorrect scientific conclusions and hinder the reproducibility of research findings. Despite advances in genome analysis techniques, challenges pers...
Article
Full-text available
Nicotiana benthamiana has long served as a crucial plant material extensively used in plant physiology research, particularly in the field of plant pathology, because of its high susceptibility to plant viruses. Additionally , it serves as a production platform to test vaccines and other valuable substances. Among its approximately 3.1 Gb genome, 5...
Article
The Bioinformation and DNA Data Bank of Japan Center (DDBJ Center, https://www.ddbj.nig.ac.jp) provides public databases that cover a wide range of fields in life sciences. As a founding member of the International Nucleotide Sequence Database Collaboration (INSDC), the DDBJ Center accepts and distributes nucleotide sequence data ranging from raw r...
Preprint
Full-text available
Motivation Accurate taxonomic assignments of genomic data are crucial across various biological databases. With a rapid increase in submitted genomes in recent years, ensuring precise classification is important to maintain database integrity. Mislabeled genomes can confuse researchers, hinder analyses, and produce false results. Therefore, there i...
Article
Full-text available
The Bioinformation and DNA Data Bank of Japan (DDBJ) Center (https://www.ddbj.nig.ac.jp) provides database archives that cover a wide range of fields in life sciences. As a founding member of the International Nucleotide Sequence Database Collaboration (INSDC), DDBJ accepts and distributes nucleotide sequence data as well as their study and sample...
Article
Full-text available
Although chemotherapy using CHOP-based protocol induces remission in most cases of canine multicentric high-grade B-cell lymphoma (mhBCL), some cases develop early relapse during the first induction protocol. In this study, we examined the gene expression profiles of canine mhBCL before chemotherapy and investigated their associations with early re...
Article
Full-text available
Although current long-read sequencing technologies have a long-read length that facilitates assembly for genome reconstruction, they have high sequence errors. While various assemblers with different perspectives have been developed, no systematic evaluation of assemblers with long reads for diploid genomes with varying heterozygosity has been perf...
Preprint
Full-text available
Nicotiana benthamiana has long served as a crucial plant material extensively used in plant physiology research, particularly in the field of plant pathology, because of its high susceptibility to plant viruses. Additionally, it serves as a production platform to test vaccines and other valuable substances. Among its approximately 3.1 Gb genome, 57...
Article
Full-text available
Background Plant genome information is fundamental to plant research and development. Along with the increase in the number of published plant genomes, there is a need for an efficient system to retrieve various kinds of genome-related information from many plant species across plant kingdoms. Various plant databases have been developed, but no pub...
Article
Full-text available
Objectives: Autosomal dominant polycystic kidney disease (ADPKD) is a common inherited disease in cats. In most cases, the responsible abnormality is a nonsense single nucleotide polymorphism in exon 29 of the PKD1 gene (chrE3:g.42858112C>A, the conventional PKD1 variant). The aim of this study was to conduct a large-scale epidemiological study of...
Article
Full-text available
In plants, variations in seed size and number are outcomes of different reproductive strategies. Both traits are often environmentally influenced, suggesting that a mechanism exists to coordinate these phenotypes in response to available maternal resources. Yet, how maternal resources are sensed and influence seed size and number are largely unknow...
Preprint
Full-text available
Autosomal dominant polycystic kidney disease (ADPKD) is a common inherited disease in cats. In most cases, the responsible abnormality is a nonsense single nucleotide polymorphism in exon 29 of the PKD1 gene (chrE3:g.42858112C>A, the conventional PKD1 variant). Epidemiological studies on feline ADPKD caused by the conventional PKD1 variant have bee...
Article
Full-text available
Nicotiana benthamiana is widely used as a model plant for dicotyledonous angiosperms. In fact, the strains used in research are highly susceptible to a wide range of viruses. Accordingly, these strains are subject to plant pathology and plant-microbe interactions. In terms of plant-plant interactions, N. benthamiana is one of the plants that exhibi...
Article
Full-text available
Simple Summary We identified the neuropeptides and their genomic loci on the draft genome sequences of Gryllus bimaculatus. These annotations were additionally assigned to the draft genome annotation. This addition to the draft genome annotation improved the convenience of research by consolidating the knowledge of neuropeptides, such as the sequen...
Article
Full-text available
The Bioinformation and DNA Data Bank of Japan (DDBJ) Center (https://www.ddbj.nig.ac.jp) maintains database archives that cover a wide range of fields in life sciences. As a founding member of the International Nucleotide Sequence Database Collaboration (INSDC), our primary mission is to collect and distribute nucleotide sequence data, as well as t...
Article
Full-text available
Perilla frutescens (Lamiaceae) is an important herbal plant with hundreds of bioactive chemicals, among which perillaldehyde and rosmarinic acid are the two major bioactive compounds in the plant. The leaves of red perilla are used as traditional Kampo medicine or food ingredients. However, the medicinal and nutritional uses of this plant could be...
Article
Full-text available
The liverwort Marchantia polymorpha is equipped with a wide range of molecular and genetic tools and resources that have led to its wide use to explore the evo-devo aspects of land plants. Although its diverse transcriptome data are rapidly accumulating, there is no extensive yet user-friendly tool to exploit such a compilation of data and to summa...
Preprint
Full-text available
The liverwort Marchantia polymorpha is equipped with a wide range of molecular and genetic tools and resources that have led to its wide use to explore the evo-devo aspects of land plants. Although its diverse transcriptome data are rapidly accumulating, there is no extensive yet user-friendly tool to exploit such a compilation of data and to summa...
Article
Full-text available
Secondary loss of photosynthesis is observed across almost all plastid-bearing branches of the eukaryotic tree of life. However, genome-based insights into the transition from a phototroph into a secondary heterotroph have so far only been revealed for parasitic species. Free-living organisms can yield unique insights into the evolutionary conseque...
Article
Full-text available
Background OryzaGenome (http://viewer.shigen.info/oryzagenome21detail/index.xhtml), a feature within Oryzabase (https://shigen.nig.ac.jp/rice/oryzabase/), is a genomic database for wild Oryza species that provides comparative and evolutionary genomics approaches for the rice research community. Results Here we release OryzaGenome2.1, the first maj...
Article
Full-text available
Sex determination is a central process for sexual reproduction and is often regulated by a sex determinant encoded on a sex chromosome. Rules that govern the evolution of sex chromosomes via specialization and degeneration following the evolution of a sex determinant have been well studied in diploid organisms. However, distinct predictions apply t...
Article
Full-text available
Cyanobacteria are a diverse group of Gram-negative prokaryotes that perform oxygenic photosynthesis. Cyanobacteria have been used for research on photosynthesis and have attracted attention as a platform for biomaterial/biofuel production. Cyanobacteria are also present in almost all habitats on Earth and have extensive impacts on global ecosystems...
Article
Planktothrix species are distributed worldwide, and these prevalent cyanobacteria occasionally form potentially devastating toxic blooms. Given the ecological and taxonomic importance of Planktothrix agardhii as a bloom species, we set out to determine the complete genome sequence of the type strain Planktothrix agardhii NIES-204. Remarkably, we fo...
Preprint
Full-text available
The domestic cat ( Felis catus ) is one of the most popular companion animals in the world. Comprehensive genomic resources will aid the development and application of veterinary medicine including to improve feline health, in particular, to enable precision medicine which is promising in human application. However, currently available cat genome a...
Article
Two Gram-stain-positive, rod-shaped, non-motile, non-spore-forming, catalase-negative bacteria, designated strains SG162T and NK01, were isolated from Japanese rice grain silage and total mixed ration silage, respectively. They were initially identified as Lactobacillus buchneri based on the 16S rRNA gene sequence similarities. However, the two str...
Article
Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One sol...
Article
Clostridium diolis shares high similarity based on 16S rRNA gene sequences and fatty acid composition with Clostridium beijerinckii. In this study, the taxonomic status of C. diolis was clarified using genomic and phenotypic approaches. High similarity was detected among C. diolis DSM 15410T, C. beijerinckii DSM 791T and NCTC 13035T, showing averag...
Article
Full-text available
Genome packaging by nucleosomes is a hallmark of eukaryotes. Histones and the pathways that deposit, remove, and read histone modifications are deeply conserved. Yet, we lack information regarding chromatin landscapes in extant representatives of ancestors of the main groups of eukaryotes, and our knowledge of the evolution of chromatin-related pro...
Preprint
Full-text available
Genome packaging by nucleosomes is a hallmark of eukaryotes. Histones and the pathways that deposit, remove, and read histone modifications are deeply conserved. Yet, we lack information regarding chromatin landscapes in extant representatives of ancestors of the main groups of eukaryotes and our knowledge of the evolution of chromatin related proc...
Chapter
DDBJ Fast Annotation and Submission Tool (DFAST) is a genome annotation pipeline for prokaryotes, which also assists data submission to the public sequence database. It is available both as a web service and as a stand-alone tool that runs on local machines. DFAST can annotate a typical-sized bacterial genome within 5 min. The default annotation wo...
Article
A taxonomic study of a Gram-stain-positive, rod-shaped, non-motile, non-spore-forming, catalase-negative bacterium, strain YK43T, isolated from spent mushroom substrates stored in Nagano, Japan was performed. Growth was detected at 15–45 °C, pH 5.0–8.5, and 0–10 % (w/v) NaCl. The genomic DNA G+C content of strain YK43T was 43.6 mol%. The predominan...
Article
The taxonomic status of Paenibacillus thermophilus was analyzed using genomic and phenotypic approaches. The results of RNA polymerase beta subunit gene sequence comparisons indicated that two type strains of P. thermophilus (DSM 24746T and JCM 17693T) and Paenibacillus macerans ATCC 8244T shared 100 % sequence similarity. By whole-genome sequence...
Article
Full-text available
Oryza officinalis is an accessible alien donor for genetic improvement of rice. Comparison across a representative panel of Oryza species showed that the wild O. officinalis and cultivated O. sativa ssp. japonica have similar cold tolerance potentials. The possibility that either distinct or similar genetic mechanisms are involved in the low temper...
Article
Three strains, JCM 5343T, JCM 5344 and JCM 1130, currently identified as Lactobacillus gasseri, were investigated using a polyphasic taxonomic approach. Although these strains shared high 16S rRNA gene sequence similarities with L. gasseri ATCC 33323T (99.9 %), they formed a clade clearly distinct from ATCC 33323T based on whole-genome relatedness....
Article
Full-text available
We report here the whole-genome sequence of Nostoc cycadae strain WK-1, which was isolated from cyanobacterial colonies growing in the coralloid roots of the gymnosperm Cycas revoluta . It can provide valuable resources to study the mutualistic relationships and the syntrophic metabolisms between the cyanobacterial symbiont and the host plant, C. r...
Article
Full-text available
Satsuma (Citrus unshiu Marc.) is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma (“Miyagawa Wase”) was conducted by a hybrid assembly approach using short-read sequences, three mate-pair librar...
Article
Full-text available
Novel genomics-based approaches such as genome-wide association studies (GWAS) and genomic selection (GS) are expected to be useful in fruit tree breeding, which requires much time from the cross to the release of a cultivar because of the long generation time. In this study, a citrus parental population (111 varieties) and a breeding population (6...
Article
Full-text available
We developed a prokaryotic genome annotation pipeline, DFAST, that also supports genome submission to public sequence databases. DFAST was originally started as an on-line annotation server, and to date, over 7,000 jobs have been processed since its first launch in 2016. Here, we present a newly implemented background annotation engine for DFAST, w...
Article
Full-text available
The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US Nati...
Article
Full-text available
The evolution of land flora transformed the terrestrial environment. Land plants evolved from an ancestral charophycean alga from which they inherited developmental, biochemical, and cell biological attributes. Additional biochemical and physiological adaptations to land, and a life cycle with an alternation between multicellular haploid and diploi...
Article
The International Nucleotide Sequence Database Collaboration (INSDC) has maintained a primary sequence database that collects experimentally-determined nucleotide sequence data directly from researchers. Now data deposition to the INSDC is mandatory for research publication at most of the scientific journals. However, the procedure to deposit data...
Article
Full-text available
Members of the cyanobacterial genus Synechococcus are abundant in marine environments. To better understand the genomic diversity of marine Synechococcus spp., we determined the complete genome sequence of a coastal cyanobacterium, Synechococcus sp. NIES-970. The genome had a size of 3.1 Mb, consisting of one chromosome and four plasmids.
Article
Genome annotation is a fundamental process in the sequence analysis, through which biological knowledge is generated from sequenced genomic data. Good annotation not only enhances our own downstream analyses but also promotes subsequent researches by others because it can propagate through public sequence databases. In this article, we will show ho...
Article
Full-text available
Whole-genome sequencing was performed for Lactobacillus parakefiri JCM 8573T to confirm its hitherto controversial taxonomic position. Here, we report its first reliable reference genome. Genome-wide metrics, such as average nucleotide identity and digital DNA-DNA hybridization, and phylogenomic analysis based on multiple genes supported its taxono...
Article
Full-text available
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distr...
Data
Sample number of registered SRA and DNApod by study type. Data as of April 2016. The sample number of the registered SRA was searched using ENA. “Library strategy” is explained on the DDBJ SRA website (http://trace.ddbj.nig.ac.jp/dra/submission_e.html). (DOCX)
Data
Data quantity of each sample. Data quantity is described as the depth after the removal of multiple-hit reads on the genome. The depth of a reference genome is <5-fold in 87% of the DNApod genotypic data. (TIF)
Data
Heterogeneous base-quality raw sequence reads in SRAs. SRAs contain data of various quality values among NGS datasets from individual projects. To detect DNA polymorphisms with uniform reliability, DNApod performs pre-processing to filter out low quality values and detects DNA polymorphisms by using a uniform threshold. (TIF)
Data
Overview of the Galaxy virtual machine. The high-level analysis is configured in the Galaxy platform, which is implemented in the virtual machine image. The virtual machine image of the high-level analysis is launched by the Oracle VirtualBox on the user’s personal computer. The respective tools in high-level analysis are encapsulated in the Docker...
Data
Read loss per read length caused by elimination of multiple-hit reads. Maize exhibits a more profound effect resulting from read loss than do rice and sorghum after the elimination of multiple-hit reads. This predicted that a large-scale syntenic block of maize would cause comparatively higher multiple-hit reads. (TIF)
Article
Full-text available
Oligoflexus tunisiensis Shr3T is the first strain described in the newest (eighth) class Oligoflexia of the phylum Proteobacteria. This strain was isolated from the 0.2-μm filtrate of a suspension of sand gravels collected in the Sahara Desert in the Republic of Tunisia. The genome of O. tunisiensis Shr3T is comprising 6,406 protein-coding and 57 R...
Article
Full-text available
Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various ci...
Article
Full-text available
The first ever cyanobacterial genome sequence was determined two decades ago and CyanoBase (http://genome.microbedb.jp/cyanobase), the first database for cyanobacteria was simultaneously developed to allow this genomic information to be used more efficiently. Since then, CyanoBase has constantly been extended and has received several updates. Here,...
Article
Genome finishing still remains a laborious work that includes various validation processes requiring both wet and dry knowledge and consideration, although long-read sequencers such as PacBio RSII have largely contributed to lighten the burden. We here introduce a procedure of post-assemble validation in which draft contigs are circularized into co...
Article
Full-text available
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for B...
Article
Full-text available
Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed cur...
Article
Full-text available
The large-scale genotyping assay is a prerequisite for modern genetic analysis, and single-nucleotide polymorphism (SNP) markers that enable high-throughput genotyping are widely used for genome-wide association studies (GWAS) and genomic selection (GS). However, SNP markers randomly selected from limited genome data often fail in genotyping certai...
Article
Full-text available
Aurantimicrobium minutum type strain KNC(T) is a planktonic ultramicrobacterium isolated from river water in western Japan. Strain KNC(T) has an extremely small, streamlined genome of 1,622,386 bp comprising 1,575 protein-coding sequences. The genome annotation suggests that strain KNC(T) has an actinorhodopsin-based photometabolism.
Article
Long-read sequencing represented by Pacific Biosciences’ single-molecule real-time (SMRT) technology has been widely used for microbial genomes. We overview an analysis procedure of Lactobacillus hokkaidonensis LOOC260T genome using the so-called “PacBio“ data. We describe (i) the characteristics of PacBio data, (ii) genome assembly using the HGAP...
Article
Full-text available
Cyanobacterial genus Leptolyngbya comprises genetically diverse species, but the availability of their complete genome information is limited. Here, we isolated Leptolyngbya sp. strain NIES-3755 from soil at the Toyohashi University of Technology, Japan. We determined the complete genome sequence of the NIES-3755 strain, which is composed of one ch...
Article
Genome assembly is a major task of NGS analyses. For this purpose, many assemblers based on the de Bruijn graph have been developed. In the framework, each node represents a series of overlapping k-mers (k nucleotides) and contigs can be obtained as paths solved from the k-mer graph. The most significant parameter is therefore k. We overview conven...
Article
Cyanobacterial phytochrome-class photosensors are recently emerging optogenetic tools, but availability of thermoresistant photosensors is limited. We isolated Fischerella sp. strain NIES-3754 from hotspring at Suwa-shrine, Suwa, Nagano, Japan. We determined complete genome sequence of the NIES-3754 strain, which is composed of one chromosome and t...
Article
Full-text available
While Marchantia polymorpha has been utilized as a model system to investigate fundamental biological questions for over almost two centuries, there is renewed interest in M. polymorpha as a model genetic organism in the genomics era. Here we outline community guidelines for M. polymorpha gene and transgene nomenclature, and we anticipate that thes...
Article
Full-text available
Lactobacillus hokkaidonensis is an obligate heterofermentative lactic acid bacterium, which is isolated from Timothy grass silage in Hokkaido, a subarctic region of Japan. This bacterium is expected to be useful as a silage starter culture in cold regions because of its remarkable psychrotolerance; it can grow at temperatures as low as 4°C. To eluc...
Article
To explore the diverse photoreceptors of cyanobacteria, we isolated Nostoc sp. strain NIES-3756 from soil at Mimomi-Park, Chiba, Japan, and determined its complete genome sequence. The Genome consists of one chromosome and two plasmids (total 6,987,571bp containing no gaps). The NIES-3756 strain carries 7 phytochrome and 12 cyanobacteriochrome gene...
Article
Full-text available
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the fra...
Article
Full-text available
The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of NGS technology, a flood of Oryza species reference genomes and genomic variation information has become available...
Article
We release a high-resolution map of genomic transformation-competent artificial chromosome (TAC) clones extending over all Arabidopsis thaliana (Arabidopsis) chromosomes. The Arabidopsis genomic TAC clones have been valuable genetic tools. Previously, we constructed an Arabidopsis genomic TAC library, which consists of more than ten thousand TAC cl...
Article
Full-text available
Bifidobacterium longum 105-A shows high transformation efficiency and allows for the generation of gene knockout mutants through homologous recombination. Here, we report the complete genome sequence of strain 105-A. Genes encoding at least four putative restriction-modification systems were found in this genome, which might contribute to its trans...
Article
Full-text available
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Cent...
Article
Full-text available
Although cyanobacteria are photoautotrophs, they have heterotrophic metabolism that enables them to survive in their natural habitat. However, cyanobacterial species that grow heterotrophically in the dark are rare. It remains largely unknown how cyanobacteria regulate heterotrophic activity. The cyanobacterium Leptolyngbya boryana grows heterotrop...
Article
Full-text available
We report the 1.86-Mb draft genome and annotation of Lactobacillus oryzae SG293T isolated from fermented rice grains. This genome information may provide further insights into the mechanisms underlying the fermentation of rice grains.
Article
In this study, the genes expressed in response to low pH stress were identified in the unicellular cyanobacterium Synechocystis sp. PCC 6803 using DNA microarrays. The expression of slr0967 and sll0939 constantly increased throughout 4-h acid stress conditions. Overexpression of these two genes under the control of the trc promoter induced the cell...
Article
Full-text available
Weissella oryzae was originally isolated from fermented rice grains. Here we report the draft genome sequence of the type strain of W. oryzae. This first report on the genomic sequence of this species may help identify the mechanisms underlying bacterial adaptation to the ecological niche of fermented rice grains.
Article
Full-text available
Microbial genome sequence submissions to the International Nucleotide Sequence Database Collaboration (INSDC) have been annotated with organism names that include the strain identifier. Each of these strain-level names has been assigned a unique 'taxid' in the NCBI Taxonomy Database. With the significant growth in genome sequencing, it is not possi...
Article
Full-text available
The colonization of land by plants was a key event in the evolution of life. Here we report the draft genome sequence of the filamentous terrestrial alga Klebsormidium flaccidum (Division Charophyta, Order Klebsormidiales) to elucidate the early transition step from aquatic algae to land plants. Comparison of the genome sequence with that of other...
Article
In forward genetics, identification of mutations is a time-consuming and laborious process. Modern whole-genome sequencing, coupled with bioinformatics analysis, has enabled fast and cost-effective mutation identification. However, for many experimental researchers, bioinformatics analysis is still a difficult aspect of whole-genome sequencing. To...

Network

Cited By