Satoshi Tabata

Kazusa DNA Research Institute, Kizarazu, Chiba, Japan

Are you Satoshi Tabata?

Claim your profile

Publications (418)2509.39 Total impact

  • PLoS ONE 09/2015; 10(9):e0139127. DOI:10.1371/journal.pone.0139127 · 3.23 Impact Factor
  • Source
    Shunichi Kosugi · Hideki Hirakawa · Satoshi Tabata
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome assemblies generated with next generation sequencing (NGS) reads usually contain a number of gaps. Several tools have recently been developed to close the gaps in these assemblies with NGS reads. Although these gap-closing tools efficiently close the gaps, they entail a high rate of misassembly at gap-closing sites. We have found that the assembly error rates caused by these tools are 20-500-fold higher than the rate of errors introduced into contigs by de novo assemblers. We here describe GMcloser, a tool that accurately closes these gaps with a preassembled contig set or a long read set (i.e., error-corrected PacBio reads). GMcloser uses likelihood-based classifiers calculated from the alignment statistics between scaffolds, contigs, and paired-end reads to correctly assign contigs or long reads to gap regions of scaffolds, thereby achieving accurate and efficient gap closure. We demonstrate with sequencing data from various organisms that the gap-closing accuracy of GMcloser is 3-100-fold higher than those of other available tools, with similar efficiency. GMcloser and an accompanying tool (GMvalue) for evaluating the assembly and correcting misassemblies except SNPs and short indels in the assembly are available at © The Author (2015). Published by Oxford University Press. All rights reserved. For Permissions, please email:
    Bioinformatics 08/2015; DOI:10.1093/bioinformatics/btv465 · 4.98 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: We release a high-resolution map of genomic transformation-competent artificial chromosome (TAC) clones extending over all Arabidopsis thaliana (Arabidopsis) chromosomes. The Arabidopsis genomic TAC clones have been valuable genetic tools. Previously, we constructed an Arabidopsis genomic TAC library, which consists of more than ten thousand TAC clones harboring large genomic DNA fragments extending over the whole genome of Arabidopsis. Here, we determined 13,577 end sequences from 6,987 Arabidopsis TAC clones and mapped 5,937 TAC clones to precise locations, covering approximately 90% of the Arabidopsis chromosomes. We present the large-scale data set of TAC clones with high-resolution mapping information as a Java application tool, the Arabidopsis TAC Position Viewer, which provides ready-to-go transformable genomic DNA clones corresponding to certain loci on the Arabidopsis chromosomes. The TAC clone resources will accelerate genomic DNA cloning, positional walking, complementation of mutants, and DNA transformation for heterologous gene expression. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
    The Plant Journal 07/2015; 83(6). DOI:10.1111/tpj.12949 · 5.97 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Ipomoea trifida (H. B. K.) G. Don. is the most likely diploid ancestor of the hexaploid sweet potato, I. batatas (L.) Lam. To assist in analysis of the sweet potato genome, de novo whole-genome sequencing was performed with two lines of I. trifida, namely the selfed line Mx23Hm and the highly heterozygous line 0431-1, using the Illumina HiSeq platform. We classified the sequences thus obtained as either 'core candidates' (common to the two lines) or 'line specific'. The total lengths of the assembled sequences of Mx23Hm (ITR_r1.0) was 513 Mb, while that of 0431-1 (ITRk_r1.0) was 712 Mb. Of the assembled sequences, 240 Mb (Mx23Hm) and 353 Mb (0431-1) were classified into core candidate sequences. A total of 62,407 (62.4 Mb) and 109,449 (87.2 Mb) putative genes were identified, respectively, in the genomes of Mx23Hm and 0431-1, of which 11,823 were derived from core sequences of Mx23Hm, while 28,831 were from the core candidate sequence of 0431-1. There were a total of 1,464,173 single-nucleotide polymorphisms and 16,682 copy number variations (CNVs) in the two assembled genomic sequences (under the condition of log2 ratio of >1 and CNV size >1,000 bases). The results presented here are expected to contribute to the progress of genomic and genetic studies of I. trifida, as well as studies of the sweet potato and the genus Ipomoea in general. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
    DNA Research 03/2015; 22(2). DOI:10.1093/dnares/dsv002 · 5.48 Impact Factor
  • Source
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In Batesian mimicry, animals avoid predation by resembling distasteful models. In the swallowtail butterfly Papilio polytes, only mimetic-form females resemble the unpalatable butterfly Pachliopta aristolochiae. A recent report showed that a single gene, doublesex (dsx), controls this mimicry; however, the detailed molecular mechanisms remain unclear. Here we determined two whole-genome sequences of P. polytes and a related species, Papilio xuthus, identifying a single ∼130-kb autosomal inversion, including dsx, between mimetic (H-type) and non-mimetic (h-type) chromosomes in P. polytes. This inversion is associated with the mimicry-related locus H, as identified by linkage mapping. Knockdown experiments demonstrated that female-specific dsx isoforms expressed from the inverted H allele (dsx(H)) induce mimetic coloration patterns and simultaneously repress non-mimetic patterns. In contrast, dsx(h) does not alter mimetic patterns. We propose that dsx(H) switches the coloration of predetermined wing patterns and that female-limited polymorphism is tightly maintained by chromosomal inversion.
    Nature Genetics 03/2015; 47(4). DOI:10.1038/ng.3241 · 29.35 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide mutations induced by ethyl methanesulfonate (EMS) and gamma irradiation in the tomato Micro-Tom genome were identified by a whole-genome shotgun sequencing analysis to estimate the spectrum and distribution of whole-genome DNA mutations and the frequency of deleterious mutations. A total of ~370 Gb of paired-end reads for four EMS-induced mutants and three gamma-ray-irradiated lines as well as a wild-type line were obtained by next-generation sequencing technology. Using bioinformatics analyses, we identified 5920 induced single nucleotide variations and insertion/deletion (indel) mutations. The predominant mutations in the EMS mutants were C/G to T/A transitions, while in the gamma-ray mutants, C/G to T/A transitions, A/T to T/A transversions, A/T to G/C transitions and deletion mutations were equally common. Biases in the base composition flanking mutations differed between the mutagenesis types. Regarding the effects of the mutations on gene function, >90% of the mutations were located in intergenic regions, and only 0.2% were deleterious. In addition, we detected 1 140 687 spontaneous single nucleotide polymorphisms and indel polymorphisms in wild-type Micro-Tom lines. We also found copy number variation, deletions and insertions of chromosomal segments in both the mutant and wild-type lines. The results provide helpful information not only for mutation research, but also for mutant screening methodology with reverse-genetic approaches. © 2015 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
    Plant Biotechnology Journal 02/2015; DOI:10.1111/pbi.12348 · 5.75 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In leguminous root nodules, rhizobia differentiate into morphology specific to symbiosis, called bacteroids. As bacteroids are surrounded with peribacteroid membranes filled with peribacteroid solution (PBS), it is considered that PBS contains substances inducing differentiation of rhizobia into bacteroids. In this study, genome-wide expression profiles of Bradyrhizobium japonicum cells cultured in PBS purified from root nodule of soybean (Glycine max L.) were compared with those of bacteroids using macroarray. PBS treatment preferentially induced regions in a large symbiosis island including various symbiosis relevant genes such as nod, fix, nol and noe, in which 75% of regions were commonly induced in bacteroids, while general repressions outside of the symbiosis island seen in bacteroids were not observed in PBS treated cells. The present results suggest that PBS contained some, but not all, substances inducing expression of the genes which are involved in differentiation into bacteroids.
    Soil Science and Plant Nutrition 01/2015; 61(3):1-10. DOI:10.1080/00380768.2014.994470 · 0.73 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Apomixis in plants generates clonal progeny with a maternal genotype through asexual seed formation. Hieracium subgenus Pilosella (Asteraceae) contains polyploid, highly heterozygous apomictic and sexual species. Within apomictic Hieracium, dominant genetic loci independently regulate the qualitative developmental components of apomixis. In H. praealtum, LOSS OF APOMEIOSIS (LOA) enables formation of embryo sacs without meiosis and LOSS OF PARTHENOGENESIS (LOP) enables fertilization-independent seed formation. A locus required for fertilization-independent endosperm formation (AutE) has been identified in H. piloselloides. Additional quantitative loci appear to influence the penetrance of the qualitative loci, although the controlling genes remain unknown. This study aimed to develop the first genetic linkage maps for sexual and apomictic Hieracium species using simple sequence repeat (SSR) markers derived from expressed transcripts within the developing ovaries. RNA from microdissected Hieracium ovule cell types and ovaries was sequenced and SSRs were identified. Two different F1 mapping populations were created to overcome difficulties associated with genome complexity and asexual reproduction. SSR markers were analysed within each mapping population to generate draft linkage maps for apomictic and sexual Hieracium species. A collection of 14 684 Hieracium expressed SSR markers were developed and linkage maps were constructed for Hieracium species using a subset of the SSR markers. Both the LOA and LOP loci were successfully assigned to linkage groups; however, AutE could not be mapped using the current populations. Comparisons with lettuce (Lactuca sativa) revealed partial macrosynteny between the two Asteraceae species. A collection of SSR markers and draft linkage maps were developed for two apomictic and one sexual Hieracium species. These maps will support cloning of controlling genes at LOA and LOP loci in Hieracium and should also assist with identification of quantitative loci that affect the expressivity of apomixis. Future work will focus on mapping AutE using alternative populations. © The Author 2014. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email:
    Annals of Botany 12/2014; 115(4). DOI:10.1093/aob/mcu249 · 3.65 Impact Factor
  • Source
    Dataset: 1456
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Unlike other important Solanaceae crops such as tomato, potato, chili pepper, and tobacco, all of which originated in South America and are cultivated worldwide, eggplant (Solanum melongena L.) is indigenous to the Old World and in this respect it is phylogenetically unique. To broaden our knowledge of the genomic nature of solanaceous plants further, we dissected the eggplant genome and built a draft genome dataset with 33,873 scaffolds termed SME_r2.5.1 that covers 833.1 Mb, ca. 74% of the eggplant genome. Approximately 90% of the gene space was estimated to be covered by SME_r2.5.1 and 85,446 genes were predicted in the genome. Clustering analysis of the predicted genes of eggplant along with the genes of three other solanaceous plants as well as Arabidopsis thaliana revealed that, of the 35,000 clusters generated, 4,018 were exclusively composed of eggplant genes that would perhaps confer eggplant-specific traits. Between eggplant and tomato, 16,573 pairs of genes were deduced to be orthologous, and 9,489 eggplant scaffolds could be mapped onto the tomato genome. Furthermore, 56 conserved synteny blocks were identified between the two species. The detailed comparative analysis of the eggplant and tomato genomes will facilitate our understanding of the genomic architecture of solanaceous plants, which will contribute to cultivation and further utilization of these crops.
    DNA research: an international journal for rapid publication of reports on genes and genomes 09/2014; 21(6). DOI:10.1093/dnares/dsu027 · 3.05 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To develop a high density linkage map in faba bean, a total of 1,363 FBES ( Faba bean expressed sequence tag [EST] derived s imple sequence repeat [SSR]) markers were designed based on 5,090 non-redundant ESTs developed in this study. A total of 109 plants of a ‘Nubaria 2’ × ‘Misr 3’ F 2 mapping population were used for map construction. Because the parents were not pure homozygous lines, the 109 F 2 plants were divided into three subpopulations according to the original F 1 plants. Linkage groups (LGs) generated in each subpopula tion were integrated by commonly mapped markers. The integrated ‘Nubaria 2’ × ‘Misr 3’ map consisted of six LGs, representing a total length of 684.7 cM, with 552 loci. Of the mapped loci, 47% were generated from multi-loci diagnostic (MLD) markers. Alignment of homologous sequence pairs along each linkage group re vealed obvious syntenic relationships between LGs in faba bean and the genomes of two model legumes, Lotus japonicus and Medicago truncatula . In a polymorphic analysis with ten Egyptian faba bean varieties, 78.9% (384/487) of the FBES markers showed polymorphisms. Along with the EST-SSR markers, the dense map developed in this study is expected to accelerate marker assisted breeding in faba bean
    Breeding Science 09/2014; 64(3). DOI:10.1270/jsbbs.64.252 · 2.13 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase ( This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
    Breeding Science 09/2014; 64(3):264-71. DOI:10.1270/jsbbs.64.264 · 2.13 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The nuclear export of proteins is regulated largely through the exportin/CRM1 pathway, which involves the specific recognition of leucine-rich nuclear export signals (NESs) in the cargo proteins, and modulates nuclear-cytoplasmic protein shuttling by antagonizing the nuclear import activity mediated by importins and the nuclear import signal (NLS). Although the prediction of NESs can help to define proteins that undergo regulated nuclear export, current methods of predicting NESs, including computational tools and consensus-sequence-based searches, have limited accuracy, especially in terms of their specificity. We found that each residue within an NES largely contributes independently and additively to the entire nuclear export activity. We created activity-based profiles of all classes of NESs with a comprehensive mutational analysis in mammalian cells. The profiles highlight a number of specific activity-affecting residues not only at the conserved hydrophobic positions but also in the linker and flanking regions. We then developed a computational tool, NESmapper, to predict NESs by using profiles that had been further optimized by training and combining the amino acid properties of the NES-flanking regions. This tool successfully reduced the considerable number of false positives, and the overall prediction accuracy was higher than that of other methods, including NESsential and Wregex. This profile-based prediction strategy is a reliable way to identify functional protein motifs. NESmapper is available at
    PLoS Computational Biology 09/2014; 10(9):e1003841. DOI:10.1371/journal.pcbi.1003841 · 4.62 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this study, the genes expressed in response to low pH stress were identified in the unicellular cyanobacterium Synechocystis sp. PCC 6803 using DNA microarrays. The expression of slr0967 and sll0939 constantly increased throughout 4-h acid stress conditions. Overexpression of these two genes under the control of the trc promoter induced the cells to become tolerant to acid stress. The Δslr0967 and Δsll0939 mutant cells exhibited sensitivity to osmotic and salt stress, whereas the trc mutants of these genes exhibited tolerance to these types of stress. Microarray analysis of the Δslr0967 mutant under acid stress conditions showed that expression of the high light-inducible protein ssr2595 (HliB) and the two-component response regulator slr1214 (rre15) were out of regulation due to gene inactivation, whereas they were upregulated by acid stress in the wild-type cells. Microarray analysis and real-time quantitative reverse transcription-polymerase chain reaction analysis showed that the expression of sll0939 was significantly repressed in the slr0967 deletion mutant. These results suggest that sll0939 is directly involved in the low pH tolerance of Synechocystis sp. PCC 6803 and that slr0967 may be essential for the induction of acid stress-responsive genes.
    Plant Physiology and Biochemistry 08/2014; 81. DOI:10.1016/j.plaphy.2014.02.007 · 2.76 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Genetic transformation systems using reporter genes in whole plants have a wide variety of applications for molecular biological study including the visualization of expression patterns of particular genes and intracellular biological phenomena as well as the identification of novel genes. In this study, we assessed co-expression of each three codon-optimized reporter genes and a selectable marker in the nuclear transformation system of whole Pyropia yezoensis, a red marine alga. With the use of an endogenous promoter, both the codon-optimized hygromycin resistance gene and ß-glucuronidase gene (PyGUS) were co-expressed in P. yezoensis cells. A high level of GUS activity was observed in 60 % of the individuals in hygromycin-resistant lines. A histochemical GUS assay revealed that the PyGUS reporter gene was stably introduced and expressed throughout the algae's life cycle. In addition, two live cell reporters, humanized cyan fluorescent protein from Anemonia majano and luciferase from Gaussia princeps, were successfully expressed in whole P. yezoensis. The development of this transformation system involving three types of reporter genes provides opportunities for monitoring temporal changes in gene expression and for genetic screening in red marine algae.
    Journal of Applied Phycology 08/2014; 26(4):1863-1868. DOI:10.1007/s10811-013-0234-x · 2.56 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Using a whole-genome-sequencing approach to explore germplasm resources can serve as an important strategy for crop improvement, especially in investigating wild accessions that may contain useful genetic resources that have been lost during the domestication process. Here we sequence and assemble a draft genome of wild soybean and construct a recombinant inbred population for genotyping-by-sequencing and phenotypic analyses to identify multiple QTLs relevant to traits of interest in agriculture. We use a combination of de novo sequencing data from this work and our previous germplasm re-sequencing data to identify a novel ion transporter gene, GmCHX1, and relate its sequence alterations to salt tolerance. Rapid gain-of-function tests show the protective effects of GmCHX1 towards salt stress. This combination of whole-genome de novo sequencing, high-density-marker QTL mapping by re-sequencing and functional analyses can serve as an effective strategy to unveil novel genomic information in wild soybean to facilitate crop improvement.
    Nature Communications 07/2014; 5:4340. DOI:10.1038/ncomms5340 · 11.47 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Single or double flower type is one of the most important breeding targets in carnation (Dianthus caryophyllus L.). We mapped the D 85 locus, which controls flower type, to LG 85P_15–2 using a simple sequence repeat (SSR)-based genetic linkage map constructed using 91 F2 progeny derived from a cross between line 85–11 (double flower) and ‘Pretty Favvare’ (single flower). A positional comparison using SSR markers as anchor loci revealed that the map positions of the D 85 locus corresponded to the single locus controlling the single flower type derived from wild D. capitatus ssp. andrzejowskianus. We identified four co-segregating SSR markers on the D 85 locus. Verification of the SSR markers in commercial cultivars revealed that two of the four SSR markers (CES0212 and CES1982) were tightly linked to the D 85 locus, and amplified a 176-bp and 269-bp allele, respectively, which were common and unique to double flower cultivars. The map positions of the D 85 locus and the tightly linked SSR markers will be useful for determining the genetic basis of flower type and for marker-assisted breeding of carnations.
    Euphytica 07/2014; 198(2). DOI:10.1007/s10681-014-1090-8 · 1.39 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The colonization of land by plants was a key event in the evolution of life. Here we report the draft genome sequence of the filamentous terrestrial alga Klebsormidium flaccidum (Division Charophyta, Order Klebsormidiales) to elucidate the early transition step from aquatic algae to land plants. Comparison of the genome sequence with that of other algae and land plants demonstrate that K. flaccidum acquired many genes specific to land plants. We demonstrate that K. flaccidum indeed produces several plant hormones and homologues of some of the signalling intermediates required for hormone actions in higher plants. The K. flaccidum genome also encodes a primitive system to protect against the harmful effects of high-intensity light. The presence of these plant-related systems in K. flaccidum suggests that, during evolution, this alga acquired the fundamental machinery required for adaptation to terrestrial environments.
    Nature Communications 05/2014; 5:3978. DOI:10.1038/ncomms4978 · 11.47 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ≥300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified.
    DNA Research 05/2014; DOI:10.1093/dnares/dsu014 · 5.48 Impact Factor

Publication Stats

26k Citations
2,509.39 Total Impact Points


  • 1994–2015
    • Kazusa DNA Research Institute
      Kizarazu, Chiba, Japan
  • 2013
    • University of Brasília
      Brasília, Federal District, Brazil
    • Kyushu University
      • Department of Bioscience and Biotechnology
      Hukuoka, Fukuoka, Japan
  • 2012
    • National Institute for Basic Biology
      Okazaki, Aichi, Japan
  • 2011
    • Tohoku University
      • Graduate School of Life Sciences
      Sendai-shi, Miyagi-ken, Japan
    • National Institute of Genetics
      Мисима, Shizuoka, Japan
  • 2010
    • National Agriculture and Food Research Organization
      Tsukuba, Ibaraki, Japan
    • Chinese Academy of Sciences
      • Northeast Institute of Geography and Agroecology
      Peping, Beijing, China
  • 2008
    • Hokkaido University
      • Research Faculty of Agriculture
      Sapporo, Hokkaidō, Japan
  • 2004–2007
    • National Institute of Agrobiological Sciences
      • Division of Plant Sciences
      Tsukuba, Ibaraki, Japan
  • 2002–2005
    • Aarhus University
      • Centre for Carbonate Recognition and Signaling CARB
      Aarhus, Central Jutland, Denmark
    • Osaka University
      • Graduate School of Engineering
      Suika, Ōsaka, Japan
  • 2003
    • Nihon University
      • Department of Applied Biological Science
      Edo, Tōkyō, Japan
  • 2001
    • Michigan State University
      East Lansing, Michigan, United States
  • 1997–1999
    • Connecticut Agricultural Experiment Station
      New Haven, Connecticut, United States
  • 1996
    • Georgia Institute of Technology
      • School of Biology
      Atlanta, Georgia, United States
  • 1995
    • Nara Institute of Science and Technology
      • Graduate School of Biological Sciences
      Ikuma, Nara, Japan
  • 1988–1994
    • Nagoya University
      • Division of Cell Science
      Nagoya, Aichi, Japan
  • 1984–1987
    • University of California, San Diego
      San Diego, California, United States
  • 1985
    • Kyoto University
      • Institute for Chemical Research
      Kioto, Kyōto, Japan