Minoru Kanehisa’s research while affiliated with Kyoto University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (484)


Figure 2. One of the local gene order alignments obtained by comparing KO sequences converted from 21 325 human genes and 22 435 mouse genes. Gene identifiers are aligned with matching K numbers in the middle (upper left) and the list of KOs can be vie w ed from the link (upper right). This particular alignment in human chromosome 11 and mouse chromosome 9 (re v ersed) contains olfactory receptor repeats at first and the two genome maps shown here starts with hsa:116 337 and mmu:208098. The coloring of genes indicates functional categories of KOs. To reproduce this result, access https:// www.kegg.jp/ kegg/ syntax/ gnalign.html , enter the organism codes hsa and mmu as Genome1 and Genome2, respectively, and click on 'Align by K Os ' button.
Figure 3. KEGG pathw a y map for nitrogen cycle ( https:// www.kegg.jp/ pathway/ map0 131 0 ), a new biogeochemical cycle map. KEGG modules in the left panel can be used to display a specific chemical transformation process as a red colored segment, such as M00175 for nitrogen fixation, and also to examine enzyme genes involved. The map contains a link to Taxonomy mapping, which displays in a separate window (top right corner) taxonomic categories of organism groups in v olv ed according to the se v en constituent modules.
Ortholog table and related tools
KEGG: biological systems database as a model of the real world
  • Article
  • Full-text available

October 2024

·

29 Reads

·

250 Citations

Nucleic Acids Research

Minoru Kanehisa

·

Miho Furumichi

·

Yoko Sato

·

[...]

·

Mari Ishiguro-Watanabe

KEGG (https://www.kegg.jp/) is a database resource for representation and analysis of biological systems. Pathway maps are the primary dataset in KEGG representing systemic functions of the cell and the organism in terms of molecular interaction and reaction networks. The KEGG Orthology (KO) system is a mechanism for linking genes and proteins to pathway maps and other molecular networks. Each KO is a generic gene identifier and each pathway map is created as a network of KO nodes. This architecture enables KEGG pathway mapping to uncover systemic features from KO assigned genomes and metagenomes. Additional roles of KOs include characterization of conserved genes and conserved units of genes in organism groups, which can be done by taxonomy mapping. A new tool has been developed for identifying conserved gene orders in chromosomes, in which gene orders are treated as sequences of KOs. Furthermore, a new dataset called VOG (virus ortholog group) is computationally generated from virus proteins and expanded to proteins of cellular organisms, allowing gene orders to be compared as VOG sequences as well. Together with these datasets and analysis tools, new types of pathway maps are being developed to present a global view of biological processes involving multiple organism groups.

Download

In a simplified view, the KEGG database consists of molecular networks (PATHWAY, BRITE), molecular building blocks (GENES, COMPOUND, GLYCAN), and the linkage between them (KO). Cellular and organism‐level functions can be revealed from the genome, metagenome, and metabolome through KEGG mapping against the KEGG database.
An example of virus taxonomy mapping, where taxonomic distributions of three types of VOC clusters and the assigned KO are shown for a viral gene, vg:1486428. This can be reproduced as follows: access https://www.kegg.jp/entry/vg:1486428, click on the Voc button and select the Taxonomy link.
An example of finding conserved gene orders in virus genomes. Human cytomegalovirus (human betaherpesvirus 5) genome is compared with other genomes around envelope glycoprotein B (vg:3077424) using the 30% VOC cluster. This can be reproduced as follows: access https://www.kegg.jp/genome/10359, locate the gene 3077424 using the search box, select a set of genes and perform gene cluster search. When conserved gene orders are found, the taxonomic tree can be used to select which genomes to display.
An example of using the automatic KO assignment server GhostKOALA (https://www.kegg.jp/ghostkoala/) for a marine environmental sample (KEGG identifier T30798). The result page contains three pie charts: (a) summary of KO assignments with color coding of KEGG functional categories, (b) taxonomic distributions of proteins in the sample and (c) taxonomic distributions of viral proteins in the sample.
KEGG tools for classification and analysis of viral proteins

November 2023

·

85 Reads

·

33 Citations

The KEGG database and analysis tools (https://www.kegg.jp) have been developed mostly for understanding genes and genomes of cellular organisms. The KO (KEGG Orthology) dataset, which is a collection of functional orthologs, plays the role of linking genes in the genome to pathways and other molecular networks, enabling KEGG mapping to uncover hidden features in the genome. Although viruses were part of KEGG for some time, they were not fully integrated in the KEGG analysis tools, because the KO assignment rate is very low for virus genes. To supplement KOs a new dataset named virus ortholog clusters (VOCs) is computationally generated, covering 90% of viral proteins in KEGG. VOCs can be used, in place of KOs, for taxonomy mapping to uncover relationships of sequence similarity groups and taxonomic groups and for identifying conserved gene orders in virus genomes. Furthermore, selected VOCs are used to define tentative KOs for characterizing protein functions. Here an overview of KEGG tools is presented focusing on these extensions for viral protein analysis.


Cataloging natural sialic acids and other nonulosonic acids (NulOs), and their representation using the Symbol Nomenclature for Glycans

January 2023

·

116 Reads

·

12 Citations

Glycobiology

Nonulosonic acids or non-2-ulosonic acids (NulOs) are an ancient family of 2-ketoaldonic acids (α-ketoaldonic acids) with a 9-carbon backbone. In nature, these monosaccharides occur either in a 3-deoxy form (referred to as “sialic acids”) or in a 3,9-dideoxy “sialic-acid-like” form. The former sialic acids are most common in the deuterostome lineage, including vertebrates, and mimicked by some of their pathogens. The latter sialic-acid-like molecules are found in bacteria and archaea. NulOs are often prominently positioned at the outermost tips of cell surface glycans, and have many key roles in evolution, biology and disease. The diversity of stereochemistry and structural modifications among the NulOs contributes to more than 90 sialic acid forms and 50 sialic-acid-like variants described thus far in nature. This paper reports the curation of these diverse naturally occurring NulOs at the NCBI sialic acid page (https://www.ncbi.nlm.nih.gov/glycans/sialic.html) as part of the NCBI-Glycans initiative. This includes external links to relevant Carbohydrate Structure Databases. As the amino and hydroxyl groups of these monosaccharides are extensively derivatized by various substituents in nature, the Symbol Nomenclature For Glycans (SNFG) rules have been expanded to represent this natural diversity. These developments help illustrate the natural diversity of sialic acids and related NulOs, and enable their systematic representation in publications and online resources.


Figure 2. Simple URLs, called KEGG weblinks, to retrieve and analyze KEGG objects (database entries). The entry operation retrieves any object specified by the KEGG identifier in the flat-file format. The pathway, brite, module or network operation retrieves a molecular network object specified by the KEGG identifier with optional highlighting of network nodes given in the argument. The genome operation retrieves the genome map of a given organism with optional specification of a gene location. The pathway, brite and genome operations actually launch specialized tools enabling further analysis. In the URL form, www.kegg.jp may be replaced by www.genome.jp for accessing the GenomeNet mirror site.
KEGG for taxonomy-based analysis of pathways and genomes

October 2022

·

370 Reads

·

3,225 Citations

Nucleic Acids Research

KEGG (https://www.kegg.jp) is a manually curated database resource integrating various biological objects categorized into systems, genomic, chemical and health information. Each object (database entry) is identified by the KEGG identifier (kid), which generally takes the form of a prefix followed by a five-digit number, and can be retrieved by appending /entry/kid in the URL. The KEGG pathway map viewer, the Brite hierarchy viewer and the newly released KEGG genome browser can be launched by appending /pathway/kid, /brite/kid and /genome/kid, respectively, in the URL. Together with an improved annotation procedure for KO (KEGG Orthology) assignment, an increasing number of eukaryotic genomes have been included in KEGG for better representation of organisms in the taxonomic tree. Multiple taxonomy files are generated for classification of KEGG organisms and viruses, and the Brite hierarchy viewer is used for taxonomy mapping, a variant of Brite mapping in the new KEGG Mapper suite. The taxonomy mapping enables analysis of, for example, how functional links of genes in the pathway and physical links of genes on the chromosome are conserved among organism groups.


In the new versions of (a) the KEGG pathway map viewer and (b) the BRITE hierarchy viewer, the side panel is available for various client‐side operations, including KEGG mapping operations. The User data section of the pathway map viewer corresponds to the Color tool of KEGG Mapper applied to a single pathway map. The ID search and Join sections of the BRITE hierarchy viewer correspond to the Search and Join tools applied to a single hierarchy file. The plus sign in each section is used to open an window for user data input
An example of using the Join tool of KEGG Mapper, where the dataset of prodrug to active substance relations is joined with br‐prefixed BRITE hierarchy files. One of the matching BRITE files, br08303 for the ATC drug classification, is shown here. Since the KEGG Mapper result appears in the Join list, it may be examined by combining with other predefined datasets
The taxonomy mapping tool shows the distribution of KEGG organisms for a given set of KOs (K numbers) and modules (M numbers) as well as for user‐defined data. The tool works with a specially organized BRITE hierarchy file, br08611 for KEGG organisms in taxonomic groups. Here the mapping result is shown with (a) zooming in to the species level or (b) zooming out to the genus level, revealing what fraction of organisms are matched (colored in red) under the changing resolution of organism groups
KEGG mapping tools for uncovering hidden features in biological data

August 2021

·

140 Reads

·

552 Citations

In contrast to artificial intelligence and machine learning approaches, KEGG (https://www.kegg.jp) has relied on human intelligence to develop “models” of biological systems, especially in the form of KEGG pathway maps that are manually created by capturing knowledge from published literature. The KEGG models can then be used in biological big data analysis, for example, for uncovering systemic functions of an organism hidden in its genome sequence through the simple procedure of KEGG mapping. Here we present an updated version of KEGG Mapper, a suite of KEGG mapping tools reported previously (Kanehisa and Sato, Protein Sci 2020; 29:28–35), together with the new versions of the KEGG pathway map viewer and the BRITE hierarchy viewer. Significant enhancements have been made for BRITE mapping, where the mapping result can be examined by manipulation of hierarchical trees, such as pruning and zooming. The tree manipulation feature has also been implemented in the taxonomy mapping tool for linking KO (KEGG Orthology) groups and modules to phenotypes.


Figure 1. KEGG consists of eighteen original databases in four categories. The health information category, called KEGG MEDICUS, is supplemented with two outside databases of drug labels: Japanese drug labels obtained from JAPIC (http://www.japic.or.jp) and FDA drug labels linked to the DailyMed database (https://dailymed.nlm.nih.gov). The identifier of each entry in the KEGG database generally takes the form of a prefix followed by a five-digit number and is called, for example, map number, M number and K number for the PATHWAY, MODULE and KO databases, respectively.
Figure 2. The new pathway map viewer with a side panel for client-side operations. Here the human pathway map hsa00600 for sphingolipid metabolism is shown with the module M00094 for ceramide biosynthesis in red and the network N00642 for saposin (PSAP) stimulation of GBA (3.2.1.45) and GALC (3.2.1.46) in purple. The saposin node and regulatory links are not present in the original map and are displayed only when this network is selected.
Figure 4. The network variation map nt06131 for Apoptosis (viruses and bacteria) showing aligned sets of reference networks in green and variant networks with viral or bacterial proteins in purple. Variant networks are linked to disease types, mostly viral infections but including five bacterial infections.
Figure 5. Selected networks in the variation map nt06131 are shown on the pathway maps: (A) inhibition of apoptosis by KSHV in the pathway map of hsa05167 for Kaposi sarcoma-associated herpesvirus infection and (B) activation of apoptosis by HIV in the pathway map of hsa05170 for Human immunodeficiency virus 1 infection.
Figure 6. The correspondence between the seven groups of Baltimore classification (colored) and the hierarchy of ICTV virus classification consisting of realm (-viria), kingdom (-virae), phylum (-viricota), class (-viricetes) and family (-viridae).
KEGG: Integrating viruses and cellular organisms

October 2020

·

684 Reads

·

2,811 Citations

Nucleic Acids Research

KEGG (https://www.kegg.jp/) is a manually curated resource integrating eighteen databases categorized into systems, genomic, chemical and health information. It also provides KEGG mapping tools, which enable understanding of cellular and organism-level functions from genome sequences and other molecular datasets. KEGG mapping is a predictive method of reconstructing molecular network systems from molecular building blocks based on the concept of functional orthologs. Since the introduction of the KEGG NETWORK database, various diseases have been associated with network variants, which are perturbed molecular networks caused by human gene variants, viruses, other pathogens and environmental factors. The network variation maps are created as aligned sets of related networks showing, for example, how different viruses inhibit or activate specific cellular signaling pathways. The KEGG pathway maps are now integrated with network variation maps in the NETWORK database, as well as with conserved functional units of KEGG modules and reaction modules in the MODULE database. The KO database for functional orthologs continues to be improved and virus KOs are being expanded for better understanding of virus-cell interactions and for enabling prediction of viral perturbations.


Comparison of the performance of KofamScan with other tools KofamScan BlastKOALA GhostKOALA KAAS Entire database (40 genomes)
KofamKOALA: KEGG ortholog assignment based on profile HMM and adaptive score threshold

November 2019

·

1,903 Reads

·

1,351 Citations

Bioinformatics

KofamKOALA is a web server to assign KEGG Orthologs (KOs) to protein sequences by homology search against a database of profile hidden Markov models (KOfam) with pre-computed adaptive score thresholds. KofamKOALA is faster than existing KO assignment tools with its accuracy being comparable to the best performing tools. Function annotation by KofamKOALA helps linking genes to KEGG resources such as the KEGG pathway maps and facilitates molecular network reconstruction. Availability KofamKOALA, KofamScan, and KOfam are freely available from GenomeNet (https://www.genome.jp/tools/kofamkoala/) Supplementary information Supplementary data are available at Bioinformatics online.



The KEGG metabolic pathway map for arginine biosynthesis (map00220), where rectangles and circles represent enzymes and chemical compounds (substrates and products), respectively. Enzymes are identified by functional orthologs called KOs, although EC numbers and gene names are displayed. The KEGG pathway module is a functional unit in the metabolic pathway defined by a set of KOs. Two modules M00028 (upper) and M00763 (lower) are shown by pink‐colored rectangles
The global map of metabolic pathways (map01100), which contains 3,000 chemical compounds and 4,000 enzyme KOs. The coloring distinguishes categories of metabolic pathways as defined by KEGG. The arginine biosynthesis pathway of Figure 1 is highlighted with thick red lines
Correspondence of pathway modules (M numbers in green) defined by KOs and reaction modules (RM numbers in blue) defined by RCs. Reaction modules are more general functional units, each of which may correspond to multiple pathway modules, such as RM001 for 2‐oxocarboxylic chain extension and RM002 and RM033 for conversion to basic and branched‐chain amino acids, respectively
Toward understanding the origin and evolution of cellular organisms

September 2019

·

209 Reads

·

3,203 Citations

In this era of high‐throughput biology, bioinformatics has become a major discipline for making sense out of large‐scale datasets. Bioinformatics is usually considered as a practical field developing databases and software tools for supporting other fields, rather than a fundamental scientific discipline for uncovering principles of biology. The KEGG resource that we have been developing is a reference knowledge base for biological interpretation of genome sequences and other high‐throughput data. It is now one of the most utilized biological databases because of its practical values. For me personally, KEGG is a step toward understanding the origin and evolution of cellular organisms.


Knowledge representation of molecular interaction, reaction, and relation networks in KEGG. (a) KEGG pathway map for human MAPK signaling pathway (hsa04010). (b) BRITE hierarchy for transporter classification (ko02000). (c) KEGG module for lysine biosynthesis, 2‐oxoglutarate = > 2‐oxoadipate (M00433), corresponding to the highlighted part in lysine biosynthesis pathway (map00300). The logical expression of K numbers is also represented by the graphical diagram
The result of Reconstruct Pathway from the combined gene set of Homo sapiens (T01001) and a gut metagenome sample (T30003). (a) the result page of KEGG pathway mapping, which shows that 421 matching pathway maps are found. (b) Reconstructed global map of metabolic pathways (map01100), where human specific pathways are shown in green, microbiome specific pathways in red, and shared pathways in blue. (c) the result in (a) shows that 166 complete modules are found. An example is coenzyme A biosynthesis (M00120), in which different gene sets, green for human and pink for microbiome, make this module complete
The result of Reconstruct Pathway from the entire set of disease genes represented by KOs in the KEGG DISEASE database. (a) the result page of mapping against KEGG modules, which shows that 40 complete modules are reconstructed. One of them is M00868 for heme biosynthesis, where all genes are associated with H01763 for porphyria. (b) the pathway module M00868 is shown in pink in the pathway map hsa00860 for porphyrin and chlorophyll metabolism. Eight genes in this module correspond to eight disease‐associated genes in H01763
The result of Search Pathway from the human protein set of drug targets in the KEGG DRUG database. (a) the result page of mapping against BRITE hierarchies for the full set of drug targets, which shows that 42 matching hierarchy files are found with “enzymes” at the top, excluding the KEGG Orthology (KO) file. (b) the result page of mapping against BRITE hierarchies for the targets of monoclonal antibodies, which shows that 24 matching hierarchy files are found with “CD molecules” at the top
KEGG Mapper for inferring cellular functions from protein sequences

August 2019

·

409 Reads

·

1,001 Citations

KEGG is a reference knowledge base for biological interpretation of large‐scale molecular datasets, such as genome and metagenome sequences. It accumulates experimental knowledge about high‐level functions of the cell and the organism represented in terms of KEGG molecular networks, including KEGG pathway maps, BRITE hierarchies, and KEGG modules. By the process called KEGG mapping, a set of protein coding genes in the genome, for example, can be converted to KEGG molecular networks enabling interpretation of cellular functions and other high‐level features. Here we report a new version of KEGG Mapper, a suite of KEGG mapping tools available at the KEGG website (https://www.kegg.jp/ or https://www.genome.jp/kegg/), together with the KOALA family tools for automatic assignment of KO (KEGG Orthology) identifiers used in the mapping.


Citations (74)


... GO and KEGG pathway enrichment analyses of differentially expressed genes (DEGs) were performed using Phyper, with a threshold is set to Qvalue ≤ 0.05. The pathway analysis was performed using the KEGG database [22][23][24] . ...

Reference:

Prospective multicenter study identifying prognostic biomarkers and microbial profiles in severe CAP using BALF, blood mNGS, and PBMC transcriptomics
KEGG: biological systems database as a model of the real world

Nucleic Acids Research

... Plots were colored based on the adjusted p-value; if the threshold exceeded 0.05, the original p-value was used. All visualizations were saved in PDF format for further analysis and presentation [22,23]. ...

KEGG tools for classification and analysis of viral proteins

... Cell surfaces are covered by a glycocalyx [1] composed of complex carbohydrate polymers (also known as glycans) anchored to cell membranes via glycoprotein or glycolipid linkages [2][3][4][5]. Biosynthetically derived from neuraminic acid [6,7], sialic acids represent a family of nine-carbon keto-aldonic acids (nonulosonic acids (NulOs)) [8]. These normally occupy the terminal positions of cell surface glycosaminoglycan moieties [9,10]. ...

Cataloging natural sialic acids and other nonulosonic acids (NulOs), and their representation using the Symbol Nomenclature for Glycans
  • Citing Article
  • January 2023

Glycobiology

... We used the FindAllMarkers function to calculate the differentially expressed genes (DEGs) of each cell clusters among various samples by using the Wilcoxon-Mann-Whitney tests 26 . Then, the gseGO and gseKEGG function in the package of "ClusterProlifer" was used to visualize the enriched pathways of these genes in biological process (BP) and Kyoto Encyclopedia of Genes and Genomes (KEGG) [26][27][28] . ...

KEGG for taxonomy-based analysis of pathways and genomes

Nucleic Acids Research

... The EG consists of a variety of glycoconjugates and glycans, including N-glycans and O-glycans on glycoproteins, glycosaminoglycans on proteoglycans, hyaluronan, glycosphingolipids, and GPI-anchored proteins. The various monosaccharides are defined using the Symbol Nomenclature for Glycans (163) (8,(16)(17)(18). For example, Dolichos biflorus agglutinin, a plant lectin with a preference for α-linked N-acetylgalactosamine, selectively binds to the EG in murine brain and lung capillaries, while Concanavalin A, a lectin that binds to the mannose core of N-glycans, targets the glycocalyx of brain capillaries, particularly near cell-cell junctions. ...

Updates to the Symbol Nomenclature for Glycans guidelines
  • Citing Article
  • September 2019

... This included both up and down-regulated genes. Protein sequences were then uploaded to BlastKOALA and Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) identifiers (K numbers) were assigned to genes encoding those proteins (Kanehisa et al., 2016(Kanehisa et al., , 2021. K numbers were then used with clusterProfiler v4.2.1 (Wu et al., 2021) in R to conduct KEGG pathway over-representation analysis and KEGG module over-representation analysis. ...

KEGG mapping tools for uncovering hidden features in biological data

... Gene function assignment was extracted from the Mitocarta 3.0 [39] and the STRING databases [40]. Pathway analysis was performed using the 'clusterProfiler' package with the ontology database of the Kyoto Encyclopaedia of Genes and Genomes (KEGG) [41]. This method identifies relevant biological pathways by first mapping input gene sets to the pathways in the KEGG ontology database and then performing enrichment analysis to assess whether specific pathways are significantly overrepresented. ...

KEGG: Integrating viruses and cellular organisms

Nucleic Acids Research

... Tisiphia (Supplementary Table S3), using MAFFT, TrimAl, and IQ-TREE. For metabolic analysis, gene functions were annotated with "anvi-run-kegg-kofams" using the KEGG Orthology database (last updated: 24 November 2020), and the completeness of metabolic pathways was estimated with "anvi-estimate-metabolism" in Anvi'o v7.1 (Aramaki et al., 2020;Eren et al., 2021;Veseli et al., 2023). ...

KofamKOALA: KEGG ortholog assignment based on profile HMM and adaptive score threshold

Bioinformatics

... GO and KEGG pathway enrichment analyses of differentially expressed genes (DEGs) were performed using Phyper, with a threshold is set to Qvalue ≤ 0.05. The pathway analysis was performed using the KEGG database [22][23][24] . ...

Toward understanding the origin and evolution of cellular organisms

... Functional composition was profiled using HUMAnN3 (v3.6) [34] based on the UniRef90 database [35] which was then regrouped into the Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) profiles [36] using the "humann_regroup_table" script. The annotation of KOs to the KEGG pathway was conducted using the KEGG Mapper [37] and R package Pathview (v1.38.0) [38]. KEGG enrichment analysis for differentially abundant KOs was performed using the R package Microbi-omeProfiler (v1.4.0) [39]. ...

KEGG Mapper for inferring cellular functions from protein sequences