ArticlePDF Available

TrypanoCyc: A community-led biochemical pathways database for Trypanosoma brucei

October 2014
Nucleic Acids Research 43(D1)

DOI:10.1093/nar/gku944

Source
PubMed

License
CC BY-NC 4.0

Authors:

Sanu Shameer

Indian Institute Of Science Education and Research, Thiruvananthapuram

Florence Vinson

L’Institut national de la recherche agronomique (Toulouse)

Ludovic Cottret

French National Institute for Agriculture, Food, and Environment (INRAE)

Show all 41 authorsHide

The metabolic network of a cell represents the catabolic and anabolic reactions that interconvert small molecules (metabolites) through the activity of enzymes, transporters and non-catalyzed chemical reactions. Our understanding of individual metabolic networks is increasing as we learn more about the enzymes that are active in particular cells under particular conditions and as technologies advance to allow detailed measurements of the cellular metabolome. Metabolic network databases are of increasing importance in allowing us to contextualise data sets emerging from transcriptomic, proteomic and metabolomic experiments. Here we present a dynamic database, TrypanoCyc (http://www.metexplore.fr/trypanocyc/), which describes the generic and condition-specific metabolic network of Trypanosoma brucei, a parasitic protozoan responsible for human and animal African trypanosomiasis. In addition to enabling navigation through the BioCyc-based TrypanoCyc interface, we have also implemented a network-based representation of the information through MetExplore, yielding a novel environment in which to visualise the metabolism of this important parasite.

TrypanoCyc page for the 6-phosphogluconate dehydrogenase (1.1.1.44) reaction. ( a ) Reaction name and GeneDB link (specific to TrypanoCyc), ( b ) Detailed description of the reaction, ( c ) Localizations of the reactions as suggested by annotators, ( d ) Confidence score for the reaction (specific to TrypanoCyc), ( e ) Annotation tables displaying content of the TrypAnnot database (specific to TrypanoCyc).

…

Proteomics data loaded in TrypanoCyc using the cellular overview tool. ( a ) The diagram shows all the metabolic pathways in gray boxes. Colored squares correspond to reactions with associated proteomics values. The color scale is displayed in the ‘Omics Viewer Control Panel’; it can be tuned using dedicated parameters. The ‘REACTION’ dialog appears when clicking on a reaction. ( b ) It is then possible to get back to the corresponding reaction page and read the annotators’ comments.

…

Navigation between pathway and network representation using MetExplore and TrypanoCyc. ( a ) Each pathway page has an hyperlink allowing to load and visualize the pathway in MetExplore (circled in red on the pathway page screenshot). ( b ) When clicking on this link in the Glycolysis page, it is loaded in MetExplore; the red box corresponds to the cytosolic part and the green one to the glycosomal part. ( c ) Using MetExplore, it is then possible to generate a combination of various pathways. TCA cycle, succinate shunt, glycolysis and the pentose phosphate pathway were selected. ( d ) All reactions of these pathways are added to the cart (red box on the right). A third compartment, mitochondrion, appears (purple box). A reaction allowing transport between cytosol and glycosome appears in the network (red arrow). ( e ) In the tabular view of MetExplore, a TrypanoCyc button (visible in the third column of [c] table) allows to link back to TrypanoCyc.

…

Figures - uploaded by Harry P De Koning

Content may be subject to copyright.

Content uploaded by Harry P De Koning

Content may be subject to copyright.

Nucleic Acids Research, 2014 1

doi: 10.1093/nar/gku944

TrypanoCyc: a community-led biochemical pathways

database for

Trypanosoma brucei

Sanu Shameer1, Flora J. Logan-Klumpler2, Florence Vinson1, Ludovic Cottret3,

Benjamin Merlet1, Fiona Achcar4, Michael Boshart5, Matthew Berriman2, Rainer Breitling6,

Fr´

ed´

eric Bringaud7, Peter B ¨

utikofer8, Amy M. Cattanach4, Bridget Bannerman-Chukualim2,

Darren J. Creek9, Kathryn Crouch4, Harry P. de Koning4, Hubert Denise10,

Charles Ebikeme11,AlanH.Fairlamb

12, Michael A. J. Ferguson12, Michael L. Ginger13,

Christiane Hertz-Fowler14, Eduard J. Kerkhoven15, Pascal M¨

aser16, Paul A. M. Michels17,

Archana Nayak4,DavidW.Nes

18, Derek P. Nolan19 , Christian Olsen20,

Fatima Silva-Franco14, Terry K. Smith21,MartinC.Taylor

22, Aloysius G. M. Tielens23, 24,

Michael D. Urbaniak13, Jaap J. van Hellemond24, Isabel M. Vincent4, Shane R. Wilkinson25,

Susan Wyllie12, Fred R. Opperdoes26, Michael P. Barrett4,* and Fabien Jourdan1,*

1Institut National de la Recherche Agronomique (INRA), UMR1331, TOXALIM (Research Centre in Food Toxicology),

Universit´

e de Toulouse, Toulouse, France, 2The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA,

UK, 3Institut National de la Recherche Agronomique (INRA), UMR441, Laboratoire des Interactions

Plantes-Microorganismes (LIPM), Auzeville, France, 4University of Glasgow, Glasgow, Scotland, G12 8QQ, UK,

5Ludwig-Maximilians-Universit¨

at M¨

unchen, Biocenter, 82152-Martinsried, Germany, 6Manchester Institute of

Biotechnology, Faculty of Life Sciences, University of Manchester, Manchester, UK, 7CNRS, Bordeaux, 33076,

France, 8University of Bern, Bern, CH-3012, Switzerland, 9Monash Institute of Pharmaceutical Sciences, Monash

University, Parkville 3052, Australia, 10European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10

1SD, UK, 11ISSC, UNESCO, F-75732 CEDEX 15, Paris, France, 12University of Dundee, Dundee, Scotland, DD1

4HN, UK, 13Divisionof Biomedical and Life Sciences, Lancaster University, Bailrigg, Lancaster, LA1 4YG, UK,

14University of Liverpool, Liverpool, Merseyside L69 3BX, UK, 15Chalmers University of Technology, Kemiv ¨

agen 10,

412 96, G¨

oteborg, Sweden, 16Swiss Tropical and Public Health Institute, Socinstr. 57, Basel 4051, Switzerland,

17University of Edinburgh, Mayﬁeld Road, Edinburgh EH9 3JU, UK, 18Texas Tech University, Lubbock, TX, USA,

19Trinity College Dublin, Dublin 2, Ireland, 20Biomatters Inc. 60 Park Place, Suite 2100, Newark, NJ, USA, 21University

of St Andrews, St Andrews, Scotland, KY16 9ST, UK, 22 LSHTM, London, WC1E 7HT, UK, 23Utrecht University,

Utrecht, 3508 TD, The Netherlands, 24Erasmus University Medical Center, Rotterdam, 3015 CE, The Netherlands,

25Queen Mary University of London, London E1 4NS, UK and 26University of Louvain, Brussels, B-1200, Belgium

Received August 13, 2014; Revised September 26, 2014; Accepted September 26, 2014

ABSTRACT

The metabolic network of a cell represents the

catabolic and anabolic reactions that interconvert

small molecules (metabolites) through the activity of

enzymes, transporters and non-catalyzed chemical

reactions. Our understanding of individual metabolic

networks is increasing as we learn more about

the enzymes that are active in particular cells un-

der particular conditions and as technologies ad-

vance to allow detailed measurements of the cellu-

lar metabolome. Metabolic network databases are

of increasing importance in allowing us to con-

textualise data sets emerging from transcriptomic,

proteomic and metabolomic experiments. Here we

present a dynamic database, TrypanoCyc (http:

//www.metexplore.fr/trypanocyc/), which describes

the generic and condition-speciﬁc metabolic network

Trypanosoma brucei,

a parasitic protozoan re-

sponsible for human and animal African trypanoso-

miasis. In addition to enabling navigation through the

*To whom correspondence should be addressed. Tel: +33 561 28 57 15; Fax: +33 561 28 52 44; Email: Fabien.Jourdan@toulouse.inra.fr

Correspondence may also be addressed to Michael P. Barrett. Tel: +44 141 330 6904; Fax: +44 141 330 4077; Email: michael.barrett@glasgow.ac.uk

The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which

permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact

journals.permissions@oup.com

Nucleic Acids Research Advance Access published October 9, 2014

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

2Nucleic Acids Research, 2014

BioCyc-based TrypanoCyc interface, we have also

implemented a network-based representation of the

information through MetExplore, yielding a novel en-

vironment in which to visualise the metabolism of

this important parasite.

INTRODUCTION

Trypanosoma brucei is the causative agent of African try-

panosomiasis (commonly known as sleeping sickness in hu-

mans and Nagana in animals). The disease is fatal if un-

treated in humans (1) and the economic impact of try-

panosomes on agriculture in Africa is immense. The drugs

available for the trypanosomiases are inadequate for a num-

ber of reasons and better therapeutic options are required

(2). Many drugs work through interfering with enzymes in-

volved in cellular metabolism. The only anti-trypanosomal

drug whose target is known is eornithine, an inhibitor of

ornithine decarboxylase (3), a key enzyme in the polyamine

biosynthetic pathway. A comprehensive understanding of

parasite metabolism therefore contributes to current efforts

in drug discovery and understanding drug resistance (4).

Global untargeted molecular proling data sets (e.g. tran-

criptomics, proteomics and metabolomics data) are now be-

ing generated for trypanosomes and the effects of life cycle,

environmental perturbation, specic genetic manipulation

and drug action are being dissected in a systematic man-

ner (5). Interpreting and integrating these data to allow bio-

logical inference and hypothesis generation is a major chal-

lenge. Metabolic network-based methods offer a means to

contextualise and integrate data to help inference of biolog-

ical function (6). Reliable, comprehensive databases collat-

ing information on metabolic networks and pathways are

therefore crucial to optimize understanding derived from

postgenomic data sets. Metabolic databases such as Leish-

Cyc (7)(forLeishmania), and the Library of Apicomplexan

Metabolic Pathways (8) (for apicomplexan parasites) are

among the few examples where this information is available

in parasitology.

Creation of a metabolic network database is achieved by

gathering information on all of the metabolic transforma-

tions an organism can perform (9). A rst outline of this

information is generally retrieved from genomic orthology.

Genes coding for enzymes are identied through sequence

similarity searches and then, using enzyme activity infor-

mation, metabolic reactions catalyzed by these enzymes are

added to the network database. Several automatic and semi-

automatic tools are available to perform these genome-

based metabolic network reconstructions (10–13).

In spite of their undoubted utility, genome-based recon-

struction has limitations since it is based primarily on se-

quence homology comparisons between the organism of

interest and databases encompassing information from a

multitude of organisms (14). Incorrect annotations readily

propagate across databases (15). Moreover, evolution works

through modication of function following alteration of

genes encoding proteins. For instance, trypanosomes use

N1,N8bis-glutathionyl spermidine (trypanothione) (16)as

a key cellular redox-associated metabolite. Trypanothione

is retained in its reduced form by the enzyme trypanoth-

ione reductase (EC 1.8.1.12). This enzyme is evolutionar-

ily derived from glutathione reductase (EC 1.8.1.7), with

which it shows great homology. In the absence of accom-

panying biochemical evidence, genome annotations would

simply predict trypanosomes as possessing a glutathione re-

ductase, and metabolic reconstructions would assume try-

panosomes employ canonical glutathione-based redox bal-

ancing. Cases like this highlight the necessity of rening

genome-based metabolic reconstructions by incorporating

advanced biochemical knowledge (15).

Moreover, simple genome reconstructions do not take

into account the sub-cellular localization of the enzymes

(although various methods are now being developed to

tackle this issue as canonical signals determining cellular lo-

calization come to light (17)). Finally, the genome provides

a view of the total metabolic capability of an organism, re-

gardless of environmental and genetic conditions. In try-

panosomes, however, different metabolic strategies are used

at different points in the life cycle. In the tsetse y, the try-

panosome’s main carbon source is proline (18) while in the

human-host it is glucose (19). Some reactions are active in

one condition but not in another. This information is par-

ticularly important when looking for potential drug targets.

Web servers such as KEGG (20) and BioCyc (14)rep-

resent metabolism as a set of pathways, reecting classical

textbook views of biochemistry. However, the pathway ap-

proach fragments metabolism in ways which constrain our

ability to decipher the broader impact on the metabolic net-

work; hence, methods that also enable connected network

views of metabolism are desirable. We have therefore com-

bined building a pathway-based TrypanoCyc database with

its integration into the MetExplore web server (21), to offer

both pathway and network-based inference and visualiza-

tion.

A HIGHLY CURATED DATABASE OF

T. BRUCEI

METABOLISM

The T. brucei TREU 927 genome is 26 Mb in size, with a

karyotype of 11 megabase chromosomes (22) and contain-

ing a predicted 9068 protein-coding genes. In a collaborative

project between the International Trypanotolerance Centre

in The Gambia and the Sanger Institute, the genome se-

quence was processed using the Pathologic metabolic net-

work reconstruction tool of Pathway Tools (23), creating

aPathway/Genome Database (PGDB) where gaps (called

‘pathway holes’) in the predicted metabolic pathways were

lled by hypothetical reactions, even without an obvious

gene association. The result of this rst automatic recon-

struction was the starting point of the current TrypanoCyc

database.

An international consortium of investigators, expert in

various aspects of trypanosome metabolism, was assembled

to produce a highly annotated TrypanoCyc database. As

recommended by Thiele & Palsson (24) we started the Try-

panoCyc initiative in 2012 with a two-day ‘jamboree’. Each

expert was offered a specic set of pathway(s) in his/her

area of expertise to curate. A dedicated web interface, called

TrypAnnot (a password protected part of the website avail-

able to annotators, not described here) stores submitted

annotations in a curation database, making it possible to

track all annotations, which are automatically taken from

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

Nucleic Acids Research, 2014 3

Figure 1. TrypanoCyc page for the 6-phosphogluconate dehydrogenase (1.1.1.44) reaction. (a) Reaction name and GeneDB link (specic to TrypanoCyc),

(b) Detailed description of the reaction, (c) Localizations of the reactions as suggested by annotators, (d) Condence score for the reaction (specic to

TrypanoCyc), (e) Annotation tables displaying content of the TrypAnnot database (specic to TrypanoCyc).

the database and added to the web page of the correspond-

ing reaction. The TrypanoCyc project has so far had 1368

editing events, among which are 653 annotations made on

464 reactions. Furthermore, since the rst automated recon-

struction in 2008, 17 pathways, 35 enzymatic-reactions, 10

transport reactions, 41 enzymes, 2 protein complexes and

104 metabolites have been added to TrypanoCyc. Extended

summaries for some pathways have also been made avail-

able in the database.

T. brucei cells contain multiple membrane-bounded or-

ganelles, including the mitochondrion and an unusual

peroxisome-related organelle, the glycosome (25,26), in

which the rst seven steps of glycolysis occur, as well as a

series of other pathways (19). Annotators, therefore, spec-

ify the sub-cellular localization of reactions, if known, in

the annotation interface. Life cycle stage specicity for each

reaction is also important, since trypanosomes use differ-

ent metabolic pathways in different environments; hence

annotators can specify one or more developmental stages

in which reactions occur. Note that this information is not

available in the reconstruction provided by KEGG (see Ta-

ble 1for comparison). The level of knowledge on each reac-

tion varies from experimentally veried to indirect evidence

of activity regardless of manual curation. To reect the level

of condence of the annotation we have used the scoring

system proposed by Thiele & Palsson (9)(seeTable2). For

instance, of the 464 annotated reactions, 84 were annotated

based on direct evidence from protein purication, bio-

chemical assays or comparative gene expression studies and

hence can be considered with the highest condence. Dur-

ing curation we found numerous falsely predicted reactions

and pathways; 60 pathways, 14 enzymatic reactions, 20 en-

zymes and 56 metabolites have been removed from the origi-

nal reconstruction. Nevertheless we retained some reactions

if they are known to occur in related trypanosomatids, or

else when they have been proposed to exist, erroneously, in

the literature. Although such reactions are kept, they are

not linked to any pathway and they are assigned a nega-

tive condence score to highlight the fact that according to

our present knowledge they are not actually present. For ex-

ample, a methionine cycle that regenerates methionine from

methylthioadenosine resulting from polyamine biosynthe-

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

4Nucleic Acids Research, 2014

Figure 2. Proteomics data loaded in TrypanoCyc using the cellular overview tool. (a) The diagram shows all the metabolic pathways in gray boxes. Colored

squares correspond to reactions with associated proteomics values. The color scale is displayed in the ‘Omics Viewer Control Panel’; it can be tuned using

dedicated parameters. The ‘REACTION’ dialog appears when clicking on a reaction. (b) It is then possible to get back to the corresponding reaction page

and read the annotators’ comments.

sis has been proposed (27). However, metabolic labelling

experiments have subsequently indicated that the pathway

is not active in trypanosomes, at least in the conditions

used (28). The reactions EC 4.2.1.109 (methylthioribulose

1-phosphate dehydratase) and EC 3.1.3.77 (5-(methylthio)-

2,3-dioxopentyl-phosphate phosphohydrolase), required to

complete the pathway, are included in the database, but

assigned negative scores to highlight that they are unde-

tectable in spite of previous predictions in the literature (27).

We consider it useful to keep such entries such that users of

the database can nd explicit reference to these reactions

they might seek upon reading literature pertaining to these

reactions.

Since metabolic databases focus mainly on pathways and

seldom consider sub-cellular compartments, they usually

lack information on intracellular transport reactions. Cur-

rently, TrypanoCyc contains only 35 such reactions. This is

because we did not incorporate transport reactions into our

annotation platform and because experimental knowledge

on intracellular transport processes is still sparse. However,

the dynamic nature of TrypanoCyc means additional anno-

tation and incorporation of measured and probable trans-

port reactions (e.g. taken from existing manually curated

metabolic models of the closely related organism Leishma-

nia major (29)) will form part of the iterative process of

database renement. We also perform gap lling in each

compartment using graph approaches and testing metabolic

scenarios as suggested (9,10) and successfully implemented

for other organisms (30).

Many additional databases provide information that can

complement metabolic network databases. Linking to these

other data sources enhances our ability to learn about

an organism’s metabolism. TrypanoCyc, therefore, links to

multiple databases including BRENDA (31,32), expasy.org

(33), ExplorEnz (34), Pubmed and UniProt (35). The Trit-

rypDB database (36) is the central resource for trypanoso-

matid genomes and associated functional genomics data,

while GeneDB houses the sequence information gathered

and annotated through the Wellcome Trust Sanger Institute

(37). For each gene, TrypanoCyc offers a direct link to the

corresponding TritrypDB and GeneDB pages.

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

Nucleic Acids Research, 2014 5

Figure 3. Navigation between pathway and network representation using MetExplore and TrypanoCyc. (a) Each pathway page has an hyperlink allowing

to load and visualize the pathway in MetExplore (circled in red on the pathway page screenshot). (b) When clicking on this link in the Glycolysis page, it

is loaded in MetExplore; the red box corresponds to the cytosolic part and the green one to the glycosomal part. (c) Using MetExplore, it is then possible

to generate a combination of various pathways. TCA cycle, succinate shunt, glycolysis and the pentose phosphate pathway were selected. (d) All reactions

of these pathways are added to the cart (red box on the right). A third compartment, mitochondrion, appears (purple box). A reaction allowing transport

between cytosol and glycosome appears in the network (red arrow). (e) In the tabular view of MetExplore, a TrypanoCyc button (visible in the third column

of [c] table) allows to link back to TrypanoCyc.

Table 1. Description of the condence score system used in TrypanoCyc to evaluate the level of curation of each reaction

Reconstruction Compartments Life cycle stages Pathways Enzymatic reactions Unique metabolites

Draft reconstruction

2008

1 0 238 1120 796

KEGG August 2014 1 0 61 656 646

TrypanoCyc August

2014

9 4 209 1025 842

The BioCyc library is a collection of 3563 PGDBs. Based

on the quality of the PGDBs and the level of manual cu-

ration, this central repository classies them into Tier 1

(highly curated), Tier 2 (moderately curated) and Tier 3

(non-curated) categories. Prior to the release of BioCyc

v18.1, only 6 PGDBs (EcoCyc (38), MetaCyc (14), Human-

Cyc (39), AraCyc (40), YeastCyc and LeishCyc (7)) were

published in the Tier 1 category. Due to the quality of in-

formation being made available on TrypanoCyc, it was in-

cluded in BioCyc’s Tier 1 category with the release of Bio-

Cyc v18.1 in June 2014.

REACTIONS, PATHWAYS AND NETWORK MINING

Browsing TrypanoCyc content and expert annotations

As a Pathway Tools-based website, TrypanoCyc provides

a dedicated web page for each metabolic network entity

(pathways, reactions, metabolites, enzymes, proteins and

genes). The reaction page architecture was, however, mod-

ied in order to allow additional annotation information.

These include the annotation condence score (Figure 1d),

stage specicity and compartmentation with links to key lit-

erature (Figure 1e). A comment box is also included, con-

taining detailed free-text information on the reaction. Fig-

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

6Nucleic Acids Research, 2014

Table 2. Overview of TrypanoCyc content before and after curation and comparison with the KEGG database

Evidence type Condence score Description

Biochemical data 4 Direct evidence for gene product function and

biochemical reaction: protein purication,

biochemical assays, experimentally solved protein

structures and comparative gene-expression

studies.

Genetic data 3 Direct and indirect evidence for gene function:

knock-out characterization, knock-in

characterization and over expression.

Physiological data 2 Indirect evidence for biochemical reaction based

on physiological data: secretion products or

dened medium components serve as evidence for

transport and metabolic reactions.

Sequence data 2 Evidence for gene function: genome annotation,

SEED annotation.

Modelling data 1 No evidence is available but reaction is required for

modelling. The included function is a hypothesis

and needs experimental verication. The reaction

mechanism may be different from the included

reaction(s).

Not evaluated 0

Negative hypothesis –1 Although there is no evidence against this

reaction, it is expected to not exist

Evidence against the reaction –2 Direct/indirect evidence against the hypothesis is

available

ure 1shows the webpage for the pentose phosphate pathway

enzyme, 6-phosphogluconate dehydrogenase (EC. 1.1.1.44).

Search requests on database content can be made

through a quick search box found at the top right-hand cor-

ner of the interface page, as well as through the advanced

search options available from the menu bar. Each pathway

representation is available with different levels of detail, the

simplest view displaying only the reactions and metabolites

while the detailed view displays all available information in-

cluding the molecular structure of all metabolites involved.

Additionally, for every pathway in TrypanoCyc, we provide

a link to visualize the pathway in MetExplore.

Mining stage specic metabolism using cellular overview

To exemplify the integration of molecular proling data in

the TrypanoCyc database we used published results from

a Stable Isotope Labelling of Amino acids in Cell culture

experiment, comparing protein levels in bloodstream form

(BSF) and procyclic form (PCF) trypanosomes (41). The

data set contains 3552 gene IDs along with their relative

protein levels in the two tested stages of T. brucei (expressed

as log PCF/BSF values). A TrypanoCyc cellular overview

shows enzymes that differ in abundance between the two

life cycle forms (Figure 2; for step by step instructions see

Supplementary Data S1).

Mapping other molecular proling data in TrypanoCyc

can be achieved using the Pathway Tools Omics Viewer (42),

which displays all pathways in a single representation. Data

sets can be loaded using the options listed on the right-hand

side of the page (Supplementary Material S2 is a version of

this data set in an Omics Viewer-compliant format). Figure

2a shows an image of the overview after loading the pro-

teomics data of (41). Individual reactions can be viewed by

moving the mouse over them and clicking the link in the

pop-up dialog box. This opens the related reaction page

containing the annotation table, giving access to specic

TrypanoCyc annotators’ comments about the enzyme and

its activity as well as the generic information pertaining to

that reaction in the MetaCyc database. For example, the

overlaid data clearly show that the respiratory chain is up-

regulated in procyclic stages. Browsing the reaction page of

any of those up-regulated proteins shows additional infor-

mation from the annotators. For example, for ubiquinone-

cytochrome C reductase (EC 1.10.2.2), two TrypanoCyc an-

notators report that this reaction is active in the PCF but

not in the long slender BSF of T. brucei (see Figure 2b), thus

agreeing with the observations from the proteomics experi-

ment.

Using MetExplore to create user-dened sub-networks from

TrypanoCyc

To complement the classical pathway-oriented BioCyc rep-

resentation of data, we also offer a novel way to visualize the

content of TrypanoCyc via our MetExplore web server (21)

(for step by step instructions see Supplementary Data S3).

Each pathway page contains a hyperlink (Figure 3a), that

opens MetExplore with the selected pathway (Figure 3b).

Importantly, the MetExplore viewer takes into account the

localization of reactions. For example, Figure 3b shows how

the glycolytic pathway is divided into two compartments

(glycosome and cytosol represented by green and red boxes,

respectively).

Another advantage of MetExplore is that it provides a

tabular representation of compartments, pathways, reac-

tions, enzymes, genes and metabolites in the database. It is

also possible to lter these tables by compartments, path-

ways or reactions. For instance, by ltering simultaneously

on the pentose phosphate pathway, TCA cycle, succinate

shunt and glycolysis, only reactions and metabolites related

to these pathways are displayed in their respective tables

(Figure 3c). The user can also add reactions of interest to a

‘cart’ (red box on Figure 3d). It is then possible to visualize

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

Nucleic Acids Research, 2014 7

the content of this cart in the network representation. From

Figure 3d, it is evident that the network perspective is much

more effective in representing compartments and transport

reactions. Furthermore, the glycosome (green box) and cy-

tosol (red box) are demonstrably connected by a reaction

involved in the succinate shunt (marked by a red arrow on

Figure 3d). For a more exible representation MetExplore

also offers a downloadable version of the Cytoscape visual-

ization software (43), pre-loaded with the cart content.

Finally, each MetExplore reaction/pathway with a de-

scription in TrypanoCyc has been hyperlinked to the corre-

sponding reaction/pathway pages, allowing the user to go

back to the expert annotations anytime (Figure 3e).

CONCLUSION

Since 2012, TrypanoCyc has been under extensive cura-

tion with the help of the scientic community and is now

counted among the seven Tier 1 databases within the Bio-

Cyc repository. Collaborative annotations help in improv-

ing the quality of the database by reducing errors, reduc-

ing the workload for individual annotators and also provid-

ing inferences from multiple perspectives given the various

types of experts in the community.

T. brucei metabolic plasticity allows the parasite to adapt

to divergent nutritional environments offered by different

hosts. For drug target identication, for example, focus-

ing on enzymes and metabolic pathways expressed in the

parasite-stages that are replicative in the mammalian host

is critical. TrypanoCyc is the rst comprehensive metabolic

network database for parasites including stage specicity as

a key component of the collected data. LeishCyc (7), for

the related parasite L. major, has also been established, and

in the future these two databases should, ideally, be linked,

given the signicant degree of similarity in the metabolic

networks of these evolutionarily related parasites.

TrypanoCyc and the related annotation database allow

anyone with an interest to join the annotation team. The

size of the consortium helps guarantee the sustainabil-

ity of TrypanoCyc as does the involvement of permanent

staff both at INRA, Toulouse, and the University of Glas-

gow. The Toulouse bioinformatics facility provides the Try-

panoCyc server. TrypanoCyc is freely available and is not

password protected.

TrypanoCyc database content can be mined in a

pathway-oriented manner using the BioCyc-like web inter-

face but also in a network perspective using the MetExplore

web server, which allows tailored building and visualization

of sub-networks. Two options are available to programmati-

cally access TrypanoCyc: through pathway tools using Java-

Cyc or PerlCyc and through MetExplore using its web ser-

vice.

TrypanoCyc is a unique knowledge source for people in-

vestigating T. brucei metabolism. The availability of SBML

(44) les (provided as Supplementary Material S4) based

on the curated network reconstruction in TrypanoCyc will

underpin efforts to explore trypanosome metabolism using

ux balance analysis (45) or other constraints-based tech-

niques. It will also serve as a potential model organism for

early eukaryotes.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

ACKNOWLEDGEMENT

We are grateful to the genotoul bioinformatics platform

Toulouse Midi-Pyrenees for providing computing and stor-

age resources requiredby TrypanoCyc Database. This study

was initiated by the BBSRC-ANR Systryp grant.

FUNDING

European Commission FP7 Marie Curie Initial Train-

ing Network ‘ParaMet’ [290080 to S.S.]; ANR project

MetaboHub [ANR-11-INBS-0010 to B.M.]; Wellcome

Trust [085349]; The work of Fiona Achcar was part of the

SysMO SilicoTryp project coordinated by R.B. Funding

for open access charge: European Commission FP7 Marie

Curie Initial Training Network ‘ParaMet’ [290080].

Conict of interest statement. None declared.

REFERENCES

1. Brun,R., Blum,J., Chappuis,F. and Burri,C. (2010) Human African

trypanosomiasis. Lancet,375, 148–59.

2. Barrett,M.P. (2010) Potential new drugs for human African

trypanosomiasis: some progress at last. Curr. Opin. Infect. Dis.,23,

603–608.

3. Vincent,I.M., Creek,D.J., Burgess,K., Woods,D.J., Burchmore,R.J.S.

and Barrett,M.P. (2012) Untargeted metabolomics reveals a lack of

synergy between nifurtimox and eornithine against Trypanosoma

brucei. PLoS Negl. Trop. Dis. 6, e1618.

4. Creek,D.J. and Barrett,M.P. (2014) Determination of antiprotozoal

drug mechanisms by metabolomics approaches. Parasitology,141,

83–92.

5. Achcar,F., Kerkhoven,E.J. and Barrett,M.P. (2014) Trypanosoma

brucei: meet the system. Curr. Opin. Microbiol.,20C, 162–169.

6. Cottret,L. and Jourdan,F. (2010) Graph methods for the investigation

of metabolic networks in parasitology. Parasitology,137, 1393–1407.

7. Doyle,M.A., MacRae,J.I., De Souza,D.P., Saunders,E.C.,

McConville,M.J. and Liki´

c,V.A. (2009) LeishCyc: a biochemical

pathways database for Leishmania major. BMC Syst. Biol.,3, 57.

8. Shanmugasundram,A., Gonzalez-Galarza,F.F., Wastling,J.M.,

Vasieva,O. and Jones,A.R. (2013) Library of Apicomplexan

Metabolic Pathways: a manually curated database for metabolic

pathways of apicomplexan parasites. Nucleic Acids Res.,41,

D706–D713.

9. Thiele,I., Palsson, and Ø,B.. (2010) A protocol for generating a

high-quality genome-scale metabolic reconstruction. Nat. Protoc.,5,

93–121.

10. DeJongh,M., Formsma,K., Boillot,P., Gould,J., Rycenga,M. and

Best,A. (2007) Toward the automated generation of genome-scale

metabolic networks in the SEED. BMC Bioinform.,8, 139.

11. Whitaker,J.W., Letunic,I., McConkey,G.A. and Westhead,D.R.

(2009) metaTIGER: a metabolic evolution resource. Nucleic Acids

Res.,37, D531–D538.

12. Agren,R., Liu,L., Shoaie,S., Vongsangnak,W., Nookaew,I. and

Nielsen,J. (2013) The RAVEN toolbox and its use for generating a

genome-scale metabolic model for Penicillium chrysogenum. PLoS

Comput. Biol.,9, e1002980.

13. May,J.W., James,A.G. and Steinbeck,C. (2013) Metingear: a

development environment for annotating genome-scale metabolic

models. Bioinformatics,29, 2213–2215.

14. Caspi,R., Altman,T., Billington,R., Dreher,K., Foerster,H.,

Fulcher,C.A., Holland,T.A., Keseler,I.M., Kothari,A., Kubo,A. et al.

(2014) The MetaCyc database of metabolic pathways and enzymes

and the BioCyc collection of Pathway/Genome Databases. Nucleic

Acids Res.,42, D459–D471.

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

8Nucleic Acids Research, 2014

15. Ginsburg,H. (2006) Progress in in silico functional genomics: the

malaria Metabolic Pathways database. Trends. Parasitol.,22, 238–240.

16. Fairlamb,A.H., Blackburn,P., Ulrich,P., Chait,B.T. and Cerami,A.

(1985) Trypanothione: a novel bis(glutathionyl)spermidine cofactor

for glutathione reductase in trypanosomatids. Science,227,

1485–1487.

17. Tardif,M., Atteia,A., Specht,M., Cogne,G., Rolland,N., Brugi`

ere,S.,

Hippler,M., Ferro,M., Bruley,C., Peltier,G. et al. (2012) PredAlgo: a

new subcellular localization prediction tool dedicated to green algae.

Mol. Biol. Evol.,29, 3625–3639.

18. Lamour,N., Rivi`

ere,L., Coustou,V., Coombs,G.H., Barrett,M.P. and

Bringaud,F. (2005) Proline metabolism in procyclic Trypanosoma

brucei is down-regulated in the presence of glucose. J. Biol. Chem.,

280, 11902–11910.

19. Michels,P.A., Hannaert,V. and Bringaud,F. (2000) Metabolic aspects

of glycosomes in trypanosomatidae - new data and views. Parasitol.

Today,16, 482–489.

20. Kanehisa,M., Goto,S., Sato,Y., Kawashima,M., Furumichi,M. and

Tanabe,M. (2014) Data, information, knowledge and principle: back

to metabolism in KEGG. Nucleic Acids Res.,42, D199–205.

21. Cottret,L., Wildridge,D., Vinson,F., Barrett,M.P., Charles,H.,

Sagot,M.-F. and Jourdan,F. (2010) MetExplore: a web server to link

metabolomic experiments and genome-scale metabolic networks.

Nucleic Acids Res.,38, W132–W137.

22. Berriman,M., Ghedin,E., Hertz-Fowler,C., Blandin,G., Renauld,H.,

Bartholomeu,D.C., Lennard,N.J., Caler,E., Hamlin,N.E., Haas,B.

et al. (2005) The genome of the African trypanosome Trypanosoma

brucei. Science,309, 416–422.

23. Karp,P.D., Latendresse,M. and Caspi,R. (2011) The pathway tools

pathway prediction algorithm. Stand. Genomic Sci.,5, 424–429.

24. Thiele,I. and Palsson,B.Ø. (2010) Reconstruction annotation

jamborees: a community approach to systems biology. Mol. Syst.

Biol.,6, 361.

25. Opperdoes,F.R., Borst,P. and Spits,H. (1977) Particle-bound enzymes

in the bloodstream form of Trypanosoma brucei. Eur. J. Bioche.,76,

21–28.

26. Opperdoes,F.R. and Borst,P. (1977) Localization of nine glycolytic

enzymes in a microbody-like organelle in Trypanosoma brucei: the

glycosome. FEBS Lett.,80, 360–364.

27. Berger,B.J., Dai,W.W., Wang,H., Stark,R.E. and Cerami,A. (1996)

Aromatic amino acid transamination and methionine recycling in

trypanosomatids. Proc. Natl. Acad. Sci. U.S.A.,93, 4126–4130.

28. Creek,D.J., Chokkathukalam,A., Jankevics,A., Burgess,K.E. V,

Breitling,R. and Barrett,M.P. (2012) Stable isotope-assisted

metabolomics for network-wide metabolic pathway elucidation. Anal.

Chem.,84, 8442–8447.

29. Chavali,A.K., Whittemore,J.D., Eddy,J.a., Williams,K.T. and

Papin,J.a. (2008) Systems analysis of metabolism in the pathogenic

trypanosomatid Leishmania major. Mol. Syst. Biol.,4, 177.

30. Thiele,I., Swainston,N., Fleming,R.M.T., Hoppe,A., Sahoo,S.,

Aurich,M.K., Haraldsdottir,H., Mo,M.L., Rolfsson,O., Stobbe,M.D.

et al. (2013) A community-driven global reconstruction of human

metabolism. Nat. Biotechnol.,31, 419–425.

31. Schomburg,I., Chang,A., Ebeling,C., Gremse,M., Heldt,C., Huhn,G.

and Schomburg,D. (2004) BRENDA, the enzyme database: updates

and major new developments. Nucleic Acids Res.,32, D431–D433.

32. Schomburg,I., Chang,A., Placzek,S., S¨

ohngen,C., Rother,M.,

Lang,M., Munaretto,C., Ulas,S., Stelzer,M., Grote,A. et al. (2013)

BRENDA in 2013: integrated reactions, kinetic data, enzyme

function data, improved disease classication: new options and

contents in BRENDA. Nucleic Acids Res.,41, D764–D772.

33. Artimo,P., Jonnalagedda,M., Arnold,K., Baratin,D., Csardi,G., De

Castro,E., Duvaud,S., Flegel,V., Fortier,A., Gasteiger,E. et al. (2012)

ExPASy: SIB bioinformatics resource portal. Nucleic Acids Res.,40,

W597–W603.

34. McDonald,A.G., Boyce,S. and Tipton,K.F. (2009) ExplorEnz: the

primary source of the IUBMB enzyme list. Nucleic Acids Res.,37,

D593–D597.

35. The UniProt Consortium (2013) Update on activities at the Universal

Protein Resource (UniProt) in 2013. Nucleic Acids Res.,41, D43–D47.

36. Aslett,M., Aurrecoechea,C., Berriman,M., Brestelli,J., Brunk,B.P.,

Carrington,M., Depledge,D.P., Fischer,S., Gajria,B., Gao,X. et al.

(2010) TriTrypDB: a functional genomic resource for the

Trypanosomatidae. Nucleic Acids Res.,38, D457–D462.

37. Logan-Klumpler,F.J., De Silva,N., Boehme,U., Rogers,M.B.,

Velarde,G., McQuillan,J.A., Carver,T., Aslett,M., Olsen,C.,

Subramanian,S. et al. (2012) GeneDB-an annotation database for

pathogens. Nucleic Acids Res.,40, D98–D108.

38. Keseler,I.M., Mackie,A., Peralta-Gil,M., Santos-Zavaleta,A.,

Gama-Castro,S., Bonavides-Mart´

ınez,C., Fulcher,C., Huerta,A.M.,

Kothari,A., Krummenacker,M. et al. (2013) EcoCyc: fusing model

organism databases with systems biology. Nucleic Acids Res.,41,

D605–D612.

39. Romero,P., Wagg,J., Green,M.L., Kaiser,D., Krummenacker,M. and

Karp,P.D. (2005) Computational prediction of human metabolic

pathways from the complete human genome. Genome Biol.,6,R2.

40. Zhang,P., Foerster,H., Tissier,C.P., Mueller,L., Paley,S., Karp,P.D.

and Rhee,S.Y. (2005) MetaCyc and AraCyc. Metabolic pathway

databases for plant research. Plant Physiol.,138, 27–37.

41. Urbaniak,M.D., Guther,M.L.S. and Ferguson,M.A.J. (2012)

Comparative SILAC proteomic analysis of Trypanosoma brucei

bloodstream and procyclic lifecycle stages. PloS One,7, e36619.

42. Paley,S.M. and Karp,P.D. (2006) The Pathway Tools cellular overview

diagram and Omics Viewer. Nucleic Acids Res. 34, 3771–3778.

43. Cline,M.S., Smoot,M., Cerami,E., Kuchinsky,A., Landys,N.,

Workman,C., Christmas,R., Avila-Campilo,I., Creech,M., Gross,B.

et al. (2007) Integration of biological networks and gene expression

data using Cytoscape. Nat. Protoc.,2, 2366–2382.

44. Hucka,M., Finney,A., Sauro,H.M., Bolouri,H., Doyle,J.C.,

Kitano,H., Arkin,A.P., Bornstein,B.J., Bray,D., Cornish-Bowden,A.

et al. (2003) The systems biology markup language (SBML): a

medium for representation and exchange of biochemical network

models. Bioinformatics,19, 524–531.

45. Schellenberger,J., Que,R., Fleming,R.M.T., Thiele,I., Orth,J.D.,

Feist,A.M., Zielinski,D.C., Bordbar,A., Lewis,N.E., Rahmanian,S.

et al. (2011) Quantitative prediction of cellular metabolism with

constraint-based models: the COBRA Toolbox v2.0. Nat. Protoc.,6,

1290–1307.

at Periodicals Dept on November 7, 2014http://nar.oxfordjournals.org/Downloaded from

Progress in Research on African Trypanosomes: Highlights from an Exceptional Decade

Book

Jan 2022

Progress in Research on African Trypanosomes: Highlights from an Exceptional Decade

Chapter

Jan 2022

In the late nineteenth century, Trypanosoma brucei was discovered as the parasitic protist responsible for Human African Trypanosomiasis (HAT), also known as sleeping sickness. It is transmitted by the bite of the tsetse fly where trypanosomes undergo several steps of differentiation, proliferation and migration that ultimately lead to the production of parasites than can again be infective for a mammalian host. Here, we review four major areas of trypanosome research that saw spectacular progress in knowledge over the last decade. The cell biology of the parasite can now be studied at unprecedented level thanks to the development of 3D electron microscopy, live imaging and super-resolution microscopy, revealing the architecture of all organelles, such as the flagellum that performs multiple essential functions. The omics area has lifted the basic vision of the genome sequence to a highly sophisticated appreciation of gene expression and chromatin organisation, with the ability to interrogate gene function thanks to advanced reverse genetics both at the individual and the global level. These developments were translated in vivo especially via imaging of the infection in the insect and the mammalian host. This resulted in a reconsideration of the life cycle, revealing the critical role of extravascular parasites in mammalian hosts where the skin now appears as a central reservoir for transmission. These findings will have an impact on monitoring and treating the disease in the field, as well as on the programme for elimination of HAT.Keywords Trypanosoma brucei TrypanosomesHuman African TrypanosomiasisSleeping sicknessMicroscopyFlagellumGenomeReservoirTsetse flySkin

The establishment of variant surface glycoprotein monoallelic expression revealed by single-cell RNA-seq of Trypanosoma brucei in the tsetse fly salivary glands

Article

Full-text available

Sep 2021
PLOS PATHOG

The long and complex Trypanosoma brucei development in the tsetse fly vector culminates when parasites gain mammalian infectivity in the salivary glands. A key step in this process is the establishment of monoallelic variant surface glycoprotein ( VSG ) expression and the formation of the VSG coat. The establishment of VSG monoallelic expression is complex and poorly understood, due to the multiple parasite stages present in the salivary glands. Therefore, we sought to further our understanding of this phenomenon by performing single-cell RNA-sequencing (scRNA-seq) on these trypanosome populations. We were able to capture the developmental program of trypanosomes in the salivary glands, identifying populations of epimastigote, gamete, pre-metacyclic and metacyclic cells. Our results show that parasite metabolism is dramatically remodeled during development in the salivary glands, with a shift in transcript abundance from tricarboxylic acid metabolism to glycolytic metabolism. Analysis of VSG gene expression in pre-metacyclic and metacyclic cells revealed a dynamic VSG gene activation program. Strikingly, we found that pre-metacyclic cells contain transcripts from multiple VSG genes, which resolves to singular VSG gene expression in mature metacyclic cells. Single molecule RNA fluorescence in situ hybridisation (smRNA-FISH) of VSG gene expression following in vitro metacyclogenesis confirmed this finding. Our data demonstrate that multiple VSG genes are transcribed before a single gene is chosen. We propose a transcriptional race model governs the initiation of monoallelic expression.

Divergent metabolism between Trypanosoma congolense and Trypanosoma brucei results in differential sensitivity to metabolic inhibition

Article

Full-text available

Jul 2021
PLOS PATHOG

Animal African Trypanosomiasis (AAT) is a debilitating livestock disease prevalent across sub-Saharan Africa, a main cause of which is the protozoan parasite Trypanosoma congolense. In comparison to the well-studied T. brucei, there is a major paucity of knowledge regarding the biology of T. congolense. Here, we use a combination of omics technologies and novel genetic tools to characterise core metabolism in T. congolense mammalian-infective bloodstream-form parasites, and test whether metabolic differences compared to T. brucei impact upon sensitivity to metabolic inhibition. Like the bloodstream stage of T. brucei, glycolysis plays a major part in T. congolense energy metabolism. However, the rate of glucose uptake is significantly lower in bloodstream stage T. congolense, with cells remaining viable when cultured in concentrations as low as 2 mM. Instead of pyruvate, the primary glycolytic endpoints are succinate, malate and acetate. Transcriptomics analysis showed higher levels of transcripts associated with the mitochondrial pyruvate dehydrogenase complex, acetate generation, and the glycosomal succinate shunt in T. congolense, compared to T. brucei. Stable-isotope labelling of glucose enabled the comparison of carbon usage between T. brucei and T. congolense, highlighting differences in nucleotide and saturated fatty acid metabolism. To validate the metabolic similarities and differences, both species were treated with metabolic inhibitors, confirming that electron transport chain activity is not essential in T. congolense. However, the parasite exhibits increased sensitivity to inhibition of mitochondrial pyruvate import, compared to T. brucei. Strikingly, T. congolense exhibited significant resistance to inhibitors of fatty acid synthesis, including a 780-fold higher EC50 for the lipase and fatty acid synthase inhibitor Orlistat, compared to T. brucei. These data highlight that bloodstream form T. congolense diverges from T. brucei in key areas of metabolism, with several features that are intermediate between bloodstream- and insect-stage T. brucei. These results have implications for drug development, mechanisms of drug resistance and host-pathogen interactions.

Multi-label classification with XGBoost for metabolic pathway prediction

Article

Full-text available

Feb 2024
BMC BIOINFORMATICS

Background Metabolic pathway prediction is one possible approach to address the problem in system biology of reconstructing an organism’s metabolic network from its genome sequence. Recently there have been developments in machine learning-based pathway prediction methods that conclude that machine learning-based approaches are similar in performance to the most used method, PathoLogic which is a rule-based method. One issue is that previous studies evaluated PathoLogic without taxonomic pruning which decreases its performance. Results In this study, we update the evaluation results from previous studies to demonstrate that PathoLogic with taxonomic pruning outperforms previous machine learning-based approaches and that further improvements in performance need to be made for them to be competitive. Furthermore, we introduce mlXGPR, a XGBoost-based metabolic pathway prediction method based on the multi-label classification pathway prediction framework introduced from mlLGPR. We also improve on this multi-label framework by utilizing correlations between labels using classifier chains. We propose a ranking method that determines the order of the chain so that lower performing classifiers are placed later in the chain to utilize the correlations between labels more. We evaluate mlXGPR with and without classifier chains on single-organism and multi-organism benchmarks. Our results indicate that mlXGPR outperform other previous pathway prediction methods including PathoLogic with taxonomic pruning in terms of hamming loss, precision and F1 score on single organism benchmarks. Conclusions The results from our study indicate that the performance of machine learning-based pathway prediction methods can be substantially improved and can even outperform PathoLogic with taxonomic pruning.

Photodynamic Antimicrobial Chemotherapy: Advancements in Porphyrin-Based Photosensitize Development

Chapter

Full-text available

Feb 2022

Trypanosoma brucei: Metabolomics for analysis of cellular metabolism and drug discovery

Article

Full-text available

Mar 2022
METABOLOMICS

Background Trypanosoma brucei is the causative agent of Human African Trypanosomiasis (also known as sleeping sickness), a disease causing serious neurological disorders and fatal if left untreated. Due to its lethal pathogenicity, a variety of treatments have been developed over the years, but which have some important limitations such as acute toxicity and parasite resistance. Metabolomics is an innovative tool used to better understand the parasite’s cellular metabolism, and identify new potential targets, modes of action and resistance mechanisms. The metabolomic approach is mainly associated with robust analytical techniques, such as NMR and Mass Spectrometry. Applying these tools to the trypanosome parasite is, thus, useful for providing new insights into the sleeping sickness pathology and guidance towards innovative treatments. Aim of review The present review aims to comprehensively describe the T. brucei biology and identify targets for new or commercialized antitrypanosomal drugs. Recent metabolomic applications to provide a deeper knowledge about the mechanisms of action of drugs or potential drugs against T. brucei are highlighted. Additionally, the advantages of metabolomics, alone or combined with other methods, are discussed. Key scientific concepts of review Compared to other parasites, only few studies employing metabolomics have to date been reported on Trypanosoma brucei. Published metabolic studies, treatments and modes of action are discussed. The main interest is to evaluate the metabolomics contribution to the understanding of T. brucei’s metabolism.

Enzyme Databases in the Era of Omics and Artificial Intelligence

Article

Full-text available

Nov 2023
INT J MOL SCI

Enzyme research is important for the development of various scientific fields such as medicine and biotechnology. Enzyme databases facilitate this research by providing a wide range of information relevant to research planning and data analysis. Over the years, various databases that cover different aspects of enzyme biology (e.g., kinetic parameters, enzyme occurrence, and reaction mechanisms) have been developed. Most of the databases are curated manually, which improves reliability of the information; however, such curation cannot keep pace with the exponential growth in published data. Lack of data standardization is another obstacle for data extraction and analysis. Improving machine readability of databases is especially important in the light of recent advances in deep learning algorithms that require big training datasets. This review provides information regarding the current state of enzyme databases, especially in relation to the ever-increasing amount of generated research data and recent advancements in artificial intelligence algorithms. Furthermore, it describes several enzyme databases, providing the reader with necessary information for their use.

Network biology and applications

Chapter

Jan 2022

TriTrypDB: An integrated functional genomics resource for kinetoplastida

Article

Full-text available

Jan 2023
PLOS NEGLECT TROP D

Parasitic diseases caused by kinetoplastid parasites are a burden to public health throughout tropical and subtropical regions of the world. TriTrypDB ( https://tritrypdb.org ) is a free online resource for data mining of genomic and functional data from these kinetoplastid parasites and is part of the VEuPathDB Bioinformatics Resource Center ( https://veupathdb.org ). As of release 59, TriTrypDB hosts 83 kinetoplastid genomes, nine of which, including Trypanosoma brucei brucei TREU927, Trypanosoma cruzi CL Brener and Leishmania major Friedlin, undergo manual curation by integrating information from scientific publications, high-throughput assays and user submitted comments. TriTrypDB also integrates transcriptomic, proteomic, epigenomic, population-level and isolate data, functional information from genome-wide RNAi knock-down and fluorescent tagging, and results from automated bioinformatics analysis pipelines. TriTrypDB offers a user-friendly web interface embedded with a genome browser, search strategy system and bioinformatics tools to support custom in silico experiments that leverage integrated data. A Galaxy workspace enables users to analyze their private data (e.g., RNA-sequencing, variant calling, etc.) and explore their results privately in the context of publicly available information in the database. The recent addition of an annotation platform based on Apollo enables users to provide both functional and structural changes that will appear as ‘community annotations’ immediately and, pending curatorial review, will be integrated into the official genome annotation.

The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases

Article

Full-text available

Jan 2014
NUCLEIC ACIDS RES

The MetaCyc database (MetaCyc.org) is a freely accessible comprehensive database describing metabolic pathways and enzymes from all domains of life. The majority of MetaCyc pathways are small-molecule metabolic pathways that have been experimentally determined. MetaCyc contains more than 2400 pathways derived from >46 000 publications, and is the largest curated collection of metabolic pathways. BioCyc (BioCyc.org) is a collection of 5700 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems, and pathway-hole fillers. The BioCyc website offers a variety of tools for querying and analyzing PGDBs, including Omics Viewers and tools for comparative analysis. This article provides an update of new developments in MetaCyc and BioCyc during the last two years, including addition of Gibbs free energy values for compounds and reactions; redesign of the primary gene/protein page; addition of a tool for creating diagrams containing multiple linked pathways; several new search capabilities, including searching for genes based on sequence patterns, searching for databases based on an organism's phenotypes, and a cross-organism search; and a metabolite identifier translation service.

Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M.. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res 42: D199-D205

Article

Full-text available

Nov 2013
NUCLEIC ACIDS RES

In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource (http://www.kegg.jp/ or http://www.genome.jp/kegg/) has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society.

Trypan-othione: a novel bis(glutathionyl)spermidine cofactor for glutathione reductase in trypanosom

Article

Full-text available

Jan 1985

Metingear: A development environment for annotating genome-scale metabolic models

Article

Full-text available

Jun 2013
BIOINFORMATICS

Summary: Genome-scale metabolic models often lack annotations that would allow them to be used for further analysis. Previous efforts have focused on associating metabolites in the model with a cross reference, but this can be problematic if the reference is not freely available, multiple resources are used or the metabolite is added from a literature review. Associating each metabolite with chemical structure provides unambiguous identification of the components and a more detailed view of the metabolism. We have developed an open-source desktop application that simplifies the process of adding database cross references and chemical structures to genome-scale metabolic models. Annotated models can be exported to the Systems Biology Markup Language open interchange format.Availability: Source code, binaries, documentation and tutorials are freely available at http://johnmay.github.com/metingear. The application is implemented in Java with bundles available for MS Windows and Macintosh OS X.Contact: johnmay@ebi.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.

Determination of antiprotozoal drug mechanisms by metabolomics approaches

Article

Full-text available

Jun 2013
Parasitology

SUMMARY The discovery, development and optimal utilization of pharmaceuticals can be greatly enhanced by knowledge of their modes of action. However, many drugs currently on the market act by unknown mechanisms. Untargeted metabolomics offers the potential to discover modes of action for drugs that perturb cellular metabolism. Development of high resolution LC-MS methods and improved data analysis software now allows rapid detection of drug-induced changes to cellular metabolism in an untargeted manner. Several studies have demonstrated the ability of untargeted metabolomics to provide unbiased target discovery for antimicrobial drugs, in particular for antiprotozoal agents. Furthermore, the utilization of targeted metabolomics techniques has enabled validation of existing hypotheses regarding antiprotozoal drug mechanisms. Metabolomics approaches are likely to assist with optimization of new drug candidates by identification of drug targets, and by allowing detailed characterization of modes of action and resistance of existing and novel antiprotozoal drugs.

GeneDB--an annotation database for pathogens.

Article

Jan 2012

GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing databasedriven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.

Update on activities at the Universal Protein Resource (UniProt) in 2013

Article

Jan 2013

Data, information, knowledge and principle: Back to metabolism in KEGG

Article

Jan 2013

Integration of biological networks and gene expression data using Cytoscape

Article

Jan 2007

Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape.

Trypanosoma brucei: Meet the system

Article

Aug 2014
CURR OPIN MICROBIOL

TrypanoCyc: A community-led biochemical pathways database for Trypanosoma brucei

Abstract and Figures

Recommended publications

Methods to Investigate Metabolic Systems in Trypanosoma: From Metabolism to Drug Discovery

MetExploreViz: Web component for interactive metabolic network visualization

MetExplore: A web server to link metabolomic experiments and genome-scale metabolic networks

Graph methods for the investigation of metabolic networks in parasitology

Understanding Protozoan Parasite Metabolism and Identifying Drug Targets through Constraint-Based Mo...