ChapterPDF Available

Systems Biology Resources and Their Applications to Understand the Cancer

Authors:

Abstract

Tumor development is not an abrupt process, but rather begins as a slow and gradual anomaly in a random cell’s machinery. Such an aberration disturbs the cell’s homeostasis and leads to transformation of its niche into a tumor microenvironment. These changes alter the genetic stability of the surrounding cells through different molecules and associated pathways. Although tumor niche development is a random and complicated process, most of the conventional therapy modes have only a symptomatic effect. Therefore, targeted therapeutic interventions and a better understanding of tumor complexity are required. This phenomenon can be achieved with systems biology approaches by using high-throughput techniques that assist in the prognosis and diagnosis of cancer. These approaches work primarily in two interdependent ways collating the available data in the form of a database and extracting data from these platforms via tools and software to create networks and prediction models. Such tools are efficient in generating data that can help in personalized medicine through drug discovery. Therefore, in this chapter, we have reported the publicly available systems biology databases, software, and tools used in cancer data analysis.
Systems Biology Resources and Their
Applications to Understand the Cancer
Pawan Kumar Raghav, Zoya Mann, Pranav K. Pandey, and
Sujata Mohanty
Contents
Introduction . . . . . . . . . . ............................................................................. 2
Systems Biology Databases Used in Cancer Research .......................................... 3
Expression and Variation Databases . . . . . . . ................................................... 3
Immune System and Personalized Medicine Databases Used in Cancer Drug Designing . . . 14
Databases Used for Interaction and Pathway Analyses ...................................... 26
Tumor Databases Used for Drug Designing in Experimental and Clinical Studies ......... 27
Systems Biology Approaches and Tools to Cancer .............................................. 27
Expression and Variation-Based Systems Biology Tools and Approaches for Cancer
Prediction ...................................................................................... 27
Common Immunoinformatic and Bioinformatics Tools to Cancer Drug Discovery ........ 29
Biomolecular Networks Tools in Cancer . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Text Mining Tools Used in Cancer Research ................................................ 30
Mathematical Modeling and Simulation Tools to Model Cancer Pathways and Networks . . . 31
Clinical Applications of Systems Biology Tools and Approaches .......................... 32
Conclusion ........................................................................................ 32
References ........................................................................................ 33
Abstract
Tumor development is not an abrupt process, but rather begins as a slow and
gradual anomaly in a random cells machinery. Such an aberration disturbs the
cells homeostasis and leads to transformation of its niche into a tumor microen-
vironment. These changes alter the genetic stability of the surrounding cells
through different molecules and associated pathways. Although tumor niche
development is a random and complicated process, most of the conventional
P. K. Raghav · Z. Mann · S. Mohanty (*)
Stem Cell Facility, DBT- Centre of Excellence for Stem Cell Research, All India Institute of
Medical Sciences, New Delhi, India
P. K. Pandey
Department of Ophthalmic Sciences, Dr. Rajendra Prasad Centre, All India Institute of Medical
Sciences, New Delhi, India
© Springer Nature Singapore Pte Ltd. 2021
S. Chakraborti et al. (eds.), Handbook of Oxidative Stress in Cancer: Mechanistic
Aspects,https://doi.org/10.1007/978-981-15-4501-6_140-1
1
therapy modes have only a symptomatic effect. Therefore, targeted therapeutic
interventions and a better understanding of tumor complexity are required. This
phenomenon can be achieved with systems biology approaches by using high-
throughput techniques that assist in the prognosis and diagnosis of cancer. These
approaches work primarily in two interdependent ways collating the available
data in the form of a database and extracting data from these platforms via tools
and software to create networks and prediction models. Such tools are efcient in
generating data that can help in personalized medicine through drug discovery.
Therefore, in this chapter, we have reported the publicly available systems
biology databases, software, and tools used in cancer data analysis.
Keywords
Systems biology · Databases · Tools · Software · Mathematical modeling and
simulations · Cancer
Introduction
Cancer occurs due to the abnormal proliferation of cells regulated predominantly by
anti-apoptotic, proapoptotic, and other proteins (Verma et al. 2013; Raghav et al.
2012a,b,2019). Death from cancer is still prevalent despite the advancement of
prevention and treatment. Based on predictions, the deaths from cancer will continue
to increase, and 11.4 million are expected to die by 2030 (World Health Organization
2012). The early prognosis and diagnosis are tedious task due to the lack of specic
molecular markers for cancer, although regulating genes and their products are being
studied to eradicate cancer. The solution to these issues is provided by data integra-
tion from the Omicsdata generation technologies to explain the molecular mech-
anisms of cancer pathogenesis. Currently, cancer is considered as systems biology
disease (Hornberg et al. 2006). Systems biology is an integrative approach through
which the new molecular events in cancer can be revealed based on high-throughput
Omicstechnologies (Chakraborty et al. 2018). The high-throughput research tools
belong to Omicstechnologies and include genomics, proteomics, and trans-
criptomics. The genome sequencing and high-throughput technologies (microarray
and next-generation sequences) provide extensive datasets stored in several data-
bases (Kitano 2002). These datasets containing genes and proteins information
extracted from samples have been analyzed using systems biology tools (Nagaraj
2009). Furthermore, systems biology explains the complex interactions between
genes and networks of all cellular elements (Liu 2005). A comprehensive network
of diverse data, such as gene expression, mutation, DNA-protein, and protein-
protein interactions, can be constructed to understand molecular processes associ-
ated with cancer (Schadt et al. 2009). These complex networks are represented as a
systems biology model that can be used to identify specic regulatory molecules or
pathways involved in cancer (Laubenbacher et al. 2009; Morrow et al. 2010; Mac
et al. 2010). This inclusive systems biology approach is necessary to understand the
2 P. K. Raghav et al.
interaction between the adaptive antioxidant response and ROS signaling (Kitano
2002).
The scope of the current chapter is to provide an overview of web resources,
databases, tools, and software of systems biology used for cancer research (Tables 1
and 2, Fig. 1). This chapter summarizes the advanced bioinformatics tools with
systems biology approaches for Omicsanalysis. Such analysis reports multiple
facets of protein and gene expression, variations, experimental pathways and inter-
actions, immunoinformatics, drug designing and GWAS (Genome-wide association
studies) (Mac et al. 2010).
Systems Biology Databases Used in Cancer Research
Expression and Variation Databases
The number of bioinformatics data repositories is exponentially growing. The search
for biomarkers and novel cancer targets needs various databases and tools (Table 1).
The high-throughput data obtained from cancer studies have been integrated into
publicly available centralized databases. For cancer therapy, aggregation, analysis,
and integration of clinical data are required to identify novel biomarkers of cancer
and their targets (Hackl et al. 2010). This biological data includes both gene
expression and mutation data available at the international consortium, TCGA
(The Cancer Genome Atlas, http://cancergenome.nih.gov), ICGC (International
Cancer Genome Consortium, http://icgc.org), cBioPortal (https://www.cbioportal.
org/), and GEN2PHEN (Genotype to Phenotype Database, http://www.gen2phen.
org). The primary aim of these portals is to generate genetic variation data of cancer
patients compared to healthy. However, these databases are used in combination
with other tools to identify cancer-associated mutations (Raghav et al. 2019; Forman
et al. 2010). The TCGA database is funded by the NIH (National Institutes of Health)
and has clinical and high-throughput data for 33 types of cancer (Gao et al. 2019).
The extensive resources, GEO (Gene Expression Omnibus) and microarray infor-
matics at EBI (European Bioinformatics Institute), are available for retrieving
microarray data, analysis, and storage. The querying abilities of the GEO repository
are enhanced and achieved in GEOmetadb. The GEO and ArrayExpress public
repositories provide the datasets for microarray gene expression data of cancer
types and subtypes acquired on several platforms and tumor samples, while
PRIDE (PRoteomics IDentications database) is a proteomics data repository.
Oncomine is another cancer gene expression database that includes cancer genomic
proles in 15 datasets and 86,733 samples. Data integration, combined with data
mining tools, identies correlations, which is essential for systems biology studies.
The gene name redundancies and inconsistencies can be reduced using genetic
nomenclature in GO (Gene Ontology) and the caBIG (cancer Biomedical Informat-
ics Grid) database. High-throughput analyses pertain to the GWAS method used to
identify cancer biomarkers. However, a protein microarray approach has identied a
set of antigens in one particular colon cancer study (Nam et al. 2003), while the
Systems Biology Resources and Their Applications to Understand the Cancer 3
Table 1 List of systems biology and related databases used in cancer studies
Databases/Webportals Description
PubMed
IDs
Expression and variation databases
ArrayExpress Archive of the genomics data, based on
microarray data, gene expression proles and
their functionality.
[https://www.ebi.ac.uk/arrayexpress/]
17132828
Atlas of Genetics and
Cytogenetics in Oncology
and Hematology
Genetic mutations related to cancer and
respective chromosomal abnormalities.
[http://atlasgeneticsoncology.org/]
18396036
BASE Web-based microarray database and analysis
platform.
[http://base.thep.lu.se/]
19822003
BMERC
(BioMolecular Engineering
Research Center)
Completed Genomes collected resources.
[https://www.bu.edu/bmerc/]
17604447
caBIG
(Cancer Biomedical
Informatics
Grid)
Large multidisciplinary data sets, analysis tools
and other related tools resources for analysis.
[https://orbit.nlm.nih.gov/browse-repository/
online-community/forum-message/28-cancer-
biomedical-informatics-grid-cabig]
22986455
Cancer GAMAdb Cancer genetic data from integrated knowledge
database of cancer with genome-wide
association studies and meta-analyses, where
specic information can be searched.
[https://omictools.com/cancer-gamadb-tool]
18396036
Cancer Gene Census Catalogue of annotation of genes and their
mutations.
[https://cancer.sanger.ac.uk/census]
22986455
caGWAS
(Cancer Genome Wide
Association Scan)
Integrate, analyze, query, and report plausible
associations among genetic variations, the
respective drug responses, and diseases or other
clinical outcomes.
[https://www.nitrc.org/projects/cagwas/]
22986455
cBioPortal for Cancer
Genomics
A platform for analysis, visualization, and
downloading of the enormous genomic data.
[https://www.cbioportal.org/]
24492837
CGAP
(Cancer Genome Anatomy
Project)
Genetic expression prolesresource of
normal vs. cancer cells.
[http://cgap.nci.nih.gov/]
22986455
CGEMS
(Cancer Genetic Markers of
Susceptibility)
Identication of commonly inherited genetic
mutations linked with the risk of prostate and
breast cancer.
[https://dl.acm.org/doi/10.1145/965106.
965131]
22986455
COLT-Cancer
database
Collection of all the shRNA-based signatures
proles, covering approximately16000 human
genes.
[https://omictools.com/colt-cancer-tool]
18396036
(continued)
4 P. K. Raghav et al.
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
COSMIC
(Catalogue of Somatic
Mutations in Cancer)
Contains details of sample aberrations or
mutations and literature. Also provides
mutational range and frequency statistics for any
gene of interest and/or cancer phenotype.
[https://cancer.sanger.ac.uk/cosmic]
18396036
22986455
CRC gene Database
(Colorectal Cancer)
Gathers all gene-based studies on CRC with a
precise interpretation of all the possible risk
factors.
[http://colonatlas.org/]
18396036
caSNP
(Database for copy number
alterations of cancer genome
from SNP array data)
Collection of copy number variations (CNV)
from numerous SNP arrays.
[https://bioinformaticshome.com/tools/cnv/
descriptions/CaSNP.html]
22986455
dbDEPC
(Database of Differentially
Expressed Proteins in Human
Cancers)
Provides proteomics data for cancer, details
about changes in expressions at protein level
and exploring the differences in protein proles
among different cancers subtypes.
[https://www.scbit.org/dbdepc3/protein.php]
22986455
EBI
(European Bioinformatics
Institute)
Maintains SwissProt and EMBL Nucleotide
Sequence Database, Europes primary and most
reliable nucleotide sequence database.
[https://ebi.ac.uk]
7937043
EBI- EMBL Europes primary sequence data resource that
maintains EMBL Nucleotide Sequence
Database and SWISS- PROT protein Sequence
Database.
[https://www.ebi.ac.uk/]
8594602
Ensembl Genome Browser Genomic information on Human, archiving all
the sequenced genes.
[http://www.ensembl.org/]
11752248
FaCD
(Familial Cancer Database)
Aids differential diagnosis in cancer patients at
genetic level.
[https://www.familialcancerdatabase.nl/]
18396036
GEM
(Grid-Enabled Measures)
A platform by NCI to promote use of
standardized measures organized by theoretical
constructs and share the harmonized data hence
generated.
[grids.ucs.indiana.edu/ptliupages/publications/
hpjavaapril04.pdf]
21521586
GEN2PHEN
(Genotype to Phenotype
Database)
This database collates the G2P data from
various resources with the facility of data
annotation and user feedback.
[http://www.gen2phen.org]
21438073
caSNP
(Gene Expression Omnibus)
Stores high-throughput genomic data,
comprised of chromatin structure, transcription
factor binding information, methylation status,
and CNVs.
[https://www.ncbi.nlm.nih.gov/geo/]
22986455
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 5
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
GeneCards A comprehensive database for all predicted and
known human genes.
[https://www.genecards.org/]
24492837
GEOmetadb SQLite database that contains the metadata
associated with the GEO repository.
[https://gbnci-abcc.ncifcrf.gov/geo/]
18842599
GO
(Gene Ontology)
The largest database on function of all the genes.
Data has been derived from molecular and
genetics experiments.
[http://geneontology.org/]
23161678
HGMD
(Human Gene Mutation
Database)
Collates all the human gene lesions underlying
genetic diseases, most of which are published.
[http://www.hgmd.org]
28349240
HGMP Resource
(Human Genome Mapping
Project)
Homology information for mouse, human and
over 70 other species.
[http://www.hgmp.mrc.ac.uk/Genome]
10193186
HNOCDB
(Head and Neck and Oral
Cancer)
Comprehensive information on microRNAs and
genes of the head, neck and oral cancer.
[https://omictools.com/hnocdb-tool]
18396036
ICGC Data Portal
(International Cancer Genome
Consortium)
Catalogues of genetic aberrations (somatic
mutations, epigenetic modications, abnormal
expression of genes) among various types of
tumors. Project for omicsdata, ICGC is
focused in understanding the genomic
abnormalities caused in cancer with detailed
information on somatic mutations in 50 different
cancer types.
[https://icgc.org/]
18396036
24492837
22986455
IGDB NSCLC
(Integrated Genomic Database
of Non-Small Cell Lung
Cancer)
Consolidated database of genetic mutations
(LOH, CNA, aCGH, SNP) archived on lung
cancer.
[http://igdb.nsclc.ibms.sinica.edu.tw/]
18396036
IMB Jena Structural genomics data for human, mouse, and
primates.
[http://genome.imb-jena.de/]
10592237
Liverome Curated database of liver cancer-related genes
retrieved from available open source proteomics
and microarray studies.
[http://liverome.kobic.re.kr/]
18396036
MethDB Database for environmental epigenetic effects
and DNA methylation.
[http://www.methdb.net/]
11125109
MINT
(Molecular INTeraction
database)
Gene expression patterns and DNA methylation
in cancers vs. normal cells.
[https://mint.bio.uniroma2.it/]
18396036
(continued)
6 P. K. Raghav et al.
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
miRCancer Archived microRNA expression description in
different human cancers and subtypes using
chi-square sequence and clustering
analytical tool.
[http://mircancer.ecu.edu/]
18396036
Mitelman Database of
Chromosome
Aberrations in Cancer
Chromosomal mutations and tumor
characteristics, from either associated or
individual cases.
[https://mitelmandatabase.isb-cgc.org/]
18396036
NCBI dbGaP
(Database of Genotypes and
Phenotypes )
Archives the exposure, genotype, phenotype,
and sequence-based data of the individuals and
the possible associations among them.
[https://www.ncbi.nlm.nih.gov/gap/]
22986455
Oncomine Database and data-mining platform built to
retrieve microarray data. Collects networks and
pathways and gene expression data.
[https://www.oncomine.org/resource/login.
html]
18396036
22986455
Oral Cancer Gene
Database Version I
and II
Oral cancer associated genes archived in this
database, with two versions-
Version I has 242 genes.
Version II has 374 genes.
[http://www.actrec.gov.in/OCDB/]
18396036
PC-GDB
(Pancreatic Cancer gene
database)
Details on pancreatic cancer associated genes.
[https://www.bioinformatics.org/pcgdb/]
18396036
PEpiD Records prostate cancer epigenetic data of
experimental rodents and humans.
[https://www.pepid.com/]
18396036
Progenetix database Aberrations in copy number of genes associated
with human cancer.
[https://www.progenetix.org/]
18396036
RCDB
(Roller Coaster Database)
Manually collated information about
269 microRNA and 240 protein-coding genes,
describing pathogenesis and etiology of
different renal cancers.
[https://rcdb.com/]
18396036
RefDIC Immunoinformatic resources for microarray
analyses.
[http://refdic.rcai.riken.jp/welcome.cgi]
17893089
Roche Cancer
Genome Database
Comprehensive information on chromosomal
aberrations and SNPs archived from techniques
like CGH and FISH.
[http://rcgdb.bioinf.uni-sb.de/MutomeWeb/]
18396036
SageBio Provides a platform for data curation, sharing
and running solutions on complex biomedical
problems.
[https://sagebionetworks.org/]
24492837
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 7
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
Sanger Institute databases A set of databases within the institute to store
and analyze a large scale of data.
[https://www.sanger.ac.uk/science/tools/
categories/database-software]
24492837
dbSNP
(Single Nucleotide
Polymorphism Database)
Various kinds of databases are categorized based
on different mutations like substitutions,
deletion, insertion polymorphism, and
microsatellite repeats.
[https://www.ncbi.nlm.nih.gov/snp/]
22986455
SNP500Cancer Main library for sequence verication of SNPs
and related assay information.
[https://hsls.pitt.edu/obrc/index.php?
page¼URL1097241151]
18396036
TCGA
(The Cancer Genome Atlas)
Resource designed to understand molecular
basis of the cancer, by performing analysis of
genetic and miRNA expression, copy number
and methylation status of brain, ovarian and
lung cancer.
[https://www.cancer.gov/about-nci/
organization/ccg/research/structural-genomics/
tcga]
24492837
22986455
TUMIR Archives manually collected but experimentally
backed data for understanding the role of
miRNAs in different cancer types.
[https://omictools.com/tumir-tool]
18396036
Tumor Gene Family
Databases
Molecular and cellular data of genes associated
with different cancers subtypes.
[http://www.tumor-gene.org/tgdf.html]
18396036
UCSC Xena
(University of California Santa
Cruz)
Visualizes and hosts functional genomic data.
[https://xena.ucsc.edu/welcome-to-ucsc-xena/]
18396036
Immune system databases
dbLRC-KIR Database of the human Leukocyte Receptor
Complex (LRC).
[https://www.ebi.ac.uk/ipd/kir/]
23193264
dbMHC Experimental and clinical data on MHC.
[https://www.ncbi.nlm.nih.gov/Web/Newsltr/
Summer03/dbMHC.html]
15215374
IEDB
(Immune epitope database and
analysis resource)
Contains immune epitope data related to all
species.
[https://www.iedb.org/]
30357391
IMGT
(International
ImMunoGeneTics information
system)
Resource of the IgSF, MHC, MhcSF, RPI, IG,
TR.
[http://www.imgt.org/]
15608269
International HapMap Project Databases and linkage maps of sequence
variations.
[http://snp.cshl.org/]
16251469
(continued)
8 P. K. Raghav et al.
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
IPD
(Immuno-polymorphism
database)
Database containing information on
polymorphic genes of the immune system and
their respective variations.
[http://www.ebi.ac.uk/ipd/]
16944494
MHC Haplotype Project Studies of the MHC linked-diseases to nd and
possible associations.
[http://www.sanger.ac.uk/HGP/Chr6/MHC/]
18193213
SNPBinder The database helps in prediction of the tissue-
specic minor histocompatibility antigens.
[http://www.sipep.org/]
16893394
SIGMA
(System for Integrative
Genomic Microarray Analysis)
Immunoinformatic resources for microarray
analyses.
[http://sigma.bccrc.ca]
17192189
TumorHoPe Collated information on tumor homing peptides
and their respective target cells, validated by
experimental data.
[https://webs.iiitd.edu.in/raghava/tumorhope/
help.php]
18396036
Drug designing databases used for systems biology analysis
BindingDB Web-based open source database for measuring
binding afnities of protein-ligand complexes.
[https://www.bindingdb.org/bind/index.jsp]
26481362
ChEMBL Freely available database on bioactive
molecules that have properties similar to drugs,
curating the chemical, biological and genomic
data. Primarily used for the purpose of drug
discovery.
[https://www.ebi.ac.uk/chembl/]
27899562
DrugBank Combines information on drug and its details on
structure and function with the specic targets,
collating bioinformatics and cheminformatics.
[https://www.drugbank.ca/]
18048412
MMDB
(Molecular Modeling
Database)
Archives 3D structures of biomolecular nucleic
acids and proteins derived from PDB and
determines their biological functions.
[https://www.ncbi.nlm.nih.gov/Structure/
MMDB/mmdb.shtml]
24319143
MOAD
(Mother of all databases)
Largest archive about maximum of the
identied protein-ligand complexes retrieved
from PDB.
[http://bindingmoad.org/]
15971202
PDB
(Protein Data Bank)
This repository archives information about
macromolecules, nucleic acids and proteins.
[https://www.rcsb.org/]
10592235
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 9
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
PharmGKB Online database predicting clinical data by
investigating effect of genetic variations in
respective drug responses.
[https://www.pharmgkb.org/]
23824865
PubChem The largest freely accessible chemistry database,
with information about the chemicalsname,
molecular formula, structure and biological
applications.
[https://pubchem.ncbi.nlm.nih.gov/]
26400175
STITCH
(Search Tool for Interactions of
Chemicals)
Explores and exploits drug-target relationships
by integrating data of interaction of proteins and
chemicals.
[http://stitch.embl.de/]
18084021
SwissProt Manually annotated section of UniProt with
details retrieved from available literature and
also curator-evaluated computated analysis.
Acts as a protein sequencing database.
[https://web.expasy.org/docs/swiss-prot_
guideline.html]
10592178
UniProt
(Universal Protein Resource)
Information on protein sequence and respective
function applied for drug designing.
[http://www.uniprot.org/]
18045787
ZINC
(Zinc Is Not Commercial)
Database of commercially available chemical
compounds for virtual screening.
[https://zinc.docking.org/]
15667143
Databases for interaction and pathway analyses
CancerProView Cancer-related Gene/Protein and Disease
Pathway Database.
[https://omictools.com/cancerproview-tool]
18396036
DIP
(Database of Interacting
Proteins)
Catalogs protein-protein interactions and their
application and roles in biological networks.
[http://dip.doe-mbi.ucla.edu]
10592249
HPRD
(Human Protein Reference
Database)
Pathways and protein interaction networks.
[http://www.hprd.org/]
22159132
InnateDB Interactions and signaling pathways in the
innate immune response.
[http://www.innatedb.ca/]
23180781
JenPep Immunological proteinpeptide interactions
database.
[http://www.jenner.ac.uk/jenpep/]
11934742
KEGG
(Kyoto Encyclopedia of Genes
and Genomes)
Collection of databases of genomes, pathways,
drugs and chemical structures.
[https://www.genome.jp/kegg/]
10592173
(continued)
10 P. K. Raghav et al.
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
LINCS
(Library of Integrated
Network-based Cellular
Signatures)
LINCS aims to enhance a network-based
understanding of biology, cataloguing changes
in gene expression, and other cellular processes
in response to perturbations.
[http://www.lincsproject.org/]
24492837
OMIM
(Online Mendelian Inheritance
in Man)
Compendium of human genes and related
genetic disorders and traits, focusing on
genotype- phenotype relationship, to enhance
practice of clinical genetics.
[https://omim.org/]
11752252
PIG
(Pathogen Interaction
Gateway)
Hostpathogen, proteinprotein interactions
(PPIs) data.
[http://molvis.vbi.vt.edu/pig/]
18984614
Pathway Commons Pathway Commons is a portal to access
biological pathway information collected from
public pathway databases.
[https://www.pathwaycommons.org/]
24492837
Reactome Database for pathways analysis to add
information to proteomics data.
[www.reactome.org/]
21067998
VirusMINT Interactions between human and viral proteins
are archived here.
[http://mint.bio.uniroma2.it/virusmint/
Welcome.do]
18974184
Databases for experimental and clinical studies
caGWAS
(Cancer Genome Wide
Association Scan)
Integrate, query, report, and analyze signicant
associations between genetic variations and
disease, drug response, or other clinical
outcomes.
[https://www.nitrc.org/projects/cagwas/]
22986455
CancerDR Provides information about 148 anti-cancer
drugs, and their pharmacological proling
across 1000 cancer cell lines.
[http://crdd.osdd.net/raghava/cancerdr/]
18396036
canSAR v 2.0 Brings together biological, chemical,
pharmacological, and disease data, distils them
and makes them accessible to cancer research
scientists from all disciplines to support
translational research and drug discovery.
[https://cansarblack.icr.ac.uk/]
18396036
CDISC
(Clinical Data Interchange
Standards Consortium)
Standards to support the use of clinical research
data and metadata.
[http://www.cdisc.org/]
29888049
CGAP
(Cancer Genome Anatomy
Project)
Resource of gene expression proles of normal,
precancer, and cancer cells.
[http://cgap.nci.nig.gov/]
22986455
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 11
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
CMAP
(Cancer Molecular Analysis
Project)
Available for analysis gene associated with
oncogenesis and cancer proles, clinical trials
and therapies.
[https://www.g6g-softwaredirectory.com/bio/
cross-omics/dbs-kbs/20768-CMAP.php]
22986455
CPT-4
(Current Procedural
Terminology)
Describes medical, surgical, and diagnostic
services.
[https://catalog.ama-assn.org/Catalog/cpt/cpt_
search.jsp]
24761332
DICOM
(Digital Imaging and
Communication in Medicine)
A standard for information in medical imaging.
[http://medical.nema.org/]
9147339
eTUMOUR This project curates database based on
transcriptomic and clinical data from brain
tumor patients.
[http://www.etumour.net]
23180768
FaCD
(Familial Cancer Database)
Assist genetic differential diagnosis in cancer
patients.
[https://www.familialcancerdatabase.nl/]
18396036
FuGE
(Functional Genomics
Experiments)
Enlists standards for high-throughput biological
experiments to maintain repositories and a
dened set of standards.
[https://omictools.com/fuge-tool]
17921998
GLIF
(GuideLine Interchange
Format)
For sharing of clinical practice guidelines.
[http://www.glif.org/glif_main.html]
9670133
HL7
(Health Level Seven)
Standards for interoperability of health
information technology.
[http://www.hl7.org/]
30353411
ICD
(International Classication of
Disease)
Classications of diseases.
[http://www.who.int/classications/icd/en/]
25879045
LGA Public platform that supports research and
analysis of molecular data of leukemias.
[http://www.leukemia-gene-atlas.org/LGAtlas/]
18396036
LOINC
(Logical Observation
Identiers Names and Codes)
Universal codes and names to identify
laboratory and other clinical observations.
[http://loinc.org/]
12651816
MammoGrid Archives mammograms and related reports of
patients.
[http://www.cems.uwe.ac.uk/cccs/project.php?
name.mammogrid]
17920862
METAcancer A consortium of metabolomics, proteomics, and
transcriptomic data of breast cancer patients.
[http://www.metacancer-fp7.eu]
22546809
(continued)
12 P. K. Raghav et al.
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
MIAPE
(Minimum Information About
a Proteomics Experiment
Uses modules for recording use and
interpretation of proteomics data derived from
protein dependent techniques.
[http://www.psidev.info/miape/]
18688244
MIBBI
(Minimum Information for
Biological and Biomedical
Investigations)
Web-based freely accessible tool for creating
checklists to facilitate coordination with the aim
to build an integrated checklist resource site.
[http://www.dcc.ac.uk/resources/metadata-
standards/mibbi-minimum-information-
biological-and-biomedical-investigations]
18688244
MIFlowCyt
(Minimum Information about a
Flow Cytometry Experiment)
Establishes standards for recording and
reporting information related to ow cytometer
experiments like samples, instrumentation and
data analysis.
[https://isac-net.org/page/MIFlowCyt]
18752282
MINI
(Minimum Information about a
Neuroscience Investigation)
Maintains a checklist that establishes minimum
requirements for the use of electrophysiology in
a neuroscience study.
[http://carmen.org.uk/standards]
18688244
MIAME
(Minimum Information About
a Microarray Experiment)
For the interpretation of microarray
experimental results.
[http://www.mged.org/Workgroups/MIAME/
miame.html]
19484163
MMHCC
(Mouse Models of Human
Cancer Consortium)
Resource for mouse cancer models of mouse
and associated strains that combines the basic
and translational data to derive at genetically
engineered mouse models for cancer research.
[http://emice.nci.nih.gov/]
19259381
NCCN
(Oncology Outcomes
Database)
Network-based data collection, reporting, and
analytic system to describe patterns and
outcomes of care delivered in the management
of patients with cancer.
[https://www.nccn.org/clinical_trials/
SharedResource.aspx]
18396036
NCDB A joint program of the CoC of the ACoS and the
ACS evaluate and compare cancer care
delivered to patients diagnosed and/or treated at
state, regional, and national cancer facilities.
[https://www.facs.org/quality-programs/cancer/
ncdb]
18396036
PRIDE
(PRoteomics IDentications
database)
An integrated database for sharing the
tremendous proteomics data generated till date
among the proteomics community.
[http://www.ebi.ac.uk/pride]
18592187
SEER
(Surveillance, Epidemiology,
and End Results Program)
Collects cancer incidence, prevalence, and
survival data, further categorized based on
Epidemiological features providing reliable
cancer statistics.
30647547
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 13
mRNA and protein proles are stored in the RefDIC database. Several systems
biology genes and protein expression, and variation databases related to cancer are
shown in Table 1(Expression and Variation Databases). The dbSNP (Single Nucle-
otide Polymorphism Database) database is primarily used to retrieve detailed genetic
sequence and SNPs (Single Nucleotide Polymorphisms) information.
Immune System and Personalized Medicine Databases Used
in Cancer Drug Designing
Pharmacogenomics analysis and personalized medicine development recognized the
relationship between disease and genetic variations (Yan 2008a). This phenomenon
shows that different patient subgroups respond to different vaccines or drugs (Wang
et al. 2010). The systems biology and immunoinformatic approaches lead to drug
and vaccine development required for personalized medicine, treatment, and pre-
vention of cancer. This data is available at an Immunoinformatics Portal (http://
immune.pharmtao.com). Immune epitopes are the most strongly studied areas of the
mammalian immune system and play an essential role in designing different cancer
vaccines (Gao et al. 2019). Epitopes are the binding domain of antigens that interact
with the respective receptors (Yan 2008b). This epitope interaction triggers immune
responses in the host immune cells. The immune epitope databases have contributed
to the vaccine development, targeted drug design, and interaction analyses (Table 1,
Immune System databases). The epitope analysis can be performed using the IEDB
(Immune Epitope Database and Analysis Resource). IMGT (International ImMuno-
GeneTics) is a complete resource database with information about proteins of the
human immune system like MHC (Major Histocompatibility Complex), MhcSF
Table 1 (continued)
Databases/Webportals Description
PubMed
IDs
[https://seer.cancer.gov/data-software/linked_
databases.html]
SEER
(Medicare Linked Database)
This database combines clinical information
from different cancer registries for health
services research.
[https://healthcaredelivery.cancer.gov/
seermedicare/]
12187163
SNOMED CT
(Systematized Nomenclature
of Medicine Clinical Terms)
A comprehensive clinical terminology collated.
[http://www.nlm.nih.gov/research/umls/
Snomed/snomed_main.html]
28566995
TRANS-BIG Consortium of breast cancer patientsdata.
[http://www.breastinternationalgroup.org/
Research/TRANSBIG.aspx]
28451965
UMLS
(Unied Medical Language
System)
Terminology, classication and coding
standards.
[http://www.nlm.nih.gov/research/umls/]
14681409
14 P. K. Raghav et al.
Table 2 Systems biology tools used in cancer studies
Tools/Software Description PUBMED IDs
Expression and variation tools
ArrayMiner Set of analysis tools for microarray data.
Windows and Mac based commercial
software.
[https://arraymining.net/]
19863798
ArrayWiki A common sharing platform for collection
and storage of microarray experimental data
and meta-analysis results.
[http://www.bio-miblab.org/arraywiki]
18541053
caArray Microarray data management system by NCI
for analyzing and visualizing gene
expression data.
[https://array.nci.nih.gov/caarray/home.
action]
19208739
Camelot
(CAusal Modeling with
Expression Linkage for
cOmplex Traits)
Outputs a linear regression model that uses
genotype and expression to predict
phenotype; powered by regularized linear
regression.
19888205
CARMAweb
(Comprehensive R-based
Microarray Analysis web
service)
Web-based tool for microarray data analysis
by performing data processing, cluster
analysis, and gene oncology term analysis.
[https://www.genepattern.org/]
16845058
Cluster For clustering, SOM of microarray data.
Windows based commercial software.
[http://bonsai.hgc.jp/~mdehoon/software/
cluster/]
31115888
CNAmet Identication of genes that show
simultaneous methylation, copy number,
and expression alterations.
[omictools.com/cnamet-tool]
22986455
Consensus clustering Starting from multiple clusterings (each can
represent a data type), obtaining a single
integrated cluster assignment.[http://code.
google.com/p/consensus-cluster]
20141333
dChip Widely used for the analysis of Affymetrix
gene chip data.
[http://www.dchip.org/]
18528524
ECR Browser
(Evolutionary Conserved
Regions)
Publicly available resource for regulatory
genome data mining for sequence alignment
comparison.
[https://ecrbrowser.dcode.org/]
15215395
Elastic Net Regularized regression method to improve
the overall prediction accuracy, using all
data as covariates. [https://github.com/
kiwtir/RWEN]
29688307
Entrez SNP Search tool to look for SNP mutations in
dbSNP.
[https://www.ncbi.nlm.nih.gov/snp/]
23241512
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 15
Table 2 (continued)
Tools/Software Description PUBMED IDs
Expression Proler Analysis and clustering of gene expression
data
Web based commercial software.
[https://omictools.com/expression-proler-
tool]
15215431
GEI
(Genes and the Environment
Initiative, Exposure
Biology Program )
A tool that links the genetic and
environmental aspects of any tumor type,
identifying the environmental exposures and
lifestyle factors that make a certain
population more susceptible to
carcinogenesis.
[www.gei.nih.gov/exposurebiology/]
21308768
GenePattern Open source software package that provides
tools in the form of modules for genomic
data analysis.
[https://www.genepattern.org/]
16642009
GeneTrailExpress Analyzes microarray data through standard
normalization procedures and statistical
analysis.
[http://genetrail.bioinf.uni-sb.de]
19099609
GEPAS
(Gene Expression Prole
Analysis Suite)
Web-based microarray data analysis tool
with algorithms for gene selection, class
prediction and functional proling.
[http://www.gepas.org]
18508806
ILOOP To analyze two-channel microarray data in
different clinical settings.
[http://mcbc.usm.edu/iloop]
18831776
IntegrOmics Identication of relationships between two
omicsdata sets.
[https://omictools.com/integromics-tool]
22986455
iPAC Integration of copy number and gene
expression to detect genes and associated
pathways or processes that are inuenced in
trans by copy number.
[http://bioconductor.org/packages/release/
bioc/html/iPAC.html]
22986455
KOBAS
(KEGG Orthology-Based
Annotation System )
Pathway and disease annotation of gene sets.
[http://kobas.cbi.pku.edu.cn/]
22986455
Lasso Identication of Omicsfeatures with
predictive ability for a given response (such
as survival), using all data as covariates or
using some data to decide the penalty of
others.
[https://github.com?HaohanWang/
thePrecisionLasso]
22986455
(continued)
16 P. K. Raghav et al.
Table 2 (continued)
Tools/Software Description PUBMED IDs
Lol
(Lots of Lasso)
Integration of copy number and gene
expression to detect in-cis and in-trans
regulation of gene expression.
[https://rdrr.io/bioc/lol/]
22986455
MAGMA A tool for gene analysis and generalized
gene-set analysis of genome-wide
association studiesdata.
[http://ctglab.nl/software/magma]
25885710
MAPPFinder Tool for gene ontology term annotation of
differentially expressed genes.
[http://www.genmapp.org/]
22986455
MCD
(Multiple Concerted
Disruption analysis)
Identication of subsets of genes that are
affected on multiple levels by some
condition.
20478067
MGSA
(Model-based Gene Set
Analysis)
Identication of active gene sets and their
analysis.
[https://www.bioconductor.org/packages/
release/bioc/html/mgsa.html]
22986455
Microarray Retriever Web-based tool for screening of publicly
available microarray data from GEO and
ArrayExpress.
[http://www.lgtc.nl/MaRe/]
18463138
Netwalker Netwalker is a platform to assist in
functional analyses of large-scale genomics
datasets focused on molecular networks.
[http://bioinfo.vanderbilt.edu/netwalker/]
24492837
omniBioMarker Biomarker detection tool that scans through
NCI Cancer Gene Index through particular
algorithms to improve microarray-based
clinical prediction performance.
[http://omnibiomarker.bme.gatech.edu/]
22893372
PatholOgist A consistency score and an activity score is
calculated for each pathway.
[https://thepathologist.com/inside-the-lab/
bioinformatics]
22986455
PLRS (Piecewise Linear
Regression Splines)
Studying relationships between copy
number and mRNA expression; detection of
copy number-induced sample subgroup-
specic effects.
[http://bioconductor.org/packages/release/
bioc/html/plrs.html]
22986455
RMA Express
(Robust Multichip Average)
Online tool to compute gene expression
summary values for qualitative assessment
using probe-level metrics.
[http://rmaexpress.bmbolstad.com/]
12925520,
19145252
ScanAlyze Processes uorescent images of microarrays.
Windows based commercial software.
[https://omictools.com/scanalyze-tool]
24298393
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 17
Table 2 (continued)
Tools/Software Description PUBMED IDs
SPIA
(Signaling Pathway Impact
Analysis)
Pathway annotation of differentially
expressed genes.
[http://bioconductor.org/packages/release/
bioc/html/SPIA.html]
22986455
SubpathwayMiner Pathway annotation of gene sets.
[https://omictools.com/subpathwayminer-
tool]
22986455
Taverna Combines web services and local tools into
workow pipelines for high-throughput
Omicsanalyses.
[http://www.taverna.org.uk]
23640334
TreeView Graphically browse and analyzes results of
clustering. Windows based commercial
software.
[https://visualcomposer.com/help/interface/
tree-view/]
18792942
Tumorscape Provide copy number alterations across
multiple cancer types.
[http://portals.broadinstitute.org/tcga/home]
22986455
VIDA
(Visulaization of DAta)
Homologous protein families from virus
genomes.
[http://www.biochem.ucl.ac.uk/bsm/virus_
database/VIDA.html]
11125070
Immunoinformatic tools
BLAST
(Basic Local Alignment
Search Tool)
Most widely used web-based sequence
similarity search tool used for comparing
nucleotide and protein queries with their
respective databases, examining multiple
parameters.
[https://blast.ncbi.nlm.nih.gov/Blast.cgi]
18440982
CLUSTAL W Sequence alignment tool for similarities and
differences.
[http://www.ebi.ac.uk/clustalw/]
7984417
CTLPred Tool for CTL epitopes prediction, for
vaccine designing.
[http://crdd.osdd.net/raghava/ctlpred/]
30406342
Cytoscape Cytoscope is an open source platform for
visualizing complex networks and
integrating the networks with other data
types.
[https://cytoscape.org/]
24492837
IntBioSim Integrative multiscale projects relevant to
produce biomolecular simulations.
[http://intbiosim.org/]
16766357
Motif Scan Helps nding motifs in a sequence.
[http://myhits.isb-sib.ch/cgi-bin/motif_scan]
19351663
(continued)
18 P. K. Raghav et al.
Table 2 (continued)
Tools/Software Description PUBMED IDs
Physiome Project Prediction of physiological and functional
dynamics of biomolecules.
[http://physiomeproject.org//]
12539957
PredictProtein Protein secondary structure prediction.
[http://www.predictprotein. org/]
24799431
SiPep Prediction of tissue-specic minor
histocompatibility antigens.
[http://www.sipep.org/]
19513250
SNeP Prediction of SNP-derived epitopes.
[http://elchtools.de/SNEP/]
25852748
Biomolecular Network Tools
ARACNe
(Algorithm for the
Reconstruction of Accurate
Cellular Networks)
Uses microarray expression proles to
construe functional mechanisms of cellular
processes in mammalian cells by analyzing
transcriptional interactions.
[http://califano.c2b2.columbia.edu/aracne]
16723010
ClueGo It is a CytoScape App that improves
biological interpretation by integrating GO
and KEGG data to create functionally
grouped terms with similar associated genes
to reduce redundancy.
[http://apps.cytoscape.org/apps/cluego]
19237447
DAVID
(Database for Annotation,
Visualization and Integration
Discovery)
Tool used for functional interpretation of
genesenlisted and archived on the basis of
genomic studies.
[https://david.ncifcrf.gov/]
17576678
DREAM
(Dialogue for Reverse
Engineering Assessments and
Methods)
DREAM aims to be a catalyzer for the
interaction between experiment and theory
focused on cellular network inference and
quantitative model building.
[http://dreamchallenges.org/]
24492837
Metacore For functional analysis of expression and
genetic variation data.
[https://portal.genego.com/]
29163640
PSORT Prediction tool for protein subcellular
localization sites.
[https://psort.hgc.jp/]
10087920
STRING String is a tool for predicting physical and
functional protein interactions.
[https://string-db.org/]
24492837
Text mining tools
Anni 2.0 Medline interface with retrieved information
of genes, drugs, and diseases to conduct
more efcient oncological research.
[http://biosemantics.org/anni/]
18549479
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 19
Table 2 (continued)
Tools/Software Description PUBMED IDs
CAESAR
(CAndidatE Search And
Rank)
Annotates human genes as tumor associated
candidates and ranks them based on a score
system.
[http://visionlab.bio.unc.edu/caesar]
20074336
CARGO
(Cancer And Related Genes
Online)
Web-based visualization tool for integrating
customized biological information, designed
to aid researches with little or no
Bioinformaticsbackground.
[http://cargo-dev.bioinfo.cnio.es/]
17483515
CGMIM Text mining of OMIM to detect genetically
associated cancer and the related genes.
[http://www.bccrc.ca/ccr/CGMIM]
15796777
Coag-MDB
(Coagulation Serine Protease
Mutation Database)
Provides data on point mutations with
structural analyses of coagulation proteases.
[http://www.coagmdb.org/]
18058827
COBRA Text mining tool to construct biochemical
networks and enforcing necessary QC
measures.
[http://opencobra.sourceforge.net]
21596791
CoPub Text mining system that uses Medline
abstracts to narrow down keyword
co-occurrences.
[http://www.copub.org]
21622961
ENDEAVOUR Tool for prioritization of candidate genes for
more complex studies by input of already
known genes and integrating the
genomic data.
[https://endeavour.esat.kuleuven.be/]
18508807
FACTA+
(Finding Associated Concepts
with Text Analysis)
Text search engine like PubMed for
visualizing associations between genes,
diseases and chemical compounds.
http://www.nactem.ac.uk/facta/]
21685059
G2D A tool for gene by based on its link to a
particular tumor type. Does so by associating
various databases like MEDLINE, STRING,
GO and RefSeq.
[https://omictools.com/g2d-tool]
25392685
GAPscreener An automatic SVM tool for screening the
human genetic associated literature in
PubMed.
[http://www.hugenavigator.net/download/
GAPscreener_src.zip]
18430222
HuGE Navigator .A platform for integrated genetic
dataextracted from PubMed using text
mining algorithms. [https://phgkb.cdc.gov/
PHGKB/hNHome.action]
23999671
(continued)
20 P. K. Raghav et al.
Table 2 (continued)
Tools/Software Description PUBMED IDs
MarkerInfoFinder For direct retrieval of publications related to
a particular or set of genetic markers, making
it a more efcient tool for identifying genetic
disorders. Can help in delineating genetic
causes and mechanism underlying the tissue-
specic tumors.
[http://brainarray.mbni.med.umich.edu/
brainarray/datamining/MarkerInfoFinder]
17823133
MedlineR Open source library in R language for
Medline biomedical literature data mining
for more relevant data scanning.
[http://dbsr.duke.edu/pub/MedlineR]
15284107
MeInfoText Database for information on gene
methylation pattern and other epigenetic
modications associated with a tumor type.
[http://bws.iis.sinica.edu.tw:8081/
MeInfoText2/]
18194557
NetCutter Co-occurrence analysis tool to identify
coordinately deregulated set of genes in a
particular cancer type.
[http://bio.ifom-ieo-campus.it/NetCutter/]
18781200
OSIRISv1.2 Upgraded version of OSIRIS, it is a Named
Entity Recognition System to sh out allelic
variants of genes from MEDLINE literature
to relate SNPs with tissue malignancies.
[http://ibi.imim.es/OSIRISv1.2.html]
18251998
PolySearch Web-based text mining system for
correlating cancer types, associated genetic
mutations, proteins and metabolites that
helps in decoding underlying mechanisms.
[http://polysearch.cs.ualberta.ca]
18487273
Mathematical modeling and simulation tools
BioNetTGen Generates physicochemical models of
biological systems, where cellular signaling
from user-specied rules for biomolecular
interactions at the protein domain level is
integrated with tools for reaction network
simulation and analysis.
[http://cellsignaling.lanl.gov/bionetgen/]
27402907
Bio-SPICE Cellular modelling and simulation of
pathways and interaction- network tools for
data analysis.
[https://biospice.org/index.php]
14683613
CellDesigner Gene regulatory network modelling,
supported by SBML format.
[http://celldesigner.org/index.html]
24927840
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 21
Table 2 (continued)
Tools/Software Description PUBMED IDs
Cellerator For simulation and analysis of signal
transduction networks in cells and
multicellular tissues.
[http://www-aig.jpl.nasa.gov/public/mls/
cellerator/]
12651737
CoExMiner Used for modeling gene co-expression
patterns from transcriptional data.
19381544
COPASI Simulation of network pathways and
metabolomic analysis.
[http://www.copasi.org/tiki-index.php]
28655634
Dizzy Kinetic modelling of integrated large-scale
genetic, metabolic, and signaling networks.
[http://magnet.systemsbiology.net/software/
Dizzy]
15852513
Grid Cellware Grid-based modeling and simulation tool for
biochemical pathways.
[http://www.bii.a-star.edu.sg/research/sbg/
cellware]
30945248
MCell Monte Carlo simulator for investigating
cellular physiology, particularly ligand
receptor binding, its dynamics and chemical
reactions involved.
[http://www.mcell.cnl.salk.edu and www.
mcell.psc.edu]
30945248
MesoRD Open source C++ software for 3D
simulation of kinetic reactions based on
SBML format.
[http://mesord.sourceforge.net]
15817692
PathwayPro Provides quantitative assessment of gene
expression proles of ligands and their
respective receptors.
[https://www.accessionhealth.com/digital-
disruption/pathwaypro/]
19381544
SBTOOLBOX Matlab toolbox for prototyping new
algorithms, and building applications for the
analysis and simulation of biological
systems.
[http://www.sbtoolbox.org/]
19381542
SIGTRAN Simulation platform for large- scale reaction
networks with Java swing graphical user
interface and SBML le support.
[https://datatracker.ietf.org/wg/sigtran/
about/]
19381542
SimBiology Matlab toolkit for modeling, simulating, and
analyzing biochemical pathways.
[http://www.mathworks.com/products/
simbiology/description4.html]
19381542
(continued)
22 P. K. Raghav et al.
Table 2 (continued)
Tools/Software Description PUBMED IDs
SmartCell C++ platform for the modeling and
simulation of diffusion- reaction networks in
different subcellular compartments.
[http://smartcell.embl.de/introduction.html]
19381542
StochKit C++ simulation tool for intracellular
biochemical processes.
[http://www.engineering.ucsb.edu/~cse/
StochKit/StochKit.html]
19381542
Virtual Cell Model creation and simulation of cell
biological processes, integrating
biochemical and electrophysiological data
for individual reactions with experimental
microscopic image data describing
subcellular locations.
[http://www.vcell.org/]
19381542
Clinical applications of tools
AfCS
(Alliance for Cellular
Signaling)
Reconstructs signaling networks by
combining the experimental data with
genetic and protein annotations, that
ultimately leads to quantitative
understanding of cellular responses and
translational applications.
[http://www.signaling-gateway.org/aboutus/
afcs.html]
12539952
caBIG
®
(Cancer Biomedical
Informatics Grid)
An initiative to develop a system of
networks to support translational research in
eld of cancer. It allows extraction of
structured data from pathology reports.
[cabig.nci.nih.gov]
17911733
caMATCH This subsystem of caBIG is used in
recruitment of patients for clinical trials.
[cabig.nci.nih.gov/tools/caMATCH]
20859412
caTIES,
(Text Information Extraction
System)
It is an extraction system for generating
clinical information based in surgical
pathological reports. It is a tool used under
caBIG.
[cabig.nci.nih.gov/tools/caties]
19108734
CERTS
(Centers for Education in
Research and Therapeutics)
The most widely used online platform
operated as a collaborative effort of CERTs
is Clinician- Consumer Health Advisory
Information Network (CHAIN), which is
primarily a resource dissemination program
that provides practical information to health
care professionals for better patient
handling.
[www.certs.hhs.gov/]
17632533
(continued)
Systems Biology Resources and Their Applications to Understand the Cancer 23
(MHC superfamily), Ig (immunoglobulins), and IgSF (Immunoglobulin superfam-
ily). The IPD (Immuno Polymorphism Database) provides an integrated system to
study gene polymorphism of the immune system and is further categorized into
diverse databases. The IPD-IMGT/HLA (Human Leucocyte Antigen) database
incorporates sequences of the human MHC, which is named by the WHO Nomen-
clature Committee for Factors of the HLA System. Another database, IPDKIR,
contains human killer-cell immunoglobulin-like receptors(KIR) allelic sequences,
though IPDMHC database holds MHC information. The dbMHC is an MHC
database that provides MHC microsatellite sub-database and interactive alignment
visualization for HLA and related genes. Also, database, IPDHPA includes human
platelet antigens (HPA) information. The immunologically characterized tumor cells
data is encompassed in the IPDESTDAB database. Further, Sanger MHC Haplo-
type Project comprises data related to MHC-linked-diseases. HapMap is a genomic
variation database that provides information on individual genotype data.
Table 2 (continued)
Tools/Software Description PUBMED IDs
CRN
(Cancer Research Network)
It is a consortium of 14 different research
groups connected by their respective health
care delivery sites with information on a
wide variety of topics like cancer prevention,
detection and treatment.
[crn.cancer.gov]
30972356
CTSAs
(Clinical and Translational
Sciences Awards Programs)
This program has been developed to enhance
the clinical application of research data
collected, to improve the health of the
public.
[ctsaweb.org]
21896519
CVRN
(Cardiovascular Research
Network)
Standardizes data elements like
demographics, diagnosis and pathology
reports in a virtual data warehouse in
conjunction with NCIs caBIG.
[https://cvrn.org/]
18793105
Medi-Class Decodes both free-text and coded records to
detect clinical events in EMRs to optimize
record keeping and clinical practice both.
[http://www.mediclass.ro/]
15905485
PROMIS
(Patient-Reported Outcomes
Measuring System)
Patient-centered tool for evaluating physical,
mental, and social health. Primarily used to
assess symptom burden, or impact of the
disease on quality of life through clinical
information.
[outcomes.cancer.gov/tools/promis]
30426667
STKE
(Signal Transduction
Knowledge Environment)
Includes tools to collate information in inter-
disciplinary elds of signal transduction
with a more patient specic approach.
[https://libraries.usc.edu/databases/signal-
transduction-knowledge-environment-stke]
12438188
24 P. K. Raghav et al.
Fig. 1 Common systems biology databases and tools used in cancer studies
Systems Biology Resources and Their Applications to Understand the Cancer 25
Moreover, an enormous amount of biological data is required to construct drug-
target networks, which are available in publicly available chemical databases. The
databases, such as PubChem, MOAD (Mother of All Databases), PDB (Protein Data
Bank), ZINC (Zinc Is Not Commercial), and ChEMBL are valuable resources in
drug discovery (Table 1, Drug designing databases). Subsequently, these databases
can be accessed using STITCH (Search Tool for InTeractions of CHemicals) that
integrates relevant information from crystal structures, metabolic pathways, drug-
target relationships, and binding experiments. However, STITCH 2 incorporates
BindingDB, PharmGKB, comparative toxicogenomics, and similarity prediction
based on text mining, a chemical structure database containing 74,000 compounds
and 2200 drugs. Prominently, three-dimensional (3D) modeling databases such as
PDB and MMDB (Molecular Modeling Database) have protein 3D structures of
antibodies, HLA, and TCRs (T Cell Receptors), required to develop protein inter-
action network (Wang et al. 2010;Yan2008b).
Databases Used for Interaction and Pathway Analyses
The databases for interaction and pathway analyses are specied in Table 1(Data-
bases for Interaction and Pathways Analyses). The signaling pathways and molec-
ular interactions associated with the immune systems are provided by InnateDB, PIG
(Pathogen Interaction Gateway), and JenPep. Another database, VirusMINT, con-
tains interactions data between human and viral proteins. The typical databases,
Reactome, HPRD (Human Protein Reference Database), and DIP (Database of
Interacting Proteins), are useful to predict the gene network, protein-protein inter-
actions, and pathways. The linkage between sequence mutation and its function is
reported under the OMIM (Online Mendelian Inheritance in Man) database. The
miRNAs (miRNA-21, miRNA 125b, miRNA-221, and miRNA-326) play a signif-
icant role in the cancer regression by modulating the apoptosis pathway (Hur 2015;
Singh and Mo 2013). The miRBase database contains sequences, nomenclature,
annotation, and the target prediction of the specic miRNAs. Epigenetic mechanistic
studies involving DNA methylation and chromatin modication helps to understand
cancer progression and potential cancer therapies (Tomasi et al. 2006). The epige-
netic database, MethDB, revealed the methylation patterns and proles. The central
database, KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.genome.
jp/kegg/), is used for studying the biological functions of genes or proteins. KEGG
database is comprised of the following sections: KEGG PATHWAY, KEGG BRITE,
KEGG GENES, KEGG COMPOUND, KEGG GLYCAN, KEGG REACTIONS,
KEGG ENZYMES, KEGG NETWORK, KEGG DISEASE, and KEGG DRUG.
The KEGG PATHWAY database is composed of a group of pathway-based maps of
molecular reactions and interactions for nucleotide metabolism, metabolic paths,
signal transduction paths, and other cell-based processes. KEGG BRITE provides
more extensive interaction pathways with diverse types of relationships. The gene
catalog database, KEGG GENES, retrieves genomic sequences from NCBI RefSeq.
The information for structures of the chemical compound can be retrieved from KEGG
26 P. K. Raghav et al.
COMPOUNDS. The KEGG GLYCAN is a repository of glycan structures. KEGG
REACTIONS is used to nd the formula of chemical reactions, while the nomencla-
ture of enzymes is contained in KEGG ENZYMES. KEGG NETWORK illustrates the
networks of gene variations, KEGG DISEASE is used to visualize the details of the
molecular network between the type of diseases and therapeutic drugs, KEGG DRUG
for chemical structures, the target molecule of drugs and therapeutic categories.
Tumor Databases Used for Drug Designing in Experimental
and Clinical Studies
Multiple target approaches are used to design drugs against cancer and to identify the
interactions between genes and the molecules that target a particular pathway and
selectively eradicates cancer (Sharom et al. 2004). Moreover, the unique repositories
such as MIBBI (Minimum Information for Biological and Biomedical Investiga-
tions), MINI (Minimum Information about a Neuroscience Investigation),
MIFlowCyt (Minimum Information about a Flow Cytometry Experiment), and
MIAPE (Minimum Information About a Proteomics Experiment) and MIAME
(Minimum Information about a Microarray Experiment) are essential for researchers
to retrieve the experimental data for analysis. FuGE (Functional Genomics Exper-
iments) repository provides a general description of experimental conditions stored
in FuGE as FuGE Object Model (FuGE-OM) or FuGE Markup-Language (FuGE-
ML). Conversely, the proteomic data is stored in a modied database called PRIDE.
Systems biology plays an essential role in identifying crosstalk molecules between
pathways and transient behavior of the tumor clinical data. The multicenter project,
IOTA, delivers the information of characterized ovarian tumors. Also, the multi-
disciplinary eTUMOUR project includes clinical Omicsdata, used to develop the
brain cancer diagnostics tools. Afterward, this developed model was incorporated into
a form of the GUI (Graphical User Interface) to make it user friendly for clinicians.
Additionally, breast cancer consortia such as METAcancer, TRANS-BIG, and
MammoGrid provide the solution to improve prognosis and breast cancer diagnosis.
Besides, the TME.db (Tumor MicroEnvironment database) includes processed, clin-
ical data and R-based statistical tests and approaches for survival analysis.
Systems Biology Approaches and Tools to Cancer
Expression and Variation-Based Systems Biology Tools
and Approaches for Cancer Prediction
All databases perform the meta-analysis (aggregation) of identical data, an essential
step of data integration. This approach expands the sample size and consequently
advances statistical strength, used to perform the analysis of expression proling
data (Mathew et al. 2007). Data integration indicates numerous issues to be consid-
ered for data analysis:
Systems Biology Resources and Their Applications to Understand the Cancer 27
1. The quantile normalization applied to raw data from any platform (Affymetrix,
Agilent, and Illumina.) eliminates the batch-specic effects based on the study
and platform analysis (Orlov et al. 2007).
2. Identication of differentially expressed genes and improvement for multiple
hypothesis testing (Motakis et al. 2009).
3. Accurate annotation of probe-sets, transcripts, or genes is vital to compare the
expression levels of transcript isoform.
4. Consistency should be followed for the clinical sample description and sample
nomenclature in all studies.
Furthermore, microarray data analysis can be performed based on two different
approaches. Either of the separate microarray experiments can be clustered to form
one dataset, or each microarray experiment analyzed followed by statistical analysis
of all experiments, then ranked according to the aggregation approach (Pihur et al.
2008). Thus, the use of conventional statistical methods for meta-analysis or com-
bining p-values can be implemented to microarray data (Whitehead and Whitehead
1991). However, in the case of meta-regression or stratied groups, a correlation
method is used (Mantel and Haenszel 1959), while the developed latent variable
approach is the advancement of this method (Mac et al. 2010; Choi et al. 2007). The
meta-analysis primarily identies cancer biomarkers and prognosis signatures (Pihur
et al. 2008; Rhodes et al. 2004). In addition to gene expression analysis, the meta-
analyses also perform genomics, genetics, and GWAS in cancer (Guerra and Gold-
stein 2016). The gene expression microarray data at the clinical level has to be
analyzed based on either unusual expression levels in one sample, varied expression
across all the samples, samples with similar expression patterns, or similar expres-
sion patterns over all the samples (Kostka and Spang 2004; Affara 2003). The
fundamental evolutionary issues and inuence of the amount and quality of available
data are addressed based on comparative genome analysis using ECR (Evolutionary
Conserved Regions) Browser. Resources and tools for genetic expression and
respective functional analysis are listed in Table 2. DNA microarrays can be used
for mRNA expression assessment. Nevertheless, protein expression is achieved
using 2D-DIGE and MudPit techniques. Two-channel-based microarrays tools,
ILOOP (Interwoven Loop) and MAGMA are used to analyze differentially
expressed genes at various conditions. Another tool used to perform microarray
analysis is GEPAS (Gene Expression Prole Analysis Suite), which includes feature
selection, data normalization, unsupervised clustering, and class prediction. Also, a
new tool for microarray data analysis, CARMAweb (Comprehensive R-based
Microarray Analysis web service), is a Bioconductor module that can be accessed
through R programming language. This package includes background correction,
normalization, differential gene detection, quality control, visualization, clustering,
and dimensionality reduction. GenePattern and caArray are analysis tools that collect
differential gene expression data to integrate into the caBIG database (cancer
Biomedical Informatics Grid, https://cabig.nci.nih.gov/). The statistical gene func-
tion data analysis is performed based on data extracted from the GO database using
several available tools or downloadable packages, like GoMiner, AmiGO, GOStat,
28 P. K. Raghav et al.
GOEAST, and BiNGO. Furthermore, CoPub tool mines literature and visualizes the
searched genes to specic keywords derived from the literature by searching through
Medline abstracts (http://services.nbic.nl/cgi-bin/copub/CoPub.pl). Some databases
such as GEO and ArrayExpress follow community data standards and established by
MIAME, though the community users are allowed to annotate gene expression data
using ArrayWiki. Correspondingly, the Microarray Retriever tool retrieves gene
expression data from ArrayExpress in addition to GEO databases to increase the
sample size in the microarray-based studies. GeneTrailExpress, a web-based appli-
cation module, is used for implementation, normalization, interpretation, visualiza-
tion, and statistical analysis based on the standard methods. However, Taverna has
been building workows for web services like caBIG. A web-resource,
omniBioMarker (http://omnibiomarker.bme.gatech.edu/) identies biomarkers
based on quality control and normalization, biological interpretation, feature selec-
tion, clinical prediction, and validation. There are several existing applications such
as RMA Express (Robust Multichip Average), dChip, and caCORRECT that assess
microarray data quality and normalizes gene expression. However, some efcient
commercial software and source codes are available such as ScanAlyze, Cluster, and
TreeView (Table 2, Expression and Variation Tool) (Kostka and Spang 2004; Affara
2003).
Common Immunoinformatic and Bioinformatics Tools to Cancer
Drug Discovery
Several immune epitope prediction tools are listed in Table 2(Immunoinformatic
Tools). CTLPred predicts the cytotoxic T lymphocyte epitopes, which is an essential
tool for designing vaccines. The SNP-derived potential T-cell epitopes for mHAgs
(Minor histocompatibility antigens) can be predicted using the SNeP tool. Similarly,
SiPep is a mHAgs prediction tool responsible for predicting the chances of rejection
of skin grafts transplants and tumors from MHC identical donors, since mHAgs is an
antigen in addition to the MHC that determines the functional feasibility of trans-
plantation. Besides, some commonly used bioinformatics tools such as BLAST
(Basic Local Alignment Search Tool), Motif Scan and CLUSTAL W are required
to compare genetic sequences, building phylogenetic trees, evolutionary relation-
ships, and sequence pattern analysis for immune molecules (Table 2, Expression and
Variation Tools) (Affara 2003).
Biomolecular Networks Tools in Cancer
Systems approaches have conrmed to be of great advantage for cancer studies,
including CSCs (Cancer Stem Cells) diagnosis and characterization (Al-Hajj et al.
2003; Ricci-Vitiani et al. 2007; Singh et al. 2004). A systems approach to cancer
identies crosstalk molecules through global analyses and biological networks
(Alberghina et al. 2004; Stilwell et al. 2007). The most widely used systems
Systems Biology Resources and Their Applications to Understand the Cancer 29
approach is to identify cancer-related gene interaction networks, protein-DNA, and
protein-protein interaction networks based on their differential gene expression.
These interaction networks are used to visualize large data sets, though they cannot
be used to construct drug development and predictive medicine network models
(Price et al. 2008). Biomolecular networks integrate clinical data to change high-
throughput genomics information into a more comprehensive learning of personal-
ized medicine for the respective disease (Baudot et al. 2009). Network modeling
approaches have proved useful to recognize cancer (Wong et al. 2008; Kreeger and
Lauffenburger 2009). The network modeling constructs the gene co-expression
network, which represents a signicant gene correlation map based on their expres-
sion proles at a specic cut-off for tumors samples. These co-expression connec-
tions can be weighed with a sigmoid function. The neighboring genes connections
can be measured effectively by hierarchical clustering in cancer. Furthermore, the
ARACNe (Algorithm for Reconstruction of Accurate Cellular Networks) and Rel-
evance Networks are the two approaches that measure correlation among genes.
ClueGo and DAVID build the gene network based on gene ontology terms using the
kappa score. The STRING predicts protein or gene relationships. The tools widely
used for multiple pathways and interaction analyses in the cancer-immune responses
are given in Table 2(Biomolecular Network Tools). PSORT tool predicts the protein
localization sites within cells. A popular software, Cytoscape, has been used to
model and visualize cancer biomolecular interactions with options for integration
with other data. In addition to Cytoscape, the Metacore is used to visualize biological
systems and complex interaction modeling.
Text Mining Tools Used in Cancer Research
The text mining tools have proved to be benecial to identify the relationship
between a biological entity and diseases such as cancer. FACTA+ (Finding Associ-
ated Concepts with Text Analysis) is a text-mining tool available to analyze the
association between genes, proteins, diseases, and chemical compounds. Also,
STITCH provides the association links between several drugs and compounds.
Nevertheless, NetCutter determines the signicant associations between biological
entities from the literature. An automatic concept recognition software, Anni 2.0,
elicits conceptual proles from the articles and the ontological relationships among
genes. The programming language R has a large-scale archive of statistical packages
used for microarray data analysis. An open-source R library, MedlineR, is designed
explicitly for Medline literature data mining to build the association matrix and
network of query terms, which is further visualized by Pajek. The visualization can
also be performed using another literature mining tool CARGO (Cancer And Related
Genes Online) that retrieves information of SNPs, genetically inherited diseases, and
structural information from iHOP, OMIM, and PDB, respectively. Similarly, other
tools like MarkerInfoFinder and OSIRISv1.2 extract SNPs data from the literature,
and make suitable connections to cancers and other diseases. However, the program-
ming languages such as Python and Java-based standalone scripts extract mutations
30 P. K. Raghav et al.
contained from the literature, whereas R-based script is used to create recursive
computational sequential mutations (Raghav et al. 2019). A machine learning-based
approach, CRFs algorithm, has been implemented in VTag software to collect
mutations information related to cancer at the amino acid and nucleic acid levels
from the text. Likewise, a coagulation protein database, Coag-MDB, extracts muta-
tion data from full-text articles and abstracts, further manually inspected and vali-
dated. The OMIM collects summaries of Mendelian disorders, described phenotypic
and genotypic features of human genes. The OMIM is thoroughly utilized by the
CGMIM tool to identify the cancer-associated genes, though the HuGE Navigator
tool identies these genes based on PubMed abstracts using SVM text classier
application, GAP screener. The PolySearch tool extracts gene and disease associa-
tion information from abstracts, sentences from the literature, and multiple databases
(e.g., DrugBank, Entrez SNP, SwissProt, and HGMD). The MeInfoText provides the
systemic search from the literature for DNA methylation (hypermethylation and
hypomethylation), methods (MSP and COBRA), and gene methylation-related
pathway associated with cancer. Similarly, a PubMeth database extracts cancer-
associated methylated genes information from abstracts contained in PubMed.
These articles are indexed based on gene names from Ensembl and Entrez Gene,
incorporated in GeneCard. The text mining tool ENDEAVOU, developed a model
for gene ranking, by integrating multiple databases like KEGG (for pathway anal-
ysis), BIND (for interaction analysis) and GO (for functional terms assessment).
However, the G2D tool prioritizes genes depending on the genetically inherited
diseases, and CAESAR (CAndidatE Search And Rank) ranks genes based on
retrieved information from OMIM (Kreeger and Lauffenburger 2009; Moding
et al. 2013).
Mathematical Modeling and Simulation Tools to Model Cancer
Pathways and Networks
Three main approaches used for the treatment of cancer are surgery, radiotherapy,
and chemotherapy (Mitra et al. 2015). A large proportion of cancer patients receive
radiotherapy in combination with other therapies. The proliferative cells within a
tumor are primarily more sensitive to DNA mutation or damage caused by radiation.
Nonetheless, some tumor cells became quiescent due to hypoxia and are not
susceptible to radiation therapy, which is a drawback of the efcacy of radiotherapy
(Moding et al. 2013). Therefore, the sensitivity of tumor cells to ionization by
radiation depends on the cell cycle phases. Typically, a radiation dose of 2 Gγ/day
for 5 days a week, followed by repetition for several weeks, is the standard protocol
for radiation therapy given to patients (Moding et al. 2013). The radiation doses
optimal timing needs to be set to study the cumulative effects of radiation therapies
using systems biology, mathematical modeling, and simulation approaches (Ribba
et al. 2006). The development of various mathematical models for describing tumor
dynamics is increasing enormously. Cancer mathematical model is subdivided into
two broad groups, mechanistic and descriptive (Anderson and Quaranta 2008). The
Systems Biology Resources and Their Applications to Understand the Cancer 31
descriptive models describe the regulation of cancer growth without giving out
cellular biological detail (Anderson and Quaranta 2008), while these mechanistic
models represent the mechanism of tumor progression, which helps to initiate cancer
treatment (Anderson and Quaranta 2008; Araujo and McElwain 2004). The mathe-
matical modeling of biochemical reaction network models demonstrates their utility
in constructing predictive models (Price and Shmulevich 2007). The two computa-
tional algorithms, PathwayPro and CoExMiner, are used to investigate dynamic
behaviors of the gene networks and stable features of gene co-expression. Table 2
(Text Mining Tools) briey describes some of the existing software tools that can be
applied for the stochastic kinetic simulations of various biological systems.
Clinical Applications of Systems Biology Tools and Approaches
Natural languageprocessing services, caBIGs, caTIES (Cancer Text Information
Extraction System), and Medi-Class (Hazlehurst et al. 2005), examine pathology
records, chart notes, and other free-text elements measure the frequency of different
cancer subtypes for the clinical trials. The programs, caBIG and caMATCH (www.
breastcancertrials.org) match patientsEHR with extensive treatment trials and enlist
eligible cancer patients for the clinical trials. The healthcare delivery systems answer
the related research questions by patterning with a virtual data warehouse using
resources such as CRN (Cancer Research Network), CVRN (Cardiovascular
Research Network), and CERTS (Centers for Education in Research and Therapeu-
tics). The SEER-Medicare Linked Database shares knowledge for cancer prevention
and research studies related to the treatments (Hazlehurst et al. 2005). Several
clinical bioinformatics tools are being employed for risk assessment, early diagnosis,
prognosis, and classication of cancer (Kapetanovic et al. 2004). Additionally, open-
source software for analyzing and modeling of data has been developed under
projects, CGAP (Cancer Genome Anatomy Project), SBML (Systems Biology
Markup Language, http://sbml.org/index.psp), the MMHCC (Mouse Models of
Human Cancer Consortium), CellML language (http://www.cellml.org/public/
news/index.html), and Systems Biology Workbench (http://www.sbw-sbml.org/
oldindex.html). Besides these, remarkable pathways databases involve the develop-
ment of the systems biology models, for instance, the KEGG, STKE (Signal
Transduction Knowledge Environment) (http://stke.sciencemag.org/), and AfCS
(Alliance for Cellular Signaling). In the eld of clinical bioinformatics, the applica-
tions of systems biology tools and software support the design of future personalized
medicines.
Conclusion
Cancer is considered to be one of the most lethal diseases in current times (Siegel
et al. 2015). The primary concern is the lack of target-specic medicine because it is
practically impossible to analyze the big data through conventional study models
32 P. K. Raghav et al.
(Stephens et al. 2012; Mossé et al. 2008; Kumar-Sinha et al. 2008; Greenman et al.
2007). The lacunae are now being identied and addressed through Systems Biology
approaches. Bioinformatics tools and software are widely used for high-throughput
molecular analyses on different data types, involving those generated from the
microarray experiments, RNA-seq, WES (Whole-Exome Sequencing), DNA copy
number, DNA methylation assays, pathwaysdelineation, and protein structure
analysis (Kunz et al. 2017). Thorough analysis of these databases and exploiting
the mentioned tools and software has helped both types of research and healthcare
professionals in understanding cancer statistics and epidemiological studies, diag-
nosis, biomarker prediction and detection, drug designing, and, most importantly,
targeted personalized medicine. Network-based models are being employed and
proved to be a good strategy for augmenting the therapeutic efcacy in cancer
treatments. These developed networks are being availed to understand inter-tumoral
heterogeneity and facilitate data integration among genomic, transcriptomic, and
epigenetic alterations. Although more user-friendly tools with simpler interfaces
needed to be developed for their full-scale acceptance and application. The present
work provides a detailed overview of how bioinformatics algorithms and analytic
innovations are being used in the cancer domain to extract maximum biological
information.
References
Affara NA (2003) Resource and hardware options for microarray-based experimentation. Brief
Funct Genomic Proteomic. https://doi.org/10.1093/bfgp/2.1.7
Alberghina L, Chiaradonna F, Vanoni M (2004) Systems biology and the molecular circuits of
cancer. Chembiochem
Al-Hajj M, Wicha MS, Benito-Hernandez A et al (2003) Prospective identication of tumorigenic
breast cancer cells. Proc Natl Acad Sci U S A. https://doi.org/10.1073/pnas.0530291100
Anderson ARA, Quaranta V (2008) Integrative mathematical oncology. Nat Rev Cancer
Araujo RP, McElwain DLS (2004) A history of the study of solid tumour growth: the contribution
of mathematical modelling. Bull Math Biol. https://doi.org/10.1016/j.compbiomed.2009.01.014
Baudot A, Gómez-López G, Valencia A (2009) Translational disease interpretation with molecular
networks. Genome Biol
Chakraborty S, Hosen MI, Ahmed M, Shekhar HU (2018) Onco-multi-OMICS approach: a new
frontier in cancer research. Biomed Res Int. https://doi.org/10.1155/2018/9836256
Choi H, Shen R, Chinnaiyan AM, Ghosh D (2007) A latent variable approach for meta-analysis of
gene expression data from multiple microarray experiments. BMC Bioinformatics. https://doi.
org/10.1186/1471-2105-8-364
Forman MR, Greene SM, Avis NE et al (2010) Bioinformatics. Tools to accelerate population
science and disease control research. Am J Prev Med
Gao GF, Parker JS, Reynolds SM et al (2019) Before and after: comparison of legacy and
harmonized TCGA genomic data commonsdata. Cell Syst. https://doi.org/10.1016/j.cels.
2019.06.006
Greenman C, Stephens P, Smith R et al (2007) Patterns of somatic mutation in human cancer
genomes. Nature. https://doi.org/10.1038/nature05610
Guerra R, Goldstein DR (2016) Meta-analysis and combining information in genetics and genomics
Systems Biology Resources and Their Applications to Understand the Cancer 33
Hackl H, Stocker G, Charoentong P et al (2010) Information technology solutions for integration of
biomolecular and clinical data in the identication of new cancer biomarkers and targets for
therapy. Pharmacol Ther
Hazlehurst B, Sittig DF, Stevens VJ et al (2005) Natural language processing in the electronic
medical record: assessing clinician adherence to tobacco treatment guidelines. Am J Prev Med.
https://doi.org/10.1016/j.amepre.2005.08.007
Hornberg JJ, Bruggeman FJ, Westerhoff HV, Lankelma J (2006) Cancer: a systems biology disease.
BioSystems
Hur K (2015) MicroRNAs: promising biomarkers for diagnosis and therapeutic targets in human
colorectal cancer metastasis. BMB Rep
Kapetanovic IM, Rosenfeld S, Izmirlian G (2004) Overview of commonly used bioinformatics
methods and their applications. Annals of the New York
Kitano H (2002) Systems biology: a brief overview. Science 80
Kostka D, Spang R (2004) Finding disease specic alterations in the co-expression of genes.
Bioinformatics
Kreeger PK, Lauffenburger DA (2009) Cancer systems biology: a network modeling perspective.
Carcinogenesis
Kumar-Sinha C, Tomlins SA, Chinnaiyan AM (2008) Recurrent gene fusions in prostate cancer.
Nat Rev Cancer
Kunz M, Wolf B, Schulze H et al (2017) Non-coding RNAs in lung cancer: contribution of
bioinformatics analysis to the development of non-invasive diagnostic tools. Genes (Basel)
Laubenbacher R, Hower V, Jarrah A et al (2009) A systems biology view of cancer. Biochim
Biophys Acta Rev Cancer
Liu ET (2005) Systems biology, integrative biology, predictive biology. Cell
Mac GF, Annex BH, Popel AS (2010) Gene therapy from the perspective of systems biology. Curr
Opin Mol Ther
Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of
disease. J Natl Cancer Inst. https://doi.org/10.1093/jnci/22.4.719
Mathew JP, Taylor BS, Bader GD et al (2007) From bytes to bedside: data integration and
computational biology for translational cancer research. PLoS Comput Biol
Mitra AK, Agrahari V, Mandal A et al (2015) Novel delivery approaches for cancer therapeutics. J
Control Release 219:248268. https://doi.org/10.1016/j.jconrel.2015.09.067
Moding EJ, Kastan MB, Kirsch DG (2013) Strategies for optimizing the response of cancer and
normal tissues to radiation. Nat Rev Drug Discov
Morrow JK, Tian L, Zhang S (2010) Molecular networks in drug discovery. Crit Rev Biomed Eng
Mossé YP, Laudenslager M, Longo L et al (2008) Identication of ALK as a major familial
neuroblastoma predisposition gene. Nature. https://doi.org/10.1038/nature07261
Motakis E, Ivshina AV, Kuznetsov VA (2009) Data-driven approach to predict survival of cancer
patients: estimation of microarray genesprediction signicance by Cox proportional hazard
regression model. IEEE Eng Med Biol Mag. https://doi.org/10.1109/MEMB.2009.932937
Nagaraj NS (2009) Evolving omicstechnologies for diagnostics of head and neck cancer. Brief
Funct Genomic Proteomic. https://doi.org/10.1093/bfgp/elp004
Nam MJ, Madoz-Gurpide J, Wang H et al (2003) Molecular proling of the immune response in
colon cancer using protein microarrays: occurrence of autoantibodies to ubiquitin C-terminal
hydrolase L3. Proteomics. https://doi.org/10.1002/pmic.200300594
Orlov YL, Zhou J, Lipovich L et al (2007) Quality assessment of the Affymetrix U133A&B probe-
sets by target sequence mapping and expression data analysis. In Silico Biol
Pihur V, Datta S, Datta S (2008) Finding common genes in multiple cancer types through meta-
analysis of microarray experiments: a rank aggregation approach. Genomics. https://doi.org/10.
1016/j.ygeno.2008.05.003
Price ND, Shmulevich I (2007) Biochemical and statistical network models for systems biology.
Curr Opin Biotechnol
34 P. K. Raghav et al.
Price ND, Foltz G, Madan A et al (2008) Systems biology and cancer stem cells. J Cell Mol Med.
https://doi.org/10.1111/j.1582-4934.2007.00151.x
Raghav PK, Verma YK, Gangenahalli GU (2012a) Peptide screening to knockdown Bcl-2s anti-
apoptotic activity: implications in cancer treatment. Int J Biol Macromol. https://doi.org/10.
1016/j.ijbiomac.2011.11.021
Raghav PK, Verma YK, Gangenahalli GU (2012b) Molecular dynamics simulations of the Bcl-2
protein to predict the structure of its unordered exible loop domain. J Mol Model. https://doi.
org/10.1007/s00894-011-1201-6
Raghav PK, Kumar R, Kumar V, Raghava GPS (2019) Docking-based approach for identication
of mutations that disrupt binding between Bcl-2 and Bax proteins: inducing apoptosis in cancer
cells. Mol Genet Genomic Med. https://doi.org/10.1002/mgg3.910
Rhodes DR, Yu J, Shanker K et al (2004) Large-scale meta-analysis of cancer microarray data
identies common transcriptional proles of neoplastic transformation and progression. Proc
Natl Acad Sci U S A. https://doi.org/10.1073/pnas.0401994101
Ribba B, Colin T, Schnell S (2006) A multiscale mathematical model of cancer, and its use in
analyzing irradiation therapies. Theor Biol Med Model. https://doi.org/10.1186/1742-4682-3-7
Ricci-Vitiani L, Lombardi DG, Pilozzi E et al (2007) Identication and expansion of human colon-
cancer-initiating cells. Nature. https://doi.org/10.1038/nature05384
Schadt EE, Zhang B, Zhu J (2009) Advances in systems biology are enhancing our understanding
of disease and moving us closer to novel disease treatments. Genetica. https://doi.org/10.1007/
s10709-009-9359-x
Sharom JR, Bellows DS, Tyers M (2004) From large networks to small molecules. Curr Opin Chem
Biol
Siegel RL, Miller KD, Jemal A (2015) Cancer statistics, 2015. CA Cancer J Clin. https://doi.org/10.
3322/caac.21254
Singh R, Mo YY (2013) Role of microRNAs in breast cancer. Cancer Biol Ther
Singh SK, Hawkins C, Clarke ID et al (2004) Identication of human brain tumour initiating cells.
Nature. https://doi.org/10.1038/nature03128
Stephens PJ, Tarpey PS, Davies H et al (2012) The landscape of cancer genes and mutational
processes in breast cancer. Nature. https://doi.org/10.1038/nature11017
Stilwell JL, Guan Y, Neve RM, Gray JW (2007) Systems biology in cancer research: genomics to
cellomics. Methods Mol Biol. https://doi.org/10.1385/1-59745-217-3:353
Tomasi TB, Magner WJ, Khan ANH (2006) Epigenetic regulation of immune escape genes in
cancer. Cancer Immunol Immunother
Verma YK, Raghav PK, Raj HG et al (2013) Enhanced heterodimerization of Bax by Bcl-2 mutants
improves irradiated cell survival. Apoptosis. https://doi.org/10.1007/s10495-012-0780-8
Wang SS, Gonzalez P, Yu K et al (2010) Common genetic variants and risk for HPV persistence and
progression to cervical cancer. PLoS One. https://doi.org/10.1371/journal.pone.0008667
Whitehead A, Whitehead J (1991) A general parametric approach to the meta-analysis of random-
ized clinical trials. Stat Med. https://doi.org/10.1002/sim.4780101105
Wong DJ, Nuyten DSA, Regev A et al (2008) Revealing targeted therapy for human cancer by gene
module maps. Cancer Res. https://doi.org/10.1158/0008-5472.CAN-07-0382
World Health Organization (2012) Cancer fact sheets. Globocan 2012
Yan Q (2008a) The integration of personalized and systems medicine: bioinformatics support for
pharmacogenomics and drug discovery. Methods Mol Biol. https://doi.org/10.1007/978-1-
59745-205-2_1
Yan Q (2008b) Bioinformatics databases and tools in virology research: an overview. In Silico Biol
Systems Biology Resources and Their Applications to Understand the Cancer 35
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Background: Inducing apoptosis in cancer cells is an important step for the successful treatment of cancer patients. Bcl-2 is an antiapoptotic protein which determines apoptosis by interacting with proapoptotic members of the Bcl-2 family. Exome sequencing has identified Bcl-2 and Bax missense mutations in more than 40 cancer types. However, a little information is available about the functional impact of each Bcl-2 and Bax mutation on the pathogenesis of cancer. Methods: The mutational data from cancer tissues and cell lines were retrieved from the cBioPortal web resource. The 13 mutated Bcl-2 and wild-type Bax complexes with experimentally verified binding were identified from previous studies wherein, binding for all complexes was reportedly disrupted except one. Several protein-protein docking methods such as ClusPro, HDOCK, PatchDock, FireDock, InterEVDock2 and several mutation prediction methods such as PolyPhen-2, SIFT, and OncoKB have been used to predict the effect of mutation to disrupt the binding between Bcl-2 and Bax. The result obtained was compared with the known experimental data. Results: The protein-protein docking method, ClusPro, employed in the present study confirmed that the binding affinity of 11 out of 13 complexes decreases. Similarly, binding affinity computed for all the 10 wild-type Bcl-2 and mutated Bax complexes agreed with experimentally verified results. Conclusion: Several methods like PolyPhen-2, SIFT, and OncoKB have been developed to predict cancer-associated or deleterious mutations, but no method is available to predict apoptosis-inducing mutations. Thus, in this study, we have examined the mutations in Bcl-2 and Bax proteins that disrupt their binding, which is crucial for inducing apoptosis to eradicate cancer. This study suggests that protein-protein docking methods can play a significant role in the identification of hotspot mutations in Bcl-2 or Bax that can disrupt their binding with wild-type partner to induce apoptosis in cancer cells.
Article
Full-text available
Currently, a majority of cancer treatment strategies are based on the removal of tumor mass mainly by surgery. Chemical and physical treatments such as chemo- and radiotherapies have also made a major contribution in inhibiting rapid growth of malignant cells. Furthermore, these approaches are often combined to enhance therapeutic indices. It is widely known that surgery, chemo- and radiotherapy also inhibit normal cells growth. In addition, these treatment modalities are associated with severe side effects and high toxicity which in turn lead to low quality of life. This review encompasses novel strategies for more effective chemotherapeutic delivery aiming to generate better prognosis. Currently, cancer treatment is a highly dynamic field and significant advances are being made in the development of novel cancer treatment strategies. In contrast to conventional cancer therapeutics, novel approaches such as ligand or receptor based targeting, triggered release, intracellular drug targeting, gene delivery, cancer stem cell therapy, magnetic drug targeting and ultrasound-mediated drug delivery, have added new modalities for cancer treatment. These approaches have led to selective detection of malignant cells leading to their eradication with minimal side effects. Lowering, multi-drug resistance and involving influx transportation in targeted drug delivery to cancer cells can also contribute significantly in the therapeutic interventions in cancer.
Article
Full-text available
Colorectal cancer (CRC) is the third most common cancer and the fourth most common cause of cancer-related death worldwide. Distant metastasis is the major cause of mortality in CRC. MicroRNAs (miRNAs) are small non-coding RNA molecules involved in the post-transcriptional and translational regulation of gene expression. Many miRNAs are aberrantly expressed in cancer and influence tumor progression. Accumulating studies suggest that multiple miRNAs actively involved in CRC metastasis process. Thus, we aim to introduce the role of miRNAs in a multi-step of CRC metastasis, including cancer cells invasion, intravasation, circulation, extravasation, colonization, angiogenesis, and epithelial-mesenchymal transition (EMT). Moreover, we suggest the potential application of miRNAs as biomarkers for CRC patients with metastasis.
Article
Full-text available
B Cell Lymphoma-2 (Bcl-2) protein suppresses ionizing radiation-induced apoptosis in hemato-lymphoid system. To enhance the survival of irradiated cells, we have compared the effects and mechanism of Bcl-2 and its functional variants, D34A (caspase-3 resistant) and S70E (mimics phosphorylation on S70). Bcl-2 and its mutants were transfected into hematopoietic cell line and assessed for cell survival, clonogenicity and cell cycle perturbations upon exposure to ionizing radiation. The electrostatic potential of BH3 cleft of Bcl-2/mutants and their heterodimerization with Bcl-2 associated X protein (Bax) were computationally evaluated. Correspondingly, these results were verified by co-immunoprecipitation and western blotting. The mutants afford higher radioprotective effect than Bcl-2 in apoptotic and clonogenic assays at D(0) (radiation dose at which 37 % cell survival was observed). The computational and functional analysis indicates that mutants have higher propensity to neutralize Bax protein by heterodimerization and have increased caspase-9 suppression capability, which is responsible for enhanced survival. This study implies potential of Bcl-2 mutants or their chemical/peptide mimics to elicit radioprotective effect in cells exposed to radiation.
Article
Full-text available
All cancers carry somatic mutations in their genomes. A subset, known as driver mutations, confer clonal selective advantage on cancer cells and are causally implicated in oncogenesis, and the remainder are passenger mutations. The driver mutations and mutational processes operative in breast cancer have not yet been comprehensively explored. Here we examine the genomes of 100 tumours for somatic copy number changes and mutations in the coding exons of protein-coding genes. The number of somatic mutations varied markedly between individual tumours. We found strong correlations between mutation number, age at which cancer was diagnosed and cancer histological grade, and observed multiple mutational signatures, including one present in about ten per cent of tumours characterized by numerous mutations of cytosine at TpC dinucleotides. Driver mutations were identified in several new cancer genes including AKT2, ARID1B, CASP8, CDKN1B, MAP3K1, MAP3K13, NCOR1, SMARCD1 and TBX3. Among the 100 tumours, we found driver mutations in at least 40 cancer genes and 73 different combinations of mutated cancer genes. The results highlight the substantial genetic diversity underlying this common disease.
Article
Full-text available
Background Radiotherapy outcomes are usually predicted using the Linear Quadratic model. However, this model does not integrate complex features of tumor growth, in particular cell cycle regulation. Methods In this paper, we propose a multiscale model of cancer growth based on the genetic and molecular features of the evolution of colorectal cancer. The model includes key genes, cellular kinetics, tissue dynamics, macroscopic tumor evolution and radiosensitivity dependence on the cell cycle phase. We investigate the role of gene-dependent cell cycle regulation in the response of tumors to therapeutic irradiation protocols. Results Simulation results emphasize the importance of tumor tissue features and the need to consider regulating factors such as hypoxia, as well as tumor geometry and tissue dynamics, in predicting and improving radiotherapeutic efficacy. Conclusion This model provides insight into the coupling of complex biological processes, which leads to a better understanding of oncogenesis. This will hopefully lead to improved irradiation therapy.
Book
Novel Techniques for Analyzing and Combining Data from Modern Biological Studies. Broadens the Traditional Definition of Meta-Analysis. With the diversity of data and meta-data now available, there is increased interest in analyzing multiple studies beyond statistical approaches of formal meta-analysis. Covering an extensive range of quantitative information combination methods, Meta-analysis and Combining Information in Genetics and Genomics looks at how to analyze multiple studies from a broad perspective. After presenting the basic ideas and tools of meta-analysis, the book addresses the combination of similar data types: genotype data from genome-wide linkage scans and data derived from microarray gene expression experiments. The expert contributors show how some data combination problems can arise even within the same basic framework and offer solutions to these problems. They also discuss the combined analysis of different data types, giving readers an opportunity to see data combination approaches in action across a wide variety of genome-scale investigations. As heterogeneous data sets become more common, biological understanding will be significantly aided by jointly analyzing such data using fundamentally sound statistical methodology. This book provides many novel techniques for analyzing data from modern biological studies that involve multiple data sets, either of the same type or multiple data sources.
Article
Approximately 50% of all patients with cancer receive radiation therapy at some point during the course of their treatment, and the majority of these patients are treated with curative intent. Despite recent advances in the planning of radiation treatment and the delivery of image-guided radiation therapy, acute toxicity and potential long-term side effects often limit the ability to deliver a sufficient dose of radiation to control tumours locally. In the past two decades, a better understanding of the hallmarks of cancer and the discovery of specific signalling pathways by which cells respond to radiation have provided new opportunities to design molecularly targeted therapies to increase the therapeutic window of radiation therapy. Here, we review efforts to develop approaches that could improve outcomes with radiation therapy by increasing the probability of tumour cure or by decreasing normal tissue toxicity.
Article
Meta-analysis provides a systematic and quantitative approach to the summary of results from randomized studies. Whilst many authors have published actual meta-analyses concerning specific therapeutic questions, less has been published about comprehensive methodology. This article presents a general parametric approach, which utilizes efficient score statistics and Fisher's information, and relates this to different methods suggested by previous authors. Normally distributed, binary, ordinal and survival data are considered. Both the fixed effects and random effects model for treatments are described.
Chapter
Cancers result from large-scale deregulation of genes that lead to cancer pathophysiologies such as increase proliferation, decreased apoptosis, increased motility, increased angiogenesis, and others. Genes that influence proliferation and apoptosis are particularly attractive as therapeutic targets. To identify genes that influence these phenotypes, we have developed simple and rapid methods to measure apoptosis and cell proliferation using high content screening with YO-PRO®-1 and anti-BrdU staining of BrdU pulsed cells, respectively. Key WordsApoptosis–BrdU–cell cycle–cellomics–high content analysis–high content screening–image analysis–propidium iodide–siRNA–YO-PRO-1