PresentationPDF Available

Network Motif Families for Lung Cancer Diagnostics: A World Community Grid Approach

Authors:

Abstract

Despite smoking cessation, and advances in detection and treatment, lung cancer remains the primary cause of cancer-related death. Over recent decades, numerous studies have analysed NSCLC using diverse “omic” platforms to identify a large pool of signatures for detection and prognosis. However, translating these into clinical practice remains challenging; one main difficulty being the exhaustive evaluation of putative signatures. Our project utilized the World Community Grid (WCG) to systematically explore the entire space of potential signature patterns. Within those patterns, we find generalized motif families that give deeper insights to the molecular background of cancers and give rise to more reliable patterns for cancer detection and prognosis.
Network Motif Families for Lung Cancer Diagnostics:
A World Community Grid Approach
Anne-Christin Hauschild, Christian A. Cumbaa, Mike Tsay, Igor Jurisica
Princess Margaret Cancer Centre, Toronto, Ontario
Despite smoking cessation, and advances in detection and treatment, lung cancer remains the
primary cause of cancer-related death[1, 2, 3], and non-small cell lung cancer (NSCLC) accounts
for about 80-85% of all cases. The overall survival rate for lung cancer has marginally improved
in the past decades, from 13% to 16%. The asymptomatic nature of this disease does not result
in diagnosis until advanced stages of the disease. Over recent decades, numerous studies have
analysed NSCLC using diverse “omic” platforms to identify a large pool of signatures for
detection and prognosis. However, translating these into clinical practice remains challenging;
signatures often do not validate in other cohorts or by different biological assays, and there are
thousands of possible combinations to consider.
Making use of a unique computational resource, the Mapping Cancer Markers (MCM) project
aims to systematically survey the landscape of useful cancer gene signatures for multiple can-
cers (diagnosis and prognosis), and thereby establish a benchmark for cancer gene signature
identification and validation. MCM is powered by IBM’s World Community Grid (WCG),
a massive grid of 3.3 million devices (http://www.worldcommunitygrid.org). WCG members
contribute spare compute cycles to problems in health, poverty, and ecology.
Using the WCG, MCM’s lung cancer evaluation sampled 9.8 trillion combinations of fixed-
length gene expression patterns, and evaluated these signatures against a NSCLC diagnostic
gene expression dataset. Using a performance threshold based on the Matthews correlation
coefficient (MCC), approximately 45 million high-performing signatures have been identified.
We characterized the distribution of the high-performing signatures in terms of the frequency
of individual genes, network patterns, and by comprehensive pathway enrichment analysis.
Our overall goal is to utilize network patterns to identify generalized motif families that
give deeper insights to the molecular background of cancers, and give rise to more reliable
signatures for cancer detection and prognosis. Using state-of-the-art unsupervised learning
technologies we first partition the gene features into clusters of high connectivity. We then
apply established frequent-itemset mining algorithm to identify co-occuring terms among these
patterns. Those most frequent motif families have been further evaluated with frequentist and
Bayesian methods in combination with performance measures such as MCC and AUC. Given
the broad representation of the pattern space, the result of this extensive processing pipeline is
a set of highly informative gene clusters and gene motif families of high predictive power.
Finally, we demonstrate how the discovered cluster and motif families summarize genes of
similar functionality, localization as well as interaction and pathway networks.
In summary, we demonstrate a “big data” pattern discovery system that can produce more
robust and reliable clinical diagnostics. The presented compuational framework carries the
potential for applications in precision oncology.
1
References
[1] Jeffrey P Kanne. Screening for lung cancer: what have we learned? American Journal of Roentgenology,
202(3):530–535, 2014.
[2] Rebecca Siegel, Deepa Naishadham, and Ahmedin Jemal. Cancer statistics, 2012. CA: a cancer journal for
clinicians, 62(1):10–29, 2012.
[3] Rebecca L Siegel, Kimberly D Miller, and Ahmedin Jemal. Cancer statistics, 2015. CA: a cancer journal
for clinicians, 65(1):5–29, 2015.
2
ResearchGate has not been able to resolve any citations for this publication.
Article
Objective: The purposes of this article are to briefly review the history of lung cancer screening, discuss the results and implications of the National Lung Screening Trial, and address some of the questions that remain since the publication of this landmark study. Conclusion: Lung cancer remains the leading cause of cancer-related death in the United States and the world. The National Lung Screening Trial showed a 20% reduction in lung cancer mortality among individuals at high risk undergoing low-dose CT. The findings opened the door for clinical lung cancer screening and publication of lung cancer screening guidelines. However, many questions remain, including whom to screen, how often, and for how long. Furthermore, costs and effects on the health care system remain unclear.
CA: a cancer journal for clinicians
  • Rebecca Siegel
  • Deepa Naishadham
  • Ahmedin Jemal
Rebecca Siegel, Deepa Naishadham, and Ahmedin Jemal. Cancer statistics, 2012. CA: a cancer journal for clinicians, 62(1):10-29, 2012.
Cancer statistics CA: a cancer journal for clinicians
  • L Rebecca
  • Siegel
  • D Kimberly
  • Ahmedin Miller
  • Jemal
Rebecca L Siegel, Kimberly D Miller, and Ahmedin Jemal. Cancer statistics, 2015. CA: a cancer journal for clinicians, 65(1):5–29, 2015.
CA: a cancer journal for clinicians
  • L Rebecca
  • Kimberly D Siegel
  • Ahmedin Miller
  • Jemal
Rebecca L Siegel, Kimberly D Miller, and Ahmedin Jemal. Cancer statistics, 2015. CA: a cancer journal for clinicians, 65(1):5-29, 2015.