Tyler FunnellMemorial Sloan Kettering Cancer Center | MSKCC · Computational Oncology
Tyler Funnell
Master of Science
About
85
Publications
3,234
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
794
Citations
Introduction
I look at patterns of mutation in cancer using machine learning and bulk or single cell whole genome sequencing.
Publications
Publications (85)
The extent of cell-to-cell variation in tumor mitochondrial DNA (mtDNA) copy number and genotype, and the phenotypic and evolutionary consequences of such variation, are poorly characterized. Here we use amplification-free single-cell whole-genome sequencing (Direct Library Prep (DLP+)) to simultaneously assay mtDNA copy number and nuclear DNA (nuD...
High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability1–4 patterned by distinct mutational processes5,6, tumour heterogeneity7–9 and intraperitoneal spread7,8,10. Immunotherapies have had limited efficacy in HGSOC11–13, highlighting an unmet need to assess how mutational processes and the anatomical sites of tumour...
How cell-to-cell copy number alterations that underpin genomic instability1 in human cancers drive genomic and phenotypic variation, and consequently the evolution of cancer2, remains understudied. Here, by applying scaled single-cell whole-genome sequencing3 to wild-type, TP53-deficient and TP53-deficient;BRCA1-deficient or TP53-deficient;BRCA2-de...
Assessing tumour gene fitness in physiologically-relevant model systems is challenging due to biological features of in vivo tumour regeneration, including extreme variations in single cell lineage progeny. Here we develop a reproducible, quantitative approach to pooled genetic perturbation in patient-derived xenografts (PDXs), by encoding single c...
High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability patterned by distinct mutational processes, a high degree of tumor heterogeneity and intraperitoneal spread. As immunotherapies have thus far proven ineffective in this disease, we sought to establish the determinants of immune recognition, avoidance and evasion...
Copy number alterations and structural variants are associated with disease progression, therapeutic response, and metastasis in human cancers, yet the extent and mechanisms driving continued genomic instability remain poorly understood. We generated more than 20,000 single-cell whole genomes from 25 high-grade serous ovarian and triple-negative br...
High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability patterned by distinct mutational processes, intratumoral heterogeneity and intraperitoneal spread. We investigated determinants of immune recognition and evasion in HGSOC to elucidate co- evolutionary processes underlying malignant progression and tumor immunity...
Cancer genomes exhibit extensive chromosomal copy number changes and structural variation, yet how allele specific alterations drive cancer genome evolution remains unclear. Here, through application of a new computational approach we report allele specific copy number alterations in 11,097 single cell whole genomes from genetically engineered mamm...
Structural genome alterations are determinants of cancer ontogeny and therapeutic response. While bulk genome sequencing has enabled delineation of structural variation (SV) mutational processes which generate patterns of DNA damage, we have little understanding of how these processes lead to cell-to-cell variations which underlie selection and rat...
A new generation of scalable single cell whole genome sequencing (scWGS) methods, allows unprecedented high resolution measurement of the evolutionary dynamics of cancer cells populations. Phylogenetic reconstruction is central to identifying sub-populations and distinguishing mutational processes. The ability to sequence tens of thousands of singl...
Mutation signatures in cancer genomes reflect endogenous and exogenous mutational processes, offering insights into tumour etiology, features for prognostic and biologic stratification and vulnerabilities to be exploited therapeutically. We present a novel machine learning formalism for improved signature inference, based on multi-modal correlated...
Plate notation for ILDA, ICTM, and IMMCTM.
Graphical models for the a ILDA, b ICTM and c IMMCTM models, with d descriptions of their variables. See S1 Text for detailed descriptions.
(PDF)
Comparison of NMF with LDA, CTM, MMCTM, ILDA, ICTM, and IMMCTM, using the ovarian cancer dataset.
Displayed are log likelihood means ± standard error for: a 2–15 signatures, and b a range of mutation count fractions. Top panels are evaluations on SNV counts, bottom panels are evaluations on SV counts only. NMF: applied to raw counts, NMF-norm: appl...
Log likelihoods across random restarts.
Average per-mutation predictive log-likelihoods from 100 restarts for SNV and SV signatures inferred by each method. Values have been mean-centered.
(PDF)
Mutation processes and mutation signature analysis workflow.
a Analysis workflow for the multimodal topic models MMCTM and IMMCTM. b Mutation process activity is detected as patterns of mutations, i.e. mutation signatures, in the genome. Samples with common levels of signature probabilities may be grouped, and potentially exhibit similar phenotypes...
Binary heatmap indicating which SNV signatures have cosine similarity ≥ 0.8.
Included are SNV signatures from COSMIC, the breast, and ovarian cancer datasets.
(PDF)
Sample cluster signature probability comparisons.
Tests compared signature probability means for clusters in the a breast, and b ovarian cancer datasets. Adjusted p-values >0.05 are not shown. Cluster labels are colored according to those in the associated signature probability heatmap.
(PDF)
Method benchmarking mean absolute error on synthetic breast cancer data.
Columns: value (estimated value type, signature or probability; string), method (signature inference method; string), evaluation (snv or sv; string), signature (signature name; string), subset (1.0-1.0: full counts, 0.01-0.1: 1% SNVs & 10% SVs; string), seed (random seed; inte...
Signature mean absolute errors on synthetic data.
Shown are mean absolute errors per method and per signature for estimated signatures compared to the reference signatures. The experiment was repeated with full mutation counts and with 1% SNVs & 10% SVs. Data is represented as Tufte-like boxplots with the following elements: points (median), gap (f...
SNV signatures for the 560 genomes BRCA-EU dataset.
Mutation and flanking sequence shown on x-axis.
(PDF)
SNV signatures for the ovarian cancer dataset.
Mutation and flanking sequence shown on x-axis.
(PDF)
Shah HGSC cancer mutation signature analysis.
a SNV mutation signatures. SNVs are organized according to the SNV type (color). Within each type, SNVs are further organized into the pattern of flanking nucleotides (A—A, A—C, …,T—G, T—T). b SV mutation signatures. SVs are grouped by type (DEL: deletion, DUP: tandem duplication, INV: inversion, FBI: f...
Description of mutation signature methods.
(PDF)
Method benchmarking log-likelihood values across a range of the number of signatures.
Columns: method (signature inference method; string), evaluation (snv or sv; string), k (number of signatures; integer), n (cross validation repeat; integer), fold (cross validation fold; integer), ll (log-likelihood; float), dataset (breast, ovary; string).
(TSV)
Method benchmarking log-likelihood values across a range of mutation count fractions.
Columns: method (signature inference method; string), evaluation (snv or sv; string), k (number of signatures; integer), snv_frac (fraction of retained SNVs; float), sv_frac (fraction of retained SVs; float), n (cross validation repeat; integer), fold (cross valid...
Mutation signature probabilities per sample.
Columns: signature (mutation signature label; string), sample (sample id; string), probability (sample-signature probability; float), dataset (breast, ovary; string).
(TSV)
Sample cluster signature probability enrichment p-values.
Columns: cluster (sample cluster; integer), signature (mutation signature label; string), p_value (enrichment p-value, float), mean_diff (difference between means, float), conf_low (lower bound of confidence interval; float), conf_high (upper bound of confidence interval; float), q_value (BH...
Sample cluster annotation association p-values.
Columns: label (annotation label; string), p_value (enrichment p-value, float), diff (difference between group statistics, float), conf_low (lower bound of confidence interval; float), conf_high (upper bound of confidence interval; float), test (statistical test; string), cluster (sample cluster; inte...
Comparisons of NMF, LDA, CTM, MMCTM, ILDA, ICTM, and IMMCTM, using the 560 breast cancer dataset.
Displayed are SNV and SV signature log likelihood means ± standard error for: a 2–12 signatures, and b a range of mutation count fractions. c Logistic regression accuracy means ± standard error for predicting HRD labels using per-sample signature proba...
ICGC HGSC cancer mutation signature analysis.
a SNV mutation signatures. SNVs are organized according to the SNV type (color). Within each type, SNVs are further organized into the pattern of flanking nucleotides (A—A, A—C, …,T—G, T—T). b SV mutation signatures. SVs are grouped by type (DEL: deletion, DUP: tandem duplication, INV: inversion, FBI: f...
MMCTM SNV and SV log likelihood means ± standard error across signature number.
Shown for: a breast, and b ovarian cancer datasets. Signature number choice indicated as an green vertical line.
(PDF)
Descriptions of the topic models.
(PDF)
Method benchmarking logistic regression accuracy across a range of the number of signatures.
Columns: score (logistic regression accuracy; float), k (number of signatures; integer), n (cross validation repeat; integer), fold (cross validation fold; integer), method (signature inference method; string), train (training set, either SNV, SV or SNV & S...
Sample clusters.
Columns: sample (sample id; string), cluster (cluster number; integer), dataset (breast, ovary; string).
(TSV)
Heatmap of relative probabilities of signatures in BRCA-EU samples.
Each heatmap column represents a single sample, and is composed of the probabilities of SNV and SV signatures output from the MMCTM model. The values for each signature (row) have been standardized, producing z-scores. Heatmap display has been truncated to ±3. Samples have been hie...
Heatmap of relative probabilities of signatures in ovarian cancer samples.
Each heatmap column represents a single sample, and is composed of the probabilities of SNV and SV signatures output from the MMCTM model. The values for each signature (row) have been standardized, producing z-scores. Heatmap display has been truncated to ±3. Samples have b...
Mutation signatures.
Columns: modality (1: SNV, 2: SV; integer), signature (signature label; string), value (mutation term number; integer), term (mutation term; string), probability (signature-mutation probability; float), dataset (breast, ovary; string).
(TSV)
Mutation signature probability correlations.
Columns: signature_* (mutation signature label; string), correlation (sample-signature probability correlation between two signatures; float), dataset (breast, ovary; string).
(TSV)
High-grade serous ovarian cancer (HGSC) exhibits extensive malignant clonal diversity with widespread but non-random patterns of disease dissemination. We investigated whether local immune microenvironment factors shape tumor progression properties at the interface of tumor-infiltrating lymphocytes (TILs) and cancer cells. Through multi-region stud...
Mutation signatures in cancer genomes reflect endogenous and exogenous mutational processes, offering insights into tumour etiology, features for prognostic and biologic stratification and vulnerabilities to be exploited therapeutically. We present a novel machine learning formalism for improved signature inference, based on multi-modal correlated...
CDC-like kinase phosphorylation of serine/arginine-rich proteins is central to RNA splicing reactions. Yet, the genomic network of CDC-like kinase-dependent RNA processing events remains poorly defined. Here, we explore the connectivity of genomic CDC-like kinase splicing functions by applying graduated, short-exposure, pharmacological CDC-like kin...
High-grade serous ovarian cancer exhibits extensive intratumoral heterogeneity coupled with widespread intraperitoneal disease. Despite this, metastatic spread of tumor clones is non-random, implying the existence of local microenvironmental factors that shape tumor progression. We interrogated the molecular interface between tumor-infiltrating lym...
We present a novel hierarchical Bayes statistical model, xseq, to systematically quantify the impact of somatic mutations on expression profiles. We establish the theoretical framework and robust inference characteristics of the method using computational benchmarking. We then use xseq to analyse thousands of tumour data sets available through The...