Sohrab P. Shah's research while affiliated with Memorial Sloan Kettering Cancer Center and other places

Publications (521)

Preprint
Full-text available
Background The encoding of cell intrinsic resistance states in breast cancer reflects the contributions of genomic and non-genomic variation. However, identifying the potential contributions of each requires accurate measurement and subtraction of the contribution of clonal fitness from co-measurement of transcriptional states. Somatic genomic vari...
Article
Full-text available
Deciphering individual cell phenotypes from cell-specific transcriptional processes requires high dimensional single cell RNA sequencing. However, current dimensionality reduction methods aggregate sparse gene information across cells, without directly measuring the relationships that exist between genes. By performing dimensionality reduction with...
Article
Full-text available
Purpose: We report updated clinical outcomes from a phase II study of pembrolizumab, trastuzumab, and chemotherapy (PTC) in metastatic esophagogastric (EG) cancer in conjunction with outcomes from an independent MSK cohort. Experimental design: The significance of pre-treatment 89Zr-trastuzumab PET, plasma circulating tumor DNA (ctDNA) dynamics,...
Article
Although allogeneic hematopoietic cell transplantation (allo-HCT) is curative for high-risk pediatric acute myeloid leukemia (AML), disease relapse remains the primary cause of post-transplant mortality. To identify pressures imposed by allo-HCT on AML cells that escape the graft-versus-leukemia effect, we evaluated immune signatures at diagnosis a...
Article
Full-text available
Chromosomal instability (CIN) and epigenetic alterations are characteristics of advanced and metastatic cancers1–4, but whether they are mechanistically linked is unknown. Here we show that missegregation of mitotic chromosomes, their sequestration in micronuclei5,6 and subsequent rupture of the micronuclear envelope⁷ profoundly disrupt normal hist...
Article
2037 Background: Central nervous system (CNS) metastasis is a major cause of cancer death and morbidity, but the clinicogenomic covariates of CNS metastasis have been studied in small cohorts. We sought to i) determine whether models predicting patient time to CNS metastasis (ttCNS) trained on a large, automatically annotated clinicogenomic dataset...
Article
Chromosomal instability (CIN) is a major driver of tumor progression and treatment resistance in many cancers. CIN is characterized by ongoing chromosome missegregation, generating copy number heterogeneity that provides a substrate for natural selection. Although CIN has been well studied in model systems, the evolutionary dynamics and genomic imp...
Article
Full-text available
With the aim of producing a 3D representation of tumors, imaging and molecular annotation of xenografts and tumors (IMAXT) uses a large variety of modalities in order to acquire tumor samples and produce a map of every cell in the tumor and its host environment. With the large volume and variety of data produced in the project, we developed automat...
Article
Objectives: While fully supervised learning can yield high-performing segmentation models, the effort required to manually segment large training sets limits practical utility. We investigate whether data mined line annotations can facilitate brain MRI tumor segmentation model development without requiring manually segmented training data. Method...
Preprint
Full-text available
DNA replication is a highly coordinated cell cycle process that can become dysregulated in cancer, increasing both proliferation and mutation rates. Single-cell whole genome sequencing holds potential for studying replication dynamics of cancer cells; however, computational methods for identifying S-phase cells and inferring single-cell replication...
Article
The digitization of health records and prompt availability of tumor DNA sequencing results offer a chance to study the determinants of cancer outcomes with unprecedented richness; however, abstraction of key attributes from free text presents a major limitation to large-scale analyses. Using natural language processing (NLP), we derived sites of me...
Article
Studying DNA replication dynamics of genomically unstable cancers is important for understanding how genome diversity generating processes shape subclonal evolutionary trajectories. The emergence of single-cell whole genome sequencing (scWGS) has enabled novel interrogation of how copy number aberration (CNA) patterns shape tumor evolution and, sep...
Article
Anticancer therapy changes tumor physiology and genomics, making it a key variable in cancer studies. Although antineoplastics given at a single institution may be available in research-ready format, treatment at external institutions prior to receiving care at academic medical centers, common among patients at these centers, is often only describe...
Article
Whole genome sequencing (WGS) enables the identification of all cancer associated biomarkers in a patient’s tumor genome. Whilst fresh frozen (FF) derived WGS data provides optimal data quality, the majority of clinical biospecimens are from formalin fixed paraffin embedded (FFPE) tissue which results in DNA damage and an increase in artifactual mu...
Article
Recently, novel technologies have enabled spatially resolved multiplexed protein profiling of tumors; however, there is an unmet need for unbiased detection of spatial structures within the tissue. By encoding multiplexed images as a graph, we produce an efficient and well-studied graph representation that allows for the application of sophisticate...
Preprint
Full-text available
Somatic copy number alterations drive aberrant gene expression in cancer cells. In tumors with high levels of chromosomal instability, subclonal copy number alterations (CNAs) are a prevalent feature which often result in heterogeneous cancer cell populations with distinct phenotypes ¹ . However, the extent to which subclonal CNAs contribute to clo...
Article
Full-text available
High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability1–4 patterned by distinct mutational processes5,6, tumour heterogeneity7–9 and intraperitoneal spread7,8,10. Immunotherapies have had limited efficacy in HGSOC11–13, highlighting an unmet need to assess how mutational processes and the anatomical sites of tumour...
Conference Paper
Objectives Genomic instability is a hallmark of human cancer, with fundamental relevance to cancer etiology and evolution, anti-tumor immunity and therapeutic response. High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability defined by distinct mutational processes, intraperitoneal spread and tumor heterogeneity. As...
Article
Full-text available
Circulating tumor DNA (ctDNA) sequencing guides therapy decisions but has been studied mostly in small cohorts without sufficient follow-up to determine its influence on overall survival. We prospectively followed an international cohort of 1,127 patients with non-small-cell lung cancer and ctDNA-guided therapy. ctDNA detection was associated with...
Article
Full-text available
Follicular lymphoma (FL) is an indolent cancer of mature B-cells but with ongoing risk of transformation to more aggressive histology over time. Recurrent mutations associated with transformation have been identified; however, prognostic features that can be discerned at diagnosis could be clinically useful. We present here comprehensive profiling...
Article
Full-text available
Chromosomal instability is a major challenge to patient stratification and targeted drug development for high-grade serous ovarian carcinoma (HGSOC). Here we show that somatic copy number alterations (SCNAs) in frequently amplified HGSOC cancer genes significantly correlate with gene expression and methylation status. We identify five prevalent clo...
Preprint
Full-text available
Single-cell T cell repertoire sequencing can pair both T cell receptor (TCR) and gene expression sequence data, providing an enriched view of T cell behavior. This powerful tool can identify and characterize specific clonotypes and phenotypes as well as track their changes in response to therapy, such as immune checkpoint blockade (ICB). We present...
Article
Full-text available
How cell-to-cell copy number alterations that underpin genomic instability1 in human cancers drive genomic and phenotypic variation, and consequently the evolution of cancer2, remains understudied. Here, by applying scaled single-cell whole-genome sequencing3 to wild-type, TP53-deficient and TP53-deficient;BRCA1-deficient or TP53-deficient;BRCA2-de...
Article
Full-text available
Immunotherapy is used to treat almost all patients with advanced non-small cell lung cancer (NSCLC); however, identifying robust predictive biomarkers remains challenging. Here we show the predictive capacity of integrating medical imaging, histopathologic and genomic features to predict immunotherapy response using a cohort of 247 patients with ad...
Article
Full-text available
Assessing tumour gene fitness in physiologically-relevant model systems is challenging due to biological features of in vivo tumour regeneration, including extreme variations in single cell lineage progeny. Here we develop a reproducible, quantitative approach to pooled genetic perturbation in patient-derived xenografts (PDXs), by encoding single c...
Article
Full-text available
Patients with high-grade serous ovarian cancer suffer poor prognosis and variable response to treatment. Known prognostic factors for this disease include homologous recombination deficiency status, age, pathological stage and residual disease status after debulking surgery. Recent work has highlighted important prognostic information captured in c...
Article
Copy number alterations and structural variants are associated with disease progression, therapeutic response, and metastasis in human cancers, yet the extent and mechanisms driving continued genomic instability remain poorly understood. We generated more than 20,000 single-cell whole genomes from 25 high-grade serous ovarian and triple-negative br...
Article
High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability patterned by distinct mutational processes, a high degree of tumor heterogeneity and intraperitoneal spread. As immunotherapies have thus far proven ineffective in this disease, we sought to establish the determinants of immune recognition, avoidance and evasion...
Article
Response to immune checkpoint blockade (ICB) in non-small cell lung cancers (NSCLC) is associated with recurring mutations in tumor suppressor genes STK11 and TP53. Whereas STK11-mutated patients are mostly insensitive, TP53-mutated patients commonly respond to ICB. Previous studies have linked mutational status in these genes to differences in cel...
Article
Preclinical and clinical studies have shown that intratumoral oncolytic viruses (OVs) can potentiate host anti-tumor immunity and overcome resistance to immune checkpoint blockade, although clinical responses to OVs have been modest to date. While T cell infiltration of tumors is frequently cited as a measure of OV immunogenicity, this measure is n...
Article
Pathologic review of tissue samples is a crucial step in cancer diagnosis and treatment planning. In recent years, quantitative analysis, including artificial intelligence (AI) techniques, have been applied to facilitate the evaluation of histopathologic images. Research in computational pathology comes with numerous engineering challenges: from ma...
Article
Background: Single-cell whole genome sequencing (scWGS) methods such as direct library preparation (DLP) provide amplification-free capture of cells in all cycle phases and have enabled rich interrogation into the cell to cell genomic diversity of cancer genomes. Previous DLP-driven clonal evolution studies removed S-phase cells as replicated loci...
Article
Neoadjuvant chemotherapy (NAC) is the standard of care for selected patients with high-risk early-stage breast cancer with pathologic complete response (pCR) being the most prominent predictor of favorable outcomes. Here, we sought to study the predictive capacity of integrating orthogonal diagnostic measures on predicting pCR relative to standard...
Article
Chromosomal instability (CIN) and epigenetic alterations are characteristics of advanced and metastatic cancers, yet whether they are mechanistically linked is unknown. Here we show that missegregation of mitotic chromosomes, their sequestration in micronuclei, and subsequent micronuclear envelope rupture profoundly disrupt normal histone post-tran...
Article
Full-text available
Missense driver mutations in cancer are concentrated in a few hotspots1. Various mechanisms have been proposed to explain this skew, including biased mutational processes2, phenotypic differences3–6 and immunoediting of neoantigens7,8; however, to our knowledge, no existing model weighs the relative contribution of these features to tumour evolutio...
Article
9064 Background: Immunotherapy is now given to almost all patients with advanced non-small cell lung cancer (NSCLC). However, developing robust biomarkers to predict benefit remains challenging. We set out to evaluate the predictive capacity of integrating medical imaging, histopathologic, and genomic features to develop a multimodal biomarker for...
Article
Full-text available
Transcription factors ThPOK and Runx3 regulate the differentiation of "helper" CD4+ and "cytotoxic" CD8+ T cell lineages respectively, inducing single positive (SP) T cells that enter the periphery with the expression of either the CD4 or CD8 co-receptor. Despite the expectation that these cell fates are mutually exclusive and that mature CD4+CD8+...
Preprint
Full-text available
Deciphering individual cell phenotypes from cell-specific transcriptional processes requires high dimensional single cell RNA sequencing. However, current dimensionality reduction methods aggregate sparse gene information across similar cells, without directly measuring the relationships that exist between genes. By performing dimensionality reduct...
Article
Full-text available
PRAME is a prominent member of the cancer germline antigen family of proteins, which triggers autologous T-cell mediated immune responses. Integrative genomic analysis in diffuse large B-cell lymphoma (DLBCL) uncovered recurrent, and highly focal deletions of 22q11.22 including the PRAME gene, which were associated with poor outcome. PRAME-deleted...
Article
Metastatic progression is the main cause of death in cancer patients, whereas the underlying genomic mechanisms driving metastasis remain largely unknown. Here, we assembled MSK-MET, a pan-cancer cohort of over 25,000 patients with metastatic diseases. By analyzing genomic and clinical data from this cohort, we identified associations between genom...
Article
Background Artificial intelligence (AI) applications for cancer imaging conceptually begin with automated tumor detection, which can provide the foundation for downstream AI tasks. However, supervised training requires many image annotations, and performing dedicated post hoc image labeling is burdensome and costly. Purpose To investigate whether c...
Preprint
Full-text available
Chromosomal instability (CIN) and epigenetic alterations are characteristics of advanced and metastatic cancers [1-4], yet whether they are mechanistically linked is unknown. Here we show that missegregation of mitotic chromosomes, their sequestration in micronuclei [5, 6], and subsequent micronuclear envelope rupture [7] profoundly disrupt normal...
Article
Follicular lymphoma (FL) is an indolent lymphoma of mature B-cells but may transform to a more aggressive histology, most commonly diffuse large B cell lymphoma. Recurrent mutations associated with transformation have been identified; however, biological predictors to guide initial therapy have remained elusive. We hypothesized that clonal heteroge...
Article
Full-text available
Significance Our study provides detailed functional and spatial characteristics of immune cells in the LR-CHL microenvironment at single-cell resolution. We describe detailed T cell subset definitions and importantly identified a unique CD4 ⁺ PD-1 ⁺ CXCL13 ⁺ CXCR5 ⁻ TFH-like subset that surrounds HRS cells, appears in close proximity to CXCR5 ⁺ B c...
Article
Advances in quantitative biomarker development have accelerated new forms of data-driven insights for patients with cancer. However, most approaches are limited to a single mode of data, leaving integrated approaches across modalities relatively underdeveloped. Multimodal integration of advanced molecular diagnostics, radiological and histological...
Preprint
Full-text available
High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability patterned by distinct mutational processes, intratumoral heterogeneity and intraperitoneal spread. We investigated determinants of immune recognition and evasion in HGSOC to elucidate co- evolutionary processes underlying malignant progression and tumor immunity...
Article
Full-text available
Progress in defining genomic fitness landscapes in cancer, especially those defined by copy number alterations (CNAs), has been impeded by lack of time-series single-cell sampling of polyclonal populations and temporal statistical models1–7. Here we generated 42,000 genomes from multi-year time-series single-cell whole-genome sequencing of breast e...
Preprint
Full-text available
Progression to metastatic disease remains the main cause of cancer death. Yet, the underlying genomic mechanisms driving metastasis remain largely unknown. Here, we present MSK-MET, an integrated pan-cancer cohort of tumor genomic and clinical outcome data from more than 25,000 patients. We analyzed this dataset to identify associations between tum...
Preprint
Full-text available
With the aim of producing a 3D representation of tumours, IMAXT uses a large variety of modalities in order to acquire tumour samples and produce a map of every cell in the tumour and its host environment. With the large volume and variety of data produced in the project we develop automatic data workflows and analysis pipelines and introduce a res...
Preprint
Full-text available
Cancer genomes exhibit extensive chromosomal copy number changes and structural variation, yet how allele specific alterations drive cancer genome evolution remains unclear. Here, through application of a new computational approach we report allele specific copy number alterations in 11,097 single cell whole genomes from genetically engineered mamm...
Preprint
Structural genome alterations are determinants of cancer ontogeny and therapeutic response. While bulk genome sequencing has enabled delineation of structural variation (SV) mutational processes which generate patterns of DNA damage, we have little understanding of how these processes lead to cell-to-cell variations which underlie selection and rat...
Article
Immune checkpoint blockade (ICB) has been a remarkable clinical advance for cancer; however, the majority of patients do not respond to ICB therapy. We show that metastatic disease in the pleural and peritoneal cavities is associated with poor clinical outcomes after ICB therapy. Cavity-resident macrophages express high levels of Tim-4, a receptor...
Article
Primary mediastinal large B-cell lymphoma (PMBL) is a type of aggressive B-cell lymphoma that typically affects young adults, characterized by presence of a bulky anterior mediastinal mass. Lymphomas with gene expression features of PMBL have been described in non-mediastinal sites, raising questions about how these tumors should be classified. Her...
Article
Purpose: The World Trade Center (WTC) attack of September 11, 2001 created an unprecedented environmental exposure to known and suspected carcinogens. High incidence of multiple myeloma (MM) and precursor conditions has been reported among first responders to the WTC disaster. To expand on our prior screening studies, and to characterize the genom...
Article
Background: With the goal of translating biological discovery into clinical actionability, deciphering crosstalk in the cellular ecosystem of the tumor microenvironment (TME) has emerged as a research focus. Although comparatively little is known about the immune biology of diffuse large B-cell lymphoma (DLBCL), as reflected in clonal selection of...
Article
PURPOSE : The World Trade Center (WTC) attack of September 11, 2001 created an unprecedented environmental exposure to known and suspected carcinogens. A higher incidence of multiple myeloma (MM) and precursor disease has been reported among first responders to the WTC disaster compared to the unexposed population (Landgren, 2018). To expand on pri...
Article
Introduction: Classic Hodgkin lymphoma (CHL) features a unique crosstalk between malignant cells and different types of normal immune cells in the tumor-microenvironment (TME). On the basis of histomorphologic and immunophenotypic features of the malignant Hodgkin and Reed-Sternberg (HRS) cells and infiltrating immune cells, four histological subty...
Article
Full-text available
Background Malignant pleural effusions and peritoneal carcinomatosis are associated with poor outcomes in patients with cancer. 1–3 Macrophages in these serous body cavities express the phosphatidylserine receptor Tim-4. 4–8 Prior reports demonstrated that Tim-4 abrogation is associated with improved anti-tumor activity. 9–11 Whether macrophages ex...
Preprint
The genomic complexity and heterogeneity of high-grade serous ovarian cancer (HGSOC) has hampered the realisation of successful therapies and effective personalised treatment is an unmet clinical need. Here we show that primary HGSOC spheroid models can be used to predict drug response and use them to demonstrate that somatic copy number alteration...
Article
Full-text available
We present Epiclomal, a probabilistic clustering method arising from a hierarchical mixture model to simultaneously cluster sparse single-cell DNA methylation data and impute missing values. Using synthetic and published single-cell CpG datasets, we show that Epiclomal outperforms non-probabilistic methods and can handle the inherent missing data c...
Article
Endometrial carcinoma, the most common gynaecological cancer, develops from endometrial epithelium which is composed of secretory and ciliated cells. Pathologic classification is unreliable and there is a need for prognostic tools. We used single cell sequencing to study organoid model systems derived from normal endometrial endometrium to discover...
Conference Paper
Endometrial epithelium gives rise to both endometrial and ovarian cancers (of clear-cell and endometrioid subtypes), the latter arising from ectopic endometrium (endometriosis). Endometrial epithelium comprises mainly secretory cells, with a minor ciliated cell population. Due to their scarcity, little is known about the biology or function of endo...
Article
Full-text available
Subsets of breast tumors present major clinical challenges, including triple-negative, metastatic/recurrent disease and rare histologies. Here, we developed 37 patient-derived xenografts (PDX) from these difficult-to-treat cancers to interrogate their molecular composition and functional biology. Whole-genome and transcriptome sequencing and revers...
Preprint
Full-text available
Tumour fitness landscapes underpin selection in cancer, impacting etiology, evolution and response to treatment. Progress in defining fitness landscapes has been impeded by a lack of timeseries perturbation experiments over realistic intervals at single cell resolution. We studied the nature of clonal dynamics induced by genetic and pharmacologic p...
Preprint
A new generation of scalable single cell whole genome sequencing (scWGS) methods, allows unprecedented high resolution measurement of the evolutionary dynamics of cancer cells populations. Phylogenetic reconstruction is central to identifying sub-populations and distinguishing mutational processes. The ability to sequence tens of thousands of singl...
Article
Full-text available
The functional consequences of somatic non-coding mutations in ovarian cancer (OC) are unknown. To identify regulatory elements (RE) and genes perturbed by acquired non-coding variants, here we establish epigenomic and transcriptomic landscapes of primary OCs using H3K27ac ChIP-seq and RNA-seq, and then integrate these with whole genome sequencing...
Article
Full-text available
Transmembrane protein 30A (TMEM30A) maintains the asymmetric distribution of phosphatidylserine, an integral component of the cell membrane and ‘eat-me’ signal recognized by macrophages. Integrative genomic and transcriptomic analysis of diffuse large B-cell lymphoma (DLBCL) from the British Columbia population-based registry uncovered recurrent bi...
Conference Paper
Introduction Breast cancer is the most common cancer in female and triple-negative breast cancer (TNBC) is the most aggressive subtype of breast cancer which shows high rate of recurrence and metastasis. Malignant cells that comprise primary tumor are heterogeneous and during disease progression selection of tumor cells occur as a mean to adapt and...
Article
Full-text available
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges...