Tzu-Ting Wei’s research while affiliated with Charité Universitätsmedizin Berlin and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (5)


Cancer cell calling based on transcriptome information. (A) Anatomical locations and mutational patterns of the samples. C, cecum; A, ascending colon; D, descending colon; S, sigmoid; R, rectum. Mutations (in brackets) A: APC, B: BRAF, C: CTNNB1, K: KRAS, P: TP53. (B) UMAP of all 73,294 cells, colored by three major cell type compartments: Epithelial (blue), immune (orange), and stromal cells (green). (C, D, F) UMAPs of epithelial cells only. (C) Color code by the sample origin and the microsatellite status. Cancer sample (MSI), red; cancer sample (MSS), yellow; normal sample, gray. (D) Color code for cancer sample cells by iCMS assignment; iCMS2 (yellow), iCMS3 (pink), or normal (blue), normal samples (not scored, gray). (F) Color code of cancer sample cells by inferCNV. Copy number status aberrant (CNA; orange), normal (CNN; blue), or not applicable (NA; purple) when the clones in the sample are not differentiable, normal samples (not scored, gray). (E,G) Stacked bar plots summarizing iCMS and inferCNV information, respectively, by cancer sample. (H) Quantification of the agreement between iCMS and inferCNV calls as an upset plot, color‐coded by patient, as indicated.
CCISM identifies cancer cells with somatic single nucleotide variants. (A) Scatterplot of the number of SNVs in whole genome sequencing data and the average number of expressed SNVs per cell in single‐cell RNA sequencing data colored by patient. (B) CCISM's workflow diagram from input data (scRNAseq and bulk DNAseq data), allele count calculation by cellSNP‐lite to CCISM modelling. Benchmark simulations can be generated from input counts (blue). (C) Boxplots of tool performances in simulation data regarding runtime in seconds (right), false positive rate (FPR, left), and true positive rate (TPR, mi) between CCISM (green), cardelinoEM (orange), and vireo (pear). (D) Line plots comparing model performances (CCISM, green circle; cardelinoEM, orange cross; vireo, pear star) as function of tumor fraction (upper) and mean number of expressed SNVs per cell (lower). (E) Line plot of CCISM's performance (TPR) in single‐cell transcriptomes subsampled to five different mean numbers of reads per cell, color‐coded by patient.
Cancer cell calling based on genomic information. (A,B) UMAPs of epithelial cells. (A) Color‐code by CCISM calls (cancer cell, orange; normal cell, blue). Insets given for inferCNV and iCMS calls. Cells from normal samples are given in gray. (B) Color‐code by Numbat call (cancer cell, orange; normal cell, blue). Cells from normal samples are given in gray. (C) Venn diagram of the intersections of cancer cell calls from iCMS (pink), inferCNV (yellow), CCISM (green), and Numbat (blue). 5,637 cells are called as normal by all four tools. (D) Intersections of cancer cell calls from CCISM and Numbat colored by microsatellite status of the sample (MSI, red; MSS, yellow), given as an upset plot. (E) Heatmaps of the cancer cell scores (0.0, blue; 0.5, dark gray; 1.0, orange) from Numbat (upper) and CCISM (lower) across cancer samples. (F) Decision matrix for consensus cancer cell calls, based on CCISM, Numbat and microsatellite status. (G) Stacked barplot of the consensus derived from CCISM and Numbat (cancer cell, orange; normal cell, blue; undefined, purple). (H) UMAP of the consensus calls, color code as in G, excluding cells with an “unclear” call.
Consensus calls identify a cluster of genomically normal cells unique to left‐sided cancer samples. (A) UMAP of epithelial cells, colored by louvain clustering. (B) Stacked bar plot of consensus calls across 20 louvain clusters (cancer sample and genomically cancer, orange; cancer sample and genomically normal, blue, normal sample, grey). (C) Bar plot of cluster homogeneity scores for cancer cell calls by different methods as indicated. (D) Relative fractions of genomically normal cells in cluster 9, by cancer location (see Figure 1A). P‐value from mixed‐effects binomial model, *** P < .001. (E) Pie chart of the epithelial cell types in louvain cluster 9, as indicated. Color code: Enterocyte (dark green), Enterocyte progenitor (light green), Immature Goblet (light purple), Stem/TA (dark blue), and Stem (light blue). (F) Dot plot of top 10 marker genes for louvain cluster 9. Color of dot represents the mean normalized expression of the gene, and the size of the dot shows the fraction cells expressing the gene. (G) UMAP colored by PLA2G2A expression, which is the top gene marker specific to louvain cluster 9.
Cell states and developmental trajectories are altered in genomically normal cells of cancer samples compared to normal colon epithelium. (A) Stacked bar plots of epithelial cell types in normal samples (upper) and genomically normal cell populations (lower), including Enterocyte (dark green), Enterocyte progenitor (light green), Goblet (dark purple), Immature Goblet (light purple), Tuft (yellow), Stem/TA (dark blue), and Stem (light blue). (B) Diffusion map with additional histograms of first and second dimensions/axes colored by epithelial cell types. Color code as in A, with the addition of genomically cancer cells (red). (C) Stacked bar plots of the epithelial cell type compositions across binned diffusion map dimension 2 in normal sample and genomically normal cells, as indicated. (D) UMAP colored by Cytotrace developmental pseudotime, from early (0, yellow) to late (1, dark purple) in pseudotime space. (E,F) Violin plots of Cytotrace pseudotime across epithelial cell types and consensus call groups, as indicated.

+1

High‐confidence calling of normal epithelial cells allows identification of a novel stem‐like cell state in the colorectal cancer microenvironment
  • Article
  • Full-text available

July 2024

·

45 Reads

·

2 Citations

Tzu‐Ting Wei

·

·

·

[...]

·

Single‐cell analyses can be confounded by assigning unrelated groups of cells to common developmental trajectories. For instance, cancer cells and admixed normal epithelial cells could adopt similar cell states thus complicating analyses of their developmental potential. Here, we develop and benchmark CCISM (for Cancer Cell Identification using Somatic Mutations) to exploit genomic single nucleotide variants for the disambiguation of cancer cells from genomically normal non‐cancer cells in single‐cell data. We find that our method and others based on gene expression or allelic imbalances identify overlapping sets of colorectal cancer versus normal colon epithelial cells, depending on molecular characteristics of individual cancers. Further, we define consensus cell identities of normal and cancer epithelial cells with higher transcriptome cluster homogeneity than those derived using existing tools. Using the consensus identities, we identify significant shifts of cell state distributions in genomically normal epithelial cells developing in the cancer microenvironment, with immature states increased at the expense of terminal differentiation throughout the colon, and a novel stem‐like cell state arising in the left colon. Trajectory analyses show that the new cell state extends the pseudo‐time range of normal colon stem‐like cells in a cancer context. We identify cancer‐associated fibroblasts as sources of WNT and BMP ligands potentially contributing to increased plasticity of stem cells in the cancer microenvironment. Our analyses advocate careful interpretation of cell heterogeneity and plasticity in the cancer context and the consideration of genomic information in addition to gene expression data when possible.

Download

Overview of study subjects and data analysis. (Left) Repartition of the subjects into clinical categories and smoking status. For each category, we show the number of subjects for which RNA-seq (on nasal and bronchial samples) and array-based blood genotyping were performed. Nasal samples from the AEGIS cohort were used as a validation set. (Right) Schematic of the different analyses conducted to stratify patients and identify dysregulated pathways among clinic patients
Smoke injury dynamics. a Plot showing the change of reversibility dynamics for the 749 response genes in the healthy volunteer (left) and clinic (right) donor groups (genes classified as unaffected by smoking in both donor groups were removed). Color bars represent the number of genes in each reversibility class (blue = rapidly reversible, yellow = slowly reversible, red = irreversible, green = cessation associated, grey = unaffected by smoking). b Normalized gene expression over smoking status for 4 exemplar response genes with different post-cessation dynamics in the clinic and healthy groups, with linetype and shape representing donor status (plain line = clinic group, dashed line = healthy volunteer) and colors representing the genes’ assigned reversibility classes (same color code as panel a). See also Fig. S1 for schematic examples
Disease status prediction based on response genes. a, b Risk score distribution for the population test (a) and the clinic test (b) predicted from the clinical variables and the expression of the response genes using a penalized regression (see the ‘Methods’ section). The risk distributions are presented separately for healthy volunteers (green), clinic patients without cancer (orange) and clinic patients with cancer (purple). c, d ROC curves for the population (c) and clinic (d) scores. For each case, we present the ROC curve for the model trained on clinical data (triangles) or on gene expression and clinical data (squares). Each curve is an average obtained across 100 cross-validation (CV) experiments and the grey area surrounding the curve gives the standard error. The color of the curve represents the test threshold corresponding to the represented sensitivity/false-positive rate compromise. (Inset) Area under the ROC curve, in 100 CV rounds, for a clinical-only model (red) the model constructed on the response genes (blue) and a model constructed on a combination of clinical information and response genes (green) for the population (c) and clinic (d) classifiers. P values given above each box are computed using a 2-sample t-test. e The population and clinic classifiers applied to nasal samples from the AEGIS cohort
Pathway analysis and contribution to risk. a Comparison of geneset metascore (average vst-normalized gene expression; see the ‘Methods’ section) over smoking status for 4 immune-related GO terms in healthy (dashed line, triangles dot) and clinic subjects (plain line, round dot). b Correlation between the population or clinic risk score and geneset metascore for the 8 gene sets representing biological functions altered by smoking; spearman correlation is shown separately for current and former smokers (> 12 months); Spearman correlation values are reported (blue = positive correlation, red = negative correlation), as well as the associated p-values (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001)
Genotype background influences lung cancer risk. a Combined environmental and genetic effect on the expression of the FXYD5 gene in nasal tissues. For each nasal sample, we present the expression level of the gene FXYD5 separately for never (pink), former (green) and current (blue) smokers. Samples are further stratified depending on the genotype of the subject at the 19:35660670:G:A locus (Ref/Ref: homozygous reference; Ref/Alt: heterozygous; Alt/Alt homozygous Alternative). The p-value gives the significance level of an interaction effect of the smoking status and the genotype at 19:35660670:G:A on the expression of the FXYD5 gene (see the ‘Methods’ section). GWAS enrichment analysis: (b) Network representation of the 4 bronchial regulons enriched in GWAS genes. The 4 TFs are shown as squares and their target genes in the bronchial network as circles. The colour of the nodes indicates whether the gene/TF is a smoke injury risk gene (blue), a gene that co-localizes with a GWAS hit (i.e. no threshold on eQTL significance) (red) or both (green). The level of overrepresentation for genes in the network of those TFs can be found in Table 1. c Activity level of each of the 4 TFs in nasal tissue, depending on the disease status of the patient (green, healthy volunteer; orange, clinic patient without cancer; purple, clinic patient with cancer). Stars represent the significance of a two-sample t-test (ns, p > 0.05; *p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001; ****p ≤ 0.0001)
Smoking-associated gene expression alterations in nasal epithelium reveal immune impairment linked to lung cancer risk

April 2024

·

45 Reads

·

4 Citations

Genome Medicine

Background Lung cancer is the leading cause of cancer-related death in the world. In contrast to many other cancers, a direct connection to modifiable lifestyle risk in the form of tobacco smoke has long been established. More than 50% of all smoking-related lung cancers occur in former smokers, 40% of which occur more than 15 years after smoking cessation. Despite extensive research, the molecular processes for persistent lung cancer risk remain unclear. We thus set out to examine whether risk stratification in the clinic and in the general population can be improved upon by the addition of genetic data and to explore the mechanisms of the persisting risk in former smokers. Methods We analysed transcriptomic data from accessible airway tissues of 487 subjects, including healthy volunteers and clinic patients of different smoking statuses. We developed a computational model to assess smoking-associated gene expression changes and their reversibility after smoking is stopped, comparing healthy subjects to clinic patients with and without lung cancer. Results We find persistent smoking-associated immune alterations to be a hallmark of the clinic patients. Integrating previous GWAS data using a transcriptional network approach, we demonstrate that the same immune- and interferon-related pathways are strongly enriched for genes linked to known genetic risk factors, demonstrating a causal relationship between immune alteration and lung cancer risk. Finally, we used accessible airway transcriptomic data to derive a non-invasive lung cancer risk classifier. Conclusions Our results provide initial evidence for germline-mediated personalized smoke injury response and risk in the general population, with potential implications for managing long-term lung cancer incidence and mortality.


Figure 1. Cancer cell calling based on transcriptome information. A Anatomical locations and mutational patterns of the samples. C: cecum, A: ascending colon, D: descending colon, S: sigmoid, and R: rectum. Mutations
High-confidence calling of normal epithelial cells allows identification of a novel stem-like cell state in the colorectal cancer microenvironment

February 2024

·

87 Reads

Single-cell analyses can be confounded by assigning unrelated groups of cells to common developmental trajectories. For instance, cancer cells and admixed normal epithelial cells could potentially adopt similar cell states thus complicating analyses of their developmental potential. Here, we develop and benchmark CCISM (for Cancer Cell Identification using Somatic Mutations) to exploit genomic single nucleotide variants for the disambiguation of cancer cells from genomically normal non-cancer epithelial cells in single-cell data. In colorectal cancer datasets, we find that our method and others based on gene expression or allelic imbalances identify overlapping sets of cancer versus normal epithelial cells, depending on molecular characteristics of individual cancers. Further, we define consensus cell identities of normal and cancer epithelial cells with higher transcriptome cluster homogeneity than those derived using existing tools. Using the consensus identities, we identify significant shifts of cell state distributions in genomically normal epithelial cells developing in the cancer microenvironment, with immature states increased at the expense of terminal differentiation throughout the colon, and a novel stem-like cell state arising in the left colon. Trajectory analyses show that the new cell state extends the pseudo-time range of normal colon stem-like cells in a cancer context. We identify cancer-associated fibroblasts as sources of WNT and BMP ligands potentially contributing to increased plasticity of stem cells in the cancer microenvironment. Our analyses advocate careful interpretation of cell heterogeneity and plasticity in the cancer context and the consideration of genomic information in addition to gene expression data when possible. Novelty and Impact Single-cell analyses have become standard to assess cell heterogeneity and developmental hierarchies in cancer tissues. However, these datasets are complex and contain cancer and non-cancer lineage cells. Here, we develop and systematically benchmark tools to distinguish between cancer and non-cancer single-cell transcriptomes, based on gene expression or different levels of genomic information. We provide strategies to combine results of different tools into consensus calls tailored to the biology and genetic characteristics of the individual cancer.


Smoking-dependent expression alterations in nasal epithelium reveal immune impairment linked to germline variation and lung cancer risk

November 2021

·

49 Reads

Lung cancer is the leading cause of cancer-related death in the world. In contrast to many other cancers, a direct connection to lifestyle risk in the form of cigarette smoke has long been established. More than 50% of all smoking-related lung cancers occur in former smokers, often many years after smoking cessation. Despite extensive research, the molecular processes for persistent lung cancer risk are unclear. CT screening of current and former smokers has been shown to reduce lung cancer mortality by up to 26%. To examine whether clinical risk stratification can be improved upon by the addition of genetic data, and to explore the mechanisms of the persisting risk in former smokers, we have analyzed transcriptomic data from accessible airway tissues of 487 subjects. We developed a model to assess smoking associated gene expression changes and their reversibility after smoking is stopped, in both healthy subjects and clinic patients. We find persistent smoking-associated immune alterations to be a hallmark of the clinic patients. Integrating previous GWAS data using a transcriptional network approach, we demonstrate that the same immune and interferon related pathways are strongly enriched for genes linked to known genetic risk factors, demonstrating a causal relationship between immune alteration and lung cancer risk. Finally, we used accessible airway transcriptomic data to derive a non-invasive lung cancer risk classifier. Our results provide initial evidence for germline-mediated personalised smoke injury response and risk in the general population, with potential implications for managing long-term lung cancer incidence and mortality.


Figure 4. MAPK activity is linked to CRC cell differentiation states. A Gene expression of LGR5 and EPHB2, along activity gradients of LGR5-ISC or MAPK target gene signatures. B Cell state distribution of SCN-aberrant CRC cells along gradients of LGR5-ISC or MAPK transcriptional signatures, as in A. C Cell state distribution of SCN-aberrant CRC tumor cells along MAPK signature activity, as in B, per tumor. Correlation between cell state distributions and MAPK target gene was calculated using Pearson's r. For correlations and significances, see Table EV6. Color code as in Fig 4B. D UMAP representations of single-cell transcriptomes derived from P009T or P013T organoids, after MAPK blockade using MEK or combined MEK and EGFR inhibition. Color codes are treatment conditions or expression strength of signature, as indicated. Dashed line in P013T UMAP roughly separates control (DMSO) and MEK/EGFR inhibitor-treated cells.
Mitogen‐activated protein kinase activity drives cell trajectories in colorectal cancer

August 2021

·

157 Reads

·

72 Citations

EMBO Molecular Medicine

In colorectal cancer, oncogenic mutations transform a hierarchically organized and homeostatic epithelium into invasive cancer tissue lacking visible organization. We sought to define transcriptional states of colorectal cancer cells and signals controlling their development by performing single-cell transcriptome analysis of tumors and matched non-cancerous tissues of twelve colorectal cancer patients. We defined patient-overarching colorectal cancer cell clusters characterized by differential activities of oncogenic signaling pathways such as mitogen-activated protein kinase and oncogenic traits such as replication stress. RNA metabolic labeling and assessment of RNA velocity in patient-derived organoids revealed developmental trajectories of colorectal cancer cells organized along a mitogen-activated protein kinase activity gradient. This was in contrast to normal colon organoid cells developing along graded Wnt activity. Experimental targeting of EGFR-BRAF-MEK in cancer organoids affected signaling and gene expression contingent on predictive KRAS/BRAF mutations and induced cell plasticity overriding default developmental trajectories. Our results highlight directional cancer cell development as a driver of non-genetic cancer cell heterogeneity and re-routing of trajectories as a response to targeted therapy.

Citations (2)


... Lung cancer is one of the most prevalent cancers globally, with high death rates in both genders. The majority of lung cancers are attributed to non-small cell lung cancer (NSCLC), causing the most cancer-related deaths and ranking as the second most prevalent cancer globally [1][2][3]. In recent years, despite great progress in multidisciplinary treatment including surgery, radiotherapy, chemotherapy and immunotherapy, the prognosis of patients with NSCLC remains unsatisfactory. ...

Reference:

Prognostic and clinicopathological significance of tertiary lymphoid structure in non-small cell lung cancer: a systematic review and meta-analysis
Smoking-associated gene expression alterations in nasal epithelium reveal immune impairment linked to lung cancer risk

Genome Medicine

... Next, we assessed differences in signaling pathway activities of colon cancer cell subpopulations with differential GPA33 mRNA expression, using single-cell transcriptome sequencing data of 12 samples of freshly resected colon cancers [16]. We grouped cancer cells into GPA33-high and -low clusters and looked for differences in pathway activities through assessment of expression of specific target gene signatures (Fig. 1F). ...

Mitogen‐activated protein kinase activity drives cell trajectories in colorectal cancer

EMBO Molecular Medicine