Mathias Wilhelm

Mathias Wilhelm
Technical University of Munich | TUM · TUM School of Life Sciences

Professor

About

133
Publications
29,609
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,115
Citations
Introduction
We investigate how mass spectrometric data can be better understood, made usable for the broad scientific community and how findings from it can be translated into research and clinical practice.
Additional affiliations
March 2021 - present
Technical University of Munich
Position
  • Professor
June 2017 - February 2021
Technical University of Munich
Position
  • Group Leader
September 2012 - May 2017
Technical University of Munich
Position
  • PhD Student
Education
August 2012 - April 2017
Technical University of Munich
Field of study
  • Computational Mass Spectrometry
October 2009 - October 2011
Bielefeld University
Field of study
  • Informatics in the Natural Sciences
October 2009 - October 2011
Bielefeld University
Field of study
  • Informatics in the Natural Sciences

Publications

Publications (133)
Article
Proteomes are characterized by large protein-abundance differences, cell-type- and time-dependent expression patterns and post-translational modifications, all of which carry biological information that is not accessible by genomics or transcriptomics. Here we present a mass-spectrometry-based draft of the human proteome and a public, high-performa...
Article
Full-text available
Calculating the number of confidently identified proteins and estimating false discovery rate (FDR) is a challenge when analyzing very large proteomic datasets such as entire human proteomes. Biological and technical heterogeneity in proteomic experiments further add to the challenge and there are strong differences in opinion regarding the concept...
Article
Full-text available
ProteomicsDB (https://www.ProteomicsDB.org) is a protein-centric in-memory database for the exploration of large collections of quantitative mass spectrometry-based proteomics data. ProteomicsDB was first released in 2014 to enable the interactive exploration of the first draft of the human proteome. To date, it contains quantitative data from 78 p...
Article
Full-text available
An atlas for drug interactions Kinase inhibitors are an important class of drugs that block certain enzymes involved in diseases such as cancer and inflammatory disorders. There are hundreds of kinases within the human body, so knowing the kinase “target” of each drug is essential for developing successful treatment strategies. Sometimes clinical t...
Article
Full-text available
In mass-spectrometry-based proteomics, the identification and quantification of peptides and proteins heavily rely on sequence database searching or spectral library matching. The lack of accurate predictive models for fragment ion intensities impairs the realization of the full potential of these approaches. Here, we extended the ProteomeTools syn...
Article
Full-text available
Post-translational modifications (PTMs) play pivotal roles in regulating cellular signaling, fine-tuning protein function, and orchestrating complex biological processes. Despite their importance, the lack of comprehensive tools for studying PTMs from a pathway-centric perspective has limited our ability to understand how PTMs modulate cellular pat...
Preprint
Full-text available
It has been shown that integrating peptide property predictions such as fragment intensity into the scoring process of peptide spectrum match can greatly increase the number of confidently identified peptides compared to using traditional scoring methods. Here, we introduce Prosit-XL, a robust and accurate fragment intensity predictor covering the...
Article
Full-text available
The human body contains trillions of cells, classified into specific cell types, with diverse morphologies and functions. In addition, cells of the same type can assume different states within an individual's body during their lifetime. Understanding the complexities of the proteome in the context of a human organism and its many potential states i...
Preprint
Full-text available
Acute myeloid leukemia (AML) is an aggressive blood cancer with a poor prognosis. Although treatments like allogeneic hematopoietic stem cell transplantation and high-dose chemotherapy can potentially cure some younger patients, challenges such as relapse and treatment-related toxicities remain significant. Combination therapy has been a cornerston...
Preprint
Full-text available
Identifying detectable peptides, known as flyers, is key in mass spectrometry-based proteomics. Peptide detectability is strongly related with the peptide sequence and its resulting physicochemical properties. Moreover, the high variability in MS data, particularly in peptide detectability and intensity across multiple analyses and samples, makes t...
Preprint
Citrullination is a critical yet understudied post-translational modification (PTM) implicated in various biological processes. Exploring its role in health and disease requires a comprehensive understanding of the prevalence of this PTM at a proteome-wide scale. Although mass spectrometry has enabled the identification of citrullination sites in c...
Article
Mass-spectrometry-based proteomics has advanced with the integration of experimental and predicted spectral libraries, which have significantly improved peptide identification in complex search spaces. However, challenges persist in distinguishing some peptides with close retention times and nearly identical fragmentation patterns. In this study, w...
Article
Full-text available
Most heritable diseases are polygenic. To comprehend the underlying genetic architecture, it is crucial to discover the clinically relevant epistatic interactions (EIs) between genomic single nucleotide polymorphisms (SNPs) (1–3). Existing statistical computational methods for EI detection are mostly limited to pairs of SNPs due to the combinatoria...
Article
Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. R...
Article
Proteomics, the study of proteins within biological systems, has seen remarkable advancements in recent years, with protein isoform detection emerging as one of the next major frontiers. One of the primary challenges is achieving the necessary peptide and protein coverage to confidently differentiate isoforms as a result of the protein inference pr...
Article
Full-text available
Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a...
Preprint
Full-text available
Recent developments in machine-learning (ML) and deep-learning (DL) have immense potential for applications in proteomics, such as generating spectral libraries, improving peptide identification, and optimizing targeted acquisition modes. Although new ML/DL models for various applications and peptide properties are frequently published, the rate at...
Article
Rescoring of peptide spectrum matches originating from database search engines enabled by peptide property predictors is exceeding the performance of peptide identification from traditional database search engines. In contrast to the peptide spectrum match scores calculated by traditional database search engines, rescoring peptide spectrum matches...
Preprint
Full-text available
Proteomic workflows generate vastly complex peptide mixtures that are analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS), creating thousands of spectra, most of which are chimeric and contain fragment ions from more than one peptide. Because of differences in data acquisition strategies such as data-dependent (DDA), data-independ...
Article
Full-text available
Immunopeptidomics is crucial for immunotherapy and vaccine development. Because the generation of immunopeptides from their parent proteins does not adhere to clear-cut rules, rather than being able to use known digestion patterns, every possible protein subsequence within human leukocyte antigen (HLA) class-specific length restrictions needs to be...
Preprint
Full-text available
Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. R...
Chapter
Liquid chromatography-coupled mass spectrometry (LC-MS/MS) is the primary method to obtain direct evidence for the presentation of disease- or patient-specific human leukocyte antigen (HLA). However, compared to the analysis of tryptic peptides in proteomics, the analysis of HLA peptides still poses computational and statistical challenges. Recentl...
Article
Full-text available
Unlike for DNA and RNA, accurate and high-throughput sequencing methods for proteins are lacking, hindering the utility of proteomics in applications where the sequences are unknown including variant calling, neoepitope identification, and metaproteomics. We introduce Spectralis, a de novo peptide sequencing method for tandem mass spectrometry. Spe...
Article
Background Late‐onset Alzheimer’s disease (AD) is the leading cause of dementia with neither cure nor a clearly understood disease trajectory. Previous studies have shown cross‐sectional and longitudinal associations of the blood lipidome and AD. We previously reported on significant sex differences in the associations of lipids with AD biomarkers...
Preprint
Full-text available
Most heritable diseases are polygenic. To comprehend the underlying genetic architecture, it is crucial to discover the clinically relevant epistatic interactions (EIs) between genomic single nucleotide polymorphisms (SNPs). Existing statistical computational methods for EI detection are mostly limited to pairs of SNPs due to the combinatorial expl...
Article
Full-text available
Medicinal chemistry has discovered thousands of potent protein and lipid kinase inhibitors. These may be developed into therapeutic drugs or chemical probes to study kinase biology. Because of polypharmacology, a large part of the human kinome currently lacks selective chemical probes. To discover such probes, we profiled 1,183 compounds from drug...
Article
Mass spectrometry coupled to liquid chromatography is one of the most powerful technologies for proteome quantification in biomedical samples. In peptide-centric workflows, protein mixtures are enzymatically digested to peptides prior their analysis. However, proteome-wide quantification studies rarely identify all potential peptides for any given...
Article
Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data-independent acquisition (DIA) data analysis to data-driven rescoring of search engine results. Here, we presen...
Preprint
Full-text available
Post-translational modifications (PTMs) play a governing role in regulating cellular signaling, fine-tuning protein function, and orchestrating complex biological processes. Despite their importance, the lack of comprehensive tools for studying PTMs from a pathway-centric perspective has limited our ability to understand how PTMs modulate cellular...
Article
Sample multiplexed quantitative proteomics assays have proved to be a highly versatile means to assay molecular phenotypes. Yet, stochastic precursor selection and precursor coisolation can dramatically reduce the efficiency of data acquisition and quantitative accuracy. To address this, intelligent data acquisition (IDA) strategies have recently b...
Article
Full-text available
Systemic pan-tumor analyses may reveal the significance of common features implicated in cancer immunogenicity and patient survival. Here, we provide a comprehensive multi-omics data set for 32 patients across 25 tumor types for proteogenomic-based discovery of neoantigens. By using an optimized computational approach, we discover a large number of...
Preprint
Full-text available
Immunopeptidomics plays a crucial role in identifying targets for immunotherapy and vaccine development. Because the generation of immunopeptides from their parent proteins does not adhere to clear-cut rules, rather than being able to use known digestion patterns, every possible protein subsequence within HLA class-specific length restrictions need...
Article
Although most cancer drugs modulate the activities of cellular pathways by changing post-translational modifications (PTMs), surprisingly little is known regarding the extent and the time- and dose-response characteristics of drug-regulated PTMs. Here, we introduce a proteomic assay termed decryptM that quantifies drug-PTM modulation for thousands...
Preprint
Sample multiplexed quantitative proteomics has proved to be a highly versatile means to assay molecular phenotypes. Yet, stochastic precursor selection and precursor co-isolation can dramatically reduce the efficiency of data acquisition and quantitative accuracy. To address this, intelligent data acquisition (IDA) strategies have recently been dev...
Article
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mas...
Preprint
Full-text available
Unlike for DNA and RNA, accurate and high-throughput sequencing methods for proteins are lacking, hindering the utility of proteomics in applications where the sequences are unknown including variant calling, neoepitope identification, and metaproteomics. We introduce Spectralis, a new de novo peptide sequencing method for tandem mass spectrometry....
Conference Paper
Full-text available
Proteomics is the interdisciplinary field focusing on the large-scale study of proteins. Proteins essentially organize and execute all functions within organisms. Today, the bottom-up analysis approach is the most commonly used workflow, where proteins are digested into peptides and subsequently analyzed using Tandem Mass Spectrometry (MS/MS). MS-b...
Article
Full-text available
Estimating false discovery rates (FDRs) of protein identification continues to be an important topic in mass spectrometry-based proteomics, particularly when analyzing very large data sets. One performant method for this purpose is the Picked Protein FDR approach which is based on a target-decoy competition strategy on the protein level that ensure...
Preprint
Full-text available
Systemic pan-tumor analyses may reveal the significance of common features implicated in cancer immunogenicity and patient survival. Here, we provide a comprehensive multi-omics data set for 32 patients across 25 tumor types by combining proteogenomics with phenotypic and functional analyses. By using an optimized computational approach, we discove...
Article
Full-text available
Drugs that target histone deacetylase (HDAC) entered the pharmacopoeia in the 2000s. However, some enigmatic phenotypes suggest off-target engagement. Here, we developed a quantitative chemical proteomics assay using immobilized HDAC inhibitors and mass spectrometry that we deployed to establish the target landscape of 53 drugs. The assay covers 9...
Article
Full-text available
The laboratory mouse ranks among the most important experimental systems for biomedical research and molecular reference maps of such models are essential informational tools. Here, we present a quantitative draft of the mouse proteome and phosphoproteome constructed from 41 healthy tissues and several lines of analyses exemplify which insights can...
Article
Isobaric labeling increases the throughput of proteomics by enabling the parallel identification and quantification of peptides and proteins. Over the past decades, a variety of isobaric tags have been developed allowing the multiplexed analysis of up to 18 samples. However, experiments utilizing such tags often exhibit reduced identification rates...
Article
The prediction of fragment ion intensities and retention time of peptides has gained significant attention over the past few years. However, the progress shown in the accurate prediction of such properties focused primarily on unlabeled peptides. Tandem mass tags (TMT) are chemical peptide labels that are coupled to free amine groups usually after...
Article
Full-text available
Machine learning has been an integral part of interpreting data from mass spectrometry (MS)-based proteomics for a long time. Relatively recently, a machine-learning structure appeared successful in other areas of bioinformatics, Transformers. Furthermore, the implementation of Transformers within bioinformatics has become relatively convenient due...
Article
Full-text available
Isobaric stable isotope labeling techniques such as tandem mass tags (TMT) have become popular in proteomics because they enable the relative quantification of proteins with high precision from up to 18 samples in a single experiment. While missing values in peptide quantification are rare in a single TMT experiment, they rapidly increase when comb...
Article
Primary human hepatocytes are widely used to evaluate liver toxicity of drugs, but they are scarce and demanding to culture. Stem cell-derived hepatocytes are increasingly discussed as alternatives. To obtain a better appreciation of the molecular processes during the differentiation of induced pluripotent stem cells into hepatocytes, we employ a q...
Article
Full-text available
Machine learning is increasingly applied in proteomics and metabolomics to predict molecular structure, function, and physicochemical properties, including behavior in chromatography, ion mobility, and tandem mass spectrometry. These must be described in sufficient detail to apply or evaluate the performance of trained models. Here we look at and i...
Article
Full-text available
Proteome-wide measurements of protein turnover have largely ignored the impact of post-translational modifications (PTMs). To address this gap, we employ stable isotope labeling and mass spectrometry to measure the turnover of >120,000 peptidoforms including >33,000 phosphorylated, acetylated, and ubiquitinated peptides for >9,000 native proteins....
Article
Full-text available
The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics...
Article
Full-text available
ProteomicsDB (https://www.ProteomicsDB.org) is a multi-omics and multi-organism resource for life science research. In this update, we present our efforts to continuously develop and expand ProteomicsDB. The major focus over the last two years was improving the findability, accessibility, interoperability and reusability (FAIR) of the data as well...
Preprint
Full-text available
HDAC drugs have entered the pharmacopoeia in the 2000s. However, some enigmatic phenotypes suggest off-target engagement. Here, we developed a chemical proteomics assay using three promiscuous chemotypes and quantitative mass spectrometry that we deployed to establish the target landscape of 53 drugs. The results highlight 14 direct targets, includ...
Article
Full-text available
A current trend in proteomics is to acquire data in a “single-shot” by LC–MS/MS because it simplifies workflows and promises better throughput and quantitative accuracy than schemes that involve extensive sample fractionation. However, single-shot approaches can suffer from limited proteome coverage when performed by data dependent acquisition (ssD...
Article
Full-text available
Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Still, the identification of non-tryptic peptides presents substantial computational challenges. To address these, we synthesized and analyzed >300,000 peptides by multi-modal LC-MS/MS...
Preprint
Full-text available
The amount of public proteomics data is increasing at an extraordinary rate. Hundreds of datasets are submitted each month to ProteomeXchange repositories, representing many types of proteomics studies, focusing on different aspects such as quantitative experiments, post-translational modifications, protein-protein interactions, or subcellular loca...
Article
Here, we present the Universal Spectrum Explorer (USE), a web-based tool based on IPSA for cross-resource (peptide) spectrum visualization and comparison (https://www.proteomicsdb.org/use/). Mass spectra under investigation can be either provided manually by the user (table format) or automatically retrieved from online repositories supporting acce...
Article
Full-text available
Proteogenomics approaches often struggle with the distinction between true and false peptide-to-spectrum matches as the database size enlarges. However, features extracted from tandem mass spectrometry intensity predictors can enhance the peptide identification rate and can provide extra confidence for peptide-to-spectrum matching in a proteogenomi...
Article
Full-text available
Plant growth and development are regulated by a tightly controlled interplay between cell division, cell expansion and cell differentiation during the entire plant life cycle from seed germination to maturity and seed propagation. To explore some of the underlying molecular mechanisms in more detail, we selected different aerial tissue types of the...