Markus List

Markus List
Technische Universität München | TUM · School of Life Sciences Weihenstephan

PhD

About

117
Publications
21,014
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,609
Citations
Additional affiliations
September 2018 - present
Technische Universität München
Position
  • Head of Department
Description
  • Markus List heads the group of Big Data in Biomedicine which is part of the Chair of Experimental Bioinformatics at the Technical University of Munich. We are located in the TUM School of Life Sciences in Freising-Weihenstephan.
March 2018 - August 2018
Technische Universität München
Position
  • PostDoc Position
October 2015 - February 2018
Max Planck Institute for Informatics
Position
  • PostDoc Position

Publications

Publications (117)
Article
Funding Acknowledgements Type of funding sources: Foundation. Main funding source(s): German Center for Cardiovascular Research (DZHK) Background/Introduction RPs are young, hyper-reactive and RNA-rich platelets. They have a pro-thrombotic potential, are predictors of an insufficient response to antiplatelet therapy after myocardial infarction and...
Article
Funding Acknowledgements Type of funding sources: Public grant(s) – National budget only. Main funding source(s): This work was supported by the German Center for Cardiovascular Research (DZHK grant number Deutsches Zentrum für Herz- Kreislaufforschung 81 × 3600606 to D.B.). Abstract: Background/Introduction Reticulated platelets (RPs) are prothro...
Article
Cancer is a heterogeneous disease characterized by unregulated cell growth and promoted by mutations in cancer driver genes some of which encode suitable drug targets. Since the distinct set of cancer driver genes can vary between and within cancer types, evidence-based selection of drugs is crucial for targeted therapy following the precision medi...
Preprint
Molecular signatures have been suggested as biomarkers to classify pancreatic ductal adenocarcinoma (PDAC) into two, three or four subtypes. Since the robustness of existing signatures is controversial, we performed a systematic evaluation of three established signatures for PDAC stratification across eight publicly available datasets. Clustering r...
Preprint
Motivation As complex tissues are typically composed of various cell types, deconvolution tools have been developed to computationally infer their cellular composition from bulk RNA sequencing (RNA-seq) data. To comprehensively assess deconvolution performance, gold-standard datasets are indispensable. Gold-standard, experimental techniques like fl...
Preprint
During disease progression or organism development, alternative splicing (AS) may lead to isoform switches (IS) that demonstrate similar temporal patterns and reflect the AS co-regulation of such genes. Tools for dynamic process analysis usually neglect AS. Here we propose Spycone ( https://github.com/yollct/spycone ), a splicing-aware framework fo...
Preprint
Motivation A key problem in systems biology is the discovery of regulatory mechanisms that drive phenotypic behavior of complex biological systems in the form of multi-level networks. Modern multiomics profiling techniques probe these fundamental regulatory networks but are often hampered by experimental restrictions leading to missing data or part...
Preprint
Full-text available
Motivation Cancer is one of the leading causes of death worldwide. Despite significant improvements in prevention and treatment, mortality remains high for many cancer types. Hence, innovative methods that use molecular data to stratify patients and identify biomarkers are needed. Promising biomarkers can also be inferred from competing endogenous...
Article
Full-text available
De novo pathway enrichment is a systems biology approach in which OMICS data are projected onto a molecular interaction network to identify subnetworks representing condition-specific functional modules and molecular pathways. Compared to classical pathway enrichment analysis methods, de novo pathway enrichment is not limited to predefined lists of...
Article
Full-text available
Meta-analysis has been established as an effective approach to combining summary statistics of several genome-wide association studies (GWAS). However, the accuracy of meta-analysis can be attenuated in the presence of cross-study heterogeneity. We present sPLINK, a hybrid federated and user-friendly tool, which performs privacy-aware GWAS on distr...
Article
Background Exclusive enteral nutrition (EEN) is a first-line induction therapy for paediatric Crohn’s disease (CD). Although the protective mechanisms remain unclear, previous studies showed substantial changes in microbiome composition in response to EEN. We aim to assess the protective function of EEN and its impact on gut microbiome signatures i...
Article
Background Artificial intelligence (AI) has been successfully applied in numerous scientific domains. In biomedicine, AI has already shown tremendous potential, e.g., in the interpretation of next-generation sequencing data and in the design of clinical decision support systems. Objectives However, training an AI model on sensitive data raises conc...
Preprint
Full-text available
Alternative splicing is a major contributor to transcriptome and proteome diversity in health and disease. A plethora of tools have been developed for studying alternative splicing in RNA-seq data. Previous benchmarks focused on isoform quantification and mapping. They neglected event detection tools, which arguably provide the most detailed insigh...
Article
Motivation Disease module mining methods (DMMMs) extract subgraphs that constitute candidate disease mechanisms from molecular interaction networks such as protein-protein interaction (PPI) networks. Irrespective of the employed models, DMMMs typically include non-robust steps in their workflows, i. e., the computed subnetworks vary when running th...
Article
Mass cytometry (CyTOF) is a new technology that allows the investigation of protein expression at single cell level with high resolution. While several protocols are available to investigate leukocyte expression, platelet staining and analysis with CyTOF have been described only from whole blood. Moreover, available protocols do not allow sample st...
Preprint
Full-text available
Background 16S rRNA gene profiling is currently the most widely used technique in microbiome research and allows for studying microbial diversity, taxonomic profiling, phylogenetics, functional and network analysis. While a plethora of tools have been developed for the analysis of 16S rRNA gene data, only few platforms offer a user-friendly interfa...
Article
Full-text available
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, the accuracy might drop if class labels are inhomogeneous...
Article
Full-text available
Alternative splicing (AS) is an important aspect of gene regulation. Nevertheless, its role in molecular processes and pathobiology is far from understood. A roadblock is that tools for the functional analysis of AS-set events are lacking. To mitigate this, we developed NEASE, a tool integrating pathways with structural annotations of protein-prote...
Article
Full-text available
Despite impressive efforts invested in epigenetic research in the last 50 years, clinical applications are still lacking. Only a few university hospital centers currently use epigenetic biomarkers at the bedside. Moreover, the overall concept of precision medicine is not widely recognized in routine medical practice and the reductionist approach re...
Article
Full-text available
For complex diseases, most drugs are highly ineffective, and the success rate of drug discovery is in constant decline. While low quality, reproducibility issues, and translational irrelevance of most basic and preclinical research have contributed to this, the current organ-centricity of medicine and the ‘one disease–one target–one drug’ dogma obs...
Article
Cytometry techniques are widely used to discover cellular characteristics at single-cell resolution. Many data analysis methods for cytometry data focus solely on identifying subpopulations via clustering and testing for differential cell abundance. For differential expression analysis of markers between conditions, only few tools exist. These tool...
Article
Background Reticulated platelets (RPs) are young, hyper-reactive thrombocytes that contain more RNA compared with mature platelets (MPs). The measurement of RPs level in peripheral blood with point-of-care systems is fast, reproducible, and inexpensive. Elevated RPs in peripheral blood predict adverse events in patients with acute and chronic coron...
Article
We present the AIMe registry, a community-driven reporting platform for AI in biomedicine. It aims to enhance the accessibility, reproducibility and usability of biomedical AI models, and allows future revisions by the community. View-only version: https://rdcu.be/cv5H7
Article
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection induces a coagulopathy characterized by platelet activation and a hypercoagulable state with an increased incidence of cardiovascular events. The viral spike protein S has been reported to enhance thrombosis formation, stimulate platelets to release procoagulant factors, and pro...
Preprint
Full-text available
Cytometry techniques are widely used to discover cellular characteristics at single-cell resolution. Many data analysis methods for cytometry data focus solely on identifying subpopulations via clustering and testing for differential cell abundance. For differential expression analysis of markers between conditions, only few tools exist. These tool...
Preprint
Full-text available
Alternative splicing (AS) is an important aspect of gene regulation. Nevertheless, its role in molecular processes and pathobiology is far from understood. A roadblock is that tools for the functional analysis of AS-set events are lacking. To mitigate this, we developed NEASE, a tool integrating pathways with protein-protein and domain-domain inter...
Article
Full-text available
Lack of reproducibility in gene expression studies is a serious issue being actively addressed by the biomedical research community. Besides established factors such as batch effects and incorrect sample annotations, we recently reported tissue heterogeneity, a consequence of unintended profiling of cells of other origins than the tissue of interes...
Preprint
Full-text available
SARS-CoV-2 infection induces a coagulopathy characterized by platelet activation and a hypercoagulable state with an increased incidence of cardiovascular events. The viral spike protein S has been reported to enhance thrombosis formation, stimulate platelets to release pro-coagulant factors and promote the formation of platelet-leukocyte aggregate...
Preprint
Full-text available
Machine Learning (ML) and Artificial Intelligence (AI) have shown promising results in many areas and are driven by the increasing amount of available data. However, this data is often distributed across different institutions and cannot be shared due to privacy concerns. Privacy-preserving methods, such as Federated Learning (FL), allow for traini...
Article
Full-text available
Microorganisms including bacteria, fungi, viruses, protists and archaea live as communities in complex and contiguous environments. They engage in numerous inter- and intra- kingdom interactions which can be inferred from microbiome profiling data. In particular, network-based approaches have proven helpful in deciphering complex microbial interact...
Article
In network and systems medicine, active module identification methods (AMIMs) are widely used for discovering candidate molecular disease mechanisms. To this end, AMIMs combine network analysis algorithms with molecular profiling data, most commonly, by projecting gene expression data onto generic protein–protein interaction (PPI) networks. Althoug...
Article
Epigenetics studies inheritable and reversible modifications of DNA that allow cells to control gene expression throughout their development and in response to environmental conditions. In computational epigenomics, machine learning is applied to study various epigenetic mechanisms genome wide. Its aim is to expand our understanding of cell differe...
Article
Full-text available
Background: Transcriptional regulation of gene expression is crucial for the adaptation and survival of bacteria. Regulatory interactions are commonly modeled as Gene Regulatory Networks (GRNs) derived from experiments such as RNA-seq, microarray and ChIP-seq. While the reconstruction of GRNs is fundamental to decipher cellular function, even GRNs...
Article
A plethora of tools exist for RNA-Seq data analysis with a focus on alternative splicing (AS). However, appropriate data for their comparative evaluation is missing. The R package ASimulatoR simulates gold standard RNA-Seq datasets with fine-grained control over the distribution of AS events, which allow for evaluating alternative splicing tools, e...
Article
Full-text available
Short-amplicon 16S rRNA gene sequencing is currently the method of choice for studies investigating microbiomes. However, comparative studies on differences in procedures are scarce. We sequenced human stool samples and mock communities with increasing complexity using a variety of commonly used protocols. Short amplicons targeting different variab...
Article
Full-text available
microRNAs (miRNAs) are post-transcriptional regulators involved in many biological processes and human diseases, including cancer. The majority of transcripts compete over a limited pool of miRNAs, giving rise to a complex network of competing endoge-nous RNA (ceRNA) interactions. Currently, gene-regulatory networks focus mostly on transcription fa...
Article
Full-text available
microRNAs (miRNAs) are post-transcriptional regulators involved in many biological processes and human diseases, including cancer. The majority of transcripts compete over a limited pool of miRNAs, giving rise to a complex network of competing endogenous RNA (ceRNA) interactions. Currently, gene-regulatory networks focus mostly on transcription fac...
Article
Full-text available
Novel coronavirus disease 2019 (COVID-19) is associated with a hypercoagulable state, characterized by abnormal coagulation parameters and by increased incidence of cardiovascular complications. With this study, we aimed to investigate the activation state and the expression of transmembrane proteins in platelets of hospitalized COVID-19 patients....
Article
Motivation Unsupervised learning approaches are frequently employed to stratify patients into clinically relevant subgroups and to identify biomarkers such as disease-associated genes. However, clustering and biclustering techniques are oblivious to the functional relationship of genes and are thus not ideally suited to pinpoint molecular mechanism...
Preprint
Full-text available
Background: Lack of reproducibility in gene expression studies has recently attracted much attention in and beyond the biomedical research community. Previous efforts have identified many underlying factors, such as batch effects and incorrect sample annotations. Recently, tissue heterogeneity, a consequence of unintended profiling of cells of othe...
Article
Motivation Recently, various tools for detecting single nucleotide polymorphisms (SNPs) involved in epistasis have been developed. However, no studies evaluate the employed statistical epistasis models such as the χ2-test or quadratic regression independently of the tools that use them. Such an independent evaluation is crucial for developing impro...
Preprint
Full-text available
Federated learning is a well-established approach to privacy-preserving training of a joint model on heavily distributed data. Federated averaging (FedAvg) is a well-known communication-efficient algorithm for federated learning, which performs well if the data distribution across the clients is independently and identically distributed (IID). Howe...
Article
Full-text available
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need...
Preprint
Full-text available
Aggregating clinical transcriptomics data across hospitals can increase sensitivity and robustness of differential gene expression analyses yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, if class labels or confounders are inhomogen...
Article
Full-text available
The field of breath analysis lacks a fully automated analysis platform that enforces machine learning good practice and enables clinicians and clinical researchers to rapidly and reproducibly discover metabolite patterns in diseases. We present BALSAM-a comprehensive web-platform to simplify and automate this process, offering features for preproce...
Article
Full-text available
Alternative splicing plays a major role in regulating the functional repertoire of the proteome. However , isoform-specific effects to protein-protein interactions (PPIs) are usually overlooked, making it impossible to judge the functional role of individual exons on a systems biology level. We overcome this barrier by integrating protein-protein i...
Preprint
Full-text available
Artificial intelligence (AI) has been successfully applied in numerous scientific domains including biomedicine and healthcare. Here, it has led to several breakthroughs ranging from clinical decision support systems, image analysis to whole genome sequencing. However, training an AI model on sensitive data raises also concerns about the privacy of...
Article
Full-text available
Coronavirus Disease-2019 (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus. Various studies exist about the molecular mechanisms of viral infection. However, such information is spread across many publications and it is very time-consuming to integrate, and exploit. We develop CoVex, an interactive online platform for SARS-CoV-2 ho...
Article
Full-text available
Lifestyle, obesity, and the gut microbiome are important risk factors for metabolic disorders. We demonstrate in 1,976 subjects of a German population cohort (KORA) that specific microbiota members show 24-h oscillations in their relative abundance and identified 13 taxa with disrupted rhythmicity in type 2 diabetes (T2D). Cross-validated predictio...
Article
Full-text available
Manipulating molecules that impact T cell receptor (TCR) or cytokine signaling, such as the protein tyrosine phosphatase non-receptor type 2 (PTPN2), has significant potential for advancing T cell-based immunotherapies. Nonetheless, it remains unclear how PTPN2 impacts the activation, survival, and memory formation of T cells. We find that PTPN2 de...
Preprint
Full-text available
Genome-wide association studies (GWAS) have been widely used to unravel connections between genetic variants and diseases. Larger sample sizes in GWAS can lead to discovering more associations and more accurate genetic predictors. However, sharing and combining distributed genomic data to increase the sample size is often challenging or even imposs...
Article
Objective To facilitate shared decision-making for patients with knee osteoarthritis (OA), we aimed at building clinically applicable models to predict the individual change in pain intensity (VAS scale 0 -100), knee-related quality of life (QoL) (KOOS QoL score 0-100) and walking speed (m/sec) immediately following two educational and 12 supervise...
Article
Full-text available
A current challenge in genomics is to interpret non-coding regions and their role in transcriptional regulation of possibly distant target genes. Genome-wide association studies show that a large part of genomic variants are found in those non-coding regions, but their mechanisms of gene regulation are often unknown. An additional challenge is to r...
Preprint
Full-text available
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need...
Article
Full-text available
Drug research, therapy development, and other areas of pharmacology and medicine can benefit from simulations and optimization of mathematical models that contain a mathematical description of interactions between systems elements at the cellular, tissue, organ, body, and population level. This approach is the foundation of systems medicine and pre...
Preprint
Full-text available
Coronavirus Disease-2019 (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus. It was first identified in Wuhan, China, and has since spread causing a global pandemic. Various studies have been performed to understand the molecular mechanisms of viral infection for predicting drug repurposing candidates. However, such information is s...
Article
Simulated data is crucial for evaluating epistasis detection tools in genome-wide association studies. Existing simulators are limited, as they do not account for linkage disequilibrium (LD), support limited interaction models of single nucleotide polymorphisms (SNPs) and only dichotomous phenotypes, or depend on proprietary software. In contrast,...
Article
Full-text available
Plants are essential for life and are extremely diverse organisms with unique molecular capabilities¹. Here we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana. Our analysis provides initial answers to how many genes exist as proteins (more than 18,000), where t...