Simon Rasmussen

Simon Rasmussen
  • PhD
  • Professor (Associate) at University of Copenhagen

About

213
Publications
282,819
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
28,888
Citations
Current institution
University of Copenhagen
Current position
  • Professor (Associate)
Additional affiliations
January 2009 - present
Technical University of Denmark

Publications

Publications (213)
Preprint
Full-text available
The genetic regulation of the plasma proteome has been extensively studied in adult populations, yet protein quantitative trait loci (pQTL) studies in children and adolescents remain largely unexplored. Here, we mapped pQTLs for 178 plasma proteins measured using affinity-based proteomics in 3,853 Danish children and adolescents (44.1% boys; median...
Preprint
Motivation: Time-to-event data in disease occurrence is often right-censored, requiring survival models for accurate predictions. While deep learning advancements have extended traditional Cox models, current approaches do not allow modeling on individual-level, large-scale genotype data. Scalable models integrating genetic and clinical data could...
Preprint
Full-text available
Plasmids are extrachromosomal DNA molecules that enable horizontal gene transfer in bacteria, often conferring advantages such as antibiotic resistance. Despite their significance, plasmids are underrepresented in genomic databases due to challenges in assembling them, caused by mosaicism and micro-diversity. Current plasmid assemblers rely on dete...
Article
Full-text available
Our current understanding of the determinants of plasma proteome variation during pediatric development remains incomplete. Here, we show that genetic variants, age, sex and body mass index significantly influence this variation. Using a streamlined and highly quantitative mass spectrometry-based proteomics workflow, we analyzed plasma from 2,147 c...
Article
Full-text available
Polygenic prediction has yet to make a major clinical breakthrough in precision medicine and psychiatry, where the application of polygenic risk scores is expected to improve clinical decision-making. Most widely used approaches for estimating polygenic risk scores are based on summary statistics from external large-scale genome-wide association st...
Article
Full-text available
Contaminants, such as heavy metals (HMs), accumulate in the Arctic environment and the food web. The diet of the Indigenous Peoples of North Greenland includes locally sourced foods that are central to their nutritional, cultural, and societal health but these foods also contain high concentrations of heavy metals. While bacteria play an essential...
Preprint
Full-text available
A common procedure for studying the microbiome is binning the sequenced contigs into metagenome-assembled genomes. Currently, unsupervised and self-supervised deep learning based methods using co-abundance and sequence based motifs such as tetranucleotide frequencies are state-of-the-art for metagenome binning. Taxonomic labels derived from alignme...
Article
Full-text available
For taxonomy based classification of metagenomics assembled contigs, current methods use sequence similarity to identify their most likely taxonomy. However, in the related field of metagenomic binning, contigs are routinely clustered using information from both the contig sequences and their abundance. We introduce Taxometer, a neural network base...
Preprint
Although thousands of genetic variants are linked to human traits and diseases, the underlying mechanisms influencing these traits remain largely unexplored. One important aspect is to understand how proteins are regulated by the genome by identifying protein quantitative trait loci (pQTLs). Beyond this, there is a need to understand the role of co...
Article
Full-text available
Imputation techniques provide means to replace missing measurements with a value and are used in almost all downstream analysis of mass spectrometry (MS) based proteomics data using label-free quantification (LFQ). Here we demonstrate how collaborative filtering, denoising autoencoders, and variational autoencoders can impute missing values in the...
Preprint
Full-text available
New methods for metagenomic binning are typically evaluated using benchmarking software, and become tuned to maximize whatever criterion is measured by the benchmark. Subtleties in benchmarking procedures can cause misleading evaluations, derailing method development. Differences between procedures used to evaluate binning tools make them hard to c...
Preprint
Full-text available
Polygenic prediction has yet to make a major clinical breakthrough in precision medicine and psychiatry, where the application of polygenic risk scores are expected to improve clinical decision-making. Most widely used approaches for estimating polygenic risk scores are based on summary statistics from external large-scale genome-wide association s...
Article
Full-text available
Germline pathogenic variants associated with increased childhood mortality must be subject to natural selection. Here, we analyze publicly available germline genetic metadata from 4,574 children with cancer [11 studies; 1,083 whole exome sequences (WES), 1,950 whole genome sequences (WGS), and 1,541 gene panel] and 141,456 adults [125,748 WES and 1...
Article
Full-text available
Here we provide a curated, large scale, label free mass spectrometry-based proteomics data set derived from HeLa cell lines for general purpose machine learning and analysis. Data access and filtering is a tedious task, which takes up considerable amounts of time for researchers. Therefore we provide machine based metadata for easy selection and ov...
Preprint
Full-text available
For taxonomy based classification of metagenomics assembled contigs, current methods use sequence similarity to identify their most likely taxonomy. However, in the related field of metagenomics binning contigs are routinely clustered using information from both the contig sequences and their abundance. We introduce Taxometer, a neural network base...
Article
Full-text available
The inflammatory activity in cirrhosis is often pronounced and related to episodes of decompensation. Systemic markers of inflammation may contain prognostic information, and we investigated their possible correlation with admissions and mortality among patients with newly diagnosed liver cirrhosis. We collected plasma samples from 149 patients wit...
Article
Full-text available
Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we prese...
Preprint
The microbial community, or microbiota, in our gut plays important roles in our health. In fact, one way to treat Clostridioides difficile infection (CDI) is by transferring the microbiota from a healthy donor to a patient. After fecal microbiota transplantation (FMT), bacterial strains from the donor may take up residence in the recipient. But if...
Article
Full-text available
Most investigations of geographical within-species differences are limited to focusing on a single species. Here, we investigate global differences for multiple bacterial species using a dataset of 757 metagenomics sewage samples from 101 countries worldwide. The within-species variations were determined by performing genome reconstructions, and th...
Preprint
Full-text available
The chemokine receptor variant CCR5delta32 is linked to HIV-1 infection resistance and other pathological conditions. In European populations, the allele frequency ranges from 10-16%, and its evolution has been extensively debated throughout the years. We provide a detailed perspective of the evolutionary history of the deletion through time and sp...
Preprint
Full-text available
Here we provide a curated, large scale, label free mass spectrometry-based proteomics data set derived from HeLa cell lines for general purpose machine learning and analysis. Data access and filtering is a tedious task, which takes up considerable amounts of time for researchers. Therefore we provide machine based metadata for easy selection and ov...
Article
Full-text available
Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a dee...
Article
Full-text available
Distinct gut microbiome ecology may be implicated in the prevention of aging-related diseases as it influences systemic immune function and resistance to infections. Yet, the viral component of the microbiome throughout different stages in life remains unexplored. Here we present a characterization of the centenarian gut virome using previously pub...
Preprint
Full-text available
The levels of specific proteins in human blood are the most commonly used indicators of potential health-related problems ¹ . Understanding the genetic and other determinants of the human plasma proteome can aid in biomarker research and drug development. Diverse factors including genetics, age, sex, body mass index (BMI), growth and development in...
Article
Full-text available
Background Next-generation sequencing (NGS) based population screening holds great promise for disease prevention and earlier diagnosis, but the costs associated with screening millions of humans remain prohibitive. New methods for population genetic testing that lower the costs of NGS without compromising diagnostic power are needed. Methods We d...
Preprint
Full-text available
Germline pathogenic variants associated with increased childhood mortality must be subject to natural selection. Here, we analyzed publically available germline genetic metadata from 141,456 adults [gnomAD; 125,748 whole exome sequences (WES) and 15,708 whole genome sequences (WGS)] and 4,810 children with cancer [11 studies; 1,319 WES, 1,950 WGS,...
Preprint
Full-text available
Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we prese...
Preprint
Full-text available
Imputation techniques provide means to replace missing measurements with a value and are used in almost all downstream analysis of mass spectrometry (MS) based proteomics data using label-free quantification (LFQ). Some methods only impute assuming the limit of detection (LOD) was not passed and therefore impute missing values with too low or too h...
Article
Full-text available
The application of multiple omics technologies in biomedical cohorts has the potential to reveal patient-level disease characteristics and individualized response to treatment. However, the scale and heterogeneous nature of multi-modal data makes integration and inference a non-trivial task. We developed a deep-learning-based framework, multi-omics...
Article
Importance Diagnoses and treatment of mental disorders are hampered by the current lack of objective markers needed to provide a more precise diagnosis and treatment strategy. Objective To develop deep learning models to predict mental disorder diagnosis and severity spanning multiple diagnoses using nationwide register data, family and patient-sp...
Article
Full-text available
Background Fecal microbiota transplantation (FMT) effectively prevents the recurrence of Clostridioides difficile infection (CDI). Long-term engraftment of donor-specific microbial consortia may occur in the recipient, but potential further transfer to other sites, including the vertical transmission of donor-specific strains to future generations,...
Preprint
Full-text available
Blood and urine biomarkers are an essential part of modern medicine, not only for diagnosis, but also for their direct influence on disease. Many biomarkers have a genetic component, and they have been studied extensively with genome-wide association studies (GWAS) and methods that compute polygenic scores (PGSs). However, these methods generally a...
Article
Autoimmune and autoinflammatory diseases (AIIDs) involve a deficit in an individual's immune system function, whereby the immune reaction is directed against self-antigens. Many AIIDs have a strong genetic component, but they can also be triggered by environmental factors. AIIDs often have a highly negative impact on the individual's physical and m...
Article
Full-text available
The many microbial communities around us form interactive and dynamic ecosystems called microbiomes. Though concealed from the naked eye, microbiomes govern and influence macroscopic systems including human health, plant resilience, and biogeochemical cycling. Such feats have attracted interest from the scientific community, which has recently turn...
Article
Full-text available
Human populations have been shaped by catastrophes that may have left long-lasting signatures in their genomes. One notable example is the second plague pandemic that entered Europe in ca. 1,347 CE and repeatedly returned for over 300 years, with typical village and town mortality estimated at 10%–40%.¹ It is assumed that this high mortality affect...
Article
Background The etiology of central nervous system (CNS) tumors in children is largely unknown and population-based studies of genetic predisposition are lacking. Methods In this prospective, population-based study, we performed germline whole-genome sequencing in 128 children with CNS tumors, supplemented by a systematic pedigree analysis covering...
Article
Full-text available
Currently, psychiatric diagnoses are, in contrast to most other medical fields, based on subjective symptoms and observable signs and call for new and improved diagnostics to provide the most optimal care. On the basis of a deep learning approach, we performed unsupervised patient stratification of 19,636 patients with depression [major depressive...
Article
Full-text available
Alcohol-related liver disease (ALD) is a major cause of liver-related death worldwide, yet understanding of the three key pathological features of the disease—fibrosis, inflammation and steatosis—remains incomplete. Here, we present a paired liver–plasma proteomics approach to infer molecular pathophysiology and to explore the diagnostic and progno...
Article
Full-text available
Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 ne...
Preprint
Background The underlying cause of central nervous system (CNS) tumors in children is largely unknown. In this nationwide, prospective population-based study we investigate rare germline variants across known and putative CPS genes and genes exhibiting evolutionary intolerance of inactivating alterations in children with CNS tumors. Methods One hu...
Preprint
Full-text available
Next-generation sequencing (NGS) based population screening holds great promise for disease prevention and earlier diagnosis, but associated sequencing costs remain prohibitive. We developed double batched sequencing (DoBSeq) and tested it on neonatal blood spot DNA in an explorative (n = 100) and a validation (n = 100) cohort selected from a natio...
Article
Full-text available
Despite the accelerating number of uncultivated virus sequences discovered in metagenomics and their apparent importance for health and disease, the human gut virome and its interactions with bacteria in the gastrointestinal tract are not well understood. This is partly due to a paucity of whole-virome datasets and limitations in current approaches...
Article
Full-text available
DNA interstrand crosslinks (ICLs) are cytotoxic lesions that threaten genome integrity. The Fanconi anemia (FA) pathway orchestrates ICL repair during DNA replication, with ubiquitylated FANCI-FANCD2 (ID2) marking the activation step that triggers incisions on DNA to unhook the ICL. Restoration of intact DNA requires the coordinated actions of poly...
Preprint
Full-text available
Most investigations of geographical differences within microbial species are limited to focusing on a single species. Here, we investigate the global differences for multiple bacterial species by using a dataset of 757 metagenomics sewage samples from 101 different countries worldwide. The within-species variations were identified by performing uns...
Article
Full-text available
Non-alcoholic steatohepatitis (NASH) is a chronic liver disease affecting up to 6.5% of the general population. There is no simple definition of NASH, and the molecular mechanism underlying disease pathogenesis remains elusive. Studies applying single omics technologies have enabled a better understanding of the molecular profiles associated with s...
Preprint
Full-text available
Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the community-driven initiative for the Critical Assessment of Metagenome Interpretation (CAMI). In its second challenge, CAMI engaged the community to assess their methods on realistic and complex metagenomic datasets with long and short reads, created fro...
Preprint
Full-text available
Despite the accelerating number of uncultivated virus sequences discovered in metagenomics and their apparent importance for health and disease, the human gut virome and its interactions with bacteria in the gastrointestinal are not well understood. In addition, a paucity of whole-virome datasets from subjects with gastrointestinal diseases is prev...
Article
The foliar microbiome can extend the host plant phenotype by expanding its genomic and metabolic capabilities. Despite increasing recognition of the importance of the foliar microbiome for plant fitness, stress physiology, and yield, the diversity, function, and contribution of foliar microbiomes to plant phenotypic traits remain largely elusive. T...
Preprint
Full-text available
Polygenic risk scores (PRSs) are expected to play a critical role in achieving precision medicine. PRS predictors are generally based on linear models using summary statistics, and more recently individual- level data. However, these predictors generally only capture additive relationships and are limited when it comes to what type of data they use...
Article
Full-text available
Background Infections are a major disease burden worldwide. While they are caused by external pathogens, host genetics also plays a part in susceptibility to infections. Past studies have reported diverse associations between human leukocyte antigen (HLA) alleles and infections, but many were limited by small sample sizes and/or focused on only one...
Article
Full-text available
Despite recent advances in metagenomic binning, reconstruction of microbial species from metagenomics data remains challenging. Here we develop variational autoencoders for metagenomic binning (VAMB), a program that uses deep variational autoencoders to encode sequence coabundance and k-mer distribution information before clustering. We show that a...
Article
Full-text available
A Correction to this paper has been published: https://doi.org/10.1038/s41586-021-03328-2.
Article
Full-text available
A correct identification of Streptococcus pseudopneumoniae is a prerequisite for investigating the clinical impact of the bacterium. The identification has traditionally relied on phenotypic methods. However, these phenotypic traits have been shown to be unreliable, with some S. pseudopneumoniae giving conflicting results. Therefore, sequence based...
Article
Full-text available
Lesions on DNA uncouple DNA synthesis from the replisome, generating stretches of unreplicated single-stranded DNA (ssDNA) behind the replication fork. These ssDNA gaps need to be filled in to complete DNA duplication. Gap-filling synthesis involves either translesion DNA synthesis (TLS) or template switching (TS). Controlling these processes, ubiq...
Preprint
Existing tests for detecting liver fibrosis, inflammation and steatosis, three stages of liver disease that are still reversible are severely hampered by limited accuracy or invasive nature. Here, we present a paired liver-plasma proteomics approach to infer molecular pathophysiology and to identify biomarkers in a cross-sectional alcohol-related l...
Article
Full-text available
The ‘red complex’ is an aggregate of three oral bacteria ( Tannerella forsythia , Porphyromonas gingivalis and Treponema denticola ) responsible for severe clinical manifestation of periodontal disease. Here, we report the first direct evidence of ancient T. forsythia DNA in dentin and dental calculus samples from archaeological skeletal remains th...
Article
Full-text available
The maritime expansion of Scandinavian populations during the Viking Age (about ad 750–1050) was a far-flung transformation in world history1,2. Here we sequenced the genomes of 442 humans from archaeological sites across Europe and Greenland (to a median depth of about 1×) to understand the global influence of this expansion. We find the Viking pe...
Article
Full-text available
Anatomically modern humans reached East Asia more than 40,000 years ago. However, key questions still remain unanswered with regard to the route(s) and the number of wave(s) in the dispersal into East Eurasia. Ancient genomes at the edge of the region may elucidate a more detailed picture of the peopling of East Eurasia. Here, we analyze the whole-...
Article
Full-text available
Re-theorising mobility and the formation of culture and language among the Corded Ware Culture in Europe—CORRIGENDUM - Volume 94 Issue 375 - Kristian Kristiansen, Morten E. Allentoft, Karin M. Frei, Rune Iversen, Niels N. Johannsen, Guus Kroonen, Łukasz Pospieszny, T. Douglas Price, Simon Rasmussen, Karl-Göran Sjögren, Martin Sikora, Eske Willersle...
Article
Background Previous studies have indicated the bidirectionality between autoimmune and mental disorders. However, genetic studies underpinning the co-occurrence of the two disorders have been lacking. In this study, we examined the potential genetic contribution to the association between autoimmune and mental disorders and investigated the genetic...
Article
Full-text available
Streptococcus gordonii and Streptococcus sanguinis belong to the Mitis group streptococci, which mostly are commensals in the human oral cavity. Though they are oral commensals, they can escape their niche and cause infective endocarditis, a severe infection with high mortality. Several virulence factors important for the development of infective e...
Article
Full-text available
Background The giant squid (Architeuthis dux; Steenstrup, 1857) is an enigmatic giant mollusc with a circumglobal distribution in the deep ocean, except in the high Arctic and Antarctic waters. The elusiveness of the species makes it difficult to study. Thus, having a genome assembled for this deep-sea–dwelling species will allow several pending ev...
Article
Background: Many diverse inflammatory pathophysiologic mechanisms have been linked to mental disorders, and through the past decade an increasing interest in the gut microbiota and its relation to mental health has been arising. We aimed to systematically review studies of alterations in gut microbiota of patients suffering from psychotic disorder...
Article
Full-text available
The rise of ancient genomics has revolutionised our understanding of human prehistory but this work depends on the availability of suitable samples. Here we present a complete ancient human genome and oral microbiome sequenced from a 5700 year-old piece of chewed birch pitch from Denmark. We sequence the human genome to an average depth of 2.3× and...
Preprint
Full-text available
An important step in metagenomics studies is to identify which species are present in a sample as well as to compare samples from different environments. Here we introduce MicroWineBar, a graphical tool for analyzing and comparing metagenomics samples. MicroWineBar can visualize the abundances of metagenomics samples in line and bar graphs, as well...
Preprint
Full-text available
The Viking maritime expansion from Scandinavia (Denmark, Norway, and Sweden) marks one of the swiftest and most far-flung cultural transformations in global history. During this time (c. 750 to 1050 CE), the Vikings reached most of western Eurasia, Greenland, and North America, and left a cultural legacy that persists till today. To understand the...
Preprint
Full-text available
Background: Previous studies have indicated the bidirectionality between autoimmune and mental disorders. However, genetic studies underpinning the co-occurrence of the two disorders have been lacking. In this study, we examined the potential genetic contribution to the association between autoimmune and mental disorders. Methods: We used diagnosti...
Article
Full-text available
Northeastern Siberia has been inhabited by humans for more than 40,000 years but its deep population history remains poorly understood. Here we investigate the late Pleistocene population history of northeastern Siberia through analyses of 34 newly recovered ancient genomes that date to between 31,000 and 600 years ago. We document complex populati...
Article
Full-text available
The third millennium BCE was a period of major cultural and demographic changes in Europe that signaled the beginning of the Bronze Age. People from the Pontic steppe expanded westward, leading to the formation of the Corded Ware complex and transforming the genetic landscape of Europe. At the time, the Globular Amphora culture (3300–2700 BCE) exis...
Article
Full-text available
Human leukocyte antigen (HLA) genes encode proteins with important roles in the regulation of the immune system. Many studies have also implicated HLA genes in psychiatric and neurodevelopmental disorders. However, these studies usually focus on one disorder and/or on one HLA candidate gene, often with small samples. Here, we access a large dataset...
Article
Full-text available
Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use meta-genomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AM...
Article
Full-text available
Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use meta-genomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AM...
Article
Full-text available
Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use meta-genomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AM...
Article
Full-text available
Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use metagenomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AMR...
Article
Full-text available
After European colonization, the ancestral remains of Indigenous people were often collected for scientific research or display in museum collections. For many decades, Indigenous people, including Native Americans and Aboriginal Australians, have fought for their return. However, many of these remains have no recorded provenance, making their repa...
Article
Full-text available
After European colonization, the ancestral remains of Indigenous people were often collected for scientific research or display in museum collections. For many decades, Indigenous people, including Native Americans and Aboriginal Australians, have fought for their return. However, many of these remains have no recorded provenance, making their repa...
Preprint
Full-text available
We present a complete ancient human genome and oral microbiome sequenced from a piece of resinous "chewing gum" recovered from a Stone Age site on the island of Lolland, Denmark, and directly dated to 5,858-5,661 cal. BP (GrM-13305; 5,007+/-11). We sequenced the genome to an average depth-of-coverage of 2.3x and find that the individual who chewed...
Preprint
Full-text available
Identification and reconstruction of microbial species from metagenomics wide genome sequencing data is an important and challenging task. Current existing approaches rely on gene or contig co-abundance information across multiple samples and k-mer composition information in the sequences. Here we use recent advances in deep learning to develop an...
Article
Full-text available
Between 5,000 and 6,000 years ago, many Neolithic societies declined throughout western Eurasia due to a combination of factors that are still largely debated. Here, we report the discovery and genome reconstruction of Yersinia pestis, the etiological agent of plague, in Neolithic farmers in Sweden, pre-dating and basal to all modern and ancient kn...
Article
Full-text available
Complex processes in the settling of the Americas The expansion into the Americas by the ancestors of present day Native Americans has been difficult to tease apart from analyses of present day populations. To understand how humans diverged and spread across North and South America, Moreno-Mayar et al. sequenced 15 ancient human genomes from Alaska...
Article
Full-text available
with In this Article, Angela M. Taravella and Melissa A. Wilson Sayres have been added to the author list (associated with: School of Life Sciences, Center for Evolution and Medicine, The Biodesign Institute, Arizona State University, Tempe, AZ, USA). The author list and Author Information section have been corrected online.
Preprint
Full-text available
Far northeastern Siberia has been occupied by humans for more than 40 thousand years. Yet, owing to a scarcity of early archaeological sites and human remains, its population history and relationship to ancient and modern populations across Eurasia and the Americas are poorly understood. Here, we report 34 ancient genome sequences, including two fr...

Network

Cited By