Manolis Kellis

Manolis Kellis
Massachusetts Institute of Technology | MIT · Computer Science and Artificial Intelligence Laboratory

PhD, MEng, BSc

About

589
Publications
141,192
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
101,443
Citations
Introduction
Manolis Kellis is a Professor of Computer Science at MIT, a member of the Broad Institute of MIT and Harvard, and a member of the Computer Science and Artificial Intelligence Lab at MIT where he directs the MIT Computational Biology Group (compbio.mit.edu). His research is in disease genetics, epigenomics, gene circuitry, non-coding RNAs, comparative genomics, and phylogenetics. He has authored more than 230 journal publications that have been cited more than 115,000 times. He has helped direct
Additional affiliations
September 2004 - present
Massachusetts Institute of Technology
Position
  • Professor (Full)
June 2004 - present
Broad Institute of MIT and Harvard
Position
  • Institute Member

Publications

Publications (589)
Preprint
Functional genomics experiments are invaluable for understanding mechanisms of gene regulation. However, comprehensively performing all such experiments, even across a fixed set of sample and assay types, is often infeasible in practice. A promising alternative to performing experiments exhaustively is to, instead, perform a core set of experiments...
Article
Full-text available
BACE-1 is required for generating -amyloid (A) peptides in Alzheimer's disease (AD). Here, we report that microglial BACE-1 regulates the transition of homeostatic to stage 1 disease-associated microglia (DAM-1) signature. BACE-1 deficiency elevated levels of transcription factors including Jun, Jund, Btg2, Erg1, Junb, Fos, and Fosb in the transi...
Preprint
Despite significant advances in identifying genetic drivers of neurodegenerative disorders, the majority of affected individuals lack molecular genetic diagnosis, with somatic mutations proposed as one potential contributor to increased risk. Here, we report the first cell-type-specific map of somatic mosaicism in Alzheimer’s Dementia (AlzD), using...
Article
In this issue of Neuron, Meijer and Agirre et al. (2022) demonstrate that immune genes exhibit a primed chromatin state in healthy oligodendroglia and are transcriptionally activated in MS through a series of epigenetic activations including histone modification deposition, transcription factor binding, and chromatin reconfiguration.
Preprint
Chorea-acanthocytosis and McLeod syndrome are diseases with shared clinical manifestations caused by mutations in VPS13A and XK, respectively. Key features of these conditions are the degeneration of caudate neurons and the presence of abnormally shaped erythrocytes. XK belongs to a family of plasma membrane (PM) lipid scramblases whose action resu...
Article
Full-text available
Emergence of SARS-CoV-2 variants of concern (VOCs) suggests viral adaptation to enhance human-to-human transmission1,2. Although much effort has focused on characterisation of spike changes in VOCs, mutations outside spike likely contribute to adaptation. Here we used unbiased abundance proteomics, phosphoproteomics, RNAseq and viral replication as...
Article
Full-text available
Despite the importance of the cerebrovasculature in maintaining normal brain physiology and in understanding neurodegeneration and CNS drug delivery1, human cerebrovascular cells remain poorly characterized due to their sparsity and dispersion. Here, we perform the first single-cell characterization of the human cerebrovasculature using both ex viv...
Preprint
Cerebrovascular breakdown occurs early in Alzheimer′s Disease (AD), but its cell-type-specific molecular basis remains uncharacterized. Here, we characterize single-cell transcriptomic differences in human cerebrovasculature across 220 AD and 208 control individuals and across 6 brain regions. We annotate 22,514 cerebrovascular cells in 11 subtypes...
Article
Full-text available
Tumor-associated epitopes presented on MHC-I that can activate the immune system against cancer cells are typically identified from annotated protein-coding regions of the genome, but whether peptides originating from novel or unannotated open reading frames (nuORFs) can contribute to antitumor immune responses remains unclear. Here we show that pe...
Preprint
DNA double strand breaks (DSBs) are linked to aging, neurodegeneration, and senescence 1,2 . However, the role played by neurons burdened with DSBs in disease-associated neuroinflammation is not well understood. Here, we isolate neurons harboring DSBs from the CK-p25 mouse model of neurodegeneration through fluorescence-activated nuclei sorting (FA...
Preprint
Full-text available
MicroRNAs (miRNAs) are small RNA molecules that act as regulators of gene expression through targeted mRNA degradation. They are involved in many biological and pathophysiological processes and are widely studied as potential biomarkers and therapeutics agents for human diseases, including cardiovascular disorders. Recently discovered isoforms of m...
Preprint
Full-text available
Regular physical exercise has long been recognized to reverse the effects of diet-induced obesity, but the molecular mechanisms mediating these multi-tissue beneficial effects remain uncharacterized. Here, we address this challenge by studying the opposing effects of exercise training and high-fat diet at single-cell, deconvolution and tissue-level...
Article
Recent increases in human longevity have been accompanied by a rise in the incidence of dementia, highlighting the need to preserve cognitive function in an aging population. A small percentage of individuals with pathological hallmarks of neurodegenerative disease are able to maintain normal cognition. Although the molecular mechanisms that govern...
Article
Full-text available
Finding a causal gene is a fundamental problem in genomic medicine. We present a causal inference framework, CoCoA-diff, that prioritizes disease genes by adjusting confounders without prior knowledge of control variables in single-cell RNA-seq data. We demonstrate that our method substantially improves statistical power in simulations and real-wor...
Article
Full-text available
The most prevalent post-transcriptional mRNA modification, N⁶-methyladenosine (m⁶A), plays diverse RNA-regulatory roles, but its genetic control in human tissues remains uncharted. Here we report 129 transcriptome-wide m⁶A profiles, covering 91 individuals and 4 tissues (brain, lung, muscle and heart) from GTEx/eGTEx. We integrate these with interi...
Preprint
Full-text available
Metabolism plays a central role in evolution, as resource conservation is a selective pressure for fitness and survival. Resource-driven adaptations offer a good model to study evolutionary innovation more broadly. It remains unknown how resource-driven optimization of genome function integrates chromatin architecture with transcriptional phase tra...
Preprint
Full-text available
Amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD) are two devastating and fatal neurodegenerative conditions. While distinct, they share many clinical, genetic, and pathological characteristics, and both show selective vulnerability of layer 5b extratelencephalic-projecting cortical populations, including Betz cells i...
Article
Full-text available
Despite significant clinical progress in cell and gene therapies, maximizing protein expression in order to enhance potency remains a major technical challenge. Here, we develop a high-throughput strategy to design, screen, and optimize 5′ UTRs that enhance protein expression from a strong human cytomegalovirus (CMV) promoter. We first identify nat...
Preprint
Full-text available
The human hippocampal formation plays a central role in Alzheimer’s disease (AD) progression, cognitive traits, and the onset of dementia; yet its molecular states in AD remain uncharacterized. Here, we report a comprehensive single-cell transcriptomic dissection of the human hippocampus and entorhinal cortex across 489,558 cells from 65 individual...
Preprint
Full-text available
Metabolism plays a central role in evolution, as resource conservation is a selective pressure for fitness and survival. Resource-driven adaptations offer a good model to study evolutionary innovation more broadly. It remains unknown how resource-driven optimization of genome function integrates chromatin architecture with transcriptional phase tra...
Preprint
Full-text available
Ischemic heart disease is the single most common cause of death worldwide with an annual death rate of over 9 million people. Genome-wide association studies have uncovered over 200 genetic loci underlying the disease, providing a deeper understanding of the causal mechanisms leading to it. However, in order to understand ischemic heart disease at...
Article
Full-text available
At least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for t...
Article
Full-text available
Despite initial responses1–3, most melanoma patients develop resistance⁴ to immune checkpoint blockade (ICB). To understand the evolution of resistance, we studied 37 tumor samples over 9 years from a patient with metastatic melanoma with complete clinical response to ICB followed by delayed recurrence and death. Phylogenetic analysis revealed co-e...
Article
Full-text available
The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an...
Preprint
Full-text available
Prioritizing disease-critical cell types by integrating genome-wide association studies (GWAS) with functional data is a fundamental goal. Single-cell chromatin accessibility (scATAC-seq) and gene expression (scRNA-seq) have characterized cell types at high resolution, and early work on integrating GWAS with scRNA-seq has shown promise, but work on...
Article
Full-text available
The incorrect Associate Editor was listed. The correct Associate Editor is Fuwen Wei. This error has been corrected online.
Article
Full-text available
Despite its clinical importance, the SARS-CoV-2 gene set remains unresolved, hindering dissection of COVID-19 biology. We use comparative genomics to provide a high-confidence protein-coding gene set, characterize evolutionary constraint, and prioritize functional mutations. We select 44 Sarbecovirus genomes at ideally-suited evolutionary distances...
Preprint
Despite the importance of the blood-brain barrier in maintaining normal brain physiology and in understanding neurodegeneration and CNS drug delivery, human cerebrovascular cells remain poorly characterized due to their sparsity and dispersion. Here, we perform the first single-cell characterization of the human cerebrovasculature using both ex viv...
Article
Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v...
Article
The E4 allele of the apolipoprotein E gene ( APOE ) has been established as a genetic risk factor for many diseases including cardiovascular diseases and Alzheimer’s disease (AD), yet its mechanism of action remains poorly understood. APOE is a lipid transport protein, and the dysregulation of lipids has recently emerged as a key feature of several...
Article
Full-text available
Despite recent discoveries in genome-wide association studies (GWAS) of genomic variants associated with Alzheimer’s disease (AD), its underlying biological mechanisms are still elusive. The discovery of novel AD-associated genetic variants, particularly in coding regions and from APOEε4 non-carriers, is critical for understanding the pathology of...
Article
Full-text available
Annotating the molecular basis of human disease remains an unsolved challenge, as 93% of disease loci are non-coding and gene-regulatory annotations are highly incomplete 1–3 . Here we present EpiMap, a compendium comprising 10,000 epigenomic maps across 800 samples, which we used to define chromatin states, high-resolution enhancers, enhancer modu...
Preprint
Full-text available
Finding a causal gene from case-control studies is a classic and fundamental problem in genomics. To date, we still ask which genes are differentially regulated by a disease with single-cell sequencing data, but in a cell-type-specific way. Here, we present a causal inference framework that effectively adjusts confounding effects, not requiring pri...
Preprint
Full-text available
Thousands of genetic variants acting in multiple cell types underlie complex disorders, yet most gene expression studies profile only bulk tissues, making it hard to resolve where genetic and non-genetic contributors act. This is particularly important for psychiatric and neurodegenerative disorders that impact multiple brain cell types with highly...
Article
Full-text available
Background: POLG, located on nuclear chromosome 15, encodes the DNA polymerase γ(Pol γ). Pol γ is responsible for the replication and repair of mitochondrial DNA (mtDNA). Pol γ is the only DNA polymerase found in mitochondria for most animal cells. Mutations in POLG are the most common single-gene cause of diseases of mitochondria and have been ma...
Article
Full-text available
The epigenome and three-dimensional (3D) genomic architecture are emerging as key factors in the dynamic regulation of different transcriptional programs required for neuronal functions. In this study, we used an activity-dependent tagging system in mice to determine the epigenetic state, 3D genome architecture and transcriptional landscape of engr...
Preprint
At least six small alternate-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for the...
Article
Genome-wide association studies have uncovered over 200 genetic loci underlying coronary artery disease (CAD), providing great hope for a deeper understanding of the causal mechanisms leading to this disease. However, in order to understand CAD at the molecular level, it is necessary to uncover cell-type-specific circuits and to use these circuits...
Preprint
Full-text available
A randomized controlled trial of calcifediol (25-hydroxyvitamin D 3 ) as a treatment for hospitalized COVID-19 patients in Córdoba, Spain, found that the treatment was associated with reduced ICU admissions with very large effect size and high statistical significance, but the study has had limited impact because it had only 76 patients and imperfe...
Preprint
Schizophrenia is a devastating mental disorder with a high societal burden, complex pathophysiology, and diverse genetic and environmental risk factors. Its complexity, polygenicity, and small-effect-size and cell-type-specific contributors have hindered mechanistic elucidation and the search for new therapeutics. Here, we present the first single-...
Article
Full-text available
The influence of genetic background on driver mutations is well established; however, the mechanisms by which the background interacts with Mendelian loci remain unclear. We performed a systematic secondary-variant burden analysis of two independent cohorts of patients with Bardet–Biedl syndrome (BBS) with known recessive biallelic pathogenic mutat...
Article
We report that the SARS-CoV-2 nucleocapsid protein (N-protein) undergoes liquid-liquid phase separation (LLPS) with viral RNA. N-protein condenses with specific RNA genomic elements under physiological buffer conditions and condensation is enhanced at human body temperatures (33°C and 37°C) and reduced at room temperature (22°C). RNA sequence and s...
Article
Full-text available
Dissecting the cellular heterogeneity embedded in single-cell transcriptomic data is challenging. Although many methods and approaches exist, identifying cell states and their underlying topology is still a major challenge. Here, we introduce the concept of multiresolution cell-state decomposition as a practical approach to simultaneously capture b...
Article
Full-text available
Despite its overwhelming clinical importance, the SARS-CoV-2 gene set remains unresolved, hindering dissection of COVID-19 biology. Here, we use comparative genomics to provide a high-confidence protein-coding gene set, characterize protein-level and nucleotide-level evolutionary constraint, and prioritize functional mutations from the ongoing COVI...
Article
Full-text available
Bumblebees are a diverse group of globally important pollinators in natural ecosystems and for agricultural food production. With both eusocial and solitary life-cycle phases, and some social parasite species, they are especially interesting models to understand social evolution, behavior, and ecology. Reports of many species in decline point to pa...
Article
Full-text available
Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a probabilistic framework that utilizes latent long-range information to reconstruct unspecified haplotypes in diploid and polyploid organisms. It introduces the observation that diff...
Article
The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the version 8 data, examining 15,201 RNA-sequencing samples from 49 tissues of 838 postmortem donors. We compr...
Article
The mechanisms by which mutant huntingtin (mHTT) leads to neuronal cell death in Huntington’s disease (HD) are not fully understood. To gain new molecular insights, we used single nuclear RNA sequencing (snRNA-seq) and translating ribosome affinity purification (TRAP) to conduct transcriptomic analyses of caudate/putamen (striatal) cell type-specif...
Article
Full-text available
Genomic analyses in budding yeast have helped define the foundational principles of eukaryotic gene expression. However, in the absence of empirical methods for defining coding regions, these analyses have historically excluded specific classes of possible coding regions, such as those initiating at non-AUG start codons. Here, we applied an experim...
Preprint
Full-text available
Despite its overwhelming clinical importance, the SARS-CoV-2 gene set remains unresolved, hindering dissection of COVID-19 biology. Here, we use comparative genomics to provide a high-confidence protein-coding gene set, characterize protein-level and nucleotide-level evolutionary constraint, and prioritize functional mutations from the ongoing COVI...
Article
Full-text available
To reveal post-traumatic stress disorder (PTSD) genetic risk influences on tissue-specific gene expression, we use brain and non-brain transcriptomic imputation. We impute genetically regulated gene expression (GReX) in 29,539 PTSD cases and 166,145 controls from 70 ancestry-specific cohorts and identify 18 significant GReX-PTSD associations corres...
Article
Full-text available
In Alzheimer’s disease, amyloid deposits along the brain vasculature lead to a condition known as cerebral amyloid angiopathy (CAA), which impairs blood–brain barrier (BBB) function and accelerates cognitive degeneration. Apolipoprotein (APOE4) is the strongest risk factor for CAA, yet the mechanisms underlying this genetic susceptibility are unkno...