Telomere length in humans is associated with lifespan and severe diseases, yet the genetic determinants of telomere length remain incompletely defined. Here we performed genome-wide CRISPR–Cas9 functional telomere length screening and identified thymidine (dT) nucleotide metabolism as a limiting factor in human telomere maintenance. Targeted genetic disruption using CRISPR–Cas9 revealed multiple telomere length control points across the thymidine nucleotide metabolism pathway: decreasing dT nucleotide salvage via deletion of the gene encoding nuclear thymidine kinase (TK1) or de novo production by knockout of the thymidylate synthase gene (TYMS) decreased telomere length, whereas inactivation of the deoxynucleoside triphosphohydrolase-encoding gene SAMHD1 lengthened telomeres. Remarkably, supplementation with dT alone drove robust telomere elongation by telomerase in cells, and thymidine triphosphate stimulated telomerase activity in a substrate-independent manner in vitro. In induced pluripotent stem cells derived from patients with genetic telomere biology disorders, dT supplementation or inhibition of SAMHD1 promoted telomere restoration. Our results demonstrate a critical role of thymidine metabolism in controlling human telomerase and telomere length, which may be therapeutically actionable in patients with fatal degenerative diseases.
After severe heart injury, fibroblasts are activated and proliferate excessively to form scarring, leading to decreased cardiac function and eventually heart failure. It is unknown, however, whether cardiac fibroblasts are heterogeneous with respect to their degree of activation, proliferation and function during cardiac fibrosis. Here, using dual recombinase-mediated genetic lineage tracing, we find that endocardium-derived fibroblasts preferentially proliferate and expand in response to pressure overload. Fibroblast-specific proliferation tracing revealed highly regional expansion of activated fibroblasts after injury, whose pattern mirrors that of endocardium-derived fibroblast distribution in the heart. Specific ablation of endocardium-derived fibroblasts alleviates cardiac fibrosis and reduces the decline of heart function after pressure overload injury. Mechanistically, Wnt signaling promotes activation and expansion of endocardium-derived fibroblasts during cardiac remodeling. Our study identifies endocardium-derived fibroblasts as a key fibroblast subpopulation accounting for severe cardiac fibrosis after pressure overload injury and as a potential therapeutic target against cardiac fibrosis.
With the emergence of large-scale sequencing data, methods for improving power in rare variant association tests are needed. Here we show that adjusting for common variant polygenic scores improves yield in gene-based rare variant association tests across 65 quantitative traits in the UK Biobank (up to 20% increase at α = 2.6 × 10⁻⁶), without marked increases in false-positive rates or genomic inflation. Benefits were seen for various models, with the largest improvements seen for efficient sparse mixed-effects models. Our results illustrate how polygenic score adjustment can efficiently improve power in rare variant association discovery.
Individuals of admixed ancestries (for example, African Americans) inherit a mosaic of ancestry segments (local ancestry) originating from multiple continental ancestral populations. This offers the unique opportunity of investigating the similarity of genetic effects on traits across ancestries within the same population. Here we introduce an approach to estimate correlation of causal genetic effects (radmix) across local ancestries and analyze 38 complex traits in African-European admixed individuals (N = 53,001) to observe very high correlations (meta-analysis radmix = 0.95, 95% credible interval 0.93–0.97), much higher than correlation of causal effects across continental ancestries. We replicate our results using regression-based methods from marginal genome-wide association study summary statistics. We also report realistic scenarios where regression-based methods yield inflated heterogeneity-by-ancestry due to ancestry-specific tagging of causal effects, and/or polygenicity. Our results motivate genetic analyses that assume minimal heterogeneity in causal effects by ancestry, with implications for the inclusion of ancestry-diverse individuals in studies.
Malignant pleural mesothelioma (MPM) is an aggressive cancer with rising incidence and challenging clinical management. Through a large series of whole-genome sequencing data, integrated with transcriptomic and epigenomic data using multiomics factor analysis, we demonstrate that the current World Health Organization classification only accounts for up to 10% of interpatient molecular differences. Instead, the MESOMICS project paves the way for a morphomolecular classification of MPM based on four dimensions: ploidy, tumor cell morphology, adaptive immune response and CpG island methylator profile. We show that these four dimensions are complementary, capture major interpatient molecular differences and are delimited by extreme phenotypes that—in the case of the interdependent tumor cell morphology and adapted immune response—reflect tumor specialization. These findings unearth the interplay between MPM functional biology and its genomic history, and provide insights into the variations observed in the clinical behavior of patients with MPM.
Endometriosis is a common condition associated with debilitating pelvic pain and infertility. A genome-wide association study meta-analysis, including 60,674 cases and 701,926 controls of European and East Asian descent, identified 42 genome-wide significant loci comprising 49 distinct association signals. Effect sizes were largest for stage 3/4 disease, driven by ovarian endometriosis. Identified signals explained up to 5.01% of disease variance and regulated expression or methylation of genes in endometrium and blood, many of which were associated with pain perception/maintenance (SRP14/BMF, GDAP1, MLLT10, BSN and NGF). We observed significant genetic correlations between endometriosis and 11 pain conditions, including migraine, back and multisite chronic pain (MCP), as well as inflammatory conditions, including asthma and osteoarthritis. Multitrait genetic analyses identified substantial sharing of variants associated with endometriosis and MCP/migraine. Targeted investigations of genetically regulated mechanisms shared between endometriosis and other pain conditions are needed to aid the development of new treatments and facilitate early symptomatic intervention.
Following severe liver injury, when hepatocyte-mediated regeneration is impaired, biliary epithelial cells (BECs) can transdifferentiate into functional hepatocytes. However, the subset of BECs with such facultative tissue stem cell potential, as well as the mechanisms enabling transdifferentiation, remains elusive. Here we identify a transitional liver progenitor cell (TLPC), which originates from BECs and differentiates into hepatocytes during regeneration from severe liver injury. By applying a dual genetic lineage tracing approach, we specifically labeled TLPCs and found that they are bipotent, as they either differentiate into hepatocytes or re-adopt BEC fate. Mechanistically, Notch and Wnt/β-catenin signaling orchestrate BEC-to-TLPC and TLPC-to-hepatocyte conversions, respectively. Together, our study provides functional and mechanistic insights into transdifferentiation-assisted liver regeneration.
Gastric cancer is among the most common malignancies worldwide, characterized by geographical, epidemiological and histological heterogeneity. Here, we report an extensive, multiancestral landscape of driver events in gastric cancer, involving 1,335 cases. Seventy-seven significantly mutated genes (SMGs) were identified, including ARHGAP5 and TRIM49C. We also identified subtype-specific drivers, including PIGR and SOX9, which were enriched in the diffuse subtype of the disease. SMGs also varied according to Epstein–Barr virus infection status and ancestry. Non-protein-truncating CDH1 mutations, which are characterized by in-frame splicing alterations, targeted localized extracellular domains and uniquely occurred in sporadic diffuse-type cases. In patients with gastric cancer with East Asian ancestry, our data suggested a link between alcohol consumption or metabolism and the development of RHOA mutations. Moreover, mutations with potential roles in immune evasion were identified. Overall, these data provide comprehensive insights into the molecular landscape of gastric cancer across various subtypes and ancestries.
Women with germline BRCA1 mutations (BRCA1+/mut) have increased risk for hereditary breast cancer. Cancer initiation in BRCA1+/mut is associated with premalignant changes in breast epithelium; however, the role of the epithelium-associated stromal niche during BRCA1-driven tumor initiation remains unclear. Here we show that the premalignant stromal niche promotes epithelial proliferation and mutant BRCA1-driven tumorigenesis in trans. Using single-cell RNA sequencing analysis of human preneoplastic BRCA1+/mut and noncarrier breast tissues, we show distinct changes in epithelial homeostasis including increased proliferation and expansion of basal-luminal intermediate progenitor cells. Additionally, BRCA1+/mut stromal cells show increased expression of pro-proliferative paracrine signals. In particular, we identify pre-cancer-associated fibroblasts (pre-CAFs) that produce protumorigenic factors including matrix metalloproteinase 3 (MMP3), which promotes BRCA1-driven tumorigenesis in vivo. Together, our findings demonstrate that precancerous stroma in BRCA1+/mut may elevate breast cancer risk through the promotion of epithelial proliferation and an accumulation of luminal progenitor cells with altered differentiation.
Current risk assessment and treatment strategies for venous thromboembolism (VTE) consider genetic factors only in a limited way. New work shows a more pervasive role of common variants in VTE risk, inspiring genetic predictors that surpass and complement individual clinical risk factors and monogenic thrombophilia testing.
Identification of host determinants of coronavirus infection informs mechanisms of viral pathogenesis and can provide new drug targets. Here we demonstrate that mammalian SWItch/Sucrose Non-Fermentable (mSWI/SNF) chromatin remodeling complexes, specifically canonical BRG1/BRM-associated factor (cBAF) complexes, promote severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and represent host-directed therapeutic targets. The catalytic activity of SMARCA4 is required for mSWI/SNF-driven chromatin accessibility at the ACE2 locus, ACE2 expression and virus susceptibility. The transcription factors HNF1A/B interact with and recruit mSWI/SNF complexes to ACE2 enhancers, which contain high HNF1A motif density. Notably, small-molecule mSWI/SNF ATPase inhibitors or degraders abrogate angiotensin-converting enzyme 2 (ACE2) expression and confer resistance to SARS-CoV-2 variants and a remdesivir-resistant virus in three cell lines and three primary human cell types, including airway epithelial cells, by up to 5 logs. These data highlight the role of mSWI/SNF complex activities in conferring SARS-CoV-2 susceptibility and identify a potential class of broad-acting antivirals to combat emerging coronaviruses and drug-resistant variants.
In cancer, evolutionary forces select for clones that evade the immune system. Here we analyzed >10,000 primary tumors and 356 immune-checkpoint-treated metastases using immune dN/dS, the ratio of nonsynonymous to synonymous mutations in the immunopeptidome, to measure immune selection in cohorts and individuals. We classified tumors as immune edited when antigenic mutations were removed by negative selection and immune escaped when antigenicity was covered up by aberrant immune modulation. Only in immune-edited tumors was immune predation linked to CD8 T cell infiltration. Immune-escaped metastases experienced the best response to immunotherapy, whereas immune-edited patients did not benefit, suggesting a preexisting resistance mechanism. Similarly, in a longitudinal cohort, nivolumab treatment removes neoantigens exclusively in the immunopeptidome of nonimmune-edited patients, the group with the best overall survival response. Our work uses dN/dS to differentiate between immune-edited and immune-escaped tumors, measuring potential antigenicity and ultimately helping predict response to treatment.
Zygotic genome activation (ZGA) is a critical postfertilization step that promotes totipotency and allows different cell fates to emerge in the developing embryo. MERVL (murine endogenous retrovirus-L) is transiently upregulated at the two-cell stage during ZGA. Although MERVL expression is widely used as a marker of totipotency, the role of this retrotransposon in mouse embryogenesis remains elusive. Here, we show that full-length MERVL transcripts, but not encoded retroviral proteins, are essential for accurate regulation of the host transcriptome and chromatin state during preimplantation development. Both knockdown and CRISPRi-based repression of MERVL result in embryonic lethality due to defects in differentiation and genomic stability. Furthermore, transcriptome and epigenome analysis revealed that loss of MERVL transcripts led to retention of an accessible chromatin state at, and aberrant expression of, a subset of two-cell-specific genes. Taken together, our results suggest a model in which an endogenous retrovirus plays a key role in regulating host cell fate potential.
Epigenetic reprogramming in the germline contributes to the erasure of epigenetic inheritance across generations in mammals but remains poorly characterized in plants. Here we profiled histone modifications throughout Arabidopsis male germline development. We find that the sperm cell has widespread apparent chromatin bivalency, which is established by the acquisition of H3K27me3 or H3K4me3 at pre-existing H3K4me3 or H3K27me3 regions, respectively. These bivalent domains are associated with a distinct transcriptional status. Somatic H3K27me3 is generally reduced in sperm, while dramatic loss of H3K27me3 is observed at only ~700 developmental genes. The incorporation of the histone variant H3.10 facilitates the establishment of sperm chromatin identity without a strong impact on resetting of somatic H3K27me3. Vegetative nuclei harbor thousands of specific H3K27me3 domains at repressed genes, while pollination-related genes are highly expressed and marked by gene body H3K4me3. Our work highlights putative chromatin bivalency and restricted resetting of H3K27me3 at developmental regulators as key features in plant pluripotent sperm.
Pearl millet is an important cereal crop worldwide and shows superior heat tolerance. Here, we developed a graph-based pan-genome by assembling ten chromosomal genomes with one existing assembly adapted to different climates worldwide and captured 424,085 genomic structural variations (SVs). Comparative genomics and transcriptomics analyses revealed the expansion of the RWP-RK transcription factor family and the involvement of endoplasmic reticulum (ER)-related genes in heat tolerance. The overexpression of one RWP-RK gene led to enhanced plant heat tolerance and transactivated ER-related genes quickly, supporting the important roles of RWP-RK transcription factors and ER system in heat tolerance. Furthermore, we found that some SVs affected the gene expression associated with heat tolerance and SVs surrounding ER-related genes shaped adaptation to heat tolerance during domestication in the population. Our study provides a comprehensive genomic resource revealing insights into heat tolerance and laying a foundation for generating more robust crops under the changing climate.
Lung-function impairment underlies chronic obstructive pulmonary disease (COPD) and predicts mortality. In the largest multi-ancestry genome-wide association meta-analysis of lung function to date, comprising 580,869 participants, we identified 1,020 independent association signals implicating 559 genes supported by ≥2 criteria from a systematic variant-to-gene mapping framework. These genes were enriched in 29 pathways. Individual variants showed heterogeneity across ancestries, age and smoking groups, and collectively as a genetic risk score showed strong association with COPD across ancestry groups. We undertook phenome-wide association studies for selected associated variants as well as trait and pathway-specific genetic risk scores to infer possible consequences of intervening in pathways underlying lung function. We highlight new putative causal variants, genes, proteins and pathways, including those targeted by existing drugs. These findings bring us closer to understanding the mechanisms underlying lung function and COPD, and should inform functional genomics experiments and potentially future COPD therapies. Multi-ancestry genome-wide association analyses and systematic variant-to-gene mapping strategies implicate new genes and pathways influencing lung function and chronic obstructive pulmonary disease risk.
Schizophrenia (SCZ) is a chronic mental illness and among the most debilitating conditions encountered in medical practice. A recent landmark SCZ study of the protein-coding regions of the genome identified a causal role for ten genes and a concentration of rare variant signals in evolutionarily constrained genes1. This recent study—and most other large-scale human genetics studies—was mainly composed of individuals of European (EUR) ancestry, and the generalizability of the findings in non-EUR populations remains unclear. To address this gap, we designed a custom sequencing panel of 161 genes selected based on the current knowledge of SCZ genetics and sequenced a new cohort of 11,580 SCZ cases and 10,555 controls of diverse ancestries. Replicating earlier work, we found that cases carried a significantly higher burden of rare protein-truncating variants (PTVs) among evolutionarily constrained genes (odds ratio = 1.48; P = 5.4 × 10−6). In meta-analyses with existing datasets totaling up to 35,828 cases and 107,877 controls, this excess burden was largely consistent across five ancestral populations. Two genes (SRRM2 and AKAP11) were newly implicated as SCZ risk genes, and one gene (PCLO) was identified as shared by individuals with SCZ and those with autism. Overall, our results lend robust support to the rare allelic spectrum of the genetic architecture of SCZ being conserved across diverse human populations. Targeted sequencing finds a higher burden of rare protein-truncating variants in constrained genes among schizophrenia cases of diverse ancestries. Meta-analyses with existing datasets show that this excess burden is consistent across five ancestral populations.
Multi-omic profiling of lesions at autopsy reveals a plethora of resistance mechanisms present within individual patients with ovarian cancer. This highlights the extreme challenge faced in treating end-stage disease and underscores the need for new methods of early detection and intervention.
High-grade serous ovarian cancer (HGSC) is frequently characterized by homologous recombination (HR) DNA repair deficiency and, while most such tumors are sensitive to initial treatment, acquired resistance is common. We undertook a multiomics approach to interrogate molecular diversity in end-stage disease, using multiple autopsy samples collected from 15 women with HR-deficient HGSC. Patients had polyclonal disease, and several resistance mechanisms were identified within most patients, including reversion mutations and HR restoration by other means. We also observed frequent whole-genome duplication and global changes in immune composition with evidence of immune escape. This analysis highlights diverse evolutionary changes within HGSC that evade therapy and ultimately overwhelm individual patients.
Identification of therapeutic targets from genome-wide association studies (GWAS) requires insights into downstream functional consequences. We harmonized 8,613 RNA-sequencing samples from 14 brain datasets to create the MetaBrain resource and performed cis- and trans-expression quantitative trait locus (eQTL) meta-analyses in multiple brain region- and ancestry-specific datasets (n ≤ 2,759). Many of the 16,169 cortex cis-eQTLs were tissue-dependent when compared with blood cis-eQTLs. We inferred brain cell types for 3,549 cis-eQTLs by interaction analysis. We prioritized 186 cis-eQTLs for 31 brain-related traits using Mendelian randomization and co-localization including 40 cis-eQTLs with an inferred cell type, such as a neuron-specific cis-eQTL (CYP24A1) for multiple sclerosis. We further describe 737 trans-eQTLs for 526 unique variants and 108 unique genes. We used brain-specific gene-co-regulation networks to link GWAS loci and prioritize additional genes for five central nervous system diseases. This study represents a valuable resource for post-GWAS research on central nervous system diseases.
Interacting proteins tend to have similar functions, influencing the same organismal traits. Interaction networks can be used to expand the list of candidate trait-associated genes from genome-wide association studies. Here, we performed network-based expansion of trait-associated genes for 1,002 human traits showing that this recovers known disease genes or drug targets. The similarity of network expansion scores identifies groups of traits likely to share an underlying genetic and biological process. We identified 73 pleiotropic gene modules linked to multiple traits, enriched in genes involved in processes such as protein ubiquitination and RNA processing. In contrast to gene deletion studies, pleiotropy as defined here captures specifically multicellular-related processes. We show examples of modules linked to human diseases enriched in genes with known pathogenic variants that can be used to map targets of approved drugs for repurposing. Finally, we illustrate the use of network expansion scores to study genes at inflammatory bowel disease genome-wide association study loci, and implicate inflammatory bowel disease-relevant genes with strong functional and genetic support.
In the context of climate change, drought is one of the most limiting factors that influence crop production. Maize, as a major crop, is highly vulnerable to water deficit, which causes significant yield loss. Thus, identification and utilization of drought-resistant germplasm are crucial for the genetic improvement of the trait. Here we report on a high-quality genome assembly of a prominent drought-resistant genotype, CIMBL55. Genomic and genetic variation analyses revealed that 65 favorable alleles of 108 previously identified drought-resistant candidate genes were found in CIMBL55, which may constitute the genetic basis for its excellent drought resistance. Notably, ZmRtn16, encoding a reticulon-like protein, was found to contribute to drought resistance by facilitating the vacuole H⁺-ATPase activity, which highlights the role of vacuole proton pumps in maize drought resistance. The assembled CIMBL55 genome provided a basis for genetic dissection and improvement of plant drought resistance, in support of global food security.
Obesity-associated morbidity is exacerbated by abdominal obesity, which can be measured as the waist-to-hip ratio adjusted for the body mass index (WHRadjBMI). Here we identify genes associated with obesity and WHRadjBMI and characterize allele-sensitive enhancers that are predicted to regulate WHRadjBMI genes in women. We found that several waist-to-hip ratio-associated variants map within primate-specific Alu retrotransposons harboring a DNA motif associated with adipocyte differentiation. This suggests that a genetic component of adipose distribution in humans may involve co-option of retrotransposons as adipose enhancers. We evaluated the role of the strongest female WHRadjBMI-associated gene, SNX10, in adipose biology. We determined that it is required for human adipocyte differentiation and function and participates in diet-induced adipose expansion in female mice, but not males. Our data identify genes and regulatory mechanisms that underlie female-specific adipose distribution and mediate metabolic dysfunction in women.
Even for essential splice-site variants that are almost guaranteed to alter mRNA splicing, no current method can reliably predict whether exon-skipping, cryptic activation or multiple events will result, greatly complicating clinical interpretation of pathogenicity. Strikingly, ranking the four most common unannotated splicing events across 335,663 reference RNA-sequencing (RNA-seq) samples (300K-RNA Top-4) predicts the nature of variant-associated mis-splicing with 92% sensitivity. The 300K-RNA Top-4 events correctly identify 96% of exon-skipping events and 86% of cryptic splice sites for 140 clinical cases subject to RNA testing, showing higher sensitivity and positive predictive value than SpliceAI. Notably, RNA re-analyses showed we had missed 300K-RNA Top-4 events for several clinical cases tested before the development of this empirical predictive method. Simply, mis-splicing events that happen around a splice site in RNA-seq data are those most likely to be activated by a splice-site variant. The SpliceVault web portal allows users easy access to 300K-RNA for informed splice-site variant interpretation and classification.
How enhancers activate their distal target promoters remains incompletely understood. Here we dissect how CTCF-mediated loops facilitate and restrict such regulatory interactions. Using an allelic series of mouse mutants, we show that CTCF is neither required for the interaction of the Sox2 gene with distal enhancers, nor for its expression. Insertion of various combinations of CTCF motifs, between Sox2 and its distal enhancers, generated boundaries with varying degrees of insulation that directly correlated with reduced transcriptional output. However, in both epiblast and neural tissues, enhancer contacts and transcriptional induction could not be fully abolished, and insertions failed to disrupt implantation and neurogenesis. In contrast, Sox2 expression was undetectable in the anterior foregut of mutants carrying the strongest boundaries, and these animals fully phenocopied loss of SOX2 in this tissue. We propose that enhancer clusters with a high density of regulatory activity can better overcome physical barriers to maintain faithful gene expression and phenotypic robustness. Genetic manipulation of the Sox2 locus in mice shows that gene activation by distal enhancers does not require CTCF-mediated loops and can occur across ectopic CTCF-mediated boundaries. The ability to bypass CTCF boundaries varies with their insulation strength and the tissue-specific enhancers responsible for activation.
Attention-deficit hyperactivity disorder (ADHD) is a prevalent neurodevelopmental disorder with a major genetic component. Here, we present a genome-wide association study meta-analysis of ADHD comprising 38,691 individuals with ADHD and 186,843 controls. We identified 27 genome-wide significant loci, highlighting 76 potential risk genes enriched among genes expressed particularly in early brain development. Overall, ADHD genetic risk was associated with several brain-specific neuronal subtypes and midbrain dopaminergic neurons. In exome-sequencing data from 17,896 individuals, we identified an increased load of rare protein-truncating variants in ADHD for a set of risk genes enriched with probable causal common variants, potentially implicating SORCS3 in ADHD by both common and rare variants. Bivariate Gaussian mixture modeling estimated that 84–98% of ADHD-influencing variants are shared with other psychiatric disorders. In addition, common-variant ADHD risk was associated with impaired complex cognition such as verbal reasoning and a range of executive functions, including attention. Genome-wide analyses identify 27 loci associated with attention-deficit hyperactivity disorder and provide insights into its genetic architecture in relation to other psychiatric disorders and cognitive traits.
Most transcriptome-wide association studies (TWASs) so far focus on European ancestry and lack diversity. To overcome this limitation, we aggregated genome-wide association study (GWAS) summary statistics, whole-genome sequences and expression quantitative trait locus (eQTL) data from diverse ancestries. We developed a new approach, TESLA (multi-ancestry integrative study using an optimal linear combination of association statistics), to integrate an eQTL dataset with a multi-ancestry GWAS. By exploiting shared phenotypic effects between ancestries and accommodating potential effect heterogeneities, TESLA improves power over other TWAS methods. When applied to tobacco use phenotypes, TESLA identified 273 new genes, up to 55% more compared with alternative TWAS methods. These hits and subsequent fine mapping using TESLA point to target genes with biological relevance. In silico drug-repurposing analyses highlight several drugs with known efficacy, including dextromethorphan and galantamine, and new drugs such as muscle relaxants that may be repurposed for treating nicotine addiction. A multi-ancestry transcriptome-wide association study using an optimal linear combination of association statistics provides insights into tobacco use biology and suggests opportunities for drug repurposing.
APOBEC mutational signatures SBS2 and SBS13 are common in many human cancer types. However, there is an incomplete understanding of its stimulus, when it occurs in the progression from normal to cancer cell and the APOBEC enzymes responsible. Here we whole-genome sequenced 342 microdissected normal epithelial crypts from the small intestines of 39 individuals and found that SBS2/SBS13 mutations were present in 17% of crypts, more frequent than most other normal tissues. Crypts with SBS2/SBS13 often had immediate crypt neighbors without SBS2/SBS13, suggesting that the underlying cause of SBS2/SBS13 is cell-intrinsic. APOBEC mutagenesis occurred in an episodic manner throughout the human lifespan, including in young children. APOBEC1 mRNA levels were very high in the small intestine epithelium, but low in the large intestine epithelium and other tissues. The results suggest that the high levels of SBS2/SBS13 in the small intestine are collateral damage from APOBEC1 fulfilling its physiological function of editing APOB mRNA. Whole-genome sequencing of healthy human epithelial crypts from the small intestines of 39 individuals highlights APOBEC enzymes as a common contributor to the overall mutational burden in this tissue.
Precision medicine promises to transform healthcare for groups and individuals through early disease detection, refining diagnoses and tailoring treatments. Analysis of large-scale genomic–phenotypic databases is a critical enabler of precision medicine. Although Asia is home to 60% of the world’s population, many Asian ancestries are under-represented in existing databases, leading to missed opportunities for new discoveries, particularly for diseases most relevant for these populations. The Singapore National Precision Medicine initiative is a whole-of-government 10-year initiative aiming to generate precision medicine data of up to one million individuals, integrating genomic, lifestyle, health, social and environmental data. Beyond technologies, routine adoption of precision medicine in clinical practice requires social, ethical, legal and regulatory barriers to be addressed. Identifying driver use cases in which precision medicine results in standardized changes to clinical workflows or improvements in population health, coupled with health economic analysis to demonstrate value-based healthcare, is a vital prerequisite for responsible health system adoption. This Perspective article discusses Singapore’s efforts to implement a National Precision Medicine Strategy through the integration of genomic, clinical and lifestyle data of up to one million Singaporean individuals.
We report a genome-wide association study of venous thromboembolism (VTE) incorporating 81,190 cases and 1,419,671 controls sampled from six cohorts. We identify 93 risk loci, of which 62 are previously unreported. Many of the identified risk loci are at genes encoding proteins with functions converging on the coagulation cascade or platelet function. A VTE polygenic risk score (PRS) enabled effective identification of both high- and low-risk individuals. Individuals within the top 0.1% of PRS distribution had a VTE risk similar to homozygous or compound heterozygous carriers of the variants G20210A (c.*97 G > A) in F2 and p.R534Q in F5. We also document that F2 and F5 mutation carriers in the bottom 10% of the PRS distribution had a risk similar to that of the general population. We further show that PRS improved individual risk prediction beyond that of genetic and clinical risk factors. We investigated the extent to which venous and arterial thrombosis share clinical risk factors using Mendelian randomization, finding that some risk factors for arterial thrombosis were directionally concordant with VTE risk (for example, body mass index and smoking) whereas others were discordant (for example, systolic blood pressure and triglyceride levels).