Human Genetics

Published by Springer Nature
Online ISSN: 1432-1203
Learn more about this page
Recent publications
Building the Effector Index and enrichment for likely target genes by statistical fine-mapping. a Flow diagram depicting: (1) how data were generated using fine-mapping of GWAS summary statistics, followed by SNV annotation and pairing to genes at each GWAS locus (2–3) how these data are used to generate gene- and locus-level features, followed by fitting their feature weights within the models using a leave-one-out analysis, and (4) assessing the performance of the models to predict target genes for loci containing positive control genes. b Ratio of enrichment for positive control genes within ± 25 Kbp of genome-wide significant SNVs (P < 5 × 10–8) compared to SNVs having log10(BF) > 2 after fine-mapping. Fold enrichment was calculated as the proportion of positive control genes targeted to the proportion of all genes targeted. c Comparison of enrichments for genome-wide significant SNVs (x-axis) vs. SNVs with log10(BF) > 2 SNVs (y-axis) within trait-specific DHS sites (see “Methods”)
Enrichment of genomic landscape features with positive control genes. Genes with protein or transcript-altering SNVs were assessed separately. Non-coding SNVs were classified by overlap with trait-specific DHSs, distance to the TSS, and eQTL or pcHi-C evidence. a Summary of positive control genes at GWAS loci by relation to log10(BF) > 2 SNVs. Bar charts demonstrate the proportion of positive control genes identified by intersection of fine-mapped SNVs with different genomic landscape features. Results are separated by trait/disease. Genes were attributed to a single genomic landscape category in the order listed in figure legend above the plot. b Enrichments for each category of non-coding SNVs for positive control genes segregated by trait/disease. Enrichment for protein or transcript-altering variants was excluded for legibility. c Enrichment of positive control genes by distance to non-coding SNVs (x-axis) for all traits. Fold enrichment was calculated as the ratio between the proportion of positive control genes targeted to the proportion of all genes targeted. The SNVs with log10(BF) > 2 were further overlapped with a master list of DHSs in any cell or tissue type (Table S5)
Performance of the Effector index at loci containing positive control genes. The performance of the Effector index compared to logistic regression, eCAVIAR, and DEPICT for predicting positive control genes for: a all 12 traits, b. type 2 diabetes only, and c type 2 diabetes with the addition of manually curated causal genes from large-scale exon array and exome-sequencing studies (Mahajan and McCarthy 2019). Area under the curves are provided in parentheses and are segregated by trait/disease Table 3. Performance using the nearest gene to the lead SNV by P value is also shown by open circles
Features selected by the Effector index and comparison of the Effector index to use of only locus-level features. a Top 20 features selected by the Effector index where each importance value provided is the absolute mean importance of that feature across the 12 traits. Locus-level features shown (in blue) are those do not vary across genes at a locus. Features that incorporate distance to gene are displayed using triangles (Δg denotes SNV-gene distance; ‘genic’ denotes that SNV overlaps gene body). b, c The ROC (b) and PRC (c) curves for only locus-level features versus the use of all features in the Ei model. Areas under the curve are provided in parentheses. d Leading edge analysis shows the peak enrichment score for positive control genes occurs an Ei probability of 0.46 (red point); a threshold above which we observe 99 of 159 positive control genes (vertical grey lines). e Using the peak Ei threshold of 0.46 considerably reduces the number of genes per locus. For instance, 78% of loci contain 4 or fewer genes with the Ei > 0.46 (red open circle), whereas only 28% of loci contain 4 or fewer genes when no threshold is applied (black line open circle)
Drug development and biological discovery require effective strategies to map existing genetic associations to causal genes. To approach this problem, we selected 12 common diseases and quantitative traits for which highly powered genome-wide association studies (GWAS) were available. For each disease or trait, we systematically curated positive control gene sets from Mendelian forms of the disease and from targets of medicines used for disease treatment. We found that these positive control genes were highly enriched in proximity of GWAS-associated single-nucleotide variants (SNVs). We then performed quantitative assessment of the contribution of commonly used genomic features, including open chromatin maps, expression quantitative trait loci (eQTL), and chromatin conformation data. Using these features, we trained and validated an Effector Index (Ei), to map target genes for these 12 common diseases and traits. Ei demonstrated high predictive performance, both with cross-validation on the training set, and an independently derived set for type 2 diabetes. Key predictive features included coding or transcript-altering SNVs, distance to gene, and open chromatin-based metrics. This work outlines a simple, understandable approach to prioritize genes at GWAS loci for functional follow-up and drug development, and provides a systematic strategy for prioritization of GWAS target genes.
Pedigree of the family with a neurodevelopmental disorder with hypotonia and contractures (A). Schematic representation of C18orf32 protein with variant identified in the present study and multiple sequence alignment showing conservation of amino acids (B). The clinical picture of the proband showing no facial dysmorphism (C). MRI of the brain showing delayed myelination and prominent fronto-temporal subdural spaces (D and E). Flow cytometry analysis of fibroblasts derived from asymptomatic parents compared to healthy controls. The histograms are representative of at least two separate experiments using three different controls (F). Total RNA from the asymptomatic parents and three unrelated controls was extracted from fibroblasts. C18orf32 relative mRNA levels were measured using qRT-PCR and compared to the control gene TBP control gene. Error bars represent standard deviations
The C18orf32 knockout HEK293 cells were treated with (indicated by gray peak) or without (indicated by the solid line) PIPLC, and stained with CD59 followed by flow cytometry analysis. Dashed lines show background (A). The resistance of GPI-APs against PIPLC in C18orf32-KO cells was rescued by expression of wild-type C18orf32 but not mutant C18orf32 (dupC). CD59 levels remained after PIPLC treatment with the value in wild-type cells set as 1 and the relative values were calculated. The results are shown as means ± SD from three independent experiments (B). P values (two-tailed, Student’s t test) are shown. Immunoblot of EGFP-C18orf32 prepared from cell lysates (C). GAPDH was used as a loading control. Localization of EGFP-C18orf32 WT (top panels) and EGFP-C18orf32 (dupC) (middle and lower panels) was analyzed in C18orf32-KO cells (D). mRFP-KDEL was used as an ER marker. Three days after transfection, images were obtained using confocal microscopy. Over 50% of EGFP-C18orf32 (dupC) were localized to the nucleus as aggregates (middle panels), whereas some of them were still merged with mRFP-KDEL (lower panels). DAPI staining is shown as blue in merged images. Bars, 10 μm
Glycosylphosphatidylinositol (GPI) functions to anchor certain proteins to the cell surface. Although defects in GPI biosynthesis can result in a wide range of phenotypes, most affected patients present with neurological abnormalities and their diseases are grouped as inherited-GPI deficiency disorders. We present two siblings with global developmental delay, brain anomalies, hypotonia, and contractures. Exome sequencing revealed a homozygous variant, NM_001035005.4:c.90dupC (p.Phe31Leufs*3) in C18orf32, a gene not previously associated with any disease in humans. The encoded protein is known to be important for GPI-inositol deacylation. Knockout of C18orf32 in HEK293 cells followed by a transfection rescue assay revealed that the PIPLC (Phosphatidylinositol-Specific Phospholipase C) sensitivity of GPI-APs (GPI-anchored proteins) was restored only by the wild type and not the mutant C18orf32. Immunofluorescence revealed that the mutant C18orf32 was localized to the endoplasmic reticulum and was also found as aggregates in the nucleus. In conclusion, we identified a pathogenic variant in C18orf32 as the cause of a novel autosomal recessive neurodevelopmental disorder with hypotonia and contractures. Our results demonstrate the importance of C18orf32 in the biosynthesis of GPI-anchors, the molecular impact of the variant on the protein function, and add a novel candidate gene to the existing repertoire of genes implicated in neurodevelopmental disorders.
Illustration of the trypsinogen (PRSS1 and PRSS2) copy number gains and rs10273639C-tagged genotypes, their anticipated hypothetical trypsinogen expression levels, genetic effect sizes and global carrier frequencies. It should be noted that the PRSS1-PRSS2 expression levels in carriers of trypsinogen copy number gain variants are purely hypothetical whereas those in carriers of rs10273639C-tagged heterozygous and homozygous genotypes were supported by experimental data. See text for details of how the genetic effect sizes were determined. Carrier frequencies in control populations were derived from gnomAD (SVs v2.1 for the Tri and Dup genotypes and v2.1.1 for the rs10273639C-tagged genotypes). Gray bar, the PRSS1-PRSS2 loci on chromosome 7q34. Star, the rs10273639C-tagged allele. WT wild-type, Tri triplication, Dup duplication, Hom homozygote, Het heterozygote, HCP hereditary chronic pancreatitis, OR odds ratio
of findings from three studies Athwal et al. (2014), Huang et al. (2020) and Wan et al. (2020) that reported the transgenic expression of the wild-type human PRSS2 or PRSS1 gene in mice. Unless specifically stated, the zygosity of the transgene in the genetically modified mice was unknown. In the Wan et al. study, the number of upward pointing arrows indicates the different levels of the expressed PRSS2 gene in the two mouse lines. PRSS2WT, wild-type PRSS2 gene; PRSS1WT, wild-type PRSS1 gene
Trypsinogen (PRSS1, PRSS2) copy number gains and regulatory variants have both been proposed to elevate pancreatitis risk through a gene dosage effect (i.e., by increasing the expression of wild-type protein). However, to date, their impact on pancreatitis risk has not been thoroughly evaluated whilst the underlying pathogenic mechanisms remain to be explicitly investigated in mouse models. Genetic studies of the rare trypsinogen duplication and triplication copy number variants (CNVs), and the common rs10273639C variant, were collated from PubMed and/or ClinVar. Mouse studies that analyzed the influence of a transgenically expressed wild-type human PRSS1 or PRSS2 gene on the development of pancreatitis were identified from PubMed. The genetic effects of the different risk genotypes, in terms of odds ratios, were calculated wherever appropriate. The genetic effects of the rare trypsinogen duplication and triplication CNVs were also evaluated by reference to their associated disease subtypes. We demonstrate a positive correlation between increased trypsinogen gene dosage and pancreatitis risk in the context of the rare duplication and triplication CNVs, and between the level of trypsinogen expression and disease risk in the context of the heterozygous and homozygous rs10273639C-tagged genotypes. We retrospectively identify three mouse transgenic studies that are informative in relation to the pathogenic mechanism underlying the trypsinogen gene dosage effect in pancreatitis. Trypsinogen gene dosage correlates with pancreatitis risk across genetic and transgenic studies, highlighting the fundamental role of dysregulated expression of wild-type trypsinogen in the etiology of pancreatitis. Specifically downregulating trypsinogen expression in the pancreas may serve as a potential therapeutic and/or prevention strategy for pancreatitis.
Coloboma, a congenital disorder characterized by gaps in ocular tissues, is caused when the choroid fissure fails to close during embryonic development. Several loci have been associated with coloboma, but these represent less than 40% of those that are involved with this disease. Here, we describe a novel coloboma-causing locus, BMP3. Whole exome sequencing and Sanger sequencing of patients with coloboma identified three variants in BMP3, two of which are predicted to be disease causing. Consistent with this, bmp3 mutant zebrafish have aberrant fissure closure. bmp3 is expressed in the ventral head mesenchyme and regulates phosphorylated Smad3 in a population of cells adjacent to the choroid fissure. Furthermore, mutations in bmp3 sensitize embryos to Smad3 inhibitor treatment resulting in open choroid fissures. Micro CT scans and Alcian blue staining of zebrafish demonstrate that mutations in bmp3 cause midface hypoplasia, suggesting that bmp3 regulates cranial neural crest cells. Consistent with this, we see active Smad3 in a population of periocular neural crest cells, and bmp3 mutant zebrafish have reduced neural crest cells in the choroid fissure. Taken together, these data suggest that Bmp3 controls Smad3 phosphorylation in neural crest cells to regulate early craniofacial and ocular development.
RC and RI histone proteins. A Gene names, aliases and Ensembl 37 Gene IDs of the replication-coupled (RC) (H2A—green, H2B—blue, H3—yellow, H4—purple) and H1 linker (gray) histone protein-encoding genes. Depiction of ~ 147 bp of DNA wrapped around a histone octamer to form a nucleosome and the H1 linker protein that join adjacent nucleosomes. B Comparison of distinguishing features between RC and replication-independent (RI) histones. C Gene names, aliases and Ensembl 37 Gene IDs of the RI histone protein-encoding genes (color figure online)
The tolerance of human histone protein-encoding genes to missense variation. A Distribution of histone protein-encoding genes based on gnomAD Z scores. 7.9% (n = 7) of genes had Z scores ≥ 2, indicating significant intolerance of missense mutations. 46.7% (n = 7) of RI histone protein-encoding genes had missense Z scores ≥ 2 [13.3% (n = 2) with Z score ≥ 3 and 33.3% (n = 5) with Z score ≥ 2]. B Comparison of gnomAD missense Z scores with GTEx expression data demonstrating that all 7 genes determined to be significantly constrained against missense variation also met criteria for ubiquitous expression. Yellow rectangle identifies the putative disease-candidate genes. RC histone protein-encoding genes indicated by circles and RI histone protein-encoding genes indicated by triangles. Genes with missense Z scores ≥ 2 highlighted in teal. C Identification of genes that met criteria for significance. Three genes have previously been reported to be missense mutations underlying a human disease phenotype (H3F3A, H3F3B, H2AFZ) (bolded) while four of the genes identified in this screen are novel putative disease-candidate genes (H2AFY, H2AFY2, H2AFV, H2AFX). (D) Comparison of gnomAD missense Z scores with GTEx expression data with previously reported genes underlying Mendelian disease with conserved phenotype of neurodevelopmental syndrome coupled with craniofacial anomaly highlighted in blue
The tolerance of human histone protein-encoding genes to loss-of-function variation. A Distribution of histone protein-encoding genes based on gnomAD pLI scores. 1.1% (n = 1) of histone protein-encoding genes had a statistically significant pLI score (pLI > 0.9) [H2AFY2, pLI = 0.98]. 76% (n = 68) genes had a pLI score < 0.1, inidcating tolerance to accumulation of loss-of-function variation. B Distribution of histone protein-encoding genes based on gnomAD LOEUF values, color-coded based on gnomAD-identified genes with statistically significant LOEUF values [red—H2AFY2 (LOEUF = 0.3); orange—H2AFZ (LOEUF = 0.43), H2AFY (LOEUF = 0.48), H3F3B (LOEUF = 0.55); yellow—H2AFV (LOEUF = 0.82), H1F0 (LOEUF = 0.90), H1FOO (LOEUF = 0.97), HIST2H2AB (LOEUF = 0.98)]. C Comparison of gnomAD LOEUF values with GTEx expression data demonstrating that the top 6 histone protein-encoding genes most constrained against loss-of-function variation also met criteria for ubiquitous expression (horizontal yellow box). Two genes highly constrained aginst loss-of-function variation did not meet criteria for ubiqutious expression (vertical orange box). RC histone protein-encoding genes indicated by circles and RI histone protein-encoding genes indicated by triangles. Genes with significant LOEUF values highlighted in teal. D Identification of genes that met criteria for significance (horizontal yellow box—H2AFY2, H2AFZ, H2AFY, H3F3B, H2AFV, H1F0) and constrained genes that were not ubiquitously expressed (vertical orange box—H1FOO, HIST2H2AB). (E) Of the 6 genes identified as putative disease-candidate genes (horizontal yellow box), none have been previously reported to be loss-of-function variants underlying a human disease phenotype. HIST1H1E (highlighted in teal, labeled) is the only histone protein-encoding gene previously reported to have a loss-of-function mutation underlying the conserved phenotype of neurodevelopmental disorder coupled with craniofacial anomaly (color figure online)
While germline variants in histone protein-encoding genes are emerging as the pathogenic mutations underlying rare, Mendelian disorders characterized by a conserved phenotype of neurodevelopmental syndrome coupled with craniofacial abnormalities, a systematic assessment of all human genes encoding histone proteins has not been performed to predict novel disease-candidate genes. We first defined a comprehensive list of 89 histone-encoding genes. We then analyzed which are most likely to underlay this conserved phenotype when mutated based on their intolerance to either missense or loss-of-function variation and based on their tissue expression profile. Strikingly few genes were found to be both ubiquitously expressed and significantly constrained against missense (7.9%, n = 7) or loss-of-function (6.7%, n = 6) variation. Notably, most of those significantly constrained genes encode replication-independent, variant histone proteins (7/7 in the missense analysis, 5/6 in the loss-of-function analysis). Of the seven genes predicted to be disease-causing when germline missense variation is present, three (H2AFV, H2AFY, H2AFY2) are novel disease-candidate genes. Five of the six genes predicted to be disease-causing with an underlying germline loss-of-function variant are novel disease-candidate genes (H2AFY2, H2AFZ, H2AFY, H2AFV, H1F0). These findings may serve as a focused reference for future sequencing of patients with the conserved phenotype.
Defective left–right (LR) pattering results in a spectrum of laterality disorders including situs inversus totalis (SIT) and heterotaxy syndrome (Htx). Approximately, 50% of patients with primary ciliary dyskinesia (PCD) displayed SIT. Recessive variants in DNAH9 have recently been implicated in patients with situs inversus. Here, we describe six unrelated family trios and 2 sporadic patients with laterality defects and complex congenital heart disease (CHD). Through whole exome sequencing (WES), we identified compound heterozygous mutations in DNAH9 in the affected individuals of these family trios. Ex vivo cDNA amplification revealed that DNAH9 mRNA expression was significantly downregulated in these patients carrying biallelic DNAH9 mutations, which cause a premature stop codon or exon skipping. Transmission electron microscopy (TEM) analysis identified ultrastructural defects of the outer dynein arms in these affected individuals. dnah9 knockdown in zebrafish lead to the disturbance of cardiac left–right patterning without affecting ciliogenesis in Kupffer’s vesicle (KV). By generating a Dnah9 knockout (KO) C57BL/6n mouse model, we found that Dnah9 loss leads to compromised cardiac function. In this study, we identified recessive DNAH9 mutations in Chinese patients with cardiac abnormalities and defective LR pattering.
Family pedigrees and photographs. A Pedigree illustration for family 1. The male proband (Individual 1; F1:IV-2) displaying intellectual disability (ID) and cardiomyopathy harbours a c.16G>C p.(A6P) variant which was maternally inherited. A maternal uncle (F1:III-5) had ID and cardiomyopathy and died at age 40, but was never genetically tested. B Pedigree illustration for family 2. Three males (Individual 3–5; F2:II-5, F2:III-3 and F2:III-4) with ID were found to carry a maternally inherited c.235C>T p.(R79C) variant. Photographs show individual 4 and 5. A fourth male (Individual 2; F2:II-6) was not genetically tested, but displayed a similar phenotype and was a suspected carrier of the variant. C Pedigree illustration for family 3. The female proband (Individual 6; F3:II-3) has a de novo c.384T>G p.(F128L) variant. She has ID, microcephaly and central vision impairment. D Pedigree illustration for family 4. Photograph shows the female proband (Individual 7; F4:II-1) found to be heterozygous for a de novo c.386A>C p.(Q129P) variant. She has facial dysmorphism, microcephaly and cardiac anomalies. E Pedigree illustration for family 5. Photograph shows the male proband (Individual 8; F5:III-1) with a maternally inherited c.469G>A p.(E157K) variant. He displays autistic features, facial dysmorphism and microcephaly. Ind. individual
NAA10 sequence conservation, NatA structure and substituted residues. A Multiple sequence alignment displaying NAA10 amino acid conservation (indicated by a blue gradient) across nine species. NAA10 variant sites are indicated in red text above the alignment. Secondary structures were derived from human NatA (PDB ID: 6C9M). B Human NatA crystal structure (PDB ID: 6C9M) with NAA15 (grey) and NAA10 (yellow). The positions of NAA10 substitutions are highlighted in red. Acetyl coenzyme A (Ac-CoA) (blue) and serine–alanine–serine–glutamate-starting peptide (SASE) (magenta) in the active site were embedded in the structure from S. pombe NatA (PDB ID: 4KVM)
Cellular stability analysis of NAA10 variants. HeLa cells transfected with NAA10 WT-V5, NAA10 A6P-V5 (A), NAA10 R79C-V5 (B), NAA10 Q129P-V5 (C), or NAA10 E157K-V5 (D) were treated with cycloheximide (CHX, 50 µg/ml) for 2–6 h and cell lysates were analysed by Western blot. Top panels in A–D: Western blot analysis of a CHX time course assay. Bottom panels in A–D: Stability curve showing the percentage level of NAA10-V5 at time points 2–6 h relative to the amount present at 0 h and β-tubulin as a loading control. Each stability curve shows the mean ± SD of three independent experiments performed per NAA10 variant. Significance was calculated by a two-tailed Student’s t test. ****P ≤ 0.0001; **P ≤ 0.01; *P ≤ 0.05; ns not significant P > 0.05
NatA complex formation and N-terminal acetylation by NAA10 variants. Top panels in A–D: Western blot analysis of V5-immunoprecipiation from HeLa cells overexpressing NAA10 WT-V5 or the variants NAA10 A6P-V5 (A), NAA10 R79C-V5 (B), NAA10 Q129P-V5 (C), or NAA10 E157K-V5 (D). Bottom panels in A–D: Immuno-precipitated NAA10 WT or NAA10 variants were comparatively tested in Nt-acetylation assays using the NatA substrate SESS and monomeric NAA10 in vitro substrate EEEI. β-gal-V5 pull-down was used as input in negative control reactions. The values for Nt-acetylated SESS and EEEI product formation were normalised to the band intensities of NAA15 and NAA10, respectively, and shown as relative to WT. The experiments were performed in at least three independent setups for each NAA10 variant (Fig. S1). One representative experiment with technical triplicates is shown. The mean of the triplicates is indicated by a black line
NAA10 is the catalytic subunit of the N-terminal acetyltransferase complex, NatA, which is responsible for N-terminal acetylation of nearly half the human proteome. Since 2011, at least 21 different NAA10 missense variants have been reported as pathogenic in humans. The clinical features associated with this X-linked condition vary, but commonly described features include developmental delay, intellectual disability, cardiac anomalies, brain abnormalities, facial dysmorphism and/or visual impairment. Here, we present eight individuals from five families with five different de novo or inherited NAA10 variants. In order to determine their pathogenicity, we have performed biochemical characterisation of the four novel variants c.16G>C p.(A6P), c.235C>T p.(R79C), c.386A>C p.(Q129P) and c.469G>A p.(E157K). Additionally, we clinically describe one new case with a previously identified pathogenic variant, c.384T>G p.(F128L). Our study provides important insight into how different NAA10 missense variants impact distinct biochemical functions of NAA10 involving the ability of NAA10 to perform N-terminal acetylation. These investigations may partially explain the phenotypic variability in affected individuals and emphasise the complexity of the cellular pathways downstream of NAA10.
Strategy and genetic analysis overview of two groups of CPT patients in this study. het: heterozygous. hom: homozygous
Up to 84% of patients with congenital pseudarthrosis of the tibia (CPT) present with neurofibromatosis type 1 (NF1) (NF1-CPT). However, the etiology of CPT not fulfilling the NIH diagnostic criteria for NF1 (non-NF1-CPT) is not well understood. Here, we collected the periosteum tissue from the pseudarthrosis (PA) site of 43 non-NF1-CPT patients and six patients with NF1-CPT, together with the blood or oral specimen of trios (probands and unaffected parents). Whole-exome plus copy number variation sequencing, multiplex ligation-dependent probe amplification (MLPA), ultra-high amplicon sequencing, and Sanger sequencing were employed to identify pathogenic variants. The result showed that nine tissues of 43 non-NF1-CPT patients (21%) had somatic mono-allelic NF1 inactivation, and five of six NF1-CPT patients (83.3%) had bi-allelic NF1 inactivation in tissues. However, previous literature involving genetic testing did not reveal somatic mosaicism in non-NF1-CPT patients so far. In NF1-CPT patients, when the results from earlier reports and the present study were combined, 66.7% of them showed somatic NF1 inactivation in PA tissues other than germline inactivation. Furthermore, no diagnostic variants from other known genes (GNAS, AKT1, PDGFRB, and NOTCH3) related to skeletal dysplasia were identified in the nine NF1 positive non-NF1-CPT patients and six NF1-CPT patients. In conclusion, we detected evident somatic mono-allelic NF1 inactivation in the non-NF1-CPT. Thus, for pediatric patients without NF1 diagnosis, somatic mutations in NF1 are important.
The genetic background of familial, late-onset colorectal cancer (CRC) (i.e., onset > age 50 years) has not been studied as thoroughly as other subgroups of familial CRC, and the proportion of families with a germline genetic predisposition to CRC remains to be defined. To define the contribution of known or suggested CRC predisposition genes to familial late-onset CRC, we analyzed 32 well-established or candidate CRC predisposition genes in 75 families with late-onset CRC. We identified pathogenic or likely pathogenic variants in five patients in MSH6 (n = 1), MUTYH (monoallelic; n = 2) and NTHL1 (monoallelic; n = 2). In addition, we identified a number of variants of unknown significance in particular in the lower penetrant Lynch syndrome-associated mismatch repair (MMR) gene MSH6 (n = 6). In conclusion, screening using a comprehensive cancer gene panel in families with accumulation of late-onset CRC appears not to have a significant clinical value due to the low level of high-risk pathogenic variants detected. Our data suggest that only patients with abnormal MMR immunohistochemistry (IHC) or microsatellite instability (MSI) analyses, suggestive of Lynch syndrome, or a family history indicating another cancer predisposition syndrome should be prioritized for such genetic evaluations. Variants in MSH6 and MUTYH have previously been proposed to be involved in digenic or oligogenic hereditary predisposition to CRC. Accumulation of variants in MSH6 and monoallelic, pathogenic variants in MUTYH in our study indicates that digenic or oligogenic inheritance might be involved in late-onset CRC and warrants further studies of complex types of inheritance.
Properties of the mega-sample. A Distribution of the twins across the five studies (ADHD, OCS, Depression, Aging and Obesity, on the horizontal axis) combined in the current mega-sample, showing the zygosity of the pairs and their actual counts (stacked bars; gray represents the MZ twins and white the DZ twins). B Various types of twin pairs (represented by colors) across studies (for those pairs with only one member included, we show the single member’s sex). C Distribution of the intra-cranial volume (ICV) in cm³ across all studies as density plots per sex (colored areas and curves) and overall (black curve). D Distribution of age (in years) at the time of the MRI for each study (on the horizontal axis) separately as box plots. Generated automatically using R 4.1.3 (
The genetic covariance structure model (GCSM). Given a phenotypic measure PM and a twin pair, we denote as PM1 and PM2 are the latent values of this measure for the two members of the twin pair, “Twin 1” and “Twin 2”. These are indexed each by the two raters, “Rater 1” and “Rater 2”, producing the four observed values, two per co-twin, denoted as PMij, where i ∈ {1,2} stands for the rater and j ∈ {1,2} for the twin; se² is the variance of the measurement error. The latent measurements PM1 and PM2 are each influenced by the effects of the additive genotype A, the non-shared environment E, and of the dominance genetic factor D or the shared environment C, as appropriate. The correlation between the additive factors A of the two twins differs between MZ (1.0) and DZ (1/2) twins, as do the correlations between dominance effects D (1.0 for MZ and 1/4 for DZ). The correlation of the shared environment C equals 1.0 by definition. Please note that the fixed effects of the covariates are included in the fitted model, but not represented in this figure to avoid cluttering. Drawn manually using LibreOffice Draw 7.2 (
Visual representation of the PMs with evidence for narrow-sense heritability in our data. For full size images, please see the Figs. S4–S15. A–D Midsagittal view of several measures from various domains that belong, respectively, to class I (very strong evidence; A), class II (strong evidence; B), class III (moderate evidence; C), and class IV (circumstantial evidence; D) evidence. E–G Mandibular view of some mandibular measures in class I (E), class III (F), and class IV (G), respectively (there are no measures of class II in this view). H Hard palate view of a dentition measure in class IV (there are no measures of the other classes in this view). Colors help disambiguate the measures. Colored lines with dots represent distances, while solid colored lines with semi-circles represent angles. The decimal numbers after the measure codes are the point estimates of the narrow-sense heritabilities, h². We show only the measures in class IV and higher. Please note that ANSF (the angle between the line from nasion to sella and the Frankfort Horizontal Plane) is not shown (it belongs to class IV and should have appeared in D and G), as we did not find a satisfactory way of visually representing it. The PMs are described in Text S2 and Table S2; see also Table 2. Drawn manually based on Figs. S4–S15 using GIMP 2.10 (
While language is expressed in multiple modalities, including sign, writing, or whistles, speech is arguably the most common. The human vocal tract is capable of producing the bewildering diversity of the 7000 or so currently spoken languages, but relatively little is known about its genetic bases, especially in what concerns normal variation. Here, we capitalize on five cohorts totaling 632 Dutch twins with structural magnetic resonance imaging (MRI) data. Two raters placed clearly defined (semi)landmarks on each MRI scan, from which we derived 146 measures capturing the dimensions and shape of various vocal tract structures, but also aspects of the head and face. We used Genetic Covariance Structure Modeling to estimate the additive genetic, common environmental or non-additive genetic, and unique environmental components, while controlling for various confounds and for any systematic differences between the two raters. We found high heritability, h ² , for aspects of the skull and face, the mandible, the anteroposterior (horizontal) dimension of the vocal tract, and the position of the hyoid bone. These findings extend the existing literature, and open new perspectives for understanding the complex interplay between genetics, environment, and culture that shape our vocal tracts, and which may help explain cross-linguistic differences in phonetics and phonology.
Epigenetic gene silencing by DNA methylation produced by transcription interference in examples of human imprinting disorders of development, inherited disorders of metabolism and cell exposure to radiation. A The imprinted domain of Prader–Willi syndrome (PWS) includes the SNRPN sense gene, the SNORD116 and SNORD115 repeated snoRNA clusters and antisense UBE3A gene. The DNA methylation on the maternal allele of the PWS imprinting control regions (ICR) silences the expression of SNRPN and the long non-coding transcript that encompasses SNORD116, SNORD115 and the antisense transcript to UBE3A (UBE3A-ATS). Conversely, the paternal UBE3A allele is silenced by the expression of UBE3A antisense transcript (-ATS) in neurons. As a consequence, deletion or mutation of the maternal copy of UBE3A causes Angelman syndrome. B Detailed architecture of the gene trio of epi-cblC. The variant in the splice acceptor site of intron 5 of PRDX1 produces the skipping of the last exon and the polyA transcription termination signal, with a subsequent aberrant antisense transcription extended through the MMACHC gene promoter. The antisense aberrant transcription produces a cis epimutation in the MMACH promoter. The antisense genes CCDC163P and PRDX1 are in yellow and the sense MMACHC gene is in blue. CMMACHC and PRDX1 belong to a trio of reverse (R1)/forward (F2)/reverse (R3) genes where CCDC163P is R1, MMACHC is F2 and PRDX1 is R3. We suggest that epivariations in R1/F2/R3 trios of genes is a general mechanism where the epivariation in the bidirectional promoter R1/F2 could be produced by the R3 aberrant transcription triggered by gene variants and/or environmental factors. D Example of alpha-thalassemia. The 16p telomeric region from a normal chromosome (16p13.3) and from the ZF-deleted chromosome (α-ZF 16p13.3). Genes are shown as boxes, where HBA2, HBA1, HBQ1 genes above the line are transcribed in sense (towards the centromere) and LUC7L under the line in antisense. The α-ZF deletion removed the normal transcription termination site of LUC7L. As a consequence, the abnormal transcripts corresponded to correctly spliced mRNA driven from the LUC7L promoter. The extension of antisense transcription of LUC7L through the promoter of HBA2 generates an epimutation. E Upon exposure to radiation, PARTICL forms a DNA-lncRNA triplex upstream of the CpG island of MAT2A promoter, which represses MAT2A via methylation. The radiation-induced PARTICL interacts with the transcription-repressive complex proteins G9a and SUZ12 (subunit of PRC2). Open and closed green circles represent unmethylated and methylated CpG, respectively
Generation of the MMACHC epimutation of epi-cblC according to RNA-seq and ChIP-Seq studies of patient’s fibroblasts and data published in cellular and animal models. PRDX1 mutation produces the H3K36me3 mark, which is deposited by the histone-lysine N-methyltransferase SETD2 as the consequence of the high level of antisense transcription through the MMACHC promoter, where it binds the active form of RNA polymerase II. The histone mark triggers the de novo methylation of the CpG island of the promoter of MMCHC through the recruitment of HP1 and DNMT3B1. Whether the epimutation stability depends on the ratio between the sense transcription of MMACHC and high level of aberrant transcription is not known. DNMT3B1 and/or DNMT1 could be implicated in the maintenance of the epimutation even when PRDX1 transcription is removed, as this is the case in the differentiation of spermatogonia into spermatids
Methylation profile of the CCDC163/MMACHC bidirectional promoter of 4 dizygous siblings from the Environmental Risk (E-Risk) Twin cohort (top) and 2 monozygous twins from the MuTHER (bottom) cohort (Garg et al.2020). A The ‘epi-Manhattan’ plot generated as described previously (Gueant et al. 2018) identified the epivariation in CCDC163P/MMACHC bidirectional promoter as the single major genome-wide methylome change. B Epigrams of the epivariation detected in 4 dizygous siblings from the environmental risk (E-Risk) Twin cohort compared with controls. C Epigrams of the epivariation detected in 2 monozygous twins from the MuTHER cohort. The beta-values of promoter methylation were higher in one compared to the other monozygous twin, suggesting the influence of environmental factors, that differed after birth. Numbered CpG sites of the CpG islet detected by the Illumina Infinium HumanMethylation450 BeadChip array are indicated in X-axis
Top: transcription of PRDX1 in differentiated human cells classified according to categories of tissues and organs (top). Data were extracted in (Papatheodorou et al. 2020). The lowest expression of PRDX1 was observed in germ cells and cardiomyocytes. The transcript expression values, denoted Normalized eXpression (NX), were calculated for each gene in every sample. Bottom: the transcription of PRDX1 in germ cells was the highest in spermatogonia, while it was undetectatble in early spermatids and very low in late spermatids and spermatocytes. This suggests that the epimutation is maintained in male germ cells even when the PRDX1 transcription dramatically decreases their differentiation programming
Transcription level (reported in Expression Atlas, (Papatheodorou et al. 2020)) of 5 trios of reverse (R1)–forward (F2)–R3 genes with epivariations in F2 genes among the 384 epivariations of OMIM genes related to rare diseases reported by Garg et al. (2020)
Epigenetic diseases can be produced by a stable alteration, called an epimutation, in DNA methylation, in which epigenome alterations are directly involved in the underlying molecular mechanisms of the disease. This review focuses on the epigenetics of two inherited metabolic diseases, epi-cblC, an inherited metabolic disorder of cobalamin (vitamin B12) metabolism, and alpha-thalassemia type α-ZF, an inherited disorder of α2-globin synthesis, with a particular interest in the role of aberrant antisense transcription of flanking genes in the generation of epimutations in CpG islands of gene promoters. In both disorders, the epimutation is triggered by an aberrant antisense transcription through the promoter, which produces an H3K36me3 histone mark involved in the recruitment of DNA methyltransferases. It results from diverse genetic alterations. In alpha-thalassemia type α-ZF, a deletion removes HBA1 and HBQ1 genes and juxtaposes the antisense LUC7L gene to the HBA2 gene. In epi-cblC, the epimutation in the MMACHC promoter is produced by mutations in the antisense flanking gene PRDX1, which induces a prolonged antisense transcription through the MMACHC promoter. The presence of the epimutation in sperm, its transgenerational inheritance via the mutated PRDX1, and the high expression of PRDX1 in spermatogonia but its nearly undetectable transcription in spermatids and spermatocytes, suggest that the epimutation could be maintained during germline reprogramming and despite removal of aberrant transcription. The epivariation seen in the MMACHC promoter (0.95 × 10–3) is highly frequent compared to epivariations affecting other genes of the Online Catalog of Human Genes and Genetic Disorders in an epigenome-wide dataset of 23,116 individuals. This and the comparison of epigrams of two monozygotic twins suggest that the aberrant transcription could also be influenced by post-zygotic environmental exposures.
NKG2D is hypoglycosylated in cells lacking STT3B activity. a Diagram showing the glycosylation sites of NKG2D. Grey glycan indicates a suboptimal glycosylation sequon (N108NC). The transmembrane domain is depicted in black, cysteine residues by black lines. b WT and KO HEK293 cells were transfected with NKG2D-Myc-DDK followed by metabolic pulse-chase labelling. The figure was spliced between the WT and MAGT1−/−TUSC3−/− samples. c Metabolic pulse-chase labelling of different NKG2D constructs, containing just one glycosylation site. Quantified values below gel lanes represent the average number of glycans for the respective reporter (n = 3). EH indicates endoglycosidase H treatment
Magnesium supplementation does not rescue the hypoglycosylation of NKG2D. a The different HEK293 cell lines were cotransfected in regular DMEM with NKG2D-Myc-DDK and DAP10-Myc-DDK constructs and analysed for protein steady-state levels of NKG2D 48 h later by immunoblotting (b) in DMEM supplemented with 5 mM MgSO4. The asterisks depict two nonspecific bands co-migrating with NKG2D. β-Tubulin was used as a loading control. Values represent averaged values normalised to the WT control of each condition (n = 3); the standard deviation (STDEV) is shown in the lane below. c Relative abundance of the fully glycosylated NKG2D isoform in untreated and Mg²⁺-treated cells, normalised to the total NKG2D of the respective cell line. Error bars depict the standard deviation (Student’s t test: *p < 0.05; **p < 0.005; ns not significant)
Effects of in vivo Mg²⁺ supplementation for patient 1. a Median fluorescence index (MFI) of NKG2D cell surface expression on NK (n = 6) and CD8⁺ (n = 3) control (CTL) and patient (P1) cells. For P1, levels were measured before treatment and after 3 or 21 months of 3 g Mg gluconate per day supplementation. Gating strategy is described in Fig. S3. Three data points from the control NK cells and one from P1 (baseline, before Mg substitution), previously reported in Blommaert et al. 2019, were included in this figure. b EBV (Epstein–Barr virus) PCR levels (IU/mL) of P1 measured over time (diamond squares indicate magnesium supplementation)
Mutations in the X-linked gene MAGT1 cause a Congenital Disorder of Glycosylation (CDG), with two distinct clinical phenotypes: a primary immunodeficiency (XMEN disorder) versus intellectual and developmental disability. It was previously established that MAGT1 deficiency abolishes steady-state expression of the immune response protein NKG2D (encoded by KLRK1) in lymphocytes. Here, we show that the reduced steady-state levels of NKG2D are caused by hypoglycosylation of the protein and we pinpoint the exact site that is underglycosylated in MAGT1-deficient patients. Furthermore, we challenge the possibility that supplementation with magnesium restores NKG2D levels and show that the addition of this ion does not significantly improve NKG2D steady-state expression nor does it rescue the hypoglycosylation defect in CRISPR-engineered human cell lines. Moreover, magnesium supplementation of an XMEN patient did not result in restoration of NKG2D expression on the cell surface of lymphocytes. In summary, we demonstrate that in MAGT1-deficient patients, the lack of NKG2D is caused by hypoglycosylation, further elucidating the pathophysiology of XMEN/MAGT1-CDG.
SLC10A7 genomic organization and transcript variant representation. A Schematic representation of SLC10A7 genomic coding sequence. It is divided into 12 exons (white-numbered blue boxes). The length of each exon is proportional to their base content. The introns are not scaled. The numbering under each exon is based on coding nucleotides (Zou et al. 2005). cDNA Patients mutations are spotted with red arrows and noted in red. B SLC10A7 most common variants, v2 and v4. Exon 11’ is only present in v4 whose exon 12 represented with a dotted line is non coding. Each variant can be found by an Ensembl transcript number annotated under the sequence
Predicted topology and 3D structure of SLC10A7. A search on the PHYRE2 server ( using the human SLC10A7 isoform b protein chain (UniProtKB/Swissprot ID Q0GE19-2) indicates a 100% probability/confidence that amino acid residues 6–332 of SLC10A7 match with the apical sodium-dependent bile acid transporter (ASBT; also known as SLC10A2) homologue from Yersinia frederiksenii (ASBTYf) whose structure in a lipid environment was solved at 1.95 Å resolution (PDB entry 4N7W, Zhou et al. 2014). This ultimate homology probability, together with a 21% sequence identity between SLC10A7 and ASBTYf, give reasonably accurate the topology model depicted in the scheme in panel A and the overall SLC10A7 fold shown in panels B and C. A Putative schematic 2D topology of the human SLC10A7 isoform b protein chain showing the 10 transmembrane domains (TM1-TM10) predicted by both TMHMM v.2.0 ( and PHYRE2 servers. Both N- (N-t) and C-terminal (C-t) ends are located at the cytosolic side. The transmembrane segments belonging to the predicted functional domains of SLC10A7 described in the text and illustrated in B and C, namely, the core domain (TM3-5, TM8-10) and the panel domain (TM1-2, TM6-7), are indicated. The luminal side and cytoplasmic loops are indicated as L1–L5 and C1–C4, respectively. The one-letter-code amino acids in the black- grey- and white-filled circles correspond to the identical, highly similar and non-conserved residues between SLC10A7 and ASBTYf, respectively, as determined by primary sequence alignments in Clustal Omega ( (Madeira et al. 2019). The red-circled amino acids are those found mutated in SLC10A7-CDG patients (see Sect. 3 of the manuscript), with red arrows indicating the amino acid changes. The yellow- and green-circled amino acids are located at positions corresponding to the residues identified in the Na⁺-binding sites 1 and 2 of ASBTYf, respectively (Zhou et al. 2014; Wang et al. 2021). (B and C) Model of SLC10A7 predicted on PHYRE2 using the structure of ASBTYf (PDB entry 4N7W) as a template. Rainbow-color fold representation was obtained using the UCSF ChimeraX software (Goddard et al. 2018). The numbering of TM domains and loops is the same as in panel A. The side view (B) and the top view (C) of the predicted SLC10A7 model are shown. The core and panel domains are indicated. The grayed areas roughly indicate the locations of the bile acid pocket (area A) and the two Na⁺-binding sites (area B) characterized in ASBTYf (Zhou et al. 2014; Wang et al. 2021)
Negative regulation by SLC10A7 of SOC-dependent cellular Ca²⁺ entry in response to ER Ca²⁺ depletion. This scheme, inspired from Lu and Fivaz (2016), shows the principal molecular components (ORAI1 and STIM1) and their interactions permitting the store operated Ca²⁺ entry (SOCE). When ER Ca²⁺ stores are full (left side), STIM1 is dispersed throughout the ER. In conditions of ER Ca²⁺ depletion, STIM1 oligomerizes (red arrows), recruits and interacts with ORAI1 at ER-PM contact sites, then allowing cellular Ca²⁺ entry (blue arrows) and replenishing of stores via the SERCA pumps. SLC10A7, mainly expressed in the secretory pathway, ER and/or Golgi, seems to act as a negative regulator of SOC-dependent Ca²⁺ ER replenishing by possibly interacting with one or more components of this pathway (ORAI1, STIM1 and/or SERCA) (black arrows). The solute transport activity of SLC10A7, still not characterized (green arrows), may also account for the negative regulation of Ca²⁺ entry and storage. Figure was built using Servier Medical Art graphics (
Schematic representation of the impact of SLC10A7 on the N-glycosylation process and heparan sulfate biosynthesis. SLC10A7 deficiency impacts glycosylation processes in the different cellular compartments and Golgi cisterna represented. N-glycosylation and heparan sulfate (HS) synthesis (red frames) are affected. Regarding N-glycans maturation, high mannose structures, such as Man9GlcNAc2 increase. A decrease in the sialylation degree of complex N-glycans is also found. Heparan Sulfate biosynthesis is affected in SLC10A7 deficiency. A general decrease is observed but without affecting the quality of the HS structures. The substitution of the glycans is not fully represented; only the remaining structure at the entrance of each cisterna is shown. Phosphorylation is represented by the letter « P». The symbol nomenclature for glycan structure is depicted in the bottom left-hand corner
SLC10A7, encoded by the so-called SLC10A7 gene, is the seventh member of a human sodium/bile acid cotransporter family, known as the SLC10 family. Despite similarities with the other members of the SLC10 family, SLC10A7 does not exhibit any transport activity for the typical SLC10 substrates and is then considered yet as an orphan carrier. Recently, SLC10A7 mutations have been identified as responsible for a new Congenital Disorder of Glycosylation (CDG). CDG are a family of rare and inherited metabolic disorders, where glycosylation abnormalities lead to multisystemic defects. SLC10A7-CDG patients presented skeletal dysplasia with multiple large joint dislocations, short stature and amelogenesis imperfecta likely mediated by glycosaminoglycan (GAG) defects. Although it has been demonstrated that the transporter and substrate specificities of SLC10A7, if any, differ from those of the main members of the protein family, SLC10A7 seems to play a role in Ca²⁺ regulation and is involved in proper glycosaminoglycan biosynthesis, especially heparan-sulfate, and N-glycosylation. This paper will review our current knowledge on the known and predicted structural and functional properties of this fascinating protein, and its link with the glycosylation process.
Distribution and quantification of pathogenic MMAB variants. (a) Count of different variant types in our cohort; each square corresponds to one identified allele. (b) Ranked list of variants, which occurred at least twice in the cohort, from highest to lowest frequency (inset: ranked list of splicing variants). (c) Lolliplot of all missense, truncating and insertion variants, distributed along the MMAB polypeptide chain (inset: zoom of the hotspot region at the end of exon 7). Tracks underneath the polypeptide chain indicate residues involved in cobalamin and ATP binding
Clinical and biochemical cohort characterization. (a) Scatter plot of PI activity with and without supplementation of OHCbl; dashed line indicates a PI ratio of 1.5. (b) In vivo vitamin B12 response (clinical responsiveness) compared to in vitro responsiveness (PI ratio, dashed line indicates ratio at 1.5). (c) In vitro OHCbl response (PI responsiveness) compared to ammonia levels as assessed at time at presentation. (d) Scatter plot comparing PI ratio to age at onset; vertical dashed line indicates PI ratio at 1.5, horizontal dashed line indicates age at onset of 30 days. (e) PI activity with and without supplementation of OHCbl, grouped in early and late onset. (f) Linear regression plots comparing clinical and biochemical parameters. p values in (b), (c) and (e) are calculated by Wilcoxon signed-rank test. Linear regressions in (d) and (f) are calculated by Pearson correlation
Functional impact of variant types and specific variants. (a) Scatter plot of PI activity with and without supplementation of OHCbl, grouped according to variant types indicated by colors; triangles indicate in vitro responsiveness, dot or triangle size indicate the abundance of the specific allele category. (b) Proportional abundance of different allele combinations, grouped according to in vitro responsiveness. (c) same as (a), titles indicate which specific alleles are color coded. (d) same as (b) but with color coded abundance of specific alleles
Mapping of pathogenic variants onto the MMAB structure. Sites of frequent and novel missense variants have been depicted onto all three subunits of the MMAB homotrimer (PDB code: 6D5K). (a) Top view. (b) Side view of a trimeric interface of chains A and B containing AdoCbl. (c) Side view of a trimeric interface of chains B and C containing ATP. (d) Zoom in from (b) to show interaction of Ala127 and Val209. (e) Zoom in from (b) to show residues involved in AdoCbl binding. (f) Zoom in from (c) to show residues involved in ATP binding
Biochemical characterization of human MMB. (a) Schematic representation of different chemical states of AdoCbl and the absorbing wavelengths; modified from (Padovani et al. 2008). (b) AdoCbl binding. Top. UV–visible absorbance spectrum was obtained by titrating a fixed concentration of AdoCbl with increasing concentrations of MMAB (blue = 0 µM, red = 90 µM). Inset. Change in absorbance at 525 nm. Each data point represents the mean of n = 3. Bottom. Scatchard plot analysis of the change in absorbance at 525 nm. Dashed lines represent linear regression fits. (c) ATP binding. Top. Fluorescence quenching in arbitrary units (a.u.) of MANT-ATP titrated with increasing concentrations of MMAB. Experiment was performed in technical triplicates. Bottom. ITC data for binding of ATP to MMAB. Upper panel depicts ATP-binding in power versus time. Lower panel shows integration of data in the upper panel. Since one-binding site was indicated, the data in the lower panel were fit to a single-site binding model. (d) ATP-mediated AdoCbl release. Top. UV–visible absorbance spectrum changes following titration of ATP (blue = 0 µM, red = 1650 µM) to AdoCbl-bound (holo-)MMAB. Inset. Release of AdoCbl from holo-MMAB represented by change in absorbance at 525 nm. Each data point represents the mean of n = 3. Bottom. The same as above but performed following pre-incubation of holo-MMAB with apo-MMUT; AdoCbl binding to MMUT is indicated by a transition at 565 nm
Pathogenic variants in MMAB cause cblB -type methylmalonic aciduria, an autosomal-recessive disorder of propionate metabolism. MMAB encodes ATP:cobalamin adenosyltransferase, using ATP and cob(I)alamin to create 5’-deoxyadenosylcobalamin (AdoCbl), the cofactor of methylmalonyl-CoA mutase (MMUT). We identified bi-allelic disease-causing variants in MMAB in 97 individuals with cblB -type methylmalonic aciduria, including 33 different and 16 novel variants. Missense changes accounted for the most frequent pathogenic alleles (p.(Arg186Trp), N = 57; p.(Arg191Trp), N = 19); while c.700C > T (p.(Arg234*)) was the most frequently identified truncating variant ( N = 14). In fibroblasts from 76 affected individuals, the ratio of propionate incorporation in the presence and absence of hydroxocobalamin (PI ratio) was associated to clinical cobalamin responsiveness and later disease onset. We found p.(Arg234*) to be associated with cobalamin responsiveness in vitro, and clinically with later onset; p.(Arg186Trp) and p.(Arg191Trp) showed no clear cobalamin responsiveness and early onset. Mapping these and novel variants onto the MMAB structure revealed their potential to affect ATP and AdoCbl binding. Follow-up biochemical characterization of recombinant MMAB identified its three active sites to be equivalent for ATP binding, determined by fluorescence spectroscopy ( K d = 21 µM) and isothermal calorimetry ( K d = 14 µM), but function as two non-equivalent AdoCbl binding sites ( K d1 = 0.55 μM; K d2 = 8.4 μM). Ejection of AdoCbl was activated by ATP ( K a = 24 µM), which was sensitized by the presence of MMUT ( K a = 13 µM). This study expands the landscape of pathogenic MMAB variants, provides association of in vitro and clinical responsiveness, and facilitates insight into MMAB function, enabling better disease understanding.
Individual participant data (IPD) Flow Diagram
Occular manifestation in 135/137 patients with CblC defect at early or late onset. ERG: electroretinogram; SD-OCT: spectral domain optical coherence tomography
Spectrum of the associations of occular manifestations according to number of cases in MMA-HCU group. Gray boxes and empty boxes represent cases with and without manifestation, respectively
Spectrum of the associations of occular manifestations according to number of cases in HCU patients Gray boxes and empty boxes represent cases with and without manifestation, respectively.
Spectrum of the associations of occular manifestations according to number of cases in MMA patients Gray boxes and empty boxes represent cases with and without manifestation, respectively.
Inherited disorders of cobalamin (cbl) metabolism (cblA-J) result in accumulation of methylmalonic acid (MMA) and/or homocystinuria (HCU). Clinical presentation includes ophthalmological manifestations related to retina, optic nerve and posterior visual alterations, mainly reported in cblC and sporadically in other cbl inborn errors. We searched MEDLINE EMBASE and Cochrane Library, and analyzed articles reporting ocular manifestations in cbl inborn errors. Out of 166 studies a total of 52 studies reporting 163 cbl and 24 mut cases were included. Ocular manifestations were found in all cbl defects except for cblB and cblD-MMA; cblC was the most frequent disorder affecting 137 (84.0%) patients. The c.271dupA was the most common pathogenic variant, accounting for 70/105 (66.7%) cases. One hundred and thirty-seven out of 154 (88.9%) patients presented with early-onset disease (0–12 months). Nystagmus and strabismus were observed in all groups with the exception of MMA patients while maculopathy and peripheral retinal degeneration were almost exclusively found in MMA-HCU patients. Optic nerve damage ranging from mild temporal disc pallor to complete atrophy was prevalent in MMA-HCU.and MMA groups. Nystagmus was frequent in early-onset patients. Retinal and macular degeneration worsened despite early treatment and stabilized systemic function in these patients. The functional prognosis remains poor with final visual acuity < 20/200 in 55.6% (25/45) of cases. In conclusion, the spectrum of eye disease in Cbl patients depends on metabolic severity and age of onset. The development of visual manifestations over time despite early metabolic treatment point out the need for specific innovative therapies.
Human CBS structure and mutations. A One-dimensional drawing showing homology domains of CBS. Green shows region similar to other PLP-containing enzymes; blue shows CBS domains. Above the rectangle shows location of key residue-binding heme and PLP. Below show some patient-derived mutations. Red mutations are “non-rescuable”, while blue mutations are rescuable. B Structure of human CBS (Δ516–525) as determined in the presence and absence of AdoMet. One subunit of the dimer is shaded as in A, while the other is shaded in teal. PLP is shown in yellow, heme in red, and AdoMet in magenta. C Patient-derived mutants mapped onto AdoMet activated structure (rotated 90° on vertical axis compared to image in A). Red mutations are non-responsive, while orange are pyridoxine-rescuable
Equilibrium model for hCBS folding in S. cerevisiae. Missense mutant proteins are acted on by two competing systems. If bound by small heat shock proteins, like Hsp26, missense mutant proteins are ubiquitinated and directed to the proteasome for degradation. If bound by Hsp70, mutant proteins are refolded into an active conformation. Proteasome inhibitors alter the equilibrium by both inhibiting the degradation arm, and stimulating the refolding arm by increasing Hsp70 levels
Inborn errors of metabolism (IEM) comprise a large class of recessive genetic diseases involving disorders of cellular metabolism that tend to be caused by missense mutations in which a single incorrect amino acid is substituted in the polypeptide chain. Cystathionine beta-synthase (CBS) deficiency is an example of an IEM that causes large elevations of blood total homocysteine levels, resulting in phenotypes in several tissues. Current treatment strategies involve dietary restriction and vitamin therapy, but these are only partially effective and do not work in all patients. Over 85% of the described mutations in CBS-deficient patients are missense mutations in which the mutant protein fails to fold into an active conformation. The ability of CBS to achieve an active conformation is affected by a variety of intracellular protein networks including the chaperone system and the ubiquitin/proteasome system, collectively referred to as the proteostasis network. Proteostasis modulators are drugs that perturb various aspects of these networks. In this article, we will review the evidence that modulation of the intracellular protein folding environment can be used as a potential therapeutic strategy to treat CBS deficiency and discuss the pros and cons of such a strategy.
The emergence of next-generation sequencing enabled a cost-effective and straightforward diagnostic approach to genetic disorders using clinical exome sequencing (CES) panels. We performed a retrospective observational study to assess the diagnostic yield of CES as a first-tier genetic test in 128 consecutive pediatric patients addressed to a referral center in the North-East of France for a suspected genetic disorder, mainly an inborn error of metabolism between January 2016 and August 2020. CES was performed using the TruSight One (4811 genes) or the TruSight One expanded (6699 genes) panel on an Illumina sequencing platform. The median age was 6.5 years (IQR 2.0–12.0) with 43% of males (55/128), and the median disease duration was 7 months (IQR 1–47). In the whole analysis, the CES diagnostic yield was 55% (70/128). The median test-to-report time was 5 months (IQR 4–7). According to CES indications, the CES diagnostic yields were 81% (21/26) for hyperlipidemia, 75% (6/8) for osteogenesis imperfecta, 64% (25/39) for metabolic disorders, 39% (10/26) for neurological disorders, and 28% (8/29) for the subgroup of patients with miscellaneous conditions. Our results demonstrate the usefulness of a CES-based diagnosis as a first-tier genetic test to establish a molecular diagnosis in pediatric patients with a suspected genetic disorder with a median test-to-report time of 5 months. It highlights the importance of a close interaction between the pediatrician with expertise in genetic disorders and the molecular medicine physician to optimize both CES indication and interpretation. Graphic abstract Diagnostic yield of clinical exome sequencing (CES) as a first-tier genetic test for diagnosing genetic disorders in 128 consecutive pediatric patients referred to a reference center in the North-East of France for a suspected genetic disorder, mainly an inborn error of metabolism between January 2016 and August 2020. The CES diagnostic yields are reported in the whole population and patients’ subgroups (hyperlipidemia, osteogenesis imperfecta, metabolic diseases, neurological disorders, miscellaneous conditions) (Icons made by Flaticon,; CC-BY-3.0).
Patients with Down syndrome (DS) are more affected by the Coronavirus Disease (COVID)-19 pandemic when compared with other populations. Therefore, the primary aim of our study was to report the death (case fatality rate) from SARS-CoV-2 infection in Brazilian hospitalized patients with DS from 03 January 2020 to 04 April 2021. The secondary objectives were (i) to compare the features of patients with DS and positive for COVID-19 (G1) to those with DS and with a severe acute respiratory infection (SARI) from other etiological factors (G2) to tease apart the unique influence of COVID-19, and (ii) to compare the features of patients with DS and positive for COVID-19 to those without DS, but positive for COVID-19 (G3) to tease apart the unique influence of DS. We obtained the markers for demographic profile, clinical symptoms, comorbidities, and the clinical features for SARI evolution during hospitalization in the first year of the COVID-19 pandemic in Brazil from a Brazilian open-access database. The data were compared between (i) G1 [1619 (0.4%) patients] and G2 [1431 (0.4%) patients]; and between (ii) G1 and G3 [222,181 (64.8%) patients]. The case fatality rate was higher in patients with DS and COVID-19 (G1: 39.2%), followed by individuals from G2 (18.1%) and G3 (14.0%). Patients from G1, when compared to G2, were older (≥ 25 years of age), presented more clinical symptoms related to severe illness and comorbidities, needed intensive care unit (ICU) treatment and non-invasive mechanical ventilation (MV) more frequently, and presented a nearly two fold-increased chance of death (OR = 2.92 [95% CI 2.44–3.50]). Patients from G1, when compared to G3, were younger (< 24 years of age), more prone to nosocomial infection, presented an increased chance for clinical symptoms related to a more severe illness; frequently needed ICU treatment, and invasive and non-invasive MV, and raised almost a three fold-increased chance of death (OR = 3.96 [95% CI 3.60–4.41]). The high case fatality rate in G1 was associated with older age (≥ 25 years of age), presence of clinical symptoms, and comorbidities, such as obesity, related to a more severe clinical condition. Unvaccinated patients with DS affected by COVID-19 had a high case fatality rate, and these patients had a different profile for comorbidities, clinical symptoms, and treatment (such as the need for ICU and MV) when compared with other study populations.
From research towards clinical genomics. Timeline illustrating the discovery of patients with monogenic IBD, the development of diagnostic approaches and its translation into clinical practice. NGS next generation sequencing. *(NHS England-National Genomic Test Directory 2022)
A data driven taxonomy model of monogenic IBD genes. Based on penetrance estimates genes were clustered by associated phenotypes, response to haematopoietic stem cell transplant, single cell gene expression in the colon, clinical parameters, and biochemical pathways illustrating a model of essential cellular modules that maintain intestinal barrier function and immune homeostasis under physiological conditions and drive monogenic IBD in genetic defects. Right side: Illustration of example mechanisms of monogenic IBD affecting different cell compartments, classified into predominantly affected cell types. *Loeys-Dietz syndrome and IPO8 defects. CGD chronic granulomatous disease, IPEX immune dysregulation, polyendocrinopathy, enteropathy, X-linked syndrome, Treg regulatory T cells, WAS Wiskott-Aldrich syndrome. Modified after Bolton et al. (2021)
Over 100 genes are associated with monogenic forms of inflammatory bowel disease (IBD). These genes affect the epithelial barrier function, innate and adaptive immunity in the intestine, and immune tolerance. We provide an overview of newly discovered monogenic IBD genes and illustrate how a recently proposed taxonomy model can integrate phenotypes and shared pathways. We discuss how functional understanding of genetic disorders and clinical genomics supports personalised medicine for patients with monogenic IBD.
Genomic sequencing (GS) can reveal secondary findings (SFs), findings unrelated to the reason for testing, that can be overwhelming to both patients and providers. An effective approach for communicating all clinically significant primary and secondary GS results is needed to effectively manage this large volume of results. The aim of this study was to develop a comprehensive approach to communicate all clinically significant primary and SF results. A genomic test report with accompanying patient and provider letters were developed in three phases: review of current clinical reporting practices, consulting with genetic and non-genetics experts, and iterative refinement through circulation to key stakeholders. The genomic test report and consultation letters present a myriad of clinically relevant GS results in distinct, tabulated sections, including primary (cancer) and secondary findings, with in-depth details of each finding generated from exome sequencing. They provide detailed variant and disease information, personal and familial risk assessments, clinical management details, and additional resources to help support providers and patients with implementing healthcare recommendations related to their GS results. The report and consultation letters represent a comprehensive approach to communicate all clinically significant SFs to patients and providers, facilitating clinical management of GS results.
Phenotypes and genetic diagnoses of 74 46,XY DSD patients. a typical/mild DSD patients in TD (22/0), cryptorchidism (16/7), micropenis (21/14), and hypospadias (29/10); b patients with variants in TD (8/22), cryptorchidism (11/23), micropenis (19/35), and hypospadias (18/39); c patients with different DSD pathogenic genes. DSD disorders of sex development, TD testicular dysgenesis
Features of the prevalent variants in 74 46,XY DSD patients. a genetic diagnoses of the patients; b ratio of novel and reported variants; c ratio of different variant types; d ratio of different categories of DSD genes; e ratio of different genotypes. P pathogenic, LP likely pathogenic, VUS variant of uncertain significance, DSD disorders of sex development
DSD families. a Family 1-LHCGR;b family 2-SRD5A2
46,XY disorders of sex development (DSD) present with diverse phenotypes and complicated genetic causes. Precise genetic diagnosis contributes to accurate management, and targeted next-generation sequencing (NGS) and whole-exome sequencing are powerful tools for investigating DSD. However, the prevalent variants resulting in 46,XY DSD remain unclear, especially those associated with mild forms, such as isolated hypospadias, inguinal cryptorchidism, and micropenis. From 2019 to 2021, 74 patients with 46,XY DSD (48 typical and 26 mild) from the First Affiliated Hospital of Sun Yat-sen University were enrolled in our cohort study for targeted NGS or whole-exome sequencing. Our targeted 46,XY DSD panel included 108 genes involved in disorders of gonadal development and differentiation, steroid hormone synthesis and activation, persistent Müllerian duct syndrome, idiopathic hypogonadotropic hypogonadism, syndromic disorder, and others. Variants were classified as pathogenic, likely pathogenic, variant of uncertain significance, likely benign, or benign following the American College of Medical Genetics guidelines. As a result, 28 of 74 (37.8%) patients with pathogenic and/or likely pathogenic variants acquired genetic diagnoses. The Mild DSD patients acquired a diagnosis rate of 30.7%. We detected 44 variants in 28 DSD genes from 31 patients, including 33 novel and 11 reported variants. Heterozygous (65%) and missense (70.5%) variants were the most common. Variants associated with steroid hormone synthesis and activation were the main genetic causes of 46,XY DSD. In conclusion, 46,XY DSD manifests as a series of complicated polygenetic diseases. NGS reveals prevalent variants and improves the genetic diagnoses of 46,XY DSD, regardless of severity.
Osteoporosis is a serious public health problem that affects 200 million people worldwide. Genome-wide association studies have revealed the association between several single nucleotide polymorphisms (SNPs) near WNT/β-catenin signaling genes and bone mineral density (BMD). The activation of β-catenin by WNT ligands is required for osteoblast differentiation. SNP rs9921222 is an intronic variant of AXIN1 (a scaffold protein in the destruction complex that regulates β-catenin signaling) that correlates with BMD. However, the biological mechanism of SNP rs9921222 has never been reported. Here, we show that the genotype of SNP rs9921222 correlates with the expression of AXIN1 in human osteoblasts. RNA and genomic DNA were analyzed from primary osteoblasts from 111 different individuals. Homozygous TT at rs9921222 correlates with a higher expression of AXIN1 than homozygous CC. Regional association analysis showed that rs9921222 is in high linkage disequilibrium (LD) with SNP rs10794639. In silico transcription factor analysis predicted that rs9921222 is within a GATA4 motif and rs10794639 is adjacent to an estrogen receptor alpha (ERα) motif. Mechanistically, GATA4 and ERα bind at SNPs rs9921222 and rs10794639 as detected by ChIP-qPCR. Luciferase assays demonstrate that rs9921222 is the causal SNP to alter ERα and GATA4 binding. GATA4 promoted the expression, and in contrast, ERα suppressed the expression of AXIN1 via the histone deacetylase complex member SIN3A. Functionally, the level of AXIN1 negatively correlates with the level of transcriptionally active β-catenin. In summary, we have discovered a molecular mechanism of the SNP rs9921222 to regulate AXIN1 through GATA4 and ERα binding in human osteoblasts.
Global distribution of inferred CYP2D6 phenotypes. Frequencies of CYP2D6 poor metabolizer (A), intermediate metabolizer (B) and ultrarapid metabolizer (C) phenotypes were calculated based on the frequencies of loss-of-function alleles (*3, *4, *5 and *6), decreased function alleles (*9, *10, *17, *29 and *41) and increased function alleles (*1xN and *2xN) from 53 countries/populations (Tables 1 and 2; Supplementary Table 1). Countries are color-coded with the highest frequency in red, the average frequency across all populations (f¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{f }$$\end{document}) in yellow, and the lowest frequency in green. In case of missing population frequencies, averaged continent frequency data from the literature (Gaedigk et al. 2017) were used to infer metabolizer phenotypes
Global distribution of inferred CYP2C19 phenotypes. Frequencies of CYP2C19 poor metabolizers (A), intermediate metabolizers (B) and ultrarapid metabolizers (C) were calculated based on frequencies of the loss-of-function alleles CYP2C19*2 and *3, as well as the increased function allele CYP2C19*17 for 52 countries/populations (Table 3; Supplementary Table 2). Countries are color-coded with the highest frequency in red, the average frequency across all populations (f¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{f }$$\end{document}) in yellow, and the lowest frequency in green. In case of missing population frequencies, averaged continent frequency data from the literature (Ionova et al. 2020; Scott et al. 2013) were used to infer metabolizer phenotypes
Global distribution of clinically important human leukocyte antigen (HLA) alleles. Allele frequencies of HLA-B*57:01 (A), HLA-B*15:02 (B), HLA-A*31:01 (C), and HLA-B*58:01 (D) across up to 74 countries are shown. Countries are color-coded with the highest frequency in red, the average frequency across all populations (f¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{f}$$\end{document}) in yellow, and the lowest frequency in blue. Countries for which no HLA frequency information was available are colored white. Figure modified with permission from (Zhou et al. 2021b)
Both safety and efficacy of medical treatment can vary depending on the ethnogeographic background of the patient. One of the reasons underlying this variability is differences in pharmacogenetic polymorphisms in genes involved in drug disposition, as well as in drug targets. Knowledge and appreciation of these differences is thus essential to optimize population-stratified care. Here, we provide an extensive updated analysis of population pharmacogenomics in ten pharmacokinetic genes ( CYP2D6 , CYP2C19 , DPYD , TPMT , NUDT15 and SLC22A1 ), drug targets ( CFTR ) and genes involved in drug hypersensitivity ( HLA-A , HLA-B ) or drug-induced acute hemolytic anemia ( G6PD ). Combined, polymorphisms in the analyzed genes affect the pharmacology, efficacy or safety of 141 different drugs and therapeutic regimens. The data reveal pronounced differences in the genetic landscape, complexity and variant frequencies between ethnogeographic groups. Reduced function alleles of CYP2D6 , SLC22A1 and CFTR were most prevalent in individuals of European descent, whereas DPYD and TPMT deficiencies were most common in Sub-Saharan Africa. Oceanian populations showed the highest frequencies of CYP2C19 loss-of-function alleles while their inferred CYP2D6 activity was among the highest worldwide. Frequencies of HLA-B*15:02 and HLA-B*58:01 were highest across Asia, which has important implications for the risk of severe cutaneous adverse reactions upon treatment with carbamazepine and allopurinol. G6PD deficiencies were most frequent in Africa, the Middle East and Southeast Asia with pronounced differences in variant composition. These variability data provide an important resource to inform cost-effectiveness modeling and guide population-specific genotyping strategies with the goal of optimizing the implementation of precision public health.
Over the last few years, the field of pharmacogenomics has gained considerable momentum. The advances of new genomics and bioinformatics technologies propelled pharmacogenomics towards its implementation in the clinical setting. Since 2007, and especially the last-5 years, many studies have focused on the clinical implementation of pharmacogenomics while identifying obstacles and proposed strategies and approaches for overcoming them in the real world of primary care as well as outpatients and inpatients clinics. Here, we outline the recent pharmacogenomics clinical implementation projects and provide details of the study designs, including the most predominant and innovative, as well as clinical studies worldwide that focus on outpatients and inpatient clinics, and primary care. According to these studies, pharmacogenomics holds promise for improving patients’ health in terms of efficacy and toxicity, as well as in their overall quality of life, while simultaneously can contribute to the minimization of healthcare expenditure.
Population genetic statistics. Empirical cumulative distribution function of minor-allele frequency (A), nucleotide diversity (π) (B), derived allele frequency (C), population differentiation (FST) values (D) and Hardy–Weinberg P values (E) at modifier variations, pathogenic variations and neutral controls. Control 1: variations at fourfold degenerate sites. Control 2: variations within short introns. Kolmogorov–Smirnov tests were used to test the significances of differences. P values for all the pair-wise comparisons with modifier variations are less than 10–22. For the pair-wise comparisons between pathogenic variations and controls, except the FST values showing no statistically significant difference, P values for the rest of the statistics are all less than 10–6. The analyses were based on the 1000 Genome Project dataset
Epigenetic features. Distribution of variations as a function of maximum ENCODE H3K27 acetylation level (A), ENCODE H3K4 methylation level (B) and maximum ENCODE H3K4 trimethylation level (C). Kolmogorov–Smirnov tests were used to test the significance of the differences. Pm–p stands for the P value of the comparison between modifier loci and pathogenic loci, Pm-c1/2 for the comparison between modifier loci and controls and Pp-c1/2 for the comparison between pathogenic loci and controls. D Percentage of genomic positions displaying the chromatin states in at least one cell type. Variations at fourfold degenerate sites (Control 1) and variations within short introns (Control 2) were used as neutral controls. The observed excess or deficit for the chromatin states were evaluated by P values based on Pearson’s chi-squared test. Significant differences are indicated as *P value < 0.05, **P value < 10–5, ***P value < 10–10. Of the symbols labeled vertically, the first indicates the significant differences in comparison with control 1 and the second indicates the significant differences in comparison with control 2. Significant differences between modifier loci and pathogenic loci were horizontally labeled in red
Evolutionary constraints. Empirical cumulative probability distributions of evolutionary conservation measured by PhastCons scores (A) and Phylop scores (B) across vertebrates, across mammalians and across primates, for modifier variations and for pathogenic variations. Kolmogorov–Smirnov tests were used to test the significance of the differences between pathogenic variations and modifier variations, P values of all the pair-wise comparisons are less than 10–15
Damaging scores predicted by computational tools. Wilcoxon signed rank tests were used to test the significances of the pair-wise differences, significant differences are indicated as *P value < 0.05, **P value < 10–5, ***P value < 10–10, and ns stands for not significant. M stands for modifier variations, P for pathogenic variations, C1 for neutral control variations at four fold degenerate sites and C2 for neutral control variations within short introns
Functional consequence. Effects that alleles of the variants may have on transcripts. Percentage of modifier variations and pathogenic variations in genic or non-genic regions (A) and in consequence types (B). The observed excess or deficit for the different classes of genomic location are evaluated by P values based on Pearson’s chi-squared test, significant differences are indicated as *P value < 0.05, **P value < 10–5, ***P value < 10–10. C–E Transcriptional regulatory property. Distribution of eQTL variations functioning as either modifier or not corresponding to the number of tissues (C), to the number of target genes (D) and to the number of affected diseases (E). Kolmogorov–Smirnov tests were used to test the differences between modifiers and non-modifiers, P values for all the comparisons are less than 10–22
Epistatic interactions complicate the identification of variants involved in phenotypic effect. In-depth knowledge in modifiers and in pathogenic variants would benefit the mechanistic studies on the genetic basis of complex traits. We systematically compared the modifier variants which have evidence of modifier effect with the pathogenic variants from multiple angles. Our study found that genomic loci of modifier variations differ from pathogenic loci in many aspects, such as population genetics statistics, epigenetic features, evolutionary characteristics and functional properties of the variations. Genes containing modifier variation(s) exhibit higher probability of being haploinsufficient and higher probability of recessive disease causation, and they are relatively more important in network communication. Furthermore, we reinforced that co-expression analysis is an effective methodology to predict functional associations between modifier genes and their potential target genes. In many aspects, we detected statistically significant differences between modifier variants/genes and pathogenic variants/genes, and investigated relationships between modifiers and their potential targets. Our results offer some actionable insights that may provide appropriate guidelines to clinical genetics and researchers to elucidate the molecular mechanism underlying the human phenotypic variation.
The distinct predictions of the transcriptional scanning (TS) hypothesis and the transcription-associated mutagenesis (TAM) model. A Schematic of TCR and TCD effects during transcription. TCR machinery detects and repairs existing DNA damages on the transcribed strand, thus reducing gene mutation rates; TCD results in increased mutagenesis when transcription machinery unwinds the DNA strands, making them relatively more vulnerable to cellular mutagens. B The TS hypothesis predicts a compound effect of TCR and TCD on mutation rates, modulated by expression level, while the TAM model predicts a monotonic positive correlation between gene expression level and mutation rates. C Schematic of asymmetric mutation rates between strands, and predictions from the TS and TAM models. The TS hypothesis predicts a positive correlation between gene expression levels and coding strand mutation rates, together with a negative correlation on the template strand. The TAM model predicts a positive correlation between gene expression levels and mutation rates on both strands
Excluding zero-variant genes in sparse mutation datasets reduces biases in mutation rate inference. A The number of variants in the SNP and DNM datasets. B A representative scatter view of mutation rates inferred from randomly down-sampled SNP dataset and the original SNP dataset. Spearman’s rank correlation coefficients are indicated with or without zero-variant genes. C Spearman’s rank correlation coefficients between mutation rates inferred from down-sampled SNP dataset and the original dataset. The random sampling was repeated for 100 times, and the Spearman’s correlation coefficients were calculated in pairs, including or excluding zero-variant genes. D, E Down-sampled SNP dataset-inferred mutation rates across gene expression level categories including (D) or excluding (E) the zero-variant genes. Significance in D, E is computed by the Mann–Whitney test with Bonferroni correction for multiple tests. *p < 0.01; **p < 10–10; n.s. not significant
Gene expression level modulates mutation rates by the compound effects of TCR and TCD. A, B Spearman’s correlation coefficients between gene expression level during spermatogenesis and mutation rates inferred from SNP dataset (A) or DNM dataset (B). C Distribution of gene expression levels in the human male germ cells. The plot is
adapted from Xia, et al. (2020). D Schematic of the compound effects of TCR and TCD. TCR dominates in the ~ 70% of low-to-moderately expressed genes, while TCD gradually overwhelms TCR in the top ~ 30% highly expressed genes. The schematic is adapted from Xia, et al. (2020). E, F Spearman’s correlation coefficients between gene expression level and mutation rates inferred from SNP dataset (E) or DNM dataset (F). Genes in E, F are divided into low-to-moderately expressed group and highly expressed group and their correlation coefficients between mutation rates and expression levels are plotted, respectively. *p < 0.01; **p < 10–5; ***p < 10–10. GC: GC content, RT: replication timing
Gene expression level modulates mutation rates of coding strand and template strand differently. A Spearman’s correlation coefficients between gene expression level and mutation rates inferred from coding strand (left) or template strand (right). B, C Spearman’s correlation coefficients between gene expression level and coding strand mutation rate (B) or template strand mutation rate (C). Genes in B and C are divided into low-to-moderately expressed group and highly expressed group and their correlation coefficients between mutation rates and expression levels are plotted, respectively. D–F Same as in A–C, but used de novo mutations. *p < 0.01; **p < 10–5; ***p < 10–10. GC: GC content, RT: replication timing
Of all mammalian organs, the testis has long been observed to have the most diverse gene expression profile. To account for this widespread gene expression, we have proposed a mechanism termed ‘transcriptional scanning’, which reduces germline mutation rates through transcription-coupled repair (TCR). Our hypothesis contrasts with an earlier observation that mutation rates are overall positively correlated with gene expression levels in yeast, implying that transcription is mutagenic due to effects dominated by transcription-coupled damage (TCD). Here we report evidence that the compound effects of both TCR and TCD during spermatogenesis modulate human germline mutation rates, with TCR dominating in most genes, thus supporting the transcriptional scanning hypothesis. Our analyses address potentially confounding factors, distinguish the differential mutagenic effects acting on the highly expressed genes and the low-to-moderately expressed genes, and resolve concerns relating to the validation of the results using a de novo mutation dataset. We also discuss the theoretical possibility of transcriptional scanning hypothesis from an evolutionary perspective. Together, these analyses support a model by which the coupling of transcription-coupled repair and damage establishes the pattern of germline mutation rates and provide an evolutionary explanation for widespread gene expression during spermatogenesis.
Variant interpretation remains a central challenge for precision medicine. Missense variants are particularly difficult to understand as they change only a single amino acid in a protein sequence yet can have large and varied effects on protein activity. Numerous tools have been developed to identify missense variants with putative disease consequences from protein sequence and structure. However, biological function arises through higher order interactions among proteins and molecules within cells. We therefore sought to capture information about the potential of missense mutations to perturb protein interaction networks by integrating protein structure and interaction data. We developed 16 network-based annotations for missense mutations that provide orthogonal information to features classically used to prioritize variants. We then evaluated them in the context of a proven machine-learning framework for variant effect prediction across multiple benchmark datasets to demonstrate their potential to improve variant classification. Interestingly, network features resulted in larger performance gains for classifying somatic mutations than for germline variants, possibly due to different constraints on what mutations are tolerated at the cellular versus organismal level. Our results suggest that modeling variant potential to perturb context-specific interactome networks is a fruitful strategy to advance in silico variant effect prediction.
Pharmaceutical companies have increasingly utilized genomic data for the selection of drug targets and the development of precision medicine approaches. Most major pharmaceutical companies routinely collect DNA from clinical trial participants and conduct pharmacogenomic (PGx) studies. However, the implementation of PGx studies during clinical development presents a number of challenges. These challenges include adapting to a constantly changing global regulatory environment, challenges in study design and clinical implementation, and the increasing concerns over patient privacy. Advances in the field of genomics are also providing new opportunities for pharmaceutical companies, including the availability of large genomic databases linked to patient health information, the growing use of polygenic risk scores, and the direct sequencing of clinical trial participants. The Industry Pharmacogenomics Working Group (I-PWG) is an association of pharmaceutical companies actively working in the field of pharmacogenomics. This I-PWG perspective will provide an overview of the steps pharmaceutical companies are taking to address each of these challenges, and the approaches being taken to capitalize on emerging scientific opportunities.
A timeline of important discoveries and events in the field of gene therapy. Key events in the history of gene therapy are presented between 1900 and 2020. ADA-SCID adenosine deaminase-severe combined immuno-deficiency, LCA Leber congenital amaurosis, LPLD lipoprotein lipase deficiency, SPV Shope papilloma virus
Overview of CRISPR/Cas9-mediated gene editing. The Streptococcus pyogenes Cas9 enzyme (in grey) along with a guide RNA (gRNA) pair with complementary DNA in the genome to facilitate a site-directed DNA double-strand break (DSB). The DSB can be repaired by error-prone non-homologous end joining (NHEJ) to create insertions/deletions that disrupt gene function. Homologous recombination (HR) can also be used to repair the DSB when a donor DNA carries homologous DNA sequence flanking a gene to be inserted (gene replacement) or a wild-type gene sequence (gene editing). In this way, genes can be replaced or edited using CRISPR/Cas9 to correct genetic diseases
Criteria and considerations to guide the decision-making process of whether to pursue gene therapy. A flow diagram is presented encompassing the many ethical, legal, technical and medical considerations and criteria affecting the decision to pursue gene therapies, including gene editing for a given disease. Decisions promoting the use of gene therapy are in green and those that would not promote the use of gene therapy as a treatment are in magenta. Horizontal arrows between the ethical/legal issues and the technical/medical considerations highlight that they will likely inform one another and the process should be dynamic between the two sections
Overview of gene therapy approaches. Cells from the patient can either be modified by genome engineering ex vivo and reintroduced back in the same patient (left), or a patient can be treated by direct delivery of therapeutic agents in vivo (right) using liposomes, gold nanoparticles, or adeno-associated viruses (AAV). In vivo gene therapy carries greater risk to the patient from unintended on- and off-target effects as well as pathogenic immune responses. In contrast, ex vivo approaches allow gene editing side effects to be addressed at the cellular level prior to introduction into the patient. In addition, ex vivo gene therapy can prevent rejection or pathogenic immune responses to CRISPR/Cas9 or viruses used in gene delivery, and can employ agents such as modified RNAs that may be less effective in vivo (left)
Gene therapies for genetic diseases have been sought for decades, and the relatively recent development of the CRISPR/Cas9 gene-editing system has encouraged a new wave of interest in the field. There have nonetheless been significant setbacks to gene therapy, including unintended biological consequences, ethical scandals, and death. The major focus of research has been on technological problems such as delivery, potential immune responses, and both on and off-target effects in an effort to avoid negative clinical outcomes. While the field has concentrated on how we can better achieve gene therapies and gene editing techniques, there has been less focus on when and why we should use such technology. Here we combine discussion of both the technical and ethical barriers to the widespread clinical application of gene therapy and gene editing, providing a resource for gene therapy experts and novices alike. We discuss ethical problems and solutions, using cystic fibrosis and beta-thalassemia as case studies where gene therapy might be suitable, and provide examples of situations where human germline gene editing may be ethically permissible. Using such examples, we propose criteria to guide researchers and clinicians in deciding whether or not to pursue gene therapy as a treatment. Finally, we summarize how current progress in the field adheres to principles of biomedical ethics and highlight how this approach might fall short of ethical rigour using examples in the bioethics literature. Ultimately by addressing both the technical and ethical aspects of gene therapy and editing, new frameworks can be developed for the fair application of these potentially life-saving treatments.
Total percentage per million of PGx scholars (full color fill) and citations of PGx scholars (pattern fill) (a) and percentage per million of PGx scholars by country (b). Data were compiled for “pharmacogenetics” or “pharmacogenomics” researchers on Google Scholar (Sup. Table 1) ( Accessed on July 6, 2020). Countries are classified based on the United Nations Development Program (UNDP) Human Development Index (HDI, and numbers are corrected for population size (in millions) (
Percentage per million of research on the PGx of oral anticoagulants by country (a) and year of publication (b). Countries are classified based on the United Nations Development Program (UNDP) Human Development Index (HDI, and numbers are corrected for population size (in millions) ( Blue: Very high HDI; Black: High HDI; Gray: Medium HDI; Pink: Low HDI. Data were compiled by a PubMed search of original human subjects’ research from January 1, 2000 till June 30, 2020 on the PGx of oral anticoagulants (Sup. Table 2)
While significant advances have been made in pharmacogenetics (PGx), especially in countries with developed economies, this field remains at its infancy in developing countries and low resource environments. Herein, we provide insights into the gap and challenges of PGx at the research and clinical fronts, and some perspectives to bridge the gap and move forward with PGx in the developing world. We show that developing countries fall behind in PGx research, evidenced by a lower number of researchers, citations, and research output. In addition, the implementation of PGx in the clinic has been progressing at a much slower pace than research, and more so in developing countries. To bridge this gap, we recommend fostering regional and multinational collaborations to secure funds for high-throughput genotyping and local capacity building while preserving individual countries' identity, implementing next-generation sequencing, and organizing specialized training and exchange programs to move PGx research and clinical applications forward in developing countries.
A Overview of the total publication output of AS research over the years, from 1976 until August 2021. B Schematic representation of most important milestones in AS research, based on the most cited AS publications
A Distribution of AS publications in the most prominent journals (based on publication output) in the field. B Distribution of AS publications over all acknowledged research funders. C Distribution of publications over the most acknowledged funders in AS research, excluding NHS
A Network visualization of the most frequently used keywords in AS publications through time, colored based on clustering (colors randomly assigned by the visualization software VOSviewer). B Network visualization of the most frequently used keywords in AS publications through time, colored based on the time of publication. C Development of documents in the top five ISI subject categories per 5-year period
A Network of researchers active in the AS field, colored based on the average publication year of each researcher. B Network visualization of the countries involved in AS research and inter-country collaborations, colored based on the average publication year per country
Network visualization of the most frequently used keywords in AS publications related to treatment or therapy, colored based on clustering (colors randomly assigned by the visualization software VOSviewer)
Angelman syndrome is a rare neurodevelopmental disorder caused by mutations affecting the chromosomal 15q11-13 region, either by contiguous gene deletions, imprinting defects, uniparental disomy, or mutations in the UBE3A gene itself. Phenotypic abnormalities are driven primarily, but not exclusively (especially in 15q11-13 deletion cases) by loss of expression of the maternally inherited UBE3A gene expression. The disorder was first described in 1965 by the English pediatrician Harry Angelman. Since that first description of three children with Angelman syndrome, there has been extensive research into the genetic, molecular and phenotypic aspects of the disorder. In the last decade, this has resulted in over 100 publications per year. Collectively, this research has led the field to a pivotal point in which restoring UBE3A function by genetic therapies is currently explored in several clinical trials. In this study, we employed a bibliometric approach to review and visualize the development of Angelman syndrome research over the last 50 years. We look into different parameters shaping the progress of the Angelman syndrome research field, including source of funding, publishing journals and international collaborations between research groups. Using a network approach, we map the focus of the research field and how that shifted over time. This overview helps understand the shift of research focus in the field and can provide a comprehensive handbook of Angelman syndrome research development.
Cellular responses to acetaldehyde and formaldehyde. Acetaldehyde and formaldehyde, which are normally detoxified by ADH5 and ALDH2, can cause a variety of DNA adducts activating multiple DNA repair pathways. Acetaldehyde and formaldehyde can induce ICLs which are repaired by the FA pathway
ICL repair by the FA pathway. A CMG complex undergoes polyubiquitination when the two converging forks arrive at an ICL site. B, C Polyubiquitinated CMG is removed from the ICL site in a manner dependent on the p97 segregase. D ATR promotes ICL recognition by the Anchor complex. E The Anchor complex mediates the assembly of the Core complex at the ICL site. F The Core complex ubiquitinates the ID2 complex. G The ubiquitinated ID2 complex promotes ICL unhooking by the XPF (Q)-ERCC endonuclease supported by a scaffold protein SLX4 (P). H TLS polymerases perform bypass synthesis over the unhooked ICL. I DSB processing by MRN and CtIP to promote strand invasion. J BRCA1 (S), BRCA2 (D1), PALB2 (N) and FANCJ work together to promote Rad51-mediated strand invasion to repair DNA
Regulation of the FA Core complex. A FANCM is phosphorylated by ATR, which is required for Core complex assembly on chromatin. B FANCM undergoes Plk1-mediated hyperphosphorylation in G2/M and mitosis, leading to the removal of the Core complex from chromatin. C Multiple phosphorylation events on Core complex subunits are shown. Green arrows indicate activation of the FA pathway, while red arrows show negative regulation of the pathway
Regulation of FANCD2 and FANCI. FANCD2 undergoes CK2-mediated phosphorylation (882–898) that prevents FANCD2 from DNA binding in the absence of DNA damage. Dephosphorylation of the 882–898 cluster increases FANCD2’s affinity to DNA, leading the ID2 complex to become “facultatively active”. ATR phosphorylates FANCI and FANCD2 at S556 and T691/S717, respectively, resulting in monoubiquitination of FANCD2 and FANCI that are locked on to DNA. This allows the ID2 complex to become “active” to coordinate downstream DNA repair processes. ID2 monoubiquitination is reversed by the USP–UAF1 deubiquitinating enzyme, and the ID2 complex is released from the DNA duplex
Fanconi anemia is a genetic disorder that is characterized by bone marrow failure, as well as a predisposition to malignancies including leukemia and squamous cell carcinoma (SCC). At least 22 genes are associated with Fanconi anemia, constituting the Fanconi anemia DNA repair pathway. This pathway coordinates multiple processes and proteins to facilitate the repair of DNA adducts including interstrand crosslinks (ICLs) that are generated by environmental carcinogens, chemotherapeutic crosslinkers, and metabolic products of alcohol. ICLs can interfere with DNA transactions, including replication and transcription. If not properly removed and repaired, ICLs cause DNA breaks and lead to genomic instability, a hallmark of cancer. In this review, we will discuss the genetic and phenotypic characteristics of Fanconi anemia, the epidemiology of the disease, and associated cancer risk. The sources of ICLs and the role of ICL-inducing chemotherapeutic agents will also be discussed. Finally, we will review the detailed mechanisms of ICL repair via the Fanconi anemia DNA repair pathway, highlighting critical regulatory processes. Together, the information in this review will underscore important contributions to Fanconi anemia research in the past two decades.
Non-obstructive azoospermia (NOA) and premature ovarian insufficiency (POI) represent the most serious forms of human infertility caused by gametogenic failure. Although whole-exome sequencing (WES) has uncovered multiple monogenic causes of human infertility, our knowledge of the genetic basis of human gametogenesis defects remains at a rudimentary stage. Coiled-coil-domain-containing protein 155 (CCDC155) encodes a core component of the linker of the nucleoskeleton and cytoskeleton complex that is essential for modulating telomere-led chromosome movements during the meiotic prophase of mice. Additionally, Ccdc155 deficiency in mice causes infertility in both sexes with meiotic arrest. In this study, we applied WES to identify the pathogenic genes for 15 NOA and POI patients whose parents were consanguineous and identified a novel homozygous missense mutation in CCDC155 [c.590T>C (p.Leu197Pro)] in a pair of familial NOA and POI patients whose parents were first cousins. The affected spermatocytes were unable to complete meiotic division coupled with unresolved repair of the DNA double-strand break. This rare missense mutation with lesions in the conserved CC domain of CCDC155 blocked nuclear envelope (NE) distribution and subsequently prevented NE-specific enrichment of Sad1- and UNC84-domain-containing 1 either ex vivo or in vitro, eventually leading to disruptive NE anchoring of chromosome-induced meiotic arrest in both sexes. This study presents the first evidence of the necessity of the SUN1–CCDC155 complex during human meiosis and provides insight into the CCDC155 CC domain, thereby expanding the genetic spectrum of human NOA and POI and promoting adequate genetic counselling and appropriate fertility guidance for these patients.
Pedigree of the family with autosomal recessive primary microcephaly in generation III and IV. The PLK4 deletion (Star symbol) was inherited to 14 of the 16 offspring of generation I, both through spermatogenesis (II.2) and oogenesis (II.3, III.6). The complete PLK4 constitution including the novel variant is depicted in Fig. 3
a Deleted region of chromosome 4 identified by array-CGH. b Novel missense variant (c.811 T > G) in PLK4 near the Degron motif resulting in the replacement of isoleucine by serine (p.294Ile > Ser). The known (P) and putative autophosphorylation sites ℗, which regulate the degradation of the protein, are indicated (after Note, that the mutation creates a potential new autophosphorylation site. c qPCR analysis of exons 4, 5.1, 5.2 and 6 of PLK4 (brown–purple) and as a control exon 2 of the Cystic fibrosis transmembrane conductance regulator (CFTR) gene (blue)
Reconstruction of haplotypes based on manual analysis of SNPs and microsatellites (Suppl. Figure 1) flanking the PLK4 gene and the region of gene conversion. The color changes point to the sites of crossovers. green—grandmother (II.1) haplotype G1 with the mutated PLK4 allele c.881G, blue—grandmother haplotype G2 with the mutated PLK4 allele, red—grandfather (II.2) haplotype with the deletion, yellow—grandfather haplotype with the wild-type PLK4 allele c.881 T, purple—grandmother (II.3) haplotype with the wild-type PLK4 allele, light and dark gray—grandfather (II.4) haplotypes with the wild-type PLK4 alleles
The evolutionary conserved Polo-like kinase 4 (PLK4) is essential for centriole duplication, spindle assembly, and de novo centriole formation. In man, homozygous mutations in PLK4 lead to primary microcephaly, altered PLK4 expression is associated with aneuploidy in human embryos. Here, we report on a consanguineous four-generation family with 8 affected individuals compound heterozygous for a novel missense variant, c.881 T > G, and a deletion of the PLK4 gene. The clinical phenotype of the adult patients is mild compared to individuals with previously described PLK4 mutations. One individual was homozygous for the variant c.881G and phenotypically unaffected. The deletion was inherited by 14 of 16 offspring and thus exhibits transmission ratio distortion (TRD). Moreover, based on the already published families with PLK4 mutations, it could be shown that due to the preferential transmission of the mutant alleles, the number of affected offspring is significantly increased. It is assumed that reduced expression of PLK4 decreases the intrinsically high error rate of the first cell divisions after fertilization, increases the number of viable embryos and thus leads to preferential transmission of the deleted/mutated alleles.
Flowchart of the study. DNA was extracted from blood leukocytes and skin lesion biopsies of patients with pigmentary mosaicism and analyzed by next-generation sequencing. Rare variants only in the skin (somatic variants) or in both the skin and blood (germline or somatic variants) were investigated. Candidate germline variants were confirmed by Sanger sequencing of trio-samples (from patients and their parents). Exome sequencing of 11 individuals with pigmentary mosaicism revealed four somatic MTOR variants, a somatic RHOA variant; de novo variants in USP9X (n = 2), TFE3 (n = 1), and KCNQ5 (n = 1); biallelic germline variants in GTF3C5; and one inherited germline variant in PHF6
Skin lesion photographs and brain magnetic resonance imaging (MRI) studies of patients 1–4. a–f Patient 1 with a somatic MTOR somatic variant (c.4448G>A, p.Cys1483Tyr). a, c Hypopigmentation of the back and buttocks. b Axial view of T1-weighted MRI images of the brain. The brain is almost symmetrical, but the right hemisphere is slightly larger than the left. d Axial image of brain positron emission tomography (PET). The signals are significantly decreased in the right hemisphere (arrowhead). e, f Fontana–Masson stained skin biopsy of a depigmented lesion. Arrow indicates clusters of melanin-laden keratinocytes (brown areas), which are sparse in the hypopigmented skin compared to a normally pigmented skin lesion, although the number of melanocytes is not reducted (f). g Skin lesion of Patient 2. Hypopigmentation of the skin is seen only in the right side of the body. h, i T2-weighted coronal and axial images of brain MRI of Patient 2 with a somatic MTOR variant (c.5930C>T, p.Thr1977Ile). The right hemisphere is slightly larger than the left. The arrowheads show focal cortical dysplasia in bilateral occipital lobes, but predominantly in the right hemisphere. j Skin lesion of Patient 3 with a somatic MTOR variant (c.6644C>T, p.Ser2215Phe). k T2-weighted MRI brain finding of Patient 3 at day 21 after birth prior to surgery. Right hemimegalencephaly is observed in many slices. l, m Skin lesions and T2-weighted brain MRI of Patient 4 with a somatic MTOR variant (c.7292 T>C, p.Leu2431Pro) found in both skin lesion and blood leukocytes, respectively. Brain MRI of Patient 4 is normal and almost symmetrical
Skin lesion photographs and brain magnetic resonance imaging (MRI) studies of patients 5–11. a Hypopigmentation in the left leg of Patient 5 with a somatic RHOA variant. b, c Saggital and axial views of T2-weighted brain MRI, respectively. No laterality is observed. Hyperintensity with mild dilatation of ventricles and small cysts are observed. d Skin lesion of Patient 6 with a germline TFE3 variant. e, f Skin lesions of Patients 7 and 8, respectively, both with a germline USP9X variant. g, h Skin lesion and sagittal view of T2-weighted brain MRI of Patient 9 with a germline PHF6 variant. The patient’s skin showed hyperpigmentation rather than hypopigmentation. Atrophy of the bridge capsules and superior cerebellar peduncles is observed. i Skin lesion of Patient 10 with a germline KCNQ5 variant. j, k Skin lesions and sagittal view of T2-weighted brain MRI of Patient 11. Brain MRI showed polymicrogyria, loss of white matter volume, and mildly enlarged lateral ventricles
Pigmentary mosaicism of the Ito type, also known as hypomelanosis of Ito, is a neurocutaneous syndrome considered to be predominantly caused by somatic chromosomal mosaicism. However, a few monogenic causes of pigmentary mosaicism have been recently reported. Eleven unrelated individuals with pigmentary mosaicism (mostly hypopigmented skin) were recruited for this study. Skin punch biopsies of the probands and trio-based blood samples (from probands and both biological parents) were collected, and genomic DNA was extracted and analyzed by exome sequencing. In all patients, plausible monogenic causes were detected with somatic and germline variants identified in five and six patients, respectively. Among the somatic variants, four patients had MTOR variant (36%) and another had an RHOA variant. De novo germline variants in USP9X, TFE3, and KCNQ5 were detected in two, one, and one patients, respectively. A maternally inherited PHF6 variant was detected in one patient with hyperpigmented skin. Compound heterozygous GTF3C5 variants were highlighted as strong candidates in the remaining patient. Exome sequencing, using patients’ blood and skin samples is highly recommended as the first choice for detecting causative genetic variants of pigmentary mosaicism.
Advances in human genetics raise many social and ethical issues. The application of genomic technologies to healthcare has raised many questions at the level of the individual and the family, about conflicts of interest among professionals, and about the limitations of genomic testing. In this paper, we attend to broader questions of social justice, such as how the implementation of genomics within healthcare could exacerbate pre-existing inequities or the discrimination against social groups. By anticipating these potential problems, we hope to minimise their impact. We group the issues to address into six categories: (i) access to healthcare in general, not specific to genetics. This ranges from healthcare insurance to personal behaviours. (ii) data management and societal discrimination against groups on the basis of genetics. (iii) epigenetics research recognises how early life exposure to stress, including malnutrition and social deprivation, can lead to ill health in adult life and further social disadvantage. (iv) psychiatric genomics and the genetics of IQ may address important questions of therapeutics but could also be used to disadvantage specific social or ethnic groups. (v) complex diseases are influenced by many factors, including genetic polymorphisms of individually small effect. A focus on these polygenic influences distracts from environmental factors that are more open to effective interventions. (vi) population genomic screening aims to support couples making decisions about reproduction. However, this remains a highly contentious area. We need to maintain a careful balance of the competing social and ethical tensions as the technology continues to develop.
Rapid whole genome sequencing (WGS) and whole exome sequencing (WES), sometimes referred to as “next generation sequencing” (NGS) are now recommended by some experts as a first-line diagnostic test to diagnose infants with suspected monogenic conditions. Estimates of how often NGS leads to diagnoses or changes in management vary widely depending on the population being studied and the indications for testing. Finding a genetic variant that is classified as pathogenic may not necessarily equate with being able to predict the resultant phenotype or to give a reliable prognosis. Molecular diagnoses do not usually lead to changes in clinical management but they often end a family’s diagnostic Odyssey and allow informed decisions about future reproductive choices. The likelihood that NGS will be beneficial for patients and families in the NICU remains uncertain. The goal of this paper is to highlight the implications of these ambiguities in interpreting the results of NGS. To do that, we will first review the types of cases that are admitted to NICUs and show why, at least in theory, NGS is unlikely to be useful for most NICU patients and families and may even be harmful for some, although it can help families in some cases. We then present a number of real cases in which NGS results were obtained and show that they often lead to unforeseen and unpredictable consequences. Finally, we will suggest ways to communicate with families about NGS testing and results in order to help them understand the meaning of NGS results and the uncertainty that surrounds them.
How an individual’s genetic information is governed by confidentiality, and how the interests of others—such as close relatives—in knowing such information might be respected, has been the topic of much debate ever since genetic testing has become more prevalent. In this paper, two authors who often appear to have different views on familial disclosure, discuss where they agree on this topic.
In recent years, it has become increasingly apparent that many neurological disorders are underpinned by a genetic aetiology. This has resulted in considerable efforts to develop therapeutic strategies which can treat the disease-causing mutation, either by supplying a functional copy of the mutated gene or editing the genomic sequence. In this review, we will discuss the main genetic strategies which are currently being explored for the treatment of monogenic neurological disorders, as well as some of the challenges they face. In addition, we will address some of the ethical difficulties which may arise.
The practice of recontacting patients has a long history in medicine but emerged as an issue in genetics as the rapid expansion of knowledge and of testing capacity raised questions about whether, when and how to recontact patients. Until recently, the debate on recontacting has focussed on theoretical concerns of experts. The publication of empirical research into the views of patients, clinicians, laboratories and services in a number of countries has changed this. These studies have filled out, and altered our view of, this issue. Whereas debates on the duty to recontact have explored all aspects of recontact practice, recent contributions have been developing a more nuanced view of recontacting. The result is a narrowing of the scope of the duty, so that a norm on recontacting focuses on the practice of reaching out to discharged patients. This brings into focus the importance of the consent conversation, the resource implications of this duty, and the role of the patient in recontacting.
Timeline of the evolution of diagnostic criteria for “autism” and its relationship to developmental delay and/or intellectual disability. Development delays (DDs): defined as delay (prior to age 3) in language (or total lack of language), and/or adaptive functioning. Global developmental (GDD): defined as significant delay in 2 or more developmental domains in children under the age of 5 (Srour and Shevell 2015) Intellectual disability (ID): formerly referred to as mental retardation, typically applied if the criteria for GDD are met after the age of 5 *(> %70) met criteria for a co-morbid diagnosis of GDD/ID (Bryson et al. 1988; Yeargin-Allsopp et al. 2003)
Genetic testing to identify genetic syndromes and copy number variants (CNVs) via whole genome platforms such as chromosome microarray (CMA) or exome sequencing (ES) is routinely performed clinically, and is considered by a variety of organizations and societies to be a “first-tier” test for individuals with developmental delay (DD), intellectual disability (ID), or autism spectrum disorder (ASD). However, in the context of schizophrenia, though CNVs can have a large effect on risk, genetic testing is not typically a part of routine clinical care, and no clinical practice guidelines recommend testing. This raises the question of whether CNV testing should be similarly performed for individuals with schizophrenia. Here we consider this proposition in light of the history of genetic testing for ID/DD and ASD, and through the application of an ethical analysis designed to enable robust, accountable and justifiable decision-making. Using a systematic framework and application of relevant bioethical principles (beneficence, non-maleficence, autonomy, and justice), our examination highlights that while CNV testing for the indication of ID has considerable benefits, there is currently insufficient evidence to suggest that overall, the potential harms are outweighed by the potential benefits of CNV testing for the sole indications of schizophrenia or ASD. However, although the application of CNV tests for children with ASD or schizophrenia without ID/DD is, strictly speaking, off-label use, there may be clinical utility and benefits substantive enough to outweigh the harms. Research is needed to clarify the harms and benefits of testing in pediatric and adult contexts. Given that genetic counseling has demonstrated benefits for schizophrenia, and has the potential to mitigate many of the potential harms from genetic testing, any decisions to implement genetic testing for schizophrenia should involve high-quality evidence-based genetic counseling.
Genetic carrier screening for reproductive purposes has existed for half a century. It was originally offered to particular ethnic groups with a higher prevalence of certain severe recessive or X-linked genetic conditions, or (as carrier testing) to those with a family history of a particular genetic condition. Commercial providers are increasingly offering carrier screening on a user-pays basis. Some countries are also trialing or offering public reproductive genetic carrier screening with whole populations, rather than only to those known to have a higher chance of having a child with an inherited genetic condition. Such programs broaden the ethical and practical challenges that arise in clinical carrier testing. In this paper we consider three aspects of selecting genes for population reproductive genetic carrier screening panels that give rise to important ethical considerations: severity, variable penetrance and expressivity, and scalability; we also draw on three exemplar genes to illustrate the ethical issues raised: CFTR, GALT and SERPINA1. We argue that such issues are important to attend to at the point of gene selection for RGCS. These factors warrant a cautious approach to screening panel design, one that takes into account the likely value of the information generated by screening and the feasibility of implementation in large and diverse populations. Given the highly complex and uncertain nature of some genetic variants, careful consideration needs to be given to the balance between delivering potentially burdensome or harmful information, and providing valuable information to inform reproductive decisions.
Due to a number of recent achievements, the field of prenatal medicine is now on the verge of a profound transformation into prenatal genomic medicine. This transformation is expected to not only substantially expand the spectrum of prenatal diagnostic and screening possibilities, but finally also to advance fetal care and the prenatal management of certain fetal diseases and malformations. It will come along with new and profound challenges for the normative framework and clinical care pathways in prenatal (and reproductive) medicine. To adequately address the potential ethically challenging aspects without discarding the obvious benefits, several agents are required to engage in different debates. The permissibility of the sequencing of the whole fetal exome or genome will have to be examined from a philosophical and legal point of view, in particular with regard to conflicts with potential rights of future children. A second requirement is a societal debate on the question of priority setting and justice in relation to prenatal genomic testing. Third, a professional-ethical debate and positioning on the goal of prenatal genomic testing and a consequential restructuring of clinical care pathways seems to be important. In all these efforts, it might be helpful to envisage the unborn rather not as a fetus, not as a separate moral subject and a second "patient", but in its unique physical connection with the pregnant woman, and to accept the moral quandaries implicitly given in this situation.
The Zone of Parental Discretion.
Adapted from The Zone of Parental Discretion. From ‘When doctors and parents disagree’ (2016). Eds McDougall, Delany and Gillam
Genomic sequencing (GS) is now well embedded in clinical practice. However, guidelines issued by professional bodies disagree about whether unsolicited findings (UF)—i.e., disease-causing changes found in the DNA unrelated to the reason for testing—should be reported if they are identified inadvertently during data analysis. This extends to a lack of clarity regarding parents’ ability to decide about receiving UF for their children. To address this, I use an ethical framework, the Zone of Parental Discretion (ZPD), to consider which UF parents should be allowed to choose (not) to receive and examine how well this assessment aligns with existing professional recommendations. Assessment of guidelines shows recommendations ranging from leaving the decision to the discretion of laboratories through to mandatory reporting for UF for childhood onset, treatable/preventable conditions. The ZPD suggests that parents’ decisions should be respected, even where there is no expected benefit, provided that there is not sufficient evidence of serious harm. Using this lens, parents should be able to choose whether or not to know UF for adult-onset conditions in their children, but only insofar as there is insufficient evidence that this knowledge will cause harm or benefit. In contrast, parents should not be allowed to refuse receiving UF for childhood-onset medically actionable conditions. The ZPD is a helpful tool for assessing where it is appropriate to offer parents the choice of receiving UF for their children. This has implications for refinement of policy and laboratory reporting practices, development of consent forms, and genetic counselling practice.
Precision medicine aims to tailor medical treatment to match individual characteristics and to stratify individuals to concentrate benefits and avoid harm. It has recently been joined by precision public health—the application of precision medicine at population scale to decrease morbidity and optimise population health. Newborn preventive genomic sequencing (NPGS) provides a helpful case study to consider how we should approach ethical questions in precision public health. In this paper, I use NPGS as a case in point to argue that both precision medicine and precision public health need public health ethics. I make this argument in two parts. First, I claim that discussions of ethics in precision medicine and NPGS tend to focus on predominantly individualistic concepts from medical ethics such as autonomy and empowerment. This highlights some deficiencies, including overlooking that choice is subject to constraints and that an individual’s place in the world might impact their capacity to ‘be responsible’. Second, I make the case for using a public health ethics approach when considering ethics and NPGS, and thus precision public health more broadly. I discuss how precision public health needs to be construed as a collective enterprise and not just as an aggregation of individual interests. I also show how analysing collective values and interests through concepts such as solidarity can enrich ethical discussion of NPGS and highlight previously overlooked issues. With this approach, bioethics can contribute to more just and more appropriate applications of precision medicine and precision public health, including NPGS.
Here, we argue that polygenic risk scores (PRSs) are different epistemic objects as compared to other biomarkers such as blood pressure or sodium level. While the latter two may be subject to variation, measured inaccurately or interpreted in various ways, blood flow has pressure and sodium is available in a concentration that can be quantified and visualised. In stark contrast, PRSs are calculated, compiled or constructed through the statistical assemblage of genetic variants. How researchers frame and name PRSs has consequences for how we interpret and value their results. We distinguish between the tangible and inferential understanding of PRS and the corresponding languages of measurement and computation, respectively. The conflation of these frames obscures important questions we need to ask: what PRS seeks to represent, whether current ways of ‘doing PRS’ are optimal and responsible, and upon what we base the credibility of PRS-based knowledge claims.
Top-cited authors
David N Cooper
  • Cardiff University
Matthew Mort
  • Cardiff University
Edward V Ball
  • Cardiff University
Fowzan Alkuraya
  • King Faisal Specialist Hospital and Research Centre
Katy Howells
  • Cardiff University