Chao Wang’s research while affiliated with Uppsala University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (26)


Fig. 2. Identification of an anatomically specialized motor cortical region targeting laryngeal motoneurons in the Egyptian fruit bat. (A) Right: schematic of anatomical tracing approaches. Retrograde tracer cholera toxin B (CTB, purple) was injected bilaterally into the cricothyroid muscles to label brainstem motoneurons in nucleus ambiguus (NA). Simultaneously, an anterograde viral tracer (channelrhodopsin-2, ChR2, or Synapsin/synaptophysin dual-label, SYN; green) was injected bilaterally into the orofacial motor cortex (ofM1) to label corticobulbar projections into NA. Left: example coronal section showing cortical injection sites with anterograde tracer (ChR2, green) and DAPI labeling (cyan). (B to F) Laryngeal motoneurons in the NA identified using a retrograde tracer (CTB, purple), cortical fibers labeled with ChR2 (green), corticobulbar synapses labeled with VGLUT1 (red), and DAPI (blue). (B) and (C) are overlaid images showing colocalization of fibers with a synaptic bouton on the retrograde labeled cell (white arrow). (G) Percentage of laryngeal motoneurons labeled with CTB that are colocalized with cortical fibers (blue) or with both cortical fibers and synaptic boutons (red). Note that both tracing techniques qualitatively yielded similar results: ChR2, n = 51 cells from 3 bats; Synapsin/synaptophysin dual-label virus (SYN), n = 26 cells from 2 bats. (H) Illustration of the experimental setup during which wireless electrophysiological recordings were conducted from the identified cortical region in freely behaving and vocalizing bats. (I) Spiking activity of an example ofM1 neuron aligned to the onset of vocalizations produced (bat's own calls, orange) or heard (other bats' calls, blue) by the bat subject. Top row, time varying mean firing rate and corresponding raster plot below. Colored lines in the raster plot show the duration of each vocalization. Note the increased firing rate during vocal production as compared to hearing. (J) Information (see Methods) between the time varying firing rate and the amplitude of produced (x-axis) vs. heard (y-axis) vocalizations for 219 single units (marker shapes indicate bat ID, n=4 bats). The cell shown in (I) is highlighted in red. Inset shows the distribution of D-prime between motor and auditory information for the same cells. Note that the distribution is heavily skewed toward higher motor information rather than auditory information coded in the activity of the recorded neurons. Error bars are mean +/− SEM throughout the figure.
Fig. 3. Differential Open Chromatin in Bat Orofacial M1 relative to Wing M1. (A) Open chromatin was profiled from 7 dissected brain regions of Egyptian Fruit bats. (B) Volcano plot of ATAC-seq OCRs with differential activity between the orofacial and wing subregions of primary motor cortex (ofM1 and wM1, respectively) of Egyptian fruit bat. (C) Genome browser showing ofM1 and wM1 ATACseq traces at the 3′ end of the FOXP2 locus. Reproducible M1 open chromatin regions (OCRs) are indicated in blue, with a differentially active OCR in ofM1 relative to wM1 highlighted in red.
Fig. 4. Vocal learning-associated convergent evolution in motor cortex open chromatin regions implicates specific neuron subtypes. (A) Overview of applying the Tissue-Aware Conservation Inference Toolkit [TACIT (23)] approach to vocal learning. OCRs (left) identified in motor cortex (M1). Measured open chromatin from M1 (4 species) were used to train convolutional neural networks (CNNs) to predict M1 open chromatin from sequence alone. Red bars and corresponding arrows indicate the presence of a peak while the blue bars represent the absence. The same OCRs were then mapped across 222 mammalian genomes (left) and the identified sequences were used as input to the CNNs to predict open chromatin activity. TACIT identified OCRs whose predicted open chromatin across species was significantly associated with those species' vocal learning status. (B and C) The 4-way Venn diagrams represent the number of OCRs implicated by TACIT (both M1 and PV+) as displaying low (B) or high (C) activity in each of the vocal learning clades based on a t test. (D) The heatmap visualizes specific open chromatin regions along the rows (predicted higher in vocal learners in green; predicted lower in vocal learners in purple) across 222 mammals in the columns (vocal learner in red, vocal nonlearner in black, insufficient or conflicting evidence in gray). The color in each cell corresponds to the z-scored predicted open chromatin, with low open chromatin in blue, mean open chromatin in white, and high open chromatin in red. For open chromatin regions predicted to be significantly less (E) or more (F) open in vocal learning species (p<0.05), the red point shows the number of overlapping regions (y-axis) across mouse cortical cell types (x-axis). The bar-plot shows the distribution across 1,000 permutations of the peaks implicated by TACIT. The notches extend 1.58 * IQR / sqrt(n), which gives a roughly 95% confidence).
Vocal learning-associated convergent evolution in mammalian proteins and regulatory elements
  • Article
  • Full-text available

February 2024

·

518 Reads

·

15 Citations

Science

·

·

·

[...]

·

James R Xue

Vocal production learning is a convergently evolved trait in vertebrates. To identify brain genomic elements associated with mammalian vocal learning, we integrated genomic, anatomical and neurophysiological data from the Egyptian fruit-bat with analyses of the genomes of 215 placental mammals. First, we identified a set of proteins evolving more slowly in vocal learners. Then, we discovered a vocal-motor cortical region in the Egyptian fruit-bat, an emergent vocal learner, and leveraged that knowledge to identify active cis -regulatory elements in the motor cortex of vocal learners. Machine learning methods applied to motor cortex open chromatin revealed 50 enhancers robustly associated with vocal learning whose activity tended to be lower in vocal learners. Our research implicates convergent losses of motor cortex regulatory elements in mammalian vocal learning evolution.

Download


Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture

August 2023

·

790 Reads

·

66 Citations

Genome Biology

Background The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. Results We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. Conclusions We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available.


Using evolutionary constraint to define novel candidate driver genes in medulloblastoma

August 2023

·

179 Reads

·

2 Citations

Proceedings of the National Academy of Sciences

Current knowledge of cancer genomics remains biased against noncoding mutations. To systematically search for regulatory noncoding mutations, we assessed mutations in conserved positions in the genome under the assumption that these are more likely to be functional than mutations in positions with low conservation. To this end, we use whole-genome sequencing data from the International Cancer Genome Consortium and combined it with evolutionary constraint inferred from 240 mammals, to identify genes enriched in noncoding constraint mutations (NCCMs), mutations likely to be regulatory in nature. We compare medulloblastoma (MB), which is malignant, to pilocytic astrocytoma (PA), a primarily benign tumor, and find highly different NCCM frequencies between the two, in agreement with the fact that malignant cancers tend to have more mutations. In PA, a high NCCM frequency only affects the BRAF locus, which is the most commonly mutated gene in PA. In contrast, in MB, >500 genes have high levels of NCCMs. Intriguingly, several loci with NCCMs in MB are associated with different ages of onset, such as the HOXB cluster in young MB patients. In adult patients, NCCMs occurred in, e.g., the WASF-2/AHDC1/FGR locus. One of these NCCMs led to increased expression of the SRC kinase FGR and augmented responsiveness of MB cells to dasatinib, a SRC kinase inhibitor. Our analysis thus points to different molecular pathways in different patient groups. These newly identified putative candidate driver mutations may aid in patient stratification in MB and could be valuable for future selection of personalized treatment options.


Family pedigree of litter 1 and litter 2. Filled circles (female) or squares (male) indicate homozygous (TT) or dead subjects; circles and squares with dots indicate heterozygous subjects (GT); and empty circles and squares indicate unaffected (GG) (n = 2) or not genetically investigated (n = 3) subjects.
Radiographic examinations of the antebrachium and carpus (CrCd/DPa—projections) before and after initiation of corrective treatment with 1,25‐dihydroxyvitamin D3, (Etalpha). The physeal widening, irregular marginations of the metaphyses, and osteopenia improve over time. Radiographic changes are most dramatic in the distal radius and ulna. There is unbalanced growth between the radius and ulna resulting in elbow joint incongruity with secondary subchondral sclerosis. (A) Pug 2 with VDDR type 1A before treatment. (B) Pug 2 with VDDR type 1A after 6 weeks on treatment with 1,25‐dihydroxyvitamin D3 (Etalpha). (C) Age‐matched control pug (the dog has an unrelated defect in the elbow joint).
(A) Computed tomography images of the entire spine of Pug 1.1♀ with VDDR type 1A showing generalized decrease in bone density (attenuation). Endplate widening is also noted. (B) Computed tomography images of the entire spine of an age‐matched control pug.
(A) Histopathological changes at sites of enchondral ossification in the bones of Pug 1.1♀ with VDDR type 1A. Costochondral joint, longitudinal section stained with hematoxylin and eosin (H&E), showing retention of hypertrophic chondrocytes and tongue‐like projections of cartilage extending into the metaphysis from the physeal cartilage. (B) Histopathological changes at sites of enchondral ossification in the bones of Pug 1.1♀ with VDDR type 1A. Vertebrae T13–L1, transverse section (H&E), showing disorganized columns of hypertrophic chondrocytes and tongue‐like projections of cartilage in the metaphysis. (C) Spinal cord at T13–L1, transverse section (H&E), showing focal malacia with moderate parenchymal destruction and gliosis of primarily the ventral funiculi at the level of spinal cord stenosis.
A stop gain mutation (chr10:2182971G>T) in CYP27B1 identified from the 2 pugs (Pug 1.1 and Pug 2) with VDDR type 1A. This premature truncation occurs at the 87th codon in exon 2, which leads to 83% of the protein sequence to be missing.
Mutations in the CYP27B1 gene cause vitamin D dependent rickets in pugs

June 2023

·

54 Reads

·

2 Citations

Rickets is a disorder of bone development and can be the result of either dietary or genetic causes. Here, related pugs from 2 litters were included. Three pugs had clinical signs including, lameness, bone deformities, and dyspnea. One other pug was found dead. Radiographs of 2 affected pugs, 5 and 6 months old, showed generalized widening, and irregular margination of the physes of both the appendicular and the axial skeleton with generalized decrease in bone opacity and bulbous swelling of the costochondral junctions. Two pugs had low serum calcium and 1,25 (OH)2D3 concentrations. Test results further indicated secondary hyperparathyroidism with adequate concentrations of 25‐hydroxyvitamin D. Necropsy revealed tongue‐like projections of cartilage extending into the metaphysis consistent with rickets, loss of metaphyseal mineralization and lung pathology. Vitamin D‐dependent rickets was diagnosed. A truncating mutation in the 1α‐hydroxylase gene (CYP27B1) was identified by genome sequence analysis of the pugs with VDDR type 1A. Vitamin D‐dependent rickets type 1A can occur in young pugs, and if left untreated is a life‐threatening condition. Early medical intervention can reverse clinical signs and should be instituted as soon as possible.


Evolutionary constraint and innovation across hundreds of placental mammals

April 2023

·

551 Reads

·

150 Citations

Science

Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.


Relating enhancer genetic variation across mammals to complex phenotypes using machine learning

April 2023

·

235 Reads

·

52 Citations

Science

Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species' phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer-phenotype associations, including brain size-associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.


Comparative genomics of Balto, a famous historic dog, captures lost diversity of 1920s sled dogs

April 2023

·

354 Reads

·

9 Citations

Science

We reconstruct the phenotype of Balto, the heroic sled dog renowned for transporting diphtheria antitoxin to Nome, Alaska, in 1925, using evolutionary constraint estimates from the Zoonomia alignment of 240 mammals and 682 genomes from dogs and wolves of the 21st century. Balto shares just part of his diverse ancestry with the eponymous Siberian husky breed. Balto's genotype predicts a combination of coat features atypical for modern sled dog breeds, and a slightly smaller stature. He had enhanced starch digestion compared with Greenland sled dogs and a compendium of derived homozygous coding variants at constrained positions in genes connected to bone and skin development. We propose that Balto's population of origin, which was less inbred and genetically healthier than that of modern breeds, was adapted to the extreme environment of 1920s Alaska.


Integrating gene annotation with orthology inference at scale

April 2023

·

381 Reads

·

111 Citations

Science

Annotating coding genes and inferring orthologs are two classical challenges in genomics and evolutionary biology that have traditionally been approached separately, limiting scalability. We present TOGA (Tool to infer Orthologs from Genome Alignments), a method that integrates structural gene annotation and orthology inference. TOGA implements a different paradigm to infer orthologous loci, improves ortholog detection and annotation of conserved genes compared with state-of-the-art methods, and handles even highly fragmented assemblies. TOGA scales to hundreds of genomes, which we demonstrate by applying it to 488 placental mammal and 501 bird assemblies, creating the largest comparative gene resources so far. Additionally, TOGA detects gene losses, enables selection screens, and automatically provides a superior measure of mammalian genome quality. TOGA is a powerful and scalable method to annotate and compare genes in the genomic era.


Three-dimensional genome rewiring in loci with human accelerated regions

April 2023

·

219 Reads

·

64 Citations

Science

Human accelerated regions (HARs) are conserved genomic loci that evolved at an accelerated rate in the human lineage and may underlie human-specific traits. We generated HARs and chimpanzee accelerated regions with an automated pipeline and an alignment of 241 mammalian genomes. Combining deep learning with chromatin capture experiments in human and chimpanzee neural progenitor cells, we discovered a significant enrichment of HARs in topologically associating domains containing human-specific genomic variants that change three-dimensional (3D) genome organization. Differential gene expression between humans and chimpanzees at these loci suggests rewiring of regulatory interactions between HARs and neurodevelopmental genes. Thus, comparative genomics together with models of 3D genome folding revealed enhancer hijacking as an explanation for the rapid evolution of HARs.


Citations (23)


... Over the last decades, the vocal (production) learning hypothesis has emerged as the dominant framework to explain speech evolution from a comparative perspective (Janik and Slater 1997;Christiansen and Kirby 2003b;Bolhuis and Wynne 2009;Fitch and Jams 2013;Jarvis 2019;Vernes et al. 2021;Wirthlin et al. 2024). It proposes that vocal learning-the capacity to expand one's call repertoire through the imitation of new sounds or modification of pre-existing calls through acoustic feedbackwas a precondition for speech evolution. ...

Reference:

Vocal Learning Versus Speech Evolution: Untangling a False Equivalence
Vocal learning-associated convergent evolution in mammalian proteins and regulatory elements

Science

... In parallel, these 79 protein-coding and splicing variants were compared with 1987 genomes from the Dog10K project (Meadows et al., 2023) to identify unique variants that are not present in the overall dog population. From this analysis, 26 variants were private to the family (Table S2). ...

Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture

Genome Biology

... This approach makes it possible to distinguish mutations that are likely to have a functional impact in cancer. Using this approach, it has been possible to identify genes enriched for non-coding constraint mutations in tumor / normal sequencing data from human medulloblastoma and glioblastoma [87,88]. This methodology is currently being applied to ongoing canine and human cancer sequencing studies. ...

Using evolutionary constraint to define novel candidate driver genes in medulloblastoma
  • Citing Article
  • August 2023

Proceedings of the National Academy of Sciences

... Beyond the nutritional causes of vitamin D deficiency, genetic familial factors may also contribute. For example, a recent publication by Rohdin et al. (2023) identifies abnormalities in bone formation in Pugs as linked to a mutation in CYP27B1, a gene involved in genetic disorders affecting vitamin D metabolism [5,30]. ...

Mutations in the CYP27B1 gene cause vitamin D dependent rickets in pugs

... Among the TEs, solo long terminal repeats (solo-LTRs) were exapted as cis-regulatory regions (e.g., promoters, transcription factors binding sites) of human genes [42]. For this reason, their activity is strictly regulated by epigenetic modifications to prevent aberrant gene expression [43]. ...

Mammalian evolution of human cis-regulatory elements and transcription factor binding sites

Science

... Mutation rates differ across species due to genetic and environmental factors, yet the evolutionary forces shaping context-specific mutation rates remain largely unexplored outside of humans, primates and a few vertebrates 6,[17][18][19] . With the expansion of large-scale sequencing initiatives and conservation genomics efforts, population-level polymorphism data are now available for diverse eukaryotic taxa beyond mammals [20][21][22] . These resources provide an opportunity to investigate mutation spectra across a wider phylogenetic range. ...

Evolutionary constraint and innovation across hundreds of placental mammals
  • Citing Article
  • April 2023

Science

... And consequently, many research teams and consortia have been able to produce near-complete full-chromosome assemblies for an ever-growing number of organisms, including both model and non-model species; prominent examples include the Telomere-to-Telomere (T2T) consortia [7,11,12], the Vertebrate Genome Project (VGP) [13], and the California Conservation Genomics Project (CCGP) [14]. In addition to being of interest to evolutionary biologists and genome researchers, having complete assemblies for many species has the potential to be transformational for medical genetics, agriculture, conservation biology, and many other disciplines [15][16][17][18][19][20][21][22][23][24][25][26]. ...

The contribution of historical processes to contemporary extinction risk in placental mammals
  • Citing Article
  • April 2023

Science

... And consequently, many research teams and consortia have been able to produce near-complete full-chromosome assemblies for an ever-growing number of organisms, including both model and non-model species; prominent examples include the Telomere-to-Telomere (T2T) consortia [7,11,12], the Vertebrate Genome Project (VGP) [13], and the California Conservation Genomics Project (CCGP) [14]. In addition to being of interest to evolutionary biologists and genome researchers, having complete assemblies for many species has the potential to be transformational for medical genetics, agriculture, conservation biology, and many other disciplines [15][16][17][18][19][20][21][22][23][24][25][26]. ...

Comparative genomics of Balto, a famous historic dog, captures lost diversity of 1920s sled dogs
  • Citing Article
  • April 2023

Science

... And consequently, many research teams and consortia have been able to produce near-complete full-chromosome assemblies for an ever-growing number of organisms, including both model and non-model species; prominent examples include the Telomere-to-Telomere (T2T) consortia [7,11,12], the Vertebrate Genome Project (VGP) [13], and the California Conservation Genomics Project (CCGP) [14]. In addition to being of interest to evolutionary biologists and genome researchers, having complete assemblies for many species has the potential to be transformational for medical genetics, agriculture, conservation biology, and many other disciplines [15][16][17][18][19][20][21][22][23][24][25][26]. ...

The functional and evolutionary impacts of human-specific deletions in conserved elements
  • Citing Article
  • April 2023

Science

... And consequently, many research teams and consortia have been able to produce near-complete full-chromosome assemblies for an ever-growing number of organisms, including both model and non-model species; prominent examples include the Telomere-to-Telomere (T2T) consortia [7,11,12], the Vertebrate Genome Project (VGP) [13], and the California Conservation Genomics Project (CCGP) [14]. In addition to being of interest to evolutionary biologists and genome researchers, having complete assemblies for many species has the potential to be transformational for medical genetics, agriculture, conservation biology, and many other disciplines [15][16][17][18][19][20][21][22][23][24][25][26]. ...

Insights into mammalian TE diversity through the curation of 248 genome assemblies
  • Citing Article
  • April 2023

Science