About
277
Publications
110,134
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
155,274
Citations
Introduction
Skills and Expertise
Publications
Publications (277)
Woolly mammoths were among the most abundant cold adapted species during the Pleistocene. Their once large populations went extinct in two waves, an end-Pleistocene extinction of continental populations followed by the mid-Holocene extinction of relict populations on St. Paul Island ∼5,600 years ago and Wrangel Island ∼4,000 years ago. Wrangel Isla...
Standardized identification of genotypes is necessary in animals that reproduce asexually and form large clonal populations such as coral. We developed a high-resolution hybridization-based genotype array coupled with an analysis workflow and database for the most speciose genus of coral, Acropora , and their symbionts. We designed the array to co-...
Genomic sequence data for non-model organisms are increasingly available requiring the development of efficient and reproducible workflows. Here, we develop the first genomic resources and reproducible workflows for two threatened members of the reef-building coral genus Acropora We generated genomic sequence data from multiple samples of the Carib...
Genomic sequence data for non-model organisms are increasingly available requiring the development of efficient and reproducible workflows. Here, we develop the first genomic resources and reproducible workflows for two threatened members of the reef-building coral genus Acropora . We generated genomic sequence data from multiple samples of the Car...
Woolly mammoths were among the most abundant cold adapted species during the Pleistocene. Their once large populations went extinct in two waves, an end-Pleistocene extinction of continental populations followed by the mid-Holocene extinction of relict populations on St. Paul Island ~5,600 years ago and Wrangel Island ~4,000 years ago. Wrangel Isla...
Supplementary Figures 1-5, Supplementary Tables 1-2, Supplementary Notes 1-4 and Supplementary References
The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and throug...
Interspecific hybridization is recognized as a widespread phenomenon but measuring its extent, directionality, and adaptive importance in the evolution of species remain challenging. Polar bears possess unique adaptations to life on the Arctic sea ice, whereas their closest relatives -brown bears - are boreal and subarctic generalists. Despite larg...
The California condor is a critically endangered avian species that, in 1982, became extinct in the wild. Its survival has persevered through a captive breeding program and reintroduction efforts within its former range. As of April, 2015, 421 California condors, including 204 flying in the wild constituted the extant population. Concern regarding...
The California condor is a critically endangered avian species that, in 1982, became extinct in the wild. Its survival has persevered through a captive breeding program and reintroduction efforts within its former range. As of April, 2015, 421 California condors, including 204 flying in the wild constituted the extant population. Concern regarding...
Woolly mammoths and living elephants are characterized by major phenotypic differences that have allowed them to live in very different environments. To identify the genetic changes that underlie the suite of woolly mammoth adaptations to extreme cold, we sequenced the nuclear genome from three Asian elephants and two woolly mammoths, and we identi...
With the development of inexpensive, high-throughput sequencing technologies, it has become feasible to examine questions related to population genetics and molecular evolution of non-model species in their ecological contexts on a genome-wide scale. Here, we employed a newly developed suite of integrated, web-based programs to examine population d...
Woolly mammoths and the living elephants are characterized by major phenotypic differences that allowed them to live in very different environments. To identify the genetic changes that underlie the suite of adaptations in woolly mammoths to life in extreme cold, we sequenced the nuclear genome from three Asian elephants and two woolly mammoths, id...
The Genome 10K Project was established in 2009 by a consortium of biologists and genome scientists determined to facilitate the sequencing and analysis of the complete genomes of 10,000 vertebrate species. Since then the number of selected and initiated species has risen from ∼26 to 277 sequenced or ongoing with funding, an approximately tenfold in...
Background
The discovery and mapping of genomic variants is an essential step in most analysis done using sequencing reads. There are a number of mature software packages and associated pipelines that can identify single nucleotide polymorphisms (SNPs) with a high degree of concordance. However, the same cannot be said for tools that are used to id...
The Khoisan people from Southern Africa maintained ancient lifestyles as hunter-gatherers or pastoralists up to modern times, though little else is known about their early history. Here we infer early demographic histories of modern humans using whole-genome sequences of five Khoisan individuals and one Bantu speaker. Comparison with a 420 K SNP da...
Polar bears (Ursus maritimus) face extremely cold temperatures and periods of fasting, which might result in more severe energetic challenges than those experienced by their sister species, the brown bear (U. arctos). We have examined the mitochondrial and nuclear genomes of polar and brown bears to investigate if polar bears demonstrate lineage-sp...
W635: The critically endangered California condor (Gymnogyps californianus) has been the focus of intensive conservation efforts for several decades. Reduced to a population size of twenty-three birds in 1985, the entire surviving population was brought under captive management for recovery. Founded by fourteen individuals, the surviving California...
HbVar (http://globin.bx.psu.edu/hbvar) is one of the oldest and most appreciated locus-specific databases launched in 2001 by a multi-center academic effort to provide timely information on the genomic alterations leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Database records include extensive phenotypic descri...
Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familia...
We performed a population genomics study of the aye-aye, a highly specialized nocturnal lemur from Madagascar. Aye-ayes have low population densities and extensive range requirements that could make this flagship species particularly susceptible to extinction. Therefore, knowledge of genetic diversity and differentiation among aye-aye populations i...
Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in...
The putative allele, the flanking sequence in the genome, PCR and extension primers used in this study are listed in this spreadsheet. The sheet “Only 454” lists the details for locations that were only called using 454 generated sequences, “Only Illumina” refers to the details for locations that were only called by sequences generated using Illumi...
Hybridization is a widespread evolutionary phenomenon that can play a role in diversification, especially among closely related taxa. Whereas hybridization is well known in plants, natural animal hybrids are considered much less common, although recent genome-wide investigations of canids and hominids suggest that admixture may have shaped the evol...
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign bioc...
This unit focuses on some of the tools available on the public Galaxy server that are useful for exploring possible associations between human genetic variants and phenotypes. We trace step-by-step through an example illustrating several methods for examining a single full-coverage genome to look for single-nucleotide polymorphisms (SNPs) that are...
Background
With over 1.3 billion people, India is estimated to contain three times more genetic diversity than does Europe. Next-generation sequencing technologies have facilitated the understanding of diversity by enabling whole genome sequencing at greater speed and lower cost. While genomes from people of European and Asian descent have been seq...
Table S1. SNPs and indels in (A) Gene, Regulatory and Enhancer regions, (B) Repeat class and family. Table S2 Non-synonymous SNPs in SAIF genome. Table S3 SNPs predicted to be damaging by SIFT. Table S4 (A) In-frame short indels, and (B) Short frameshift indels in SAIF genome. Table S5 Short indels predicted to lead to non-sense mediated decay (NMD...
Figure S1. Pathway analysis of synonymous SNPs. Pathway enrichment analysis was performed using DAVID program. The enriched KEGG pathways (FDR<= 0.05) identified are reported. Figure S2. Protein domain position and non-sense SNP location in MMP28 protein. Figure S3. Phylogenetic relationship of the SAIF mt genome. The tree on the left shows phyloge...
Polar bears (PBs) are superbly adapted to the extreme Arctic environment and have become emblematic of the threat to biodiversity from global climate change. Their divergence from the lower-latitude brown bear provides a textbook example of rapid evolution of distinct phenotypes. However, limited mitochondrial and nuclear DNA evidence conflicts in...
Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is ortholo...
Recent advances in next-generation sequencing technology have opened doors to conservation and population genomic studies of wildlife species, for which only limited genetic resources exist. We are using a next-generation sequencing strategy to detect genome-wide single nucleotide polymorphisms (SNPs) in bear species. Based on distinct differences...
We present a high-coverage draft genome assembly of the aye-aye (Daubentonia madagascariensis), a highly unusual nocturnal primate from Madagascar. Our assembly totals ~3.0 billion bp (3.0 Gb), roughly the size of the human genome, comprised of ~2.6 million scaffolds (N50 scaffold size = 13,597 bp) based on short paired-end sequencing reads. We com...
Supplement. This includes Table S1 - GenBank accession numbers of the new sequences; Table S2 - Summary of detected conversions in the five human gene clusters; Table S3 - Fraction of paralogous pairs by their number of conversion events, out of all paralogous sequence pairs; Table S4 - Fraction of bases by their number of conversion events (involv...
Gene clusters containing multiple similar genomic regions in close proximity are of great interest for biomedical studies because of their associations with inherited diseases. However, such regions are difficult to analyze due to their structural complexity and their complicated evolutionary histories, reflecting a variety of large-scale mutationa...
Interplays among lineage-specific nuclear proteins, chromatin modifying enzymes, and the basal transcription machinery govern cellular differentiation, but their dynamics of action and coordination with transcriptional control are not fully understood. Alterations in chromatin structure appear to establish a permissive state for gene activation at...
The Tasmanian devil (Sarcophilus harrisii) is threatened with extinction because of a contagious cancer known as Devil Facial Tumor Disease. The inability to mount an immune response and to reject these tumors might be caused by a lack of genetic diversity within a dwindling population. Here we report a whole-genome analysis of two animals originat...
Plasticity of gene regulatory encryption can permit DNA sequence divergence without loss of function. Functional information is preserved through conservation of the composition of transcription factor binding sites (TFBS) in a regulatory element. We have developed a method that can accurately identify pairs of functional noncoding orthologs at evo...
Author Summary
The Encyclopedia of DNA Elements (ENCODE) Project was created to enable the scientific and medical communities to interpret the human genome sequence and to use it to understand human biology and improve health. The ENCODE Consortium, a large group of scientists from around the world, uses a variety of experimental methods to identif...
We developed a series of interrelated locus-specific databases to store all published and unpublished genetic variation related to hemoglobinopathies and thalassemia and implemented microattribution to encourage submission of unpublished observations of genetic variation to these public repositories. A total of 1,941 unique genetic variants in 37 g...
Gene clusters are genetically important, but their analysis poses significant computational challenges. One of the major reasons for these difficulties is gene conversion among the duplicated regions of the cluster, which can obscure their true relationships. Many computational methods for detecting gene conversion events have been released, but th...
'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evo...
Much important evolutionary activity occurs in gene clusters, where a copy of a gene may be free to acquire new functions. Current computational methods to extract evolutionary information from sequence data for such clusters are suboptimal, in part because accurate sequence data are often lacking in these genomic regions, making existing methods d...
Gene conversion events are often overlooked in analyses of genome evolution. In a conversion event, an interval of DNA sequence (not necessarily containing a gene) overwrites a highly similar sequence. The event creates relationships among genomic intervals that can confound attempts to identify orthologs and to transfer functional annotation betwe...
The MultiPipMaker World Wide Web server (http://www.bx.psu.edu) provides a tool for aligning multiple DNA sequences and visualizing regions of conservation among them. This unit describes its use and gives an explanation of the resulting output files and supporting tools. Features provided by the server include alignment of up to 20 very long genom...