David B Jaffe's research while affiliated with 10x Genomics and other places

Publications (153)

Article
Full-text available
Linked-Read sequencing technology has recently been employed successfully for de novo assembly of human genomes, however, the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5-gigabase (Gb) diploid pepper (Capsicum annuum) genome with a single Linked-Read library. Pl...
Article
X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease that is endemic to the Philippines and is associated with a founder haplotype. We integrated multiple genome and transcriptome assembly technologies to narrow the causal mutation to the TAF1 locus, which included a SINE-VNTR-Alu (SVA) retrotransposition into intron 32 of...
Preprint
Full-text available
Background Linked-Read sequencing technology has recently been employed successfully for de novo assembly of multiple human genomes, however the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5 gigabase (Gb) diploid pepper (Capsicum annuum) genome with a single Link...
Preprint
X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease endemic to the Philippines. We integrated genome and transcriptome assembly with induced pluripotent stem cell-based modeling to identify the XDP causal locus and potential pathogenic mechanism. Genome sequencing identified novel variation that was shared by all probands...
Article
Current short-read methods have come to dominate genome sequencing because they are cost-effective, rapid, and accurate. However, short reads are most applicable when data can be aligned to a known reference. Two new methods for de novo assembly are linked-reads and restriction-site labeled optical maps. We combined commercial applications of these...
Article
Full-text available
Determining the genome sequence of an organism is challenging, yet fundamental to understanding its biology. Over the past decade, thousands of human genomes have been sequenced, contributing deeply to biomedical research. In the vast majority of cases, these have been analyzed by aligning sequence reads to a single reference genome, biasing the re...
Preprint
Determining the genome sequence of an organism is challenging, yet fundamental to understanding its biology. Over the past decade, thousands of human genomes have been sequenced, contributing deeply to biomedical research. In the vast majority of cases, these have been analyzed by aligning sequence reads to a single reference genome, biasing the re...
Article
Full-text available
Background: De novo reference assemblies that are affordable, practical to produce, and of sufficient quality for most downstream applications, remain an unattained goal for many taxa. Insects, which may yield too little DNA from individual specimens for long-read sequencing library construction and often have highly heterozygous genomes, can be p...
Article
Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human geno...
Article
Full-text available
Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity,we sequenced the genomes and transcriptomes of five lineages of African cichlids: theNile tilapia (Oreochromis niloticus), an ancestral lineage with low div...
Article
Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low d...
Article
Full-text available
The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, accept...
Article
Full-text available
Background DNA sequencing technologies deviate from the ideal uniform distribution of reads. These biases impair scientific and medical applications. Accordingly, we have developed computational methods for discovering, describing and measuring bias. Results We applied these methods to the Illumina, Ion Torrent, Pacific Biosciences and Complete Ge...
Article
Full-text available
The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked...
Article
Full-text available
The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked...
Article
Detection of somatic point substitutions is a key step in characterizing the cancer genome. However, existing methods typically miss low-allelic-fraction mutations that occur in only a subset of the sequenced cells owing to either tumor heterogeneity or contamination by normal cells. Here we present MuTect, a method that applies a Bayesian classifi...
Article
Full-text available
Although genetic lesions responsible for some mendelian disorders can be rapidly discovered through massively parallel sequencing of whole genomes or exomes, not all diseases readily yield to such efforts. We describe the illustrative case of the simple mendelian disorder medullary cystic kidney disease type 1 (MCKD1), mapped more than a decade ago...
Article
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platfor...
Article
Full-text available
Exceptionally accurate genome reference sequences have proven to be of great value to microbial researchers. Thus, to date, about 1800 bacterial genome assemblies have been "finished" at great expense with the aid of manual laboratory and computational processes that typically iterate over a period of months or even years. By applying a new laborat...
Article
Full-text available
Eliminating the bacterial cloning step has been a major factor in the vastly improved efficiency of massively parallel sequencing approaches. However, this also has made it a technical challenge to produce the modern equivalent of the Fosmid- or BAC-end sequences that were crucial for assembling and analyzing complex genomes during the Sanger-based...
Article
Full-text available
The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalogue of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies,...
Article
Full-text available
Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genom...
Article
Full-text available
Advances in next-generation sequencing (NGS) promise to facilitate diagnosis of inherited disorders. Although in research settings NGS has pinpointed causal alleles using segregation in large families, the key challenge for clinical diagnosis is application to single individuals. To explore its diagnostic use, we performed targeted NGS in 42 unrela...
Article
Full-text available
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures...
Article
Full-text available
Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technolo...