J. C Venter

J. C Venter
J. Craig Venter Institute | JCVI

About

543
Publications
150,787
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
124,942
Citations
Citations since 2016
71 Research Items
32581 Citations
201620172018201920202021202201,0002,0003,0004,0005,000
201620172018201920202021202201,0002,0003,0004,0005,000
201620172018201920202021202201,0002,0003,0004,0005,000
201620172018201920202021202201,0002,0003,0004,0005,000
Introduction
Skills and Expertise

Publications

Publications (543)
Article
Full-text available
Synthetic genomics is the construction of viruses, bacteria, and eukaryotic cells with synthetic genomes. It involves two basic processes: synthesis of complete genomes or chromosomes and booting up of those synthetic nucleic acids to make viruses or living cells. The first synthetic genomics efforts resulted in the construction of viruses. This le...
Article
Full-text available
Significance To understand the value and clinical impact of surveying genome-wide disease-causing genes and variants, we used a prospective cohort study design that enrolled volunteers who agreed to have their whole genome sequenced and to participate in deep phenotyping using clinical laboratory tests, metabolomics technologies, and advanced nonin...
Article
Full-text available
Background: Modern medicine is rapidly moving towards a data-driven paradigm based on comprehensive multimodal health assessments. Integrated analysis of data from different modalities has the potential of uncovering novel biomarkers and disease signatures. Methods: We collected 1385 data features from diverse modalities, including metabolome, m...
Article
Full-text available
The human gut is inhabited by a complex and metabolically active microbial ecosystem. While many studies focused on the effect of individual microbial taxa on human health, their overall metabolic potential has been under-explored. Using whole-metagenome shotgun sequencing data in 1,004 twins, we first observed that unrelated subjects share, on ave...
Article
(Cell Metabolism 25, 1054–1062.e1–e5; May 2, 2017) In the originally published version of this article, there was a typographical error in Figure 1 of the manuscript related to the two Dorea species. Dorea longicatena was mislabeled as Dorea sp. CAG:317 (and vice versa). The correction has now been made online. This error does not affect the conclu...
Article
Full-text available
Sequence variation data of the human proteome can be used to analyze 3D protein structures to derive functional insights. We used genetic variant data from nearly 140,000 individuals to analyze 3D positional conservation in 4,715 proteins and 3,951 homology models using 860,292 missense and 465,886 synonymous variants. Sixty percent of protein stru...
Preprint
Full-text available
The human gut is inhabited by a complex and metabolically active microbial ecosystem regulating host health. While many studies have focused on the effect of individual microbial taxa, the metabolic potential of the entire gut microbial ecosystem has been largely under-explored. We characterised the gut microbiome of 1,004 twins via whole shotgun m...
Article
We previously discovered that intact bacterial chromosomes can be directly transferred to yeast host cell where they can propagate as centromeric plasmids by fusing bacterial cells with Saccharomyces cerevisiae spheroplasts. Inside the host any desired number of genetic changes can be introduced into the yeast centromeric plasmid to produce designe...
Preprint
Full-text available
We report the results of a three-year precision medicine study that enrolled 1190 presumed healthy participants at a single research clinic. To enable a better assessment of disease risk and improve diagnosis, a precision health platform that integrates non-invasive functional measurements and clinical tests combined with whole genome sequencing (W...
Preprint
Full-text available
Modern medicine is rapidly moving towards a data-driven paradigm based on comprehensive multimodal health assessments. We collected 1,385 data features from diverse modalities, including metabolome, microbiome, genetics and advanced imaging, from 1,253 individuals and from a longitudinal validation cohort of 1,083 individuals. We utilized an ensemb...
Article
Full-text available
Obesity is a heterogeneous phenotype that is crudely measured by body mass index (BMI). There is a need for a more precise yet portable method of phenotyping and categorizing risk in large numbers of people with obesity to advance clinical care and drug development. Here, we used non-targeted metabolomics and whole-genome sequencing to identify met...
Article
The Central Asian Kyrgyz highland population provides a unique opportunity to address genetic diversity and understand the genetic mechanisms underlying high-altitude pulmonary hypertension (HAPH). Although a significant fraction of the population is unaffected, there are susceptible individuals who display HAPH in the absence of any lung, cardiac...
Article
Full-text available
Functional genomics studies in minimal mycoplasma cells enable unobstructed access to some of the most fundamental processes in biology. Conventional transposon bombardment and gene knockout approaches often fail to reveal functions of genes that are essential for viability, where lethality precludes phenotypic characterization. Conditional inactiv...
Article
Full-text available
Inherited variation contributes to autism About one-quarter of genetic variants that are associated with autism spectrum disorder (ASD) are due to de novo mutations in protein-coding genes. Brandler et al. wanted to determine whether changes in noncoding regions of the genome are associated with autism. They applied whole-genome sequencing to ∼2600...
Preprint
Full-text available
Obesity is a heterogeneous phenotype that is crudely measured by body mass index (BMI). More precise phenotyping and categorization of risk in large numbers of people with obesity is needed to advance clinical care and drug development. Here, we used non-targeted metabolome analysis and whole genome sequencing to identify metabolic and genetic sign...
Article
Full-text available
There is a significant interest in the standardized classification of human genetic variants. We used whole-genome sequence data from 10,495 unrelated individuals to contrast population frequency of pathogenic variants to the expected population prevalence of the disease. Analyses included the ACMG-recommended 59 gene-condition sets for incidental...
Article
Full-text available
Significance Advances in technology are enabling evaluation for prevention and early detection of age-related chronic diseases associated with premature mortality, such as cancer and cardiovascular diseases. These diseases kill about one-third of men and one-quarter of women between the ages of 50 and 74 years old in the United States. We used whol...
Article
Full-text available
Understanding the significance of genetic variants in the noncoding genome is emerging as the next challenge in human genomics. We used the power of 11,257 whole-genome sequences and 16,384 heptamers (7-nt motifs) to build a map of sequence constraint for the human species. This build differed substantially from traditional maps of interspecies con...
Article
Full-text available
Background: Acetaminophen (paracetamol) is one of the most common medications used for management of pain in the world. There is lack of consensus about the mechanism of action, and concern about the possibility of adverse effects on reproductive health. Methods: We first established the metabolome profile that characterizes use of acetaminophen...
Article
Full-text available
Objectives: Inflammatory bowel diseases (IBD), comprised of Crohn’s disease (CD) and ulcerative colitis (UC), are characterized by a complex pathophysiology that is thought to result from an aberrant immune response to a dysbiotic luminal microbiota in genetically susceptible individuals. New technologies support the joint assessment of host-microb...
Data
Table S1. Comparison of STR Disease Prevalence in HLI Samples and the Known Prevalence Estimates Based on Literature Review
Data
Table S2. Number of Spanning, Partial, Repeat-Only, and Paired-End Reads Identified by TREDPARSE for Each of the 138 Individuals with Risk Alleles
Data
Table S5. Number of Spanning, Partial, Repeat-Only, and Paired-End Reads and STR Calls Identified by TREDPARSE for Each of the Trio Families Used in the Mendelian-Error Estimates Each worksheet contains a single STR locus. STR calls are in the form of “X | Y” for autosomal loci, and “X | null” for male individuals called at X-linked loci to indica...
Article
Full-text available
Short tandem repeats (STRs) are hyper-mutable sequences in the human genome. They are often used in forensics and population genetics and are also the underlying cause of many genetic diseases. There are challenges associated with accurately determining the length polymorphism of STR loci in the genome by next-generation sequencing (NGS). In partic...
Article
A gene can be defined as essential when loss of its function compromises viability of the individual (for example, embryonic lethality) or results in profound loss of fitness. At the population level, identification of essential genes is accomplished by observing intolerance to loss-of-function variants. Several computational methods are available...
Preprint
Full-text available
In a recently published PNAS article, we studied the identifiability of genomic samples using machine learning methods [Lippert et al., 2017]. In a response, Erlich [2017] argued that our work contained major flaws. The main technical critique of Erlich [2017] builds on a simulation experiment that shows that our proposed algorithm, which uses only...
Article
Full-text available
Fine population structure can be examined through the clustering of individuals into subpopulations. The clustering of individuals in large sequence datasets into subpopulations makes the calculation of subpopulation specific allele frequency possible, which may shed light on selection of candidate variants for rare diseases. However, as the magnit...
Article
Human high-altitude (HA) adaptation or mal-adaptation is explored to understand the physiology, pathophysiology and molecular mechanisms that underlie long-term exposure to hypoxia. Here we report the results of an analysis of the largest whole-genome-sequencing of Chronic Mountain Sickness (CMS) and non-CMS individuals, identified candidate genes...
Article
Full-text available
Significance By associating deidentified genomic data with phenotypic measurements of the contributor, this work challenges current conceptions of genomic privacy. It has significant ethical and legal implications on personal privacy, the adequacy of informed consent, the viability and value of deidentification of data, the potential for police pro...
Preprint
Full-text available
Sequence variation data of the human proteome can be used to analyze 3-­-dimensional (3D) protein structures to derive functional insights. We used genetic variant data from nearly 150,000 individuals to analyze 3D positional conservation in 4,390 protein structures using 481,708 missense and 264,257 synonymous variants. Sixty percent of protein st...
Article
Full-text available
Significance Regulation of the human immune system is largely controlled by the HLA gene complex on chromosome 6 and is important in infectious disease immunity, graft rejection, autoimmunity, and cancer. HLA typing is traditionally performed by serotyping and/or targeted sequencing. However, the advent of precision medicine and cheaper personal ge...
Article
Full-text available
Manufacturing processes for biological molecules in the research laboratory have failed to keep pace with the rapid advances in automization and parellelization. We report the development of a digital-to-biological converter for fully automated, versatile and demand-based production of functional biologics starting from DNA sequence information. Sp...
Preprint
Full-text available
Urine culture and microscopy techniques are used to profile the bacterial species present in urinary tract infections. To gain insight into the urinary flora in infection and health, we analyzed clinical laboratory features and the microbial metagenome of 121 clean-catch urine samples. 16S rDNA gene signatures were successfully obtained for 116 par...
Article
Full-text available
BACKGROUND Progress in science and technology have created the capabilities and alternatives to symptom-driven medical care. Reducing premature mortality associated with age-related chronic diseases, such as cancer and cardiovascular disease, is an urgent priority we address using advanced screening detection. METHODS We enrolled active adults for...
Article
The presence of advanced fibrosis in nonalcoholic fatty liver disease (NAFLD) is the most important predictor of liver mortality. There are limited data on the diagnostic accuracy of gut microbiota-derived signature for predicting the presence of advanced fibrosis. In this prospective study, we characterized the gut microbiome compositions using wh...
Preprint
Full-text available
The genetic architecture of autism spectrum disorder (ASD) is known to consist of contributions from gene-disrupting de novo mutations and common variants of modest effect. We hypothesize that the unexplained heritability of ASD also includes rare inherited variants with intermediate effects. We investigated the genome-wide distribution and functio...
Article
Full-text available
Background Metagenomics is the study of the microbial genomes isolated from communities found on our bodies or in our environment. By correctly determining the relation between human health and the human associated microbial communities, novel mechanisms of health and disease can be found, thus enabling the development of novel diagnostics and ther...
Article
Full-text available
The characterization of the blood virome is important for the safety of blood-derived transfusion products, and for the identification of emerging pathogens. We explored non-human sequence data from whole-genome sequencing of blood from 8,240 individuals, none of whom were ascertained for any infectious disease. Viral sequences were extracted from...
Data
Abundance of EBV in association with use of human reference genome NA12878. The distribution of the abundance of EBV is shown for the EBV B95-8 strain-immortalized the cell line of NA12878, for samples sequenced sharing the same flow cell with human genome NA12878 and for samples sequenced in the absence of human genome NA12878 in the sequencing fl...
Data
Distribution of samples with viruses across the sequencing flow cells. The number of viral reads per samples are shown on the y-axis in relation to the number of samples per flow cell that are positive for the corresponding virus. The presence of multiple positive samples in flow cells that contain one high viral-titer sample is suggestive of conta...
Data
Sequence reads from RNA viruses. Panel A depicts the alignment of 4 reads from one individual to the influenza H1N1 reference sequence M1 and M2, segment seven. Closest match; serotype = H1N1, strain = A/Puerto Rico/8/1934. Panel B depicts the alignment of 18 reads from one individual to a HCV subtype 3 sequence. Closest match, HCV clone FG1-NS3-4a...
Data
Complete listing of viruses putatively identified or contaminating blood DNA of 8,240 individuals. (PDF)
Data
Statistical significant differences for demographic characteristics and viral prevalence or viral load. (PDF)
Data
Assembly of contigs of human viruses. The sensitivity of identification of human viruses differs when using contigs from de novo assembly of reads, versus using individual reads. The upper panel is based on raw counts of the virus reads and the lower panels show the normalized viral abundances. The identification of viruses is improved by several o...
Data
Association of viral presence with demographic characteristics. Panel A-C depict the individual association of viral presence with sex, age and genetic ancestry. Panel D plots the results of the analysis of deviance (variance) for the presence of any human virus in response to the individuals’ gender, ethnicity, age. AFR, African; AMR, Admixed Amer...
Data
Read mapping statistics. Unmapped reads in deep sequencing of the human genome using Illumina HiseqX10 technology. The average percentage of unmapped reads per sample is around 5.23%, and median is 4.91%. (TIF)