Lance R Parsons

Lance R Parsons
Princeton University | PU · Lewis-Sigler Institute for Integrative Genomics

M.S.

About

113
Publications
15,637
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,980
Citations
Additional affiliations
October 2008 - present
Princeton University
Position
  • Scientific Programmer
Description
  • Bioinformatics programmer in the Lewis-Sigler Institute for Integrative Genomics.
April 2006 - October 2008
Arizona State University
Position
  • Web Application Developer/Technical Support Analyst
Description
  • Develop customized web applications that integrate enterprise systems including PeopleSoft, DARS degree audit system, and the Google Search Appliance.

Publications

Publications (113)
Article
Pancreatic cancer cells with limited access to free amino acids can grow by scavenging extracellular protein. In a murine model of pancreatic cancer, we performed a genome-wide CRISPR screen for genes required for scavenging-dependent growth. The screen identified key mediators of macropinocytosis, peripheral lysosome positioning, endosome-lysosome...
Article
Animals face both external and internal dangers: pathogens threaten from the environment, and unstable genomic elements threaten from within. C. elegans protects itself from pathogens by "reading" bacterial small RNAs, using this information to both induce avoidance and transmit memories for four generations. Here, we found that memories can be tra...
Article
Stable isotope labeling techniques have been widely applied in the field of metabolomics and proteomics. Before the measured mass spectral data can be used for quantitative analysis, it must be accurately corrected for isotope natural abundance and tracer isotopic impurity. Despite the increasing popularity of dual-isotope tracing strategy such as...
Article
Several essential components of the electron transport chain, the major producer of ATP in mammalian cells, are encoded in the mitochondrial genome. These 13 proteins are translated within mitochondria by ‘mitoribosomes’. Defective mitochondrial translation underlies multiple inborn errors of metabolism and has been implicated in pathologies such a...
Preprint
Full-text available
Life’s most dramatic innovations, from the emergence of self-replicating molecules to highly-integrated societies, often involve increases in biological complexity. Some groups traverse different levels of complexity, providing a framework to identify key factors shaping these evolutionary transitions. Halictid bees span the transition from individ...
Preprint
Stable isotope labeling techniques have been widely applied in the field of metabolomics and proteomics. Before the measured mass spectrum data can be used for quantitative analysis, it must be accurately corrected for isotope natural abundance and tracer isotopic impurity. Despite the increasing popularity of dual-isotope tracing strategy such as...
Article
Full-text available
Herpes simplex virus 1 (HSV-1) strain McKrae was isolated in 1965 and has been utilized by many laboratories. Three HSV-1 strain McKrae stocks have been sequenced previously, revealing discrepancies in key genes. We sequenced the genome of HSV-1 strain McKrae from the laboratory of James M. Hill to better understand the genetic differences between...
Preprint
Animals face both external and internal dangers: pathogens threaten from the environment, and unstable genomic elements threaten from within. Previously, we discovered that C. elegans protects itself from pathogens by reading bacterial small RNAs and using this information to both induce avoidance and transmit memories for several generations. Here...
Article
Full-text available
Caenorhabditis elegans must distinguish pathogens from nutritious food sources among the many bacteria to which it is exposed in its environment¹. Here we show that a single exposure to purified small RNAs isolated from pathogenic Pseudomonas aeruginosa (PA14) is sufficient to induce pathogen avoidance in the treated worms and in four subsequent ge...
Preprint
C. elegans is exposed to many different bacteria in its environment, and must distinguish pathogenic from nutritious bacterial food sources. Here, we show that a single exposure to purified small RNAs isolated from pathogenic Pseudomonas aeruginosa (PA14) is sufficient to induce pathogen avoidance, both in the treated animals and in four subsequent...
Article
Full-text available
Differences in immune responses across species can contribute to the varying permissivity of species to the same viral pathogen. Understanding how our closest evolutionary relatives, nonhuman primates (NHPs), confront pathogens and how these responses have evolved over time could shed light on host range barriers, especially for zoonotic infections...
Article
Full-text available
Ascidian embryos highlight the importance of cell lineages in animal development. As simple proto-vertebrates, they also provide insights into the evolutionary origins of cell types such as cranial placodes and neural crest cells. Here we have determined single-cell transcriptomes for more than 90,000 cells that span the entirety of development—fro...
Preprint
We recently discovered that C. elegans can pass on a learned avoidance of pathogenic Pseudomonas aeruginosa (PA14) to four generations of its progeny. This transgenerational inheritance is bacterial species-specific, but how C. elegans recognizes and distinguishes different bacteria and transmits this information to future generations is not appare...
Article
Hepatitis B virus (HBV) remains a major global health problem with 257 million chronically infected individuals worldwide, of whom approximately 20 million are co‐infected with hepatitis delta virus (HDV). Progress towards a better understanding of the complex interplay between these two viruses and development of novel therapies have been hampered...
Article
Full-text available
Mice engrafted with components of a human immune system have become widely-used models for studying aspects of human immunity and disease. However, a defined methodology to objectively measure and compare the quality of the human immune response in different models is lacking. Here, by taking advantage of the highly immunogenic live-attenuated yell...
Article
Altered glycolysis is a hallmark of diseases including diabetes and cancer. Despite intensive study of the contributions of individual glycolytic enzymes, systems-level analyses of flux control through glycolysis remain limited. Here, we overexpress in two mammalian cell lines the individual enzymes catalyzing each of the 12 steps linking extracell...
Preprint
Full-text available
We present Bioconda (https://bioconda.github.io), a distribution of bioinformatics software for the lightweight, multi-platform and language-agnostic package manager Conda. Currently, Bioconda offers a collection of over 3000 software packages, which is continuously maintained, updated, and extended by a growing global community of more than 200 co...
Article
Full-text available
The yeast Saccharomyces cerevisiae has emerged as a superior model organism. Selection of distinct laboratory strains of S. cerevisiae with unique phenotypic properties, such as superior mating or sporulation efficiencies, has facilitated advancements in research. W303 is one such laboratory strain that is closely related to the first completely se...
Article
Full-text available
Extrachromosomal circular DNAs (eccDNAs) are common genetic elements in Saccharomyces cerevisiae and are reported in other eukaryotes as well. EccDNAs contribute to genetic variation among somatic cells in multicellular organisms and to evolution of unicellular eukaryotes. Sensitive methods for detecting eccDNA are needed to clarify how these eleme...
Code
Command-line utilities to assist in developing tools for the Galaxy Project. http://galaxyproject.org
Article
Full-text available
Extrachromosomal circular DNA (eccDNA) derived from chromosomal Ty retrotransposons in yeast can be generated in multiple ways. Ty eccDNA can arise from the circularization of extra-chromosomal linear DNA during the transpositional life cycle of retrotransposons or from circularization of genomic Ty DNA. Circularization may happen through non-homol...
Conference Paper
Copy number variation (CNV) is probably the most frequent single genetic change in eukaryotic DNA but even though amplifications and deletions are well-documented, their frequency is not much appreciated as CNVs are generally too rare to be detected if they are not selected for. We developed a highly sensitive method, Circle-Seq, for screening of C...
Article
Full-text available
Examples of extrachromosomal circular DNAs (eccDNAs) are found in many organisms, but their impact on genetic variation at the genome scale has not been investigated. We mapped 1,756 eccDNAs in the Saccharomyces cerevisiae genome using Circle-Seq, a highly sensitive eccDNA purification method. Yeast eccDNAs ranged from an arbitrary lower limit of 1...
Article
Full-text available
Unlabelled: Herpes simplex virus (HSV) is a widespread pathogen that causes epithelial lesions with recurrent disease that manifests over a lifetime. The lifelong aspect of infection results from latent viral infection of neurons, a reservoir from which the virus reactivates periodically. Recent work has demonstrated the breadth of genetic variati...
Article
Full-text available
We used paired-end Illumina deep sequencing and de novo assembly to determine the genome sequence of herpes simplex virus 1 (HSV-1) strain MacIntyre (aka McIntyre). The MacIntyre strain originated from the brain of a patient with lethal HSV encephalitis and has a unique limitation in its neuronal spread, moving solely in the retrograde direction.
Article
Full-text available
DNA mismatch repair is a highly conserved DNA repair pathway. In humans, germline mutations in hMSH2 or hMLH1, key components mismatch repair, have been associated with Lynch Syndrome, a leading cause of inherited cancer mortality. Current estimates of the mutation rate and the mutational spectra in mismatch repair defective cells are primarily lim...
Article
Full-text available
The Fluorescence in situ Hybridization (FISH) method allows one to detect nucleic acids in the native cellular environment. Here we provide a protocol for using FISH to quantify the number of mRNAs in single yeast cells. Cells can be grown in any condition of interest and then fixed and made permeable. Subsequently, multiple single-stranded deoxyol...
Article
Full-text available
The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5%) of its precursor "silent" germline micronuclear genome...
Data
Genome meta-assembly method. The meta-assembly started by reassembling the contigs produced from three different assemblers (see Materials and Methods for parameters used) with the CAP3 assembler. Two cycles of contig extension and re-assembly were performed before splitting of potentially chimeric contigs and trimming back at the sites of potentia...
Data
Nanochromosomal variant frequencies in relation to sequence coverage. Variant frequency distribution over all positions with detected variants for low (≥20× to <40×; blue) and high sequence coverage (≥40×; green); variant frequencies ≥40 bp from either nanochromosome end were counted to avoid possible incorrect variant calling resulting from telome...
Data
Association between nanochromosome copy number and number of telomeric reads. (A) Hexagonal binning plot of relative nanochromosome copy number measured in reads/bp versus copy number measured in number of telomeric reads for nonalternatively fragmented nanochromosomes (nanochromosomes without strongly supported alternative fragmentation sites). (B...
Data
Genes per contig or nanochromosome. Nanochromosomes are defined as contigs with TASs no more than 100 bp away from both ends of the contig (14,388 in total). Alternatively fragmented nanochromosomes are those that are strongly supported by Illumina telomeric reads (≥10 reads per site), and nonalternatively fragmented nanochromosomes are all the rem...
Data
Heat map of extended contigs verified with 454-telomeric end reads. Axes indicate percent difference of the 100 bp end matches. The first number in each cell indicates the fraction of complete nanochromosomes with paired matches to within 50 bases of each end of the nanochromosome; the second number indicates the total number of extended nanochromo...
Data
Genome assembly redundancy analysis. Distribution of matching contigs for non-self BLAT matches (≥100 bp long) within the Oxytricha macronuclear genome assemblies. The number of matching contigs, not the number of contig matches, is counted. The graphs (A–D) represent the ≥90%, ≥99%, ≥90% to <99%, and ≥95% match identity thresholds, respectively. (...
Data
Missing pentose phosphate pathway (PPP) enzymes in ciliates. Enzymes that are confirmed to be absent/present are highlighted in color, with enzymes that are present in Paramecium, Tetrahymena and Ichthyophthirius, but not Oxytricha highlighted in pink, and a single enzyme missing in Paramecium but present in Oxytricha, Tetrahymena, and Ichthyophthi...
Data
Linear regressions of relative estimates of nanochromosome copy number. Squares (red) are total telomeric reads for each contig; triangles (green) are 5′ telomeric reads for each contig; and diamonds (blue) are total reads/nanochromosome length (bp). The x-axis units are values obtained from qPCR (see Table S3). Linear regressions were determined w...
Data
Validation of nanochromosomes from the final assembly. Nanochromosomes were validated both by 454 telomeric end reads and/or Sanger reads/mate pairs. (A) Length distribution of nanochromosomes validated by either 454 telomeric end reads (green) or Sanger read/mate pairs (cyan) or both (purple), or not validated by either method (pink) (see Material...
Data
Length distribution of nanochromosomes validated by Sanger mate pairs. Nanochromosomes were validated according to the method illustrated in Figure S11. (TIFF)
Data
Intra-CDS alternative nanochromosome fragmentation. Nanochromosomes are indicated by black bars in descending order of length, with gene annotations below them. Predicted genes are indicated by green arrows and predicted CDSs by yellow arrows. Red arrows indicate alternative fragmentation sites and point in the direction that the alternative nanoch...
Data
Nonalternatively fragmented tRNA nanochromosomes. Nanochromosomes are indicated by black bars in descending order of length, with gene annotations below them. Where multiple allelic versions of nanochromosomes are present, we have selected just a single representative nanochromosome. Predicted genes are indicated by green arrows, predicted CDSs by...
Data
TAS sequence logos. Sequence logos showing nucleotide frequencies (generated with WebLogo [130]) for method 1 are for contig-derived sequences; while the logos for method 2 are for read-derived sequences. Sequence logos show base frequencies. (TIFF)
Data
Examples of overamplification of ribosomal protein-encoding nanochromosome isoforms (red) relative to the isoforms only containing nonribosomal genes (blue). High peaks and deep troughs indicate subtelomeric sequence biases (see Materials and Methods). (TIFF)
Data
Base compositional biases surrounding TASs. Contig consensus sequences surrounding strongly supported site TASs (≥10 supporting Illumina telomeric reads) were extracted for (A–C) (see Text S1: Determination of sequences surrounding telomere addition sites). The telomere position is 0. We only illustrate base composition biases for one end of the na...
Data
Nanochromosome subtelomeric base composition of Stylonychia compared to that of Oxytricha. Oxytricha base compositions are indicated by dots behind the Stylonychia base composition lines. (TIFF)
Data
Intron length distribution. The green histogram is for all introns predicted by AUGUSTUS, including those with experimental support from RNA-seq data; the blue histogram is for all introns determined from RNA-seq data that were used as hints for AUGUSTUS during the gene prediction. The inset shows the size distribution over a longer length scale (w...
Data
Sequence logos of experimentally determined and predicted intron donor sites. Sequence logos generated by WebLogo show base frequencies. Experimentally determined introns were obtained from RNA-seq data (see Text S1: Determination of sequences surrounding telomere addition sites). Predicted introns are all the introns predicted by AUGUSTUS, includi...
Data
Subtelomeric DNA capture method for 454 subtelomeric sequencing. Adaptor ends with a 5′-phosphate are shown in bold; otherwise 5′-phosphate is absent. The biotinylated thyamine residue in the internal adaptor is indicated in green. (TIFF)
Data
Southern blot analysis of Contig14329.0. Total macronuclear DNA was run on an electrophoretic gel. Two probes were created to investigate alternative fragmentation of this contig (“gene 1 probe” and “gene 2 probe”). For the gene 1 probe, the forward and reverse primers, 257_F and 1264_R, are CAGGCCCACAACATCTTCCTTCTTTG and CCATCTAGCACTACTCCATTAAGCAC...
Data
pN/pS values for matchless nanochromosomes. pN/pS values were calculated by PAML (see Text S1: Determination of pN/pS values). A cut-off of pN/pS = 0.6 is shown by the dashed red line. (TIFF)
Data
Positional variation of TASs. TASs are contig-derived (see Text S1: Determination of sequences surrounding telomere addition sites). TASs within a 200 bp window surrounding and centered on strongly supported, alternatively fragmented, and nonalternatively sites were counted. The frequency distributions of the TASs for alternatively fragmented sites...
Data
Length distributions of untranscribed (UTS) and untranslated (UTR) regions. Length distributions are for single-gene, nonalternatively fragmented nanochromosomes. (A) 5′ UTS length from the transcription start site to telomere (determined from 5′-RLM RACE Sanger reads). (B) 3′ UTS length from polyadenylation site to telomere (determined from RNA-se...
Data
Assessment of potential paralogy in model ciliate genomes. UCLUST from the USEARCH suite (version 5.1.221) [131] was used for clustering at increasing global sequence alignment identity clustering thresholds, with the query and target alignment fractions both set to 80% coverage (i.e., number of letters in the query that are aligned to letters in t...
Data
End-to-end validation by Sanger mate pair reads. Paired and SE reads are shown with gray arrows. First, “outer spans” between the ends of paired-end reads or consisting of the entire SE read are found. Next, we attempt to greedily find a path through the spans, so that there are ≥100 bp overlaps between the spans comprising the path. If we find suc...
Data
Nucleic-acid-associated protein domains found in both Paramecium and Tetrahymena but not Oxytricha. Domains marked with * are present in translated ORFs, but were not originally detected as AUGUSTUS failed to predict them. Protein IDs are given for Tetrahymena. (RTF)
Data
Genomic and RNA libraries Sanger sequenced on ABI3730 sequencers. (RTF)
Data
Meta-contig statistics after first CAP3 assembly before extension. “Single” refers to an SE being complete (≥1 5′ or 3′ telomeres). “Both” refers to one or more telomeres on both ends of the contig (≥1 5′ and ≥1 3′ ends). “Multiple” refers to greater than two ends on either end of the contig (≥2 5′ or ≥2 3′ ends). All lengths are given in bp. (RTF)
Data
Meta-contig statistics after CAP3 reassembly of extended contigs. “Single” refers to an SE being complete (≥1 5′ or 3′ telomeres). “Both” refers to one or more telomeres on both ends of the contig (≥1 5′ and ≥1 3′ ends). “Multiple” refers to greater than two ends on either end of the contig (≥2 5′ or ≥2 3′ ends). All lengths are given in bp. (RTF)
Data
Meta-contig statistics after second extension. “Single” refers to an SE being complete (≥1 5′ or 3′ telomeres). “Both” refers to one or more telomeres on both ends of the contig (≥1 5′ and ≥1 3′ ends). “Multiple” refers to greater than two ends on either end of the contig (≥2 5′ or ≥2 3′ ends). All lengths are given in bp. (RTF)
Data
Alternative nanochromosome fragmentation in a predicted intron-containing region. Gene predictions for Contig17419.0 are shown. Predicted genes are indicated by green arrows and predicted CDSs by yellow arrows; predicted introns are indicate by white arrows, and those introns that are supported by RNA-seq evidence have two white arrows; neon green...
Data
Sequence logo of Euplotes crassus subtelomeric regions. Sequence logos show base frequencies. Note that some of the motifs may be slightly misaligned (usually by 1 base), and hence the motif centered on position −20 would be even more prominent if they were correctly aligned. (TIFF)
Data
Intron length distribution for Tetrahymena thermophila gene predictions. Intron lengths determined from 2008 Tetrahymena gene predictions (downloaded from http://www.ciliate.org/system/downloads/oct2008_release.gff). (TIFF)
Data
Sequence logos of experimentally determined and predicted intron acceptor sites. Sequence logos show base frequencies. Experimentally determined introns were obtained from RNA-seq data (see Text S1: Gene prediction). Predicted introns are all the introns predicted by AUGUSTUS, including those that have supporting RNA-seq evidence. Sequence logos we...
Data
Location of alternative fragmentation sites relative to inter- and intracoding sequence regions for two-gene nanochromosomes. Alternative fragmentation sites with decreasing numbers of supporting telomeric reads are shown in three successive columns. To exclude conventional TASs, only alternative fragmentation sites at least 100 bp away from either...
Data
Oxytricha nucleic-acid-associated protein domains not found in Paramecium and Tetrahymena. aJudging from multiple sequence alignments, domain appears to exist in Paramecium (GSPATP00020413001) and Tetrahymena (TTHERM_00721450) but was not detected by hmmscan (HMMER3) bindependent E-value greater than the threshold (0.001), but domain exists (e.g.,...