Lenwood S. Heath

Virginia Polytechnic Institute and State University, Блэксбург, Virginia, United States

Are you Lenwood S. Heath?

Claim your profile

Publications (140)144.58 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Developing Arabidopsis seeds accumulate oils and seed storage proteins synthesized by the pathways of primary metabolism. Seed development and metabolism are positively regulated by transcription factors belonging to the LAFL (LEC1, AB13, FUSCA3 and LEC2) regulatory network. The VAL gene family encodes repressors of the seed maturation program in germinating seeds, although they are also expressed during seed maturation. The possible regulatory role of VAL1 in seed development has not been studied to date. Reverse genetics revealed that val1 mutant seeds accumulated elevated levels of proteins compared to the wild type, suggesting that VAL1 functions as a repressor of seed metabolism. However, metabolomes and the levels of ABA, auxin, and jasmonate derivatives did not change significantly in developing embryos in the absence of VAL1. Two VAL1 splice variants were identified through RNA sequencing analysis: a full-length and a truncated form lacking the plant-homeodomain-like domain associated with epigenetic repression. None of the transcripts encoding the core LAFL network transcription factors were affected in val1 embryos. Instead, activation of VAL1 by FUSCA3 appears to result in repression of a subset of seed maturation genes downstream of core LAFL regulators as 39% of transcripts in the FUSCA3 regulon were de-repressed in the val1 mutant. The LEC1 and LEC2 regulons also responded but to a lesser extent. Additional 832 transcripts that were not LAFL targets were de-repressed in val1 mutant embryos. These transcripts are candidate targets of VAL1, acting through epigenetic and/or transcriptional repression. This article is protected by copyright. All rights reserved.
    No preview · Article · Dec 2015 · The Plant Journal
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Transcriptomics reveals the existence of transcripts of different coding potential and strand orientation. Alternative splicing (AS) can yield proteins with altered number and types of functional domains, suggesting the global occurrence of transcriptional and post-transcriptional events. Many biological processes, including seed maturation and desiccation, are regulated post-transcriptionally (e.g., by AS), leading to the production of more than one coding or noncoding sense transcript from a single locus. Results: We present an integrated computational framework to predict isoform-specific functions of plant transcripts. This framework includes a novel plant-specific weighted support vector machine classifier called CodeWise, which predicts the coding potential of transcripts with over 96 % accuracy, and several other tools enabling global sequence similarity, functional domain, and co-expression network analyses. First, this framework was applied to all detected transcripts (103,106), out of which 13 % was predicted by CodeWise to be noncoding RNAs in developing soybean embryos. Second, to investigate the role of AS during soybean embryo development, a population of 2,938 alternatively spliced and differentially expressed splice variants was analyzed and mined with respect to timing of expression. Conserved domain analyses revealed that AS resulted in global changes in the number, types, and extent of truncation of functional domains in protein variants. Isoform-specific co-expression network analysis using ArrayMining and clustering analyses revealed specific sub-networks and potential interactions among the components of selected signaling pathways related to seed maturation and the acquisition of desiccation tolerance. These signaling pathways involved abscisic acid- and FUSCA3-related transcripts, several of which were classified as noncoding and/or antisense transcripts and were co-expressed with corresponding coding transcripts. Noncoding and antisense transcripts likely play important regulatory roles in seed maturation- and desiccation-related signaling in soybean. Conclusions: This work demonstrates how our integrated framework can be implemented to make experimentally testable predictions regarding the coding potential, co-expression, co-regulation, and function of transcripts and proteins related to a biological process of interest.
    Full-text · Article · Dec 2015 · BMC Genomics
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Reconstructing system dynamics from sequential data traces is an important algorithmic challenge with applications in computational neuroscience, systems biology, paleontology, and physical plant engineering. Here, we formalize a key computational task in network reconstruction, namely re- covering complex order-theoretic constraints among the sys- tem variables underlying a given dataset. Specifically, we fo- cus on the problem of reconstructing partial orders (posets) from their linear extensions. We discuss the theoretical com- plexity of this problem, a general framework to pose and study various inference tasks, and sketch algorithmic results for mining restricted classes of posets.
    Full-text · Article · Sep 2015
  • Source
    Eman Badr · Lenwood S Heath
    [Show abstract] [Hide abstract]
    ABSTRACT: Alternative splicing (AS) is a post-transcriptional regulatory mechanism for gene expression regulation. Splicing decisions are affected by the combinatorial behavior of different splicing factors that bind to multiple binding sites in exons and introns. These binding sites are called splicing regulatory elements (SREs). Here we develop CoSREM (Combinatorial SRE Miner), a graph mining algorithm to discover combinatorial SREs in human exons. Our model does not assume a fixed length of SREs and incorporates experimental evidence as well to increase accuracy. CoSREM is able to identify sets of SREs and is not limited to SRE pairs as are current approaches. We identified 37 SRE sets that include both enhancer and silencer elements. We show that our results intersect with previous results, including some that are experimental. We also show that the SRE set GGGAGG and GAGGAC identified by CoSREM may play a role in exon skipping events in several tumor samples. We applied CoSREM to RNA-Seq data for multiple tissues to identify combinatorial SREs which may be responsible for exon inclusion or exclusion across tissues. The new algorithm can identify different combinations of splicing enhancers and silencers without assuming a predefined size or limiting the algorithm to find only pairs of SREs. Our approach opens new directions to study SREs and the roles that AS may play in diseases and tissue specificity.
    Full-text · Article · Sep 2015 · BMC Bioinformatics
  • Source
    Edward A. Fox · Lenwood S. Heath · Qi Fan Chen · Amjad M. Daoud

    Full-text · Dataset · Aug 2015
  • Source
    Dataset: mphf
    Edward A. Fox · Lenwood S. Heath · Qi Fan Chen · Amjad M. Daoud

    Full-text · Dataset · Aug 2015
  • Source
    Edward A. Fox · Lenwood S. Heath · Qi Fan Chen · Amjad M. Daoud

    Full-text · Dataset · Aug 2015
  • Source
    Saima Sultana Tithi · Lenwood S Heath · Liqing Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: In the past decade, next generation sequencing technology (NGS) has produced billions of short read sequences. Mapping these short reads to a reference genome is a computationally expensive task. As a result, many mapping tools have been proposed. However, most of the existing tools ignore genomic variations during the mapping task. To address this problem, recently we introduced SNPwise, a short read aligner that takes into account known SNP variations provided by the database or the user. In this work, we improve efficiency of the lookup of the SNP information and evaluate the performance of the improved version of SNPwise using several human genome data sets.
    Full-text · Conference Paper · Jun 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background. Developing a universal standardized microbial typing and nomenclature system that provides phylogenetic and epidemiological information in real time has never been as urgent in public health as it is today. We previously proposed to use genome similarity as the basis for immediate and precise typing and naming of individual organisms or viruses. In this study, we tested the validity of the proposed system and applied it to the epidemiology of infectious diseases using Ebola virus disease (EVD) outbreaks as the example. Methods. One hundred twenty-eight publicly available ebolavirus genomes were compared with each other, and average nucleotide identity (ANI) was calculated. The ANI was then used to assign unique codes, hereafter referred to as Life Identification Numbers (LINs), to every viral isolate, whereby each LIN consisted of a series of positions reflecting increasing genome similarity. Congruence of LINs with phylogenetic and epidemiological relationships was then determined. Results. Assigned LINs correlate with phylogeny at the species and infraspecies level and can even identify some individual transmission chains during the 2014–2015 EVD epidemic in West Africa. Conclusions. Life Identification Numbers can provide a fast, automated, standardized, and scalable approach to precisely identify and name viral isolates upon genome sequence submission, facilitating unambiguous communication during disease epidemics among clinicians, epidemiologists, and governments.
    Preview · Article · Mar 2015 · Open Forum Infectious Diseases
  • Source
    Saima Sultana Tithi · Lenwood S Heath · Liqing Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: Current high-throughput sequencing technologies produce a large number of short reads from random locations in the genome. The next and time-consuming step is to map or align all the reads to a reference genome. Though a few dozen alignment tools have been introduced, few of them consider known genomic variants while aligning reads. We present SNPwise, a short read alignment tool that incorporates known SNPs (Single Nucleotide Polymorphisms) provided by a database such as dbSNP or the user while aligning reads. Results show that SNPwise significantly increases the number of reads that can be mapped to the reference genome and also improves the accuracy of the alignment, which is important as the alignment result is used for all downstream analyses. Although incorporating known variations into the mapping/alignment process requires more computing time, the benefit is the improved mapping results in terms of both the proportion of the mapped reads and the accuracy of the mapped reads.
    Full-text · Conference Paper · Mar 2015
  • Eman Badr · Lenwood S Heath
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Splicing regulatory elements (SREs) are short, degenerate sequences on pre-mRNA molecules that enhance or inhibit the splicing process via the binding of splicing factors, proteins that regulate the functioning of the spliceosome. Existing methods for identifying SREs in a genome are either experimental or computational. Here, we propose a formalism based on de Bruijn graphs that combines genomic structure, word count enrichment analysis, and experimental evidence to identify SREs found in exons. In our approach, SREs are not restricted to a fixed length (i.e., k-mers, for a fixed k). As a result, we identify 2001 putative exonic enhancers and 3080 putative exonic silencers for human genes, with lengths varying from 6 to 15 nucleotides. Many of the predicted SREs overlap with experimentally verified binding sites. Our model provides a novel method to predict variable length putative regulatory elements computationally for further experimental investigation.
    No preview · Article · Nov 2014 · Journal of computational biology: a journal of computational molecular cell biology
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A broadly accepted and stable biological classification system is a prerequisite for biological sciences. It provides the means to describe and communicate about life without ambiguity. Current biological classification and nomenclature use the species as the basic unit and require lengthy and laborious species descriptions before newly discovered organisms can be assigned to a species and be named. The current system is thus inadequate to classify and name the immense genetic diversity within species that is now being revealed by genome sequencing on a daily basis. To address this lack of a general intra-species classification and naming system adequate for today's speed of discovery of new diversity, we propose a classification and naming system that is exclusively based on genome similarity and that is suitable for automatic assignment of codes to any genome-sequenced organism without requiring any phenotypic or phylogenetic analysis. We provide examples demonstrating that genome similarity-based codes largely align with current taxonomic groups at many different levels in bacteria, animals, humans, plants, and viruses. Importantly, the proposed approach is only slightly affected by the order of code assignment and can thus provide codes that reflect similarity between organisms and that do not need to be revised upon discovery of new diversity. We envision genome similarity-based codes to complement current biological nomenclature and to provide a universal means to communicate unambiguously about any genome-sequenced organism in fields as diverse as biodiversity research, infectious disease control, human and microbial forensics, animal breed and plant cultivar certification, and human ancestry research.
    Full-text · Article · Feb 2014 · PLoS ONE
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Developing soybean seeds accumulate oils, proteins, and carbohydrates that are used as oxidizable substrates providing metabolic precursors and energy during seed germination. The accumulation of these storage compounds in developing seeds is highly regulated at multiple levels, including at transcriptional and post-transcriptional regulation. RNA sequencing was used to provide comprehensive information about transcriptional and post-transcriptional events that take place in developing soybean embryos. Bioinformatics analyses lead to the identification of different classes of alternatively spliced isoforms and corresponding changes in their levels on a global scale during soybean embryo development. Alternative splicing was associated with transcripts involved in various metabolic and developmental processes, including central carbon and nitrogen metabolism, induction of maturation and dormancy, and splicing itself. Detailed examination of selected RNA isoforms revealed alterations in individual domains that could result in changes in subcellular localization of the resulting proteins, protein-protein and enzyme-substrate interactions, and regulation of protein activities. Different isoforms may play an important role in regulating developmental and metabolic processes occurring at different stages in developing oilseed embryos.
    Full-text · Article · Dec 2013 · Biology
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: There has been much research on the combinatorial problem of generating the linear extensions of a given poset. This paper focuses on the reverse of that problem, where the input is a set of linear orders, and the goal is to construct a poset or set of posets that generates the input. Such a problem finds applications in computational neuroscience, systems biology, paleontology, and physical plant engineering. In this paper, several algorithms are presented for efficiently finding a single poset that generates the input set of linear orders. The variation of the problem where a minimum set of posets that cover the input is also explored. It is found that the problem is polynomially solvable for one class of simple posets (kite(2) posets) but NP-complete for a related class (hammock(2,2,2) posets).
    Full-text · Article · Oct 2013 · Discrete Mathematics Algorithms and Applications
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Soybean (Glycine max) seeds are an important source of seed storage compounds, including protein, oil, and sugar used for food, feed, chemical, and biofuel production. We assessed detailed temporal transcriptional and metabolic changes in developing soybean embryos to gain a systems biology view of developmental and metabolic changes and to identify potential targets for metabolic engineering. Two major developmental and metabolic transitions were captured enabling identification of potential metabolic engineering targets specific to seed filling and to desiccation. The first transition involved a switch between different types of metabolism in dividing and elongating cells. The second transition involved the onset of maturation and desiccation tolerance during seed filling and a switch from photoheterotrophic to heterotrophic metabolism. Clustering analyses of metabolite and transcript data revealed clusters of functionally related metabolites and transcripts active in these different developmental and metabolic programs. The gene clusters provide a resource to generate predictions about the associations and interactions of unknown regulators with their targets based on "guilt-by-association" relationships. The inferred regulators also represent potential targets for future metabolic engineering of relevant pathways and steps in central carbon and nitrogen metabolism in soybean embryos and drought and desiccation tolerance in plants.
    Full-text · Article · Jun 2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Cold acclimation in woody perennials is a metabolically intensive process, but coincides with environmental conditions that are not conducive to the generation of energy through photosynthesis. While the negative effects of low temperatures on the photosynthetic apparatus during winter have been well studied, less is known about how this is reflected at the level of gene and metabolite expression, nor how the plant generates primary metabolites needed for adaptive processes during autumn. Results The MapMan tool revealed enrichment of the expression of genes related to mitochondrial function, antioxidant and associated regulatory activity, while changes in metabolite levels over the time course were consistent with the gene expression patterns observed. Genes related to thylakoid function were down-regulated as expected, with the exception of plastid targeted specific antioxidant gene products such as thylakoid-bound ascorbate peroxidase, components of the reactive oxygen species scavenging cycle, and the plastid terminal oxidase. In contrast, the conventional and alternative mitochondrial electron transport chains, the tricarboxylic acid cycle, and redox-associated proteins providing reactive oxygen species scavenging generated by electron transport chains functioning at low temperatures were all active. Conclusions A regulatory mechanism linking thylakoid-bound ascorbate peroxidase action with “chloroplast dormancy” is proposed. Most importantly, the energy and substrates required for the substantial metabolic remodeling that is a hallmark of freezing acclimation could be provided by heterotrophic metabolism.
    Preview · Article · Apr 2013 · BMC Plant Biology
  • Source
    Lenwood S. Heath · Ajit Kumar Nema

    Preview · Article · Jan 2013 · Open Journal of Discrete Mathematics
  • Source
    Kuan Yang · Lenwood S Heath · João C Setubal
    [Show abstract] [Hide abstract]
    ABSTRACT: Ancestral genome reconstruction can be understood as a phylogenetic study with more details than a traditional phylogenetic tree reconstruction. We present a new computational system called REGEN for ancestral bacterial genome reconstruction at both the gene and replicon levels. REGEN reconstructs gene content, contiguous gene runs, and replicon structure for each ancestral genome. Along each branch of the phylogenetic tree, REGEN infers evolutionary events, including gene creation and deletion and replicon fission and fusion. The reconstruction can be performed by either a maximum parsimony or a maximum likelihood method. Gene content reconstruction is based on the concept of neighboring gene pairs. REGEN was designed to be used with any set of genomes that are sufficiently related, which will usually be the case for bacteria within the same taxonomic order. We evaluated REGEN using simulated genomes and genomes in the Rhizobiales order.
    Full-text · Article · Dec 2012 · Genes
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microarray gene expression profiling is a powerful technique to understand complex developmental processes, but making biologically meaningful inferences from such studies has always been challenging. We previously reported a microarray study of the freezing acclimation period in Sitka spruce (Picea sitchensis) in which a large number of candidate genes for climatic adaptation were identified. In the current paper, we apply additional systems biology tools to these data to further probe changes in the levels of genes and metabolites and activities of associated pathways that regulate this complex developmental transition. One aspect of this adaptive process that is not well understood is the role of the cell wall. Our data suggest coordinated metabolic and signaling responses leading to cell wall remodeling. Co-expression of genes encoding proteins associated with biosynthesis of structural and non-structural cell wall carbohydrates was observed, which may be regulated by ethylene signaling components. At the same time, numerous genes, whose products are putatively localized to the endomembrane system and involved in both the synthesis and trafficking of cell wall carbohydrates, were up-regulated. Taken together, these results suggest a link between ethylene signaling and biosynthesis, and targeting of cell wall related gene products during the period of winter hardening. Automated Layout Pipeline for Inferred NEtworks (ALPINE), an in-house plugin for the Cytoscape visualization environment that utilizes the existing GeneMANIA and Mosaic plugins, together with the use of visualization tools, provided images of proposed signaling processes that became active over the time course of winter hardening, particularly at later time points in the process. The resulting visualizations have the potential to reveal novel, hypothesis-generating, gene association patterns in the context of targeted subcellular location.
    Preview · Article · Oct 2012 · Frontiers in Plant Science
  • L. S. Heath · J. P. C. Vergara
    [Show abstract] [Hide abstract]
    ABSTRACT: Sorting permutations by operations such as reversals and block-moves has received much interest because of its applications in the study of genome rearrangements and in the design of interconnection networks. A short block-move is an operation on a permutation that moves an element at most two positions away from its original position. This paper investigates the problem of finding a minimum-length sorting sequence of short block-moves for a given permutation. A 4/3 -approximation algorithm for this problem is presented. Woven double-strip permutations are defined and a polynomial-time algorithm for this class of permutations is devised that employs graph matching techniques. A linear-time maximum matching algorithm for a special class of grid graphs improves the time complexity of the algorithm for woven double-strip permutations. Key words. Computational biology, Genome rearrangement, Approximation algorithms, Maximum matching, Permutations.
    No preview · Article · Apr 2012 · Algorithmica

Publication Stats

3k Citations
144.58 Total Impact Points

Institutions

  • 1987-2013
    • Virginia Polytechnic Institute and State University
      • • Department of Plant Pathology, Physiology, and Weed Science
      • • Department of Computer Science
      Блэксбург, Virginia, United States
    • Massachusetts Institute of Technology
      • Department of Mathematics
      Cambridge, MA, United States
  • 2004
    • North Carolina State University
      Raleigh, North Carolina, United States
  • 1992
    • University of North Carolina at Chapel Hill
      North Carolina, United States
  • 1984
    • University of North Carolina at Charlotte
      Charlotte, North Carolina, United States