LASAGNA: A novel algorithm for transcription factor binding site alignment

BMC Bioinformatics (Impact Factor: 2.67). 03/2013; 14(1):108. DOI: 10.1186/1471-2105-14-108
Source: PubMed

ABSTRACT Background Scientists routinely scan DNA sequences for transcription factor (TF) binding sites (TFBSs). Most of the available tools rely on position-specific scoring matrices (PSSMs) constructed from aligned binding sites.Because of the resolutions of assays used to obtain TFBSs, databases such as TRANSFAC, ORegAnno and PAZARstore unaligned variable-length DNA segments containing binding sites of a TF. These DNA segments need to bealigned to build a PSSM. While the TRANSFAC database provides scoring matrices for TFs, nearly 78% of the TFsin the public release do not have matrices available. As work on TFBS alignment algorithms has been limited, itis highly desirable to have an alignment algorithm tailored to TFBSs.Results We designed a novel algorithm named LASAGNA, which is aware of the lengths of input TFBSs and utilizes position dependence.Results on 189 TFs of 5 species in the TRANSFAC database showed that our method significantly outperformed ClustalW2and MEME. We further compared a PSSM method dependent on LASAGNA to an alignment-free TFBS search method.Results on 89 TFs whose binding sites can be located in genomes showed that our method is significantly more preciseat fixed recall rates. Finally, we described LASAGNA-ChIP, a more sophisticated version for ChIP(Chromatin immunoprecipitation) experiments. Under the one-per-sequence model, it showed comparableperformance with MEME in discovering motifs in ChIP-seq peak sequences.Conclusions We conclude that the LASAGNA algorithm is simple and effective in aligning variable-length binding sites.It has been integrated into a user-friendly webtool for TFBS search and visualization calledLASAGNA-Search. The tool currently stores precomputed PSSM models for 189 TFs and 133 TFs built from TFBSs in theTRANSFAC Public database (release 7.0) and the ORegAnno database (08Nov10 dump), respectively.The webtool is available at:


Available from: Chun-Hsi Huang, May 16, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Stress tolerance in plants is a coordinated action of multiple stress response genes that also cross talk with other components of the stress signal transduction pathways. The expression and regulation of stress-induced genes are largely regulated by specific transcription factors, families of which have been reported in several plant species, such as Arabidopsis, rice and Populus. In sorghum, the majority of such factors remain unexplored. We used 2DE refined with MALDI-TOF techniques to analyze drought stress-induced proteins in sorghum. A total of 176 transcription factors from the MYB, AUX_ARF, bZIP, AP2 and WRKY families of drought-induced proteins were identified. We developed a method based on semantic similarity of gene ontology terms (GO terms) to identify the transcription factors. A threshold value (≥ 90%) was applied to retrieve total 1,493 transcription factors with high semantic similarity from selected plant species. It could be concluded that the identified transcription factors regulate their target proteins with endogenous signals and environmental cues, such as light, temperature and drought stress. The regulatory network and cis-acting elements of the identified transcription factors in distinct families are involved in responsiveness to auxin, abscisic acid, defense, stress and light. These responses may be highly important in the modulation of plant growth and development.
    Cellular & Molecular Biology Letters 12/2014; DOI:10.2478/s11658-014-0223-3 · 1.78 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The earliest known vertebrate copulatory organs are claspers, paired penis-like structures that are associated with evolution of internal fertilization and viviparity in Devonian placoderms. Today, only male chondrichthyans possess claspers, which extend from posterior pelvic fins and function as intromittent organs. Here we report that clasper development from pelvic fins of male skates is controlled by hormonal regulation of the Sonic hedgehog (Shh) pathway. We show that Shh signalling is necessary for male clasper development and is sufficient to induce clasper cartilages in females. Androgen receptor (AR) controls the male-specific pattern of Shh in pelvic fins by regulation of Hand2. We identify an androgen response element (ARE) in the Hand2 locus and present biochemical evidence that AR can directly bind the Hand2 ARE. Together, our results suggest that the genetic circuit for appendage development evolved an androgen regulatory input, which prolonged signalling activity and drove clasper skeletogenesis in male fins.
    Nature Communications 04/2015; 6:6698. DOI:10.1038/ncomms7698 · 10.74 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background In previous studies on an Iberian x Landrace cross, we have provided evidence that supported the porcine ELOVL6 gene as the major causative gene of the QTL on pig chromosome 8 for palmitic and palmitoleic acid contents in muscle and backfat. The single nucleotide polymorphism (SNP) ELOVL6:c.-533C > T located in the promoter region of ELOVL6 was found to be highly associated with ELOVL6 expression and, accordingly, with the percentages of palmitic and palmitoleic acids in longissimus dorsi and adipose tissue. The main goal of the current work was to further study the role of ELOVL6 on these traits by analyzing the regulation of the expression of ELOVL6 and the implication of ELOVL6 polymorphisms on meat quality traits in pigs. Results High-throughput sequencing of BAC clones that contain the porcine ELOVL6 gene coupled to RNAseq data re-analysis showed that two isoforms of this gene are expressed in liver and adipose tissue and that they differ in number of exons and 3’UTR length. Although several SNPs in the 3’UTR of ELOVL6 were associated with palmitic and palmitoleic acid contents, this association was lower than that previously observed with SNP ELOVL6:c.-533C > T. This SNP is in full linkage disequilibrium with SNP ELOVL6:c.-394G > A that was identified in the binding site for estrogen receptor alpha (ERα). Interestingly, the ELOVL6:c.-394G allele is associated with an increase in methylation levels of the ELOVL6 promoter and with a decrease of ELOVL6 expression. Therefore, ERα is clearly a good candidate to explain the regulation of ELOVL6 expression through dynamic epigenetic changes in the binding site of known regulators of ELOVL6 gene, such as SREBF1 and SP1. Conclusions Our results strongly suggest the ELOVL6:c.-394G > A polymorphism as the causal mutation for the QTL on pig chromosome 8 that affects fatty acid composition in pigs.
    Genetics Selection Evolution 03/2015; 47(20). DOI:10.1186/s12711-015-0111-y · 3.75 Impact Factor