Widespread RNA and DNA Sequence Differences in the Human Transcriptome

Department of Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA.
Science (Impact Factor: 33.61). 05/2011; 333(6038):53-8. DOI: 10.1126/science.1207018
Source: PubMed


The transmission of information from DNA to RNA is a critical process. We compared RNA sequences from human B cells of 27
individuals to the corresponding DNA sequences from the same individuals and uncovered more than 10,000 exonic sites where
the RNA sequences do not match that of the DNA. All 12 possible categories of discordances were observed. These differences
were nonrandom as many sites were found in multiple individuals and in different cell types, including primary skin cells
and brain tissues. Using mass spectrometry, we detected peptides that are translated from the discordant RNA sequences and
thus do not correspond exactly to the DNA sequences. These widespread RNA-DNA differences in the human transcriptome provide
a yet unexplored aspect of genome variation.

Download full-text


Available from: Isabel Xiaorong Wang, May 14, 2014
1 Follower
38 Reads
  • Source
    • " the genomic sequence that is partially or completely converted to G at the corresponding cDNA position indicates a candidate A - to - I RNA editing site . Using this method , vast numbers of such RNA and DNA differences ( RDD ) in the human transcriptome have been screened using next - generation sequencing tech - nologies ( Bahn et al . , 2012 ; Li et al . , 2011 ; Peng et al . , 2012 ) . However , it is reported that these sites include many false positives ( Kleinman & Majewski , 2012 ; Lin , Piskol , Tan , & Li , 2012 ; Pickrell , Gilad , & Pritchard , 2012 ; Piskol , Peng , Wang , & Li , 2013 ) . The high error rate using the RDD method is attributed to the difficulty in differentiating betw"
    [Show abstract] [Hide abstract]
    ABSTRACT: Inosine (I) is a modified adenosine (A) in RNA. In Metazoa, I is generated by hydrolytic deamination of A, catalyzed by adenosine deaminase acting RNA (ADAR) in a process called A-to-I RNA editing. A-to-I RNA editing affects various biological processes by modulating gene expression. In addition, dysregulation of A-to-I RNA editing results in pathological consequences. I on RNA strands is converted to guanosine (G) during cDNA synthesis by reverse transcription. Thus, the conventional method used to identify A-to-I RNA editing sites compares cDNA sequences with their corresponding genomic sequences. Combined with deep sequencing, this method has been applied to transcriptome-wide screening of A-to-I RNA editing sites. This approach, however, produces a large number of false positives mainly owing to mapping errors. To address this issue, we developed a biochemical method called inosine chemical erasing (ICE) to reliably identify genuine A-to-I RNA editing sites. In addition, we applied the ICE method combined with RNA-seq, referred to as ICE-seq, to identify transcriptome-wide A-to-I RNA editing sites. In this chapter, we describe the detailed protocol for ICE-seq, which can be applied to various sources and taxa. © 2015 Elsevier Inc. All rights reserved.
    Methods in enzymology 08/2015; 560:331-53. DOI:10.1016/bs.mie.2015.03.014 · 2.09 Impact Factor
  • Source
    • "Despite the expected difference in gene expression between in vitro and in vivo cells of different origins, both studies reported RNA editing, which according to our results often leads to modified protein activity. Also widely discussed was the role of technical artifacts due to sequencing or sequence mapping, nevertheless they only partially explain the discovered RDDs (Pickrell et al., 2012). This would suggest that it is a general, widespread editing mechanism that can affect phenotype and should be further investigated. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: The occurrence of widespread RNA and DNA sequence differences in the human transcriptome was reported in 2011. Similar findings were described in a second independent publication on personal omics profiling investigating the occurrence of dynamic molecular and related medical phenotypes. The suggestion that the RNA sequence variation was likely to affect disease susceptibility prompted us to investigate with a range of algorithms the amino acid variants reported to be present in the identified peptides to determine if they might be disease-causing. Results: The predictive qualities of the different algorithms were first evaluated by using nonsynonymous single-base nucleotide polymorphism (nsSNP) datasets, using independently established data on amino acid variants in several proteins as well as data obtained by mutational mapping and modelling of binding sites in the human serotonin transporter protein (hSERT). Validation of the used predictive algorithms was at a 75% level. Using the same algorithms, we found that widespread RNA and DNA sequence differences were predicted to impair the function of the peptides in over 57% of cases. Conclusions: Our findings suggest that a proportion of edited RNAs which serve as templates for protein synthesis is likely to modify protein function, possibly as an adaptive survival mechanism in response to environmental modifications.
    Acta biochimica Polonica 02/2015; 62(1). DOI:10.18388/abp.2014_771 · 1.15 Impact Factor
  • Source
    • "Our comparative approach is critically important for validating editing, a feature that cannot be inferred from genome sequence alone, and because RNA-seq error results in mismatches relative to the genome. As a result, the extent of editing has been controversial (Li et al. 2011; Bass et al. 2012). Species comparison provides higher confidence in the validation of these elements (Danecek et al. 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.
Show more