Content uploaded by Corinne Rancurel
Author content
All content in this area was uploaded by Corinne Rancurel on Jul 29, 2015
Content may be subject to copyright.
Content uploaded by Corinne Rancurel
Author content
All content in this area was uploaded by Corinne Rancurel on Jul 29, 2015
Content may be subject to copyright.
ECCB 2014 Accepted Posters with Abstracts
F: Evolution and population genomics
F03: Corinne Rancurel, Martine Da Rocha and Etienne G J Danchin.
Alienness: Rapid detection of horizontal gene transfers in metazoan genomes
Abstract:
Horizontal gene transfer (HGT) is the transmission of genetic material between species by ways other than
direct (vertical) inheritance from parents to the offspring. HGT is recognized as a major evolutionary force in
prokaryotes as it is involved in acquisition of antibiotic resistance or pathogeny. HGT has long been
overlooked and considered insignificant in eukaryotes. However, HGT have also played important roles in
the evolutionary history and biology of these species, including animals. For example, HGT have contributed
to the colonization of land by plants, in the emergence of plant parasitism in nematodes or in the
development of capabilities like resistance to extreme temperatures or desiccation. Progress in genome
sequencing technologies has allowed multiple animal genomes to be publicly released. Systematic searches
for HGT events in the root-knot nematodes or in the bdeloid rotifer, have shown that between 3 and 9 % of
protein-coding genes in these species were of foreign origin (1). However, in the absence of a user-friendly,
rapid and publicly available tool to detect HGT events in metazoan genomes, we still lack a global view of the
prevalence of HGT in animals.
Here, we propose a tool that allows automatic detection of putative HGT events, based on the predicted
protein set from a genome or transcriptome. Our tool has been specifically designed to rapidly detect genes
of non-metazoan origin (e.g. bacterial, fungal) in metazoan genomes (e.g. insects, mammals, nematodes).
Based on a blastP search against the NCBI's non-redundant library (nr), we retrieve putative homologs for
each protein query in a whole proteome. Canonical metazoan proteins should return better blast hits to
metazoans than to non-metazoans. Conversely, candidate HGT of non-metazoan origin are expected to
return better blast hits to non-metazoans than metazoans.
Our method uses the Alien Index metrics as described in (2) to detect a significant gap between the best
metazoan and non-metazoan e-values as an indicator of putative HGT. An Alien Index (AI) > 0, indicates a
better hit to non-metazoans than to metazoans. An AI > 30 corresponds to a difference of magnitude > e10
between the best non-metazoan and metazoan e-values and is estimated to be a clear indication of HGT.
The main difficulty of this approach is to automatically retrieve taxonomic information and to assign a hit to
either metazoan, or non-metazoan categories. Our methods not only allows classifying best blast hits in
metazoans and non-metazoans but it further classifies non-metazoan best hits into virus, bacteria, archaea,
fungi, plant and "other". Thus it allows both the detection of putative HGT acquisitions and identification of
the putative donors.
References
1. J.-F. Flot et al., Genomic evidence for ameiotic evolution in the bdelloid rotifer Adineta vaga, Nature (2013).
2. E. A. Gladyshev, et al., Massive horizontal gene transfer in bdelloid rotifers, Science (2008).
Example of detected HGT
Gene : cds.comp59722_c0_seq1
Best non-metazoan e-value : 7.0e-153 (bacterium)
Best metazon e-value: 7.0e-42
AI = 255.6
Pfam : Glyco_hydro_32N
Putative function: invertase
Recording the best metazoan and non-metazoan e-values
Rapid detection of horizontal gene transfers in metazoan genomes
Corinne Rancurel 1, Martine Da Rocha 1, Etienne G J Danchin1
Abstract : Horizontal gene transfer (HGT) is the transmission of
genetic material between species by ways other than direct (vertical)
inheritance from parents to the offspring. HGT is recognized as a major
evolutionary force in prokaryotes as it is involved in acquisition of
antibiotic resistance or pathogeny. HGT has long been overlooked and
considered insignificant in eukaryotes. However, HGT have also played
important roles in the evolutionary history and biology of these species,
including animals.
Our pipeline allows rapid detection of potential HGT of non metazoan
origin (e.g. bacteria, fungi, plant, protists) in metazoan genomes (e.g.
insects, vertebrates, nematodes , cnidaria).
Archeae
Reptiles
Echinoderms
Mollusks
Bacteria
Cnidaria
Mammals
Plants
Annelids
Porifera
Insects
Nematodes
Fungi
Red algae
Vertebrates
Bilateria
Eukaryotes
Vertical transmission
from parents to the offsrping
Metazoa
M
METAZOA
NON-METAZOA
HGT from non metazoa to metazoa
Alien index (AI) = log ((best e-value for Metazoan)+ e-200) – log ((best e-value for Non-Metazoan)+ e-200)
in our example : log (2e-24 + e-200) – log (1e-37 + e-200) = +30,6
Recording of the
best:
•Metazoan &
•Non Metazoan
e-values, for each
query
Non Redondant
NCBI library
NR
LINEAGE
Taxonomic
NCBI files
HITS
BLASTP
Compute
Alien Index AI
by query
[1]
Alien Index by
query (xls)
Select Input File
Query proteome (fasta)
Output Blast Hits (tabular)
Fill option
Exclude_group
[1]
[2]
[3]
Identification of potentiel
horizontal gene transfers
[2]
[3]
1INRA UMR1355, UNS, CNRS UMR7254, F-06903, Sophia Antipolis, France. spiboc.bioinfo@sophia.inra.fr
Alienness
Rapid detection o
f
horizontal
g
ene trans
f
ers in metazoan
g
enome
s
C
orinne Rancurel
1
, Martine
D
a
Rocha
1
, Etienne
G
J
D
anchi
n
1
1
INRA UMR1355
, U
NS,
C
NR
S
UMR725
4,
F
-
0
6903,
S
o
phi
a Anti
pol
is, France.
spi
boc.bioin
fo@
sop
hia.inra.
f
r
[ref. Massive Horizontal gene Transfer in Bdelloid Rotifers , Gladyshev et al. Science 2008 ]
# BLASTP 2.2.26+
# Query: ANUcomp28394_c0_seq1
# Database: nr
# Fields: query id, subject id, % identity, …. evalue, bit score, subject ids
# 250 hits found
ANUcomp28394_c0_seq1 gi|522006232 40.98 …. 1e-37 144
………
ANUcomp28394_c0_seq1 gi|110671508 33.52 …. 2e-24 105
Tabular blast output
2
1
1
2
Compute Alien Index
An AI is calculated for each query protein returning at least one hit either to a metazoan or a non-metazoan species
using the following formula :
3
3
NB : When no metazoan or non-metazoan significant BLAST hit is found, an e-value of 1 is automatically assigned as
the best metazoan or non-metazoan hit, respectively.
Rules Meaning Decision
AI > 0 e-value non-metazoan > evalue metazoan possible HGT event
AI > 30 corresponds to a difference of magnitude e10 (see example) indicative of a HGT event
AI > 0 and
% identity > 70 strong presumption of contamination not considered as HGT event
Examples of results obtained with Alienness:
1
122
0
1
12
231
203
Archaea
Bacteria
Viroids
Viruses
Viridiplantae
Fungi
Other
Putative donors
Validation of HGT via phylogeny
0.4
1
1
1
0,9792
1
1
1
1
1
1
0,9583
1
1
1
0,9583
1
0,875
0,9792
11
1
1
1
1
0,75
1
0,9167
1
1
1
1
0,5208
0,7292
0,9583
11
11
1
1
1
1
1
1
1
0,8542
0,9792
10,875
1
1
1
1
1
1
1
0,8958
1
1
1
1
1
1
1
1
0,9167
1
1
1
95
1
99
1
100
1
100
1
91
Plant-parasitic Nematodes (Tylenchida)
Rhizobial bacteria
INSECT
Other bacteria
N. aberrans
11
0,95831
[ref. The transcriptome of Nacobbus aberrans reveals insights into the evolution of sedentary
endoparasitism in plant-parasitic nematodes, Eves-van den Akker et al. GBE 2014 ]
BLAST hits are parsed to retrieve
associated taxonomic information from
the NCBI. Self-hits are ignored by
indicating the NCBI taxonomic
identifier of the query organism in the
(exclude_group option). The same
option can be used to ignore the
whole parent lineage.
For example, one can be interested by HGT of non-metazoan origin in the human
genome specifically (taxid = 9606) , or in the whole primate lineage (taxid = 9443),
in both case, using the human proteome as a query.
catarrhini
humans
chimpanzees
gorillas
gibbons
platyrrhini old world monkey
new world monkey
anthropoids
prosimians
primates lemurs, lorises
hominoidea
Query proteome : the plant-parasitic nematode Nacobbus aberrans (clade Tylenchida)
Exclude group : Tylenchida (Taxid: 6300)
Results : 570 N. aberrans proteins returned an Alien Index (AI) >30
and less than 70 % identity to a non-metazoan protein
Conclusion and Perspectives
14 query proteomes have been analyzed so far
Automatically generate functional annotation and eliminate contamination
Generating automatic phylogenies
Bacteria
Eukarya
Archaea
Virus
[ ref. D.Raoult. The post-Darwinist rhizome of life. The Lancet. 2010 ]