
Kamel Jabbari- PhD
- Researcher at University of Cologne
Kamel Jabbari
- PhD
- Researcher at University of Cologne
About
96
Publications
31,598
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
15,113
Citations
Introduction
Skills and Expertise
Current institution
Additional affiliations
June 2017 - present
Instutute for Genetics
Position
- Researcher
Publications
Publications (96)
1. Background BORIS = Brother Of Regulator Of Imprinted Sites BORIS (also called CTCF-like = CTCFL) emerged by duplication of the CCCTC-binding factor (CTCF) [1,2,3]. It has long been assumed that the BORIS gene arose during amniote (reptiles, birds, mammals) evolution . The BORIS gene is located within a synteny block which is highly conserved in...
In this study, by exploring chromatin conformation capture data, we show that the nuclear segregation of Topologically Associated Domains (TADs) is contributed by DNA sequence composition. GC-peaks and valleys of TADs strongly influence interchromosomal interactions and chromatin 3D structure. To gain insight on the compositional and functional con...
Recent findings established a link between DNA sequence composition and interphase chromatin architecture and explained the evolutionary conservation of TADs (Topologically Associated Domains) and LADs (Lamina Associated Domains) in mammals. This prompted us to analyse conformation capture and recombination rate data to study the relationship betwe...
Supplementary figures.
Figure A. Schematic representation of “regioneR” approach. We first perform permutation test by creating 1000 randomizations of set-2 to test if the overlap with set-1 is more than expected, the output is then stored in the object “pt”. We can then plot the “pt” object; plot(pt) will create a plot with the distribution of the...
Recent investigation established a link between DNA sequences and chromatin architecture and explained the evolutionary conservation of TADs (Topologically Associated Domains) and LADs (Lamina Associated Domains) in mammals. This prompted us to analyse the relationship between chromatin architecture and recombination landscapes of human and mouse....
Genetic Generalized Epilepsy (GGE) and benign epilepsy with centro-temporal spikes or Rolandic Epilepsy (RE) are common forms of genetic epilepsies. Rare copy number variants have been recognized as important risk factors in brain disorders. We performed a systematic survey of rare deletions affecting protein-coding genes derived from exome data of...
Deletions present in array data.
(DOCX)
Deletions in common with ExAC CNVs.
Data is sorted from low to high deletion score (del.score) and duplication (dup) frequencies. "+" indicates expression in the brain. Deletion score increases with increasing intolerance.
(DOCX)
Background Genetic generalised epilepsy is the most common type of inherited epilepsy. Despite a high concordance rate of 80% in monozygotic twins, the genetic background is still poorly understood. We aimed to investigate the burden of rare genetic variants in genetic generalised epilepsy.
BACKGROUND: Genetic generalised epilepsy is the most common type of inherited epilepsy. Despite a high concordance rate of 80% in monozygotic twins, the genetic background is still poorly understood. We aimed to investigate the burden of rare genetic variants in genetic generalised epilepsy. METHODS: For this exome-based case-control study, we used...
The CCCTC-binding factor (CTCF) is multi-functional, ubiquitously expressed, and highly conserved from Drosophila to human. It has important roles in transcriptional insulation and the formation of a high-dimensional chromatin structure. CTCF has a paralog called "Brother of Regulator of Imprinted Sites" (BORIS) or "CTCF-like" (CTCFL). It binds DNA...
Rolandic epilepsy (RE) is the most common focal epilepsy in childhood. To date no hypothesis-free exome-wide mutational screen has been conducted for RE and atypical RE (ARE). Here we report on whole-exome sequencing of 194 unrelated patients with RE/ARE and 567 ethnically matched population controls. We identified an exome-wide significantly enric...
A recent investigation showed the existence of correlations between the architectural features of mammalian interphase chromosomes and the compositional properties of isochores. This result prompted us to compare maps of the Topologically Associating Domains (TADs) and of the Lamina Associated Domains (LADs) with the corresponding isochore maps of...
Table A, Isochore families in the human genome, Table B, Structural and functional properties of the genome core vs. the genome desert. Table C, Isochores & interphase chromatin.
(PDF)
Epilepsy is a common complex disorder most frequently associated with psychiatric and neurological diseases. Massive parallel sequencing of individual or cohort genomes and exomes led the identification of several disease associated genes. We review here the candidate genes in epilepsy genetics with focus on exome and gene panel data. Together with...
Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings....
Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred
an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B...
Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes—and show that cultivated types deriv...
Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes—and show that cultivated types deriv...
Recent studies reported DEPDC5 loss-of-function mutations in different focal epilepsy syndromes. Here we identified one predicted truncation and two missense mutations in three independent children with Rolandic epilepsy (3/207). In addition, we identified three families with unclassified focal childhood epilepsies carrying predicted truncating DEP...
Members of the family Trypanosomatidae infect many organisms, including animals, plants and humans. Plant-infecting trypanosomes are grouped under the single genus Phytomonas, failing to reflect the wide biological and pathological diversity of these protists. While some Phytomonas spp. multiply in the latex of plants, or in fruit or seeds without...
Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B...
Red seaweeds are key components of coastal ecosystems and are economically important as food and as a source of gelling agents, but their genes and genomes have received little attention. Here we report the sequencing of the 105-Mbp genome of the florideophyte Chondrus crispus (Irish moss) and the annotation of the 9,606 genes. The genome features...
Brown algae are important organisms both because of their key ecological roles in
coastal ecosystems and because of the remarkable biological features that they have
acquired during their unusual evolutionary history. The recent sequencing of the
complete genome of the filamentous brown alga Ectocarpus has provided unprecedented
access to the molec...
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domesticatio...
Brown algae are important organisms both because of their key ecological roles in coastal ecosystems and because of the remarkable biological features that they have acquired during their unusual evolutionary history. The recent sequencing of the complete genome of the filamentous brown alga Ectocarpus has provided unprecedented access to the molec...
Brown algae are important organisms both because of their key ecological roles in coastal ecosystems and because of the remarkable biological features that they have acquired during their unusual evolutionary history. The recent sequencing of the complete genome of the filamentous brown alga Ectocarpus has provided unprecedented access to the molec...
• By comparative analyses we identify lineage-specific diversity in transcription factors (TFs) from stramenopile (or heterokont) genome sequences. We compared a pennate (Phaeodactylum tricornutum) and a centric diatom (Thalassiosira pseudonana) with those of other stramenopiles (oomycetes, Pelagophyceae, and Phaeophyceae (Ectocarpus siliculosus))...
Diatoms represent the predominant group of eukaryotic phytoplankton in the oceans and are responsible for around 20% of global photosynthesis. Two whole genome sequences are now available. Notwithstanding, our knowledge of diatom biology remains limited because only around half of their genes can be ascribed a function based onhomology-based method...
Supplementary Table S2. Diatom-specific genes expressed in both high and low decadienal libraries (HD and LD).
Supplementary Table S3. R-values of the actual 9,145 clusters and that of the randomized data set.
Supplementary Figure S1. Expression patterns of diatom-specific genes. (A) Hierarchical clustering to show the expression pattern of transcripts belonging to the gene families conserved across different taxonomical groups (Core), diatom-specific (Diatom) and P. tricornutum-specific (Pt) [8]. (B) Plot showing the average frequency of the above set o...
Supplementary Figure S2. Percentage of differentially expressed transcripts in primary y-axis, normalized to number of non-redundant transcripts (TUs) across the EST libraries and the percentage of transcripts with defined InterPro domains (PDFs) in the differentially expressed transcripts in the secondary y-axis. The arrow in the secondary y-axis...
Supplementary Figure S3. Distribution of P. tricornutum PDFs in other organismal groups. Numbers in parentheses indicate the number of genes with defined protein domains (PDF) and the number outside the parentheses represent the total number of genes in each organismal group.
Supplementary Table S6. The top 20 IPR domains expressed across all the libraries and the number of ESTs for each domain.
Supplementary Table S1. A comprehensive description of culturing conditions of the libraries.
Supplementary Table S4. The 71 transcripts that were expressed at least once across all the libraries.
Supplementary Table S5. GO terms that are over-represented in each library (P < 0.001). In this table we also show over-represented GO terms shared between libraries.
Supplementary Figure S5. Hierarchical clustering showing the expression patterns of P. tricornutum orthologs of the novel genes identified by tiling array in T. pseudonana [42]. Expression levels are shown in an increasing scale from grey to dark blue, and are based on frequencies of ESTs in each library (see Materials and methods). For two-letter...
Supplementary Table S7. Bacterial genes and their expression across different libraries along with the domain and genomic location.
Supplementary Figure S4. Hierarchical clustering of transcripts defined as being differentially expressed under the nitrate starved condition (NS) in P. tricornutum along with the hierarchical clustering of corresponding orthologs expressed in the nitrate limited condition (NL) in T. pseudonana. Expression levels are shown in an increasing scale fr...
Supplementary Figure S6. Expression of bacterial orthologous genes in P. tricornutum. (A) Plot showing the number of transcripts of bacterial origin expressed across the 16 different growth conditions. The primary y-axis shows the number of transcripts and the secondary y-axis shows the average frequency of these expressed transcripts. (B) Expressi...
Brown algae (Phaeophyceae) are complex photosynthetic organisms with a very different evolutionary history to green plants, to which they are only distantly related. These seaweeds are the dominant species in rocky coastal ecosystems and they exhibit many interesting adaptations to these, often harsh, environments. Brown algae are also one of only...
Polymorphism generated by TE insertions across P. tricornutum accessions. Distribution of polymorphic bands obtained by SSAP experiments (with BKB, SCF, and PtC34) across 13 P. tricornutum accessions and positions of the corresponding sequences in the Pt1 genome when occurring only once (otherwise, we indicated the nature of the repeat sequenced).
List of putatively active LTR-RTs found in diatom genomes. Classification, structural features, and accession numbers of the putatively active LTR-RTs identified in the P. tricornutum and T. pseudonana genomes.
Pt2_50588 consists in a recombination product. Close up on the sequence alignment of the Pt2_50588 orthologs at the level of the transition between higher similarities of Pt2_50588 with Pt2_46949/Pt2_46953 (highlighted in blue) and with Pt2_46950/Pt2_50589 (highlighted in red).
Haplotype specificity of Blackbeard insertion. (A) Close up on the dot-plot comparison (window size: 11) of two consensus sequences of the Blackbeard insertion locus retrieved with the help of the Stanford Human Genome Center. (B) Schematic view of the two haplotypes observed at the Blackbeard insertion locus in the P. tricornutum genome. Haplotype...
Transposable elements (TEs) are mobile DNA sequences present in the genomes of most organisms. They have been extensively studied in animals, fungi, and plants, and have been shown to have important functions in genome dynamics and species evolution. Recent genomic data can now enlarge the identification and study of TEs to other branches of the eu...
Summary *Ten axenic cultures, referred to as Fibrocapsa japonica, were studied for their morphology, pigment composition, toxicity and phylogeny. *Morphologically, all 10 accessions were similar and displayed equivalent pigment contents. We identified chlorophylls a and c, beta-carotene and fucoxanthin as the dominant pigments, together with xantho...
Diatoms are photosynthetic secondary endosymbionts found throughout marine and freshwater environments, and are believed to be responsible for around one-fifth of the primary productivity on Earth. The genome sequence of the marine centric diatom Thalassiosira pseudonana was recently reported, revealing a wealth of information about diatom biology....
The material covered herein touches on the recent understanding of the fate of introns in human duplicated genes. A structural genomics framework has been provided to account for the functional asymmetry of sister copies after the duplication event(s). Structural shift between duplicated copies are very remarkable. Indeed, translocation/transpositi...
Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying
chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were
inherited from the common ancestor of plants and animals, but lost in...
Diatoms are unicellular brown algae that likely arose from the endocytobiosis of a red alga into a single-celled heterotroph and that constitute an algal class of major importance in phytoplankton communities around the globe. The first whole-genome sequence from a diatom species, Thalassiosira pseudonana Hasle et Heimdal, was recently reported, an...
The smallest known eukaryotes, at ≈1-μm diameter, are Ostreococcus tauri and related species of marine phytoplankton. The genome of Ostreococcus lucimarinus has been completed and compared with that of O. tauri. This comparison reveals surprising differences across orthologous chromosomes in the two species from highly syntenic chromosomes in most...
Proper validation can accelerate sequence-based discovery of proteins and protein-coding genes. Databases currently contain a backlog of experimentally unverified gene models and tentative assignments of observed transcripts to coding or noncoding RNA. We present and apply a general principle, founded on base composition and the genetic code and va...
The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary produc...
In previous work [Jabbari, K., Rayko, E., Bernardi, G., 2003. The major shifts of human duplicated genes. Gene 317, 203-208], we investigated the fate of ancient duplicated genes after the compositional transitions that occurred between the genomes of cold- and warm-blooded vertebrates. We found that the majority of duplicated copies were transpose...
Diatoms are one of the most important constituents of phytoplankton communities in aquatic environments, but in spite of this, only recently have large-scale diatom-sequencing projects been undertaken. With the genome of the centric species Thalassiosira pseudonana available since mid-2004, accumulating sequence information for a pennate model spec...
Reports accompanying draft or finished sequences of rice chromosomes and full-length cDNA libraries indicate that between a third and half of the (largely predicted) protein-coding genes of rice might have no identifiable homologs in Arabidopsis and/or other species. The set of apparent ‘no-homolog’ sequences are predicted to exhibit striking compo...
Reports accompanying draft or finished sequences of rice chromosomes and full-length cDNA libraries indicate that between a third and half of the (largely predicted) protein-coding genes of rice might have no identifiable homologs in Arabidopsis and/or other species. The set of apparent ‘no-homolog’ sequences are predicted to exhibit striking compo...
An analysis of dinucleotide frequencies was carried out on DNAs from insects and mammals, as well as on large DNA sequences from the genomes of Drosophila melanogaster, Anopheles gambiae, puffer fish (Takifugu rubripes), zebra fish (Danio rerio) and human. These organisms were chosen because Drosophila and Anopheles DNAs have an extremely low level...
In this paper, we provide evidence for the body temperature effect on the formation of GC-rich isochores, by analysing genomic sequences from two puffer fishes living at different temperatures. The higher body temperature of Tetraodon nigroviridis compared to Takifugu rubripes (DeltaT approximately 15 degrees C) appears to be the cause of a higher...
A sequence analysis of the genomes of Anopheles gambiae and Drosophila melanogaster reveals that Anopheles DNA is more heterogeneous and GC-richer than Drosophila DNA. The gene concentration across the Anopheles genome is characterized by low levels in the GC-poor part of the genome and a 3-fold increase in the GC-richest part; this gene density gr...
Between one third and one half of the proposed rice genes appear to have no homologs in other species, including Arabidopsis. Compositional considerations, and a comparison of curated rice sequences with ex novo predictions, suggest that many or most of the putative genes without homologs may be false positive predictions, i.e., sequences that are...
The existence of a well conserved linear relationship between GC levels of genes' second and third codon positions (GC2, GC3) prompted us to focus on the landscape, or joint distribution, spanned by these two variables. In human, well curated coding sequences now cover at least 15%-30% of the estimated total gene set. Our analysis of the landscape...
The localization of HIV-1 proviruses in compositional DNA fractions from 27 AIDS patients during the chronic phase of the disease with depletion of CD4+ and different levels of viremia showed the following. (1) At low viremia, proviruses are predominantly localized in the GC-richest isochores, which are characterized by an open chromatin structure;...
A positive correlation holds between the GC level of third codon positions of human genes (GC(3)) and hydropathy of the encoded proteins. This correlation may appear counterintuitive, since it links a physical property of proteins to the base composition of 'synonymous' sites. We here establish the nontriviality of the correlation, which has recent...
A recent paper by Belle et al. (J. Mol. Evol. 55 (2002) 356) reported an analysis of mean GC(3) (the GC level of third codon positions) and standard deviations of GC(3) of vertebrate genomes as related to body temperature, and concluded that "the thermal stability hypothesis does not appear to explain the general patterns of composition", apparentl...
Since many gene duplications in the human genome are ancient duplications going back to the origin of vertebrates, the question may be asked about the fate of such duplicated genes at the compositional genome transitions that occurred between cold- and warm-blooded vertebrates. Indeed, at that transition, about half of the (GC-poor) genes of cold-b...
Gene prediction relies on the identification of characteristic features of coding sequences that distinguish them from non-coding
DNA. The recent large-scale sequencing of entire genomes from higher eukaryotes, in conjunction with currently used gene prediction
algorithms, has provided an abundance of putative genes that can now be analysed for the...
Alus and LINEs (LINE1) are widespread classes of repeats that are very unevenly distributed in the human genome. The majority of GC-poor LINEs reside in the GC-poor isochores whereas GC-rich Alus are mostly present in GC-rich isochores. The discovery that LINES and Alus share similar target site duplication and a common AT-rich insertion site speci...
In the present work we show that in the Drosophila genome (which covers a 37-51% GC range at a DNA size of approx.50kb) a linear correlation holds between GC (or GC(3)50kb) genomic sequences embedding them. This correlation allows us to position the two compositional distributions of (a) coding sequences, and (b) of long DNA segments relative to ea...
Mycobacterium tuberculosis and Mycobacterium leprae are the ethiological agents of tuberculosis and leprosy, respectively. After performing extensive comparisons between genes
from these two GC-rich bacterial species, we were able to construct a set of 275 homologous genes. Since these two bacterial
species also have a very low growth rate, transla...
A compositional transition was previously detected by comparing orthologous coding sequences from cold- and warm-blooded vertebrates (see Bernardi, G., Hughes, S., Mouchiroud, D., 1997. The major compositional transitions in the vertebrate genome. J. Mol. Evol. 44, S44-S51 for a review). The transition is characterized by higher GC levels (GC is th...
In this work, we have investigated the relationships between synonymous and nonsynonymous rates and base composition in coding sequences from Gramineae to analyze the factors underlying the variation in substitutional rates. We have shown that in these genes the rates of nucleotide divergence, both synonymous and nonsynonymous, are, to some extent,...
We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (α-helix, β-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substituti...
The “universal correlation” (D'Onofrio, G., Bernardi, G., 1992. A universal compositionalcorrelation amomg codon positions. Gene 110, 81–88.) that holds between 〈GC3〉 and 〈GC1〉 or 〈GC2〉 (〈GC〉 values are the average values of the coding sequences of each genome analyzed) at both the inter- and intra-genomic level, was re-analyzed on a vastly larger...
The discovery that the vertebrate genomes of warm-blooded vertebrates are mosaics of isochores, long DNA segments homogeneous in base composition, yet belonging to families covering a broad spectrum of GC levels, has led to two major observations. The first is that gene density is strikingly non-uniform in the genome of all vertebrates, gene concen...
In this work, we have investigated the relationships between synonymous and nonsynonymous rates and base composition in coding
sequences from Gramineae to analyze the factors underlying the variation in substitutional rates. We have shown that in these genes the rates of nucleotide
divergence, both synonymous and nonsynonymous, are, to some extent,...
We have analyzed the patterns of synonymous codon preferences of the nuclear genes of Plasmodium falciparum, a unicellular parasite characterized by an extremely GC-poor genome. When all genes are considered, codon usage is strongly
biased toward A and T in third codon positions, as expected, but multivariate statistical analysis detects a major tr...
A computer analysis of 946 human DNA sequences larger than 50 kb and representing about 118 Mb of DNA has led to the following observations. (i) Positive correlations hold between CpG levels and the GC levels of isochores and coding sequences, as expected from previous results. (ii) The correlation between CpG levels and the GC levels of pseudogene...
This review briefly describes the compositional approach to the animals of vertebrate genomes. This approach involves the study of distributions of, and correlations among, the base compositions (GC levels) of different parts of these genomes, such as exons, introns, third codon positions, flanking of genes, and long genomic sequences or fragments...
In this work, we investigated (1) the compositional distributions of all available nuclear coding sequences (and of their three codon positions) of six dicots and four Gramineae; this considerably expanded our knowledge about the differences previously seen between these two groups of plants; (2) the compositional correlations of homologous genes f...
We have analysed the levels of 5-methylcytosine (5mC) in DNAs from 42 vertebrates, and compiled, including data from literature, a table of genomic 5mC and GC levels (as well as the available c-values, i.e., the haploid genome sizes) of 87 species from all vertebrate classes. An analysis of the data indicates that (i) two positive correlations hold...
5-Methylcytosine (5mC) levels were determined in compositional DNA fractions corresponding to different isochore families from the genomes of Xenopus, chicken, mouse and human, four vertebrates which show different isochore patterns. The results obtained indicate that: (i) positive correlations exist between the 5mC levels and the GC levels of isoc...
Previous investigations indicated that synonymous and nonsynonymous substitution rates are correlated in mammalian genes.
In the present work, this correlation has been studied at the intragenic level using a dataset of 48 orthologous genes from
species belonging to at least four different mammalian orders. The results obtained show that the intrag...