Kamel Jabbari

Kamel Jabbari
  • PhD
  • Researcher at University of Cologne

About

96
Publications
31,598
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
15,113
Citations
Current institution
University of Cologne
Current position
  • Researcher
Additional affiliations
June 2017 - present
Instutute for Genetics
Position
  • Researcher

Publications

Publications (96)
Poster
Full-text available
1. Background BORIS = Brother Of Regulator Of Imprinted Sites BORIS (also called CTCF-like = CTCFL) emerged by duplication of the CCCTC-binding factor (CTCF) [1,2,3]. It has long been assumed that the BORIS gene arose during amniote (reptiles, birds, mammals) evolution . The BORIS gene is located within a synteny block which is highly conserved in...
Article
Full-text available
In this study, by exploring chromatin conformation capture data, we show that the nuclear segregation of Topologically Associated Domains (TADs) is contributed by DNA sequence composition. GC-peaks and valleys of TADs strongly influence interchromosomal interactions and chromatin 3D structure. To gain insight on the compositional and functional con...
Article
Full-text available
Recent findings established a link between DNA sequence composition and interphase chromatin architecture and explained the evolutionary conservation of TADs (Topologically Associated Domains) and LADs (Lamina Associated Domains) in mammals. This prompted us to analyse conformation capture and recombination rate data to study the relationship betwe...
Data
Supplementary figures. Figure A. Schematic representation of “regioneR” approach. We first perform permutation test by creating 1000 randomizations of set-2 to test if the overlap with set-1 is more than expected, the output is then stored in the object “pt”. We can then plot the “pt” object; plot(pt) will create a plot with the distribution of the...
Preprint
Recent investigation established a link between DNA sequences and chromatin architecture and explained the evolutionary conservation of TADs (Topologically Associated Domains) and LADs (Lamina Associated Domains) in mammals. This prompted us to analyse the relationship between chromatin architecture and recombination landscapes of human and mouse....
Article
Full-text available
Genetic Generalized Epilepsy (GGE) and benign epilepsy with centro-temporal spikes or Rolandic Epilepsy (RE) are common forms of genetic epilepsies. Rare copy number variants have been recognized as important risk factors in brain disorders. We performed a systematic survey of rare deletions affecting protein-coding genes derived from exome data of...
Data
Deletions present in array data. (DOCX)
Data
Deletions in common with ExAC CNVs. Data is sorted from low to high deletion score (del.score) and duplication (dup) frequencies. "+" indicates expression in the brain. Deletion score increases with increasing intolerance. (DOCX)
Article
Full-text available
Background Genetic generalised epilepsy is the most common type of inherited epilepsy. Despite a high concordance rate of 80% in monozygotic twins, the genetic background is still poorly understood. We aimed to investigate the burden of rare genetic variants in genetic generalised epilepsy.
Article
BACKGROUND: Genetic generalised epilepsy is the most common type of inherited epilepsy. Despite a high concordance rate of 80% in monozygotic twins, the genetic background is still poorly understood. We aimed to investigate the burden of rare genetic variants in genetic generalised epilepsy. METHODS: For this exome-based case-control study, we used...
Article
Full-text available
The CCCTC-binding factor (CTCF) is multi-functional, ubiquitously expressed, and highly conserved from Drosophila to human. It has important roles in transcriptional insulation and the formation of a high-dimensional chromatin structure. CTCF has a paralog called "Brother of Regulator of Imprinted Sites" (BORIS) or "CTCF-like" (CTCFL). It binds DNA...
Article
Rolandic epilepsy (RE) is the most common focal epilepsy in childhood. To date no hypothesis-free exome-wide mutational screen has been conducted for RE and atypical RE (ARE). Here we report on whole-exome sequencing of 194 unrelated patients with RE/ARE and 567 ethnically matched population controls. We identified an exome-wide significantly enric...
Article
Full-text available
A recent investigation showed the existence of correlations between the architectural features of mammalian interphase chromosomes and the compositional properties of isochores. This result prompted us to compare maps of the Topologically Associating Domains (TADs) and of the Lamina Associated Domains (LADs) with the corresponding isochore maps of...
Data
Table A, Isochore families in the human genome, Table B, Structural and functional properties of the genome core vs. the genome desert. Table C, Isochores & interphase chromatin. (PDF)
Article
Epilepsy is a common complex disorder most frequently associated with psychiatric and neurological diseases. Massive parallel sequencing of individual or cohort genomes and exomes led the identification of several disease associated genes. We review here the candidate genes in epilepsy genetics with focus on exome and gene panel data. Together with...
Article
Full-text available
Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings....
Article
Full-text available
Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B...
Article
Full-text available
Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes—and show that cultivated types deriv...
Article
Full-text available
Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes—and show that cultivated types deriv...
Article
Recent studies reported DEPDC5 loss-of-function mutations in different focal epilepsy syndromes. Here we identified one predicted truncation and two missense mutations in three independent children with Rolandic epilepsy (3/207). In addition, we identified three families with unclassified focal childhood epilepsies carrying predicted truncating DEP...
Article
Full-text available
Members of the family Trypanosomatidae infect many organisms, including animals, plants and humans. Plant-infecting trypanosomes are grouped under the single genus Phytomonas, failing to reflect the wide biological and pathological diversity of these protists. While some Phytomonas spp. multiply in the latex of plants, or in fruit or seeds without...
Article
Full-text available
Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B...
Article
Full-text available
Red seaweeds are key components of coastal ecosystems and are economically important as food and as a source of gelling agents, but their genes and genomes have received little attention. Here we report the sequencing of the 105-Mbp genome of the florideophyte Chondrus crispus (Irish moss) and the annotation of the 9,606 genes. The genome features...
Chapter
Brown algae are important organisms both because of their key ecological roles in coastal ecosystems and because of the remarkable biological features that they have acquired during their unusual evolutionary history. The recent sequencing of the complete genome of the filamentous brown alga Ectocarpus has provided unprecedented access to the molec...
Article
Full-text available
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domesticatio...
Chapter
Full-text available
Brown algae are important organisms both because of their key ecological roles in coastal ecosystems and because of the remarkable biological features that they have acquired during their unusual evolutionary history. The recent sequencing of the complete genome of the filamentous brown alga Ectocarpus has provided unprecedented access to the molec...
Article
Brown algae are important organisms both because of their key ecological roles in coastal ecosystems and because of the remarkable biological features that they have acquired during their unusual evolutionary history. The recent sequencing of the complete genome of the filamentous brown alga Ectocarpus has provided unprecedented access to the molec...
Article
• By comparative analyses we identify lineage-specific diversity in transcription factors (TFs) from stramenopile (or heterokont) genome sequences. We compared a pennate (Phaeodactylum tricornutum) and a centric diatom (Thalassiosira pseudonana) with those of other stramenopiles (oomycetes, Pelagophyceae, and Phaeophyceae (Ectocarpus siliculosus))...
Article
Full-text available
Diatoms represent the predominant group of eukaryotic phytoplankton in the oceans and are responsible for around 20% of global photosynthesis. Two whole genome sequences are now available. Notwithstanding, our knowledge of diatom biology remains limited because only around half of their genes can be ascribed a function based onhomology-based method...
Data
Supplementary Table S2. Diatom-specific genes expressed in both high and low decadienal libraries (HD and LD).
Data
Supplementary Table S3. R-values of the actual 9,145 clusters and that of the randomized data set.
Data
Full-text available
Supplementary Figure S1. Expression patterns of diatom-specific genes. (A) Hierarchical clustering to show the expression pattern of transcripts belonging to the gene families conserved across different taxonomical groups (Core), diatom-specific (Diatom) and P. tricornutum-specific (Pt) [8]. (B) Plot showing the average frequency of the above set o...
Data
Full-text available
Supplementary Figure S2. Percentage of differentially expressed transcripts in primary y-axis, normalized to number of non-redundant transcripts (TUs) across the EST libraries and the percentage of transcripts with defined InterPro domains (PDFs) in the differentially expressed transcripts in the secondary y-axis. The arrow in the secondary y-axis...
Data
Supplementary Figure S3. Distribution of P. tricornutum PDFs in other organismal groups. Numbers in parentheses indicate the number of genes with defined protein domains (PDF) and the number outside the parentheses represent the total number of genes in each organismal group.
Data
Supplementary Table S6. The top 20 IPR domains expressed across all the libraries and the number of ESTs for each domain.
Data
Supplementary Table S1. A comprehensive description of culturing conditions of the libraries.
Data
Supplementary Table S4. The 71 transcripts that were expressed at least once across all the libraries.
Data
Supplementary Table S5. GO terms that are over-represented in each library (P < 0.001). In this table we also show over-represented GO terms shared between libraries.
Data
Full-text available
Supplementary Figure S5. Hierarchical clustering showing the expression patterns of P. tricornutum orthologs of the novel genes identified by tiling array in T. pseudonana [42]. Expression levels are shown in an increasing scale from grey to dark blue, and are based on frequencies of ESTs in each library (see Materials and methods). For two-letter...
Data
Supplementary Table S7. Bacterial genes and their expression across different libraries along with the domain and genomic location.
Data
Full-text available
Supplementary Figure S4. Hierarchical clustering of transcripts defined as being differentially expressed under the nitrate starved condition (NS) in P. tricornutum along with the hierarchical clustering of corresponding orthologs expressed in the nitrate limited condition (NL) in T. pseudonana. Expression levels are shown in an increasing scale fr...
Data
Full-text available
Supplementary Figure S6. Expression of bacterial orthologous genes in P. tricornutum. (A) Plot showing the number of transcripts of bacterial origin expressed across the 16 different growth conditions. The primary y-axis shows the number of transcripts and the secondary y-axis shows the average frequency of these expressed transcripts. (B) Expressi...
Article
Full-text available
Brown algae (Phaeophyceae) are complex photosynthetic organisms with a very different evolutionary history to green plants, to which they are only distantly related. These seaweeds are the dominant species in rocky coastal ecosystems and they exhibit many interesting adaptations to these, often harsh, environments. Brown algae are also one of only...
Data
Polymorphism generated by TE insertions across P. tricornutum accessions. Distribution of polymorphic bands obtained by SSAP experiments (with BKB, SCF, and PtC34) across 13 P. tricornutum accessions and positions of the corresponding sequences in the Pt1 genome when occurring only once (otherwise, we indicated the nature of the repeat sequenced).
Data
List of putatively active LTR-RTs found in diatom genomes. Classification, structural features, and accession numbers of the putatively active LTR-RTs identified in the P. tricornutum and T. pseudonana genomes.
Data
Pt2_50588 consists in a recombination product. Close up on the sequence alignment of the Pt2_50588 orthologs at the level of the transition between higher similarities of Pt2_50588 with Pt2_46949/Pt2_46953 (highlighted in blue) and with Pt2_46950/Pt2_50589 (highlighted in red).
Data
Haplotype specificity of Blackbeard insertion. (A) Close up on the dot-plot comparison (window size: 11) of two consensus sequences of the Blackbeard insertion locus retrieved with the help of the Stanford Human Genome Center. (B) Schematic view of the two haplotypes observed at the Blackbeard insertion locus in the P. tricornutum genome. Haplotype...
Article
Full-text available
Transposable elements (TEs) are mobile DNA sequences present in the genomes of most organisms. They have been extensively studied in animals, fungi, and plants, and have been shown to have important functions in genome dynamics and species evolution. Recent genomic data can now enlarge the identification and study of TEs to other branches of the eu...
Article
Summary *Ten axenic cultures, referred to as Fibrocapsa japonica, were studied for their morphology, pigment composition, toxicity and phylogeny. *Morphologically, all 10 accessions were similar and displayed equivalent pigment contents. We identified chlorophylls a and c, beta-carotene and fucoxanthin as the dominant pigments, together with xantho...
Article
Full-text available
Diatoms are photosynthetic secondary endosymbionts found throughout marine and freshwater environments, and are believed to be responsible for around one-fifth of the primary productivity on Earth. The genome sequence of the marine centric diatom Thalassiosira pseudonana was recently reported, revealing a wealth of information about diatom biology....
Chapter
The material covered herein touches on the recent understanding of the fate of introns in human duplicated genes. A structural genomics framework has been provided to account for the functional asymmetry of sister copies after the duplication event(s). Structural shift between duplicated copies are very remarkable. Indeed, translocation/transpositi...
Article
Full-text available
Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in...
Article
Full-text available
Diatoms are unicellular brown algae that likely arose from the endocytobiosis of a red alga into a single-celled heterotroph and that constitute an algal class of major importance in phytoplankton communities around the globe. The first whole-genome sequence from a diatom species, Thalassiosira pseudonana Hasle et Heimdal, was recently reported, an...
Article
Full-text available
The smallest known eukaryotes, at ≈1-μm diameter, are Ostreococcus tauri and related species of marine phytoplankton. The genome of Ostreococcus lucimarinus has been completed and compared with that of O. tauri. This comparison reveals surprising differences across orthologous chromosomes in the two species from highly syntenic chromosomes in most...
Article
Proper validation can accelerate sequence-based discovery of proteins and protein-coding genes. Databases currently contain a backlog of experimentally unverified gene models and tentative assignments of observed transcripts to coding or noncoding RNA. We present and apply a general principle, founded on base composition and the genetic code and va...
Article
Full-text available
The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary produc...
Article
In previous work [Jabbari, K., Rayko, E., Bernardi, G., 2003. The major shifts of human duplicated genes. Gene 317, 203-208], we investigated the fate of ancient duplicated genes after the compositional transitions that occurred between the genomes of cold- and warm-blooded vertebrates. We found that the majority of duplicated copies were transpose...
Article
Full-text available
Diatoms are one of the most important constituents of phytoplankton communities in aquatic environments, but in spite of this, only recently have large-scale diatom-sequencing projects been undertaken. With the genome of the centric species Thalassiosira pseudonana available since mid-2004, accumulating sequence information for a pennate model spec...
Article
Reports accompanying draft or finished sequences of rice chromosomes and full-length cDNA libraries indicate that between a third and half of the (largely predicted) protein-coding genes of rice might have no identifiable homologs in Arabidopsis and/or other species. The set of apparent ‘no-homolog’ sequences are predicted to exhibit striking compo...
Article
Reports accompanying draft or finished sequences of rice chromosomes and full-length cDNA libraries indicate that between a third and half of the (largely predicted) protein-coding genes of rice might have no identifiable homologs in Arabidopsis and/or other species. The set of apparent ‘no-homolog’ sequences are predicted to exhibit striking compo...
Article
An analysis of dinucleotide frequencies was carried out on DNAs from insects and mammals, as well as on large DNA sequences from the genomes of Drosophila melanogaster, Anopheles gambiae, puffer fish (Takifugu rubripes), zebra fish (Danio rerio) and human. These organisms were chosen because Drosophila and Anopheles DNAs have an extremely low level...
Article
In this paper, we provide evidence for the body temperature effect on the formation of GC-rich isochores, by analysing genomic sequences from two puffer fishes living at different temperatures. The higher body temperature of Tetraodon nigroviridis compared to Takifugu rubripes (DeltaT approximately 15 degrees C) appears to be the cause of a higher...
Article
A sequence analysis of the genomes of Anopheles gambiae and Drosophila melanogaster reveals that Anopheles DNA is more heterogeneous and GC-richer than Drosophila DNA. The gene concentration across the Anopheles genome is characterized by low levels in the GC-poor part of the genome and a 3-fold increase in the GC-richest part; this gene density gr...
Article
Between one third and one half of the proposed rice genes appear to have no homologs in other species, including Arabidopsis. Compositional considerations, and a comparison of curated rice sequences with ex novo predictions, suggest that many or most of the putative genes without homologs may be false positive predictions, i.e., sequences that are...
Article
Full-text available
The existence of a well conserved linear relationship between GC levels of genes' second and third codon positions (GC2, GC3) prompted us to focus on the landscape, or joint distribution, spanned by these two variables. In human, well curated coding sequences now cover at least 15%-30% of the estimated total gene set. Our analysis of the landscape...
Article
The localization of HIV-1 proviruses in compositional DNA fractions from 27 AIDS patients during the chronic phase of the disease with depletion of CD4+ and different levels of viremia showed the following. (1) At low viremia, proviruses are predominantly localized in the GC-richest isochores, which are characterized by an open chromatin structure;...
Article
A positive correlation holds between the GC level of third codon positions of human genes (GC(3)) and hydropathy of the encoded proteins. This correlation may appear counterintuitive, since it links a physical property of proteins to the base composition of 'synonymous' sites. We here establish the nontriviality of the correlation, which has recent...
Article
A recent paper by Belle et al. (J. Mol. Evol. 55 (2002) 356) reported an analysis of mean GC(3) (the GC level of third codon positions) and standard deviations of GC(3) of vertebrate genomes as related to body temperature, and concluded that "the thermal stability hypothesis does not appear to explain the general patterns of composition", apparentl...
Article
Since many gene duplications in the human genome are ancient duplications going back to the origin of vertebrates, the question may be asked about the fate of such duplicated genes at the compositional genome transitions that occurred between cold- and warm-blooded vertebrates. Indeed, at that transition, about half of the (GC-poor) genes of cold-b...
Article
Full-text available
Gene prediction relies on the identification of characteristic features of coding sequences that distinguish them from non-coding DNA. The recent large-scale sequencing of entire genomes from higher eukaryotes, in conjunction with currently used gene prediction algorithms, has provided an abundance of putative genes that can now be analysed for the...
Article
Alus and LINEs (LINE1) are widespread classes of repeats that are very unevenly distributed in the human genome. The majority of GC-poor LINEs reside in the GC-poor isochores whereas GC-rich Alus are mostly present in GC-rich isochores. The discovery that LINES and Alus share similar target site duplication and a common AT-rich insertion site speci...
Article
In the present work we show that in the Drosophila genome (which covers a 37-51% GC range at a DNA size of approx.50kb) a linear correlation holds between GC (or GC(3)50kb) genomic sequences embedding them. This correlation allows us to position the two compositional distributions of (a) coding sequences, and (b) of long DNA segments relative to ea...
Article
Mycobacterium tuberculosis and Mycobacterium leprae are the ethiological agents of tuberculosis and leprosy, respectively. After performing extensive comparisons between genes from these two GC-rich bacterial species, we were able to construct a set of 275 homologous genes. Since these two bacterial species also have a very low growth rate, transla...
Article
A compositional transition was previously detected by comparing orthologous coding sequences from cold- and warm-blooded vertebrates (see Bernardi, G., Hughes, S., Mouchiroud, D., 1997. The major compositional transitions in the vertebrate genome. J. Mol. Evol. 44, S44-S51 for a review). The transition is characterized by higher GC levels (GC is th...
Article
Full-text available
In this work, we have investigated the relationships between synonymous and nonsynonymous rates and base composition in coding sequences from Gramineae to analyze the factors underlying the variation in substitutional rates. We have shown that in these genes the rates of nucleotide divergence, both synonymous and nonsynonymous, are, to some extent,...
Article
We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (α-helix, β-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substituti...
Article
The “universal correlation” (D'Onofrio, G., Bernardi, G., 1992. A universal compositionalcorrelation amomg codon positions. Gene 110, 81–88.) that holds between 〈GC3〉 and 〈GC1〉 or 〈GC2〉 (〈GC〉 values are the average values of the coding sequences of each genome analyzed) at both the inter- and intra-genomic level, was re-analyzed on a vastly larger...
Article
The discovery that the vertebrate genomes of warm-blooded vertebrates are mosaics of isochores, long DNA segments homogeneous in base composition, yet belonging to families covering a broad spectrum of GC levels, has led to two major observations. The first is that gene density is strikingly non-uniform in the genome of all vertebrates, gene concen...
Article
In this work, we have investigated the relationships between synonymous and nonsynonymous rates and base composition in coding sequences from Gramineae to analyze the factors underlying the variation in substitutional rates. We have shown that in these genes the rates of nucleotide divergence, both synonymous and nonsynonymous, are, to some extent,...
Article
Full-text available
We have analyzed the patterns of synonymous codon preferences of the nuclear genes of Plasmodium falciparum, a unicellular parasite characterized by an extremely GC-poor genome. When all genes are considered, codon usage is strongly biased toward A and T in third codon positions, as expected, but multivariate statistical analysis detects a major tr...
Article
A computer analysis of 946 human DNA sequences larger than 50 kb and representing about 118 Mb of DNA has led to the following observations. (i) Positive correlations hold between CpG levels and the GC levels of isochores and coding sequences, as expected from previous results. (ii) The correlation between CpG levels and the GC levels of pseudogene...
Chapter
This review briefly describes the compositional approach to the animals of vertebrate genomes. This approach involves the study of distributions of, and correlations among, the base compositions (GC levels) of different parts of these genomes, such as exons, introns, third codon positions, flanking of genes, and long genomic sequences or fragments...
Article
Full-text available
In this work, we investigated (1) the compositional distributions of all available nuclear coding sequences (and of their three codon positions) of six dicots and four Gramineae; this considerably expanded our knowledge about the differences previously seen between these two groups of plants; (2) the compositional correlations of homologous genes f...
Article
We have analysed the levels of 5-methylcytosine (5mC) in DNAs from 42 vertebrates, and compiled, including data from literature, a table of genomic 5mC and GC levels (as well as the available c-values, i.e., the haploid genome sizes) of 87 species from all vertebrate classes. An analysis of the data indicates that (i) two positive correlations hold...
Article
Full-text available
5-Methylcytosine (5mC) levels were determined in compositional DNA fractions corresponding to different isochore families from the genomes of Xenopus, chicken, mouse and human, four vertebrates which show different isochore patterns. The results obtained indicate that: (i) positive correlations exist between the 5mC levels and the GC levels of isoc...
Article
Previous investigations indicated that synonymous and nonsynonymous substitution rates are correlated in mammalian genes. In the present work, this correlation has been studied at the intragenic level using a dataset of 48 orthologous genes from species belonging to at least four different mammalian orders. The results obtained show that the intrag...

Network

Cited By