About
64
Publications
9,605
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,299
Citations
Citations since 2017
Publications
Publications (64)
In the present note, the genomic compositional rule largely known as 'Chargaff's 2nd parity rule' (asserting equimolarity between Adenine-Thymine and Guanine-Cytosine in any of the two DNA strands) is regarded in association with Noether's theorem linking symmetries with conservation laws in physics. In the case of the genome, the strict physical a...
We revisit the topic of human genome guanine-cytosine (G+C) content and adenine-thymine (A+T) content under neutral evolution. For this study, the “gold standard” de novo mutation data within the human genome is used to estimate the mutation rates, instead of using base substitution data between related species. We define the rates (coefficients) o...
We revisit the topic of human genome guanine-cytosine content under neutral evolution. For this study, the de novo mutation data within human is used to estimate mutational rate instead of using base substitution data between related species. We then define a new measure of mutation bias which separate the de novo mutation counts from the backgroun...
Αυτή η εργασία βασίζεται στα αποτελέσματα της δημοσίεσης των W.Li, D.Thanos και A.Provata η οποία σχετίζονταν με την αναζήτηση μοτίβων Erdos στο ανθρώπινο DNA (Quantifying local randomness
in human DNA and RNA sequences using Erdos motifs, 2018), και προχωράει στην αναζήτηση αυτών των μοτίβων στα γονιδιώματα και τα χρωμοσώματα και άλλων οργανισμών....
Analysis of DNA composition at several length scales constitutes the bulk of many early studies aimed at unravelling the complexity of the organization and functionality of genomes. Dinucleotide relative abundances are considered an idiosyncratic feature of genomes, regarded as a ‘genomic signature’. Motivated by this finding, we introduce the ‘Gen...
The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word w in a given sequence x can be used for classifying w as avoided or overabundant. The definitions used for the expectation and deviation of w in this statistical model were described and biologically justified by Brendel et al. (1986) [1]...
We discuss questions related to the ‘Benveniste Affair’, its consequences and broader issues in an attempt to understand homeopathy. Specifically, we address the following points:
1. The relationship between the experiments conducted by Benveniste, Montagnier, their collaborators and groups that independently tested their results, to ‘traditional’...
The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word $w$ in a given sequence $x$ can be used for classifying $w$ as avoided or overabundant. The definitions used for the expectation and deviation of $w$ in this statistical model were described and biologically justified by Brendel et al. (J...
Background The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the deviation of w, denoted by \(\textit{dev}(w)\), effectively characterises the extent of a word by i...
The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of w, denoted by \(\textit{std}(w)\), effectively characterises the extent of a word by its...
Strand biases reflect deviations from a null expectation of DNA evolution that assumes strand-symmetric substitution rates.
Here, we present strong evidence that nearest-neighbour preferences are a strand-biased feature of bacterial genomes, indicating
neighbour-dependent substitution asymmetries. To detect such asymmetries we introduce an alignmen...
Abstract Conserved non-coding elements (CNEs) are defined using various degrees of sequence identity and thresholds of minimal length. Their conservation frequently exceeds the one observed for protein-coding sequences. We explored the chromosomal distribution of different classes of CNEs in the human genome. We employed two methodologies: the scal...
The deviation of the observed frequency of a word $w$ from its expected frequency in a given sequence $x$ is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of $w$, denoted by $std(w)$, effectively characterises the extent of a word by its edge...
Conserved, ultraconserved and other classes of constrained non-coding elements (referred as CNEs) represent one of the mysteries of current comparative genomics. These elements are defined using various degrees of sequence similarity between organisms and several thresholds of minimal length and are often marked by extreme conservation that frequen...
This article provides an overview of the first BIOASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March and September 2013. BIOASQ assesses the ability of systems to semantically index very large numbers of biomedical scientific articles, and to return concise and user-u...
The late Professor J.S. Nicolis always emphasized, both in his writings and in presentations and discussions with students and friends, the relevance of a dynamical systems approach to biology. In particular, viewing the genome as a "biological text" captures the dynamical character of both the evolution and function of the organisms in the form of...
Most common methods for inquiring genomic sequence composition, are based on the bag-of-words approach and thus largely ignore the original sequence structure or the relative positioning of its constituent oligonucleotides. We here present a novel methodology that takes into account both word representation and relative positioning at various lengt...
Conserved, ultraconserved and other classes of constrained elements (collectively referred as CNEs here), identified by comparative genomics in a wide variety of genomes, are non-randomly distributed across chromosomes. These elements are defined using various degrees of conservation between organisms and several thresholds of minimal length. We he...
Repeats or Transposable Elements (TEs) are highly repeated sequence stretches, present in virtually all eukaryotic genomes. We explore the distribution of representative TEs from all major classes in entire chromosomes across various organisms. We employ two complementary approaches, the scaling of block entropy and box-counting. Both converge to t...
The Hox gene collinearity enigma has often been approached using models based on biomolecular mechanisms. The biophysical model is an alternative approach based on the hypothesis that collinearity is caused by physical forces pulling the Hox genes from a territory where they are inactive to a distinct spatial domain where they are activated in a st...
The healing potential and description of homeopathic remedies, as determined in homeopathic pathogenic trials (HPTs) and verified by medical experience, are often found to be meaningfully connected with the symbolic content attributed to the original materials (tinctures, metals etc) through tradition or modern semantics. Such a connection is incom...
This article provides an overview of BIOASQ, a new competition on biomedical semantic indexing and question answering (QA). BIOASQ aims to push towards systems that will allow biomedical workers to express their information needs in natural language and that will return concise and user-understandable answers by combining information from multiple...
The coding parts of DNA sequences are regarded as clusters of connected sites of a random Cantor-like set, while the non-coding parts are regarded as the empty regions of the same set. Under this representation, we find that higher eucaryotes are mapped on random Cantor sets with fractal dimension around 0.85, while lower organisms are mapped on Ca...
Statistical methods, including block entropy based approaches, have already been used in the study of long-range features of genomic sequences seen as symbol series, either considering the full alphabet of the four nucleotides or the binary purine or pyrimidine character set. Here we explore the alternation of short protein-coding segments with lon...
Large-scale features of the spatial arrangement of protein-coding segments (PCS) are investigated by means of the inter-PCS spacers' size distributions, which have been found to follow power-laws. Linearity in double-logarithmic scale extends to several orders of magnitude in the genomes of organisms as disparate as mammals, insects and plants. Thi...
Spatial distribution and clustering of repetitive elements are extensively studied during the last years, as well as their colocalization with other genomic components. Here we investigate the large-scale features of Alu and LINE1 spatial arrangement in the human genome by studying the size distribution of interrepeat distances. In most cases, we h...
Chargaff' s second parity rule (PR2) states that complementary nucleotides are met with almost equal frequencies in single stranded DNA. This is indeed the case for all bacterial and eukaryotic genomes studied, although the genomic patterns may differ among genomes in terms of local deviations. The behaviour of organellar genomes regarding the seco...
The statistical properties of the size distribution of DNA segments separating identical oligonucleotides are studied. For representative eukaryotes (Homo sapiens, Mus musculus, Saccharomyces cereviciae, Oryza sativa, Arabidopsis thaliana) we have demonstrated the existence of long-range correlations for the distances separating oligonucleotides of...
This work applies two recently formulated quantities, strongly correlated with the coding character of a sequence, as an additional "module" on GeneMark, in a three-criterial method. The difference in the statistical approaches implicated by the methods combined here, is expected to contribute to an efficient assignment of functionality to unannota...
Extensive work on n-tuplet occurrence in genomic sequences has revealed the correlation of their usage with sequence origin. Parallel to that, there exist different restrictions in the nucleotide composition of coding and noncoding sequences that may result in distinct modes of usage of n-tuplets. The relatively simple approaches described herein f...
The historical evolution of the traditional correspondences of planets and metals and of knowledge of the planetary arrangements is reviewed. The traditional geocentric sequence of the planets generates not only the sequence of days of the week, whose names are taken from the deities traditionally associated with the planets, but also a ranking by...
Deviations from Chargaff's 2nd parity rule, according to which A∼T and G∼C in single stranded DNA, have been associated with
replication as well as with transcription in prokaryotes. Based on observations regarding mainly the transcription-replication
co-linearity in a large number of prokaryotic species, we formulate the hypothesis that the replic...
The distribution of n-tuplet frequencies is shown to strongly correlate with functionality when examining a genomic sequence in a reading-frame specific manner. The approach described herein applies a coarse-graining procedure, which is able to reveal aspects of triplet usage that are related to protein coding, while at the same time remaining spec...
The general property of asymmetry in word use in meaningful texts written in a variety of languages, motivates a quantification of the differences in the use of mutually symmetric triplets in genomic sequences. When this is done in the three reading frames, high values found for one of them are used as indication that the sequence is coding for a p...
The deviation from randomness in the distribution of nucleotides in genomic sequences is quantified and studied, using a modified standard deviation (MSD). This method implies a "per block" computation of the standard deviation of the nucleotide frequencies of occurrence, using local means (means taken in a neighborhood of each block). This quantit...
Clustering and long-range correlations in the nucleotide sequences of different categories of organisms are discussed. Clustering, mostly observed in higher eucaryotes, can be found at different length scales in DNA and Central Limit Theorems are used as links between these length scales. Several dynamical, statistical, mean-field models are propos...
We present a model for genome evolution, comprising biologically plausible events such as transpositions inside the genome and insertions of exogenous sequences. This model attempts to formulate a minimal proposition accounting for key statistical properties of genomes, avoiding, as far as possible, unsupportable hypotheses for the remote evolution...
Turing's original reaction network is systematically studied, particularly in what concerns: (a) Its ability to produce patterns in a predictable way. (b) The feasibility of its concentration-independent sink term. Despite the widely accepted view that Turing's original model presents some inherent inability to produce regular structures, the patte...
Diffusing morphogens in cooperation can control gene expression in developing limbs. Additive cooperation corresponds to the Boolean operator OR and implies the equivalent action of the (suitably scaled) concentrations of two morphogens, either by their alternative binding to the same receptor or by another way of convergence of their effects durin...
A method quantifying the randomness of nucleotide sequences is developed, based on the introduction of a standard deviation type of quantity involving locally computed means and a length scale around which is assessed the clustering of nucleotides. It is pointed out that the value taken by this modified standard deviation may distinguish between co...
We study the size distribution of coding and non-coding regions in DNA sequences. For most organisms we observe that the size distribution P
c(S) of the coding regions of size S shows short range distribution, whereas the size distribution of the non-coding regions follows a power-law decay P
nc(S) S
–1 – , with power exponents indicating clear lon...
We study the size distribution of purine and pyrimidine clusters in coding and non-coding DNA sequences. We observe that the cluster-size distribution P(s) follows an exponential decay in coding sequences whereas it follows a power-law decay in non-coding sequences: P(s) ∼ s−1−μ, with a power exponent μ = 1.5–1.8. The mean-square displacement σ2(m)...
A method allowing to measure the inhomogeneous distribution of purines/pyrimidines in nucleotide sequences is developed. We show that this measure relates to the coding or non-coding character of the considered sequence. Coding sequences present a near to the random Pu or Py distribution. This property is shared by both protein-coding DNA and funct...
A method allowing to measure the inhomogeneous distribution of purines/pyrimidines in nucleotide sequences is developed. We
show that this measure relates to the coding or non-coding character of the considered sequence. Coding sequences present
a near to the random Pu or Py distribution. This property is shared by both protein-coding DNA and funct...
Pattern formation of the developing vertebrate limb is mainly controlled by the zone of polarizing activity (ZPA) and the apical ectodermal ridge (AER) which may act as sources of diffusing morphogens. These sources are tightly interconnected and maintained by positive feedback and, together with the established role of Wnt7 a on the dorsal side of...
We describe two experiments on the regenerating forelimbs of the urodele Triturus cristatus. In the first, a contralateral grafting is performed where the anteroposterior axis of the regenerating blastema coincides with the dorsoventral axis of the host stump. In the second, the regenerating blastema is ipsilaterally rotated on the stump at angles...
The interaction between a symmetry-breaking bifurcation and a weak external field represents a possible mechanism of pattern selection. Here, the coupling of a reaction-diffusion system operating under periodic boundary conditions, with a circularly polarized electromagnetic field is analyzed near a wave-producing bifurcation. The system is studied...
A mechanism for the generation of the morphological left-right asymmetry in higher organisms is proposed, based on the idea that chirality at the molecular level is the primordial source for macroscopic asymmetry. This mechanism accounts for a variety of experimental results on artificial production of situs inversus and fits well with mutations in...
In 1968 Saunders & Gasseling discovered that a group of cells at the posterior margin of the chick wing bud has a remarkable morphogenetic potency: if transplanted to the anterior site of a host limb bud, a mirror image duplication of the normal pattern of digits develops (cf. Figure 3.1). This group of cells was labeled the ‘zone of polarizing act...
The minimal conditions for pattern formation are examined in reaction-diffusion systems and in the presence of cross-diffusion. It is found that for such systems self-organization properties appear even if the homogeneous reactions are very simple. A cell-cell contact mechanism, able to create cross-diffusion transport in biological systems is prop...
We have collected several experimental data of pattern duplications due to the ZPA transplantation or application of retinoic acid on the developing chick limb bud. We have compared these data with the predictions of models based on diffusion or autocatalysis of retinoids. It turns out that these models cannot comprehensively explain the data. More...
We investigate numerically the effect of a spatiotemporal forcing on travelling waves produced in a model reaction-diffusion system operating under periodic boundary conditions. It is shown that, for suitable parameter values, this external forcing can stabilize the progressive wave regime and impose its directionality through wave inversion. When...
It is pointed out that in a reaction-diffusion system near a chiral symmetry-breaking instability, a weak drift term in the diffusion law can induce a systematic selection between macroscopic three-dimensional patterns of opposite chirality. When the cause of the drift term is an external field, the orientation of the field with respect to the syst...
The minimal conditions for pattern formation are examined in reaction-diffusion systems and in the presence of cross-diffusion. It is found that for such systems self-organization properties appear even if the homogeneous reactions are very simple. A cell-cell contact mechanism, able to create cross-diffusion transport in biological systems is prop...
Some key experiments of artificial production ofsitus inversus viscerum are briefly reviewed and a two-step mechanism for the explanation of the systematic asymmetric visceral arrangement in vertebrates
is proposed. A two-variable reaction-diffusion system displaying a symmetry-breaking bifurcation is considered, and it is
demonstrated that a sligh...
Die Tota1synthesen von 2-Methyl-3-p-tolylcyclopentenon (VII), von Dihydrojasmon (XVIII) und von cis-Jasmon (XXVI) sowie von weiteren disubstituierten Cyclopentenonen werden beschrieben.