About
17
Publications
918
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
61
Citations
Introduction
Current institution
Publications
Publications (17)
We propose an order index, ϕ, which quantifies the notion of “life at the edge of chaos” when applied to genome sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length and base composition. The 786 complete genomic sequences in GenBank were found to have ϕ values i...
Category Le for coding and non-coding parts. Averages of p (fractional A/T-content) and Le for k = 7 (situations for other ks are similar) for the coding parts (solid symbols; ex for eukaryotes and gn for prokaryotes) and non-coding parts (hollow symbols; in for eukaryotes and ig for prokaryotes) of chromosomes. Symbols for categories are: vertebra...
Equivalent lengths of complete sequences (100 pp).
(0.36 MB PDF)
Distributions of χ2 versus L and p. Each symbol gives the χ2 for one chromosomal Le. Top panels, for genic (gn) and exon (ex) concatenates. Bottom panels, for intergenic (ig) and intron (in) concatenates. Symbols, with color, number of data in group, and number of data whose χ2 is less than 10−3 given in brackets, stand for: diamond, gn (blue; 7100...
Le(k), k = 2 to 10, averaged over categories of organisms.
(0.06 MB PDF)
Le of sequences with highly biased compositions.
(0.06 MB PDF)
Effect of replication and segmental duplication on le.
(0.04 MB PDF)
Results from minimal RSD model. Top-left: Equi-χ2 contour as function of r and d, with L0 = 64 (bases); length (L) of generated model sequence is 2 Mb and only Le(k) results for k = 7 are used. Top-right: Le(k), k = 2, 4, 6, 8, 10 from 200 model sequences generated using the “best” parameters L0 = 64, = 1000 (b) and r = 0.73 (cumulative point mutat...
List of complete sequences included in the study (20 pp).
(0.13 MB PDF)
Segmental duplication is widely held to be an important mode of genome growth and evolution. Yet how this would affect the global structure of genomes has been little discussed.
Here, we show that equivalent length, or L(e), a quantity determined by the variance of fluctuating part of the distribution of the k-mer frequencies in a genome, character...
The cause of symmetry is usually subtle, and its study often leads to a deeper understanding of the bearer of the symmetry. To gain insight into the dynamics driving the growth and evolution of genomes, we conducted a comprehensive study of textual symmetries in 786 complete chromosomes. We focused on symmetry based on our belief that, in spite of...
We propose an order index, phi, which gives a quantitative measure of randomness and order of complete genomic sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length. The 786 complete genomic sequences in GenBank were found to have phi values in a very narrow rang...
Background: Segmental duplication is widely held to be a dominant force driving genome growth and evolution.
Motivation: Modern genomes and their statistical properties reflect the series of steps by which they evolved. A very simple model for genome growth has been implemented, compared to other literature models and properties of bacterial genomes. The model has four types of nucleotide and adjustable rates of mutation, deletion and insertion of single...