Published by MDPI

Online ISSN: 1099-4300


Figure 1. Plot of the redundancy rate versus D 2
Figure 2. Inclusion of additional sequences breaks down the segregation observed by Gatlin.  
Figure 3. The logo of a number of sequences at the beginning of a gene. The start codon ATG is immediately apparent. The logo was constructed using the software at http://weblogo.threeplusone.com/.
Figure 6. A block diagram depicting the basic steps involved with a grammar-based compression scheme.
Data Compression Concepts and Algorithms and Their Applications to Bioinformatics
  • Article
  • Full-text available

January 2010


305 Reads



Data compression at its base is concerned with how information is organized in data. Understanding this organization can lead to efficient ways of representing the information and hence data compression. In this paper we review the ways in which ideas and approaches fundamental to the theory and practice of data compression have been used in the area of bioinformatics. We look at how basic theoretical ideas from data compression, such as the notions of entropy, mutual information, and complexity have been used for analyzing biological sequences in order to discover hidden patterns, infer phylogenetic relationships between organisms and study viral populations. Finally, we look at how inferred grammars for biological sequences have been used to uncover structure in biological sequences.

Figure 6. Schematic representations of a large loop of N -bonds deforming with fluctuating cross-linking hydrogen bonds. The number of cross-links can range from 0 to ≈ N/4. Note that because all cross-links are independent for these flexible structures, the DCM prediction for the entropy reduction only depends on the number of cross-links. The exact formula for entropy reduction completely accounts for the location of the cross-links and all accessible atomic geometries consistent with the fixed topology.  
Conformational Entropy of an Ideal Cross-Linking Polymer Chain

September 2008


97 Reads

We present a novel analytical method to calculate conformational entropy of ideal cross-linking polymers from the configuration integral by employing a Mayer series expansion. Mayer-functions describing chemical bonds within the chain and for cross-links are sharply peaked over the temperature range of interest, and, are well approximated as statistically weighted Dirac delta-functions that enforce distance constraints. All geometrical deformations consistent with a set of distance constraints are integrated over. Exact results for a contiguous series of connected loops are employed to substantiate the validity of a previous phenomenological distance constraint model that describes protein thermodynamics successfully based on network rigidity.

Thermodynamic and Differential Entropy under a Change of Variables

March 2010


274 Reads

The differential Shannon entropy of information theory can change under a change of variables (coordinates), but the thermodynamic entropy of a physical system must be invariant under such a change. This difference is puzzling, because the Shannon and Gibbs entropies have the same functional form. We show that a canonical change of variables can, indeed, alter the spatial component of the thermodynamic entropy just as it alters the differential Shannon entropy. However, there is also a momentum part of the entropy, which turns out to undergo an equal and opposite change when the coordinates are transformed, so that the total thermodynamic entropy remains invariant. We furthermore show how one may correctly write the change in total entropy for an isothermal physical process in any set of spatial coordinates.

Figure 4. Example results for test case 3. This re-plots one of the results from Figure 3 that was shown as magenta for the 1048576 random samples. Here, we can see the accuracy better using a semi-log scale.  
Best Probability Density Function for Random Sampled Data

December 2009


153 Reads

The maximum entropy method is a theoretically sound approach to construct an analytical form for the probability density function (pdf) given a sample of random events. In practice, numerical methods employed to determine the appropriate Lagrange multipliers associated with a set of moments are generally unstable in the presence of noise due to limited sampling. A robust method is presented that always returns the best pdf, where tradeoff in smoothing a highly varying function due to noise can be controlled. An unconventional adaptive simulated annealing technique, called funnel diffusion, determines expansion coefficients for Chebyshev polynomials in the exponential function.

Enhanced Sampling in Molecular Dynamics Using Metadynamics, Replica-Exchange, and Temperature-Acceleration

January 2014


272 Reads

We review a selection of methods for performing enhanced sampling in molecular dynamics simulations. We consider methods based on collective variable biasing and on tempering, and offer both historical and contemporary perspectives. In collective-variable biasing, we first discuss methods stemming from thermodynamic integration that use mean force biasing, including the adaptive biasing force algorithm and temperature acceleration. We then turn to methods that use bias potentials, including umbrella sampling and metadynamics. We next consider parallel tempering and replica-exchange methods. We conclude with a brief presentation of some combination methods.

Heat and Gravitation. I. The Action Principle

December 2008


159 Reads

This first article of a series formulates the thermodynamics of ideal gases in a constant gravitational field in terms of an action principle that is closely integrated with thermodynamics. The theory, in its simplest form, does not deviate from standard practice, but it lays the foundations for a more systematic approach to the various extensions, such as the incorporation of radiation, the consideration of mixtures and the integration with General Relativity. We study the interaction between an ideal gas and the photon gas, and propose a new approach to this problem. We study the propagation of sound in a vertical, isothermal column and are led to suggest that the theory is incomplete, and to ask whether the true equilibrium state of an ideal gas may turn out be adiabatic, in which case the role of solar radiation is merely to compensate for the loss of energy by radiation into the cosmos. An experiment with a centrifuge is proposed, to determine the influence of gravitation on the equilibrium distribution with a very high degree of precision.

Ricci Curvature, Isoperimetry and a Non-additive Entropy

February 2015


76 Reads

Searching for the dynamical foundations of the Havrda-Charv\'{a}t/Dar\'{o}czy/Cressie-Read/Tsallis non-additive entropy, we come across a covariant quantity called, alternatively, a generalized Ricci curvature, an $N$-Ricci curvature or a Bakry-\'{E}mery-Ricci curvature in the configuration/phase space of a system. We explore some of the implications of this tensor and its associated curvature and present a connection with the non-additive entropy under investigation. We present an isoperimetric interpretation of the non-extensive parameter and comment on further features of the system that can be probed through this tensor.

FIG. 1: Entropies parametrized in the (c, d)-plane, with their associated distribution functions. BG entropy corresponds to (1, 1), Tsallis entropy to (c, 0), and entropies for stretched exponentials to (1, d > 0). Entropies leading to distribution functions with compact support, belong to equivalence class (1, 0). Figure from [3]. 
FIG. 2: Example for an auto-correlated random walk that persistently walks in the same direction for ∝ n 1−α steps (α = 0.5). 
Generalized (c,d)-Entropy and Aging Random Walks

October 2013


193 Reads

Complex systems are often inherently non-ergodic and non-Markovian for which Shannon entropy loses its applicability. In particular accelerating, path-dependent, and aging random walks offer an intuitive picture for these non-ergodic and non-Markovian systems. It was shown that the entropy of non-ergodic systems can still be derived from three of the Shannon-Khinchin axioms, and by violating the fourth -- the so-called composition axiom. The corresponding entropy is of the form $S_{c,d} \sim \sum_i \Gamma(1+d,1-c\ln p_i)$ and depends on two system-specific scaling exponents, $c$ and $d$. This entropy contains many recently proposed entropy functionals as special cases, including Shannon and Tsallis entropy. It was shown that this entropy is relevant for a special class of non-Markovian random walks. In this work we generalize these walks to a much wider class of stochastic systems that can be characterized as `aging' systems. These are systems whose transition rates between states are path- and time-dependent. We show that for particular aging walks $S_{c,d}$ is again the correct extensive entropy. Before the central part of the paper we review the concept of $(c,d)$-entropy in a self-contained way.

Metriplectic Algebra for Dissipative Fluids in Lagrangian Formulation

March 2015


105 Reads

The dynamics of dissipative fluids in Eulerian variables may be derived from an algebra of Leibniz brackets of observables, the metriplectic algebra, that extends the Poisson algebra of the frictionless limit of the system via a symmetric semidefinite component, encoding dissipative forces. The metriplectic algebra includes the conserved total Hamiltonian H, generating the non-dissipative part of dynamics, and the entropy S of those microscopic degrees of freedom draining energy irreversibly, which generates dissipation. This S is a Casimir invariant of the Poisson algebra to which the metriplectic algebra reduces in the frictionless limit. The role of S is as paramount as that of H, but this fact may be underestimated in the Eulerian formulation because S is not the only Casimir of the symplectic non-canonical part of the algebra. Instead, when the dynamics of the non-ideal fluid is written through the parcel variables of the Lagrangian formulation, the fact that entropy is symplectically invariant clearly appears to be related to its dependence on the microscopic degrees of freedom of the fluid, that are themselves in involution with the position and momentum of the parcel.

Entropic Forms and Related Algebras

February 2013


126 Reads

Starting from a very general trace-form entropy, we introduce a pair of algebraic structures endowed by a generalized sum and a generalized product. These algebras form, respectively, two Abelian fields in the realm of the complex numbers isomorphic each other. We specify our results to several entropic forms related to distributions recurrently observed in social, economical, biological and physical systems including the stretched exponential, the power-law and the interpolating Bosons-Fermions distributions. Some potential applications in the study of complex systems are advanced.

Information Anatomy of Stochastic Equilibria

March 2014


65 Reads

A stochastic nonlinear dynamical system generates information, as measured by its entropy rate. Some---the ephemeral information---is dissipated and some---the bound information---is actively stored and so affects future behavior. We derive analytic expressions for the ephemeral and bound informations in the limit of small-time discretization for two classical systems that exhibit dynamical equilibria: first-order Langevin equations (i) where the drift is the gradient of a potential function and the diffusion matrix is invertible and (ii) with a linear drift term (Ornstein-Uhlenbeck) but a noninvertible diffusion matrix. In both cases, the bound information is sensitive only to the drift, while the ephemeral information is sensitive only to the diffusion matrix and not to the drift. Notably, this information anatomy changes discontinuously as any of the diffusion coefficients vanishes, indicating that it is very sensitive to the noise structure. We then calculate the information anatomy of the stochastic cusp catastrophe and of particles diffusing in a heat bath in the overdamped limit, both examples of stochastic gradient descent on a potential landscape. Finally, we use our methods to calculate and compare approximations for the so-called time-local predictive information for adaptive agents.

Complexity in Animal Communication: Estimating the Size of N-Gram Structures

August 2013


290 Reads

In this paper, new techniques that allow conditional entropy to estimate the combinatorics of symbols are applied to animal communication studies relying on information theory. By using the conditional entropy estimates at multiple orders, the paper estimates the total repertoire sizes for animal communication across bottlenose dolphins, humpback whales, and several species of birds for N-grams of one, two, and three combined units. How this can influence our estimates and ideas about the complexity of animal communication is also discussed.

Emergence of Animals from Heat Engines. Part 1. Before the Snowball Earths

November 2008


426 Reads

Previous studies modelled the origin of life and the emergence of photosynthesis on the early Earth-i.e. the origin of plants-in terms of biological heat engines that worked on thermal cycling caused by suspension in convecting water. In this new series of studies, heat engines using a more complex mechanism for thermal cycling are invoked to explain the origin of animals as well. Biological exploitation of the thermal gradient above a submarine hydrothermal vent is hypothesized, where a relaxation oscillation in the length of a protein 'thermotether' would have yielded the thermal cycling required for thermosynthesis. Such a thermal transition driven movement is not impeded by the low Reynolds number of a small scale. In the model the thermotether together with the protein export apparatus evolved into a 'flagellar proton pump' that turned into today's bacterial flagellar motor after the acquisition of the proton-pumping respiratory chain. The flagellar pump resembles Feynman's ratchet, and the 'flagellar computer' that implements chemotaxis a Turing machine: the stator would have functioned as Turing's paper tape and the stator's proton-transferring subunits with their variable conformation as the symbols on the tape. The existence of a cellular control centre in the cilium of the eukaryotic cell is proposed that would have evolved from the prokaryotic flagellar computer. Comment: Correction of text on Turing machines; Appendix added with journal reviewer's criticism, and reaction to that criticism; minor typos

The kappa-Generalizations of Stirling Approximation and Multinominal Coefficients

November 2005


100 Reads

Stirling approximation of the factorials and multinominal coefficients are generalized based on the one-parameter ($\kappa$) deformed functions introduced by Kaniadakis [Phys. Rev. E \textbf{66} (2002) 056125]. We have obtained the relation between the $\kappa$-generalized multinominal coefficients and the $\kappa$-entropy by introducing a new $\kappa$-product operation.

FIG. 1: Quantum circuit for implementing remote state preparation (RSP) of arbitrary two-qubit entangled states. |Mij 14 denotes a two-qubit projective measurement on Qubits 1 and 4 under a set of complete orthogonal basis vectors {|Mij 14}; ˆ U ij 25 denotes Alice's appropriate collective unitary transformation on bipartite (2,5); ˆ U36A denotes Bob's collective three-qubit unitary transformation on his Qubits 3, 6 and A, andˆUandˆ andˆU ijrs 36 denotes Bob's appropriate  
FIG. 3: Quantum circuit for implementing RSP of arbitrary three-qubit entangled states. |M ijk 147 denotes a threequbit projective measurement on Qubits 1, 4 and 7 under a set of complete orthogonal basis vectors {|M ijk 147}; ˆ U ijk 258 denotes Alice's appropriate triplet collective unitary transformation on triplet (2,5,8); ˆ U369A denotes Bob's collective four-qubit unitary transformation on his Qubits 3, 6, 9 and A andˆUandˆ andˆU ijkrst 369 denotes Bob's appropriate  
Generalized Remote Preparation of Arbitrary $m$-qubit Entangled States via Genuine Entanglements

March 2015


146 Reads

Herein, we present a feasible, general protocol for quantum communication within a network via generalized remote preparation of an arbitrary $m$-qubit entangled state designed with genuine tripartite Greenberger--Horne--Zeilinger-type entangled resources. During the implementations, we construct novel collective unitary operations; these operations are tasked with performing the necessary phase transfers during remote state preparations. We have distilled our implementation methods into a five-step procedure, which can be used to faithfully recover the desired state during transfer. Compared to previous existing schemes, our methodology features a greatly increased success probability. After the consumption of auxiliary qubits and the performance of collective unitary operations, the probability of successful state transfer is increased four-fold and eight-fold for arbitrary two- and three-qubit entanglements when compared to other methods within the literature, respectively. We conclude this paper with a discussion of the presented scheme for state preparation, including: success probabilities, reducibility and generalizability.

Figure 1. Autonomous vs. nonautonomous dynamics. Top: Autonomous evolution of a gas from a non-equilibrium state to an equilibrium state (Minus-First Law). Bottom: Nonautonomous evolution of a thermally isolated gas between two equilibrium states. The piston moves according to a pre-determined protocol specifying its position λ t in time. The entropy 
Fluctuation, Dissipation and the Arrow of Time

November 2011


173 Reads

The recent development of the theory of fluctuation relations has led to new insights into the ever-lasting question of how irreversible behavior emerges from time-reversal symmetric microscopic dynamics. We provide an introduction to fluctuation relations, examine their relation to dissipation and discuss their impact on the arrow of time question.

Genuine Tripartite Entanglement and Nonlocality in Bose-Einstein Condensates by Collective Atomic Recoil

November 2013


60 Reads

We study a system represented by a Bose-Einstein condensate interacting with a cavity field in presence of a strong off-resonant pumping laser. This system can be described by a three-mode Gaussian state, where two are the atomic modes corresponding to atoms populating upper and lower momentum sidebands and the third mode describes the scattered cavity field light. We show that, as a consequence of the collective atomic recoil instability, these modes possess a genuine tripartite entanglement that increases unboundedly with the evolution time and is larger than the bipartite entanglement in any reduced two-mode bipartition. We further show that the state of the system exhibits genuine tripartite nonlocality, which can be revealed by a robust violation of the Svetlichny inequality when performing displaced parity measurements. Our exact results are obtained by exploiting the powerful machinery of phase-space informational measures for Gaussian states, which we briefly review in the opening sections of the paper.

FIGURE 1. A (3 × 3)-grid.
Unextendible Mutually Unbiased Bases (after Mandayam, Bandyopadhyay, Grassl and Wootters)

July 2014


44 Reads

We consider questions posed in a recent paper of Mandayam, Bandyopadhyay, Grassl and Wootters [10] on the nature of "unextendible mutually unbiased bases." We describe a conceptual framework to study these questions, using a connection proved by the author in [19] between the set of nonidentity generalized Pauli operators on the Hilbert space of $N$ $d$-level quantum systems, $d$ a prime, and the geometry of non-degenerate alternating bilinear forms of rank $N$ over finite fields $\mathbb{F}_d$. We then supply alternative and short proofs of results obtained in [10], as well as new general bounds for the problems considered in loc. cit. In this setting, we also solve Conjecture 1 of [10], and speculate on variations of this conjecture.

Low-Temperature Behaviour of Social and Economic Networks

June 2006


87 Reads

Real-world social and economic networks typically display a number of particular topological properties, such as a giant connected component, a broad degree distribution, the small-world property and the presence of communities of densely interconnected nodes. Several models, including ensembles of networks also known in social science as Exponential Random Graphs, have been proposed with the aim of reproducing each of these properties in isolation. Here we define a generalized ensemble of graphs by introducing the concept of graph temperature, controlling the degree of topological optimization of a network. We consider the temperature-dependent version of both existing and novel models and show that all the aforementioned topological properties can be simultaneously understood as the natural outcomes of an optimized, low-temperature topology. We also show that seemingly different graph models, as well as techniques used to extract information from real networks, are all found to be particular low-temperature cases of the same generalized formalism. One such technique allows us to extend our approach to real weighted networks. Our results suggest that a low graph temperature might be an ubiquitous property of real socio-economic networks, placing conditions on the diffusion of information across these systems.

Corrections to Bekenstein-Hawking Entropy - Quantum or not-so Quantum?

December 2010


120 Reads

Hawking radiation and Bekenstein--Hawking entropy are the two robust predictions of a yet unknown quantum theory of gravity. Any theory which fails to reproduce these predictions is certainly incorrect. While several approaches lead to Bekenstein--Hawking entropy, they all lead to different sub-leading corrections. In this article, we ask a question that is relevant for any approach: Using simple techniques, can we know whether an approach contains quantum or semi-classical degrees of freedom? Using naive dimensional analysis, we show that the semi-classical black-hole entropy has the same dimensional dependence as the gravity action. Among others, this provides a plausible explanation for the connection between Einstein's equations and thermodynamic equation of state, and that the quantum corrections should have a different scaling behavior.

Fact-Checking Ziegler's Maximum Entropy Production Principle beyond the Linear Regime and towards Steady States

July 2013


158 Reads

We challenge claims that the principle of maximum entropy production produces physical phenomenological relations between conjugate currents and forces, even beyond the linear regime, and that currents in networks arrange themselves to maximize entropy production as the system approaches the steady state. In particular: (1) we show that Ziegler's principle of thermodynamic orthogonality leads to stringent reciprocal relations for higher order response coe?cients, and in the framework of stochastic thermodynamics, we exhibit a simple explicit model that does not satisfy them; (2) on a network, enforcing Kirchhoff's current law, we show that maximization of the entropy production prescribes reciprocal relations between coarse-grained observables, but is not responsible for the onset of the steady state, which is rather due to the minimum entropy production principle.

Fig. 4. Plot of bounds in a "Pe vs. H(T |Y )" diagram.
Fig. 5. Plot of bounds in a "P E vs. H(T |Y )" diagram.
A New Approach of Deriving Bounds between Entropy and Error from Joint Distribution: Case Study for Binary Classifications

March 2013


77 Reads

The existing upper and lower bounds between entropy and error are mostly derived through an inequality means without linking to joint distributions. In fact, from either theoretical or application viewpoint, there exists a need to achieve a complete set of interpretations to the bounds in relation to joint distributions. For this reason, in this work we propose a new approach of deriving the bounds between entropy and error from a joint distribution. The specific case study is given on binary classifications, which can justify the need of the proposed approach. Two basic types of classification errors are investigated, namely, the Bayesian and non-Bayesian errors. For both errors, we derive the closed-form expressions of upper bound and lower bound in relation to joint distributions. The solutions show that Fano's lower bound is an exact bound for any type of errors in a relation diagram of "Error Probability vs. Conditional Entropy". A new upper bound for the Bayesian error is derived with respect to the minimum prior probability, which is generally tighter than Kovalevskij's upper bound.

Minimum and Maximum Entropy Distributions for Binary Systems with Known Means and Pairwise Correlations

September 2017


205 Reads

Maximum entropy models are increasingly being used to describe the collective activity of neural populations with measured mean neural activities and pairwise correlations, but the full space of probability distributions consistent with these constraints has not been explored. We provide upper and lower bounds on the entropy for the {\em minimum} entropy distribution over arbitrarily large collections of binary units with any fixed set of mean values and pairwise correlations. We also construct specific low-entropy distributions for several relevant cases. Surprisingly, the minimum entropy solution has entropy scaling logarithmically with system size for any set of first- and second-order statistics consistent with arbitrarily large systems. We further demonstrate that some sets of these low-order statistics can only be realized by small systems. Our results show how only small amounts of randomness are needed to mimic low-order statistical properties of highly entropic distributions, and we discuss some applications for engineered and biological information transmission systems.

Inquiries into the Nature of Free Energy and Entropy in Respect to Biochemical Thermodynamics

April 2000


923 Reads

Free energy and entropy are examined in detail from the standpoint of classical thermodynamics. The approach is logically based on the fact that thermodynamic work is mediated by thermal energy through the tendency for nonthermal energy to convert spontaneously into thermal energy and for thermal energy to distribute spontaneously and uniformly within the accessible space. The fact that free energy is a Second-Law, expendable energy that makes it possible for thermodynamic work to be done at finite rates is emphasized. Entropy, as originally defined, is pointed out to be the capacity factor for thermal energy that is hidden with respect to temperature; it serves to evaluate the practical quality of thermal energy and to account for changes in the amounts of latent thermal energies in systems maintained at constant temperature. A major objective was to clarify the means by which free energy is transferred and conserved in sequences of biological reactions coupled by freely diffusible intermediates. In achieving this objective it was found necessary to distinguish between a 'characteristic free energy' possessed by all First-Law energies in amounts equivalent to the amounts of the energies themselves and a 'free energy of concentration' that is intrinsically mechanical and relatively elusive in that it can appear to be free of First-Law energy. The findings in this regard serve to clarify the fact that the transfer of chemical potential energy from one repository to another along sequences of biological reactions of the above sort occurs through transfer of the First-Law energy as thermal energy and transfer of the Second-Law energy as free energy of concentration. Comment: 18-page PDF; major correction in APPENDIX; minor corrections elsewhere

Figure 1: Geodesics of the Poincaré half-plane
Figure 2: One step of GIGO update
Figure 3: Median number of function calls to reach 10 −8 fitness on 24 runs for: Sphere function, Cigar-tablet function and Rosenbrock function. Initial position θ 0 = N (x 0 , I), with x 0 uniformly distributed on the circle of center 0 and radius 10. We recall that the "CMA-ES" algorithm here is using the so-called pure rank-µ CMA update.
Figure 6: Trajectories of GIGO, CMA and xNES optimizing x → x 2 in dimension 1 with δt = 0.5, sample size 5000, weights w i = 4.1 i1250 , and learning rates η µ = 1, η Σ = 1.8. One dot every 2 steps. Stronger differences. Notice that after one step, the lowest mean is still GIGO (∼ 8.5, whereas xNES is around 8.75), but from the second step, GIGO has the highest mean because of the lower variance.
Figure 7: Trajectories of GIGO, CMA and xNES optimizing x → x 2 in dimension 1 with δt = 1, sample size 5000, weights w i = 4.1 i1250 , and learning rates η µ = 1, η Σ = 1.8. One dot per step. The CMA-ES algorithm fails here, because at the fourth step, the covariance matrix is not positive definite anymore (It is easy to see that the CMA-ES update is always defined if δtη Σ < 1, but this is not the case here). Also notice (see also Proposition 6.2) that at the first step, GIGO decreases the variance, whereas the σ-component of the IGO speed is positive. 35
Black-Box Optimization Using Geodesics in Statistical Manifolds

September 2013


606 Reads

Information geometric optimization (IGO) is a general framework for stochastic optimization problems aiming at limiting the influence of arbitrary parametrization choices. The initial problem is transformed into the optimization of a smooth function on a Riemannian manifold, defining a parametrization-invariant first order differential equation. However, in practice, it is necessary to discretize time, and then, parametrization invariance holds only at first order in the step size. We define the Geodesic IGO update (GIGO), which uses the Riemannian manifold structure to obtain an update entirely independent from the parametrization of the manifold. We test it with classical objective functions. Thanks to Noether's theorem from classical mechanics, we find a reasonable way to compute the geodesics of the statistical manifold of Gaussian distributions, and thus the corresponding GIGO update. We then compare GIGO, pure rank-$\mu$ CMA-ES and xNES (two previous algorithms that can be recovered by the IGO framework), and show that while the GIGO and xNES updates coincide when the mean is fixed, they are different in general, contrary to previous intuition. We then define a new algorithm (Blockwise GIGO) that recovers the xNES update from abstract principles.

FIG. 1. The characteristic data for a (vacuum) spherically symmetric isolated horizon corresponds to Reissner-Nordstrom data on ∆, and free radiation data on the transversal null surface with suitable fall-off conditions. For each mass, charge, and radiation data in the transverse null surface there is a unique solution of Einstein-Maxwell equations locally in a portion of the past domain of dependence of the null surfaces. This defines the phase space of Type I isolated horizons in Einstein-Maxwell theory. The picture shows two Cauchy surfaces M1 and M2 "meeting" at space-like infinity i0. A portion of I + and I − are shown; however, no reference to future time-like infinity i + is made as the isolated horizon need not to coincide with the black hole event horizon.
FIG. 2. The value of the Immirzi parameter β k as a function of k ∈ N for the first few integers. The value β1 = 0.172217... is exact as well as the asymptotic value β∞ = 0.343599.... The other points have been computed using (109) which is only valid in the large k limit.
Static Isolated Horizons: SU(2) Invariant Phase Space, Quantization, and Black Hole Entropy

November 2010


50 Reads

We study the classical field theoretical formulation of static generic isolated horizons in a manifestly SU(2) invariant formulation. We show that the usual classical description requires revision in the non-static case due to the breaking of diffeomorphism invariance at the horizon leading to the non conservation of the usual pre-symplectic structure. We argue how this difficulty could be avoided by a simple enlargement of the field content at the horizon that restores diffeomorphism invariance. Restricting our attention to static isolated horizons we study the effective theories describing the boundary degrees of freedom. A quantization of the horizon degrees of freedom is proposed. By defining a statistical mechanical ensemble where only the area A of the horizon is fixed macroscopically-states with fluctuations away from spherical symmetry are allowed-we show that it is possible to obtain agreement with the Hawking's area law---S = A/4 (in Planck Units)---without fixing the Immirzi parameter to any particular value: consistency with the area law only imposes a relationship between the Immirzi parameter and the level of the Chern-Simons theory involved in the effective description of the horizon degrees of freedom.

Black Hole Horizons and Thermodynamics: A Quantum Approach

July 2005


37 Reads

We focus on quantization of the metric of a black hole restricted to the Killing horizon with universal radius $r_0$. After imposing spherical symmetry and after restriction to the Killing horizon, the metric is quantized employing the chiral currents formalism. Two ``components of the metric'' are indeed quantized: The former behaves as an affine scalar field under changes of coordinates, the latter is instead a proper scalar field. The action of the symplectic group on both fields is realized in terms of certain horizon diffeomorphisms. Depending on the choice of the vacuum state, such a representation is unitary. If the reference state of the scalar field is a coherent state rather than a vacuum, spontaneous breaking of conformal symmetry arises and the state contains a Bose-Einstein condensate. In this case the order parameter fixes the actual size of the black hole with respect to $r_0$. Both the constructed state together with the one associated with the affine scalar are thermal states (KMS) with respect to Schwarzschild Killing time when restricted to half horizon. The value of the order parameter fixes the temperature at the Hawking value as well. As a result, it is found that the quantum energy and entropy densities coincide with the black hole mass and entropy, provided the universal parameter $r_0$ is suitably chosen, not depending on the size of the actual black hole in particular.

Geometric Thermodynamics: Black Holes and the Meaning of the Scalar Curvature

December 2014


456 Reads

In this paper we show that the vanishing of the scalar curvature of Ruppeiner-like metrics does not characterize the ideal gas. Furthermore, we claim through an example that flatness is not a sufficient condition to establish the absence of interactions in the underlying microscopic model of a thermodynamic system, which poses a limitation on the usefulness of Ruppeiner’s metric and conjecture. Finally, we address the problem of the choice of coordinates in black hole thermodynamics. We propose an alternative energy representation for Kerr-Newman black holes that mimics fully Weinhold’s approach. The corresponding Ruppeiner’s metrics become degenerate only at absolute zero and have non-vanishing scalar curvatures.

Deformed Density Matrix and Quantum Entropy of the Black Hole

April 2006


44 Reads

In the present work the approach - density matrix deformation - earlier developed by the author to study a quantum theory of the Early Universe (Planck's scales) is applied to study a quantum theory of black holes. On this basis the author investigates the information paradox problem, entropy of the black hole remainders after evaporation, and consistency with the holographic principle. The possibility for application of the proposed approach to the calculation of quantum entropy of a black hole is considered.

Effective Conformal Descriptions of Black Hole Entropy

July 2011


38 Reads

It is no longer considered surprising that black holes have temperatures and entropies. What remains surprising, though, is the universality of these thermodynamic properties: their exceptionally simple and general form, and the fact that they can be derived from many very different descriptions of the underlying microscopic degrees of freedom. I review the proposal that this universality arises from an approximate conformal symmetry, which permits an effective "conformal dual" description that is largely independent of the microscopic details.

Bootstrap Methods for the Empirical Study of Decision-Making and Information Flows in Social Systems

June 2013


120 Reads

We characterize the statistical bootstrap for the estimation of information-theoretic quantities from data, with particular reference to its use in the study of large-scale social phenomena. Our methods allow one to preserve, approximately, the underlying axiomatic relationships of information theory---in particular, consistency under arbitrary coarse-graining---that motivate use of these quantities in the first place, while providing reliability comparable to the state of the art for Bayesian estimators. We show how information-theoretic quantities allow for rigorous empirical study of the decision-making capacities of rational agents and the time-asymmetric flows of information in distributed systems. We provide illustrative examples by reference to ongoing collaborative work on the semantic structure of the British Criminal Court system and the conflict dynamics of the contemporary Afghanistan insurgency.

Analysis of Time Reversible Born-Oppenheimer Molecular Dynamics

June 2013


47 Reads

We analyze the time reversible Born-Oppenheimer molecular dynamics (TRBOMD) scheme, which preserves the time reversibility of the Born-Oppenheimer molecular dynamics even with non-convergent self-consistent field iteration. In the linear response regime, we derive the stability condition as well as the accuracy of TRBOMD for computing physical properties such as the phonon frequency obtained from the molecular dynamic simulation. We connect and compare TRBOMD with the Car-Parrinello molecular dynamics in terms of accuracy and stability. We further discuss the accuracy of TRBOMD beyond the linear response regime for non-equilibrium dynamics of nuclei. Our results are demonstrated through numerical experiments using a simplified one dimensional model for Kohn-Sham density functional theory.

Radiation Entropy Bound from the Second Law of Thermodynamics

July 2008


65 Reads

It has been suggested heuristically by Unruh and Wald, and independently by Page, that at given energy and volume, thermal radiation has the largest entropy. The suggestion leads to the corresponding universal bound on entropy of physical systems. Using a gedanken experiment we show that the bound follows from the Second Law of Thermodynamics if the CPT symmetry is assumed and a general condition on matter holds. The experiment suggests that a wide class of Lorentz invariant local quantum field theories obeys a bound on the density of states.

Entropy Bounds, Holographic Principle and Uncertainty Relation

September 1999


53 Reads

A simple derivation of the bound on entropy is given and the holographic principle is discussed. We estimate the number of quantum states inside space region on the base of uncertainty relation. The result is compared with the Bekenstein formula for entropy bound, which was initially derived from the generalized second law of thermodynamics for black holes. The holographic principle states that the entropy inside a region is bounded by the area of the boundary of that region. This principle can be called the kinematical holographic principle. We argue that it can be derived from the dynamical holographic principle which states that the dynamics of a system in a region should be described by a system which lives on the boundary of the region. This last principle can be valid in general relativity because the ADM hamiltonian reduces to the surface term.

Figure 3: Probability density consisting of two components: the target density and outlier density. The gap between two components is the 1 − µ quantile of the length B µ .  
Figure 5: Plots of the maximum norms and the worst-case test errors. The top (Bottom) panels show the results for a Gaussian (linear) kernel. Red points mean the top 50 percent of values, and the asterisk ( * ) is the point that violates the inequality ν − µ ≤ 2(r − 2µ).  
Breakdown Point of Robust Support Vector Machine

September 2014


190 Reads

The support vector machine (SVM) is one of the most successful learning methods for solving classification problems. Despite its popularity, SVM has a serious drawback, that is sensitivity to outliers in training samples. The penalty on misclassification is defined by a convex loss called the hinge loss, and the unboundedness of the convex loss causes the sensitivity to outliers. To deal with outliers, robust variants of SVM have been proposed, such as the robust outlier detection algorithm and an SVM with a bounded loss called the ramp loss. In this paper, we propose a robust variant of SVM and investigate its robustness in terms of the breakdown point. The breakdown point is a robustness measure that is the largest amount of contamination such that the estimated classifier still gives information about the non-contaminated data. The main contribution of this paper is to show an exact evaluation of the breakdown point for the robust SVM. For learning parameters such as the regularization parameter in our algorithm, we derive a simple formula that guarantees the robustness of the classifier. When the learning parameters are determined with a grid search using cross validation, our formula works to reduce the number of candidate search points. The robustness of the proposed method is confirmed in numerical experiments. We show that the statistical properties of the robust SVM are well explained by a theoretical analysis of the breakdown point.

A Kernel-Based Calculation of Information on a Metric Space

May 2014


42 Reads

Kernel density estimation is a technique for approximating probability distributions. Here, it is applied to the calculation of mutual information on a metric space. This is motivated by the problem in neuroscience of calculating the mutual information between stimuli and spiking responses; the space of these responses is a metric space. It is shown that kernel density estimation on a metric space resembles the k-nearest-neighbor approach. This approach is applied to a toy dataset designed to mimic electrophysiological data.

Figure 1. Density of states of the pendulum in reduced units
Figure 2. Kinetic energy U kin as a function of energy U
Figure 3. Free energy of the pendulum
Figure 4. Configurational free energy of the pendulum
On the Thermodynamics of Classical Micro-Canonical Systems

September 2010


381 Reads

We study the configurational probability distribution of a mono-atomic gas with a finite number of particles N in the micro-canonical ensemble. We give two arguments why the thermodynamic entropy of the configurational subsystem involves Renyi's entropy function rather than that of Tsallis. The first argument is that the temperature of the configurational subsystem is equal to that of the kinetic subsystem. The second argument is that the instability of the pendulum, which occurs for energies close to the rotation threshold, is correctly reproduced.

Information-Geometric Markov Chain Monte Carlo Methods Using Diffusions

March 2014


100 Reads

Recent work incorporating geometric ideas in Markov chain Monte Carlo is reviewed in order to highlight these advances and their possible application in a range of domains beyond Statistics. A full exposition of Markov chains and their use in Monte Carlo simulation for Statistical inference and molecular dynamics is provided, with particular emphasis on methods based on Langevin diffusions. After this geometric concepts in Markov chain Monte Carlo are introduced. A full derivation of the Langevin diffusion on a Riemannian manifold is given, together with a discussion of appropriate Riemannian metric choice for different problems. A survey of applications is provided, and some open questions are discussed.

Measures of Causality in Complex Datasets with Application to Financial Data

January 2014


311 Reads

This article investigates causality structure of financial time series. We concentrate on three main approaches to measuring causality: linear Granger causality, kernel generalisations of Granger causality (based on ridge regression and Hilbert-Schmidt norm of the cross-covariance operator) and transfer entropy, examining each method and comparing their theoretical properties, with special attention given to the ability to capture nonlinear causality. We also analyse the theoretical benefits of applying non symmetrical measures rather than symmetrical measures of dependence. We applied the measures to a range of simulated and real data. The simulated data sets have been generated with linear and several types of nonlinear dependence, using bivariate as well as multivariate setting. Application to real-world financial data highlights the practical difficulties as well as the potential of the methods. We use two sets of real data: (1) US inflation and 1 month Libor, (2) S$\&$P data and exchange rates for the following currencies: AUDJPY, CADJPY, NZDJPY, AUDCHF, CADCHF, NZDCHF. Overall, we reached the conclusion that no single method can be recognised as the best in all circumstances and each of the methods has its domain of best applicability. We also describe the areas for improvement and future research.

Causality is an effect

December 2000


18 Reads

Using symmetric boundary conditions at separated times, I show analytically that both the time ordering of (macroscopic) causality and the direction of entropy increase follow from these boundary conditions. In particular, when the endpoints have low entropy, these arrows of time point away from the ends and toward the middle. Causality in this context means that when perturbations are applied, the effect of the perturbation---the macroscopic change in the system's behavior---is confined to one temporal side of the perturbations. These results hold for both mixing and integrable systems, although relaxation for integrable systems is incomplete. Simulations are presented for purposes of illustration.

FIG. 3: Direct inference of couplings in the repressilator system. We consider the repressilator dynamics modeled by Equation (21) with the parameters n = 2, α0 = 0, α = 10 and β = 100. The system poses a single stable equilibrium. Given L (number of samples), σ 2 (variation of perturbation) and ∆t (lag time), we apply stochastic perturbations, as described in Section III, and obtain time series of the perturbations {ξ } and responses {η }. The time series are then converted into {xt} according to Equation (23), and direct couplings are inferred via the aggregative discovery and progressive removal algorithms (see Section II D for details). (a,b) False positive ε+ and false negative ε− as a function of L for three different values of ∆t. Here, ε+ and ε− are defined in Equation (27). (c,d) ε+ and ε− as a function of 1/∆t for three different values of L. In all panels, we set σ = 10 −2 , and each data point is an average over 100 independent runs. 
Identifying Coupling Structure in Complex Systems through the Optimal Causation Entropy Principle

November 2014


141 Reads

Inferring the coupling structure of complex systems from time series data in general by means of statistical and information-theoretic techniques is a challenging problem in applied science. The reliability of statistical inferences requires the construction of suitable information-theoretic measures that take into account both direct and indirect influences, manifest in the form of information flows, between the components within the system. In this work, we present an application of the optimal causation entropy (oCSE) principle to identify the coupling structure of a synthetic biological system, the repressilator. Specifically, when the system reaches an equilibrium state, we use a stochastic perturbation approach to extract time series data that approximate a linear stochastic process. Then, we present and jointly apply the aggregative discovery and progressive removal algorithms based on the oCSE principle to infer the coupling structure of the system from the measured data. Finally, we show that the success rate of our coupling inferences not only improves with the amount of available data, but it also increases with a higher frequency of sampling and is especially immune to false positives.

FIG. 1: Plot of the magnetisation of the one-dimensional Blume-Emery-Griffiths model as a function of the external applied field at constant temperature 1/T = 20. The values of the constants of H1 (68) are K = 0, J = −1 and ∆ = 0; 0.5; 1 for the dotted, the solid, the dashed line respectively.
Maximum Entropy Estimation of Transition Probabilities of Reversible Markov Chains

October 2009


235 Reads

In this paper, we develop a general theory for the estimation of the transition probabilities of reversible Markov chains using the maximum entropy principle. A broad range of physical models can be studied within this approach. We use one-dimensional classical spin systems to illustrate the theoretical ideas. The examples studied in this paper are: the Ising model, the Potts model and the Blume-Emery-Griffiths model. © 2009 by the authors; licensee Molecular Diversity Preservation International, Basel, Switzerland.

Figure 1: A 3d wireframe rendering of two 3 × 3 × 10 worlds, the transparent brown cubes are earth blocks, the blue cubes agents. In the left scenario the agent is controlled by empowerment, and has produced a stair-like structure after ca. 100 time steps, which allows the agent to access the higher parts of the environment. The right scenario features an agent that chooses actions uniformly random for comparison, and we see that the initial configuration of earth blocks in the lower 5 levels is nearly unmodified. 
Changing the Environment Based on Empowerment as Intrinsic Motivation

June 2014


429 Reads

One aspect of intelligence is the ability to restructure your own environment so that the world you live in becomes more beneficial to you. In this paper we investigate how the information-theoretic measure of agent empowerment can provide a task-independent, intrinsic motivation to restructure the world. We show how changes in embodiment and in the environment change the resulting behaviour of the agent and the artefacts left in the world. For this purpose, we introduce an approximation of the established empowerment formalism based on sparse sampling, which is simpler and significantly faster to compute for deterministic dynamics. Sparse sampling also introduces a degree of randomness into the decision making process, which turns out to beneficial for some cases. We then utilize the measure to generate agent behaviour for different agent embodiments in a Minecraft-inspired three dimensional block world. The paradigmatic results demonstrate that empowerment can be used as a suitable generic intrinsic motivation to not only generate actions in given static environments, as shown in the past, but also to modify existing environmental conditions. In doing so, the emerging strategies to modify an agent's environment turn out to be meaningful to the specific agent capabilities, i.e., de facto to its embodiment.

Diffusion Dynamics With Changing Network Composition

August 2013


106 Reads






We analyze information diffusion using empirical data that tracks online communication around two instances of mass political mobilization, including the year that lapsed in-between the protests. We compare the global properties of the topological and dynamic networks through which communication took place as well as local changes in network composition. We show that changes in network structure underlie aggregated differences on how information diffused: an increase in network hierarchy is accompanied by a reduction in the average size of cascades. The increasing hierarchy affects not only the underlying communication topology but also the more dynamic structure of information exchange; the increase is especially noticeable amongst certain categories of nodes (or users). This suggests that the relationship between the structure of networks and their function in diffusing information is not as straightforward as some theoretical models of diffusion in networks imply.

Local Softening of Information Geometric Indicators of Chaos in Statistical Modeling in the Presence of Quantum-Like Considerations

August 2013


50 Reads

In a previous paper (C. Cafaro et al., 2012), we compared an uncorrelated 3D Gaussian statistical model to an uncorrelated 2D Gaussian statistical model obtained from the former model by introducing a constraint that resembles the quantum mechanical canonical minimum uncertainty relation. Analysis was completed by way of the information geometry and the entropic dynamics of each system. This analysis revealed that the chaoticity of the 2D Gaussian statistical model, quantified by means of the Information Geometric Entropy (IGE), is softened or weakened with respect to the chaoticity of the 3D Gaussian statistical model due to the accessibility of more information. In this companion work, we further constrain the system in the context of a correlation constraint among the system's micro-variables and show that the chaoticity is further weakened, but only locally. Finally, the physicality of the constraints is briefly discussed, particularly in the context of quantum entanglement.

Synchronicity From Synchronized Chaos

January 2011


526 Reads

It is argued that the dynamical systems paradigm of synchronized chaos goes further toward realizing the philosophically motivated notion of "synchronicity" than commonly thought. Two effectively unpredictable systems exhibit a predictable relationship. That relationship can become highly intermittent, as with philosophical "synchronicities", in physically realistic configurations that include a time delay in the coupling channel. Further, the philosophical requirement that synchronicities be {\it meaningful} is fullfilled if meaningfulness is related to internal coherence. A relationship between internal and external synchronization is illustrated for systems that exhibit oscillons, primitive time-varying coherent structures whose existence in Hamiltonian systems appears necessary for a weak form of extenal synchronization. The philosophical notion of synchronicity between matter and mind is also realized naturally if mind is analogized to a computer model assimilating data from observations of a real system. Meaningfulness as internal synchronization within mind appears at the level of neuronal spike-trains whose synchronization is thought to be key to perceptual grouping. The utility of a representation of object groups based on synchronized chaos has indeed been illustrated by a cellular neural network generalization of the Hopfield neural network for the traveling salesman problem. At a higher level, consciousness may emerge as synchronization among alternative mental representations of the same reality. In the objective world, synchronicity is most apparent in the quantum realm, where nonlocal connections are implied by Bell's theorem. The quantum world seems to reside on a generalized synchronization "manifold", that one can either take as primitive or as evidence of long-range connections in a multiply-connected spacetime.

A Characterization of Entropy in Terms of Information Loss

June 2011


391 Reads

There are numerous characterizations of Shannon entropy and Tsallis entropy as measures of information obeying certain properties. Using work by Faddeev and Furuichi, we derive a very simple characterization. Instead of focusing on the entropy of a probability measure on a finite set, this characterization focuses on the `information loss', or change in entropy, associated with a measure-preserving function. Information loss is a special case of conditional entropy: namely, it is the entropy of a random variable conditioned on some function of that variable. We show that Shannon entropy gives the only concept of information loss that is functorial, convex-linear and continuous. This characterization naturally generalizes to Tsallis entropy as well.

Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests

January 2013


196 Reads

We propose a non-parametric method to cluster mixed data containing both continuous and discrete random variables. The product space of continuous and categorical sample spaces is approximated locally by analyzing neighborhoods with cluster patterns. Detection of cluster patterns on the product space is determined by using a modified Chi-square test. The proposed method does not impose a global distance function which could be difficult to specify in practice. Results from simulation studies have shown that our proposed methods out-performed the benchmark method, AutoClass, for various settings.

The Phase Space Elementary Cell in Classical and Generalized Statistics

January 2015


3,153 Reads

In the past, the phase-space elementary cell of a non-quantized system was set equal to the third power of the Planck constant; in fact, it is not a necessary assumption. We discuss how the phase space volume, the number of states and the elementary-cell volume of a system of non-interacting N particles, changes when an interaction is switched on and the system becomes or evolves to a system of correlated non-Boltzmann particles and derives the appropriate expressions. Even if we assume that nowadays the volume of the elementary cell is equal to the cube of the Planck constant, h^3, at least for quantum systems, we show that there is a correspondence between different values of h in the past, with important and, in principle, measurable cosmological and astrophysical consequences, and systems with an effective smaller (or even larger) phase-space volume described by non-extensive generalized statistics.

On Classical Ideal Gases

August 2011


1,536 Reads

The ideal gas laws are derived from the democritian concept of corpuscles moving in vacuum plus a principle of simplicity, namely that these laws are independent of the laws of motion aside from the law of energy conservation. A single corpuscle in contact with a heat bath and submitted to a $z$ and $t$-invariant force $-w$ is considered, in which case corpuscle distinguishability is irrelevant. The non-relativistic approximation is made only in examples. Some of the end results are known but the method appears to be novel. The mathematics being elementary the present paper should facilitate the understanding of the ideal-gas law and more generally of classical thermodynamics. It supplements importantly a previously published paper: The stability of ideal gases is proven from the expressions obtained for the force exerted by the corpuscle on the two end pistons of a cylinder, and the internal energy. We evaluate the entropy increase that occurs when the wall separating two cylinders is removed and show that the entropy remains the same when the separation is restored. The entropy increment may be defined at the ratio of heat entering into the system and temperature when the number of corpuscles (0 or 1) is fixed. In general the entropy is defined as the average value of $\ln(p)$ where $p$ denotes the probability of a given state. Generalization to $z$-dependent weights, or equivalently to arbitrary static potentials, is made.

Top-cited authors