-
[show abstract]
[hide abstract]
ABSTRACT: Recent work emphasizes that the maximum entropy principle provides a bridge
between statistical mechanics models for collective behavior in neural networks
and experiments on networks of real neurons. Most of this work has focused on
capturing the measured correlations among pairs of neurons. Here we suggest an
alternative, constructing models that are consistent with the distribution of
global network activity, i.e. the probability that K out of N cells in the
network generate action potentials in the same small time bin. The inverse
problem that we need to solve in constructing the model is analytically
tractable, and provides a natural "thermodynamics" for the network in the limit
of large N. We analyze the responses of neurons in a small patch of the retina
to naturalistic stimuli, and find that the implied thermodynamics is very close
to an unusual critical point, in which the entropy (in proper units) is exactly
equal to the energy.
07/2012;
-
[show abstract]
[hide abstract]
ABSTRACT: Cells in a developing embryo have no direct way of "measuring" their physical
position. Through a variety of processes, however, the expression levels of
multiple genes come to be correlated with position, and these expression levels
thus form a code for "positional information." We show how to measure this
information, in bits, using the gap genes in the Drosophila embryo as an
example. Individual genes carry nearly two bits of information, twice as much
as expected if the expression patterns consisted only of on/off domains
separated by sharp boundaries. Taken together, four gap genes carry enough
information to define a cell's location with an error bar of ~1% along the
anterior-posterior axis of the embryo. This precision is nearly enough for each
cell to have a unique identity, which is the maximum information the system can
use, and is nearly constant along the length of the embryo. We argue that this
constancy is a signature of optimality in the transmission of information from
primary morphogen inputs to the output of the gap gene network.
12/2011;
-
[show abstract]
[hide abstract]
ABSTRACT: The visual system is challenged with extracting and representing behaviorally relevant information contained in natural inputs of great complexity and detail. This task begins in the sensory periphery: retinal receptive fields and circuits are matched to the first and second-order statistical structure of natural inputs. This matching enables the retina to remove stimulus components that are predictable (and therefore uninformative), and primarily transmit what is unpredictable (and therefore informative). Here we show that this design principle applies to more complex aspects of natural scenes, and to central visual processing. We do this by classifying high-order statistics of natural scenes according to whether they are uninformative vs. informative. We find that the uninformative ones are perceptually nonsalient, while the informative ones are highly salient, and correspond to previously identified perceptual mechanisms whose neural basis is likely central. Our results suggest that the principle of efficient coding not only accounts for filtering operations in the sensory periphery, but also shapes subsequent stages of sensory processing that are sensitive to high-order image statistics.
Proceedings of the National Academy of Sciences 10/2010; 107(42):18149-54. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In retina and in cortical slice the collective response of spiking neural populations is well described by "maximum-entropy" models in which only pairs of neurons interact. We asked, how should such interactions be organized to maximize the amount of information represented in population responses? To this end, we extended the linear-nonlinear-Poisson model of single neural response to include pairwise interactions, yielding a stimulus-dependent, pairwise maximum-entropy model. We found that as we varied the noise level in single neurons and the distribution of network inputs, the optimal pairwise interactions smoothly interpolated to achieve network functions that are usually regarded as discrete--stimulus decorrelation, error correction, and independent encoding. These functions reflected a trade-off between efficient consumption of finite neural bandwidth and the use of redundancy to mitigate noise. Spontaneous activity in the optimal network reflected stimulus-induced activity patterns, and single-neuron response variability overestimated network noise. Our analysis suggests that rather than having a single coding principle hardwired in their architecture, networks in the brain should adapt their function to changing noise and stimulus correlations.
Proceedings of the National Academy of Sciences 08/2010; 107(32):14419-24. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Central to the functioning of a living cell is its ability to control the readout or expression of information encoded in the genome. In many cases, a single transcription factor protein activates or represses the expression of many genes. As the concentration of the transcription factor varies, the target genes thus undergo correlated changes, and this redundancy limits the ability of the cell to transmit information about input signals. We explore how interactions among the target genes can reduce this redundancy and optimize information transmission. Our discussion builds on recent work [Tkacik, Phys. Rev. E 80, 031920 (2009)], and there are connections to much earlier work on the role of lateral inhibition in enhancing the efficiency of information transmission in neural circuits; for simplicity we consider here the case where the interactions have a feed forward structure, with no loops. Even with this limitation, the networks that optimize information transmission have a structure reminiscent of the networks found in real biological systems.
Physical Review E 04/2010; 81(4 Pt 1):041905. · 2.26 Impact Factor
-
BMC Neuroscience. 01/2010;
-
[show abstract]
[hide abstract]
ABSTRACT: Ising models with pairwise interactions are the least structured, or maximum-entropy, probability distributions that exactly reproduce measured pairwise correlations between spins. Here we use this equivalence to construct Ising models that describe the correlated spiking activity of populations of 40 neurons in the salamander retina responding to natural movies. We show that pairwise interactions between neurons account for observed higher-order correlations, and that for groups of 10 or more neurons pairwise interactions can no longer be regarded as small perturbations in an independent system. We then construct network ensembles that generalize the network instances observed in the experiment, and study their thermodynamic behavior and coding capacity. Based on this construction, we can also create synthetic networks of 120 neurons, and find that with increasing size the networks operate closer to a critical point and start exhibiting collective behaviors reminiscent of spin glasses. We examine closely two such behaviors that could be relevant for neural code: tuning of the network to the critical point to maximize the ability to encode diverse stimuli, and using the metastable states of the Ising Hamiltonian as neural code words. Comment: This is an extended version of arXiv:q-bio.NC/0611072
12/2009;
-
[show abstract]
[hide abstract]
ABSTRACT: Evolutionary theory predicts that a population in a new environment will accumulate adaptive substitutions, but precisely how they accumulate is poorly understood. The dynamics of adaptation depend on the underlying fitness landscape. Virtually nothing is known about fitness landscapes in nature, and few methods allow us to infer the landscape from empirical data. With a view toward this inference problem, we have developed a theory that, in the weak-mutation limit, predicts how a population's mean fitness and the number of accumulated substitutions are expected to increase over time, depending on the underlying fitness landscape. We find that fitness and substitution trajectories depend not on the full distribution of fitness effects of available mutations but rather on the expected fixation probability and the expected fitness increment of mutations. We introduce a scheme that classifies landscapes in terms of the qualitative evolutionary dynamics they produce. We show that linear substitution trajectories, long considered the hallmark of neutral evolution, can arise even when mutations are strongly selected. Our results provide a basis for understanding the dynamics of adaptation and for inferring properties of an organism's fitness landscape from temporal data. Applying these methods to data from a long-term experiment, we infer the sign and strength of epistasis among beneficial mutations in the Escherichia coli genome.
Proceedings of the National Academy of Sciences 11/2009; 106(44):18638-43. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In order to survive, reproduce, and (in multicellular organisms) differentiate, cells must control the concentrations of the myriad different proteins that are encoded in the genome. The precision of this control is limited by the inevitable randomness of individual molecular events. Here we explore how cells can maximize their control power in the presence of these physical limits; formally, we solve the theoretical problem of maximizing the information transferred from inputs to outputs when the number of available molecules is held fixed. We start with the simplest version of the problem, in which a single transcription factor protein controls the readout of one or more genes by binding to DNA. We further simplify by assuming that this regulatory network operates in steady state, that the noise is small relative to the available dynamic range, and that the target genes do not interact. Even in this simple limit, we find a surprisingly rich set of optimal solutions. Importantly, for each locally optimal regulatory network, all parameters are determined once the physical constraints on the number of available molecules are specified. Although we are solving an oversimplified version of the problem facing real cells, we see parallels between the structure of these optimal solutions and the behavior of actual genetic regulatory networks. Subsequent papers will discuss more complete versions of the problem.
Physical Review E 09/2009; 80(3 Pt 1):031920. · 2.26 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The precision of biochemical signaling is limited by randomness in the diffusive arrival of molecules at their targets. For proteins binding to specific sites on DNA and regulating transcription, the ability of the proteins to diffuse in one dimension by sliding along the length of the DNA, in addition to their diffusion in bulk solution, would seem to generate a larger target for DNA binding, consequently reducing the noise in the occupancy of the regulatory site. Here we show that this effect is largely canceled by the enhanced temporal correlations in one-dimensional diffusion. With realistic parameters, sliding along DNA has surprisingly little effect on the physical limits to the precision of transcriptional regulation.
Physical Review E 06/2009; 79(5 Pt 1):051901. · 2.26 Impact Factor
-
BMC Neuroscience. 01/2009;
-
[show abstract]
[hide abstract]
ABSTRACT: In the simplest view of transcriptional regulation, the expression of a gene is turned on or off by changes in the concentration of a transcription factor (TF). We use recent data on noise levels in gene expression to show that it should be possible to transmit much more than just one regulatory bit. Realizing this optimal information capacity would require that the dynamic range of TF concentrations used by the cell, the input/output relation of the regulatory module, and the noise in gene expression satisfy certain matching relations, which we derive. These results provide parameter-free, quantitative predictions connecting independently measurable quantities. Although we have considered only the simplified problem of a single gene responding to a single TF, we find that these predictions are in surprisingly good agreement with recent experiments on the Bicoid/Hunchback system in the early Drosophila embryo and that this system achieves approximately 90% of its theoretical maximum information transmission.
Proceedings of the National Academy of Sciences 09/2008; 105(34):12265-70. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Changes in a cell's external or internal conditions are usually reflected in the concentrations of the relevant transcription factors. These proteins in turn modulate the expression levels of the genes under their control and sometimes need to perform nontrivial computations that integrate several inputs and affect multiple genes. At the same time, the activities of the regulated genes would fluctuate even if the inputs were held fixed, as a consequence of the intrinsic noise in the system, and such noise must fundamentally limit the reliability of any genetic computation. Here we use information theory to formalize the notion of information transmission in simple genetic regulatory elements in the presence of physically realistic noise sources. The dependence of this "channel capacity" on noise parameters, cooperativity and cost of making signaling molecules is explored systematically. We find that, in the range of parameters probed by recent in vivo measurements, capacities higher than one bit should be achievable. It is of course generally accepted that gene regulatory elements must, in order to function properly, have a capacity of at least one bit. The central point of our analysis is the demonstration that simple physical models of noisy gene transcription, with realistic parameters, can indeed achieve this capacity: it was not self-evident that this should be so. We also demonstrate that capacities significantly greater than one bit are possible, so that transcriptional regulation need not be limited to simple "on-off" components. The question whether real systems actually exploit this richer possibility is beyond the scope of this investigation.
Physical Review E 08/2008; 78(1 Pt 1):011910. · 2.26 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The scale invariance of natural images suggests an analogy to the statistical mechanics of physical systems at a critical point. Here we examine the distribution of pixels in small image patches and show how to construct the corresponding thermodynamics. We find evidence for criticality in a diverging specific heat, which corresponds to large fluctuations in how "surprising" we find individual images, and in the quantitative form of the entropy vs. energy. The energy landscape derived from our thermodynamic framework identifies special image configurations that have intrinsic error correcting properties, and neurons which could detect these features have a strong resemblance to the cells found in primary visual cortex.
07/2008;
-
[show abstract]
[hide abstract]
ABSTRACT: This review was written for the Encyclopedia of Complexity and System Science (Springer-Verlag, Berlin, 2008), and is intended as a guide to the growing literature which approaches the phenomena of cell biology from a more theoretical point of view. We begin with the building blocks of cellular networks, and proceed toward the different classes of models being explored, finally discussing the "design principles" which have been suggested for these systems. Although largely a dispassionate review, we do draw attention to areas where there seems to be general consensus on ideas that have not been tested very thoroughly and, more optimistically, to areas where we feel promising ideas deserve to be more fully explored.
01/2008;
-
[show abstract]
[hide abstract]
ABSTRACT: Recent work has shown that probabilistic models based on pairwise interactions-in the simplest case, the Ising model-provide surprisingly accurate descriptions of experiments on real biological networks ranging from neurons to genes. Finding these models requires us to solve an inverse problem: given experimentally measured expectation values, what are the parameters of the underlying Hamiltonian? This problem sits at the intersection of statistical physics and machine learning, and we suggest that more efficient solutions are possible by merging ideas from the two fields. We use a combination of recent coordinate descent algorithms with an adaptation of the histogram Monte Carlo method, and implement these techniques to take advantage of the sparseness found in data on real neurons. The resulting algorithm learns the parameters of an Ising model describing a network of forty neurons within a few minutes. This opens the possibility of analyzing much larger data sets now emerging, and thus testing hypotheses about the collective behaviors of these networks.
01/2008;
-
[show abstract]
[hide abstract]
ABSTRACT: Gene expression levels fluctuate even under constant external conditions. Much emphasis has usually been placed on the components of this noise that are due to randomness in transcription and translation. Here we focus on the role of noise associated with the inputs to transcriptional regulation; in particular, we analyze the effects of random arrival times and binding of transcription factors to their target sites along the genome. This contribution to the total noise sets a fundamental physical limit to the reliability of genetic control, and has clear signatures, but we show that these are easily obscured by experimental limitations and even by conventional methods for plotting the variance vs. mean expression level. We argue that simple, universal models of noise dominated by transcription and translation are inconsistent with the embedding of gene expression in a network of regulatory interactions. Analysis of recent experiments on transcriptional control in the early Drosophila embryo shows that these results are quantitatively consistent with the predicted signatures of input noise, and we discuss the experiments needed to test the importance of input noise more generally.
PLoS ONE 01/2008; 3(7):e2774. · 4.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a principled likelihood-based approach for inferring physical models of TF-DNA binding energy from the data produced by modern high-throughput binding assays. Central to our analysis is the ability to assess the relative likelihood of different model parameters given experimental observations. We take a unique approach to this problem and show how to compute likelihood without any explicit assumptions about the noise that inevitably corrupts such measurements. Sampling possible choices for model parameters according to this likelihood function, we can then make probabilistic predictions for the identities of binding sites and their physical binding energies. Applying this procedure to previously published data on the Saccharomyces cerevisiae TF Abf1p, we find models of TF binding whose parameters are determined with remarkable precision. Evidence for the accuracy of these models is provided by an astonishing level of phylogenetic conservation in the predicted energies of putative binding sites. Results from in vivo and in vitro experiments also provide highly consistent characterizations of Abf1p, a result that contrasts with a previous analysis of the same data.
Proceedings of the National Academy of Sciences 02/2007; 104(2):501-6. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Ising models with pairwise interactions are the least structured, or maximum-entropy, probability distributions that exactly reproduce measured pairwise correlations between spins. Here we use this equivalence to construct Ising models that describe the correlated spiking activity of populations of 40 neurons in the retina, and show that pairwise interactions account for observed higher-order correlations. By first finding a representative ensemble for observed networks we can create synthetic networks of 120 neurons, and find that with increasing size the networks operate closer to a critical point and start exhibiting collective behaviors reminiscent of spin glasses.
12/2006;
-
[show abstract]
[hide abstract]
ABSTRACT: In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. Here, we reformulate the clustering problem from an information theoretic perspective that avoids many of these assumptions. In particular, our formulation obviates the need for defining a cluster "prototype," does not require an a priori similarity metric, is invariant to changes in the representation of the data, and naturally captures nonlinear relations. We apply this approach to different domains and find that it consistently produces clusters that are more coherent than those extracted by existing algorithms. Finally, our approach provides a way of clustering based on collective notions of similarity rather than the traditional pairwise measures.
Proceedings of the National Academy of Sciences 01/2006; 102(51):18297-302. · 9.68 Impact Factor