# Thomas M. Cover's research while affiliated with Palo Alto University and other places

**What is this page?**

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

## Publications (32)

Half Title Title Copyright Contents Preface to the Second Edition Preface to the First Edition Acknowledgments for the Second Edition Acknowledgments for the First Edition

Gaussian Channel: Definitions Converse to the Coding Theorem for Gaussian Channels Bandlimited Channels Parallel Gaussian Channels Channels with Colored Gaussian Noise Gaussian Channels with Feedback Summary Problems Historical Notes

Definitions AEP for Continuous Random Variables Relation of Differential Entropy to Discrete Entropy Joint and Conditional Differential Entropy Relative Entropy and Mutual Information Properties of Differential Entropy, Relative Entropy, and Mutual Information Summary Problems Historical Notes

Entropy Joint Entropy and Conditional Entropy Relative Entropy and Mutual Information Relationship Between Entropy and Mutual Information Chain Rules for Entropy, Relative Entropy, and Mutual Information Jensen's Inequality and Its Consequences Log Sum Inequality and Its Applications Data-Processing Inequality Sufficient Statistics Fano's Inequalit...

Universal Codes and Channel Capacity Universal Coding for Binary Sequences Arithmetic Coding Lempel–Ziv Coding Optimality of Lempel–Ziv Algorithms Summary Problems Historical Notes

Basic Inequalities of Information Theory Differential Entropy Bounds on Entropy and Relative Entropy Inequalities for Types Combinatorial Bounds on Entropy Entropy Rates of Subsets Entropy and Fisher Information Entropy Power Inequality and Brunn–Minkowski Inequality Inequalities for Determinants Inequalities for Ratios of Determinants Summary Prob...

Models of Computation Kolmogorov Complexity: Definitions and Examples Kolmogorov Complexity and Entropy Kolmogorov Complexity of Integers Algorithmically Random and Incompressible Sequences Universal Probability Kolmogorov complexity Ω Universal Gambling Occam's Razor Kolmogorov Complexity and Universal Probability Kolmogorov Sufficient Statistic M...

The Horse Race Gambling and Side Information Dependent Horse Races and Entropy Rate The Entropy of English Data Compression and Gambling Gambling Estimate of the Entropy of English Summary Problems Historical Notes

Examples of Codes Kraft Inequality Optimal Codes Bounds on the Optimal Code Length Kraft Inequality for Uniquely Decodable Codes Huffman Codes Some Comments on Huffman Codes Optimality of Huffman Codes Shannon–Fano–Elias Coding Competitive Optimality of the Shannon Code Generation of Discrete Distributions from Fair Coins Summary Problems Historica...

Maximum Entropy Distributions Examples Anomalous Maximum Entropy Problem Spectrum Estimation Entropy Rates of a Gaussian Process Burg's Maximum Entropy Theorem Summary Problems Historical Notes

Quantization Definitions Calculation of the Rate Distortion Function Converse to the Rate Distortion Theorem Achievability of the Rate Distortion Function Strongly Typical Sequences and Rate Distortion Characterization of the Rate Distortion Function Computation of Channel Capacity and the Rate Distortion Function Summary Problems Historical Notes

Markov Chains Entropy Rate Example: Entropy Rate of a Random Walk on a Weighted Graph Second Law of Thermodynamics Functions of Markov Chains Summary Problems Historical Notes

The Stock Market: Some Definitions Kuhn–Tucker Characterization of the Log-Optimal Portfolio Asymptotic Optimality of the Log-Optimal Portfolio Side Information and the Growth Rate Investment in Stationary Markets Competitive Optimality of the Log-Optimal Portfolio Universal Portfolios Shannon–McMillan–Breiman Theorem (General AEP) Summary Problems...

Asymptotic Equipartition Property Theorem Consequences of the AEP: Data Compression High-Probability Sets and the Typical Set Summary Problems Historical Notes

Gaussian Multiple-User Channels Jointly Typical Sequences Multiple-Access Channel Encoding of Correlated Sources Duality Between Slepian–Wolf Encoding and Multiple-Access Channels Broadcast Channel Relay Channel Source Coding with Side Information Rate Distortion with Side Information General Multiterminal Networks Summary Problems Historical Notes

Method of Types Law of Large Numbers Universal Source Coding Large Deviation Theory Examples of Sanov's Theorem Conditional Limit Theorem Hypothesis Testing Chernoff–Stein Lemma Chernoff Information Fisher Information and the Cramér–Rao Inequality Summary Problems Historical Notes

Examples of Channel Capacity Symmetric Channels Properties of Channel Capacity Preview of the Channel Coding Theorem Definitions Jointly Typical Sequences Channel Coding Theorem Zero-Error Codes Fano's Inequality and the Converse to the Coding Theorem Equality in the Converse to the Channel Coding Theorem Hamming Codes Feedback Capacity Source–Chan...

Half-title pageSeries pageTitle pageCopyright pageDedicationPrefaceAcknowledgementsContentsList of figuresHalf-title pageIndex

In this chapter we put content in the definition of entropy by establishing the fundamental limit for the compression of information. Data compression can be achieved by assigning short descriptions to the most frequent outcomes of the data source and necessarily longer descriptions to the less frequent outcomes. For example, in Morse code, the mos...

At first sight, information theory and gambling seem to be unrelated. But there is strong duality between the growth rate of investment in a horse race and the entropy rate of the horse race. Indeed the sum of the growth rate and the entropy rate is a constant. In the process of proving this, we shall argue that the financial value of side informat...

What do we mean when we say that A communicates with B? We mean that the physical acts of A have induced a desired physical state in B. This transfer of information is a physical process and therefore is subject to the uncontrollable ambient noise and imperfections of the physical signalling process itself. The communication is successful if the re...

The duality between the growth rate of wealth in the stock market and the entropy rate of the market is striking. We explore this duality in this chapter. In particular, we shall find the competitively optimal and growth rate optimal portfolio strategies. They are the same, just as the Shannon code is optimal both competitively and in expected valu...

The temperature of a gas corresponds to the average kinetic energy of the molecules in the gas. What can we say about the distribution of velocities in the gas at a given temperature? We know from physics that this distribution is the maximum entropy distribution under the temperature constraint, otherwise known as the Maxwell-Boltzmann distributio...

This chapter summarizes and reorganizes the inequalities found throughout this book. A number of new inequalities on the entropy rates of subsets and the relationship of entropy and ℒp norms are also developed. The intimate relationship between Fisher information and entropy is explored, culminating in a common proof of the entropy power inequality...

The great mathematician Kolmogorov defined the algorithmic (descriptive) complexity of an object to be the length of the shortest binary computer program that describes the object. (Apparently a computer, the most general form of data decompressor, will use this description to exhibit the described object after a finite amount of computation.) Thus...

A system with many senders and receivers contains many new elements in the communication problem: interference, cooperation and feedback. These are the issues that are the domain of network information theory. The general problem is easy to state. Given many senders and receivers and a channel transition matrix which describes the effects of the in...

In information theory, the analog of the law of large numbers is the Asymptotic Equipartition Property (AEP). It is a direct consequence of the weak law of large numbers. The law of large numbers states that for independent, identically distributed (i.i.d.) random variables, is close to its expected value EX for large values of n. The AEP states th...

The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a continuous random variable can never be perfect. How well can we do? To frame the question appropriately, it is necessary to define the “goodness” of a representation of a source. This is accomplished by defining a distortion measure whi...

The asymptotic equipartition property in Chapter 3 establishes that nH(X) bits suffice on the average to describe n independent and identically distributed random variables. But what if the random variables are dependent? In particular, what if the random variables form a stationary process? We will show, just as in the i.i.d. case, that the entrop...

## Citations

... Following the organization of gene clusters, hypergeometric tests were performed for each cluster to test for enrichment of DE genes. An information theory approach [39] was adopted to infer gene expression networks within select clusters using the minet software package for R [40]. Mutual information (MI) measures the information content that two variables share: a numerical value ranging from 0 to 1 depending on, intuitively, how much knowing one variable would predict variability of the other. ...

... Usually, the architectures of these models are very complex and hard to interpret. In order to address the problems mentioned above, in this letter, we adopt a new and low complexity unsupervised change detection method based on a combination of similarity and dissimilarity measures (Mutual Information [11], [12], Disjoint Information [11], [12], the Local Dissimilarity Map [13]) and k-means clustering. Our method is inspired by Gupta's method [14] which used only Mutual Information. ...

... Finally, we estimated the temporal complexity of the brain for the whole spectrum of frequencies (1-45 Hz) by means of Lempel-Ziv Complexity (LZC) (Lempel and Ziv, 1976). This measure is based on Kolmogorov complexity and calculates the minimal "information" contained in a sequence (Cover and Thomas, 2006;Schartner et al., 2015). To analyze LZC continuous sequences, such as the EEG signal, it needs to be discretized; in this work we employed the binarization by mean-value (Zozor et al., 2005). ...

... Our general problem here is coding of integers universally, meaning that the encoder and decode don"t have any knowledge of source statistic and probability distribution on integers that are supposed source alphabet so they don"t get any usage of it, (Davisson, 1973) and (Andreasen, 2001). It is supposed that the information source is discrete, stationary and without memory. ...

... The theoretical fundamentals of human mobility and the background associated with the study of regularity and predictability of human mobility are introduced below. The notations, definitions, and formulas follow those presented in [54] and [2] where entropy rate has been used to quantify the extent to which an individual's travel patterns are regular and predictable. ...

... Rate-distortion theory [25,26,27] aids in a deeper understanding in the trade off and balance between the regularization and reconstruction losses. The general rate distortion problem is formulated before making these connections. ...

... This article presents EnDED, which implements four approaches, and their combination, to indicate environmentally driven (indirect) associations in microbial networks. The four methods are sign pattern [23], overlap (developed here), interaction information [23,37], and data processing inequality [27,38]. SP requires an association score that represents co-occurrence when it is positive, and mutual exclusion when it is negative. ...

... The first is noise-free, so its value is constant within the inner/outer shell. The differential entropy H [34] measures the information for a set of values, when considered as a signal taking a random value Q, and described by the continuous probability density f (q) [34]: ...

... Features were removed if the mutual information value was below the mutual information threshold. Discrete features were calculated with Eq. (2), while continuous features were calculated with Eq. (3) [11]. ...

... EnDED [68] aims to differentiate direct and indirect associations based on environmental factors which may affect the dynamics of the ecosystem, such as temperature, turbidity, salinity and nutrients. It employs four different approaches, such as Sign Pattern [69], Overlap [68], Interaction Information [69,70], and Data Processing Inequality [71,72] to identify indirect (environmentally-driven) edges. It classifies an edge as indirect due to environment factor, only if all four methods classify it as indirect. ...