Content uploaded by Paul M. B. Vitányi

Author content

All content in this area was uploaded by Paul M. B. Vitányi on Feb 19, 2014

Content may be subject to copyright.

A preview of the PDF is not available

Figures - uploaded by Paul M. B. Vitányi

Author content

All figure content in this area was uploaded by Paul M. B. Vitányi

Content may be subject to copyright.

Content uploaded by Paul M. B. Vitányi

Author content

All content in this area was uploaded by Paul M. B. Vitányi on Feb 19, 2014

Content may be subject to copyright.

A preview of the PDF is not available

Suppose we want to describe a given object by a finite binary string. We do not care whether the object has many descriptions; however, each description should describe but one object. Prom among all descriptions of an object we can take the length of the shortest description as a measure of the object’s complexity. It is natural to call an object “simple” if it has at least one short description, and to call it “complex” if all of its descriptions are long.

Kolmogorov complexity is a modern notion of randomness dealing with the quantity of information in individual objects; that is, pointwise randomness rather than average randomness as produced by a random source. It was proposed by A. N. Kolmogorov in 1965 to quantify the randomness of individual objects in an objective and absolute manner. This is impossible for classical probability theory. Kolmogorov complexity is known variously as `algorithmic information', `algorithmic entropy', `Kolmogorov-Chaitin complexity', `descriptional complexity', `shortest program length', `algorithmic randomness', and others. Using it, we developed a new mathematical proof technique, now known as the `incompressibility method'. The incompressibility method is a basic general technique such as the `pigeon hole' argument, `the counting method' or the `probabilistic method'. The new method has been quite successful and we present recent examples. The first example concerns a "static" problem in combinatorial geometry. From among ((n)(3)) triangles with vertices chosen from among n points in the unit square, U, let T be the one with the smallest area, and let A be the area of T. Heilbronn's triangle problem asks for the maximum value assumed by A over all choices of n points. We consider the average-case: If the n points are chosen independently and at random (uniform distribution) then there exist positive c and C such that c/n(3) < mu (n) < C/n(3) for all large enough n, where mu (n), is the expectation of A. Moreover, c/n(3) < A < C/n(3) for almost all A, that is, almost all A are close to the expectation value so that we determine the area of the smallest triangle for an arrangement in "general position". Our second example concerns a "dynamic" problem in average-case running time of algorithms. The question of a nontrivial general lower bound (or upper bound) on the average-case complexity of Shellsort has been open for about forty years. We obtain the first such lower bound.

Various issues in information theory and theoretical physics can be fruitfully analyzed by Kolmogorov complexity. This is the case for physical aspects of information processing and for application of complexity to physics issues. Physicists have used complexity arguments in a variety of settings like information distance, thermodynamics, chaos, biology, and philosophy. We touch briefly upon several themes, but focus on three main issues.

... Are some patterns intrinsically more likely than others, and could these be used to forecast? A related question has been studied (albeit in an abstract way) in a branch of computer science known as algorithmic information theory [1][2][3][4] (AIT). The central quantity of AIT is Kolmogorov complexity, K(x), which measures the complexity of an individual object or pattern x via the amount of information required to describe or generate x. ...

... Algorithmic probability and AIT results are typically difficult to apply in real-world settings due to the fact that K(x) is uncomputable, the theorems assume the presence of universal Turing machines (UTMs), and the results are asymptotic and stated with accuracy to within an unknown constant. Despite these theoretical difficulties, in practice many successful applications of AIT have been made based, for example in bioinformatics [9,10], physics [11], signal denoising [12], among many other applications [4]. Mostly these applications use standard compression algorithms to approximate K(x), sometimes combined with various forms of theorem approximation. ...

... where p is a binary program for a prefix optimal universal Turing machine (UTM) U [25], and |p| indicates the length of the binary program p in bits. Due to the invariance theorem [4] for any two optimal UTMs U and V , K U (x) = K V (x) + O(1) so that the complexity of x is independent of the machine, up to additive constants. Hence we conventionally drop the subscript U in K U (x), and speak of 'the' Kolmogorov complexity K(x). ...

To what extent can we forecast a time series without fitting to historical data? Can universal patterns of probability help in this task? Deep relations between pattern Kolmogorov complexity and pattern probability have recently been used to make a priori probability predictions in a variety of systems in physics, biology and engineering. Here we study simplicity bias (SB) — an exponential upper bound decay in pattern probability with increasing complexity — in discretised time series extracted from the World Bank Open Data collection. We predict upper bounds on the probability of discretised series patterns, without fitting to trends in the data. Thus we perform a kind of ‘forecasting without training data’, predicting time series shape patterns a priori, but not the actual numerical value of the series. Additionally we make predictions about which of two discretised series is more likely with accuracy of ∼80%, much higher than a 50% baseline rate, just by using the complexity of each series. These results point to a promising perspective on practical time series forecasting and integration with machine learning methods.

... • Zipf's and Herdan-Heaps' laws for word frequency distributions [Zipf, 1935, Mandelbrot, 1954, Guiraud, 1954, Herdan, 1964, Heaps, 1978, • universal coding based on grammars [de Marcken, 1996, Kieffer and Yang, 2000, Charikar et al., 2005 and on normalized maximum likelihood [Shtarkov, 1987, Ryabko, 1988, 2008, • consistent (hidden) Markov order estimators [Merhav et al., 1989, Ziv andMerhav, 1992], • the concept of infinite excess entropy [Hilberg, 1990, Ebeling and Nicolis, 1991, Ebeling and Pöschel, 1994, Bialek et al., 2001, Crutchfield and Feldman, 2003], • the ergodic theorem and the ergodic decomposition [Birkhoff, 1932, Rokhlin, 1962, Gray and Davisson, 1974, and • Kolmogorov complexity and algorithmic randomness [Kolmogorov, 1965, Martin-Löf, 1966, Li and Vitányi, 2008. ...

... Consider a string x k j := (x j , x j+1 , ..., x k ) over a countable alphabet. Its prefix-free Kolmogorov complexity is denoted K(x k j ) [Li and Vitányi, 2008]. The algorithmic mutual information between strings u and v is J(u, v) := K(u) + K(v) − K(u, v). ...

... Distribution (4) is a formal model of Zipf's law from quantitative linguistics [Zipf, 1935, Mandelbrot, 1954. Moreover, let (z k ) k∈N be an algorithmically random sequence, i.e., a sequence of particular (= fixed) bits (= coin flips) such that the Kolmogorov complexity of any string z k 1 is the highest possible, K(z k 1 ) ≥ k − c for a certain constant c < ∞ and all lengths k ∈ N [Li and Vitányi, 2008]. Then the Santa Fe process (X i ) i∈N is a sequence of pairs ...

We present an impossibility result, called a theorem about facts and words, which pertains to a general communication system. The theorem states that the number of distinct words used in a finite text is roughly greater than the number of independent elementary persistent facts described in the same text. In particular, this theorem can be related to Zipf's law, power-law scaling of mutual information, and power-law-tailed learning curves. The assumptions of the theorem are: a finite alphabet, linear sequence of symbols, complexity that does not decrease in time, entropy rate that can be estimated, and finiteness of the inverse complexity rate.

... If we assume that each program is a theory and that the simplest is the smallest program (in number of bits), we can now rigorously calculate the notion of information (Kirchherr et al., 1997) even if its content is not definable, as it is the case with the notion of entropy in signal processing. This is called the Kolmogorov complexity (Li & Bitanyi, 1997). ...

In this article, we present an educational reform we implemented a few years ago to respond to a marked drop in the success of students entering higher education in computer science. The main objective of our reform is to adapt the teaching methods of Generation X-Y to Generation Y-Z or more. To do this, we propose two approaches. First, to make learning more active, stimulating and empowering. Second, to make learning more individualized in a context of a large group of students with an optimization of the teacher’s time. We first present an analysis of the probable reasons for the lower level of students and the specific issues encountered by both students and teachers. Then, we detail how we implemented these solutions in the form of an original e-learning platform based on two back-end tools able to manage large number of students: an efficient real-time auto-corrector of source codes and a very robust anti-plagiarism software based on computer distance information theory. Finally, we present the results obtained after an experimentation of 8 years. These results are put into perspective by an analysis of 15 years of qualitative and quantitative indicators carried out on hundreds of students each year. The analysis shows that the technical skills and the involvement of students are improved even in groups of several hundreds of students. Computer science teachers wishing to quickly and concretely develop the involvement of their students can rely on these tried and tested key levers and solutions presented.

The process of constructing concepts underpins our capacity to encode information in an efficient and competent manner and also, ultimately, our ability to think in terms of abstract ideas such as justice, love and happiness. But what are the mechanisms which correspond to psychological categorization processes? This book unites many prominent approaches in modelling categorization. Each chapter focuses on a particular formal approach to categorization, presented by the proponent(s) or advocate(s) of that approach, and the authors consider the relation of this approach to other models and the ultimate objectives in their research programmes. The volume evaluates progress that has been made in the field and where it goes from here. This is an essential companion to any scientist interested in the formal description of categorization and, more generally, in formal approaches to cognition. It will be the definitive guide to formal approaches in categorization research for years to come.

The process of constructing concepts underpins our capacity to encode information in an efficient and competent manner and also, ultimately, our ability to think in terms of abstract ideas such as justice, love and happiness. But what are the mechanisms which correspond to psychological categorization processes? This book unites many prominent approaches in modelling categorization. Each chapter focuses on a particular formal approach to categorization, presented by the proponent(s) or advocate(s) of that approach, and the authors consider the relation of this approach to other models and the ultimate objectives in their research programmes. The volume evaluates progress that has been made in the field and where it goes from here. This is an essential companion to any scientist interested in the formal description of categorization and, more generally, in formal approaches to cognition. It will be the definitive guide to formal approaches in categorization research for years to come.

The process of constructing concepts underpins our capacity to encode information in an efficient and competent manner and also, ultimately, our ability to think in terms of abstract ideas such as justice, love and happiness. But what are the mechanisms which correspond to psychological categorization processes? This book unites many prominent approaches in modelling categorization. Each chapter focuses on a particular formal approach to categorization, presented by the proponent(s) or advocate(s) of that approach, and the authors consider the relation of this approach to other models and the ultimate objectives in their research programmes. The volume evaluates progress that has been made in the field and where it goes from here. This is an essential companion to any scientist interested in the formal description of categorization and, more generally, in formal approaches to cognition. It will be the definitive guide to formal approaches in categorization research for years to come.

This interdisciplinary study of infinity explores the concept through the prism of mathematics and then offers more expansive investigations in areas beyond mathematical boundaries to reflect the broader, deeper implications of infinity for human intellectual thought. More than a dozen world-renowned researchers in the fields of mathematics, physics, cosmology, philosophy and theology offer a rich intellectual exchange among various current viewpoints, rather than displaying a static picture of accepted views on infinity. The book starts with a historical examination of the transformation of infinity from a philosophical and theological study to one dominated by mathematics. It then offers technical discussions on the understanding of mathematical infinity. Following this, the book considers the perspectives of physics and cosmology: can infinity be found in the real universe? Finally, the book returns to questions of philosophical and theological aspects of infinity.

ResearchGate has not been able to resolve any references for this publication.