PreprintPDF Available

Claude Shannon and Information Theory

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
Claude Shannon and Information Theory
Lev I. Verkhovsky
A brief outline of what a General Information Theory might look
like (for the centenary of Claude Shannon).
Entry on my blog (not currently active) dated April 15, 2016.
This month marks the centenary of the birth of the American engineer and
mathematician Claude Elwood Shannon (19162001), whose article “A
Mathematical Theory of Communication” (1948) indicated methods for optimal
transmission of a flow of messages over communication channels, based on their
probabilistic, statistical characteristics; its results began to be actively used in the
communications industry. Since the late 50s, Shannon himself has hardly
published articles on his theory and did not participate in conferences: he took up
other things -- he designed various cybernetic devices, and looked for ways to
build AI. When asked if a machine could think, he answered: “Yes, of course. I am
a machine and you are a machine, and we both think, don’t we?”
The wide general scientific and even general cultural popularity of Shannon’s
work is explained by its interpretation as “information theory.” Everyone is
constantly confronted with different information, and many people have decided
that now there is a theory for all these information processes. However, gradually
it became clear that Shannon considered only one aspect of the multifaceted
concept of information, although very important in the field of communications.
Therefore, it cannot be considered that he created a theory of information;
attempts began to build a more general theory that would cover the meaning of
messages (semantics), their usefulness (pragmatics) and other properties. I briefly
talked about this in the article “We are Made of the Same Substance...”,
published in `Chemistry and Life` (`Химия и Жизнь`, 1995, No. 3, in Russian).
Many years have passed since then, and it can be stated that a general theory of
information has not emerged. I will give a few thoughts on how, in general terms,
it could, in my opinion, look.
The idea has long been expressed that information theory should consider -- in
addition to communication channels -- receivers and transmitters of messages (it
was developed by the Soviet mathematician and philosopher Julius A. Schreider).
They usually represent complex cybernetic systems with memory, a set of
knowledge (an internal model of the world, a thesaurus), and their own
operational goals. For such a system, to receive information means to change its
state (its thesaurus), and the measure of the information it receives should be
precisely the magnitude of these changes. The volume of the thesaurus can
increase (when new information is received) or decrease if it is possible (thanks to
a message) to compress the existing data -- in both cases the message carries
certain information; but it may be trivial or meaningless, then it will not change
the thesaurus -- there is no information. Of course, for highly complex systems
(like human scientific knowledge) it is very difficult to quantify such changes in
the thesaurus, but, say, for computer software systems this is in principle
possible.
Further, the principle of economy applies: the system strives to represent the
available knowledge as briefly as possible. Frequently occurring combinations of
words, symbols, etc. in memory are remembered under some name (new
concepts, macro commands, etc. are introduced). This consumes memory, but in
overall there is a gain, saving memory. The same effect in communication (in
messages): tell the correspondent the definition of a new concept (cost), and then
use this concept to shorten messages (gain). In general, information processes are
similar to economic ones: first, funds are invested, which then pay off and make a
profit.
Ideally, there should be a synthesis of various approaches to information theory
(see my article), in particular, semantic (changing the thesaurus), algorithmic --
when the length of the shortest description is estimated. Note that combinatorics
plays a large role in information processes, when minimal descriptions of finite
sets of objects are sought; perhaps, combinatorial geometry would be a suitable
mathematical apparatus -- see my article "Thoughts on Thinking" (1989)
https://www.researchgate.net/publication/366544413_Thoughts_on_thinking
Naturally, the general theory must also include a statistical (Shannon) approach.
As we remember, in Shannon’s communication model, the received message is
the more informative the less expected it was. Let us assign to each possible
message some description of it, the length of which will be proportional to its
probability; the description of the entire situation as a whole is the sum of
descriptions of all possible outcomes (after all, this is what we do in life: we
consider the most probable things in more detail, and briefly consider the unlikely
ones). Then, when a certain outcome occurs, the descriptions of all others are
eliminated, and the total change in the length of the description will be greater,
the less probable this outcome is. That is, the qualitative result will be the same
as Shannon’s (it is also possible to achieve a quantitative, formulaic
correspondence).
These are my projects. As they say, there is little left -- to start and finish.
Отправить отзыв
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.