ArticlePDF Available


This is the blog post from slightly reformatted for printing, "published" in this form here by request.
On Information
This is the blog post from slightly
reformatted for printing, publishedin this form here by request.
1. Information in general
We live in the information age; nevertheless, when you ask what “information” is, the
answers you get are so far apart, that it might have a good chance for the least defined
term of the decade. So, if one embarks upon a study of what is different between the
kind of information one encounters in contemporary documents and that which is
used for historical research, one has to give a definition, or at least a description, of
how one understands the term.
There are definitions of the term, provided by and used within different disciplines. I
would like to start with one, which has been used under various names Knowledge
Pyramid, Ladder of Knowledge, DIKW modelin various disciplines: information
science, philosophy of information, cybernetics and a few others, though rarely in
information theory [Ackoff 1989, Ashenhurst 1996, Rowley 2007, Frické 2009, Saab
2011, Baskarada 2013, Jifa 2014, Duan 2017]. And usually not, or not much so, in
computer science and information technology. The latter two are frequently quite
satisfied with being able to represent information in data structures and process them
by algorithms, without having to define what it actually is. Two comments on the
model I start with: (a) It is a compromise between the numerous varieties which have
been proposed by different researchers in different disciplines, most closely following
[Favre-Bull 2001]. (b) I start with a rather naïve version; a significant adaption
follows later in this paper.
In a nutshell the model assumes, that there is a hierarchical relationship between
various phenomena related to “information”, as in graphic 1.
Graphic 1: Knowledge Pyramid
The definitions of these layers will follow soon, let me start with an intuitive example
first. Let’s look at some medieval tally sticks. These were short wooden sticks, which
were given as acknowledgment of some economic relationship, described in writing
on the stick. Notches at the top ensured on the one hand, that the form of the stick
was unique, at the other the notches in many cases representing also the amount of
what was owed or simply counted. The stick was than split lengthwise and the two
parts were kept by the two parties of the transaction. Having an irregular shape, the
individual parts could not be tampered with. Graphic 2 shows such a tally stick,
holding information. The notches on top represent the amount of what is described in
the text.
Graphic 2: Information recorded on tally stick
Graphic 3, on the other hand holds only data. Whether the writing was never there, as
the two partners of the transaction were illiterate, or whether it has eroded: We see
notches, which counted something or something else. Which something, well …
Graphic 3: Data recorded on tally stick
More formally: Data are marks in some representational system, which can be stored.
Information results, when these marks are put into some context. So “22°” are data.
“The temperature of this room is 22°” is information. Knowledge arises, when this
information encounters the ability to draw advice for action from it. “Oh well, I do
not feel overly warm just because I had to run to get here in time. I really should get
out of my jacket.”
Seemingly a side track at the moment, but important later: Most researchers notable
exception: Luciano Floridi [Floridi 2011, 182 ff.] do not think, that truth has to do
anything with all this. Assume your knowledge of the world includes the “truth”, that
vampires exist. The data represented by a color change of the garlic in your garden
becomes the information that it is ready for harvest. Within your view of the world, it
is valuable knowledge to deduct from that information the plan to use some of it to
surround your windows.
This is not the only reason, that many people feel less than perfectly comfortable with
the notion of wisdom in this context; which is why I ignore that concept in the
remainder of this paper. How can we represent the relationship between the
remaining conceptual layers more stringently? The usual approach is represented in
graphic 4.
Graphic 4: Information among agents
Cognitive agents you, me, a component of a “smart” piece of software can
perform its activity, as it has other information or knowledge as a context in the
background: That a number you see on a thermometer represents temperature and
how to translate that into action requires information or knowledge beyond the
number indicated by the device. Both processes have in common, therefore, that they
put a specific chunk of data or information into a context. There is a very big
difference between the two levels, however.
To convert the data “22°” to the information “the temperature of this room is 22°”,
requires contextual information, that in our sociocultural environment this is the way
to read a thermometer. This is a context, which is common to all cognitive agents,
which operate successfully in that environment. The contextual knowledge, that this
temperature should trigger the action of getting rid of your jacket, is restricted to one
specific cognitive agent, as it is bound to the biological entity supporting that agent,
you. So the context required to convert data into information is shared between a
larger number of agents, the context required to convert information into knowledge,
is private to a specific agent.
If we try to use this model to describe what is happening in communication, we get
the following graphic, inspired by [Favre-Bull 2001, p 87, ill. 47]. The icons of
graphs, which represents the contexts, hint at the widely spread opinion, that
networks are a useful model to understand the relationships in a semantic
environment. With this in mind, we can describe the situation of two people
discussing a possible purchase as illustrated by graphic 5.
Graphic 5: Discussing the price of a purchase
This graphic is to be read as follows: The person in the top left corner has been
asked, what the price of an item is, which the person on the top right hand corner
wants to purchase. Based on his knowledge about the price he fixed before or, if none
has been fixed, the situation of his business and the going rates for such merchandise,
he selects an amount. This is converted into information, combining an amount and a
currency, and in the next step into data which can be transmitted. Whether the signal
used for that purpose is a set of sound waves at a market stall or a string of bits
transmitting from an internet shop via the WWW is irrelevant for our purpose. In
both cases, the purchaser at the right hand side will receive that signal, hopefully
undistorted by the acoustic noise at the market or the static electricity around the
connecting lines. The signal reconstituted as data gets transformed into information
as amount and currency and allows the purchaser to act upon the offer,
contemplating it in terms of the desirability of the object and her overall budget.
That all of this works requires that the background knowledge of the two participants
in the conversation and the common context of sociocultural conventions between
them overlaps sufficiently, that they have the feeling to “understand” the other party.
This overlap does not need to be complete: Whether the seller in a developing
country believes to ask an outrageous price, while the purchasing tourist understands
she’s been made a gracious offer, does not really hinder the action (even if the
communication does not get everything across). But incompatible contexts may
destroy the communication: If the tourist believes she must haggle, as this is
allegedly required in the sociocultural context of the country where the exchange
takes place, the seller offers a fixed price, however, the two parties will not get
So there are two levels at which communication may break down: The transmission
of the signal amidst noise at the bottom of the diagram and the compatibility of the
knowledge context semantic context, for short on top. May break down but
most of us have frequently encountered that two parties in a conversation do
misunderstand what the other party says, so this is a more serious problem, than the
speculative “may” indicates.
One might even say preliminarily, that this is the most crucial problem when
applying this model to information as contained in historical sources. If you replace
in graphic 5 the present day icons of seller and purchaser by photographs of statues of
Pericles and Thucydides and the string “5 €” by the string “πόλεμος” the diagram is
still completely appropriate to describe what Thucydides understood, when listening
to Pericles explaining his policy, as both shared the same sociocultural context. If you
replace the seller by Pericles and assume that the lady’s icon at the right hand side
represents a modern historian, you immediately see that this will not work a modern
historian simply does not share the sociocultural context with Pericles. Which is an
incomparably more serious problem, than the question how close the text of
Thucydides we have is to what Olorus’ son actually wrote, leave alone what words
the offspring of Xanthippus really had chosen.
Nevertheless, information technology in its broadest sense until quite recently had the
tendency to focus on how to overcome the signal noise, not so much the semantic
noise of a process of communication; therefore much of the discussion of information
theory proper is concentrating on signal handling. We will try to point out, that this is
not helpful to handle information as it occurs in historical sources in the next section
and propose some concrete technical consequences of that in section 3. Just to clarify
things, let us point out, why this somewhat distorted focus may have arisen.
Much of the information technology we have today and more of what constitutes
information theory as a contemporary discipline rests upon the research paper by
Claude Elwood Shannon published in two parts in 1948, entitled A Mathematical
Theory of Communication [Shannon 1948]. Without the concepts presented there,
electronic communication and signal technology would simply not be what it is
today: no TV free of distortions, no phone connection without crackling background,
most certainly no internet. Nevertheless Shannon was very clear on what he thought
could be done and what could not be done. The second paragraph of his paper starts:
The fundamental problem of communication is that of reproducing at one point
either exactly or approximately a message selected at another point. Frequently the
messages have meaning; that is they refer to or are correlated according to some
system with certain physical or conceptual entities. These semantic aspects of
communication are irrelevant to the engineering problem.([Shannon 1948, 379]
“Meaning” italicized by Shannon; last sentence by me.)
This as an introduction to the basic diagram on the model of communication
employed, as reproduced in graphic 6, the traces of which you easily recognize in
graphic 5.
Graphic 6: Shannon's model of communication
Shannon’s paper was most fundamental to modern information technology; but it was
not exactly easy to read. Being recognizably fundamental already at the time it was
republished as a book already one year later in 1949, entitled The Mathematical
Theory of Communication this time [Shannon 1949]. Not exactly easy to read, it was
introduced by an introduction by Warren Weaver, as mathematician highly qualified
to understand the original argument and personally highly qualified to write
transparent prose, so his text probably influenced the perception of Shannon’s
concepts outside of signal engineering much more than the original text. Now
Weaver was fully aware of the semantic problems of communication. He lists three
levels of communication problems:
Level A. How accurately can the symbols of communication be transmitted? (The
technical problem.)
Level B. How precisely do the transmitted symbols convey the desired meaning?
(The semantic problem.)
Level C. How effectively does the received meaning affect conduct in the desired
way? (The effectiveness problem.)” [Weaver 1949, 24]
It should be obvious, that with rather limited effort our concepts of data, information
and knowledge could be related to these three levels.
Two pages later, Weaver even writes:
So stated, one would be inclined to think that Level A is a relatively superficial one,
involving only the engineering details of good design of a communication system;
while B and C seem to contain most if not all of the philosophical content of the
general problem of communication.” [Weaver 1949, 4] Which basically rephrases
Shannon’s statement.
But barely two pages later Weaver continues:
Part of the significance of the new theory comes from the fact that levels B and C,
above, can make use only of those signal accuracies which turn out to be possible
when analyzed at Level A. Thus any limitations discovered in the theory at Level A
necessarily apply to levels B and C. But a larger part of the significance comes from
the fact that the analysis at Level A discloses that this level overlaps the other levels
more than one could possible [sic] naively suspect. Thus the theory of Level A is, at
least to a significant degree, also a theory of levels B and C.” [Weaver 1949, 6]
(Capitalization “Level A” vs. “levels B and C” is Weaver’s.)
And he starts his conclusion triumphantly:
It is the purpose of this concluding section to review the situation and see to what
extend and in what terms the original section was justified in indicating that the
progress made at Level A is capable of contributing to levels B and C, was to
indicating [sic] that the interrelation of the three levels is so considerable that one’s
final conclusion may be that the separation into the three levels is really artificial and
undesirable.” [Weaver 1949, 25]
How this is disclosed did always beat me, since I read it the very first time. It seems
to be at the root of the popular perception of Shannon’s model, however.
Let me start another comment on the relevance of Shannon for the fundamental
understanding of communication with an anecdotal chronological observation. One
of the best known undocumented quotations from the early world of computer
technology is the statement which Thomas J. Watson jr. of IBM allegedly made in
1943: “I think there is a world market for maybe five computers”. Is it not
astonishing, that only five years later a signal technician created a theoretical model
which is supposed to be at the bottom of digital technology, when signal technology
at the time was still almost exclusively analog? Again, looking at what Shannon
actually wrote helps.
He starts his reasoning with a model based on discrete signals. But in chapter III of
his paper he extends it to continuous signals, starting with:
We now consider the case, where the signals or the messages or both are
continuously variable, in contrast with the discrete nature assumed heretofore. To a
considerable extent the continuous case can be obtained through a limiting process
from the discrete case by dividing the continuum of messages and signals into a large
but finite number of small regions and calculating the various parameters involved on
a discrete basis. As the size of the regions is decreased these parameters in general
approach as limits the proper values for the continuous case.” [Shannon 1948, 623]
So, discrete and continuous computations can relatively easily be mapped unto each
other. As discrete and continuous are the appropriate technical terms for “analog” and
“digital” which in common parlance not the least in the so called Digital
Humanities have taken on an almost eschatological importance, let me reflect on
that a little bit longer, even if it may seem to be a detour from the main line of
argument in this paper.
Analog mechanical calculations are best envisaged by a thought experiment. Assume
you want to add the two numbers 1.00134 and 0.48723. What you have are three
precisely polished glass containers with extremely fine grained scaling at the side and
a supply of water. To perform an addition, you fill 1.00134 liters of water into
container A and fill 0.48723 liters of water into container B. Next you put the content
of both containers into container C and read from the scale at its side, that it now
contains 1.48857 liters. This is an analog computation to the best of my knowledge
never implemented with water, but an important line of research in the early days of
electronic computing, when amperage and voltage were used instead of water, still
used for specialized purposes. (The classically trained Humanist is right in reflecting
on Archimedes’ method to measure the volume of an irregular object. The anecdote
with the bath tub, for the not so trained reader. The fervent reader of alternative
history science fiction, may, I vaguely remember, actually have encountered an
hydro-analog computer somewhere.)
Now a digital mechanical calculation is best envisaged by thinking of a child’s
building blocks. To add the numbers 3 and 4 you build two towers, one out of three,
the other of four building blocks, put the one above the other and read the correct
answer, 7. This principle can of course be extended quite considerably; there is no
reason, why you should not build two towers out of 100134 and 048723 building
blocks, putting one above another resulting in a tower of 148857 to which a decimal
point would have to be applied. No reason why not to do so, except you would want
to use considerably smaller building blocks than in the 3 plus 4 case.
If we doubt that it is possible to handle hundred thousands of building blocks: do not
forget, that in our analog thought experiment we had accepted, it would be possible to
etch a scale into glass, which would be fine enough to read one-hundred-thousandth
If we continue our continuous (i.e., analog) thought experiment, we will eventually
arrive at a scale, which allows us to read the amount of water represented by one
molecule; if we continue our discrete (i.e., digital) thought experiment, we will
ultimately arrive at building blocks the size of one molecule of an appropriate
medium, water, for example. With other words: Both thought experiments ultimately
converge. The difference between a discrete and a continuous model of computations
by physical devices is simply derived from the precision of the instrument we use to
measure the physical units. “The digitalis no category of eschatology but one of
plain instrument making. If not a term for eschatology, one should even doubt,
whether it really should be one of epistemology.
Bluntly speaking: A digital watch is a much better analogy of time, than a mechanical
one. Its timer cuts time up into slices which are orders of magnitude more finely
grained than the ones a mechanical watch can produce. Many “analog” films use
crystals for light recording, which are coarser than the resolution of a digital camera.
Their resolution is lower, therefore, than that of the digital camera which in turn is
more analog, than the allegedly analog device.
Digital technology is frequently more analog, than analog one.
I beg the patience of the reader for this seeming fixation on the low relevance of the
“digital” for the handling of information. The reason will become apparent later in
the argument, when we grapple with one misunderstanding which is ultimately
derived from a misapprehension of this principle the ubiquitous “Computers know
only yes or no; they are unable to handle nuances at all. (And that is ultimately, why
they are unfit for historians and humanists.)”
2. Information in historical sources
One of the reasons, why the discussions about the application of computational
techniques to the Humanities frequently leave one rather confused, particularly if
they are organized under the label of “Digital Humanities” is, that their applicability
depends heavily on the methodological position of the speaker. If, as F.R. Ankersmit
assumes, the point of history is an exercise in literary presentation of reflections
about high level politics, where even anthropological reflections on village
communities, highly focused on narrative as they tend to be, are a deviation from the
true path, there is little computer technology can do beyond replacing the type writer.
If a philosophy or methodology of history is derived from the narrative to be
produced, sources and how to extract information from them, are a secondary
phenomenon. If a philosophy or methodology of history is derived from the problems
posed by gaining knowledge from objects and documents that have survived, they are
My own position clearly thinking, that historical research is defined by the way we
get knowledge about the past, not how we present that knowledge to the public will
be the topic of another of these papers offered as blog posts, but the following
considerations would be incomprehensible, if that position would not be stated
Let us start our consideration of the information contained in historical sources with
graphic 7, modified from graphic 5 to apply that model to the situation a historian
finds herself in, when she tries to understand Pericles’ reasons to implement his
policy regarding the war with Sparta, as reflected by Thucydides.
Graphic 7: Reasoning about Pericles' policies
Our friend faces two problems: The first we already mentioned, when discussing
information as such. We do not share the silent assumptions of the sociocultural
context (question mark in the middle), so we have no access to the context in which
Pericles did formulate his policies. But we have ignored another problem so far: The
information we have or some of it in any case did not originate from Pericles, but
from Thucydides, being the result of an earlier communication process. Estimating
the distortions produced by such earlier communication processes is the bread and
butter of historical research, as far as it is focused on the content and not the literary
qualities of historical writing.
The topic has a long tradition, therefore. Let us look at two sources on Hannibal
Barkas, trying to understand, which of the three possible routes he took across the
Alps with his elephants. Let us imagine two potential sources to solve that riddle. The
one shall be a statue or bust of Hannibal. We use it to get insight into his character,
whether he impresses us as restrained overall, having a character which implies that
despite his undisputed audacity he was governed by a tendency to select of the risky
alternatives the safest one over the advantages the other alternatives may possess. The
other source shall be a remarkably deep layer of horse manure, found at one of the
possible routes, indicating by various archaeometric methods that a very large
number of horses and other producers of manure passed a specific Alpine pass at
about the time of the Hannibalian crossing [Mahaney 2017].
Judging the relative value of these sources, historians of my stripe would argue for
the horse manure. The artist who shaped Hannibal’s statue had almost certainly never
seen him; and even if so, he almost certainly tried to express those features of his
character which were emphasized by the contemporary discussion of the man. The
horses producing manure had no intention whatsoever to express any opinion on their
Since the 19th century, this has been clearly fixed in historical methodology. Droysen
[Droysen 1937, 38-50] clearly emphasized that traditions (Traditionen), like statues,
which were the result of an intentional effort to leave a specific view of an event,
were less valuable as sources than remainders (Überreste) which resulted from
processes which were not controlled by an intention to leave a specific image for the
coming generations. Albeit Droysen was thinking less of statues and horse manure,
but rather of earlier historiography vs. the results of administrative documentation.
For us this implies, that a historian is not interested in the message the author of a
source wanted to transmit, but rather in such insights about the situation, which a
source provides as independent of the intentions of the author.
An example. Let’s look at Römischer Käyserlicher Majestät Ordnung und
Reformation gutter Policey, im Heiligen Römischen Reich, zu Augspurg Anno 1530
auffgericht, usually quoted as Reichspolizeiordnung of 1530 [Weber 2001], in
English best described, rather than translated, as constitutional law of the Holy
Roman Empire, after some major revisions in force until 1806.
Observation 1: At a rather prominent place, this constitutional law contains the
following section: IX. Von unordentlicher und köstlicher Kleidung. (IX. On irregular
and costly clothing.) Extremely detailed regulations regulating members of which
social group would be allowed to spend how much on their garments, a dress code
allegedly preventing insensible luxury, at the same time carefully preserving the
visible differences between social ranks.
What will a historian learn from the presence of that section? “OK, the last attempt at
regulating expenditure of clothing did not work, again.” (Similar dress codes being a
textbook example for history students that the permanent re-issuing of regulations is a
sure sign they were ignored.)
Observation 2: § 2 of the subsection of this dress code which speaks “Von Bürgern
und Inwohnern in Städten” (citizens and other people living in towns) prescribes:
Deßgleichen sollen sie kein Tuch, die Elen über zween Gülden werth, ihnen
anmachen lassen, oder einig Marder, Zobel, Hermlin und dergleichen Futter
antragen. Wol mögen sie zum höchsten Marderkehln, und ihre Haußfrauen Fehine
Futter gebrauchen. (Furthermore they shall not use any cloth which costs more than 2
guilders a yard for their garments, or any marten, sable or ermine or similar lining. At
the most, they may use scraps of marten and their housewives squirrel for lining.)
What will a historian learn from that text? “Some plain citizens could occasionally
afford rather expensive furs, if they did set their mind to it.” “Marten were still
sufficiently numerous in the woods, to make them a borderline case, when luxury was
Observation 3: § 4 of the same section on townspeople reads: Wäre es aber Sach, daß
ein solcher Handwerker in einer Stadt in Rath wird erwählt, alsdann soll derselb mit
Kleidung sich nicht anderst, dann hernach von Kauffleuten gemeldt wird, zu halten
Macht haben. (In case that a craftsman is elected into the council of the town, he shall
keep for his garments the same rules which further below are defined for merchants.)
What will a historian learn from that text? “There were sufficiently many towns
where craftsmen had achieved the right to enter the council of the city, that special
regulations for them were needed.”
Let’s look at these historical findings again:
1. OK, the last attempt at regulating expenditure of clothing did not work, again.
2. Some plain citizens could occasionally afford rather expensive furs, if they did
set their mind to it.
3. Marten were still sufficiently numerous in the woods, to make them a
borderline case, when luxury was discussed.
4. There were sufficiently many towns where craftsmen had achieved the right to
enter the council of the city, that special regulations for them were needed.
What have these four historical results in common? This is information which is
implied by the source; it is in no way the message, which the authors of the law
wanted to send out to their contemporary recipients. Shannon’s focus on the ability to
transmit a message is therefore not a sensible model to understand the information
contained in a historical source. A historian is an interpreter of messages, not a
recipient. He or she may interpret it correctly, but if the original context of the
messages is lost, all such interpretations have to be kept tentative. Which graphic 8
tries to visualize.
Graphic 8: Transmission model for historical sources
If we change the basic model of our understanding of how we process information,
does that have implications on the technological solutions we should use? I think so.
The following and final section will try to explore, where we have to deviate from the
current implementations of information systems. To prepare for that, let me close this
section with a high level view of the requirements of historical information systems,
which I consider to be derived from our musings on the nature of information above.
An information technology appropriate for historical sources:
1. Represents the artifacts as free from any interpretation as possible in the
technical system,
2. embeds them, however, in a network of interpretations of what they imply,
3. provides tools which help to remove contradictions between such
4. accepts, however, that such contradictions may prove to resist resolution
5. as well as that all interpretations always represent tendencies, no certainties.
3. Consequences for information systems
While to the best of my knowledge the above attempt to intertwine musings about
historical sources with considerations of the nature of information as handled by
technical systems is a rather rare exercise, musings about the nature of information in
such systems are anything but rare. As mentioned at the very beginning, the
knowledge pyramid has left traces in information science, cybernetics, the philosophy
of information and a few others. But with all respect due to the excellent work done
by all these disciplines within themselves, it is hard to see, how information
technology would have developed differently from the way it did, if they would never
have existed (extremely rare exceptions – Norbert Wiener, e.g. proving the rule).
The main systematic exception is cognitive science, which contributes its own share
of theoretical reflections on the nature of information, frequently interconnects with
the implementation of technical solutions however, and therefore actually influences
technical development. If thinking about the nature of information in historical
sources does not influence the development of information technology, at first look
that just hurts the vanity of the thinker. At second look, one reaches the conclusion,
however, that information technology serves historical research less than it should, if
it implements assumptions which violate the principles formulated at the end of the
last section. If historical information can only be handled by systems which cannot
handle contradictory information, such systems are ultimately inappropriate for this
In software engineering, we are familiar with the term “technology stack” or
“software stack” describing the selections of technologies made at various levels to
implement a system. A restrictive definition usually defines a specific stack as the
choice of an operating system, a web server, a database system and a programming
language. LAMP (Linux, Apache, MySQL, {Perl, PHP or Python} ) is the best
known example. The term “stack” leads to the association, that once the bottom level
has been chosen, the choice of the upper levels is severely restricted. The existence of
the WAMP (Windows, Apache, MySQL, {Perl, PHP or Python} ) stack shows,
however, that the metaphor fails here: The different levels are related, but not strictly
hierarchical. In a wider sense, we speak of a technology stack also, if we try to
describe the selection of software components used to implement a solution: say the
selection of JAR packages employed to build a JAVA based system.
I propose conceptual stack as a new term for the combination of general concepts
which go implicitly or explicitly into the design of information systems. A
conceptual stack in that sense more abstract than the software stack, but sufficiently
concrete to determine specific properties and capabilities of all information systems
build upon that stack. As in the technical case, we assume these concepts to depend
on each other, but not in a strictly hierarchical sense.
In contemporary information systems, I identify at least five conceptual decisions,
which restrict their usefulness for the handling of information contained in historical
sources as discussed above. These are:
(1) The interpretation of the signals used for communication with the concept of
granular units of information. What in my understanding is Weaver’s distortion of
Shannon’s concepts.
(2) The believe that, as bits can be used to conveniently implement Boolean logic,
computer systems necessarily have to be based on binary logic. What in my
understanding is the binary fallacy.
(3) The assumption that the language of historical documents can best be approached
by analyzing their syntax. What in my understanding is Chomsky’s dead end.
(4) The approach to embed interpretations of an object into their representation. What
in my understanding is the markup fallacy.
(5) The principle, that variables in a programming language are conceptually
independent of each other, as long as they are not explicitly connected into a structure
or object. What for reasons not immediately apparent in my understanding is the
Gorilla syndrome.
To each of these conceptual layers a paper will be presented on this blog. For now, I
restrict myself to describe them briefly and try to give hints how implementations of
solutions could look like.
3.1 Weaver’s Distortion
My accusation against Weaver that in the attempt to make Shannon’s model easier to
understand he mixed up different conceptual levels, can probably be made a bit more
transparent by an analogy. Norbert Wiener said famously “Information is
information, not matter or energy”. In the world of matter and energy we are
perfectly aware, that there exist two closely connected but in many ways independent
sub-worlds: A Newtonian one and a world that is ruled by Quantum physics. They
are closely connected; nevertheless the confusing habits of quarks do not prevent the
Earth to circle the sun in an encouragingly reliable way even if gravitation,
responsible for the reliability, can probably be understood only on the sub-nuclear
level. My proposal is, that a similar separation can be used to understand the
relationship between the world of data, turning into information and knowledge in the
context of other data, and the signals constituting those data. On the possibility to use
more than one theory of information in parallel see [Sommaruga 2009].
There exist such differences in the interpretation of signals in quite a few fields of
computing. When, e.g., you look at image processing, the JFIF encoding of JPEG
compression mixes two views: Most of the steps interpret bytes as numbers which are
conceptually points in the continuum, so the fact that their representation is made up
of bit strings is an irrelevant accident. In the final step of the compression algorithm,
however, the blocks of numbers are handled as bit strings, the fact that they consist of
numbers completely irrelevant, when compressed by Huffman encoding.
The computational legacy of Weaver’s distortion is, therefore, that the programming
paradigms we use today inherit it. So operations which shall interpret information are
handled by data types, which are exactly that: data.
One of the reasons for this is, that what you do with numbers can be understood and
validated by the formal apparatus provided by analysis in the mathematical sense and
particularly by numerical analysis among its branches. For strings similarly stringent
formalisms have been developed.
But anything related to meaning seems to be more slippery. Though there have been
attempts to change that: Keith Devlin specifically proposed a mathematical theory
(well, a pre-mathematical one, in his own words) which addresses that problem:
“… whereas in this essay I am taking information itself as the basic entity under
consideration. More precisely, I am seeking a specific conceptualization of
‘information’ as a theoretical ‘commodity’ that we can work with, analogous to (say)
the numbers that the number-theorist works with or the points, lines and planes the
geometer works with.” [Devlin 1991, 17] (All italics are Devlin’s.)
Staying at the lowest illustrative level, and knowingly ignoring, that the bulk of
Devlin’s theory is built upon units one level of complexity higher [cf. Devlin 2009],
we can describe his approach by the notion of an “infon” as the atomic unit of an
information system. An infon is defined as
<< P, a1, …, an, l, t, i >>
The parameters are defined as:
Pan n-ary relation
a1, …, anobjects between which P holds
la spatial location
ta temporal location
ia truth value
Two comments:
(a) Objects a1, …, an: Best understood as other infons, as the whole system is most
conveniently be understood as completely recursive. Keep in mind, that the recursion
can always end by an infon, where everything consists of nil - though I am not sure,
whether this understanding is Devlin’s or mine.
(b) Strictly speaking, Devlin defines an infon on p. 22 of Logic and Information
[Devlin 1999] as << P, a1, …, an, i >> and adds l and t only in the following
paragraphs. As it is obviously simple to replace these both parameters by nil when not
applicable, I recommend to use the more complete definition for easier understanding
from the start.
As an example Devlin [Devlin, 1991, 24] gives:
<< marries, Bob, Carol, l, t, 1 >>
To describe the information, that Bob marries Carol at a location l and a date t. As
this is true, the final parameter is 1. If it would be an alternative fact, a.k.a. a lie, it
would be 0.
Considering this as a basic notion of information has many attractions. On the one
hand, it acknowledges, that information grows out of data in context; on the other, it
reflects that knowledge does not have to be true. Zeus and Hera are married on
Olympus, even if they do not exist and the spatial location of Olympus requires an
interesting extension of the concept of space.
Two short comments:
(a) I strongly propose see section 3.2 belowto replace Devlin’s binary truth
values by continuous ones in the unit interval [0, 1].
(b) Infons remind one immediately and obviously of RDF triples. While an n-ary
relation can obviously be represented by a set of binary ones, I would recommend to
avoid this simplification. An n-ary relation holding at a given spatial and temporary
location is something rather different from a set of binary relations with slightly
different temporal and spatial coordinates between them.
Research proposal in software technology 1:
Implement infons for seamless usage in main stream programming languages. (Or
possibly situations, the slightly more complex abstraction which actually is the
subject of the bulk of Devlin’s theory, omitted here to simplify the argument.) ⁋
But our argument for the basing of information systems on other building blocks than
the current data types does not end here. We have derived this requirement from our
initial consideration that information should be understood in the sense of one or the
other version of the knowledge pyramid, not on the level of signals.
For simplicity’s sake, we have so far avoided the discussions of the shortcomings of
the knowledge pyramid itself, which has led to various criticisms against it and
occasional calls for its abolishment. Let’s look again at the starting example.
Information results, when these marks are put into some context. So “22°” are data.
“The temperature of this room is 22°” is information.
Well, yes. However: What about the number “22”? Before it turns up on a
thermometer, it might measure the length of a room in feet, the weight of truck in
tons, the distance of two towns … So: “22” is data; by being contextualized as “22°”
it becomes information. And what about the bit string “0000001000000010”? It could
be the “device control character 2” from the ACII table, the first half of the Unicode
code for the universal quantifier symbol, … So: “0000001000000010” is data; by
being contextualized as “the value of an integer variable” it becomes information.
And so on.
At first look one can react in two ways to this discovery. Argue it contradicts the
knowledge pyramid model; or argue, that it emphasizes its validity as a confirmation
of the overwhelming importance of context, even if the transitions between
interpretative levels are more complex than the simple 3 level model data,
information, knowledge indicates.
I would recommend that we use it as an encouragement to look at another approach
for the understanding of the concept of information, which unfortunately has left only
very few traces [e.g. Kettinger 2010] in its discussion: Langefors’ “infological
equation” [Langefors 1973]. According to him, the information communicated by a
set of data, is understood to be a function i() of the available data D, the existing
knowledge structure S and the time interval t which is allowed for the
communication, given by the formula
I = i(D,S,t)
It should be understood, that “formula” here just stands as a help for
conceptualization, not as an object in any calculus. Nevertheless, on can use it for a
stimulating exercise in reasoning, where we assume that S represents existing
knowledge, the context of the transmission of information. The interesting thing
about Langefors’ equation is, that it introduces the time a communication or the
process of generating information out of data takes. If information is derived from
data in a continuous process, we must assume, that the longer that process may take,
the more information we may extract “more information” easily conceptualized as
“information in a more complex context”. Formulated differently: If this is a process
represented by a function, rather than a timeless transition, there is not a discrete, but
a continuous relationship between data and information. That is, what has been data
at one stage, is information at another. Still using a formula as a notation to help
reasoning, we can therefore write
I2 = i (I1, S2, t)
to represent the notion, that the information available at time “2” is a function of the
information available at time “1” and the knowledge available at time “2”, depending
on the length of time we have available for the execution of that function. As I have
discussed this in detail somewhere else [Thaller 2009, 345ff.] I will cut the argument
short and just mention, that from this stage we can develop the argument further,
representing knowledge, S, also as a function s(), rather than a static entity and arrive
Ix = i (Ix-α, s(Ix-β, t(x- β)), t(x- α))
To be read as: The information available at time x is the result of an interpretative
process i() which has interpreted the information available at an earlier point of time
x-α over the time span t between x and α, in the context of a knowledge generating
process s(). This knowledge generating process in turn has been running over the
time span t between x and β, using the available information at the point of the time
preceding x by β.
What I find fascinating about this model, are its implications. Computer science
usually assumes today, that we are representing “information” statically in data
structures, on which dynamic functions algorithms - operate. The implication of the
ideas above is, that no such thing as static information exists; “representing it” just
captures a snapshot of a continuously running algorithm.
Research proposal in software technology 2:
Represent information as a set of conceptually permanently running algorithms, the
state of which can be frozen and stored. ⁋
In a sense, this is the most ambitious proposal contained in this paper. I assume, that
it closely connects with neural networks. How to realize the simple task of comparing
the similarity of two strings in a context, where the strings are represented by nodes
of an active network is not entirely trivial, however.
3.2 The Binary Fallacy
It is a long and in many ways successful tradition, to start the practical training of
computer scientists and generally that of software technologists by the introduction of
the concepts of bit and byte. But this focus has a tendency to create a distorted look
upon what is actually happening in software systems.
If you look at the software fragment
int parameter;
if (parameter) doSomething();
most programmers would spontaneously translate the condition as “if parameter is
true or one, execute the function. If it is false or zero, don’t.” Only if pressed, they
would give the correct reading “if parameter has any other value but zero that is: -
215+1 till -1, 1 till 215-1 execute the function.
That is, a Boolean interpretation of the code fragment actually underuses what it
Similarly, one should point out, that the “natural” representation of a floating point
number - 1 sign bit, 8 bits exponent and 23 bits fraction, is an arbitrary decision by
IEEE; historically different decisions by individual hardware manufacturers have
been made. And together with the handling of subnormal numbers and handling of
infinity / NaN floating point arithmetic, which appears as a completely “natural”
operation, is actually a bundle of short algorithms on a lower level of the software /
hardware stack.
Therefore, there is no technical reason why computations should be restricted to
binary logic, nor is there any compelling reason, why a “number” must be a zero-
dimensional point on the continuum.
Both of these preliminary observations should be kept in mind, when we think about
the inherent fuzziness of information derived from our metaphor of the historian as
observer of signals once exchanged between participants, computing in a context
which has been lost.
There are a number of phenomena which this inherent fuzziness encompasses.
Without claiming completeness: (1) Fuzziness in a narrower sense, i.e., the
impossibility to give a crisp truth value for a statement. (2) An inherent imprecision
of a semantic concept, as in “old people”. (3) An item which conceptually is a scalar,
but goes beyond our current datatypes. E.g. a price for a commodity, for which we do
not have a precise value, but a minimum and a maximum, plus possibly hints at the
distribution of the data points between these.
I do not claim this list to be complete; but I am so far working on the suspicion, that
all other phenomena of the general fuzziness of sources can be handled by a
combination of the techniques needed to solve these three. Specifically the two most
obviously missing ones: (a) the problem of missing data or decisions with incomplete
information and (b) the problem of decisions based on contradictory information.
Now, for all three of the problems mentioned partial solutions exist. These solutions
exist usually so high up the technology stack however, that they are applicable only
under very special circumstances.
If you solve any of these problems at the level of an application, available to
the end users of that application e.g. a data base “Spurious people of the 13th
century”, <http://sp10c.someuniversity.terra> every other application to be
developed, ever, has to reinvent the solution.
If you solve any of these problems at the level of an application system e.g. a
specific data base system like Neo4J it is easily available for all applications
realized with the help of that system; every application realized in another
application system to be developed ever, has still to reinvent the solution.
If you solve any of these problems at the level of a programming language
e.g. C++ or Java it is easily available for all applications realized with the
help of any application system realized with the help of that programming
language; though every application realized in another programming language
to be developed ever, has still to reinvent the solution.
The solutions for the problems described below, are therefore assumed to be provided
at the level of (higher) programming languages with implicit cross-relations to the
solutions for the problems described in section 3.5 below.
Increasing complexity, the solutions required should be presented in inverse order
from their introduction.
(3) “An item which conceptually is a scalar, but goes beyond our current datatypes.
E.g. a price for a commodity, for which we do not have a precise value, but a
minimum and a maximum, plus possibly hints at the distribution of the data points
between these.”
For this problem, Julong Deng’s Grey System Theory, or rather the systematic
treatment of the building blocks for such systems as described for Western readers by
Sifeng Liu and Yi Lin [Liu 2006, 2011], seems to provide almost a blueprint for
Research proposal in software technology 3:
Implement grey numbers, or a derivation from them, and integrate them seamlessly
into main stream programming languages.
(2) “An inherent imprecision of a semantic concept, as in ‘old people’.”
It is obvious, that this problem closely relates to Zadeh’s [1965, 1975, 1978, 1999]
seminal work on Fuzzy Sets and Systems, and the later concept of “Computing with
Words” based on linguistic variables, which has found widespread applications in
many branches of computation. One is tempted to say, in almost all of them outside
of the Humanities which is a bit baffling, as Zadeh himself has said, that originally
he expected them to respond more eagerly to his proposals than any other disciplines
[Blair 1994, Termini 2012]. Nevertheless, a systematic approach to make these
concepts useful for the handling of historical sources, has to be a bit broader.
The smaller generalization required is, that since 1965 Zadeh’s Fuzzy Sets have been
joined by Rough Sets [Pawlak 1982, Pawlak 1985], Evidence or Believe Theory
[Shafer 1976], and an almost endless list of modifications of the basic approaches,
like Intuitionistic Fuzzy Sets [Atanassov 1986], Hesitant Fuzzy Sets [Torra 2010,
Herrera 2014], Rough Fuzzy Sets [Nanda 1992, Jiang 2009] etc. etc. have arisen.
Zadeh himself in his later years has tried to combine some of these approaches into a
Generalized Theory of Uncertainty [Zadeh 2005] but this is too restricted in scope
and leaves aside quite a few of the approaches to handle problems of fuzziness
besides Fuzzy Sets (divergent capitalization intentional). As Barr has noted “One of
the problems with fuzzy sets is that the meaning of the term has been left vague (one
might say fuzzy).” [Barr 2010, 400] Whether category theory is the right approach for
such an attempt and how far that eases implementation seems to me a not yet
completely decided question, even if Barr and Wells have shown that at Least Fuzzy
Sets can be shown to be covered by topos theory [Barr 2010, 400-403, within context
of 383-411].
A more general problem is, that most of the approaches I have encountered, still
assume fuzziness to be the exception, rather than the rule, as it has to be, if we accept
the model of an observer of signals exchanged in a lost context. The current logic of
embedding an approximately reasoned decision into an information system, is
illustrated by graphic 9, an attempt to generalize the similar graphics contained in the
Graphic 9: General logic of "computing with words"
With other words: From an information system, which is crisp, some information is
transferred into a fuzzy box, the result of the decision made crisp again for the major
parts of the larger embedding system.
Research proposal in software technology 4:
Implement linguistic variables and integrate them seamlessly into main stream
programming languages, as permanently accessible data type in all parts of the flow
of execution. Base the implementation on a generalized concept of uncertainty, which
broadens the scope of Zadeh’s theory of that name. ⁋
(1) “Fuzziness in a narrower sense, i.e., the impossibility to give a crisp truth value
for a statement.”
This is in some ways the most puzzling problem for software technology. At first
look it seems to be rather simple, as logics with multiple values of truth, preferably
continuous truth functions, are well understood and an ample literature exists. The
engineering problem appears, however, when the evaluation of an expression in
multivalued logic is the base of a control structure.
In the code fragment
if (condition_is_valid) doSomething();
else doSomethingElse();
what happens, if “condition_is_valid” has a truth value of 0.75?
To the best of my (admittedly incomplete) knowledge a very early proposal for the
inclusion of fuzzy logic into a programming language Adamo’s LPL [Adamo 1980]
is the only one, where for such cases a combination of the execution of both
branches is contemplated in detail.
Research proposal in software technology 5:
Design genuinely fuzzy control structures and integrate them seamlessly into main
stream programming languages. ⁋
Two short remarks:
(a) This is probably a generalization of the preceding problem (embed fuzziness as a
principle into the general program structure, rather than as an “island” of approximate
reasoning into a crisp program).
(b) There are probably cross-relations to the problem of a “frozen algorithm”
mentioned with the comment on the consequences of trying to implement the
infological model of Langefors in section 3.1 above (research proposal 2).
3.3 Chomsky’s Dead End
Linguistics and programming languages have shared an intimate relationship for a
long time; Backus-Naur form, one of the major break-throughs on the way from
tinkering to designing languages, acknowledges a debt owed to Chomsky’s early
work. Software technology in the meantime certainly has paid for that debt, as the
numerous programs show which analyze the syntax of linguistic expressions.
At least for historians this has led to a focus of linguistic work which is at the least
not very productive, if not outright counter-productive. Syntax is certainly a very
useful tool, to formulate correct sentences, but it obscures the fact, that many
linguistic remainders have at best a doubtful syntax and it may encourage a belief that
we understand text”s, which we do not.
Igitur Carolus Magnus a Leone III. Pontifice Romae anno 800 coronatus, …” (“So
Charlemagne was crowned in the year 800 in Rome by Pope Leo III, …”; Robert
Bellarmine (1542-1621), De Translatione Imperii Romani, liber secundus, caput primum)
Charlemagne, Leo III and cardinal Bellarmine all would have been able to
“understand” this sentence, as all of them were quite familiar with Latin grammar.
But what would they really have understood? The chief of a network of post-tribal
German loyalties and the son of a modest family in Southern Italy, where Byzantium
was still looming large, might have had at least similar notions of what the title
“Imperator” implied. The idea which the highly educated 16th/17th century cardinal
connected with the title would certainly have been incomprehensible for both and
vice versa.
The problem is, that this title is meaningful only within some understanding of the
roles within a specific political system. (Which is the reason, why any good history
teacher at university level will spend at least about 20 times as long explaining what
exactly the term implied in the year 800, than with the story of the actual coronation.)
Whether the fixation on syntax is good for linguistics is a question linguists will have
to answer. Why a semantic understanding requires an understanding of the syntax of
a message, at least I have never understood. There are linguists which doubt it for a
historian the notion of Roy Harris, that much of modern linguistics is based upon a
language myth, that holds to the erroneous belief, that we communicate in precise
statements, where we understand all implications of what we say, is eminently
attractive. “Do we always know what we mean?” [Harris 1998, 14] Indeed, do we?
Do we ponder before we speak, are we aware of all the assumptions we make in
formulating a sentence and all the implications our linguistic choices make for the
listener? If so, how can we hope ever to understand an utterance where the last native
speaker has died a few hundred years ago?
For historical studies technological support for keeping track of semantic changes
would be much more important in any case. And, not on top of several conceptual
layers which emphasize independence of context and syntactic bias, before “semantic
technologies” are plastered on as a second thought at the very top of the non-semantic
conceptual building.
Of the five problems I mention, this is probably the one, where it is hardest to
plausibly describe in brief a direction for a technical algorithmic solution.
Conceptual alternatives exist: The notion of Lakoff and Johnson, that understanding
is based upon metaphors [Lakoff 1980] is immediately intuitive for a historian who
has tried to look through the meaning hidden behind historical texts. And the notion,
that the possibility to construct associations between different concepts, to blend
concepts [Fauconnier 2003] is actually a feature distinguishing the human mind in a
much more fundamental way, than the I-Language and the Universal Grammar [Isac
2008], is highly convincing for the same historian.
But, as I said, while for most of the other proposals to solve a concrete technical
problem in this paper, a sensible starting point and the first few steps of the solution
are clear, how to implement metaphors and conceptual blending is much harder to
see. What might be very useful as the starting point would be semantic graphs, which
allow the handling of seemingly contradictory relationships between nodes. Blending
of two concepts means, that an edge connects two nodes, where from the point of
view of any of the two concepts, no edge should exist. This would require a class of
graphs were both nodes and edges are labelled and there exist bundles of edges,
where the acknowledgment of one implies the rejection of one or more of the others.
Under the label of “co-edges” I have discussed in other contexts that these would also
be useful for the implementation of the type of graphs discussed in the next section.
Research proposal in software technology 6:
Provide tools for the easy handling of such graphs in main stream programming
languages. ⁋
3.4 The Markup Fallacy
From a theoretical point of view markup languages are not a very central subject of
computer science; for many humanists and historians, however, they seem to be the
quintessence of the so called Digital Humanities. In my opinion, they current usage of
markup in the handling of historical documents has two methodological weaknesses,
however, which are not easily overcome, unless software technology provides
support for a new class of concepts.
The first of these weaknesses is rather straightforward: Embedding markup into a text
goes directly against the principle mentioned earlier, that a software system handling
historical information “Represents the artifacts as free from any interpretation as
possible in the technical system …”. On the surface that is violated by some
principles, which the TEI has propagated strongly in its early days, when the
principle, that markup should signal meaning, not display layout features, resulted in
the idea that e.g. italics should result in an <emph> </emph> tag. How an encoder
knows, that this was the intention in inserting italics, without parapsychological
powers, did always beat me. Though I have to admit, that many historians have opted
for normalized instead of diplomatic editions, it should be apparent from the earlier
parts of this paper, that this violates my methodological understanding of historical
research. From the methodological position formulated, any mixture of representation
and interpretation is a sin.
But there is a much more fundamental problem with this sort of markup, when one
looks at it from the point of view of processing historical data in information systems.
“Markup” according to the current paradigm, applies to text; adding explanatory or
analytic comments to an image, a 3D reconstruction or any other non-textual material
is considered an “annotation”. (Though annotations have recently also appeared
related to text.) I can see no epistemological reason whatsoever, why texts and other
forms of source representations are handled differently.
In principle it is quite possible to define standoff annotations which provide a
homogeneous solution for one-dimensional (textual), two-dimensional (images) … n-
dimensional data. Indeed, in the context of long term preservation, we have proven
that this is technically viable in one of my earlier projects [Thaller 2009].
However, while Desmond Schmidt [Schmidt 2009] has proposed a solution for
preparing standoff markup for a text, in a way which allows editing of the text
independent of the markup, I am not aware of any solution, which would allow this
for a data object of higher dimensionality.
Research proposal in software technology 7:
Develop a representation of “information objects”, where a data object of arbitrary
dimensionality can be combined with interpretative layers in such a way, that the data
object can be changed without damaging these layers. ⁋
There is a small caveat to be added to the above. All of these considerations relate to
the situation, where a source is converted by a 1 : 1 operation into a technical
representation, be it a human transcription or the scanning operation of an image.
Despite the emphasis on leaving a source as undistorted as possible, there is of course
the need to handle data objects which represent attempts to create a common
representation of more than one such object, e.g. the reconstruction of the
commonalities between the witnesses of a an abstract text surviving as different
manuscripts. In principle the “leave the source unchanged” principle should apply
here as well. For such nonlinear texts the equation “one source is represented as a
string, i.e., an array of characters” obviously does not hold. I have myself proposed a
model for representing texts not as arrays, but as graphs [Thaller 1993]. Similar
situations may become important in the future in data objects of other dimensionality,
when the explosive spread of scanning techniques emphasizes more strongly the need
to represent relationships between families of images or other objects of higher
Research proposal in software technology 8:
Generalize the solution of research proposal 7 to handle graphs of objects of
inhomogeneous dimensionality. ⁋
3.5 The Gorilla Syndrome
Of all the problems I mention here, this is probably the one most confusingly named.
Let me start by quoting a text which is completely focused on software technology,
historical sources not even remotely concerned. It comes from a book with interviews
of people important for the development of programming languages [Seibel 2009].
Interviewer: So you started out saying software reuse is “appallingly bad”, but
opening up every black box and fiddling with it all hardly seems like movement
towards reusing software.
[ Joe ] Armstrong [inventor of the programming language Erlang]: I think the lack of
reusability comes in object-oriented languages, not in functional languages. Because
the problem with object-oriented languages is they’ve got all this implicit
environment that they carry around with them. You wanted a banana but what you
got was a gorilla holding the banana and the entire jungle.(My emphasis.)
What is described here, at first look seems to be a purely technical problem with
modern programming languages. One of the major breakthroughs of software
technology has been the invention of object orientation, which assumes that programs
should not be defined as operations on numbers, characters or bits, but as interactions
between more abstract units, objects, which hide the fact, that ultimately on some
lower logical level numbers, characters or bits are manipulated. In the context of
historical sources you could envisage an object of the class “currency”, which allows
a convenient handling of expressions like “1.4.2” (one pound, four shillings, two
pence). Convenient handling implying an expression like “1.4.2 * 25” to allow you to
multiply that amount by 25, resulting in “30.4.2”. Of course, in other historical
sources “1.4.2” would stand for one gulden, four kreuzer, two heller, and “1.4.2 * 25”
should therefore result in “26.46.2”. Solutions for such problems have of course been
implemented, including solutions implemented by me. And as this is exactly the kind
of simple abstraction which can be hidden perfectly within the class of an object, you
would expect, that all you have to do to apply such a solution would be to import an
appropriate JAR file into your software stack (or the equivalent in another object
oriented programming language, my weapon of choice being C++, but not refered to
directly here, as there are presumably more JAVA-aware readers). So you simply
include JAR files implementing the classes “currencyPound” and “currencyGulden”
into your program.
Unfortunately, however, the implementation of such a class would almost certainly
use some other class to handle e.g. label strings “pound”, “shilling”, “pence” and
“gulden”, “kreuzer”, “heller” to format such amounts for printing. String classes are
very important to handle anything resembling a text; therefore they are usually quite
complex and refer to many other classes. Unless the two classes “currencyPound”
and “currencyGulden” (the bananas) are implemented in exactly the same software
stack, they may likely use different string classes (the gorillas) which in turn use
other classes forming a different technology stack (the jungle).
As many historical sources contain meaningful units, which can easily be handled
algorithmically, but cannot be mapped directly unto the basic data types used in
software technology, such classes allowing their handling would be extremely useful.
Some of them being considerably less trivial: A class for geographical locations, e.g.,
which accepts that some of them are related to specific locations and some of them
are purely mythical. (Remember Mount Olympus from section 3.1.)
Research proposal in software technology 9:
Look upon possibilities to extend the object oriented paradigm of programming into a
context oriented one. Intuitively speaking by two approaches: (1) Augmenting the
“private” and “public” sections of classes by a “context” section, which provides an
interface between classes outside of their lines of inheritance. (2) Providing a
possibility for “virtual system calls”, which provides interfaces into tools, which can
be shared between programs in different languages. Apologies to readers which do
not find this paragraph intuitive. ⁋
Russel L. Ackoff: “From Data To Wisdom”, Journal of Applied Systems Analysis 15
(1989) 3-9.
J.M. Adamo: “L.P.L. A fuzzy Programming Language: 1 Syntactic Aspects,” and
“L.P.L. A fuzzy Programming Language: 2 Semantic Aspects,” Fuzzy Sets and
Systems 3 (1980) 151-179, 261-289.
Robert L. Ashenhurst: “Ontological Aspects of Information Modeling”, Minds and
Machines 6 (1996) 287-394.
Krassimir T. Atanassov: “Intuitionistic Fuzzy Sets”, Fuzzy Sets and Systems 20
(1986) 87-96.
Michael Barr and Charles Wells: Category Theory for Computing Science, Montréal,
Sasa Baskarada and Andy Koronios: “Data, Information, Knowledge, Wisdom
(DIKW): a Semiotic Theoretical and Empirical Exploration of the Hierarchy and its
Quality Dimension”, Australasian Journal of Information Systems 18 (2013) 5-24.
Betty Blair: “Interview with Lotfi Zadeh”, Azerbaijan International 2 (Winter 1994)
46-47, 50.
Keith Devlin: Logic and Information, Cambridge, 1991.
Keith Devlin: “Modeling Real Reasoning”, in: Giovanni Sommaruga (ed.): Formal
Theories of Information, (= Lecture Notes in Computer Science 5363), Berlin-
Heidelberg, 2009, 234-252.
Johann Gustav Droysen: Historik. Vorlesungen über Enzyklopädie und Methodologie
der Geschichte, ed. by Rudolf Hübner, München, 1937.
Yucong Duan et al.: „Specifying Architecture of Knowledge Graph with Data Graph,
Information Graph, Knowledge Graph and Wisdom Graph“, presented at SERA
2017, accessed on April 23rd 2018 at:
Gilles Fauconnier and Mark Turner: The Way We Think. Conceptual Blending and
the Mind’s Hidden Complexities, New York, 2003.
Bernard Favre-Bull: Information und Zusammenhang. Informationsfluß in Prozessen
der Wahrnehmung, des Denkens und der Kommunikation, Springer, 2001
Luciano Floridi: The Philosophy of Information, Oxford, 2011.
Martin Frické: “The Knowledge Pyramid: A Critique of the DIKW Hierarchy”,
Journal of Information Science 35 (2009) 131-142.
Roy Harris: Introduction to Integrational Linguistics, Oxford, 1998.
Francisco Herrera et al. (eds.) “Special Issue on Hesitant Fuzzy Sets”, International
Journal of Intelligent Systems 29 (2014) 493-595.
Daniela Isac and Charles Reiss, I-Language, Oxford, 2008.
Yuncheng Jiang et al.: “Reasoning with Expressive Fuzzy Rough Description
Logics” Fuzzy Sets and Systems 160 (2009) 3403-3424.
Gu Jifa and Zhang Lingling: “Data, DIKW, Big Data and Data Science”, Procedia
Computer Science 31 (2014) 814-821.
William J. Kettinger and Yuan Li: “The infological equation extended: towards
conceptual clarity in the relationship between data, information and knowledge”, in:
European Journal of Information Systems 19 (2010), p. 409-421.
George Lakoff and Mark Johnson: Metaphors We Live By, Chicago 1980, with a
substantial afterword reprinted 2003.
Sifeng Liu and Yi Lin: Grey Information. Theory and Practical Applications,
London, 2006.
Sifeng Liu and Yi Lin: Grey Systems. Theory and Practical Applications, London,
Börje Langefors: Theoretical Analysis of Information Systems, Studentlitteratur,
W.C. Mahaney et al.: “Biostratigraphic Evidence Relating to the Age-Old Question
of Hannibal’s Invasion of Italy”, Archaeometry, 59 (2017), pp. 164-178 and 179-180.
S. Nanda and S. Majumdar: “Fuzzy Rough Sets”, Fuzzy Sets and Systems 45 (1992)
Zdzisław Pawlak: “Rough Sets”, International Journal of Parallel Programming, 11
(1982), 341-356.
Zdzisław Pawlak: “Rough Sets and Fuzzy Sets”, Fuzzy Sets and Systems 17 (1985)
Jennifer Rowley:The Wisdom Hierarchy: Representations of the DIKW Hierarchy”,
Journal of Information Science 33 (2007) 163-180.
David J. Saab and Uwe V. Riss: “Information as Ontologization”, Journal of the
American Society for Information Science and Technology 62 (2011) 2236-2246.
Desmond Schmidt and Robert Colomb: “A Data Structure for Representing Multi-
Version Texts Online”, International Journal of Human-Computer Studies 67 (2009)
Peter Seibel: Coders at Work, Apress, 2009, 213.
Glenn Shafer: A Mathematical Theory of Evidence, Princeton, 1976.
Claude E. Shannon: “A Mathematical Theory of Communication”, Bell System
Technical Journal, 27 (1948) 379423, 623656, 1948.
Giovanni Sommaruga: “One or Many Concepts of Information?”, in: Giovanni
Sommaruga (ed.): Formal Theories of Information, (= Lecture Notes in Computer
Science 5363), Berlin-Heidelberg, 2009, 253-267.
Settimo Termini: “On some ‘Family Resemblances’ of Fuzzy Set Theory and Human
Sciences”, in: Rudolf Seising and Veronica Sanz (eds.): Soft Computing in
Humanities and Social Sciences (= Studies in Fuzziness and Soft Computing 273),
Berlin-Heidelberg, 2012, 39-54.
Manfred Thaller: “Historical Information Science: Is there such a Thing? New
Comments on an Old Idea.”, In: Seminario discipline umanistiche e informatica. Il
problema dell' integrazione, ed. Tito Orlandi, 51-86 (= Contributi del Centro Linceo
interdisciplinare 'Beniamino Segre' 87), Rome, 1993. Reprinted under the same title
in: Historical Social Research Supplement 29 (2017), 260-286.
Manfred Thaller: “The Cologne Information Model: Representing Information
Persistently”, In: The eXtensible Characterisation Languages XCL, ed. Manfred
Thaller, Hamburg, 2009, 223-39. Reprinted under the same title in: Historical Social
Research Supplement 29 (2017), 344-356.
Vincenç Torra: “Hesitant Fuzzy Sets”, International Journal of Intelligent Systems 25
(2010) 529-539.
Warren Weaver: “Introductory Note on the General Setting of the Analytical
Communication Studies”, in: Claude E. Shannon and Warren Weaver: The
Mathematical Theory of Communication, 1949.
Matthias Weber (ed): Die Reichspolizeiordnungen von 1530, 1548 und 1577, (= Ius
Commune, Sonderheft 146), Frankfurt am Main, 2001.
Lotfi A. Zadeh:Fuzzy Sets”, Information and Control 8 (1965) 338-353.
Lotfi A. Zadeh:The Concept of a Linguistic Variable and its Application to
Approximate Reasoning”, I III, Information Sciences 8 (1975) 199-249, 301-357, 9
(1975) 43-80.
Lotfi A. Zadeh:Fuzzy Sets as a Basis for a Theory of Possibility”, Fuzzy Sets and
Systems 1 (1978), 3-28.
Lotfi A. Zadeh and Janusz Kacprzyk (Eds.): Computing with Words in Information /
Intelligent Systems I and II (= Studies in Fuzziness and Soft Computing 33 and 34
(1999) ).
Lotfi A. Zadeh:Toward a Generalized Theory of Uncertainty (GTU) an outline”,
Information Sciences 172 (2005), 1-40.
... Was meinen eigenen Überlegungen, warum dieses Modell zum Verständnis der Information, die in historischen Quellen überlebt hat, ungeeignet ist (vgl. Thaller, 2018), natürlich entgegenkommt. ...
... In the blog [Thaller 2018] I have proposed to understand "information" as appearing in historical sources as inherently different from information as arising out of contemporary processes. (That contemporary information from many knowledge domains may actually share more with historical information than with information arising from the hard sciences notwithstanding.) ...
Full-text available
This is the blog post from slightly reformatted for printing, “published” in this form here by request.
Full-text available
On Making in the Digital Humanities fills a gap in our understanding of digital humanities projects and craft by exploring the processes of making as much as the products that arise from it. The volume draws focus to the interwoven layers of human and technological textures that constitute digital humanities scholarship. To do this, it assembles a group of well-known, experienced and emerging scholars in the digital humanities to reflect on various forms of making (we privilege here the creative and applied side of the digital humanities). The volume honours the work of John Bradley, as it is totemic of a practice of making that is deeply informed by critical perspectives. A special chapter also honours the profound contributions that this volume’s co-editor, Stéfan Sinclair, made to the creative, applied and intellectual praxis of making and the digital humanities. Stéfan Sinclair passed away on 6 August 2020. The chapters gathered here are individually important, but together provide a very human view on what it is to do the digital humanities, in the past, present and future. This book will accordingly be of interest to researchers, teachers and students of the digital humanities; creative humanities, including maker spaces and culture; information studies; the history of computing and technology; and the history of science and the humanities.
Full-text available
Both Narratology and Digital Humanities look back on a remarkable history of research and progress. One after the other, the narratological and the digital research communities evolved into large international and interdisciplinary networks. While cooperation between the two disciplines would be possible and beneficial in many areas, they often still work in parallel rather than together. A workshop at Hamburg University brought together Literary Studies researchers from Narratology and from Digital Humanities to (a) discuss requirements for and possibilities of a digital operationalisation of analytical categories from Narratology and Literary Studies and (b) theoretically reflect upon possible connections between more traditional and digital approaches. The present volume combines the workshop contributions from both disciplines and thus attempts to further the bridge-building and dialogue.
Full-text available
Controversy over the alpine route that Hannibal of Carthage followed from the Rhône Basin into Italia has raged amongst classicists and ancient historians for over two millennia. The motivation for identifying the route taken by the Punic Army through the Alps lies in its potential for identifying sites of historical archaeological significance and for the resolution of one of history's most enduring quandaries. Here, we present stratigraphic, geochemical and microbiological evidence recovered from an alluvial floodplain mire located below the Col de la Traversette (~3000 m asl—above sea level) on the French/Italian border that potentially identifies the invasion route as the one originally proposed by Sir Gavin de Beer (de Beer 1974). The dated layer is termed the MAD bed (mass animal deposition) based on disrupted bedding, greatly increased organic carbon and key/specialized biological components/compounds, the latter reported in Part II of this paper. We propose that the highly abnormal churned up (bioturbated) bed was contaminated by the passage of Hannibal's animals, possibly thousands, feeding and watering at the site, during the early stage of Hannibal's invasion of Italia (218 BC).
Full-text available
What exactly is the difference between data and information? What is the difference between data quality and information quality; is there any difference between the two? And, what are knowledge and wisdom? Are there such things as knowledge quality and wisdom quality? As these primitives are the most basic axioms of information systems research, it is somewhat surprising that consensus on exact definitions seems to be lacking. This paper presents a theoretical and empirical exploration of the sometimes directly quoted, and often implied Data, Information, Knowledge, Wisdom (DIKW) hierarchy and its quality dimension. We first review relevant literature from a range of perspectives and develop and contextualise a theoretical DIKW framework through semiotics. The literature review identifies definitional commonalities and divergences from a scholarly perspective; the theoretical discussion contextualises the terms and their relationships within a semiotic framework and proposes relevant definitions grounded in that framework. Next, rooted in Wittgenstein’s ordinary language philosophy, we analyse 20 online news articles for their uses of the terms and present the results of an online focus group discussion comprising 16 information systems experts. The empirical exploration identifies a range of definitional ambiguities from a practical perspective.
Full-text available
In this paper we discuss the relationship between data and DIKW, that the data only evolves to knowledge, which may have some value, but if without the wisdom we still could let the knowledge be really useful to people. Now the big data occupies much attention in some extent for his volume, velocity, and variety. But in practical use the value plays more important role. Finally to judge the value for data not necessary for big, in some cases the small data also may lead to big value. So we appreciate the data science, which may consider more inherent value from data.
A definition of the concept ‘intuitionistic fuzzy set’ (IFS) is given, the latter being a generalization of the concept ‘fuzzy set’ and an example is described. Various properties are proved, which are connected to the operations and relations over sets, and with modal and topological operators, defined over the set of IFS's.
Information modeling (also known as conceptual modeling or semantic data modeling) may be characterized as the formulation of a model in which information aspects of objective and subjective reality are presented (“the application”), independent of datasets and processes by which they may be realized (“the system”). A methodology for information modeling should incorporate a number of concepts which have appeared in the literature, but should also be formulated in terms of constructs which are understandable to and expressible by the system user as well as the system developer. This is particularly desirable in connection with certain “intimate” relationships, such as being the same as or being a part of. The conceptual basis for such a methodology, as conventionally approached, seems flavored with notions arising in the systems arena to an inappropriate degree. To counter this tendency it is useful to turn to a discipline not hitherto much involved in technology, namely analytic philosophy.
The aim of this paper is to underline the importance of detecting similarities or at least, ‘family resemblances’ among different fields of investigation. As a matter of fact, the attention will be focused mainly on fuzzy sets and a few features of human sciences; however, I hope that the arguments provided and the general context outlined will show that the problem of picking up (dis)similarities among different disciplines is of a more general interest. Usually strong dichotomies guide out attempts at understanding the paths along which scientific research proceed; i.e., soft versus hard sciences, humanities versus the sciences of nature, Naturwissenschaften versus Geisteswissenschaften, Kultur versus Zivilization, applied sciences and technology versus fundamental, basic (or, as has become recently fashionable to denote it, “curiosity driven”) research. However, the similarity or dissimilarity of different fields of investigation is - to quote Lotfi Zadeh - “a matter of degree”. This is particularly evident in the huge, composite, rich and chaotic field of the investigations having to do with the treatment of information, uncertainty, partial and revisable knowledge (and their application to different problems). The specific points treated in this paper can be then seen as case studies of a more general crucial question. A question which could be important in affording also the problems posed by interdisciplinarity. The specific point of the interaction between fuzzy sets and human sciences can be seen as an episode of a larger question. There is a long history, in fact, regarding the mutual relationship existing between the (so-called) humanities and the (so-called) hard sciences, that has produced the so-called question of the two Cultures. At the end of the paper possible epistemological similarities between the development of Fuzzy Set theory and new emerging disciplines, like Trust Theory, will be briefly discussed.
It is generally accepted that the management of imprecision and vagueness will yield more intelligent and realistic knowledge-based applications. Description Logics (DLs) are suitable, well-known logics for managing structured knowledge that have gained considerable attention the last decade. The current research progress and the existing problems of uncertain or imprecise knowledge representation and reasoning in DLs are analyzed in this paper. An integration between the theories of fuzzy DLs and rough DLs has been attempted by providing fuzzy rough DLs based on fuzzy rough set theory. The syntax, semantics and properties of fuzzy rough DLs are given. It is proved that the satisfiability, subsumption, entailment and ABox consistency reasoning in fuzzy rough DLs may be reduced to the ABox consistency reasoning in the corresponding fuzzy DLs.