PreprintPDF Available

Modeling Conflicting, Unreliable, and Varying Information

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Most persistent memories in which bodies of information are stored can only provide a view of that information as it currently is, from a single point of view, and with no respect to its reliability. This is a poor reflection of reality, because information changes over time, may have many and possibly disagreeing origins, and is far from often certain. Hereat, this paper introduces a modeling technique that manages conflicting, unreliable, and varying information. In order to do so, the concept of a "single version of the truth" must be abandoned and replaced by an equivocal theory that respects the genuine nature of information. Through such, information can be seen from different and concurrent perspectives, where each statement has been given a reliability ranging from being certain of its truth to being certain of its opposite, and when that reliability or the information itself varies over time, changes are managed non-destructively, making it possible to retrieve everything as it was at any given point in time. As a result, other techniques are, among them third normal form, anchor modeling, and data vault, contained as special cases of the henceforth entitled transitional modeling.
Content may be subject to copyright.
MODELING CONFLICTING,
UNRELIABLE, AND VARYING
INFORMATION
LARS RÖNNBÄCK
FOURTH REVISION — 16 DECEMBER 2018
Most persistent memories in which bodies of information
are stored can only provide a view of that information as it
currently is, from a single point of view, and with no respect
to its reliability. This is a poor reflection of reality, because
information changes over time, may have many and possibly
disagreeing origins, and is far from often certain. Hereat,
this paper introduces a modeling technique that manages
conflicting, unreliable, and varying information. In order to
do so, the concept of a “single version of the truth” must be
abandoned and replaced by an equivocal theory that respects
the genuine nature of information. Through such, information
can be seen from different and concurrent perspectives, where
each statement has been given a reliability ranging from being
certain of its truth to being certain of its opposite, and when
that reliability or the information itself varies over time, changes
are managed non-destructively, making it possible to retrieve
everything as it was at any given point in time. As a result,
other techniques are, among them third normal form, anchor
modeling, and data vault, contained as special cases of the
henceforth entitled transitional modeling1.1With gratitude to the consul-
tant company Up to Change
(www.uptochange.com)
sponsoring research on A N -
CH O R and TRANSITIONAL
MODELING.
KEYWORDS
transitional, modeling, information, temporality, concurrency,
reliability, variability, language, vaugeness, fact
All current information modeling techniques result
in lossy implementations, in the sense that they cannot
preserve combinations of who said what, when, and
how sure they were of what they were saying. Mod-
eling such requirements explicitly using traditional
techniques is complex and error-prone, and therefore
2 lars rönnbäck doi:10.13140/rg .2.2.34381.49121/1
rarely done in practice, resulting in lost information
impossible to recover (Golfarelli and Rizzi)2. In fact,
2Matteo Golfarelli and Stefano
Rizzi. “A survey on temporal
data warehousing”. In: In-
ternational Journal of Data
Warehousing and Mining
(IJDWM) 5.1 (2009), pp. 1–17
every statement has an origin and is made with some
degree of confidence (B. Liu)3, but in most modeling
3B Liu. Uncertainty Theory:
An Introduction to its Axiomatic
Foundations. 2004. Springer-
Verlag, Berlin, 2004
techniques the simplified assumption is that statements
are universal, univocal, and unvarying. Riddance of
such limitations should be of interest to, for example,
institutions subject to financial regulations that stipu-
late complete auditability, uncertain and complemen-
tary clinical results within the health care domain, and
conflicting information that may be strengthened or
discarded over time within military, policial, or judicial
applications (Fisher and Kingma)4. Modeled informa-
4Craig W Fisher and Bruce R
Kingma. “Criticality of data
quality as exemplified in two
disasters”. In: Information &
Management 39.2 (2001),
pp. 109–116
tion often ends up in databases, and while database
vendors have started to add rudimentary temporal
capabilities (Kulkarni and Michels)5, these are still in-
5Krishna Kulkarni and Jan-
Eike Michels. “Temporal
features in SQL: 2011”. In:
ACM Sigmod Record 41.3
(2012), pp. 34–43
sufficient (Johnston)6and rely on the relational model.
6Tom Johnston. Bitemporal
data: theory and practice.
Newnes, 2014
Rather than attempting to extend the relational model,
this paper introduces a new generally applicable mod-
eling technique, that is lossless with respect to con-
currency, reliability and temporality. It is simplistic in
nature, yet able to model complex scenarios. It can, for
example, capture the following illustrative story in a
structured way:
The accused, Archie, was seen fleeing the scene of the
crime by two witnesses, Donna and Charlie. Donna is
certain that the accused was clean shaved, while Charlie
thinks he had a red beard. When the victim, Bella, later
regained consciousness, she corroborated Charlie’s story,
at which time Donna retracted her statement. In fact,
Archie had been wearing a fake red beard during the
attack, but it fell off when he fled, eventually leading to
his conviction through a DNA match.
In the terminology introduced, concurrency refers to
having multiple, possibly conflicting, views of the same
modeling conflicting,unreliable,and varying information 3
state of affairs, reliability refers to being able to express
a degree of uncertainty about said state of affairs, and
temporality refers to when it took place, when it changes,
when an opinion is had towards it, and when such
opinions were recorded. There are approaches that
address these aspects of information in isolation or in
combination (Benjelloun et al.; Dylla, Miliaraki, and
Theobald)7, but to the best knowledge of the author no 7Omar Benjelloun et al.
“Databases with uncertainty
and lineage”. In: The VLDB
Journal-The International
Journal on Very Large Data
Bases 17.2 (2008), pp. 243–
264; Maximilian Dylla, Iris
Miliaraki, and Martin Theobald.
“A temporal-probabilistic
database model for information
extraction”. In: Proceedings
of the VLDB Endowment 6.14
(2013), pp. 1810–1821
existing approach combines all of them. The paper is
structured with a section containing a formalization of
the technique that will combine all of them, followed
by a conceptual model of the formalized constructs,
after which the work is contrasted with related research
along with the presentation of some special cases, and it
ends with conclusions and further research.
Formalization
This formalization introduces a few simple constructs
that enables the capturing of information that evolves
over time, may have conflicting sources, and that can
be less than certain. At its core, it deals with statements
pertaining to the properties or relationships of things8,8Both properties and rela-
tionships will be represented
using the same construct, or
in other words, a property is
treated just as a special type of
relationship.
where a thing for our purposes simply is that which can
have properties and participate in relationships. The ex-
act nature of what it entails to be a thing (Cumming and
Collier)9will be left to the reader to judge, but broadly 9Graeme S Cumming and
John Collier. “Change and
identity in complex systems”.
In: Ecology and society 10.1
(2005)
speaking it is something that needs to be sufficiently
different from something else to be told apart. Some
examples of things are perpetrators, beards, places, sam-
ples, notes, and investigations, but also that which may
not come immediately to mind, like thoughts, events,
molecules, transactions, and moments in time.
4 lars rönnbäck doi:10.13140/rg .2.2.34381.49121/1
Posits
First some purely syntactical constructions are needed.
These will later, through the final construct defined, be
given meaning. Their intention will, however, be exem-
plified along with the introduction of each construct.
The distinction here between syntax and semantics is
theoretically important but may be neglected in prac-
tice. Using syntax at random to create statements will
produce a lot of nonsense with respect to the seman-
tics of the universe of discourse, or in other words, be
irrelevant to that which is being modeled10.
10 While “Arthur has 25 pounds
of bad feelings against light
bulbs” may be syntactically
correct, it may be complete
nonsense from a semantical
point of view.
def.of an appearance and a role
An appearance is a pair (i,r)where the first element is a unique
identifier and the second a string called role.
The intention of an appearance is to capture the
identity of a thing along with a role that thing is playing
in a statement. If for the sake of simplicity we assume
unique identifiers to be capital letters, some examples of
appearances are:
(A,has gender)
(A,has beard)
(A,is accused)
(B,is victim)
(C,is witness)
These appearances convey that Ais some thing that
may have a specific gender, may have a beard of some
kind, and that may have been an accused at some point.
Similarly Band Cmay respectively have been a victim
and a witness in some, but not necessarily the same,
state of affairs11. In order to bind appearances to a
11 Note that once a unique
identifier is associated with a
thing, that thing will retain that
identifier indefinitely and only
that identifier. In other words
A,B, and Care, were, or will
be different things.
modeling conflicting,unreliable,and varying information 5
particular state of affairs, a dereferencing set is needed:
def.of a dereferencing set
Adereferencing set,{(i1,r1), . . . (in,rn)}, is a set of appearances
where any role rimay only appear once.
The intention of a dereferencing set is to bind a single
appearance or several appearances that have appeared
or will appear in a state of affairs. Given the example
above, some thinkable dereferencing sets are:
{(A,has gender)}
{(A,has beard)}
{(A,is accused),(B,is victim),(C,is witness)}
The first two trivially binds single appearances to
some state of affairs and the third states that A,B, and
Crespectively appeared as the accused, the victim, and
the witness in a particular state of affairs. The presented
theory treats dereferencing sets having any number of
members equally, simplifying the formalization. How-
ever, dereferencing sets with a single member may be
thought of as properties and those with more than one
member as relationships. Every dereferencing set is as-
sumed to take on a value, where the value is associated
with a moment in time, before which it was not valid
and after which it is12. In order to express such a value 12 If a value can be said to
have always existed, this can
be represented by a special
point in time representing the
beginning of time.
and its time dependence, the posit is introduced:
def.of a posit
Aposit is a triple, [D,v,t], where the first element is a deref-
erencing set, the second a data value of a simple or complex
type, and the third a time point. The domain from which
time points are taken is called appearance time.
6 lars rönnbäck doi:10.13140/rg .2.2.34381.49121/1
While posits still only are syntactical constructs,
they will by later constructions be assigned some truth
value. In fact, the word “posit” is suitably defined as
“a statement that is made on the assumption that it
will prove to be true” according to the Oxford English
Dictionary (Oxford University Press)13. The intention
13 Oxford University Press.
The Oxford English Dictionary.
2017. URL:https: // en .
oxforddictionaries.com/
definition/posit (visited
on 07/11/2018)
of a posit is to state which value an appearance has
taken since the specified point in time. Continuing the
example the following are some possible posits:
[{(A,has gender)}, male, 1972]
[{(A,has beard)}, fluffy red, 10:00]
[{(A,has beard)}, shaved clean, 10:02]
[{(A,is accused),(B,is victim),(C,is witness)},
at scene of crime, 09:58]
The intention of appearance time is to allow the
representation of change, which occurs when two posits
share the same dereferencing set, but with different
values and time points14. In the example posits above,
14 This type of change repre-
sents those that occur naturally
in the universe of discourse,
such as it being possible for
Archie to both have and not
have a beard during different
periods of time, presuming that
Archie periodically shaves it off
and lets it grow back out.
Archie’s beard (suspiciously quickly) changes from
being fluffy red at 10:00 to shaved clean at 10:02. Archie,
Bella, and Charlie are also related to each other and
have different roles at the scene of the crime, where they
all appeared at 09:58. The first posit is from the public
records, showing that Archie is male and has remained
so since his birth in 1972.
To further exemplify change, the following posit
capture the fact that the group had dispersed shortly
after the crime.
[{(A,is accused),(B,is victim),(C,is witness)},
dispersed, 10:02]
Posits capture transitions, in which whatever is ref-
erenced can be said to enter a different state along with
modeling conflicting,unreliable,and varying information 7
when this happened. Some states are likely to remain
forever, such as the birth dates of the parties involved.
[{(A,has birth date)},1972-08-20,1972-08-20]
[{(B,has birth date)},1980-02-13,1980-02-13]
It is easy to be mislead into thinking that if a date of
birth had been incorrectly stated, it could be changed by
entering a new value with a later appearance time.
[{(A,has birth date)},1972-09-21,1972-09-21]
This, however, would lead to ambiguity, since there
is no way to tell which of the two coinciding posits is
the correct one15. If interpreted as a transition between 15 Think of posits as a deck
of cards, which could be
shuffled and given to you, from
which you need to pick out
the true ones, using only the
information on the cards.
states, as with Archie losing his beard, that would in-
conceivably mean that there was a time when Archie
actually was born in August to be succeeded by a time
when he was actually born in September. Corrections,
then, must by necessity be handled differently, why
more constructs are needed.
Assertions
Rather than to assume posits as meaningful and truth-
ful, they are viewed as syntactical pieces of information
towards which it is possible to hold different opinions,
be uncertain about, and revise over time. To serve these
purposes and to bring posits meaning, a new construct
is needed.
def.of an assertion
An assertion is a predicate, !(P,p,α,T), taking four arguments,
where the first argument is a unique identifier, the second
a posit, the third a real number in the range [1, 1], and the
fourth a time point. The domain from which time points are
taken is called assertion time.
8 lars rönnbäck doi:10.13140/rg .2.2.34381.49121/1
Being a predicate, an assertion evaluates to true or
false, depending on its arguments16. The unique iden-
16 An assertion is true when
it models the universe of
discourse, or in other words
represents information about
it, whereas false assertions
represent disinformation.
tifier Prepresents a positor, the one holding an opinion
about a posit. The real number αrepresents the relia-
bility with which a positor holds its opinion and the
time point Trepresents when this opinion was asserted.
A different assertion with the same positor and posit
represents a change of opinion. Using assertions, con-
currency can be expressed, as in the peculiarities of
Archie’s beard:
p1= [{(A,has beard)}, fluffy red, 10:00]
p2= [{(A,has beard)}, shaved clean, 10:02]
!(C,p1, 0.75, Friday)
!(D,p2, 1, Friday)
The two posits p1and p2are in conflict, since they are
stating two different values for the same dereferencing
set, with two assertions putting them both in effect
after 10:02. In plain text, the assertions say that “Donna
stated on Friday that she is certain that Archie’s beard
was shaved clean at and since 10:02” and “Charlie stated
on Friday that it is probable that Archie had a fluffy
red beard at and since 10:00”. If both of these represent
actual facts in the universe of discourse, the assertions
are true, which is the case in the running example.
Looking at the reliability, it is natural that α=1
corresponds to the absolute certainty that Donna is
expressing. The interpretation of α=0.75 is more
elusive, however. Where machines running probabilistic
models may yield exact reliabilities connected to their
results, humans tend to express themselves in a less
exact manner. A possible translation between natural
language terms and their corresponding reliabilities can
modeling conflicting,unreliable,and varying information 9
αLinguistic correspondence
1 It is certain that the value is v
0.75 It is probable that the value is v
0.5 It may be that the value is v
0.25 It is possible that the value is v
0 It could be any value whatsoever
0.25 It is possible the value is not v
0.5 It may be that the value is not v
0.75 It is probable that the value is not v
1 It is certain that the value is not v
Table 1: Suggestions of some
reliabilities and their linguistic
correspondences for a value v.
be found in Table 1. While the meaning of the values
{−1, 0, 1}must remain fixed, the values in between may
be assigned according to the requirements at hand.
Not surprisingly, when asked, Charlie was certain
that Archie was not clean shaved when he saw him at
10:00.
¯
p3= [{(A,has beard)}, not shaved clean,10:00]
!(C,¯
p3, 1, Friday)
Since complementary values, such as “not shaved
clean”, are harder to store17 than actual values, the 17 Information storages, such
as databases, tend to allow
for storing actual values
rather than their mathematical
complements.
not in the posit can be removed in favor of a negative
reliability:
p3= [{(A,has beard)}, shaved clean, 10:00]
!(C,p3,1, Friday)
This assertion is essentially making the same state-
ment, where α=1 corresponds to being certain of the
opposite of a posit. Given what is known so far, it can
be concluded that the two positors, Donna and Charlie,
are in strong disagreement. Where existing modeling
techniques require such conflicts to be resolved at the
time information is recorded, these conflicting state-
ments can now be captured and instead reviewed at
10 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
the time the information is retrieved18. In other words,
18 In order for a modeling
technique to be lossless, it
must adopt the “resolve on
read” paradigm rather than
“resolve on write”.
there are many versions of the truth, better reflecting the
genuine nature of information.
Characteristics
When information is captured using assertions there
are some characteristics that may be desirable19. For
19 There are a number of ways
that assertions may misbehave
with respect to each other
that need to be avoided. The
restrictions introduced are
not severe, as they represent
how we normally think about
information anyway.
example, if Charlie is making assertions about both the
posit p3and its opposite ¯
p3it is not desirable that these
contradict each other. First a collection of assertions is
needed:
def.of a body of information
Abody of information is a set of true assertions.
Given a body of information, a set of true asser-
tions that represent actual facts in some universe of
discourse,20 some characteristics that pertain to their
20 Likewise, a body of disinfor-
mation would be a set of false
assertions, which is beyond
the scope of this paper and
perhaps less desirable, but not
ruled out as meaningless.
interrelationships can be defined.
def.of symmetric
A body of information is said to be symmetric iff the assertions
!(P,p,α,T)and !(P,¯
p,α,T)are equivalent for a posit pand
its opposite ¯
p.
The first characteristic is symmetry. In a symmetrical
body of information the assertion made by Charlie
that “the reliability of Archie having a fluffy red beard
is 0.75” is equivalent to asserting that “the reliability
of Archie not having a fluffy red beard is -0.75”. This
quality makes it possible to remove the negation of a
value, such as “not fluffy red”, in a posit by reversing
the sign of the reliability. As earlier stated, negated
values may be troublesome from a storage point of
view, so this characteristic makes it possible to avoid
modeling conflicting,unreliable,and varying information 11
them completely in favor of actual values with negative
reliabilities.
def.of canonical
A body of information is said to be canonical iff all assertions
are made against posits without negated values.
It follows that any symmetrical body of information
can be transformed to become canonical and thereby
easier to manage from a storage perspective.
As another characteristic, it is reasonable that for
any assertion that is not certain, for example that “the
reliability of Archie having a fluffy red beard is 0.75
also implies that there is a possibility that his beard
is not red and fluffy. Intuitively21 the complementary
21 It would be counterintuitive
to be able to state that “Archie
very likely had a red beard and
very likely did not have a red
beard” and still be consistent in
your reasoning.
assertion must be that “the reliability of Archie not
having a fluffy red beard is 0.25”, since reliabilities
should sum up to 1.
def.of bounded
A body of information is said to be bounded iff the reliabilities
in the assertions !(P,p,α,T)and !(P,¯
p,β,T)satisfy
α+β
=
α+β
2.
not v
v
symmetry
1
1
1
1
Figure 1: The boundary and
symmetry of reliability for a
value vand its negation not v.
Since negative reliabilities are allowed, the formula
becomes a little more complicated than just summing
up reliabilities. For example, in a symmetrical and
bounded body of information, if α=0.75 for ¯
pthen
β=0.25 for pusing boundary, and using symmetry
that is equivalent to 0.25 for ¯
p, which using boundary
again yields the reliability of pas 0.75. The symme-
try and boundary of values and their negation can be
illustrated as a graph, seen in Figure 1.
There is still no characteristic preventing a positor
from having a multitude of reliabilities for the same
posit at the same time. Such a quality will also ensure
12 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
that even though the reliabilities 0.25 and 0.75 for a
posit produce equivalent assertions, only one should
exist in the body of information22.
22 This avoids lunatic state-
ments such as “I think Archie
had a red beard and I am also
sure he had a red beard, but
he might not have had a red
beard”.
def.of exclusive
A body of information is said to be exclusive iff no assertions
in which only the reliability differs exist.
An exclusive body of information prevents a positor
from being, for example, certain of a fact and its op-
posite at the same time. Different positors may be in
disagreement though, provided that it can be ensured
that positors are having opinions about the same things.
If one positor is saying that “Archie has a fluffy red
beard” and another that “Archie was clean shaved”, can
we be sure they are talking about the same Archie and
how do we know “has beard” means the same thing for
them?
def.of universal
A body of information is said to be universal iff positors agree
on all appearances.
In a universal body of information positors are not
allowed to interpret, for example, the appearance
(A,has beard)differently23. Whatever the components
23 In order to disagree, there
must still be a base upon
which you agree, since how
would you otherwise know
what you are disagreeing
about?
represent will be the same for every positor. As asser-
tions are predicates it is only through their representa-
tion that it can be determined if they are true or false.
In order to have “meaningful” assertions, the bodies of
information discussed henceforth will therefore be as-
sumed to have all the previously defined characteristics.
def.of comprehensive
A body of information is said to be comprehensive iff it is sym-
metric, canonical, bounded, exclusive and universal.
modeling conflicting,unreliable,and varying information 13
From here on it will be assumed that all bodies of
information discussed are comprehensive. In a sense, a
comprehensive body of information can be said to be a
body of well behaving information.
Corrections
It would be presumptuous to assume that any state-
ment made is always correct. We all make mistakes, and
when opinions change in such a way that a previous
statement is erroneous, corrections have to be made and
made losslessly. To be able to do corrections a particu-
lar, but intuitively understandable, reliability value is
needed.
def.of the complete uncertainty
An assertion is called positive when the reliability is above
zero, negative when below zero, and completely uncertain when
zero.
Zero reliability is best understood through an exam-
ple. Given the following posits:
p2= [{(A,has beard)}, shaved clean, 10:02]
¯
p2= [{(A,has beard)}, not shaved clean, 10:02]
For the posit p2and its opposite ¯
p2the assertions
!(P,p2, 0, T)and !(P,¯
p2, 0, T)are equivalent given sym-
metry. Since the reliability for the beard being both
shaved clean and not shaved clean is the same, the only
meaningful interpretation is that zero reliability must
be when the positor has no idea what the value is24.24 It could be either of the
posits, to any degree of
certainty, but this positor does
not know. Not knowing also
conveys information, a fact that
is often overlooked.
The positor may have had an idea at some earlier time
though.
!(D,p2, 1, Friday)
!(D,p2, 0, Sunday)
14 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
In the assertions above Donna is changing her mind
from on Friday thinking that Archie was shaved clean to
on Sunday not having been able to tell whether Archie
had a beard or not. The later assertion is called a retrac-
tion when it changes a positive or negative reliability for
an earlier posit to zero.
def.of a retraction
Let Abe a body of information. A retraction is an asser-
tion !(P,p, 0, T0)Afor which there exists an assertion
!(P,p,α,T)A, with α6=0 and T<T0.
If at the same time another posit is asserted with non-
zero reliability, this and the retraction together form a
correction.
def.of a correction
Let Abe a body of information. The three assertions a1=
!(P,p,α,T),a2=!(P,p, 0, T0), and a3=!(P,p0,α0,T0)together
form a correction when {a1,a2,a3} ⊂ A,p6=p0,α6=0, α06=0,
and T<T0.
Extending the example with the assertion below,
Donna says that it is possible that the perpetrator
“Archie” had a fluffy red beard, while at the same time
retracting her earlier assertion that his beard was shaved
clean.
p4= [{(A,has beard)}, fluffy red, 10:02]
!(D,p4, 0.25, Sunday)
Without the retraction Donna would simultaneously
be stating that Archie’s beard is certainly shaved clean
and also that it is possibly fluffy red, so she is contra-
dicting herself. Both assertions will remain in effect
without the retraction, but exactly why this is the case
has yet to be formalized.
modeling conflicting,unreliable,and varying information 15
The Information in Effect
With both assertions and posits being bound in time,
this warrants the question of what information actually
is in effect at any given point in time. The intention
is that posits should capture transitions of states in
the domain being modeled, such as people finding
themselves at the scene of a crime and later dispersed
or Archie removing his fake beard, and assertions the
transitions in the opinions expressed towards such
posits. Assertion and appearance time give rise to two
axes of time over which it should be possible to “travel”,
where for all points in the bitemporal plane T×an
unambiguous set of assertions should be in effect.
def.of the information in effect
Let Abe a body of information. The information in effect is a
subset A(T@,t@)Agiven a bitemporal point in assertion
and appearance time (T@,t@)T×. Assuming P,α,T,D,v,
and tare free variables, all assertions !(P,p,α,T)A(T@,t@)
with p= [D,v,t]are found using the following selection
criteria:
1Let A1Abe the assertions in Asatisfying TT@and
tt@.
2Let A2A1be the assertions in A1with α6=0 and the
latest appearance time tfor each combination of Pand
D, excluding those assertions !(P,p,α,T)for which an
assertion !(P,p, 0, T0)exists with TT0T@.
3Let A(T@,t@)A2be the assertions from A2with the
latest assertion time Tfor each combination of P,D, and v.
In less formal terms, the information in effect can be
found by first rewinding all assertions to those whose
assertion and appearance times are before the specified
points in time. Among those, find the ones with the
latest appearance times for each positor and derefer-
encing set, excluding retracted ones. Finally, keep those
16 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
with the latest assertion times, again for each positor
and dereferencing set. This will guarantee that a single
assertion for each existing combination of a positor and
a posit is in effect at any bitemporal point in time. In
a sense, such a “slice” of assertions is static, since all
temporal aspects have been removed by the operations
described. Had it not been for the remaining reliability
it would have been close to what is usually stored in a
traditional database. Gathering some assertions from
the running example:
p1= [{(A,has beard)}, fluffy red, 10:00]
p2= [{(A,has beard)}, shaved clean, 10:02]
p3= [{(A,has beard)}, shaved clean, 10:00]
a1=!(D,p2, 1, Friday)
a2=!(C,p1, 0.75, Friday)
a3=!(C,p3,1, Friday)
Given the body of information above, {a1,a2,a3}, the
information in effect on (Saturday, 10:01)is a set with
two members {a2,a3}, as they are the only assertions
with appearance times on or before 10:01 and assertion
times on or before Saturday. The information in effect
on (Saturday, 10:03)is the set {a1,a2,a3}, as all asser-
tions are made before both times, and no retractions
made. Although, something happened on Sunday, after
Bella woke up:
p4= [{(A,has beard)}, fluffy red, 10:02]
a4=!(D,p2, 0, Sunday)
a5=!(D,p4, 0.25, Sunday)
With this added information that extends the body
of information, on (Sunday, 10:03)the set {a2,a3,a5}
is in effect. The assertion a1now has a corresponding
modeling conflicting,unreliable,and varying information 17
retraction a4and must be excluded, which brings a5into
effect. The two assertions, a2and a3made by Charlie
are not contradictory, at the same time saying Archie’s
beard is probably fluffy red and certainly not shaved
clean. In fact, “certainly not” may be stated for an arbi-
trary number of values without a positor contradicting
itself25. But, what if the following assertion is made: 25 Note that there is a differ-
ence between stating that
“Archie’s beard is certainly
not red” and “I have no idea if
Archie’s beard is red”. The for-
mer excludes a particular value
and remains in the information
in effect, whereas the latter
leaves all doors open and is
removed from the information
in effect. Both may be stated
for an arbitrary number of
values without contradicting
yourself though.
a6=!(D,p2,0.75, Sunday)
Suddenly, Donna is stating that Archie’s beard is
both possibly fluffy red and probably not shaved clean.
Could it be the case that Donna is now contradicting
herself? When the same positor is having several as-
sertions in effect with the same dereferencing set, but
different values and reliabilities, another characteristic is
needed ensuring positors are non-contradictory.
def.of non-contradictory
A body of information is said to be non-contradictory iff for
every information in effect the reliabilities in all positive and
negative assertions made by the same positor for the same
dereferencing set satisfy the inequality:
1
2
n
i=11αi
|αi|+
n
i=1
αi1
Looking at {a5,a6}, these two assertions by Donna
concurrently assert different values for the beard of
Archie and they are both in effect on (Sunday, 10:03).
Calculating the left side of the inequality gives:
1
210.25
|0.25|+10.75
| − 0.75|+0.25 0.75 =0.5
From this is can be concluded that Donna is far from
contradicting herself, since the reliabilities only “sum
18 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
up” to 0.5. As another example, a positor may be 95
percent sure that Archie’s beard is not shaved clean, and
the same for up to 20 different beard styles and colors,
without being contradictory. The 21st beard style Archie
has almost certainly not got, however, will bring the
left hand side to 1.05. Taking this example all the way,
a positor may simultaneously assert with 100 percent
certainty that something is not the case for an unlimited
number of posits.
def.of indecisive and decisive
When multiple assertions by the same positor are in effect for
different posits with the same dereferencing set, the positor
is said to be indecisive with respect to the set. When a single
assertion of a positor is in effect for a dereferencing set, the
positor is said to be decisive with respect to the set. The in-
formation in effect is said to be indecisive if there is at least
one indecisive positor and decisive if not. A body of informa-
tion is said to be decisive if for every information in effect no
indecisive positors exist and indecisive if there is at least one
indecisive positor.
Since contradictions by definition rely on indeci-
siveness, it follows that a body of information which is
decisive is always non-contradictory.26
26 A traditional relational
database is decisive as no
possible alternative values
can be represented, without
modeling such behavior
explicitly. Reassertions and Restatements
In a traditional database, updating a value to the same
value is an indistinguishable operation, since the value
will remain the same after the operation is completed.
For assertions, it is possible to assert the exact same
thing over and over again with increasing assertion
times. Such assertions can be chosen to be kept or dis-
carded, depending on the requirements at hand.
modeling conflicting,unreliable,and varying information 19
def.of a reassertion and assertive
If two in assertion time successive assertions are otherwise
equal, the later is said to be a reassertion of the earlier. A body
of information is said to be assertive if reassertions exist.
It may be noted that in an assertive body of infor-
mation, the information in effect will contain the latest
reassertion, and if desired further searching is necessary
in order to find the first assertion in a chain of reasser-
tions. To further exemplify, Charlie will make three
additional assertions, {a7,a8,a9}:
p1= [{(A,has beard)}, fluffy red, 10:00]
p4= [{(A,has beard)}, fluffy red, 10:02]
a2=!(C,p1, 0.75, Friday)
a7=!(C,p4, 0.75, Friday)
a8=!(C,p1, 0.75, Sunday)
a9=!(C,p4, 0.25, Sunday)
Here the assertion a8is a reassertion of a2, since only
the assertion time differs. The assertions a7and a2share
a different relationship though, where the assertion time
is the same, but the appearance time differs while the
dereferencing set and value is the same in the posit.
def.of a restatement and restatable
For two assertions, !(P,p,α,T)and !(P,p0,α,T), with p=
[D,v,t]and p0= [D,v,t0], such that t<t0, the latter is called
arestatement of the former. If a body of information contains
restatements it is said to be restatable.
Given the definition above, the assertion a7is a re-
statement of a2, but a9is not a restatement of a8as the
reliability differs. Looking at a9and a7, Charlie is also
downgrading his belief in the posit, from being prob-
able that the beard was fluffy red to only possible that
20 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
is was. Perhaps he observed Archie for the full two
minutes between 10:00 and 10:02, but could not at first
believe his eyes that the beard was gone. With some af-
terthought, he was less sure on Sunday. The beard was
probably fluffy red at 10:00 and only possibly fluffy red
at 10:02, according to Charlie.
Modeling and Typing
While the posits and assertions seen so far have been
intuitively understandable, being constructed from a
story and simple examples, the exact nature of Archie is
still undecided. Is it a perpetrator, an accused, a person,
a human being, a fictional character, or something com-
pletely different? In order to determine what unique
identifiers actually represent, a model is needed. Models
provide auxiliary information, with the aim to collate
similar things under common names. The more well un-
derstood such names are the more intelligible the model
becomes27.
27 However, even if positors
agree upon such names, they
may disagree on how to assign
them, or in other words, the
same information may have
many models.
def.of a classifier and a class
Let is class be a role reserved for the purpose of model-
ing. A posit pc= [{(C,is class)},c,t], defines the name
of a class through the string cand associates the unique
identifier Cwith it. A classifier is a relationship that binds
a thing to a class, expressed through posits on the form
pM= [{(i,thing),(C,class)},v,t].
The defined strings are intended to represent clas-
sifications, such that it can be determined to which
class any unique identifier belongs. Thanks to the time
point tin the classifier, the class of imay change, natu-
rally, over time. First given some posits that define class
names assumed to have existed since the beginning of
modeling conflicting,unreliable,and varying information 21
time, here denoted by , the following posits may all
be valid classifiers for Arthur:
pc1= [{(C1,is class)}, Infant, ]
pc2= [{(C2,is class)}, Teenager, ]
pM1= [{(A,thing),(C1,class)}, active, 1972]
pM2= [{(A,thing),(C1,class)}, inactive, 1973]
pM3= [{(A,thing),(C2,class)}, active, 1985]
From these it is apparent that two classifications
exist and have always existed28, and that Arthur was an 28 Posits that define class
names from the beginning
of time may in some sense
be viewed as being static, or
in other words, have no way
to naturally change into a
different value.
infant between 1972 and 1973 and became a teenager in
1985. However, it would be equally valid to at all those
times state that Arthur is a person, as illustrated by the
following posits.
pc3= [{(C3,is class)}, Person, ]
pM4= [{(A,thing),(C3,class)}, active, 1972]
This hints at a relationship between infant and per-
son, since Arthur is both an infant and a person dur-
ing the year of 1972, which of course can be expressed
through its own posit.
pM4= [{(C1,subclass),(C3,class)}, active, ]
Whatever needs to be said about classes and how
they relate to each other and other things can be ex-
pressed using posits. These can be indefinitely extended
by introducing new reserved strings used in their ap-
pearances. Each class could have description, for ex-
ample, through the (C,has description)appearance. Just
like posits in general, classifiers are meaningless unless
asserted by a positor, acting as a modeler in this case. A
modeler may, like positors in general, disagree on the
22 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
model, or express uncertainty towards parts of it.
aM1=!(M,pM1, 0.75, Today)
aM1=!(M,pM4, 1, Today)
aM1=!(L,pM4,1, Today)
The modeler Mis quite sure that Arthur was an
infant since 1972, while simultaneously sure that Arthur
is a person, whereas Lis certain Arthur is not a person.
This does not necessarily mean that Arthur is an alien,
unless Lmodels it so, but that Lmay have chosen a
model in which Arthur perhaps is a client instead29.
29 From the point of view of a
lawyer in the case, this is not
unreasonable. def.of a model
Let Abe a body of information and let Ibe the set of all
unique identifiers found in A. The model Mof A, denoted
M|=A, is a comprehensive body of information in which
each positor exhaust Ithrough assertions of classifiers.
A model, then, contains complete information about
which class every thing belongs to. It may contain much
more information, such as detailed descriptions of
classes or how classes relate to each other, through
for example inheritance. It contains information about
information and can be seen as meta-information with
respect to the body of information it models. If the two
are intermixed it is important to keep track of which
strings have been reserved for use in those appearances
that pertain to modeling.
As can be understood from the examples, there may
be many ways to model the same body of information.
The question then arises when a model is a “good”
model. Unfortunately there is no simple answer. In
some cases, few and generic classes are preferable,
where in others many and specific ones are better. A
model is that which creates boundaries between sim-
modeling conflicting,unreliable,and varying information 23
ilar and dissimilar things. To model is to define these
boundaries by determining when things are similar
enough to be of the same class. Assuming that Arthur
belongs to the person class and that many other things
also are persons, it is possible to derive some additional
information about persons from the posits in which
such things appear. Backtracking from pc3it is possi-
ble to find all posits in which things of the person type
appears, many of which will have similar structures.
def.of a posit type
Aposit type,τ(p) = [{(C1,r1), . . . , (Cn,rn)},τ(v),τ(t)], for a
posit p= [{(i1,r1), . . . , (in,rn)},v,t], is a structure constructed
by replacing unique identifiers, iiwith the unique identifiers
of their class, Ci, the value, v, with its data type, τ(v), and the
time point, t, with its data type, τ(t).
Posit types are descriptive by nature, and will be
used to further enrich the model. For example, given
the person class, an example posit and its corresponding
type are:
p4= [{(A,has beard)}, fluffy red, 10:02]
τ(p4) = [{(C3,has beard)},string,time]
We know from pc3that C3is the person class. It ap-
pears then that persons may have beards of some string
value capturing its color and since some specified time.
def.of a typing and a type
Let is type be a role reserved for the purpose of creating types.
A posit pτ= [{(T,is type)},τ(p),t], defines a type through
the posit type τ(p)and associates the unique identifier Twith
it. A typing is a relationship that binds a class to a type, ex-
pressed through posits on the form pT= [{(C,class),(T,type)},v,t].
24 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
Defining a type is done by binding a unique identifier
to a posit type, for example τ(p4), using another posit30.
30 Remember that posits
are defined to bind unique
identifiers to values that may
be of any data type, primitive
as well as complex ones. A
posit type can be considered
to be a complex data type in
this case.
pτ4= [{(T1,is type)},τ(p4),]
pT1= [{(C3,class),(T1,type)}, commonly, 1942]
Using the value in a typing, it is possible to express
more than just if the relationship is currently active or
inactive. It can be used to indicate how often a person
has a beard in general, and which in the example above
is commonly. Other thinkable values that can be useful
are rarely,always, and never31. These may change over
31 When values form an
enumeration, it is important
that they together are mutually
exclusive and exhaustive,
which is why never is needed,
since it thinkable that it will
become impossible for persons
to have beards from some
point in time.
appearance time if the demographics of the persons in
the universe of discourse changes. It is important to
note that even if one bearded person exists it does not
imply that all persons can or must have beards. There
will be no posits of type τ(p4)for Bella or Donna, ever.
Typings are created from the information at hand
and they evolve along with the information, and should
therefore not be seen as templates into which future
information should fit. Types and typings are asserted
by positors, just like classes and classifiers, even if types
are somewhat harder to disagree upon, but it is possible
for two different positors to have a difference in opinion
about the data types involved. Extending the example
we know Archie and Bella was observed by Charlie at
the scene of the crime, but there seems to have been
something preceding the crime, involving Xavier.
p5= [{(A,is accused),(B,is victim),(C,is witness)},
at scene of crime, 09:58]
p6= [{(A,is accused),(X,is victim),(C,is witness)},
at scene of crime, 09:42]
Given the similarities in the structure of the posits
p5and p6, it would be reasonable to assume they have
modeling conflicting,unreliable,and varying information 25
the same posit type, but that is actually not the case.
What seems to have started the kerfuffle between Archie
and Bella was in fact that Archie was being mean to her
dog, Xavier, which Charlie witnessed somewhat earlier
that day. This results in two similar, but not equal, posit
types τ(p5)and τ(p6).
pc4= [{(C4,is class)}, Animal, ]
pM5= [{(X,thing),(C4,class)}, active, 2009]
τ(p5) = [{(C3,is accused),(C3,is victim),(C3,is witness)},
string,time]
τ(p6) = [{(C3,is accused),(C4,is victim),(C3,is witness)},
string,time]
p0
τ5= [{(T2,is type)},τ(p5),]
p0
τ6= [{(T3,is type)},τ(p6),]
pT2= [{(C3,class),(T2,type)}, rarely, 1942]
pT3= [{(C3,class),(T3,type)}, rarely, 1942]
This example illustrates the importance of typing.
The added information it gives makes it possible to get
the whole picture of what is going on, and while the
posits p5and p6may look alike, the things they relate
differ in their classes. To get a good understanding of
what a class represents it is useful to look at a particular
set, called an ensemble.
def.of an ensemble
Let Abe a body of information. If Cis the unique identifier
for a class cfrom a model of A, the ensemble of cis the infor-
mation in effect limited to all assertions of typings in which C
occurs.
An ensemble gives a good idea of what it entails to
be an instance of, for example, the “Person” class. You
26 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
may even say that a person is that which commonly
may have a beard, always has a gender and a birth date,
and that rarely is involved as an accused, victim, or
witness, where the victim could be an animal or another
person32.
32 Provided that some positor
has asserted these typings. While sufficient to provide a model through a set of
classifiers, the ensembles enrich the model by providing
a more descriptive view of each class. Ensembles will
naturally evolve over time, as new information is added.
Together they can be likened with a growing schema in
a database, but with a complete history of changes, and
possibly multiple conflicting models and typings of the
same information.
Identification
A body of information is rarely something that comes
about in a Big Bang kind of fashion. Rather, they are
gradually built, and many may continue to evolve indef-
initely. Herein lies a difficulty with the unique identi-
fiers that have already been assigned within that body.
Say, if a new witness steps forward a few weeks later,
is the Archie that witness saw the same Archie that
already has been assigned the unique identifier A?
def.of an identification
Let cbe a class in a body of information Aand eits ensemble.
Given some circumstances, where circumstances consist of
information not yet or only partly represented in A, an iden-
tification is a search in Ain which eis gradually populated
with values and/or other unique identifiers, until they match
existing assertions and posits, from some information in effect,
such that a single unique identifier of class ccan be deter-
mined or that after exhausting the circumstances no match
was found.
modeling conflicting,unreliable,and varying information 27
In the case of the new witness, the circumstances
may have been that that witness also saw a man at the
same scene of the crime at the same time as the oth-
ers and that that man also wore a red beard and fled
shortly thereafter. This may have been deemed sufficient
to draw the conclusion that the new witness must be
talking about the same man, Archie. When ensembles
are large, have many members in their sets, it may be
convenient to narrow down searches by providing some
subsets of posit types that normally suffice for identifi-
cation.
def.of an identifier
Let ebe the ensemble of a class c. An identifier of cis a subset
eid e, for which it is deemed sufficient to fill with values
and/or other unique identifiers in order to complete an identi-
fication.
Identification is, by all means and the definitions
above, no exact science. There also may be many ways
to reach the same conclusion in an identification and
many different identifiers for the same class. A real
perpetrator could perhaps be identified through DNA
matching, fingerprints, witnesses, motives, means, op-
portunity, and so on. Since positors are allowed to make
retractions, this may make identification even harder.
def.of analytic and synthetic
Let pbe a posit in which the unique identifier iappears. If
all positors retract pand this is debilitating to the identifi-
cation process, then pis said to be analytic to iand synthetic
otherwise.
To illustrate these concepts, assume that Eliot, the
investigator, swabbed Archie shortly after capturing
28 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
him, resulting in his DNA being sequenced.
p7= [{(A,has DNA)},ACGT...,1972-08-20]
a10 =!(E,p7, 1, Sunday)
If Eliot retracts a10 and this would seriously debilitate
the identification process, that means that Archie’s
DNA is analytic to Archie. Analytic posits need to be
handled with additional care as the information evolves,
since retracting them may have negative side effects33.
33 In some modeling tech-
niques, identifiers consisting of
only analytic posits are called
“candidate keys”.
Assuming that it was the DNA of Archie that eventually
led to his conviction, the whole trial may be put in
question if it turns out that the DNA actually belonged
to someone else.
Cardinality
Eliot’s investigation would also show that Archie and
Bella knew each other quite well. The posits below34 de-
34 These posits describe a
quaternary relationship. While
high arity relationships are
not often seen in database
models, they actually appear
quite frequently in real life. The
discrepancy could perhaps
be explained by the simple
fact that they are harder to
manage, but then, what is lost
in databases due to this fact is
quite troublesome.
scribe a series of events, from Bella marrying Archie in
2004, to divorcing him in 2009, and to both remarrying
others in 2012. Of these, only the first two constitute a
real change, since they have the same dereferencing set.
The last two instead constitute different events, as the
dereferencing sets involve different unique identifiers.
p1w= [{(A,is husband),(B,is wife),(C,is minister),
(8, is church)}, marriage, 2004]
p2w= [{(A,is husband),(B,is wife),(C,is minister),
(8, is church)}, divorce, 2009]
p3w= [{(A,is husband),(F,is wife),(C,is minister),
(8, is church)}, marriage, 2012]
p4w= [{(G,is husband),(B,is wife),(C,is minister),
(9, is church)}, marriage, 2012]
modeling conflicting,unreliable,and varying information 29
With combined temporality, concurrency, and relia-
bility, it is of interest to understand constraints under
the circumstances where values may change over time,
positors may be in conflict, and retractions occur. De-
scribing the specific case of cardinality in transitional
modeling, which is a common constraint in almost ev-
ery technique, will help shed light on the general case
of how any constraint is handled. Consider enforcing
monogamy35 in the “wedding” relationship, where a 35 The author is well aware that
weddings and marriages can
take many other forms and
is in this example expressing
no particular preference, but
for the sake of simplicity the
examples look the way they do.
married couple may come to divorce, different positors
may have conflicting opinions about when it happened,
and they may change their minds about whether it hap-
pened at all.
def.of a limiter
Alimiter is a triple, (R,l,u), where R={r1, . . . , rn}is a set
of strings and land uare positive integers, with the special
value also allowed for u. If u=, then lmay be any positive
integer, or otherwise lumust hold.
The intention of a limiter is to express a lower, l, and
upper, u, bound for how many times a unique identifier
or combination thereof may appear having the roles
from Rin a relationship. The special value should be
interpreted as an “unlimited” number of times36.36 Note that zero is not a valid
value for the lower bound.
Because of the way posits are
defined, they always contain
unique identifiers with every
role in a dereferencing set.
In that respect, if the church
is optional at a wedding, a
different posit type without the
is church role would be used.
def.of a cardinality constraint
Acardinality constraint,{L1, . . . , Ln}with n>1 over a posit
type, τ(p), that has more than one role, is a set of limiters,
Li= (Ri,li,ui), with Ri={ri1, . . . , rim}for which every rijis a
role in the posit type τ(p)and where Riare disjoint sets.
In order to enforce monogamy, first recognize that
all posits share the same posit type τ(pw). Assuming
that C5is the unique identifier for the “Location” class,
the posit type for the wedding relationship becomes as
30 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
follows.
τ(pw) = [{(C3,is husband),(C3,is wife),(C3,is minister),
(C5,is church)},string,year]
Preventing a husband from having several wives
and wives from having several husbands needs the
following cardinality constraint on τ(pw).
{({is husband}, 1, 1),({is wife}, 1, 1)}
According to the limiters, the unique identifier for
a husband must appear at least once and at most once
at the same time as the unique identifier for a wife also
must appear exactly once. This enforces a one-to-one
relationship between husbands and wives. However,
already the first two posits, p1wand p2w, seem to break
this constraint, since both Arthur and Bella appear in
both of them.
def.of decisive fulfillment
A cardinality constraint is said to be fulfilled in a decisive
body of information, if among posits that share the same type
τ(p1) = τ(p2) = . . . =τ(pn), unique identifiers appear at
least ltimes and at most utimes in the roles specified in the
corresponding limiter, for each positor and every information
in effect.
This is where it is important to notice that constraint
checks are not made against all posits in a body of in-
formation, but rather made for each positor and in-
formation in effect in that body. Fulfillment, as can be
seen, is only defined for decisive bodies of information.
Fulfillment in indecisive bodies will not be detailed in
this paper, but to give an idea, if Eliot asserts with 0.50
reliability both that Archie is currently married either
to Bella or to Fanny, he could be stating that there is a
modeling conflicting,unreliable,and varying information 31
chance Archie is married to both of them or that in real-
ity, one wife is a fact and the other is not, but he is not
sure which one it is. Cardinalities in indecisive bodies
of information may aid in resolving this ambiguity, but
therefore also need additional treatment, which is left as
further research. Now, let Eliot instead decisively assert
the posits.
a1w=!(E,p1w, 1, Sunday)
a2w=!(E,p2w, 1, Sunday)
a3w=!(E,p3w, 1, Sunday)
a4w=!(E,p4w, 1, Sunday)
The information in effect on (Sunday, 2004)is {a1w},
on (Sunday, 2009)it is {a2w}, and on (Sunday, 2012)it
is {a3w,a4w}, exhausting the entire bitemporal timeline.
Neither of these break the constraint, so the constraint
is therefore fulfilled. Let Both Archie and Bella admit
they were married in 2004 through the following two
assertions.
a5w=!(A,p1w, 1, Sunday)
a6w=!(B,p1w, 1, Sunday)
Each of the sets with the information in effect above
will now also have these two assertions included. This
will, however, still not break the constraint, since each
assertion that could have done so is made by a different
positor. This pattern repeats itself for any type of con-
straint, such that checks are made for each positor and
information in effect37.
37 A uniqueness constraint,
such that the client number
at a law firm may never
repeat itself and thereby
uniquely identify a client, is
also only checked against the
information in effect and for
each positor. Two different
positors may assign the same
client number to two different
clients and the same client
number may have been reused
for another client at a different
time in the past.
Given the quaternary “wedding” relationship, there
are a lot more cardinality constraints that can be de-
fined. For the sake of completeness, the following is
defined.
32 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
def.of complete cardinality38
38 The size of the complete
set is calculated by:
3n2n+1+1
2
giving for example 3, 6, 25,
90, and 301 elements for
2, 3, 4, 5, and 6-ary posits.
The fraction is the same as
the number of matches n
players would play, given
that all possible teams,
combining them in sizes
from 1to n1, meet once
and that players cannot
meet themselves.
The complete cardinality of a posit type, τ(p), is a set of all
cardinality constraints that can be created using combinations
of roles from τ(p).
Already for a quaternary relationship like the wed-
ding, the complete cardinality set has 25 members.
Along with three already exemplified, another five of
them are the following constraints.
{({is husband}, 1, ),({is minister}, 1, 1)},
{({is wife}, 1, ),({is minister}, 1, 1)},
{({is husband}, 1, ),({is church}, 1, 1)},
{({is wife}, 1, ),({is church}, 1, 1)},
{({is husband, is wife}, 1, ),({is minister, is church}, 1, 1)}
The first two constraint expresses that if a husband
or wife is present at a wedding exactly one minister
must be as well and that a minister may wed many hus-
bands and wives. The second two expresses a similar
constraint with respect to the church. The last expresses
that if a husband and a wife is present at a wedding in
which there is a combination of a minister and a church,
there must be only one such combination, and that a
combination of a minister and a church may have par-
taken in the wedding of many husbands and wives.
Intuitively, there should be some relation between the
first four constraints and the last. It is counter-intuitive
that a husband and wife could be married more than
once if only they find a new combination of minister
and church. While not immediately obvious how, it
turns out that it is possible to calculate the smallest and
largest possible bounds for the last constraint, given
the first four. Being slightly less formal in how this is
calculated, first, define a “cross product” according to
modeling conflicting,unreliable,and varying information 33
the following rules for the lower and upper bounds in a
cardinality constraint.
(li,ui)×(lj,uj) = (max(li,lj),min(ui,uj))
Since the upper bound can have the special value,
max and min are defined as follows when it is involved.
min(u,) = u
max(l,) = l
min(,) = max(,) =
To find the bounds for the last constraint, first cal-
culate from left to right and crosswise roles, (hus-
band side to minister) ×(husband side to church)
×(wife side to minister) ×(wife side to church) =
(1, )×(1, )×(1, )×(1, ) = (1, ). Then
calculate from right to left and crosswise roles, (min-
ister side to husband) ×(minister side to wife) ×
(church side to husband) ×(church side to wife) =
(1, 1)×(1, 1)×(1, 1)×(1, 1) = (1, 1). This gives the
bounds found in the last constraint, and it cannot pass
beyond these bounds without resulting in inconsistent
cardinalities. Further restricting the constraint is allowed
though, and warrants the need for actually specifying
higher order constraints. For example, if a minister is
only allowed to wed five couples in any given church
before finding a new church, the last constraint can be
restricted to:
{({is husband, is wife}, 1, 5),({is minister, is church}, 1, 1)}
34 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
def.of a limiting and a limit
Let is limit be a role reserved for the purpose of creating
constraints. A posit pL= [{(L,is limit)},L,t], defines a
limit through the limiter Land associates the unique iden-
tifier Lwith it. A limiting is a relationship that binds a
posit type to a limit, expressed through posits on the form
pL= [{(T,type),(L,limit)},v,t].
When limitings are asserted, constraints also become
part of the information itself, similar to the model and
the types. This makes it possible for constraints to be
valid only during certain periods of time, for example,
or that they may evolve and look different now from
what they did before. Different positors may also simul-
taneously use different constraints.
Declaring constraints in transitional modeling turns
out to be very similar to traditional techniques, where
the complexity instead lies within their enforcement,
over every positor and every information in effect.
Related Research
Some 20 years ago (Inmon)39 defined a Data Warehouse
39 William H Inmon. Building
the data warehouse. John
wiley & sons, 2005 as being “a subject-oriented, integrated, time-variant,
nonvolatile collection of data”. While companies are
belatedly coming to terms with the first two conditions,
in our experience, supported by (Golfarelli and Rizzi)40,
40 Matteo Golfarelli and
Stefano Rizzi. “A survey on
temporal data warehousing”.
In: International Journal of
Data Warehousing and Mining
(IJDWM) 5.1 (2009), pp. 1–17
few can say that they fulfil the last two, primarily be-
cause of lacking methods. If Data Warehouses are our
best attempt at integrating bodies of information under
evolution, they are currently not well equipped to do
so. To address this the presented transitional modeling
technique, rooted in multiple research areas, makes it
easy to communicate, store, manage, and analyze infor-
mation that is conflicting, uncertain and varying.
modeling conflicting,unreliable,and varying information 35
(Bradley)41 shares the notion that a thing is defined 41 Francis Herbert Bradley.
Appearance and reality: a
metaphysical essay. Rout-
ledge, 2016
by its appearance and change is “a bond of identity
in differences”. Through positors and appearances
problems such as “the Ship of Theseus”, discussed
by (Chisholm et al.)42, can be largely avoided. Even if 42 Roderick Chisholm et al.
Person and object: A meta-
physical study. Routledge,
2014
every plank in the ship is replaced, the unique identifier
once associated with the ship will prevail, and will do
so eternally (Sider; Black)43. If at some point another 43 Theodore Sider. “Quantifiers
and temporal ontology”. In:
Mind 115.457 (2006), pp. 75–
97; Max Black. “The identity
of indiscernibles”. In: Mind
61.242 (1952), pp. 153–164
identifier is associated with the ship, because a different
positor deems it a new ship, that positor is perfectly free
to do so. The two ships can then be discussed in paral-
lel, and relations between them defined. To paraphrase
the greek philosopher Heraclitus [~500 BC], it is possi-
ble for the same man to step into the same river twice,
as long as some positor can identify them to still be
the same. Being the same man or the same river rarely
comes down to following the trail of body cells or water
molecules.
(Lombard)44 presents the problem of distinguishing 44 Lawrence Brian Lombard.
“Relational change and
relational changes”. In:
Philosophical Studies 34.1
(1978), pp. 63–79
relational from non-relational change, which is solved
by separating immutable parts, allowing change to hap-
pen independently across a model, similar to how our
unique identifiers are the only immutables, while every-
thing else may change. (Moens and Steedman)45 define 45 Marc Moens and Mark
Steedman. “Temporal ontology
and temporal reference”. In:
Computational linguistics 14.2
(1988), pp. 15–28
“culmination” as being a point in time “accompanied
by a transition to a new state of the world”, similar to
how our appearance time is used to capture different
states. (Bradley)46 also states that shifting to statements 46 Francis Herbert Bradley.
Appearance and reality: a
metaphysical essay. Rout-
ledge, 2016
and away from reality relaxes the need for absolute
truth. Similarly our assertions make it possible to hold
different and uncertain opinions about posits. An RDF
triple (W3C RDF Working Group)47 roughly corre-
47 W3C RDF Working Group.
Resource Description Frame-
work. 2004. U R L:http :
//www.w3 .org/RDF/ (visited
on 07/11/2018)
sponds to our posit, but has no concept beyond it with
respect to temporality, reliability, or concurrency. The
36 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
quality of a posit being “analytic” or “synthetic” is bor-
rowed directly from the philosophical works of (Ayer;
Ramsey)48.
48 Alfred Ayer. “Language,
Truth, and Logic”. In: London:
V. Gollancz ltd (1936); Frank
Ramsey. “Truth and Probability.
Reprinted in HE Kyburg and
HE Smokler”. In: Studies in
Subjective Probability (1926),
pp. 25–52
(Johnston)49 suggests an approach based on speech
49 Tom Johnston. Bitemporal
data: theory and practice.
Newnes, 2014
act theory with actors resembling positors and speech
acts bound to two axes of time, but does not take re-
liability into account. (Reichenbach)50 was among the
50 Hans Reichenbach. “The
tenses of verbs”. In: Time:
From Concept to Narrative
Construct: a Reader (1947)
first to suggest bitemporality; a difference between the
“point of speech” and the “point of event”, similar to
our assertion time and appearance time. The specifics
of bitemporal models in databases was later elaborated
upon by (Snodgrass and Ahn)51. Neither Reichenbach
51 Richard Snodgrass and
Ilsoo Ahn. “A taxonomy of
time databases”. In: ACM
Sigmod Record 14.4 (1985),
pp. 236–246
nor Snodgrass touch upon conflicting or uncertain infor-
mation.
Reliability has likenesses with the Uncertainty Theory
of (B. Liu)52 and its personal belief degree. While Liu
52 B Liu. Uncertainty Theory:
An Introduction to its Axiomatic
Foundations. 2004. Springer-
Verlag, Berlin, 2004
states that “when the personal knowledge changes, the
belief degree changes too”, and that “different people
may produce different belief degrees”, his theory does
not as of yet support these notions. His “normality” and
“duality” axioms correspond to our definitions of non-
contradiction and boundary, although with a symmetry
around 0.5 instead of 0 as used in transitional modeling.
(Massaro and Friedman; Lin and Mendelzon)53 high-
53 Dominic W Massaro and
Daniel Friedman. “Models
of integration given multiple
sources of information.” In:
Psychological review 97.2
(1990), p. 225; Jinxin Lin
and Alberto O Mendelzon.
“Merging databases under
constraints”. In: International
Journal of Cooperative
Information Systems 7.01
(1998), pp. 55–76
light the difficulty of integrating information from mul-
tiple sources, especially when these may be contradic-
tory. The problem of integration lies in the fact that it is
treated at write time, such that integrated information
end up being stored. This is lossy and relies on integra-
tion rules already being in place, not to mention that
these rules are assumed to be perfect and never subject
to change. These are dangerous assumptions, and with
transitional modeling no such assumptions are made.
Resolving possible conflicts can be deferred to query
modeling conflicting,unreliable,and varying information 37
time, without losing any information due to write time
integration.
The field of information fusion is flourishing, due to
requirements to integrate extreme volumes of sensory
data, in which current techniques try to integrate in
real time (Smets and Kennes; Z.-g. Liu et al.)54. This,
54 Philippe Smets and Robert
Kennes. “The transferable
belief model”. In: Artificial
intelligence 66.2 (1994),
pp. 191–234; Zhun-ga Liu
et al. “Sequential adaptive
combination of unreliable
sources of evidence”. In:
Advances and Applications of
DSmT for Information Fusion 2
(2015), p. 23
naturally, leads to loss of the actual circumstances, and
perhaps this field could benefit from storing actual data
and delaying integration until decisions need to be
made.
(Benjelloun et al.)55 introduces databases with un-
55 Omar Benjelloun et al.
“Databases with uncertainty
and lineage”. In: The VLDB
Journal-The International
Journal on Very Large Data
Bases 17.2 (2008), pp. 243–
264
certainty and lineage, with confidence values similar
to reliability and lineage similar to positors, but that
lack the concept of temporality. (Dylla, Miliaraki, and
Theobald)56 extends similar databases with “valid
56 Maximilian Dylla, Iris
Miliaraki, and Martin Theobald.
“A temporal-probabilistic
database model for information
extraction”. In: Proceedings
of the VLDB Endowment 6.14
(2013), pp. 1810–1821
time”, but does not make the connexion to transitions
in opinions, which are tracked over our assertion time.
(Papafragou)57 warrants such connexions in her linguis-
57 Anna Papafragou. “Epis-
temic modality and truth
conditions”. In: Lingua 116.10
(2006), pp. 1688–1702
tics paper, stating that “subjective epistemic modality
is time-dependent”, or in other words, when people
express themselves vaguely they are more prone to re-
vising their statements. Transitional modeling should be
an excellent candidate in situations where vague state-
ments need to be recorded and it is of importance to
keep track of when minds are changed about things.
(Thalheim)58 holds that modeling is an art and that
58 Bernhard Thalheim. “The
art of conceptual modelling”.
In: Information Modelling and
Knowledge Bases XXII. Fron-
tiers in Artificial Intelligence
and Applications 237 (2012),
pp. 149–168
models provide boundaries between similar and dis-
similar things. Furthermore, that models need to evolve
along with the environment they are meant to capture,
even if Thalheim holds that there is one true model,
rather than a multitude depending on who is doing
the modeling. Our ability to track the evolution of a
model over time is similar to requirements in ontol-
ogy evolution (Noy and Klein)59 and database schema
59 Natalya F Noy and Michel
Klein. “Ontology evolution:
Not the same as schema
evolution”. In: Knowledge
and information systems 6.4
(2004), pp. 428–440
38 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
evolution (McBrien and Poulovassilis)60. Our take on
60 Peter McBrien and Alexan-
dra Poulovassilis. “Schema
evolution in heterogeneous
database architectures, a
schema transformation ap-
proach”. In: International
Conference on Advanced Infor-
mation Systems Engineering.
Springer. 2002, pp. 484–499
classifiers, typings, and limitings as part of the informa-
tion itself, with the ability to evolve and to be disagreed
upon, are probably unique, and from which it is pos-
sible to tell if an answer could have been given at all
to a certain query if it had been asked in the past. The
concept of an ensemble is however directly borrowed
from (Hultgren)61, who coined the term “Ensemble
61 Hans Hultgren. Modeling the
agile data warehouse with data
vault. New Hamilton, 2012 Modeling” as an umbrella over database modeling tech-
niques that separate mutable and immutable parts of
the information.
Concerning constraints, some work on higher-order
cardinality in particular has been made by (Hart-
mann)62. A more generic discussion with respect to
62 Sven Hartmann. “On
the implication problem for
cardinality constraints and
functional dependencies”. In:
Annals of Mathematics and
Artificial Intelligence 33.2-4
(2001), pp. 253–307 cardinalities in conjunction with temporality and the
complexity that arises can be found in (Allen)63. While
63 James F Allen. “Maintaining
knowledge about temporal
intervals”. In: Communications
of the ACM 26.11 (1983),
pp. 832–843
positors need to be precise with respect to appearance
and assertion time, this may be limiting in some cases,
and research into imprecise time intervals has been
made by (Koubarakis)64. It is possible that the introduc-
64 Manolis Koubarakis. “Rep-
resentation and querying in
temporal databases: the power
of temporal constraints”. In:
Data Engineering, 1993. Pro-
ceedings. Ninth International
Conference on. IEEE. 1993,
pp. 327–334
tion of complex data types for representing time may
satisfy such needs, but that is yet to be researched.
A good overview of research concerning erroneous,
imprecise and uncertain information can be found
in (Smets; Motro)65. Erroneous data is to some extent
65 Philippe Smets. “Imperfect
information: Imprecision and
uncertainty”. In: Uncertainty
management in information
systems. Springer, 1997,
pp. 225–254; Amihai Motro.
“Imprecision and uncertainty
in database systems”. In:
Fuzziness in Database
Management Systems.
Springer, 1995, pp. 3–22
handled by our retractions. Imprecise information, such
as a value being somewhere within some limits, say for
example that Charlie estimated Archie’s length to be in
the 170180cm range, can be handled by introducing
complex data types for representing values. Uncertainty
is handled through our reliability.
modeling conflicting,unreliable,and varying information 39
Special Cases
The fundamental nature of transitional modeling com-
bined with its ability to capture rather generic informa-
tion results in many other modeling techniques being
expressable as special cases of this technique. Three
of them are the database modeling techniques Anchor
Modeling, Data Vault, and the third normal form.
Starting with Anchor Modeling (Ronnback et al.)66,66 Lars Ronnback et al.
“Anchor modeling - Agile infor-
mation modeling in evolving
data environments”. In: Data &
Knowledge Engineering 69.12
(2010), pp. 1229–1253
it is similarly constructed around immutable unique
identifiers, called “anchors”. Its “attributes” and “ties”
correspond to posits with respectively single or multiple
appearances in their dereferencing sets. These posits
are also limited to primitive data types for values and
times, except for “knotted” attributes and ties, for which
a complex value type in the form of an enumeration is
used. Their “changing time” is over which states change
in the domain being modeled, exactly like our appear-
ance time. As it, in the form presented in the referenced
paper, has no construct corresponding to assertions,
the assumption is that there is a single positor assert-
ing every posit with absolute certainty since the dawn
of time. Unpublished work is available on their home-
page (Ronnback)67, concerning extensions that allow for
67 Lars Ronnback. Anchor
Modeling. 2005. URL:http:
// www.anchormodeling .
com/ (visited on 07/11/2018)
functionality that bring it closer to transitional model-
ing.
Continuing with Data Vault (Linstedt and Graziano)68,68 Dan Linstedt and Kent
Graziano. Super Charge Your
Data Warehouse: Invaluable
Data Modeling Rules to
Implement Your Data Vault.
CreateSpace, 2011
it is similarly constructed around immutable unique
identifiers, called “hubs”, but where hubs also contain
at least one analytic posit with a complex data type for
its value, consisting of a set of primitive values. Syn-
thetic posits become either satellites or links, depending
on whether they respectively have single or multiple
appearances in their dereferencing sets. Both of these
also use complex data types for their values, consisting
40 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
of sets of primitive values. There is no definite rule for
when such a set of primitive values may be divided in
order to create another satellite or link, but the practice
seems to be related to their rate of change (Hultgren)69.
69 Hans Hultgren. Modeling the
agile data warehouse with data
vault. New Hamilton, 2012 Their “loading time” is over which states change in the
database, rather than the domain being modeled, so
it is different from our appearance time. There are no
constructs corresponding to assertions, but loading time
is closer to assertion time, if assumed that the database
is the single positor asserting every posit with absolute
certainty at the time of loading.
Finishing with the third normal form (Kent; Codd)70,
70 William Kent. “A simple
guide to five normal forms in
relational database theory”.
In: Communications of the
ACM 26.2 (1983), pp. 120–
125; Edgar F Codd. “A
relational model of data for
large shared data banks”. In:
Communications of the ACM
13.6 (1970), pp. 377–387
it is not constructed around immutable unique iden-
tifiers, but rather around analytic posits. A “table” in
third normal form consists of at least one analytic posit
along with optional synthetic posits, both having com-
plex data types for their values, consisting of sets of
primitive values. The cardinality of a posit type with
multiple roles determines to which tables they belong.
For one-to-one and one-to-many the posit is broken
apart and put into both tables. For many-to-many or
higher order than pairwise cardinalities, a separate table
is needed. In third normal form posits with multiple
appearances in their dereferencing sets carry no value,
but it is likely that if they do, a separate table is always
needed. It also lacks constructs to manage state changes,
so it assumed that all posits are eternally in the same
state, until destructively updated to the next assumed
eternal state. There is no construct corresponding to
assertions, so the assumption is that there is a single
positor asserting every posit with absolute certainty
since the dawn of time.
modeling conflicting,unreliable,and varying information 41
Conclusions and Further Research
The presented technique, transitional modeling, should
have a large number of application in a number of dif-
ferent fields of research, as well as in practical appli-
cations. It can, for example, be used to bridge the gap
between what a Data Warehouse should be and what it
currently is in most implementations. It should also be
of interest for operational system databases and other
types of systems in which it is of importance never to
lose information. In research areas where information
is influenced by variation, uncertainty, or opinions, this
technique may bring a refreshing approach to integra-
tion, such that it can be moved from gathering time to
the time of the decision making. Classifiers, and their
derived ensembles, make it possible to have multiple
evolving schemas over the same information, which
should prove useful where not only the content, but
also the structure, of information changes rapidly. The
story in the running example is illustrative of informa-
tion coming from humans, sensors, predictions, ratings,
tracking, and other multifaceted sources. Such can now
be fully modeled, stored, communicated, and analyzed,
through the introduced concepts of posits and asser-
tions.
More research is needed in order to fully understand
constraints in indecisive bodies of information. An-
other area of interest is which complex data types for
values and times could enrich the theory even further.
Ranges would be an immediate candidate for dealing
with imprecise information, for example. Most of the
paper assumes bodies of information to be comprehen-
sive, but there may be many fruitful conclusions that
can be drawn from other types of bodies. It is also not
unthinkable that false assertions, disinformation, could
42 lars rönnbäck doi:10.13140/rg.2.2.34381.49121/1
be interesting to represent, but it is unclear what the
repercussions to the theory would be.
An obvious area of research is how to, if possible,
represent posits and assertions using relational algebra.
The previously mentioned Anchor Modeling has come
a long way, but a relational representation may not
be the best fit for transitional modeling. Therefore, a
new massively parallel processing database engine,
built using posits and assertions at its core, is in early
stages of development. It is a natural venue for our
own further research. We believe strongly that it is
only when a theory is converted to practice that you
see its true benefits and limitations, which is why we
are also aiming for some actual implementations of the
technique.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.