Conference PaperPDF Available

Discovering Causal Relations in Textual Instructions

Authors:

Abstract and Figures

One aspect of ontology learning methods is the discovery of relations in textual data. One kind of such relations are causal relations. Our aim is to discover causations described in texts such as recipes and manuals. There is a lot of research on causal relations discovery that is based on grammatical patterns. These patterns are, however , rarely discovered in textual instructions (such as recipes) with short and simple sentence structure. Therefore we propose an approach that makes use of time series to discover causal relations. We distinguish causal relations from correlation by assuming that one word causes another only if it precedes the second word temporally. To test the approach, we compared the discovered by our approach causal relations to those obtained through grammatical patterns in 20 textual instructions. The results showed that our approach has an average recall of 41% compared to 13% obtained with the grammatical patterns. Furthermore the discovered by the two approaches causal relations are usually dis-joint. This indicates that the approach can be combined with grammatical patterns in order to increase the number of causal relations discovered in textual instructions.
Content may be subject to copyright.
Discovering Causal Relations in Textual Instructions
Kristina Yordanova
University of Rostock
kristina.yordanova@uni-rostock.de
Abstract
One aspect of ontology learning methods
is the discovery of relations in textual data.
One kind of such relations are causal re-
lations. Our aim is to discover causations
described in texts such as recipes and man-
uals. There is a lot of research on causal
relations discovery that is based on gram-
matical patterns. These patterns are, how-
ever, rarely discovered in textual instruc-
tions (such as recipes) with short and sim-
ple sentence structure. Therefore we pro-
pose an approach that makes use of time
series to discover causal relations. We dis-
tinguish causal relations from correlation
by assuming that one word causes another
only if it precedes the second word tempo-
rally. To test the approach, we compared
the discovered by our approach causal re-
lations to those obtained through gram-
matical patterns in 20 textual instructions.
The results showed that our approach has
an average recall of 41% compared to 13%
obtained with the grammatical patterns.
Furthermore the discovered by the two ap-
proaches causal relations are usually dis-
joint. This indicates that the approach can
be combined with grammatical patterns in
order to increase the number of causal re-
lations discovered in textual instructions.
1 Introduction and Motivation
There is an increasing number of approaches and
systems for ontology learning based on textual
data partially because of the availability of web
resources that are easily accessible on the inter-
net (Wong et al., 2012). One problem these ap-
proaches face is the discovery of relations in the
data (Wong et al., 2012). One type of relations are
the causal relations between text elements, that is,
whether one word or phrase causes another. Most
of the research regarding causal relations is cen-
tred on the discovery of causal relations between
topics (Radinsky et al., 2011; Kim et al., 2012; Li
et al., 2010) or based on a large amount of textual
data (Silverstein et al., 2000; Mani and Cooper,
2000; Girju, 2003). Moreover, the works usu-
ally focus on the discovery of causal relations in
rich textual data with complex sentence structure
(Silverstein et al., 2000; Mani and Cooper, 2000;
Girju, 2003). There is however little research on
discovering causal relations in textual instructions
that have short sentence length and simple struc-
ture (Zhang et al., 2012). This can be explained
with the fact that short sentences often do not con-
tain any grammatical causal patterns, rather the re-
lations are implicitly inferred by the reader. There
is a large amount of web instructions available in
the form of recipes, manuals, and tutorials1that
contain such simple structures. For example, in
the sentence “Add the pork pieces, fry them for
2 minutes.there is no explicit causal relation be-
tween add and fry. However, we implicitly know,
that without adding the pork pieces, we cannot fry
them. This means that when attempting to learn
an ontology representing the domain knowledge
of such domain, it is difficult to discover causal re-
lations between the ontology elements. For exam-
ple, when attempting to learn the ontology struc-
ture of our experimental data with a state of the
art tool (Cimiano and V¨
olker, 2005), it is able to
identify is-a relations, but no similarity or causal
relations in the text. To address the problem of
identifying causal relations in textual data, in this
paper we discuss an approach that utilises time
series in order to find temporally dependent ele-
ments in the text. We concentrate on the discov-
1For example, BBC Food Recipes provides currently 12
385 recipes (BBC, 2015).
ery of relations between events2, and on the rela-
tion between events and the words that describe
the changes these events cause.
The work is structured as follows. In Section
2 we discuss the related work on causal relations
discovery. In Section 3 we present our approach to
causality discovery. The experimental setup to test
our approach is described in Section 4. Later, we
discuss the results in Section 5 and we conclude
the work with a discussion about the advantages
and limitations of the approach (Section 6).
2 Related Work
There is a lot of research on discovery of causal
relations in textual data. Most of it is centred on
applying grammatical patterns in order to iden-
tify the relations. Khoo et al. (Khoo et al.,
1998) propose five ways of explicitly identify-
ing cause-effect pairs, and based on them con-
struct patterns for discovering them. The pat-
terns employ causal links between two phrases or
clauses (e.g. hence,therefore), causative verbs
(e.g. cause,break), resultative constructions
(verb-noun-adjective constructions), conditionals
(e.g. if-then), and causative adverbs and adjectives
(e.g. fatally). Khoo et al. also provide an extensive
catalogue of causative words and phrases. Based
on this concept other works search for causal re-
lations for different applications. For example,
Li et al. attempt to generate attack plans based
on newspaper data (Li et al., 2010); Girju et al.
utilise grammatical patterns in order to analyse
cause-effect questions in question answering sys-
tem (Girju, 2003); Cole et al. apply grammatical
patterns to textual data in order to obtain Bayesian
network fragments (Cole et al., 2006); and Radin-
sky et al. mine web articles to identify causal rela-
tions (Radinsky et al., 2011).
Other approaches combine grammatical pat-
terns with machine learning in order to extract pre-
conditions and effects from textual data. For ex-
ample, Sill et al. train a support vector machine
with a large annotated textual corpus in order to
be able to identify preconditions and effects, and
to build STRIPS representations of actions and
events (Sil and Yates, 2011).
Alternative approaches rely on the Markov con-
dition to identify causal relations between doc-
uments. They utilise the LCD algorithm that
2By event we mean the verb describing the action that has
to be executed in an instruction.
tests variables for dependence, independence, and
conditional independence to restrict the possible
causal relations (Cooper, 1997). Based on this al-
gorithm Silverstein et al. were able to discover
causal relations between words by representing
each article as a sample with the nmost frequent
words (Silverstein et al., 2000). Similarly, Mani et
al. apply the LCD algorithm to identify causal re-
lations in medical data (Mani and Cooper, 2000).
All of the above methods are applied to large
amounts of data, usually with rich textual descrip-
tions. There is, however, no much research on
finding the causal relations within a textual in-
structions document, where the sentences are short
and simple. Zahng et al. attempt to extract proce-
dural knowledge from textual instructions (man-
uals and recipes) in order to build a procedural
model of the instruction (Zhang et al., 2012). By
applying grammatical patterns they are able to
build a procedural model of each sentence. They,
however, do not discuss the relations between the
identified procedures, thus, do not identify any
causal relations between the sentences.
In our work we identify implicit causal relations
within and between sentences in a document. To
do that we adapt the approach proposed by Kim et
al. (Kim et al., 2012; Kim et al., 2013), where they
search for causally related topics by representing
each topic as a time series where each time stamp
is represented by a document from the correspond-
ing topic. In the following we explain how the ap-
proach can be adapted to identify causal relations
within a textual document.
3 Discovering Causal Relations using
Time Series
Textual instructions such as recipes and manu-
als have a simple sentence structure that does not
contain many grammatical patterns, indicating ex-
plicit causal relations. On the other hand, we
as humans are able to detect implicit relations,
e.g. that one instruction can be executed only af-
ter another was already executed. In that case,
we can either assume that the causal relation be-
tween events follows the temporal relation (i.e.
each event causes the next), or we can attempt
to identify only those events that are causally re-
lated. Similarly, to identify the effects one event
has on the object, or the state of the object that al-
lows the occurrence of the event, one can search
for grammatical patterns. That will however only
identify relations within the sentence but not be-
tween sentences (unless they are connected with a
causal link). For example, in the sentences “Sim-
mer (the sauce) until thickened. Add the pork,
mix well for one minute.using a grammatical pat-
tern we will discover that simmer causes thickened
(through the causal link until). However, it will not
discover that the sauce has to thicken, in order to
add the pork. A grammatical pattern will also not
discover the relation between add and mix, as there
is no causal link between them. To discover such
implicit relations, we treat each word in textual in-
structions as a time series. Then we apply causal-
ity test on the pairs of words we are interested in to
identify whether they are causally related or not.
We concentrate on three types of causal rela-
tions. These are discovering causal relation (1)
between two events; (2) between an event and its
effect on the state of object over which the event
is executed; (3) between the state of the object be-
fore an event can be executed over it. By state of
the object we mean the phrase that serves as an
adjectival modifier or a nominal subject.
We consider a text to be a sequences of sen-
tences divided by a sentence separator.
Definition 1 (Text) A text Iis a set of tuples
(S, C) = {(s1, c1),(s2, c2), ..., (sn, cn)}where S
represents the sentence and Cthe sentence sepa-
rator, with nbeing the length of the text.
Each sentence in the text is then represented by
a sequence of words, where each word has a tag
describing its part of speech (POS) meaning.
Definition 2 (Sentence) A sentence Sis a set of
tuples (W, T ) = {(w1, t1), ..., (wm, tm)}where
Wrepresents the words in the sentence, and Tthe
corresponding POS tag assigned to the words. The
sentence is mwords long.
In a text we have different types of words. We
are most interested in verbs as they describe the
events that cause other events or changes. More
precisely, a verb vWis a word where for the
tuple (v, t)holds that t=verb. We denote the set
of verbs with V. The events are then verbs in their
infinitive form or in present tense, as textual in-
structions are usually described in imperative form
with a missing agent.
Definition 3 (Event) An event eVis a
verb where for the tuple (e, t)holds that t=
verb infinitive OR verb present. For short we say
t=event.
We are also interested in those nouns that are
the direct (accusative) objects of the verb. A noun
nWis a word where for the tuple (n, t)holds
that t=noun. We denote the set of nouns with N.
Then we define the object in the following manner.
Definition 4 (Object) An object oNof a verb
vis the accusative object of v. We denote the rela-
tion between oand vas dobj(v, o), and any direct
object-verb in a sentence snas a tuple (v, o)n.
We define the state of an object as the adjectival
modifier or the nominal subject of an object.
Definition 5 (State) A state cWof an object
ois a word that has one of the following relations
with the object: amod(c, o), denoting the adjecti-
val modifier or nsubj(c, o), denoting the nominal
subject. We denote such tuple as (c, o)n, where n
is the sentence number.
As in textual instructions the object is often omit-
ted (e.g. “Simmer (the sauce) until thickened.),
we also investigate the relation between an event
and past tense verbs or adjectives that do not be-
long to an adjectival modifier or to nominal sub-
ject, but that might still describe this relation.
3.1 Generating time series
Given the definitions above, we can now describe
each unique word in a text as a time series. Each
element in the series is a tuple consisting of the
number of the sentence in the text, and the number
of occurrences of the word in the sentence.
Definition 6 (Time series) A time series of a
word wis a sequence of tuples (D, F )w=
{(1, f1)w,(2, f2)w, ..., (n, fn)w}where D=
{1, ..., n}is the timestamp, and Fis the number
of occurrences of a word at the given timestamp.
Here ncorresponds to the sentence number in the
text.
Algorithm 1 Generate time series for a given object and
the events applied on it.
Require: (V, O).all event-object pairs in I
Require: mO . a unique object
1: for Snin Ido .for each sentence in a text
2: Vn[w|t== event,(w, t)Sn].extract the events
3: end for
4: Uunique(V).returns all unique events in I
5: N[unique(o)|(v, o)(V, O )] .collect the unique objects in I
6: for u in U do .for each unique event in I
7: i1
8: while ilength(I)do
9: for (v, o)in (V, O)ido .for each event-object pair in Si
10: (D, F )u,i (i, count((v== u, o == m))) .calculate the
number of occurrences of (u, m)for each sentence
11: ii+ 1
12: end for
13: end while
14: end for
15: return (D, F )m.return the time series for all events w.r.t. an object
Generally, we can generate a time series for
each kind of word in the corpus, as well as for each
tuple of words. Here we concentrate on those de-
scribing or causing change in a state. That means
we generate time series for all events and for all
states that change an object. To generate time se-
ries for the events we distinguish two cases. The
first is of events that are applied to objects (e.g.
“simmer the sauce”). In that case, for each unique
object oin the corpus we generate a time series
that describes how often this object had a direct
object relation with a verb v, namely we are look-
ing for the number of occurrences of (v, o)nin
each sentence sn(see Algorithm 1).
Apart from the events that are applied to an ob-
ject, there are such that do not have a direct ob-
ject relation, or where the relation is not explic-
itly described (e.g. “Mix (the pork) well for one
minute.). In that case, we also search for causal
relations in events without considering their direct
objects (see Algorithm 2).
Algorithm 2 Generate time series representing the events
in a textual corpus
Require: U . all unique events in I
Require: Vn.all unique events in each sentence Sn
1: for uin Udo .for each unique event in I
2: i1
3: while ilength(I)do
4: for vin Vido .for each event in Si
5: (D, F )u,i (i, count(v== u)) .calculate the number of
occurrences of ufor Si
6: ii+ 1
7: end for
8: end while
9: end for
10: return (D, F ).return the time series for all events
To investigate the causal relation between a
state of the object and an event, we also generate
time series describing the state. This is done by
following the procedure described in Algorithm 1
where the (O, V )pair is replaced with (C, O)pair,
and where we no longer extract events but rather
states c. In order to include all states where the
object is omitted, we also generate time series for
each adjective or verb in past tense that could po-
tentially describe a state. To do that we follow the
procedure in Algorithm 2, where instead of events
we search for adjectives or past tense verbs.
3.2 Searching for causality
In order to discover causal relations based on the
generated time series, we make use of the Granger
causality test. It is a statistical test for determin-
ing whether one time series is useful for forecast-
ing another. More precisely, Granger testing per-
forms statistical significance test for one time se-
ries, “causing” the other time series with different
time lags using auto-regression (Granger, 1969).
The causality relationship is based on two princi-
ples. The first is that the cause happens prior to the
effect, while the second states that the cause has a
unique information about the future values of its
effect (Granger, 2001). Based on these assump-
tions, given two sets of time series xtand yt, we
can test whether xtGranger causes ytwith a max-
imum ptime lag. To do that, we estimate the re-
gression yt=ao+a1yt1+...+apytp+b1xt1+
... +bpxtp. An F-test is then used to determine
whether the lagged xterms are significant.
Algorithm 3 Identify causal relation between two words
Require: (D, F ).all time series describing words of interest in a corpus
Require: L . the lag in the Granger causality test
Require: Th .significance threshold
Require: uW . a word which causal relation w.r.t. the rest of the words is tested
1: for win W, w 6=udo .for each unique time series
2: Cu,w granger.Causality(((D, F )u,(D, F )w), L).calculate the
causality between w and u
3: if p.value(Cu,w)Th then .the relation is significant
4: Ru,w Cu,w .u causes w
5: end if
6: end for
7: return Ru.return the list of words with which u is causally related
We use the Granger causality test to search for
causal relations between the generated time series
(see Algorithm 3). Generally, for each two time
series of interest, we perform Granger test, and if
the pvalue of the result is under the significance
threshold, we conclude that the first time series
causes the second, hence the first word causes the
second. The Granger causality test can be applied
only on stationary time series. Otherwise, they
have to be converted into stationary time series be-
fore applying the test (e.g. by taking the difference
of every two elements in the series).
4 Experimental Setup
To test our approach, we selected 20 different in-
structions: 10 recipes from BBC Food Recipes3,
3 washing machine instructions4, 3 coffee ma-
chine instructions5, 3 kitchen experiment instruc-
tions describing the experiments from the CMU
Grand challenge dataset6, and one description of
a cooking task experiment7. The shortest instruc-
tion is 5 lines (each line being a sentence with a
3http://www.bbc.co.uk/food/recipes/
4http://www.miele.co.uk/Resources/
OperatingInstructions/W%203923%20WPS.pdf
5http://www.cn.jura.com/service_
support/download_manual_jura_impressa_
e10_e20_e25_english.pdf
6http://kitchen.cs.cmu.edu/
7Source not shown due to blind reviewing.
0 40 80
Number of causal relations
●●
all
truth
estGr
estP
nGr/P
braisedBeef
dragonShortbread
sweetPork
calamariRipieni
pizza1
caramelCupcakes
roastLamb
chickenCurry
creamyPasta
sodaBread
coffeeMachineCappuccino
coffeeMachineFilter
coffeeMachineFirstUse
washingMachineWash
washingMachineClean
washingMachineInstall
pizza
carrots
brownies
salad
Figure 1: Number of causal relations discovered by a human expert (circle), Granger causality (triangle), part of speech
patterns (rhombus), and all discovered relations (solid square). The square without fill shows the causal relations that have been
discovered by both Granger causality and grammatical patterns.
full stop at the end), the longest is 111 lines, with
a mean length of 31 lines. The average sentence
length in an instruction text is 11.2 words, with
the shortest text having an average of 5.7 words
per sentence, and the longest an average of 17.4
words per sentence. The average number of events
per sentence is 1.6, with the minimum average of 1
event per sentence in a text, and the average max-
imum of 2.23 events per sentence.
A human expert was asked to search for causal
relations in the text, concentrating on relations be-
tween events or between states and events. This
was later used as the ground truth against which
the discovered relations were compared.
Later, each of the instructions was parsed by
the Stanford NLP Parser8in order to obtain the
part of speech tags and the dependencies between
the words. This was then used as an input for
generating the time series. We considered as a
sentence separator a full stop and a comma, as
in this type of instructions it divides sequentially
executed events in one sentence. The time series
were then tested for stationarity by using the Aug-
mented Dickey–Fuller (ADF) t-statistic test. It
showed that the series are already stationary.
We search for causal relations between events
without considering the object, between events
given the object, and between events and states.
For the case of events given the object we per-
formed Granger causality test with a lag from 1 to
5 as the shortest instructions text has 5 sentences.
For identifying relations between events and states
we used a lag of 1, as the event and the change of
state are usually described in the same sentence or
in following sentences. For identifying relations
between events without considering the object, we
8http://nlp.stanford.edu/software/
lex-parser.shtml
also took a lag of 1, because in texts with longer
sentences, the test tends to discover false positives
when applied with a longer lag. Furthermore, to
reduce the familywise error rate during the mul-
tiple comparisons, we decreased the significance
threshold by applying the Bonferroni correction.
To compare the approach with that of using
grammatical patterns, we implemented patterns
with a causal link that contain words such as until,
because,before, etc. We also added the conjunc-
tion and to the causal links, as it was often used
in the recipes to describe a sequence of events.
We also implemented a verb-noun-adjective pat-
tern to search for the relation between events and
states, and a verb(present)-noun-verb(past) pat-
tern to search for relations between events and
states. Finally, we implemented a conditional pat-
tern (e.g. the if-then construction). As an input for
these patterns we used once again the text instruc-
tions with POS tags from the Stanford Parser.
5 Results
The human expert discovered an average of 25.25
causal relations per text document. Using the
grammatical patterns, an average of 4.15 causal re-
lations per text document were discovered. Using
the time series approach, an average of 20.9 causal
relations per document were discovered.
The number of causal relations discovered in
each text document can be seen in Figure 1. It
shows that the number of discovered relations is
lower in texts with short sentences.
Furthermore, the recall for each textual instruc-
tion is shown in Figure 2. The recall increases with
decreasing the sentence length, while the false dis-
covery rate (FDR) decreases.
On the other hand, the recall for the grammati-
cal patterns is low for all instructions. However, in
0.0 0.6
Recall of causal relations
● ● ● ●
braisedBeef
dragonShortbread
sweetPork
calamariRipieni
pizza1
caramelCupcakes
roastLamb
chickenCurry
creamyPasta
sodaBread
coffeeMachineCappuccino
coffeeMachineFilter
coffeeMachineFirstUse
washingMachineWash
washingMachineClean
washingMachineInstall
pizza
carrots
brownies
salad
0.0 0.6
Precision of causal relations
● ●
● ● ●
Granger Patterns Granger objects
braisedBeef
dragonShortbread
sweetPork
calamariRipieni
pizza1
caramelCupcakes
roastLamb
chickenCurry
creamyPasta
sodaBread
coffeeMachineCappuccino
coffeeMachineFilter
coffeeMachineFirstUse
washingMachineWash
washingMachineClean
washingMachineInstall
pizza
carrots
brownies
salad
Figure 2: Recall and precision of the discovered causal relations for each dataset. Square indicates Granger causality, circle
grammatical patterns, triangle Granger causality when using only the event-object pairs.
difference to the time series approach, the gram-
matical patterns have a high precision.
The precision and recall of the time series when
using only the event-object pairs (Algorithm 1)
show that the precision for the event-object pairs
is very high in comparison to the overall time se-
ries precision (Figure 2).
Finally, we tested whether there is a significant
correlation between the performance of the ap-
proaches and the type of textual instruction. We
applied a two sided correlation test that uses the
Pearson’s product moment correlation coefficient.
The results showed that in the approach using time
series and the Granger causality test, the perfor-
mance is inversely proportional to the sentence
length and the number of events in the sentence.
On the other hand, the approach using the gram-
matical patterns is proportional to the sentence
length and the number of events.
6 Discussion
In this work we presented an approach that relies
on time series to discover causal relations in tex-
tual descriptions such as manuals and recipes.
Among the advantages of the approach are the
following. The approach allows the discovery
of implicit causal relations in texts where ex-
plicit causal relations are not discoverable through
grammatical patterns. It does not require a training
phase (assuming the text has POS tags), or explicit
modelling of grammatical patterns. This makes
the approach more context independent. It discov-
ers relations different from those discovered with
grammatical patterns, and can detect causal rela-
tions between elements that are several sentences
apart. This indicates that both approaches can be
combined to provide better performance.
Apart from the advantages, there are several
shortcomings to the approach. The approach is not
suitable for texts with complex sentence structure
and many events in one sentence, as this generates
false positive relations. The cause for this is that
when we have several words we want to test in the
same sentence, they will also have the same time
stamp. To solve this problem, one can introduce
additional sentence separators.
Another characteristic of textual instructions is
that they often omit the direct object. On the
other hand, as the results showed, the usage of ob-
jects reduces the generation of false positives. To
make use of this, we can introduce a preprocess-
ing phase, where verbs that are in conjunction all
receive the same direct object.
Another problem is the lag size in the Granger
causality test. The test is very sensitive to the lag
size in the case when it is applied to events that
do not have direct objects. On the other hand, the
approach is less sensitive to the lag when the sen-
tence length is reduced, and it is robust when direct
object is used.
Another problem associated with the Granger
causality test is whether it discovers causality or
simply correlation. As the approach does not rely
on contextual information, apart from the causes,
it also discovers any number of correlations in the
time series. To that end, Granger causality is prob-
ably not the best tool for searching for causal rela-
tions in textual instructions, but it produces results
in situations where the grammatical patterns are
not able to yield any results.
As a conclusion, the usage of time series in tex-
tual instructions allows the discovery of implicit
causal relations that are usually not discoverable
when using grammatical patterns. This can po-
tentially improve the learned semantic structure of
ontologies representing the knowledge embedded
in textual instructions.
References
BBC (2015). BBC Food Recipes.
Retrieved: 22.04.2015 from
http://www.bbc.co.uk/food/recipes/.
Cimiano, P. and V ¨
olker, J. (2005). Text2onto. In Mon-
toyo, A., Mu´
noz, R., and M´
etais, E., editors, Natu-
ral Language Processing and Information Systems,
volume 3513 of Lecture Notes in Computer Science,
pages 227–238. Springer Berlin Heidelberg.
Cole, S., Royal, M., Valtorta, M., Huhns, M., and
Bowles, J. (2006). A lightweight tool for auto-
matically extracting causal relationships from text.
In SoutheastCon, 2006. Proceedings of the IEEE,
pages 125–129.
Cooper, G. (1997). A simple constraint-based algo-
rithm for efficiently mining observational databases
for causal relationships. Data Mining and Knowl-
edge Discovery, 1(2):203–224.
Girju, R. (2003). Automatic detection of causal rela-
tions for question answering. In Proceedings of the
ACL 2003 Workshop on Multilingual Summarization
and Question Answering - Volume 12, MultiSumQA
’03, pages 76–83, Stroudsburg, PA, USA. Associa-
tion for Computational Linguistics.
Granger, C. W. J. (1969). Investigating Causal Re-
lations by Econometric Models and Cross-spectral
Methods. Econometrica, 37(3):424–438.
Granger, C. W. J. (2001). Testing for causality: A per-
sonal viewpoint. In Ghysels, E., Swanson, N. R.,
and Watson, M. W., editors, Essays in Econometrics,
pages 48–70. Harvard University Press, Cambridge,
MA, USA.
Khoo, C. S., Kornfilt, J., Myaeng, S. H., and Oddy,
R. N. (1998). Automatic extraction of cause-
effect information from newspaper text without
knowledge-based inferencing. Literature and Lin-
guist Computing, 13(4):177–186.
Kim, H. D., Castellanos, M., Hsu, M., Zhai, C., Rietz,
T., and Diermeier, D. (2013). Mining causal topics
in text data: iterative topic modeling with time series
feedback. In Proceedings of the 22nd ACM inter-
national conference on Conference on information
& knowledge management, CIKM ’13, pages
885–890, New York, NY, USA. ACM.
Kim, H. D., Zhai, C., Rietz, T. A., Diermeier, D.,
Hsu, M., Castellanos, M., and Ceja Limon, C. A.
(2012). Incatomi: Integrative causal topic miner be-
tween textual and non-textual time series data. In
Proceedings of the 21st ACM International Confer-
ence on Information and Knowledge Management,
CIKM ’12, pages 2689–2691, New York, NY, USA.
ACM.
Li, X., Mao, W., Zeng, D., and Wang, F.-Y. (2010).
Automatic construction of domain theory for at-
tack planning. In IEEE International Conference
on Intelligence and Security Informatics (ISI), 2010,
pages 65–70.
Mani, S. and Cooper, G. F. (2000). Causal discovery
from medical textual data. In Proceedings of the
AMIA annual fall symposium 2000, Hanley and Bel-
fus Publishers, pages 542–546.
Radinsky, K., Davidovich, S., and Markovitch, S.
(2011). Learning causality from textual data. In
Proceedings of the IJCAI Workshop on Learning by
Reading and its Applications in Intelligent Question-
Answering, pages 363–367, Barcelona, Spain.
Sil, A. and Yates, E. (2011). Extracting strips represen-
tations of actions and events. In Recent Advances in
Natural Language Processing, pages 1–8.
Silverstein, C., Brin, S., Motwani, R., and Ullman, J.
(2000). Scalable techniques for mining causal struc-
tures. Data Min. Knowl. Discov., 4(2-3):163–192.
Wong, W., Liu, W., and Bennamoun, M. (2012). On-
tology learning from text: A look back and into the
future. ACM Comput. Surv., 44(4):20:1–20:36.
Zhang, Z., Webster, P., Uren, V., Varga, A., and
Ciravegna, F. (2012). Automatically extracting pro-
cedural knowledge from instructional texts using
natural language processing. In Chair), N. C. C.,
Choukri, K., Declerck, T., Do˘
gan, M. U., Maegaard,
B., Mariani, J., Moreno, A., Odijk, J., and Piperidis,
S., editors, Proceedings of the Eight International
Conference on Language Resources and Evaluation
(LREC’12), Istanbul, Turkey. European Language
Resources Association (ELRA).
... The existing approaches either rely on initial manual definition to learn these relations [4], or on grammatical patterns and rich texts with complex sentence structure [13]. In contrast, textual instructions usually have a simple sentence structure and grammatical patterns are rarely discovered [25]. The existing approaches do not address the problem of discovering causal relations between sentences, but assume that all causal relations are within the sentence [20]. ...
... The existing approaches do not address the problem of discovering causal relations between sentences, but assume that all causal relations are within the sentence [20]. In contrast, in instructional texts, the elements representing cause and effect are usually found in different sentences [25]. ...
... In this work, we extend the approach to generate rich planning operators and we show first empirical evidence that it is possible to reason about human behaviour based on the generated models. The method adapts an approach proposed by [25] to use time series analysis to identify the causal relations between text elements. We use it to discover implicit causal relations between actions. ...
... The existing approaches either rely on initial manual definition to learn these relations [4], or on grammatical patterns and rich texts with complex sentence structure [13]. In contrast, textual instructions usually have a simple sentence structure and grammatical patterns are rarely discovered [25]. The existing approaches do not address the problem of discovering causal relations between sentences, but assume that all causal relations are within the sentence [20]. ...
... The existing approaches do not address the problem of discovering causal relations between sentences, but assume that all causal relations are within the sentence [20]. In contrast, in instructional texts, the elements representing cause and effect are usually found in different sentences [25]. ...
... In this work, we extend the approach to generate rich planning operators and we show first empirical evidence that it is possible to reason about human behaviour based on the generated models. The method adapts an approach proposed by [25] to use time series analysis to identify the causal relations between text elements. We use it to discover implicit causal relations between actions. ...
Conference Paper
Full-text available
Recent attempts at behaviour understanding through language grounding have shown that it is possible to automatically generate planning models from instructional texts. One drawback of these approaches is that they either do not make use of the semantic structure behind the model elements identified in the text, or they manually incorporate a collection of concepts with semantic relationships between them. To use such models for behaviour understanding, however, the system should also have knowledge of the semantic structure and context behind the planning operators. To address this problem, we propose an approach that automatically generates planning operators from textual instructions. The approach is able to identify various hierarchical, spatial, directional, and causal relations between the model elements. This allows incorporating context knowledge beyond the actions being executed. We evaluated the approach in terms of correctness of the identified elements, model search complexity, model coverage, and similarity to handcrafted models. The results showed that the approach is able to generate models that explain actual tasks executions and the models are comparable to handcrafted models.
... For example, Kim et al. [10] search for causally related newspaper topics by representing each topic as a time series where each time stamp is represented by a document from the corresponding topic. Yordanova [17] adapts this approach in order to analyse instructional texts in search of causally related actions. We follow the method proposed in [17]. ...
... Yordanova [17] adapts this approach in order to analyse instructional texts in search of causally related actions. We follow the method proposed in [17]. In difference to [17], we do not search causations in instructional texts but rather in the annotation of a sensor dataset recorded in a nursing home. ...
... We follow the method proposed in [17]. In difference to [17], we do not search causations in instructional texts but rather in the annotation of a sensor dataset recorded in a nursing home. To do that, we treat the annotation for the observed during the day behaviour as a time series. ...
Conference Paper
Full-text available
With the increase of elderly population, the percentage of people suffering from dementia also increases. Typically, patients with dementia are cared for at home by family members. The task of caregiving is associated with significant psychological and physical stress that affects both the caregiver and the person with dementia. One solution to improving the task of caregiving is to provide an assistive system that is able to automatically recognise when challenging behaviour is exhibited and to provide suggestions for appropriate intervention strategies. One of the challenges such system has, is to predict future challenging behaviour based on the currently observed behaviour.To address this problem, we propose a method for discovering potential causal relations between challenging behaviours. We analyse the annotation of a sensor dataset collected in two nursing homes for a period of 4 weeks. The preliminary results show that our approach is able to discover relations between different challenging behaviours. The discovered relations do not contradict existing findings on the frequency correlations between certain groups of challenging behaviours.
... They also do not identify implicit relations. To find causal relations in instructions without a training phase, however, one has to rely on alternative methods, such as time series analysis [13]. ...
... To identify causal relations between the actions, and between states and actions, we use an approach proposed in [13]. It transforms every word of interest in the text into a time series and then applies time series analysis to identify any causal relations between the series. ...
... In more general case, an error is to be expected. For example, see the results in[13]. ...
Conference Paper
One of the major difficulties in activity recognition stems from the lack of a model of the world where activities and events are to be recognised. When the domain is fixed and repetitive we can manually include this information using some kind of ontology or set of constraints. On many occasions, however, there are many new situations for which only some knowledge is common and many other domain-specific relations have to be inferred. Humans are able to do this from short descriptions in natural language, describing the scene or the particular task to be performed. In this paper we apply a tool that extracts situation models and rules from natural language description to a series of exercises in a surgical domain, in which we want to identify the sequence of events that are not possible, those that are possible (but incorrect according to the exercise) and those that correspond to the exercise or plan expressed by the description in natural language. The preliminary results show that a large amount of valuable knowledge can be extracted automatically, which could be used to express domain knowledge and exercises description in languages such as event calculus that could help bridge these high-level descriptions with the low-level events that are recognised from videos.
... The existing approaches, however, either rely on initial manual definition to learn these relations (Branavan et al., 2012), or on grammatical patterns and rich texts with complex sentence structure (Li et al., 2010). Textual instructions however usually have a simple sentence structure where grammatical patterns are rarely discovered (Yordanova, 2015). The existing approaches do not address the problem of discovering causal relations between sentences, but assume that all causal relations are expressed within the sentence (Tenorth et al., 2010). ...
... The existing approaches do not address the problem of discovering causal relations between sentences, but assume that all causal relations are expressed within the sentence (Tenorth et al., 2010). In textual instructions however, the elements representing cause and effect are usually found in different sentences (Yordanova, 2015). ...
... In this work, we extend the approach by proposing a method for automatic generation of situation models. The method adapts the idea proposed by (Yordanova, 2015) to use time series analysis to identify the causal relations between text elements. Our approach uses this idea to discover causal relations between actions. ...
Conference Paper
Full-text available
Recent attempts at behaviour understanding through language grounding have shown that it is possible to automatically generate models for planning problems from textual instructions. One drawback of these approaches is that they either do not make use of the semantic structure behind the model elements identified in the text, or they manually incorporate a collection of concepts with semantic relationships between them. We call this collection of knowledge situation model. The situation model introduces additional context information to the model. It could also potentially reduce the complexity of the planning problem compared to models that do not use situation models. To address this problem, we propose an approach that automatically generates the situation model from textual instructions. The approach is able to identify various hierarchical, spatial, directional, and causal relations. We use the situation model to automatically generate planning problems in a PDDL notation and we show that the situation model reduces the complexity of the PDDL model in terms of number of operators and branching factor compared to planning models that do not make use of situation models. We also compare the generated PDDL model to a handcrafted one and show that the generated model performs comparable to simple handcrafted models.
... They also do not identify implicit relations. However, to find causal relations in instructions without a training phase, one has to rely on alternative methods, such as time series analysis [21]. Moreover, the initial state is manually defined and there are only a few works that identify possible goals based on the textual instructions [26,1]. ...
... To identify causal relations between the actions, and between states and actions, we use an approach proposed in [21]. It transforms every word of interest in the text into a time series and then applies time series analysis to identify any causal relations between the series. ...
... In a previous work we proposed an approach of extracting causal relations from textual instructions through time series analysis [21]. We applied the approach to 20 textual instructions (cooking recipes, manuals, and experiment instructions). ...
Conference Paper
Full-text available
There are various knowledge-based activity recognition approaches that rely on manual definition of rules to describe user behaviour. These rules are later used to generate computational models of human behaviour that are able to reason about the user behaviour based on sensor observations. One problem with these approaches is that the manual rule definition is time consuming and error prone process. To address this problem, in this paper we outline an approach that learns the model structure from textual sources and later optimises it based on observations. The approach includes extracting the model elements and generating rules from textual instructions. It then learns the optimal model structure based on observations in the form of manually created plans and sensor data. The learned model can then be used to recognise the behaviour of users during their daily activities. We illustrate the approach with an example from the cooking domain.
... They also assume that all causal relations are expressed within the sentence [22] and they do not identify implicit relations. However, to find causal relations in instructions without a training phase, one has to rely on alternative methods, such as time series analysis [24]. Moreover, the initial state is manually defined and there are only a few works that identify possible goals based on the textual instructions [28,1]. ...
... To identify causal relations between the actions, and between states and actions, we use an approach proposed in [24] (see Fig. 1). It transforms every word of interest in the text into a time series and then applies time series analysis to identify any causal relations between the series. ...
Conference Paper
There are various activity recognition approaches that rely on manual definition of precondition-effect rules to describe user behaviour. These rules are later used to generate computational models of human behaviour that are able to reason about the user behaviour based on sensor observations. One problem with these approaches is that the manual rule definition is time consuming and error prone process. To address this problem, in this paper we outline an approach that extracts the rules from textual instructions. It then learns the optimal model structure based on observations in the form of manually created plans and sensor data. The learned model can then be used to recognise the behaviour of users during their daily activities.
... They also do not identify implicit relations. However, to find causal relations in instructions without a training phase, one has to rely on alternative methods, such as time series analysis (Yordanova, 2015a). Furthermore, they rely on manually defined ontology, or do not use one. ...
... To identify causal relations between the actions, and between states and actions, we use an approach proposed in (Yordanova, 2015a). It transforms every word of interest in the text into a time series and then applies time series analysis to identify any causal relations between the series. ...
Conference Paper
Full-text available
There are various activity recognition approaches that rely on manual definition of precondition-effect rules to describe human behaviour. These rules are later used to generate computational models of human behaviour that are able to reason about the user behaviour based on sensor observations. One problem with these approaches is that the manual rule definition is time consuming and error prone process. To address this problem, in this paper we propose an approach that learns the rules from textual instructions. In difference to existing approaches, it is able to learn the causal relations between the actions without initial training phase. Furthermore, it learns the domain ontology that is used for the model generalisation and specialisation. To evaluate the approach, a model describing cooking task was learned and later applied for explaining seven plans of actual human behaviour. It was then compared to a hand-crafted model describing the same problem. The results showed that the learned model was able to recognise the plans with higher overall probability compared to the hand-crafted model. It also learned a more complex domain ontology and was more general than the handcrafted model. In general, the results showed that it is possible to learn models of human behaviour from textual instructions which are able to explain actual human behaviour.
... We automate this step by first automatically obtaining the implicit causal relations between the actions in the textual instructions. This is done by converting the textual instructions into time series and then performing a time series analysis to discover any causal dependencies between the series as proposed in [20,23]. We start by representing each unique action in a text as a time series. ...
Conference Paper
Ground truth is essential for activity recognition problems. It is used to apply methods of supervised learning, to provide context information for knowledge-based methods, and to quantify the recognition performance. Semantic annotation extends simple symbolic labelling by assigning semantic meaning to the label and enables reasoning about the semantic structure of the observed activity. The development of semantic annotation for activity recognition is a time consuming task, which involves a lot of effort and expertise. To reduce the time needed to develop semantic annotation, we propose an approach that automatically generates semantic models based on manually assigned symbolic labels. We provide a detailed description of the automated process for annotation generation and we discuss how it replaces the manual process. To validate our approach we compare automatically generated semantic annotation for the CMU grand challenge dataset with manual semantic annotation for the same dataset. The results show that automatically generated models are comparable to manually developed models but it takes much less time and no expertise in model development is required
Article
Full-text available
This study investigated how effectively cause-effect information can be extracted from newspaper text using a simple computational method (i.e. without knowledge-based inferencing and without full parsing of sentences). An automatic method was developed for identifying and extracting cause-effect information in Wall Street Journal text using linguistic clues and pattern matching. The set of linguistic patterns used for identifying causal relationships was based on a through review of the literature and on an analysis of sample sentences from the Wall Street Journal. The cause-effect information extracted using the method was compared with that identified by two human judges. The program successfully extracted ˜68% of the causal relationships identified by both judges (the intersection of the two sets of causal relationships identified by the judges.) Of the instances that the computer program identified as causal relationships, ˜25% were identified by both judges, and 64% were identified by at least one of the judges. Problems encountered are discussed.
Article
Full-text available
Ontologies are often viewed as the answer to the need for interoperable semantics in modern information systems. The explosion of textual information on the Read/Write Web coupled with the increasing demand for ontologies to power the Semantic Web have made (semi-)automatic ontology learning from text a very promising research area. This together with the advanced state in related areas, such as natural language processing, have fueled research into ontology learning over the past decade. This survey looks at how far we have come since the turn of the millennium and discusses the remaining challenges that will define the research directions in this area in the near future.
Conference Paper
Full-text available
Knowledge about how the world changes over time is a vital component of commonsense knowledge for Artificial Intelligence (AI) and natural language understanding. Actions and events are fundamental components to any knowledge about changes in the state of the world: the states before and after an event differ in regular and predictable ways. We describe a novel system that tackles the problem of extracting knowledge from text about how actions and events change the world over time. We leverage standard language-processing tools, like semantic role labelers and coreference resolvers, as well as large-corpus statistics like pointwise mutual information, to identify STRIPS representations of actions and events, a type of representation commonly used in AI planning systems. In experiments on Web text, our extractor’s Area under the Curve (AUC) improves by more than 31 % over the closest system from the literature for identifying the preconditions and add effects of actions. In addition, we also extract significant aspects of STRIPS representations that are missing from previous work, including delete effects and arguments. 1
Article
Topic modeling is popular for text mining tasks. Recently, topic modeling has been combined with time lines when textual data is related to external non-textual time series data such as stock prices. However, no previous work has used the external non-textual time series data in the process of topic modeling. In this paper, we describe a novel text mining system, Integrative Causal Topic Miner (InCaToMi) that integrates textual and non-textual time series data. InCaToMi automatically finds causal relationships and topics using text data and external non-textual time series data using Granger Testing. Moreover, InCaToMi considers the non-textual time series data in the topic modeling process, using the time series data to iteratively improve modeling results through interactions between it and the textual data at both topic and word levels.
Conference Paper
Many applications require analyzing textual topics in conjunction with external time series variables such as stock prices. We develop a novel general text mining framework for discovering such causal topics from text. Our framework naturally combines any given probabilistic topic model with time-series causal analysis to discover topics that are both coherent semantically and correlated with time series data. We iteratively refine topics, increasing the correlation of discovered topics with the time series. Time series data provides feedback at each iteration by imposing prior distributions on parameters. Experimental results show that the proposed framework is effective.
Article
Mining for association rules in market basket data has proved a fruitful area of research. Measures such as conditional probability (confidence) and correlation have been used to infer rules of the form “the existence of item A implies the existence of item B.” However, such rules indicate only a statistical relationship between A and B. They do not specify the nature of the relationship: whether the presence of A causes the presence of B, or the converse, or some other attribute or phenomenon causes both to appear together. In applications, knowing such causal relationships is extremely useful for enhancing understanding and effecting change. While distinguishing causality from correlation is a truly difficult problem, recent work in statistics and Bayesian learning provide some avenues of attack. In these fields, the goal has generally been to learn complete causal models, which are essentially impossible to learn in large-scale data mining applications with a large number of variables. In this paper, we consider the problem of determining casual relationships, instead of mere associations, when mining market basket data. We identify some problems with the direct application of Bayesian learning ideas to mining large databases, concerning both the scalability of algorithms and the appropriateness of the statistical techniques, and introduce some initial ideas for dealing with these problems. We present experimental results from applying our algorithms on several large, real-world data sets. The results indicate that the approach proposed here is both computationally feasible and successful in identifying interesting causal structures. An interesting outcome is that it is perhaps easier to infer the lack of causality than to infer causality, information that is useful in preventing erroneous decision making.
Article
This paper presents a simple, efficient computer-based method for discovering causal relationships from databases that contain observational data. Observational data is passively observed, as contrasted with experimental data. Most of the databases available for data mining are observational. There is great potential for mining such databases to discover causal relationships. We illustrate how observational data can constrain the causal relationships among measured variables, sometimes to the point that we can conclude that one variable is causing another variable. The presentation here is based on a constraint-based approach to causal discovery. A primary purpose of this paper is to present the constraint-based causal discovery method in the simplest possible fashion in order to (1) readily convey the basic ideas that underlie more complex constraint-based causal discovery techniques, and (2) permit interested readers to rapidly program and apply the method to their own databases, as a start toward using more elaborate causal discovery algorithms.
Conference Paper
Terrorism organizations are devising increasingly sophisticated plans to conduct attacks. The ability of emulating or constructing attack plans by potential terrorists can help us understand the intents and motivation behind terrorism activities. A feasible computational method to construct plans is planning technique in AI. Traditionally, AI planning methods rely on a predefined domain theory which is compiled by domain experts manually. To facilitate domain theory construction and plan generation, we propose a method to construct domain theory automatically from free text data. The effectiveness of our proposed approach is evaluated empirically through experimental studies using real world terrorist plans .
Article
Medical records usually incorporate investigative reports, historical notes, patient encounters or discharge summaries as textual data. This study focused on learning causal relationships from intensive care unit (ICU) discharge summaries of 1611 patients. Identification of the causal factors of clinical conditions and outcomes can help us formulate better management, prevention and control strategies for the improvement of health care. For causal discovery we applied the Local Causal Discovery (LCD) algorithm, which uses the framework of causal Bayesian Networks to represent causal relationships among model variables. LCD takes as input a dataset and outputs causes of the form variable Y causally influences variable Z. Using the words that occur in the discharge summaries as attributes for input, LCD output 8 purported causal relationships. The relationships ranked as most probable subjectively appear to be most causally plausible.