How Do Empirical Methods Interact with Theoretical Pragmatics? The Conceptual and Procedural Contents of the English Simple Past and Its Translation into French



One major theoretical issue that has dominated the field of theoretical pragmatics for the last twenty years is the conceptual vs. procedural distinction and its application for verb tenses. In this chapter, we address this distinction from both theoretical and empirical perspectives following a multifaceted methodology: work on parallel corpora, contrastive analysis methodology and offline experimentation with natural language processing applications. We argue that the conceptual/procedural distinction should be investigated under the aegis of empirical pragmatics. In the case study, we bring evidence from offline experimentation for the procedural and conceptual contents of the English Simple Past and we use this information for improving the results of a machine translation system.
201 Introduction
21In the last few years, linguists have become aware of the numerous advantages of
22the collaboration between theoretical and empirical pragmatics, which joined their
23forces in order to provide more and more insight into the use of language. In our
24view, empirical pragmatics investigates language use from both descriptive-
25theoretical and empirical perspectives. The empirical means considered in this
C. Grisot (*) • J. Moeschler
Department of Linguistics, University of Geneva, 2 rue de Candolle 1211,
`ve 4, Switzerland
The authors of this chapter are thankful to the reviewers for their helpful comments, which
improved the quality of this chapter.
J. Romero-Trillo (ed.), Yearbook of Corpus Linguistics and Pragmatics 2014:
New Empirical and Theoretical Paradigms, Yearbook of Corpus Linguistics and Pragmatics 2,
DOI 10.1007/978-3-319-06007-1_2, ©Springer International Publishing Switzerland 2014
26 study are corpora and experimental methods. These methods are complementary
27 and allow a better view on the linguistic phenomena of interest in this study,
28 specifically the nature of the information encoded by verb tenses.
29 Theoretical pragmatics can be defined in a broad sense as the study of language
30 in use, and in a narrow sense, as the study of how linguistic properties and
31 contextual factors interact for utterances interpretation (Noveck and Sperber
32 2004). Two types of properties are involved in verbal communication: linguistic
33 properties that are linked to the content of sentences (phonological, syntactic,
34 semantic assigned by the grammar of each language) and non-linguistic properties
35 that are linked to them being uttered in a given situation, at a given moment by a
36 speaker. One question pragmatics wants to answer is the exact role of each type of
37 property and their interaction. On the one hand, Grice (1975/1989) and neo-Gricean
38 scholars (Gazdar 1979; Horn 1973,1984,1989,1992,2004,2007; Levinson 1983,
39 2000) proposed an explanation based on conversation maxims and principles that
40 guide conversation participants. On the other hand, relevance theorists (Sperber
41 and Wilson 1986/1995; Blakemore 1987,2002; Carston 2002; Moeschler 1989;
42 Reboul 1992; Moeschler and Reboul 1994; Reboul and Moeschler 1995,1996,
43 1998) speak about a unique expectation of relevance that hearers have while
44 participating in an act of communication. According to relevance theorists, this
45 expectation of relevance is sufficient for recovering the speaker’s meaning.
46 Theoretical pragmatics (both neo-Griceans, relevance theorists as well as other
47 pragmaticians) is thus concerned with phenomena related to the interpretation of
48 utterances, including both explicit (in close relation to semantics) and implicit
49 meaning. The main assumption is that propositional structures are systematically
50 underdetermined and must be contextually enriched. Of great interest for the
51 present study is the theoretical distinction between conceptual vs. procedural
52 meaning, proposed by Blakemore (1987) within the framework of Relevance
53 Theory (RT) (Sperber and Wilson 1986/1995). As Escandell-Vidal et al. (2011)
54 argue, the conceptual/procedural distinction was first meant as a solution for the
55 semantics/pragmatics division of labour and it has remained an important explana-
56 tion for the contribution of linguistic meaning to utterance interpretation. A speaker
57 is not expected to render more difficult than necessary his/her addressee’s task in
58 obtaining a relevant interpretation. Therefore, procedural meanings are instructions
59 encoded by linguistic expressions that specify paths to follow during the interpre-
60 tation process (manipulation of conceptual representations) in order to access the
61 most relevant context. Wilson and Sperber (1993) attach cognitive foundations to
62 the conceptual/procedural distinction and propose a distinguishing criterion: con-
63 ceptual representations can be brought to consciousness while procedures cannot.
64 We are particularly interested in this distinction because of its highly debated
65 application for verb tenses (Smith 1990; Wilson and Sperber 1993; Moeschler
66 et al. 1998; Moeschler 2000,2002; de Saussure 2003,2011; Ameno
´s-Pons 2011;
67 Moeschler et al. 1998,2012; Grisot et al. 2012).
68 The two aims of this chapter are (1) to show that an investigation of the
69 conceptual and procedural meanings of verb tenses should be done under the
70 aegis of empirical pragmatics and (2) to argue for the benefits of combining two
8 C. Grisot and J. Moeschler
71empirical methods, corpus analysis and linguistic experiment. In our study, we
72combined data from parallel corpora that served as stimulus composition for offline
73experiments (linguistic judgement task). Parallel corpora revealed variation in
74translation possibilities of a verb tense from a source language to a target language.
75Based on semantic and pragmatic theories we formulated hypotheses about the
76source of this variation and possible disambiguation criteria. Offline experiments
77allowed us to validate one of these criteria, as well as to propose new theoretic
78descriptions of the meaning and usages of verb tenses. We place this study under
79the cover of empirical pragmatics.
80Empirical pragmatics draws on theoretical pragmatics and corpus linguistics,
81adopting experimental methods at the same time. Empirical pragmatics aims at
82having consistent data for supporting or challenging current pragmatic theories, as
83well as proposing new models for the interpretation of linguistic phenomena. Of
84course, theoretical pragmatics makes use of data consisting of built examples
85representing mainly the researchers’ own intuitions. This type of data is criticisable
86mainly for its subjectivity and lack of replicability. For this reason robust (objec-
87tive, quantifiable, replicable) data must be adopted, such as data from corpora
88(as argued for example by Barlow and Kemmer (2000), Boas (2003)) and experi-
89ments (Tomasello 2000). Of the two types of experiments used in psycholinguistics,
90only offline experimentation can be adopted more easily by empirical pragmatics
91because of the lack of material required (no necessity of a laboratory with electro-
92encephalography EEG material
or eye-trackers).
93There is one branch of pragmatics that has integrated experimental methodolo-
94gies for testing pragmatic theories: experimental pragmatics. While theoretical
95pragmatics is rooted in philosophy of language and in linguistics, experimental
96pragmatics, drawing on pragmatics, psycholinguistics and psychology of reasoning,
97has taken over and reinterpreted the psycholinguistic sophisticated experimental
98methods (Meibauer and Steinbach 2011). For instance, Katsos and Cummins (2010)
99emphasize the relation between pragmatic theory and psycholinguistic experimen-
100tal design: linguists benefit from experimental data confirming the psychological
101validity of their observations and provide critical evidence for cases that go beyond
102the reach of intuitive reflection, and psychologists benefit from a wide range of
103phenomena to study and of multiple theories provided by semantics and pragmatics.
104Recent experimental pragmatics (such as papers from the volume edited by Noveck
105and Sperber in 2004) has focused on phenomena such as indirect speech acts,
106metaphors, implicature, presupposition and, more generally, speaker meaning.
107Finally, we would like to argue that empirical pragmatics has built a bridge to the
108Natural Language Processing (NLP) domain thanks to the robust type of data used.
109The NLP domain needs models of language interpretation inspired from theoretical
EEG is a procedure that measures electrical activity of the brain over time using electrodes placed
on the scalp and it reflects thousands of simultaneously ongoing brain processes. Eye tracking is
the process of measuring either the point of gaze or the motion of an eye relative to the head and it
is used to investigate human thought processes.
How Do Empirical Methods Interact with Theoretical Pragmatics?... 9
110 pragmatics that can be adapted to machines. NLP also requires large amounts of
111 data that allow quantitative analyses, statistical models and data for training parses
112 and classifiers. Empirical pragmatics is able to provide NLP both linguistic models
113 and empirical data.
114 This chapter is structured as follows: in Sect. 2, we introduce the role and type of
115 data used in linguistics presented from a general point of view and in semantics and
116 pragmatics, as well as their advantages and limits; in Sect. 3, we describe our case
117 study by pointing out theoretical matters about verb tenses, our hypotheses, our
118 empirical study on parallel corpora and offline experiments. We conclude our
119 chapter in Sect. 4by addressing the impact of the results of our experiments on
120 theoretical matters about verb tenses and the importance of giving multiple sources
121 of data for empirical pragmatics studies.
122 2 Type and Role of Data in Empirical Pragmatics
123 Nowadays, one can observe the increasing aspirations of linguists to use robust and
124 objective findings in addition to intuitive and subjective acceptability judgements
125 or built examples. McEnery and Wilson (2001) highlight that, broadly speaking,
126 linguists have tended to favour the use of either introspective data (that is, language
127 data constructed by linguists) or naturally occurring data (that is, examples of actual
128 language usage). Nowadays, most linguists see these two types of data as comple-
129 mentary approaches, and not exclusive ones. Gibbs and Matlock (1999) and Gries
130 (2002) argue that, although intuition may be poor as a methodology for investigat-
131 ing mental representations, linguists’ intuitions are useful in the formulation of
132 testable hypotheses about linguistic structure and behaviour.
133 Kepser and Reis (2005) point out that introspective and corpus data were the two
134 main sources of data for theoretical linguistics until the mid-1990s. After that time
135 other sources have been considered, such as experimentation (investigating offline
136 and online processes), language acquisition, language pathologies, neurolinguistic,
137 etc. They argue that linguistic evidence coming from different domains of data
138 sheds more light on issues investigated than from a unique source. Multi-source
139 evidence can either validate the theory or bring contradictory results, therefore
140 opening new perspectives.
141 In what natural occurring data is concerned, Table 1provides an overview of
142 kinds of linguistic data (Gilquin and Gries 2009). They are presented in descending
143 order of naturalness of production and collection (only corpora with written exam-
144 ples are produced for other aims than the specific purpose of linguistic research, and
145 are thus the most natural kind).
146 In this chapter, we are interested in the first and the last type of data, namely
147 corpora with written texts and data coming from experimentation where subjects
148 are required to do something with language they do not usually do (using units they
149 usually interact with involving typical linguistic output). We argue that both types
10 C. Grisot and J. Moeschler
150of data are complementary and necessary in pragmatic research, and may be used
151within various frameworks of linguistic description and analysis.
152Before presenting the advantages and difficulties, as well as the complementarity
153of both empirical methods used in this study, we will define and describe briefly
154corpora and offline experiments.
1552.1 Corpora
156The well-known description of a corpus as being “a body of naturally occurring
157language” (McEnery et al. 2006: 4) is largely accepted in the corpus linguistics
158community, as well as other domain that work on corpora, such as empirical
159pragmatics or translation studies (Baker 1993,1995). The same is true for corpora
160as having a machine-readable form, a feature that allows its compilation and
161analysis semi-automatically and automatically. As far as size is concerned, corpora
162become larger and larger and this is due to the possibility to be tagged, compiled
163and analysed automatically. The most important aspect to take into account when
164doing corpus work is to have an appropriate match of the research goal and the
165corpus type and size (Gries 2013).
166Another feature of corpora is the number of languages and type of texts they
167contain, for example, monolingual or multilingual. Multilingual corpora can be of
t:1Table 1 Kinds of linguistic data (Sorted according to naturalness of production/collection)
(Gilquin and Gries 2009:5)
Data source t:2
1. Corpora with written texts (e.g. newspapers, weblogs) t:3
2. Example collections t:4
3. Corpora of recorded spoken language in societies/communities where note-taking/recording
is not particularly spectacular/invasive t:5
4. Corpora with recorded spoken language from fieldwork in societies/communities where
note-taking/recording is spectacular/invasive t:6
5. Data from interviews (e.g. sociolinguistic interviews) t:7
6. Experimentation requiring subjects to do something with language they usually do anyway
(e.g. sentence production as in answering questions in studies on priming or picture description
in studies on information structure) t:8
7. Elicited data from fieldwork (e.g. response to “how do you say X in your language?”) t:9
8. Experimentation requiring subjects to do something with language they usually do, t:10
*on units they usually interact with (e.g. sentence sorting, measurements of reaction times in
lexical decision tasks, word associations) t:11
9. Experimentation requiring subjects to do something with language they usually do not do, t:12
*on units they usually interact with, involving typical linguistic output (e.g. measurements of
event-related potentials evoked by viewing pictures, eye-movement during reading idioms,
acceptability/grammaticality judgements t:13
*on units they usually do not interact with, involving production of linguistic output
(e.g. phoneme monitoring, ultrasound tongue-position videos) t:14
How Do Empirical Methods Interact with Theoretical Pragmatics?... 11
168 two main types: (a) parallel (or translation)corpora, containing source texts and
169 their translation in one or several target languages, which can be unidirectional
170 (from language A to language B) or bi/multidirectional, and (b) comparable cor-
171 pora, containing non-translated or translated texts of the same genre. Each type can
172 be used for specific research goals.
173 A first advantage of working on corpora is that they represent an empirical basis
174 for researchers’ intuitions. Intuitions are the starting-point of any study but can be
175 misleading and sometimes a few striking differences could lead to hazardous
176 generalizations. Moreover, results of analyses of quantifiable data allow not only
177 generalizations (through statistical significance tests) but also predictions through
178 statistical analyses, such as correlations
or multiple regression models,
which are
179 often used for investigating such a complex phenomenon as language.
180 Furthermore, multilingual corpora have quite naturally been used in contrastive
181 studies. Contrastive Linguistics, also called Contrastive Analysis (CA), is “the
182 systematic comparison of two or more languages, with the aim of describing their
183 similarities and differences” (Johansson 2003: 31) and it is often done by focusing
184 on one linguistic phenomenon. Mainly, the methodology used in a contrastive study
185 consists of a first phase of monolingual description of the data (the phenomenon to
186 be analysed), followed by the juxtaposition of two or more monolingual descrip-
187 tions and the analysis of the elements according to a tertium comparationis (James
188 1980; Krzeszowski 1990). In our case study, we argue that the necessary tertium
189 comparationis for verb tenses should be defined in terms of cross-linguistic valid
190 features, such as conceptual and procedural information.
191 The practice of contrastive languages comparison based on corpora has itself
192 numerous advantages, such as (a) new insights into the languages to be compared
193 (which would have remained unnoticed in studies of monolingual corpora), (b) the
194 highlighting of language-specific features and (c) the possibility of making seman-
195 tic and pragmatic equivalences for the considered linguistic phenomenon between
196 the source language (SL) and the target language (TL). In some cases, corpus-based
197 studies with a contrastive perspective have applicable purposes, such as our case
198 study, which aims at modelling verb tenses for improving the quality of the texts
199 translated by machine translation systems.
200 Another advantage is that data from corpora can be annotated (enriched) with
201 semantic and pragmatic information, which allows more complex analyses. Anno-
202 tation is the practice of adding interpretative linguistic information to a corpus, as
203 underlined by Leech (2005). Annotation is thus an enrichment of the original raw
Correlation is a monofactorial statistical method, which investigates the relation between one
independent variable (the predictor) and one dependent variable (the phenomenon of interest).
Correlation does not involve obligatorily causality between the two variables (they can be only
associated) and can be used only when relationship is linear (cf. Gries 2009; Baayen 2008).
Multiple regressions are multifactorial statistical methods, which investigate the relation between
several independent variables (predictors) and one dependent variable, as well as their interactions.
The relation between independent variables and the dependent variable can be linear or non-linear.
(cf. Gries 2009; Baayen 2008).
12 C. Grisot and J. Moeschler
204corpus. From this perspective, adding annotations to a corpus is providing
205additional value and thus increasing their utility (McEnery and Wilson 2003;
206Leech 2004). Firstly, annotated corpora are useful both for the researcher(s) who
207made the annotation and for other researchers, who can use them for their own
208purposes, modify or enlarge them. Secondly, annotated corpora allow both manual
209and automatic analysis and processing of the corpus and by assuring its
210multifunctional utilisation, the annotations themselves often revealing a whole
211range of uses which would not have been practicable unless the corpus had been
212annotated. Thirdly, annotated corpora allow an objective record of analysis open to
213future analysis, decisions being more objective and reproducible. Due to automatic
214analysis of the corpus, annotated corpora are often used for training of NLP tools,
215such as classifiers (see Sect. 3.4).
216Corpus work is thus interesting when the researcher is concerned with a descrip-
217tive approach of the linguistic phenomenon considered, as well as the study of
218language in use, given the fact that most of the time cotext and contextual infor-
219mation is also available in the corpus. Corpora permit monolingual and cross-
220linguistic investigations. Furthermore, corpus work allows the researcher to
221uncover on the one hand, what is probable and typical and, on the other hand,
222what is unusual about the phenomenon considered.
223Corpus work has also some difficulties, such as the insufficiency of multilingual
224corpora for less widespread languages or the predilection for ‘form-based research’
225where there is an interest in a specific grammatical form (Granger 2003). These
226difficulties constrain researchers to carry out their research manually, including
227building their corpus themselves and annotating it if they are interested in other
228phenomena than a specific grammatical form, such as semantic or syntactic cate-
229gories. Another difficulty about corpus work is when the researcher is interested
230in infrequent phenomena
that will have insufficient occurrences in the corpus.
231Difficulties are also encountered when phenomena that are not lexically expressed
232such as world knowledge used in inferences as well as the cognitive basis of
233language are investigated.
234This is one reason why corpus data are more and more combined with other
235types of evidence, such as experimentation. In what follows, we will briefly
236describe the use of experimentation in pragmatics and put forward the complemen-
237tarity between corpus work and experimentation.
2382.2 Experimentation
239In pragmatics, experimentation has extremely useful for studying issues from the
240semantics/pragmatics interface and testing theories concerning the psychological
For example, Grivaz (2012) who studied causality in certain pairs of verbs in a very large corpus
and with human annotation experiments, found that less frequent pairs had a good causal
correlation while very frequent pairs had a small causal correlation.
How Do Empirical Methods Interact with Theoretical Pragmatics?... 13
241 real competence native speakers have regarding semantics and pragmatics
242 (Katsos and Breheny 2008).
243 One important distinction at the semantics/pragmatics interface was proposed by
244 Grice (1975/1989) between what is ‘said’ vs. what is ‘implicated’ within the entire
245 meaning of an utterance. The first experimental study of the identification and
246 labelling by ordinary speakers of what is ‘said’ vs. what is ‘implicated’ was Gibbs
247 and Moise (1997). In their chapter, Gibbs and Moise designed their experiments to
248 determine whether people distinguished what speakers say from what they impli-
249 cate and if they viewed what is ‘said’ as being enriched pragmatically. They used
250 five categories of sentences
and participants had to choose between a minimal
251 vs. enriched interpretation. Example (1) illustrates the temporal relation type of
252 sentence as well as the two possible interpretations (minimal or literal meaning and
253 the pragmatically enriched meaning):
254 (1) ‘The old king died of heart attack and a republic was declared’.
255 (2) Minimal: order of events unspecified
256 (3) Enriched: the old kind died and then a republic was declared
257 The experiments were designed in order to manipulate the type of sentence,
258 the instructions and the context of the targeted sentence. In the first experiment, the
259 instructions consisted in explaining the two categories of interpretation of the
260 sentence and no context was given. In the second experiment, the instructions
261 were more detailed, including information about linguistic theories addressing the
262 distinction between what is ‘said’ and what is ‘implicated’. In the last two exper-
263 iments, linguistic contexts were provided (a short story) in order to favour enriched
264 interpretation (in the third experiment) as in example (4) and the minimal interpre-
265 tation (in the fourth experiment) as in example (5), regarding temporal relation
266 sentences.
267 (4) The professor was lecturing on the life of Jose Sebastian. He was a famous rebel in Spain who
fought to overthrow the King. Many citizens wanted Sebastian to serve as their President.
“Did Jose Sebastian ever became President?” one student asked. The professor replied,
The old king died of a heart attack before and a republic was declared.
268 (5) Mike liked to take long bike rides each day. He also liked to sing as he rode because he has
a terrific voice. Mike’s roommate thought this was funny. He said to someone that Mike
likes to ride his bike and sing at the top of his lungs.
269 Gibbs and Moise’s four experiments showed that speakers assume that enriched
270 pragmatics plays a significant role in what is said: the enriched interpretation was
271 preferred in the first three experiments but not in the last one where the context
272 biased strongly for the minimal interpretation. Manipulation of instructions and
273 training did not have any effect on the participants’ judgements.
Cardinal (Jane has three children), possession (Robert broke a finger last night), scalar (Every-
one went to Paris), time-distance (It will take us some time to get there) and temporal relations.
14 C. Grisot and J. Moeschler
274We can make three observations concerning the temporal relation sentences:
275(a) temporal sequencing is an inference drawn contextually,
(b) it is independent of
276the specific instructions that speakers received and (c) it can be blocked in a context
277biasing for the minimal interpretation, that is the unspecified order. On the basis of
278their results, Gibbs and Moise argue that there might be two types of pragmatic
279processes, one that provides an interpretation for what speakers say and another one
280that provides an interpretation for what speakers implicate. They argue that this
281position can be explained by the principle of optimal relevance (Sperber and
282Wilson 1986/1995) and they acknowledge the difficulty of testing it experimen-
283tally. In our case study, we will consider temporal sequencing under the label
284[narrativity] as being an inferential type of information that can function as a
285disambiguation criterion for usages of the English Simple Past (SP).
286We now turn to experimentation as a type of methodology used in empirical and
287experimental pragmatics and we point out two advantages of adopting it: (a) it
288makes possible systematic control of confounding variables, and (b) depending on
289the nature of the experiment, it permits the study of online processes (Gilquin and
290Gries 2009: 9). One difficulty with experimentation is the artificial setting exper-
291iments require that can influence the behaviour of the participants in this unnatural
292setting. If experimental pragmatics completely adopted the psycholinguistics meth-
293odology as well as the study of online processes (through EEG and eye-tracking
294tools), empirical pragmatics focused mainly on offline experimentation, preserving
295the very essence of experimental studies: systematic manipulation of independent
296variables in order to determine their effect on dependent variables.
297Concerning the complementarity of the two empirical sources of data, Gilquin
298and Gries argue that a corpus has a fourfold purpose in experimentation:
299(a) validator: the corpus serves as a validator of the experiment, (b) validatee: the
300corpus is validated by the experiment, (c) equal: corpus and experimental data are
301used on an equal footing and (d) stimulus composition: the corpus serves as a
302database for the items used in experiments. They also note that corpus work deals
303with a larger range of phenomena that can be investigated compared to experimen-
304tation. Experiments, however, allow the study of phenomena that are infrequent in
305corpora. Corpora and experiments have thus advantages and disadvantages that are
306complementary and thus linguists nowadays tend to use both of these empirical
308Finally, we would add that data from experiments are human annotated data and
309can be used for NLP as training for automatic classifiers, thus proving the machines
310with different sorts of information (linguistic, contextual and world knowledge) that
311humans have and use in language interpretation process.
In his Model of Directional Inferences (2000,2002), Moeschler makes the same prediction about
temporal relations between eventualities. They have an inferential nature and are drawn based on
contextual assumptions. They can be blocked (minimal interpretation) under certain specific
linguistic and contextual conditions.
How Do Empirical Methods Interact with Theoretical Pragmatics?... 15
312 In this research we consider data from experimentation (the 9th type of data
313 in Gilquin and Gries’ classification), focusing on linguistic judgments made by
314 participants. Linguistic judgments were used mainly for acceptability and gram-
315 maticality tasks but nowadays they concern all types of linguistic information. By
316 presenting our case study, we aim at pointing out the complementarity of corpus
317 work and experimentation for testing theoretic hypothesis, build description models
318 and apply them to NLP.
319 In what follows, we provide a case study presenting our investigation on verb
320 tenses and show how the methodology presented above has been used, as well as
321 how the results of our study support our thesis about the advantages of combining
322 corpora work and experimentation when doing empirical pragmatics research.
323 3 Case Study
324 The case study presented in this article is incorporated within a research project
325 that aims at improving the results of statistical machine translation (SMT) systems
326 by modelling intersentential relations, such as those that depend on verb tenses and
327 connectives. We investigate the ‘meaning’ of verb tenses, where the meaning is
328 seen as consisting of both what is said and what is implicated. We deal thus with the
329 semantics and pragmatics of verb tenses. Within the frame of empirical pragmatics,
330 we study verb tenses within RT from a contrastive perspective based on parallel
331 corpora and offline experimentation. Moreover, data from experimentation (human
332 annotation) was used for automatic annotation and, furthermore, for training of a
333 statistical machine translation (SMT) system.
334 As Ame
´nos-Pons (2011) correctly underlines, any approach to tenses must deal
335 with the fact that they present a certain stability of some basic features, combined
336 with a high adaptability at discourse level that depends on contextual information
337 (semantic and pragmatic) and world knowledge. A great challenge for linguists
338 was, and remains, to know which of the features of verb tenses are stable and which
339 are not.
340 Probably, one of the few generally accepted ideas about the meaning of verb
341 tenses is the linguistic underdeterminacy thesis, as developed in RT and applied
342 specifically to verb tenses by Neil Smith (1990). According to it, verb tenses are
343 defined as a referential category: they can be characterized as locating temporal
344 reference for eventualities with respect to three coordinates: speech moment S,
345 event moment E and reference point R (Reichenbach 1947) through contextual enrich-
346 ment following the expectation of optimal relevance (Wilson and Sperber 1998).
The COMTIS Project (Improving the Coherence of Machine Translation Output by modelling
Intersentential Relations, 2010–2013), a Sinergia interdisciplinary program funded by the Swiss
National Science Foundation (no. CRSI22-127510).
16 C. Grisot and J. Moeschler
347The consequence of this theory is that verb tenses do not have several meanings but
348several usages corresponding to different contextual interpretations.
349In the literature, two main trends are opposed regarding the nature of the
350encoded content verb tenses: on the one hand, verb tenses have only rigid proce-
351dural meanings that help the hearer reconstruct the intended representation of
352eventualities (Nicolle 1998; Ame
´nos-Pons 2011; de Saussure 2003,2011). de
353Saussure (2003) proposes algorithms to follow, consisting of the instructions
354encoded by verb tenses, in order to grasp the intended meaning of a verb tense at
355the discourse level.
356On the other hand, verb tenses are seen as having both procedural and conceptual
357contents, as argued in Moeschler (2002) and Grisot et al. (2012). In Grisot
358et al. (2012) we argue that the conceptual content is given by a specific configura-
359tion of Reichenbachian coordinates event moment E, reference point R and speech
360moment S. The procedural content consists of instructions and constraints for
361contextual usages, namely [narrative] and [subjective]. Conceptual and pro-
362cedural information represent bare-bone semantics that are contextually worked out
363through inferences (high-level explicatures consisting of pragmatically determined
364aspects of what is said). The hearer has to ascertain the contextual value for both
365types of encoded information in order to access the right contextual hypotheses to
366get the intended cognitive effects.
367Regards conceptual information, the assumption is that the specific configuration
368of the temporal coordinates S, R and E behaves like pro-concepts (Wilson 2011;
369Sperber and Wilson 1998: 15). Pro-concepts are semantically incomplete, they are
370conveyed in a given utterance and have to be contextually worked out. Once the
371enrichment process is completed the propositional form of the utterance is also
372available. This temporal information is not defeasible, i.e. it cannot be cancelled.
373The temporal coordinates S, R and E combine with the predicate’s lexical aspect, in
374order to allow the calculation of the aspectual class (state, process, event). This
375conceptual information is the skeleton of the meaning for each verb tense, which is
376enriched with contextual information and world knowledge in the inferential
377interpretation process.
378Concerning the status of the temporal coordinates, de Saussure and Morency
379(2012) argue that tenses encode instructions on how the eventuality is to be
380represented by the hearer through the positions of temporal coordinates. They
381consider thus that temporal location with the help of S, R and E is of a procedural
382nature. We will show later on in this chapter that experimental studies revealed the
383contrary: the configuration of temporal coordinates is of a conceptual nature, that is,
384they are variables that are saturated contextually.
385The procedural content of verb tenses, on the other hand, consists of two types of
386instructions: (a) the [narrative] instruction: to verify whether R is part of a series
387of points of reference available in the context and thus, eventualities are temporally
388sequenced, and (b) the [subjective] instruction: to verify whether there is a
389perspective or a point of view on the eventuality presented. The experimental
390work that we conducted (see Sect. 3.3.3) showed that the [narrative] feature
391includes temporal sequencing (inferential temporal relation as in Gibbs and
How Do Empirical Methods Interact with Theoretical Pragmatics?... 17
392 Moise’s experiments described in Sect. 2.2) and causal relations holding between
393 eventualities (cf. Moeschler 2003,2011 for the relation between causality and
394 temporal sequencing).
395 Another important point in the model described in Grisot et al. (2012) is that the
396 specific combination of conceptual content and procedural content characterises
397 contextual usages of verb tenses and not the meaning of a verb tense. For this point
398 Grisot et al.’s analysis joins Ame
´nos-Pons (2011) who assumes that tenses do not
399 encode temporal relations. They are only the result of the tense meaning in specific
400 environments.
401 In this chapter we adopt the view proposed by Grisot et al. (2012) and we bring
402 new arguments, as well as evidence from experimental work, that support the
403 procedural and conceptual nature of the information encoded by verb tenses
404 expressing past time in French (FR) and English (EN).
405 3.1 Our Hypotheses
406 An investigation of parallel corpora consisting of several stylistic genres revealed
407 the four most frequent translation divergences: (a) EN into French FR: the SP, the
408 Simple Present and the Present Perfect (PresPerf), and (b) from FR into EN – the
409 Passe
´(PC) and Pre
´sent. In a first research phase, we chose to investigate
410 the translation of the EN SP into FR, where its semantic and pragmatic domain is
411 rendered through the Passe
´Simple (PS), the PC and the Imparfait (IMP). In order to
412 grasp the meaning of the EN SP, we assume that the distinction between conceptual
413 and procedural types of information is very important.
414 Our assumptions are: (1) a verb tense encodes conceptual and procedural
415 information and (2) conceptual and procedural contents explain cross-linguistic
416 variation. In what concerns the first hypothesis, we argue and bring evidence from
417 offline experiments that procedural information encoded by the English SP is
418 inaccessible to consciousness and hard to describe in conceptual terms, while
419 conceptual information is accessible to conscious thinking and can be conceptual-
420 ized. We also argue that the conceptual content of verb tenses (specifically,
421 a specific configuration of temporal coordinates S, E and R) behaves like
422 pro-concepts in that they are conveyed in a given utterance and have to be
423 contextually worked out (high-level explicature).
424 Concerning our second hypothesis, we assume that conceptual and procedural
425 contents of verb tenses explain their cross-linguistic variation revealed by an
426 investigation of our parallel corpora. A verb tense can have several usages, where
427 each usage is triggered by a language-specific combination of conceptual and
428 procedural contents. Parallel corpus analysis reveals that each usage of a verb
429 tense in a SL is rendered by a different verb tense in a TL. Specifically, the
430 translation divergence of the English SP into FR can be resolved if contextual
431 usages of the SP are considered.
18 C. Grisot and J. Moeschler
432In the following sections, we bring evidence for our model for the semantics
433and pragmatics of the English SP from parallel corpus (Sect. 3.2) and offline
434experiments (Sect. 3.3). Section 3.4 is dedicated to the NLP application of the
435model defended in this case study.
4363.2 Data from Parallel Corpora with
437a Contrastive Perspective
438In Grisot and Cartoni (2012) we studied the discrepancies between theoretical
439descriptions of verb tenses and their use in parallel corpora. We investigated
440corpora consisting of texts in EN and their translations into FR that belong to
441four different genres (literature 18 %, journalistic 18 %, legislation 33 % and
442EuroParl 31 %). A total of 1275 predicative verb tenses have been considered,
443which represents 77 % of the verb tenses occurring in the corpus. The qualitative
444and quantitative analysis of the corpus was done in two steps. In the first monolin-
445gual step, we identified tenses that occur in the corpus and calculated their fre-
446quency in the SL. In the second bilingual step, we identified the tenses used as
447translation possibilities in the TL of a certain tense from SL and calculated their
448frequency. Analysis of frequency of tenses in SL provided information about tenses
449that are possible candidates for being problematic for machine translation systems,
450with the assumption that frequent tenses, if wrongly translated, decrease the quality
451of the translated text. Bilingual analysis with focus on identifying verb tenses used
452as translation possibilities in TL for ambiguous tenses in SL revealed that the SP is
453translated into FR using mainly three tenses (PS, PC and IMP representing 80 %
454of translation possibilities) as in examples (6), (7) and (8) and that the PresPerf is
455translated using two tenses (PC and Pre
´sent, 100 % of translation possibilities) as in
456examples (9) and (10). These are two of the translation divergences shown by
457analysis of parallel corpora.
458(6) EN/SP: General Musharraf appeared on the national scene on October 12, 1999, when he
ousted an elected government and announced an ambitious “nation-building” project.
(Journalistic Corpus: “News Commentaries”)
459FR/PC: Le Ge
´ral Moucharraf est apparu sur la sce
`ne nationale le 12 octobre 1999,
460lorsqu’il a force
´le gouvernement e
´lu a
´missionner et annonce
´son projet ambitieux de
461“construction d’une nation”.
462(7) EN/SP: With significant assistance from the United States—warmly accepted by both
countries—disarmament was orderly, open and fast. Nuclear warheads were returned
to Russia. (Journalistic Corpus: “The New York Times”)
463FR/PS: Avec l’assistance non ne
´gligeable des Etats-Unis – chaleureusement accepte
´e par
464les deux pays: le de
´sarmement a e
´thodique, ouvert et rapide. Les ogives nucle
465furent renvoye
´es en Russie.
466(8) EN/SP:He seeme d aboutseventeenyears of age, and wasof quite extraordinary personalbeauty,
though somewhat effeminate. (Literature Corpus: O. Wilde, “The picture of Mr. W.H”) 469
How Do Empirical Methods Interact with Theoretical Pragmatics?... 19
467 FR/IMP: Il paraissait avoir seize ans, et il e
´tait d’une beaute
´absolument extraordinaire,
468 quoique manifestement un peu effe
469 (9) EN/PresPerf: I would like to fully support Mrs Roth-Behrendt’s proposals, but we have
spent over 20 years talking about people’s willingness to spend more money on food; it
is just that the distribution process has totally changed. (“EuroParl” Corpus)
470 FR/Pre
´sent: Je soutiendrais vraiment de tout coeur les propositions de Mme Roth-Behrendt;
471 cela fait vingt ans que nous parlons de la possibilite
´de consacrer plus d’ argent a
472 alimentation mais, quand il s’ agit du processus de distribution, c’est tout autre chose.
473 (10) EN/PresPerf: Whether or not the government was involved, the fact remains that Pakistan
has lost a desperately needed leader. (Journalistic Corpus: “News Commentaries”)
474 FR/PC: Que le gouvernement soit ou non implique
´, le fait est que le Pakistan a perdu un
475 leader dont il a cruellement besoin.
476 The ambiguity of the EN SP, as well as the PresPerf, is illustrated by their
477 translation into FR. In order to improve their translation by SMT systems, these
478 tenses must be disambiguated. Following the CA’s methodology, the SP and the
479 PresPerf, as well as the FR tenses used for their translation, must be compared in
480 three steps. The first step consists of the monolingual description, followed by
481 bilingual juxtaposition of the two monolingual descriptions and finally, their anal-
482 ysis according to the tertium comparationis defined in terms of conceptual and
483 procedural contents.
484 Now in what concerns the SP, known as preterit, it describes an action or state as
485 having occurred or having existed at a past moment or during a past period of time
486 that is definitely separated from the actual present moment of speaking or writing.
487 Comrie (1985: 41) emphasized that the SP “only locates the event in the past,
488 without saying anything about whether the situation continues up to the present or
489 into the future”. Radden and Dirven (2007: 219) argue that the use of the SP to
490 express bounded past situations, presented as a series of events, typically in
491 narratives, as in (11). The individual events from example (11) are temporally
492 ordered (signalled by the coordination and the conjunction and) and are thus
493 interpreted as being successive.
494 (11) I grabbed his arm and I twisted it up behind his back and when I let go his arm there was
a knife on the table and he just picked it up and let me have it and I started bleeding like a
pig. (Labov and Waletzky 1967, quoted by Radden and Dirven 2007: 219)
495 The verb tenses used in FR for translating the SP are, as we have already noted, the
496 PC, PS and IMP. The PC is classically described from a monolingual point of view as
497 a “tense with two faces” (Martin 1971) because it can express both past and present
498 time. The PS is described as a tense that expresses a past event completely accom-
499 plished in the past with no connection to present time (Grevisse 1980,Wagnerand
500 Pinchon 1962) and used in contexts where events are temporally ordered (Kamp and
501 Rohrer 1983). Finally, the IMP is a tense that expresses background information
502 (Weinrich 1973). The focus on the accomplishment of the event in the past is the
503 feature that distinguishes the PS from the PC, the second one expressing a link to
504 present time, while perfectivity is a feature that distinguishes the PS from the IMP,
505 the former being perfective and the latter imperfective.
20 C. Grisot and J. Moeschler
506Given these monolingual descriptions, when juxtaposed, we can observe the
507multitude of facets for describing these four tenses: in terms of temporal location
508(time preceding, simultaneous or even following speech moment), grammatical
509aspect (perfective or imperfective), discursive grounding (foreground or back-
510ground information) and relation to other eventualities (temporally ordered or
511not). Another point that can be observed is the lack of a one-to-one correspondence
512between the several meanings of the SP and the three FR tenses used for its
513translation. In Grisot et al. (2012), we argue that the meaning of these verb tenses
514should be investigated cross-linguistically in terms of their conceptual and
515procedural information, and more specifically that the procedural information
516[narrativity] is a disambiguation criterion for the usages of the SP. In this study
517we bring evidence for our claim that the [narrativity] feature is procedural
518(through experimental work presented in Sect. 3.3.3). We show that items of SP
519annotated by two human annotators as having a narrative usage correspond in the
520parallel corpora investigated to translation through either PS or PC and items
521annotated as having a non-narrative usage correspond to translation through an
522IMP (detailed results provided in Sect. 3.3.3).
523The EN PresPerf is characterized by a grammatical combination of present tense
524and perfect aspect and it is used to express a past eventuality that has present
525relevance. The same grammatical combination exists in other languages such as the
526FR PC, with the specificity that the PC can also express eventualities accomplished
527in the past. In EN, there is a competition between the SP and the PresPerf for
528referring to past time eventualities, with the particularity that PresPerf is incom-
529patible with adverbials expressing define past time. The first annotation experiment
530considered the competition between SP and PresPerf forms for expressing past time
531eventualities, showing that each verb tense has conceptual meaning and it can easily
532be dealt with by human annotators (Sect. 3.3.2).
533A benefit of parallel corpora is the availability of context and cotext, information
534that facilitates establishing semantic and pragmatic equivalence for each verb tense.
535This information is crucial as regards the meaning of verb tenses.
536From the corpus described above, we used a subset of 30 excerpts randomly
537selected (that we call items and all contain occurrences of the SP or PresPerf) for the
538first experiment and 458 items (containing occurrences of the SP) for the second
539experiment. In what follows, we describe and provide the results of annotation
5413.3 Data from Offline Experiments
542Experimental work we have conducted brought evidence for the hypothesis that verb
543tenses encode both conceptual and procedural information. Conceptual information
544concerns different combinations of Reichenbachian temporal coordinates, which
545are contextually saturated variables. Procedural information concerns instructions
546relating the reference point R of an eventuality to reference points of other
How Do Empirical Methods Interact with Theoretical Pragmatics?... 21
547 eventualities from the cotext, in order to check their temporal order. In this section, we
548 will provide the general design of our experiments (participants, procedure and
549 evaluation), followed by the presentation of the two experiments and their results.
550 3.3.1 Design of Experiments and Participants
551 The two annotators were native speakers of EN with basic knowledge of FR. They
552 were asked to follow the instructions (given below for each type of information
553 annotated) and went through a training phase in order to check whether the
554 instructions given were clear. For the effective annotation task, annotators received
555 a file with the total number of excerpts that were taken from the EN part of the
556 parallel corpora. For each item, sentences including the verb tense considered, as
557 well as one sentence before or after, were provided in order to have sufficient
558 content for a pragmatic judgement.
559 One way of evaluating human annotation is to calculate the inter-annotator
560 agreement with the help of the kappa coefficient (Carletta 1996). One issue that
561 influences corpus annotation by raters is the subjectivity of the judgements, which
562 can be quite substantial for semantic and pragmatic annotations (Artstein and
563 Poesio 2008). It can be tested whether different raters produced consistently similar
564 results, so that one can infer that the annotators have understood the guidelines and
565 that there was no agreement just by chance. The kappa statistic factors out agree-
566 ment by chance and measures the effective agreement by two or more raters. The
567 kappa coefficient is 1 if there is a total agreement among the annotators, 0 if there is
568 no agreement other than the one expected to occur by chance, and 1 for values
569 greater than chance agreement. We used this measure for quantifying the inter-
570 annotator agreement in our experiments.
571 3.3.2 Annotation of Conceptual Information
572 Through this annotation experiment, we wanted to determine the conceptual mean-
573 ing of two verb tenses in EN, SP and PresPerf. Our expectation was that human
574 annotators should be able to think of the meaning of SP and PresPerf consciously,
575 conceptualize it and make specific decisions in each context with easiness. Anno-
576 tators received annotation guidelines (presented below) and went through a training
577 phase before the actual annotation phase.
578 As there are no quantitative measures
proposed in the literature to evaluate the
579 conceptual and procedural type of information encoded by linguistic expressions, at
580 least none that we are aware of, we propose to use the kappa coefficient and propose
581 the following intuitive hypotheses based on the description proposed by Wilson and
de Saussure (2011) proposes a qualitative criterion to evaluate procedural expressions: an
expression is procedural if it triggers inferences that cannot be predicted on the basis of an
identifiable conceptual core to which general pragmatic inferential principles are identified.
22 C. Grisot and J. Moeschler
582Sperber (1993) and Wilson (2011: 11): conceptual meanings are generally seen as
583accessible to consciousness, capable of being reflected on, evaluated and used in
584general inference, and procedures are “relatively inaccessible to consciousness,
585resistant to conceptualisation, thus we can not discover through introspection the
586rules of our language, the principles governing inferential comprehension, or the
587processes involved in mental-state attribution”.
588We assumed thus that manipulating conceptual information described as easily
589graspable concepts is related to the notions of sensitivity and accessibility to
590consciousness, specifically native speakers’ sensitivity is a cue to direct access to
591the encoded conceptual content. We expected thus high values of the inter-
592annotator agreement coefficient based on the relative ease of the task, namely to
593identify striking information.
594As far as procedural information is concerned, we expected low agreement,
595related to a more difficult task: procedural information is notoriously hard to pin
596down in conceptual terms (Wilson and Sperber 1993:16) and not accessible to
597consciousness. The processing of the narrative feature is predicted to be less
598accessible because it is the result of a non-guaranteed pragmatic inference (non-
599demonstrative inference
for Sperber and Wilson 1986/1995: 65) based on con-
600ceptual information, cotextual information and contextual hypotheses. As inferen-
601tial processes are costly and depend on several factors, they are predicted to produce
602lower values of the inter-annotators agreement coefficient.
603Based on our claim (Grisot et al. 2012) that the configuration of Reichenbachian
604coordinates should be split into three pairs of two coordinates (E/R, R/S and the inferred
605E/S) instead of the classical view of three coordinates as Reichenbach proposed.
606We defined the conceptual content of the Simple Past, as in example (12) to be the
607pair E <S which bears the focus (from the line E ¼R, R <SandE<S), in other
608words ‘situation that happened in the past’ and the conceptual meaning of Present
609Perfect, as en example (13) to be the pair R ¼S (from the line E <R, R ¼S,
in other words the “current resulting state of a past situation”.
611(12) EN/SP: After almost a decade in self-imposed exile, Bhuto’s return to Pakistan in October
gave her a fresh political start. Pakistan had changed, as military dictatorship and
religious extremism in the north played havoc with the fabric of society. (Journalistic
Corpus: “NewsCommentaries”)
612(13) EN/PresPerf: Some of the proposals concerning greater focus on equality have also been
accepted, but the Council did not want to accept some very central proposals from
Parliament. (“EuroParl” Corpus)
Sperber and Wilson (1986/1995: 65) argue that the process of inferential communication is
non-demonstrative: even under the best circumstances, it might fail (the addressee can not deduce
the communicator’s communicative intention).
In the parallel corpus both the SP and the PresPerf from these two examples are translated by a
PC in French, highlighting thus another translation divergence: the French PC into EN. A hint of
the disambiguation criterion is a focus either in the E <S relation for the SP or on the R ¼S
relation for the PresPerf (as we argued in Grisot et al. 2012).
How Do Empirical Methods Interact with Theoretical Pragmatics?... 23
613 The annotation guidelines included: (a) a description of the two types of
614 meaning (b) four examples for each usage, as given in the examples below and
615 (c) the instruction to read each excerpt, identify the meaning of the verb highlighted
616 and decide on the type of meaning. In the first example, the most salient information
617 is the result state in the present: the fact that the false declaration is now filled. In the
618 second example, the most salient information is the situation that happened in the
619 past: the lack of choice of Musharraf.
620 (14) And instead of full cooperation and transparency, Iraq has filed a false declaration to the
United Nations that amounts to a 12,200-page lie. (Journalistic Corpus:
621 (15) In a historic ruling that Musharraf had little choice but to accept, the Supreme Court itself
reinstated the Chief Justice in July. Subsequently, the energized judiciary continued
ruling against government decisions, embarrassing the government – especially its
intelligence agencies. (Journalistic Corpus: “NewsCommentaries”)
622 In what concerns the annotation guidelines, three aspects should be mentioned:
623 (a) the ‘meaning’ of the SP and PresPerf, respectively, was easily identified and
624 conceptualized in order to explain the task to annotators, (b) they were asked to
625 identify ‘the most salient information’ in order to identify the focus and
626 (c) annotators understood the annotation task easily, as well as the examples used
627 for training.
628 In this experiment, annotators made decisions following the annotation instruc-
629 tions on 30 excerpts from the corpus. They agreed on all the items annotated
630 (kappa ¼1) and pointed out the easiness of the task. This result provides evidence
631 for the conceptual nature of the information considered in this experiment.
632 We hypothesized that the total agreement is due to the highly accessible con-
633 ceptual information, that is, the ability for the raters to consciously represent the
634 temporal coordinates as part of the conceptual meaning of tenses. We also hypoth-
635 esized that the understanding of the coordinates and the relations holding among
636 them are not inferred through a non-demonstrative inference, but are immediately
637 accessible to consciousness.
638 3.3.3 Annotation of Procedural Information
639 One of the features tested with the help of the annotation experiment is [
640 narrativity]. As mentioned, this feature is a procedural information encoded by
641 tenses that instructs the hearer/reader to verify whether the reference point is part of
642 a series of R that increases incrementally, in other words if the eventualities
643 presented are temporally ordered. Wilson (2011) emphasized that procedures are
644 not part of the meaning of a linguistic expression but are merely activated or
645 triggered by the occurrence of that expression in an utterance. If the feature is
646 activated ([+ narrative]), then we can talk about a narrative usage of the verb tense
647 considered. And respectively, if the feature is not activated, then the verb tense
648 considered has a non-narrative usage.
24 C. Grisot and J. Moeschler
649Numerous studies have already addressed narrativity either in the traditional
650rhetoric (since the nineteenth century, such as Alexander Bain 1866 and John
651Genung 1900), in DRT (Kamp and Reyle 1993) and SDRT (Lascarides and
652Asher 1993) or within a semantics and pragmatics perspective (Hinrichs 1986;
653Partee 1984; Reboul and Moeschler 1998; Smith 2001,2003,2010). Mainly, in
654these studies, narrativity is a discourse relation or a discourse mode associated with
655temporal sequencing of eventualities. In this chapter, we adopt this view of
656narrativity and postulate that it is a binary variable ([narrativity]) that represents
657procedural information conveyed by verb tenses and which can be used as a
658disambiguation criterion for various usages of tenses expressing past time in EN
659and FR.
660The verb tense considered in this annotation experiment is the EN SP. As in the
661first experiment, annotators received annotation guidelines (presented below) and
662went through a training phase. Narrativity was defined and explained to annotators
663as it follows:
664(16) In narrative contexts a story that is being told (you might not have the whole story available
in the sentence) and eventualities are temporally ordered, while non-narrative contexts
are associated with descriptive passages, where no story is being told.
665Annotation guidelines included: (a) a definition of narrativity (b) the explanation
666of each usage (narrative and non-narrative) with two examples for each usage, as
667given in the examples below, (c) the instruction to read each excerpt, identify the
668verb highlighted and decide if in context, the highlighted verb is part of the
669underlying theme (the verb tense would have a narrative usage) or not (the verb
670tense would have a non-narrative usage).
671In the first example below, there are two events, i.e. ‘the marriage that happened’
672and ‘the wealth which was added’. The second event is presented in relation to the
673first (first he got married and then he added to his wealth), which is why the SP
674verbs happened and added are in narrative usage. In the second example, there are
675three states (was a single man, lived and had a companion) that describe the owner
676of the estate. States are not temporally ordered, which is why this example illus-
677trates the non-narrative usage of the SP.
678(17) By his own marriage, likewise, which happened soon afterwards, he added to his wealth.
(Literature Corpus: J. Austen, “Sense and Sensibility”)
679(18) The late owner of this estate was a single man, who lived to a very advanced age, and who
for many years of his life, had a constant companion and housekeeper in his sister.
(Literature Corpus: J. Austen, “Sense and Sensibility”)
680The value of kappa coefficient for this annotation experiment was 0.42. This
681value is above chance, but not high enough to point to entirely reliable linguistic
682decisions (values generally accepted around 0.6–0.7). What this first result shows
683about the procedural feature [narrativity] encoded by the EN SP is the difficulty
684hearers/readers have in the interpretation process to conceptualize the language
685rules they have and make decisions about their functioning.
How Do Empirical Methods Interact with Theoretical Pragmatics?... 25
686 The two annotators agreed on 325 items (71 %) and disagreed on 133 items
687 (29 %). Error analysis showed that the main source of errors was the length of the
688 temporal interval between two eventualities, which was perceived differently by the
689 two annotators. This lead to ambiguity between temporal sequence or simultaneity,
690 each of them corresponding to narrative, respectively, non-narrative usage, as in
691 example (19) where the eventualities “qualify” and “enable” were perceived as
692 being simultaneous by one annotator and successive by the other.
693 (19) Elinor, this eldest daughter, whose advice was so effectual, possessed a strength of
understanding, and coolness of judgment, which qualified her, though only nineteen,
to be the counsellor of her mother, and enabled her frequently to counteract, to the
advantage of them all, that eagerness of mind in Mrs. Dashwood which must generally
have led to imprudence. (Literature Corpus: J. Austen, “Sense and Sensibility”)
694 A possible explanation is the fact that personal world knowledge is used to infer
695 temporal information, such as the length of the temporal interval between two
696 eventualities, i.e. information that allows the annotator to decide whether the
697 eventualities are temporally ordered or not. Cases where the length of the temporal
698 interval between two eventualities was very reduced were ambiguous for the
699 annotators, so each of them decided differently whether it was long enough for
700 temporal sequencing or too short, so that the simultaneity meaning was preferred.
701 Disagreements were resolved in a second round of the annotation experiment,
702 where the narrativity feature was identified with a new linguistic test that was
703 explained to two new annotators.
Annotators were asked to insert a connective
704 such as and and and then when possible, in order to make explicit the ‘meaning’ of
705 the excerpt, namely the temporal relation existent between the two eventualities
706 considered. The connective because (for a causal relation) has also been proposed
707 by annotators under the [+ narrative] label showing that causal relations should also
708 be considered. We will not look more into causality in this chapter. The inter-
709 annotator agreement in this second experiment was kappa ¼0.91, signalling very
710 strong and reliable agreement. This result emphasizes the procedural nature of the
711 feature taking into account that one of the characteristics is the possibility to render
712 explicit the instructions encoded with the help of discourse markers.
713 The cross-linguistic application of these findings consists of the observation of a
714 pattern in the parallel corpus. We investigated the data containing agreements from
715 both annotation rounds (435 items) and analyzed them in the parallel corpus. We
716 observed that the narrative usages of the SP identified by annotators correspond to
717 narrative usages
in the FR part of the corpus (translation by a PC or PS) and the
718 non-narrative usages of the SP correspond to the non-narrative usages in the FR text
The new annotators were one of the authors and a research peer, who was not aware of the
purpose of the research.
In Grisot et al. (2012), we describe a similar annotation experiment made on the French tenses
used for translating the EN SP, namely PC, PS and IMP. In this experiment, the PC and PS have
been identified as being narrative and the IMP as being non-narrative with a kappa value of 0.63
(reliable agreement).
26 C. Grisot and J. Moeschler
719(translation with an IMP) in 338 items (78 %). This leaves 22 % where annotators
720agreed on the narrativity label but where it is not consistent with the tense used in
721FR. Future work will focus on investigating the other factors that explain the 22 %
722of the variation in the translation of the SP in French.
7233.4 Natural Language Processing Application
724Nowadays, linguistic research tends more and more to integrate language automatic
725processing techniques. Human annotation and classification of texts is often used
726in Natural Language Processing (NLP) and Machine Translation (MT). Most of
727the current MT systems incorporate a language model that analyses texts at
728the sentence level. But there are linguistic phenomena whose interpretation is
729done using information that goes beyond sentence boundaries, such as verb tenses.
730The theoretical model of the pragmatics and semantics of the EN SP described in
731this chapter has been validated empirically also through an NLP technique called
732automatic annotation or classification. Human-annotated data provides to the
733machine translation system pragmatic information that humans make use of in
734the interpretation process, such as the reference point R, the relative sequence of
735eventualities, the length of the interval and any causal relation existent between
737Human-annotated texts described in this chapter served as training data for
738machine-learning tools,
specifically a maximum entropy classifier (Manning and
739Klein 2003). A classifier is a machine-learning tool that will take data items and place
740them into one of the available classes (in the present case, narrative and
741non-narrative) according to a statistical algorithm. The underlying principle of
742maximum entropy is that, when assigning a class, it should be done uniformly
743(uniform distributions) unless there is some external knowledge that would instruct
744the system to do it differently. Annotated data used for training these classifiers
745provide external knowledge and thus inform the automatic labelling technique where
746to be minimally non-uniform. Iterative runs of the classifier results in automatically
747labelled or annotated texts with the considered features.
748The feature tested in our case study was [narrativity] and the human-annotated
749data was used for training the classifier. The results of automatic annotation are
750similar to human annotation; the classifier correctly annotated 76 % of the items.
751The purpose of using automatic annotation is the possibility to do it on large
752amounts of data. Human annotation has the disadvantages of being tedious and
753costly, and it is often done on a reduced amount of data.
754The final purpose was to improve the results in what concerns verb tenses of a
755statistical machine translation system. Current machine translation systems have
The NLP work was done by our colleagues Thomas Meyer and Andrei-Popescu Belis from the
Idiap Research Institute (Martigny, Switzerland) to whom we address our gratitude.
How Do Empirical Methods Interact with Theoretical Pragmatics?... 27
756 difficulties in choosing the correct verb tense translations, in some language pairs,
757 because these depend on a larger context than systems consider. A machine translation
758 system generally misses information from previously translated sentences, which is
759 detrimental to lexical cohesion and coherence of the translated text.
760 A first run of an SMT system, which uses the classifier trained on the annotated
761 data with the [narrativity] feature, had slightly better results than without this
762 pragmatic feature. When trained and tested on automatically annotated data,
763 the [narrativity] feature improves translation by about 0.2 BLEU points.
764 More importantly, manual evaluation shows that verb tense translation and verb
765 choice are improved by respectively 9.7 % and 3.4 % (absolute), leading to an
766 overall improvement of verb translation of 17 % (relative) (for more detailed results
767 see Meyer et al. 2013).
768 4 Conclusion
769 This chapter has given an account of the place of empirical pragmatics among
770 theoretical pragmatics and experimental pragmatics, for the study of language in
771 use. We have argued for the need to have robust data for pragmatic research, data
772 provided by both corpus work and experimentation.
773 We have shown that corpus work can be fruitfully done with a contrastive
774 perspective, following the specific three-steps methodology of CA. As far as
775 experimentation is concerned, we have looked into offline experiments consisting
776 of linguistic judgement task that resulted in human annotated data. We have
777 discussed the example of the first experiment for the pragmatic distinction between
778 what is ‘said’ and what is ‘implicated’ designed by Gibbs and Moise (1997).
779 Another important topic of this chapter was the discussion about the advantages
780 and difficulties of each of the two methods considered (corpus work and experi-
781 mentation), as well as their complementarity.
782 In our case study, we investigated the nature of the information encoded by verb
783 tenses. We assumed and validated empirically through annotation experiments that
784 verb tenses encode both procedural and conceptual information. We defined con-
785 ceptual information as being involved in the language of thought in a Fodorian
786 framework (Fodor 1975,1998) having the characteristic of being accessible to
787 consciousness and capable of being reflected on, evaluated and used in general
788 inference. We proposed thus, based on these two features, that verb tenses encode
789 conceptual information consisting of a certain configuration of temporal coordinates.
790 The basic meaning of a tense is to locate an eventuality related to the speech moment,
791 passing through a reference point. A verb tense encodes instructions to verify the
BLEU (Bilingual Evaluation Understudy) is an evaluation measure for machine-translated texts.
It calculates the degree of resemblance to a human-translated text and it is a number between 0 and
1, where values closer to 1 represent more similar texts.
28 C. Grisot and J. Moeschler
792contextual value of several features that are important and relevant for utterance
793comprehension. In this chapter, we investigate one feature: [narrativity].
794As far as procedural information is concerned, we followed Wilson and
795Sperber’s idea (1993) that procedures are not part of language of thought and
796thus are not accessible to consciousness and easily conceptualized, as representa-
797tions are. The results of the annotation experiment showed that verb tenses encode
798procedural information that instruct the reader/hearer to look for other eventualities
799that are related to the eventuality considered, namely the [narrativity] procedural
801Taken together, the empirical findings of this research provide an example of the
802relation between theoretical framework(s) and empirical methodologies. Theoret-
803ical hypotheses have an impact on the choice of empirical methodologies. For
804example, a cross-linguistic perspective requires work on parallel corpora in order to
805have access to both source and target texts. The disambiguation of the usages of the
806targeted verb tense requires the formulation of possible disambiguation criteria that
807need to be validated through experimentation involving linguistic judgement tasks.
808Genuine data dealt with empirical methods can challenge theoretical positions. For
809verb tenses, for example, the results of our experiments challenged the theoretical
810assumption that verb tenses do not encode conceptual information, but only proce-
811dural information. Next to existent qualitative measures for conceptual and proce-
812dural information, we proposed a quantitative measure: the kappa coefficient for
813inter-annotator agreement. This measure makes use of the knowledge that native
814speakers have about their language.
815Finally, our work has illustrated how empirical pragmatics can work together
816with the NLP domain. The pragmatic feature identified as procedural information
817and validated through human annotation experiments has been used as a label for
818discourse tagging with an automatic classifier. Moreover, a SMT system trained on
819the annotated corpus had better results for translating verb tenses than if it hadn’t
820made use of the [narrativity] pragmatic feature.
821An issue that was not addressed in this study was the cross-linguistic application
822of the model to more than one pair of languages. This issue will be addressed in
823further studies and it targets the translation of the English SP into Italian and Roma-
824nian. The application of the conceptual/procedural distinction for verb tenses could
825also be done using online experimental methodology. This would probably reduce any
826remaining doubts about the existence of a conceptual content of verb tenses.
How Do Empirical Methods Interact with Theoretical Pragmatics?... 33
... A different view was defended in Grisot and Moeschler (2014). We argued that that location through temporal coordinates S and E does not constrain the inferential processing but contributes to the propositional content of the utterance. ...
... Specifically, one type of procedural information concerns temporal and causal relations among eventualities. In Grisot and Moeschler (2014), this procedure is called the [± narrative] feature. 7 Narrativity 8 is a binary pragmatic feature: in narrative usages, a verb tense expresses eventualities (events/states 9 ) that are temporally ordered accompanied or not by a causal relation, while non-narrative usages express temporal simultaneity or temporally un-related states of affairs. ...
... As suggested by Smith (1990) and based on the results of the empirical work, we adopted the second possibility. To account for the translation divergence of the SP, Grisot and Moeschler (2014) made the hypothesis that the procedural feature [± narrative] can be used for disambiguating among different possible usages of the SP. As mentioned in Sect. ...
Full-text available
This chapter consists of four main sections and ends with a summary. First, it provides a general account of coherence relations as cognitive entities, and focus on the status of temporal relations. Second, in order to support the proposal that temporal relations are cognitively motivated, it addresses the results of a series of online and offline experiments carried out in order to test the processing and conscious evaluation of temporal relations, as expressed by the French Passé Composé versus the Passé Simple, and by the temporal connectives ensuite and puis. Third, the notion of cognitive temporal coherence is discussed, by arguing that the cohesive ties investigated in this research are cognitively motivated categories, and by exploring a psycholinguistic account of coherence according to which mental representations of discourse segments are structured and coherent.
... The dynamically changing socio-economic situation does not allow a more effective management process through the routine technology of working with paper documents, now it is the task of digitalization of the main industries and stuffing them educational institutions. The modern level of information technology development provides opportunities for radical reorganization of management processes through the transition from the traditional paper to electronic document management and the formation of a digital workplace of the contractor [10,11]. Electronic document management is a fundamentally new technology, a qualitatively new phenomenon that imposes new requirements on organizational communication, including the ecological nature of communication and the employee's comfort in the digital space. ...
Full-text available
The development of organizational communication monitoring involves increasing the sustainability of production development, ensuring the efficiency of production. The article sets out the experience of some research in this area carried out by the authors. The authors propose to develop a system of optimization of efforts and resources aimed at improving the methodological apparatus of communication activity of specialists in production, considering the existing experience. Development of a comprehensive approach to acting in conditions of uncertainty, quasi-stable situations of risk and communicative tension. Development of creative potential of communication participants, their ability to integrate into various management situations and respond quickly to the changing situation in the workplace. The system development of a pragmatic approach to the communicative component of company management, the development of environmentally appropriate measures to maintain continuous managerial communication in the digital space, comfort and stress resistance of the employee.
... This study, following Aronoff and Fudeman (2011), takes a no-holds-barred approach, and assumes that multi-source evidence can either validate the theory or bring counter observations, and thus open new perspectives (Grisot & Moeschler, 2014;Sharif, 2020). It includes both corpus (lexical translation and Urdu Lughat) and introspective data (experimentation and introspection) to explore the maximum space of grammatical possibility. ...
... Corpora allow the researcher to uncover what is probable and typical, as well as what is unusual about the phenomenon observed (Grisot andMoeschler 2014: 13, cited in Szczyrbak 2017: 96). Studying a politician's public performances is important in order to understand his/her stance. ...
... Secondly, the PC and PS are both perfective verbal tenses, and their meanings are contextually determined (Moeschler 2000b(Moeschler , 2002Grisot & Moeschler 2014;Grisot 2015Grisot , 2018. ...
Full-text available
In this paper, we aim to enhance our understanding about the processing of implicit and explicit temporal chronological relations by investigating the roles of temporal connectives and verbal tenses, separately and in interaction. In particular, we investigate how two temporal connectives ( ensuite and puis , both meaning ‘then’) and two verbal tenses expressing past time (the simple and compound past) act as processing instructions for chronological relations in French. Theoretical studies have suggested that the simple past encodes the instruction to relate events sequentially, unlike the more flexible compound past, which does not. Using an online experiment with a self-paced reading task, we show that these temporal connectives facilitate the processing of chronological relations when they are expressed with both verbal tenses, and that no significant difference is found between the two verbal tenses, nor between the two connectives. By means of an offline experiment with an evaluation task, we find, contrary to previous studies, that comprehenders prefer chronological relations to be overtly marked rather than implicitly expressed, and prefer to use the connective puis in particular. Furthermore, comprehenders prefer it when these relations are expressed using the compound past, rather than the simple past. Instead of using the continuity hypothesis ( Segal et al. 1991 , Murray 1997 ) to explain the processing of temporal relations, we conclude that a more accurate explanation considers a cluster of factors including linguistic knowledge (connectives, tenses, grammatical and lexical aspect) and world knowledge.
This study examines the evidential value of corpus and introspective data from the perspective of morphological operations involved in the causative derivation of Urdu change-of-state verbs. By analyzing multi-source data on the transitivity status of causative variants of 112 Urdu verbs, it discovers a crucial relation between the type of evidence and the aspect of linguistic competence it addresses. More specifically, it compares data from lexical translation, Urdu WordNet and Urdu Lughat on the one hand, and data from a judgment task coupled with dialogical introspection on the other hand, and finds that the former can reveal the gradient nature of morphological productivity, but not its dynamic nature; the latter, however, can help explore the dynamic nature of productivity in the causative alternation. Such an observation confirms that both corpus and introspection complement each other, and that a particular source-specific piece of evidence may limit the coverage and generality of analysis. Thus, the study shows the importance of sharpening data by multi-source evidence for examining how various components of a phenomenon interrelate in the context of the overall grammatical organization of a language.
Full-text available
El presente artículo aplica, a la luz de los estudios de pragmática experimental, algunos de los conceptos básicos de la filosofía de la lingüística (López Serena 2019), con especial interés en el dato y su naturaleza empírica. Para ello, se retoma la descripción de la reformulación en español peninsular (Pons 2013, 2017; Murillo 2016), un problema de corte teórico abordado, experimentalmente, desde una visión cualitativa-cuantitativa (Salameh 2021a, 2021b). La aplicación de la terminología hermenéutica ayuda, por un lado, a acotar mejor la naturaleza de estos datos y su papel con respecto a la teoría, y, por otro, a dotar de una mayor precisión explicativa a la descripción de este fenómeno. Se demuestra la necesidad de más acercamientos de este tipo a los estudios de lingüística experimental, considerada ciencia natural frente a otros estudios lingüísticos de las ciencias humanas, pero con matices que deben tenerse en cuenta para una correcta praxis explicativa.
Relevance Theory is a cognitive pragmatic theory devoted to utterance interpretation. Its main assumption is that linguistic communication is guided by the communicative principle of relevance, which states that the addressee is invited to take the speaker’s contribution as optimally relevant. In intracultural communication, the crucial point is to understand how communication succeeds, since its success depends not on a complete linguistic decoding but rather on accessing the relevant contextual assumptions; that is, the assumptions that are closest to the speaker’s informative intention. This chapter’s first aim is to elucidate both how Relevance Theory is included in Grice’s legacy, and how it diverges from Grice. Its second aim is to discuss the place of Relevance Theory in pragmatics today, and more specifically to explore whether Relevance Theory makes different predictions than do neo-Gricean approaches. Its third aim is to give insights into Relevance Theory’s contributions to the intercultural pragmatics agenda, and in particular to discuss how Relevance Theory converges with but also diverges from the intercultural pragmatics paradigm initiated by Kecskes in 2014.
Full-text available
If a particular situation calls for the assistance of a qualified and competent interpreter or translator, it is reasonable to ask how we define ‘qualified’ and ‘competent’. More generally, of course, we might seek to explain precisely how translation and interpreting work. In the last three decades, scholars have been increasingly interested not only in the product of translation and interpreting, but also in the cognitive aspects of these processes. In this context, Relevance Theory (Sperber and Wilson, 1986/1995) is the only theoretical framework in the area of cognitive pragmatics that has been adopted to capture the complexity of translator- or interpreter-mediated communication. As Kliffer and Stroinska (2004, 171) state, “it may well prove to be the most reliable tool for handling the interpretive richness evinced by real-life data.” This is the first book-length attempt - since Setton’s (1999) Simultaneous Interpretation: A Cognitive-Pragmatic Analysis - to illustrate the explanatory potential of Relevance Theory in providing a cognitively based account of translation and interpreting, and ‘getting closer’ to communicators’ intentions. It provides an overview of key concepts in Gricean and relevance-theoretic pragmatics, and showcases Relevance Theory-inspired research in a wide range of professional activities – from audiovisual translation to conference and legal interpreting. Further, it discusses applications of and charts future developments in the disciplinary relationship between translation and interpreting studies and Sperber and Wilson’s inferential model of communication. Special attention is given to the notion of ‘faithfulness’ and the development of a (meta)pragmatic competence. The first four chapters include many practical illustrative examples, as well as a list of recommended reading, questions, and exercises. Firmly grounded theoretically and methodologically, and yet highly accessible, the book will be an essential reading for translation and interpreting academics, students, and practitioners, as well as for those working in the related fields of linguistics (in particular, cognitive pragmatics), communication and intercultural studies. By the end, readers will be ready to look in more detail at specific components of the central and expanding approach presented here, and come up with their own Relevance Theory-oriented strategies to achieve a pragmatically successful output in various contexts and across different languages and cultures.
Full-text available
Este trabajo describe la relación de paráfrasis desde un enfoque teórico-experimental. Parte de la idea de que la paráfrasis (la equivalencia) (E. Gülich y Th. Kotschi 1983, 1995) no es una actividad de fácil producción en comparación con la reformulación discursiva, una postura distinta a la defendida en los estudios publicados durante los últimos cuarenta años. Para completar esta visión, se aplica la técnica experimental de lectura con eye-tracking, que permite medir cuántos esfuerzos cognitivos requiere un proceso de asimilación informativa a partir de los movimientos oculares producidos (hipótesis ojo-mente, K. Rayner 1998). Los datos obtenidos complementan el tratamiento de funciones discursivas como paráfrasis, reformulación o corrección, entre otras, cuyos límites definitorios no quedan claros (S. Pons 2013).
Bernard Comrie defines tense as the grammaticalisation of location in time. In this textbook he introduces readers to the range of variation found in tense systems across the languages of the world, bringing together a rich collection of illustrative material that student and specialist alike will find invaluable. This systematic account of the data is carefully integrated with a theoretical discussion of tense that is sensitive both to the range of tense oppositions found cross-linguistically and also to the constraints on that variation. For the most part the book is written without formalism, nor is it written within the framework of any specific current theory of linguistics. Nevertheless, as the final chapter makes clear, a formal theory of tense can build upon the insights gained here. For all readers, Dr Comrie's coherent and characteristically elegant account of this complex grammatical category will provide a solid basis for further research on tense, even in a language as thoroughly studied as English.
In studying discourse, the problem for the linguist is to find a fruitful level of analysis. Carlota Smith offers a new approach with this study of discourse passages, units of several sentences or more. She introduces the key idea of the 'Discourse Mode', identifying five modes: Narrative, Description, Report, Information, Argument. These are realized at the level of the passage, and cut across genre lines. Smith shows that the modes, intuitively recognizable as distinct, have linguistic correlates that differentiate them. She analyzes the properties that distinguish each mode, focusing on grammatical rather than lexical information. The book also examines linguistically based features that appear in passages of all five modes: topic and focus, variation in syntactic structure, and subjectivity, or point of view. Operating at the interface of syntax, semantics, and pragmatics, the book will appeal to researchers and graduate students in linguistics, stylistics and rhetoric.