Abstract

We describe the problem of anaphora resolution and discuss approaches to modeling this problem. Centering Theory (CT), which is an approach to modeling certain aspects of local coherence in discourse, includes within it the component that models anaphora resolution. However, CT itself is not a theory of anaphora resolution. It was developed as part of a theory of local coherence. Subsequently many researchers have attempted to use CT or some modified versions of CT for anaphora resolution. This has led to some very interesting work but also raised issues and questions as to what CT is about. We attempt to clarify some of these issues.
Anaphora Resolution: A Centering Approach
Aravind K. Joshi (joshi@linc.cis.upenn.edu)
Department of Computer and Information Science, and
Institute for Research in Cognitive Science,
University of Pennsylvania,
Philadelphia, PA 19104, U.S.A.
Rashmi Prasad (rjprasad@linc.cis.upenn.edu)
Department of Linguistics, and
Institute for Research in Cognitive Science,
University of Pennsylvania,
Philadelphia, PA 19104, U.S.A.
Eleni Miltsakaki (elenimi@linc.cis.upenn.edu)
Institute for Research in Cognitive Science,
University of Pennsylvania,
Philadelphia, PA 19104, U.S.A.
Abstract
We start by describing the problem of anaphora resolution and discuss approaches to
modeling this problem. Centering Theory (CT), which is an approach to modeling certain
aspects of local coherence in discourse, includes within it the component that models
anaphora resolution. However, CT itself is not a theory of anaphora resolution. It was
developed as part of a theory of local coherence. Subsequently many researchers have
attempted to use CT or some modified versions of CT for anaphora resolution. This has
led to some very interesting work but also raised issues and questions as to what CT is
about. We attempt to clarify some of these issues.
1. Anaphora Resolution with Centers of Attention
Anaphora resolution in discourse - a coherent sequence of utterances - is the task or
process of identifying the referents of expressions which we use to denote discourse
entities, i.e., objects, individuals, properties and relations that have been introduced and
talked about in the prior discourse. The importance of modeling this process cannot be
overstated. Computing the meaning of a discourse is commonly understood as partly the
process of connecting the information in the upcoming utterance with the information
contained in the prior discourse. Before we can do this, however, we need to assign an
interpretation to all the elements of the utterance and then to the utterance as a whole. In
many cases, the interpretation of some elements in the sentence can only be assigned
relative to the prior discourse context – anaphoric expressions comprise one such class of
elements.
In early approaches to anaphoric reference in AI and linguistics, the task of anaphora
resolution was relegated to syntax, which provided filters such as grammatical agreement
constraints, and open-ended semantic inference that drew on, among other things, world
knowledge and inference procedures to identify the appropriate referent. However, it was
soon recognized that while syntactic constraints were very limited in constraining the
search for anaphoric referents on the one hand, the mechanism of open-ended semantic
inference, on the other hand, was too knowledge intensive and complex - requiring
reasoning over the entire space of discourse at once - and therefore computationally
unfeasible.
In 1977, a different view to anaphora resolution arose out of the work of Barbara Grosz
(Grosz, 1977) which rests on a fundamental and singularly important assumption
regarding the attentional status of discourse entities: at any given point of the discourse,
the discourse participants’ attention is centered on a set of entities, a proper subset of all
the entities being talked about in the discourse. Furthermore, for a given utterance, the
discourse participants’ attention is centered on a singleton entity, and the rest of the
utterance makes a predication about this entity. The notion of the center of attention
specific to utterances is very similar to the notion of “topic” in linguistics, where it is
defined as what is “talked about” in the utterance. The approach for anaphora resolution
with this centering view is that the search for the referents of anaphoric expressions
should be restricted to the set of centered entities, the assumption being that in discourse,
it is these entities that we are most likely to continue to talk about and refer to with the
use of anaphoric expressions. Furthermore, a partial ordering is imposed on the elements
of the set, so that some entities are more centered than others. Such a preference ordering
on the possible candidate referents for anaphoric expressions significantly simplifies the
“nature” of inference that would be needed and at the same time minimizes the “amount”
of inference. Another significant proposal was that the set of centered entities can be
partially determined by the linguistic structure of the utterance itself. The consequences
for all these ideas were tremendous because it meant that it was possible to set aside, to a
significant extent, the role of open-ended inferencing for anaphora resolution and look
instead to more easily identifiable surface features of the utterance as the solution and
explanation for at least part of the problem.
While Grosz laid out the general framework for the Centering process, her work did not
suggest the exact mechanisms whereby the centered entities could be identified. In 1979,
Candy Sidner extended Grosz’s framework by precisely defining the notion of the
utterance-based center linguistically and also provided a mechanism for using centers to
identify referents of pronouns.
Sidner invoked several Centering structures - singleton sets called “discourse focus” and
“actor focus”, and a set called the “potential foci” which can contain one or more
elements. The “discourse focus” is equivalent to the center of the utterance, i.e., the entity
about which some predication is made by the utterance. The “actor focus” is the discourse
entity that is predicated as the agent of the event in the utterance. The “discourse focus” is
identified using a set of rules that refer to the linguistic structure of the utterance as well
as the state of the existing data structures when the utterance containing the pronoun is
processed. A referent for a pronoun is identified primarily with the actor focus or the
discourse focus, unless it is ruled out by some specified criteria, in which case an
alternate candidate referent is considered from the set of “potential foci”, which contains
entities other than the two primary foci. A significant aspect of Sidners work is that she
does not rule out the role of inference in pronoun interpretation, but instead only
constrains it in nature and amount. The nature of inference needed is different from
earlier open-ended inference systems because it only involves checking for contradictions
once a candidate referent is chosen using the structurally determined preference ordering:
this allows for a much simpler knowledge base and reasoning procedures. The amount of
inference needed is also reduced because of the preference ordering, so that as soon as an
entity is identified for which no contradictions arise, no other inferencing is needed.
2. Centering Theory: Modeling Local Coherence with Centers of
Attention
Centering Theory arose from the work of Aravind Joshi and Steve Kuhn in 1979 (Joshi
and Kuhn, 1979), where the concepts of the “center” and “Centering” were first
introduced as a way to specify an almost monadic calculus approach to discourse
interpretation. Joshi and Kuhn showed that inferences of a certain class are more easily
computed by using a monadic representation for utterances. However, they were also
interested in computing the difficulty of deriving the necessary inferences.
While not explicitly stated by Joshi and Kuhn, the Centering process was assumed to be a
local phenomenon operating over successive utterances. In the meantime, Grosz’s work
on global and local discourse processing had also been formalized by Grosz and Sidner
(Grosz and Sidner, 1986) and it was possible to place CT in its proper place in a complete
theory of discourse processing. Grosz and Sidner provided a framework for discourse
structure as a composite of three interacting constituents: a linguistic structure, an
intentional structure, and an attentional state. The linguistic structure is determined by the
intentional structure and comprises the utterances of the discourse grouped together
hierarchically into discourse segments. The attentional state is an abstraction of the
discourse participants’ center of attention as the discourse unfolds. Each discourse
segment is associated with a fixed attentional state relevant to the overall discourse – the
global attentional state. A local attentional state is associated with each utterance within
the segment. The local attentional state is inherently dynamic and can remain constant or
change from utterance to utterance within the segment.
Centering Theory (Grosz, Joshi and Weinstein, 1983;1986;1995) was proposed as a
model of the local attentional state, i.e., of the dynamic attentional state within the
discourse segment. Following up on the concerns of Joshi and Kuhn, it explicates more
clearly and formally the particular linguistic and attentional state factors that contribute to
the ease or difficulty in interpreting a discourse segment. The notion of inferential
complexity or difficulty was recast as the term of “coherence”. The first factor that
contributes to coherence is given as a further explication of Joshi and Kuhn’s “change of
center” rule, and accounts for the difference in coherence between the following two
discourse segments:
1. (a) John went to his favorite music store to buy a piano.
(b) He had frequented the store for many years.
(c) He was excited that he could finally buy a piano.
(d) He arrived just as the store was closing for the day.
2. (a) John went to his favorite music store to buy a piano.
(b) It was a store John had frequented for many years.
(c) He was excited that he could finally buy a piano.
(d) It was closing just as John arrived.
Discourse (1) is intuitively more coherent than discourse (2). This difference may be seen
to arise from the number of changes in the center. Discourse (1) centers a single
individual ‘John’, describing various actions he took and his reactions to them. In
contrast, discourse (2) seems to flip back and forth between ‘John’ and ‘the store’. These
“changes in aboutness” or “changes of centers” makes discourse (2) less coherent than
discourse (1).
The second observation that CT captures with discourses (1) and (2) establishes the
correlation of center changes and the degree of coherence with the linguistic form of the
utterances. Both discourses convey the same information, but in different ways. They
differ not in content or what is said, but in expression or how it is said. The variation in
“changes of attentional state” that they exhibit arises from different choices of the way in
which they express the same propositional content. The different linguistic choices
further engender different inference demands on the hearer or reader, and these
differences in inference load underlie certain differences in coherence between them.
In addition to the different linguistic choices pertaining to the realization of the
propositional content of the utterance as a whole, CT also identifies different linguistic
choices made for realizing particular elements within the propositional content of the
utterance. These are choices in referring expression form. Pronouns and definite
descriptions are not equivalent with respect to their effect on coherence. CT characterizes
the perceived coherence of the use of pronouns and definite descriptions by relating
different choices to the inferences they require the hearer or reader to make. The
following variations of a discourse illustrate this relationship:
3. (a) Terry really goofs sometimes.
(b)Yesterday was a beautiful day and he was excited about trying
out his new sailboat
(c) He wanted Tony to join him on a sailing expedition.
(d) He called him at 6 A.M.
(e) He was sick and furious at being woken up so early.
(e’) ) Tony was sick and furious at being woken up so early.
(f) He told Terry to get lost and hung up.
(g) Of course, he hadn’t intended to upset Tony.
(g’) Of course, Terry hadn’t intended to upset Tony.
(g’’) Of course, Terry hadn’t intended to upset him.
In discourse (3), it is the use of the pronoun in utterance (3e) that is in question. While we
can tell that the pronoun He refers to ‘Tony’, the use of the pronoun here is potentially
confusing. CT claims that this is because, until utterance (3d), ‘Terry’ has been the
“center of attention”, and therefore the most likely referent of the pronoun. This claim
rests on the assumption that hearers expect speakers to continue talking about the entity
that is in the “center of attention”. The confusion therefore results because we tend to
assign the reference of the pronoun to the center of attention as soon as we encounter it
but have to backtrack (a phenomenon called “garden-path”) when we process the rest of
the sentence and find something that contradicts our assumption. In this particular
example, we backtrack when we get to the work sick and from the prior utterances in the
discourse, reason that it must be ‘Tony’ and not ‘Terry’ who is sick. As the careful reader
will have noticed, the assumed preferences for determining the referents of pronouns in
CT is reminiscent of Sidner’s model. We will return to this comparison at the end of this
section where we discuss the relation between anaphora resolution and Centering theory.
The confusion arising from (3e) is removed if the pronoun is replaced with the full noun
phrase ‘Tony’ as shown in (3e’). The conjecture in CT, therefore, is that when the center
of attention shifts to another entity, the form of referring expression used to denote the
new centered entity has consequences for the processing load required for interpreting the
utterance. A pronoun used to refer to the new centered entity increases the processing
load because it causes backtracking from the interpretation of the old centered entity and
thus from the interpretation of the utterance itself. A full noun phrase on the other hand
shifts the center of attention before the rest of the utterance is processed and therefore
entails less processing.
The three variants (3g), (3g’) and (3g’’) provide an illustration of yet another type of
difference in coherence due to the form of referring expression. This arises when multiple
entities are talked about from one utterance to the next. By the time (3f) is processed, the
center has shifted from ‘Terry’ to ‘Tony’, so that in (3g), we expect ‘Tony’ to be the
center of attention. This expectation is borne out in (3g) since ‘Tony’ is indeed mentioned
again. However, what makes this sentence very odd and hard to process is that ‘Terry’ is
also mentioned in (3g), but while the centered ‘Tony’ is referred to with a full noun
phrase, the non-centered ‘Terry’ is referred to with a pronoun. This increased processing
is reduced when a full noun phrase is used for ‘Terry’ instead of the pronoun, as in (3g’)
or (3g’’), so that we are able to shift the center before processing the rest of the utterance,
thus avoiding any backtracking. The type of coherence variation found in these utterances
is due to the fact that both the centered entity in (3f) as well as another entity are
mentioned again in (3g) and its variants, but in (3g), it is the non-centered entity from (3f)
that is referred to with a pronoun.
CT provides a set of definitions, constraints and rules to formalize the three-way
relationship discussed above, i.e., the relationship between attentional state, the degree of
coherence and linguistic form (for the realization of full propositional content as well as
for the realization of discourse entities). The CT definitions, constraints and rules are
given below.
Definitions:
(D1.) Each utterance U in a discourse segment is assigned a set of forward-looking
centers, Cf (U), where centers are discourse entities realized in the utterance.
(D2.) Each utterance other than the segment-initial utterance is assigned a single
backward-looking center, Cb (U).
(D3.) The backward-looking center of utterance Un+1 connects with one of the forward-
looking centers of Un.
(D4.) The elements of Cf (Un) are partially ordered to reflect relative prominence or
salience in Un. In English, the Cf is ordered according to grammatical role.
(D5.) The more highly ranked an element of Cf (Un), the more likely it is to be Cb(Un+1).
(D6.) The most highly ranked element of Cf (Un) is called the preferred center, Cp (Un).
(D7.) A transition relation holds between each utterance pair Un and Un+1 in a segment.
There are four types of transitions, which describe center continuation, center retention,
and two types of center shifting. The transitions are shown in Table 1.
Constraints:
(C1.) There is precisely one backward-looking center Cb (Un).
(C2.) Cb (Un+1) is the highest ranked element of Cf (Un) that is realized in Un+1.
Constraint C1 says that there is one central discourse entity that the utterance is about.
Constraint C2 states that the ranking or ordering of the forward-looking centers in Un
determines which of them realized in Un+1 will become the backward-looking center of
Un+1.
Rules.
(Rule 1.) If some element of Cf (Un) is realized as a pronoun in Un+1 then so is Cb(Un+1).
(Rule 2.) With respect to Table 1, sequences of the CONTINUE transition are preferred to
sequences of the RETAIN transition, which are preferred to sequences of the SMOOTH-
SHIFT transition, which are preferred to sequences of the ROUGH-SHIFT transition.
Rule 1 is often called the “Pronoun Rule”. It is important to note that the inference load
due to Rule 1 is not part of the inference load characterized by the transitions. Rule 1 is
thus independent of the transitions. This independence of Rule 1 is an important
consideration when thinking of the relation between CT and anaphora resolution. The
inference load due to Rule 1 can be regarded as a binary measure, simply stating whether
or not Rule 1 has been violated. With this rule, we can now explain the varying degrees
of coherence for utterances (3g-3g’’) in discourse (3). The centering analysis for this
discourse is shown in Table 4. After ‘Tony’ is established as the center (the Cb) in (3e),
this center continues in (3f), but with the re-introduction of ‘Terry’ as a potential center.
In (3g), both ‘Tony’ and ’Terry’ are mentioned but since ‘Tony’ is higher ranked than
‘Terry’ in (3f), it is ‘Tony’ that is retained as the Cb in (3g). However, this utterance
creates a Rule 1 violation because the Cb, ‘Tony’, is not realized with a pronoun whereas
‘Terry’, which is not the Cb, is. The only difference between (3g) and (3g’-3g’’) is that the
latter does not violate Rule 1, the transitions remaining the same. The oddness of (3g) is
therefore explained by Rule 1.
Rule 2 provides a formal characterization of the perceived differences in coherence for
discourse segments in terms of an ordering on transition sequences. The less frequent the
shifts in a discourse, the more coherent it is. Discourse (1) above is characterized by
Continue transitions throughout the segment (Continue, Continue, Continue see Table
2) describing a highly coherent discourse, whereas discourse (2) is characterized by
switches between Retain and Continue (Retain, Continue, Retain see Table 3),
describing a less coherent discourse.
3. Centering Theory and Anaphora Resolution
As stated right in the beginning, the main goal of CT is to characterize certain aspects of
local coherence. Differences in coherence result from changes in the center of attention,
captured by the Centering transitions and transition ordering, and from the different
expressions in which centers are realized. In particular, pronouns and definite
descriptions engender difference inference demands on the hearer. CT, however, is not to
be seen as a theory of anaphora resolution. The incorporation of referring expressions in
the account of local coherence has led many researchers to use the CT as part of anaphora
resolution algorithms. This has led to some very interesting research. At the same it has
led to some confusion in the literature associated with CT.
The first point to appreciate is that there is undoubtedly a very relevant connection
between CT and anaphora resolution. As the careful reader will have deduced, the
garden-path effects with the interpretation of the pronouns illustrated in discourse (3) is
reminiscent of the preference ordering utilized by Sidner for the reference resolution of
pronouns. In Sidner’s model, the “center of attention” is equivalent to the “discourse
focus” and like Sidner, CT utilizes this preference for the “center of attention” to continue
over successive utterances. The relative preference of the “actor focus” as the next center
of attention is also captured with the “preferred center” in CT. At a first look, it may seem
that Sidner’s use of the “center of attention” to determine the referents of pronouns and
CT’s use of the same to explain how incorrect referents are assigned to pronouns results
in a paradox. But a closer look shows that it isn’t really so, because like CT, Sidner also
allows for garden-paths on the referents of pronouns by further invoking inference
procedures (albeit unspecified) to check for contradictions. So Sidner’s goals and CT’s
goals are very much alike, in that they both assume very similar preference for the
“initial” resolution of pronouns which can be contradicted with further information. The
difference between the two is that CT goes further to formalize the nature and difficulty
of the contradictory inferences in terms of utterance pair transitions and uses the formal
system as a way to compute the degree of coherence of a discourse segment.
Anaphora resolution algorithms that want to obviate the need for inference procedures
and want to model the preferential rules for pronoun resolution should use the common
part underlying the two described models. Sidners inference rules for computing
contradictions should be left out (or at least relegated to another interacting component)
as should the part in CT that deals with the computation of coherence with the transitions
and transition orderings. More formally, the common aspect of Sidner’s model and CT
are captured in CT with (i) the list of forward-looking centers, (ii) the backward-looking
center, (iii) the preferred center, and (iv) Rule 1, the “Pronoun Rule”. These data
structures and rules are sufficient to set the initial preference for the referents of
pronouns. Furthermore, corpus studies and studies of naturally occurring data of the form
of referring expressions have shown that to a large extent speakers adhere to the
preference orderings and Rule 1, so that much mileage can be achieved by building in
these preferences into anaphora resolution algorithms, as Sidner had conjectured.
However, while some anaphora resolution algorithms have used these very data
structures and shown good results, others have used CT in totality, i.e., together with the
transitions and transition orderings, to compute the referents of pronouns (for example,
the centering algorithm – called the BFP algorithm - for pronoun resolution in Brennan et
al., 1987). In addition to being theoretically misguided, the latter approach also yields
contradictory results for the initial preferential resolution of pronouns (Kehler, 1997). An
Optimality theory based version of the BFP algorithm and a comprehensive overview of
Centering together with a historical development of Centering Theory and its applications
can also be found in Beaver (2004).
4. Unspecified Aspects of Centering
Some parameters and constants in Centering, both from the perspective of anaphora
resolution and local coherence were left unspecified in the original models. Two of these
in particular have led to a great deal of research.
The first is determination of the preference ordering on the list of forward-looking centers
or determination of relative salience of discourse entities in an utterance. This is crucial
for the initial interpretation assignments for pronouns. Cross-linguistic investigation of
the mechanisms that languages use to realize discourse functions like “topic” shows that
different ranking criteria need to be used for different languages. In English, relative
salience is largely predicted by grammatical role, as was correctly assumed in CT. Other
languages use other mechanisms. In Japanese, which uses the morphemes wa and ga to
distinguish topics and subjects and special forms of the verb for marking empathy, topic
and empathy marked entities are ranked higher than subjects. German uses word order in
some syntactic contexts to indicate salience, positioning higher ranked entities before
lower ranked ones. Other languages on which such research has been conducted include
Finnish, Greek, Hindi, Italian, Russian and Turkish.
The second is the specification of what constitutes the utterance, which in CT is the
linguistic locus of the local attentional state. Discourse centers, both backward-looking
and forward-looking, are computed for each utterance. That is, each utterance serves as a
center update unit. In attempting to characterize the linguistic encoding of a center update
unit, complications rise from complex sentence structures. Up-to-date research on this
issue suggests that complex sentences may project different center update units
depending on their internal structure.
In early theoretical work on characterizing the center update unit in Centering, it was
suggested that complex sentences be broken in clauses each of which forms an
autonomous center update unit, with the possible exception of relative clauses and
complement clauses. Treating adverbial clauses as autonomous center update units
predicts that a pronoun in a fronted adverbial clause, as in (4c) below, is anaphorically
dependent on an entity already introduced in the immediately prior discourse and not on
the subject of the main clause it is attached to:
4. (a) (Jim) Kerni began reading a lot about the history and philosophy of
Communism
(b) but never 0i felt there was anything he as an individual could do about
(c) When hei attended the Christina Anti-Communist Crusade school here about
six months ago
(d) Jimi (Kern) became convinced that he as an individual could do something
constructive in the ideological battle
(e) and 0i set out to do it
This view on backward anaphora was also professed in earlier work by Kuno, who
asserted that there was no genuine backward anaphora: the referent of an apparent
cataphoric pronoun must appear in the previous discourse. Empirical data later showed
that this view of backward anaphora cannot be maintained. Corpus studies show that
cataphoric pronouns can appear discourse initially.
Experimental work focusing on complex sentences of the type that includes adverbial
clauses suggests that adverbial clauses are processed as a single unit with the matrix
clause. Specifically, native speakers of English tend to interpret the ambiguous subject
pronoun in (5) as the groom , i.e., the subject of the preceding clause, even when the
adverbial in the second main clause is semantically varied (however, as a result,
moreover, then etc). This pattern contrasts with the interpretation of the subject pronoun
(6) for which no consistent tendency is identified, indicating that in this case the
interpretation of the pronoun is most likely determined by the semantics of the predicates
of the main and adverbial clause and the relation between them.
5. The groom hit the best man. However, he…
6. The groom hit the best man although he…
Other experimental work on the interpretation of a subject pronoun following a complex
sentence indicates that referents in subject position in adverbial clauses are not favored
for the interpretation of a subsequent pronoun. In (7) and (8), for example, the subject
pronoun is interpreted as the conductor, i.e., the referent of the matrix clause, even when
the adverbial clause is postposed with respect to the main clause.
7. After the tenor opened his music store the conductor sneezed three
times. He...
8. The conductor sneezed three times after the tenor opened his music
score. He...
Data such as the above would be a challenge for a Centering-based anaphora resolution
algorithm which processes one clause at a time because there is no way of distinguishing
between (5) and (6). At the same time, these data are consistent with Centering and
Centering’s Pronoun rule under the assumption that adverbial clauses are not processed as
independent update units. Under this assumption, Centering would predict the pattern
observed in (5), (7) and (8). Centering’s pronoun rule would not make a prediction for (6)
with respect to the entities introduced in the main clause because they belong to the same
unit as the pronoun. Additional evidence for treating the entire sentence as a single update
unit comes from corpus work exploring various parameters that can be set for Centering
and the number of Centering rules that they would violate. This type of work suggests
that overall treating the whole complex sentence as a center update unit leads to fewer
violations of the Pronoun rule.
Studies of Centering in relative clauses present conflicting results which need further
research to be reconciled. On the one hand are discourses like (9) that suggest that entities
mentioned in relative clauses (9b) are less salient than in the main clause (9a), as
indicated by the use of the subsequent use of full noun phrase in (9c). In fact, a pronoun
used instead of the full noun phrase would probably be interpreted as Mr. Taylor, i.e., the
entity in the main clause.
9. (a) Mr. Taylori, 45 years old, succeeds Robert D. Kilpatrickj, 64,
(b) whoj is retiring, as reported earlier.
(c) Mr. Kilpatrickj will remain a director.
(d) Hei …#Hej
On the other hand are discourses like (10) showing the opposite pattern from that in (9).
Such data comes from work that looks at different types of relative clauses, specifically
non-restrictive and restrictive with a definite or indefinite head. Complementary patterns
in the use of pronouns and definite descriptions shows that non-restrictive clauses and
restrictive clauses with an indefinite head pattern alike, and form an autonomous (but
embedded and accessible) center update unit. In example (10), the subject pronoun in
(10c) refers, without any garden-path effects, to the subject referent of the preceding
relative clause and not the subject referent of the main clause, indicating that in this case
the relative clause probably introduces a new update unit that is accessible to (10c) for
center establishment.
10. (a) This Mosesi was irresistible to a man like Simkinj
(b) whoj loved to pity and to poke fun at the same time.
(c) Hej was a reality-instructor.
5. Applications of Centering Theory as a model of Local Coherence
Some research illustrates the appropriate and correct application of Centering Theory.
The four Centering transitions shown in Table 1 define four degrees of coherence within
a discourse segment. A textual segment characterized by a sequence of Continue
transitions demonstrates the highest degree of coherence and is perceived as a segment
focusing on a single entity. Topic retains and smooth shifts to new topics are captured in
the Retain and Smooth-Shift transitions. Indeed, numerous corpus studies have identified
Continue, Retain and Smooth-Shift transitions. As expected, Rough-Shift transitions are
rarely identified in corpora of written text which presumably maintain a high level of
coherence. An exception to this pattern is observed in texts whose coherence is under
evaluation and therefore cannot be assumed. A typical kind of this type of text is student
essays. Indeed, in a study of essays written by students, it has been shown that excessive
number of Rough-Shift transitions per paragraph in students’ essays correlates with low
essay scores provided by writing experts .
A closer analysis of the essays reveals that the incoherence detected by a Rough-Shift
measure is not due to violations of Centering's Pronominal Rule or other infelicitous uses
of pronominal forms. The distribution of nominal and pronominal forms over Rough-
Shift transitions reveals that in fact pronominal forms are avoided in Rough-Shift
transitions. This observation indicates that the incoherence found in student essays is not
due to the processing load imposed on the reader to resolve anaphoric references.
Instead, the incoherence in the essays is due to discontinuities caused by introducing a
rapid succession of new, undeveloped topics with no links to the prior discourse. In other
words, Rough-Shifts pick up textual incoherence due to topic discontinuities.
Studies such as the one just described are supportive of the formulation of Centering as a
model of local discourse coherence. They also show that the Centering model can be used
successfully for practical applications, e.g., to improve automated systems of writing
evaluation in testing and education. In fact, it has been shown that adding a Centering-
based metric of coherence to an existing electronic essay scoring system (the system e-
rated developed at the Educational Testing Service) improves the performance of the
system by better approximating human expert scores. In addition, a Centering-based
system of writing evaluation has exceptional pedagogical value. This is because the
models offers the capability of directing students' attention to specific locations within an
essay where topic discontinuities occur. It can illuminate broken topic and focus chains
within the text of an essay by drawing the student’s attention to the noun phrases playing
the roles of Cb's and Cp's. Supplementary instructional comments could guide the student
into revising the relevant section paying attention to topic discontinuities.
Bibiography
Baldwin, B.F. (1995). COGNIAC: a discourse processing engine (Ph.D. thesis).
University of Pennsylvania.
Beaver, D. (2004). ‘The Optimization of Discourse Anaphora.’ Linguistics and
Philosophy 27(1), 3-56.
Brennan, S.E., Friedman, M.W. & Pollard, C.J. (1987). ‘A Centering approach to
pronouns.’ Proceedings of the 25th Annual Meeting of the Association for Computational
Linguistics, Stanford, Calif., 155-162.
Cooreman, A. & Sanford, A. (1996). Focus and Syntactic Subordination in Discourse
(Technical Report). Human Communication Research Center.
Di Eugenio, B. (1996). ‘Centering in Italian’. In Walker, M.A., Joshi, A.K. & Prince, E.F.
(eds.) Centering Theory in Discourse. New York: Oxford University Press. 115-138.
Givón, T. (1983). ‘Topic continuity in discourse: a quantitative cross-language study.’
Topic Continuity in Discourse: An Introduction. Amsterdam: John Benjamins Publishing.
1-42.
Gordon, P.C., Grosz, B.J. & Gilliom, L.A. (1993). ‘Pronouns, names and the Centering of
attention in discourse.’ Cognitive Science, 17(3), 311-347.
Grosz, B.J. (1977). The representation and use of focus in dialogue understanding
(Technical Report No. 151). Menlo Park, Calif.: SRI International.
Grosz, B.J. & and Sidner, C.L. (1986). ‘Attentions, intentions and the structure of
discourse.’ Computational Linguistics 12, 175-204.
Grosz, B.J., Joshi, A.K. and Weinstein, S. (1983). ‘Providing a unified account of noun
phrases in discourse.’ Proceedings of the 21st Annual Meeting of the Association for
Computational Linguistics, Cambridge, Mass., 44-50.
Grosz, B.J., Joshi, A.K. and Weinstein, S. (1995). ‘Centering: a framework for modeling
the local coherence of discourse.’ Computational Linguistics 21(2), 203-225.
Hudson-D’Zmura, S.B. (1988). The structure of discourse and anaphor resolution: the
discourse center and the role of nouns and pronouns (Ph.D. thesis). University of
Rochester.
Joshi, A.K. & Kuhn, S. (1979). ‘Centered logic: the role of entity centered sentence
representation in natural language inferencing.’ Proceedings of the 6th International Joint
Conference in Artificial Intelligence, Tokyo, 435-439.
Kehler, A. (1997). ‘Current theories of Centering for pronoun interpretation: a critical
evalutation.’ Computational Linguistics 23(3), 467-475.
Miltsakaki, L. (2004). ‘Not all subjects are born equal: a look at complex sentence
structure.’ The Processing and Acquisition of Reference. Cambridge, MA: MIT Press.
Miltsakaki, E. & Kukich, K. (2004). ‘Evaluation of text coherence for electronic essay
scoring systems.’ Natural Language Engineering 10(1), 25-55.
Miltsakaki, E. (2002). ‘Toward an aposynthesis of topic continuity and intras-sentential
anaphora. ’ Computational Linguistics 28(3), 319-255.
Poesio, M., Stevenson, R., Di Eugenio, B. & Hitzeman, J. (2004). ‘Centering: a
parametric theory and its instantiations.’Computational Linguistics 30(3), 309-363.
Prasad, R. & Strube, M. (2000). ‘Discourse salience and pronoun resolution in Hindi.’ In
Williams, A. & Kaiser, E. (eds.) Penn Working Papers in Linguistics: Current Work in
Linguistics 6(3), 189-208.
Prasad, R. (2003). Constraints on the generation of referring expressions: with special
reference to Hindi (Ph.D. thesis). University of Pennsylvania.
Prince, E.F. (1999). ‘Subject pro-drop in Yiddish.’ In Bosch, P & van der Sandt, R. (eds.)
Focus: Linguistic, Cognitive and Computational and Perspectives. Cambridge:
Cambridge University Press. 82-101.
Rambow, O. (1993). ‘Pragmatic aspects of scrambling and topicalization in German.’
Institute for Research in Cognitive Science Workshop on Centering Theory in Naturally-
Occurring Discourse (Ms.). University of Pennsylvania, May 20-28.
Reinhart, T. (1981). ‘Pragmatics and linguistics. an analysis of sentence topics.’
Philosphica 27(1), 53–94.
Sidner, C.L. (1979). Toward a computational theory of definite anaphora comprehension
in English (Technical Report No. AI-TR-537). Cambridge, Mass.: MIT Press.
Strube, M. & Hahn, U. (1998). ‘Never look back: an alternative to Centering.’
Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics
and the 17th International conference on computational linguistics, Montreal, Quebec,
Canada. 1251-1257.
Suri, L.Z., DeCristofaro, J.D. & McCoy, K.F. (1999). ‘A methodology for extending
focusing frameworks.’ Computational Linguistics 25(2), 173-194.
Turan, U.D. (1995). Null vs. overt subjects in Turkish discourse: a Centering analysis.
(Ph.D. thesis). University of Pennsylvania.
Walker, M.A., Iida, M. & Cote, S. (1994). ‘Japanese discourse and the process of
Centering.’ Computational Linguistics 20(2), 193-232.
Walker, M.A., Joshi, A.K. & Prince, E.F. (1998). Centering theory in discourse. New
York: Oxford University Press.
Cb (Ui+1) = Cb (Ui) OR
Cb (Ui) = [?]
Cb (Ui+1) Cb (Ui)
Cb (Ui+1) = Cp (Ui+1) CONTINUE SMOOTH-SHIFT
Cb (Ui+1) Cp (Ui+1)RETAIN ROUGH-SHIFT
Table 1: Centering Transitions
(1a) John went to his favorite music store to buy a piano.
Cf = {John, store, piano}, Cp = John, Cb = ?, Transition = undef.
(1b) He had frequented the store for many years.
Cf = {John, store}, Cp = John, Cb = John, Transition = Continue
(1c) He was excited that he could finally buy a piano.
Cf = {John, piano}, Cp = John, Cb = John, Transition = Continue
(1d) He arrived just as the store was closing for the day.
Cf = {John, store}, Cp = John, Cb = John, Transition = Continue
Table 2: Centering Analysis for Discourse (1)
(2a) John went to his favorite music store to buy a piano.
Cf = {John, store, piano}, Cp = John, Cb = ?, Transition = undef.
(2b) It was a store John had frequented for many years.
Cf = {store, John}, Cp = store, Cb = John, Transition = Retain
(2c) He was excited that he could finally buy a piano.
Cf = {John, piano}, Cp = John, Cb = John, Transition = Continue
(2d) It was closing just as John arrived.
Cf = {store, John}, Cp = store, Cb = John, Transition = Retain
Table 3: Centering analysis for Discourse (2)
(3a) Terry really goofs sometimes.
Cf = {Terry}, Cp = Terry, Cb = ?, Transition = undef.
(3b) Yesterday was a beautiful day and he was excited about trying out his new sailboat.
Cf = {Terry, sailboat}, Cp = Terry, Cb = Terry, Transition = Continue
(3c) He wanted Tony to join him in a sailing expedition.
Cf = {Terry, Tony, expedition}, Cp = Terry, Cb = Terry, Transition = Continue
(3d) He called him at 6 A.M.
Cf = {Terry, Tony}, Cp = Terry, Cb = Terry, Transition = Continue
(3e) He was sick and furious at being woken up so early.
Cf = {Tony}, Cp = Tony, Cb = Tony, Transition = Smooth-shift
(3e’) Tony was sick and furious at being woken up so early.
Cf = {Tony}, Cp = Tony, Cb = Tony, Transition = Smooth-shift
(3f) He told Terry to get lost and hung up.
Cf = {Tony, Terry}, Cp = Tony, Cb = Tony, Transition = Continue
(3g) Of course, he hadn’t intended to upset Tony.
Cf = {Terry, Tony}, Cp = Terry, Cb = Tony, Transition = Retain
(3g’) Of course Terry hadn’t intended to upset Tony.
Cf = {Terry, Tony}, Cp = Terry, Cb = Tony, Transition = Retain
(3g’’) Of course, Terry hadn’t intended to upset him.
Cf = {Terry, Tony}, Cp = Terry, Cb = Tony, Transition = Retain
Table 4: Centering analysis for Discourse (3)
KEYWORDS:
Anaphora resolution
Pronoun resolution
Centering
Discourse
Discourse structure
Linguistics
Pragmatics
Processing complexity
Inference
Topic
Coherence
Referring expressions
Discourse salience
Utterance
Complex sentences
Attentional state
... There is a plethora of theories on anaphora and pronoun interpretation. One prominent theory is Centering Theory (CT) (Grosz et al., 1995;Johshi & Prasad, 2006), which states that particular antecedents are more central in discourse than others. The ranking of antecedents is defined by morphogrammatical properties, particularly grammatical role (Kehler & Rohde, 2013). ...
Article
Full-text available
Coreference processing of Control constructions and their pronoun-containing counterparts can be studied experimentally using priming or interference paradigms. We replicate findings in a priming study on non-finite Control constructions in Norwegian (Larsen & Johansson, 2020) and contrast them with their finite counterparts using interference effects in a grammatical maze (G-maze) design. We asked participants to read sentences word-byword and to select the grammatically correct continuation from two options. When the ungrammatical option was a potential antecedent from within the sentence, we predicted interference, i.e., longer reaction times compared to an unrelated baseline. We observed a trend towards significant interference effects when a participant was presented with either of the potential noun phrase (NP) antecedents of PRO in competition with the infinitive marker (test position zero) during the processing of a Control sentence. This indicates reactivation of potential antecedents at the infinitive marker, and a reactivation position (PRO) near or at the infinitive marker. We also observed significant differences between Control constructions and their pronoun counterparts. A significant interference effect was recorded for Subject Pronoun constructions when either potential NP antecedent of the pronoun was presented in competition with the pronoun itself. A similar trend was recorded for Object Pronoun sentences.
... Even if one encounters a case of syntactic ambiguity where more than one syntactic structure can be ascribed to the same sentence, all the possible syntactic analyses should be generated by that program (Pezik,2011:441). One famous example of syntactic ambiguity is called the garden path sentence, a phenomenon which occurs " when we process the rest of the sentence and find something that contradicts our assumption (Joshi et al, 2006:225): ...
Book
Full-text available
This book presents an investigation of the potentialities of employing computational linguistics and statistics in resolving cases of disputed authorship and suspected plagiarism. Much has been written about the ways of addressing doubtful texts with no clear clues of real authorship. it is no use to reduplicate what has already been mentioned. Therefore, some major questions are posed related to thee viability of certain computational programs (WordSmith Tools, Version 5.0), and statistical techniques (Principal Components Analysis, and Cluster Analysis) involved in the statistical package (SPSS Version 14.0). Corpora of various genres are tackled in English and Arabic so that some general premises might be attained. The researcher observes the statistical behavior of one particular feature to settle down authorship and plagiarism problems. This behavior is related to the way function words behave across different textual corpora. Based on the results of this study, the researcher figures out the overlapping aspects that bring plagiarism detection and authorship attribution together in one basket. Suspected cases of plagiarism can be interpreted as special cases of disputed or misattributed authorship.
...  Richard Evans and Constantin Orasan improved anaphora resolution by identifying animate entities in texts [4]. centering theory for pronoun resolution [8].  Dev Bahadur using Lappin Leass approach pronominal anaphora is resolved in Nepali Language [9]. ...
Article
Full-text available
Anaphora resolution is complex problem in linguistics and has attracted the attention of many researchers. It is the problem of identifying referents in the discourse. Anaphora Resolution plays an important role in Natural language processing task. This paper completely emphasis on pronominal anaphora resolution for English Language in which pronouns refers to the intended noun in discourse. In this paper two computational models are proposed for anaphora resolution. Resolution of anaphora is based on various factors among which these models use Recency factor and Animistic Knowledge. Recency factor is implemented by using Lappin Leass approach in first model and using Centering approach in second model. Information about animacy is obtained by Gazetteer method. The identification of animistic elements is employed to improve the accuracy of the system. This paper demonstrates experiment conducted by both the models on different data sets from different domains. A comparative result of both the model is summarized and conclusion is drawn for the best suitable model.
... Other studies (Hobbs, 1985; Fox, 1987; Kibrik, 1996; Kehler, 2002) underlined the contribution of the semantic structure of discourse, including the hierarchical structure. Several models of discourse-semantic relations have been proposed in the recent decades (see Hobbs, 1985; Polanyi, 1985; Wolf et al., 2003; Miltsakaki et al., 2004; Joshi et al., 2006, i.a.), one of the best known being Rhetorical Structure Theory (RST) (Mann and Thompson, 1987; Taboada and Mann, 2006). RST represents text as a hierarchical structure, in which each node corresponds to an elementary discourse unit (EDU), roughly equaling a clause. ...
Article
Full-text available
We report a study of referential choice in discourse production, understood as the choice between various types of referential devices, such as pronouns and full noun phrases. Our goal is to predict referential choice, and to explore to what extent such prediction is possible. Our approach to referential choice includes a cognitively informed theoretical component, corpus analysis, machine learning methods and experimentation with human participants. Machine learning algorithms make use of 25 factors, including referent?s properties (such as animacy and protagonism), the distance between a referential expression and its antecedent, the antecedent?s syntactic role, and so on. Having found the predictions of our algorithm to coincide with the original almost 90% of the time, we hypothesized that fully accurate prediction is not possible because, in many situations, more than one referential option is available. This hypothesis was supported by an experimental study, in which participants answered questions about either the original text in the corpus, or about a text modified in accordance with the algorithm?s prediction. Proportions of correct answers to these questions, as well as participants? rating of the questions? difficulty, suggested that divergences between the algorithm?s prediction and the original referential device in the corpus occur overwhelmingly in situations where the referential choice is not categorical.
... In relation to the first of these, Centering Theory (Grosz, Joshi, & Weinstein, 1995) holds that the set of referents in an utterance have a ranked order of attentional focus and that this order is based on structural properties of the utterance. The relevant properties vary cross-linguistically, but for English it appears that grammatical role is central (Joshi, Prasad, & Miltsakaki, 2006). The basic hierarchy for English is generally considered to be subject followed by direct object, indirect object, and adjunct (Beaver, 2004;Brennan, Friedman, & Pollard, 1987;Grosz et al., 1995;but cf. ...
Article
Full-text available
The tendency of intermediate and advanced second language speakers to underuse pronouns and zero anaphora has been characterized as a developmental stage of overexplicitness, yet little consideration has been given to whether learners create sufficient contexts for their use. This study analyzed references across eight degrees of accessibility, revealing that this did not account for infrequent pronoun use by Chinese learners of English. Further analysis revealed that participants were seldom overexplicit when referring to highly accessible individuals, particularly those that represented continued topics, but were significantly more likely than native speakers to use lexical noun phrases elsewhere, particularly for main characters. This is discussed in relation to a possible role of overexplicitness as a clarity-based communication strategy. http://goo.gl/yHvhT6
... Результаты более ранних исследований (Fox, 1987; Hobbs, 1985; Kehler, 2002; Kibrik, 1996) указывают на наличие зависимости между дискурсивной структурой и референциальным выбором. Наиболее логично среди различных моделей семантико-дискурсивной структуры текста (например, Hobbs, 1985; Joshi, Prasad & Miltsakaki, 2006; Miltsakaki, Prasad, Joshi & Webber, 2004; Polanyi, 1985; Wolf & Gibson, 2003) иерархическую организацию дискурса описывает ТРС. Выбор RST Discourse Treebank обоснован наличием в нем аннотации иерархической структуры, которая требует специальной подготовки аннотаторов и значительных затрат времени. ...
Conference Paper
Full-text available
В статье обсуждаются различные способы оптимизации системы, мо- делирующей референциальный выбор (РВ) на основе аннотирован- ного корпуса с использованием машинного обучения. Аннотационная схема, использовавшаяся в наших более ранних исследованиях, была улучшена и расширена. На следующем этапе был имплементирован более «дешевый» набор параметров с целью сокращения времени обработки и трудозатратности аннотации. Наши результаты свиде- тельствуют о том, что, несмотря на возможность исключения наибо- лее «дорогих» факторов при моделировании РВ, лучшая аккуратность предсказания достижима только при использовании максимального количества доступной информации. Жанровая принадлежность тек- стов была введена в систему в качестве одного из параметров и послу- жила повышению показателя аккуратности. И наконец, была запущена серия психолингвистических экспериментов по изучению категорич- ности выбора, совершаемого говорящими/пишущими. Первые полу- ченные нами результаты оказались многообещающими: они показали, что в случаях, в которых системе не удается дать однозначное пред- сказание, согласно человеческой оценке, возможно с равной вероят- ность использование более одного референциального средства.
... Gordon and his colleagues (Gordon et al., 1993;Gordon and Hendrick, 1998;Camblin et al., 2007) found longer reading times for proper nouns compared to pronouns in subject position of S2, which they explained in the context of Centering Theory (Grosz et al., 1995, Joshi et al., 2006. According to Centering Theory, local discourse coherence is maintained via the mechanisms of forward-looking and backward-looking centers. ...
Article
Full-text available
Two hundred participants, 50 in each of 4 age ranges (19-29 years, 30-49 years, 50-69 years, 70-90 years) were tested for short-term working memory, speed of processing, and online processing of 3 types of sentences in which an initially assigned syntactic structure and/or semantic interpretation had to be revised. Self-paced reading times were longer for the segments that signaled the need for revision; there also were interactions of age and sentence type and speed of processing and sentence type, but not of working memory and sentence type on reading times for these segments. The results provide evidence that working memory does not support the processes that revise the structure and interpretation of sentences and discourse. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Article
Full-text available
Centering theory (CT) has been adopted in analyzing 84 zero anaphors in 50 informative texts. It is found that most zero anaphors occur in Continuation state both in English texts (source texts (ST)) and Thai translation (target texts (TT)). Zero anaphors in the TT outnumber those in ST and are found in more environments. In terms of translation, most zero anaphors in source texts remain in the same form in the target texts although some items are translated into different anaphor forms. Results indicate that zero anaphor is used to keep discourse coherence and to refer to the backward-looking center (Cb) of current utterances in both languages. Therefore, most zero anaphors in source texts are translated into zero anaphors in target texts when the CT transition state of utterances in source texts and target texts is Continuation, and are translated into other anaphors when the CT transition state in source texts is changed to another transition state in the target texts. Constraints in translation of zero anaphors can be explained in terms of anaphor interpretation, salience of entities, syntactic constraint, and naturalness of translation. However, this paper focuses only on one type of anaphor, namely subject zero anaphor; investigation of other types of anaphor will reveal other discrepancies in using and translating anaphors from this language pair.
Article
The aim of this paper is to take a look at discourse structure from the standpoint of pronominal anaphora processing and socalled 'accessibility domains'. The core hypothesis of the paper is that attention-based anaphora interpretation models like Focus Theory or Centering Theory can be utilized in a more satisfying way if discourse is considered as a bundle of concurrent, interacting processes. Elaborating on this hypothesis, in the paper a central role is played by various notions borrowed from non-linear phonological frameworks.
Article
This book studies how people refer to entities in natural discourse. It contributes to the understanding of both linguistic diversity and the cognitive underpinnings of language and it provides a framework for further research in both fields. This book focuses on the way specific entities are mentioned in natural discourse, during which about every third word usually depends on referential choice. It considers reference as an overt representation of underlying cognitive processes and combines a theoretically-oriented cognitive approach with empirically-based cross-linguistic analysis. It begins by introducing the cognitive approach to discourse analysis and by examining the relationship between discourse studies and linguistic typology. The book discusses reference as a linguistic phenomenon, in connection with the traditional notions of deixis, anaphora, givenness, and topicality, and describes the way its theoretical approach is centred on notions of referent activation in working memory. The book argues that the speaker is responsible for the shape of discourse and that referential expressions should be understood as choices made by speakers rather than as puzzles to be solved by addressees. It examines the cross-linguistic aspects of reference and the typology of referential devices, including referring expressions per se, such as free and bound pronouns, and referential aids that help to tell apart the concurrently activated entities. This discussion is based on the data from about 200 languages from around the world. The book then proposes a comprehensive model of referential choice, in which it draws on concepts from cognitive linguistics, psycholinguistics, cognitive psychology, and cognitive neuroscience, and applies this to Russian and English. The book also draws together empirical analyses in order to examine what light the analysis of discourse can shed on the way information is processed in working memory. The final part of the book offers a wider perspective, including deixis, referential aspects of gesticulation and signed languages.
Article
Full-text available
This paper explores the correlation between centering and different forms of pronominal reference in Italian, in particular zeros and overt pronouns in subject position. Such correlations, that I had proposed in earlier work (COLING 90), are verified through the analysis of a corpus of naturally occurring texts. In the process, I extend my previous analysis in several ways, for example by taking possessives and subordinates into account. I also provide a more detailed analysis of the "continue" transition: more specifically, I show that pronouns are used in a markedly different way in a "continue" preceded by another "continue" or by a "shift", and in a "continue" preceded by a "retain".
Article
Full-text available
Article
Full-text available
Existing software systems for automated essay scoring can provide NLP researchers with opportunities to test certain theoretical hypotheses, including some derived from Centering Theory. In this study we employ the Educational Testing Service's e-rater essay scoring system to examine whether local discourse coherence, as defined by a measure of Centering Theory's Rough-Shift transitions, might be a significant contributor to the evaluation of essays. Rough-Shifts within students' paragraphs often occur when topics are short-lived and unconnected, and are therefore indicative of poor topic development. We show that adding the Rough-Shift based metric to the system improves its performance significantly, better approximating human scores and providing the capability of valuable instructional feedback to the student. These results indicate that Rough-Shifts do indeed capture a source of incoherence, one that has not been closely examined in the Centering literature. They not only justify Rough-Shifts as a valid transition type, but they also support the original formulation of Centering as a measure of discourse continuity even in pronominal-free text. Finally, our study design, which used a combination of automated and manual NLP techniques, highlights specific areas of NLP research and development needed for engineering practical applications.
Conference Paper
Full-text available
This paper focuses on particular phenomena of this sort-the use of various referring expressions such as def'mite noun phrases and pronouns-and examines their interaction with mechanisms used to maintain discourse coherence
Article
Full-text available
Centering theory is the best-known framework for theorizing about local coherence and salience; however, its claims are articulated in terms of notions which are only partially specified, such as "utterance," "realization," or "ranking." A great deal of research has attempted to arrive at more detailed specifications of these parameters of the theory; as a result, the claims of centering can be instantiated in many different ways. We investigated in a systematic fashion the effect on the theory's claims of these different ways of setting the parameters. Doing this required, first of all, clarifying what the theory's claims are (one of our conclusions being that what has become known as "Constraint 1" is actually a central claim of the theory). Secondly, we had to clearly identify these parametric aspects: For example, we argue that the notion of "pronoun" used in Rule 1 should be considered a parameter. Thirdly, we had to find appropriate methods for evaluating these claims. We found that while the theory's main claim about salience and pronominalization, Rule 1—a preference for pronominalizing the backward-looking center (CB)—is verified with most instantiations, Constraint 1–a claim about (entity) coherence and CB uniqueness—is much more instantiation-dependent: It is not verified if the parameters are instantiated according to very mainstream views ("vanilla instantiation"), it holds only if indirect realization is allowed, and is violated by between 20% and 25% of utterances in our corpus even with the most favorable instantiations. We also found a trade-off between Rule 1, on the one hand, and Constraint 1 and Rule 2, on the other: Setting the parameters to minimize the violations of local coherence leads to increased violations of salience, and vice versa. Our results suggest that "entity" coherence—continuous reference to the same entities—must be supplemented at least by an account of relational coherence.
Article
Full-text available
The problem of proposing referents for anaphoric expressions has been extensively researched in the literature and significant insights have been gained through the various approaches. However, no single model is capable of handling all the cases. We argue that this is due to a failure of the models to identify two distinct processes. Drawing on current insights and empirical data from various languages we propose an aposynthetic ¹ model of discourse in which topic continuity, computed across units, and focusing preferences internal to these units are subject to different mechanisms. The observed focusing preferences across the units (i.e., intersententially) are best modeled structurally, along the lines suggested in centering theory. The focusing mechanism within the unit is subject to preferences projected by the semantics of the verbs and the connectives in the unit as suggested in semantic/pragmatic focusing accounts. We show that this distinction not only overcomes important problems in anaphora resolution but also reconciles seemingly contradictory experimental results reported in the literature. We specify a model of anaphora resolution that interleaves the two mechanisms. We test the central hypotheses of the proposed model with an experimental study in English and a corpus-based study in Greek. 1 “Aposynthesis” is a Greek word that means “decomposition,” that is, pulling apart the components that constitute what appears to be a uniform entity.
Article
Full-text available
In this paper the Centering model of anaphora resolution and discourse coherence (Grosz, Joshi and Weinstein, 1983, 1995) is reformulated in terms of Optimality Theory (ot) (Prince and Smolensky 1993). One version of the reformulated model is proven to be descriptively equivalent to an earlier algorithmic statement of Centering due to Brennan, Friedman and Pollard (1987). However, the new model is stated declaratively, and makes clearer the status of the various constraints used in the theory. In the second part of the paper, the model is extended, demonstrating the advantages of the ot reformulation, and capturing formally ideas originally described by Grosz, Joshi and Weinstein. Three new applications of the extended ot Centering model are described: generation of linguistic forms from meanings, the evaluation and optimization of extended texts, and the interpretation of accented pronouns.
Article
Full-text available
This paper has three aims: (1) to generalize a computational account of the discourse process called CENTERING, (2) to apply this account to discourse processing in Japanese so that it can be used in computational systems for machine translation or language understanding, and (3) to provide some insights on the effect of syntactic factors in Japanese on discourse interpretation. We argue that while discourse interpretation is an inferential process, syntactic cues constrain this process, and demonstrate this argument with respect to the interpretation of
Article
We address the problem of how to develop and assess algorithms for tracking local focus and for proposing referents of pronouns. Previous focusing research has Mot adequately addressed the processing of complex sentences. We discuss issues involved in processing complex sentences and review a methodology used by other researchers to develop their focusing frameworks. We identify difficulties with that methodology and difficulties with using a corpus analysis to extend focusing frameworks to handle complex sentences. We introduce a new methodology for extending focusing frameworks, which involves two steps. In the first step, a set of systematically constructed texts are used to identify an extension of the focusing framework to handle a particular kind of complex sentence. In the second step, a corpus analysis is used to confirm the extension. We explain hole our methodology overcomes the difficulties faced by other approaches.
Article
Centering theory, developed within computational linguistics, provides an account of ways in which patterns of interutterance reference can promote the local coherence of discourse. It states that each utterance in a coherent discourse segment contains a single semantic entity—the backward-looking center—that provides a link to the previous utterance, and an ordered set of entities—the forward-looking centers—that offer potential links to the next utterance. We report five reading-time experiments that test predictions of this theory with respect to the conditions under which it is preferable to realize (refer to) an entity using a pronoun rather than a repeated definite description or name. The experiments show that there is a single backward-looking center that is preferentially realized as a pronoun, and that the backward-looking center is typically realized as the grammatical subject of the utterance. They also provide evidence that there is a set of forward-looking centers that is ranked in terms of prominence, and that a key factor in determining prominence—surface-initial position—does not affect determination of the backward-looking center. This provides evidence for the dissociation of the coherence processes of looking backward and looking forward.