Content uploaded by Eleni Miltsakaki
Author content
All content in this area was uploaded by Eleni Miltsakaki on Apr 13, 2021
Content may be subject to copyright.
Anaphora Resolution: A Centering Approach
Aravind K. Joshi (joshi@linc.cis.upenn.edu)
Department of Computer and Information Science, and
Institute for Research in Cognitive Science,
University of Pennsylvania,
Philadelphia, PA 19104, U.S.A.
Rashmi Prasad (rjprasad@linc.cis.upenn.edu)
Department of Linguistics, and
Institute for Research in Cognitive Science,
University of Pennsylvania,
Philadelphia, PA 19104, U.S.A.
Eleni Miltsakaki (elenimi@linc.cis.upenn.edu)
Institute for Research in Cognitive Science,
University of Pennsylvania,
Philadelphia, PA 19104, U.S.A.
Abstract
We start by describing the problem of anaphora resolution and discuss approaches to
modeling this problem. Centering Theory (CT), which is an approach to modeling certain
aspects of local coherence in discourse, includes within it the component that models
anaphora resolution. However, CT itself is not a theory of anaphora resolution. It was
developed as part of a theory of local coherence. Subsequently many researchers have
attempted to use CT or some modified versions of CT for anaphora resolution. This has
led to some very interesting work but also raised issues and questions as to what CT is
about. We attempt to clarify some of these issues.
1. Anaphora Resolution with Centers of Attention
Anaphora resolution in discourse - a coherent sequence of utterances - is the task or
process of identifying the referents of expressions which we use to denote discourse
entities, i.e., objects, individuals, properties and relations that have been introduced and
talked about in the prior discourse. The importance of modeling this process cannot be
overstated. Computing the meaning of a discourse is commonly understood as partly the
process of connecting the information in the upcoming utterance with the information
contained in the prior discourse. Before we can do this, however, we need to assign an
interpretation to all the elements of the utterance and then to the utterance as a whole. In
many cases, the interpretation of some elements in the sentence can only be assigned
relative to the prior discourse context – anaphoric expressions comprise one such class of
elements.
In early approaches to anaphoric reference in AI and linguistics, the task of anaphora
resolution was relegated to syntax, which provided filters such as grammatical agreement
constraints, and open-ended semantic inference that drew on, among other things, world
knowledge and inference procedures to identify the appropriate referent. However, it was
soon recognized that while syntactic constraints were very limited in constraining the
search for anaphoric referents on the one hand, the mechanism of open-ended semantic
inference, on the other hand, was too knowledge intensive and complex - requiring
reasoning over the entire space of discourse at once - and therefore computationally
unfeasible.
In 1977, a different view to anaphora resolution arose out of the work of Barbara Grosz
(Grosz, 1977) which rests on a fundamental and singularly important assumption
regarding the attentional status of discourse entities: at any given point of the discourse,
the discourse participants’ attention is centered on a set of entities, a proper subset of all
the entities being talked about in the discourse. Furthermore, for a given utterance, the
discourse participants’ attention is centered on a singleton entity, and the rest of the
utterance makes a predication about this entity. The notion of the center of attention
specific to utterances is very similar to the notion of “topic” in linguistics, where it is
defined as what is “talked about” in the utterance. The approach for anaphora resolution
with this centering view is that the search for the referents of anaphoric expressions
should be restricted to the set of centered entities, the assumption being that in discourse,
it is these entities that we are most likely to continue to talk about and refer to with the
use of anaphoric expressions. Furthermore, a partial ordering is imposed on the elements
of the set, so that some entities are more centered than others. Such a preference ordering
on the possible candidate referents for anaphoric expressions significantly simplifies the
“nature” of inference that would be needed and at the same time minimizes the “amount”
of inference. Another significant proposal was that the set of centered entities can be
partially determined by the linguistic structure of the utterance itself. The consequences
for all these ideas were tremendous because it meant that it was possible to set aside, to a
significant extent, the role of open-ended inferencing for anaphora resolution and look
instead to more easily identifiable surface features of the utterance as the solution and
explanation for at least part of the problem.
While Grosz laid out the general framework for the Centering process, her work did not
suggest the exact mechanisms whereby the centered entities could be identified. In 1979,
Candy Sidner extended Grosz’s framework by precisely defining the notion of the
utterance-based center linguistically and also provided a mechanism for using centers to
identify referents of pronouns.
Sidner invoked several Centering structures - singleton sets called “discourse focus” and
“actor focus”, and a set called the “potential foci” which can contain one or more
elements. The “discourse focus” is equivalent to the center of the utterance, i.e., the entity
about which some predication is made by the utterance. The “actor focus” is the discourse
entity that is predicated as the agent of the event in the utterance. The “discourse focus” is
identified using a set of rules that refer to the linguistic structure of the utterance as well
as the state of the existing data structures when the utterance containing the pronoun is
processed. A referent for a pronoun is identified primarily with the actor focus or the
discourse focus, unless it is ruled out by some specified criteria, in which case an
alternate candidate referent is considered from the set of “potential foci”, which contains
entities other than the two primary foci. A significant aspect of Sidner’s work is that she
does not rule out the role of inference in pronoun interpretation, but instead only
constrains it in nature and amount. The nature of inference needed is different from
earlier open-ended inference systems because it only involves checking for contradictions
once a candidate referent is chosen using the structurally determined preference ordering:
this allows for a much simpler knowledge base and reasoning procedures. The amount of
inference needed is also reduced because of the preference ordering, so that as soon as an
entity is identified for which no contradictions arise, no other inferencing is needed.
2. Centering Theory: Modeling Local Coherence with Centers of
Attention
Centering Theory arose from the work of Aravind Joshi and Steve Kuhn in 1979 (Joshi
and Kuhn, 1979), where the concepts of the “center” and “Centering” were first
introduced as a way to specify an almost monadic calculus approach to discourse
interpretation. Joshi and Kuhn showed that inferences of a certain class are more easily
computed by using a monadic representation for utterances. However, they were also
interested in computing the difficulty of deriving the necessary inferences.
While not explicitly stated by Joshi and Kuhn, the Centering process was assumed to be a
local phenomenon operating over successive utterances. In the meantime, Grosz’s work
on global and local discourse processing had also been formalized by Grosz and Sidner
(Grosz and Sidner, 1986) and it was possible to place CT in its proper place in a complete
theory of discourse processing. Grosz and Sidner provided a framework for discourse
structure as a composite of three interacting constituents: a linguistic structure, an
intentional structure, and an attentional state. The linguistic structure is determined by the
intentional structure and comprises the utterances of the discourse grouped together
hierarchically into discourse segments. The attentional state is an abstraction of the
discourse participants’ center of attention as the discourse unfolds. Each discourse
segment is associated with a fixed attentional state relevant to the overall discourse – the
global attentional state. A local attentional state is associated with each utterance within
the segment. The local attentional state is inherently dynamic and can remain constant or
change from utterance to utterance within the segment.
Centering Theory (Grosz, Joshi and Weinstein, 1983;1986;1995) was proposed as a
model of the local attentional state, i.e., of the dynamic attentional state within the
discourse segment. Following up on the concerns of Joshi and Kuhn, it explicates more
clearly and formally the particular linguistic and attentional state factors that contribute to
the ease or difficulty in interpreting a discourse segment. The notion of inferential
complexity or difficulty was recast as the term of “coherence”. The first factor that
contributes to coherence is given as a further explication of Joshi and Kuhn’s “change of
center” rule, and accounts for the difference in coherence between the following two
discourse segments:
1. (a) John went to his favorite music store to buy a piano.
(b) He had frequented the store for many years.
(c) He was excited that he could finally buy a piano.
(d) He arrived just as the store was closing for the day.
2. (a) John went to his favorite music store to buy a piano.
(b) It was a store John had frequented for many years.
(c) He was excited that he could finally buy a piano.
(d) It was closing just as John arrived.
Discourse (1) is intuitively more coherent than discourse (2). This difference may be seen
to arise from the number of changes in the center. Discourse (1) centers a single
individual ‘John’, describing various actions he took and his reactions to them. In
contrast, discourse (2) seems to flip back and forth between ‘John’ and ‘the store’. These
“changes in aboutness” or “changes of centers” makes discourse (2) less coherent than
discourse (1).
The second observation that CT captures with discourses (1) and (2) establishes the
correlation of center changes and the degree of coherence with the linguistic form of the
utterances. Both discourses convey the same information, but in different ways. They
differ not in content or what is said, but in expression or how it is said. The variation in
“changes of attentional state” that they exhibit arises from different choices of the way in
which they express the same propositional content. The different linguistic choices
further engender different inference demands on the hearer or reader, and these
differences in inference load underlie certain differences in coherence between them.
In addition to the different linguistic choices pertaining to the realization of the
propositional content of the utterance as a whole, CT also identifies different linguistic
choices made for realizing particular elements within the propositional content of the
utterance. These are choices in referring expression form. Pronouns and definite
descriptions are not equivalent with respect to their effect on coherence. CT characterizes
the perceived coherence of the use of pronouns and definite descriptions by relating
different choices to the inferences they require the hearer or reader to make. The
following variations of a discourse illustrate this relationship:
3. (a) Terry really goofs sometimes.
(b)Yesterday was a beautiful day and he was excited about trying
out his new sailboat
(c) He wanted Tony to join him on a sailing expedition.
(d) He called him at 6 A.M.
(e) He was sick and furious at being woken up so early.
(e’) ) Tony was sick and furious at being woken up so early.
(f) He told Terry to get lost and hung up.
(g) Of course, he hadn’t intended to upset Tony.
(g’) Of course, Terry hadn’t intended to upset Tony.
(g’’) Of course, Terry hadn’t intended to upset him.
In discourse (3), it is the use of the pronoun in utterance (3e) that is in question. While we
can tell that the pronoun He refers to ‘Tony’, the use of the pronoun here is potentially
confusing. CT claims that this is because, until utterance (3d), ‘Terry’ has been the
“center of attention”, and therefore the most likely referent of the pronoun. This claim
rests on the assumption that hearers expect speakers to continue talking about the entity
that is in the “center of attention”. The confusion therefore results because we tend to
assign the reference of the pronoun to the center of attention as soon as we encounter it
but have to backtrack (a phenomenon called “garden-path”) when we process the rest of
the sentence and find something that contradicts our assumption. In this particular
example, we backtrack when we get to the work sick and from the prior utterances in the
discourse, reason that it must be ‘Tony’ and not ‘Terry’ who is sick. As the careful reader
will have noticed, the assumed preferences for determining the referents of pronouns in
CT is reminiscent of Sidner’s model. We will return to this comparison at the end of this
section where we discuss the relation between anaphora resolution and Centering theory.
The confusion arising from (3e) is removed if the pronoun is replaced with the full noun
phrase ‘Tony’ as shown in (3e’). The conjecture in CT, therefore, is that when the center
of attention shifts to another entity, the form of referring expression used to denote the
new centered entity has consequences for the processing load required for interpreting the
utterance. A pronoun used to refer to the new centered entity increases the processing
load because it causes backtracking from the interpretation of the old centered entity and
thus from the interpretation of the utterance itself. A full noun phrase on the other hand
shifts the center of attention before the rest of the utterance is processed and therefore
entails less processing.
The three variants (3g), (3g’) and (3g’’) provide an illustration of yet another type of
difference in coherence due to the form of referring expression. This arises when multiple
entities are talked about from one utterance to the next. By the time (3f) is processed, the
center has shifted from ‘Terry’ to ‘Tony’, so that in (3g), we expect ‘Tony’ to be the
center of attention. This expectation is borne out in (3g) since ‘Tony’ is indeed mentioned
again. However, what makes this sentence very odd and hard to process is that ‘Terry’ is
also mentioned in (3g), but while the centered ‘Tony’ is referred to with a full noun
phrase, the non-centered ‘Terry’ is referred to with a pronoun. This increased processing
is reduced when a full noun phrase is used for ‘Terry’ instead of the pronoun, as in (3g’)
or (3g’’), so that we are able to shift the center before processing the rest of the utterance,
thus avoiding any backtracking. The type of coherence variation found in these utterances
is due to the fact that both the centered entity in (3f) as well as another entity are
mentioned again in (3g) and its variants, but in (3g), it is the non-centered entity from (3f)
that is referred to with a pronoun.
CT provides a set of definitions, constraints and rules to formalize the three-way
relationship discussed above, i.e., the relationship between attentional state, the degree of
coherence and linguistic form (for the realization of full propositional content as well as
for the realization of discourse entities). The CT definitions, constraints and rules are
given below.
Definitions:
(D1.) Each utterance U in a discourse segment is assigned a set of forward-looking
centers, Cf (U), where centers are discourse entities realized in the utterance.
(D2.) Each utterance other than the segment-initial utterance is assigned a single
backward-looking center, Cb (U).
(D3.) The backward-looking center of utterance Un+1 connects with one of the forward-
looking centers of Un.
(D4.) The elements of Cf (Un) are partially ordered to reflect relative prominence or
salience in Un. In English, the Cf is ordered according to grammatical role.
(D5.) The more highly ranked an element of Cf (Un), the more likely it is to be Cb(Un+1).
(D6.) The most highly ranked element of Cf (Un) is called the preferred center, Cp (Un).
(D7.) A transition relation holds between each utterance pair Un and Un+1 in a segment.
There are four types of transitions, which describe center continuation, center retention,
and two types of center shifting. The transitions are shown in Table 1.
Constraints:
(C1.) There is precisely one backward-looking center Cb (Un).
(C2.) Cb (Un+1) is the highest ranked element of Cf (Un) that is realized in Un+1.
Constraint C1 says that there is one central discourse entity that the utterance is about.
Constraint C2 states that the ranking or ordering of the forward-looking centers in Un
determines which of them realized in Un+1 will become the backward-looking center of
Un+1.
Rules.
(Rule 1.) If some element of Cf (Un) is realized as a pronoun in Un+1 then so is Cb(Un+1).
(Rule 2.) With respect to Table 1, sequences of the CONTINUE transition are preferred to
sequences of the RETAIN transition, which are preferred to sequences of the SMOOTH-
SHIFT transition, which are preferred to sequences of the ROUGH-SHIFT transition.
Rule 1 is often called the “Pronoun Rule”. It is important to note that the inference load
due to Rule 1 is not part of the inference load characterized by the transitions. Rule 1 is
thus independent of the transitions. This independence of Rule 1 is an important
consideration when thinking of the relation between CT and anaphora resolution. The
inference load due to Rule 1 can be regarded as a binary measure, simply stating whether
or not Rule 1 has been violated. With this rule, we can now explain the varying degrees
of coherence for utterances (3g-3g’’) in discourse (3). The centering analysis for this
discourse is shown in Table 4. After ‘Tony’ is established as the center (the Cb) in (3e),
this center continues in (3f), but with the re-introduction of ‘Terry’ as a potential center.
In (3g), both ‘Tony’ and ’Terry’ are mentioned but since ‘Tony’ is higher ranked than
‘Terry’ in (3f), it is ‘Tony’ that is retained as the Cb in (3g). However, this utterance
creates a Rule 1 violation because the Cb, ‘Tony’, is not realized with a pronoun whereas
‘Terry’, which is not the Cb, is. The only difference between (3g) and (3g’-3g’’) is that the
latter does not violate Rule 1, the transitions remaining the same. The oddness of (3g) is
therefore explained by Rule 1.
Rule 2 provides a formal characterization of the perceived differences in coherence for
discourse segments in terms of an ordering on transition sequences. The less frequent the
shifts in a discourse, the more coherent it is. Discourse (1) above is characterized by
Continue transitions throughout the segment (Continue, Continue, Continue – see Table
2) describing a highly coherent discourse, whereas discourse (2) is characterized by
switches between Retain and Continue (Retain, Continue, Retain – see Table 3),
describing a less coherent discourse.
3. Centering Theory and Anaphora Resolution
As stated right in the beginning, the main goal of CT is to characterize certain aspects of
local coherence. Differences in coherence result from changes in the center of attention,
captured by the Centering transitions and transition ordering, and from the different
expressions in which centers are realized. In particular, pronouns and definite
descriptions engender difference inference demands on the hearer. CT, however, is not to
be seen as a theory of anaphora resolution. The incorporation of referring expressions in
the account of local coherence has led many researchers to use the CT as part of anaphora
resolution algorithms. This has led to some very interesting research. At the same it has
led to some confusion in the literature associated with CT.
The first point to appreciate is that there is undoubtedly a very relevant connection
between CT and anaphora resolution. As the careful reader will have deduced, the
garden-path effects with the interpretation of the pronouns illustrated in discourse (3) is
reminiscent of the preference ordering utilized by Sidner for the reference resolution of
pronouns. In Sidner’s model, the “center of attention” is equivalent to the “discourse
focus” and like Sidner, CT utilizes this preference for the “center of attention” to continue
over successive utterances. The relative preference of the “actor focus” as the next center
of attention is also captured with the “preferred center” in CT. At a first look, it may seem
that Sidner’s use of the “center of attention” to determine the referents of pronouns and
CT’s use of the same to explain how incorrect referents are assigned to pronouns results
in a paradox. But a closer look shows that it isn’t really so, because like CT, Sidner also
allows for garden-paths on the referents of pronouns by further invoking inference
procedures (albeit unspecified) to check for contradictions. So Sidner’s goals and CT’s
goals are very much alike, in that they both assume very similar preference for the
“initial” resolution of pronouns which can be contradicted with further information. The
difference between the two is that CT goes further to formalize the nature and difficulty
of the contradictory inferences in terms of utterance pair transitions and uses the formal
system as a way to compute the degree of coherence of a discourse segment.
Anaphora resolution algorithms that want to obviate the need for inference procedures
and want to model the preferential rules for pronoun resolution should use the common
part underlying the two described models. Sidner’s inference rules for computing
contradictions should be left out (or at least relegated to another interacting component)
as should the part in CT that deals with the computation of coherence with the transitions
and transition orderings. More formally, the common aspect of Sidner’s model and CT
are captured in CT with (i) the list of forward-looking centers, (ii) the backward-looking
center, (iii) the preferred center, and (iv) Rule 1, the “Pronoun Rule”. These data
structures and rules are sufficient to set the initial preference for the referents of
pronouns. Furthermore, corpus studies and studies of naturally occurring data of the form
of referring expressions have shown that to a large extent speakers adhere to the
preference orderings and Rule 1, so that much mileage can be achieved by building in
these preferences into anaphora resolution algorithms, as Sidner had conjectured.
However, while some anaphora resolution algorithms have used these very data
structures and shown good results, others have used CT in totality, i.e., together with the
transitions and transition orderings, to compute the referents of pronouns (for example,
the centering algorithm – called the BFP algorithm - for pronoun resolution in Brennan et
al., 1987). In addition to being theoretically misguided, the latter approach also yields
contradictory results for the initial preferential resolution of pronouns (Kehler, 1997). An
Optimality theory based version of the BFP algorithm and a comprehensive overview of
Centering together with a historical development of Centering Theory and its applications
can also be found in Beaver (2004).
4. Unspecified Aspects of Centering
Some parameters and constants in Centering, both from the perspective of anaphora
resolution and local coherence were left unspecified in the original models. Two of these
in particular have led to a great deal of research.
The first is determination of the preference ordering on the list of forward-looking centers
or determination of relative salience of discourse entities in an utterance. This is crucial
for the initial interpretation assignments for pronouns. Cross-linguistic investigation of
the mechanisms that languages use to realize discourse functions like “topic” shows that
different ranking criteria need to be used for different languages. In English, relative
salience is largely predicted by grammatical role, as was correctly assumed in CT. Other
languages use other mechanisms. In Japanese, which uses the morphemes wa and ga to
distinguish topics and subjects and special forms of the verb for marking empathy, topic
and empathy marked entities are ranked higher than subjects. German uses word order in
some syntactic contexts to indicate salience, positioning higher ranked entities before
lower ranked ones. Other languages on which such research has been conducted include
Finnish, Greek, Hindi, Italian, Russian and Turkish.
The second is the specification of what constitutes the utterance, which in CT is the
linguistic locus of the local attentional state. Discourse centers, both backward-looking
and forward-looking, are computed for each utterance. That is, each utterance serves as a
center update unit. In attempting to characterize the linguistic encoding of a center update
unit, complications rise from complex sentence structures. Up-to-date research on this
issue suggests that complex sentences may project different center update units
depending on their internal structure.
In early theoretical work on characterizing the center update unit in Centering, it was
suggested that complex sentences be broken in clauses each of which forms an
autonomous center update unit, with the possible exception of relative clauses and
complement clauses. Treating adverbial clauses as autonomous center update units
predicts that a pronoun in a fronted adverbial clause, as in (4c) below, is anaphorically
dependent on an entity already introduced in the immediately prior discourse and not on
the subject of the main clause it is attached to:
4. (a) (Jim) Kerni began reading a lot about the history and philosophy of
Communism
(b) but never 0i felt there was anything he as an individual could do about
(c) When hei attended the Christina Anti-Communist Crusade school here about
six months ago
(d) Jimi (Kern) became convinced that he as an individual could do something
constructive in the ideological battle
(e) and 0i set out to do it
This view on backward anaphora was also professed in earlier work by Kuno, who
asserted that there was no genuine backward anaphora: the referent of an apparent
cataphoric pronoun must appear in the previous discourse. Empirical data later showed
that this view of backward anaphora cannot be maintained. Corpus studies show that
cataphoric pronouns can appear discourse initially.
Experimental work focusing on complex sentences of the type that includes adverbial
clauses suggests that adverbial clauses are processed as a single unit with the matrix
clause. Specifically, native speakers of English tend to interpret the ambiguous subject
pronoun in (5) as the groom , i.e., the subject of the preceding clause, even when the
adverbial in the second main clause is semantically varied (however, as a result,
moreover, then etc). This pattern contrasts with the interpretation of the subject pronoun
(6) for which no consistent tendency is identified, indicating that in this case the
interpretation of the pronoun is most likely determined by the semantics of the predicates
of the main and adverbial clause and the relation between them.
5. The groom hit the best man. However, he…
6. The groom hit the best man although he…
Other experimental work on the interpretation of a subject pronoun following a complex
sentence indicates that referents in subject position in adverbial clauses are not favored
for the interpretation of a subsequent pronoun. In (7) and (8), for example, the subject
pronoun is interpreted as the conductor, i.e., the referent of the matrix clause, even when
the adverbial clause is postposed with respect to the main clause.
7. After the tenor opened his music store the conductor sneezed three
times. He...
8. The conductor sneezed three times after the tenor opened his music
score. He...
Data such as the above would be a challenge for a Centering-based anaphora resolution
algorithm which processes one clause at a time because there is no way of distinguishing
between (5) and (6). At the same time, these data are consistent with Centering and
Centering’s Pronoun rule under the assumption that adverbial clauses are not processed as
independent update units. Under this assumption, Centering would predict the pattern
observed in (5), (7) and (8). Centering’s pronoun rule would not make a prediction for (6)
with respect to the entities introduced in the main clause because they belong to the same
unit as the pronoun. Additional evidence for treating the entire sentence as a single update
unit comes from corpus work exploring various parameters that can be set for Centering
and the number of Centering rules that they would violate. This type of work suggests
that overall treating the whole complex sentence as a center update unit leads to fewer
violations of the Pronoun rule.
Studies of Centering in relative clauses present conflicting results which need further
research to be reconciled. On the one hand are discourses like (9) that suggest that entities
mentioned in relative clauses (9b) are less salient than in the main clause (9a), as
indicated by the use of the subsequent use of full noun phrase in (9c). In fact, a pronoun
used instead of the full noun phrase would probably be interpreted as Mr. Taylor, i.e., the
entity in the main clause.
9. (a) Mr. Taylori, 45 years old, succeeds Robert D. Kilpatrickj, 64,
(b) whoj is retiring, as reported earlier.
(c) Mr. Kilpatrickj will remain a director.
(d) Hei …#Hej…
On the other hand are discourses like (10) showing the opposite pattern from that in (9).
Such data comes from work that looks at different types of relative clauses, specifically
non-restrictive and restrictive with a definite or indefinite head. Complementary patterns
in the use of pronouns and definite descriptions shows that non-restrictive clauses and
restrictive clauses with an indefinite head pattern alike, and form an autonomous (but
embedded and accessible) center update unit. In example (10), the subject pronoun in
(10c) refers, without any garden-path effects, to the subject referent of the preceding
relative clause and not the subject referent of the main clause, indicating that in this case
the relative clause probably introduces a new update unit that is accessible to (10c) for
center establishment.
10. (a) This Mosesi was irresistible to a man like Simkinj
(b) whoj loved to pity and to poke fun at the same time.
(c) Hej was a reality-instructor.
5. Applications of Centering Theory as a model of Local Coherence
Some research illustrates the appropriate and correct application of Centering Theory.
The four Centering transitions shown in Table 1 define four degrees of coherence within
a discourse segment. A textual segment characterized by a sequence of Continue
transitions demonstrates the highest degree of coherence and is perceived as a segment
focusing on a single entity. Topic retains and smooth shifts to new topics are captured in
the Retain and Smooth-Shift transitions. Indeed, numerous corpus studies have identified
Continue, Retain and Smooth-Shift transitions. As expected, Rough-Shift transitions are
rarely identified in corpora of written text which presumably maintain a high level of
coherence. An exception to this pattern is observed in texts whose coherence is under
evaluation and therefore cannot be assumed. A typical kind of this type of text is student
essays. Indeed, in a study of essays written by students, it has been shown that excessive
number of Rough-Shift transitions per paragraph in students’ essays correlates with low
essay scores provided by writing experts .
A closer analysis of the essays reveals that the incoherence detected by a Rough-Shift
measure is not due to violations of Centering's Pronominal Rule or other infelicitous uses
of pronominal forms. The distribution of nominal and pronominal forms over Rough-
Shift transitions reveals that in fact pronominal forms are avoided in Rough-Shift
transitions. This observation indicates that the incoherence found in student essays is not
due to the processing load imposed on the reader to resolve anaphoric references.
Instead, the incoherence in the essays is due to discontinuities caused by introducing a
rapid succession of new, undeveloped topics with no links to the prior discourse. In other
words, Rough-Shifts pick up textual incoherence due to topic discontinuities.
Studies such as the one just described are supportive of the formulation of Centering as a
model of local discourse coherence. They also show that the Centering model can be used
successfully for practical applications, e.g., to improve automated systems of writing
evaluation in testing and education. In fact, it has been shown that adding a Centering-
based metric of coherence to an existing electronic essay scoring system (the system e-
rated developed at the Educational Testing Service) improves the performance of the
system by better approximating human expert scores. In addition, a Centering-based
system of writing evaluation has exceptional pedagogical value. This is because the
models offers the capability of directing students' attention to specific locations within an
essay where topic discontinuities occur. It can illuminate broken topic and focus chains
within the text of an essay by drawing the student’s attention to the noun phrases playing
the roles of Cb's and Cp's. Supplementary instructional comments could guide the student
into revising the relevant section paying attention to topic discontinuities.
Bibiography
Baldwin, B.F. (1995). COGNIAC: a discourse processing engine (Ph.D. thesis).
University of Pennsylvania.
Beaver, D. (2004). ‘The Optimization of Discourse Anaphora.’ Linguistics and
Philosophy 27(1), 3-56.
Brennan, S.E., Friedman, M.W. & Pollard, C.J. (1987). ‘A Centering approach to
pronouns.’ Proceedings of the 25th Annual Meeting of the Association for Computational
Linguistics, Stanford, Calif., 155-162.
Cooreman, A. & Sanford, A. (1996). Focus and Syntactic Subordination in Discourse
(Technical Report). Human Communication Research Center.
Di Eugenio, B. (1996). ‘Centering in Italian’. In Walker, M.A., Joshi, A.K. & Prince, E.F.
(eds.) Centering Theory in Discourse. New York: Oxford University Press. 115-138.
Givón, T. (1983). ‘Topic continuity in discourse: a quantitative cross-language study.’
Topic Continuity in Discourse: An Introduction. Amsterdam: John Benjamins Publishing.
1-42.
Gordon, P.C., Grosz, B.J. & Gilliom, L.A. (1993). ‘Pronouns, names and the Centering of
attention in discourse.’ Cognitive Science, 17(3), 311-347.
Grosz, B.J. (1977). The representation and use of focus in dialogue understanding
(Technical Report No. 151). Menlo Park, Calif.: SRI International.
Grosz, B.J. & and Sidner, C.L. (1986). ‘Attentions, intentions and the structure of
discourse.’ Computational Linguistics 12, 175-204.
Grosz, B.J., Joshi, A.K. and Weinstein, S. (1983). ‘Providing a unified account of noun
phrases in discourse.’ Proceedings of the 21st Annual Meeting of the Association for
Computational Linguistics, Cambridge, Mass., 44-50.
Grosz, B.J., Joshi, A.K. and Weinstein, S. (1995). ‘Centering: a framework for modeling
the local coherence of discourse.’ Computational Linguistics 21(2), 203-225.
Hudson-D’Zmura, S.B. (1988). The structure of discourse and anaphor resolution: the
discourse center and the role of nouns and pronouns (Ph.D. thesis). University of
Rochester.
Joshi, A.K. & Kuhn, S. (1979). ‘Centered logic: the role of entity centered sentence
representation in natural language inferencing.’ Proceedings of the 6th International Joint
Conference in Artificial Intelligence, Tokyo, 435-439.
Kehler, A. (1997). ‘Current theories of Centering for pronoun interpretation: a critical
evalutation.’ Computational Linguistics 23(3), 467-475.
Miltsakaki, L. (2004). ‘Not all subjects are born equal: a look at complex sentence
structure.’ The Processing and Acquisition of Reference. Cambridge, MA: MIT Press.
Miltsakaki, E. & Kukich, K. (2004). ‘Evaluation of text coherence for electronic essay
scoring systems.’ Natural Language Engineering 10(1), 25-55.
Miltsakaki, E. (2002). ‘Toward an aposynthesis of topic continuity and intras-sentential
anaphora. ’ Computational Linguistics 28(3), 319-255.
Poesio, M., Stevenson, R., Di Eugenio, B. & Hitzeman, J. (2004). ‘Centering: a
parametric theory and its instantiations.’Computational Linguistics 30(3), 309-363.
Prasad, R. & Strube, M. (2000). ‘Discourse salience and pronoun resolution in Hindi.’ In
Williams, A. & Kaiser, E. (eds.) Penn Working Papers in Linguistics: Current Work in
Linguistics 6(3), 189-208.
Prasad, R. (2003). Constraints on the generation of referring expressions: with special
reference to Hindi (Ph.D. thesis). University of Pennsylvania.
Prince, E.F. (1999). ‘Subject pro-drop in Yiddish.’ In Bosch, P & van der Sandt, R. (eds.)
Focus: Linguistic, Cognitive and Computational and Perspectives. Cambridge:
Cambridge University Press. 82-101.
Rambow, O. (1993). ‘Pragmatic aspects of scrambling and topicalization in German.’
Institute for Research in Cognitive Science Workshop on Centering Theory in Naturally-
Occurring Discourse (Ms.). University of Pennsylvania, May 20-28.
Reinhart, T. (1981). ‘Pragmatics and linguistics. an analysis of sentence topics.’
Philosphica 27(1), 53–94.
Sidner, C.L. (1979). Toward a computational theory of definite anaphora comprehension
in English (Technical Report No. AI-TR-537). Cambridge, Mass.: MIT Press.
Strube, M. & Hahn, U. (1998). ‘Never look back: an alternative to Centering.’
Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics
and the 17th International conference on computational linguistics, Montreal, Quebec,
Canada. 1251-1257.
Suri, L.Z., DeCristofaro, J.D. & McCoy, K.F. (1999). ‘A methodology for extending
focusing frameworks.’ Computational Linguistics 25(2), 173-194.
Turan, U.D. (1995). Null vs. overt subjects in Turkish discourse: a Centering analysis.
(Ph.D. thesis). University of Pennsylvania.
Walker, M.A., Iida, M. & Cote, S. (1994). ‘Japanese discourse and the process of
Centering.’ Computational Linguistics 20(2), 193-232.
Walker, M.A., Joshi, A.K. & Prince, E.F. (1998). Centering theory in discourse. New
York: Oxford University Press.
Cb (Ui+1) = Cb (Ui) OR
Cb (Ui) = [?]
Cb (Ui+1) Cb (Ui)
Cb (Ui+1) = Cp (Ui+1) CONTINUE SMOOTH-SHIFT
Cb (Ui+1) Cp (Ui+1)RETAIN ROUGH-SHIFT
Table 1: Centering Transitions
(1a) John went to his favorite music store to buy a piano.
Cf = {John, store, piano}, Cp = John, Cb = ?, Transition = undef.
(1b) He had frequented the store for many years.
Cf = {John, store}, Cp = John, Cb = John, Transition = Continue
(1c) He was excited that he could finally buy a piano.
Cf = {John, piano}, Cp = John, Cb = John, Transition = Continue
(1d) He arrived just as the store was closing for the day.
Cf = {John, store}, Cp = John, Cb = John, Transition = Continue
Table 2: Centering Analysis for Discourse (1)
(2a) John went to his favorite music store to buy a piano.
Cf = {John, store, piano}, Cp = John, Cb = ?, Transition = undef.
(2b) It was a store John had frequented for many years.
Cf = {store, John}, Cp = store, Cb = John, Transition = Retain
(2c) He was excited that he could finally buy a piano.
Cf = {John, piano}, Cp = John, Cb = John, Transition = Continue
(2d) It was closing just as John arrived.
Cf = {store, John}, Cp = store, Cb = John, Transition = Retain
Table 3: Centering analysis for Discourse (2)
(3a) Terry really goofs sometimes.
Cf = {Terry}, Cp = Terry, Cb = ?, Transition = undef.
(3b) Yesterday was a beautiful day and he was excited about trying out his new sailboat.
Cf = {Terry, sailboat}, Cp = Terry, Cb = Terry, Transition = Continue
(3c) He wanted Tony to join him in a sailing expedition.
Cf = {Terry, Tony, expedition}, Cp = Terry, Cb = Terry, Transition = Continue
(3d) He called him at 6 A.M.
Cf = {Terry, Tony}, Cp = Terry, Cb = Terry, Transition = Continue
(3e) He was sick and furious at being woken up so early.
Cf = {Tony}, Cp = Tony, Cb = Tony, Transition = Smooth-shift
(3e’) Tony was sick and furious at being woken up so early.
Cf = {Tony}, Cp = Tony, Cb = Tony, Transition = Smooth-shift
(3f) He told Terry to get lost and hung up.
Cf = {Tony, Terry}, Cp = Tony, Cb = Tony, Transition = Continue
(3g) Of course, he hadn’t intended to upset Tony.
Cf = {Terry, Tony}, Cp = Terry, Cb = Tony, Transition = Retain
(3g’) Of course Terry hadn’t intended to upset Tony.
Cf = {Terry, Tony}, Cp = Terry, Cb = Tony, Transition = Retain
(3g’’) Of course, Terry hadn’t intended to upset him.
Cf = {Terry, Tony}, Cp = Terry, Cb = Tony, Transition = Retain
Table 4: Centering analysis for Discourse (3)
KEYWORDS:
Anaphora resolution
Pronoun resolution
Centering
Discourse
Discourse structure
Linguistics
Pragmatics
Processing complexity
Inference
Topic
Coherence
Referring expressions
Discourse salience
Utterance
Complex sentences
Attentional state