From the perspective of the definite article, Lyons (1999) surveys how the expression
of definiteness varies greatly cross-linguistically –not only in form, but also in the
functions it serves–, which brings him to formulate a new theory that regards
definiteness not as a lexical but as a grammatical category. Lyons takes a diachronic
perspective to outline the evolution of definiteness from a category of meaning
expressing identifiability to a grammatical category. At this point it is relevant to
recall Bybee’s (1998) functionalist approach to grammar evolution, based on the
belief that the only linguistic universals we can talk of are those of change. The
definite article does fit in the grammaticization process as characterized by Bybee:
the article has abstracted its meaning and the range of contexts in which it appears
has increased. The different position that each language occupies in this
grammaticization continuum accounts for the variation across languages at present.
While English uses the article only in simple definites, Spanish does so also in
generics, and Catalan represents a further stage where the article appears not only in
simple definites and generics, but also in possessives and personal names.
All in all, it seems indeed irrefutable that limiting the meaning of the definite article
to anaphoricity fails short to account for the overwhelming number of non-anaphoric
definites observed in real occurring data. Lyons’ (1999) distinction between lexical
definiteness and grammatical definiteness correlates with Löbner’s (1985) distinction
between pragmatic and semantic definites. On the one hand, Löbner emphasises the
role of the noun semantics; on the other hand, Lyons subtracts lexical meaning from
the article in favour of a more grammatical role. These two ideas meet in Fraurud’s
(1990) claim for a non-uniform treatment of definite NPs. Non-anaphoric uses of
definites should not be ignored by any theory that aims at providing a full account of
how NP interpretation takes place and, by extension, by any theory on natural
language understanding. The aim of this paper is to merge the views by Löbner
(1985), Fraurud (1990), Lyons (1999), and Bybee (1998) in order to cast light on the
non-anaphoric uses of definites in real data from Spanish and Catalan.
3 Empirical evidence
We carry out two quantitative corpus studies based on the AnCora corpus (Annotated
Corpora for Spanish and Catalan).
Firstly, evidence for the grammaticization of the
article is found in a typological study that compares different languages –Spanish,
Catalan, Swedish and English– with respect to what we call the definiteness ratio,
that is, the number of definite NPs in relation with the total number of full NPs
definiteness ratio yields an insight into the extent to which languages differ in
AnCora consists of two 500,000-word corpora, mainly newspaper articles, annotated from the
morphological to the semantic level (PoS tags, constituents and functions, argument structures and
thematic roles, strong and weak named entities, and WordNet synsets). http://clic.ub.edu/ancora
By full NPs we mean NPs with a nominal head, thus omitting pronouns, NPs with an elliptical head
as well as coordinated NPs.