Thoughts on grammaticalization

Thoughts on grammaticalization
Second, revised edition
Christian Lehmann
July 2002
Prospect of volume II v
Tables vi
Figures vi
Preface to the draft version vii
Preface to the published edition ix
1. The history of research in grammaticalization 1
2. Grammaticalization: 8
2.1. The term ‘grammaticalization’ 8
2.2. The meaning of ‘grammaticalization’ 10
2.3. Degrammaticalization 14
2.4. Renovation and innovation 17
2.5. Reinforcement 20
3. Grammatical domains 22
3.1. Verbal complexes 22
3.1.1. Existence and possession 22
3.1.2. The copula 23
3.1.3. Modals and moods 24
3.1.4. Tense and aspect 25
3.1.5. Passive and emphasis 28
3.1.6. Auxiliaries and alternative sources 29
3.2. Pronominal elements 33
3.2.1. Definite pronominal elements 33 Definite determiners 34 Personal pronouns 34 Reflexive pronouns 37
3.2.2. Indefinite pronominal elements 44 Interrogative pronouns 44 Indefinite pronouns 44 Negative indefinites 47 Conclusion 50
3.3. Nominal complexes 50
3.3.1. Nominal categories 50 Number 50 Numeral classifiers 53
3.3.2. Nominalization 54
3.3.3. Attribution 61
3.4. Clause level relations 66
3.4.1. Adverbial relations 66 Adverbial relators 66 Relational nouns 67 From adposition to case affix 70 Adverbs 77 Renovations and reinforcements 84 Preverbs 86 Coverbs 92
3.4.2. Main actant relations 95 Terminology 95 Grammatical cases 97 From functional sentence perspective to syntax 100
3.5. Conclusion 107
4. Parameters of grammaticalization 108
4.1. Theoretical prerequisites 108
4.2. Paradigmatic parameters 112
4.2.1. Integrity 112
4.2.2. Paradigmaticity 118
4.2.3. Paradigmatic variability 123
4.3. Syntagmatic parameters 128
4.3.1. Structural scope 128
4.3.2. Bondedness 131
4.3.3. Syntagmatic variability 140
4.4. Interaction of parameters 143
4.4.1. Quantifiability of the parameters 143
4.4.2. Correlation among the parameters 146
4.4.3. Lack of correlation 150
4.4.4. Reduction to zero and fixation of word order 153
Indices 160
Abbreviations 160
Language abbvreviations 160
Grammatical categories in interlinear morphemic translations 160
Bibliographical references 162
Prospect of volume II
5. Processes cognate to grammaticalization
5.1. Semantic processes
5.2. Lexicalization
5.3. Phonological processes
5.4. Analogy
6. Traditional problems in new perspective
6.1. Grammatical meaning
6.2. Grammatical levels
6.3. Markedness
6.4. Arbitrariness of the linguistic sign
6.5. Semantic representation
7. Comparison of languages
7.1. Contrastive linguistics
7.2. Language typology
7.3. Language universals
8. Language history and linguistic evolution
8.1. Development of grammatical categories
8.2. Linguistic evolution
8.3. Historical reconstruction
9. Language theory
9.1. Language activity
9.2. The causes of grammaticalization
T1. Greenlandic case paradigm ....................................... 75
T2. Development of case suffixes in Hungarian ........................... 75
T3. Mangarayi case paradigm ........................................ 84
T4. The parameters of grammaticalization ............................. 110
T5. Tense/aspect periphrases in Portuguese ............................ 119
T6. Correlation of grammaticalization parameters ....................... 146
F1. The phases of grammaticalization .................................. 12
F2. Some interrelated grammaticalization channels of verbal categories ....... 32
F3. Some interrelated grammaticalization channels of pronominal elements .... 49
F4. Structure of complex adpositional phrase ............................ 69
F5. Structure of expanded adverbial phrase ............................. 80
F6. Grammaticalization of coverbs .................................... 94
F7. Evolution of adverbial relators .................................... 95
F8. Some interrelated grammaticalization channels of cases ................ 99
F9. From discourse functions to syntactic functions ...................... 106
F10. Scale of grammatical boundaries .................................. 137
F11. Structure of numeral classifier phrase .............................. 149
F12. Reduction to zero and fixation of word order ........................ 158
Preface vii
Preface to the draft version
As we will be going a long way, through involved and ramified discussions, until we
arrive at something like a definition of grammaticalization, the reader who wants to
know beforehand what this book is all about is asked to accept this as a preliminary
characterization: Grammaticalization is a process leading from lexemes to grammatical
formatives. A number of semantic, syntactic and phonological processes interact in the
grammaticalization of morphemes and of whole constructions. A sign is grammaticalized
to the extent that it is devoid of concrete lexical meaning and takes part in obligatory
grammatical rules. A simple example is the development of the Latin preposition ad ‘at,
towards’ into the Spanish direct object marker a.
It must be made clear at the outset that this treatment is preliminary, incomplete and
imperfect. It presents little more than what has been found out in the two centuries in
which the subject has been studied, and probably it contains even less than that, because
I have been unable to take notice of all the relevant literature. I must also warn the reader
that I have great conceptual difficulties with the present subject, and I will leave many
questions open. The problem is not so much an empirical one: there are sufficient
analyzed data, and the empirical phenomena in themselves appear to be reasonably clear.
What is highly unclear is how the phenomena are to be interpreted, classified and related
to each other. Grammaticalization is such a pervasive process and therefore such a
comprehensive notion that it is often difficult to say what does not fall under it. The
present essay will therefore be concerned, first and foremost, with the question: what is
The discussion will not be couched in terms of a specific theory of grammar, one reason
being that existing grammatical models are inadequate for the representation of the
gradual nature which is essential to the phenomena comprised by grammaticalization. As
many of the problems involved are traditional ones, they can be discussed in traditional
The theory of language which is to account for the systematicity, goal-directedness and
dynamism inherent to grammaticalization must be structural, functional and operational
in nature. It is essentially the theory of Wilhelm von Humboldt (1836), which has been
elaborated in more recent times by Eugenio Coseriu (1974) and Hansjakob Seiler (1978).
This theory has never been made fully explicit; but it will become transparent through all
of the present treatment, and an attempt to make it more explicit will be presented in the
last chapter.
The work is organized as follows. We start, in ch. 1, with a brief historical review of the
relevant literature. Ch. 2 will supply some first clarifications to the concept of
grammaticalization and will delimit it against related concepts. Ch. 3 contains the bulk
of the empirical data which illustrate grammaticalization, ordered according to
Preface viii
semantically defined domains of grammar. From this evidence, the various basic
processes which integrate grammaticalization and which are called its parameters are
then extracted and ordered according to how they pertain to the paradigmatic or the
syntagmatic aspect, to the content or the expression of the grammaticalized sign. The
degree to which these parameters correlate will also be discussed in ch. 4. The next
chapter looks out for analogs to grammaticalization in different parts of the language
system and tries to distinguish these from grammaticalization proper. In ch. 6 we turn to
a couple of traditional linguistic problems, asking whether the concept of
grammaticalization can contribute anything towards their clarification. The various
modes of contrasting different languages, including language typology and universals
research, are discussed in the perspective of grammaticalization in ch. 7. Ch. 8
concentrates on the diachronic aspect of grammaticalization, its role in language change
and historical reconstruction. The final chapter tries to formulate the advances that may
be made in language theory if grammaticalization is given its proper place in it.
Due to idiosyncrasies in the timing of my research projects, I have had to interrupt the
writing of this book after ch. 4. It was decided that the finished chapters should appear
as volume I, while chapters 5 - 9 should be reserved for a second volume. I have
included them in the preceding sketch and also given a prospect on their contents in
order that the reader may get an idea of the plan of the complete work. It is my intention
to complete volume II for akup in 1983.
A cordial word of thanks goes to Bernd Heine and Mechthild Reh, who have been
working on grammaticalization and evolutive typology, especially in African languages,
simultaneously and partly in cooperation with me. They have been kind and disinterested
enough to put their notes and manuscripts at my disposal. References are to this
prepublication version; their work is now being published as akup 47. Finally, I should
like to thank Sonja Schlögel and Ingrid Hoyer, who have taken great care in typing and
editing the manuscript.
Cologne, 7.10.1982 Christian Lehmann
Preface ix
Preface to the published edition
A preliminary version of the present work was distributed in 1982 under the following
title: Thoughts on grammaticalization. A programmatic sketch. Vol. I. Köln: Institut für
Sprachwissenschaft der Universität (Arbeiten des Kölner Universalienprojekts, 48). It
got out of stock immediately, but has been in high demand since. A slightly revised
version was released in January 1985, but only in form of a number of xerocopies. The
original plan was, of course, to get back to work on grammaticalization as soon as
possible, to write up volume II and then publish the whole work. Then the title, too,
would have been streamlined a bit. However, I never got around to do that.
The semipublished 1982 paper has played an instrumental role in the development of
modern work on grammaticalization. Many people have asked me to at least make it
available in published form, even if I should never manage to round it off. This is what
I am doing here. Consequently, this publication is slightly anachronistic. I have removed
those errors of the preliminary version that I got aware of. I have modified many points
of detail. I have updated references to unpublished material. But I have not taken into
consideration the vast amount of literature on grammaticalization that has appeared since
(including my own more recent contributions) and that would lead me to reformulate
substantially some of the ideas expounded here. Readers should be aware that the state
of research reflected here is essentially that of 1982.
References to volume II, including even the ‘Prospect of contents of volume II’, have not
been deleted. A fair appreciation of what is being published here is only possible if one
considers that it was always intended to be only half of what would, at least, be
necessary. However, I doubt that volume II will ever be published. Below, I list the
articles on grammaticalization that I have published since 1982. Some of them may be
considered to fill the lacunae created by the prospect. In particular, the following
assignments may be allowed:
Ch. 5.2: 1989[G], 2002.
Ch. 6.3: 1989[M].
Ch. 7.2: 1985[r], 1986.
Ch. 8: 1985[G], 1987, 1992.
Ch. 9: 1993, 1995.
I have been unable to get my English grammar and style revised by a native speaker, and
I must apologize for the inconveniences resulting therefrom. Finally, cordial thanks go
to Cornelia Sünner for the effort she has made in editing the typescript. I also thank the
numerous colleagues who have reacted to the preliminary version and whose comments
would deserve fuller attention.
Bielefeld, 21.07.1995 Christian Lehmann
Preface x
For the second edition, some changes and corrections have been made.
Erfurt, 08.07.2002 Christian Lehmann
Aguado, Miquel & Lehmann, Christian 1989, “Zur Grammatikalisierung der Klitika im
Katalanischen.” Raible, Wolfgang (ed.) 1989, Romanistik, Sprachtypologie und
Universalienforschung. Beiträge zum Freiburger Romanistentag 1987. Tübingen:
G. Narr (TBL, 332); 151-162.
Lehmann, Christian 1985, “Grammaticalization: Synchronic variation and diachronic
change.” Lingua e Stile 20:303-318.
Lehmann, Christian 1985, “The role of grammaticalization in linguistic typology.”
Seiler, Hansjakob & Brettschneider, Gunter (eds.), Language invariants and
mental operations. International interdisciplinary conference held at Gummers-
bach/Cologne, Germany, Sept. 18-23, 1983. Tübingen: G. Narr (LUS, 5); 41-52.
Lehmann, Christian 1986, “Grammaticalization and linguistic typology.” General
Linguistics 26:3-23.
Lehmann, Christian 1987, “Sprachwandel und Typologie.” Boretzky, Norbert et al.
(eds.), Beiträge zum 3. Essener Kolloquium über Sprachwandel und seine bestim-
menden Faktoren, vom 30.9. - 2.10.1987 [sic; i.e. 1986] an der Universität Essen.
Bochum: N. Brockmeyer (Bochum-Essener Beiträge zur Sprachwandelforschung,
4); 201-225.
Lehmann, Christian 1989, “Grammatikalisierung und Lexikalisierung.” Zeitschrift für
Phonetik, Sprachwissenschaft und Kommunikationsforschung 42:11-19.
Lehmann, Christian 1989, “Markedness and grammaticalization.” Tomi¡, Olga M. (ed.),
Markedness in synchrony and diachrony. Berlin & New York: Mouton de Gruyter
(Trends in Linguistics, Studies and Monographs, 39); 175-190.
Lehmann, Christian 1991, “Grammaticalization and related changes in contemporary
German.” Traugott, Elizabeth C. & Heine, Bernd (eds.), Approaches to
grammaticalization. 2 vols. Amsterdam & Philadelphia: J. Benjamins (Typological
Studies in Language, 19); II:493-535.
Lehmann, Christian 1992, “Word order change by grammaticalization.” Gerritsen,
Marinel & Stein, Dieter (eds.) 1992, Internal and external factors in syntactic
change. Berlin & New York: Mouton de Gruyter (Trends in Linguistics, 61);
Lehmann, Christian 1993, “Theoretical implications of processes of grammaticaliza-
tion.” Foley, William A. (ed.) 1993, The role of theory in language description.
Berlin & New York: Mouton de Gruyter (Trends in Linguistics, 69); 315-340.
Lehmann, Christian 1995, “Synsemantika.” Jacobs, Joachim et al. (eds.), Syntax. Ein
internationales Handbuch zeitgenössischer Forschung. Berlin: W. de Gruyter
(Handbücher der Sprach- und Kommunikationswissenschaft, 9/2); II:1251-1266.
Preface xi
Lehmann, Christian 2002, "New reflections on grammaticalization and lexicalization."
Wischer, Ilse & Diewald, Gabriele (eds.), New reflections on grammaticalization.
Amsterdam & Philadelphia: J. Benjamins (TSL, 49); 1-18.
1Information on Condillac and Horne Tooke from Arens 1969:109f, 132-134, and
Stammerjohann (ed.) 1975:119, 452f.
As far as I can see, it was Antoine Meillet (1912) who coined the term ‘gram-
maticalization’ and first applied it to the concept for which it is still used today. We will
return to him in a moment. The concept itself, however, and the ideas behind it, are
considerably older. The idea that grammatical formatives evolve from lexemes, that
affixes come from free forms, was already expounded by the French philosopher Étienne
Bonnot de Condillac. In his work Essai sur l'origine des connaissances humaines (1746),
he explained the personal endings of the verb through agglutination of personal pronouns
and maintained that verbal tense came from the coalescence of a temporal adverb with
the stem. Again, John Horne Tooke, in his etymological work ›H_I? _gIa}IYg? or the
diversions of Purley (vol. I: 1786, vol. II: 1805), claimed that prepositions derive from
nouns or verbs.1 We shall see in ch. 3 that all such processes do, in fact, occur, though
not necessarily in the specific cases which these authors had in mind.
Condillac and Horne Tooke were certainly only forerunners to the first evolutive
typologists, notably August Wilhelm von Schlegel and Wilhelm von Humboldt. In his
Observations sur la langue et la littérature provençales (1818), Schlegel deals
extensively with the renewal of Latin synthetic morphology by Romance analytic
morphology. About the formation of the latter, he writes:
C'est une invention en quelque façon négative, que celle qui a produit les
grammaires analytiques, et la méthode uniformément suivie à cet égard peut
se réduire à un seul principe. On dépouille certains mots de leur énergie
significative, on ne leur laisse qu'une valeur nominale, pour leur donner un
cours plus général et les faire entrer dans la partie élémentaire de la langue.
Ces mots deviennent une espèce de papier-monnaie destiné à faciliter la
circulation. (28)
This is followed by a series of Latin-Romance examples of different kinds, including the
development of articles, auxiliaries and indefinite pronouns, which have subsequently
become the stock examples of grammaticalization theory. Although Schlegel goes so far
as to speak of “la formation d'une nouvelle grammaire” (30), he views the development
essentially as due to linguistic decadence. It will be observed, however, that some of the
core aspects of grammaticalization, viz. semantic depletion and expansion of
distribution, are foreshadowed here.
Wilhelm von Humboldt arrived at more far-reaching conclusions. In his academic lecture
on the origins of grammatical forms, he proposed that “grammatische Bezeichnung” (the
signifying of grammatical categories, as opposed to objects) evolves through the
following four stages (1822:54f):
1. The history of research in grammaticalization 2
2This inadequacy of the term was also felt by Jespersen, who proposed to substitute it by “coales-
cence theory” (1922:376).
3This application of agglutination theory is not to be confused with Bopp's typology of roots.
I. “grammatische Bezeichnung durch Redensarten, Phrasen, Sätze”:
grammatical categories are completely hidden in the lexemes and in the
semantosyntactic configurations.
II. “grammatische Bezeichnung durch feste Wortstellungen und zwischen
Sach- und Formbedeutung schwankende Wörter”.
III. “grammatische Bezeichnung durch Analoga von Formen”: here the
“vacillating words” have been agglutinated as affixes to the main
words. The resulting complexes are not “forms”, unitary wholes, but
only “aggregates”, and therefore mere “analogs to forms”.
IV. “grammatische Bezeichnung durch wahre Formen, durch Beugung und
rein grammatische Wörter”.
These four stages are connected with each other “durch verloren gehende Bedeutung der
Elemente und Abschleifung der Laute in langem Gebrauch.”
One may simply overlook the “evaluation” of the different stages to which this theory is
committed. One may also regard it as a terminological issue whether the term
‘grammatical form’ can be correctly applied only at stage IV, and not also at the other
stages. But one must recognize that this account of the evolution of grammatical forms
is essentially a theory of grammaticalization, if only a sketchy one. Three things are
worth noting here. First, the term ‘grammatical form’ must not mislead one into thinking
that this theory deals only with the expression of the language sign. The passages
quoted leave no doubt that the evolution in question affects both the meaning and the
expression of the grammatical sign. Secondly, the four stages are essentially the
morphological types of the linguistic typology of the time: stages I and/or II = isolating,
III = agglutinative, IV = flexional. Thirdly, linguistic typology, which in the twentieth
century was reduced to a synchronic discipline, is here conceived as evolutive typology.
Consequently, the theory of grammaticalization is tied, from the very start, to evolutive
This theory was subsequently widely received under the name of “Aggluti-
nationstheorie”. This term appears to refer only to the transition towards stage III, but
was later used to comprise all of the four stages.2 The first to apply the theory, Franz
Bopp, who shared ideas with Humboldt through correspondence, actually concentrated
on stage III. In his Über das Conjugationssystem der Sanskritsprache in Vergleichung
mit jenem der griechischen, lateinischen, persischen und germanischen Sprache (1816:
147f; apud Arens 1969:177), and again in vol. I of his Vergleichende Grammatik des
Sanskrit, Zend, Griechischen, Lateinischen, Litauischen, Altslavischen, Gothischen und
Deutschen (Berlin, 1833), he derived the personal endings of the Indo-European verb
from agglutinated personal pronouns.3 Several of the neogrammarians, among them
1. The history of research in grammaticalization 3
Brugmann, were favorably inclined to hypotheses of this kind. Again, the typological
version of agglutination theory was most vigorously promoted by August Schleicher; he
followed Humboldt in making agglutination theory the center of his evolutive typology.
Another prominent representative of agglutination theory is Georg von der Gabelentz.
The essential passage from his Die Sprachwissenschaft, which remained unaltered in the
second edition (1891:251 = 1901:256), will be quoted in full here, because it summarizes
well what was known or thought about agglutination theory at that time.
Nun bewegt sich die Geschichte der Sprachen in der Diagonale zweier Kräf-
te: des Bequemlichkeitstriebes, der zur Abnutzung der Laute führt, und des
Deutlichkeitstriebes, der jene Abnutzung nicht zur Zerstörung der Sprache
ausarten läßt. Die Affixe verschleifen sich, verschwinden am Ende spurlos;
ihre Funktionen aber oder ähnliche drängen wieder nach Ausdruck. Diesen
Ausdruck erhalten sie, nach der Methode der isolierenden Sprachen, durch
Wortstellung oder verdeutlichende Wörter. Letztere unterliegen wiederum
mit der Zeit dem Agglutinationsprozesse, dem Verschliffe und Schwunde,
und derweile bereitet sich für das Verderbende neuer Ersatz vor: peri-
phrastische Ausdrücke werden bevorzugt; mögen sie syntaktische Gefüge
oder wahre Komposita sein (englisch: I shall see, — lateinisch videbo =
vide-fuo); immer gilt das Gleiche: die Entwicklungslinie krümmt sich zurück
nach der Seite der Isolation, nicht in die alte Bahn, sondern in eine annähernd
parallele. Darum vergleiche ich sie der Spirale.
The extent to which Gabelentz is obliged to Humboldt emerges clearly from this
quotation. On the other hand, two things are new here: First, an explanation for
grammaticalization is offered, this being seen as the result of two competing forces, the
tendency towards ease of articulation and the tendency towards distinctness. We will
meet these again and again, in various disguises, in the subsequent literature. Secondly,
the evolution is not conceived as linear, as leading from a primitive to an advanced stage,
but as basically cyclic, though Gabelentz is cautious enough to use the more precise
metaphor of the spiral. With the necessary refinements, this still corresponds to the most
recent insights.
In 1912, Antoine Meillet published his article “L'évolution des formes grammaticales”.
Although the title is reminiscent of Humboldt's lecture, Meillet shows no sign of being
acquainted with it or with agglutination theory, though he certainly must have been. In
particular, his examples include Schlegel's examples. However, grammaticalization was
of interest to him not for its typological implications, but for its capacity to explain
certain facts in the history of Indo-European languages. He thus continues the Bopp -
neogrammarian tradition. Meillet assumes three main classes of words, “mots principa-
les”, “mots accessoires” and “mots grammaticales”, between which there is a gradual
L'affaiblissement du sens et l'affaiblissement de la forme des mots accessoires
vont de pair; quand l'un et l'autre sont assez avancés, le mot accessoire peut
1. The history of research in grammaticalization 4
4‘Renovation’ will here be used instead of the traditional ‘renewal’ because it offers a neat
counterpart to ‘innovation’.
finir par ne plus être qu'un élément privé de sens propre, joint à un mot
principal pour en marquer le rôle grammatical. Le changement d'un mot en
élément grammatical est accompli. (139).
This leads Meillet to what appears to be a reformulation of Gabelentz's agglutination
Les langues suivent ainsi une sorte de développement en spirale; elles ajou-
tent de mots accessoires pour obtenir une expression intense; ces mots
s'affaiblissent, se dégradent et tombent au niveau de simples /141/ outils
grammaticaux; on ajoute de nouveaux mots ou des mots différents en vue de
l'expression; l'affaiblissement recommence, et ainsi sans fin.
The two driving factors he mentions, “expressivité” and “usage”, also have much in
common with Gabelentz's tendency towards distinctness and towards ease. Even when
he contends that analytic (= periphrastic) and synthetic constructions do not differ in
principle, because they are connected through grammaticalization, citing the example of
the Latin-Romance tenses, he only seems to strengthen a point that was already implicit
in agglutination theory. However, Meillet does go beyond this. First, he introduces the
term ‘grammaticalization’ (133), though he consistently puts it in quotation marks. He
does not define the term, but uses it in the sense of “attribution du caractère grammatical
à un mot jadis autonome” (131). Secondly, Meillet opposes grammaticalization to
analogy as the two principal processes of grammatical change (s. below ch. 5.4), thus
assigning grammaticalization a more narrowly defined place in linguistic theory. And
finally (147f), he offers what appears to be a useful extension of this notion: he considers
that the order of constituents may be grammaticalized, too, illustrating from Latin, in
which word order signifies expressive nuances, and French, where it expresses syntactic
Three years later, in his article “Le renouvellement des conjonctions”, Meillet extends
his theory to the historical analysis of conjunctions, especially in Latin-Romance. The
recruitment of new words which are then to follow the paths of grammaticalization
already well established in the language, is termed “renouvellement” and distinguished
from “création”, where grammatical and/or formal categories previously absent from the
language are introduced. The substitution of Latin nam by quare > French car is an
example of ‘renovation’ (renewal).4
Continuing in chronological order, we next come to Edward Sapir, who again represents
the other, Humboldtian tradition. Sapir's primary interest was neither in grammaticaliza-
tion as a force in historical change (he does not use the term) nor in agglutination theory
or evolutive typology; but in establishing a continuum of the different kinds of linguistic
concepts as a basis for his synchronic typology, he actually contributes to both of these
1. The history of research in grammaticalization 5
issues. In ch. V of his Language, Sapir (1921:102) defines the following four classes of
Material content VI. Basic Concepts
II. Derivational Concepts
Relational VIII. Concrete Relational Concepts
IV. Pure Relational Concepts.
Semantically, there is a gradience through these four classes from the concrete to the
abstract; morphologically, there is a parallel gradience from “independent words or
radical elements” to expression “by affixing non-radical elements to radical elements ...
or by their inner modification, by independent words, or by position”. Sapir also
mentions the possibility of a word's diachronic passage through this continuum. His most
important, and most problematic, innovation is his attempt to give a more precise
semantic basis to the different grammaticalization stages. In this, he has had practically
no followers. One point which might at first seem to be of minor importance is
noteworthy: the expression of grammatical concepts by “position” shows up at the end
of Sapir's scale, while it appeared at the beginning of Humboldt's four stages. Take this
together with Meillet's contention that word order may be grammaticalized, too, and the
problem becomes obvious.
Henri Frei's work may be mentioned in passing. Nothing in his book La grammaire des
fautes (1929) is intended to be a contribution to grammaticalization theory; but he does
adduce a lot of relevant data for “un passage incessant du signe expressif au signe
arbitraire”, for which he finds two forces responsible, “le besoin d'expressivité” and “la
loi de l'usure” (233). Frei's association of grammaticalization with a change from the
expressive to the arbitrary will yet occupy us (p. 115).
In the period of American and even of European structuralism, topics such as
grammaticalization were not fashionable. With the decline of morphological and evoluti-
ve typology, this vein of research in grammaticalization virtually broke off. The only
work of this time in which agglutination theory figures prominently is the Africanist Carl
Meinhof's book Die Entstehung flektierender Sprachen (1936), in which he treats the
evolution of flexional morphology in Semitic, Hamitic and Indo-European languages.
Following Jespersen (1922:375-388; see ch. V.4), Meinhof in ch. 4 posits two principal
ways in which inflection can evolve: 1) through grammaticalization, for instance of
nouns or verbs via postpositions to case suffixes; or 2) through the reinterpretation of
already existing phonological outgrowths of the word.
Apart from this sporadic recurrence, however, agglutination theory does not, as far as I
can see, regain its former popularity until Hodge 1970 and Givón 1971 (the latter
apparently being unaware of the venerable tradition which he continues). Two important
articles which throw new light on grammaticalization are Roman Jakobson's “Boas's
view of grammatical meaning” (1959) and V. M. irmunskij's “The word and its
boundaries” (1966; Russian original 1961). Jakobson attributes to Boas a distinction
1. The history of research in grammaticalization 6
between “those concepts which are grammaticalized and consequently obligatory in
some languages but lexicalized and merely optional in others” (1959:492), adducing “the
obligatoriness of grammatical categories as the specific feature which distinguishes them
from lexical meanings” (489). This is clearly an advance because it adds an essential
syntactic aspect to the until then almost exclusively morphological view of
grammaticalization. Here for the first time, too, an opposition between grammaticaliza-
tion and lexicalization is formulated.
In § 3 of his article, irmunskij deals with the “unification of the word combination into
a single (compound) word.” There are two possible directions that this process can take:
either towards grammaticalization, which yields “a specific new analytical form of the
word”, or towards lexicalization, which yields “a phraseological equivalent of the word
in the semantic sense.” (83). In the first case, the next stage is a synthetic inflectional
word form; in the second case, the next stage is a compound word. Several points should
be stressed here. First, there are processes of unification which do not involve the
development of one element of the combination into a grammatical formative and which
are therefore not regarded as grammaticalization. Second, such processes are called
lexicalization. Observe that this use of the term “lexicalization” is quite different from
Jakobson's use quoted above; this will constitute one of our problems (ch. 5.2). Thirdly,
the term ‘grammaticalization’ is used here not (only) for the transition from the analytic
to the synthetic construction, i.e. the agglutination process, but is explicitly applied to the
formation of an analytic construction. This is consistent with the meaning of the term
which covers an open-ended continuum comprising all of Humboldt's or Sapir's four
Outside structuralism, the Indo-Europeanist tradition of grammaticalization theory
remained uninterrupted. Its most important representatives are Jerzy Kuryowicz and
Emile Benveniste. Kuryowicz applied the concept of grammaticalization systematically
in his book The inflectional categories of Indo-European, many of which are explained
through grammaticalization. In his article “The evolution of grammatical categories”
(1965; notice again the tradition of article titles!), Kuryowicz defines:
Grammaticalization consists in the increase of the range of a morpheme
advancing from a lexical to a grammatical or from a less grammatical to a
more grammatical status, e.g. from a derivative formant to an inflectional
one. (52)
By ‘increasing range’ Kuryowicz means wider distribution, a defining factor of
grammaticalization which had hitherto only been hinted at by Schlegel. Notice that
word-formation is reintroduced into the picture, which we might think to have excluded
from grammaticalization with irmunskij. Kuryowicz then gives a survey of various
Indo-European grammatical categories and their development through grammaticaliza-
tion. He also opposes grammaticalization to lexicalization in a third sense which will
occupy us in ch. 5.2.
1. The history of research in grammaticalization 7
Benveniste, who, curiously enough, consistently avoids the term ‘grammaticalization’,
has made various contributions to the subject. In his article “Mutations of linguistic
categories” (1968), he takes up Meillet's distinction between “création” and “renou-
vellement”, explaining that the former is innovative change, where grammatical
categories may disappear or emerge for the first time, while the latter is conservative
change, where categories are only formally ‘renovated’. The examples are again the
same as in Meillet 1912: the Latin-Romance perfect and future.
Switching back, for the last time, to the conception of evolutive typology, we find this
revived in two articles by Carleton T. Hodge and Talmy Givón. In his paper “The
linguistic cycle” (1970), Hodge somewhat simplifies the picture by distinguishing only
two stages, one with heavy syntax and little morphology (Sm), which roughly comprises
Humboldt's stages I and II; and another with little syntax and heavy morphology (sM),
which corresponds to Humboldt's stages III and IV. His point is essentially an empirical
one: he adduces the history of Egyptian as factual proof for the hypothesis that a single
language can pass through a full cycle ‘sM Sm sM’. His slogan “that one man's
morphology was an earlier man's syntax” (3) is echoed in Givón's formulation “Today's
morphology is yesterday's syntax.” (1971:413), which is the central thesis of his article
“Historical syntax and synchronic morphology: An archaeologist's field trip”. We will
deal in ch. 8.3 with the role of grammaticalization in historical reconstruction. Here it
suffices to mention that Givón has expanded his theory in various works, proposing, in
1979, the grammaticalization scale which we will discuss in ch. 2.2. The notion of
grammaticalization has by now become widely known and is receiving ever greater
interest. I will end my review here and discuss more recent work in thematically more
specific connections.
Summing up, we can say that the theory of grammaticalization has been developed by
two largely independent linguistic traditions, that of Indo-European historical linguistics
and that of language typology. The moment has come, I think, where the two threads
should be united. One tradition is conspicuously absent from this picture, namely that of
structural linguistics, from de Saussure to our day. This is by no means an accident:
whereas historical linguistics and typology have been concerned, from their beginning,
with processes and continuous phenomena and thus could easily accommodate
grammaticalization as a process which creates such phenomena, structural linguistics has
tended to favour a static view of language and clear-cut binary distinctions. In ch. 6 we
will try and see whether the perspective of grammaticalization cannot, in fact shed some
light on problems traditional in structural linguistics.
5A further abbreviation is represented in Werner's (1979:965f) German participle grammatisiert,
formed on an unattested grammatisieren “to grammatize” (i.e. grammaticalize).
2.1. The term ‘grammaticalization’
The derivational pattern which the word grammaticalization belongs to suggests that it
means a process in which something becomes or is made grammatical (cf. legalization).
In view of this, the term is unfortunate in several respects. Firstly, the term
grammatical’ has various meanings. In the above explication of grammaticalization,
grammatical signifies that which belongs to, is part of, the grammar, as opposed to, e.g.,
what belongs to the lexicon, to stylistics or discourse. Apart from this, grammatical has
come to mean something completely unrelated to the notion of grammaticalization: x is
grammatical is an abbreviation of x is grammatically correct and accordingly means that
x conforms to (as opposed to: is incompatible with, violates) the rules of grammar. What
is particularly distressing about this ambiguity is the fact that while grammatical may
have either meaning in attributive use, it can only have the second meaning in
predicative use; and yet the first meaning is needed in the predicative use which is made
of it in the above explication of grammaticalization.
Secondly, in addition to the above explication, grammaticalization must mean a process
in which something becomes or is made m o r e grammatical (cf. the quotation from
Kuryowicz on p. 6). We defer to ch. 6.2 the problem of what it means to say that
something belongs to the grammar to a greater or lesser degree, and observe here that
this latter notion should be designated by the noun grammaticality. That is, in a theory
of grammaticalization, the term ‘grammaticality’ would be needed to mean the degree of
grammaticalization which an element has reached. Again, however, this term (or its
variant ‘grammaticalness’) is currently based on the other meaning of grammatical and
therefore means the well-formedness of something according to the rules of grammar.
There would seem to exist a way out. Some authors (e.g. Givón 1975:49, Bolinger
1978:489) have used ‘grammaticization’ instead of ‘grammaticalization’.5 We might
adopt this use and substitute, accordingly, ‘grammaticity’ for ‘grammaticality’ in the
intended sense. Unfortunately, this terminological arrangement would soon come to an
inconsistent end, because we would not only have to call ‘grammatic’ what we always
have called ‘grammatical’; what is more, this terminological regularization would not be
implementable in French, the language in which the term ‘grammaticalization’ was
coined in the first place. Finally, it seems paradoxical to give up the well-established
‘grammaticalization’ instead of the rare ‘grammaticization’. We will therefore abide by
the terms ‘grammatical’, ‘grammaticality’ and ‘grammaticalization’ and use them
2.1. The term ‘grammaticalization’ 9
6Brettschneider, Li and Thompson actually apply this term only to one specific grammaticaliza-
tion channel, namely the reduction of a (subordinate) clause to a word.
exclusively in the sense in which grammatical designates that which belongs to the
grammar. It seems more convenient to leave the resolution of the terminological conflict
to the other side; one might, for example, resort to the expression grammatically
well-formed if one wants to signify “grammatically well-formed”.
A more serious question is whether the term ‘grammaticalization’ is not unduly stretched
if we apply it to such a large range of phenomena. On the one hand, I intend to follow
irmunskij in subsuming the formation of analytic constructions under grammaticali-
zation. On the other hand, the process does not stop at the level of inflectional
morphology. The English pronoun him, after having been grammaticalized to a
verb-suffixal object marker -im in Tok Pisin, has further evolved into an invariable
transitive verb marker. Such linear extensions of grammaticalization processes into
derivational morphology are not at all rare. On the one hand, since such extensions
continue the same pattern, they should be called by the same name. On the other hand, it
does not seem correct to say that the suffix -im, in its change from an object marker to a
transitive verb marker, becomes more grammatical. A term slightly more comprehensive
than ‘grammaticalization’ would seem to be needed; but the alternatives that have
appeared in the literature are no more satisfactory. Li & Thomson (1974), Givón
(1979[u]:209) and Brettschneider (1980:94) have offered the term ‘condensation’
essentially for what is here called grammaticalization.6 A precursor of this term is Gabe-
lentz's (1901:433, 436) ‘Verdichtungsprozess’. In a loose sense, condensation may be
used to designate one aspect of grammaticalization, namely the narrowing down of the
level of grammatical structure (see ch. 4.3.1); and this is actually what the above authors
have in mind. However, if we take the word literally, it would have to mean that
something becomes denser, compacter in the course of grammaticalization. On the
contrary, the authors quoted in ch. 1 concur that the meaning of a grammaticalized sign
is weakened in the same measure as its expression is weakened; a more grammaticalized
sign does not say the same thing as a less grammaticalized one in a smaller space, as
seems to be implied by the term ‘condensation’.
The term ‘reduction’ (used, for instance, in Langacker 1977:103-107) does not have this
shortcoming, but displays a different one, which, incidentally, it shares with
‘condensation’. It is not specific enough, because it covers also the reduction of a phrase
to a compound word, which is not a grammaticalization process.
Authors depending on A. Martinet have sometimes used the term ‘morphématisation’
essentially with the meaning ‘grammaticalization’ (e.g. in Martinet (ed.) 1968:1064f).
This presupposes Martinet's terminology, in which ‘morphème’ equals other linguists'
‘grammatical morpheme’. Apart from its local character, ‘morphématisation’ has the
disadvantage of being too narrow. Although the formation of grammatical morphemes is
probably the focus of grammaticalization, it is by no means all of it.
2. Characterization and delimitation of the concept 10
We are thus led back to our term ‘grammaticalization’. I see no way to avoid its exten-
sion, in a generic sense, to processes such as the one illustrated above. If one wants to
make specific reference to just that type of process, one will, of course, not use the term
‘grammaticalization’; ch. 5.2 will deal with the question of whether a convenient term
can be found.
2.2. The meaning of ‘grammaticalization’
Having settled on the term, we may now characterize more fully the concept. We will
first justify one decision which has been presupposed in the above terminological
discussion, namely the interpretation of grammaticalization as a process which may not
only change a lexical into a grammatical item, but may also shift an item “from a less
grammatical to a more grammatical status”, in Kuryowicz's words. Since adjectives
derived in -al are commonly non-relative (they have no polar antonyms and do not take
part in comparison; cf. maternal), one might take the position that the property of being
grammatical, of belonging to the grammar, is a binary property and not a matter of
degree. As I said, we will postpone discussion of this problem to ch. 5.2. Anyway, if this
were accepted, then grammaticalization could not be a gradual, relative process. From
this position it would be correct to say that something is either grammaticalized or not
grammaticalized. This is the position of Jakobson, Mel'uk and Lyons. Lyons writes
Different languages make a different selection, as it were, from the set of
possible distinctions that could be made and grammaticalize them (i.e. make
them grammatically functional) in terms of such categories as tense, number,
gender, case, person, proximity, visibility, shape, animacy, etc.
Throughout his book, Lyons consistently uses the expression ‘x is grammaticalized in
language L’ only if x is a semantic category which is represented by a grammatical
category in L. At first sight, this appears to provide us with a simple and intuitively
satisfactory interpretation of the notion ‘grammaticalization’. But then we must also
provide binary criteria which answer the question: which conditions must something
fulfill in order to be a grammatical category of a language L? Jakobson (1959:489; see
the quotation above on p. 6) and Mel'uk (1976:84) answer that the essential criterion is
obligatoriness: a meaning is grammatical in L if the speaker cannot choose to leave it
unspecified. The criterion of obligatoriness will in fact be used below (ch. 4.2.3); but it
does not appear to me to be an absolute one. Something is obligatory relative to the
context; i.e. it may be obligatory in one context, optional in another and impossible in a
third context. Take, for instance, the category of number. In Latin, every noun form
compulsorily belongs either to the singular or to the plural; the speaker cannot choose to
leave the number unspecified. Here the criterion correctly decides that number is a
grammatical category in Latin. In Turkish, most nouns may be specified for number by
adding a plural suffix. Some nouns may not, for instance terms of nationality or
2.2. The meaning of ‘grammaticalization’ 11
7The terminological confusion associated, especially in German, with the term ‘Flexion’ and its
cognates may be resolved in English, for our purposes, by the following convention: ‘inflection’
will be opposed to ‘word-formation’ (esp. ‘derivation’) as the syntax-bound part of morphology;
‘flexion’ will be opposed to ‘agglutination’ (and ‘isolation’) as one of the techniques of
morphological typology (namely the fusional or amalgamating one, which Sapir (1921:129ff) calls
‘inflective’). Cf. Comrie 1981[L]:41f on the terminological dilemma.
profession if they form the predicate. No noun may be specified for number if preceded
by a cardinal numeral. In most other contexts, number is optional; i.e. the unmarked form
may signify the singular or the plural. Is number obligatory in Turkish or not? Certainly
not nearly as obligatory as in Latin. Should we therefore say that number is not a
grammatical category in Turkish? Would it not be more illuminating to say that number
is more grammaticalized in Latin than in Turkish?
An analogous argument could be made with respect to any other criterion that one might
be inclined to propose. Ch. 3 will provide abundant evidence that even the mere
transition from a lexeme to a grammatical formative (if we were to restrict grammatic-
alization to this process) is not a leap, but a gradual shift to a new function. The category
of prepositions is a notable example. In many languages, there are some prepositions like
English beyond which need not be treated individually in the grammar because they obey
general rules of syntax like other ordinary lexemes; and there are other prepositions like
of which require special treatment in the grammar because they are obligatory in a
number of constructions. The space in between is filled by the bulk of prepositions,
which are at different stages on their way from lexeme to grammatical formative. I
therefore see no way to avoid the conclusion that grammaticalization is a process of
gradual change, and that its products may have different degrees of grammaticality.
If grammaticalization is not a binary, but a gradual change of state, then we must face the
problem that it may be an open-ended process. Some authors (e.g. Ronneberger-Sibold
1980:113-115) have restricted the notion of grammaticalization to the passage from an
analytic to a synthetic construction. We have already observed (p. 2) that this passage,
the agglutination process, stood godfather to the denomination of agglutination theory.
Possibly this transition into the unity of the word is the most salient phase of the
grammaticalization process. Nevertheless, the nature of the process is the same before
and after this phase. The formation of analytic constructions out of ‘word combinations’
( irmunskij), on the one hand, and the melting of an agglutinative to a flexional
formation,7 on the other, are phases of the grammaticalization process. The question
naturally arises: where does grammaticalization start, and where does it end? We will
provisionally answer this question by diagram F1, which incorporates the one presented
in Givón 1979[u]:209.
2. Characterization and delimitation of the concept 12
level Discourse Syntax Morphology Morphophonemics
technique isolating > analytic > synthetic-
agglutinating >synthetic-
flexional >zero
phase syntacticization morphologization demorphemicization loss
process grammaticalization
F1. The phases of grammaticalization
This picture is incomplete and simplified, because it represents only two of the factors
involved in grammaticalization, namely those that will be called condensation and
coalescence in ch. 4.3, and because it pretends a perfect correlation between these two.
Nevertheless, it suffices to illustrate, for present purposes, the range of the grammatic-
alization process and the phases conventionally recognized in it. Thus we assume that
grammaticalization starts from a free collocation of potentially uninflected lexical words
in discourse. This is converted into a syntactic construction by syntacticization, whereby
some of the lexemes assume grammatical functions so that the construction may be
called analytic. Morphologization, which here means the same as agglutination, reduces
the analytic construction to a synthetic one, so that grammatical formatives become
agglutinating affixes. In the next phase, the unity of the word is tightened, as the
morphological technique changes from agglutinative to flexional. This transition from
morphology to morphophonemics will here be called demorphemicization. Givón calls
it lexicalization, and this is the fourth sense in which the term appears in the literature.
This need not worry us at the moment. We pass over to the final phase, where expression
and content of the grammatical category become zero.
I repeat that this account is simplified. It makes it appear as if the grammaticalization
process had a clear-cut end, which we will see it has not. On the other hand, the start of
the process is not readily identifiable either, and we will defer this problem, too. The sole
function of F1 is to give a first impression of what is covered by grammaticalization.
A single example to illustrate the whole process is not easy to come by, though such ex-
amples probably exist. At any rate, it may be remarked at this juncture that it is not
essential to grammaticalization theory that every element affected by grammaticalization
enter the process at the start and leave it at the end, where start and end are identified
with reference to F1. On the contrary, this is certainly the rarest case. I will therefore
illustrate a complete grammaticalization process with two examples which together
cover the entire range.
From the beginning of the literary tradition up to the postclassical period, the Latin
language had an elaborate system of demonstrative pronouns. There was a deictically
2.2. The meaning of ‘grammaticalization’ 13
8Cf. also sentences such as A quelle heure le train arrive-t-il?, La grammaire n'a-t-elle pas le
devoir de s'attacher aux fonctions?, Peut-être les hypothèses contraires veulent-elles seulement dire
que ...
neutral pronoun is, which was also used as an anaphoric personal pronoun. Besides,
there were three deictic pronouns, of first (hic), second (iste) and third (ille) person
deixis. Apart from their function as NPs, which is our concern here, all four could
function as determiners. In Archaic Latin, the members of the deictic triad always had
some demonstrative force. Their use was subject to no syntactic rule; they occurred
where and how the speaker saw fit. However, ille was the unmarked member of the triad
and began to assume anaphoric function, thus intruding into the area of is, which it
finally ousted in Vulgar Latin. At this stage, ille was a neutral anaphoric pronoun,
witness E1, from the first half of the 6th cent. AD.
duo rustici sic ad hora captum comederunt, et ita illis contigit, et unus illorum
sanguinem deiuso produxit nimium. (Anth. obs.cib. 25)
‘two peasants ate one [turtle dove] just caught, and it so happened to them,
and one of them voided too much blood’ (Pulgram 1978:288)
Here we have already entered the path of syntacticization, because the function of ille is
but the grammatical representation of an NP of a previous clause. Still, ille is not
commonly used as a personal pronoun in subject position. This step has been made, to a
varying extent, in the Romance languages. In Standard Italian, for instance, a finite verb
does not need an overt subject; vende alone may be used to mean ‘he sells’. In French,
however, the personal pronoun il (<ille) is obligatory if there is no other subject; the
corresponding example would be il vend. With this, the phase of syntacticization is
completed: we have arrived at an analytical verb form.
The morphologization of this combination would presuppose that il remains present even
when there is a subject in the same clause. This step has not (yet) been taken; but
preparations are being made. In a construction such as Et lui, vend-il des fleurs?, the
left-dislocated NP is almost the syntactic subject of the clause, and yet the pronoun il
cannot be absent.8 It is enclitic to the verb and could, through agglutination, become a
suffixal personal ending. (It is improbable that it will do so in French; but this is not
essential to the demonstration.) Summarizing then, we have seen that Latin ille has
started at the beginning of the scale in F1 and has advanced, in the shape of French il, to
the beginning of the morphologization phase.
The second half of this demonstration takes us back to Proto-Indo-European. The
so-called secondary personal endings of the active verb were *-m, -s, -t for the three
singular persons. Though the details are not recoverable, scholars generally agree that
these suffixes derive from the agglutination of personal pronouns. (As will be recalled
[p. 2], this was already Bopp's position.) In particular, the third person singular suffix *-
t is most probably a reduced form of the neutral demonstrative stem *to- (details in
2. Characterization and delimitation of the concept 14
Szemerényi 1970:302-305 and Seebold 1971). We can therefore be fairly confident that
this example takes up at the very point where the former example leaves off.
Still in Proto-Indo-European times, these endings were extended by a suffix *-i, whose
nature need not concern us here (cf. p. 32). By the time of Archaic Latin, this -i was
again lost. The personal suffixes retained their pronominal function, i.e. their capability
of representing the subject, over the most part of the subsequent time. Classical Latin
vendit means ‘he sells’ and needs an overt (pronominal) subject even less than Italian
vende does. However, the pronominal function gradually got lost, and parallel to this the
morphological bond between the stem and the personal endings grew tighter. In Latin,
the personal endings cannot be neatly separated from the stem, which means that they are
not agglutinative but flexional and they are partly different according to conjugation
class; so in this sense, and to this extent, they are demorphemicized. This is the transition
from morphology to morphophonemics. The phonological substance of the endings is
then further reduced; to the Latin vendo, vendis, vendit corresponds the Italian vendo,
vendi, vende. As was mentioned above, the Italian personal endings can still represent,
by themselves, the person of the subject. This is no longer so in French. The personal
endings have been reduced to zero in the singular (and in the third person plural), which
means that apart from exceptions, person is no longer a morphological category of the
singular verb. This is the end of the grammaticalization process.
2.3. Degrammaticalization
Various authors (Givón 1975:96, Langacker 1977:103f, Vincent 1980[I]:56-60) have
claimed that grammaticalization is unidirectional; that is, it is an irreversible process, the
scale in F1 cannot be run through from right to left, there is no degrammaticalization.
Others have adduced examples in favor of degrammaticalization. The few that have
come to my knowledge will be briefly discussed.
Kuryowicz (1965:52f) maintains that there is a reverse process to grammaticalization
which he calls lexicalization. His examples have, according to him, the following
structure: derivational category is grammaticalized to inflectional category and is again
lexicalized to derivational category. The examples are: Proto-Indo-European *-a was a
derivational nominal affix with collective meaning. In Latin, it was grammaticalized to
the plural marker of neuter nouns, e.g. ovum ‘egg’, pl. ova. In Italian, the Latin neuter
nouns have become masculine and form their plural in -i. However, -a is again used as
a derivational collective suffix, e.g. in muro ‘wall’ — mura, uovo ‘egg’ — uova.
The Pre-English meaning of the perfect was stative. In Modern English, all the verbs
which formerly formed their perfect with be use have now, and the meaning of the
perfect is no longer stative but completive. However, for some verbs the perfect with be
has been restored in the old stative meaning, e.g. is come/gone.
Again, the verbs can, may, shall, dare are original perfect forms (known as
preterite-presents in Germanic linguistics). While the perfect has changed its meaning in
2.3. Degrammaticalization 15
9As a last example, KuryÙowicz mentions in passing the development of sex to gender to sex
from Proto-Indo-European via Proto-Germanic to Modern English. This is a process whose details
are complicated; however, it is, in the last analysis, an instance of continuous grammaticalization;
see Lehmann 1982[U], § 7.2.
English, today these forms again signify a present state. Furthermore, the hypothesis that
they have changed from inflectional forms to derivational stems is evidenced by the fact
that they have developed an inflection of their own: could, might, should, durst.9
None of these examples stands up to closer scrutiny. All of them suffer from the defect
that the newly evolved derivational category does not possess a minimum of
productivity, whereas those Proto-Indo-European derivational categories which they
ultimately go back to (if we may assume, for the sake of argument, that perfect was a
derivational category in Proto-Indo-European) must clearly have been highly productive,
for otherwise they could not have yielded inflectional categories. Instead, the specific
examples which Kuryowicz adduces are virtually the only ones of their kind; that is,
they are lexicalized in quite a different sense (the one we already encountered in
irmunskij): they are frozen, not amenable to any rule, idiomaticized.
Secondly (and this is a difficulty which most putative examples of degrammaticalization
are liable to meet with), these lexicalized forms have not really made their way back
from a more grammaticalized, inflectional stage, but instead directly continue the
original stage. Italian uova is not a modern alternative to uovi, nor has the construction
is gone developed on the basis of an earlier has gone, nor do can etc. go back to older
completive perfect forms. Instead, Italian uova continues Latin ova, and English is gone
and can etc. continue Proto-Germanic stative perfects or preterite-presents, respectively.
Finally, although it must be admitted that the -a of Ital. mura does not go back to Latin,
it is not the case that uova and mura are collectives; they are plural forms. So these
examples do not establish a degrammaticalization process.
Kahr (1976:122) offers a single example of degrammaticalization. Modern Turkish has
a postposition için ‘for’, which, like some others, takes nominal complements in their
unmarked form, but pronominal complements in the genitive (cf. p.71-73 below). In
some rare instances, this morpheme is suffixed, e.g. in on-un-çün (D3-GEN-for)
‘therefore’. Since these suffixed forms are archaic relics, the modern productive
postpositional usage must be explained, according to Kahr, as a degrammaticalization of
the suffixal construction.
This is just like explaining the prepositional function of Portuguese com ‘with’ through
the degrammaticalization of the prefixal construction comigo, contigo etc. ‘with me, with
you’ etc. or, for that matter, of the Latin suffixal construction mecum, tecum etc. It seems
clear that the Turkish case must be just like the Romance one: What was originally an
adposition continued to be an adposition in modern times, except in combination with
pronouns, where it became affixal already in early times.
2. Characterization and delimitation of the concept 16
A class of possible examples comes from decliticization. One factor of the phonological
weakening of a grammaticalized element is its deaccentuation and subsequent cliticiza-
tion. If elements could be found which were exclusively clitic at a former stage, but at a
later stage allowed an autonomous use, these would be examples of degrammaticaliza-
tion. Jeffers & Zwicky (1980, § 3) first adduce the Indo-European relative pronouns *yo
and *kwo-, which may be accentuated in their respective Sanskrit and Latin forms yas
and qui. These are said to derive from clitic connective particles which formed a
sequence with clitic anaphoric pronouns. Such a sequence coalesced and was
reinterpreted as an inflected relative pronoun.
Two objections must be raised against this argument. First, even granting the
etymological correctness of this reconstruction, nobody can guarantee that these
connectives were actually clitic at the stage in question. Hittite, for instance, does have
such sequences as hypothesized by Jeffers and Zwicky; but the connectives are not clitic.
Second, the reconstruction proposed by Jeffers and Zwicky is probably false. The syntax
of the clitic connectives in the historical languages (e.g. Hittite -ya, Sanskrit -ca, Latin
-que) differs markedly from that assumed by them for Indo-European. The relative
pronouns are much more plausibly derived from an anaphoric/demonstrative and an
interrogative/indefinite pronoun, respectively (see Lehmann 1984, ch. VI.1), whose
relation to the connectives may well be left open. Given the notorious indeterminacy of
reconstruction, everything is, of course, possible. What we need, however, are not
hypothetical, but historical examples.
Jeffers' and Zwicky's second case is of a completely different nature. The verb of such
ancient Indo-European languages as Vedic could be unaccentuated, especially in main
clauses; this appears to be no longer possible at later stages. Though this is probably true,
it is not an instance of decliticization, since verbs have never been clitic. Clisis is a
lexically inherent property of an element which may manifest itself either independently
or in dependence on the semantosyntactic context (Jeffers' and Zwicky's “special” vs.
“simple” clitics). In the case at hand, however, we are dealing with a certain pattern of
sentence intonation which leaves the main verb unaccentuated and which ceases to be
usual, or even possible, at a later stage.
The last potential example of degrammaticalization is provided by English. In
Proto-Germanic, the genitive suffix -s was a flexional ending bound to the word. In
Modern English, however, we find such phrases as the King of England's daughter and
the man I met yesterday's son, where the -s is agglutinated to a complex NP. This looks
like a bona fide case. However, the historical details are complex (see Janda 1980). On
the one hand, the originally flexional -s became more agglutinative, in Middle English,
as a contingent result of the reduction and regularization of the Old English case
paradigm. On the other hand, dialects and lower sociolects of Middle English had the
alternative construction ‘NP his N’ (e.g. the king (of England) his daughter) available,
which itself became homophonous with the inherited genitive. As a result, the genitive
suffix was reanalyzed as a clitic possessive pronoun. Thus, it was not the genitive on its
2.4. Renovation and innovation 17
own what expanded to higher syntactic levels. Rather, the (real or putative) clitic
possessive pronoun, which had been compatible with these levels from start, got
generalized to non-masculine genders.
We may therefore conclude this discussion with the observation that no cogent examples
of degrammaticalization have been found. This result is important because it allows us
to recognize grammaticalization at the synchronic level. Given two variants which are
related by the parameters of grammaticalization to be made explicit in ch. 4, we can
always tell which way the grammaticalization goes, or must have gone. The significance
of this for the purposes of internal reconstruction is obvious; see ch. 8.3.
If grammaticalization is really a unidirectional process, one must ask why this should be
so. I will not anticipate here the theoretical considerations of the final chapter, but
mention only the explanation that Givón (1975:96) has given. He says that grammatic-
alization essentially involves a deletion of both semantic and phonological substance.
Degrammaticalization would have to be an enrichment in semantic and phonological
substance. Now while the result of a deletion process may be predictable, its source is
generally not predictable from the result; so the product of an enrichment process, or of
degrammaticalization, would also not be predictable. This appears to be a step in the
right direction. However, it remains to be seen, first to what extent the results of
grammaticalization processes are really predictable, and secondly, if rules for these
processes can be found, why natural languages cannot apply them, at least to non-zero
elements, in reverse direction.
2.4. Renovation and innovation
Grammaticalization changes analytic into synthetic constructions. There are, however,
numerous instances in the history where languages have changed from the synthetic to
the analytic type. This was in fact the observation on which August W. Schlegel (1818:
14-30) based his introduction of the terms ‘analytic’ and ‘synthetic’ in the first place. He
observed, for instance, that Latin case inflection has been substituted by prepositional
constructions in the Romance languages, that certain tenses are no longer formed by verb
inflection but by auxiliaries, and so forth. If such changes from the synthetic to the
analytic do occur, aren't they instances of degrammaticalization? This has been
maintained by Lightfoot (1979:223-225), but the argument has rightly been rejected by
Heine & Reh (1984:75f). Far from invalidating grammaticalization theory, the evolution
synthetic S analytic is predicted by it and has been so predicted since the early days of
agglutination theory. If the evolution along grammaticalization scales takes the form of
a spiral, this implies that forms which are given up near the end of the scale may be
substituted by new forms entering at its beginning. For degrammaticalization to obtain,
analytical forms would have to be historical continuants of synthetic forms; but this
actually never happens.
2. Characterization and delimitation of the concept 18
10 A wider use of this term has been made in Indo-European linguistics, where it may cover what
is here called innovation, renovation and analogical change.
This presupposes that we make a clear distinction between the two diachronic relations
‘y continues x’ and ‘y replaces x’. Within a grammaticalization scale, the relation ‘y
continues x’ is equivalent to the relation ‘x is grammaticalized to y’. However, the
relation ‘y replaces x’ is neither a relation of grammaticalization nor of degrammati-
calization. We shall call it, with Meillet's ‘renouvellement’ in mind, the relation of
renovation, also called renewal in the literature. Within a grammaticalization scale, ‘y
replaces x’ is equivalent to ‘x is renovated by y’. For brevity's sake, I will employ the
following symbolism:
x > y = ‘x is grammaticalized to y’;
x /r y = ‘x is renovated by y’.
Latin ille > French il
Latin clara mente > Italian chiaramente
Latin ille /r French ce(lui) là
Latin clare /r Italian chiaramente
Further examples are the renovation of the future, perfect, passive and adjective
comparison, which had been synthetic categories in the ancient Indo-European
languages, by the corresponding analytic categories in several of the modern languages,
including English and the Romance languages. A particularly rich field of constant
renovation are the subordinating conjunctions as already observed by Meillet. All of
these examples will be discussed more fully in ch. 3. A wealth of further material for the
development of the synthetic towards the analytic may be found in Tauli 1966, ch. I.
Now consider the situation where an analytic construction y comes into being, but there
is no x such that x /r y. For example, the Latin ille, illa has also been grammaticalized
into the French definite articles le, la. But when we ask what the x is in ‘Latin x /r
French le, la’, we get no answer. Latin had no grammatical category which corresponded
to the French articles, so that nothing has been renovated by these. This is an instance of
what Meillet (1915f) and Traugott (1980) have called (novel) creation. This is an
imprecise term, because all linguistic activity, including renovation, is creative activity.
Innovation’, as used in Benveniste 1968, seems to be a better one, because it expresses
the desired meaning and provides a suitable contrast to ‘renovation’.10 Unfortunately, to
innovate is intransitive, so that we will resort to create in case we need a transitive verb.
Further examples of innovation are the introduction of numeral classifiers in Persian, the
distinction expressed by ser vs. estar in Spanish, the progressive form in English and the
imperfective vs. perfective aspects in Slavic.
In theory, the distinction between innovation and renovation is entirely clear. Innovation
is revolutionary; it creates grammatical categories that had not been in the language
2.4. Renovation and innovation 19
before. Renovation is conservative; it only introduces new forms for old categories. The
notion of a category which had not been in the language before should cause no
problems. Obviously no one would like to commit himself to the claim that no ancestral
stage of the Indo-European languages had numeral classifiers, an essence/accidence
distinction or a distinction between progressive and neutral or perfective and
imperfective aspects. What matters here is the stage immediately preceding the
In practice, however, there are numerous borderline cases between innovation and
renovation. First we must notice that renovation takes its time. There are admittedly
cases where the new construction entirely and almost instantly replaces the old one,
taking a function and shape maximally similar to the old ones; this has occurred in the
renovation of the Latin future in the Romance languages. More often, however, the new
and the old constructions coexist for some time. An example is provided by the new
analytic and the old synthetic perfect (‘passé composé vs. passé simple’) in the Romance
languages. As long as such a situation obtains, the two categories tend to be functionally
non-identical, so that we have two categories where we formerly only had one. So far
this is not really a conservative change. Conservatism asserts itself only when the old
construction falls out of use and the new one takes over its function (and possibly its
morphosyntactic form). So what is conservative about renovation is not the particular
situation brought about by the introduction of the renovative periphrastic construction,
but rather the reentering of a grammaticalization channel which, if run through, will lead
to a result maximally similar to the situation which had obtained formerly.
Secondly, two grammatical constructions can be functionally similar only to the extent
that they are formally similar. If the renovation of a construction enters upon a path that
cannot lead to anything formally similar to the former construction, a complete
replacement of the old function will never be obtained, and to this extent the change will
be partly renovative, partly innovative. Consider the change that is often called the
renovation of Latin case inflection by prepositional constructions. Prepositions will
never become case suffixes; even their development into case prefixes is relatively rare
(cf. ch. Here it suffices to observe that the Latin case suffixes have disappeared,
but the Romance prepositions are far from truly fulfilling their function. On the one
hand, they do less than that, since strict word order comes in where prepositions (or other
means) fail. On the other hand, they do more than that, since prepositions are much more
intimately connected with the verb than are case suffixes and may be used to derive
compound verbs. Moreover, prepositions can express finer distinctions than cases can
because there are more of them. Consequently, the loss of Latin case inflection and the
introduction of prepositional constructions is renovative to the extent that the functions
of the two constructions overlap, and it is innovative to the extent that they do not.
2. Characterization and delimitation of the concept 20
11 Cf. the telling remark by A. Schlegel, who was the first to observe some of the above cases;
according to him (1818:30), they “ne laissent pas de sentir un peu la barbarie.”
2.5. Reinforcement
If an element is weakened through grammaticalization, there are, in fact, two
possibilities open to linguistic conservatism. The first is to give it up and replace it by a
new, but similar one. This is renovation, as we have just seen. The second is to reinforce
it, thus compensating for and checking the decay. Here are some examples: Latin aliquis
‘someone’ is reinforced by unus ‘one’, yielding *aliqui-unu; this is then grammaticalized
to Italian alcuno, French aucun etc. Latin ille, which, as we have seen, was
grammaticalized to the Romance definite article, was reinforced in its demonstrative
function: *eccu illu ‘voilà that (one)’ resulted in Italian quello. Many Latin prepositions
have been reinforced on their way into Romance; e.g. Latin ante ‘in front, before’ was
strengthened by preposed ab ‘from’ before it developed into French avant. We will
introduce a symbol for the relation of reinforcement: ‘the reinforced form of x is y’ will
be written ‘x v y’. The three symbols >, /r, and v will also be used in the converse
relations ‘y < x’, ‘y r/ x’ and ‘y w x’.
Reinforcement can be reiterated ad libitum. For instance, IE *in ‘in’ v *en-tos > Latin
intus ‘within, inside’ v *de-intus ‘of/from within’ > French dans ‘in’ v dedans ‘inside’.
Pre-Latin *is ‘that (one)’ v Latin iste ‘that one on your side’ v Proto-Romance *eccu
istu ‘lo that one’ > It. questo ‘this’ v questo... qui and French ce...-ci ‘this (one) here’. At
the stage where the reinforcement is first made, it sounds to puristic ears like a redundant
accumulation,11 a hypercharacterization (on the latter, see Malkiel 1957f and Tauli 1966,
ch. IV). But the emphasis soon vanishes, and the reinforced expression becomes neutral
The examples illustrate the reinforcement of an element by its morphological union with
another one. The situation becomes slightly more complicated when an expression is
reinforced not by adding an element next to the grammatical marker already present, but
at a different place in the construction. Latin non ‘not’ was reinforced by passum ‘step’
in a construction *non V passu, to yield French ne V pas. The particle ne can
subsequently be dropped, and the negation pas ends up at a different position from Latin
non. Another example, which I have already used in a simplified manner, but which is
really quite complex, are the Latin-Romance prepositions. In Proto-Indo-European, we
may assume there were agglutinative case suffixes with rather specific functions. When
these got more grammaticalized, they were first specified, and thus reinforced, by
adverbs; for example, the accusativus directionis was specified by *peri ‘around, along’
> Latin per ‘through’. These adverbs were in turn grammaticalized, yielding on the one
hand preverbs and on the other adpositions. In Latin we encounter expressions such as
percurrere urbem or currere per urbem ‘to run through the city’. We neglect here the
2.5. Reinforcement 21
12 Developments of this type are also responsible for a considerable amount of headache caused to
the historical linguist by certain grammatical formatives. How would we be able to understand the
etymology of, e.g., French pas, rien, point, personne or of Italian cosa ‘what’, if we did not know
that they arose through reinforcement (cf. ch.
13 The distinction between renovation, innovation and reinforcement as made here is also
postulated in Kahr 1976:115, in the terms 'renewal', 'novel creation' and 'hypercharacterization',
possible hypercharacterization percurrere per urbem and pay attention to the fact that in
no one of these expressions the suffix is substituted by the preposition or preverb. There
is no alternative between case suffix and preposition, such as there is between passé
simple and passé composé. We see here that what later on will result in a (partial)
renovation, begins as a complex reinforcement (cf. Jakobson 1936:55). In those many
instances where the renovative construction starts as an extension of the renovated one,
we may speak of renovation by reinforcement; whereas in the other case, where the
renovative construction syntagmatically excludes the renovated one, we may speak of
pure renovation.
On the same basis, we are led to distinguish between two types of reinforcement: simple
reinforcement consists in the morphological union of the bleached element with the
specifying one. Complex reinforcement consists in the introduction of a specifying
element in a different position of the construction. We started this chapter with simple
reinforcement; this is necessarily conservative. In complex reinforcement, however, if
the reinforcing element ousts the reinforced one, we have a source of quite novel
constructions.12 We may even speculate, since no new construction starts ab nihilo, but
necessarily uses elements of inherited constructions, that there may be a gradual
transition between reinforcement and innovation.13
This chapter deals with what Givón (1979) and Heine & Reh (1984) have called the
various channels of grammaticalization. The term ‘channel’ graphically expresses the
fact that the fate of a category in grammaticalization is largely predetermined once we
know two things: 1) its meaning, 2) its syntactic function. These conditions are equally
necessary. Givón (1979[L]:213f) and others have emphasized condition 1, whereas
Meillet (1915f:170) had already said: “c'est le rôle dans la phrase qui décide de tout.”
The terms ‘grammaticalization scale’ and ‘grammaticalization channel’ will often be
used interchangeably. A grammaticalization scale is a theoretical construct along which
functionally similar signs types are ordered according to their degree of grammaticality
as measured by certain parameters to be discussed in ch. 4. The relation among the
elements on such a scale is panchronic. A grammaticalization channel is a frequently
recurring route which signs with a given function may take when they are gramma-
ticalized in language change. The relation among the elements in such a channel is a
diachronic one.
The aim of ch. 3 is twofold. First, a certain amount of examples of grammaticalization
will be accumulated in order to give an idea of the nature of this type of process and to
provide suitable empirical material to refer to from the more theoretical chapters to
follow. Second, although, naturally, not all parts of the grammar can be treated here, the
chapter is meant to demonstrate that grammaticalization is omnipresent and not specific
to any particular part of the grammar.
The subdivision of the material follows, in part, from the connections established by
grammaticalization channels. But as some channels cross, the presentation will
necessarily be somewhat repetitive. The amount of material presented is still greatly
reduced in comparison with the masses of evidence available for most of the channels.
It would be impossible to display it all here; the reader is referred to the cited literature.
3.1. Verbal complexes
3.1.1. Existence and possession
Verbs such as Engl. exist, possess, or Latin existere, possidere, are lexical verbs like any
other and have no particular grammatical function. But most languages have more gram-
maticalized verbs with similar meanings, verbs which roughly correspond to English be
and have.
The various forms of Engl. be, as well as of its cognates in other Indo-European
languages, go back to three different roots: PIE *bhew- ‘become’ yields forms such as
Engl. be, German bin, Span. fuí. PIE *Hes- ‘exist, be in a place’ yields forms such as
Engl. is, German ist, Span. es. And Proto-Germanic *wes- ‘live’ yields forms such as
3.1. Verbal complexes 23
Engl. was, German war. These are doubtless typical sources of the verb ‘to be’. ‘He
lives’ is, for instance, the etymological meaning of the verb úhki ‘he is’ in Tunica (Haas
1941:41ff). Another source of ‘be’-verbs is ‘to stand’. This can be seen in Span./Port.
estar, French être, which derive from Latin stare. Among the 15 auxiliaries which
irmunskij (1966:85f) cites from Usbek, there are also quj- ‘stand, place’ and tur-
‘stand’. ‘To remain’ is the original meaning of the Port. verb ficar, which is currently
taking over some functions of the verb ‘to be’. These verbs are usually highly irregular
or even suppletive, which points to their grammaticalized status.
Engl. have, German haben and cognates derive from Proto-Germanic *hafjan ‘seize’.
Span. tener ‘have’ meant ‘hold’ in Latin. Anticipating future developments of English,
we can say that ‘receive’ is another source: have (phonologically /v/ or /z/ or /d/ in the
various inflected forms) is currently reinforced by got and will soon be entirely
renovated by it. These are all, of course, common sources of the possessive verb; see
Seiler 1981:104-106.
Although there are diachronic derivational relations between ‘be’ and ‘have’ in many
languages, there is, interestingly, no unidirectional grammaticalization relation between
them. On the one hand, existence predications are often grammaticalized constructions
of the verb ‘have’. Thus dialectal German es hat, Span. ha(y), French il y a, all ‘there
is/are’. On the other hand, possessive predications very often contain a verb of existence:
Latin Paulo est liber ‘Paul has a book’, Mandarin wo
Q yoQu yí-zhÅ goQu (I EXIST one-CL
dog) ‘I have a dog’; cf. also Russian est' and Japanese arimasu. This is, by the way, an
argument against reducing possession to existence or vice versa.
3.1.2. The copula
A copula is a word which turns a nominal into a predicate. This function will not be
considered here because it will be treated in subsequent sections. Here we concentrate on
the question: through which grammaticalization channels do elements arise which
function as the copula in nominal clauses? There are, in principle, two such channels.
As is familiar from Indo-European languages, a copula may be a grammaticalized ‘be’-
verb, any one of those treated in the preceding section. In this case, the copula has
obviously verbal properties, i.e. it may inflect for person, number, tense etc.; though it
may be absent when all the categories are unmarked, as it is, e.g., in Russian.
A less familiar, but equally frequent origin of the copula is a demonstrative or anaphoric
pronoun. Consider the case of the Chinese copula, as analyzed by Li & Thompson
(1977). In Archaic Chinese, nominal clauses contained no copula. The subject of a
nominal predication, especially a relatively heavy one, could be topicalized by
left-dislocation. This necessitates a substitute in the subject position of the nominal
clause, a demonstrative or personal pronoun which anaphorically takes up the topicalized
NP. The resulting nominal clause is, of course, syntactically completely unmarked. The
complex sentence structure is as follows: S[ NP S[ DEM NP ] ]. The DEM in Archaic
Chinese is shì. By the 1. cent. AD, this construction was sufficiently grammaticalized to
3. Grammatical domains 24
be reanalyzed as S[NP DEM NP]. Here shì already functions as a copula, one criterion
being that it is indifferent as to the person of the subject. About the same time, it ceases
to be used as a demonstrative, while in its copula function it becomes increasingly
Copulas of this origin may also be found, according to Li and Thompson, in Hebrew,
Palestinian Arabic, Wappo and Zway. Such copulas do not, of course, express verbal
categories. Since the latter are, in fact, irrelevant to them, they are also not distributed
according to marked and unmarked verbal categories, but also appear in what would
correspond to a present indicative verbal clause.
The second grammaticalization channel also admits nominal clauses which already
contain a copula, which is then reinforced by the pronoun. This is currently happening in
French. ‘To live is to learn to die’ is not Vivre est apprendre à mourir, but rather Vivre
c'est apprendre à mourir, which is pronounced, as Frei (1929:72) insists, “sans pause”.
3.1.3. Modals and moods
Modal verbs or auxiliaries may, of course, derive from full verbs. In what follows, I list
some possible sources.
In the Germanic languages, many modal verbs derive from Proto-Indo-European
preterite-presents, i.e. original full verbs whose inherited perfect form was used with
stative present function. Among them are OE can(n) ‘know, be able’, sceal ‘owe’, mæg
‘be able’. These verbs developed a past tense inflection of their own, which made them
morphologically highly irregular. Their syntax was still that of common verbs in Old
English. During the Middle English period, however, they developed those syntactic
pecularities which make them constitute the syntactic category of modal verbs; and as
such the verbs can, shall, may and others appear in the 16. century. This development is
analyzed in detail by Lightfoot (1979:98ff), though he tries to do without the concept of
grammaticalization. A synchronic example for the ambivalence, or transitional status, of
a verb between full verb and auxiliary is provided by Romanian poate ‘can’; see Mallin-
son & Blake 1981:198f.
Desiderative modals such as will evolve, of course, from verbs meaning ‘want’. As also
shown by English, they may subsequently form the basis of subjunctive auxiliaries such
as would. The German equivalent is würde, but this has a different source. The original
meaning of werden is ‘become’, and since würde is formally subjunctive, its original
(still alive) meaning is ‘would become’. In this meaning, the verb formed constructions
such as OHG würde lesende ‘would become reading’, with a clearly inchoative meaning.
The latter, however, disappeared in Middle High German, and in the course of
grammaticalization only the subjunctive meaning remained: ‘would read’. Once würde
had become a sign of the subjunctive, the marked participial form of the verb was no
longer necessary. In analogy to the other modal verb periphrases, it was simplified to the
infinitive form: würde lesen. For this account, see Ronneberger-Sibold 1980:60f. The
3.1. Verbal complexes 25
interesting thing about this development is the solution to the problem of reinforcing the
subjunctive mood. This was done by extracting this mood from the main verb and using
an auxiliary verb as its bearer whose lexical meaning was necessarily irrelevant since its
function was nothing more than to carry the subjunctive. This is why, in this
construction, it lost its meaning so soon. Contrast this with the formation of the
werden-future dealt with in ch. 3.1.4.
The omnipresent existence verb also forms modal constructions, chiefly obligative ones.
It combines with nominal verb forms to yield expressions of the type ‘my going exists’,
meaning ‘I have to go’. Compare Lat. mihi est eundum id., but also Yucatec Maya yàan
in bin (EXIST 1.SG go) id. Once more, the functional similarity of ‘have’ and ‘be’ in the
existence meaning asserts itself here. Thus we have Engl. I have to go, and also Vulgar
Latin cantari habet ‘has to be sung’, which, according to Benveniste (1968, § II), ultima-
tely yielded the Romance future (cf. below).
Continuing grammaticalization transforms modal verbs into affixes. Examples for the
development of desiderative and obligative modals into future markers have already been
mentioned and will yet be seen in the following section. The existence of verbal mood
affixes is known; besides the common Indo-European subjunctive suffixes, note in
particular the Sanskrit desiderative suffix -sa. What is lacking in my data is historical
evidence for their development out of modal verbs; but on the basis of the analogy to
related categories, such evidence must exist.
3.1.4. Tense and aspect
Tense and aspect are often expressed with the help of periphrastic verb constructions in
which an auxiliary is used to support a nominal main verb. The two auxiliaries which
predominate in Indo-European languages are presumably widespread everywhere: ‘have’
and ‘be’. Both are used in the analytic perfect of the Germanic and Romance languages.
For the origin of this construction, see Meillet 1912:141-143, Benveniste 1968, § I,
Seiler 1973, Rosén 1980, Ramat 1983. In Persian, the auxiliary ‘be’ has been aggluti-
nated to the main verb and now expresses the personal endings of the past tense verb.
Similarly, Haas (1977) demonstrates that the personal endings in the conjugation of some
Muskogean languages go back to an agglutinated auxiliary.
Heine & Reh (1984:130) show that in Africa, too, past tenses are frequently expressed
with the help of ‘be’. Following Givón (1973, § 5), they posit two other possible origins:
verbs of motion, especially ‘come’; and verbs meaning ‘to be/have finished’. Both can
be exemplified from Portuguese: vem de escrever (comes from writing) ‘has written’ (cf.
French vient d'écrire); acaba de escrever (finishes of writing) ‘has just written’. Both of
these examples illustrate that past tenses often start out as perfects or perfective aspects;
the past meaning actually results from a further grammaticalization. The same is to be
observed in the development from the Indo-European perfect to the Germanic past and
of the Latin perfect to the Romance simple past tense. And the same is again happening
with the ‘passé composé’ in French and the haben-perfect in Bavarian German.
3. Grammatical domains 26
Passing over to future tenses, we again meet ‘have’ here, viz. in Latin-Romance. The
periphrastic construction ‘infinitive of main verb + form of habere’ started in Vulgar
Latin, according to Benveniste (1968, § II) in passive clauses, and according to Ineichen
(1980) in subordinate clauses. In the course of its expansion, the construction became
agglutinative and led to the synthetic Romance future (cf. also Coseriu 1974:132-151).
Overall, ‘have’ is probably not so common a future tense auxiliary. Much more wide--
spread is ‘go’. It occurs in periphrastic futures in English and various Romance
languages, e.g. Port. vou escrever (go.1.SG write:INF) ‘I will write’. An isolated
precedence of this may be seen in the Latin passive infinitive of the future, scriptum iri
‘to be going to be written’ (cf. Ultan 1978:109-114). ‘Go’ also figures in the Usbek and
Tunica auxiliary lists given in irmunskij 1966:85f and Haas 1941:41-51, respectively.
For African languages, see Givón 1973, § 5 and Heine & Reh 1984:131f.
Since ‘be’ is the counterpart of ‘have’ in so many respects, obligative ‘be’ grammatic-
alizes to future just as obligative ‘have’ does. An example is provided by Yucatec Maya.
The construction yàan in bin mentioned in ch. 3.1.3 is also used colloquially to mean ‘I
will go’.
Equally often, the future may arise through the grammaticalization of a desiderative
modal. English will is a known example. In 13. cent. Greek, an impersonal thélei ‘it
wishes’ governs a subordinate clause introduced by ‘that’. This is shortened to thé ná,
then contracted to thená and, by the 16. century, yields thá FUT. In Swahili, -taka ‘want’
> -ta- FUT, as illustrated in E2 (cf. Heine & Reh 1984:131).
E2. a. n-a-taka ku-la
SWAH SBJ.1. SG-PRS-want INF-eat ‘I want to eat’
b. ni-ta-ku-la
SBJ.1.SG-FUT-INF-eat ‘I will eat’ (Givón 1973:916)
At a more advanced stage of grammaticalization, we find the Ancient Greek future in
-se/so-, which derives from a PIE desiderative; see Rix 1976:224f, and cf. the Sanskrit
-sa-desiderative mentioned in the preceding section.
Finally, future auxiliaries may evolve from verbs with an inchoative meaning. Givón
(1973:917) adduces the example of SiLuyana (Bantu) -tamba ‘begin’ > -mba- FUT, as in
ni-mba-kela (SBJ.1.SG-FUT-work) ‘I will work’. On the other hand, we have the German
future with werden. This started at the same time and in the same construction as the
würde-subjunctive mentioned above. Here, again, the original participle of the OHG
construction wird lesende (‘becomes reading’) is simplified to an infinitive. However,
the inchoative meaning here is not discarded, but grammaticalized to a future meaning.
The main source of progressive aspect conjugations is a periphrastic construction
formed with the verb ‘be’ plus a nominalized verb form in some locative dependence. A
typical instance of this is the Engl. she is on working > she is a-working > she is
working. Compare also the Portuguese variants está a trabalhar (stand:3.SG at work:INF;
3.1. Verbal complexes 27
European) and está trabalhando (stand:3.SG work:ing, Brazilian). Colloquial German
has ist am arbeiten, corresponding to the European Portuguese version. One may also be
more precise on the nature of the ‘be’-verb involved: Since the construction originally
expresses a state (position or condition, ‘Befindlichkeit’) of the subject — as is
sufficiently proved by the prepositions used —, the verb employed as an auxiliary, if
there is a choice, will be the verb ‘be at a place’. It could therefore be predicted that
Spanish and Portuguese use estar rather than ser in their progressive constructions. The
same can be seen in African languages. Thus, the Ewe progressive construction éle vavá
D (he:is RDP:come PROG) ‘he is coming’ originally expresses a location: m derives from
*me ‘inside’, so that the original meaning is ‘he is in coming’ (Heine 1980:105f). In
Abkhaz (Hewitt 1979:128, 181f), the postposition -{c'` ‘in’ is converted into the
intransitive verb ‘be in’ by adding stative verb inflection. The full verb is put into the
masdar, an infinitive-like verbal noun, and is constructed as the oblique complement of
the auxiliary, as shown in E3.
E3. a-xYmàr-ra d-a+ec'`-w+p'
‘he is playing’ (Hewitt 1979:181)
In Usbek ( irmunskij 1966:86), there are four auxiliaries which may be used in the
progressive frame ‘main verb-gerund auxiliary-gerund-inflection’, e.g. in ëz-ib AUX-ib-
man ‘I am writing’, namely tur- ‘stand’, ýt ‘sit’, ët ‘lie’ and jur- ‘walk about’. It is
palpable how all these verbs characterize the spatial situation of the subject.
Givón (1973, § 5) and Heine & Reh (1984:124-126) also point to a second source of
progressive aspect markers, namely verbs of the meaning ‘stay’, ‘remain’, ‘keep’. This
can also be exemplified from Portuguese, which uses ficar (beside estar) in progressive
For habitual aspect/aktionsart, two sources may be mentioned. The first is a
periphrasis with the copula, as for progressive aspect. In Imbabura Quechua, the same
suffix -j which also forms simultaneous relative clauses is used on the full verb. The
resulting form is constructed as the predicate complement of the copula. Sentences such
as the one in E4 can nevertheless not be analyzed as containing a syntactically regular
free relative clause (see Cole 1982:149).
E4. Utavalu-pi trabaja-j ka-rka-ni
‘I used to work in Otavalo.’ (Cole 1982:149)
Subordinate clauses cannot contain validators (a kind of modal particle). However, in
habitual sentences such as E4, validators are possible. This shows that there is only one
clause in this construction and that non-finite verb plus copula form a periphrastic verb
form in it. What started out as a simultaneous nominalizer of clauses ends up as a verb
marker of habitual aspect.
3. Grammatical domains 28
The second source of habitual aspect are periphrases which involve the verb ‘do’.
Sentences such as E5 occur in Irish English.
E5. He does plough the field for us. (John Harris p.c.)
In Mayan languages, the predicate focus construction is mainly used in order to express
habitual aspect, as in E6 from Yucatec.
E6. puroh káaltal k-in bèet-ik
YUC mere drink IMPF-ERG.1.SG do-INCOMPL
‘mere drinking was what I did’
Here the full verb becomes non-finite, and the whole predicate is put into focus position.
The extrafocal clause reduces to a finite form of bèet ‘do’, to which the nominalized
predicate is the direct object.
3.1.5. Passive and emphasis
The analytical passive with esse ‘be’, which was used, in Latin, only in the perfective
categories, replaced the synthetic forms in the Romance languages and yields such passi-
ves as Italian è detto ‘is said’. This is currently being renovated with the auxiliaries
venire ‘come’ and andare ‘go’. Of these, the unmarked form is viene detto ‘is said’; but
the contrast with va detto evokes the deictic potential of these auxiliaries: the former
then implies ‘is said to the speaker’, the latter ‘is said by the speaker’.
The notion of ‘becoming’ is at the basis of the auxiliary which serves in German (wer-
den) and Persian (šodan) passive constructions; it also appears in the English get-passi-
ve. Because of the basic meaning of the auxiliary, these passives were originally
inchoative; wird grammatikalisiert would have meant ‘becomes grammaticalized’, the
passive meaning being carried exclusively by the participial form of the main verb. With
increasing grammaticalization, however, the auxiliary loses its inchoative meaning and
becomes a mere carrier of finite verbal categories. This is another example of renovation
through complex reinforcement. For other sources of the passive, see Givón 1979[d]:85f.
As for emphatic constructions, we will mention here only the auxiliary ‘do’. There are
different types of emphatic constructions, and in at least three of them the verb ‘do’ may
appear. For the first type, cf. the predicate focus construction mentioned in § 3.1.4.
Second, the emphasis may not be on a particular sentence constituent, but rather the
assertion itself may be emphasized. This type is exemplified in English. According to
Traugott (1980:55), in Middle English the verb do was used as an auxiliary, apart from
causative constructions, only if a positive assertion was to be strongly emphasized. By
1700, it came to be used also when the assertion was to be questioned, that is, as an inter-
rogative auxiliary; and by 1900, it appeared also as an auxiliary in negation. The
desemanticization accompanying this expansion has led to the situation that do is
currently being used everywhere with little or no emphasis.
3.1. Verbal complexes 29
14 The term 'main verb' is, unfortunately, ambiguous. In its syntactic sense, it means the governing
verb; and in this sense the auxiliary in an analytic verb phrase is the main verb, as is argued above.
In its semantic sense, it means, within an analytic verb form, that verb which carries the lexical
In the third type of emphasis, the main verb is used as a contrastive topic; and due to its
being foregrounded, it needs a substitute in the clause. This function is fulfilled by tun in
Standard German, in sentences such as Kochen tut sie nicht schlecht (lit. cooking does
she not badly). In Non-Standard German, the auxiliary tun has been generalized beyond
this context to expressions such as sie tut nicht schlecht kochen (cf. p. 102).
3.1.6. Auxiliaries and alternative sources
The discussion in ch. 3.1.2-3.1.5 has concentrated on auxiliaries and the like. We will
first sum it up and then turn briefly to alternative sources of the grammatical categories
The common denominator of the above developments can be characterized as follows:
main verb becomes auxiliary verb, possibly via modal verb; this then becomes a mood or
aspect marker, and the latter finally a tense marker. The most important and most
differentiated instance of this development is certainly represented by the verb ‘be’. It
starts out as a ‘verbum substantivum’, a verb of existence. Subsequently, it comes to be
used in location predications, with the meaning ‘to be in a place’. Then it appears as the
copula in nominal sentences. As such, it may be employed when the predicate is a
nominalized verb form, and in this way it ends up as an auxiliary. This development was
already posited by Meillet (1912:131), who exemplifies it as follows:
verbum substantivum: je suis celui qui suis
‘be in a place’: je suis chez moi
copula: je suis malade
auxiliary: je suis parti
As was already mentioned with reference to Persian and Muskogean, further
grammaticalization yields inflectional endings.
The grammaticalization of full verbs to auxiliaries shows us two things. First, a piece of
methodology: The dispute on whether auxiliaries are main verbs or not (J. Ross: yes; L.
Palmer: no; R. Huddleston: yes; etc.) is fruitless. Two grammatical categories connected
on a grammaticalization scale are neither the same nor distinct. The difference between
them is gradual, and there is no clear-cut dividing line. Secondly, an empirical insight:
Grammaticalization can turn syntactic relations around. In a word combination which
contains two verb forms, one of which will become the auxiliary in an analytic
construction, this latter one starts by being the syntactic (not lexical!) main verb (cf.
Givón 1979[d]:96f), while the other, governed verb carries the major part of the lexical
meaning.14 However, only a free form can exert government. As, in the course of
3. Grammatical domains 30
meaning, and consequently denotes the exact opposite of the first sense. The term denoting the
second sense should probably be 'full verb'.
15 An alternative development is that a pair of verbs in a series becomes a compound verb; but this
is not grammaticalization; see ch. 5.2.
proceeding grammaticalization, the auxiliary loses its verbal properties, it can no longer
be said to govern the lexical verb. When it has become a tense/mood/aspect marker, it
depends on the lexical verb, which is now the main verb. Thus, the syntactic relations are
almost reversed; though not quite, because within a word there are no syntactic, but
morphological relations. We shall find (ch. 4.3.2) that this development of relations is
characteristic of grammaticalization processes. For the trouble that intermediate stages
of this development may cause to synchronic analysis, see Matthews 1981:155f.
We now turn to alternative sources of the verbal categories treated above. There appear
to be two principal ones: serial verbs and adverbs. Serial verbs will be treated in more
detail in ch., as a source of adpositions. They have, in fact, been studied mainly
in that connection, and comparably little attention has been devoted to their aspectual or
aktionsart function.
I cannot clarify here the complex and much debated issue of the syntactic relations
among the verbs in a series. Let us assume the following definition: a serial verb
construction is the combination of two or more asyndetically juxtaposed verbs with at
least one shared argument in order to express a complex, but unitary situation. In the
course of grammaticalization of a serial verb construction, one verb in a pair undergoes
the usual symptoms of grammaticalization, becoming, in the last event, a grammatical
formative, while the other remains virtually unaffected.15 I shall refer to that member of
a series which is (destined to be) grammaticalized as ‘the serial verb’ in the
construction. This terminology is based on the assumption that wherever verb
serialization occurs, there is a relatively closed class of verbs with an active serialization
potential (the serial verbs), combining with verbs from an open class which are indiffe-
rent to serialization. Such serial verbs which develop into adpositions are called
‘coverbs’ in the literature and will be dealt with in ch.
Examples of serial verbs with aspectual function may be adduced from Niger-Congo
languages (see also Sasse 1977[G]:113-117 on Mba). In Akan (Kwa), there is a verb
‘come’ which has developed a grammatical function as the first verb in a series (Wel-
mers 1973:353f). In this position, it has become a future marker, which is subject to
phonologically conditioned allomorphy and has become prefixed, together with its
personal prefix, to the following full verb. This is the origin of such forms as g-bu-bá
(3.SG-FUT-come) ‘he's going to come’ or ò-bé-dìdí (3.SG-FUT-eat) ‘he's going to eat’. In
Efik (Benue-Kongo), the verb ‘fulfill, accomplish’ takes the first position in a series.
Here it is grammaticalized to a neutral past marker and undergoes tonal assimilation, as
in the following examples (Welmers 1973:371): ì-mà í'dí (we-PAST we-come) ‘we
came’; m
D'má Û-dí (I-PAST I-come) ‘I came’.
3.1. Verbal complexes 31
More evidence for serial verbs in aspectual function comes from Creole languages. Tok
Pisin (New Guinea) provides us with the following examples (from Mosel 1980):
E7. ol manmeri bilong Papua Niu Gini i save kaikai kaukau
TOK PL man of Papua New Guinea SBJ.3 HAB eat sweet potato
‘Papua New Guineans eat sweet potatoes.’ (Mosel 1980:108)
Portuguese provided the verb saber ‘know, can’, which has become save ‘to do habitual-
ly’ in Tok Pisin. This enters verbal series as the first member and ends up as an aspectual
marker, as in E7.
English stop yields stap ‘live, be located’ in Tok Pisin. This enters a verbal series as the
last member and develops into a marker of continuous action, as in E8.
E8. em i wok i stap yet
TOK he SBJ.3 work SBJ.3 CONT self
‘he is/was still working’ (Mosel 1980:108)
A similar fate has befallen English finish; this has become a postverbal completive
aspect marker in Tok Pisin:
E9. em i go pinis
‘he has/had/will have gone’ (Mosel 1980:123)
I shall gloss over several problems in these examples. It is evident, for instance, that in
some of them the serial and the full verb each have their own personal prefixes, whereas
in others only one of them has. Furthermore, the question naturally arises as to whether
we need to treat grammaticalized serial verbs as distinct from auxiliaries or modal verbs.
All the examples seem to be interpretable in either of these two terms. This would mean
that we have only found a new source of auxiliary verbs, but not a new source of
mood/aspect/tense markers, since these would still derive from auxiliaries. Much seems
to speak in favour of this position. On the other hand, the morphological difference just
mentioned might correlate with a difference among serial, modal, and auxiliary verbs.
The latter distinction might also account for the positional differences in the last three
examples. Heine & Reh (1984:128) have an intriguing example from Ewe (Kwa). The
language has serial verb constructions in which the serial verb follows the full verb(s).
It also has auxiliaries which precede full verbs. There is a verb n ‘remain, stay’ which
has been grammaticalized to a habitual aspect marker. In standard Ewe, this is
constructed as a serial verb, e.g. me-yí-na (I-go-HAB) ‘I am in the habit of going’. In the
Dahome dialect of Ewe, however, n is constructed as an auxiliary, as in m-n-sa (I-HAB
-sell) ‘I am in the habit of selling’.
Faced with problems such as this, I prefer to take no stand on the issue of whether (some
of) the grammaticalized serial verbs in the above examples are to be analyzed as
3. Grammatical domains 32
verb mood
verb auxiliary
verb aspect
adverb tense
F2. Some interrelated grammaticalization channels of verbal categories
auxiliary or modal verbs. It suffices to say that these categories are functionally and
structurally quite similar.
We finally turn to a definitely different source of tense markers. Givón (1979[L]:218f)
raises the question whether tense/mood/aspect distinctions can arise from adverbs, and
answers it in the negative. Available evidence, however, argues for a more differentiated
hypothesis: while modal and aspect markers appear, in fact, to derive exclusively from
periphrastic verbal constructions, tenses may come from adverbs (see also Heine & Reh
1984, ch. There are probably quite a number of languages which use a word
meaning ‘already’ in the function of a past or perfect tense marker; Indonesian sudah is
one example. Future markers deriving from adverbs can be exemplified from creole
languages (Labov 1971). English by and by has yielded the free future temporal adverb
baimbai of the pidgin stage of Tok Pisin (which lacks tense). This was subsequently sim-
plified and grammaticalized to a preverbal future marker, which may cooccur with future
adverbs, as in klostu bai i dai (soon FUT SBJ.3 die) ‘he'll die soon’. In the present creole
language, it has become increasingly obligatory and is further phonologically reduced to
be (cf. also Sankoff & Laberge 1974). Spanish luego ‘soon’ underwent a maximally
parallel fate in Papiamento: it was reduced to lo and became a preverbal future marker,
as in lo mi kanta (FUT I sing) ‘I will sing’. Adverbs which are grammaticalized to future
and past tense markers and adjust their position vis-à-vis the verb accordingly have also
been found in the Nilotic languages Luo, Lotuko and Bari (Heine & Reh 1984:130, 132).
Finally, according to an Indo-Europeanist hypothesis of long standing, the final -i
common to the so-called primary verbal desinences is an original deictic particle. While
a reconstruction, obviously, does not count as evidence, the other cases clearly show that
the development ‘adverb tense marker’ must be posited as a grammaticalization
The developments discussed in the preceding sections may be summarized in F2.
3.2. Pronominal elements 33
3.2. Pronominal elements
I shall not deal here with all the different kinds of pronouns. A major distinction will be
made between definite and indefinite pronominal elements. Under the category of
definite pronominal elements I will treat demonstratives, definite articles and personal
pronouns, as well as their products in grammaticalization. The heading of indefinite
pronominal elements will comprise indefinites properly speaking, indefinite articles and
interrogative pronouns, and again their grammaticalization products.
3.2.1. Definite pronominal elements
There is one type of pronoun at the root of this family, and this is the free demonstrative
pronoun. In its full, ideal form, this contains three components, two semantic and one
syntactic. First, the demonstrative element in the narrow sense, which embodies
definiteness and a pointing gesture. Second, what we may call the deictic element, which
directs the attention to something located in regard to the speech situation (speaker vs.
hearer, visible vs. invisible, etc.). Third, a categorial element, either NP or Det, which
renders the pronoun either syntactically autonomous or dependent. Of these, the deictic
component will usually be segmentally expressed at the stage of the free demonstrative
pronoun (otherwise it fuses with the demonstrative one). Either the demonstrative or the
categorial component will almost always lack expression. The Yucatec Mayan
discontinuous (or circumfixal) demonstratives express the demonstrative and the deictic
components separately. We have the following paradigm:
le NP-a' ‘this NP’
le NP-o' ‘that NP’
le NP-e' ‘aforementioned NP’
The Japanese demonstrative (and other) pronouns express the deictic and the categorial
components separately, as shown in the following paradigm:
pronoun proadjective
ko-re ‘this one’ ko-no N ‘this N’
so-re ‘that one’ so-no N‘that N
a-re ‘yonder one’ a-no N ‘yonder N’
The first step in the grammaticalization of the demonstrative pronouns is the weakening
of the deictic component. Deictic distinctions tend to be neutralized, the paradigm is
reduced, and at the same time its unmarked member, namely that of third person deixis,
assumes a primarily anaphoric function. An example, Latin ille, has already been
mentioned in ch. 2.2. A case of extreme reduction is provided by Vulgar Latin *ecce hoc
illœc ‘lo this over there’> French cela ‘that’ > ça ‘it’. We disregard for the moment the
fate of the more marked demonstrative pronouns (see ch. 6.3) and concentrate on the
3. Grammatical domains 34
further development of the unmarked one. There are two principal grammaticalization
channels, corresponding to whether the categorial component is NP or Det; and we will
subdivide the discussion accordingly. Definite determiners
At the present stage of the development, we have an adnominal demonstrative pronoun
which is deictically neutral and therefore mainly used for anaphoric purposes. Examples,
besides Late Latin ille, are Gothic sa, så, Þata, OE s¯, s¯o, thæt and Homeric , h¯D, to,
all deriving from PIE *so, sœ, tod. Persian œn and Japanese sono appear to be well on
their way towards this stage.
The following development has been described by Greenberg (1978) for African
languages (cf. also Givón 1978, § 3), but it occurs in languages all over the world. The
demonstrative component is gradually reduced to mere definiteness, and the result is a
definite article. We thus get French le, la, OHG ther, thiu, thaz, Engl. the and Attic ho,
h¯, to. Further grammaticalization agglutinates the article to the noun. Suffixed articles
occur in Romanian, Swedish, Danish, Basque, Ijo (Kwa), Koyo (Kru) and Yuman
languages such as Mohave, Diegueño and Yavapai. Prefixed articles occur in Abkhaz
(Caucasian) and Arabic vernaculars. The Swedish case illustrates that while the definite
article is typically in opposition to a demonstrative, a definite affix starts cooccurring
with other definite elements.
At this stage, further semantic weakening leads to a reduction of definiteness to
specificity. This is largely true for the Abkhaz article and for the suffixed article of
Dagbani (Gur). If this last bit of referential meaning is lost, too, we are left with the
categorial component of the erstwhile demonstrative. That is, the element then signals
only that the word it is attached to is a noun, and can therefore still be used as a
nominalizer (which is an important function of the definite article, anyway). See
Greenberg 1978, § 3.5 on the nominalizing -s of Plateau Penutian.
If the demonstrative pronoun which is at the beginning of this process expresses any
noun class or gender distinctions — the primary locus of which is, in fact, the pronoun
—, then these will go all the way along, and when the specificity of the article is lost,
they will be left as noun class markers. This appears to be a plausible account of the
genesis of nominal gender or class markers as they occur, for instance, in Bantu
languages (details in Lehmann 1982[U], § 7.2). Personal pronouns
We go back again to the stage of Early Latin is, Late Latin ille, Gothic sa, Homeric
and Bambara ò. One thing that often happens to such anaphoric pronouns with a slight
demonstrative force is that they come to be used as relative pronouns. This happened, for
instance, to OHG ther and to Homeric . The development is treated in detail in Leh-
mann 1984, ch. VI.1.1.2 and 1.2.2. Although this is a deviation from the main channel,
it certainly is a grammaticalization, since the pronoun loses its demonstrative force and
3.2. Pronominal elements 35
16 In Rio de Janeiro, even dogs are addressed by você.
definiteness (cf. Lehmann 1984, ch. V.2.3, § 2) and becomes syntactically obligatory in
a certain construction.
Returning to the main thread, we find the pronouns here losing their demonstrative force,
too. The result is a free personal pronoun as exemplified by Proto-Romance *illu, Engl.
he or German er. The latter two derive, in fact, from the PIE demonstrative *ei-s which
also yielded Latin is. Having thus arrived at a third person pronoun, let us now turn to
first and second person pronouns and discuss briefly their possible origin.
New pronouns, especially for the second person singular, are often obtained by shifting
pronouns around in the paradigm, especially by substituting marked forms for unmarked
ones. This explains, e.g., the use of German Sie, French vous and English you for the
second person singular (see Syromjatnikov 1980:112 for Japanese). Again, a new first
person plural pronoun is being formed in French and Portuguese by what has so far been
the non-specific indefinite pronoun ‘one’, namely on and a gente, respectively. Here
grammaticalization plays no part.
However, new forms may also come from outside the paradigm; nouns may be
grammaticalized to pronouns. In Spanish, vuestra merced ‘your grace’ has yielded the
honorific second person pronoun usted, whose plural ustedes has already ousted, in
South America, the original plain form vos(otros). The Portuguese product of vossa
mercê, você, is used in most parts of Brasil instead of the original tu.16 Japanese provides
the following examples: watakusi lit. ‘my private affair’ > watasi ‘I’ (hon.); boku
(Chinese loan) ‘slave’ > ‘I’; Old Jap. kimi ‘lord’ > ‘you’ (hon.) > ‘thou’; anata lit. ‘that
part’ > ‘you’ (hon.); omae (HON:front) > ‘thou’ (vulg.) (from Syromjatnikov 1980 and
Yoshiko Ono, p.c.). Vietnamese tôi ‘I’ comes from a word meaning ‘subject’ (Wilfried
Kuhn, p.c.). The Indonesian saya ‘I’ derives from a literate word sahaya ‘servant’
(which in turn comes from Sanskrit sahœya ‘assistant’); and tuan ‘you’ (hon.) is an
original Arabic loan meaning ‘master’ (Gabelentz 1901:152). In East-Asia, the use of
relational nouns instead of personal pronouns whenever there is a personal relation bet-
ween the discourse participants is wide-spread and liable to yield rich material for the
grammaticalization origin of first and second person pronouns.
We see that personal pronouns derive from two entirely different sources: whereas those
of the third person come from demonstratives, those of the first and second persons come
from nouns of social relations. There is no a priori reason why the grammaticalization
processes which lead to these two kinds of personal pronouns should take a parallel
course. It is therefore no wonder that we find many languages where the third person
pronouns are not well integrated into the paradigm. Several of the ancient Indo-European
languages are examples of this, as their third person pronouns retain a slight
demonstrative force which is, of course, absent from the first and second person
pronouns. And there are quite a number of languages which are conventionally said to
lack third person pronouns altogether, a situation which we might rephrase by saying that
3. Grammatical domains 36
17 On p. 25 it was mentioned that a verb may acquire such categories through the agglutination of
an auxiliary which possesses them. Ultimately, however, this is probably not an alternative, since
the auxiliary, in turn, must have acquired these categories somehow.
what would be the third person pronouns are either too little or too much
grammaticalized to be able to fulfill that function. Such languages are Walbiri, Dyirbal,
Mangarayi (North Australia; Merlan 1982:99), Japanese, Lakhota (Sioux) and Basque.
This situation repeats itself in the personal affixes of many languages: there are
paradigms in which the third person (singular) affix is zero (although this may also be
explained by its semantic unmarkedness). On the other hand, the genetic and functional
difference of the two kinds of pronouns does not necessarily prevent them from forming
an integrated paradigm and behaving maximally similar, as they do, for instance, in
English, German, Russian, Arabic, Turkish and Chinese. Such paradigmatic differences
will be disregarded in what follows. For more details on the subsequent development,
see Lehmann 1982[U], §§ 6.2 and 7.1.
When personal pronouns are deaccentuated, they become clitic, usually either in
Wackernagel's position or to the word which governs them. Examples are the oblique
pronouns le, la etc. in Italian, French and Spanish or the forms ne, se, s of Northern
Substandard German (e.g. Ich habe ne/se/s doch gestern gesehen! ‘But I saw him/her/it
yesterday!’). Such forms are frequently phonologically reduced in comparison with
eventually coexisting stressed forms. While full personal pronouns may have the same
distribution as lexically headed NPs, clitic pronouns are often confined to certain
positions. Many languages, such as Modern Greek and Romance languages, have a set
of primary prepositions which require a full NP or personal pronoun as their complement
and do not accept a clitic pronoun.
Clitic pronouns become fillers of syntactic positions which may not be left open. In
Italian, for instance, if the direct object is topicalized by left-dislocation, it must be
represented in the clause by a clitic pronoun, as in Giovanni, l'ho visto ieri. ‘John, I saw
(him) yesterday.’ (cf. Mallinson & Blake 1981:154). In Spanish, the clitic object pronoun
may even cooccur with a nominal object within a clause, as in Ayer lo vi a Juan.
‘Yesterday I saw John.’ At this stage, the pronoun potentially loses its anaphoric function
and becomes an agreement marker. At about the same time, it turns from a clitic into
an affix (cf. Humboldt 1836:496f on this phase of the development). In this way, the
carrier of the affix acquires the morphological categories of person, number and
gender/noun class.17 Simplifying somewhat, we call these personal affixes. They may
appear on verbs (for subject, direct and indirect object), nouns (for the possessor) and
adpositions (for the complement). There are a number of languages such as Navaho,
Abkhaz or Arosi, which have all three of these types. E10 contains examples from
Abkhaz (Hewitt 1979:105, 116, 103).
3.2. Pronominal elements 37
E10. a. (sarà) a-x'-kàa-šq'-kàØ-r-s-to-yt'.
3.PL- IO
1.SG- give
‘I give the books to the children’
b. à-'k'ny- yn
ART-boy OBL.3.SG.M-house
‘the boy's house’
c. a-yyas a-q'+n
ART-river OBL.3.SG.NHUM-at
‘at the river’
In the cases cited, the personal agreement affixes may still function (anaphorically) as
personal pronouns, when no NP is present in the same construction. Further semantic
weakening makes them lose this ability, and they become entirely conditioned by
agreement. The personal endings of the finite verb in French, Russian and German illu-
strate this stage of the development. If grammaticalization proceeds further, the personal
agreement affixes become invariable markers. The subject affixes of the verb become
elements which identify the category ‘verb’ or the constituent ‘predicate’, and its object
affixes become transitivity markers. Both these developments have occurred to the
erstwhile pronouns he and him, respectively, in Tok Pisin. The resulting invariable mor-
phemes, the preverbal i- and the postverbal -im, are exemplified in E11.
E11. Man i-mek-im singsing
TOK man SBJ.3-make-TR spell
‘Men utter a spell’ (Sankoff 1977:67f)
This is the final stage in the grammaticalization of personal pronouns before their
disappearance. Reflexive pronouns
The grammaticalization of reflexive pronouns has been studied recently by Faltz (1977,
esp. ch. IV), Edmondson (1978:640-647; largely based on Faltz) and Strunk (1980).
Several of my examples are drawn from these sources, and the following discussion, too,
is indebted to them. Just as it would be difficult to formulate a common grammatical
denominator for all the different phenomena arranged on a grammaticalization scale
together with personal pronouns and treated in the preceding section, so it is difficult to
find a single grammatical denominator for all the phenomena which are commonly called
reflexive and which we will again find to be arranged on a grammaticalization scale.
Their common denominator lies precisely in the fact that they are connected by a
grammaticalization channel, this in turn being determined by a function which might be
roughly characterized as marking identity with or back reference to an entity involved in
the same proposition (sentence or clause); cf. Plank 1979[E].
3. Grammatical domains 38
I will simplify the discussion a bit by assuming the following four categories,
enumerated here in the order of increasing grammaticalization:
(i) autophoric nouns, e.g. Sanskrit œtmán ‘soul’;
(ii) reflexive nouns, e.g. English self;
(iii) reflexive pronouns, e.g. German sich ‘oneself’;
(iv) verbal reflexives, e.g. Russian -sja.
It does not need to be emphasized that the boundaries between these categories are fluid.
There is a whole set of notions centering around the person, as a whole or in part, which
are generalized in many languages to comprise the self and which I call autophoric.
Typical examples are Sanskrit tanD ‘body, person’ and œtmán ‘breath, soul’, Buginese
elena ‘body’, Okinawan dna ‘body’, !Xu l'esi ‘body’, Basque burua ‘head’, Abkhaz
a-xg ‘the head’. In their respective languages, all these nouns are translation equivalents
of English self. As relational nouns, they are often accompanied by a (reflexive)
possessive pronoun. Typical examples from Vedic (Delbrück 1888:207f) are:
E12. utá sváytanv rm vade tát
VED and POSS.REFL:INST.SG.F self:INST. SG.F together speak:I that:ACC.SG.N
‘and I converse thus with myself’ (RV 7,86,2)
E13. bála rmdádhna tmáni
VED strength:ACC.SG.M put:PART.PF.MID self:LOC.SG
‘putting strength in myself’ (RV 9,113,1)
At the other end of the spectrum, Old Indic makes use of a middle voice, which will be
discussed below.
The difference between an autophoric and a reflexive noun in the present conception is
mainly one of transparency or etymologizability. That is, autophoric nouns are ordinary
nouns with free non-reflexive uses; reflexive nouns are nouns meaning ‘self’ and nothing
else. Examples are German selbst, Latin ipse, Spanish mismo, Italian stesso, Finnish itse,
Hungarian magan, Turkish kendi, Japanese zibun and Yucatec báah. Some illustrative
sentences are:
E14. a. Ich komme selbst
GERM ‘I am coming myself’
b. Wollen Sie die Karten für sich selbst? (cf. E15)
E15. Halu-at-ko lipu-t itse-lle-si?
FINN want-2.SG-INT ticket-ACC.PL self-ALL-POSS.2.SG
‘Do you want the tickets for yourself?’
Reflexive nouns are a heterogeneous class. In some languages, for instance Finnish,
Hungarian, Turkish and Yucatec, they take possessive affixes, just like autophoric nouns
3.2. Pronominal elements 39
(cf. Engl. myself, yourself). In others, such as German or the Romance languages, they
are not normally combined with possessive pronouns. Again, in some languages such as
Japanese and Yucatec, a reflexive noun can by itself function as a reflexive pronoun; in
others such as German, Latin and the Romance languages, a reflexive noun can only
accompany appositively a reflexive pronoun or another noun in order to emphasize the
identity. Reflexive nouns of the latter subtype are formally similar or identical to the
(pro-)noun of identity, ‘same’; this is so with German selb-, Italian stesso, Spanish
mismo. They are somewhat marginal to the grammaticalization channel; but they may
enter it if used in reinforcement; see below.
Reflexive pronouns function syntactically like ordinary personal pronouns. Examples
are German sich, Russian sebja, Latin-Romance se, si, soi. Because of their primary
function to refer back to the subject, reflexive pronouns normally lack a nominative.
Instead, an appositive reflexive noun will normally appear, as in E14.a above. Just as
ordinary personal pronouns have reflexive counterparts, so ordinary possessive pronouns
may have reflexive counterparts. Examples are Latin suus (as opposed to eius),
Portuguese seu (as opposed to dele) and Russian svoj (as opposed to ego). As these
examples show, the proper possessive pronouns may be inherently reflexive, while the
non-reflexive forms are in fact genitives of the personal pronouns.
Verbal reflexives are verb affixes expressing that the action somehow affects the
subject. Examples are:
E16. Çocuk y ka-n-d .
TURK child wash-REFL-PAST
‘The child washed himself.’ (Wendt 1972 :156)
E17. jal arr Ø-bu-yi-ni rna-rlandi
MANG hard 3.SG-hit-REFL-PAST N.INST-stick
‘He hit himself hard with a stick.’ (Merlan 1982:135)
E18. khrÀmatistÀ s hoûtos
GREEK businessman:NOM.SG.M D1:NOM.SG.M
állöi anaphanÀUsetai khrÀmatizómenos
other:DAT.SG.M show:FUT:MID.3.SG trade:PART.PRS.MID:NOM.SG.M
‘this businessman will appear to acquire for somebody else’
(Pl. Gor. 7, 452)
The verb forms in E16-E18 are opposed to unmarked active verb forms: thus compare
y ka-d ‘he washed’ with E16, bu-ni ‘he hit’ with E17 and anaphan¯Dsei ‘he will show’
and khr¯matízån ‘trading’ with E18. Following traditional terminology, I have dubbed
the affixes in Turkish and Mangarayi ‘reflexive’, but the Greek affix ‘middle (voice)’.
There is, in fact, a structural difference in that the reflexive affixes here come near the
verbal stem and are almost derivational, whereas the morphological category of middle
3. Grammatical domains 40
in Greek is amalgamated with the personal desinences. On the other hand, the Turkish
and Greek categories have in common that both are largely ambiguous between reflexive
and passive, while the Mangarayi category is ambiguous between reflexive and
reciprocal. In all three languages, the reflexive fills the position of a voice or
valence-changing verbal derivation. Reflexive suffixes with similar function occur in
Swedish (-s) and Quechua (mostly -ku, but -ri in Imbabura, Cole 1982:90f).
This type is to be distinguished from a reflexive affix which fills the position of a
personal (agreement) affix on the verb, as it occurs, for instance, in Swahili (ji-), Abkhaz
({c-, Hewitt 1979:77), Italian and Portuguese (-se). Examples are:
E19. a. a-li-ji-ona
SWAH SBJ.CL1-PAST-OBJ.REFL-see ‘he saw himself’
b. a-li-mw-ona
SBJ.CL1-PAST-OBJ.CL1-see ‘he saw him’
E20. a. vende-se
PORT sells-REFL ‘sells itself’ (i.e. is for sale)
b. vende-me
sells-me ‘sells me’
However, these morphological differences need not coincide with semantic differences.
Thus, both in Greek and in Portuguese the reflexive and the passive are not clearly
distinguished; and furthermore there are many reflexive verbs whose meaning differs
minimally from that of the corresponding active verb. A Greek example can be seen in
E18, where khr¯matizómenos may be substituted by khr¯matízon without much
consequence. An example from Portuguese is lembrar-se = lembrar ‘to recall’.
As the examples may have rendered plausible, these four categories of reflexive
elements are in fact on a scale of increasing grammaticality. We have yet to present
evidence for diachronic transitions between these stages. In doing this, I will also
comment on some of the semantic differences associated with the structural ones.
The transition from an autophoric to a reflexive noun may be illustrated by Arabic nafs.
In Classical Arabic this is an autophoric noun with the lexical meaning ‘soul’. In Cairene
Egyptian Colloquial Arabic it has become a reflexive noun with obligatory possessive
suffixes, which regularly functions as a reflexive pronoun (Gary & Gamal-Eldin 1982:
80f). Probably Hungarian magan is another example, as it appears to be etymologically
related to mag ‘kernel’.
I have no examples for an accomplished transition from a reflexive noun to a reflexive
pronoun, that is, no examples of a stage where a reflexive pronoun stemming from a
reflexive noun can no longer be apposed to a noun to emphasize the identity of reference.
However, the examples mentioned from Finnish, Hungarian and Arabic illustrate such a
3.2. Pronominal elements 41
change underway. So there is reason to doubt Faltz's assertion (1977:236-238) that the
change does not occur.
There is probably an alternative source for reflexive pronouns (according to Faltz 1977:
248- 266 it would be the only one), namely same-subject markers. These are pronominal
elements representing the subject of a clause and expressing that it is the same as the one
of the preceding clause. Grammaticalization would reduce the structural scope of this
device to a single clause. In view of the development of the personal pronoun sketched
in the preceding section and of general considerations of grammaticalization (see
ch. 4.3.1), this would seem to be a plausible development, though it would more
probably result in verbal reflexives than in free reflexive pronouns. Due to empirical
uncertainty, I will leave the issue at that.
The development of verbal reflexives out of reflexive pronouns is well attested.
Deaccentuation is a common fate of reflexive — as of other personal — pronouns. Thus,
the Indo-European reflexive *swe became the enclitic -za in Hittite (the a is purely
orthographic) and the prefixal he- in Greek. The Latin reflexive pronouns se became
clitic in the Romance languages, and the Russian reflexive pronoun sebja (REFL:ACC)
was reduced to -sja. Sometimes, as in Russian or in French soi, the original form subsists
beside the reduced one. The latter then tends to become affixal, normally to the verb.
Hittite -za in postinitial position is definitely a minority here. In Italian, Spanish and
Portuguese, se may be either proclitic or enclitic (and subsequently suffixal) to the verb.
Russian -sja occurs exclusively as a verb suffix. Jespersen (1922:377) adduces the
following example: Old Norse finna dik ‘find themselves’ (or ‘each other’) > finnask >
finnast > finnaz > Swedish finnas ‘are found’.
All the above verbal reflexives have a pronominal source. I know nothing about the
genesis of the diathetic verbal reflexives exemplified above for Turkish, Mangarayi,
Greek and Quechua (see, however, Szemerényi 1970:305-309 on the Indo-European
As reflexive pronouns shift from representatives of NPs with a special semantosyntactic
feature to markers of a verbal category, they are commonly reduced to middle voice
markers, “that is, more or less general intransitivizers” (Faltz 1977:268f). The semantic
development to be posited here may be illustrated by the following series of examples
from Russian:
Myt'sja ‘wash oneself’: Here a transitive action affects an object which is identical to the
Kusat'sja ‘bite (intr.)’: Here the object is not identical to the subject. There is, in fact, no
object; the action abides in the sphere of the subject. The reflexive marker renders
the verb intransitive.
Brat'sja ‘take (for oneself)’, idtis' ‘go (away)’: Here the reflexive marker does not
change the transitivity of the underlying active verb, can even be attached to
intransitive verbs and expresses only an autistic nuance in the action of the subject.
3. Grammatical domains 42
Smejat'sja ‘laugh’, bojat'sja ‘be afraid’ (< fear oneself), ostat'sja ‘remain’: These are
‘reflexiva tantum’, where the reflexive marker is obligatory and therefore nearly
meaningless. At this stage, we also find morphologically conditioned alternation
between reflexive and non-reflexive verb forms, e.g. stat' (perf.) vs. stanovit'sja
(impf.) ‘place oneself, become’.
Most of these examples could be doubled by synonyms from other Indo-European
languages. They occur with the free reflexive pronoun of German, the clitic reflexive
pronouns in Romance and the flexional Greek middle voice; recall the comments on the
Greek example E18. This shows that the semantic continuum is not neatly matched by a
morphological continuum. To expect this would be expecting too much. We must be
content to find tendencies. What we can say is that the semantic transition from the
notion of an action affecting the subject along the above stages to zero takes place in the
morphological zone from a reflexive pronoun via a verbal reflexive to zero. The
approximativity of the correlation is also due to the fact that the semantic phenomena
themselves are partly dependent on particular verbal meanings. That is, the transition is
not one of pure grammaticalization, but involves some lexicalization.
One phenomenon exhibiting a correlation between the semantic and morphological
scales may, however, be mentioned. It concerns the difference between the first of the
above semantic stages (myt'sja) and the subsequent ones. Edmondson (1978:646f) posits
the following situation: a semantically bivalent verb in an ergative language has a
reflexive object. Then with several languages which leave a choice in the expression of
the reflexivity, and also cross-linguistically, the following can be observed: If the object
is represented by a reflexive noun or free reflexive pronoun, the subject is in the ergative,
which means that the verb is syntactically transitive. If there is a verbal reflexive, the
subject is in the absolutive, which means that the verb has been detransitivized.
The examples which I have adduced show reflexive elements unmarked for person, and
thus possibly referring to the third person. Some languages have reflexive pronouns for
the other persons as well. In Greek we have me ‘me’ and se ‘you (ACC)’, but meautón
‘me myself’ and seautón ‘you yourself’. However, the less differentiated system in
which the unmarked pronouns of first and second person are also used in the reflexive
function, seems to be more widespread. An alternative, but equally economical
development, which often accompanies the grammaticalization of a reflexive element to
a verbal reflexive, is the generalization of the form which is unmarked for person to the
first and second persons. A notable example is Russian; the paradigm runs as follows:
ja mojus' ‘I wash myself’
ty moješsja ‘you wash yourself’
on mojetsja ‘he washes himself’
with the allomorphs -s' J -sja being phonologically conditioned. The same phenomenon
occurs in the Russian reflexive possessive pronoun; svoj ‘his (own)’ may be substituted
for moj ‘my’ and tvoj ‘your’ if reflexitivity is involved in the possessive relationship. The
3.2. Pronominal elements 43
same is true for Sanskrit svá. Tendencies to use the unmarked se instead of the first and
second person pronouns have also been observed in substandard French by Frei (1929:
147). His examples are in E21.
E21. a. On nous prie de s'adresser à vous.
FREN ‘One asks us to address ourselves to you.’
b. Nous se reverrons.
‘We shall meet again.’
c. Veuillez, Monsieur, nous faire le plaisir de s'en occuper.
‘Will you, sir, do us the favor to take care of it.’
d. Vous se privez.
‘You deprive yourself.’
The generalization of the unmarked reflexive pronoun is the first in a long series of
phenomena which raise the intricate question of the difference between grammaticali-
zation and analogical extension. On the one hand, it would be easy enough to argue that
what we have here is analogical extension. On the other hand, the semantic bleaching of
the reflexive element causes it to no longer signify features of a referential entity (or an
NP), but rather features of the action (or of the verb), and this involves the loss of the
category of person. I will content myself with having stated the problem and not try to
solve it here. There will be ample discussion of it in ch. 5.4.
A last feature in the development of reflexive elements which commands attention is
their frequent reinforcement. I have said above that reflexive nouns are often used in
apposition to reflexive pronouns, as in E14.b. This is essentially an emphatic,
intensifying use, and it is therefore no wonder that reflexive pronouns are commonly
reinforced by reflexive nouns. The Indo-European reflexive *swe- had yielded atonic he-
in Proto-Greek. This was reinforced by the reflexive noun autós to yield Greek heautós
‘he himself’. Latin se is itself a renovation (probably via complex reinforcement, see
ch. 2.5) of the Indo-European middle voice. Like other personal pronouns, it was
commonly intensified by the meaningless suffix -met or by ipse ‘self’ or by both, e.g.
semet ipsum. In Vulgar Latin, this was again strengthened by putting ipse in the
superlative: *semet ipsimum. This becomes *se medesimo > Port. se mesmo, Span. se
mismo. A series of reinforcements of the reflexive is also reconstructed for Southern
Paiute in Langacker 1977:107. Speakers feel the necessity of such renovations whenever
the reflexive element characterizes merely the action rather than the identity of some
actant; then the latter is underscored by apposing a reflexive noun. Cf. especially Faltz
1977:238-244 and Strunk 1980:329-334.
3. Grammatical domains 44
3.2.2. Indefinite pronominal elements
Overall, indefinite pronominal elements play a much weaker role in the grammar than
definite ones, mainly because they don't relate to the context. Indefinite pronominal
elements contain a semantic component which says that the entity meant is not identical
with anything established in the current universe of discourse. In addition, there is a
categorial component classifying the word as either a determiner or an NP. In
contradistinction to definite pronominal elements, the categorial component is often
represented by a morpheme of its own; cf. Engl. some vs. someone, which vs. who.
I shall treat here the following types of indefinite pronominal elements: interrogative
pronouns, indefinite pronouns, negative pronouns and indefinite articles. Interrogative pronouns
In a normal pronominal question, the interrogative pronoun is in focus position. This can
be proved by the cleft-sentences which it requires or favors in many languages, e.g. in
French (see Sasse 1977[n], and cf. fn. 42, p. 103). In Japanese, the focus marker ga is
applied to interrogative subjects. This function of the interrogative pronoun has the
consequence that it is normally an accentuated free form. There is thus little room for
variation, and a more grammaticalized interrogative pronoun would cease to be an
interrogative pronoun. This would also seem to account for the amazing diachronic
persistency evinced by interrogative pronouns. Thus, the forms reconstructable for
Indo-European, *kwi-s ‘who’ and *kwi-d ‘what’, have survived into most of the modern
languages despite eventual sound changes. However, in some cases they have been
reinforced. The French cleft-structures qui est-ce qui/que and qu'est-ce qui/que may be
interpreted as reinforced interrogative pronouns. They are in fact well on their way to
becoming new interrogative pronouns /kiIski, kiIsk/ and /kIski, kIsk/, respectively. In
Italian, the neuter che ‘what’ has been reinforced by cosa ‘thing’. The resulting che cosa
is currently being reduced to cosa. This shows a possible source for interrogative
When they are not in focus position and deaccentuated, interrogative pronouns may lose
their interrogative force and become mere indefinite pronouns. Examples: Greek tís,
‘who, what’ as opposed to tis, ti ‘someone, something’. The Latin interrogatives quis,
quid, when atonic, may function as indefinites in certain clause types. Similarly, the
German interrogatives wer, was are employed, in the substandard language, as indefini-
tes. The same applies, finally, to man, mœ of Classical Arabic. Indefinite pronouns
Indefinite pronouns arise from a lot of different sources. The first has just been
mentioned in the preceding paragraph: interrogatives, when atonic, may be used as
indefinites. A second source is provided by the numeral ‘one’. Just like other nominal
determiners, it may be used either as a determiner or as an NP. We leave its
3.2. Pronominal elements 45
18 Analogs to this occur in Japanese (suffix -mo and Imbabura Quechua (suffix-pash, Cole
determinative function for p. 46 and observe here its role in the construction of indefinite
pronouns. German einer, Italian and Spanish uno and Abkhaz a-k'(`
) are relevant
examples. ‘One’ in its turn may come from a noun meaning ‘single’ (IE *oinos). Instead
of taking the detour via the numeral ‘one’, such nouns may also directly be used in
indefinite pronouns. Examples are Nahuatl tlaa ‘something’ < itlaa ‘thing’ and the nouns
in Engl. somebody and something.
In the Indo-European area it is generally the case that a language has more than one
paradigm of indefinite pronouns. Complex, more or less emphatic indefinites may be
built up by combining single ones either with each other or with yet other pronominal
elements. The English words formed with a determinative indefinite pronoun — some or
any — and a nominal head have already been mentioned. The German forms jemand
(ever:man:0) ‘someone’ and jemals (ever:time:ADVR) ‘ever’ have an analogous
structure. These may in turn be reinforced by irgend ‘any’ to yield irgend jemand, irgend
etwas; but irgend may also be combined directly with the more basic atonic
interrogative-indefinites to yield the whole paradigm of irgendwer ‘anyone’, irgendwann
‘any time’ etc. Similarly, the Latin interrogative-indefinite quis and the other pronouns
of its paradigm may be reinforced by ali- ‘other’ to yield aliquis ‘someone’ etc.
Alternatively, the reinforcement may be done by suffixing quam ‘how’ v quisquam
‘anybody’18 or by reduplication v quisquis ‘whoever’; and there are yet other possibili-
Another widely favored way of forming complex indefinites is by using the numeral
‘one’ as a nominal head and expanding it by determinative indefinite elements. Typical
examples are English someone and anyone, corresponding to German irgendeiner. ‘One’
may also be combined with indefinites which are already complex. Thus Latin aliquis v
Vulgar Latin *aliqui-unu > Ital. alcuno ‘someone’ (cf. French aucun). Similarly, Latin
qualis ‘which’ + quis yielded Vulgar Latin *quali-qui > Ital. qualche, French quelque.
These function as adjectives and are combined with ‘one’ to yield the substantival
indefinites qualcuno, quelqu'un. Much could be added here about the formation and fate
of meaning ‘whoever’, ‘every(one)’ etc. It will appear from this exemplification that
indefinite pronouns are a particularly rich field of continuous reinforcements by ever
new combinations of old material.
As for the non-specific human indefinite pronoun ‘one’, two sources have been found.
The first is, once more, the numeral ‘one’, as in English. This occurs also in Cairene
Colloquial Arabic (Gary & Gamal-Eldin 1982:79). The other source are nouns with the
general meaning ‘person’. Compare French on < *hom ‘man’, German man id., Ital. la
gente ‘the people’ and Abkhaz a-wayå` ART-man/person (Hewitt 1979:157f).
While definite, namely personal, pronouns generally have a strong tendency to become
clitic and affixal to the term governing them, mostly the verb, such advanced
3. Grammatical domains 46
grammaticalizations have been little observed in the case of indefinite pronouns. I am
aware of two cases of (former) indefinite pronouns filling the position of a personal verb
affix. The Nahuatl indefinite pronoun tlaa ‘something’ may be incorporated into the verb
in direct object position, as in E22.
E22. ni-k-neki in ti-tla-kwa-s.
‘I want you to eat (something).’ (Misteli 1893:118)
In Abkhaz, there is an indefinite pronoun a-k'× ‘something’, which is identical to the
numeral ‘one’ and which may be expanded to a-k'×-r ‘anything’ (Hewitt 1979:158). A
reduced form of this may appear in the absolutive prefix position of a few verbs, as in
E23. (a+)k'r-y-fò-yt'
‘he's eating’ (Hewitt 1979:220)
In both of these examples, the morphological grammaticalization is matched by a
semantic one, since there is no emphasis on an indefinite object, but rather the verb is
detransitivized by this device.
So far we have dealt with substantival indefinite pronouns only. I will not comment here
on the various morphological differences which often separate indefinite determiners
from substantival pronouns. However, just as the definite pronominal elements take a
different course, accordingly as they are NPs or determiners, developing into articles in
the latter case, the same happens with indefinite pronouns, which also develop into arti-
cles when adnominal. In the most widely known examples, it is the numeral ‘one’ which
becomes an indefinite article (for a recent treatment see Givón 1981). The English,
German and Romance cases are too well-known to require exemplification here. The
same phenomenon occurs in Persian (yek), Turkish (bir) and many other languages. The
phonological weakening which separates English a(n) from one is noteworthy, as it is an
outer sign of the grammaticalization performed. Similarly, the possibility to pluralize
Spanish un (unos) marks the grammatical distance from the numeral un(o).
I have been implying here that the development in question passes through the stages
numeral ‘one’ > (determinative) indefinite pronoun > indefinite article (cf. Heine & Reh
1984:273). One may ask what the evidence for the intermediate stage is. Why not simply
pass from the numeral to the article, as most linguists have assumed? The reasons are
both theoretical and empirical. Theoretically, we may posit, on the basis of the facts
ascertained about definite pronominal elements, the following proportion: Just as an
adnominal demonstrative does not directly change into a definite article, but passes
through the intermediate stage of a deictically unmarked determiner (e.g. Vulgar Latin
ille, German dér), so the numeral ‘one’ does not directly become an indefinite article, but
passes through the intermediate stage of a numerically neutral indefinite determiner.
‘Numerically neutral’ does not mean that more than one may be meant, but that the
3.2. Pronominal elements 47
19 I cannot dwell here on the role of volition in this context nor on the obvious similarity — not
noted by Merlan — between the volitive negation and the word for ‘who’.
opposition to the other cardinal numbers is lost. If this assumption is correct, we should
expect there to be indefinite articles coming from indefinite pronouns other than those
based on the numeral ‘one’. Such cases do exist. The English atonic some, often
linguistically rendered as sm, is a first example. A more convincing one comes from
Kobon (Davies 1981). There is an indefinite pronoun ap ‘some’, usable as a substantive
or a determiner, which is unrelated to the numeral ‘one’ and may even cooccur with it
(o.c. 150), but which possibly comes from a former interrogative ‘what’ (nöhön ‘what’
would then be a renovation, at the side of an ‘who’; o.c. 8). This is regularly used as an
obligatory postnominal indefinite article, as in ni ap ‘a boy’ (o.c. 60). It may also be
combined with a partitive morpheme ri–mn- to yield ri–mnap ‘some’, which is preferably
used with mass nouns, as in hali– ri–mnap ‘some greens’ (o.c. 151). So this is a piece of
empirical evidence to prove that the grammaticalization stage immediately preceding the
indefinite article is an adnominal indefinite pronoun, which may in turn come from the
numeral ‘one’.
In ch. we observed that the main grammaticalization channel of the definite
pronominal elements allowed for a side-channel which led to relative pronouns. The
same repeats itself with the indefinite pronominal elements. Interrogative-indefinites are
often used as relative pronouns, especially in preposed relative clauses. Examples are
again IE *kwis, which yielded the Hittite and Latin relative pronouns kwis and qui,
respectively, and Bambara (Mande) mìn. The grammaticalization of the indefinite to the
relative pronouns involves the loss of the indefiniteness feature; since relative pronouns
are mere place-holders, they are neither definite nor indefinite. Further details in Leh-
mann 1984, Kap. V.2.3, § 2. Negative indefinites
Pronouns equivalent to Engl. nobody, nothing are mostly either formed by a negator plus
an indefinite pronoun, or the negator is directly combined with an element from the same
source that also feeds the indefinites. As for the first alternative negation appears to be
the principal context in many languages which allows atonic interrogatives to be used as
indefinites, the negator and the interrogative-indefinite then frequently coalescing to a
negative pronoun. Thus, from the volitive negation n¯ plus quis we get Latin n¯quis
‘nobody’; and in an exactly parallel fashion we get Mangarayi jag iñja (VOL.NEG
who) ‘nobody’ (Merlan 1982:36, 119).19 Non-interrogative indefinites are at the basis of
German niemand (NEG:someone) ‘no one’, nie(mals) (NEG:ever) ‘never’ etc., and
similarly of Latin numquamnever’ etc. Cf. also French aucun ( The numeral ‘one’
is also used; cf. English no one, Ital. nessuno, Span. ninguno etc.
Lexical nouns seem to be exploited to a greater degree in the formation of negative
indefinites than of plain indefinites, which would be explicable as a consequence of the
3. Grammatical domains 48
20 For less known cases in Germanic languages see Krahe 1967:73.
greater emphasis commonly associated with the former. Thus, while English nobody,
nothing do correspond to plain indefinites formed with the same nouns, Latin nemo, nihil
and German nichts do not have such counterparts. Ne + ho/emo ‘man’ yields nemo
‘nobody’, ne + hilum ‘fiber’ > nihilum > nihil ‘nothing’, OHG ni + wiht-s (NEG + thing--
GEN) > German nichts ‘nothing’.
While these forms, even if synchronically not fully analyzable, clearly contain a negative
(sub)morphemic unit, we also find negative indefinites which are analyzable, but contain
no trace of a negator. The better known cases20 are French personne, rien etc., the former
of quite recent origin, the latter going back to Vulgar Latin rem ‘thing’. In the literary
style, these are still combined with the negator ne; but they retain their negative meaning
even in isolation and will certainly outlive ne.
If negative pronouns are further grammaticalized, they commonly become negators.
Thus, Latin nihil ‘nothing’ > nil and Spanish nada id. are often used in the sense of ‘not
(in the least)’. The Latin negator