Content uploaded by Liesbeth Augustinus
Author content
All content in this area was uploaded by Liesbeth Augustinus on Dec 07, 2015
Content may be subject to copyright.
Complement Raising and
Cluster Formation in Dutch
A Treebank-supported Investigation
Published by
LOT phone: +31 30 253 6111
Trans 10
3512 JK Utrecht e-mail: lot@uu.nl
The Netherlands http://www.lotschool.nl
Cover illustration: Leen Sevens
ISBN 978-94-6093-195-6
NUR 616
Copyright C2015: Liesbeth Augustinus. All rights reserved.
KU Leuven
Faculteit Letteren
Onderzoeksgroep Computationele en
Formele Taalkunde
Centrum voor Computerlinguïstiek
Complement Raising and
Cluster Formation in Dutch
A Treebank-supported Investigation
Liesbeth Augustinus
Proefschrift ingediend tot het
verkrijgen van de graad van
Doctor in de Taalkunde
Leuven, 2015
Promotiecommissie
Promotor: Prof. dr. Frank Van Eynde
Co-promotor: Prof. dr. Hans Smessaert
Leden: Dr. Gosse Bouma
Prof. dr. Jan Odijk
Prof. dr. Jeroen van Craenenbroeck
aan mijn familie
Acknowledgements
There are many people whom I would like to thank for their support during the
preparation of this thesis.
First and foremost, I would like to thank my supervisor prof. dr. Frank Van Eynde,
for giving me the opportunity to write this dissertation, for the hours of inspiring
discussions, and for the valuable feedback on draft versions. Without his support, this
thesis would never have seen the light of day. I also want to express my gratitude to
my co-supervisor prof. dr. Hans Smessaert, for the fruitful discussions and extensive
comments on my writings.
Another big thank you goes out to my colleagues at the Centre for Computa-
tional Linguistics (CCL). Vincent Vandeghinste, for his co-operation on GrETEL and
for helping me out with technical issues every time I got lost in the digital forest. Ineke
Schuurman and Peter Dirix, for their co-operation on GrETEL, the fruitful discussions
on the research presented here, and for weeding out typos in the pre-final version of
this thesis. My office mates Tom Vanallemeersch and Leen Sevens, for distracting me
with a sufficient amount of discussions on linguistically (ir)relevant topics when I was
writing this thesis, and for designing the cover illustration.
I would like to express my gratitude to dr. Gosse Bouma and prof. dr. Jeroen van
Craenenbroeck for the discussions on parts of my linguistic research, and I would like
to thank prof. dr. Jan Odijk for the many suggestions and comments with respect to
the development of GrETEL and tools for treebank mining.
I would like to thank the audiences of workshops and conferences where I have
presented parts of the research described in this thesis, especially the audience of the
past CLIN, LREC, and HPSG conferences, as well as the participants of the workshop
on verb clusters. I would also like to thank the colleagues of the Department of
Linguistics, with whom I have discussed my research in a more informal context.
My gratitude goes out to the the Fonds voor Wetenschappelijk Onderzoek (FWO),
who funded the research presented in this thesis (G.0559.11.N.10, 2011–2015).
vii
viii
Many thanks to my friends and family, especially my parents Lieve and Eddy and my
brother Klaas. Even though the subject of my research is somewhat mysterious to
them, I could not have done this without their encouragement and endless support.
And finally, thank you Jonas, for always being there.
Contents
Preface 1
I Literature Study 3
1 Descriptive study 5
1.1 Dutch sentence structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 What is a verb cluster? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Clustering verbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Infinitivus Pro Participio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 The Third Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Cluster creeping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6.1 A typology of cluster creepers . . . . . . . . . . . . . . . . . . . . . 18
1.6.2 Position of the cluster creepers . . . . . . . . . . . . . . . . . . . . 21
1.7 Word order variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.8 Verbal complement types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.9 Cross-serial dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.10 The boundaries of the second pole . . . . . . . . . . . . . . . . . . . . . . 29
1.10.1 In the second pole or in the Nachfeld? . . . . . . . . . . . . . . . . 29
1.10.2 In the second pole or in the Mittelfeld? . . . . . . . . . . . . . . . 30
1.11 Cluster formation in German . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.11.1 German sentence structure . . . . . . . . . . . . . . . . . . . . . . . 34
1.11.2 Coherent versus incoherent structures . . . . . . . . . . . . . . . . 35
1.11.3 Infinitivus Pro Participio . . . . . . . . . . . . . . . . . . . . . . . . 36
1.11.4 Word order variation . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.11.5 The Third Construction and cluster creeping . . . . . . . . . . . . 40
1.11.6 Cross-serial dependencies . . . . . . . . . . . . . . . . . . . . . . . 41
1.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ix
x CONTENTS
2 Status quæstionis: Transformational grammar 45
2.1 Extraposition versus V-Raising . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.1 Head-final approaches . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.2 A head-initial proposal . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.2 Third Construction and VP-Raising . . . . . . . . . . . . . . . . . . . . . . 53
2.2.1 Head-final approaches . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.2.2 A head-initial proposal . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3 Status quæstionis: Monostratal grammar 61
3.1 Cluster formation in CG and GPSG . . . . . . . . . . . . . . . . . . . . . . 61
3.1.1 Categorial Grammar (CG) . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.2 Generalized Phrase Structure Grammar (GPSG) . . . . . . . . . . 63
3.2 Introduction to HPSG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.1 Signs and types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.2.2 Headed phrases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2.3 Argument selection versus argument realization . . . . . . . . . 71
3.2.4 Raising and control . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.3 Generalized raising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.3.1 Binary branching verb clusters . . . . . . . . . . . . . . . . . . . . 77
3.3.2 Argument Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.4 Head-Complement versus Head-Cluster structures . . . . . . . . . . . . . 83
3.4.1 Hinrichs & Nakazawa (1994) . . . . . . . . . . . . . . . . . . . . . 84
3.4.2 Rentier (1994) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4.3 Kathol (2000) and Müller (2002) . . . . . . . . . . . . . . . . . . . 88
3.5 An alternative analysis: flat tree structures . . . . . . . . . . . . . . . . . 92
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
II Corpus Study 97
4 Treebank mining 99
4.1 Corpora, treebanks and linguistics . . . . . . . . . . . . . . . . . . . . . . . 100
4.2 Treebanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2.1 CGN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
CONTENTS xi
4.2.2 LASSY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3 Querying the treebanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.3.1 XPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.3.2 XQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.3.3 GrETEL: An online search engine for treebanks . . . . . . . . . . 111
4.3.4 Stand-alone search tools . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5 A treebank-supported investigation of verb clusters 119
5.1 Constructions with a verbal complement . . . . . . . . . . . . . . . . . . . 120
5.2 Verb clusters with bare infinitives and/or a past participle . . . . . . . . 122
5.2.1 Extracting the constructions . . . . . . . . . . . . . . . . . . . . . . 122
5.2.2 Cluster types and word order variation . . . . . . . . . . . . . . . 126
5.2.3 Clustering verbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.3 Verb clusters with a te-infinitive . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3.1 Extracting the constructions . . . . . . . . . . . . . . . . . . . . . . 144
5.3.2 Cluster types and word order variation . . . . . . . . . . . . . . . 145
5.3.3 Clustering verbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.4 IPP constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.4.1 Extracting IPP constructions . . . . . . . . . . . . . . . . . . . . . . 161
5.4.2 A typology of IPP verbs . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.5 Cluster creepers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
5.5.1 Extracting constructions with cluster creepers . . . . . . . . . . . 173
5.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
III Analysis 187
6 A new treatment of verb clusters 189
6.1 Why differentiate complement raising from subject raising . . . . . . . . 189
6.1.1 Interaction with subject control verbs . . . . . . . . . . . . . . . . 190
xii CONTENTS
6.1.2 Interaction with the binding principles . . . . . . . . . . . . . . . 191
6.1.3 Interaction with the passive lexical rule . . . . . . . . . . . . . . . 193
6.2 An alternative treatment of complement raising . . . . . . . . . . . . . . 194
6.2.1 Complement raising versus subject raising . . . . . . . . . . . . . 194
6.2.2 Complement raising versus complement extraction . . . . . . . . 198
6.3 Constraints on complement raising . . . . . . . . . . . . . . . . . . . . . . 199
6.3.1 How to block complement raising . . . . . . . . . . . . . . . . . . 200
6.3.2 No complement raising beyond the first pole . . . . . . . . . . . . 201
6.4 Optional versus obligatory complement raising . . . . . . . . . . . . . . . 202
6.5 Word order and branching structure . . . . . . . . . . . . . . . . . . . . . 208
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
7 Beyond verb clusters 217
7.1 Complement raising out of non-verbal phrases . . . . . . . . . . . . . . . 217
7.1.1 Complement raising out of non-verbal complements . . . . . . . 217
7.1.2 Complement raising out of subjects and adjuncts . . . . . . . . . 220
7.1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
7.2 Adposition stranding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
7.2.1 Adposition stranding in Dutch . . . . . . . . . . . . . . . . . . . . . 222
7.2.2 No complement raising out of P-initial PPs . . . . . . . . . . . . . 224
7.2.3 Complement raising vs complement extraction out of PPs . . . . 228
7.2.4 A comparison with the uniform extraction analysis . . . . . . . . 229
7.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Conclusion 237
A Abbreviations 241
B Treebank annotations 243
B.1 CGN treebank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
B.1.1 Syntactic annotations . . . . . . . . . . . . . . . . . . . . . . . . . . 243
B.1.2 Lexical annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
B.1.3 Data format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
B.2 LASSY Small . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
B.2.1 Syntactic annotations . . . . . . . . . . . . . . . . . . . . . . . . . . 255
B.2.2 Lexical annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
xiv CONTENTS
Preface
Dutch is well-known for its verb clusters, as exemplified in the following example:
Ik
I
denk
think
dat
that
ik
I
Cecilia
Cecilia
het
the
nijlpaard
hippo
heb
have
zien
seen
voeren.
feed
‘I think I saw Cecilia feed the hippo.’
The verbs form a cluster at the end of the clause, where they are separated from their
non-verbal dependants. In addition, Infinitivus Pro Participio (IPP) shows up: The
verb zien ‘see’ is selected by an auxiliary of the perfect, but appears as an infinitive
and not as a past participle. Verb clustering and related phenomena such as the IPP
effect have fascinated researchers for decades, as indicated by the abundant litera-
ture that is available within descriptive, theoretical and corpus linguistics. Still, many
questions with respect to the description and the analysis of Dutch verb clusters re-
main unanswered. The research presented in this thesis addresses a number of these
questions, and investigates how authentic language examples obtained from corpora
can be an added value for a theoretical analysis of Dutch verb clusters.
This dissertation is organized into three parts: a literature study (part I), a corpus
study (part II), and a theoretical analysis (part III).
Part I considers how verb clusters are described and analysed in the descriptive and
theoretical literature. Chapter 1 gives a definition of verb clusters based on the lit-
erature, and points out the phenomena that are typically related to cluster formation.
The most important ones include the IPP effect, the interruption of the cluster by
non-verbal elements, and word order variation within the cluster. Chapter 2 sketches
the analysis of verb clusters in transformational grammar, as the first theoretical anal-
yses of verb clusters were described in that framework. Moreover, the terminology
used in descriptive and theoretical accounts in other frameworks is often based on
the transformational work on verb clusters. Chapter 3 provides an overview of the
most influential monostratal analyses of verb clusters. The focus is on the treatments
1
2 PREFACE
formulated within Head-driven Phrase Structure Grammar (HPSG), as this framework
is also used to for the new analysis proposed in part III.
The literature study addresses the following questions:
– What is the set of Dutch clustering verbs?
– In which cases is clustering obligatory and in which cases is it optional?
– What is the link between cluster formation and the IPP effect?
– What types of word order variation can be observed in Dutch verb clusters?
– What are the conditions on cluster creeping, i.e. the interruption of the cluster
by non-verbal elements?
Part II presents a corpus-based investigation of verb clusters. By consulting tree-
banks, i.e. text corpora enriched with syntactic annotations, it will be investigated
whether and how often the phenomena described in part I occur in non-elicited lan-
guage data. Chapter 4 presents the data and the methodology used for the corpus
study. Chapter 5 describes and discusses the results of the treebank investigation.
The main topics that will be addressed are the word order variation observed in the
data, the identification of the clustering verbs, the occurrence of IPP and cluster creep-
ing. Special attention goes out to constructions with a te-infinitive, as they are often
neglected in studies on verb clusters.
Part III presents a new analysis of Dutch verb clusters, formulated in HPSG. In chap-
ter 6 it will be demonstrated that the current HPSG analyses do not adequately anal-
yse Dutch verb clusters. An alternative analysis will be proposed that deals with those
issues. It heavily relies on the empirical observations obtained from the treebanks.
Chapter 7 illustrates how the analysis proposed in chapter 6 extends to the analysis
of other phenomena, especially adposition stranding.