
Jan StrunkUniversity of Cologne | UOC · Department of Linguistics
Jan Strunk
Master of Arts
About
28
Publications
5,788
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
500
Citations
Citations since 2017
Introduction
Additional affiliations
August 2004 - May 2013
Publications
Publications (28)
Words in utterance-final positions are often pronounced more slowly than utterance-medial words, as previous studies on individual languages have shown. This paper provides a systematic cross-linguistic comparison of relative durations of final and penultimate words in utterances in terms of the degree to which such words are lengthened. The study...
This paper explores the application of quantitative methods to study the effect of various factors on phonetic word duration in ten languages. Data on most of these languages were collected in fieldwork aiming at documenting spontaneous speech in mostly endangered languages, to be used for multiple purposes, including the preservation of cultural h...
Significance
When we speak, we unconsciously pronounce some words more slowly than others and sometimes pause. Such slowdown effects provide key evidence for human cognitive processes, reflecting increased planning load in speech production. Here, we study naturalistic speech from linguistically and culturally diverse populations from around the wo...
This study is concerned with the identifiability of intonational phrase boundaries across familiar and unfamiliar languages. Four annotators segmented a corpus of more than three hours of spontaneous speech into intonational phrases. The corpus included narratives in their native German, but also in three languages of Indonesia unknown to them. The...
This book examines the issue of competing motivations in grammar and language use. The term “competing motivations” refers to the conflicting factors that shape the content and form of grammatical rules and which speakers and addressees need to contend with when expressing themselves, or when trying to comprehend messages. For example, there are on...
I describe the construction of a corpus for research on relative clause extraposition in German based on the treebank T¨ uBa-D/Z. I also define an annotation scheme for the relations between relative clauses and their antecedents which is added as a second annotation level to the syntactic trees. This additional annotation level allows for a direct...
The realization of singular count nouns without an accompanying determiner in- side a PP (determinerless PP, bare PP, Preposition-Noun Combination) has re- cently attracted some interest in compu- tational linguistics. Yet, the relevant fac- tors for determiner omission remain un- clear, and conditions for determiner omission vary from language to...
We present an annotation scheme for preposition senses in preposition-nouncombinations
(PNCs) and PPs in German.
Prepositions are highly polysemous. Yet, little effort has been spent to develop language-specific annotation schemata for preposition senses to systematically represent and analyze the polysemy of prepositions in large corpora. In this paper, we present an annotation schema for preposition senses in German. The annotation schema includes a hierarc...
Preposition-noun constructions (PNCs) are problematic in that they allow the rea-lization of singular count nouns without an accompanying determiner. While the construction is empirically productive, it defies intuitive judgments. In this paper, we describe the extraction of PNCs from large annotated corpora as a preliminary step for identifying th...
Preposition-noun constructions (PNCs) are problematic in that they allow the realization of singular count nouns without an accompanying determiner. While the construction is empirically productive, it defies intuitive judgments. In this paper we describe the extraction of PNCs from large annotated corpora as a preliminary step for identifying thei...
Chomsky (1973, p. 235) and Akmajian (1975) take the set of cyclic categories to include S (IP) and NP (DP). Prediction: Extraposition "out of " a noun phrase embedded inside another noun phrase is impossible.
In this article, we present a language-independent, unsupervised approach to sentence boundary detection. It is based on the assumption that a large number of ambiguities in the determination of sentence boundaries can be eliminated once abbreviations have been identified. Instead of relying on orthographic clues, the proposed system is able to det...
In this paper, we describe a new unsupervised sentence boun- dary detection system and present a comparative study evaluating its performance against different systems found in the literature that have been used to perform the task of automatic text segmentation into sen- tences for English and Portuguese documents. The results achieved by this new...
I provide LFG analyses for three nominal possessive constructions of modern Low Saxon, a less-studied West Germanic language closely related to Dutch and German. I argue that elegant synchronic analyses of these con-structions can be given if it is assumed that they involve a phenomenon which is largely parallel to verbal pro-drop and which I accor...
In this paper, I describe and compare different solutions to the problem of information retrieval for languages that lack a fixed orthography. For these languages, even simple Boolean queries are a problem, because the person writing the query might use a different orthographic system from the people whose documents have been indexed. Moreover, if...
We describe a language-independent, flexible, and accurate method for the detection of abbreviations in text corpora. It is based on the idea that an abbreviation can be viewed as a collocation, and can be identified by using methods for collocation detection such as the Although the log likelihood ratio is known to show a good recall, its precisio...
The detection of abbreviations is an impor- tant step in the process of sentence boundary detection. We describe a flexible, language- independent and accurate method based on the idea that an abbreviation can be viewed as a collocation. As such, it can be identified by using methods for collocation detection such as the log likelihood ratio. Altho...
In this paper, I describe and evaluate a least-effort approach for automatically learning the gender of Low Saxon nouns. I propose two different methods: One is a corpus-based method which relies on counting clue words that occur in the immediate context of the nouns to be classified and uses very simple statistics to assign gender classes. A secon...