Figure 1 - uploaded by Manfred Sailer
Content may be subject to copyright.
CoDII web interface for the German BW Hehl (‘secret’) 

CoDII web interface for the German BW Hehl (‘secret’) 

Source publication
Article
Full-text available
NSKI Abstract. Dieses Papier gibt einen ¨ Uberblickuber CoDII, die Collection of Distri- butionally Idiosyncratic Items. CoDII ist eine elektronische Sammlung verschiedener Untergruppen lexikalischer Elemente, die sich durch idiosynkratische Distribution aus- zeichnen. Das bedeutet, dass sich die Verteilung dieser Lexeme im Text nicht allei- ne auf...

Contexts in source publication

Context 1
... the non-English items), expressions in which the item occurs, and, if appropriate, paraphrases. This is also where we report occurrences of an NPI outside its theoretically expected licensing environments. Within the block ‘Syntactic Information’, each item is assigned a syntactic category. The syntactic structure of the expression in which the item occurs is added where appropriate. Possible syntactic variations are listed, including passivization, pronominaliza- tion, modification, topicalization, occurrence in raising or control constructions, and ap- pearance within relative or interrogative clauses. For each syntactic variation, examples from corpora, Internet or the linguistic literature are included. Three tagsets provide the theory and notation for the syntactic description of CoDII items. The Stuttgart- Tübingen Tagset (STTS) 3 is used for the syntactic description of German items and of expressions in which they occur. The English BWs are annotated with the syntactic annotation scheme from the Syntactically Annotated Idiom Database (SAID, cf. Kuiper et al. (2003)). For the syntactic description of Romanian NPIs we take the (modified) tagset from the Multilingual Text Tools and Corpora for Central and Eastern European Languages (MULTEXT-East) 4 . The block ‘Licensing Contexts’ contains information on the licensing environment of each item. In the case of polarity items, the licensing contexts are chosen from general, descriptive categories rather than from classifications in a particular theoretical framework. We distinguish the following licensing environments: clausemate (sentential) negation, non-clausemate negation, n-words (such as nobody , never ), the scope of negation expressed by the determiner kein- , the scope of without , interpretation in the restrictor of universal quantifiers, other contexts of interpretation which are logically downward-entailing (and are not subsumed by one of the more specific categories), the scope of only , the complement clause of negative verbs (such as doubt, fear and regret ), questions, antecedents of conditionals, comparative constructions, superlative constructions, and imperatives. To allow the documentation of all available data, exceptional cases that do not fit any of these predetermined categories are listed as ‘Exceptions’. Some of the licensing environments will be discussed in more detail below. The examples for the usage in their licensing contexts of the items listed in CoDII were collected from electronic and printed sources. The Romanian examples were gath- ered from Rada Mihalcea’s Romanian electronic corpus, and from Internet search with Google. Some examples were constructed by Gianina Iord achioaia, a native speaker of Romanian, who worked on CoDII-NPI.ro. The sources of the German BWs, NPIs and PPIs were corpora of the Institute of German Language in Mannheim 5 , the corpus of the Digitales Wörterbuch der Deutschen Sprache (DWDS) 6 , and Internet search with Google. The examples in CoDII-BW.en mainly come from dictionaries, from the Internet and from the British National Corpus (via the SARA software package 7 ). The last block, ‘Classificatory Information’, reports, for each item, classifications found in the literature. For English and German BWs, the classifications were taken from Dobrovol’skij (1988, 1989), Dobrovol’skij and Piirainen (1994), and Nunberg et al. (1994). Polarity items are classified as positive or negative, and are subdivided in three semantic classes according to the theory of Zwarts (1997) (see the discussion below). For citations from literature that does not use these semantic distinctions, we use the classification tag ‘open’. The five CoDII collections are encoded in XML with a uniform schema. Technical details for BWs are described in Sailer and Trawiński (2006) and Trawiński et al. (2008b), and for PIs, in Trawiński and Soehn (2008). Design and data structure of CoDII are conceived in such a way that further types of distributionally idiosyncratic items, such as anaphora, can be modeled, and collections from various languages can easily be inte- grated using the existing schema. CoDII not only compiles, documents and (alphabetically) lists distributionally idiosyncratic items. Due to the integration into the Open Source XML database eXist , 8 it also offers dynamic and flexible access. The design of the internal data structure and the annotation with syntactic and (partial) semantic information make it possible to query our resource with respect to particular lemmata, syntactic properties and linguistically interesting classifications. First statistical observations on the data in our collections which were obtained by using these database functionalities are reported in Trawiński et al. (2008a, 2008b) and Trawiński and Soehn (2008). The user interface of CoDII displays all the linguistic information, including syntactic structure and licensing contexts together with the links to corresponding examples (see Figure 1 and Figure 2). Comments, information about the classification systems, licensing contexts, and examples of the usage of each item in context can be obtained by clicking on the links in the display. All bibliographic references in CoDII are linked to two electronic bibliographies, the ‘Bound Words Bibliography’ 9 , and the ‘Polarity Items Bibliography’ 10 . 2.3. Context Classification and Variation. Figure 1 and Figure 2 show the web interface of CoDII for two entries in different subcollections: The German BW Hehl (‘secret’) and the German multi-word NPI ein(en) Hehl aus etw. machen (‘to make a secret out of sth.’). 11 In CoDII-BW.de, Hehl is recorded as a word without the usual free distribution of a noun; it may only occur as part of the multi-word expression ein(en) Hehl aus etw. machen . Figure 1 shows the information blocks that CoDII records for a bound word, including a window available through the ‘Output(s)’ link which illustrates one result to a given sample query. Note the links in this CoDII entry, which provide background information about the various categorizations offered on this page. Before looking at Hehl as item in CoDII-NPI.de, a more precise explanation of NPIs and their semantic subclasses is in order. The three classes of NPIs we distinguish in CoDII, weak NPIs, strong NPIs, and superstrong NPIs, were introduced by Zwarts (1997). In the formulation of the theory given by van der Wouden (1997) they are algebraically defined as follows: (1) NPIs are superstrong if they are licensed only by antimorphic contexts (overt negation). 12 An example of an antimorphic operator is sentential negation. (2) NPIs are strong if they are licensed by antimorphic and anti-additive contexts. 13 Examples of anti-additive operators are the expressions nobody and never . The word nobody is shown to be anti-additive by checking that the sentence Nobody complained or resisted is true in exactly those situations in which Nobody complained and nobody resisted is true. (3) NPIs are weak if they are licensed by antimorphic, anti- additive, and downward-entailing contexts (and possibly some others). 14 An example of a plain downward-entailing operator is the phrase few students . This phrase is shown to be downward entailing by checking that Few students complained or resisted implies Few students complained and few students resisted . Moreover, Few students complained or few students resisted implies Few students complained and resisted . According to the definition of the three NPI classes, any NPI is licensed by sentential negation. Strong NPIs need to be in the scope of an operator that is at least strong. The German strong NPI einen blassen Schimmer haben (‘to have the faintest idea’) is thus licensed by sentential negation and niemals (‘never’) but not by wenige Studenten (‘few students’). Weak NPIs are already satisfied in the presence of a weak licenser. Hehl is recorded in CoDII-NPI.de because apart from being a bound word, it is also a lexeme which occurs in a multi-word expression that behaves like an NPI: Figure 2 shows the corresponding CoDII entry and a window with a corpus example reachable through the link ‘Example(s)’ of the licensing context ‘Clausemate Negation (CMN)’. The reader will notice that the ‘[A5]’ classification categorizes the item as a weak negative polarity item. This means that it is an item which only needs a logically weak form of negation as licenser. A reflex of this fact is the existence of corpus evidence in the category ‘Downward-Entailing (DENT)’. These are licensing environments that are weaker than the antimorphic ‘Clausemate Negation (CMN)’ environment or the restrictor of the determiner kein- . The NPI classification in CoDII into weak, strong and superstrong is preliminary in the sense that it strictly follows the corpus evidence that we found: It can (and does) happen that an item which is generally considered a weak NPI is classified as strong in CoDII, because we only found corpus evidence for its occurrence with sentential negation, kein- , and ohne (‘without’). It is important to realize that CoDII deliberately stays within the limited horizon of its data base and leaves it to the user’s judgment and research to revise this preliminary categorization where it is appropriate or necessary. 3. Theory 3.1. Grammar Theories. Idioms are treated very differently in different areas of linguistics. Two opposite extremes within the overall spectrum are the constructional (holistic) approach, and the collocational approach. The constructional perspective views idioms as syntactic and semantic units which are usually treated as fixed, stored chunks. They are basically conceived of as lexical items, differing from words primarily in that they may be syntactically complex. This perspective is common in the phraseological literature such as Fleischer (1997) and in formal linguistics, be it Generative Grammar (Chomsky, 1981), or Construction Grammar (Fillmore et ...
Context 2
... are listed, including passivization, pronominaliza- tion, modification, topicalization, occurrence in raising or control constructions, and ap- pearance within relative or interrogative clauses. For each syntactic variation, examples from corpora, Internet or the linguistic literature are included. Three tagsets provide the theory and notation for the syntactic description of CoDII items. The Stuttgart- Tübingen Tagset (STTS) 3 is used for the syntactic description of German items and of expressions in which they occur. The English BWs are annotated with the syntactic annotation scheme from the Syntactically Annotated Idiom Database (SAID, cf. Kuiper et al. (2003)). For the syntactic description of Romanian NPIs we take the (modified) tagset from the Multilingual Text Tools and Corpora for Central and Eastern European Languages (MULTEXT-East) 4 . The block ‘Licensing Contexts’ contains information on the licensing environment of each item. In the case of polarity items, the licensing contexts are chosen from general, descriptive categories rather than from classifications in a particular theoretical framework. We distinguish the following licensing environments: clausemate (sentential) negation, non-clausemate negation, n-words (such as nobody , never ), the scope of negation expressed by the determiner kein- , the scope of without , interpretation in the restrictor of universal quantifiers, other contexts of interpretation which are logically downward-entailing (and are not subsumed by one of the more specific categories), the scope of only , the complement clause of negative verbs (such as doubt, fear and regret ), questions, antecedents of conditionals, comparative constructions, superlative constructions, and imperatives. To allow the documentation of all available data, exceptional cases that do not fit any of these predetermined categories are listed as ‘Exceptions’. Some of the licensing environments will be discussed in more detail below. The examples for the usage in their licensing contexts of the items listed in CoDII were collected from electronic and printed sources. The Romanian examples were gath- ered from Rada Mihalcea’s Romanian electronic corpus, and from Internet search with Google. Some examples were constructed by Gianina Iord achioaia, a native speaker of Romanian, who worked on CoDII-NPI.ro. The sources of the German BWs, NPIs and PPIs were corpora of the Institute of German Language in Mannheim 5 , the corpus of the Digitales Wörterbuch der Deutschen Sprache (DWDS) 6 , and Internet search with Google. The examples in CoDII-BW.en mainly come from dictionaries, from the Internet and from the British National Corpus (via the SARA software package 7 ). The last block, ‘Classificatory Information’, reports, for each item, classifications found in the literature. For English and German BWs, the classifications were taken from Dobrovol’skij (1988, 1989), Dobrovol’skij and Piirainen (1994), and Nunberg et al. (1994). Polarity items are classified as positive or negative, and are subdivided in three semantic classes according to the theory of Zwarts (1997) (see the discussion below). For citations from literature that does not use these semantic distinctions, we use the classification tag ‘open’. The five CoDII collections are encoded in XML with a uniform schema. Technical details for BWs are described in Sailer and Trawiński (2006) and Trawiński et al. (2008b), and for PIs, in Trawiński and Soehn (2008). Design and data structure of CoDII are conceived in such a way that further types of distributionally idiosyncratic items, such as anaphora, can be modeled, and collections from various languages can easily be inte- grated using the existing schema. CoDII not only compiles, documents and (alphabetically) lists distributionally idiosyncratic items. Due to the integration into the Open Source XML database eXist , 8 it also offers dynamic and flexible access. The design of the internal data structure and the annotation with syntactic and (partial) semantic information make it possible to query our resource with respect to particular lemmata, syntactic properties and linguistically interesting classifications. First statistical observations on the data in our collections which were obtained by using these database functionalities are reported in Trawiński et al. (2008a, 2008b) and Trawiński and Soehn (2008). The user interface of CoDII displays all the linguistic information, including syntactic structure and licensing contexts together with the links to corresponding examples (see Figure 1 and Figure 2). Comments, information about the classification systems, licensing contexts, and examples of the usage of each item in context can be obtained by clicking on the links in the display. All bibliographic references in CoDII are linked to two electronic bibliographies, the ‘Bound Words Bibliography’ 9 , and the ‘Polarity Items Bibliography’ 10 . 2.3. Context Classification and Variation. Figure 1 and Figure 2 show the web interface of CoDII for two entries in different subcollections: The German BW Hehl (‘secret’) and the German multi-word NPI ein(en) Hehl aus etw. machen (‘to make a secret out of sth.’). 11 In CoDII-BW.de, Hehl is recorded as a word without the usual free distribution of a noun; it may only occur as part of the multi-word expression ein(en) Hehl aus etw. machen . Figure 1 shows the information blocks that CoDII records for a bound word, including a window available through the ‘Output(s)’ link which illustrates one result to a given sample query. Note the links in this CoDII entry, which provide background information about the various categorizations offered on this page. Before looking at Hehl as item in CoDII-NPI.de, a more precise explanation of NPIs and their semantic subclasses is in order. The three classes of NPIs we distinguish in CoDII, weak NPIs, strong NPIs, and superstrong NPIs, were introduced by Zwarts (1997). In the formulation of the theory given by van der Wouden (1997) they are algebraically defined as follows: (1) NPIs are superstrong if they are licensed only by antimorphic contexts (overt negation). 12 An example of an antimorphic operator is sentential negation. (2) NPIs are strong if they are licensed by antimorphic and anti-additive contexts. 13 Examples of anti-additive operators are the expressions nobody and never . The word nobody is shown to be anti-additive by checking that the sentence Nobody complained or resisted is true in exactly those situations in which Nobody complained and nobody resisted is true. (3) NPIs are weak if they are licensed by antimorphic, anti- additive, and downward-entailing contexts (and possibly some others). 14 An example of a plain downward-entailing operator is the phrase few students . This phrase is shown to be downward entailing by checking that Few students complained or resisted implies Few students complained and few students resisted . Moreover, Few students complained or few students resisted implies Few students complained and resisted . According to the definition of the three NPI classes, any NPI is licensed by sentential negation. Strong NPIs need to be in the scope of an operator that is at least strong. The German strong NPI einen blassen Schimmer haben (‘to have the faintest idea’) is thus licensed by sentential negation and niemals (‘never’) but not by wenige Studenten (‘few students’). Weak NPIs are already satisfied in the presence of a weak licenser. Hehl is recorded in CoDII-NPI.de because apart from being a bound word, it is also a lexeme which occurs in a multi-word expression that behaves like an NPI: Figure 2 shows the corresponding CoDII entry and a window with a corpus example reachable through the link ‘Example(s)’ of the licensing context ‘Clausemate Negation (CMN)’. The reader will notice that the ‘[A5]’ classification categorizes the item as a weak negative polarity item. This means that it is an item which only needs a logically weak form of negation as licenser. A reflex of this fact is the existence of corpus evidence in the category ‘Downward-Entailing (DENT)’. These are licensing environments that are weaker than the antimorphic ‘Clausemate Negation (CMN)’ environment or the restrictor of the determiner kein- . The NPI classification in CoDII into weak, strong and superstrong is preliminary in the sense that it strictly follows the corpus evidence that we found: It can (and does) happen that an item which is generally considered a weak NPI is classified as strong in CoDII, because we only found corpus evidence for its occurrence with sentential negation, kein- , and ohne (‘without’). It is important to realize that CoDII deliberately stays within the limited horizon of its data base and leaves it to the user’s judgment and research to revise this preliminary categorization where it is appropriate or necessary. 3. Theory 3.1. Grammar Theories. Idioms are treated very differently in different areas of linguistics. Two opposite extremes within the overall spectrum are the constructional (holistic) approach, and the collocational approach. The constructional perspective views idioms as syntactic and semantic units which are usually treated as fixed, stored chunks. They are basically conceived of as lexical items, differing from words primarily in that they may be syntactically complex. This perspective is common in the phraseological literature such as Fleischer (1997) and in formal linguistics, be it Generative Grammar (Chomsky, 1981), or Construction Grammar (Fillmore et al., 1988). The collocational perspective originates from corpus linguistic research. Under this perspective, the co-occurrence patterns of individual words are studied. If a word co-occurs with a second word more often than expected on the basis of their syntactic category, the two words form a collocation. This perspective is common in computational corpus linguistic research on idioms, such as in ...
Context 3
... Stuttgart- Tübingen Tagset (STTS) 3 is used for the syntactic description of German items and of expressions in which they occur. The English BWs are annotated with the syntactic annotation scheme from the Syntactically Annotated Idiom Database (SAID, cf. Kuiper et al. (2003)). For the syntactic description of Romanian NPIs we take the (modified) tagset from the Multilingual Text Tools and Corpora for Central and Eastern European Languages (MULTEXT-East) 4 . The block ‘Licensing Contexts’ contains information on the licensing environment of each item. In the case of polarity items, the licensing contexts are chosen from general, descriptive categories rather than from classifications in a particular theoretical framework. We distinguish the following licensing environments: clausemate (sentential) negation, non-clausemate negation, n-words (such as nobody , never ), the scope of negation expressed by the determiner kein- , the scope of without , interpretation in the restrictor of universal quantifiers, other contexts of interpretation which are logically downward-entailing (and are not subsumed by one of the more specific categories), the scope of only , the complement clause of negative verbs (such as doubt, fear and regret ), questions, antecedents of conditionals, comparative constructions, superlative constructions, and imperatives. To allow the documentation of all available data, exceptional cases that do not fit any of these predetermined categories are listed as ‘Exceptions’. Some of the licensing environments will be discussed in more detail below. The examples for the usage in their licensing contexts of the items listed in CoDII were collected from electronic and printed sources. The Romanian examples were gath- ered from Rada Mihalcea’s Romanian electronic corpus, and from Internet search with Google. Some examples were constructed by Gianina Iord achioaia, a native speaker of Romanian, who worked on CoDII-NPI.ro. The sources of the German BWs, NPIs and PPIs were corpora of the Institute of German Language in Mannheim 5 , the corpus of the Digitales Wörterbuch der Deutschen Sprache (DWDS) 6 , and Internet search with Google. The examples in CoDII-BW.en mainly come from dictionaries, from the Internet and from the British National Corpus (via the SARA software package 7 ). The last block, ‘Classificatory Information’, reports, for each item, classifications found in the literature. For English and German BWs, the classifications were taken from Dobrovol’skij (1988, 1989), Dobrovol’skij and Piirainen (1994), and Nunberg et al. (1994). Polarity items are classified as positive or negative, and are subdivided in three semantic classes according to the theory of Zwarts (1997) (see the discussion below). For citations from literature that does not use these semantic distinctions, we use the classification tag ‘open’. The five CoDII collections are encoded in XML with a uniform schema. Technical details for BWs are described in Sailer and Trawiński (2006) and Trawiński et al. (2008b), and for PIs, in Trawiński and Soehn (2008). Design and data structure of CoDII are conceived in such a way that further types of distributionally idiosyncratic items, such as anaphora, can be modeled, and collections from various languages can easily be inte- grated using the existing schema. CoDII not only compiles, documents and (alphabetically) lists distributionally idiosyncratic items. Due to the integration into the Open Source XML database eXist , 8 it also offers dynamic and flexible access. The design of the internal data structure and the annotation with syntactic and (partial) semantic information make it possible to query our resource with respect to particular lemmata, syntactic properties and linguistically interesting classifications. First statistical observations on the data in our collections which were obtained by using these database functionalities are reported in Trawiński et al. (2008a, 2008b) and Trawiński and Soehn (2008). The user interface of CoDII displays all the linguistic information, including syntactic structure and licensing contexts together with the links to corresponding examples (see Figure 1 and Figure 2). Comments, information about the classification systems, licensing contexts, and examples of the usage of each item in context can be obtained by clicking on the links in the display. All bibliographic references in CoDII are linked to two electronic bibliographies, the ‘Bound Words Bibliography’ 9 , and the ‘Polarity Items Bibliography’ 10 . 2.3. Context Classification and Variation. Figure 1 and Figure 2 show the web interface of CoDII for two entries in different subcollections: The German BW Hehl (‘secret’) and the German multi-word NPI ein(en) Hehl aus etw. machen (‘to make a secret out of sth.’). 11 In CoDII-BW.de, Hehl is recorded as a word without the usual free distribution of a noun; it may only occur as part of the multi-word expression ein(en) Hehl aus etw. machen . Figure 1 shows the information blocks that CoDII records for a bound word, including a window available through the ‘Output(s)’ link which illustrates one result to a given sample query. Note the links in this CoDII entry, which provide background information about the various categorizations offered on this page. Before looking at Hehl as item in CoDII-NPI.de, a more precise explanation of NPIs and their semantic subclasses is in order. The three classes of NPIs we distinguish in CoDII, weak NPIs, strong NPIs, and superstrong NPIs, were introduced by Zwarts (1997). In the formulation of the theory given by van der Wouden (1997) they are algebraically defined as follows: (1) NPIs are superstrong if they are licensed only by antimorphic contexts (overt negation). 12 An example of an antimorphic operator is sentential negation. (2) NPIs are strong if they are licensed by antimorphic and anti-additive contexts. 13 Examples of anti-additive operators are the expressions nobody and never . The word nobody is shown to be anti-additive by checking that the sentence Nobody complained or resisted is true in exactly those situations in which Nobody complained and nobody resisted is true. (3) NPIs are weak if they are licensed by antimorphic, anti- additive, and downward-entailing contexts (and possibly some others). 14 An example of a plain downward-entailing operator is the phrase few students . This phrase is shown to be downward entailing by checking that Few students complained or resisted implies Few students complained and few students resisted . Moreover, Few students complained or few students resisted implies Few students complained and resisted . According to the definition of the three NPI classes, any NPI is licensed by sentential negation. Strong NPIs need to be in the scope of an operator that is at least strong. The German strong NPI einen blassen Schimmer haben (‘to have the faintest idea’) is thus licensed by sentential negation and niemals (‘never’) but not by wenige Studenten (‘few students’). Weak NPIs are already satisfied in the presence of a weak licenser. Hehl is recorded in CoDII-NPI.de because apart from being a bound word, it is also a lexeme which occurs in a multi-word expression that behaves like an NPI: Figure 2 shows the corresponding CoDII entry and a window with a corpus example reachable through the link ‘Example(s)’ of the licensing context ‘Clausemate Negation (CMN)’. The reader will notice that the ‘[A5]’ classification categorizes the item as a weak negative polarity item. This means that it is an item which only needs a logically weak form of negation as licenser. A reflex of this fact is the existence of corpus evidence in the category ‘Downward-Entailing (DENT)’. These are licensing environments that are weaker than the antimorphic ‘Clausemate Negation (CMN)’ environment or the restrictor of the determiner kein- . The NPI classification in CoDII into weak, strong and superstrong is preliminary in the sense that it strictly follows the corpus evidence that we found: It can (and does) happen that an item which is generally considered a weak NPI is classified as strong in CoDII, because we only found corpus evidence for its occurrence with sentential negation, kein- , and ohne (‘without’). It is important to realize that CoDII deliberately stays within the limited horizon of its data base and leaves it to the user’s judgment and research to revise this preliminary categorization where it is appropriate or necessary. 3. Theory 3.1. Grammar Theories. Idioms are treated very differently in different areas of linguistics. Two opposite extremes within the overall spectrum are the constructional (holistic) approach, and the collocational approach. The constructional perspective views idioms as syntactic and semantic units which are usually treated as fixed, stored chunks. They are basically conceived of as lexical items, differing from words primarily in that they may be syntactically complex. This perspective is common in the phraseological literature such as Fleischer (1997) and in formal linguistics, be it Generative Grammar (Chomsky, 1981), or Construction Grammar (Fillmore et al., 1988). The collocational perspective originates from corpus linguistic research. Under this perspective, the co-occurrence patterns of individual words are studied. If a word co-occurs with a second word more often than expected on the basis of their syntactic category, the two words form a collocation. This perspective is common in computational corpus linguistic research on idioms, such as in computational lexicography (Sinclair (1991), Moon (1998)), and in more general computational linguistic approaches such as Krenn (1999). Interestingly for us, there is a natural area of overlap between these two perspectives: The constituents of what would traditionally be called an idiom may show high co- occurrence ratios in corpora. However, the two perspectives do not cover the same ground. Many idioms ...

Citations

... This raises the question of how general the phenomenon of nonveridicality (as distinct from the better-known phenomenon of sensitivity to negation) is in languages like German. We believe that there might be a lot more to be found, once we look more closely into a great number of idioms, using modern corpus tools (see Lichte & Soehn 2007, Richter, Sailer & Trawiński 2010. We hope this paper will encourage researchers in the field of phraseology to look more closely at the distribution of the idioms they study. ...
Article
Full-text available
This paper discusses idioms of the form be/come from X-place, where X-place is a placename which contains an adjectival or verbal root. Using a corpus of examples from the Internet, the constructional properties of placename idioms and aspects of their distribution are considered. The examples are taken from German, Dutch and English. After a general introduction of placename idioms, we focus is on the subtype represented by German aus Dummsdorf kommenbe from Stupid-village'). The following aspects of the expression will be discussed: The set of verbs involved, their syntactic flexibility, their interpretation, their status as individual level (rather than stage-level) predicates, and the fact that they appear in nonveridical contexts, primarily negative sentences, questions, and modal/subjunctive contexts, making them a special type of negative polarity item.
Chapter
We present experimental findings that support the hypothesis that the licensing requirements of negative polarity items (NPIs) pattern with well-formedness conditions on frozen syntactic-semantic features of idiomatic expressions. When multiword NPIs that require a strong negation as their licenser are accompanied by a weaker type of negative licenser instead, they are perceived as degraded by native speakers the same way as violations of morphosyntactic co-occurrence requirements in the idiomatic multiword component of these NPIs. Such a violation occurs for example when a certain noun phrase in argument position is in plural form instead of singular, or when an obligatory lexical element is replaced by a synonym. Subsuming idiomatic phrases under the more general category of (not necessarily idiomatic) collocationally restricted complex expressions, we take our results as evidence for a theory of NPIs which interprets their licensing in syntactically delimited negative environments as an instance of satisfying the well-formedness constraints of a collocation that comprises a semantic restriction. The lexically variable negation component of NPIs is interpreted as an abstract semantic co-occurrence requirement of a complex collocation.