Article

# Using the text corpus to create a comprehensive list of phrasal verbs

01/2002;

ABSTRACT

The paper describes extraction of Estonian multi-word verbs from text corpora, using a language-and task-specific software tool SENVA, which is based on a statistical language-independent software tool SENTA (Dias et al, 2000). The outcome is a comprehensive list of 16,000 phrasal verbs. We describe the extraction tool, manual post-editing principles, and evaluate the outcome in terms of precision and recall, comparing the results with man-made electronic dictionaries, and with the results of a manual extraction experiment of a sub-set of the MWV-s. . 1 We use the term phrasal verb here to denote what is multi-word lexical verb in English grammars; we use the latter term in the rest of the paper for clarity.

### Full-text

• Source
• "In (Kaalep, Muischnek 2002) we reported about an experiment, involving the creation of a database of Estonian MWV-s, based on both human-oriented dictionaries and various text corpora. The current paper has a closer look at one subtask of the experiment – finding new MWV-s in a corpus. "
##### Article: Inconsistent Selectional Criteria in Semi-automatic Multi-word Unit Extraction

Full-text · Article ·
• Source
##### Article: Differentiating types of verb particle constructions
[Hide abstract]
ABSTRACT: A verb particle construction (VPC) classification scheme gleaned from linguistic sources has been used to assess its usefulness for identifying issues in decomposability. Linguistic sources have also been used to inform the features suitable for use in building an automatic classifier for the scheme with a series of good performance results. The notions of how to define the task of computing phrasal verbs are discussed and new proposals are presented.
Full-text · Article · Jan 2004
• Source
##### Article: Verbi ja noomeni püsiühendid eesti keeles

Full-text · Article ·