Conference Paper

A System for Recognition of Named Entities in Greek.

DOI: 10.1007/3-540-45154-4_39 Conference: Natural Language Processing - NLP 2000, Second International Conference, Patras, Greece, June 2-4, 2000, Proceedings
Source: DBLP


In this paper, we describe work in progress for the development of a Greek named entity recognizer. The system aims at information
extraction applications where large scale text processing is needed. Speed of analysis, system robustness, and results accuracy
have been the basic guidelines for the system’s design. Pattern matching techniques have been implemented on top of an existing
automated pipeline for Greek text processing and the resulting system depends on non-recursive regular expressions in order
to capture different types of named entities. For development and testing purposes, we collected a corpus of financial texts
from several web sources and manually annotated part of it. Overall precision and recall are 86% and 81% respectively.

Download full-text


Available from: Voula Giouli, Nov 26, 2015
  • Source
    • "Petasis et al. (2002) use the C4.5 machine learning algorithm to update NER grammars. Boutsis et al. (2000) use a collection of 110 hand-crafted grammars. Lucarelli (2005) uses Support Vector Machines to recognize person Named Entities and semi-automatically created patterns to recognize temporal expressions. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We describe our work on Greek Named Entity Recognition using comparatively three different machine learning techniques: (i) Support Vector Machines (SVM), (ii) Maximum Entropy and (iii) Onetime, a shortcut method based on previous work of one of the authors. The majority of our system's features use linguistic knowledge provided by: morphology, punctuation, position of the lexical units within a sentence and within a text, electronic dictionaries, and the outputs of external tools (a tokenizer, a sentence splitter, and a Hellenic version of Brill's Part of Speech Tagger). After testing we observed that the application of a few simple Post Testing Classification Correction (PTCC) rules created after the observation of output errors, improved the results of the SVM and the Maximum Entropy systems output. We achieved very good results with the three methods. Our best configurations (Support Vector Machines with a second degree polynomial kernel and Maximum Entropy) achieved both after the application of PTCC rules an overall F-measure of 91.06.
  • Source

    Proceedings of the 3rd International Conference on Language Resources and Evaluation, Las Palmas, Spain; 05/2002
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a named-entity recognizer for Greek person names and temporal expressions. For temporal expressions, it relies on semi- automatically produced patterns. For person names, it employs two Support Vector Machines, that scan the input text in two passes, and active learning, which reduces the human annotation effort during training.
    12/2006: pages 203-213;
Show more