Conference Paper

Learning Relations Using Collocations.

Conference: IJCAI'2001 Workshop on Ontology Learning, Proceedings of the Second Workshop on Ontology Learning OL'2001, Seattle, USA, August 4, 2001 (Held in conjunction with the 17th International Conference on Artificial Intelligence IJCAI'2001)
Source: DBLP

ABSTRACT This paper describes the application of statistical analysis of large corpora to the problem of extracting semantic relations from unstructured text. We regard this approach as a viable method for generating input for the construction of ontologies as ontologies use well-defined semantic relations as building blocks (cf. van der Vet & Mars 1998). Starting from a short description of our corpora as well as our language analysis tools, we discuss in depth the automatic generation of collocation sets. We further give examples of different types of relations that may be found in collocation sets for arbitrary terms. The central question we deal with here is how to postprocess statistically generated collocation sets in order to extract named relations. We show that for different types of relations like cohyponyms or instance-of-relations, different extraction methods as well as additional sources of information can be applied to the basic collocation sets in order to verify the existence of a specific type of semantic relation for a given set of terms.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Keyword and keyphrase extraction is an important problem in natural language processing, with applications ranging from summarization to semantic search to document clustering. Graph-based approaches to keyword and keyphrase extraction avoid the problem of acquiring a large in-domain training corpus by applying variants of PageRank algorithm on a network of words. Although graph-based approaches are knowledge-lean and easily adoptable in online systems, it remains largely open whether they can benefit from centrality measures other than PageRank. In this paper, we experiment with an array of centrality measures on word and noun phrase collocation networks, and analyze their performance on four benchmark datasets. Not only are there centrality measures that perform as well as or better than PageRank, but they are much simpler (e.g., degree, strength, and neighborhood size). Furthermore, centrality-based methods give results that are competitive with and, in some cases, better than two strong unsupervised baselines.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a framework for supporting ontology engineering by exploiting key-concept extraction. The framework is implemented in an existing wiki-based collaborative platform which has been extended with a component for terminology extraction from domain-specific textual corpora, and with a further step aimed at matching the extracted concepts with pre-existing structured and semi-structured information. Several ontology engineering related tasks can benefit from the availability of this system: ontology construction and extension, ontology terminological validation and ranking, and ontology concepts ranking.
    Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management; 10/2012
  • Source

Full-text (4 Sources)

Available from
Jun 1, 2014