Conference Paper

Learning Relations Using Collocations.

Conference: IJCAI'2001 Workshop on Ontology Learning, Proceedings of the Second Workshop on Ontology Learning OL'2001, Seattle, USA, August 4, 2001 (Held in conjunction with the 17th International Conference on Artificial Intelligence IJCAI'2001)
Source: DBLP

ABSTRACT This paper describes the application of statistical analysis of large corpora to the problem of extracting semantic relations from unstructured text. We regard this approach as a viable method for generating input for the construction of ontologies as ontologies use well-defined semantic relations as building blocks (cf. van der Vet & Mars 1998). Starting from a short description of our corpora as well as our language analysis tools, we discuss in depth the automatic generation of collocation sets. We further give examples of different types of relations that may be found in collocation sets for arbitrary terms. The central question we deal with here is how to postprocess statistically generated collocation sets in order to extract named relations. We show that for different types of relations like cohyponyms or instance-of-relations, different extraction methods as well as additional sources of information can be applied to the basic collocation sets in order to verify the existence of a specific type of semantic relation for a given set of terms.

Download full-text

Full-text

Available from: Christian Wolff, Dec 03, 2012
0 Followers
 · 
161 Views
 · 
222 Downloads
  • Source
    • "Word collocation networks (Ferret, 2002; Ke, 2007), also known as collocation graphs (Heyer et al., 2001; Choudhury and Mukherjee, 2009), are networks of words found in a document or a document collection, where each node corresponds to a unique word type, and edges correspond to word collocations (Ke and Yao, 2008). In the simplest case, each edge corresponds to a unique bigram in the original document. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we explore complex network properties of word collocation networks (Ferret, 2002) from four different genres. Each document of a particular genre was converted into a network of words with word collocations as edges. We analyzed graphically and statistically how the global properties of these networks varied across different genres, and among different network types within the same genre. Our results indicate that the distributions of network properties are visually similar but statistically apart across different genres, and interesting variations emerge when we consider different network types within a single genre. We further investigate how the global properties change as we add more and more collocation edges to the graph of one particular genre, and observe that except for the number of vertices and the size of the largest connected component, network properties change in phases, via jumps and drops.
  • Source
    • "Ontology learning is a generic term pertaining all information extraction tasks and approaches aimed at (semi) automatically extracting relevant concepts and relations from a corpus to be included in an explicit, formal specification, i.e. the ontology. The corpus can be processed in different ways to retrieve such information, for example it can be first parsed [9], or collocations can be identified [10] or also semantic graphs can be retrieved [11]. Also the choice of the corpus from which concepts and relations should be retrieved represents a relevant issue. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a framework for supporting ontology engineering by exploiting key-concept extraction. The framework is implemented in an existing wiki-based collaborative platform which has been extended with a component for terminology extraction from domain-specific textual corpora, and with a further step aimed at matching the extracted concepts with pre-existing structured and semi-structured information. Several ontology engineering related tasks can benefit from the availability of this system: ontology construction and extension, ontology terminological validation and ranking, and ontology concepts ranking.
    Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management; 10/2012
  • Source
    • "Ontology learning is a generic term pertaining all information extraction tasks and approaches aimed at (semi) automatically extracting relevant concepts and relations from a corpus to be included in an explicit, formal specification, i.e. the ontology. The corpus can be processed in different ways to retrieve such information, for example it can be first parsed [9], or collocations can be identified [10] or also semantic graphs can be retrieved [11]. Also the choice of the corpus from which concepts and relations should be retrieved represents a relevant issue. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a wiki-based collaborative environment for the semi-automatic incremental building of ontologies. The system relies on an existing platform, which has been extended with a component for terminology extraction from domain-specific textual corpora and with a further step aimed at matching the extracted concepts with pre-existing structured and semi-structured information. The system stands on the shoulders of a well-established user-friendly wiki architecture and it enables knowledge engineers and domain experts to collaborate in the ontology building process. We have performed a task-oriented evaluation of the tool in a real use case for incrementally constructing the missing part of an environmental ontology. The tool effectively supported the users in the task, thus showing its usefulness for knowledge extraction and ontology engineering.
    Proceedings of the 5th IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, CA, USA, September 18-21, 2011; 01/2011
Show more