Conference Paper

Towards Machine Learning on the Semantic Web.

DOI: 10.1007/978-3-540-89765-1_17 Conference: Uncertainty Reasoning for the Semantic Web I, ISWC International Workshops, URSW 2005-2007, Revised Selected and Invited Papers
Source: DBLP

ABSTRACT In this paper we explore some of the opportunities and chal- lenges for machine learning on the Semantic Web. The Semantic Web provides standardized formats for the representation of both data and ontological background knowledge. Semantic Web standards are used to describe meta data but also have great potential as a general data for- mat for data communication and data integration. Within a broad range of possible applications machine learning will play an increasingly im- portant role: Machine learning solutions have been developed to support the management of ontologies, for the semi-automatic annotation of un- structured data, and to integrate semantic information into web mining. Machine learning will increasingly be employed to analyze distributed data sources described in Semantic Web formats and to support approx- imate Semantic Web reasoning and querying. In this paper we discuss existing and future applications of machine learning on the Semantic Web with a strong focus on learning algorithms that are suitable for the relational character of the Semantic Web's data structure. We discuss some of the particular aspects of learning that we expect will be of rele- vance for the Semantic Web such as scalability, missing and contradicting data, and the potential to integrate ontological background knowledge. In addition we review some of the work on the learning of ontologies and on the population of ontologies, mostly in the context of textual data.

1 Bookmark
 · 
64 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Knowledge Discovery is traditionally used for analysis of large amounts of data and enables addressing a number of tasks that arise in Semantic Web and require scalable solutions. Additionally, Knowledge Discovery techniques have been successfully applied not only to structured data i.e. databases but also to semi-structured and unstructured data including text, graphs, images and video. Semantic Web technologies often call for dealing with text and sometimes also graphs or social networks. This chapter describes research approaches that are adopting knowledge discovery techniques to address semantic Web and presents several publicly available tools that are implementing some of the described approaches. 3.1 Introduction Knowledge Discovery is traditionally used for analysis of large amounts of data and enables addressing a number of tasks that arise in Semantic Web and require scalable solutions. Additionally, Knowledge Discovery techniques have been successfully applied not only to structured data, i.e., databases but also to semi-structured and unstructured data including text, graphs, images and video. Semantic Web technologies often call for dealing with text and sometimes also graphs or social networks. This chapter describes research approaches that are adopting knowledge discovery techniques to address semantic Web and presents several publicly available tools that are implementing some of the described approaches.
    01/2009; 21.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Collective classification algorithms have been used to improve classification performance when network training data with content, link and label information and test data with content and link information are available. Collective classification algorithms use a base classifier which is trained on training content and link data. The base classifier inputs usually consist of the content vector concatenated with an aggregation vector of neighborhood class information. In this paper, instead of using a single base classifier, we propose using different types of base classifiers for content and link. We then combine the content and link classifier outputs using different classifier combination methods. Our experiments show that using heterogeneous classifiers for link and content classification and combining their outputs gives accuracies as good as collective classification. Our method can also be extended to collective classification scenarios with multiple types of content and link.
    Machine Learning and Data Mining in Pattern Recognition - 7th International Conference, MLDM 2011, New York, NY, USA, August 30 - September 3, 2011. Proceedings; 01/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In many Semantic Web domains a tremendous number of statements (expressed as triples) can potentially be true but, in a given domain, only a small number of statements is known to be true or can be inferred to be true. It thus makes sense to attempt to estimate the truth values of statements by exploring regularities in the Semantic Web data via machine learning. Our goal is a "push-button" learning approach that requires a minimum of user intervention. The learned knowledge is ma- terialized o-line (at loading time) such that querying is fast. We define an extension of SPARQL for the integration of the learned probabilistic statements into querying. The proposed approach deals well with typical properties of Semantic Web data. i.e., with the sparsity of the data and with missing data. Statements that can be inferred via logical reasoning can readily be integrated into learning and querying. We study learning algorithms that are suitable for the resulting high-dimensional sparse data matrix. We present experimental results using a friend-of-a-friend data set.

Full-text (2 Sources)

View
155 Downloads
Available from
May 31, 2014