Conference Paper

Link Proximity Analysis - Clustering Websites by Examining Link Proximity.

DOI: 10.1007/978-3-642-15464-5_54 Conference: Research and Advanced Technology for Digital Libraries, 14th European Conference, ECDL 2010, Glasgow, UK, September 6-10, 2010. Proceedings
Source: DBLP


This research-in-progress paper presents a new approach called Link Proximity Analysis (LPA) for identifying related web pages
based on link analysis. In contrast to current techniques, which ignore intra-page link analysis, the one put forth here examines
the relative positioning of links to each other within websites. The approach uses the fact that a clear correlation between
the proximity of links to each other and the subject-relatedness of the linked websites can be observed on nearly every web
page. By statistically analyzing this relationship and measuring the amount of sentences, paragraphs, etc. between two links,
related websites can be automatically, identified as a first study has proven.

Download full-text


Available from: Joeran Beel, Oct 06, 2015
22 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents an approach for identifying similar documents that can be used to assist scientists in finding related work. The approach called Citation Proximity Analysis (CPA) is a further development of co-citation analysis, but in addition, considers the proximity of citations to each other within an article's full-text. The underlying idea is that the closer citations are to each other, the more likely it is that they are related. In comparison to existing approaches, such as bibliographic coupling, co-citation analysis or keyword based approaches the advantages of CPA are a higher precision and the possibility to identify related sections within documents. Moreover, CPA allows a more precise automatic document classification. CPA is used as the primary approach to analyse the similarity and to classify the 1.2 million publications contained in the research paper recommender system
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Owner: a-beel, Added to JabRef: 2009.02.24
  • [Show abstract] [Hide abstract]
    ABSTRACT: This report describes the results of automatic processing of a large number of scientific papers according to a rigorously defined criterion of coupling. The population of papers under study was ordered into groups that satisfy the stated criterion of interrelation. An examination of the papers that constitute the groups shows a high degree of logical correlation.
    American Documentation 01/1963; 14(1):10 - 25. DOI:10.1002/asi.5090140103