ABSTRACT: In this paper we address the problem of analyzing Web log data collected at a typical online newspaper site. We propose a two-way clustering technique based on probability theory. On one hand the suggested method clusters the readers of the online newspaper into user groups of similar browsing behaviour where the clusters are determined solely based on the click streams collected. On the other hand, the articles of the newspaper are clustered based on the reading behaviour of the users. The two-way clustering produces statistical user and page profiles that can be analyzed by domain experts for content personalization. In addition, the produced model can also be used for on-line prediction so that given the user cluster of a person entering the site, and the page cluster of an article of a newspaper one can infer whether or not the user will have a look at the page in question.
Applications and the Internet Workshops, 2003. Proceedings. 2003 Symposium on; 02/2003