ABSTRACT: Web usage mining has recent years played a major role in monitoring and analyzing user behavior in web environment. This specialized field in data mining has helped a lot in giving valuable insights in sense of fitting user information needs and helping in gradual improvement of a web site.
Adaptive web based systems, tend to adapt their content, based on user access and navigation patterns. In order to perform the former, a thorough analysis of web server logs needs to be performed.
XML as a technology designed specifically to share data in web environments, can offer great deal of flexibility in analyzing the web server logs for performing session reconstruction. In this paper, a suitability of XML and XSLT transformations has been tested and proven to be promising for performing (XML-based) session reconstruction on large amount of web server log data: A time-based heuristic model has been adopted to mapping user request activities into XML sessions. Further, a so-called Session IdentificatiOn Using XML (SIOUX) algorithm is presented followed by the description of the XSLT request aggregation strategy used. Finally, a case study of a web site usage and the experimental results collected illustrate the suitability of our approach. Here, a substantial amount of web server logs has been collected and first cleaned from unnecessary data, to then be submitted to our SIOUX algorithm and the XSLT-based request aggregation engine which maps each request to a unique session and hence reconstructs the sessions.
Computer Engeneering. 03/2008; 2:71-77.