Figure 1 - uploaded by Janet Wesson
Content may be subject to copyright.
Web Usage Mining Process [5]

Web Usage Mining Process [5]

Source publication
Conference Paper
Full-text available
Existing Web usage mining (WUM) tools do not indicate which data mining algorithms are used or provide effective graphical visualizations of the results obtained. WUM techniques can be used to determine typical navigation patterns in an organizational Web site. The process of combining WUM and information visualization techniques in order to discov...

Contexts in source publication

Context 1
... purpose of WUM is to reveal the knowledge hidden in the log files of a web server [8]. WUM can be broken down into three main phases, namely preprocessing, pattern discovery and pattern analysis (Figure 1). In the first phase, log files are preprocessed in order to retain only the appropriate information. ...
Context 2
... WebPatterns prototype was implemented using a three-tier architecture, comprising a Presentation Layer in which the visual pattern analysis is provided, an Application Layer in which the pattern discovery is implemented, and a Data Layer in which the preprocessed clickstream data is stored (Figure 1). The Presentation Layer was implemented using C# and FlowChart.NET for the radial tree component and runs in a Windows XP environment. ...

Similar publications

Article
Full-text available
The World Wide Web includes semantic relations of numerous types that exist among different entities. Extracting the relations that exist between two entities is an important step in various Web-related tasks such as information retrieval (IR), information extraction, and social network extraction. A supervised relation extraction system that is tr...
Article
Full-text available
In the present era, World Wide Web has become a popular and unavoidable medium to extract useful and relevant information from the huge and scattered information particularly for education and e-learning. In every educational organization e-learning is becoming an integral part of teaching-learning process. Web mining is the use of data mining tech...

Citations

... Web'e dayalı uzaktan eğitim hizmeti sunan Web materyallerini ve Web sitelerini değerlendirmek için yapılmış birçok çalışma bulunmaktadır[47][48][49][50][51][52][53]. Ayrıca, Web madenciliği konusu ile ilgili teorik bilgilerin verildiği birçok çalışma da incelenmiş olup, bu çalışmalardan[54][55][56][57][58][59][60][61][62][63][64][65][66][67][68][69][70][71], elde edilen kazanımların yanı sıra, önemli ve faydalı bilgiler tez içerisinde işlenmiştir.Yukarıda sunulan literatür araştırmasının ışığında, tezin yönelim gerekçeleri aşağıdakigibi belirlenmiştir: 1. Web kullanıcı davranışlarının analizine yeni bakış açıları kazandırmak için farklı istatistiksel yöntemlerin Web kullanım verileri üzerindeki başarımları denenmesi gerekmektedir. 2. Web kullanımının artmasına paralel olarak Web sayfalarının sayısı da hızla artmaktadır. ...
Thesis
Full-text available
Recently, by rapidly developing and common usage of the Internet, Web has been largest accessible data source in the world. While extremely growing the knowledge masses up on the Internet by passing time, Web Mining has been seeing as more attractive subject more and more to cope with goals such as improving, growing healthy and using effectiveness of web site structure and to provide appropriate web service to the web clients requests. In this thesis, a new process intended for purifying of text-based web user access logs are proposed. The implementation and code of the proposed process have been designed on Java-based SAS software environment. Purifying high dimensional data access logs, the improved data cleaned process is superior to the other methods in speed aspect. Extracting meaningful and interesting knowledge from the purified user access logs, three different implementations have been realized. • Knowledge contained the meaningful and interesting patterns from web user access log files have been extracted by using Path Analysis Method. The implementation of web user access logs of Path Analysis Method, present in the literature and implemented in different fields, has successfully shown that it can be used in extracting meaningful and interesting knowledge. • Relation knowledge between web sites by applying Association Rules Method on the same dataset is extracted. • The detailed statistical knowledge regarding three months usage of the Web site has been extracted by using Statistical Analysis Method. In the end of all implementation, web site designers and managers are given suggestions about improvement, grow healthy and being usable of the web site and to contribute its structural organization by using obtained meaningful and pure knowledge. Furthermore, some solution suggestions with relation to increase successfulness of web site and server by analyzing HTTP state codes have been formed. The formed suggestions are intended for increase the visitor pleasure.
... With time new information is added to the website while outdated web pages are deleted. This updation of web site structure causes many changes in the website struc- Fig. 9 Visualizing sequences using radial tree [29] honey.cs0990@gmail.com [30] ture; i.e. adding and deleting the nodes or connections. ...
Chapter
The web is the largest repository of data. The user frequently navigates on the web to access the information. These navigational patterns are stored in weblogs which are growing exponentially with time. This increase in voluminous weblog data raises major challenges concerning handling big data, understanding navigation patterns and the structural complexity of the web, etc. Visualization is a process to view the complex large web data graphically to address these challenges. This chapter describes the various aspects of visualization with which the novel insights can be drawn in the area of web navigation mining. To analyze user navigations, visualization can be applied in two stages: post pre-processing and post pattern discovery. First stage analyses the website structure, website evolution, user navigation behaviour, frequent and rare patterns and detecting noise. Second stage analyses the interesting patterns obtained from prediction modelling of web data. The chapter also highlights popular visualization tools to analyze weblog data.
... In recent years, web usage mining techniques have been widely used for discovering interesting and frequent user navigation patterns from Web server logs. Most research activities in web mining have centered on web usage mining [3][4][5][6][7][8]. Complete work in this paper is organized in different sections. ...
Article
Full-text available
Web log data offers valuable information insight into website usage. It represents the activity of many users over a potentially long period of time. In this paper we have analyzed the Web logs to determine different statistics like user's hourly, daily and monthly average access statistics: such as number of Visits, Hits, total Pages, Files, URLs, Referrers, User Agents, total Kbytes, top search Strings, most popular requested Pages, and most popular Images. The complete web log data of one year have been collected from the Website's main web server of an Educational Institution and an attempt has been made to enhance the performance of the Website through Web log analysis.
... Azhar explores the use of web usage mining techniques to analyze web log records collected from e-learning portal using apriori algorithm (Azhar, 2005). Oosthuizen, Wesson, and Cilliers (2006) discusses and analyzes web logs for visual web mining of organizational websites using data mining algorithms. Drott (1998) explains the various web server logs mining methods that could be used to improve site design. ...
Article
Full-text available
Web usage mining is to analyze web log files to discover user accessing patterns of web pages. In order to effectively manage and report on a website, it is necessary to get feedback about activity on the web servers. The aim of this study is to help the web designer and web administrator to improve the impressiveness of a website by determining occurred link connections on the website. Therefore, web log files are pre-processed and then path analysis technique is used to investigate the URL information concerning access to electronic sources. The proposed methodology is applied to the web log files in the web server of Firat University. The results and findings of this experimental study can be used by the web designer in order to plan the upgrading and enhancement to the website.
... Web'e dayalı uzaktan eğitim hizmeti sunan Web materyallerini ve Web sitelerini değerlendirmek için yapılmış birçok çalışma bulunmaktadır[47][48][49][50][51][52][53]. Ayrıca, Web madenciliği konusu ile ilgili teorik bilgilerin verildiği birçok çalışma da incelenmiş olup, bu çalışmalardan[54][55][56][57][58][59][60][61][62][63][64][65][66][67][68][69][70][71], elde edilen kazanımların yanı sıra, önemli ve faydalı bilgiler tez içerisinde işlenmiştir.Yukarıda sunulan literatür araştırmasının ışığında, tezin yönelim gerekçeleri aşağıdakigibi belirlenmiştir: 1. Web kullanıcı davranışlarının analizine yeni bakış açıları kazandırmak için farklı istatistiksel yöntemlerin Web kullanım verileri üzerindeki başarımları denenmesi gerekmektedir. 2. Web kullanımının artmasına paralel olarak Web sayfalarının sayısı da hızla artmaktadır. ...
Thesis
Full-text available
Recently, by rapidly developing and common usage of the Internet, Web has been largest accessible data source in the world. While extremely growing the knowledge masses up on the Internet by passing time, Web Mining has been seeing as more attractive subject more and more to cope with goals such as improving, growing healthy and using effectiveness of web site structure and to provide appropriate web service to the web clients requests. In this thesis, a new process intended for purifying of text-based web user access logs are proposed. The implementation and code of the proposed process have been designed on Java-based SAS software environment. Purifying high dimensional data access logs, the improved data cleaned process is superior to the other methods in speed aspect. Extracting meaningful and interesting knowledge from the purified user access logs, three different implementations have been realized. • Knowledge contained the meaningful and interesting patterns from web user access log files have been extracted by using Path Analysis Method. The implementation of web user access logs of Path Analysis Method, present in the literature and implemented in different fields, has successfully shown that it can be used in extracting meaningful and interesting knowledge. • Relation knowledge between web sites by applying Association Rules Method on the same dataset is extracted. • The detailed statistical knowledge regarding three months usage of the Web site has been extracted by using Statistical Analysis Method. In the end of all implementation, web site designers and managers are given suggestions about improvement, grow healthy and being usable of the web site and to contribute its structural organization by using obtained meaningful and pure knowledge. Furthermore, some solution suggestions with relation to increase successfulness of web site and server by analyzing HTTP state codes have been formed. The formed suggestions are intended for increase the visitor pleasure.
... While there are many systems [6,15,16,19,20] that focus on visualizing data mining results and the source data, Ankerst et al. [2,3] developed the PBC system to visualize decision tree construction and the data classification process. J. Han et al. proposed the RuleViz model [10] and developed the DTViz [9] and CVizT [8] systems to visualize the decision tree and classification rule construction process to use visual space in a more efficient way. ...
... Youssefi et al. [19] implement a visual web mining system using 3D representation, but severe occlusion problems make the approach impractical. A recent visual data mining system, WebPatterns [16], focuses on visualizing web usage associations, sequences, and network analysis. Most of these systems apply the idea of visualizing usage data using visual cues of the structure object, however, more interactive operations can be added to describe various possible visual data mining manipulations. ...
... One of the most popular methods to visualize web structure is Chi's radial disktree representation [6]. Since the radial tree layout uses screen space more efficiently than other layout methods, many systems [4,5,15,16] adapted the disktree method to visualize web structure. Other similar approaches include the ConeTree [17] and the Hyperbolic tree [12]. ...
Conference Paper
Full-text available
Discovering web navigational trends and understanding data mining results is undeniably advantageous to web designers and web-based application builders. It is also desirable to interactively investigate web access data and patterns, to allows ad-hoc discovery and examination of patterns that are not apriori known. Visualizing the usage data in the context of the web site structure is of major importance, as it puts web access requests and their connectivity in perspective. Various visualization tools have been developed for this task, but often fail to provide visual data mining functionalities to generate new patterns. Here we present our visual data mining system, WebViz, which allows interactive investigation of web usage data within their structure context, as well as ad-hoc knowledge pattern discovery on web navigational behaviour.
... The radial tree layout uses screen space more efficiently than a vertical layout, and easily conveys a visual sense of distance between web pages. Several visualization systems have adapted the disk tree layout to represent various aspects of web structure [1,9,10]. Other similar approaches include ConeTree [11], which starts from a node and display the children in the base of a cone that has the father as its vertex and Hyperbolic tree [6], which visualize the web structure by means of a hyperbolic space. ...
Conference Paper
Full-text available
As the volume of digitally accessible information grows, there is increasing pressure on the development of data visualization methods to enable humans to interpret that data. We provide a description of our WebViz system, as a tool to visualize both the structure and usage of web sites. We illustrate the use of our visualization paradigm by introducing polygonal graphs layered on top of our adaptation of radial disk trees. In our system, the structure of a web segment is rendered as a radial tree, and usage data can be extracted and layered as polygonal graphs. By interactively creating and adjusting these layers, a user can develop real time insight into the data. We present the system, show the idea of interactive visual operators, and provide some examples that help show the value of the specific visualization techniques, as well as the interactive use of those techniques.
... In recent years, web usage mining techniques have been widely used for discovering interesting and frequent user navigation patterns from Web server logs. Most research activities in web mining have centered on web usage mining [2] [3] [4] [5] [6] [7]. A project aiming an automatic classification of web user navigation patterns and propose a novel approach to classifying user navigation patterns and predicting users' future requests was introduced in [3]. ...
... In recent years, web usage mining techniques have been widely used for discovering interesting and frequent user navigation patterns from Web server logs. Most research activities in web mining have centered on web usage mining234567. A project aiming an automatic classification of web user navigation patterns and propose a novel approach to classifying user navigation patterns and predicting users' future requests was introduced in [3]. ...
Article
Full-text available
Web usage mining is to analysis Web log files to discover user accessing patterns of Web pages. The user access log files present very significant information about a web server. This paper is deal with finding information about a web site, top errors, link errors between the pages, etc. from the web server access log files. The aim of this study is to analysis the web server user access logs of Firat University to help system administrator and Web designer to improve their system by determining occured systems errors, corrupted and broken links by using web using mining. We found useful information about activity statistics like top errors, client errors, server errors within the visited pages etc. in a web server. The obtained results of the study will be used in the further development of the web site in order to increase its effectiveness.
Article
Currently, user require is diverse on the Web. Furthermore, each web user is wishing to retrieve data or goods that hey want to look for more conveniently and more quickly. Because different search criteria and dispositions of web users, they lead to unnecessary repeated operations in order to use implemented by web designer. In this paper, we suggest the system that analyzes user patterns on the Web using the technique of log file analysis and transfers more effectively the information of web sites to users. And we analyze the log file for customer data in the system the proposed method are implemented by means of EC-Miner that is one of the tool of datamining, and aims to offer appropriate Layout corresponding with personalization by giving weight to each transport path.