Chapter

Web Usage Mining and Personalization

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Web Usage Mining (Mobasher, B., 2005) is defined as the application of data mining techniques to discover the useful pattern from web log data to know the user's behaviour as result of activities performed on the web. It is explained in detail in the subsequent section. ...
... Website Personalization (Mobasher, B., 2005) is the process of composing tailored experiences for users to a website more willingly than providing a solo, wide experience; website personalization allows companies to present visitors with unique experiences tailored to their needs and desires. Web personalization makes efforts in one to one attentiveness and transforms it into digital world. ...
... For a website, the user satisfaction always remains the main focus for the web site, administrators. Service quality and performance (Mobasher, B., 2005) network throughput, database, load on web data traffic over the website are very crucial to the visitors' satisfaction. Web usage mining gives the technical information for improving the system (web server, network, storage, and database) where the websites are hosted for accessing. ...
Chapter
Full-text available
Websites have become the major source of information, and analysis for web usage has become the most important way of investigating a user's behaviour and obtaining information for website owners to use to make any strategic decisions. This chapter sheds light on the concept of web usage mining, techniques, and its application in various domains.
... Web data mining: Another aspect of Web personalization emphasizes the Web usage mining component (Pierrakos et al. 2003, Mobasher 2004). These approaches extract information from Web logs recording user's behaviours on the Web. ...
... There are a number of possible classification schemas for Web personalization algorithms (Resnick and Varian 1997, Schafer et al. 1999, Terveen and Hill 2001, Mobasher 2004. Several kinds of Web personalization algorithms can be distinguished: collaborative filtering, content-based filtering, demographic-based personalization, and knowledge-based personalization. ...
... This approach is based on the assumption that a given user might like entities similar to the one she/he is interested in. Content-based filtering has several identified drawbacks such as biases of input, and static profiles (Mobasher 2004). User inputs are often subjective to the user and thus prone to biases. ...
Article
Full-text available
In the past few years, spatial information and services have proliferated on the Web, due to the fact that most of our daily activities are related to the spatial dimension. The user communities involved in spatial web services are essentially diverse, still in an expansion and transformation with constantly increasing number of user and applications. This opens many research challenges, such as the elicitation of user's interests and preferences and customization of information services on the spatial Web. This PhD research proposes an integrated framework for user modeling and preference elicitation, and personalization services on the spatial Web. The framework identifies personalization services and a semantic user model for spatial web applications. These two components communicate information and knowledge about the user through inter-process communications. The personalization services are based on three mechanisms: the Bi-directional Neural Associative Memory, user-centric spatial proximity and similarity measures, image schemata and affordance concepts. A web-based user interface is integrated with these components, and offers a spectrum of personalized search strategies and a hybrid personalization engine. The user model employs expressive description logics to describe assumptions about the user and to infer implicit user features from user's descriptions as required by an application system. An application scenario in the tourism domain and a Web-based Java prototype provide an experimental validation of the research framework and identified personalization techniques.
... With the explosive growth of knowledge available on the World Wide Web, which lacks an integrated structure or schema, it becomes much more difficult for users to access relevant information, efficiently. Meanwhile, the substantial increase in the number of websites presents a challenging task for webmasters to organize the contents of the websites to cater to the needs of users (Mobasher, 2004;Liu & Keselj, 2007;Albadvi & Shahbazi, 2009). These problems have made Web personalization an indispensable tool for both Web-based organizations and for the end users. ...
... These problems have made Web personalization an indispensable tool for both Web-based organizations and for the end users. Web personalization can be described as any action, which makes the Web experience of a user customized to the user's taste or preferences (Mobasher, 2004). Recently, Web mining techniques have been widely applied for personalization (Guandong, 2008). ...
... In offline recommendation phase, the web usage mining techniques are applied to reveal the hidden navigation patterns of users that stored in the web server logs. In the online phase, the current session of active user is compared with these navigational patterns with some similarity measures and consequently recommends pages are determined (Mobasher, 2004). Most research activities in web mining have centered on content mining and usage mining (Liu & Keselj, 2007). ...
Article
Full-text available
With the rapid growth of the World Wide Web, finding useful information from the Internet has become a critical issue. Automatic classification of user navigation patterns provides a useful tool to solve these problems. In this paper, we propose an approach for classification of users’ navigation patterns and prediction of users’ future requests. Users’ profiles are constructed based on Web log server files and one of clustering methods is implemented to users’ profiles for assigning navigation patterns. Finally, using neural network, recommender engine produces a relevant recommendation list of web pages to the active user. The preliminary results indicate that the proposed approach has high accuracy and coverage in prediction of users’ future requests.
... In this paper, a Simulated Annealing based biclustering approach is proposed to identifying optimal user profiles based on their browsing interest and thereby one can provide a recommendation of web pages effectively. The proposed approach is tested on CTI dataset (Mobasher (2004) and Zhang et al. (2005)). ...
... A session file consists of a sequence of user's request for pages P= {p 1 , p 2 , p 3 ,…,p n } and a set of m sessions, S = {s 1 , s 2 ,s 3 ,….,s m } where each s i belongs to S (Mobasher , 2004 andZhang et al., 2005). A session-pageview matrix A(U, P) of size n x m where n is the number of sessions and m is the number of pageviews. ...
... The dataset CTI is taken from a university web site log file, which was made available by the authors of Mobasher (2004) and Zhang et al. (2005). The data is based on a random collection of users visiting university site for a 2-week period during the month of April 2002. ...
Article
Full-text available
In this paper, the Simulated Annealing (SA) based biclustering approach is proposed in which SA is used as an optimization tool for biclustering of web usage data to identify the optimal user profile from the given web usage data. Extracted biclusters are consists of correlated users whose usage behaviors are similar across the subset of web pages of a web site where as these users are uncorrelated for remaining pages of a web site. These results are very useful in web personalization so that it communicates better with its users and for making customized prediction. Also useful for providing customized web service too. Experiment was conducted on the real web usage dataset called CTI dataset. Results show that proposed SA based biclustering approach can extract highly correlated user groups from the preprocessed web usage data.
... You might be surprised to discover that it won't require most of next year's budget to achieve worthwhile results. Web personalization can be seen as an inter-disciplinary field that includes several research domains from user modeling [14], social networks [19], web data mining [8,13,19], human-machine interactions to Web usage mining [13]; Web usage mining is an example of approach to extract log files containing information on user navigation in order to classify users.Other techniques of information retrieval are based on documents categories' selection [13]. Contextual information extraction on the user and/or materials (for adaptation systems) is a technique fairly used also include, in addition to user contextual information, contextual information of real-time interactions with the Web. ...
... You might be surprised to discover that it won't require most of next year's budget to achieve worthwhile results. Web personalization can be seen as an inter-disciplinary field that includes several research domains from user modeling [14], social networks [19], web data mining [8,13,19], human-machine interactions to Web usage mining [13]; Web usage mining is an example of approach to extract log files containing information on user navigation in order to classify users.Other techniques of information retrieval are based on documents categories' selection [13]. Contextual information extraction on the user and/or materials (for adaptation systems) is a technique fairly used also include, in addition to user contextual information, contextual information of real-time interactions with the Web. ...
... You might be surprised to discover that it won't require most of next year's budget to achieve worthwhile results. Web personalization can be seen as an inter-disciplinary field that includes several research domains from user modeling [14], social networks [19], web data mining [8,13,19], human-machine interactions to Web usage mining [13]; Web usage mining is an example of approach to extract log files containing information on user navigation in order to classify users.Other techniques of information retrieval are based on documents categories' selection [13]. Contextual information extraction on the user and/or materials (for adaptation systems) is a technique fairly used also include, in addition to user contextual information, contextual information of real-time interactions with the Web. ...
Article
Web mining is an important application of data mining techniques to extract knowledge from the Web. Web mining has been explored to a vast degree and different techniques have been proposed for a variety of applications that includes Web Search, Classification and Personalization etc. Most research on Web mining has been from a 'data' point of view. The Web mining research is a converging research area from several research communities, such as Databases, Information Retrieval and Artificial Intelligence. In this paper, we concentrated on the significance of studying the evolving nature of the Web personalization. Web usage mining is used to discover interesting user navigation patterns and can be applied to many real-world problems, such as improving Web sites/pages, making additional topic or product recommendations, user/customer behavior studies, etc. A Web usage mining system performs five major tasks: i) data gathering, ii) data preparation, iii) navigation pattern discovery, iv) pattern analysis and visualization, and v) pattern applications. Each task has been explained in detail and its related technologies are introduced. In this paper we implement how Web mining techniques can be applied for the Customization i.e Web personalization.
... You might be surprised to discover that it won"t require most of next year"s budget to achieve worthwhile results. Web personalization can be seen as an inter-disciplinary field that includes several research domains from user modeling [14], social networks [19], web data mining [8,13,19], human-machine interactions to Web usage mining [13]; Web usage mining is an example of approach to extract log files containing information on user navigation in order to classify users.Other techniques of information retrieval are based on documents categories" selection [13]. Contextual information extraction on the user and/or materials (for adaptation systems) is a technique fairly used also include, in addition to user contextual information, contextual information of realtime interactions with the Web. ...
... You might be surprised to discover that it won"t require most of next year"s budget to achieve worthwhile results. Web personalization can be seen as an inter-disciplinary field that includes several research domains from user modeling [14], social networks [19], web data mining [8,13,19], human-machine interactions to Web usage mining [13]; Web usage mining is an example of approach to extract log files containing information on user navigation in order to classify users.Other techniques of information retrieval are based on documents categories" selection [13]. Contextual information extraction on the user and/or materials (for adaptation systems) is a technique fairly used also include, in addition to user contextual information, contextual information of realtime interactions with the Web. ...
... You might be surprised to discover that it won"t require most of next year"s budget to achieve worthwhile results. Web personalization can be seen as an inter-disciplinary field that includes several research domains from user modeling [14], social networks [19], web data mining [8,13,19], human-machine interactions to Web usage mining [13]; Web usage mining is an example of approach to extract log files containing information on user navigation in order to classify users.Other techniques of information retrieval are based on documents categories" selection [13]. Contextual information extraction on the user and/or materials (for adaptation systems) is a technique fairly used also include, in addition to user contextual information, contextual information of realtime interactions with the Web. ...
Article
Full-text available
Web mining is the application of data mining techniques to extract knowledge from Web. Web mining has been explored to a vast degree and different techniques have been proposed for a variety of applications that includes Web Search, Classification and Personalization etc. Most research on Web mining has been from a "data-centric" point of view. In this paper, we highlight the significance of studying the evolving nature of the Web personalization. Web usage mining is used to discover interesting user navigation patterns and can be applied to many real-world problems, such as improving Web sites/pages, making additional topic or product recommendations, user/customer behavior studies, etc. A Web usage mining system performs five major tasks: i) data gathering, ii) data preparation, iii) navigation pattern discovery, iv) pattern analysis and visualization, and v) pattern applications. Each task is explained in detail and its related technologies are introduced. The Web mining research is a converging research area from several research communities, such as Databases, Information Retrieval and Artificial Intelligence. In this paper we implement how Web mining techniques can be apply for the Customization i.e Web personalization.
... Recently, there has been an increasing interest to develop a web recommender systems using the application of web usage mining techniques [4][5][6]. Traditional techniques are mainly depends on users ratings on different items or other explicit feedbacks provided by the users [7,8]. Generally, Web server access log records the user's browsing history that contains an abundance of hidden information about users and behaviors of their own navigation. ...
... E v p' . i (6) Where s ,.., . . represents page support for candidate -" . 1 page Pi, and w (vp ! ) is the occurrence of the visited pattern ' VP',i that include the input pattern page ip, and include also the candidate page CPi . ...
Conference Paper
Full-text available
Nowadays, users rely on the web for information gathering. Accordingly, web usage mining becomes one important subject of research. Such research area covers prediction of user near future intentions, web-based personalized services, customer profiling, and adaptive web sites. Web page prediction is strongly limited by the nature of web logs, the intrinsic complexity of the problem and the tight efficiency requirements. This paper proposes a hybrid page ranking model based on web usage mining technique by exploiting session data of users, to enhance the recommendations of the next candidate web page to be accessed. The proposed approach represents a combination between two page ranking approaches. The first one computes the frequency ratio indicating the number of occurrences of each page in the search result. On the other hand, the second approach computes the coverage ratio from similar behavior patterns. As a result of the proposed approach, a set of candidate pages are ranked and the page with highest rate is recommended. The proposed approach has been tested on real data collected and extracted from the web server log file of CTI main web server.
... -Le type de sources de données utilisées pour la construction des profils : Différentes types de données peuvent être exploitées pour la construction des profils utilisateurs. Des exemples sont les données d'usage [Mob05], les données de contenu [MC07], les données décrivant la structure et l'organisation des documents [Lie95] ou encore les données provenant de l'utilisateur [SHY04]. ...
... Mobasher has largely explained this type of data source in his work [Mob05] and many others works in the literature have used this type of information in there mechanisms of user profile construction such as [Cha00,MWT04]. For example, in [MWT04] the authors present an algorithm to build profiles denoted as Non Obvious Profile inferred by the user's behaviour during his visits. ...
Article
This thesis contains two parts. The first one is a study of the state of the art on data personalization and a proposition of a user profile model. The second one is a focus on a specific problem which is the query reformulation using profile knowledge. The goal of personalization is to facilitate the expression of the need for a particular user and to enable him to obtain relevant information when he accesses an information system. The relevance of the information is defined by a set of criteria and preferences specific to each user or community of users. These criteria describe the user's domain of interest, the quality level of the data he is looking for or the modalities of the presentation of this data. The data describing the users is often gathered in the form of profiles. In this thesis we propose a generic and extensible model of profile, which enables the classification of the profile's contents. Personalization may occur in each step of the query life cycle. The second contribution of this thesis is the study of two query reformulation approaches based on algorithms for query enrichment and query rewriting and the proposition of an advanced query reformulation approach. The three reformulation approaches are evaluated on a benchmark described in the thesis.
... Recently, there has been an increasing interest to develop a web recommender systems using the application of web usage mining techniques [4][5][6]. Traditional techniques are mainly depends on users ratings on different items or other explicit feedbacks provided by the users [7,8]. Generally, Web server access log records the user's browsing history that contains an abundance of hidden information about users and behaviors of their own navigation. ...
... E v p' . i (6) Where s ,.., . . represents page support for candidate -" . 1 page Pi, and w (vp ! ) is the occurrence of the visited pattern ' VP',i that include the input pattern page ip, and include also the candidate page CPi . ...
Conference Paper
Full-text available
Now a days, users rely on the web for information. Accordingly, web usage mining has become one important subject of research. Such research area covers Web-based personalized services, prediction of user near future intentions, adaptive Web sites, and customer profiling. Web page Prediction is strongly limited by the nature of web logs, the intrinsic complexity of the problem and the tight efficiency requirements. This paper proposes a web usage mining approach to predict next page candidate to be accessed based on a hybrid page ranking model. The proposed approach combines between two page ranking approaches. The first approach computes frequency ratio indicating the number of occurrences for each page in the search result. On the other hand, second approach computes the coverage ratio from similar behavior patterns. As a result of the proposed approach a set of candidate pages are ranked and the page with highest rate is recommended. The proposed approach has been tested on real data extracted from the web server log file of a university website.
... Logs are thus a valuable source of information for understanding what users are doing and how a site is being used. Here we focus on web usage mining, which concentrates on developing techniques that model and study users' web navigation data obtained from server log files, with the aim to discover and evaluate "interesting" patterns of behaviour [17]. It is important to note that apart from studying log data recording a user's clickstream, other information, for example related to the content being browsed, or to the context in which the user is browsing, will be very useful. ...
... Applications of web usage mining include: creating adaptive web sites that automatically improve their organisation and presentation [18], prefetching web pages to improve latency [9], personalisation of users' web experience [17], and web access prediction [11,10,6,7]. ...
Article
Full-text available
Web server logs can be used to build a variable length Markov model representing user's navigation through a web site. With the aid of such a Markov model we can attempt to predict the user's next navigation step on a trail that is being followed, using a maximum likelihood method that predicts that the highest probability link will be chosen. We investigate three different scoring metrics for evaluating this prediction method: the hit and miss score, the mean absolute error and the ignorance score. We present an extensive experimental evaluation on three data sets that are split into training and test sets. The results confirm that the quality of prediction increases with the order of the Markov model, and further increases after removing unexpected, i.e. low probability, clicks from the test set.
... Before explain the data preparation component, we use an example to explain the phase and to explain our hypothesis. Web mining 0 18 c 2 Web content mining 3h 3h c 3 Web structure mining 1h 10m c 4 Web usage mining 0 15h c 5 Data preparation 3h30m 3h30m c 6 Pattern analysis 6h30m 6h30m ...
... Step 1: assign the types Step 2: Delete no meaningless concept Web content mining 3h 3h c 3 Web structure mining 1h 10m c 5 Data preparation 3h30m 3h30m c 6 Pattern analysis 6h30m 6h30m ...
... Today, Web log mining [32,38] is being performed at its peak over World Wide Web. Web mining is categorized into three basic class-web content mining, web structure mining and web usage mining (WUM) [29,36]. Here, web usage mining focuses on determining the user's behavior [48] from web log data. ...
Article
Full-text available
In the era of World Wide Web, more than one billions of websites are active over the internet. To perform the log analysis on huge number of available websites, although, numerous featured log analysis tools are existing. However, the great difficulty arises in selection of suitable tools. This work provides an investigation of open source and commercial toolsets available for the analysis the study will provide many choices to pick from when deciding a toolset to manage and analyze log data. The paper will help to review the set of tools currently available and positively hook the right tool to get started on analyzing logs in their organization.
... Once merging the gathered data correctly, the identification of unique users on the web becomes a complicated task. Understanding the user's life as a whole within a session in a website by applying advanced data modeling and a set of assumptions which are changed every day, are new ways to interact with the created online content [1]. ...
... Według jednej z definicji [13], personalizacja to każda akcja, która sprawia, że doznanie użytkownika w korzystaniu z serwisu jest dostosowane do jego indywidualnych preferencji. Dostosowanie to może obejmować adaptację -selekcję treści, adaptację prezentacji oraz nawigacji. ...
Article
Artykuł ma charakter przeglądowy, przedstawiono w nim dyskusję na temat kon-cepcji zarządzania relacjami z klientami w serwisach handlu elektronicznego, a w szczególności handlu internetowego B2C. Omówiono budowę systemów klasy CRM ze wskazaniem modułów odpowiedzialnych za obsługę relacji w e-handlu. Kluczowymi elementami koncepcji e-CRM są rozwiązania personalizacyjne i systemy rekomendacyjne, pozwalające na efektywne wykorzystanie gromadzonej wiedzy o klientach.
... Today, Web log mining [1] is being performed at its peak over World Wide Web. Web mining is categorized into three basic classes-Web content mining, Web structure mining, and Web usage mining (WUM) [2]. Here, Web usage mining [3] focuses on determining the user's behavior [4,5] from Web log data. ...
Conference Paper
Current cyber space is now flooding with huge number of Web sites, and the analysis of the Web sites is extremely needed to extract the gainful information. For the analysis task on Web sites’ log file, there exist various log analysis tools. However, the impeccable trouble emerges in determination of suitable tools. This work provides an examination of open source and commercial toolsets available for the analysis and its basic internal analysis process; the study will provide many choices to pick from when deciding a toolset to manage and analyze log data. The paper will help to review the set of tools currently available and positively hook the right tool to get started on analyzing logs files.
... The method of discovering the association rules is applicable successfully in the analysis of the behavior of the on-line portals users, including commercial ones [Mobasher 2005;Fu, Budzik, Hammord 2000;Vijaylakshmi, Mohan, Suresh Raja 2010]. ...
... This process is valuable mainly in web services using static relaxed since dynamic relaxed decrease the usability of web caching at both user and server-level. With proxy in classify for pre-fetching a page is well another method for performance upgrading [21,22]. ...
... There are essentially two sorts of grouping methods utilized on web utilization information: Web exchange clustering and Web page grouping [12]. One utilization of Web page grouping is the versatile Web webpage. ...
... This customization is accomplished either by the user selection or by tracking his or her behaviour on the same site such as which pages are frequently accessed. Mobasher [4] present a personalization is a method based on web usages mining. In this, the log data is collected automatically and application servers present the navigational behaviour of the customer. ...
... This process is valuable mainly in web services using static relaxed since dynamic relaxed decrease the usability of web caching at both user and server-level. With proxy in classify for pre-fetching a page is well another method for performance upgrading [21,22]. ...
Conference Paper
Full-text available
World Wide Web is an enormous store of links and web pages. It provides huge amount of information for the Internet clients. The development of web is great as about one million pages are added every day. Users' accesses are traced in web logs. Because of the great usage of web, the web log files are increasing at a more rapidly rate and the range is becoming enormous. Web Usage Mining relates mining techniques in log data to extract the performance of users which is used in different applications like Support to the Design, E-commerce, Modified services, pre-fetching etc. Web usage mining has three phases as preprocessing, pattern detection and pattern learning. Web log data is generally noisy and confusing, thus preprocessing and pattern analysis is an essential method before mining. For learning patterns gathering are to be constructed professionally. This paper is presents work done in the web usage mining. Finally a glance of various applications of web usage mining is presented. Web Usage Mining has develop into a dynamic region of study in field of data mining because of its crucial values. This paper affords a widespread conversation of the all the stages in Web Usage Mining and Problems with related works in this research areas.
... Consequently, it is necessary to apply a technique that allows us to identify collaborative behavior patterns, based on the analysis of the information that was stored in log files. The approach based on WUM [20] was our choice, and particularly, we apply association rules. It is important to highlight that the WUM approach is not novel; but, up to now it was not applied to discover contexts in collaborative learning environment. ...
Article
An effective collaboration in learning environments involves a set of skills that students must learn and cultivate. Detecting the contexts in which students apply these skills facilitates personalized assistance in learning environments during the learning process. This work introduces a method to detect collaborative behavior patterns automatically. It is based on Web Usage Mining techniques and allows us to identify contexts in which collaborative skills are applied. The patterns are discovered using association rules and then are used to update a Collaborative Profile in a Collaborative and Dynamic Student Model. The method was validated with simulation techniques and the results obtained suggest that Web Usage Mining is an effective method for detecting collaborative profiles in distance learning environments.
... CLF, ECLF), recording the parameterized URLs requested to a Web server to composed a pageview. The way data is enriched to associate domain semantics to URLs has a great impact on the subsequent phases of the process [9,21]. In the pattern discovery phase, statistical and data mining methods are applied to detect patterns or rules. ...
... The goal of the adaptive information retrieval is to adapt the process of information retrieval in order to return relevant results to the user. Techniques developed in adaptive information retrieval [18] focus mainly on the assistance in the reformulation of the request. ...
... Le WUM peut encore apporter des avantages à d'autres domaines, comme par exemple, l'ajout dynamique de liens dans des pages Web [99], la recommandation de produits [81,75], la caractérisation de groupes d'utilisateurs, l'amélioration de politiques comme le caching et le prefetching anticipés [88], etc. ...
Article
Full-text available
Nowadays, more and more organizations are becoming reliant on the Internet. The Web has become one of the most widespread platforms for information change and retrieval. The growing number of traces left behind user transactions (e.g. : customer purchases, user sessions, etc.) automatically increases the importance of usage data analysis. Indeed, the way in which a web site is visited can change over time. These changes can be related to some temporal factors (day of the week, seasonality, periods of special offer, etc.). By consequence, the usage models must be continuously updated in order to reflect the current behaviour of the visitors. Such a task remains difficult when the temporal dimension is ignored or simply introduced into the data description as a numeric attribute. It is precisely on this challenge that the present thesis is focused. In order to deal with the problem of acquisition of real usage data, we propose a methodology for the automatic generation of artificial usage data over which one can control the occurrence of changes and thus, analyse the efficiency of a change detection system. Guided by tracks born of some exploratory analyzes, we propose a tilted window approach for detecting and following-up changes on evolving usage data. In order measure the level of changes, this approach applies two external evaluation indices based on the clustering extension. The proposed approach also characterizes the changes undergone by the usage groups (e.g. appearance, disappearance, fusion and split) at each timestamp. Moreover, the refereed approach is totally independent of the clustering method used and is able to manage different kinds of data other than usage data. The effectiveness of this approach is evaluated on artificial data sets of different degrees of complexity and also on real data sets from different domains (academic, tourism and marketing).
... So, a Markov model of order k predicts the provability of the next page based on the last k pages visited. Given a set of all trails R, the probability of reaching a state s j from a state s i , via a trail r R, is given by Pr(r) = Pr k,k+1 , where k ranges between i, and j-1, in other words the probability is given by the multiplication of all intermediate states (Mobasher 2005). As an example about how a markov chain can model a set of web transactions, consider the set of transactions presented in Table 1. ...
Article
Full-text available
Nowadays, Web based platforms are quite common in any university, supporting a very diversified set of applications and services. Ranging from personal management to student evaluation processes, Web based platforms are doing a great job providing a very flexible way of working, promote student enrolment, and making access to academic information simple and in an universal way. Students can do their regular tasks anywhere, anytime. Sooner or latter, it was expected that organizations, and universities in particular, begin to think and act towards better educational platforms, more user-friendly and effective, where students find easily what they search about a specific topic or subject. Profiling is one of the several techniques that we can use to discover what students use to do, by establishing their user navigation patterns on Web based platforms, and knowing better how they explore and search the sites" pages that they visit. With these profiles Web based platforms administrators can personalize sites according with the preferences and behaviour of the students, promoting easy navigation functionalities and better abilities to response to their needs. In this article we will present the application of Markov chains in the establishment of such profiles for a target eLearning oriented Web site, presenting the system we implemented and its functionalities to do that, as well describing the entire process of discovering student profiles on an eLearning Web based platform.
... This report may include limited low-level error analysis such as detecting unauthorized entry points or finding the most common invalid URI. Despite lacking in the depth of its analysis, this type of knowledge can be potentially useful for improving the system performance, enhancing the security of the system, facilitating the site modification task, and providing support for marketing decisions [14] . ...
Article
Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. On the other hand, the rapid growth of e-commerce has caused product overload where customers on the Web are no longer able to effectively choose the products they are exposed to. Applying Web usage mining techniques to e-commerce will be a solution to overcome these problems. This paper has attempted to provide a survey of the rapidly growing area of Web usage mining. We propose a framework of Web usage mining for e-commerce, discuss and analyze each module of the framework in detail, give some applications of which to e-commerce. Finally, we conclude this paper and propose the future research directions.
... In the rule based personalization systems, site administrators define rules, according to these rules users classify and their web will personalize [20]. The main drawback of this system is the constant information. ...
Article
Full-text available
During users visiting a Web Server Access, data is stored. This information can be used broadly. Using this information we can obtain users' preferences and use them to personalize Web pages. Web mining is a new discussion that has been proposed to manage the web pages. In fact Web mining is the application of data mining techniques to discover patterns of user interests is the Data Web. In this article we provide a structure for web mining.
... The second data set is from a university website log files and was made available by the author of [17]. The data is based on a random collection of users visiting this site for a 2-week period during April of 2002. ...
Conference Paper
Due to the inherent correlation among Web objects and the lack of a uniform schema of web documents, Web community mining and analysis has become an important area for Web data management and analysis. The research of Web communities spans a number of research domains such as Web mining, Web search, clustering and text retrieval. In this talk we will present some recent studies on this topic, which cover finding relevant Web pages based on linkage information, discovering user access patterns through analyzing Web log files, co-clustering Web objects and investigating social networks from Web data. The algorithmic issues and related experimental studies will be addressed. Some research directions are also to be discussed.
... The Web usage mining technique is based on the extrac-tion of user navigations information from the system log files. Its goal is to capture and model the behavioural patterns and profiles of users interacting with a Web site (Mobasher, 2005). Other information retrieval techniques are based on contextual user/or material information extraction (Jones et al., 2004), or on the real-time interactions with the Web context (Vallet et al., 2007). ...
Chapter
Nowadays Web users are facing the problems of information overload and drowning due to the significant and rapid growth in the amount of information and the large number of users. As a result, how to provide Web users more exactly needed information is becoming a critical issue in Web-based information retrieval and data management. In order to address the above difficulties, Web mining was proposed as an efficient means to discover the intrinsic relationships among Web data. In particular, Web usage mining is to discover Web usage patterns and utilize the discovered usage knowledge for constructing interest-oriented user communities, which could be, in turn, used for presenting Web users more personalized Web contents, i.e. Web recommendation. On the other hand, Latent Semantic Analysis (LSA) is one kind of approaches that is used to reveal the inherent correlation resided in co-occurrence activities, such as Web usage data. Moreover, LSA possesses the capability of capturing the hidden knowledge at semantic level that can’t be achieved by traditional methods. In this chapter, we aim to address building user communities of interests via combining Web usage mining and latent semantic analysis. Meanwhile we also present the application of user communities for Web recommendation.
Chapter
The popularisation of mass customization and the need for integration of the user needs into the design, production and marketing phases has called for more innovative methods to be introduced into this area. At present the continuous growth of the world wide web and its rapid integration into people’s everyday lives and the popularisation of new technologies such as ubiquitous computing making possible the computing everywhere paradigm, offers a more desirable alternative for vendors in reaching their customers using more innovative techniques in an attempt to provide each customer with a one-to-one design, manufacturing and marketing service. The integration of ubiquitous computing technologies with machine learning and data mining techniques, which has been popular in personalization techniques, will serve to bring about innovative changes in this area. In future years this may revolutionise the way in which vendors can reach their customers offering every customer a tailored one-to-one service from design, to manufacturing, to delivery. This chapter will present the state of the art techniques to enable the combination of machine learning, data mining and ubiquitous computing technologies which will serve to provide innovative techniques applications and user interfaces for mass customization systems. This is currently a field of intense research and development activity and some technologies are already on the path to practical application. This chapter will present a state of the art survey of these technologies and their applications.
Article
It is quite common these days for experts, casual analysts, executives and data enthusiasts, to analyze large datasets through user-friendly interfaces on top of Business Intelligence (BI) systems. However, current BI systems do not adequately detect and characterize user interests, which may lead to tedious and unproductive interactions. In this paper, we propose a collaborative recommender system for BI interactions, specifically designed to take advantage of identified user interests. Such user interests are discovered by characterizing the intent of the interaction with the BI system. Building on user modeling for proactive search systems, we identify a set of features for an adequate description of intents, and a similarity measure for grouping intents into coherent clusters. On top of these automatically identified interests, we build a collaborative recommender system based on a Markov model that represents the probability for a user to switch from one interest to another. We validate our approach experimentally with an in-depth user study, where we analyze traces of BI navigation. Our results are two-fold. First, we show that our similarity measure outperforms a state-of-the-art query similarity measure and yields a very good precision with respect to expressed user interests. Second, we compare our recommender system to two state-of-the-art systems to demonstrate the benefit of relying on user interests.
Conference Paper
This paper addresses the problem of role classification, which is related to classifying and grouping email users into a collection of organizational roles. This classification can be used in designing modern email clients by adding an Inbox prioritizing feature that can predict the role of a sender to the recipient of an email. A comprehensive study has been done on the social network of the Enron dataset. For classifying organizational roles, a feature vector containing a set of social network metrics and interaction-based features reflecting users’ engagingness and responsiveness in their community is created. After representing each role in this feature space, Expectation Maximization (EM) algorithm has been applied to evaluate the extracted feature set. In turn, a Neural Network classifier has been built based on the extracted features for classifying organizational roles that resulted in 63.57% of accuracy.
Article
Full-text available
El Instituto de Cibernética, Matemática y Física (ICIMAF) tiene como misión gestionar y ejecutar proyectos de investigación, desarrollo e innovación con personal motivado y de competencia reconocida; realizar actividades de formación posgraduada, asesorías y servicios científicos y tecnológicos que brinden soluciones de alto valor agregado, por lo que genera una gran producción científica. Para lograr estos objetivos los investigadores deben poder realizar un mapeo y definir donde se está parado respecto a una tecnología, para que pueda conocer qué otras tecnologías existen en el mismo campo, lo que dará al científico una idea acerca de qué tan innovador es lo que se está haciendo. Por lo que el trabajo realizado presenta un estudio de los principales conceptos y características de Vigilancia Tecnológica y su relación con las técnicas de Minería de datos, específicamente la minería de textos y web. Permitirá el posterior desarrollo de un sistema de Vigilancia Tecnológica el cual facilitará la labor investigativa de los académicos, para que puedan encontrar los campos más fértiles para descubrir material científico básico y aplicado.
Book
Full-text available
http://adrastea.ugr.es/search*spi/a?itmazi&extended=0&searchscope=1
Conference Paper
The Web is a popular and interactive medium to interchange information. Web mining shall have a greater significance with the increase of the applications on the internet. It uses various data mining techniques, but it is not an application of traditional data mining due to heterogeneity and unstructured nature of the data available on the World Wide Web. In this regard, the working of World Wide Web and Web application is shown in this paper.
Article
This paper analyzes the structure of the whole system, namely, how different users according to their own characteristics of its initiative to provide users with relevant information and content and to establish individual user model, based on user behavior to build personalized user model.
Article
Full-text available
Over the last decade, there has been a paradigm shift in business computing with the emphasis moving from data collection to knowledge extraction. Central to this shift has been the explosive growth of the World Wide Web, which has enabled myriad technologies, such as Web services and enterprise server applications. These advances have improved data collection frameworks and resulted in new techniques for knowledge extraction from large databases. A popular and successful technique which has showed much promise is Web mining. Web mining is essentially data mining for Web data, thus enabling businesses to turn their vast repositories of transactional and Website usage data into actionable knowledge that is useful at every level of the enterprise – not just the front-end of an online store. To this end, the chapter provides an introduction to the field of Web mining and examines existing as well as potential Web mining applications applicable for different business function, like marketing, human resources, and fiscal administration. Suggestions for improving information technology infrastructure are made, which can help businesses interested in Web mining hit the ground running.
Article
Data mining has matured as a field of basic and applied research in computer science. The objective of this dissertation is to evaluate, propose and improve the use of some of the recent approaches, architectures and Web mining techniques (collecting personal information from customers) are the means of utilizing data mining methods to induce and extract useful information from Web information and service where data mining has been applied in the fields of e-commerce and e-business (that means User's behavior). In the context of web mining, clustering could be used to cluster similar click-streams to determine learning behaviors in the case of e-learning or general site access behaviors in e-commerce. Most of the algorithms presented in the literature to deal with clustering web sessions treat sessions as sets of visited pages within a time period and do not consider the sequence of the click-stream visitation. This has a significant consequence when comparing similarities between web sessions. Wang and Zaiane propose an algorithm based on sequence alignment to measure similarities between web sessions where sessions are chronologically ordered sequences of page accesses.
Article
Material recommender system is a significant part of e-learning systems for personalization and recommendation of appropriate materials to learners. However, in the existing recommendation algorithms, dynamic interests and multi-preference of learners and multidimensional-attribute of materials are not fully considered simultaneously. Moreover, these algorithms cannot effectively use the learner’s historical sequential patterns of material accessing in recommendation. For addressing these problems and improving the accuracy and quality of recommendation, a new material recommender system framework based on sequential pattern mining and multidimensional attribute-based collaborative filtering (CF) is proposed. In the sequential pattern based approach, modified Apriori and PrefixSpan algorithms are implemented to discover latent patterns in accessing of materials and use them for recommendation. Leaner Preference Tree (LPT) is introduced to take into account multidimensional-attribute of materials, and learners’ rating and model dynamic and multi-preference of learners in the multidimensional attribute-based CF approach. Finally, the recommendation results of two approaches are combined using cascade, weighted and mixed methods. The proposed method outperforms the previous algorithms on the classification accuracy measures and the learner’s real learning preference can be satisfied accurately according to the real-time up dated contextual information.
Conference Paper
Subscriber satisfaction of a broadband service is determined by several factors such as contents delivery strategy, service infrastructure and channel packaging delivered. Currently the single-operator, single-infrastructure broadband model in Malaysia delivered by service providers such as TM UniFi cannot address the relevant and personalized contents to diversified subscribers in their needs and goals, therefore affecting the overall satisfaction of their subscribers. The objective of this research is to examine a content personalization strategy that can improve the quality of service delivered and increase the level of satisfaction of the subscribers. The content personalization strategy will allow subscriber the freedom to dynamically select any services on the network. The deployment of content personalization model to individual homes enable subscribers to personalize and customize contents based on personal preference in realtime. This delivery strategy will significantly improve customer level of experience. In the future this study will be used as input by high speed broadband service providers in analyzing methods to increase their service delivery revenue by increasing the level of subscriber satisfaction.
Article
Full-text available
Collaborative recommender systems allow personalization for e-commerce by exploiting similarities and dissimilarities among customers' preferences. We investigate the use of association rule mining as an underlying technology for collaborative recommender systems. Association rules have been used with success in other domains. However, most currently existing association rule mining algorithms were designed with market basket analysis in mind. Such algorithms are inefficient for collaborative recommendation because they mine many rules that are not relevant to a given user. Also, it is necessary to specify the minimum support of the mined rules in advance, often leading to either too many or too few rules; this negatively impacts the performance of the overall system. We describe a collaborative recommendation technique based on a new algorithm specifically designed to mine association rules for this purpose. Our algorithm does not require the minimum support to be specified in advance. Rather, a target range is given for the number of rules, and the algorithm adjusts the minimum support for each user in order to obtain a ruleset whose size is in the desired range. Rules are mined for a specific target user, reducing the time required for the mining process. We employ associations between users as well as associations between items in making recommendations. Experimental evaluation of a system based on our algorithm reveals performance that is significantly better than that of traditional correlation-based approaches.
Article
Full-text available
In recent years, it is becoming increasingly difficult to ignore the impact Web robots have on both commercial and research institutional Web sites. In particular, e-commerce retailers are concerned about the unauthorized deployment of robots for gathering business intelligence at their Web sites. Web robots also tend to consume considerable network bandwidth at the expense of other users. Sessions due to Web robots are making it more difficult to perform clickstream analysis effectively on the ...
Article
Full-text available
We describe an efficient framework for Web personalization based on sequential and non-sequential pattern discovery from usage data. Our experimental results performed on real usage data indicate that more restrictive patterns, such as contiguous sequential patterns (e.g., frequent navigational paths) are more suitable for predictive tasks, such as Web prefetching, which involve predicting which item is accessed next by a user), while less constrained patterns, such as frequent itemsets or general sequential patterns are more effective alternatives in the context of Web personalization and recommender systems.
Article
Full-text available
Web usage mining, possibly used in conjunction with standard approaches to personalization such as collaborative filtering, can help address some of the shortcomings of these techniques, including reliance on subjective user ratings, lack of scalability, and poor performance in the face of high-dimensional and sparse data. However, the discovery of patterns from usage data by itself is not sufficient for performing the personalization tasks. The critical step is the effective derivation of good quality and useful (i.e., actionable) "aggregate usage profiles" from these patterns. In this paper we present and experimentally evaluate two techniques, based on clustering of user transactions and clustering of pageviews, in order to discover overlapping aggregate profiles that can be effectively used by recommender systems for real-time Web personalization. We evaluate these techniques both in terms of the quality of the individual profiles generated, as well as in the context of providing recommendations as an integrated part of a personalization engine. In particular, our results indicate that using the generated aggregate profiles, we can achieve effective personalization at early stages of users' visits to a site, based only on anonymous clickstream data and without the benefit of explicit input by these users or deeper knowledge about them.
Conference Paper
We are given a large database of customer transactions, where each transaction consists of customer-id, transaction time, and the items bought in the transaction. We introduce the problem of mining sequential patterns over such databases. We present three algorithms to solve this problem, and empirically evaluate their performance using synthetic data. Two of the proposed algorithms, AprioriSome and AprioriAll, have comparable performance, albeit AprioriSome performs a little better when the minimum number of customers that must support a sequential pattern is low. Scale-up experiments show that both AprioriSome and AprioriAll scale linearly with the number of customer transactions. They also have excellent scale-up properties with respect to the number of transactions per customer and the number of items in a transaction