Conference Paper

Data Mining Meets HCI: Data and Visual Analytics of Frequent Patterns

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

As a popular data mining tasks, frequent pattern mining discovers implicit, previously unknown and potentially useful knowledge in the form of sets of frequently co-occurring items or events. Many existing data mining algorithms return to users with long textual lists of frequent patterns, which may not be easily comprehensible. As a picture is worth a thousand words, having a visual means for humans to interact with computers would be beneficial. This is when human-computer interaction (HCI) research meets data mining research. In particular, the popular HCI task of data and result visualization could help data miners to visualize the original data and to analyze the mined results (in the form of frequent patterns). In this paper, we present a few systems for data and visual analytics of frequent patterns, which integrate (i) data analytics and mining with (ii) data and result visualization.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The data mining algorithms represent long textual patterns that visually incomprehensive rather than beneficial with visually informative that humans can interact with the computers. In this context, the HCI meets data mining where the HCI and its result such as visualization help (Leung et al., 2016) data miners to visualize data and analyze mined results. The HCI is immensely used in these fields for proper visualization of data into meaningful information for including general users. ...
Thesis
Full-text available
The term Human-Computer Interaction (HCI) ensures the reliable interaction between a user and the system, and HCI consists of three parts; user (human), system (computer), and the ways they interact such as the interaction mediums (i.e., physical, graphical and voice interaction). In HCI, numerous methods, models, and theories are used to develop the interaction mediums. Besides, existing methods or models act inconsistently when applying in the HCI design. The inconsistent methods lead to error-prone systems and impacts on time, cost, and system failure. This study considered three widely used HCI methods; Activity Theory (AT), User-Centered Design (UD), and Value Sensitive Design (VSD). Despite the methods widely employed, moreover, limitations originate when apply. The ideal and novel methods require developing the proper interaction mediums (e.g., physical or graphical interfaces) without producing any inconsistency. In this essence, consistencies of AT; the catalyst is absent that motivates objects, the arrows are unclear in the model, and many iterative loops caused time and effort. The inconsistency of UCD; the model is unstable and differs for different applications, users directly involved when applies in development causes time, cost, and complexities for multiple interpretations. Limitations of VSD; the values not determined, designer's and stakeholder's values conflicted, and privacy issues are factors in VSD. Based on the context proposed the Theory of Usability Index (TOUI) concerning inconsistent issues identified in existing HCI design methods. A usability law obtained as TOUI's Law: U = log10(xy) - log10x/y. The logical process of TOUI is an algorithm that implements into multidimensional areas. The evaluation between existing HCI methods and the TOUI carry by employing an illustrative case study. The findings arrange in a table and show significant improvement of TOUI in usability issues based on the three terms (Method structure, Model consistency, and Concept of values). The study three-stage qualitative research and finally draws the limitations, contributions, and future works.
Conference Paper
Full-text available
Numerous algorithms have been proposed since the introduction of the research problem of frequent pattern mining. Such a research problem has played an essential role in many knowledge discovery and data mining (KDD) tasks. Most of the proposed frequent pattern mining algorithms return the mined results in the form of textual lists that contain frequent patterns showing those frequently occurring sets of items. As "a picture is worth a thousand words", the use of visual representation can enhance the user understanding of the inherent relations in a collection of frequent patterns. Although a few visualizers have been developed to visualize the raw data or the results for some data mining tasks, most of these visualizers were not designed for visualizing frequent patterns. For those that were, they show all the frequent patterns that can be mined from datasets. It is not uncommon that, for many real-life applications, the user may end up be overwhelmed by such a huge number of patterns. In this paper, we propose a visualizer—called CloseViz—to show the user only the useful patterns. Specifically, CloseViz shows only closed frequent patterns. By doing so, CloseViz reduces the number of displayed patterns to a useful amount while retaining all the important frequency information. Moreover, CloseViz presents the closed frequent patterns to the user in a useful manner, which allows visual exploration of the patterns. Note that the closed patterns shown by CloseViz can be considered as surrogates for all the frequent patterns that can be mined from the datasets.
Conference Paper
Full-text available
Frequent itemset mining plays an essential role in the mining of many different patterns. Most existing frequent itemset mining algorithms return the mined results--namely, frequent itemsets--in the form of textual lists. However, the use of visual representation can enhance the user understanding of the inherent relations in a collection of frequent itemsets. In this paper, we propose an effective visualizer, called WiFIsViz, to display the mined frequent itemsets. WiFIsViz provides users with an overview and details about the itemsets. Moreover, this visualizer is also equipped with several interactive features for effective visualization of the frequent itemsets mined from various real-life applications.
Article
Full-text available
As frequent pattern mining plays an essential role in many knowledge discovery and data mining (KDD) tasks, numer- ous algorithms for finding frequent patterns have been pro- posed over the past 15 years. However, most of these al- gorithms return the mining results in the form of textual lists containing frequent patterns showing those frequently occurring sets of items. It is well known that "a picture is worth a thousand words". The use of visual representa- tion can enhance the user's understanding of the inherent relations in a collection of frequent patterns. In this pa- per, we develop a simple yet useful visual analytic tool for supporting frequent pattern mining called FpVAT. Such a vi- sual analytic tool consists of two modules: One module gives users an overview so that they can derive insight from a mas- sive amount of raw data; another module enables users to perform analytical reasoning on the mining results via inter- active visual interfaces so that users can detect the expected frequent patterns and discover the unexpected frequent pat- terns. As a visual analytic tool, our FpVAT is equipped with several interactive features for effective visual support in the data analysis and KDD process for various real-life applications.
Conference Paper
Astrophysical experiments produce Big Data which need efficient and effective data analytics. In this paper we present a general data analysis process which has been successfully applied to data from IceCube, a cubic kilometer neutrino detector located at the geographic South Pole. The goal of the analysis is to separate neutrinos from atmospheric muons within the data to determine the muon neutrino energy spectrum. The presented process covers straight cuts, variable selection, classification, and unfolding. A major challenge in the separation is the unbalanced dataset. The expected signal to background ratio in the initial data (trigger level) is roughly 1:10610^6. The overall process was embedded in a multi-fold cross-validation to control its performance. A subsequent regularized unfolding yields the sought after neutrino energy spectrum.
Conference Paper
People using mobile devices for making phone calls, accessing the internet, or posting georeferenced contents in social media create episodic digital traces of their presence in various places. Availability of personal traces over a long time period makes it possible to detect repeatedly visited places and identify them as home, work, place of social activities, etc. based on temporal patterns of the person’s presence. Such analysis, however, can compromise personal privacy. We propose a visual analytics approach to semantic analysis of mobility data in which traces of a large number of people are processed simultaneously without accessing individual-level data. After extracting personal places and identifying their meanings in this privacy-respectful manner, the original georeferenced data are transformed to trajectories in an abstract semantic space. The semantically abstracted data can be further analyzed without the risk of re-identifying people based on the specific places they attend.
Article
As frequent pattern mining plays an essential role in many knowledge discovery and data mining (KDD) tasks, numerous algorithms for finding frequent patterns have been proposed over the past 15 years. However, most of these algorithms return the mining results in the form of textual lists containing frequent patterns showing those frequently occurring sets of items. It is well known that "a picture is worth a thousand words". The use of visual representation can enhance the user's understanding of the inherent relations in a collection of frequent patterns. In this paper, we develop a simple yet useful visual analytic tool for supporting frequent pattern mining called FpVAT . Such a visual analytic tool consists of two modules: One module gives users an overview so that they can derive insight from a massive amount of raw data; another module enables users to perform analytical reasoning on the mining results via interactive visual interfaces so that users can detect the expected frequent patterns and discover the unexpected frequent patterns. As a visual analytic tool, our FpVAT is equipped with several interactive features for effective visual support in the data analysis and KDD process for various real-life applications.
Conference Paper
Frequent pattern mining algorithms aim to find sets of frequently co-occurring items. Visual representation of the mining results is more comprehensible to users than the traditional long textual list of frequent patterns. Existing visualizers mostly show frequent patterns as graphs in a two-dimensional space with (x,y)-coordinates. Nowadays, in a collaborative environment, it is not uncommon for users to have face-to-face meetings when they show the graphs visualizing frequent patterns. In these situations, the viewing orientation of the graphs plays an important role as different orientations positively or negatively impact the graph legibility. A legible right-side-up graph to one user may become an illegible upside-down graph towards another user. In this paper, we propose a visualizer that uses a radial layout—which is orientation free—to show frequent patterns. Having such a visualizer is beneficial in the collaborative environment.
Conference Paper
Since its introduction, frequent itemset mining has been the subject of numerous studies. However, most of them return frequent itemsets in the form of textual lists. The common cliché that “a picture is worth a thousand words” advocates that visual representation can enhance user understanding of the inherent relations in a collection of objects such as frequent itemsets. Many visualization systems have been developed to visualize raw data or mining results. However, most of these systems were not designed for visualizing frequent itemsets. In this paper, we propose a frequent itemset visualizer (FIsViz). FIsViz provides many useful features so that users can effectively see and obtain implicit, previously unknown, and potentially useful information that is embedded in data of various real-life applications.
Conference Paper
Frequent pattern mining searches for frequently occurring sets of items or events. While users are interested in finding these frequent patterns in most situations, they may want to compare and contrast the mined frequent patterns in some other situations. For example, store managers may want to find out how the collections of frequently purchased items changed from one season to another. Similarly, regional managers may want to compare the frequently purchased items between two different branches. These are some examples of looking for temporal and/or spatial changes between mined frequent patterns. A visual representation of these patterns would be more comprehensive to users than the long textual list returned by many existing frequent pattern mining algorithms. However, many existing visualizers were not designed to show frequent patterns, let alone show the differences between them. In this paper, we propose a visualization system called Contrast Viz that enables users to visualize the mined frequent patterns and their differences.
Conference Paper
Since the introduction of frequent pattern mining, majority of the studies have been focused on algorithmic efficiency. Many algorithms return the mining results in textual forms -- i.e., a long textual list of frequent patterns. However, it is well established that visual representation can be more compact and more comprehensive to users in visually revealing frequent patterns than textual representation. Existing visualizers mainly focus on showing frequency information of every frequent pattern, and thus ignore relationships (e.g., prefix/extension) among the mined patterns. Inspired by the tree map representation of hierarchical data, we propose in this paper a space-filling visualizer called FpMapViz for showing frequency information of every frequent pattern as well as prefix/extension relationships among the mined patterns.
Article
Visual data mining techniques have proven to be of high value in exploratory data analysis, and they also have a high potential for mining large databases. In this article, we describe and evaluate a new visualization-based approach to mining large databases. The basic idea of our visual data mining techniques is to represent as many data items as possible on the screen at the same time by mapping each data value to a pixel of the screen and arranging the pixels adequately. The major goal of this article is to evaluate our visual data mining techniques and to compare them to other well-known visualization techniques for multidimensional data: the parallel coordinate and stick-figure visualization techniques. For the evaluation of visual data mining techniques, the perception of data properties counts most, while the CPU time and the number of secondary storage accesses are only of secondary importance. In addition to testing the visualization techniques using real data, we developed a testing environment for database visualizations similar to the benchmark approach used for comparing the performance of database systems. The testing environment allows the generation of test data sets with predefined data characteristics which are important for comparing the perceptual abilities of visual data mining techniques