Byeong-Soo Jeong

Byeong-Soo Jeong
Kyung Hee University · Biomedical Engineering

About

101
Publications
23,584
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,739
Citations
Introduction
Skills and Expertise

Publications

Publications (101)
Conference Paper
Full-text available
These days Frequent Induced Subgraph Mining (FISM) is an active research direction, in various application domains like biological networks, chemical, or social networks. A number of FISM approaches have been proposed over the years. However, existing methods take long execution time since they perform numerous subgraph isomorphism (SI) operations,...
Conference Paper
Full-text available
Opinion or sentiment analysis has risen to extract useful information from a lot of unstructured text data, in the form of customer reviews on different products and their features or online SNS data respectively. Customer reviews are not only helpful for potential customers, but it is also helpful for the manufacturers of the products to raise the...
Article
The recent emergence of body sensor networks (BSNs) has made it easy to continuously collect and process various health-oriented data related to temporal, spatial and vital sign monitoring of a patient. As such, discovering or mining interesting knowledge from the BSN data stream is becoming an important issue to promote and assist important decisi...
Conference Paper
The recent emergence of body sensor networks (BSNs) has made it easy to continuously collect and process various health-oriented data related to temporal, spatial and vital sign monitoring of patient. As such, discovering or mining interesting knowledge from the BSN data stream is becoming an important issue to promote and assist important decision...
Article
Due to their popularity and widespread use, blogs have become an important medium through which many people communicate and exchange information on the World Wide Web (WWW). The blogosphere has provided many opportunities for individuals and companies to establish new business models that investigate social relationships. In Korea, there are many b...
Conference Paper
Data clustering has been considered as one of the most important techniques for unsupervised learning in diverse applications. Gene clustering is to find out groups of genes similarly expressed in large size of microarray data. Meanwhile, recent development of microarray technology generates a very large number of microarray data with low cost and...
Article
Multilevel knowledge in transactional databases plays a significant role in our real-life market basket analysis. Many researchers have mined the hierarchical association rules and thus proposed various approaches. However, some of the existing approaches produce many multilevel and cross-level association rules that fail to convey quality informat...
Conference Paper
The emergence of large real life networks such as social networks, web page links, and traffic networks exhibits complex graph structures with millions of vertices and edges. Among many operations for exploiting these graphs, the shortest path discovery is a major and expensive one. Besides the in-memory approaches, many efficient shortest path com...
Article
Flash memory has its unique characteristics: The write operation is much more costly than the read operation, and in-place updating is not allowed. In flash memory environment, in order to reduce the cost of copying valid pages during an erase operation, hot data clustering methods have been proposed. They try to store data with high write frequenc...
Article
Full-text available
Microarray data analysis has been widely used for extracting relevant biological information from thousands of genes simultaneously expressed in a specific cell. Although many genes are expressed in a sample tissue, most of these are irrelevant or insignificant for clinical diagnosis or disease classification because of missing values and noises. T...
Conference Paper
An interesting function named Wake on WLAN (WOW) has recently captured researchers attention as one of the remote computer administration functions that may turn on the remote computerized system through the network connection at the time point of receiving a specially coded packet. The phenomenon comes from the physiognomies of the coded packet su...
Article
Full-text available
Splice site prediction in the pre-mRNA is a very important task for understanding gene structure and its function. To predict splice sites, SVM (support vector machine)-based classification technique is frequently used because of its classification accuracy. High performance of SVM largely depends on DNA encoding method. However, existing encodi...
Article
In this paper, we propose a parameter-insensitive data partitioning approach for Chameleon, a hierarchical clustering algorithm. We first show that the quality of clusters produced by Chameleon is significantly affected by the sizes of initial sub-clusters and also that it is mainly because Chameleon recursively splits a dataset into two equal-size...
Article
Full-text available
Effective representation of DNA sequences is one of the important tasks in the study of genome sequences. In this paper, we propose a graphical representation of DNA sequences based on nucleotide ring structure. In the proposed representation, we convert DNA sequences into 16 dinucleotides on the surface of the hexagon so that it can preserve nucle...
Article
The goal of analyzing a time series database is to find whether and how frequent a periodic pattern is repeated within the series. Periodic pattern mining is the problem that regards temporal regularity. However, most of the existing algorithms have a major limitation in mining interesting patterns of users interest, that is, they can mine patterns...
Conference Paper
Splice site prediction in the pre-mRNA is a very important task for understanding gene structure and its function. To predict splice sites, SVM (support vector machine) based classification technique is frequently used because of its classification accuracy. High classification accuracy of SVM largely depends on DNA encoding method for feature extr...
Article
Full-text available
Mining combined association rules with correlation and market basket analysis can discover customers buying purchase rules along with frequently correlated, associated-correlated, and independent patterns synchronously which are extraordinarily useful for making everydays business decisions. However, due to the main memory bottleneck in single comp...
Article
High utility pattern (HUP) mining over data streams has become a challenging research issue in data mining. When a data stream flows through, the old information may not be interesting in the current time period. Therefore, incremental HUP mining is necessary over data streams. Even though some methods have been proposed to discover recent HUPs by...
Article
Full-text available
Splice site prediction in DNA sequence is a basic search problem for finding exon/intron and intron/exon boundaries. Removing introns and then joining the exons together forms the mRNA sequence. These sequences are the input of the translation process. It is a necessary step in the central dogma of molecular biology. The main task of splice site pr...
Conference Paper
A blogosphere is a representative online social network established through blog users and their relationships. Understanding information diffusion is very important in developing successful business strategies for a blogosphere. In this paper, we discuss how to predict information diffusion in a blogosphere. Documents diffused over a blogosphere d...
Article
Weighted frequent pattern (WFP) mining is more practical than frequent pattern mining because it can consider different semantic significance (weight) of the items. For this reason, WFP mining becomes an important research issue in data mining and knowledge discovery. However, existing algorithms cannot be applied for incremental and interactive WF...
Article
Full-text available
Microarray gene expression techniques and tools have become of a substantial importance and widely used to analyze the protein-protein interaction (PPI) and gene regulation network (GRN) research in recent years since it can capture the expressions of thousands of genes in a single experiment. Such dataset poses a great challenge for finding associ...
Conference Paper
Full-text available
Market basket analysis techniques are substantially important to everyday's business decision, because of its capability of extracting customer's purchase rules by discovering what items they are buying frequently and together. But, the traditional single processor and main memory based computing is not capable of handling ever increasing large tra...
Conference Paper
Full-text available
contemporary web browsers do not provide customized recommendations for the users; rather than some suggestions based on cookies or browsing history after content filtering. Usually, most of the users provide some key words to search the contents inside their preferred websites and based on these key words web servers provide the contents. So, it w...
Article
Market basket analysis is very important to everyday's business decision, because it seeks to find relationships between purchased items. Undoubtedly, these techniques can extract customer's purchase rules by discovering what items they are buying frequently and together. Therefore, to raise the probability of purchasing the corporate manager of a...
Article
Contemporary web browsers do not provide customized recommendations for the users; rather than some suggestions based on cookies or browsing history after content filtering. Usually, most of the users provide some key words to search the contents inside their preferred websites and based on these key words web servers provide the contents. So, it w...
Conference Paper
Full-text available
Finding interesting patterns plays an important role in several data mining applications, such as market basket analysis, medical data analysis, and others. The occurrence frequency of patterns has been regarded as an important criterion for measuring interestingness of a pattern in several applications. However, temporal regularity of patterns can...
Conference Paper
Full-text available
Problem of finding frequent patterns has long been studied because it is very essential to data mining tasks such as association rule analysis, clustering, and classification analysis. Privacy preserving data mining is another important issue for this domain since most users do not want their private information to leak out. In this paper, we propo...
Article
Full-text available
Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still...
Article
Full-text available
Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in fi...
Article
Full-text available
Current DNA sequence datasets have become extremely large, making it a great challenge for single-processor and main-memory-based computing systems to mine interesting patterns. Such limited hardware resources make the performance of most Apriori-like algorithms inefficient. However, recent implementation of a MapReduce framework has overcome these...
Article
Data mining is a relatively new and promising field of computer science. It is used for extracting valuable information or knowledge from large database. Data mining requires searching for frequent patterns from large database. Frequent substructure mining is also denoted by graph mining. Some of the graph mining algorithms were Apriori based and p...
Chapter
Market basket analysis techniques are useful for extracting customer’s purchase behaviors or rules by discovering what items they buy together using the association rules and correlation. Associated and correlated items are placed in the neighboring shelf to raise their purchasing probability in a super shop. Therefore, the mining combined associat...
Article
Market basket analysis techniques are useful for extracting customer's purchase behaviors or rules by discovering what items they buy together using the association rules and correlation. Associated and correlated items are placed in the neighboring shelf to raise their purchasing probability in a super shop. Therefore, the mining combined associat...
Article
In this paper, a position-based method is proposed for frequent contiguous DNA pattern mining by fast joining the same length patterns to generate next length ones as well as scanning the database only once. At first, we combine the position information of fixed-length patterns to generate a fixed-length spanning tree to mine frequent fixed-length...
Article
High utility pattern (HUP) mining is one of the most important research issues in data mining. Although HUP mining extracts important knowledge from databases, it requires long calculations and multiple database scans. Therefore, HUP mining is often unsuitable for real-time data processing schemes such as data streams. Furthermore, many HUPs may be...
Article
Traditional frequent pattern mining methods consider an equal profit/weight for all items and only binary occurrences (0/1) of the items in transactions. High utility pattern mining becomes a very important research issue in data mining by considering the non-binary frequency values of items in transactions and different profit values for each item...
Conference Paper
Due to their popularity and widespread use, blogs have become an important medium through which to communicate and exchange information on the World Wide Web. The advent of the blogosphere may provide opportunities for establishing a new business model that investigates social relationships. In Korea, there are many blogospheres that appear to main...
Article
Mining web access sequences (WASs) can discover very useful knowledge from web logs with broad applications. By considering non-binary occurrences of web pages as internal utilities in WASs, e.g., time spent by each user in a web page, more realistic information can be extracted. However, the existing utility-based approach has many limitations suc...
Article
Mining sequential patterns is an important research issue in data mining and knowledge discovery with broad applications. However, the existing sequential pattern mining approaches consider only binary frequency values of items in sequences and equal importance/significance values of distinct items. Therefore, they are not applicable to actually re...
Article
Traditional frequent pattern mining considers equal profit/weight value of every item. Weighted Frequent Pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery by considering different weights for different items. Existing algorithms in this area are based on fixed weight. But in our real world scenarios the...
Conference Paper
Mining web access sequences can discover very useful knowledge from web logs with broad applications. By considering non-binary occurrences of web pages as internal utilities in web access sequences, e.g., time spent by each user in a web page, more realistic information can be extracted. However, the existing utility-based approach has many limita...
Chapter
Full-text available
High utility pattern (HUP) mining over data streams has become a challenging research issue in data mining. The existing sliding window-based HUP mining algorithms over stream data suffer from the level-wise candidate generation-and-test problem. Therefore, they need a large amount of execution time and memory. Moreover, their data structures are n...
Conference Paper
Discovering interesting patterns from high-speed data streams is a challenging problem in data mining. Recently, the support metric-based frequent pattern mining from data stream has achieved a great attention. However, the occurrence frequency of a pattern may not be an appropriate criterion for discovering meaningful patterns. Temporal regularity...
Conference Paper
Recently proposed regular pattern mining provides an effective technique to find patterns occurring at regular interval in a static database. However, the occurrence characteristic of patterns may change significantly with the update of database. Therefore, this paper proposes the Incremental Regular Pattern Tree (IncRT) and a pattern growth mining...
Conference Paper
The share measure of item sets has been proposed to discover useful knowledge about numerical values associated with items in a transaction database. Therefore, share-frequent pattern mining problem becomes a very important research issue in data mining. However, the existing algorithms of share-frequent pattern mining are based on static databases...
Article
Finding frequent patterns in a continuous stream of transactions is critical for many applications such as retail market data analysis, network monitoring, web usage mining, and stock market prediction. Even though numerous frequent pattern mining algorithms have been developed over the past decade, new solutions for handling stream data are still...
Article
Traditional frequent pattern mining algorithms do not consider different semantic significances (weights) of the items. By considering different weights of the items, weighted frequent pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery area. However, the existing state-of-the-art WFP mining algorithms co...
Article
With advances in technology, use of wireless sensor networks (WSNs) has widely increased in recent -decades. In general, WSNs produce a large amount of data in the form of streams. Recently, data-mining techniques have received a great deal of attention for their utility in extracting knowledge from WSN data. Mining association rules on the sensor...
Conference Paper
Full-text available
Since mining frequent patterns from transactional databases involves an exponential mining space and generates a huge number of patterns, efficient discovery of user-interest-based frequent pattern set becomes the first priority for a mining algorithm. In many real-world scenarios it is often sufficient to mine a small interesting representative su...
Conference Paper
Mining useful Web path traversal patterns is a very important research issue in Web technologies. Knowledge about the frequent Web path traversal patterns enables us to discover the most interesting Websites traversed by the users. However, considering only the binary (presence/absence) occurrences of the Websites in the Web traversal paths, real w...
Conference Paper
Wireless sensor networks (WSNs) produce large scale of data in the form of streams. Recently, data mining techniques have received a great deal of attention in extracting knowledge from WSNs data. Mining association rules on the sensor data provides useful information for different applications. Even though there have been some efforts to address t...
Article
The FP-growth algorithm using the FP-tree has been widely studied for frequent pattern mining because it can dramatically improve performance compared to the candidate generation-and-test paradigm of Apriori. However, it still requires two database scans, which are not consistent with efficient data stream processing. In this paper, we present a no...
Conference Paper
Frequent pattern mining techniques treat all items in the database equally by taking into consideration only the presence of an item within a transaction. However, the customer may purchase more than one of the same item, and the unit price may vary among items. High utility pattern mining approaches have been proposed to overcome this problem. As...
Conference Paper
Recently, a significant number of parallel and distributed algorithms have been proposed to mine frequent patterns (FP) from large and/or distributed databases. Among them parallelization of the FP-growth algorithms using the FP-tree has been proved to be highly efficient. However, the FP-tree-based techniques suffer from two major limitations such...
Conference Paper
Full-text available
By considering different weights of the items, weighted frequent pattern (WFP)mining can discover more important knowledge compared to traditional frequent pattern mining. Therefore, WFP mining becomes an important research issue in data mining and knowledge discovery area. However, the existing algorithms cannot be applied for stream data mining b...
Article
Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A significant number of parallel and distributed FP mining algorithms have been proposed, when the database is large and/or distributed. Among them, parallelization of the FP-growth algorithm us...
Conference Paper
By considering different weights of the items, weighted frequent pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery. However, existing algorithms cannot be applied for incremental and interactive WFP mining because they are based on a static database and require multiple database scans. In this paper, we...
Conference Paper
Mining weighted interesting patterns (WIP) [5] is an important research issue in data mining and knowledge discovery with broad applications. WIP can detect correlated patterns with a strong weight and/or support affinity. However, it still requires two database scans which are not applicable for efficient processing of the real-time data like data...
Conference Paper
Existing weighted frequent pattern (WFP) mining algorithms assume that each item has fixed weight. But in our real world scenarios the weight (price or significance) of an item can vary with time. Reflecting such change of weight of an item is very necessary in several mining applications such as retail market data analysis and web click stream ana...
Conference Paper
Temporal regularity of pattern appearance can be regarded as an important criterion for measuring the interestingness in several applications like market basket analysis, web administration, gene data analysis, network monitoring, and stock market. Even though there have been some efforts to discover periodic patterns in time-series and sequential...
Article
Even though weighted frequent pattern (WFP) mining is more effective than traditional frequent pattern mining because it can consider different semantic significances (weights) of items, existing WFP algorithms assume that each item has a fixed weight. But in real world scenarios, the weight (price or significance) of an item can vary with time. Re...
Article
The frequency of a pattern may not be a sufficient criterion for identifying meaningful patterns in a database. The temporal regularity of a pattern can be another key criterion for assessing the importance of a pattern in several applications. A pattern can be said regular if it appears at a regular user-defined interval in the database. Even thou...
Conference Paper
Full-text available
This paper proposes a prefix-tree structure, called CPS-tree (Compact Pattern Stream tree) that efficiently discovers the exact set of recent frequent patterns from high-speed data stream. The CPS-tree introduces the concept of dynamic tree restructuring technique in handling stream data that allows it to achieve highly compact frequency-descending...
Conference Paper
Full-text available
Interpreting legacy XML documents is a great challenge for realizing the vision of the semantic Web (SW). This paper presents an algorithm to transform XML data into RDF- foundation language of the SW - automatically. Our approach maps element definitions stored in XML schema to RDF schema ontology, where the ontology is used to describe the meanin...
Conference Paper
FP-growth algorithm using FP-tree has been widely studied for frequent pattern mining because it can give a great performance improvement compared to the candidate generation-and-test paradigm of Apriori. However, it still requires two database scans which are not applicable to processing data streams. In this paper, we present a novel tree structu...
Conference Paper
Full-text available
Content adaptation is very much necessary for effective and efficient sharing of files. Now-a-days, people are using devices which vary in their configuration. Moreover, each user may have their own preference whenever they want to share files. To address device heterogeneity as well as user preference, we need to adapt the file at the time of shar...
Conference Paper
Full-text available
Sharing files is very common in collaborative environment. Users may want to share each other's file for more effective and meaningful collaboration. Sometimes it would be preferable to adapt the file so that it can provide the required information to users with minimal overhead. Moreover, users may not want to share files in their original format....
Conference Paper
Radio Frequency Identification (RFID) technology is an excellent substitute for barcodes in industry. However, the management of a large amount of RFID data, together with complicated relationships between data, in the context of responding to different kinds of queries is not well supported by traditional databases. Therefore, 1) an eventbased mod...
Conference Paper
Full-text available
Share-frequent pattern mining discovers more useful and realistic knowledge from database compared to the traditional frequent pattern mining by consider- ing the non-binary frequency values of items in trans- actions. Therefore, recently share-frequent pattern mining problem becomes a very important research issue in data mining and knowledge disc...
Article
In this paper, we investigate the critical low coverage problem of position aware localized efficient broadcast in mobile ad hoc ubiquitous sensor networks and propose a generic framework for it. The framework is to determine a small subset of nodes and minimum transmission radiuses based on snapshots of network state (local views) along the broadc...
Article
Wireless sensor networks have been used more and more widely with the developments of related techniques in telecommunication and computer sciences. While sensor nodes in wireless sensor networks have very limited memory spaces and power. In this paper, we propose a new query aggregation method to preprocess the query predicates. The size of the re...
Conference Paper
Full-text available
Interpreting the XML data in a current web into sources that can be used by the Semantic Web has received great attention in recent years. In this paper, we propose a procedure for transforming valid XML documents into RDF by using vocabularies of RDF schema. The first objective here is to obtain classes and properties from labels in XML document e...
Article
Subsequence matching is an operation that finds subsequences whose changing patterns are similar to a given query sequence from time-series databases. This paper identifies a performance bottleneck in subsequence matching, and then proposes an effective method that substantially improves the performance of entire subsequence matching by resolving t...
Conference Paper
Wireless sensor networks have been widely used in many fields with the developments of the related techniques. But there are many problems in traditional single sink sensor networks. The energy of the sensors near the sink or on the critical paths consumes too fast causing unbalanced energy consumption. The routing algorithms mainly focus on the ne...
Conference Paper
With the developments of related techniques in telecommunication and computer sciences, wireless sensor networks have been used more and more widely. While sensor nodes in wireless sensor networks have very limited memory spaces and power. In this paper, we propose a new method to preprocess the query predicates. The size of the relational table ca...
Conference Paper
Full-text available
In ad hoc networks, disconnections occur frequently. In this paper, we allocate the data replicas according to time and space. In temporal method, we store the original data, the median data in a specific time period and the replica with the second highest frequency among all the other data on the mobile hosts to improve data accessibility. In spat...
Conference Paper
Wireless sensor networks have been widely used in many fields with the developments of the related techniques. But there are many problems in traditional single sink sensor networks. The energy of the sensors near the sink or on the critical paths consumes too fast causing unbalanced energy consumption. The routing algorithms mainly focus on the ne...
Conference Paper
Full-text available
In recent years, computing becomes more mobile and pervasive; these changes imply that applications and services must be aware of and adapt to their changing contexts in highly dynamic environments. To allow interoperability in a context-aware computing environment (e.g. smart meeting space), it is necessary that the context terminology will be com...