Shi-Jinn Horng

Shi-Jinn Horng
National Taiwan University of Science and Technology · Department of Computer Science and Information Engineering

About

275
Publications
69,181
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,063
Citations

Publications

Publications (275)
Article
With the advent of the era of big data, an increasing amount of duplicate data are expressed in different forms. In order to reduce redundant data storage and improve data quality, data deduplication technology has never become more significant than nowadays. It is usually necessary to connect multiple data tables and identify different records poi...
Article
The inherent characteristics involved in data can be mined from multi-scale information systems by extracting information from different value levels of features. Information fusion in constructing methods in the multi-scale information system may facilitate the performance of a learning model. In real applications, noise data and irrelevant or red...
Article
Full-text available
High-dimensional multi-label data has become more prevalent in many application domains, presenting difficulties and challenges for multi-label learning. As a result, feature selection has been widely used as an effective dimensionality reduction technique in multi-label learning. However, traditional multi-label feature selection (MLFS) methods ma...
Article
The performance of multilabel learning depends heavily on the quality of the input features. A mass of irrelevant and redundant features may seriously affect the performance of multilabel learning, and feature selection is an effective technique to solve this problem. However, most multilabel feature selection methods mainly emphasize removing thes...
Article
The wrist vein is one robust and reliable biometric for research and applications in automatic human verification. However, the existing wrist vein recognition models were heavy and ineffective in deploying on smartphones. Also, smartphones were required to integrate Near Infrared (NIR) sensors to capture wrist vein images. This paper proposes a no...
Article
The graph regularized nonnegative matrix factorization (GNMF) algorithms have received a lot of attention in the field of machine learning and data mining, as well as the square loss method is commonly used to measure the quality of reconstructed data. However, noise is introduced when data reconstruction is performed; and the square loss method is...
Article
Recently, demand for biometric access controls and online payments in smartphones increased, necessitating further investigation and development in this area. This paper proposes a new low-cost palm vein recognition system for smartphones using RGB images. First, we detect and enhance palm vein patterns, using the saturation channel instead of the...
Article
Clustering remains a challenging research hotspot in data mining. Non-negative matrix factorization (NMF) is an effective technique for clustering, which aims to find the product of two non-negative low-dimensional matrices that approximates the original matrix. Since the matrices must satisfy the non-negative constraints, the Karush-Kuhn–Tucker co...
Article
Full-text available
The impact of fine particulate matter on health has captured attention worldwide. Many studies have proven that fine particulate matter harms the respiratory system and the cardiovascular system. To prevent people from being harmed, many scientific research studies on PM2.5 prediction have been conducted in recent years. Accurate PM2.5 forecasting...
Article
Full-text available
For emergency or intensive-care units (ICUs), patients with unclear consciousness or unstable hemodynamics often require aggressive monitoring by multiple monitors. Complicated pipelines or lines increase the burden on patients and inconvenience for medical personnel. Currently, many commercial devices provide related functionalities. However, most...
Article
The objective of co-clustering is to simultaneously identify blocks of similarity between the sample set and feature set. Co-clustering has become a widely used technique in data mining, machine learning, and other research areas. The nonnegative matrix tri-factorization (NMTF) algorithm, which aims to decompose an objective matrix into three low-d...
Article
Ensuring data confidentiality in a vehicular ad hoc network (VANET) is an increasingly important issue. Message confidentiality, user privacy and access control are the most important problems that affect services provided by VANETs. However, access control that addresses data downloads while preserving users' privacy remains an open problem. Based...
Article
Face recognition can be installed in a surveillance system so that it can be used for monitoring, tracking and access control. An excellent, intelligent surveillance system should be sensitive to the objects far away from the camera. Unfortunately, due to the long-distance, objects like human faces captured by the camera are too small to identify....
Article
Time series forecasting is an important technique to study the behavior of temporal data and forecast future values, which is widely applied in many fields, e.g. air quality forecasting, power load forecasting, medical monitoring, and intrusion detection. In this paper, we firstly propose a novel temporal attention encoder–decoder model to deal wit...
Article
In many real-life applications, the data collected from different information sources are always located in diverse sites and characterized by different types of attributes. In this paper, we call this type of data as multi-source hybrid data. Existed rough set models only work well with single source data or multi-source data with one type of attr...
Article
Air quality forecasting has been regarded as the key problem of air pollution early warning and control management. In this paper, we propose a novel deep learning model for air quality (mainly PM2.5) forecasting, which learns the spatial-temporal correlation features and interdependence of multivariate air quality related time series data by hybri...
Article
Full-text available
The very small size of face image recognition is one of a significant case in computer vision because it can be applied in many applications in the real world. The main problem of face recognition with a small size is that the area of the acquired face is very small so that many facial features cannot be obtained. In this paper, we propose a novel...
Preprint
Air quality forecasting has been regarded as the key problem of air pollution early warning and control management. In this paper, we propose a novel deep learning model for air quality (mainly PM2.5) forecasting, which learns the spatial-temporal correlation features and interdependence of multivariate air quality related time series data by hybri...
Article
The high dimensionality and sparsity of data often increase the complexity of clustering; these factors occur simultaneously in unsupervised learning. Clustering and linear discriminant analysis (LDA) are methods to reduce the dimensionality and sparsity of data. In this study, the similarity of clustering and LDA are investigated based on their ob...
Article
Full-text available
Conventional haze-removal methods are designed to adjust the contrast and saturation, and in so doing enhance the quality of the reconstructed image. Unfortunately, the removal of haze in this manner can shift the luminance away from its ideal value. In other words, haze removal involves a tradeoff between luminance and contrast. We reformulated th...
Article
Information fusion is capable of fusing and transforming multiple data derived from different sources to provide a unified representation for centralized knowledge mining that facilitates effective decision-making, classification and prediction, etc. Multi-source interval-valued data, characterizing the uncertainty phenomenons in the data in the fo...
Article
Full-text available
Tremor detection plays a crucial role in Parkinson’s disease (PD) treatment and symptom monitoring. The current gold standard for the clinical assessment of parkinsonian tremor is the evaluation using the standard clinical rating scales, which is performed by the well-trained neurologists. However, this assessment approach relies mainly on the subj...
Article
Full-text available
Traffic flow forecasting has been regarded as a key problem of intelligent transport systems. In this work, we propose a hybrid multimodal deep learning method for short-term traffic flow forecasting, which jointly learns the spatial-temporal correlation features and interdependence of multi-modality traffic data by multimodal deep learning archite...
Preprint
Traffic flow forecasting has been regarded as a key problem of intelligent transport systems. In this work, we propose a hybrid multimodal deep learning method for short-term traffic flow forecasting, which can jointly and adaptively learn the spatial-temporal correlation features and long temporal interdependence of multi-modality traffic data by...
Conference Paper
Full-text available
Bradykinesia is one of the primary characteristic symptoms of Parkinson's disease (PD). Ten-second whole-hand-grasps action was chosen to assess bradykinesia severity in this study. A quantification assessment system based on a self-developed wearable device was proposed to assess the severity of the parkinsonian bradykinesia. The proposed assessme...
Article
Set-valued information systems are important type of data tables in many real applications, where the attribute values are described by sets to characterize uncertain and incomplete information. However, in some real situations, set-values may be depicted by probability distributions, which results in that the traditional tolerance relation based o...
Article
Microarray data often contain missing values which significantly affect subsequent analysis. Existing LLSimpute-based imputation methods for dealing with missing data have been shown to be generally efficient. However, all of the LLSimpute-based methods do not consider the different importance of different neighbors of the target gene in the missin...
Article
Attribute reduction based on rough set theory has attracted much attention recently. In real-life applications, many decision tables may vary dynamically with time, e.g., the variation of attributes, objects, and attribute values. The reduction of decision tables may change on the alteration of attribute values. The paper focuses on dynamic mainten...
Article
In a dynamic environment, the data collected from real applications varies not only with the amount of objects but also with the number of features, which will result in continuous change of knowledge over time. The static methods of updating knowledge need to recompute from scratch when new data are added every time. This makes it potentially very...
Conference Paper
Set-valued information systems (SvIS), in which the attribute values are set-valued, are important types of data representation with uncertain and missing information. However, all previous investigations in rough set community do not consider the attribute values with probability distribution in SvIS, which may be impractical in many real applicat...
Article
In learning systems and environment research, intelligent tutoring and personalisation are considered the two most important factors. An Intelligent Tutoring System can serve as an effective tool to improve problem-solving skills by simulating a human tutor’s actions in implementing one-to-one adaptive and personalised teaching. Thus, in this resea...
Article
Rough set provides a theoretical framework for classification learning in data mining and knowledge discovery. As an important application of rough set, attribute reduction, also called feature selection, aims to reduce the redundant attributes in a given decision system while preserving a particular classification property, e.g., information entro...
Article
Nowadays, intelligent tutoring systems are considered an effective research tool for learning systems and problem-solving skill improvement. Nonetheless, such individualized systems may cause students to lose learning motivation when interaction and timely guidance are lacking. In order to address this problem, a solution-based intelligent tutoring...
Article
Full-text available
Uncertainty and fuzziness generally exist in real-life data. Approximations are employed to describe the uncertain information approximately in rough set theory. Certain and uncertain rules are induced directly from different regions partitioned by approximations. Approximation can further be applied to data mining related task, e.g., attribute red...
Chapter
In many fields including medical research, e-business and road transportation, data may vary over time, i.e., new objects and new attributes are added. In this paper, we present a method for dynamically updating approximations based on rough fuzzy sets under the variation of objects and attributes simultaneously in fuzzy decision systems. Firstly,...
Conference Paper
Rough set theory is an effective mathematical tool for processing the uncertainty and inexact data. In some reallife applications, data stores in information systems distributively which are called as Distributed Information Systems (DIS). It is hard to centralize the large-scale data in DIS for data mining tasks. Futhermore, knowledge needs updati...
Conference Paper
Full-text available
Game-based learning is considered as a very motivational tool to accelerate active learning of students. As such learning environments usually follow a computer-assisted instruction concept that offers no adaptability to each student, some idea from Intelligent Tutoring Systems (ITS) are borrowed and applied in educational games to teach introducto...
Conference Paper
This study presents a resistant digital image watermarking scheme based on masking model. The Singular Value Decomposition (SVD) method has been used on this scheme based on luminance masking with non-blind watermarking model in which an original image was needed in an extraction process. The resistance of digital images was checked against some of...
Article
Full-text available
This paper presents a low complexity configurable semi-fragile watermarking scheme for content-based H.264/AVC authentication, which allows content-preserving manipulations such as video transcoding techniques, while it is very sensitive to content-changing and frame manipulations. A low cost spatial analysis is exploited to maximize robustness and...
Article
Full-text available
Intelligent tutoring and personalization are considered as the two most important factors in the research of learning systems and environments. An effective tool that can be used to improve problem-solving ability is an Intelligent Tutoring System which is capable of mimicking a human tutor’s actions in implementing a one-to-one personalized and ad...
Article
Certificateless public key cryptography was introduced to solve the complicated certificate management problem in traditional public key cryptography and the key escrow problem in identity-based cryptography. The aggregate signature concept is useful in special areas where the signatures on many different messages generated by many different users...
Article
Vehicular ad hoc network (VANET) can significantly improve the traffic safety and efficiency. The basic idea is to allow vehicles to send traffic information to roadside units (RSUs) or other vehicles. Vehicles have to be prevented from some attacks on their privacy and misuse of their private data. For this reason, the security and privacy preserv...
Article
Full-text available
In the early stages of learning computer programming, Computer Science (CS) minors share a misconception of what programming is. In order to address this problem, FMAS, a flowchart-based multi-agent system is developed to familiarize students who have no prior knowledge of programming, with the initial stages in learning programming. The aim is to...
Article
Full-text available
Novice programmers have a misconception of what programming is in the early stages of learning programming. A Flowchart-based Programming Environment (FPE) is developed in this research with the aim of introducing the early stages of learning programming to clarify matters. An attempt is made to introduce the basic programming algorithms prior to s...
Article
Rule induction method based on rough set theory (RST) has received much attention recently since it may generate a minimal set of rules from the decision system for real-life applications by using of attribute reduction and approximations. The decision system may vary with time, e.g., the variation of objects, attributes and attribute values. The r...
Article
Full-text available
From the time of early exploration in the area of programming languages, many tools have been employed to introduce novice programmers to programming. The most common tools entail flowchart-based notation as well as iconic based programming environments. More research in this field has revealed that the deficiency in problem-solving skills, which i...
Article
Recently, Yeh et al. proposed a portable privacy-preserving authentication and access control protocol, named PAACP, for non-safety applications in vehicular ad hoc networks. PAACP not only accomplishes authentication, key establishment and privacy preservation, but also considers the scalability and differentiated service access control issues in...
Article
This paper proposes an adaptive watermarking scheme for e-government document images. The adaptive scheme combines the discrete cosine transform (DCT) and the singular value decomposition (SVD) using luminance masking. As a core of masking model in the human visual system (HVS), luminance masking is implemented to improve noise sensitivity. Genetic...
Conference Paper
Due to the rapid development of the Internet, vehicular value-added services have become very prevalent in vehicular ad hoc network (VANET); especially for road safety and traffic management. In this paper, we propose a secure value-added service scheme based on blind signature technique for VANET. On one hand, the feature of the portable credentia...
Article
In this paper, we develop an approach of embedded Markov chain to analyze the signaling cost of a movement-based location management (MBLM) scheme. This approach distinguishes itself from those developed in the literature in the following aspects. 1) It considers the location area (LA) architecture used by personal communication service (PCS) netwo...
Article
The security and privacy preservation issues are prerequisites for vehicular ad hoc networks. Recently, secure and privacy enhancing communication schemes (SPECS) was proposed and focused on intervehicle communications. SPECS provided a software-based solution to satisfy the privacy requirement and gave lower message overhead and higher successful...
Article
Full-text available
H.264/AVC-based products have grown tremendously in social networks; issues of content-based authentication become increasingly important. This paper presents a blind fragile watermarking scheme for content-based H.264/AVC authentication, which enjoys high sensitivity to typical video attacks. A spatiotemporal analysis is exploited to guarantee a m...