Sheng-Tun Li

National Cheng Kung University, 臺南市, Taiwan, Taiwan

Are you Sheng-Tun Li?

Claim your profile

Publications (92)90.11 Total impact

  • Intelligent Data Analysis 09/2015; 19(5):1071-1089. DOI:10.3233/IDA-150759 · 0.61 Impact Factor
  • Hei-Fong Ho · Sheng-Tun Li
    [Show abstract] [Hide abstract]
    ABSTRACT: Organizing a reliable case base, which serves as a repository of experience, is crucial for the success of a case-based reasoning (CBR) system. To ensure that such repositories contain high-quality cases, this paper proposes a framework employing the methodology of fuzzy linguistic group decision-making (GDM) in the context of multiple attributes. The overall process of MAGDM could be analogous to the memory-related behaviors of the human brain, in which knowledge is elicited and validated, as in the short-term memory, and then eventually integrated into the long-term memory to serve as solutions to build-up the number of high-quality cases. Moreover, the proposed approach is flexible, as it enables experts to define the set of the parameters of the membership functions associated with labels, thus improving the quality of the linguistic term sets and leading to better assessments. Furthermore, the proposed KC index, characterized by measures of both individual and group consistencies, can provide a more effective assessment to assign suitable experts’ weights than most existing GDM models. This is supported by the experimental results presented in this work, indicating that the KC index can indeed lead to a more satisfactory overall level of consensus. In addition, the mutual validation between the set of the parameters of the membership functions associated with labels by experts and the evaluation of the experts’ weights can be manifested in terms of the KC index.
    Knowledge-Based Systems 06/2015; 86. DOI:10.1016/j.knosys.2015.05.022 · 2.95 Impact Factor
  • Shu-Ching Kuo · Chih-Chuan Chen · Sheng-Tun Li
    [Show abstract] [Hide abstract]
    ABSTRACT: The use of fuzzy time series has attracted considerable attention in studies that aim to make forecasts using uncertain information. However, most of the related studies do not use a learning mechanism to extract valuable information from historical data. In this study, we propose an evolutionary fuzzy forecasting model, in which a learning technique for a fuzzy relation matrix is designed to fit the historical data. Taking into consideration the causal relationships among the linguistic terms that are missing in many existing fuzzy time series forecasting models, this method can naturally smooth the defuzzification process, thus obtaining better results than many other fuzzy time series forecasting models, which tend to produce stepwise outcomes. The experimental results with two real datasets and four indicators show that the proposed model achieves a significant improvement in forecasting accuracy compared to earlier models.
    International Journal of Fuzzy Systems 05/2015; DOI:10.1007/s40815-015-0043-2 · 1.10 Impact Factor
  • Sheng-Tun Li · Wei-Chien Chou
    [Show abstract] [Hide abstract]
    ABSTRACT: It is critical for information and communications technology (ICT) companies to carry our effective power planning, in order to support the growing number of services they provide, and this traditionally relies on the tacit knowledge and experience of senior staff. The loss of such domain knowledge resulting from the retirement of staff is an important issue for organizations such as Chunghwa Telecom (CHT), the largest ICT operator in Taiwan. This study thus develops a systematic power planning model using a multi-criteria operational performance evaluation. A group version of the fuzzy repertory grid and fuzzy TOPSIS approaches is applied to elicit a set of evaluation criteria that senior staff agree on, and then the priorities of the telecom rooms are evaluated against this. In addition, a new factor, reflecting the attitudes of the decision makers with respect to the degree of strictness, is defined to determine the superiority and inferiority of each alternative compared to the others. Furthermore, a novel decision aggregation strategy regarding the degree of variation among decision makers is proposed, and a quantitative assessment is carried out to analyze its impact on the ranking results in an objective manner. The proposed model may help ICT organizations to more effectively manage their power resources, and thus obtain competitive advantages.
    Omega 12/2014; 49. DOI:10.1016/ · 4.38 Impact Factor
  • Chih-Chuan Chen · Sheng-Tun Li
    [Show abstract] [Hide abstract]
    ABSTRACT: Deciding whether borrowers can fulfill their obligations is a major issue for financial institutions, and while various credit rating models have been developed to help achieve this, they cannot reflect the domain knowledge of human experts. This paper proposes a new rating model based on a support vector machine with monotonicity constraints derived from the prior knowledge of financial experts. Experiments conducted on real-world data sets show that the proposed method, not only data driven but also domain knowledge oriented, can help correct the loss of monotonicity in data occurring during the collecting process, and performs better than the conventional counterpart.
    Expert Systems with Applications 11/2014; 41(16):7235–7247. DOI:10.1016/j.eswa.2014.05.035 · 2.24 Impact Factor
  • Sheng-Tun Li · Fu-Ching Tsai
    [Show abstract] [Hide abstract]
    ABSTRACT: Automatic text classification in text mining is a critical technique to manage huge collections of documents. However, most existing document classification algorithms are easily affected by ambiguous terms. The ability to disambiguate for a classifier is thus as important as the ability to classify accurately. In this paper, we propose a novel classification framework based on fuzzy formal concept analysis to conceptualize documents into a more abstract form of concepts, and use these as the training examples to alleviate the arbitrary outcomes caused by ambiguous terms. The proposed model is evaluated on a benchmark testbed and two opinion polarity datasets. The experimental results indicate superior performance in all datasets. Applying concept analysis to opinion polarity classification is a leading endeavor in the disambiguation of Web 2.0 contents, and the approach presented in this paper offers significant improvements on current methods. The results of the proposed model reveal its ability to decrease the sensitivity to noise, as well as its adaptability in cross domain applications.
    Knowledge-Based Systems 02/2013; 39:23–33. DOI:10.1016/j.knosys.2012.10.005 · 2.95 Impact Factor
  • Yi-Chung Cheng · Sheng-Tun Li
    [Show abstract] [Hide abstract]
    ABSTRACT: An area of Fuzzy time series has attracted increasing interest in the past decade since Song and Chissom’s pioneering work and Chen’s milestone study. Various enhancements and generalizations have been subsequently proposed, including high-order fuzzy time series. One of the key steps in the Chen’s framework is to derive fuzzy relationships existing in a fuzzy time series and to encode the relationships as IF-THEN production rules. A generic exact-match strategy is then applied to the forecasting process. However, the uncertainty and fuzziness characteristics inherent to the fuzzy relationships tend to be overlooked due to the nature of the matching strategies. This omission could lead to inferior forecasting outcomes, particularly in the case of high-order fuzzy time series. In this study, to overcome this shortcoming we propose a best-match strategy forecasting method based on the fuzzy similarity measure. The experiments concerning Taiwan Weighted Stock Index and Dow Jones Industrial Average are reported. We show the effectiveness of the model by running some comparative analysis using some models well-known in the literature.
    Time Series Analysis, Modeling and Applications, 01/2013: pages 331-345;
  • Yi-Chung Cheng · Sheng-Tun Li
    [Show abstract] [Hide abstract]
    ABSTRACT: Since its emergence, the study of fuzzy time series (FTS) has attracted more attention because of its ability to deal with the uncertainty and vagueness that are often inherent in real-world data resulting from inaccuracies in measurements, incomplete sets of observations, or difficulties in obtaining measurements under uncertain circumstances. The representation of fuzzy relations that are obtained from a fuzzy time series plays a key role in forecasting. Most of the works in the literature use the rule-based representation, which tends to encounter the problem of rule redundancy. A remedial forecasting model was recently proposed in which the relations were established based on the hidden Markov model (HMM). However, its forecasting performance generally deteriorates when encountering more zero probabilities owing to fewer fuzzy relationships that exist in the historical temporal data. This paper thus proposes an enhanced HMM-based forecasting model by developing a novel fuzzy smoothing method to overcome performance deterioration. To deal with uncertainty more appropriately, the roulette-wheel selection approach is applied to probabilistically determine the forecasting result. The effectiveness of the proposed model is validated through real-world forecasting experiments, and performance comparison with other benchmarks is conducted by a Monte Carlo method.
    IEEE Transactions on Fuzzy Systems 04/2012; 20(2):291-304. DOI:10.1109/TFUZZ.2011.2173583 · 8.75 Impact Factor
  • Sheng-Tun Li · Fu-Ching Tsai
    [Show abstract] [Hide abstract]
    ABSTRACT: Document classification is critical due to explosive increasing of text in modern world. However, most of existing document classification algorithms are easily affected by noise data. Therefore, in document classification tasks, the ability of noise control is as important as the ability to classify exactly. In this paper, we propose a novel classification framework based on fuzzy formal concept analysis to moderate the impact from noise. In addition, the well-organized concepts also provide inherent relations, which support knowledge codification and distribution effectively. Experimental results using Reuters 21578 dataset demonstrates significant noise control benefit and superior classification accuracy.
    Fuzzy Systems (FUZZ), 2011 IEEE International Conference on; 07/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: The emergence of fuzzy time series has recently received more attention because of its capability of dealing with vagueness and incompleteness inherent in data. Deriving an effective and useful forecasting model has been a challenge task. In the previous work, the authors addressed two crucial issues, namely controlling uncertainty and effectively partitioning intervals, as well as developed a deterministic forecasting model to manage these issues. However, their model neglected the distribution and uncertainty of data points and can only provide scalar forecasting, thus limiting its usefulness. This study expands the deterministic forecasting model to improve forecasting capability. We propose a vector forecasting model that allows the prediction of a vector of future values in one step, by integrating the technologies of sliding window and fuzzy c-means clustering, to deal with vector forecasting and interval partitioning. Experimental results and analysis using Monte Carlo simulations for two experiments, both with three data sets, validate the effectiveness of the proposed forecasting model.
    Applied Soft Computing 04/2011; 11(3-11):3125-3134. DOI:10.1016/j.asoc.2010.12.015 · 2.81 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Since learning Bayesian networks from data is difficult, a new approach is proposed. The particle swarm optimization (PSO) and minimum description length (MDL) are combined to obtain a suitable Bayesian network. MDL is the fitness function in this learning algorithm to evaluate the goodness of the network. By adopting MDL, the balance between simplicity and accuracy is assured, which enables the optimal solution for complex models to be found in reasonable time. Base on the MDL principle, the PSO is used to enhance the structure learning in Bayesian networks. Moreover, conditional probabilities associated with the Bayesian networks are then statistically derived from these data. In the end, the Stroke data set is used for testing the efficiency and effectiveness of the stable network. Experimental results show that the proposed approach has a good accuracy than the comparative methods.
    FUZZ-IEEE 2011, IEEE International Conference on Fuzzy Systems, Taipei, Taiwan, 27-30 June, 2011, Proceedings; 01/2011
  • Sheng-Tun Li · Chih-Chuan Chen · Fernando Huang
    [Show abstract] [Hide abstract]
    ABSTRACT: With the non-stop increases in medical treatment fees, the economic survival of a hospital in Taiwan relies on the reimbursements received from the Bureau of National Health Insurance, which in turn depend on the accuracy and completeness of the content of the discharge summaries as well as the correctness of their International Classification of Diseases (ICD) codes. The purpose of this research is to enforce the entire disease classification framework by supporting disease classification specialists in the coding process. This study developed an ICD code advisory system (ICD-AS) that performed knowledge discovery from discharge summaries and suggested ICD codes. Natural language processing and information retrieval techniques based on Zipf's Law were applied to process the content of discharge summaries, and fuzzy formal concept analysis was used to analyze and represent the relationships between the medical terms identified by MeSH. In addition, a certainty factor used as reference during the coding process was calculated to account for uncertainty and strengthen the credibility of the outcome. Two sets of 360 and 2579 textual discharge summaries of patients suffering from cerebrovascular disease was processed to build up ICD-AS and to evaluate the prediction performance. A number of experiments were conducted to investigate the impact of system parameters on accuracy and compare the proposed model to traditional classification techniques including linear-kernel support vector machines. The comparison results showed that the proposed system achieves the better overall performance in terms of several measures. In addition, some useful implication rules were obtained, which improve comprehension of the field of cerebrovascular disease and give insights to the relationships between relevant medical terms. Our system contributes valuable guidance to disease classification specialists in the process of coding discharge summaries, which consequently brings benefits in aspects of patient, hospital, and healthcare system.
    Artificial intelligence in medicine 01/2011; 51(1):27-41. DOI:10.1016/j.artmed.2010.10.003 · 2.02 Impact Factor
  • Sheng-Tun Li · Yi-Chung Cheng
    [Show abstract] [Hide abstract]
    ABSTRACT: Recently, fuzzy time series have attracted more academic attention than traditional time series due to their capability of dealing with the uncertainty and vagueness inherent in the data collected. The formulation of fuzzy relations is one of the key issues affecting forecasting results. Most of the present works adopt IF-THEN rules for relationship representation, which leads to higher computational overhead and rule redundancy. Sullivan and Woodall proposed a Markov-based formulation and a forecasting model to reduce computational overhead; however, its applicability is limited to handling one-factor problems. In this paper, we propose a novel forecasting model based on the hidden Markov model by enhancing Sullivan and Woodall's work to allow handling of two-factor forecasting problems. Moreover, in order to make the nature of conjecture and randomness of forecasting more realistic, the Monte Carlo method is adopted to estimate the outcome. To test the effectiveness of the resulting stochastic model, we conduct two experiments and compare the results with those from other models. The first experiment consists of forecasting the daily average temperature and cloud density in Taipei, Taiwan, and the second experiment is based on the Taiwan Weighted Stock Index by forecasting the exchange rate of the New Taiwan dollar against the U.S. dollar. In addition to improving forecasting accuracy, the proposed model adheres to the central limit theorem, and thus, the result statistically approximates to the real mean of the target value being forecast.
    IEEE TRANSACTIONS ON CYBERNETICS 11/2010; 40(5-40):1255 - 1266. DOI:10.1109/TSMCB.2009.2036860 · 6.22 Impact Factor
  • Sheng-Tun Li · Shu-Ching Kuo · Fu-Ching Tsai
    [Show abstract] [Hide abstract]
    ABSTRACT: In the recent era of increasing volume crimes, crime prevention is now one of the most important global issues, along with the great concern of strengthening public security. Government and community officials are making an all-out effort to improve the effectiveness of crime prevention. Numerous investigations addressing this problem have generally employed disciplines of behavior science and statistics. Recently, the data mining approach has been shown to be a proactive decision-support tool in predicting and preventing crime. However its effectiveness is often limited due to different natures of crime data, such as linguistic crime data evolving over time. In this paper, we propose a framework of intelligent decision-support model based on a fuzzy self-organizing map (FSOM) network to detect and analyze crime trend patterns from temporal crime activity data. In addition, a rule extraction algorithm is employed to uncover hidden causal-effect knowledge and reveal the shift around effect. In contrast to most present crime related studies, we target a non-Western real-world case, i.e. the National Police Agency (NPA) in Taiwan. The resultant model can support police managers in assessing more appropriate law enforcement strategies, as well as improving the use of police duty deployment for crime prevention.
    Expert Systems with Applications 10/2010; 37(10-37):7108-7119. DOI:10.1016/j.eswa.2010.03.004 · 2.24 Impact Factor
  • Sheng-Tun Li · Fu-Ching Tsai
    [Show abstract] [Hide abstract]
    ABSTRACT: A knowledge structure identifies how people think and displays a macro view of human perception. By discovering the hidden structural relations of knowledge, significant reasoning patterns are retrieved to enhance further knowledge sharing and distribution. However, the utilization of such approaches is apt to be limited due to the lack of hierarchical features and the problem of information overload, which make it difficult to enhance comprehension and provide effective navigation. To address these critical issues, we propose a new approach to construct a tree-based knowledge structure from corpus which can reveal the significant relations among knowledge objects and enhance user comprehension. The effectiveness of the proposed method is demonstrated with two representative public data sets. The evaluation results show that the method presented in this work achieves remarkable consistency with the domain-specific knowledge structure, and is capable of reflecting appropriate similarities among knowledge objects along with hierarchical implications in the document classification task.
    Applied Intelligence 08/2010; 33(1):67-78. DOI:10.1007/s10489-010-0243-2 · 1.85 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the last decade, fuzzy time series have received more attention due their ability to deal with the vagueness and incompleteness inherent in time series data. Although various improvements, such as high-order models, have been developed to enhance the forecasting performance of fuzzy time series, their forecasting capability is mostly limited to short-term time spans and the forecasting of a single future value in one step. This paper presents a new method to overcome this shortcoming, called deterministic vector long-term forecasting (DVL). The proposed method, built on the basis of our previous deterministic forecasting method that does not require the overhead of determining the order number, as in other high-order models, utilizes a vector quantization technique to support forecasting if there are no matching historical patterns, which is usually the case with long-term forecasting. The vector forecasting method is further realized by seamlessly integrating it with the sliding window scheme. Finally, the forecasting effectiveness and stability of DVL are validated and compared by performing Monte Carlo simulations on real-world data sets.
    Fuzzy Sets and Systems 07/2010; 161(13-161):1852-1870. DOI:10.1016/j.fss.2009.10.028 · 1.99 Impact Factor
  • Kuo-Chin Hsu · Sheng-Tun Li
    [Show abstract] [Hide abstract]
    ABSTRACT: A data analysis method is proposed to cluster and explore spatio-temporal characteristics of the 22 years of precipitation data (1982–2003) for Taiwan. The wavelet transform self-organizing map (WTSOM) framework combines the wavelet transform (WT) and a self-organizing map (SOM) neural network. WT is used to extract dynamic and multiscale features of the non-stationary precipitation time-series, and SOM is applied to objectively identify spatially homogeneous clusters on the high-dimensional wavelet-transformed feature space. Haar and Morlet wavelets are applied in the data preprocessing stage to preserve the desired characteristics of the precipitation data. A two-level SOM neural network is applied to identify clusters in the wavelet space in the clustering stage. The performance of clustering is evaluated using silhouette coefficients. The results indicate that singularities or sharp transitions are more significant than changes in the periodicity or data structure in the spatial–temporal precipitation data. The WTSOM results show that six clusters are optimal for both Haar and Morlet wavelet functions, but their corresponding geographic locations are different. The geographic locations of clusters based on the Haar wavelet, which captures the occurrence of extreme hydrological events, appear in blocks while those classified by the Morlet wavelet, which indicates periodicity changes and describes fine structures, appear in strips that cross the island of Taiwan. Principal component analysis is applied to the precipitation data of each cluster. The first principal components explain 62–90% of the total variation of data. Characteristics of precipitation data for each cluster are explored using scalogram analysis. The results show that both extreme hydrological events and periodicity changes appear in the spatial and temporal precipitation data but with different characteristics for each cluster. Recognizing homogeneous hydrologic regions and identifying the associated precipitation characteristics improves the efficiency of water resources management in adapting to climate change, preventing the degradation of the water environment, and reducing the impact of climate-induced disasters. Measures for countering the stress of precipitation variation for water resources management are provided.
    Advances in Water Resources 02/2010; 33(2-33):190-200. DOI:10.1016/j.advwatres.2009.11.005 · 3.42 Impact Factor
  • Sheng-Tun Li · Ming-Hong Tsai · Chinho Lin
    [Show abstract] [Hide abstract]
    ABSTRACT: Managing their knowledge assets is an imperative issue for most organizations in pursuit of competitive advantage in the knowledge-based economy. Previous researchers have proposed a number of valuable taxonomies for classifying an organization’s knowledge assets. However, once knowledge assets are classified by such taxonomies as a particular type, they do not change type over time. Arguably, however, business contexts are swiftly changing, and knowledge assets may have to be constantly adapted to play new roles, and so a taxonomy capable of reflecting the changing relations between knowledge assets and environmental conditions is needed. This article proposes such a taxonomy which utilizes durability and profitability as dimensions. This taxonomy allows knowledge assets to change type in the light of the new condition. Additionally, it has the characteristics of demonstrating the alignment of assets with organizational strategies, and of being widely applicable in the for-profit sector.
    Journal of Information Science 01/2010; 36:36-56. DOI:10.1177/0165551509347955 · 1.16 Impact Factor
  • Tzung-Pei Hong · Hsin-Yi Chen · Chun-Wei Lin · Sheng-Tun Li
    [Show abstract] [Hide abstract]
    ABSTRACT: In the past, a pre-large fast-updated sequential pattern tree (pre-large FUSP tree) structure was proposed to effectively handle newly inserted customer sequences for data mining. Since data deletion also commonly occurs in real applications, in this paper, we thus propose a maintenance algorithm for pre-large FUSP trees when records are deleted from the mined database. Pre-large sequences act like buffers and are used to reduce the movement of sequences directly from large to small and vice-versa when records are deleted. Experimental results also show that the proposed pre-large FUSP-tree maintenance algorithm for record deletion has a good performance when compared to the batch maintenance algorithm.
    New Trends in Information and Service Science, 2009. NISS '09. International Conference on; 08/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Due to uncertain data quality, knowledge extracted by methods merely focusing on gaining high accuracy might result in contradiction to experts’ knowledge or sometimes even common sense. In many application areas of data mining, taking into account the monotonic relations between the response variable and predictor variables could help extracting rules with better comprehensibility. This study incorporates Particle Swarm Optimization (PSO), which is a competitive heuristic technique for solving optimization tasks, with constraints of monotonicity for discovering accurate and comprehensible rules from databases. The results show that the proposed constraints-based PSO classifier can exploit rules with both comprehensibility and justifiability.
    05/2009: pages 323-328;

Publication Stats

699 Citations
90.11 Total Impact Points


  • 2003–2014
    • National Cheng Kung University
      • • Department of Industrial and Information Management
      • • Institute of Information Management
      臺南市, Taiwan, Taiwan
  • 1999–2008
    • National Kaohsiung First University of Science and Technology
      • Department of Finance
      Kao-hsiung-shih, Kaohsiung, Taiwan