Conference Paper

Discovery of Exceptions: A Step towards Perfection.

DOI: 10.1109/NSS.2009.32 Conference: Third International Conference on Network and System Security, NSS 2009, Gold Coast, Queensland, Australia, October 19-21, 2009
Source: DBLP


It is interesting to discover exceptions, as they dispute the existing knowledge and have elements of unexpectedness and surprise. As exceptions focus on a very small portion of data, discovering exceptions still remains a great challenge. A censored production rule (CPR) is a special kind of knowledge structure that augments exceptions to their corresponding commonsense rules of high generality and support. This paper proposes discovery of decision rules in the form of censored production rules by employing a genetic algorithm approach. Results confirm that the proposed discovery of decision rules in the form of CPRs is comprehensible and interesting. Using CPRs as underlying knowledge structure for rule mining provides an excellent mechanism for exception handling and approximate reasoning. Moreover, discovering exceptions through CPRs enhances the predictive accuracy of the classifier.

8 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: In recent years, Genetic Algorithms (GAs) have shown promising results in the domain of data mining. However, unreasonably long running times due to the high computational cost associated with fitness evaluations dissuades the use of GAs for knowledge discovery. In this paper we propose an enhanced genetic algorithm for automated rule mining. The proposed approach supplements the GA with an entropy based probabilistic initialization such that the initial population has more relevant and informative attributes. Further, the GA is augmented with a memory to store fitness scores. The suggested additions have a twofold advantage. Firstly, it lessens the candidate rules’ search space making the search more effective to evolve better fit rules in lesser number of generations. Secondly, it reduces number of total fitness evaluations required giving rise to a gain in running time. The enhanced GA has been employed to datasets from UCI machine learning repository and has shown encouraging results.
    Communications in Computer and Information Science 01/2011; 131. DOI:10.1007/978-3-642-17857-3_60
  • [Show abstract] [Hide abstract]
    ABSTRACT: The main criticism of employing genetic algorithms in data mining applications is local convergence and their long running time particularly for large datasets with large number of attributes. One solution to this problem is giving a filtering bias to initial population such that more relevant attributes get initialized with higher probability as compared to not so important attributes with respect to prediction. This paper proposes a genetic algorithm with entropy based filtering bias to initial population. Each attribute in the initial population is initialized with a probability inversely proportional to its entropy. Relevant attributes occurring more frequently in the initial population provide a good start for GA to search for better fit rules at earlier generations. The results demonstrate the efficacy and efficiency of the proposed system for automated rule mining.

Similar Publications