Discovery of Exceptions: A Step towards Perfection
DOI: 10.1109/NSS.2009.32 Conference: Third International Conference on Network and System Security, NSS 2009, Gold Coast, Queensland, Australia, October 19-21, 2009
It is interesting to discover exceptions, as they dispute the existing knowledge and have elements of unexpectedness and surprise. As exceptions focus on a very small portion of data, discovering exceptions still remains a great challenge. A censored production rule (CPR) is a special kind of knowledge structure that augments exceptions to their corresponding commonsense rules of high generality and support. This paper proposes discovery of decision rules in the form of censored production rules by employing a genetic algorithm approach. Results confirm that the proposed discovery of decision rules in the form of CPRs is comprehensible and interesting. Using CPRs as underlying knowledge structure for rule mining provides an excellent mechanism for exception handling and approximate reasoning. Moreover, discovering exceptions through CPRs enhances the predictive accuracy of the classifier.
Available from: research.ijais.org
- "There have been many applications of GAs in the field of data mining and knowledge discovery. Most of them are addressed to the problem of classification , , , , , , , , , . The GAs are important when discovering association rules as the rules that GA found are usually more general because of its global search nature to discover the set of items frequency and they are less complex than other induction algorithms often used in data mining, where these algorithms usually performs a kind of local search. "
[Show abstract] [Hide abstract]
ABSTRACT: The main criticism of employing genetic algorithms in data mining applications is local convergence and their long running time particularly for large datasets with large number of attributes. One solution to this problem is giving a filtering bias to initial population such that more relevant attributes get initialized with higher probability as compared to not so important attributes with respect to prediction. This paper proposes a genetic algorithm with entropy based filtering bias to initial population. Each attribute in the initial population is initialized with a probability inversely proportional to its entropy. Relevant attributes occurring more frequently in the initial population provide a good start for GA to search for better fit rules at earlier generations. The results demonstrate the efficacy and efficiency of the proposed system for automated rule mining.
[Show abstract] [Hide abstract]
ABSTRACT: In recent years, Genetic Algorithms (GAs) have shown promising results in the domain of data mining. However, unreasonably
long running times due to the high computational cost associated with fitness evaluations dissuades the use of GAs for knowledge
discovery. In this paper we propose an enhanced genetic algorithm for automated rule mining. The proposed approach supplements
the GA with an entropy based probabilistic initialization such that the initial population has more relevant and informative
attributes. Further, the GA is augmented with a memory to store fitness scores. The suggested additions have a twofold advantage.
Firstly, it lessens the candidate rules’ search space making the search more effective to evolve better fit rules in lesser
number of generations. Secondly, it reduces number of total fitness evaluations required giving rise to a gain in running
time. The enhanced GA has been employed to datasets from UCI machine learning repository and has shown encouraging results.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.