Using Ant Programming Guided by Grammar for Building Rule-Based Classifiers

Dept. of Comput. Sci. & Numerical Anal., Univ. of Cordoba, Cordoba, Spain
IEEE TRANSACTIONS ON CYBERNETICS (Impact Factor: 6.22). 01/2012; 41(6):1585 - 1599. DOI: 10.1109/TSMCB.2011.2157681
Source: IEEE Xplore


The extraction of comprehensible knowledge is one of the major challenges in many domains. In this paper, an ant programming (AP) framework, which is capable of mining classification rules easily comprehensible by humans, and, therefore, capable of supporting expert-domain decisions, is presented. The algorithm proposed, called grammar based ant programming (GBAP), is the first AP algorithm developed for the extraction of classification rules, and it is guided by a context-free grammar that ensures the creation of new valid individuals. To compute the transition probability of each available movement, this new model introduces the use of two complementary heuristic functions, instead of just one, as typical ant-based algorithms do. The selection of a consequent for each rule mined and the selection of the rules that make up the classifier are based on the use of a niching approach. The performance of GBAP is compared against other classification techniques on 18 varied data sets. Experimental results show that our approach produces comprehensible rules and competitive or better accuracy values than those achieved by the other classification algorithms compared with it.

Download full-text


Available from: Sebastian Ventura, Aug 21, 2014
  • Source
    • "Consequently, previously learned rules directly influence the data of the other rules. Separate-andconquer algorithms use hill-climbing [8] [13], beam search [7] [22], best first search [16], genetic algorithms [24], ant colony optimization [15] [18], fuzzy rough set [4] [20] [25], neural networks [12] to extract rules from data. Divide-and-conquer algorithms greedily find the split that best separates data in terms of some predefined impurity measure such as information gain, entropy, Gini index, etc. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Separate-and-conquer type rule induction algorithms such as Ripper, solve a K > 2 class problem by converting it into a sequence of K − 1 two-class problems. As a usual heuristic, the classes are fed into the algorithm in the order of increasing prior probabilities. Although the heuristic works well in practice, there is much room for improvement. In this paper, we propose a novel approach to improve this heuristic. The approach transforms the ordering search problem into a quadratic optimization problem and uses the solution of the optimization problem to extract the optimal ordering. We compared new Ripper (guided by the ordering found with our approach) with original Ripper (guided by the heuristic ordering) on 27 datasets. Simulation results show that our approach produces rulesets that are significantly better than those produced by the original Ripper.
    Full-text · Article · Mar 2015 · Pattern Recognition Letters
  • Source
    • "T HE aim of data mining (DM) is to extract non-trivial information and knowledge hidden in data. DM is broken down into two main categories, unsupervised [1] and supervised [2] tasks. Unsupervised tasks include approaches that explore the data to find some intrinsic structures in them, so these tasks have a descriptive nature [3]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel grammar-guided genetic programming algorithm for subgroup discovery. This algorithm, called comprehensible grammar-based algorithm for subgroup discovery (CGBA-SD), combines the requirements of discovering comprehensible rules with the ability to mine expressive and flexible solutions owing to the use of a context-free grammar. Each rule is represented as a derivation tree that shows a solution described using the language denoted by the grammar. The algorithm includes mechanisms to adapt the diversity of the population by self-adapting the probabilities of recombination and mutation. We compare the approach with existing evolutionary and classic subgroup discovery algorithms. CGBA-SD appears to be a very promising algorithm that discovers comprehensible subgroups and behaves better than other algorithms as measures by complexity, interest, and precision indicate. The results obtained were validated by means of a series of nonparametric tests.
    Full-text · Article · Dec 2014 · Cybernetics, IEEE Transactions on
  • Source
    • "Concerning the classification task of DM, the GBAP algorithm [37] was the first AP algorithm for mining classification rules. GBAP is founded on the use of a CFG that restricts the search space, which adopts the shape of a derivation tree. "

    Full-text · Article · Sep 2014 · International journal of hybrid intelligent systems
Show more