Figure 3 - uploaded by Mikel Niño

Content may be subject to copyright.

# 1: The left gure shows the true diagonal decision boundary and a staircase approxiimation to it, of the kind that are created by decision tree algorithms. The right gure shows the voted decision boundary, which is a much better approximation to the diagonal boundary.

Source publication

Mikel Ni~ no) 1 Abstract| I n this work we apply three diierent methods-one of them original-for the combination of classiiers, to two real databases in Oncology. The methodologies of ensemble classiiers are applied to the problems of predicting survival of people after one, three and ve years of being diagnosed as having malignant skin melanoma an...

## Contexts in source publication

**Context 1**

... could see this easily in the case of decision tree algorithms like ID3 or C4.5: a decision boundary is a surface such that examples that lie on one side of the surface are assigned to a diierent class than examples that lie on the other side of the surface. If the true boundary between two classes is a diagonal line, then decision tree algorithms must only approximate that diagonal by a "staircase" of axis-parallel segments (Figure 3.1). Also, Naive Bayes is found to be not better than random guessing for the "cross" uniformly distribution data (Figure 3.2). ...

**Context 2**

... the true boundary between two classes is a diagonal line, then decision tree algorithms must only approximate that diagonal by a "staircase" of axis-parallel segments (Figure 3.1). Also, Naive Bayes is found to be not better than random guessing for the "cross" uniformly distribution data (Figure 3.2). This is becausethe probabilities of class y 1 and class y 2 are about the same for any v alues of attributes x 1 or x 2 . ...

**Context 3**

... a result, the attribute independence assumption for x 1 and x 2 provides no hint to the relative probability o f e a c h class. Lu 17] cites 3 conngurations of combining classiiers: cascaded, parallel and hierarchical (Figure 3.3). Some of these features are: in a cascaded system, the classiication results generated by a classiier are often used to direct the classiication processes of successive classiiers. ...

## Similar publications

The Performance Index (PI) was developed as a comprehensive criterion measure of unit performance for which the unit leader could be held responsible. The basic PI structural model has been developed to explain how the various latent unit performance dimensions comprising organisational unit performance are structurally inter-related. Preliminary r...

We present a model for rating international football teams. Using data from 1944 to 2016, we ask 'which was the greatest team?'. To answer the question requires some sophisticated methodology. Specifically, we have used k-fold cross-validation, which allows us to optimally down-weight the results of friendly matches in explaining World Cup results....

## Citations

... ers. The other reason is that the ensemble can only be accurate than its base classifiers when base classifiers disagree with each other [33] and in this case the classifiers in ensembles performance is better than the best accuracies of individual base classifiers. For example, the classification accuracy of SRE for BRCA2 mutation is better than the best accuracies of the rest of models due to the reason that these models showed different class results while classifying the same sample. ...

Design of a machine learning algorithm as a robust class predictor for various DNA microarray datasets is a challenging task, as the number of samples are very small as compared to the thousands of genes (feature set). For such datasets, a class prediction model could be very successful in classifying one type of dataset but may fail to perform in a similar fashion for other datasets. This paper presents a stacked regression ensemble (SRE) model for cancer class prediction. Results indicate that SRE has provided performance stability for various microarray datasets. The performance of SRE has been cross validated using the k-fold cross validation method (leave one out) technique for BRCA1, BRCA2 and sporadic classes for ovarian and breast cancer microarray datasets. The paper also presents comparative results of SRE with most commonly used SVM and GRNN. Empirical results confirmed that SRE has demonstrated better performance stability as compared to SVM and GRNN for the classification of assorted cancer data.

... ers. The other reason is that the ensemble can only be accurate than its base classifiers when base classifiers disagree with each other [33] and in this case the classifiers in ensembles performance is better than the best accuracies of individual base classifiers. For example, the classification accuracy of SRE for BRCA2 mutation is better than the best accuracies of the rest of models due to the reason that these models showed different class results while classifying the same sample. ...

Combining the predictions of a set of classifiers has shown to be an effective way to create composite classifiers that are more accurate than any of the component classifiers. There are many methods for combining the predictions given by component classifiers. We introduce a new method that combine a number of component classifiers using a Bayesian network as a classifier system given the component classifiers predictions. Component classifiers are standard machine learning classification algorithms, and the Bayesian network structure is learned using a genetic algorithm that searches for the structure that maximises the classification accuracy given the predictions of the component classifiers. Experimental results have been obtained on a datafile of cases containing information about ICU patients at Canary Islands University Hospital. The accuracy obtained using the presented new approach statistically improve those obtained using standard machine learning methods.

This paper presents decision-based fusion models to classify BRCA1, BRCA2 and Sporadic genetic mutations for breast and ovarian cancer. Different ensembles of base classifiers using the stacked generalization technique have been proposed including support vector machines (SVM) with linear, polynomial and radial base function kernels. A generalized regression neural network (GRNN) is then applied to predict the mutation type based on the outputs of base classifiers, and experimental results show that the new proposed fusion methodology for selecting the best and removing weak classifiers outperforms single classification models.