Science topics: Data MiningClassifier Ensemble

Science topic

# Classifier Ensemble - Science topic

Explore the latest questions and answers in Classifier Ensemble, and find Classifier Ensemble experts.

Questions related to Classifier Ensemble

Can anyone suggest any ensembling methods for the output of pre-trained models? Suppose, there is a dataset containing cats and dogs. Three pre-trained models are applied i.e., VGG16, VGG19, and ResNet50. How will you apply ensembling techniques? Bagging, boosting, voting etc.

I have implemented Stacking classifier using Decision Tree, kNN and Naive bayes as base learner and Logistic Regression as metaclassifier (final predictor), stacking has increased the accuracy in comparison to individual classifiers. The problem is multiclass (6 classes) with categorical target (Target= Activity performed by user exa: walking, running, standing.... on UCI-HAR dataset). Now I am unable to understand:

**1. How Logistic Regression is working on the output/prediction of base level classifier?**

**2. What will be the final output of Logistic Regression if: model 1 is giving class 1, model 2 is giving class 2 and model 3 is giving class 3 as their prediction. (exa: DT-Running; kNN-Walking; NB-Standing, then how logistic regression will decide the final output)?**If possible kindly explain why and how.

i was making a classification model ( 3 class ) for early detection of cracks in ball bearings, the data set is limited 120 rows and 14 features. the classifiers and their parameters is listed below can you please suggest which model will be the best (not simply accuracy also consider model complexity )

What are the best techniques for geospatial datasets? Also, are there some techniques that are better suited for stacking of models than using a single model.?

How to perform cross validation to prepare the input for the next level classifier in multi layer stacking classifier in mlxtend ?

Thanks in advance.

Assume that, after running some hyperparameter optimization technique based on the training data and possibly using cross-validation to guide the search, there are

*M***best models**avaliable to create an ensemble.How can the performance of the method be assessed?

Some thoughts:

- The simplest way would be selecting just a set of M models (resulting of only one optimization run), create N replications (i.e. random train/test splits) and get the statistics. I believe this is a highly skewed approach, as it only consider the randomness of the ensemble creation, but not the randomness of the optimization technique.
- A second alternative would be creating N replications and, for each one of them, running the entire method (from hyperparameter optimization to ensemble creation). Then, we could look at the statistics for the N measures.
- The last alternative I can think of is creating N replications and, for each one of them, (1) run the optimization algorithm and (2) create K ensembles. In the end, we could extract statistics from all K*N measures.

Since I wasn't able to find good references on this specific issue, I hope you can help me :)

Please, cite published work that supports your answer.

Thank you.

When applying Random forests classifiers for 3D image segmentation, what is the best practice to deal with the high size of the images in the training and the test step? Is it a common practice to resize( downsample ) the image?

I know how does the Random Forest works if we have two choices. If apple is Red go left, if Apple is Green go right. and etc.

But for my question, if the data is texts"features" I trained the classifier with training data, I would like to understand deeply how does the algorithm split the node, based on what? the tf-idf weight, or the word itself. In addition, how did it precidt the class for each example.

I would really appreciate a very explained in details with example in texts.

The task involves predicting a binary outcome in a small data set (sample sizes of 20-70) using many (>100) variables as potential predictors. The main problem is that the number of predictors is much larger than the sample size, and there is limited/no knowledge of which predictors may be more important than other. Therefore it is very easy to "overfit" the data - i.e. to produce models which seemingly describe the data at hand very well, but in fact include spurious predictor variables. I tried using an ensemble classification method called randomGLM (see http://labs.genetics.ucla.edu/horvath/htdocs/RGLM/#tutorials) which seeks to improve on AICc-based GLM selection using the "bagging" approach taken from random forests. I checked results by K-fold cross-validation and ROC curves. The results seemingly look good - e.g. a GLM which contains only those variables which were used in >=30 out of 100 "bags" produced a ROC curve AUC of 87%. However, I challenged these results with the following test: several "noise" variables (formulas using random numbers from the Gaussian and other distributions) were added to the data, and the random GLM procedure was run again. This was repeated several times with different random values for the noise variables. The noise variables actually attained non-negligible importance - i.e. they "competed" fairly strongly with the real experimental variables and were sometimes selected in as many as 30-50% of the random "bags". To "filter out" these nonsense variables, I tried discarding all variables whose correlation coefficient was not statistically significantly different from zero (with Bonferroni correction for multiple variables) and run randomGLM on the retained variables only. This works (I checked it with simulated data), but is of course very conservative on real data - almost all variables are discarded, and resulting classification is poor. What would be a better way to eliminate noise variables when using ensemble prediction methods like randomGLM in R? Thank you in advance for your interest and comments!

hello dear...

Please I need to know about the flowing few questions, regarding the feature selection problem in Intrusion detection :

- this problem consist in selecting the more relevent feature subset for each class (here we have 5 classes so the result ll be 5 subsets ). then these subset (needless to say one at a time ) are used to train a classifier.

is it necessary to use a binary classifier for this purpos (combine 5 binary classifier for the 5 classes ) ?? and if we use a multiclasse classifier trained on data that have only features relevent for a particular classe, will this affect the classifier's performance ???

- What is the influence of the distribution of instances in the training data sets (numbre of instances of each classe in the training datasets ) on the feature selection's performance or in the classification's performance ?

Thanks.

Please give your replies with a valid references

Ensemble of classifiers (EOC) combines a set of diverse classifiers to solve a classification problem. If we combine many classifiers, what kind of limitations might I encounter? Concept drifting would be one issue in EOC formulation. To train all single diverse classifiers requires training, which would be another problem.

Can anyone please suggest some resources regarding the limitations, and some ways we might resolve them.