Machine Learning

Machine Learning

  • Khadidja Belattar added an answer:
    How do I compute the quality score for dermatological images?
    What are the criteria that I should include in a dermatological image's quality score? How do you classify them as good or bad quality?
    Khadidja Belattar · Université Constantine 2

    Hi Mr.Dominik Lenz, below some example stating different types of artifacts in dermatological images:too much hair, underexposed image, line marker, crusts, black frame .

    + 4 more attachments

  • Ahmed Hamed added an answer:
    What is a random forest?

    Please provide matlab codes and links to related papers. I need to understand them for the paper.

    Ahmed Hamed · Suez Canal University

    For matlab code you may refer to this video

  • Murugappan M added an answer:
    Would you like to join a collaborative network for neuromarketing research?
    We are presently working on some of the projects related to neuromarketing. If any researcher would like to join with us, please let me know.
    Murugappan M · Universiti Malaysia Perlis

    Dear All

    Thanks for your interest on my invitation. Sorry for my late reply. I was continuously in travelling for the past few weeks. Let me come to my point.

    I had some experience on analyzing the EEG signals on Neuromarketing work. The main aim is to understand the human emotional behavior (like (happy, surprise), dont like (disgust, fear, anger and sad) while watching marketing or product advertisement through Audio-visual stimuli. 

    My experimental results on product advertisements in automobile, communication, food and computers give noticeable results on understanding the people like or dislike behavior through EEG signals. 

    I did some mistakes on framing up the protocol and now has some solid ideas on developing efficient protocol. 

    Limitations on continuing this work: No funding sources available and multi-ethnic people data is highly important for this work. i.e We need to develop a database. 

    Expectations: needs supports on continuous discussion on revising protocol, EEG data analysis, research manuscript writing and others. 

    Benefits: We can share our knowledge, enhance our skills on EEG based neuromarketing, and enjoying the benefits on publications (authorship). 

    This is my first project that i would like to do with international researchers. I do not have any rules or framework for starting up this. If you have anything, pl let me know.

    The things i would like to know from you:

    I almost explained many things over here. What do you think on your mind? Please let me know. If we people are synchronized with one objective, let us start together. 

    This is basically a multi-disciplinary work and each one of us is a main player on this project and equally enjoy the benefits. 

    I do not know much about your background, the way you can contribute to this work, the supports that you can give it to the project. 

  • Bojan Ploj asked a question:
    What advantages does backpropagation have over the border pairs method?

    In the linked article a border pairs method is described which have numerous advantages against the backpropagation algorithm.

    Does anyone observed any deficiency?

  • Antonio Marco added an answer:
    How reliable are the prediction scores provided by miRNA target prediction tools?

    There are many computational miRNA target prediction tools like Diana-MicroT, TargetScan, Targetminer etc. They all provide some kind of prediction score(s) depending on target site features (e.g. seed complementarity, sequence conservation, free energy of binding, AU content, accessibility of binding sites etc.) for each predicted miRNA-mRNA interaction. However, I have read some papers that when they integrate the prediction results of these tools for their miRNA of interest, they tend to ignore prediction scores and just look at if a particular miRNA-mRNA interaction is predicted by at least 2 or 3 prediction tools they selected to assume that the miRNA-mRNA interaction is reliable. Why? I don't understand.

    My questions are 

    1. Is it Ok to ignore prediction scores and consider overlapping results only to conclude a miRNA-mRNA interaction is reliable if we are looking at/integrating the prediction results of, let's say, 2 or more prediction tools?  Or

    2. Is there any tool that integrates prediction results of some of the prediction tools and their scores in some way and provide single score to rank miRNA-mRNA interactions? 

    Thank you all for your answers.

    Antonio Marco · University of Essex

    The Bioinformatics paper that Andreas mention is, from my point of view, quite important. A main conclusion is that you should NOT use combined predictions at all. My advice is to run you analyses with different prediction algorithms separately.

  • Chandrashekhar Azad added an answer:
    Can anyone help with data mining using Keel?

    Data mining using Keel

    If some one is having experience using KEEL Data Mining tool, I need help for the sated queries.

    1) Run a data mining method using c4.5 decision tree , It runs correctly and generates a Jar file.

    2)Then I execute RunKeel.jar, but it doesnt display any output/result.

    3) How to know whether the execution is completed or not , when RunKeel.jar is executed

    Please help.

    Chandrashekhar Azad · Birla Institute of Technology, Mesra

    Thanks All.... :D

  • Lionel Prevost added an answer:
    How to measure AUC for binary clustering when both clusters share some items?

    I have two clusters A and B which share some items. When I apply test data on these clusters, I don't know whether common items must be considered for both clusters or only for one cluster?

    Lionel Prevost · Université des Antilles et de la Guyane

    When classifying test data, hard clustering (i.e K-means) outputs a distance (to each cluster) while soft clustering outputs a posterior prob.

    Whatever the output, you just have to threshold it by using different thresholds.

    Then, for each threshold, you compute performance measures (detection and false alarm rates for example), build the ROC curve and evaluate the AUC.

  • Emmanuel Delaleau added an answer:
    Does anyone have intuitive explanations for matrix ranks and eigenvectors as well as their relationships?
    The mathematical definitions of matrix rank and eigenvectors are everywhere but it seems not to be intuitive. Can anyone provide intuitive and simple explanations for them?
    Emmanuel Delaleau · Institut Supérieur de l'Electronique et du Numérique

    I do not really understand the question. Algebra is by essence "intuitive" : it allows to replace (sometimes cumbersome) calculations by concepts !

    In the case of matrix rank : you can introduce that as the result of an algorithm on a matrix and you might be disappointed to understand what it is really. However, the definition does not depend on calculation and allows to understand the concept.

    This is the same for all other concept : eigenvalues, dimension...

  • Guiping Liu added an answer:
    Which is the quickest to learn and best: R or MATLAB for machine learning?
    I am a second year PhD student involved in machine learning (ML) based program analysis. I am currently using python for my coding and scripting tasks. Of late, applying (programming and interpreting) ML algorithms has become frequent and important in my research. Hence, I have decided to learn and stick to one language for my ML requirements. I have confusion in choosing between R and Matlab. I have the following constraints, (1) Quick to learn: I have already lost some time in research so far, hence I need pick up the language really quickly. (2) Large community: I don't work in team and I do all my coding alone. Hence, I rely on communities (like RG and Stackoverflow) for most of my programming queries. Hence, I'd like to choose a language that has a large community support. (3) Soft constraint - Good looking plots for papers: More often, I might need to use plots and results from my programs in my papers, hence a language that offers decent looking plots will be good for me. Please suggest on choosing either R or MATLAB based on the aforementioned requirements. Also, if you have additional justifications for your choice, it'd be helpful.
    Guiping Liu · University of British Columbia - Vancouver

    I recommend you R, too. R is not that hard to learn.   decent graphs and cutting-age statistics.   Growing communities.

  • Parminder Singh Reel added an answer:
    Can anyone help me get an OCT & SD-OCT image database in order to proceed with my research?
    Fundus image database is available but I need OCT retinal images and I am unable to get such a database.
    Parminder Singh Reel · The Open University (UK)

    The best step forward would be to find paper related to this field and see if they have used publicly available data-sets/database. Otherwise you can always email them requesting the images they have used.

  • Nils Goerke added an answer:
    What are your suggestions for Clustering Binary Categorical data?

    I have some data where I have certain classes (c1, c2, c3, c4 ...) and the data comprises of binary vectors where 1 and 0 denote that an entry belongs to a class or not. The number of classes will be > 200. 

    Would this data come under "Categorical" type?

    I tried out PCA for dimension reduction on this dataset and I even got good clusters with DBSCAN but I read that for categorical sparse data PCA is not recommended and also Euclidian distance as the distance measure is not good.

    I am planning to use MCA (Multiple Correspondence Analysis) but I cannot figure out how am I supposed to represent the data for that.

    PFA the link of the snap shot of the cluster that I got after PCA and DBSCAN.

    Nils Goerke · University of Bonn

    Dear Animesh,

    is the data vector a one-out-of-N  coding?
    or can it happen that an item is belonging to more than one class?

    As mentioned before (answer by C.Bauckhage) the possible configurations of an N-dimensional binary vector are the corners of an N-dimensional hypercube.
    There are 2^N corners available.

    I case you have a strict one-out-of-N coding, only the origin and the N-axis with exactly one of these N-bits set are available as possible solutions.
    If all N-axis (all N-components of the n-dimensional binary vector) are equivalent, then, the term  "clustering" is somehow useless, because all data points have exactly the same distance to each other.

    If an item can belong to more than one class, you will have (in principle) up to the full 2^N possible corners.
    You mentioned that N > 200.
    Do you have enough data to do statistics (like clustering) for such a large space?

    If not, try to identify the components in your N-dimensional vector that have a high correlation to be set and use these as working space, and consider the other bits as random noise. 
    (I know, this is not a really elegant way for dimensionality reduction, but it is sometimes surprisingly yielding good results.)


    Nils Goerke

  • Justin Ku added an answer:
    Can you help me with a natural language processing text tagging problem?

    I have the following project where I need to tag news items with company names to which these news items are relevant to (company names are mentioned in the news items and in many cases, in the headline of the news item). For example: I have about 2000 news items tagged with company names and the relevance level (High/Low) [this is done manually]. I have the following items: 

    story_ID, Headline; story_Text, company_name; relevance_level

    I need to automate the procedure. So I need to tag an incoming news items with company names and their relevance.

    Note: 1. some of the news items are not relevant to any company and so these are not tagged.

    2. some of the news items are relevant to multiple companies and so these are tagged with multiple company names and their corresponding relevance level.


    I am wondering what machine learning algorithms we can use. I am very new to Natural Language Processing. So I am not able to get a handle on how to go about solving the problem. (So far, I have used various techniques in machine learning, but there each row (observation) of the data matrix has only 1 label). 

    Any help would be greatly appreciated. 

    Thank you.

    Justin Ku · Lawrence Technological University

    Try GATE, UIMA, TextMarker, OpenNLP, Mathematica, etc. Each one has its strength and weakness. 

  • Gabriele Scheler added an answer:
    What is the difference between cause/effect and feature vector/label?

    I participated in a contest a while ago and the goal was to determine the cause of an effect from a training set of causes and effects and finding the regularity between them and then using it to classify the test data.

    I wonder if it is the same as a usual classification task or it is different somehow ?

    Thanks in advance.

  • Naveen Ramachandran added an answer:
    How can we use a support vector machine for multispectral data classification?

    I am trying to use SVM for land cover classification from multispectral data. How can I proceed?

    Naveen Ramachandran · Indian Institute of Technology Kanpur

    Thank you for your suggestion Paul.i am planning to perform feature extraction before using SVM. 

  • Florent Masseglia added an answer:
    What kind of patterns can be obtained through temporal data mining? Are there available databases for testing new methods?
    I've done some research on the topic and, so far, it was possible to identify that the following patterns have been thoroughly explored: 1) Sequence Mining. Example: "A-B-C-D happened in 10% of the database". This can also include a time constraint between events. 2) Temporal Association Rules: "A,B,C->D (30%, 20%) between 7am and 10am". In this case, the temporal information is used to "slice" the database in n-parts, which are then used to extract traditional association rules. My first impression is that many recent researchers are focused on efficiently performing either 1 or 2, or porting them to data stream environments. I am wondering if there are any other patterns that can be obtained from timestamped data, for instance, "A,B->C (20%, 30%)", such that A and B happens in any order in a time window of at least 10 hours and at maximum 20 hours before C. If nothing similar exists, is it worth the effort to develop a new data mining method to extract patterns similar to this? Also, if you know about any open datasets with the following characteristics, please let me know. I've tried UCI repository, but no success so far. Events with their respectives timestamps for many individuals or sensors. Example: 1 A (2012-02-20) B (2012-03-23) C (2013-01-20) 2 B (2003-04-30) D (2004-03-20) 3 B (2010-09-10) A (2010-10-01) C (2010-10-02)
    Florent Masseglia · National Institute for Research in Computer Science and Control

    The data set you describe really corresponds to the problem of sequential pattern mining (either from static or streaming data). The case you describe (patterns where events occur within a time window or separated by a gap) is know as "time constraints" in the world of sequential patterns. The first paper on this topic is GSP ( ) where the authors introduce the notions of sliding window, min-gap and max-gap.

    You might also be interested in hybrid patterns, where data can be stamped with either a period or a specific time.

    Here is a survey you migh find interesting on sequential pattern mining:

  • Larry M. Manevitz added an answer:
    What is the realitionship between deep learning methods and reservoir computing (if any)?

    I have the intuition that these two types of methodologies are related but I could not find any references nor any clear explanation of this relationship besides the fact that they are 2 types of modern, novel and evolved artificial neural networks.

    Larry M. Manevitz · University of Haifa

    Dear Aureli,

    1) rReservoir Computing essentially is using the path of iterative updating on recurrent networks as an indication of the input.  Since the input can occur at different times, it is in principle a method for spatial-temporal pattern recognition.  See the articles for Hananel Hazan for recent work on this.   The original articles are by Maass (Liquid State Machines) and Jaeger (Echo computing). ( The iterations, on the one hand, cause distinct patterns to diverge making potential classification in principle easier.)

    On the other hand, deep learning methodologies are (at least in the basic formulation) a way to allow feed forward networks with many levels to make use of their potential power.

    Thus, they are two very different ideas; one is essentially about static pattern recognition, while the other is about dynamic patterns.

    However, having said that,  they can potentially be connected in many ways.  For example, in reservoir computing, typically a "detector" that looks at the patterns can be any good classifier, and in particular, it might be very useful to use the power of deep learning classifiers for this part.

    In addition, one might consider investigating the deep learning paradigm for training the interconnections in the reservoir level; however this is still a research stretch.

    I hope this helps you.


  • S.M.M. Kahaki added an answer:
    How do you detect corner model from gray image?
    I want to extract some information from gray image, as corner model T, V and X model. currently i m interested by (location, angle, vertix) of each detected corner. Best regards
    S.M.M. Kahaki · National University of Malaysia

    Please have a look at:

    The source code is available at:

    + 1 more attachment

  • Can the pixel values work as a training and test set for svm classification?

    For example using imtool() or impixels() in matlab for pixel value retrieval for building training and test sets for svm.

    Hiba Basim Alwan Al-Dulaimi · Al-Mansour University College

    Yes it can be, because your dataset which is here your pixels values will be divide as you know into training and testing subset and they will swap their role in each iteration,  which it mean that the training subset will become testing subset and vice versa. 

  • Iskandar Keskes added an answer:
    What is the procedure of nested cross validation in SVM?

    Nested cross validation for classifier performance? What is intuition behind nested cross validation?

    Iskandar Keskes · University of Sfax

    You can see this paper that authors have used cross validation to evaluate the classifier.

  • Javier F. Botía added an answer:
    Does anyone have any advice on implementing DFA [Deterministic Finite Automata] in neural networks?

    Could you please let me know what kind of neural networks other than recurrent neural networks capable of implementing DFA?  How can a neuron be assigned to a state in DFA?

    Javier F. Botía · University of Antioquia

    I recommended to read the following two papers about that: 

    Omlin, C.W., Thornber, K.K. and Giles, L.C. Fuzzy Finite-State Automata Can be Deterministically Encoded into Recurrent Neural Networks. IEEE Trans. On Fuzzy Systems, volume (6), issue (1), pp. 86–79, 1998.

    Chandra, R. and Omlin, C. W. A Hybrid Recurrent Neural Networks Architecture Inspired by Hidden Markov Models: Training and Extraction of Deterministic Finite Automaton. In: Artificial Intelligence and Pattern Recognition, 2007, p. 278 – 285.

  • Emmanuel Benazera added an answer:
    Why is back-propagation still used extensively to train ANN while it is beaten by GA?
    I just have read an article "Training Feedforward Neural Networks Using Genetic Algorithms" written by David J Montana. Experiment 5 compares BP and GA to train ANN, the results show that GA is faster and gives smaller error margin than BP. If those results can be generalized why does BP dominate ANN training? the link to the paper:
    Emmanuel Benazera · Université Paris-Sud 11

    I've been experimenting with NN and CMA-ES, up to 4e5 dimensions. My first main observation is that convergence is very slow, though it eventually yields the expected results on a classical dataset I did experiment with.

    At this stage, and if this can help in the conversation, my view is that if there's room for GA outside the fitting of the network architecture itself, it might lie in the optimization of the network once gradient-based method that rely on BP have converged.

    Code is available if some of you believe it can be inspiring.

  • Chaurasia Sandeep added an answer:
    How to draw ROC curve and RI curve for prediction generation using SVM?
    Will Rapidminer tool help me in drawing a ROC curve? Moreover, I tried installing it, but could not.
    Chaurasia Sandeep · Sir Padampat Singhania University

    Use matlab, it will clear all aspects of ROC refinement over training cycle. 

  • Javier F. Botía added an answer:
    What are some good function approximation methods using fuzzy sets and logic such as fuzzy expert systems, fuzzy SVR, etc.?
    I have a project on function approximation by fuzzy decision trees and I want to compare my results with some other methods improved by fuzzy logic.
    Javier F. Botía · University of Antioquia

    I have other idea: use two fuzzy approximation called Disjunctive normal form (DNF) or Conjunctive normal form (CNF) to build-up a fuzzy decision and compare it with respect to your fuzzy decision tree. I've read some papers about fuzzy approximation as Computational Rule of Inference (CRI) and a special transformation called Fuzzy (F)-Transform, in order to get several options to compare.

  • Supratip Ghose added an answer:
    Is there any tool for StringToWordVector conversion for arff files?
    I need StringToWordVector converter for Machine Learning.

    There are lots of tools nowadays. The predominant of those are WEKA, Rapidminer and Lightside and this stringtowordvector class is very common.

  • Mohammad Amiri added an answer:
    Is f-measure synonymous with accuracy?
    I understand that f-measure (based on precision and recall) is an estimate of how accurate a classifier is. Also, f-measure is favored over accuracy [1] when we have an unbalanced dataset. I have a simple question (which is more about using correct terminology than about technology). I have an unbalanced dataset and I use f-measure in my experiments. I am about to write a paper which is **NOT** for a machine learning/ data mining conference. Hence, can I refer to f-measure synonymously with accuracy in this context. For eg, I have a f-measure of 0.82, then can I say my classifier achieves 82% accurate predictions? [1]:
    Mohammad Amiri · Goethe-Universität Frankfurt am Main

    F measure is different from accuracy and the best way is that you explain the FM

    shortly in your paper and cite some reference and also explain the why you used FM instead of accuracy  and also tell the accuracy.

  • Meysam Shamsi added an answer:
    How we can find novel machine learning algorithms?
    Is there any machine learning algorithm from 2010 until now for classification that is useful and popular?
    Meysam Shamsi · Iran University of Science and Technology

    This question is a terrible mistake in university of Iran!
    are you want a general classifier or best in specific case of classifier?

    I think Adaboost is a strong general classifier but this formulated by Yoav Freund and Robert Schapire who won the prestigious "Gödel Prize" in 2003.

    I think this paper is good for you:
    "A novel classifier ensemble method with sparsity and diversity" with this address:

  • Tamer Hashem Farag added an answer:
    Does anyone have information on the K-means algorithm?

    I am using K-means clustering algorithm, and I need to get clusters with one condition, that is, always each one includes one of the initial chosen centers, or if any other cluster methods which provide that feature.

    Tamer Hashem Farag · Cairo University

    What I need only that the 1st centroids set stay in their clusters 

  • Thomas Villmann added an answer:
    How can we prove the significance of features in classification?
    I have a binary classification problem. I have extracted 500 features from a set of 5000 samples using my domain knowledge. In other words, I have got hand crafted features. I wish to prove that these features actually are enough for performing classification and they make the 2 classes of samples separable. i.e. When the samples are represented with these features, there is exists a (reasonable) decision boundary. Please advise how I can prove this. Is there any statistically appropriate way of measuring the significance of the set of features as a whole (NOT the significance of individual features)?
    Thomas Villmann · Hochschule Mittweida

    Simple PCA does not work necessarily, because class distribution might be different from variance in the data. We prefer relevance learning vector quantization (GRLVQ) which automatically weights the data attributes according to their significance for classification. If also the correlation between are of interest, take the matrix variant GMLVQ,

Topic Followers (26677) See all