Machine Learning

Machine Learning

  • Aasim Khurshid added an answer:
    What is the running time complexity of SVM and ANN?
    What is the best, worst, and average running time complexity of SVM and ANN?
    Why most of machine learning papers report only the classification accuracy, and ignore the running time?
    Aasim Khurshid · Universidade Federal do Rio Grande do Sul

    "Olivier Chapelle" talks extensively about complexity of SVM in this paper. and find an optimization.  He made an arugument that as "Support Vector Machines
    (SVMs) first state the primal optimization problem, and then go directly to the
    dual formulation" and  one should solve either the primal or the dual optimization problem depending on whether n is larger or smaller than d, resulting in an O(max(n, d) min(n, d)^2) complexity. where

    Given a matrix(dataset) X ∈ R^( n×d) representing the coordinates of n points in d dimensions

    For details may refer to the paper please :

    About ANN: I am sorry, I didnt have enough knowledge. 

  • Said Jadid Abdulkadir added an answer:
    Which are the new trends in the development of ANNs?
    What kind of goals are sought with such developments?
    Said Jadid Abdulkadir · Universiti Teknologi PETRONAS

    reinforced learning as for convolutional networks its application to video data

  • Jaya Shukla added an answer:
    Is there any multi-class SVM classifier available in MATLAB?
    I applied SVM classifier for my character recognition problem. It works nicely for two classes. But it can not be applied to multi class problem directly as in case of neural network. I have to classify Devnagari characters in 43 classes. Is the any approach or matlab code available?
    Jaya Shukla · Shiv Nadar University

    Please explain the procedure for multiclass classification using LIBSVM libraray.

  • Girish G N added an answer:
    Any tool for Speech to Text conversion especially for Indian English?

    I have tried working on Kaldi and CMU's Sphinx, these work well for US accent however the accuracy for Indian English is not good. To train the model with Indian English there is a need of phonetic dictionary. So here I am with the below questions:

    - Is there any open source/commercial tool for speech to text for Indian English?

    - Is there any training data-set available for Indian speech recognition along with its phonetics ?

    - Can we do this with existing tools (Kaldi/Sphinx) or is it advisable to build it from scratch ?

    Girish G N · National Institute of Technology Karnataka

    No currently there are no tool TTS tools available for Indian languages/. Still research is going on the labs. You can implement a TTS for any indian language

  • Sachin Patil added an answer:
    When and why do we need data normalization?
    Data normalization means transforming all variables in the data to a specific range. My question is when and why do we need data normalization?
    Sachin Patil · Indian Institute of Technology Bombay

    @Juan Luis Herrera Cortijo

    Dear Juan

    Can you elaborate more. Do we need to apply normalization before finding correlation coefficient.

  • Antonio Miguel added an answer:
    What is the difference between mutual information and information gain?

    Dear all,

    Conceptually, what the difference between mutual information and information gain?

    I have tried to do an exercise which is involved the mutual information theory, However, after I ended up the calculation for MI and tested as well using the Information Gain theory by referring the same figures from a sample table, i got the same value between MI and IG. Can anyone here brief me the keyword to differentiate these both theories?   

    Antonio Miguel · University of Zaragoza

    I agree from that definition they are the same, but since they are symmetric, the notation IG(X|Y) could be confusing since it reminds probability notation. I would prefer using IG(X,Y) or IG(X;Y) as it is done in mutual information.

  • Ransalu Senanayake added an answer:
    How can I cluster users based on their posts/queries?

    I have scraped the online posts from various users in a particular domain with following fields in csv format:

    userid, post-question

    xyz, how to install joomla

    abc, problem in adding wizard


    Also I have another lists with user attributes as:




    Now, i want to categorize these users based on the topic questions they posted? I was expecting some outputs like the user is novice, is expert or is intermediate ... 


    Ransalu Senanayake · University of Sydney

    Read about similarity matching in "Mining Massive Data Sets" book. It's freely available  online.   Jaccard index, cosine similarity can be easily implemented in R.

    And this seminal paper:

  • Miguel Hernandez-Silveira added an answer:
    How can I evaluate the performance (weights and bias) given from a nonlinear SVM (Support Vector Machine) Kernel Model?

    Given the weights of each attribute and the bias, I don't know how to read these results when they are applied on a non linear Support Vector Machine. I would like to know in which way I should read them and see an example from the dataset I am using.

    Miguel Hernandez-Silveira · Sensium Healthcare Ltd

    If what you mean by performance refers  to the evaluation of the behaviour of your non-linear SVM, i would be looking for controlling overfitting and ensuring correct classification. I would do this by applying cross-validation to different models in order to determine the right one in terms of the number of support vectors and kernel functions (model selection) and with various options of regularization coefficients and types (shrinkage). I hope this helps 

  • José Francisco Moreira Pessanha added an answer:
    What are your suggestions for Clustering Binary Categorical data?

    I have some data where I have certain classes (c1, c2, c3, c4 ...) and the data comprises of binary vectors where 1 and 0 denote that an entry belongs to a class or not. The number of classes will be > 200. 

    Would this data come under "Categorical" type?

    I tried out PCA for dimension reduction on this dataset and I even got good clusters with DBSCAN but I read that for categorical sparse data PCA is not recommended and also Euclidian distance as the distance measure is not good.

    I am planning to use MCA (Multiple Correspondence Analysis) but I cannot figure out how am I supposed to represent the data for that.

    PFA the link of the snap shot of the cluster that I got after PCA and DBSCAN.

    José Francisco Moreira Pessanha · Rio de Janeiro State University

    Maybe MCA CAN help you.

    The hamming distance is other option

  • Haroldo Fraga de Campos Velho added an answer:
    In choosing an activation function for SVR or Neural network, what is the advantage/disadvantage of radial base function to using a sigmond function?

    This is with relevance to Machine Learning and application to traffic systems and traffic flow monitoring and control in predictive model using Support vector machine with training performed by Kernel trick or Artificial Neural Network trained with back-propagation.

    Haroldo Fraga de Campos Velho · National Institute for Space Research, Brazil

     @Taiwo: Interesting question.

    1. The conlcusin from the Oussama Ahmia's pape was the best result was found by SMR for a particular application (electrical load prediction) than ANN and multi-variable regression.

    2. Andrey Yu. Shelestov and Grzegorz Dudek declared SVM is global classifier and ANN is a local classifier. However, both also declared that better result depends on modeler or/and trail and error experiments. 

    3. Mahboobeh Parsapoor suggested the use of SVM  when the number of dimensions is greater than the number the number of samples, indicating a paper to support the statement (I didn't read the paper). 

    4. My experience indicates that RBF, and multi-layer perceptron can show robustness dealing with nosy data. See:




    5. You can automatically select the activation function, and other parameters/functions in the SVM and/or ANN, formulating the problem (configuration of SVM, SVR, ANN) as an optimization one. See some papers:

  • Jorge Lopez-Cifre added an answer:
    Anyone familiar with ICLR papers? (They are all available at

    This is not a question, it is just an announcement. Since I don't know how to post an announcement or some news I post a question. 

    All paper submitted to ICLR are available at I think that all researchers interested in machine learning and/or pattern recognition must take a look.

    Jorge Lopez-Cifre · Universidad Camilo José Cela

    Thanks for sharing Konstantinos !

  • İzzet Pembeci added an answer:
    How should I proceed to implement NLP (almost) from scratch (collobert 2011)?

    There is a famous JMLR journal of Ronan Collobert called  "Natural Language Processing (almost) from scratch" . I want to implement that. Can anyone let me know how should I proceed, which language will be appropriate for it ? I can go for torch or theano.

    İzzet Pembeci · Mugla Üniversitesi

    That paper follows a very different approach to NLP. So libraries like NLTK will not help you. What I can suggest is search for Python and DeepLearning and then try to use one of those frameworks for your implementation. That way you won't need to implement NN parts (a huge undertaking) but just configure them for your project.

  • Katharina Morik added an answer:
    What is the best algorithm for supervised text categorization of papers?

    I have a training set of ca. 2500 manually categorized abstracts and want to automatically categorize 4M papers. What is the best tool to download? Till now, I have only 90% with Kappa at 0.8 in cross validation. I used NaiveBayesMultinomial on 2500 features -- statistics of most frequent words. As preprocessing, I removed stop words and stemmed the text. 

    Katharina Morik · Technische Universität Dortmund

    RapidMiner (open source, download free) has a text processing extension with all the preprocessing (stemming, Bag Of Words with tiff or binary, etc.) and then you can run SVM and get the results. It is just drag and drop and the system takes care. Try it! 

  • Mohammad Mahdi Momenzadeh added an answer:
    Is there a high-resolution training image database for pattern recognition purpose?

    Hello everyone!

    I'm developing a pattern recognition algorithm on images. So far I have been using the MNIST database but due to some reason I need to switch to another database with higher resolution. It would be highly appreciated if someone could help me to find one or if anyone knows a trick for that that would be great!

    thanks in advance. 

    Mohammad Mahdi Momenzadeh · Universität Paderborn

     I dont need those images with so much details, just need a database that provides me the gray images which have few details.

  • Tarik A. Rashid added an answer:
    Anyone know about machine learning tool for urdu treebank?

    any one know about machine learning tool for Urdu treebank. i need a structure/ program for building a treebank for Urdu , e.g( input (grammar , corpus as text) output treebank)

    Tarik A. Rashid · Salahaddin University - Erbil

    Dear :-

    Try this link

  • Mani A. added an answer:
    Is there any data set available containing FAQs in different domains?

    My thesis is about Analysis and Auto generation of FAQ lists in different domains. For conducting experiments, I need high volume of FAQs. That's the reason I am looking for a publicly available data-set containing FAQs in various domain (or even one specific domain).

    Mani A. · University of Calcutta, Kolkata, India; etc

    You can also have a look at sites like:

    ubuntu forums

    ubutu community documentation

    so you would be able to know more about the ontology of FAQs

  • Maged Hamada Ibrahim added an answer:
    Can we fully simulate the whole brain on a digital computer?

    Miguel Nicolelis and Ronald Cicurel claim that the brain is relativistic and cannot be simulated by a Turing Machine which is contrary to well marketed ideas of simulating /mapping the whole brain  If it cannot be simulated on digital computers what is the solution to understand the brain language?

    Maged Hamada Ibrahim · Helwan University

    Just if we know exactly how it works. How can we implement something we are not fully aware of its capabilities and how it interacts!

  • Omer Faruk Ertugrul added an answer:
    Does anyone have a good experience with machine learning techniques on very small data sets (n is less than 20)?
    We used standard techniques, bit the results are not good. We also tried SVM and GA, with little improvement. There are some papers suggesting similar methods, that we didn't try yet. What would be your suggestions?
    Omer Faruk Ertugrul · Batman Üniversitesi

    It is hard, but it depends on the characteristics of the dataset. If it is not much complex, I think you may achieve a high a accuracy ratio

  • Le Quang Nam added an answer:
    Is there an R implementation of the HITON algorithm?
    HITON is an algorithm for markov blanket discovery of a target variable. The algorithm first appear in this paper:

    I am looking for an R implementation of an algorithm which finds the markov blanket of a target variable.

    Any reply would be greatly appreciated.
    Thank you.
    Le Quang Nam · National Institute of Animal Sciences Vietnam

    In the bnlean packages one can use Semi-Interleaved Hilton-PC, a type of constraint-base algorithm in learning BNs.

  • Cesar García-Osorio added an answer:
    Does anyone have experience with rotation forest algorithm?
    The rotation forest algorithm requires to eliminate randomly a subset of classes from the data. Afterwards, a bootstrap (I guess without reposition) of 75% of the remaining data has to be generated to perform PCA. How and how many classes should be eliminated? In every iteration a new random subset has to be selected? What if it is a two-class data set? In order to perform PCA the data has to be zero-mean (for covariance-PCA) or normalized (for correlation-PCA). I might not have understood it correctly, but does it make sense to select a bootstrap, centering the data to do PCA and then to generate scores using the rearranged rotation matrix on the whole data? The algorithm presented in the paper from Rodriguez and Kuncheva, Rotation Forest: A new classifier ensemble method, IEEE, 2006, explains that overlapping features (random selection with repetition) can be used but it is not shown how the principal components are merged. Can someone clarify these issues?
    Cesar García-Osorio · Universidad de Burgos

    There is an implementation of Rotation Forest for Weka by the author of the algorithm, maybe you could find the details of the method in the source code.

  • Sandeep Saini added an answer:
    What are some recent cognitive algorithms used in Machine Translation?

    Machine translation is pure statistical or some cognitive algorithms are being explored in these systems? If yes then what are those algorithms and approaches? 

    Sandeep Saini · The LNM Institute of Information Technology

     Thank You everyone for the valuable suggestions. . 

  • Yaakov J Stein added an answer:
    Is there any software based on machine learning algorithm to map Qos to QoE for network management?

    I am searching for open source/ publicly available software/program for mapping QoS to QoE for network management i-e a program/software based on machine learning algorithm that can calculate QoE e.g Mean Opinion Score (MOS) based on QoS parameters. Any help regarding this will be highly appreciated :)

    Yaakov J Stein · Tel Aviv University

    There has been a lot of work on predicting QoE values based on QoS parameters by people who really understand networking. The standard approach of assuming a closed form (e.g., QoE is linear/logarithmic/exponential/power-law of packet loss/delay/...) and finding best fit  coefficients works very well in practice, and has a rich literature..

    I have seen a number of papers where people use machine learning techniques (e..g., backprop NNs, decision trees, boosting), but these seem to be mostly exercises in use of a technique, rather than true contributions to the field. (You can easily find such papers using Google Scholar.)

    In addition, other than a few well-known cases (e.g., MOS for voice) the empirical QoE data has large variability, and matching the strongly averaged trends (e.g., of ApDex values)  is not really that useful.


  • Marco A. Wiering added an answer:
    Does anyone know of a function approximation with a variable number of outputs?

    Does anyone know of a function approximator which can produce a variable number of output values (i.e. for some regions in input space it might output a vector of 3 values, whereas in other regions it might produce 5 outputs)?

    Update: Thanks everyone for your suggestions. I realise now that I missed a critical aspect when phrasing my original question - we don't know in advance how many outputs will be required in each region of the input space (or even what the regions of the input space are). So maybe, I should rephrase my question in light of Simone and Meysar's answers - is there a function approximator which can learn to produce a single output for some parts of input space, and no output for other parts? My thinking so far is to use something like an RBF network as suggested by Vassilis, with a threshold applied so no output is produced if the input doesn't closely match any of the basis functions.

    Marco A. Wiering · University of Groningen

    I think it would be possible to try to use one more function approximator that learns how many outputs there should be for a specific input. Then it uses the first N function approximators to return those outputs. Although this first FA will output real numbers, you can round it off to integers.

  • Mohamad Ivan Fanany added an answer:
    The relationship between sparse coding and deep learning?
    According to my review of related literature, I think the sparse coding may be the basis of deep learning, i.e., the features in the lowest level of deep learning structures come from the dictionary of sparse coding. Is this statement right?
    Mohamad Ivan Fanany · University of Indonesia

    Dear Yu Zhao, 

    Normally, sparse coding is compared with autoencoder not with deep learning. Autoencoder is one of building block of deep learning (in case we use stacked denoising autoencoder). One of elaborated comparison between sparse coding and autoencoder can be find in the following link:

    In essence, a sparse coding can be viewed as an autoencoder with added sparsity constraint. If you add a regularization constraint into the original autoencoder formulation, the results tend to be the same as sparse coding. 

    Hope this will help.

  • David F. Nettleton added an answer:
    What is the difference between machine learning and data mining ?
    Is ML related to the algorithms and DM to the data ?
    David F. Nettleton · University Pompeu Fabra

    Data Mining is the more general concept. Different families of techniques can be applied to data to "mine" it. Machine learning (supervised, unsupervised, ...)  is one family of data analysis techniques. Traditional statistics (regression analysis, principal components, factorial analysis, etc.) is another.

Topic followers (28,091) See all