Machine Learning

Machine Learning

  • Yujiao li added an answer:
    Could synthetic control method be optimized in another way?

    I am learning the "synthetic control method(SCM)" since I am doing comparative case study. I have questions about its reliability in optimization.

    After ranking the similarity of every control unit with treatment unit, I choose different top N sample in this Sythetic Control (SCM). When I iterate the sample size from 2-280 in this method(Synth package in R), the objective function(MSE) is fluctuated very much. This unstable result make SCM unreliable. After placebo test, I delete the outliers, MSE got smaller and stable,but the optimized weight of control group and predictors are still fluctuate and non-converge as sample size increase.

    My question is whether this non-convergence is because the initial value in the optimization due to genetic algorithm? Or because of potential collinear between predictors?

    I have two solutions for your suggestions:

    (1)I applied another similarity measure matrix --mahalanobis distance instead of Euclidean distance which consider the correlation between predictors. 

    (2) Considering the potential non linear relationship, I used neural-network, Support vector machine, Gradient boosting regressiong tree for optimize the weight. But the first 2 methods turn out the overfitting and only GBRT works fine. Could I use this result and move forward? I am not sure this is the right track.

    Yujiao li

    Thanks a lot for your suggestions. I would like to try Adaboost later. But I have solve the overfitting by cross-validation. It turns out to be with less MPSE. As you said, i might work too much on. My solution for the unstable predictors is to find the most frequent selected pool during the iteration pool procedure. What do you think of that? 

    Again, appreciate your time and nice suggestions!!

  • Marion G Ceruti added an answer:
    Given a new user in a network, how can we find his/her actual publications in DBs?

    Imaging we have a research social network whatever, and we have a new user who only fills out his name, affilation, title and location. We face the problem of associating this user with his/her publications spread in a bunch of DBs (arxiv, ISIWebOfKnowledge, Inspires, ...). The name in the publications can appear with initials, with/without accents or even two authors may appear with same or similar name. How can we find his/her actual publications in real time?

    Marion G Ceruti

    Hello Alvaro,

    The use of ResearchGate to find various publications. has proven to be helpful.

    Perhaps as an approach to the academic aspect of the problem, an approach could be taken that involves an advanced artificial-intelligence tool that does some of the same things a human would do but also adds software to deal with the social-network aspect of the problem.

    For excellence in social-network analysis, I recommend the work of Prof. Carter Butts of the UC at Irvine, CA. He used to be or perhaps still is a contractor on projects sponsored by the Office of Naval Research. Perhaps his work can inspire a research direction for the present question.



  • Dennis Weyland added an answer:
    How does the Harmony search work for feature selection?
    I am a bit confused with the Harmony search for feature selection. Hopefully the expertise here can kindly help to answer my queries.
    Where should I fit the data set into theHarmony search? Can I say that in Harmony Memory (in step 2 initialise Harmony Memory) that the features are randomly selected from a dataset? How exactly is Harmony Memory the complete dataset?
    My understanding is that each row in Harmony Memory = feature subset, hence each decision variable in Harmony Memory represents a feature. So for the case of multi-dimensional datasets that have a few samples with various features (for example : Gene Expression data), how can Harmony Memory be formed?
    Dennis Weyland

    If you work with harmony search, you should maybe know about the fact that harmony search is in fact a special case of evolution strategies and that some results reported by the "inventor" of harmony search, Z.W. Geem, seem extremely unlikely:

  • Adel Sabry Eesa added an answer:
    Could any one tell me whether ordinal attribute can be given as an input to SVM and MLP?

    Can MLP and SVM handle ordinal features straightly or any preprocessing needed in order to perform classification?  

    Whether the term 'Monotonic transformation' is related here ? If so, how it is related?

     I read somewhere saying both the classifiers are sensitive to monotonic transformations.  Any paper references also appreciated. 

    Adel Sabry Eesa

    Dear Vishnu

    Yes ID3 algorithm will be work with you, also you can use any feature selection method with ID3 to improve the prediction process. 

  • Preeti Balaji added an answer:
    Does anyone have experience with SAR image classification?

    Dear all, I am trying to classify a SAR image using machine learning algorithms. I am finding difficulty in choosing an appropriate approach. Also, once a technique is chosen (SVM or Random forests or any), how do I transform the SAR image training samples into the required format suitable for the classification approach? I am trying this in Python. Any help is highly appreciated. Thanks very much!

    Preeti Balaji

    Thanks Gopika! I will have a look at your paper. Thanks for sharing!

  • Shaveta Chutani added an answer:
    Can anyone help with features of an image?
    I have to collect almost all of the possible features of an image (the image can be in RGB or in Grayscale) . I have collected about 50 features such as area, center of gravity, etc. Now I'm looking for more advanced features. It would be wonderful if anyone could recommend me the features which come to mind.
    Detail : images are cells with different shapes close to ellipse , oval tear and circle
    Shaveta Chutani

    Features can be categorized according to shape,color,intensity,edges,texture etc. Again it depends on the project u r undertaking which type of features fulfil your requirement.....

  • Priyanka Shah added an answer:
    Can molecules from Pubchem or ChEMBL can be used as decoy in classification studies?

    Can molecules extracted from ChEMBL database or Pubchem database using similarity searching methods be used as decoy in machine learning classification studies?

    Priyanka Shah

    Dear Mr. Sayak,

    Thanks for your quick reply. These structures will be further used for the molecular descriptor based binary classification.



  • Tannaz Akbarpour added an answer:
    Is there any multi-class SVM classifier available in MATLAB?
    I applied SVM classifier for my character recognition problem. It works nicely for two classes. But it can not be applied to multi class problem directly as in case of neural network. I have to classify Devnagari characters in 43 classes. Is the any approach or matlab code available?
    Tannaz Akbarpour

    this link contains another tool for multiclass SVM

  • Patricia Ryser-Welch added an answer:
    Does anybody know what the main approaches of reinforcement learning in continuous state and action spaces are?

    I’ve known that there are two main approaches for reinforcement learning in continuous state and action spaces: model-based and model-free. Does anybody know if this classification (classification of reinforcement learning approaches into model-based and model-free) is right for reinforcement learning in continuous state and action spaces as well. If not, what are the main approaches for continuous case?

    Patricia Ryser-Welch

    It may outside the question. I perceive that some supervised machine learning or hyper-heuristics ideas here. Perhaps it would be helpful to see how this type of reinforcement learning has been applied in these disciplines.

  • Angel Martinez-Tenor added an answer:
    What is the difference between value iteration and policy iteration methods in reinforcement learning?

    I'm new in reinforcement learning and I don't know the difference between value iteration and policy iteration methods!

    I am also very confused about categories of methods in reinforcement learning. Some studies classified reinforcement learning methods in two groups: model-based and model-free. But, some other studies classified reinforcement learning methods as: value iteration and policy iteration.

    I were wondering if anybody help me to know the relation between these classification, as well.

    Angel Martinez-Tenor

    In my Mater Thesis there is a simple approach from Model-based to Model-free decision-making processes; I think it will help you.   (slide 4)   (section 2.1  pages 11-14 )

    For further (and formal) information I also recommend the Sutton/Barto book:

    + 1 more attachment

  • Xin Ye added an answer:
    Anyone know about machine learning tool for urdu treebank?

    any one know about machine learning tool for Urdu treebank. i need a structure/ program for building a treebank for Urdu , e.g( input (grammar , corpus as text) output treebank)

    Xin Ye

    This may help.

  • Mostafa Ali Shahin added an answer:
    Any tool for Speech to Text conversion especially for Indian English?

    I have tried working on Kaldi and CMU's Sphinx, these work well for US accent however the accuracy for Indian English is not good. To train the model with Indian English there is a need of phonetic dictionary. So here I am with the below questions:

    - Is there any open source/commercial tool for speech to text for Indian English?

    - Is there any training data-set available for Indian speech recognition along with its phonetics ?

    - Can we do this with existing tools (Kaldi/Sphinx) or is it advisable to build it from scratch ?

    Mostafa Ali Shahin

    Kaldi/HTK/Sphinx/Julius all these tools are an implementation of speech recognition algorithms and they are independent on the language you use. If you have enough labeled speech data  all what you need is to prepare your data in the format that suitable for any of these tools and you should get a speech recognizer working on your language/accent.

    I would recommend Shahab suggestion to adapt the American English existing acoustic model to Indian accent especially if you don't have enough data to build an Indian Model.

    You can also try using DNN-HMM which reported to outperform GMM-HMM espicifically with small amount of speech data. 

    The DNN-HMM already implemented in Kaldi and Julius tool kits.

    Good Luck

  • Avinash Dudi added an answer:
    I have two sets of data, healthy and damaged; what can I do with this?

    I trained a 2-layered ANN for classifying a random signal generated using Damaged data. What else can I do with these data sets. I am contemplating to test different classifiers on the same data, other than that I have no idea what else to do? Thanks for suggestions and resources. 

    Avinash Dudi

    Accuracy results for testing a random signal are pretty convincing true positive is at 100% and false positive averaged at 13%. I tried with two different training methods, feeding random data and feeding first 80% of data. Accuracy results improved for random training.

  • Amir Jalilifard added an answer:
    How can I study the past spending behaviour of a customer in a banking perspective and predict the next purchase category and amount of buy?

    In a individual customer level, I want to study spending pattern, commodity on he spent (cloths, grocery, petrol, etc...), amount of purchase, etc. on monthly or daily basis. After studying this, I want to predict his likely next or coming month expenditure on each commodity. How this can be achieved? What type of database I have to create? Which statistical technique is suitable? Please share your thoughts and suggestions and any reference material. Thank you.

    Amir Jalilifard

    I think what you are looking for is learning about "Association Rules" and "Market Basket Analysis" algorithms. So take a look at the below paper :

    I recommend you taking a deep look on these algorithms.

  • Vladimir Batagelj added an answer:
    How do I normalize values of structural distance-based measure?

    Hi guys ,

    I've used a structural distance-based measure to compute similarity between each pair of nodes in an undirect graph. Hence, I calculated a distance matrix "D" such that the distance value "Dij" is simply the shortest-path between node i and node j. However, obtained distance values are absolutes (i.e. 5, 19, 3...etc) and I'd like to normalize them, such that : 0<= Dij <=1. 

    the normalized distance value must be converted finally to a similarity value S such that Sij=1-Dij.

    can any one guide me to find the appropriate function to normalize absolute distances ?

    Vladimir Batagelj

    There are many transformations between resemblance measures

    For example you can use

    s(i,j) = 1/(1+D(i,j))                     

    d(i,j) = 1-s(i,j) = D(i,j)/(1+D(i,j))

  • Aasim Khurshid added an answer:
    What is the running time complexity of SVM and ANN?
    What is the best, worst, and average running time complexity of SVM and ANN?
    Why most of machine learning papers report only the classification accuracy, and ignore the running time?
    Aasim Khurshid

    "Olivier Chapelle" talks extensively about complexity of SVM in this paper. and find an optimization.  He made an arugument that as "Support Vector Machines
    (SVMs) first state the primal optimization problem, and then go directly to the
    dual formulation" and  one should solve either the primal or the dual optimization problem depending on whether n is larger or smaller than d, resulting in an O(max(n, d) min(n, d)^2) complexity. where

    Given a matrix(dataset) X ∈ R^( n×d) representing the coordinates of n points in d dimensions

    For details may refer to the paper please :

    About ANN: I am sorry, I didnt have enough knowledge. 

  • Said Jadid Abdulkadir added an answer:
    Which are the new trends in the development of ANNs?
    What kind of goals are sought with such developments?
    Said Jadid Abdulkadir

    reinforced learning as for convolutional networks its application to video data

  • Sachin Patil added an answer:
    When and why do we need data normalization?
    Data normalization means transforming all variables in the data to a specific range. My question is when and why do we need data normalization?
    Sachin Patil

    @Juan Luis Herrera Cortijo

    Dear Juan

    Can you elaborate more. Do we need to apply normalization before finding correlation coefficient.

  • Antonio Miguel added an answer:
    What is the difference between mutual information and information gain?

    Dear all,

    Conceptually, what the difference between mutual information and information gain?

    I have tried to do an exercise which is involved the mutual information theory, However, after I ended up the calculation for MI and tested as well using the Information Gain theory by referring the same figures from a sample table, i got the same value between MI and IG. Can anyone here brief me the keyword to differentiate these both theories?   

    Antonio Miguel

    I agree from that definition they are the same, but since they are symmetric, the notation IG(X|Y) could be confusing since it reminds probability notation. I would prefer using IG(X,Y) or IG(X;Y) as it is done in mutual information.

  • Ransalu Senanayake added an answer:
    How can I cluster users based on their posts/queries?

    I have scraped the online posts from various users in a particular domain with following fields in csv format:

    userid, post-question

    xyz, how to install joomla

    abc, problem in adding wizard


    Also I have another lists with user attributes as:




    Now, i want to categorize these users based on the topic questions they posted? I was expecting some outputs like the user is novice, is expert or is intermediate ... 


    Ransalu Senanayake

    Read about similarity matching in "Mining Massive Data Sets" book. It's freely available  online.   Jaccard index, cosine similarity can be easily implemented in R.

    And this seminal paper:

  • Miguel Hernandez-Silveira added an answer:
    How can I evaluate the performance (weights and bias) given from a nonlinear SVM (Support Vector Machine) Kernel Model?

    Given the weights of each attribute and the bias, I don't know how to read these results when they are applied on a non linear Support Vector Machine. I would like to know in which way I should read them and see an example from the dataset I am using.

    Miguel Hernandez-Silveira

    If what you mean by performance refers  to the evaluation of the behaviour of your non-linear SVM, i would be looking for controlling overfitting and ensuring correct classification. I would do this by applying cross-validation to different models in order to determine the right one in terms of the number of support vectors and kernel functions (model selection) and with various options of regularization coefficients and types (shrinkage). I hope this helps 

  • José Francisco Moreira Pessanha added an answer:
    What are your suggestions for Clustering Binary Categorical data?

    I have some data where I have certain classes (c1, c2, c3, c4 ...) and the data comprises of binary vectors where 1 and 0 denote that an entry belongs to a class or not. The number of classes will be > 200. 

    Would this data come under "Categorical" type?

    I tried out PCA for dimension reduction on this dataset and I even got good clusters with DBSCAN but I read that for categorical sparse data PCA is not recommended and also Euclidian distance as the distance measure is not good.

    I am planning to use MCA (Multiple Correspondence Analysis) but I cannot figure out how am I supposed to represent the data for that.

    PFA the link of the snap shot of the cluster that I got after PCA and DBSCAN.

    José Francisco Moreira Pessanha

    Maybe MCA CAN help you.

    The hamming distance is other option

  • Haroldo Fraga de Campos Velho added an answer:
    In choosing an activation function for SVR or Neural network, what is the advantage/disadvantage of radial base function to using a sigmond function?

    This is with relevance to Machine Learning and application to traffic systems and traffic flow monitoring and control in predictive model using Support vector machine with training performed by Kernel trick or Artificial Neural Network trained with back-propagation.

    Haroldo Fraga de Campos Velho

     @Taiwo: Interesting question.

    1. The conlcusin from the Oussama Ahmia's pape was the best result was found by SMR for a particular application (electrical load prediction) than ANN and multi-variable regression.

    2. Andrey Yu. Shelestov and Grzegorz Dudek declared SVM is global classifier and ANN is a local classifier. However, both also declared that better result depends on modeler or/and trail and error experiments. 

    3. Mahboobeh Parsapoor suggested the use of SVM  when the number of dimensions is greater than the number the number of samples, indicating a paper to support the statement (I didn't read the paper). 

    4. My experience indicates that RBF, and multi-layer perceptron can show robustness dealing with nosy data. See:




    5. You can automatically select the activation function, and other parameters/functions in the SVM and/or ANN, formulating the problem (configuration of SVM, SVR, ANN) as an optimization one. See some papers:

    • Source
      [Show abstract] [Hide abstract]
      ABSTRACT: Multi-particle collision algorithm (MPCA) is applied to design an optimum architecture for a supervised ANN. The MPCA optimization algorithm emulates a collision process of multiple particles inspired in processes of a neutron traveling in a nuclear reactor. The procedure to carry out the automatic configuration for multi-layer perceptron (MLP) neural network is applied to identify the vertical temperature profiles are obtained from measured satellite radiance data. The MLP-NN is trained with data provided by the direct model characterized by the Radiative Transfer Equation (RTE). The MLP-NN results are compared to the ones computed using regularized inverse solutions. In addition to synthetic data (corrupted by noise), real radiation data from the HIRS/2 (High Resolution Infrared Radiation Sounder) is used as input for the MLP-NN to generate temperature profiles that are compared with the temperature profiles measured by a radiosonde. The comparison between the results obtined with automatic process and previous configuration chosen by an expert is evaluated.
      EngOpt 2012 - International Conference on Engineering Optimization; 07/2012
  • Jorge Lopez-Cifre added an answer:
    Anyone familiar with ICLR papers? (They are all available at

    This is not a question, it is just an announcement. Since I don't know how to post an announcement or some news I post a question. 

    All paper submitted to ICLR are available at I think that all researchers interested in machine learning and/or pattern recognition must take a look.

    Jorge Lopez-Cifre

    Thanks for sharing Konstantinos !

  • İzzet Pembeci added an answer:
    How should I proceed to implement NLP (almost) from scratch (collobert 2011)?

    There is a famous JMLR journal of Ronan Collobert called  "Natural Language Processing (almost) from scratch" . I want to implement that. Can anyone let me know how should I proceed, which language will be appropriate for it ? I can go for torch or theano.

    İzzet Pembeci

    That paper follows a very different approach to NLP. So libraries like NLTK will not help you. What I can suggest is search for Python and DeepLearning and then try to use one of those frameworks for your implementation. That way you won't need to implement NN parts (a huge undertaking) but just configure them for your project.

  • Katharina Morik added an answer:
    What is the best algorithm for supervised text categorization of papers?

    I have a training set of ca. 2500 manually categorized abstracts and want to automatically categorize 4M papers. What is the best tool to download? Till now, I have only 90% with Kappa at 0.8 in cross validation. I used NaiveBayesMultinomial on 2500 features -- statistics of most frequent words. As preprocessing, I removed stop words and stemmed the text. 

    Katharina Morik

    RapidMiner (open source, download free) has a text processing extension with all the preprocessing (stemming, Bag Of Words with tiff or binary, etc.) and then you can run SVM and get the results. It is just drag and drop and the system takes care. Try it! 

  • Mohammad Mahdi Momenzadeh added an answer:
    Is there a high-resolution training image database for pattern recognition purpose?

    Hello everyone!

    I'm developing a pattern recognition algorithm on images. So far I have been using the MNIST database but due to some reason I need to switch to another database with higher resolution. It would be highly appreciated if someone could help me to find one or if anyone knows a trick for that that would be great!

    thanks in advance. 

    Mohammad Mahdi Momenzadeh

     I dont need those images with so much details, just need a database that provides me the gray images which have few details.

Topic followers (28,233) See all