Machine Learning

Machine Learning

  • Patricia Ryser-Welch added an answer:
    Which feature selection algorithm is better to use for non-numerical dataset?

    The input is a movie dataset, e.g a CSV file of user preferences on features of movies, and the output should be the most important features).. I found a Java ML library  and it works pretty well with it's sample dataset, however it seems it does not work for my dataset which is not numerical.

    Patricia Ryser-Welch

    Have you tried Neural networks. It is used by Microsoft and other companies who apply machine learning to classify data. 

  • Ebenezer R.H.P. Isaac asked a question:
    What is the minimum sample size required to train a Deep Learning model - CNN?

    It is true that the sample size depends on the nature of the problem and the architecture implemented. But, on average, what is the typical sample size utilized for training a deep learning framework?

    For instance, in a convolutional neural network (CNN) used for a frame-by-frame video processing, is there a rough estimate for the minimum no. of samples required to train the model?

  • Prem Sankar Chakkingal added an answer:
    What is ecorithm?
    I'm reading a book called "Probably Approximately Correct" by Leslie Valiant, actually I'm on page 48 now and haven't got a tangle sense of what exactly the author means by the term "ecorithm" so, does anybody have an idea about what exactly "ecorithm" is all about?
    Prem Sankar Chakkingal

    You think of an algorithm as something running on your computer, but it could just as easily run on a biological organism. But in either case an ecorithm lives in an external world and interacts with that world. An ecorithm is an algorithm, but its performance is evaluated against input it gets from a rather uncontrolled and unpredictable world. And its goal is to perform well in that same complicated world.

  • Ravi Sah asked a question:
    Where is the best place for machine learning internship in india?

    name of the institute and the professors

  • James Tseng added an answer:
    What method should be used for classification of big data with an unknown number of clusters?
    I'm looking for a method for unsupervised classification of big data with an unknown number of clusters. Can you suggest a robust method? Is there any Matlab toolbox dedicated to this purpose?
    James Tseng

  • Jakob Nikolas Kather added an answer:
    Effect of imbalanced data on machine learning


    I am working with database of facial expressions that has imbalanced data. For example there are four times more examples of expression of "happiness" then expression of "disgust". 

    I am using libsvm library to learn model. When I train SVM on imbalanced dataset I get accuracy of 45%. But when I artificially balanced the data by copy pasting expressions that are under sampled, I get an accuracy of 80%. 

    Now my questions are:

    1. Is this way of balancing the data acceptable in scientific community? 

    2. Should I report both the accuracies or just the best one? 

    3. How to explain this phenomenon?

    Thank you in advance.  

    Jakob Nikolas Kather

    Dear Rizwan,
    thank you for the additional information! As I see it, this way of balancing the data is not acceptable. Here's why: To balance classes, you duplicate items in your dataset. Then, you perform 10-fold cross validation, i.e. you perform 10 rounds of training and testing. It is almost certain that some of the duplicate items will end up in the testing set. This means that the accuracy is assessed on the same items the classifier was trained on, in other words: the testing set is not "unknown" to the classifier. Consequently, the accuracy is biased. In your case, this explains why the accuracy rises from 45% to 80%.

    I suggest three alternatives:
    a) To balance the classes, you acquire more raw data (This may be difficult, but is the most straightforward solution).
    b) To balance the classes, you discard items from the larger classes (This is painful, but is a good alternative).
    c) You accept class imbalances and try to use a different classification approach. For example, you can use RUSboost, which has been shown to be insensitive to class imbalances [1]. Plus, this method is available in Matlab [2]

    Best regards,

    [1] C. Seiffert, T. M. Khoshgoftaar, J. Van Hulse, and a. Napolitano, “RUSBoost: Improving classification performance when training data is skewed,” 2008 19th Int. Conf. Pattern Recognit., pp. 8–11, 2008.

  • J. B. K Asiedu added an answer:
    Which are the best open source tools for image processing and computer vision?
    Like OpenCV which are best alternative open source tools for development of image processing and computer vision algorithms.
    J. B. K Asiedu

    I am trying to do feature extraction in some raster images, can anyone share some best softwares

  • Arturo Geigel added an answer:
    Big Data analysis with machine learning techniques with evolutionary computation approach?


    I'm searching for methods to analyze Big Data using Machine Learning techniques with Evolutionary Computation approach.

    Do Evolutionary Computation methods along with Machine Learning methods can help to deal with Big Data?

    Which methods from each mentioned area could be work and combined together to achieving  this goal?

    Thank for your help and time to answer my question and letting me find good resources to study.

    Saeed Jannesar 

    Arturo Geigel

    Do Evolutionary Computation methods along with Machine Learning methods can help to deal with Big Data?

    In answering this question you must ask yourself first:

    What is big data? What are the inherent problems of big data?

    In answering these questions you must entertain that:

    1) Most standard machine learning techniques do not scale well with increasing amount of data

    2) You might focus on inferential techniques to help manage the amount of data so that inefficient learning algorithms can handle it. Alternatively, you can focus on how to efficiently implement on line algorithms with evolutionary computation that can handle streaming data and avoid catastrophic forgetting (which I find interesting problem to solve).

    3) Even though we are using concepts such as MapReduce to handle parallel, distributed processing, at one point as data processing becomes more complex, it will become costly to the point of being prohibitive (in terms of cost per kWh). More refined strategies and algorithms are needed in the long run.

  • Ahmed Aljaaf added an answer:
    What are the possible approaches for solving imbalanced class problems?
    Ahmed Aljaaf

    Hi Yasir, 

    You can do it manually.

  • Shah Limon added an answer:
    What are your recommendations about good topic researches for master's thesis in machine learning or Big data??

    Hi everyone, I am master student and I want do my thesis in a topic related to computer science applied to business, industry or topics related to machine learning. The problem is that I did not find novelty topics to research. Any idea that I had it has already been developed.

    Shah Limon

    You can combine big data computing with sustainable manufacturing / transportation/ agricultural / renewable energy system development.

  • Abbas Chokor added an answer:
    How to validate the clusters formed for the Banking Customers Data?

    After forming the clusters of customers of Banks, using various algorithms like - kmeans and decision tree. What are the ways we can validate the clusters formed by our method.

    Abbas Chokor

    The purity of the clusters is a main metrics. check this link:

    Good luck!

  • Gayathri Varu added an answer:
    How to weight update using error back propagation alogorithm in neural network ?

     Good Evening,

                 I will solve  the problem  using error back propagation algorithm and classification of imbalanced data.In My work i am used data set Ann-thyroid data set.The data set consists of three class(a,b,c).I had split  my data set  Ann-thyroid13(23).1 and 2 refer the minority data set.3 refers to the Majority data set.Weight update rules for weaken weight update for the Majority class  and  intensifies weight update for the Majority class.My question is how to classify data set and how to update weight? i attached my coding and base paper.Anybody can please answer the problem.

    + 1 more attachment

    Gayathri Varu

    Thank you sir,i refer the book..

  • Rahul Sharma added an answer:
    Is there a solution for a holistic performance monitoring of cloud?

    I want to know that what problems will be there if I try to apply a score calculation method in the platform layer of the cloud environment to calculate the scores generated for potential interference when an application is working on the cloud. Suppose this is my basecase with one VM running and hosting the application. If I take this basecase as a standard and develop a machine learning algorithm that sits on the platform layer or right in the hypervisor (I dont know yet if I can do that in the hypervisor) to calculate scores for when other applications will be running on the cloud (e.g score for network usage, cpu, uptime, VM idle state and numerous other parameters) to find a holistic view of the performance of my application or the overall cloud environment, what will be the tradeoffs or problems with such a system. Is it even possible. Just a wild concept!

    Rahul Sharma

    Yeah that kind of answers my question, but is there any paper that I can go through to actually understand the underlying solution? 

  • Samuel Foster added an answer:
    Neural network outputting average values of all outputs - Am I doing anything wrong?

    I'm attempting to design a neural network to approximate SED fitting.

    I have a set of 20,000 runs through the SED fitting program, MAGPHYS.

    Each run contains:

    •  A set of 43 input features, which contains 21 values representing a histogram of observed radio fluxes over wavelength and 21 associated values representing signal to noise ratio for each flux value, as well as 1 redshift value
    • 32 useful output features (there are other outputs from the system, but I don't need them).

    I've built a neural network in Keras to attempt to learn this function. Currently I'm using 40 hidden nodes per layer and 4 hidden layers, in addition to one input and one output layer. The input and hidden layers are all using TanH activation functions and the output layer is using a Linear activation function.

    I am normalising my input and output data to within 0 and 1 using minmax normalisation.

    I've tried a lot of different combinations of neural network parameters, such as:

    • Different optimisers (Stochastic gradient descent, Adagrad, Adam etc.) 
    • Different learning parameters (momentum, learning rate, weight decay)
    • Different number of hidden nodes and different number of hidden layers
    • Using a TanH output layer
    • Normalising my input data between -1 and 1 instead of 0 and 1.
    • Using only the 21 flux readings without the signal to noise ratios (reducing the inputs to the system to 22 rather than 43)

    Regardless of any of these parameters, the network always seems to output values that are very close to the averages for each of the 32 outputs. Sometimes the network will output exactly the same number for every test, or sometimes these values will vary slightly, but will still be a value very close to the average.

    What would cause my neural network to always output values like this?

    Am I doing something wrong with my network design, or is there something else that I'm missing?

    Is there anything I can try in an effort to get my network to actually learn this function correctly?

    Samuel Foster

    Thankyou everyone, your recommendations have been absolutely invaluable, especially to my understanding of neural networks. I haven't been able to crack this problem yet, but here's a few things I've attempted:

    • Attempted different network configurations with different numbers of hidden nodes and layers. The one with the lowest loss out of all my tests was a network with 300 hidden nodes in each layer and 2 hidden layers.
    • Using an autoencoder to reduce the input's dimension size. This somewhat worked, but even attempting to reduce the input's dimension by 1 (by having an autoencoder with hidden layer with one less node in it than the input and output layers) would result in fairly large decrease in accuracy for some of the re-produced values. Not sure if I should proceed with this approach for dimension reduction.

    • Binning ranges of values together in to percentile bins for that specific input or output (i.e. from percentile 0 to percentile 1 of all of the values for that input/output is one bin, from percentile 1 to 2 is another bin etc.). All values are categorized by which bin range they fall in and are replaced with a value between 0 and 100. I tried this on both the inputs and outputs of the network but its predictions still remained mostly incorrect.

    • I'm now always using dropout in all of my network tests and after reading a few other sources explaining it, I can see the obvious benefits of using it. I'm also using ReLU activations now.

    • I've tried running the network for each output individually (creating 32 different networks). The prediction values were similar to those given by the full network.

    I have attached a link to a pastebin showing some of the outputs given by the network to better illustrate what the issue is:

    In these examples I am not using the signal to noise values, cutting the input size down from 43 to 22.

    Values on the left are the network's predictions, and values on the right are the correct values.

    In this case, each of the network's input values are being standardised (centred on zero with a standard deviation of 1).

    Sometimes the predicted values can be quite close, but in most cases the values are quite off and just seem to be randomly distributed around the mean for that output.

  • Obinna Igbe added an answer:
    Were can I get a labelled version of the ADFA-LD dataset for HIDS evaluation?

    I have been able to download the ADFA-LD Dataset for IDS evaluation, but after opening the file, I noticed that the .txt file only had a bunch of integers spanning multiple lines. Unlike the KDD-Dataset(which has received much negative criticism), there was no label to enable me understand what each entry represented. How can I get a labelled one or another good dataset for HIDS evaluation which has labels?

    Obinna Igbe

    @ Manmeet, follow the attached link to an answer on Quora with links to recent data sets.

  • George Bekas added an answer:
    Can WEKA be used to make two predictions on a given set of data?

    Generally, a machine learning system would only make a single prediction among available classes (classification) or predict a single real valued number (regression) based on the given set of features for an instance.

    Would it be possible to train a system in WEKA to make two predictions for a given sample?

    Although this question could be useful in many angles. A particular use case I would like to use this functionality is predicting an (x,y) coordinate given a set of features of an image. In here, x and y are the two variables to be predicted.

    Does WEKA tool have the ability to do this?

    George Bekas

    The term is multivariate multiple regression.

    You can find a tutorial about how to form it in R, here:

  • Peer-Olaf Siebers added an answer:
    What is the best agent-based traffic simulation tool ?
    I am a Ph.D student working on itinerary optimization in urban traffic, using graph algorithms and Multi-agent systems. What's the best simulation tool that is at the same time agent-based and has broad traffic utilities ?
    Thank you.
    Peer-Olaf Siebers

    You might find the following (free) booklet quite informative. "A Primer for Agent-Based Simulation and Modeling in Transportation Applications"

  • Andrey Davydenko added an answer:
    When fitting a non-linear trend, how to judge whether the used function is over fitting or under fitting? Is there any hypothesis testing?

    When fitting a non-linear trend, how to judge whether the used function is over fitting or under fitting? Is there any hypothesis testing?

    Andrey Davydenko


    @ Meghan L. Rogers: The question relates to modelling a non-linear trend, and there are many publications on that topic based on assessing the significance of coefficients.

    For example, there's a simple, efficient & well-known algorithm based on testing the significance of regression coefficients to see how many variables we need for non-linear approximation.

    The method is  based on a backward elimination procedure (Smith, 1982).The procedure starts with a large number of equidistant knots (so that there are about four or five data points per one knot). Then the number of knots is reduced by one at atime until all the regression coefficients for the remaining knots become statistically significant.


    Smith, P. L. (1982). Curve fitting and modeling with splines using statistical variable selection techniques(Report NASA 166034). Langley Research Center, Hampton, VA.

    Stone, C. J. (1986). The dimensionality reduction principle for generalized additive models. Annals of Statistics, 14, 590-606

  • Aleksandar Sokolovski added an answer:
    I'm looking for C++ implementation of Burg's algorithm?

    Hi, I'm trying to optimize C++ code that implements burg's algorithm, arburg() function. Any/all tips highly appreciated. Thank you.

    Aleksandar Sokolovski


    I am more into R then Matlab, I will check my materials and get back to you.

    I may have similar alg written in R while back in my post grad days.

    Fell free to massage me in case if a forget to reply.

    Regards, Aleksandar

  • Anand Y. Kenchakkanavar added an answer:
    How to draw ROC curves for multi-class classification problems?

    How to draw ROC curves?

    Anand Y. Kenchakkanavar

  • Raphaël Feraud added an answer:
    Which are the new trends in the development of ANNs?
    What kind of goals are sought with such developments?
    Raphaël Feraud


    May be, there is something that is definitively new in these two papers:

    I think that the use of this kind of explicit memory in neural networks or more generally in machine learning can open new doors.

  • Nisha Abhijeet Auti added an answer:
    Where can i get the food database for machine learning?

    specially i am interested in indian food database

    Nisha Abhijeet Auti

    thank you all for such a quick reply to my question. Vinod this database is really helpful but it contain us food. i am interested in indian food.

  • Arian Razmi Farooji added an answer:
    What is the relationship between the Data Science, Machine Learning and Big Data?

    As per my understanding all the three fields i.e. Data Science, Machine Learning and Big Data are interrelated. I just wanted to clear the boundaries of these three fields. Please share your views about the topic. Any example or discussion would be useful.

    Arian Razmi Farooji


    Machine learning is one of sources which can create Big Data and Data Science help scientists to analyze Big Data and  get the desired results which can be used later for decision making.

  • Uwe Reichel added an answer:
    Is it possible to select important input features from a set of inputs using WEKA software?

    Can anyone please tell me is there any option to select significant input features from a set of input variables for ANN model development in WEKA software. Since I have never experienced with this open source tool, I do not know this aspect.

    Any suggestion/guidance would be highly appreciated. Thanks in advance for your time and attention.

    Uwe Reichel

    To get started with the Explorer mode GUI (WEKA version 3.6.13) you can choose an attribute filter for your purpose (supervised learning I assume) this way:

    Open an ARFF file, then:

     Choose>Filters > supervised > attribute > AttributeSelection > Apply

    The column number in the 'Current relation' window should now be reduced.

    For more detailed information about WEKA feature selection see the links below.

    + 1 more attachment

  • Patricia Ryser-Welch added an answer:
    Which Algorithmic Concept is suitable for building a Self Learning Intelligence in Machine..?

    I  am  using  a Genetic  Programming Concept  in  my  Embedded project... so far  ..  It  works  fine..  But I am still confused  with  some  stages...because  a  lot of  logistics  operations  are  there  in my  project....

    Patricia Ryser-Welch

    I would look in using a form of Genetic Programming, with a hill-climbers.  Perphaps you may consider to evolve some neural networks.

  • Kamran Kowsari added an answer:
    How do I perform depth-based object (table) segmentation using Kinect2?

     I want to segment a table in the depth image based on depth information obtained from Kinect2.  The problem with the table is that it is infront of camera and covers a large depth area. Depth thresholding also eliminates other objects from the scene at the same depth level as of table. Any idea would be highly appreciated!

    Kamran Kowsari

    please read my two recent paper about object detection by RGB-D camera, 

  • Medhini Narasimhan added an answer:
    What are anomaly detection benchmark datasets?
    I would like to experiment with one of the anomaly detection methods. What dataset could be a good benchmark?
    Medhini Narasimhan

    For anomaly detection in crowded scene videos you can use -

    • The UCSD annotated dataset available at this link :
    • University of Minnesota unusual crowd activity dataset :
    • Signal Analysis for Machine Intelligence :

    For anomaly detection in surveillance videos -

    • Virat video dataset  -
    • McGill University :

    Hope this helps!

    + 4 more attachments

  • Muhammad Yousefnezhad added an answer:
    What is the co-relation between Machine Learning and Data Mining?

    Is there any relation between Machine Learning and Data Mining?

    Muhammad Yousefnezhad


    As a novel paradigm for exploring knowledge, Data Mining is the computational process of discovering information and knowledge in large data sets or big data. It can be applied by Statistics, Machine Learning, Databases systems, etc. As one of the critical tools for applying Data Mining, Machine learning evolved from the study of pattern recognition and computational learning theory in artificial intelligence. As a result, Machine Learning mostly mentions a wide range of tools for pattern analysis, whereas, Data Mining mostly refers to the whole process of information/knowledge discovery.

  • Abdollah (Iman) Dehzangi added an answer:
    New approaches for machine learning for small data?

    I'm working to predict users based on their manner of using smartphones. I've a dataset(number of samples are one thousand, which can be a small amount of data). Is there a new machine learning approaches valid with this number of samples?

    Abdollah (Iman) Dehzangi

    Check Weka and you can find heaps of awesome classifiers.

    I would suggest using SVM (with linear, polynomial (degree 3), and RBF). Also, Random forest (with 100 or less base learners, AdaBoost.M1 (using C45 and 100 or less base learners, LogitBoost, and Naive Bayes). They might help.

  • Chawki Djeddi added an answer:
    Can someone help with normalizing the output of One-Class SVM Classifier?

    Dear collegues,

    I'm not a machine learning expert, but I'm working on a pattern recognition problem and I'm using a One-Class SVM classifier. I initially train the classifier with positive samples and then test new data using the trained classifier. I get meaningless +ve and -ve values from the classifier.

    How can I get meaningful normalized output (confidence) from this classifier ?

    Best Rgards.

    Chawki Djeddi

    Dear Indika Kahanda,

    Thanks a lot for your response.


Topic followers (28,760) See all