Machine Learning

Machine Learning

  • Stefano Rovetta added an answer:
    Which activation function should be used in a prediction model?
    I want to develop a prediction model (like time series forecasting) with BPNN. The sigmoid function is mostly used as activation functions in BPNN but the sigmoid function gives an output between 0 to 1. If my expected output is like 231.54 then how to calculate the error & go for training? In short, I want my network to produce values like 231.54. What should the activation function for the hidden and the output layer then be?
  • Silvia Pisano added an answer:
    Does anyone have advice on EEG-based emotion recognition by Deep Learning?

    I have trained and classified EEG data with some machine learning methods in the past and now that I'm going to work with a new EEG headset I would like to explore Deep Learning (I have some demo experience in Computer Vision with it). I read a couple of papers, I'm curious about any implementations, particularly I'm interested to run tests with Theano or Google's TensorFlow.

    Silvia Pisano

    Hi, Marco, here a paper about the subject you research ... maybe you already have read it, but if you didn't, it can be helpful

  • Arturo Geigel added an answer:
    What is the better way to detect and prevent selfishness using dynamic learning ?

    i want to develop the detection and prevention of selfish attack using the dynamic/machine learning can i do that.i don't understand the proper flow for do this.

    Arturo Geigel

    According to "A Novel Defence Scheme Against Selfish Node Attack in Manet" by  Soni and Kamlesh a selfish attack is where the node Does advertising as the shortest path in order to do interception. Based on this definition what you can do in machine learning is to do a baseline analysis of the network on how the dynamic routing protocol changes through time. If you have a baseline behavior then you can target abrupt outliers in the traffic flow. To do  this the algorithm must be able to learn the topology as one of the parameters and also account for normal changes in the network(such as sudden network changes during peak hours, etc.).

    What you would aim to do is try to reduce the number of false positives that you can have using this type of IDS

  • Christos Tsatsoulis added an answer:
    How should I proceed to implement NLP (almost) from scratch (collobert 2011)?

    There is a famous JMLR journal of Ronan Collobert called  "Natural Language Processing (almost) from scratch" . I want to implement that. Can anyone let me know how should I proceed, which language will be appropriate for it ? I can go for torch or theano.

    Christos Tsatsoulis

    There are at least 2 Python libraries inspired from (and based partially upon) SENNA:

    nlpnet -

    practNLPTools -

  • Dayana Surendran added an answer:
    Request for assistance and possible collaboration: Database of the effect of small molecule inhibitors on amyloid aggregation
    I am in the process of creating a database of experimental results of small molecule inhibitors of amyloidogenic proteins
    If you are interested in helping, there are two ways to get involved:
    1. Put papers in
    for inhibitor studies
    amyloid aggregation studies without inhibitors
    2. Help in creating the database. If you want to collaborate on a deeper level, please message me. I am particularly interested in people with experience in text mining and machine learning techniques.
    Dayana Surendran

    Dr. Jeffrey, I like this idea, I want to join this.

  • Mohamad Ivan Fanany added an answer:
    Does anybody know an ISI journal with a quick review process?
    It should cover the mobile robot navigation or neural network and fuzzy system fields, if possible? This is my first article and I'm looking for a journal with low impact factor and with a quick review process. The journal should cover mobile robot navigation, Neural network and fuzzy logic fields.

    Thanks in advance.
    Mohamad Ivan Fanany

    Dear Friends,

    I have just finished my little research on IEEE Transactions and Journals Impact Factors, Review Speed, and Open Access fee. I hope this would help anyone that preparing to submit a journal article to IEEE. 

    + 1 more attachment

  • Mohamad Ivan Fanany added an answer:
    Which journals in computer science is free for publish?
    Is there a lists of free journals in computer science?
    Mohamad Ivan Fanany

    Dear Friends,

    I have just finished my little research on IEEE Transactions and Journals Impact Factors, Review Speed, and Open Access fee. I hope this would help anyone that preparing to submit a journal article to IEEE. 

    + 1 more attachment

  • Partha Majumder added an answer:
    Any recommendations for an optimization technique that usually gives good results?
    I think PSO, Bees or ACS might help, but which one will be the best choice?
    Partha Majumder

    Try PSO. It is  the most easiest optimization algorithm to understand. U may also try cat swarm optimization.

  • Ranga Dabarera added an answer:
    In an agent network (or graph) how can I find the most influential node to drive the whole network to a certain direction/opinion or idea?

    This can be seen in many disciplines under different terminology. I need to get get as much ideas as possible. Feel free to let me know how it is handled in realms closer to your research. For instance

    • Sociophysics: Opinion formation on social networks 
    • Marketing: viral marketing 
    • Microfinance : Diffusion models to study the influential individuals, etc...

    In brief the idea is: given a network, if we need to drive an idea, which node (or agent) we should select. All your comments are highly appreciated. Thank you in advance.

    Ranga Dabarera

    Professor Bin Jiang: Thank you very much for the publication professor. Really appreciate your insight in this. 

  • Hristo Nikolov added an answer:
    How can I get support vector in LibSVM and Weka?

    I am working on LibSVM classifier and Weka in java and I want to get the support vector which results from the training.

    So, can anybody tell me how to get the support vector using java code?
    This is a snapshot of my code:
    LibSVM classifier = new LibSVM();
    Instances instsTrain = new Instances(new FileReader("Train.arff"));
    instsTrain.setClassIndex(instsTrain.numAttributes() - 1);

    the result of that code was

    optimization finished, #iter = 34
    nu = 0.9733207501211705
    obj = -47.33334391005496, rho = 0.33325162948067544
    nSV = 50, nBSV = 46
    Total nSV = 137

    And nSV is the number of the Support Vector, but i cant get the data of the Support Vector.

    Thanks for your help.

    Hristo Nikolov

    One more open source solution could found in ML library in OpenCV , but it will need more effort from you. 

    Hope this shall be of use.

  • Mahamad Nabab Alam added an answer:
    Can anyone recommend algorithms to deal with unbalanced clusters for classification?
    Algorithms to deal with unbalanced clusters for classification?
    Mahamad Nabab Alam

    MATLAB can be used for the purpose.

  • Sultan Tarazi added an answer:
    Is there a rule of thumb that explains the splitting of a limited dataset into two-three subsets?

    I have 600 examples on my dataset for classification task. Number for examples labeled in each class is different. ClassA has 300 examples, ClassB has 150 examples, ClassC has 150 examples.

    I read many papers and resources about splitting data into two or three parts, train-validation- and test. Some are saying if you have limited data then no need for wasting time and three parts. Two parts (train-test) is enough giving 70% for training, and 30% for testing. And using 5-flogs metric also is the ideal one for limited.

    Some are saying doing 70% for training ( and the validation data taken from the training data itself for 30%) , and test for the remaining 30% from the original data.

    From your experience, could you tell me your thoughts and suggestions about this mystery? 

    Thank you 

    Sultan Tarazi

    First, allow me to express my esteem and appreciation to you all for participating and trying to help with this question. 

    @Marion G Ceruti

    • Collected data using Tweepy in Python and Twitter API. Based on some related terms to the illegal immigration topic that is my research on.
    • I made you a very short and readable .txt file: please visit this link and others to have a full understanding of what I am doing. 


  • Negin Malekian added an answer:
    How can I solve the problem of multi-agent learning while agents interact just through the environment by using a single-agent Q-Learning algorithm?

    There are a lot of agents in my model while they have interaction just through the environment. I’m using a Q-Learning algorithm to solve this model so that all the agents share a static Q-table in java (because here the agents are homogenous). Here, the environment is dynamic and the time step of environment changes is a lot smaller than the time step of agent state changes. So, the state of an agent won’t be changed until the environment has been updated through plenty of steps. Furthermore, the agents and environment have interaction with each other and can affect each other. In one hand, I need to know the new state of the agents at the next time step (i.e., to find the MaxQ(s(t+1),a) in Q-Learing algorithm). On the other hand, I can’t postpone updating the Q-table until the next step because it is shared between the agents. So, do you have any suggestion to handle my problem?

    Negin Malekian

    Thanks dear Sébastien

  • Frank Veroustraete added an answer:
    What is the difference between K-MEAN and density based clustering algorithm (DBSCAN)?
    Density based clustering algorithm has played a vital role in finding non linear shapes structure based on the density. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is most widely used density based algorithm. It uses the concept of density reachability and density connectivity. On the other hand, K-means (MacQueen, 1967) is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. The main idea is to define k centroids, one for each cluster. These centroids shoud be placed in a cunning way because of different location causes different result.
    Frank Veroustraete

    Thanks Christoph,

    I am really interested to be able to read your paper on this topic. I am curious about the results you obtained!

    Many hanks again and have nice day!


  • Csaba Kertész added an answer:
    What could be the true reasonable results in Classification?

    I am working on ML algorithms. SVC linear, Ensemble, LR, and MultinomialNB. For classification vision; especially classification using Tweets for multi-classes. 

    I do have results for my validation part consecutively for previous algorithms as follow: 

    1- SVC--> 45% false prediction 55% true prediction

    2- RandomForest --> 35% false prediction, and 65% true

    and the others have worse result for false prediction.

    I am not sure if those results are reasonable in ML, or I am doing something wrong? Could the cause be in preprocessing text? I do not know actually

    FYI: data were collected by me in specific time, my data is limited around 668 observations only. 


    Csaba Kertész

    I would say that if the classifiers provide performance very close to each others (e.g between 78-82%), you can't say that X is better than Y. I would say, they have similar performance. I am pretty sure that if you would have (a bit) more data, the classifier performances would change, especially, for such a small sample database. It is also important how many features do you have? You can try to remove the outliers or use feature feature standardization. If you used, try without standardization, it does not always improve the results, sometimes degrades in my experience.

  • Santhakumaran A. added an answer:
    How do I eliminate noise variables when using ensemble prediction methods like randomGLM in R?

    The task involves predicting a binary outcome in a small data set (sample sizes of 20-70) using many (>100) variables as potential predictors. The main problem is that the number of predictors is much larger than the sample size, and there is limited/no knowledge of which predictors may be more important than other. Therefore it is very easy to "overfit" the data - i.e. to produce models which seemingly describe the data at hand very well, but in fact include spurious predictor variables. I tried using an ensemble classification method called randomGLM (see which seeks to improve on AICc-based GLM selection using the "bagging" approach taken from random forests. I checked results by K-fold cross-validation and ROC curves. The results seemingly look good - e.g. a GLM which contains only those variables which were used in >=30 out of 100 "bags" produced a ROC curve AUC of 87%. However, I challenged these results with the following test: several "noise" variables (formulas using random numbers from the Gaussian and other distributions) were added to the data, and the random GLM procedure was run again. This was repeated several times with different random values for the noise variables. The noise variables actually attained non-negligible importance - i.e. they "competed" fairly strongly with the real experimental variables and were sometimes selected in as many as 30-50% of the random "bags". To "filter out" these nonsense variables, I tried discarding all variables whose correlation coefficient was not statistically significantly different from zero (with Bonferroni correction for multiple variables) and run randomGLM on the retained variables only. This works (I checked it with simulated data), but is of course very conservative on real data - almost all variables are discarded, and resulting classification is poor. What would be a better way to eliminate noise variables when using ensemble prediction methods like randomGLM in R? Thank you in advance for your interest and comments!

    Santhakumaran A.

    Read the chaper I Governing Principles  of mathematical  Modeling By A.Santhakumaran

    Title of the book:

    Food Engineering
    Emerging Issues, Modeling, and Applications

    Editors: Murlidhar Meghwal, PhD
    Megh R. Goyal, PhD

      Copyright © 2015 Apple Academic Press Inc.

  • Kamakshaiah Musunuru added an answer:
    How to use R language for larger datasets of size more than a machine RAM size?

    I am implementing statistical models for my project having very large data with me. I have used R language for implementing this and now i want to use machine learning model on it. But it creates problems while loading the data into RAM or sometimes performs operation for some data and throws error and stops working. I need a solution for this problem so that it should carry its operation automatically. I have tried big memory and ff packages also but its not working. Is there any solution for this problem?

    Kamakshaiah Musunuru

    Doing operations on entire data set is unproductive thought, but big data analysts might refute this claim. If the big data analytics is true, then the entire edifice of sampling theory collapse. If we have two alternatives one: using samples and infer the population, the two: scaling up resources. The first one seems to be more cleverly. In R there are several approaches but there is no ground for those operations. For instance, the fact that R can scale up to maximum capacity  is not true due to the reason that the host i.e. Windows doesn't allow applications to use full capacity. The only solution is using HDD. That way the only option is to mix hadoop with R. The other packages like foreach, distributedR, snow, parallel only takes care of multi-core processing but doesn't work through HDD. 

  • Hojjat Emami added an answer:
    How do you use Stanford coreference resolution in java application?
    I am looking for a code to run Stanford coreference resolution in java netbeans.
    Hojjat Emami

    Hi Dear

    you can run Stanford co-reference by following the below links:

    good luck

    H. Emami

  • Koshy George added an answer:
    Can inverse Radon Transform be embedded in the Deep learning Auto Encoder?

    I am working on removing blur from images and needed help in understanding if inverse radon transform can be embedded at the core of processing in the auto encoder(AED) which is based on deep learning techniques.

    The work revolves around removing the motion blur from an image. The approach is using Deep Learning and a useful processing technique seen in Deep Learning is Auto Encoder(AED). It uses Gaussian for its core operation. I want to replace it with inverse Radon Transform. 

    Is this approach possible?

    It would be great if some materials or links could be suggested as a pointer in this direction.

    Koshy George

    Thank you very much. The materials really helped :)

  • Viktor Dmitriyev added an answer:
    How can I overcome Rhdfs and Rmpi package installation problems?

    Hi all,
    I have been installed and setup the rhadoop in the VMware environment.
    Recently, i tried to install the following packages:

    1) rhdfs_1.0.8 package, and prompted with this error message:
    (as ‘lib’ is unspecified)
    Error in getOctD(x, offset, len) : invalid octal digit..


    configure: error: "Cannot find mpi.h header file"
    ERROR: configuration failed for package ‘Rmpi’
    * removing ‘/home/aisyah/R/x86_64-redhat-linux-gnu-library/3.2/Rmpi’

    Anyone had the same experience and know any suggestion?

    Viktor Dmitriyev

    Usually, in situations like this you need to install "special" package for your Linux distribution you are using that contains most of the "headers" required. One can also help is an installation of "dev" variant of solution you are trying to install.

  • Sebastian Furth added an answer:
    Is there a standard corpus for technical documentation?

    Standard corpora exist in various domains, however i can not find a corpus containing large amounts of technical documentation. 

    The only corpus I've heard of is the "Scania Corpus" from the PLUG project 1998. However i can not find any resources.

    Does anybody know of another corpus or has access to the Scania documents?

    Thank you in advance

    Best regards


    Sebastian Furth

    Thank you for your answer! This is indeed really interesting. I could imagine that at least the Open Office or GUI (Gnome, KDE) manuals might have some similarities with documentation for machines.

  • Shafagat Mahmudova added an answer:
    Defect patterns in the semiconductor wafers manufacturing process

    What are the most common "mixed" defect patterns appearing in the semiconductor wafers manufacturing process?

    Shafagat Mahmudova

    Dear  Ghalia Tello,

    Look the at. file. May be useful.

    Regards, Shafagat

  • Moritz S. Schmid added an answer:
    Do you know a machine learning data repository with DOI assigment to publish data?

    We intend to publicly publish a data set in the machine learning (ML) area, which consists of sensory data gathered in a technical process.

    The first and most common possibility is the UCI ML repository ( One drawback of this repository is that no persistent identifier like digital object identifier (DOI) is offered.

    Do you know an alternative in the ML area offering the possibility publishing a data set publicly and persistently accessible? My research did not bring up any alternative.

    Thanks a lot!

    Best regards,

    Moritz S. Schmid

    Sorry, I wanted to answer earlier, but it slipped though. Definitely Zenodo!

    I am using it and it's great. They also added an attribution share alike license to the general pull-down menu for your upload (after I asked for it). The support is good and responsive.

    You also have integration with Github. Another plus. You make a release on github and can then add it in zenodo!



  • Edgar Benitez added an answer:
    Should I use PCA with categorical data?
    It is not recommended to use PCA when dealing with Categorical Data. In my case I have reviews of certain books and users who commented. So, the data has been represented as a matrix with rows as binary vectors where 1 means the user commented on this book type and 0 means he has not. I am not concerned with the number of comments.

    I applied PCA to this data in order to reduce the dimensions for projecting it on a 2D plane. I noticed that it already forms 5 clusters that are disjointed and far from each other. One of the clusters look like the image attached.
    Edgar Benitez

    A common error in the handling of statistical models is the ignorance of methodologies, this causes researchers apply techniques (traditional ones) that are not appropriate for the data generated. This was especially true a few years ago, but in this age of easy access to information, apply inappropiate methodologies to data we have, should be judged from positions beyond the justification of anything goes (in statistical terms) ...

  • Mirzakhmet Syzdykov added an answer:
    Can we fully simulate the whole brain on a digital computer?

    Miguel Nicolelis and Ronald Cicurel claim that the brain is relativistic and cannot be simulated by a Turing Machine which is contrary to well marketed ideas of simulating /mapping the whole brain  If it cannot be simulated on digital computers what is the solution to understand the brain language?

    Mirzakhmet Syzdykov

    Thank you Colin,

    Please read my answer below:

    The TCP/IP example is an example of measuring the signal of two communicating servers for example as if there would be a problem of measuring the signal of this example system. However, we can state that the brain signal is invariant and can be variated relatively to the system, which is a upper subset of the product coordinate system (for example, the cartesian coordinate system). The second paper was a solution to the random problem using random and elementary relations, however, after applying the corrections, it produced a feasible results, thus optimizing the target function.

    This can be applied in measuring of signal functionality of the brain of biological creatures who actually interact randomly, however, from objective point of view researcher has the instruments to measure this system relatively to the to unary product systems like objectivity and subjectivity which, in turn, thus are giving the objective plot from one and only point of view (objectivity); for instance in math, the zero is a starting number or "nothing" in common sense. The factors like operation cost and instrumentation complexity (from zero to object, for instance) are to be taken into account.

    For this purpose, there are many known methods for experimental study of random modeling data like Fischer's regression method or a method of covariance in a fully factored experiment. However, further one can meet the problems related to the state complexity of the system to be measured adequately. In this case, typically, this leads to the simplification/systematization and localization of tasks on the partially or fully produced subset of the model (for this instance, I recommend to read good articles related to the computer linguistics/machine learning and theory of languages and automata, these are Brzozowski's partial derivatives, Rabin's subset construction and non-determinized automata, Antimirov's partial derivatives, Noam Chomsky's classification of languages/pumping lemma, etc; and Berry-Sethi's algorithm to produce deterministic finite automata from regular expression - these are actually good examples of the evolution of solving the problem of language classification/systematization and representation).

    About your suggestion to graph the decision paths, I can say that actually there could be proposed the models of the decision strategy according to the "state" (in other words, the state of the considering system or its sub-system or any other entity). The article describing the algorithm for ant colony optimisation actually proposes the hybprid approach in decision-making according to the stable equalities in the model to be converged/optimized. As a result we see the adequate product of the algorithm which produces the feasible layouts in the example problem. If you've read this article, I recomment you to read the further article authored by me which is to be published soon, and, of course, the original paper of M. Dorigo in which the stable equalities are propsed according to the abstract model of the swarm.

    About the awareness of soldiers and their operability in potentially dangerous situations I can say that it can be related to the cognitivity and application of physiology in reflexes manipulation with respect to the main factors like time (which can be measured as quickness) and other factors. This can lead to the dramatic result of building the biological weapon of the swarm of species (as if there would be the stable conditions, like in the swarm model of Dorigo). The method of cellular simulation which was propsed for example by Conway in the game "Life" can be used as a frame of building the entire model with respect to the environment in which these units are to operate. According to my own experience, I think that there could be the formal rules which are to be applied to minimize the dangerous risks wih respect to the critical situation. I think I will write something about it further.

    Good Luck!

    Looking forward for collaboration with you, Colin!


    Mirzakhmet Syzdykov

  • Santosh Kumar Sahu added an answer:
    How to load a dataset in Hortonworks sandbox VMware hadoop 2.3.2?

    Hi all,

    I'm currently working with hadoop using Hadoop 2.3.2 Hortonworks sandbox that runs on VMware.  I wish to load a dataset by following the "hello world" tutorial as provided by Hortonworks webpage. I followed exactly the steps in that tutorial. As said in that tutorial, to load a dataset, I need to create a temporary data directory by clicking on the new directory button. However, the new directory button is disabled in the admin ambary dashboard. Does anyone here has any suggestion or recommendation on any other better hadoop installer which is more easier?

    Santosh Kumar Sahu

    There are so many ways to input data into hadoop. It depend on your requirement and type of data i.e. structure, semi-structure or unstructure. But your question is most easiest way to input data.

    Visit the site:::

    But to input streaming data or any other type of data this approach will not work.

  • Morton E. O'Kelly added an answer:
    How can I use machine learning in this case?


    I have done several experiments. For each one, I have a couple (xi, yi) where xi is the label and yi is a measurement. However for the same input value (xi)   I can get different values for the output (yi) provided by different tests. I need to use a machine learning method to model the problem and then predict the measurement yj for a given xj.

    I have multiple questions:

    - Is the problem well defined? 

    - What is the method that can I use? 

    Thank you,

    Morton E. O'Kelly

    I looked at the spreadsheet of data you provided.

    While I don't think this is what you want to hear, the simple idea of plotting all the Y series for the X values shows that many of these Y series have identifiable trends / patterns, but they seem to be all quite different. Is there any reason to expect that the Y values would vary in a systematic way. How about taking "differences" --- so plot X vs (Y1 - Yn). This seems to be a problem where basic data exploration as opposed to machine learning is going to be useful to you.

  • Abass Olaode added an answer:
    Can someone help with building sub-labels for unlabeled and labeled documents?

    I am trying to build a document classification model on news articles. The problem that I face is sub-diving the class labels. For example the training set that I have is about 3000 documents with 10 class labels (For example sports, Technology, Business etc). The classification model works relatively very good in classifying a new document given the training set. Now here's the real deal, I want to further narrow down to sub-fields, For Example, If a article is speaking about "Machine Learning" and "NLP" the document is classified under technology, but what I want is it should be structured under a topic "Artificial Intelligence" or "Machine Learning" or "NLP".

    What I am looking for is generation of new topics, I have thought of few bootstrapping steps like classifying document, scoring the entity for each class, manually classifying the documents and redoing the process, but I am completely lost. I would really appreciate some guidance in attacking this problem.

    Abass Olaode

    you may need to start with a Bag of word modelling of the documents and allow the codebook to be large enough to cover all possible words in the document collection. then, follow this stage with another modelling that is built on the important words in the collection (topic based modelling such as PLSA and LDA will give you ideas but cannot achieve this). then you can use supervised classification (such as SVM) if you have enough labelled training samples or unsupervised classification such as hierarchical clustering, which should be followed by examination of the clusters generated so as to identify the cluster that meet you requirement.

  • Aisyah mat jasin added an answer:
    Is there any R package for multivariate distribution for missing data using maximum likelihood estimation?

    I'm currently anaysing the missing data problem using maximum likelihood estimation.

    I would like to test the estimation on missing data for multivariate dataset using R. I found several packages in R for MLE but some of them are only discussing and applying the general function/term/concept to apply MLE but  do not specifically applying on the missing data problem for multivariate dataset.

    My current dataset is having the combination of categorical,continuous and discrete data types

    I would appreciate if someone could recommend me the notes or tutorial for applying missing data using MLE in R

    Thank you in advance 

    Aisyah mat jasin

    thank you Watheq J. Al-Mudhafar 

Topic followers (28,477) See all