23rd Jul, 2022

Sidi Mohamed Ben Abdellah University, Faculty of Science and Technology, Fez, Morocco

Question

Asked 24th Jan, 2013

I have 18 input features for a prediction network, so how many hidden layers should I take and what number of nodes are there in those hidden layers? Is there any formula for deciding this, or it is trial and error?

**Get help with your research**

Join ResearchGate to ask questions, get input, and advance your work.

23rd Jul, 2022

Sidi Mohamed Ben Abdellah University, Faculty of Science and Technology, Fez, Morocco

ρ = N/ H(I+O+1)+O with 1 < ρ < 3??????

Where; N is the number of molecules in training set, H is number of hidden layers and O is number of output layers

You need to use Cross-validation to test the accuracy on the test set. The optimal number of hidden units could easily be smaller than the number of inputs, there is no rule like multiply the number of inputs with N... If you have a lot of training examples, you can use multiple hidden units, but sometimes just 2 hidden units works best with little data. Usually people use one hidden layer for simple tasks, but nowadays research in deep neural network architectures show that many hidden layers can be fruitful for difficult object, handwritten character, and face recognition problems.

61 Recommendations

The introduction of hidden layer(s) makes it possible for the network to exhibit non-linear behavior. I do not know the nature of the problem. You also have to decide if you expect your network to learn your training set to perfection, or if you are content with a e.g. 95% performance. In order to secure the ability of the network to generalize the number of nodes has to be kept as low as possible. If you have a large excess of nodes, you network becomes a memory bank that can recall the training set to perfection, but does not perform well on samples that was not part of the training set.

4 Recommendations

Generally 2 hidden layers will enable the network to model any arbitrary function. Check out this URL:

But you may want to optimise the number of layers and nodes etc. Network growth and pruning algorithms have been around for a long time. You can also try using a genetic algorithm to define the network structure.

1 Recommendation

@David : Can you please explain in detail how to define network using genetic algorithm?(or any book references)I have little bit knowledge about genetic algorithm.

You need to use Cross-validation to test the accuracy on the test set. The optimal number of hidden units could easily be smaller than the number of inputs, there is no rule like multiply the number of inputs with N... If you have a lot of training examples, you can use multiple hidden units, but sometimes just 2 hidden units works best with little data. Usually people use one hidden layer for simple tasks, but nowadays research in deep neural network architectures show that many hidden layers can be fruitful for difficult object, handwritten character, and face recognition problems.

61 Recommendations

I am agree with Wiering, there is no rule of thumb to find out how many hidden layers you need. In many cases one hidden layer works well, but in order to justify this for a specific problem, you have to apply a heuristic method such as cross validation. Using cross validation you divide your data in two parts namely training set and validation set (also called test set).

You use the training set for training your network, and the validation set to identify how well you neural network performed. To do this you need to predict the labels of your validation set.

In order to minimize the effect of sampling you do this more than once for example using five-fold cross validation you do it five times and then you look into the results you can now take an average of your results. By results I mean a performance measurement or more than one performance measurement such as specificity, sensitivity, MCC, misclassification rate,....

This a commonly used method to answer your question :

How many hidden layers do I need?

What is the best learning rate?

..........

I know that there is a very good implementation of cross validation and neural networks in R , the package called CMA. But if you implemented your own brand new neural network it is a good idea to also implement some kind of cross validation program.

3 Recommendations

How many hidden layers should I use? : http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-9.html (mirror: http://francky.me/aifaq/FAQ-comp.ai.neural-net.pdf)

How many hidden units should I use? : http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-10.html (mirror: http://francky.me/aifaq/FAQ-comp.ai.neural-net.pdf)

What is genetic algorithm? : https://www.researchgate.net/post/What_is_genetic_algorithm1

Genetic algorithm + neural networks: http://francky.me/doc/mrf2011-HEC-ISIR-ENS_en.pdf (chapter 2.2)

3 Recommendations

Before you start implementing genetic algorithms to optimize the topology of your neural net, you should first find out, if a neural network is appropriate for solving your problem. You mention that you have a prediction problem with 18 inputs. I recommend you to start with some simulation tool such as the Rapidminer and design an experiment comparing the generalization performance (averaged test error) of several algorithms. Start with weak learners (Linear regression or Logistic regression in case you have classification problems), then you can proceed and experiment with the neural net of increasing capacity.

4 Recommendations

This problem is solved previously by some experts using ANN so i think ANN will be able to solve it.

Hi all,

I suggest you check out my (quite extensive) answer to a similar question, which can be found here:

HTH

Cheers,

SCM

(no of inputs + no of outputs)^0.5 + (1 to 10). to fix the constant value (last part, 0 to 10), use trial and error and find the optimal no of hidden layer neurons for the min MSE.

30 Recommendations

- I have 12 inputs, 3 outputs. My question is hidden layer 1, 10 hidden layer neurons, what will be MSE?

I think it depends to number of features(neurons in input layer). Higher number of hidden layers increase order of weights and it helps to make a higher order decision boundary.

A NN with N hidden layer can make a (N+1) order decision boundary.

Example: A perceptron without a hidden layer(N=0) can only draw a (0+1=1) first order decision boundary.

A multi layer perceptron with a hidden layer(N=1) is capable to draw a (1+1=2) second or fewer order decision boundary.

So I believe an MLP with N hidden layers surely can solve your (N-1) feature problem.

You can also use the geometric pyramid rule (the Masters rule):

a) for one hidden layer the number of neurons in the hidden layer is equal to:

nbrHID = sqrt(nbrINP * nbrOUT)

nbrHID – the number of neurons in the hidden layer,

nbrINP – the number of neurons in the input layer,

nbrOUT – the number of neurons in the output layer.

b) for two hidden layers:

r = (nbrINP/nbrOUT)^(1/3)

nbrHID1 = nbrOUT*(r^2) – the number of neurons in the first hidden layer

nbrHID2 = nbrOUT*r – the number of neurons in the second hidden layer

c) for three hidden layers:

r = (nbrINP/nbrOUT)^(1/4)

nbrHID1 = nbrOUT*(r^3) – the number of neurons in the first hidden layer

nbrHID2 = nbrOUT*(r^2) – the number of neurons in the second hidden layer

nbrHID3 = nbrOUT*r – the number of neurons in the third hidden layer

and so on

4 Recommendations

The number of hidden layers and nodes depends of the problem you want to model.

Take a look in the link below that you will understand better this problem dependency.

If you change the dataset you will see that in more complex problens you will need more nodes/hidden layers.

2 Recommendations

The upper bound on the number of hidden neurons that won't result in over-fitting is:

Nh=Ns(α∗(Ni+No))

Ni = number of input neurons.

No = number of output neurons.

Ns = number of samples in training data set.

α = an arbitrary scaling factor usually 2-10.

Others recommend setting alphaalpha to a value between 5 and 10, but I find a value of 2 will often work without overfitting. As explained by this excellent N Design text, you want to limit the number of free parameters in your model (its degree or number of nonzero weights) to a small portion of the degrees of freedom in your data. The degrees of freedom in your data is the number samples * degrees of freedom (dimensions) in each sample or Ns∗(Ni+No) (assuming they're all independent). So α is a way to indicate how general you want your model to be, or how much you want to prevent overfitting.

There are many rule-of-thumb methods for determining the correct number of neurons to use in the hidden layers, such as the following:

1. The number of hidden neurons should be between the size of the input layer and the size of the output layer.

2. The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.

3. The number of hidden neurons should be less than twice the size of the input layer.

These three rules provide a starting point for you to consider. Ultimately, the selection of an architecture for your neural network will come down to trial and error. But what exactly is meant by trial and error? You do not want to start throwing random numbers of layers and neurons at your network. To do so would be very time consuming. Chapter 8, “Pruning a Neural Network” will explore various ways to determine an optimal structure for a neural network.

4 Recommendations

Excuse me but i have a question

I think the normalization change the input vector to anther one so what the guarantee that the produced ANN will maps all inputs to their desired ones?

I think we should make another change because the normalization change is not a feature in the ANN but it is a vector feature.

I think the produced ANN will map the normalized input to the desired one only if it is an element of the training set

1 Recommendation

The size of the hidden layer is normally between the size of the input and output-.It should be should be 2/3 the size of the input layerplus the size of the o/p layer The number of **hidden neurons **should be less than twice the size of the input layer**.**

3 Recommendations

This paper provided different techniques that have been employed on deciding the number of hidden neurones. It's quite a good read.

5 Recommendations

Just trial and errors because the the performance of deep neural networks are depending on the data structure.

Or you can find the number of layer and neurons by using the global optimization algorithm such as particle swarm, simulated annealing, patternsearch, bayesian optimization, and etc, to minimize the validation errors

Third method is a type of rule of thumbs called geometric pyramid rule which can determine the number of neurons in each hidden layer

The rule of thumb I have successfully followed over a number of years is that for a single hidden layer ANN (2 x No of inputs) +1 yields good results.

1 Recommendation

This article could be of use

Flores, Juan J., Mario Graff, and Hector Rodriguez. "Evolutive design of ARMA and ANN models for time series forecasting." *Renewable Energy* 44 (2012): 225-230.

1 Recommendation

Mohammad,

A few years back it was believed that one layer was capable of equivalent modeling capabilities that any number of layers. That is true, but the number of neurons grow rapidly with one layer. The trend nowadays is to design ANN with more hidden layers (therefore deep ANNs).

Two books talk about these issues:

http://hagan.okstate.edu/NNDesign.pdf (read chapter 22)

I hope this helps.

Best,

Juan

1 Recommendation

Exactly there is no rule to decide no of hidden layer and number of neurons in each hidden layer. Just we have to try different combination.

The number of neurons in the hidden layer is selected based

on the following formula (no of inputs + no of outputs)^0.5 + (1 to 10). to fix the constant value (last part, 0 to 10), use trial and error and find the optimal no of hidden layer neurons for the min Mean Square Error.

There is not a fixed answer for this , contratry to what Vijila Rani K wrote or Ines Abdeljaoued or M. Khishe

Nobody knows and depend on your data particularities.

You could try to determine neural net number of hidden layers and also other parameters with genetic alghoritms .

It is the best of the two worlds.

I think this link will help you

8 Recommendations

@Ines Abdeljaoued do you have a reference for that, did you mean this paper Sheela, K. G., & Deepa, S. N. (2013). Review on methods to fix number of hidden neurons in neural networks. *Mathematical Problems in Engineering*, *2013*.

1 Recommendation

There is no direct answer.

Yet, you can implement a nested for-loop, one for the number of hidden layers and the other for the number of nodes among each layer. Then you record the error for each topology and select the one that has the minimum error.

I hope that makes sense.

1 Recommendation

There is no specific formula to use. It is generally problem-dependent. and trial and error might work but there is no guarantee that it will lead to the optimal solution. grid search might be the solution in this case.

1 Recommendation

More hidden layer , there are chances you will approach the goal quickly. But try to solve your problem by using minimum number of neurons , as it makes sense when you design the practical circuit.

There is a paper "Review on Methods to Fix Number of Hidden Neurons in Neural Networks"

1 Recommendation

These days, our best bet is to include a large enough number of neurons and train with early stopping, regularization and dropout.

Then again, what's large enough? That is easier to determine than optimal. No rule of thumb :(

Error and trial. Apply a GA and take the output prediction accuracy as the fitness function.

Usually single layer is preferred. With cross validation, a 30% to 50% training data will reduce the data size significantly.

Also input data can be clustered and a very few patterns can represent/ address the large number of patterns in that cluster.

Applying fuzzy variables (to represent fuzzy inputs) can reduce the data.

2 Recommendations

Hello,

I am agreeing with the opinion of Sudeera H. Gunathilaka. In my opinion there is no specific rule for that to select the number of neuron in the hidden layer. Some time more number of neurons gives better result and some times it may be less number of neurons. Therefore in my suggestion you need to optimize the structure of the network model by using different physical structure.

Thanks

Deleted profile

Dear Chavan

there is no method to calculate the **number of layers in the hidden layer **of neural networks, while the **number of neurons in the layer (size of the hidden layer)**is a different term from the number of layers and is calculable.

hidden layer or the black box as the name represents has some vague characteristics to some respects and the same as many other features in a neural network that can be tested and optimized by experiment (e.g. number of epochs or combination of different types of Neural Networks for specific purposes), the number of layers in a hidden layer should be decided **experimentally in that context.**

On the other hand, **some rules are prescriptive** that you should bear them in mind;

1)having a **linear data-set -separable** requires you not to add any hidden layer; although such liner data does **not require NN **to be processed but still works.

2)for the **majority of problems a requirement for the second and third hidden layers is rare** and one hidden layer, alone, works.

finally, I should remind my friends of the formula been give above; as far as I am concerned there is no formula for calculating the number of layers in the hidden layer but the number of neurons(size of the hidden layer)is calculable to some respects as below:

the file attached worth to take a look (sure it will help you ).

1 Recommendation

The trial and error method of changing hidden nodes until achieving the global minimum is the ultimate criterion but if we go for more number of hidden nodes may lead to the overfitting of the model for testing set prediction. so better to cross-check MAPE or R2 value of both training and testing set before finalizing the model.

- The number of hidden neurons should be between the size of the input layer and the size of the output layer.
- The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
- The number of hidden neurons should be less than twice the size of the input layer.
- see this https://www.heatonresearch.com/2017/06/01/hidden-layers.html

There are some basic rules for choosing number of hidden neurons. one rule says it should be 2/3 to the total number of inputs so if you 18 features , try 12 hidden neurons. For hidden layers two layers are considered better, however you can check the accuracy of your model by changing the numbers of layers and select the one which gives best result.

Cybenko's technical report (1988) states that two hidden layers are sufficient. If the learned function is holder continuous, then deeper networks approximate better.

As some researchers mentioned, there is not a general way to answer the number of hidden layers in an ANN but experimentally you can add the independent variables (input values) and dependent variables (output values) devided by 2. Next run the network to tune (add extra nodes) to reach the most optimal results (a recommended trick:in the programming language environment, use a loop to plot different numbers of layers and numbers of nodes for each layer)

17th Mar, 2021

Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

This is commonly called neural architecture search (NAS).

Hyperparameters are mostly chosen by trial an error. You could employ a grid search technique to find them

Even better, you could use an EfficientNet, which has optimal hyperparameters for efficiency and accuracy:

Cheers, Raoul

2 Recommendations

Raoul G. C. Schönhof Thank you so much

Raoul G. C. Schönhof is there a rule of thumb for choosing the search space for hidden neurons in each layer?

I would say, use a large enough number of neurons in each layer. Use dropout, regularization, and early stopping, and only search for the number of hidden layers.

The most common method is to have the number of hidden layers from A to Z because using a large number of layers will lead to time delay and this is not preferred in high speed systems. Therefore, choosing the activation function correctly has a clear impact on the network response, better than increasing the number of neurons or increasing the number of layers, which are usually chosen by trial and error method.

1 Recommendation

Ahmed Abougarair Thanks

If you use python Sklearn you can use GridsearchCV to optimize the hyper parameters of neural networks.

This website and book provided a good solution for this crucial subject in the DL model.

The book name is :Better Deep learning:

Hope those are helpful

Thanks Mobina Golmohammadi

No there is no such formula for finding the hidden layers...just go by hit and trail

1 Recommendation

There is no rule to determine the number of hidden layers and nodes in a hidden layer. As far as I read, the only two options available is either trial and error or via an optimisation process using algorithm such as GA, PSO, GridsearchCV, etc.

Yes you are right! Unfortunately it's trial and error. To begin with, take the number of neurons in the first layer by one third more than the input signals, that is, 24. At the first stage, make 2 layers, where in the second layer the number of neurons will correspond to the number of output classes. And then you code in python, test the quality of the neural network and add either layers or the number of neurons in the layers.

There is no particular formula. As far as I know, you can do some trial and error and based on its learning curve (which you can interpret visually), you can tweak the neurons and layers.

1 Recommendation

In Python, you can try GridSearchCV or KerasTuner to optimize the hyperparameters in a systematic way.

Article

- Sep 1995

CONTENTS Page ACCEPTANCE PAGE ............................................ii ACKNOWLEDGMENTS ...........................................iii CHAPTER 1. INTRODUCTION .....................................1 CHAPTER 2. PDL AND RELATED WORKS ............................5 Neural Networks, Artificial Intelligence and Machine Learning .........................

Article

Full-text available

- Jan 1994

We built a forecasting system for the path types of the Kuroshio current and the distance between the Kuroshio axis and Cape Iroh-zaki. A layered type of the artificial neural network was used in the system. Input data sets included six months' precedent data of distances between the Kuroshio axis and major capes, occurrence rates of Kuroshio path...

Get high-quality answers from experts.