Question
Asked 24th Jan, 2013

How to decide the number of hidden layers and nodes in a hidden layer?

I have 18 input features for a prediction network, so how many hidden layers should I take and what number of nodes are there in those hidden layers? Is there any formula for deciding this, or it is trial and error?

Most recent answer

27th Sep, 2022
Shima Shafiee
Razi University
he number of hidden layers is n_layers+1 because we need an additional hidden layer with just one node in the end. This is because we are trying to achieve a binary classification and only one node is required in the end to predict whether a given observation feature set would lead to diabetes or not.
GOOD LUCK

Popular Answers (1)

26th Jan, 2013
Marco A. Wiering
University of Groningen
You need to use Cross-validation to test the accuracy on the test set. The optimal number of hidden units could easily be smaller than the number of inputs, there is no rule like multiply the number of inputs with N... If you have a lot of training examples, you can use multiple hidden units, but sometimes just 2 hidden units works best with little data. Usually people use one hidden layer for simple tasks, but nowadays research in deep neural network architectures show that many hidden layers can be fruitful for difficult object, handwritten character, and face recognition problems.
61 Recommendations

All Answers (85)

24th Jan, 2013
Steffen B Petersen
Aalborg University Hospital
The introduction of hidden layer(s) makes it possible for the network to exhibit non-linear behavior. I do not know the nature of the problem. You also have to decide if you expect your network to learn your training set to perfection, or if you are content with a e.g. 95% performance. In order to secure the ability of the network to generalize the number of nodes has to be kept as low as possible. If you have a large excess of nodes, you network becomes a memory bank that can recall the training set to perfection, but does not perform well on samples that was not part of the training set.
4 Recommendations
24th Jan, 2013
David Cornforth
Independent Scholar
Generally 2 hidden layers will enable the network to model any arbitrary function. Check out this URL:
But you may want to optimise the number of layers and nodes etc. Network growth and pruning algorithms have been around for a long time. You can also try using a genetic algorithm to define the network structure.
1 Recommendation
25th Jan, 2013
Prashant Chavan
Vishwakarma Institute of Technology
@David : Can you please explain in detail how to define network using genetic algorithm?(or any book references)I have little bit knowledge about genetic algorithm.
26th Jan, 2013
Marco A. Wiering
University of Groningen
You need to use Cross-validation to test the accuracy on the test set. The optimal number of hidden units could easily be smaller than the number of inputs, there is no rule like multiply the number of inputs with N... If you have a lot of training examples, you can use multiple hidden units, but sometimes just 2 hidden units works best with little data. Usually people use one hidden layer for simple tasks, but nowadays research in deep neural network architectures show that many hidden layers can be fruitful for difficult object, handwritten character, and face recognition problems.
61 Recommendations
1st Feb, 2013
Nazanin Kermani
Imperial College London
I am agree with Wiering, there is no rule of thumb to find out how many hidden layers you need. In many cases one hidden layer works well, but in order to justify this for a specific problem, you have to apply a heuristic method such as cross validation. Using cross validation you divide your data in two parts namely training set and validation set (also called test set).
You use the training set for training your network, and the validation set to identify how well you neural network performed. To do this you need to predict the labels of your validation set.
In order to minimize the effect of sampling you do this more than once for example using five-fold cross validation you do it five times and then you look into the results you can now take an average of your results. By results I mean a performance measurement or more than one performance measurement such as specificity, sensitivity, MCC, misclassification rate,....
This a commonly used method to answer your question :
How many hidden layers do I need?
What is the best learning rate?
..........
I know that there is a very good implementation of cross validation and neural networks in R , the package called CMA. But if you implemented your own brand new neural network it is a good idea to also implement some kind of cross validation program.
3 Recommendations
6th Feb, 2013
Pavel Kordík
Czech Technical University in Prague
Before you start implementing genetic algorithms to optimize the topology of your neural net, you should first find out, if a neural network is appropriate for solving your problem. You mention that you have a prediction problem with 18 inputs. I recommend you to start with some simulation tool such as the Rapidminer and design an experiment comparing the generalization performance (averaged test error) of several algorithms. Start with weak learners (Linear regression or Logistic regression in case you have classification problems), then you can proceed and experiment with the neural net of increasing capacity.
4 Recommendations
6th Feb, 2013
Prashant Chavan
Vishwakarma Institute of Technology
This problem is solved previously by some experts using ANN so i think ANN will be able to solve it.
9th Feb, 2013
Serban C. Musca
Univ Rennes
Hi all,
I suggest you check out my (quite extensive) answer to a similar question, which can be found here:
HTH
Cheers,
SCM
11th Jul, 2015
Salah Eddine Ghamri
University of Batna 2
maybe test and luck helps ! 
20th Dec, 2015
Koushik Pandit
CSIR - Central Building Research Institute
(no of inputs + no of outputs)^0.5 + (1 to 10). to fix the constant value (last part, 0 to 10), use trial and error and find the optimal no of hidden layer neurons for the min MSE.
30 Recommendations
11th Apr, 2016
Gana Nath
@koushik the equation you gave is to find no of layers or nodes? can you give a simple example 
10th Oct, 2016
Ravish D K
Dr. Ambedkar Institute of Technology
  1. I have 12 inputs, 3 outputs. My question is hidden layer 1, 10 hidden layer neurons, what will be MSE?
4th Nov, 2016
Majid Saberi
University of Toronto
I think it depends to number of features(neurons in input layer). Higher number of hidden layers increase order of weights and  it helps to make a higher order decision boundary.
A NN with N hidden layer can make a (N+1) order decision boundary.
Example: A perceptron without a hidden layer(N=0) can only draw a (0+1=1) first order decision boundary.
A multi layer perceptron with a hidden layer(N=1) is capable to draw a (1+1=2) second or fewer order decision boundary.
So I believe an MLP with N hidden layers surely can solve your (N-1) feature problem.  
4th Nov, 2016
Andrzej Kasperski
University of Zielona Góra
You can also use the geometric pyramid rule (the Masters rule):
a) for one hidden layer the number of neurons in the hidden layer is equal to:
nbrHID = sqrt(nbrINP * nbrOUT)
nbrHID – the number of neurons in the hidden layer,
nbrINP – the number of neurons in the input layer,
nbrOUT – the number of neurons in the output layer.
b) for two hidden layers:
r = (nbrINP/nbrOUT)^(1/3)
nbrHID1 = nbrOUT*(r^2) – the number of neurons in the first hidden layer
nbrHID2 = nbrOUT*r       – the number of neurons in the second hidden layer
c) for three hidden layers:
r = (nbrINP/nbrOUT)^(1/4)
nbrHID1 = nbrOUT*(r^3) – the number of neurons in the first hidden layer
nbrHID2 = nbrOUT*(r^2) – the number of neurons in the second hidden layer
nbrHID3 = nbrOUT*r       – the number of neurons in the third hidden layer
and so on
4 Recommendations
16th Mar, 2017
Mohammed Hasan Ali
Imam Ja'afar Al-sadiq University
17th Mar, 2017
Lucas Borges Ferreira
Universidade Federal de Viçosa (UFV)
The number of hidden layers and nodes depends of the problem you want to model.
Take a look in the link below that you will understand better this problem dependency.
If you change the dataset you will see that in more complex problens you will need more nodes/hidden layers.
2 Recommendations
11th Jul, 2017
Somayeh Kazemi
China University of Geosciences
The upper bound on the number of hidden neurons that won't result in over-fitting is:
Nh=Ns(α∗(Ni+No))
Ni = number of input neurons.
No = number of output neurons.
Ns = number of samples in training data set.
α = an arbitrary scaling factor usually 2-10.
Others recommend setting alphaalpha to a value between 5 and 10, but I find a value of 2 will often work without overfitting. As explained by this excellent N Design text, you want to limit the number of free parameters in your model (its degree or number of nonzero weights) to a small portion of the degrees of freedom in your data. The degrees of freedom in your data is the number samples * degrees of freedom (dimensions) in each sample or Ns∗(Ni+No) (assuming they're all independent). So α is a way to indicate how general you want your model to be, or how much you want to prevent overfitting.
There are many rule-of-thumb methods for determining the correct number of neurons to use in the hidden layers, such as the following:
1. The number of hidden neurons should be between the size of the input layer and the size of the output layer.
2. The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
3. The number of hidden neurons should be less than twice the size of the input layer.
These three rules provide a starting point for you to consider. Ultimately, the selection of an architecture for your neural network will come down to trial and error. But what exactly is meant by trial and error? You do not want to start throwing random numbers of layers and neurons at your network. To do so would be very time consuming. Chapter 8, “Pruning a Neural Network” will explore various ways to determine an optimal structure for a neural network.
4 Recommendations
20th Aug, 2017
Ammar Sayed
Al-Azhar University
Excuse me but i have a question
I think the normalization change the input vector to anther one so what the guarantee that the produced ANN will maps all inputs to their desired ones?
I think we should make another change because the normalization change is not a feature in the ANN but it is a vector feature.
I think the produced ANN will map the normalized input to the desired one only if it is an element of the training set
1 Recommendation
16th Jan, 2018
Hina Rehman
University of Malakand
The size of the hidden layer is normally between the size of the input and output-.It should be should be 2/3 the size of the input layerplus the size of the o/p layer The number of hidden neurons should be less than twice the size of the input layer.
3 Recommendations
8th May, 2018
Anwar P P Abdul Majeed
Xi'an Jiaotong-Liverpool University
This paper provided different techniques that have been employed on deciding the number of hidden neurones. It's quite a good read.
5 Recommendations
8th May, 2018
Inhyeok Yim
Gwangju Institute of Science and Technology
Just trial and errors because the the performance of deep neural networks are depending on the data structure.
Or you can find the number of layer and neurons by using the global optimization algorithm such as particle swarm, simulated annealing, patternsearch, bayesian optimization, and etc, to minimize the validation errors
Third method is a type of rule of thumbs called geometric pyramid rule which can determine the number of neurons in each hidden layer
30th Jul, 2018
Irfan Majid
Institute of Space Technology
The rule of thumb I have successfully followed over a number of years is that for a single hidden layer ANN (2 x No of inputs) +1 yields good results.
1 Recommendation
30th Jul, 2018
Irfan Majid
Institute of Space Technology
An amendment please I meant number of neurons in hidden layer equals ANN (2 x No of inputs) +1
14th Sep, 2018
Juan Flores
Universidad Michoacana de San Nicolás de Hidalgo
This article could be of use
Flores, Juan J., Mario Graff, and Hector Rodriguez. "Evolutive design of ARMA and ANN models for time series forecasting." Renewable Energy 44 (2012): 225-230.
1 Recommendation
25th Nov, 2018
M. Khishe
the number of neurons in the hidden layer is selected based
on the following formula: 2× N +1, where N is number of
dataset features
1 Recommendation
26th Nov, 2018
Juan Flores
Universidad Michoacana de San Nicolás de Hidalgo
Mohammad,
A few years back it was believed that one layer was capable of equivalent modeling capabilities that any number of layers. That is true, but the number of neurons grow rapidly with one layer. The trend nowadays is to design ANN with more hidden layers (therefore deep ANNs).
Two books talk about these issues:
I hope this helps.
Best,
Juan
1 Recommendation
12th Feb, 2019
Kanhaiya Sharma
Symbiosis International University
Exactly there is no rule to decide no of hidden layer and number of neurons in each hidden layer. Just we have to try different combination.
5th Jul, 2019
Vijila Rani K
Anna University, Chennai
The number of neurons in the hidden layer is selected based
on the following formula (no of inputs + no of outputs)^0.5 + (1 to 10). to fix the constant value (last part, 0 to 10), use trial and error and find the optimal no of hidden layer neurons for the min Mean Square Error.
16th Jul, 2019
Alexandru Daia
Credit Sky
There is not a fixed answer for this , contratry to what Vijila Rani K wrote or Ines Abdeljaoued or M. Khishe
Nobody knows and depend on your data particularities.
You could try to determine neural net number of hidden layers and also other parameters with genetic alghoritms .
It is the best of the two worlds.
I think this link will help you
8 Recommendations
6th Aug, 2019
Sachin Takale
MIT School of Engineering
Can I skip Fully connected Layer?
1 Recommendation
12th Aug, 2019
Farshid Keivanian
The University of Newcastle, Australia
@Ines Abdeljaoued do you have a reference for that, did you mean this paper Sheela, K. G., & Deepa, S. N. (2013). Review on methods to fix number of hidden neurons in neural networks. Mathematical Problems in Engineering, 2013.
1 Recommendation
12th Aug, 2019
Farshid Keivanian
The University of Newcastle, Australia
@Vijila Rani K, any reference for that statement?
31st Aug, 2019
Mohamed Nedal
Bulgarian Academy of Sciences
There is no direct answer.
Yet, you can implement a nested for-loop, one for the number of hidden layers and the other for the number of nodes among each layer. Then you record the error for each topology and select the one that has the minimum error.
I hope that makes sense.
1 Recommendation
24th Nov, 2019
Muneer Al-Hammadi
Norwegian University of Science and Technology
There is no specific formula to use. It is generally problem-dependent. and trial and error might work but there is no guarantee that it will lead to the optimal solution. grid search might be the solution in this case.
1 Recommendation
22nd Jan, 2020
Sheikh Suhail Mohammad
National Institute of Technology Srinagar
More hidden layer , there are chances you will approach the goal quickly. But try to solve your problem by using minimum number of neurons , as it makes sense when you design the practical circuit.
1st Apr, 2020
Guanyi Ma
Chinese Academy of Sciences
There is a paper "Review on Methods to Fix Number of Hidden Neurons in Neural Networks"
1 Recommendation
19th Apr, 2020
Juan Flores
Universidad Michoacana de San Nicolás de Hidalgo
These days, our best bet is to include a large enough number of neurons and train with early stopping, regularization and dropout.
Then again, what's large enough? That is easier to determine than optimal. No rule of thumb :(
28th Apr, 2020
Amit Saxena
Guru Ghasidas University
Error and trial. Apply a GA and take the output prediction accuracy as the fitness function.
Usually single layer is preferred. With cross validation, a 30% to 50% training data will reduce the data size significantly.
Also input data can be clustered and a very few patterns can represent/ address the large number of patterns in that cluster.
Applying fuzzy variables (to represent fuzzy inputs) can reduce the data.
2 Recommendations
17th Aug, 2020
Dr Radha Mohan Pattanayak
VIT University
Hello,
I am agreeing with the opinion of Sudeera H. Gunathilaka. In my opinion there is no specific rule for that to select the number of neuron in the hidden layer. Some time more number of neurons gives better result and some times it may be less number of neurons. Therefore in my suggestion you need to optimize the structure of the network model by using different physical structure.
Thanks
Deleted profile
Dear Chavan
there is no method to calculate the number of layers in the hidden layer of neural networks, while the number of neurons in the layer (size of the hidden layer)is a different term from the number of layers and is calculable.
hidden layer or the black box as the name represents has some vague characteristics to some respects and the same as many other features in a neural network that can be tested and optimized by experiment (e.g. number of epochs or combination of different types of Neural Networks for specific purposes), the number of layers in a hidden layer should be decided experimentally in that context.
On the other hand, some rules are prescriptive that you should bear them in mind;
1)having a linear data-set -separable requires you not to add any hidden layer; although such liner data does not require NN to be processed but still works.
2)for the majority of problems a requirement for the second and third hidden layers is rare and one hidden layer, alone, works.
finally, I should remind my friends of the formula been give above; as far as I am concerned there is no formula for calculating the number of layers in the hidden layer but the number of neurons(size of the hidden layer)is calculable to some respects as below:
The number of neurons in the layer= mean of the neurons in the input and output layers.
the file attached worth to take a look (sure it will help you ).
1 Recommendation
5th Sep, 2020
Mohammadzen Hasan Darsa
National Taiwan University of Science and Technology
Use trial and error until you get the optimal no of hidden layer neurons for the min MAPE .
7th Sep, 2020
Naveena K .
Centre for Water Resources Development and Management
The trial and error method of changing hidden nodes until achieving the global minimum is the ultimate criterion but if we go for more number of hidden nodes may lead to the overfitting of the model for testing set prediction. so better to cross-check MAPE or R2 value of both training and testing set before finalizing the model.
16th Sep, 2020
Erna Nababan
University of Sumatera Utara
  • The number of hidden neurons should be between the size of the input layer and the size of the output layer.
  • The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
  • The number of hidden neurons should be less than twice the size of the input layer.
  • see this https://www.heatonresearch.com/2017/06/01/hidden-layers.html
23rd Dec, 2020
Nuno César de Sá
Leiden University
by using an optimization algorithm
27th Dec, 2020
André Wlodarczyk
31st Dec, 2020
Nadeem Qazi
University of East London
There are some basic rules for choosing number of hidden neurons. one rule says it should be 2/3 to the total number of inputs so if you 18 features , try 12 hidden neurons. For hidden layers two layers are considered better, however you can check the accuracy of your model by changing the numbers of layers and select the one which gives best result.
6th Feb, 2021
Snehanshu Saha
BITS Pilani, K K Birla Goa
Cybenko's technical report (1988) states that two hidden layers are sufficient. If the learned function is holder continuous, then deeper networks approximate better.
11th Feb, 2021
Ali Shakarami
Islamic Azad University, Qom
As some researchers mentioned, there is not a general way to answer the number of hidden layers in an ANN but experimentally you can add the independent variables (input values) and dependent variables (output values) devided by 2. Next run the network to tune (add extra nodes) to reach the most optimal results (a recommended trick:in the programming language environment, use a loop to plot different numbers of layers and numbers of nodes for each layer)
17th Mar, 2021
Raoul G. C. Schönhof
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
This is commonly called neural architecture search (NAS).
Hyperparameters are mostly chosen by trial an error. You could employ a grid search technique to find them
Even better, you could use an EfficientNet, which has optimal hyperparameters for efficiency and accuracy:
Cheers, Raoul
2 Recommendations
17th Mar, 2021
Farshid Keivanian
The University of Newcastle, Australia
Raoul G. C. Schönhof Thank you so much
18th Mar, 2021
Houda Benhar
University of Murcia
Raoul G. C. Schönhof is there a rule of thumb for choosing the search space for hidden neurons in each layer?
18th Mar, 2021
Juan Flores
Universidad Michoacana de San Nicolás de Hidalgo
I would say, use a large enough number of neurons in each layer. Use dropout, regularization, and early stopping, and only search for the number of hidden layers.
6th May, 2021
Abhishek Kumar
ABV-Indian Institute of Information Technology and Management Gwalior
One can use genetic algorithm to optimize the parameters of neural network.
3rd Jun, 2021
Quy Thue Nguyen
Bursa Uludag University
As far as I read, there is no rule for that.
Trial and error process has been used.
20th Jun, 2021
Ahmed J. Abougarair
University of Tripoli
The most common method is to have the number of hidden layers from A to Z because using a large number of layers will lead to time delay and this is not preferred in high speed systems. Therefore, choosing the activation function correctly has a clear impact on the network response, better than increasing the number of neurons or increasing the number of layers, which are usually chosen by trial and error method.
1 Recommendation
20th Jun, 2021
Farshid Keivanian
The University of Newcastle, Australia
18th Jul, 2021
Josephine Olamatanmi Mebawondu
The Federal Polytechnic, Nassarawa
As far as I read, there is no rule for that. A trial and error process has been used.
28th Jul, 2021
Dengsheng Zhang
Federation University Australia
8th Aug, 2021
Mohsen Rezaei
Shiraz University
If you use python Sklearn you can use GridsearchCV to optimize the hyper parameters of neural networks.
4th Dec, 2021
Mobina Golmohammadi
Kansas State University
This website and book provided a good solution for this crucial subject in the DL model.
The book name is :Better Deep learning:
Hope those are helpful
4th Dec, 2021
Farshid Keivanian
The University of Newcastle, Australia
23rd Dec, 2021
Bilal Mantoo
Chandigarh University
No there is no such formula for finding the hidden layers...just go by hit and trail
1 Recommendation
9th Feb, 2022
Mutiu A. Adegboye
Federal University Oye-Ekiti
There is no rule to determine the number of hidden layers and nodes in a hidden layer. As far as I read, the only two options available is either trial and error or via an optimisation process using algorithm such as GA, PSO, GridsearchCV, etc.
15th Feb, 2022
Oleksandr Prokopenko
The National Defense university of Ukraine
Yes you are right! Unfortunately it's trial and error. To begin with, take the number of neurons in the first layer by one third more than the input signals, that is, 24. At the first stage, make 2 layers, where in the second layer the number of neurons will correspond to the number of output classes. And then you code in python, test the quality of the neural network and add either layers or the number of neurons in the layers.
22nd Feb, 2022
Steve Richards
Indian Institute of Technology Madras
There is no particular formula. As far as I know, you can do some trial and error and based on its learning curve (which you can interpret visually), you can tweak the neurons and layers.
1 Recommendation
24th Feb, 2022
Mohamed Nedal
Bulgarian Academy of Sciences
In Python, you can try GridSearchCV or KerasTuner to optimize the hyperparameters in a systematic way.
25th Feb, 2022
Zeki Akyildiz
Gazi University
Well Done !!! Interesting...
25th Apr, 2022
Joshua Oseheromomen Ighalo
Simon Fraser University
There are no rules on the choice of hidden layers. I’d suggest using a hyper parameter optimization algorithm to optimize your network but this suggestion is based off non-consideration of runtime
11th Jul, 2022
Arun Bali
Shri Mata Vaishno Devi University
How to determine the number of neurons in the Multi-dimensional Taylor network?
23rd Jul, 2022
El Rhabori Said
Sidi Mohamed Ben Abdellah University, Faculty of Science and Technology, Fez, Morocco
ρ = N/ H(I+O+1)+O with 1 < ρ < 3??????
Where; N is the number of molecules in training set, H is number of hidden layers and O is number of output layers
19th Sep, 2022
Ahmed J. Abougarair
University of Tripoli
In neural networks, usually, the maximum number of layers is from three to four, because the increase in the number of layers often will not help reduce the error, but on the contrary, the computer will need more time to perform the training process
25th Sep, 2022
Leonardo Duarte
Federal University of Santa Catarina
According to Kaastra and Boyd (1996), theoretically, the use of only one layer is enough to perform function approximations.

Similar questions and discussions

Recommendations

Article
CONTENTS Page ACCEPTANCE PAGE ............................................ii ACKNOWLEDGMENTS ...........................................iii CHAPTER 1. INTRODUCTION .....................................1 CHAPTER 2. PDL AND RELATED WORKS ............................5 Neural Networks, Artificial Intelligence and Machine Learning .........................
Article
Full-text available
We built a forecasting system for the path types of the Kuroshio current and the distance between the Kuroshio axis and Cape Iroh-zaki. A layered type of the artificial neural network was used in the system. Input data sets included six months' precedent data of distances between the Kuroshio axis and major capes, occurrence rates of Kuroshio path...
Origins Center to open at Fundamentals of Life in the Universe symposium
Sponsored Content
On 31 August, Louise Vet (director of the Netherlands Institute for Ecology, NIOO-KNAW), Ben Feringa (University of Groningen, Nobel Prize winner for Chemistry 2016) and Rens Waters (general and scientific director of the Netherlands Space Research Institute SRON) opened the Origins Center, in front of an audience of over 200 delegates at the Fundamentals of Life in the Universe symposium.
The Fundamentals of Life symposium was held on the Zernike Campus of the UG. Nineteen speakers from the Netherlands and abroad discussed the various aspects and challenges in the search for the origins of life. In conclusion, Stan Gielen, chair of the Netherlands Organisation for Scientific Research (NWO), spoke about the future of the Dutch National Research Agenda.
Origins Center
The Origins Center is a virtual pooling of resources, bringing together leading Dutch researchers from the fields of astronomy, biophysics, ecology, molecular and evolutionary biology, planet and earth sciences, chemistry, mathematics, informatics and computational science, all focusing on the origins of life in the broadest sense of the term. In July, NWO awarded the Origins Center a grant of € 2.5 million for a three-year programme involving a number of preliminary research projects. The programme includes developing a virtual centre designed to support collaboration between these researchers and similar centres elsewhere in the world. The Origins Center will also initiate projects relating to science communication.
The Center’s programme entails five three-year projects, focusing on the origins of life-bearing planets and life on planet Earth, the predictability of evolution, the malleability and controllability of life, modelling planet Earth as an exoplanet and the mathematical understanding of the effect of emerging phenomena on underlying organization levels in natural systems. The projects will serve as a joint basis for a larger, longer-lasting scientific programme designed to provide pioneering insight into the phenomenon ‘life’ on astrophysical, planetary and molecular scales.
More information? Check our website: http://www.origins-center.nl/
Got a technical question?
Get high-quality answers from experts.