Figure - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Content may be subject to copyright.
Total training time in seconds for gradient boosted neural network (GBNN), deep neural network, and neural networks with Adam solver
Source publication
This paper presents a novel technique based on gradient boosting to train the final layers of a neural network (NN). Gradient boosting is an additive expansion algorithm in which a series of models are trained sequentially to approximate a given function. A neural network can also be seen as an additive expansion where the scalar product of the res...
Contexts in source publication
Context 1
... order to compare the computational performance of the tested methods, the training time for GBNN, Deep-NN, NNAdam, NN-L-BFGS and NN-SGD are shown in Table 3. For this experiment, we applied 10-fold cross-validation and computed the average fit time of the final model that used the set hyper-parameters most frequently selected in the crossvalidation. ...Citations
... The final model combines all generated models in an additive manner. These ideas have also been applied to sequentially train Neural Networks or NNs [17]- [20]. In [19], a Gradient Boosting (GB) based approach that uses a weight estimation model to classify image labels is proposed. ...
... In addition, it also involves formulating linear classifiers and feature extraction, where the feature extraction produces input for the linear classifiers, and the resulting approximated values are stacked in a ResNet layer. In [17], a novel technique for training shallow NNs sequentially via GB is proposed. The method called Gradient Boosted Neural Network (GBNN) involves constructing one NN by utilizing the trained weights of multiple individual networks, each trained on the residual loss sequentially. ...
... In fact, the method is tested experimentally only for datasets with two features. In another line of work, a shallow NN is sequentially trained as an additive expansion using GB [17], [21]. The weights of the trained models are stored to form a final neural network. ...
Deep learning has revolutionized computer vision and image classification domains. In this context Convolutional Neural Networks (CNNs) based architectures and Deep Neural Networks (DNNs) are the most widely applied models. In this article, we introduced two procedures for training CNNs and DNNs based on Gradient Boosting (GB), namely GB-CNN and GB-DNN. These models are trained to fit the gradient of the loss function or pseudo-residuals of previous models. At each iteration, the proposed method adds one dense layer to an exact copy of the previous deep NN model. The weights of the dense layers trained on previous iterations are frozen to prevent over-fitting, permitting the model to fit the new dense as well as to fine-tune the convolutional layers (for GB-CNN) while still utilizing the information already learned. Through extensive experimentation on different 2D-image classification and tabular datasets, the presented models show superior performance in terms of classification accuracy with respect to standard CNN and DNN with the same architectures.