Article

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worst-case on-line framework. The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting. We show that the multiplicative weight-update Littlestone-Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. We show how the resulting learning algorithm can be applied to a variety of problems, including gambling, multiple-outcome prediction, repeated games, and prediction of points in ℝn. In the second part of the paper we apply the multiplicative weight-update technique to derive a new boosting algorithm. This boosting algorithm does not require any prior knowledge about the performance of the weak learning algorithm. We also study generalizations of the new boosting algorithm to the problem of learning functions whose range, rather than being binary, is an arbitrary finite set or a bounded segment of the real line.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Specifically, the updated modality learner can focus on the errors made by others, thereby highlighting their complementarity. Unlike traditional boosting techniques (Freund, 1995;Freund & Schapire, 1997), which use weak learners like decision trees, our method employs DNN-based learners which are over-parameterized models. To avoid overfitting, we discard historical learners and only preserve the last learner for each modality, creating an alternating-boosting strategy. ...
... At first glance, the dynamical variation of the loss function makes the optimization property of ReconBoost unclear. To further explore its theoretical foundation, we investigate the connection with the well-known Gradient-Boosting (GB) method (Freund, 1995;Friedman, 2001;Freund et al., 1996;Freund & Schapire, 1997), which is a powerful boosting method for additive expansion of models. The theoretical result is shown as follows: Theorem 3.1. ...
... Boosting is a commonly used learning approach in machine learning (Friedman, 2001;Freund, 1995;Friedman, 2001;Freund et al., 1996;Freund & Schapire, 1997). It enhances the performance of a basic learner by combining multiple weaker learners. ...
Preprint
This paper explores a novel multi-modal alternating learning paradigm pursuing a reconciliation between the exploitation of uni-modal features and the exploration of cross-modal interactions. This is motivated by the fact that current paradigms of multi-modal learning tend to explore multi-modal features simultaneously. The resulting gradient prohibits further exploitation of the features in the weak modality, leading to modality competition, where the dominant modality overpowers the learning process. To address this issue, we study the modality-alternating learning paradigm to achieve reconcilement. Specifically, we propose a new method called ReconBoost to update a fixed modality each time. Herein, the learning objective is dynamically adjusted with a reconcilement regularization against competition with the historical models. By choosing a KL-based reconcilement, we show that the proposed method resembles Friedman's Gradient-Boosting (GB) algorithm, where the updated learner can correct errors made by others and help enhance the overall performance. The major difference with the classic GB is that we only preserve the newest model for each modality to avoid overfitting caused by ensembling strong learners. Furthermore, we propose a memory consolidation scheme and a global rectification scheme to make this strategy more effective. Experiments over six multi-modal benchmarks speak to the efficacy of the method. We release the code at https://github.com/huacong/ReconBoost.
... For example, in Fig. 4B, our binary classifier (model 1.0) correctly classified 88 peptides as either MDPs(54) or MPPs(34), leading to a misclassification rate of 0.318 (41 out of 129) on the complete dataset. In the same figure, the misclassification rate among α-helical peptides (subset I) was lower, with a value of 0.265, whereas the fraction of misclassified coiled sequences (subset III) reached 0.422. ...
Article
Full-text available
Machine learning models are revolutionizing our approaches to discovering and designing bioactive peptides. These models often need protein structure awareness, as they heavily rely on sequential data. The models excel at identifying sequences of a particular biological nature or activity, but they frequently fail to comprehend their intricate mechanism(s) of action. To solve two problems at once, we studied the mechanisms of action and structural landscape of antimicrobial peptides as (i) membrane-disrupting peptides, (ii) membrane-penetrating peptides, and (iii) protein-binding peptides. By analyzing critical features such as dipeptides and physicochemical descriptors, we developed models with high accuracy (86–88%) in predicting these categories. However, our initial models (1.0 and 2.0) exhibited a bias towards α-helical and coiled structures, influencing predictions. To address this structural bias, we implemented subset selection and data reduction strategies. The former gave three structure-specific models for peptides likely to fold into α-helices (models 1.1 and 2.1), coils (1.3 and 2.3), or mixed structures (1.4 and 2.4). The latter depleted over-represented structures, leading to structure-agnostic predictors 1.5 and 2.5. Additionally, our research highlights the sensitivity of important features to different structure classes across models.
... The algorithm initially assigns equal weights to all data points and then iteratively trains models, giving higher weights to incorrectly classified points. The subsequent models focus on those misclassified points, aiming to reduce the overall error [68]. ...
Article
As the number of IoT devices increases daily due to the rapid growth in technology, every device and network is vulnerable to attacks because it is exposed to the internet. Denial of Service (DoS) is a prevalent type of intrusion on the Internet of Things (IoT) network in which the server becomes down due to flooding requests. Distributed Denial of Service (DDoS) is a special type of DoS attack where the network of malicious computers called botnet consumes the target’s system resources by flooding the requests. Edge computing is closely related to Industrial Internet of Things (IIoT), and industry 4.0. Both of them are relatively emerging technologies so security is a crucial part of them. By incorporating our contributions to the current and innovative dataset Edge-IIoT, the proposed study presents a novel approach to detect DDoS attacks in an IIoT network in the domain of edge computing, whether the traffic is normal or malicious (DDoS traffic). This study explores various Ensemble Learning (EL) techniques to predict normal and malicious DDoS traffic along with the type of DDoS attack. The study applies various preprocessing techniques like Synthetic Minority Over Sampling Technique (SMOTE), label encoding, etc. to enhance the model’s performance and reveals how EL techniques performs better in terms of accuracy than the individual classifiers. Further, the performance of all EL techniques has been investigated in terms of all evaluation measures, including the elapsed time. This important addition not only broadens the focus of study in this area but also offers insightful comparisons of the efficiency and precision of various ensemble approaches as well as individual classifiers. The study achieved a maximum of 99.99% in all evaluation measures.
... via the application of the standard analysis of the exponentially weighted forecaster of Vovk [1990], Littlestone and Warmuth [1994], Freund and Schapire [1997] (see, e.g., Theorem 2.2 in Cesa-Bianchi and Lugosi, 2006), and noting that q t ∞ ≤ RD θ for all t. ...
Preprint
We study offline Reinforcement Learning in large infinite-horizon discounted Markov Decision Processes (MDPs) when the reward and transition models are linearly realizable under a known feature map. Starting from the classic linear-program formulation of the optimal control problem in MDPs, we develop a new algorithm that performs a form of gradient ascent in the space of feature occupancies, defined as the expected feature vectors that can potentially be generated by executing policies in the environment. We show that the resulting simple algorithm satisfies strong computational and sample complexity guarantees, achieved under the least restrictive data coverage assumptions known in the literature. In particular, we show that the sample complexity of our method scales optimally with the desired accuracy level and depends on a weak notion of coverage that only requires the empirical feature covariance matrix to cover a single direction in the feature space (as opposed to covering a full subspace). Additionally, our method is easy to implement and requires no prior knowledge of the coverage ratio (or even an upper bound on it), which altogether make it the strongest known algorithm for this setting to date.
... AdaBoost Contrary to bagging methods, AdaBoost (ADA) relies on boosting: a sequence of base learners is fitted to the entire dataset. Then additional copies of the classifier are fitted to the same data but where the weights of incorrectly classified examples are iteratively updated [9]. ...
Preprint
Full-text available
Computer aided diagnosis systems can provide non-invasive, low-cost tools to support clinicians. These systems have the potential to assist the diagnosis and monitoring of neurodegenerative disorders, in particular Parkinson's disease (PD). Handwriting plays a special role in the context of PD assessment. In this paper, the discriminating power of "dynamically enhanced" static images of handwriting is investigated. The enhanced images are synthetically generated by exploiting simultaneously the static and dynamic properties of handwriting. Specifically, we propose a static representation that embeds dynamic information based on: (i) drawing the points of the samples, instead of linking them, so as to retain temporal/velocity information; and (ii) adding pen-ups for the same purpose. To evaluate the effectiveness of the new handwriting representation, a fair comparison between this approach and state-of-the-art methods based on static and dynamic handwriting is conducted on the same dataset, i.e. PaHaW. The classification workflow employs transfer learning to extract meaningful features from multiple representations of the input data. An ensemble of different classifiers is used to achieve the final predictions. Dynamically enhanced static handwriting is able to outperform the results obtained by using static and dynamic handwriting separately.
... We show that the error floor can be significantly reduced by the following five training methods. 1) Boosting learning using uncorrected vectors: We first leverage the boosting learning technique [10] from the machine learning domain. This technique employs a sequential training approach where each classifier concentrates on samples incorrectly classified by its preceding classifiers. ...
Preprint
Full-text available
Ensuring extremely high reliability is essential for channel coding in 6G networks. The next-generation of ultra-reliable and low-latency communications (xURLLC) scenario within 6G networks requires a frame error rate (FER) below 10-9. However, low-density parity-check (LDPC) codes, the standard in 5G new radio (NR), encounter a challenge known as the error floor phenomenon, which hinders to achieve such low rates. To tackle this problem, we introduce an innovative solution: boosted neural min-sum (NMS) decoder. This decoder operates identically to conventional NMS decoders, but is trained by novel training methods including: i) boosting learning with uncorrected vectors, ii) block-wise training schedule to address the vanishing gradient issue, iii) dynamic weight sharing to minimize the number of trainable weights, iv) transfer learning to reduce the required sample count, and v) data augmentation to expedite the sampling process. Leveraging these training strategies, the boosted NMS decoder achieves the state-of-the art performance in reducing the error floor as well as superior waterfall performance. Remarkably, we fulfill the 6G xURLLC requirement for 5G LDPC codes without the severe error floor. Additionally, the boosted NMS decoder, once its weights are trained, can perform decoding without additional modules, making it highly practical for immediate application.
... In addition to these three models, we adopted the following six commonly used classifiers as base learners: extremely randomized trees (ERT) (Geurts, Ernst, and Wehenkel 2006), support vector machines (SVM) (Cortes and Vapnik 1995), K nearest neighbors (KNN) (Fix and Hodges 1989), radius-based nearest neighbors (R-NN) (Bentley 1975), adaptive boosting based on decisions trees (ADA) (Freund and Schapire 1997), and gradient boosting (GB) (Friedman 2001). ERT, also known as extra trees, is an ensemble classifier similar to RF with more randomness included. ...
... Specifically, regarding each ∈ X as an expert, we use the standard multiplicative weight update algorithm to select 1 , . . . , [26,12]. Since the number of experts is |X| = 2 , this attains an expected regret bound of ( √︁ log |X|) ≲ poly( ) √ , while taking prohibitively long poly( )|X| ≳ 2 time per round. ...
Preprint
M${}^{\natural}$-concave functions, a.k.a. gross substitute valuation functions, play a fundamental role in many fields, including discrete mathematics and economics. In practice, perfect knowledge of M${}^{\natural}$-concave functions is often unavailable a priori, and we can optimize them only interactively based on some feedback. Motivated by such situations, we study online M${}^{\natural}$-concave function maximization problems, which are interactive versions of the problem studied by Murota and Shioura (1999). For the stochastic bandit setting, we present $O(T^{-1/2})$-simple regret and $O(T^{2/3})$-regret algorithms under $T$ times access to unbiased noisy value oracles of M${}^{\natural}$-concave functions. A key to proving these results is the robustness of the greedy algorithm to local errors in M${}^{\natural}$-concave function maximization, which is one of our main technical results. While we obtain those positive results for the stochastic setting, another main result of our work is an impossibility in the adversarial setting. We prove that, even with full-information feedback, no algorithms that run in polynomial time per round can achieve $O(T^{1-c})$ regret for any constant $c > 0$ unless $\mathsf{P} = \mathsf{NP}$. Our proof is based on a reduction from the matroid intersection problem for three matroids, which would be a novel idea in the context of online learning.
... Notably, the weighted error classification rate of the effective classifier improves as the number of iterations (T) in the AdaBoost method grows. When T approaches infinity, the process significantly reduces the weighted error rate while the error rates of the weak classifiers stay below 50% [41,42]. This strong convergence ensures the algorithm's dependability. ...
Article
Cancer, a pervasive global health issue, accounts for approximately 9 million deaths annually. The survival rate of cancer patients significantly improves with early detection and accurate staging. In this context, ribonucleic acid sequencing (RNA-Seq) has become a powerful technique for measuring gene expression, thereby playing a crucial role in human disease research. On the other hand, there is a need for more efficient computational resources and tools for analysing RNA-Seq data. The RNA-Seq datasets known as the Cancer Genome Atlas (TCGA) were used in this research. In contrast, The following five types of cancer are included: Colon Adenocarcinoma, Prostate Adenocarcinoma, Renal Clear Cell Carcinoma, Lung Adenocarcinoma, and Breast Invasive Carcinoma. This research proposes a machine-learning technique based on the AdaBoost classifier for detecting, classifying, and predicting breast cancer. The findings of our proposed method exhibit remarkable performance, achieving a cross-validation accuracy of 99.77%, while the test and prediction accuracy were 100%. Critical parameters such as precision, recall, support, F1-score, and accuracy support this performance.
... 2. Boosting algorithms, like AdaBoost and Gradient Boosting Machines, train a sequence of weak models. Each subsequent model focuses on the instances that the previous models misclassified or predicted incorrectly [37]. The ultimate forecast is a calculated blend of the separate model results. ...
Article
Full-text available
Artificial Intelligence (AI) has become a revolutionary force in the field of future prediction, surpassing the limitations of conventional forecasting methods. This article explores the significant impact of AI on predictive modeling, delving into advanced techniques like deep learning, predictive analytics, and ensemble learning. In fields like stock market predictions [1], climate forecasting [2], and socioeconomic projections, deep learning with the support of neural networks has enabled machines to identify intricate patterns and model complex relationships. Predictive analytics, utilizing statistical algorithms such as regression models and decision trees, enables AI systems to derive valuable insights from extensive datasets [3]. Ensemble learning enhances prediction accuracy by combining outputs from multiple models, reducing individual biases [4]. Through the utilization of these AI methodologies, we are able to acquire unparalleled foresight into the future, leading to a revolutionary era of accuracy and understanding that has extensive implications across various industries and domains.
... Dietterich et al. [6] indicates that ensemble learning techniques can be divided into two categories: homogeneous ensembles and heterogeneous ensembles. Homogeneous ensembles use base learners of the same type but with different features or hyperparameters [7][8][9][10][11]. Ensemble learning techniques offer a powerful approach to bank fraud detection. ...
Article
Full-text available
Traditional methods of fraud detection rely on rule-based systems or supervised machine learning models that require labelled data and domain knowledge. However, these methods have limitations such as high false positive rates, low scalability, and vulnerability to adversarial attacks. In this paper, a novel approach for bank fraud detection using hyper ensemble machine learning (HEML), which combines multiple unsupervised and semi-supervised models with different features and hyperparameters to achieve high accuracy and robustness, including—logistic regression (LR), decision tree (DT), support vector machine (SVM), neural network (NN), one-class SVM (OCSVM), and isolation forest (IF) are studied.The approach is evaluated on a real-world dataset of bank transactions from a large European bank and compared with several baseline methods.The accuracies of base learners and ensemble learners on the test data of LR, DT, SVM, NN, OCSVM and IF are as follows in order 0.95,0.91,0.96, 0.97, 0.93, 0.92. The results show that HEML outperforms the baselines in terms of precision, recall, F1-score, and AUC-ROC, while reducing the computational cost and human intervention. Additionally, the effectiveness of HEML in detecting new types of frauds that were not seen in the training data is demonstrated. Thus, HEML is a promising technique for bank fraud detection that can adapt to dynamic and complex fraud scenarios. By utilizing multiple models and features, HEML can provide accurate and robust fraud detection while reducing false positives and minimizing human intervention. By employing multiple models and features, HEML has the potential to improve the financial security and stability for both banks and their customers. Graphical Abstract
... When h p " 1, the weak learner casts a vote for the class; otherwise, it votes against the classes. Following the p th iteration, the output discrimination function is delineated as per [26]. ...
Preprint
Full-text available
Artificial intelligence (AI) has made significant advances in recent years and opened up new possibilities in exploring applications in various fields such as biomedical, robotics, education, industry, etc. Among these fields, human hand gesture recognition is a subject of study that has recently emerged as a research interest in robotic hand control using electromyography (EMG). Surface electromyography (sEMG) is a primary technique used in EMG, which is popular due to its non-invasive nature and is used to capture gesture movements using signal acquisition devices placed on the surface of the forearm. Moreover, these signals are pre-processed to extract significant handcrafted features through time and frequency domain analysis. These are helpful and act as input to machine learning (ML) models to identify hand gestures. However, handling multiple classes and biases are major limitations that can affect the performance of an ML model. Therefore, to address this issue, a new mixture of experts extra tree (MEET) model is proposed to identify more accurate and effective hand gesture movements. This model combines individual ML models referred to as experts, each focusing on a minimal class of two. Moreover, a fully trained model known as the gate is employed to weigh the output of individual expert models. This amalgamation of the expert models with the gate model is known as a mixture of experts extra tree (MEET) model. In this study, four subjects with six hand gesture movements have been considered and their identification is evaluated among eleven models, including the MEET classifier. Results elucidate that the MEET classifier performed best among other algorithms and identified hand gesture movement accurately.
... Adaptive Boosting (AdaBoost) is a classification tool that provides efficient results by combining several weak ones introduced in [49] and then modified in [50]. The algorithm begins with a weak learner that gives equal weight to each sample. ...
Preprint
Full-text available
Electric vehicles (EVs) are commonly recognized as environmentally friendly modes of transportation. They function by converting electrical energy into mechanical energy using different types of motors, which aligns with the sustainable principles embraced by smart cities. The motors of EVs store and consume electrical power from renewable energy (RE) sources through interfacing connections using power electronics technology to provide mechanical power through rotation. The reliable operation of an EV mainly relies on the condition of interfacing connections in the EV, particularly the connection between the 3-ϕ inverter output and the brushless DC (BLDC) motor. In this paper, machine learning (ML) tools are deployed for detecting and classifying the faults in the connecting lines from 3-ϕ inverter output to the BLDC motor during operational mode in the EV platform, considering double-line and three-phase faults. Several machine learning-based fault identification and classification tools, namely the Decision Tree, Logistic Regression, Stochastic Gradient Descent, AdaBoost, XGBoost, K-Nearest Neighbour, and Voting Classifier, were tuned for identifying and categorizing faults to ensure robustness and reliability. The ML classifications were developed based on the datasets of healthy and faulty conditions considering the combination of six critical parameters that have significance in reliable EV operation, namely the current supplied to the BLDC motor from the inverter, the modulated DC voltage, output speed, and measured speed, as well as the output of the Hall-effect sensor. In addition, the superiority of the proposed fault detection and classification approaches using ML tools was assessed by comparing the detection and classification efficiency through some statistical performance parameter comparisons among the classifiers.
... Adaptive Boosting (AdaBoost) is a classification tool that provides efficient results by combining several weak ones introduced in [49] and then modified in [50]. The algorithm begins with a weak learner that gives equal weight to each sample. ...
Article
Full-text available
Electric vehicles (EVs) are commonly recognized as environmentally friendly modes of transportation. They function by converting electrical energy into mechanical energy using different types of motors, which aligns with the sustainable principles embraced by smart cities. The motors of EVs store and consume electrical power from renewable energy (RE) sources through interfacing connections using power electronics technology to provide mechanical power through rotation. The reliable operation of an EV mainly relies on the condition of interfacing connections in the EV, particularly the connection between the 3-ϕ inverter output and the brushless DC (BLDC) motor. In this paper, machine learning (ML) tools are deployed for detecting and classifying the faults in the connecting lines from 3-ϕ inverter output to the BLDC motor during operational mode in the EV platform, considering double-line and three-phase faults. Several machine learning-based fault identification and classification tools, namely the Decision Tree, Logistic Regression, Stochastic Gradient Descent, AdaBoost, XGBoost, K-Nearest Neighbour, and Voting Classifier, were tuned for identifying and categorizing faults to ensure robustness and reliability. The ML classifications were developed based on the datasets of healthy and faulty conditions considering the combination of six critical parameters that have significance in reliable EV operation, namely the current supplied to the BLDC motor from the inverter, the modulated DC voltage, output speed, and measured speed, as well as the output of the Hall-effect sensor. In addition, the superiority of the proposed fault detection and classification approaches using ML tools was assessed by comparing the detection and classification efficiency through some statistical performance parameter comparisons among the classifiers.
... By iteratively adjusting the weights and extracting new training data, AdaBoost emphasizes previously mispredicted data, resulting in a model that predicts them more accurately. Originally introduced for binary classification by Freund and Schapire [50], AdaBoost has been adapted for regression problems. Its success in delivering accurate ensembles and its resistance to overfitting led Breiman to call AdaBoost the "best off-the-shelf classifier in the world" (NIPS Workshop 1996) [51]. ...
Article
Full-text available
(1) Background: This challenge is exacerbated by the aging of the rural population, leading to a scarcity of available manpower. To address this issue, the automation and mechanization of outdoor vegetable cultivation are imperative. Therefore, developing an automated cultivation platform that reduces labor requirements and improves yield by efficiently performing all the cultivation activities related to field vegetables, particularly onions and garlic, is essential. In this study, we propose methods to identify onion and garlic plants with the best growth status and accurately predict their live bulb weight by regularly photographing their growth status using a multispectral camera mounted on a drone. (2) Methods: This study was conducted in four stages. First, two pilot blocks with a total of 16 experimental units, four horizontals, and four verticals were installed for both onions and garlic. Overall, a total of 32 experimental units were prepared for both onion and garlic. Second, multispectral image data were collected using a multispectral camera repeating a total of seven times for each area in 32 experimental units prepared for both onions and garlic. Simultaneously, growth data and live bulb weight at the corresponding points were recorded manually. Third, correlation analysis was conducted to determine the relationship between various vegetation indexes extracted from multispectral images and the manually measured growth data and live bulb weights. Fourth, based on the vegetation indexes extracted from multispectral images and previously collected growth data, a method to predict the live bulb weight of onions and garlic in real time during the cultivation period, using functional regression models and machine learning methods, was examined. (3) Results: The experimental results revealed that the Functional Concurrence Regression (FCR) model exhibited the most robust prediction performance both when using growth factors and when using vegetation indexes. Following closely, with a slight distinction, Gaussian Process Functional Data Analysis (GPFDA), Random Forest Regression (RFR), and AdaBoost demonstrated the next-best predictive power. However, a Support Vector Machine (SVM) and Deep Neural Network (DNN) displayed comparatively poorer predictive power. Notably, when employing growth factors as explanatory variables, all prediction models exhibited a slightly improved performance compared to that when using vegetation indexes. (4) Discussion: This study explores predicting onion and garlic bulb weights in real-time using multispectral imaging and machine learning, filling a gap in research where previous studies primarily focused on utilizing artificial intelligence and machine learning for productivity enhancement, disease management, and crop monitoring. (5) Conclusions: In this study, we developed an automated method to predict the growth trajectory of onion and garlic bulb weights throughout the growing season by utilizing multispectral images, growth factors, and live bulb weight data, revealing that the FCR model demonstrated the most robust predictive performance among six artificial intelligence models tested.
... It utilizes an iterative approach to pick up from the errors of weak classifiers and convert them into strong ones. 57 The principle behind this technique is that several weak learners are added stage-wise to achieve strong learners as depicted in Figure 4. It helps to increase the accuracy by turning weak learners into strong ones but the drawbacks include sensitivity to outliers and noisy data. ...
Article
Full-text available
Objective Chitin a natural polymer is abundant in several sources such as shells of crustaceans, mollusks, insects, and fungi. Several possible attempts have been made to recover chitin because of its importance in biomedical applications in various forms such as hydrogel, nanoparticles, nanosheets, nanowires, etc. Among them, deep eutectic solvents have gained much consideration because of their eco-friendly and recyclable nature. However, several factors need to be addressed to obtain a pure form of chitin with a high yield. The development of an innovative system for the production of quality chitin is of prime importance and is still challenging. Methods The present study intended to develop a novel and robust approach to investigate chitin purity from various crustacean shell wastes using deep eutectic solvents. This investigation will assist in envisaging the important influencing parameters to obtain a pure form of chitin via a machine learning approach. Different machine learning algorithms have been proposed to model chitin purity by considering the enormous experimental dataset retrieved from previously conducted experiments. Several input variables have been selected to assess chitin purity as the output variable. Results The statistical criteria of the proposed model have been critically investigated and it was observed that the results indicate XGBoost has the maximum predictive accuracy of 0.95 compared with other selected models. The RMSE and MAE values were also minimal in the XGBoost model. In addition, it revealed better input variables to obtain pure chitin with minimal processing time. Conclusion This study validates that machine learning paves the way for complex problems with substantial datasets and can be an inexpensive and time-saving model for analyzing chitin purity from crustacean shells.
... The BDT model is trained using the TMVA [23] package, which implements the AdaBoost [24] method with 100 trees and a max depth of 4. During the simplification step performed by fwX, the number of trees was reduced to 10. ...
Article
Full-text available
The Global Event Processor (GEP) FPGA is an area-constrained, performance-critical element of the Large Hadron Collider's (LHC) ATLAS experiment. It needs to very quickly determine which small fraction of detected events should be retained for further processing, and which other events will be discarded. This system involves a large number of individual processing tasks, brought together within the overall Algorithm Processing Platform (APP), to make filtering decisions at an overall latency of no more than 8ms. Currently, such filtering tasks are hand-coded implementations of standard deterministic signal processing tasks. In this paper we present methods to automatically create machine learning based algorithms for use within the APP framework, and demonstrate several successful such deployments. We leverage existing machine learning to FPGA flows such as hls4ml and fwX to significantly reduce the complexity of algorithm design. These have resulted in implementations of various machine learning algorithms with latencies of 1.2 μs and less than 5% resource utilization on an Xilinx XCVU9P FPGA. Finally, we implement these algorithms into the GEP system and present their actual performance. Our work shows the potential of using machine learning in the GEP for high-energy physics applications. This can significantly improve the performance of the trigger system and enable the ATLAS experiment to collect more data and make more discoveries. The architecture and approach presented in this paper can also be applied to other applications that require real-time processing of large volumes of data.
... AdaBoost (adaptive boosting), proposed by [41], is one of boosting ensemble learning techniques. The operating principle of AD model is to update the weight according to the data of the previous classifier. ...
... Specifically, the default shallow MLP architecture was adopted with one hidden layer composed of 100 neuron units [54]. • Adaptive boosting (AdaBoost) [63]: it is an ensemble learning method that combines weak classifiers to create a strong classifier. It focuses on improving the classification of difficult-to-classify examples. ...
... Adaptive Boosting (AdaBoost) [18] is adaptive due to subsequent weak learners being removed in favor of those instances misclassified by previous classifiers. Even though the individual learners are weak, the performance of each one is slightly better than random guessing, which causes the final model to be a stronger learner. ...
... Boosting is a popular ensemble learning method combining multiple prediction models. After the proposal of the boosting algorithm using weak learners in the 1990s [53,54], Friedman reported gradient boosting in 1999, which performs sequential learning based on the gradient of any loss function [55]. Around 2014, Chen et al. developed and released eXtreme Gradient Boosting (XGBoost), an optimized and efficient version of gradient boosting [56]. ...
Article
Full-text available
Cell-type-specific regulatory elements, cataloged through extensive experiments and bioinformatics in large-scale consortiums, have enabled enrichment analyses of genetic associations that primarily utilize positional information of the regulatory elements. These analyses have identified cell types and pathways genetically associated with human complex traits. However, our understanding of detailed allelic effects on these elements’ activities and on-off states remains incomplete, hampering the interpretation of human genetic study results. This review introduces machine learning methods to learn sequence-dependent transcriptional regulation mechanisms from DNA sequences for predicting such allelic effects (not associations). We provide a concise history of machine-learning-based approaches, the requirements, and the key computational processes, focusing on primers in machine learning. Convolution and self-attention, pivotal in modern deep-learning models, are explained through geometrical interpretations using dot products. This facilitates understanding of the concept and why these have been used for machine learning for DNA sequences. These will inspire further research in this genetics and genomics field.
... Boosting is an ensemble method that combines the performance of a set of weak classifiers to produce a single strong classifier. Boosting refers to a general and provably effective method of producing a very accurate prediction rule by combining rough and moderately inaccurate rules of thumb in a manner similar to that suggested above (Freund & Schapire, 1996;1997). Boosting works by repeatedly running a given weak learning algorithm on various distributions over the training data, and then combining the classifiers produced by the weak learner into a single composite classifier. ...
Article
Full-text available
To improve the efficiency and accuracy of risk management in Customs, this paper explores the data mining process for risk detection with decision tree and boosting algorithms. The data are characterised by high dimensionality, imbalance and cost sensitivity. In particular, misjudging a false declaration as truthful can be more harmful than misjudging a truthful declaration as false. Therefore, considering the different costs of misclassification, we suggest taking a cost-sensitive approach with cost matrix in data mining. The inspection results are set as the prediction target variable to train the classifiers and make predictions. A data mining model of binary classification is formulated after feature selection and rebalancing. We evaluate its performance with classic measures of classification and customs risk assessment. The results show that the performance has been significantly improved with boosting while the output is less sensitive to cost-ratio under boosting.
... The main difference between AdaBoost and Xgboost is that AdaBoost learns by assigning weights to incorrectly predicted samples, and XGBoost learns by adding more trees to minimize the overall loss. More mathematical details about how AdaBoost assigns the weights can be found in (Freund and Schapire 1997). ...
Article
Full-text available
Phone use while driving (PUWD) is the most common distracted driving behavior. Considering that distracted driving is a preventable cause of crashes, researchers and practitioners want to understand this behavior to offer effective interventions. This research utilizes a dataset containing massive phone use events from Android phone users in Texas from 2018 to 2020 to explore the relationships between PUWD behavior and drivers’ socio-demographic factors at the census tract level. EXtreme Gradient Boosting (XGBoost) algorithm is adopted to perform the classification task: high PUWD rate or low PUWD rate. SHapley Additive exPlanations (SHAP) algorithm is established on the classification model to investigate the possible associations between socio-demographic factors and PUWD behavior. Our analysis indicates poverty, education attainment, sex, age groups, and income levels are dominantly associated with PUWD behavior. The census tracts with a higher- or lower-income level are more likely to be classified to have a high PUWD rate, while drivers from census tracts with middle-income levels (between 50 and 120 k) drive more defensively. The impact of income on the PUWD rate is also reflected through the education attainment factor. The results also demonstrate the strong association between a younger male population and a high PUWD rate. The census tracts are less likely to be classified to have a high PUWD rate while the median age of the male population increases. These findings demonstrate the potential to help transportation agencies target regions in greater need of anti-distracted driving interventions.
... Ensemble learning in machine learning was proposed in [56]. Ensemble methods use multiple learning algorithms to improve prediction. ...
Preprint
Full-text available
As an unsupervised learning method, clustering is done to find natural groupings of patterns, points, or objects. In clustering algorithms, an important problem is the lack of a definitive approach based on which users can decide which clustering method is more compatible with the input data set. This problem is due to the use of special criteria for optimization. Cluster consensus, as the reuse of knowledge, provides a solution to solve the inherent challenges of clustering. Ensemble clustering methods have come to the fore with the slogan that combining several weak models is better than a strong model. This paper proposed the optimal K-Means Clustering Algorithm (KMCE) method as an ensemble clustering method. This paper has used the K-Means weak base clustering method as base clustering. Also, by adopting some measures, the diversity of the consensus has increased. The proposed ensemble clustering method has the advantage of K-Means, which is its speed. Also, it does not have its major weakness, which is the inability to detect non-spherical and non-uniform clusters. In the experimental results, we meticulously evaluated and compared the proposed hybrid clustering algorithm with other up-to-date and powerful clustering algorithms on different data sets, ensuring the robustness and reliability of our findings. The experimental results indicate the superiority of the proposed hybrid clustering method over other clustering algorithms in terms of F1-score, Adjusted rand index, and Normal mutual information.
... The selection of features is based on the plain green channel pixel and the the Gaussian derivatives up to second-order with 5 different scales. AdaBoost classifiers Freund and Schapire (1997) were used in Lupascu et al (2010) with a 41 long feature vector based on local intensity structure, spatial properties, and geometry at multiple scales. ...
Preprint
Full-text available
Machine learning offers the potential to enhance real-time image analysis in surgical operations. This paper presents results from the implementation of machine learning algorithms targeted for an intelligent image processing system comprising a custom CMOS image sensor and field programmable gate array (FPGA). A novel method is presented for efficient image segmentation and minimises energy usage and requires low memory resources, which makes it suitable for implementation. Using two eigenvalues of the enhanced Hessian image, simplified traditional machine learning and deep learning methods are employed to learn the prediction of blood vessels. Quantitative comparisons are provided between different machine learning models based on accuracy, resource utilisation, throughput, and power usage. It is shown how a gradient boosting decision tree with 1000 times fewer parameters can achieve comparable state-of-the-art performance whilst only using a much smaller proportion of the resources and producing a 200 MHz design that operates at 1,779 frames per second at 3.85 W, making it highly suitable for the proposed system. A methodology for implementing the AI algorithms onto FPGA is presented and then used to provide additional results by extending the original work to a 512 × 512 image size along with more detailed analysis.
... The final prediction is given by the mode of the predicted categories over the H individual decision trees. Second, AdaBoost is a sequential learning algorithm (Freund & Schapire, 1997) that allows to minimize overfitting and continuously improves the accuracy of the models to combine. Third, gradient boosting achieves data classification by adopting an additive model i.e. linear combination of basic functions, and continuously reducing prediction error generated during training (Nie et al., 2021). ...
Article
Full-text available
This study compares the predictive accuracy of a set of machine learning models coupled with three resampling techniques (Random Undersampling, Random Oversampling, and Synthetic Minority Oversampling Technique) in predicting bank inactivity. Our sample includes listed banks in EU-28 member states between 2011 and 2019. We employed 23 financial ratios comprising capital adequacy, asset quality, management capability, earnings, liquidity, and sensitivity indicators. The empirical findings established that XGBoost performs exceptionally well as a classifier in predicting bank inactivity, particularly when considering a one-year time frame before the event. Furthermore, our findings indicate that random forest with Synthetic Minority Oversampling Technique demonstrates the highest predictive accuracy two years prior to inactivity, while XGBoost with Random Oversampling outperforms other methods three years in advance. Furthermore, the empirical results emphasize the significance of management capability and loan quality ratios as key factors in predicting bank inactivity. Our findings present important policy implications.
... At any iteration, the training examples mispredicted in the previous iteration will receive higher weights in the subsequent learning iteration, allowing the weak learners to pay more attention to those mispredicted instances. For a fuller presentation of AdaBoost, we refer the reader to 57,58 . ...
Article
Full-text available
We investigate if the vehicle travel time after 6 h on a given street can be predicted, provided the hourly vehicle travel time on the street in the last 19 h. Likewise, we examine if the traffic status (i.e., low, mild, or high) after 6 h on a given street can be predicted, provided the hourly traffic status of the street in the last 19 h. To pursue our objectives, we exploited historical hourly traffic data from Google Maps for a main street in the capital city of Jordan, Amman. We employ several machine learning algorithms to construct our predictive models: neural networks, gradient boosting, support vector machines, AdaBoost, and nearest neighbors. Our experimental results confirm our investigations positively, such that our models have an accuracy of around 98–99% in predicting vehicle travel time and traffic status on our study’s street for the target hour (i.e., after 6 h from a specific point in time). Moreover, given our time series traffic data and our constructed predictive models, we inspect the most critical indicators of street traffic status and vehicle travel time after 6 h on our study’s street. However, as we elaborate in the article, our predictive models do not agree on the degree of importance of our data features.
... The choice of loss function depends on the task and the type of data. • Adaptive Boosting: Adaptive Boosting (AdaBoost) [40] is a boosting algorithm that adapts the weights of the training samples based on their previous performance. The basic idea behind AdaBoost is to train a series of weak models on the weighted samples and then combine their predictions to form a strong model. ...
Article
Full-text available
In Mixed-Criticality (MC) systems, due to encountering multiple Worst-Case Execution Times (WCETs) for each task corresponding to the system operation modes, estimating appropriate WCETs for tasks in lower-criticality (LO) modes is essential to improve the system’s timing behavior. While numerous studies focus on determining WCET in the high-criticality mode, determining the appropriate WCET in the LO mode poses significant challenges and has been addressed in a few research works due to its inherent complexity. This article introduces ESOMICS, a novel scheme, to obtain appropriate WCET for LO modes, in which we propose an ML-based approach for WCET estimation based on the application’s source code analysis and the model training using a comprehensive data set. The experimental results show a significant improvement in utilization by up to 23.3% compared to state-of-the-art works, while mode switching probability is bounded by 7.19%, in the worst-case scenario.
... Boosting algorithms are computational modeling techniques first presented by Freund and Schapire [13]. These algorithms improve predictive power by converting several weak learners into strong learners. ...
Conference Paper
Full-text available
The current study discusses the application of intelligent algorithms and machine learning techniques to predict the bond strength between steel and concrete. The paper focuses on three boosting algorithms employed for this prediction task. The research exploited a database derived from pull-out tests conducted on thin steel bars to assess the bond between steel and concrete. The experimental program involved the use of three different classes of concrete and two types of steel bars. The goal was to analyze the steel-concrete bond strength, which is influenced by various factors. For the computational simulations, the input variables considered in this study were the bar surface, bar diameter (ϕ), concrete compressive strength (fc), and anchorage length (Ld). The output was the pull-out strength at the steel-concrete interface. It is important to highlight that most previous studies in this field have mainly focused on bars with diameters greater than 10.0 mm, while there is limited research available to evaluate the performance of bars with diameters smaller than 10.0 mm. The paper describes the computational experiments conducted using different boosting algorithms: Adaptive Boosting (AdaBoost), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB). These machine learning-based models achieved highly accurate predictions, applying specific hyperparameters. The following metrics were used to compare the performance of the different methods: Root Mean Squared Error (RMSE), the coefficient of variation (CV), and the error. These metrics were used to evaluate the reliability of each algorithm in predicting the bond strength in the samples. The results indicate the accuracy and goodness of fit of the model's predictions. Based on them, it can be concluded that the presented model can satisfactorily predict the bond strength of samples between thin steel bars and concrete.
... It operates sequentially, constructing a series of base DTRs to rectify the errors made by preceding models. In this algorithm, each data point is assigned a weight [64]. Initially, all data points carry equal weights. ...
Article
In the quest to reduce the environmental impact of the construction sector, the adoption of sustainable and eco- friendly materials is imperative. Geopolymer recycled aggregate concrete (GRAC) emerges as a promising so- lution by substituting supplementary cementitious materials, including fly ash and slag cement, for ordinary Portland cement and utilizing recycled aggregates from construction and demolition waste, thus significantly lowering carbon emissions and resource consumption. Despite its potential, the widespread implementation of GRAC has been hindered by the lack of an effective mix design methodology. This study seeks to bridge this gap through a novel machine learning (ML)-based approach to accurately model the compressive strength (CS) of GRAC, a critical parameter for ensuring structural integrity and safety. By compiling a comprehensive database from existing literature and enhancing it with synthetic data generated through a tabular generative adversarial network, this research employs eight ensemble ML techniques, comprising three bagging and five boosting methods, to predict the CS of GRAC with high precision. The boosting models, notably extreme gradient boosting, light gradient boosting, gradient boosting, and categorical gradient boosting regressors, demonstrated superior performance, achieving a mean absolute percentage error of less than 6 %. This precision in prediction underscores the viability of ML in optimizing GRAC formulations for enhanced structural applications. The identification of testing age, natural fine aggregate content, and recycled aggregate ratio as pivotal factors offers valuable insights into the mix design process, facilitating more informed decisions in material selection and proportioning. Moreover, the development of a user-friendly graphical interface for CS prediction exemplifies the practical application of this research, potentially accelerating the adoption of GRAC in mainstream construction practices. By enabling the practical use of GRAC, this research contributes to the global effort to promote sus- tainable development within the construction industry.
... 1) AdaBoost: AdaBoost is a well-known ensemble learning-based categorization model [34]. To direct subsequent hypotheses on more challenging classification scenarios, a set of weak classifiers or hypotheses is constructed. ...
Article
Full-text available
Despite promising results reported in the literature for mental workload assessment using electroencephalography (EEG), most of the proposed methods rely on employing multiple EEG channels, limiting their practicality. However, the advent of wearable EEG technology provides the possibility of mental workload assessment for real-life applications. Yet, a few studies that considered consumer-oriented EEG headsets for mental workload assessment only used a single database for validating the proposed methods, overlooking the potential for portability. In this research, we studied 60 recordings of participants playing a three-level n-back game, utilizing data from two EEG devices, Enobio and Muse, with distinctive characteristics such as sampling rate and channel configuration. Following the denoising of the EEG signals, we segmented the signals and applied the discrete wavelet transform to decompose them into sub-bands. Then, we extracted Shannon entropy and wavelet log energy features from all sub-bands. Subsequently, we fed the extracted features into five classifiers: support vector machine, k-nearest neighbors, multi-layer perceptron, AdaBoost, and the transformer network. In comparing the results across all classifiers, the transformer network demonstrated superiority by achieving highest mean accuracy for Database M (88%) and Database E (85%). Given the consistent outcomes achieved with the transformer network classifier across both databases and utilizing a three-level n-back game, our findings indicate that the proposed method holds promise for real-life applications.
... (2) 极端梯度提升树(XGBoost) 与 RF 相比,梯度提升树(GBDT) [40] 每一步只生 成单个决策树,首先利用初始数据库训练单个决策 ...
... sequential enhancement of the predictor's accuracy. 60 The entire dataset was initially subjected to learning using the first decision tree. Subsequent trees then iteratively focus on the dataset, adjusting the learning focus based on the error distribution characterized by the performance of the antecedent tree. ...
Article
The concrete compressive strength is essential for the design and durability of concrete infrastructure. Silica fume (SF), as a cementitious material, has been shown to improve the durability and mechanical properties of concrete. This study aims to predict the compressive strength of concrete containing SF by dual-objective optimization to determine the best balance between accurate prediction and model simplicity. A comprehensive dataset of 2995 concrete samples containing SF was collected from 36 peer-reviewed studies ranging from 5% to 30% by cement weight. Input variables included curing time, SF content, water-to-cement ratio, aggregates, superplasticizer levels, and slump characteristics in the modeling process. The gray wolf optimization (GWO) algorithm was applied to create a model that balances parsimony with an acceptable error threshold. A determination coefficient (R2) of 0.973 demonstrated that the CatBoost algorithm emerged as a superior predictive tool within the boosting ensemble context. A sensitivity analysis confirmed the robustness of the model, identifying curing time as the predominant influence on the compressive strength of SF-containing concrete. To further enhance the applicability of this research, the authors proposed a web application that facilitates users to estimate the compressive strength using the optimized CatBoost algorithm by following the link: https://sf-concrete-cs-prediction-by-javid-toufigh.streamlit.app/.
... [3] F1 Score: The F1 Score is used to determine how correctness accuracy. [3,4,5] Cross Validation Score: Evaluate a score by cross-validation. These are evaluation based on the general performance of each model in question. ...
Article
The aim of this research is to develop a system that can automatically detect and classify seven different types of dry bean seeds using data captured by a high-resolution camera. This system can help farmers determine the quality of their crop and optimize production. It can also be used for other agricultural applications, such as identifying defects or pests. The system will use a combination of image processing techniques, such as color segmentation and feature extraction, and machine learning algorithms, such as support vector machines and decision trees, to accurately classify bean seeds into their corresponding categories. The system will be evaluated using a dataset of images of bean seeds and the results will be compared to those obtained by human experts. The performance of the system will be measured in terms of accuracy, sensitivity, and specificity. The developed system will provide a more accurate and efficient way to classify bean seeds, which will lead to improved decision making in agriculture. In addition, the techniques used in this system can be applied to other agricultural applications, such as fruit and vegetable recognition.
... AdaBoost trains each weak learner in such a way that each learner focuses on the data that was misclassified by its predecessor so that learners further down the queue iteratively learn to adapt their parameters and achieve better results [53,54]. Multiple variants of the AdaBoost algorithm exist, starting from the original one [55] designed to tackle binary classification problems, regression, or multi-class classification options. The pseudocode for AdaBoost can be described as follows: ...
... AdaBoost [55,56] is an ensemble-based classification approach that creates a single composite stronger learner through the iterative addition of weak learners. A fresh weak learner is included in the ensemble for each training phase and concentrated on changing the weighting vector of the instances that were incorrectly identified in earlier training phases. ...
Article
Full-text available
Acute Lymphoblastic Leukemia (ALL) is a malignancy of White Blood Cells (WBC) originating from lymphoid cells. Hematologist detects ALL through manual inspection of Microscopic Blood Smear (MBS) images and employs standard diagnostic devices like flow cytometry. But, the manual evaluation by Hematologist is prone to diagnostic error, costly, and labor-intensive. In this paper, a computer-aided ALL detection scheme using a Whale Optimization Algorithm-based Support Vector Machine (WOA-SVM) has been proposed. The major challenges are WBC segmentation, discriminant feature extraction, and WBC classification (normal and ALL). Here, CIEL*a*b color-based K-means clustering with a marker-controlled watershed is utilized for WBC segmentation. 15-dimensional features obtained by combining the proposed features and existing features have been used to specify the feature set. The proposed features are 2D-Discrete Orthonormal S-Transform with weighted Principal Component Analysis, the sum of rotation invariant Local Binary Pattern with a uniform pattern, and the mean intensity of Cyan of the CMYK color model. Further, an ANOVA test has been performed to check the significant features. The features are reorganized through descending order of F-values and their covariance structure is removed using Zero Phase Component Analysis whitening. Moreover, the performance of WOA-SVM is evaluated using the feature set and established a promising result with 98.42% accuracy on ALL-IDB1 dataset. The experimental outcomes demonstrate the superiority of the proposed methodology over other comparing methodologies.
... Xu et al. [20] utilized Gabor filters to extract the characteristics of intensity and depth images. These characteristics were selected and reduced via a hierarchical scheme integrated with LDA and the AdaBoost learning algorithms [33]. Based on the selected depth and intensity characteristics, a classifier was built into the AdaBoost learning procedure for face recognition. ...
Article
Full-text available
This article presents a novel algorithm for automatic 3D face recognition, which is robust to facial expression alterations, missing data, and outliers. This algorithm is divided into three main components: First, the 3D face scan is decomposed into structure–texture images. Second, feature vectors are extracted from each component. Third, a postprocessing is applied to deal with the outlier embedded in a feature. The proposed technique was tested on two public datasets (i.e., Gavab and Bosphorus). Experimental testing shows that our proposed methods can increase facial recognition performance, as compared to relevant state-of-the-art methods.
Thesis
Full-text available
Action recognition is a field within computer vision that involves identifying and understanding human actions and activities based on sequences of images or depth maps. It is a complex and challenging task due to various practical difficulties such as background noise, changes in scale, occlusions, as well as variations in viewpoint, lighting, and appearance of individuals. This thesis focuses on action recognition with small sets of depth map sequence. In this context, a new ensemble learning method based on neural networks, named NECSCF, has been proposed. Three versions of the NECSCF algorithm were examined - one for tabular data and two for action recognition. In the research conducted on tabular data, it was experimentally demonstrated that NECSCF achieves good results on datasets with a limited number of examples. However, the NECSCF algorithm requires the development of appropriate common features. To apply the new algorithm to the problem of action recognition, a new set of features based on Dynamic Time Warping (DTW) distances for time series is proposed. Two versions of the NECSCF algorithm were investigated for action recognition. The first version utilizes features based on Dynamic Time Warping (DTW) calculated from handcreated features as common features, while specific features for each class are determined in two stages. In the first stage, features are extracted through convolutional networks on depth maps, and then in the second stage, features are determined on time series using one of three methods: 1D convolutional network, Siamese 1D network, or statistical features. The second version of the NECSCF algorithm has been simplified and does not utilize handcrafted features. Specific features for each class are determined by TD-LSTM networks trained in a single stage (end-to-end). In this version, two types of common features are utilized: those based on unsupervised learning (convolutional autoencoder) and features based on DTW. DTW features are not calculated on handcreated features but on features determined by the Siamese network. The proposed NECSCF method has achieved competitive results both in the case of tabular data and action recognition. This family of algorithms enables the combination of advantages from various deep learning architectures and facilitates effective utilization of the DTW algorithm in conjunction with neural networks.
Article
With the growing complexity and frequency of cyber threats, there is a pressing need for more effective defense mechanisms. Machine learning offers the potential to analyze vast amounts of data and identify patterns indicative of malicious activity, enabling faster and more accurate threat detection. Ensemble methods, by incorporating diverse models with varying vulnerabilities, can increase resilience against adversarial attacks. This study covers the usage and evaluation of the relevance of an innovative approach of ensemble classification for identifying intrusion threats on a large CICIDS2017 dataset. The approach is based on the distributivity equation that appropriately aggregates the underlying classifiers. It combines various standard supervised classification algorithms, including Multilayer Perceptron Network, k-Nearest Neighbors, and Naive Bayes, to create an ensemble. Experiments were conducted to evaluate the effectiveness of the proposed hybrid ensemble method. The performance of the ensemble approach was compared with individual classifiers using measures such as accuracy, precision, recall, F-score, and area under the ROC curve. Additionally, comparisons were made with widely used state-of-the-art ensemble models, including the soft voting method (Weighted Average Probabilities), Adaptive Boosting (AdaBoost), and Histogram-based Gradient Boosting Classification Tree (HGBC) and with existing methods in the literature using the same dataset, such as Deep Belief Networks (DBN), Deep Feature Learning via Graph (Deep GFL). Based on these experiments, it was found that some ensemble methods, such as AdaBoost and Histogram-based Gradient Classification Tree, do not perform reliably for the specific task of identifying network attacks. This highlights the importance of understanding the context and requirements of the data and problem domain. The results indicate that the proposed hybrid ensemble method outperforms traditional algorithms in terms of classification precision and accuracy, and offers insights for improving the effectiveness of intrusion detection systems.
Article
Myeloid cell leukemia 1 (Mcl1), a critical protein that regulates apoptosis, has been considered as a promising target for antitumor drugs. The conventional pharmacophore screening approach has limitations in conformation...
Preprint
Full-text available
This report introduces two recent measurements of semileptonic $b$-hadron decays at the LHCb experiment, including a test of Lepton Flavour Universality (LFU) using $\bar{B}^0 \to D^{(*)+} l^- \bar{\nu}_l$ decays where $l\in\{\mu,\tau\}$, and a study of the $D^{*+}$ longitudinal polarisation in the $\bar{B}^0 \to D^{*+} \tau^- \bar{\nu}_\tau$ decay. With the inclusion of the new results of the LFU ratios, the world average on $R(D)$ and $R(D^*)$, still shows a tension over three standard deviations from the SM prediction, while the measured $D^{*+}$ longitudinal polarisation is compatible with the SM value.
Article
In machine learning, ensemble learning methods (ELM) consist of combining several machine learning algorithms to obtain better quality predictions compared to a single model. The basic idea of this theory is to learn a set of classifiers and allow them to vote. In this paper, to correctly apply the ELM for enhancing of an artificial neural network (ANN) performances, a strategy was devised which is to divide the data to be classified into two categories, ‘easy-to-classify’ category and ‘difficult-to-classify’ category using a main ANN. Hence, reliable ANN and unreliable ANN are created and applied for the classification of ‘easy-to-classify’ data and for the classification of ‘difficult-to-classify’ data, respectively. The AdaBoost algorithm and Bagging algorithm are implemented separately on the unreliable ANN. To increase performance, the AdaBoost results and Bagging results are merged. The developed scheme is applied to remote sensing images from Meteosat Second Generation (MSG). The final results show very interesting performances in the case of the fusion of the results from AdaBoost-ANN and the results from Bagging-ANN (Ada/Bag-ANN). Indeed, the POD, FAR, CSI and Bias pass from 87.2%, 17.4%, 80.8% and 1.3 (ANN) to 96.8%, 06.8%, 92.7% and 1.1 (Ada/Bag-ANN), respectively. The same trend was observed in the case of precipitation estimates. The estimates obtained from the developed model (Ada/Bag-ANN) largely surpass those obtained from the use of ANN without ELM. Compared to ECST (Enhanced Convective Stratiform Technique), EPSAT-SG (Second Generation Satellite Precipitation Estimation), TAMSAT (Tropical Applications of Meteorology using SATellite), and RFE-2.0 (Rain Fall Estimate) which showed correlation coefficients of 87%, 81%, 76% and 71%, respectively, the Ada/Bag-ANN method shows significantly better results with a correlation coefficient of 94%.
Article
Full-text available
Many lumbar spine diseases are caused by defects or degeneration of lumbar intervertebral discs (IVD) and are usually diagnosed through inspection of the patient’s lumbar spine MRI. Efficient and accurate assessments of the lumbar spine are essential but a challenge due to the size of the clinical radiologist workforce not keeping pace with the demand for radiology services. In this paper, we present a methodology to automatically annotate lumbar spine IVDs with their height and degenerative state which is quantified using the Pfirrmann grading system. The method starts with semantic segmentation of a mid-sagittal MRI image into six distinct non-overlapping regions, including the IVD and vertebrae regions. Each IVD region is then located and assigned with its label. Using geometry, a line segment bisecting the IVD is determined and its Euclidean distance is used as the IVD height. We then extract an image feature, called self-similar color correlogram, from the nucleus of the IVD region as a representation of the region’s spatial pixel intensity distribution. We then use the IVD height data and machine learning classification process to predict the Pfirrmann grade of the IVD. We considered five different deep learning networks and six different machine learning algorithms in our experiment and found the ResNet-50 model and Ensemble of Decision Trees classifier to be the combination that gives the best results. When tested using a dataset containing 515 MRI studies, we achieved a mean accuracy of 88.1%.
Preprint
Full-text available
Falls can result in severe injuries and even mortality among individuals of all age groups. Hence, numerous wearable sensor-based fall monitoring systems are being developed to provide assistance. Fall detection and activity tracking have been partially successful using smartwatches, smartphones, and specialized devices. However, a comprehensive solution that combines sensor data from different brands in a single model and performs fall detection with high accuracy and at a satisfactory level has not been encountered. This study aims to bridge this research gap by combining data from two different brands of IMUs (inertial measurement units) that incorporate accelerometers, magnetometers, and gyroscopes, in order to create a hybrid dataset. To achieve accurate predictions on data from both brands, machine learning (ML) models were trained using ML algorithms. The first dataset was obtained from 14 volunteers using a commercially available activity tracking system called Motion Trackers Wireless (MTw). The second dataset was collected from 30 volunteers using a custom-designed Activity Tracking Device (ATD) specifically developed for detecting falls and daily-life activities. In both cases, the sensors from the respective brands were positioned on the waist to capture data related to falls and daily-life activities. The data was organized using a time-series style to reveal relational effect of the sequential falling data. During the modelling, ten different classifiers trained, and classification was performed on unseen data using the data splitting method. The Extra Tree algorithm emerged as the most successful model, achieving an accuracy of 99.54%, precision of 99.18%, recall of 99.79%, and an F-score of 99.49% on the hybrid dataset constructed from the MTw and ATD datasets. This study demonstrates hybrid dataset to create a successful system with high accuracy and low false alarm rates using inertial sensor data from various brands.
Article
Full-text available
This study compares the performance of ensemble machine learning methods stacking, blending, and soft voting for Landslide susceptibility mapping (LSM) in a highly affected Northern Italy region, Lombardy. We first created a spatial database based on open data ensuring the accessibility to relevant information for landslide-influencing factors, historical landslide records, and areas with a very low probability of landslide occurrence called ‘No Landslide Zone’, an innovative concept introduced in this study. Then, open-source software was employed for developing five Machine Learning classifiers (Bagging, Random Forests, AdaBoost, Gradient Tree Boosting, and Neural Networks) which were tested at a basin scale by implementing different combinations of training and testing schemes using three use cases. The three classifiers with the highest generalization performance (Random Forests, AdaBoost, and Neural Networks) were selected and combined by ensemble methods. The soft voting showed the highest performance among them. The best model to generate the LSM for the Lombardy region was a Neural Network model trained using data from three basins, achieving an accuracy of 0.93 in Lombardy. The LSM indicates that 37% of Lombardy is in the highest landslide susceptibility categories. Our findings highlight the importance of openness in advancing LSM not only by enhancing the reproducibility and transparency of our methodology but also by promoting knowledge-sharing within the scientific community.
Article
Full-text available
Global atmospheric models rely on parameterizations to capture the effects of gravity waves (GWs) on middle atmosphere circulation. As they propagate upwards from the troposphere, the momentum fluxes associated with these waves represent a crucial yet insufficiently constrained component. The present study employs three tree‐based ensemble machine learning (ML) techniques to probe the relationship between large‐scale flow and small‐scale GWs within the tropical lower stratosphere. The measurements collected by eight superpressure balloons from the Strateole 2 campaign, comprising a cumulative observation period of 680 days, provide valuable estimates of the gravity wave momentum fluxes (GWMFs). Multiple explanatory variables, including total precipitation, wind, and temperature, were interpolated from the ERA5 reanalysis at each balloon's location. The ML methods are trained on data from seven balloons and subsequently utilized to estimate reference GWMFs of the remaining balloon. We observed that parts of the GW signal are successfully reconstructed, with correlations typically around 0.54 and exceeding 0.70 for certain balloons. The models show significantly different performances from one balloon to another, whereas they show rather comparable performances for any given balloon. In other words, limitations from training data are a stronger constraint than the choice of the ML method. The most informative inputs generally include precipitation and winds near the balloons' level. However, different models highlight different informative variables, making physical interpretation uncertain. This study also discusses potential limitations, including the intermittent nature of GWMFs and data scarcity, providing insights into the challenges and opportunities for advancing our understanding of these atmospheric phenomena.
ResearchGate has not been able to resolve any references for this publication.