Figure 5 - uploaded by Brendan Marsh
Content may be subject to copyright.
Training of an AdaBoost classifier. The first classifier trains on unweighted data, then reweights the data for the next and so on to produce the final classifier.  

Training of an AdaBoost classifier. The first classifier trains on unweighted data, then reweights the data for the next and so on to produce the final classifier.  

Source publication
Article
Full-text available
A multivariate analysis is presented for the study of the vector boson fusion (VBF) Higgs boson decaying to a pair of tau leptons. While the VBF production mechanism of the Higgs is roughly an order of magnitude lower in cross section than the dominant gluon-gluon fusion mechanism, it is shown that VBF produces a distinctive signature that is well...

Similar publications

Article
Full-text available
The quest to observe gravitational waves challenges our ability to discriminate signals from detector noise. This issue is especially relevant for transient gravitational waves searches with a robust eyes wide open approach, the so called all- sky burst searches. Here we show how signal classification methods inspired by broad astrophysical charact...

Citations

... A description of how the AdaBoost algorithm works. The learner is incrementally boosted at each iteration, where the wrongly classified points from the last iteration are prioritized and the weights assigned to them are adjusted[41]. ...
Article
Full-text available
The performance of a machine learning algorithm is dependent on the quality of the available data for model development. However, in practical situations, the availability of the data is variable and can be limited. This limitation creates a budget problem for data-driven techniques and the objective in such situations is to develop the best model given the available data. In this article, we examine the budgeted learning problem for spatial data within the urban context. We demonstrate the effectiveness of a novel approach for inferring the attributes of spatial data when the data for the model is budgeted. This is achieved using urban functions - which describe the designated use of a geographical space - to infer the types of streets in a city. We evaluated the approach by comparing the performance of the model using the data in each urban function (the budget) against the results from the aggregate of all the functions (all data). The results indicate that with our model, individual urban functions are sufficient to infer the type attributes of streets.
... AdaBoost Algorithms.[7] Multiple learners are formed in series. ...
Conference Paper
Greater Bongkot North is a gas field located in Gulf of Thailand and on production since 1993. Most of the old wellhead platforms (30%) lack remote well test facilities which requires personnel visits for any well test measurement. Often, well testing in these platforms get lower priority compared to other operations in a matured field. This project implemented artificial intelligent (AI) technique to estimate gas rate from other available engineering and geological parameters. A new approach using machine learning was applied to estimate gas production rate where actual measurements are not available. Actual production well test data was used to train the model. Input parameters used were: Surface facility information Fluid properties Production condition Geological setup A blind test on the subset of historical data showed a level of confidence (R2) value of 0.93. This provided confidence to proceed with a full field pilot. A pilot was conducted during January to May 2018. The area of pilot was spread across various geological, operating and surface condition setups to reduce sampling bias. The pilot demonstrated the following use cases: Improved prediction accuracy in wells with no recent test, achieving primary object of model. Detection of well behavior changes: The model could detect changes in well behavior without human intervention much before the trends become obvious for engineers to detect. Improved potential estimation in wells with leaks in wellhead chokes where conventional analysis followed in Bongkot is not possible due to improper wellhead shut-in pressure measurement. Improved efficiency with production allocation: The conventional method requires significant time (40-80 person hours per month) to make the data available for production allocation. This can be shortened significantly by use of this method In essence, this project demonstrated the potential use of artificial intelligent to improve efficiency in a matured gas field operating under marginal conditions.
... Ada Boost classifier steps[49]. ...
Thesis
Full-text available
Coronary heart disease (CHD) has attracted the most attention around the world because it leads to death. These days, data mining in many fields, including commercial fields and medical fields, where medical fields are the most productive of large data on a continuous basis, and which must find different ways to extract information, may be important in predicting the spread of this disease. We have designed a system to help the diagnosis of CHD with better reduction of costs and time required for the process by using a programing language with data mining classification techniques. These algorithms produced good results and high accuracy. We applied our study to various CHD datasets. We obtained the best accuracy at 99% through the use of the Random Forest (RF) algorithm with Hungarian two classes. With Cleveland, we obtained 94% accuracy using the same algorithm while the better accuracy with the same dataset in the previous study was 58% when using the SVM algorithm. Moreover, with the Hungarian five class dataset, we obtained 99% as the best accuracy using random Forest Classifier algorithm rather than the accuracy that was achieved with this dataset in previous work, which was close to 67% using the SVM algorithm. In addition, we obtained 88% as a better accuracy using the AdaBoost classifier with the Hungarian data set and 87% accuracy using the Logistic Regression classifier with the heart.csv dataset. With the Switzerland dataset, we had 95% as the best accuracy using Random Forest and 91% best accuracy with the Long-Beach dataset using the same classifier. Finally, with the Switzerland dataset, we achieved a 78% better accuracy using the AdaBoost and Logistic Regression classifier. With Long-Beach, we had 80% using the AdaBoost classifier and 76% xii using the Logistic Regression classifier. Also with the heart.csv dataset, we achieved 87% best accuracy using the Logistic Regression classifier and 86% accuracy when using the AdaBoost classifier. We used a train test split and preprocessing for the CHD dataset in this study and processed the missing values that were found with attributes with a less complicated system. This process differs significantly from previous study is proposed results and accuracy for this purpose with the same CHD dataset.