M.A. Hall’s research while affiliated with University of Waikato and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (5)


Data Mining: Practical Machine Learning Tools and Techniques
  • Book

November 2016

·

4,904 Reads

·

6,734 Citations

·

E. Frank

·

M.A. Hall

·

C.J. Pal

Data Mining: Practical Machine Learning Tools and Techniques, Fourth Edition, offers a thorough grounding in machine learning concepts, along with practical advice on applying these tools and techniques in real-world data mining situations. This highly anticipated fourth edition of the most acclaimed work on data mining and machine learning teaches readers everything they need to know to get going, from preparing inputs, interpreting outputs, evaluating results, to the algorithmic methods at the heart of successful data mining approaches. Extensive updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including substantial new chapters on probabilistic methods and on deep learning. Accompanying the book is a new version of the popular WEKA machine learning software from the University of Waikato. Authors Witten, Frank, Hall, and Pal include today's techniques coupled with the methods at the leading edge of contemporary research. Please visit the book companion website at http://www.cs.waikato.ac.nz/ml/weka/book.html It contains Powerpoint slides for Chapters 1-12. This is a very comprehensive teaching resource, with many PPT slides covering each chapter of the book Online Appendix on the Weka workbench; again a very comprehensive learning aid for the open source software that goes with the book Table of contents, highlighting the many new sections in the 4th edition, along with reviews of the 1st edition, errata, etc. Provides a thorough grounding in machine learning concepts, as well as practical advice on applying the tools and techniques to data mining projects Presents concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods Includes a downloadable WEKA software toolkit, a comprehensive collection of machine learning algorithms for data mining tasks-in an easy-to-use interactive interface Includes open-access online courses that introduce practical applications of the material in the book.


Data Mining

January 2011

·

170 Reads

·

247 Citations

Like the popular second edition, Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining'including, i.e., the rule [onions, potatoes] - [beef] found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is also likely to buy beef. The authors inlcude both tried-and-true techniques of today as well as methods at the leading edge of contemporary research. Complementing the book is a fully functional platform-independent open source Weka software for machine learning, available for free download. The book is a major revision of the second edition that appeared in 2005. While the basic core remains the same, it has been updated to reflect the changes that have taken place over the last four or five years. The highlights for the updated new edition include completely revised technique sections; new chapter on Data Transformations, new chapter on Ensemble Learning, new chapter on Massive Data Sets, a new ?book release? version of the popular Weka machine learning open source software (developed by the authors and specific to the Third Edition); new material on ?multi-instance learning?; new information on ranking the classification, plus comprehensive updates and modernization throughout. All in all, approximately 100 pages of new material. * Thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques * Algorithmic methods at the heart of successful data mining'including tired and true methods as well as leading edge methods * Performance improvement techniques that work by transforming the input or output * Downloadable Weka, a collection of machine learning algorithms for data mining tasks, including tools for data pre-processing, classification, regression, clustering, association rules, and visualization'in an updated, interactive interface.




Citations (5)


... From our processed data set, continuous predictor variables were binarized in Weka, using an unsupervised filter to create equal frequency bins for the data (Witten et al. 2005). Since this analysis relies on a binary outcome, rates of temperature change were recoded into "0" and "1," with a "0" indicating a negative rate of decreasing temperature from 0 to 30 min after exercise and a "1" indicating a positive rate of increasing temperature after exercise. ...

Reference:

Body Temperature Regulation in Domestic Dogs After Agility Trials: The Effects of Season, Training, Body Characteristics, Age, and Genetics
Data Mining: Practical Machine Learning Tools and Techniques
  • Citing Book
  • November 2016

... In addition, 145 data used in this work are randomly divided into two subdatasets using a uniform distribution, of which 70% of the data is used for training of the ANN model and 30% of the remaining data is used for validating the model. All data is scaled to the range [0-1] to reduce numerical error during ANN processing, as recommended by [39]. is process ensures that the training phase of the AI models can be performed with functional generalization capabilities. Such proportions are represented by ...

Chapter 7-data transformations
  • Citing Article
  • January 2011

... The implementation of artificial intelligence (AI) in the financial and banking sectors has evolved since the 1980s, when rule-based expert systems such as PROLOG and LISP were used for fraud detection and credit assessment (Russell & Norvig, 2003). In the 1990s, more advanced techniques such as decision trees and support vector machines (SVM) emerged, improving the accuracy of default detection and the discernment of patterns in large volumes of financial data (Witten et al., 2011;Altman et al., 1977). With the rise of Big Data in the 2000s, machine learning made it possible to manage high-dimensional data using multi-layer neural networks and models such as random forest, optimizing customer segmentation and anomaly detection in real time (Bishop, 2006). ...

Data Mining
  • Citing Article
  • January 2011

... Thus, by assuming the number of floating-point operations to be (n × d), Table 6 reports the operations count for 1-nearest neighbor search. Second, Witten et al. (2011) xxviii Evolutionary Computation Volume 30, Number 2 estimate that decision tree's depth is O(log(n)) based on the assumption that the induced tree is 'bushy' and binary, where n is the number of training examples. Accordingly, we assume that a prediction using CART's induced decision tree exactly takes log(n) floating-point operations, and have obtained and reported the prediction cost of CART for datasets of Table 6. ...

Implementation: Real machine learning schemes
  • Citing Article
  • January 2011

... The predictive accuracy of the learned Random Forest classifiers was measured by the popular Area Under the ROC curve (AUC) measure, using a standard 10-fold cross-validation procedure [32]. The AUC measure takes values in the range [0..1], where 0.5 is the expected score for randomly guessing the class labels and 1 would be the score of a perfect classifier. ...

Chapter 5 - Credibility: Evaluating whats been learned
  • Citing Article
  • January 2011