Kimberly Phua’s research while affiliated with Nanyang Technological University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (1)


Figure 1. Validation Methods Broadly Fall into Two Types: Internal and External
Figure 2. Two Extensions of the EV: Convergent and Divergent Convergent validation uses multiple features sets to train multiple models. Each model is, in turn, benchmarked on a gold standard validation dataset. Divergent validation uses a single feature set to train one model followed by repeated challenging with multiple datasets.
Figure 3. A Schematic on How to Interpret Observed Validation Accuracies against a Backdrop of Null Accuracies Using p Values
Figure 4. Divergent Validation Comparing Published Signatures (Yellow), Expected Theoretical Distribution Based on the Binomial Distribution (Red), and Randomized Signatures (Blue) across Seven Datasets
Extensions of the External Validation for Checking Learned Model Interpretability and Generalizability
  • Literature Review
  • Full-text available

November 2020

·

3,156 Reads

·

171 Citations

Patterns

Sung Yang Ho

·

Kimberly Phua

·

·

We discuss the validation of machine learning models, which is standard practice in determining model efficacy and generalizability. We argue that internal validation approaches, such as cross-validation and bootstrap, cannot guarantee the quality of a machine learning model due to potentially biased training data and the complexity of the validation procedure itself. For better evaluating the generalization ability of a learned model, we suggest leveraging on external data sources from elsewhere as validation datasets, namely external validation. Due to the lack of research attractions on external validation, especially a well-structured and comprehensive study, we discuss the necessity for external validation and propose two extensions of the external validation approach that may help reveal the true domain-relevant model from a candidate set. Moreover, we also suggest a procedure to check whether a set of validation datasets is valid and introduce statistical reference points for detecting external data problems.

Download

Citations (1)


... Training is followed by internal and external validation. External validation is "the use of independently derived datasets (hence, external), to validate the performance of a model that was trained on initial input data" (Ho et al. 2020). But we would extend this concept and suggest for this paper that external validation also includes any trials of ML-based models in real life situations with any forms of feedback indicating their accuracy, efficiency, and usability. ...

Reference:

A review of machine learning for analysing accident reports in the construction industry
Extensions of the External Validation for Checking Learned Model Interpretability and Generalizability

Patterns