Einar Holsbø’s research while affiliated with UiT The Arctic University of Norway and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (26)


Fig. 1. The steps of conducting the Modularity Encoding approach. 1) A network is generated where the nodes are the HCS codes, and the edges are the cooccurrences of these codes in the patients' population. 2) Modules of strongly connected codes in the network were identified. 3) Each code was assigned the module id it belongs to in the network. 4) The HCS is binary encoded according to their module number, reducing the number of generated dimensions to correspond to the number of the detected modules in the network. 5) These new dimensions are used in the ML prediction models.
Fig. 2. Represents the results from the first three experiments. LR, SVM, and GBM were used in all experiments. For all models, accuracy, precision, recall, F1-score, and AUC metrics with 95% confidence intervals were used to evaluate the models' performances. Green markers indicate the highest value of the evaluation metric in the respective comparison. Experiment 1. (binary vs modularity encoding), to the left of each subfigure, shows generally better results of modularity grouping over dummy encoding of the raw ICD codes. In experiment 2. (different resolutions threshold encoding), the performance results of different resolutions are close. LR and GBM models suggest that R1 is the best resolution threshold, while SVM suggests that R08 is the best. In experiment 3. (comparison of modularity, highest hierarchy, and CCS encoding), grouping ICD codes to the highest level of system hierarchy yielded generally best results, followed by modularity grouping and CCS. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
The training time of the models on the different ICD datasets.
“Using network analysis modularity to group health code systems and decrease dimensionality in machine learning models”
  • Article
  • Full-text available

June 2024

·

19 Reads

Exploratory Research in Clinical and Social Pharmacy

·

·

Einar Holsbø

·

[...]

·

Background Machine learning (ML) prediction models in healthcare and pharmacy-related research face challenges with encoding high-dimensional Healthcare Coding Systems (HCSs) such as ICD, ATC, and DRG codes, given the trade-off between reducing model dimensionality and minimizing information loss. Objectives To investigate using Network Analysis modularity as a method to group HCSs to improve encoding in ML models. Methods The MIMIC-III dataset was utilized to create a multimorbidity network in which ICD-9 codes are the nodes and the edges are the number of patients sharing the same ICD-9 code pairs. A modularity detection algorithm was applied using different resolution thresholds to generate 6 sets of modules. The impact of four grouping strategies on the performance of predicting 90-day Intensive Care Unit readmissions was assessed. The grouping strategies compared: 1) binary encoding of codes, 2) encoding codes grouped by network modules, 3) grouping codes to the highest level of ICD-9 hierarchy, and 4) grouping using the single-level Clinical Classification Software (CCS). The same methodology was also applied to encode DRG codes but limiting the comparison to a single modularity threshold to binary encoding. The performance was assessed using Logistic Regression, Support Vector Machine with a non-linear kernel, and Gradient Boosting Machines algorithms. Accuracy, Precision, Recall, AUC, and F1-score with 95% confidence intervals were reported. Results Models utilized modularity encoding outperformed ungrouped codes binary encoding models. The accuracy improved across all algorithms ranging from 0.736 to 0.78 for the modularity encoding, to 0.727 to 0.779 for binary encoding. AUC, recall, and precision also improved across almost all algorithms. In comparison with other grouping approaches, modularity encoding generally showed slightly higher performance in AUC, ranging from 0.813 to 0.837, and precision, ranging from 0.752 to 0.782. Conclusions Modularity encoding enhances the performance of ML models in pharmacy research by effectively reducing dimensionality and retaining necessary information. Across the three algorithms used, models utilizing modularity encoding showed superior or comparable performance to other encoding approaches. Modularity encoding introduces other advantages such as it can be used for both hierarchical and non-hierarchical HCSs, the approach is clinically relevant, and can enhance ML models' clinical interpretation. A Python package has been developed to facilitate the use of the approach for future research.

Download

Using Network Analysis Modularity to Group Health Code Systems and Decrease Dimensionality in Machine Learning Models

May 2024

·

7 Reads

Background: Machine learning (ML) prediction models in healthcare and pharmacy-related research face challenges with encoding high-dimensional Healthcare Coding Systems (HCSs) such as ICD, ATC, and DRG codes, given the trade-off between reducing model dimensionality and minimizing information loss. Objectives: To investigate using Network Analysis modularity as a method to group HCSs to improve encoding in ML models. Methods: The MIMIC-III dataset was utilized to create a multimorbidity network in which ICD-9 codes are the nodes and the edges are the number of patients sharing the same ICD-9 code pairs. A modularity detection algorithm was applied using different resolution thresholds to generate 6 sets of modules. The impact of four grouping strategies on the performance of predicting 90-day Intensive Care Unit readmissions was assessed. The grouping strategies compared: 1) binary encoding of codes, 2) encoding codes grouped by network modules, 3) grouping codes to the highest level of ICD-9 hierarchy, and 4) grouping using the single-level Clinical Classification Software (CCS). The same methodology was also applied to encode DRG codes but limiting the comparison to a single modularity threshold to binary encoding. The performance was assessed using Logistic Regression, Support Vector Machine with a non-linear kernel, and Gradient Boosting Machines algorithms. Accuracy, Precision, Recall, AUC, and F1-score with 95% confidence intervals were reported. Results: Models utilized modularity encoding, especially at higher resolutions, constantly outperformed ungrouped codes binary encoding and generally performed better or similar to models using other grouping approaches. Conclusions: Modularity encoding enhances the implementation of ML models in pharmacy research. The approach demonstrated comparable or better performance to other methods. It provides additional advantages including suitability to hierarchical and non-hierarchical HCSs, clinical relevancy, and enhancing model interpretability through clinical insights provided by the modules. A Python package has been developed to facilitate using this approach in ML models.


Figure 2 A screenshot of our custom annotation tool used for the manual review
More Efficient Manual Review of Automatically Transcribed Tabular Data

April 2024

·

24 Reads

Historical Life Course Studies

Any machine learning method for transcribing historical text requires manual verification and correction, which is often time-consuming and expensive. Our aim is to make it more efficient. Previously, we developed a machine learning model to transcribe 2.3 million handwritten occupation codes from the Norwegian 1950 census. Here, we manually review the 90,000 codes (3%) for which our model had the lowest confidence scores. We allocated these codes to human reviewers, who used our custom annotation tool to review them. The reviewers agreed with the model's labels 31.9% of the time. They corrected 62.8% of the labels, and 5.1% of the images were uncertain or assigned invalid labels. 9,000 images were reviewed by multiple reviewers, resulting in an agreement of 86.4% and a disagreement of 9%. The results suggest that one reviewer per image is sufficient. We recommend that reviewers indicate any uncertainty about the label they assign to an image by adding a flag to their label. Our interviews show that the reviewers performed internal quality control and found our custom tool to be useful and easy to operate. We provide guidelines for efficient and accurate transcription of historical text by combining machine learning and manual review. We have open-sourced our custom annotation tool and made the reviewed images open access.


Figure 1: A screenshot of our custom annotation tool used for the manual review. The text on the upper left translates to "number of pages: 2 of 836". The buttons on the right translate to "back" and "next".
An overview of the categories we use in the project, and examples for how an occupation code image may be categorized based on the labels from the human reviewers.
More efficient manual review of automatically transcribed tabular data

June 2023

·

43 Reads

Machine learning methods have proven useful in transcribing historical data. However, results from even highly accurate methods require manual verification and correction. Such manual review can be time-consuming and expensive, therefore the objective of this paper was to make it more efficient. Previously, we used machine learning to transcribe 2.3 million handwritten occupation codes from the Norwegian 1950 census with high accuracy (97%). We manually reviewed the 90,000 (3%) codes with the lowest model confidence. We allocated those 90,000 codes to human reviewers, who used our annotation tool to review the codes. To assess reviewer agreement, some codes were assigned to multiple reviewers. We then analyzed the review results to understand the relationship between accuracy improvements and effort. Additionally, we interviewed the reviewers to improve the workflow. The reviewers corrected 62.8% of the labels and agreed with the model label in 31.9% of cases. About 0.2% of the images could not be assigned a label, while for 5.1% the reviewers were uncertain, or they assigned an invalid label. 9,000 images were independently reviewed by multiple reviewers, resulting in an agreement of 86.43% and disagreement of 8.96%. We learned that our automatic transcription is biased towards the most frequent codes, with a higher degree of misclassification for the lowest frequency codes. Our interview findings show that the reviewers did internal quality control and found our custom tool well-suited. So, only one reviewer is needed, but they should report uncertainty.


What is the state of the art? Accounting for multiplicity in machine learning benchmark performance

March 2023

·

63 Reads

Machine learning methods are commonly evaluated and compared by their performance on data sets from public repositories. This allows for multiple methods, oftentimes several thousands, to be evaluated under identical conditions and across time. The highest ranked performance on a problem is referred to as state-of-the-art (SOTA) performance, and is used, among other things, as a reference point for publication of new methods. Using the highest-ranked performance as an estimate for SOTA is a biased estimator, giving overly optimistic results. The mechanisms at play are those of multiplicity, a topic that is well-studied in the context of multiple comparisons and multiple testing, but has, as far as the authors are aware of, been nearly absent from the discussion regarding SOTA estimates. The optimistic state-of-the-art estimate is used as a standard for evaluating new methods, and methods with substantial inferior results are easily overlooked. In this article, we provide a probability distribution for the case of multiple classifiers so that known analyses methods can be engaged and a better SOTA estimate can be provided. We demonstrate the impact of multiplicity through a simulated example with independent classifiers. We show how classifier dependency impacts the variance, but also that the impact is limited when the accuracy is high. Finally, we discuss a real-world example; a Kaggle competition from 2020.


Figure 1: Variables by predictor importance
Conference abstract 308: Using machine learning methods to predict all-cause somatic hospital admissions and readmissions in adults: A systematic review

Introduction. Home Medicines Reviews (HMRs) are an Australian government-funded medication review service conducted in the home by consultant pharmacists (CPs-specially trained pharmacists who have received post-registration certification in medication review). Limited data are available to understand how pharmacists conduct HMR services during the various stages of service provision. Aims. To explore the information gathering and report writing processes of CPs conducting HMR services in Australia. Methods. A national cross-sectional online survey was used to explore and describe the information gathering activities of CPs during the various stages of a HMR (pre-interview, interview, post-interview, report writing). The survey was developed by the research team and included 5-point Likert-type scales and multiple-choice questions. After face validation and piloting by pharmacists with varied academic and professional expertise, the online survey was advertised through professional organisations to Australian registered CPs who had completed at least one HMR service within the past 12 months. Results. A total of 269 consented to participate in the survey, which represented 11% of the approximate total 2400 CPs registered in Australia. Most participants were female (n=133, 76.0%) and received their specialised certification through the Australian Association of Consultant Pharmacy (n= 169, 97.1%). Participants reported that medication lists (97.4%) and past medical history (88.1%) of HMR patients are commonly provided in referral letters, but medication lists (100%) and social history (57.8%) is often reported back to referrers in their written reports. The most common evidence-based tools used by participants during report writing included medication adherence scales (22%) and anticholinergic medication burden scales (18.2%). Discussion. This study explored the extent of information collected by CPs during the different stages of the HMR service provision and identified that CPs provide evidence-based and patient-centred written reports to referrers. Introduction. Pharmacy campaigns about medicine use have been run in Indonesia by several organisations at national level. Finding the remaining gaps of proper storage and disposal of household medicines will be beneficial to improve the existing campaigns. Aims. This research aimed to capture knowledge, attitude and practice related to storage and disposal of household medicines among people in Jember, Yogyakarta and Padang. Methods. A mini cross-sectional survey with quota sampling was done in three Indonesian cities during June-July 2021. A face-to-face data collection was done by three surveyors per city and data were stored online using mWater. Results. 89 of 90 participants (98.8%) agreed to participate. Most participants (62, 69.7%) failed to correctly interpret the labels of expiry date and 38 (42.7%) participants did not understand damaging the primary packaging of medicines should be done prior to disposal. Most participants (81, 91.0%) agreed to not share their medicines for others who have similar complaints. Various responses were emerged about the practice of storing syrups in a refrigerator, the disposal of liquid dosage forms through the drainage and the habit of dating the first opening of the packaging of a liquid dosage form. Discussion. Our study revealed several topics that could strengthen the ongoing campaigns, including the correct interpretation of expiry date and the need to damage medicine packaging before disposal. Surprisingly, sharing someone's medicine to close social circles was commonly disagreed by our participants. However, this attitude may not be translated into practice 1. The disposal of household medicines in Indonesia is problematic as a medication return program that can be found in other country 2 remains limited. Such program was just recently piloted in 15 major cities of Indonesia in 2019. Further works are needed to improve the practice. 1. Bayene K, Aspden T, Sheridan J. Prescription medicine sharing: exploring patients' beliefs and experiences.


Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes

January 2022

·

39 Reads

·

8 Citations

Historical Life Course Studies

Machine learning approaches achieve high accuracy for text recognition and are therefore increasingly used for the transcription of handwritten historical sources. However, using machine learning in production requires a streamlined end-to-end pipeline that scales to the dataset size and a model that achieves high accuracy with few manual transcriptions. The correctness of the model results must also be verified. This paper describes our lessons learned developing, tuning and using the Occode end-to-end machine learning pipeline for transcribing 2.3 million handwritten occupation codes from the Norwegian 1950 population census. We achieve an accuracy of 97% for the automatically transcribed codes, and we send 3% of the codes for manual verification . We verify that the occupation code distribution found in our results matches the distribution found in our training data, which should be representative for the census as a whole. We believe our approach and lessons learned may be useful for other transcription projects that plan to use machine learning in production. The source code is available at https://github.com/uit-hdl/rhd-codes.


Occode: an end-to-end machine learning pipeline for transcription of historical population censuses

June 2021

·

132 Reads

Machine learning approaches achieve high accuracy for text recognition and are therefore increasingly used for the transcription of handwritten historical sources. However, using machine learning in production requires a streamlined end-to-end machine learning pipeline that scales to the dataset size, and a model that achieves high accuracy with few manual transcriptions. In addition, the correctness of the model results must be verified. This paper describes our lessons learned developing, tuning, and using the Occode end-to-end machine learning pipeline for transcribing 7,3 million rows with handwritten occupation codes in the Norwegian 1950 population census. We achieve an accuracy of 97% for the automatically transcribed codes, and we send 3% of the codes for manual verification. We verify that the occupation code distribution found in our result matches the distribution found in our training data which should be representative for the census as a whole. We believe our approach and lessons learned are useful for other transcription projects that plan to use machine learning in production. The source code is available at: https://github.com/uit-hdl/rhd-codes




Citations (7)


... Coding occupational information generally consists of three steps. First, entry errors, abbreviations, and spelling variations are removed to standardize occupational titles [4,10,20,26,32]. Second, these occupational titles are grouped into occupational groups using intermediate coding schemes. ...

Reference:

Common Language for Accessibility, Interoperability, and Reusability in Historical Demography
Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes

Historical Life Course Studies

... Over the past several decades, a variety of omics-based techniques have made major progress in the quest for non-invasive biomarkers for all-stage, and particularly early-stage, BC diagnosis in cancer liquid biopsies, therefore avoiding invasive tumour tissue biopsies or operations. Blood/plasma-based genomics, which typically entail circulating tumour DNA (ctDNA) or cell-free DNA (cfDNA) analyses, are helpful for a variety of purposes, including prediagnosis [67], dormancy [68], sub-clonal variation in advanced BC [69], the prediction of disease-free survival (DFS) [70], and assessments of TNBC progression and personalised management and diagnoses in patients. ...

Metastatic Breast Cancer and Pre-Diagnostic Blood Gene Expression Profiles—The Norwegian Women and Cancer (NOWAC) Post-Genome Cohort

... For each study sample set separately, potential outliers were evaluated based on plots such as principal component analysis (PCA) plots and boxplots of probe signals displaying variation along with the laboratory quality measures 44 . We performed background correction, removed bad quality probes, and filtered probes detected in less than 20% of samples. ...

A standard operating procedure for outlier removal in large-sample epidemiological transcriptomics datasets

... Blood acts as a dynamic solvent of immune activity and may thus reflect critical immuno-oncologic activity, pathways, and molecular programs [7]. Numerous biomarker studies have attempted to detect the presence of cancer by profiling gene expression in either whole blood [8][9][10] or peripheral blood mononuclear cells (PBMCs) [11][12][13]. PBMCs largely consist of lymphocytes and macrophages, and while they play a significant role in the immune system, they remain a subset of all circulating immune cells and do not include other cell types such as eosinophils and neutrophils. ...

Predicting breast cancer metastasis from whole-blood transcriptomic measurements

BMC Research Notes

... Acute severe ulcerative colitis represents a lifethreatening medical emergency that carries a 10-15% risk of colectomy, and a 1% risk of mortality. 53 Patients with acute severe ulcerative colitis are at risk of undertreatment as they need rapid effective treatment (figure 1). 54 Timely introduction of salvage therapies is key, and multiple scores have been developed to predict the response to intravenous corticosteroids. ...

Systematic review with meta‐analysis: mortality in acute severe ulcerative colitis

Alimentary Pharmacology & Therapeutics

... 43 For heart sounds, concurrent recording of an ECG to relate physiologic events of the cardiac cycle is easier than the recording of respiratory airflow or chest movements to relate lung sounds to the respiratory cycle and depth of breathing. However, extracting breath phase information from the respiratory sound itself appears promising 44,45 to avoid additional instrumentation (eg, thoracic impedance measurements to identify breath phases that were used in a bimodal repository of lung sounds). 46 Such storage and sharing of recorded lung sounds in clouds of electronic data has greatly accelerated the development of machine hearing. ...

Convolutional Neural Network for Breathing Phase Detection in Lung Sounds

Sensors

... Foram analisados, ainda, outros trabalhos sobre classificação deáudios pulmo-nares, bem como outras iniciativas para criação de bases de dados. Dentre eles, se destacam [Sovijärvi et al. 2000, Riella et al. 2009, Grønnesby 2016, Grønnesby et al. 2017, Pramono et al. 2017, Kim et al. 2019]. Estes trabalhos buscam realizar a classificação dé audios pulmonares ou entender como as pesquisas nesse campo tem sido desenvolvidas. ...

Machine Learning Based Crackle Detection in Lung Sounds