Lars Ailo Bongo’s research while affiliated with UiT The Arctic University of Norway and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (135)


Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images
  • Article

November 2024

·

7 Reads

Journal of Pathology Informatics

·

·

·

[...]

·


Example overview of the hierarchical structure of the ICD-10 system
An overview of how LLMs encoded the historical causes of death.
Summation of the results from LLMs and alternative methods.
Results for LLM classification of archaic and current causes of death terms.
Correct assignment of ICD-10 codes by word category

+1

Coding Historical Causes of Death Data with Large Language Models
  • Chapter
  • Full-text available

October 2024

·

25 Reads

·

2 Citations

This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death. Due to the complex narratives often found in historical causes of death, this task has traditionally been manually performed by coding experts. We evaluate the ability of GPT-3.5, GPT-4, and Llama 2 LLMs to accurately assign ICD-10 codes on the HiCaD dataset that contains causes of death recorded in the civil death register entries of 19,361 individuals from Ipswich, Kilmarnock, and the Isle of Skye in the UK between 1861–1901. Our findings show that GPT-3.5, GPT-4, and Llama 2 assign the correct code for 69%, 83%, and 40% of causes, respectively. However, we achieve a maximum accuracy of 89% by standard machine learning techniques. All LLMs performed better for causes of death that contained terms still in use today, compared to archaic terms. Also, they performed better for short causes (1–2 words) compared to longer causes. LLMs therefore do not currently perform well enough for historical ICD-10 code assignment tasks. We suggest further fine-tuning or alternative frameworks to achieve adequate performance.

Download


Prompt Engineering a Schizophrenia Chatbot: Utilizing a Multi-Agent Approach for Enhanced Compliance with Prompt Instructions

October 2024

·

7 Reads

Patients with schizophrenia often present with cognitive impairments that may hinder their ability to learn about their condition. These individuals could benefit greatly from education platforms that leverage the adaptability of Large Language Models (LLMs) such as GPT-4. While LLMs have the potential to make topical mental health information more accessible and engaging, their black-box nature raises concerns about ethics and safety. Prompting offers a way to produce semi-scripted chatbots with responses anchored in instructions and validated information, but prompt-engineered chatbots may drift from their intended identity as the conversation progresses. We propose a Critical Analysis Filter for achieving better control over chatbot behavior. In this system, a team of prompted LLM agents are prompt-engineered to critically analyze and refine the chatbot's response and deliver real-time feedback to the chatbot. To test this approach, we develop an informational schizophrenia chatbot and converse with it (with the filter deactivated) until it oversteps its scope. Once drift has been observed, AI-agents are used to automatically generate sample conversations in which the chatbot is being enticed to talk about out-of-bounds topics. We manually assign to each response a compliance score that quantifies the chatbot's compliance to its instructions; specifically the rules about accurately conveying sources and being transparent about limitations. Activating the Critical Analysis Filter resulted in an acceptable compliance score (>=2) in 67.0% of responses, compared to only 8.7% when the filter was deactivated. These results suggest that a self-reflection layer could enable LLMs to be used effectively and safely in mental health platforms, maintaining adaptability while reliably limiting their scope to appropriate use cases.


Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images

August 2024

·

46 Reads

Classifying breast cancer molecular subtypes is crucial for tailoring treatment strategies. While immunohistochemistry (IHC) and gene expression profiling are standard methods for molecular subtyping, IHC can be subjective, and gene profiling is costly and not widely accessible in many regions. Previous approaches have highlighted the potential application of deep learning models on H&E-stained whole slide images (WSI) for molecular subtyping, but these efforts vary in their methods, datasets, and reported performance. In this work, we investigated whether H&E-stained WSIs could be solely leveraged to predict breast cancer molecular subtypes (luminal A, B, HER2-enriched, and Basal). We used 1,433 WSIs of breast cancer in a two-step pipeline: first, classifying tumor and non-tumor tiles to use only the tumor regions for molecular subtyping; and second, employing a One-vs-Rest (OvR) strategy to train four binary OvR classifiers and aggregating their results using an eXtreme Gradient Boosting (XGBoost) model. The pipeline was tested on 221 hold-out WSIs, achieving an overall macro F1 score of 0.95 for tumor detection and 0.73 for molecular subtyping. Our findings suggest that, with further validation, supervised deep learning models could serve as supportive tools for molecular subtyping in breast cancer. Our codes are made available to facilitate ongoing research and development.


Using machine learning methods to predict all-cause somatic hospitalizations in adults: A systematic review

Aim In this review, we investigated how Machine Learning (ML) was utilized to predict all-cause somatic hospital admissions and readmissions in adults. Methods We searched eight databases (PubMed, Embase, Web of Science, CINAHL, ProQuest, OpenGrey, WorldCat, and MedNar) from their inception date to October 2023, and included records that predicted all-cause somatic hospital admissions and readmissions of adults using ML methodology. We used the CHARMS checklist for data extraction, PROBAST for bias and applicability assessment, and TRIPOD for reporting quality. Results We screened 7,543 studies of which 163 full-text records were read and 116 met the review inclusion criteria. Among these, 45 predicted admission, 70 predicted readmission, and one study predicted both. There was a substantial variety in the types of datasets, algorithms, features, data preprocessing steps, evaluation, and validation methods. The most used types of features were demographics, diagnoses, vital signs, and laboratory tests. Area Under the ROC curve (AUC) was the most used evaluation metric. Models trained using boosting tree-based algorithms often performed better compared to others. ML algorithms commonly outperformed traditional regression techniques. Sixteen studies used Natural language processing (NLP) of clinical notes for prediction, all studies yielded good results. The overall adherence to reporting quality was poor in the review studies. Only five percent of models were implemented in clinical practice. The most frequently inadequately addressed methodological aspects were: providing model interpretations on the individual patient level, full code availability, performing external validation, calibrating models, and handling class imbalance. Conclusion This review has identified considerable concerns regarding methodological issues and reporting quality in studies investigating ML to predict hospitalizations. To ensure the acceptability of these models in clinical settings, it is crucial to improve the quality of future studies.


Fig. 1. The steps of conducting the Modularity Encoding approach. 1) A network is generated where the nodes are the HCS codes, and the edges are the cooccurrences of these codes in the patients' population. 2) Modules of strongly connected codes in the network were identified. 3) Each code was assigned the module id it belongs to in the network. 4) The HCS is binary encoded according to their module number, reducing the number of generated dimensions to correspond to the number of the detected modules in the network. 5) These new dimensions are used in the ML prediction models.
Fig. 2. Represents the results from the first three experiments. LR, SVM, and GBM were used in all experiments. For all models, accuracy, precision, recall, F1-score, and AUC metrics with 95% confidence intervals were used to evaluate the models' performances. Green markers indicate the highest value of the evaluation metric in the respective comparison. Experiment 1. (binary vs modularity encoding), to the left of each subfigure, shows generally better results of modularity grouping over dummy encoding of the raw ICD codes. In experiment 2. (different resolutions threshold encoding), the performance results of different resolutions are close. LR and GBM models suggest that R1 is the best resolution threshold, while SVM suggests that R08 is the best. In experiment 3. (comparison of modularity, highest hierarchy, and CCS encoding), grouping ICD codes to the highest level of system hierarchy yielded generally best results, followed by modularity grouping and CCS. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
The training time of the models on the different ICD datasets.
“Using network analysis modularity to group health code systems and decrease dimensionality in machine learning models”

June 2024

·

9 Reads

Exploratory Research in Clinical and Social Pharmacy

Background Machine learning (ML) prediction models in healthcare and pharmacy-related research face challenges with encoding high-dimensional Healthcare Coding Systems (HCSs) such as ICD, ATC, and DRG codes, given the trade-off between reducing model dimensionality and minimizing information loss. Objectives To investigate using Network Analysis modularity as a method to group HCSs to improve encoding in ML models. Methods The MIMIC-III dataset was utilized to create a multimorbidity network in which ICD-9 codes are the nodes and the edges are the number of patients sharing the same ICD-9 code pairs. A modularity detection algorithm was applied using different resolution thresholds to generate 6 sets of modules. The impact of four grouping strategies on the performance of predicting 90-day Intensive Care Unit readmissions was assessed. The grouping strategies compared: 1) binary encoding of codes, 2) encoding codes grouped by network modules, 3) grouping codes to the highest level of ICD-9 hierarchy, and 4) grouping using the single-level Clinical Classification Software (CCS). The same methodology was also applied to encode DRG codes but limiting the comparison to a single modularity threshold to binary encoding. The performance was assessed using Logistic Regression, Support Vector Machine with a non-linear kernel, and Gradient Boosting Machines algorithms. Accuracy, Precision, Recall, AUC, and F1-score with 95% confidence intervals were reported. Results Models utilized modularity encoding outperformed ungrouped codes binary encoding models. The accuracy improved across all algorithms ranging from 0.736 to 0.78 for the modularity encoding, to 0.727 to 0.779 for binary encoding. AUC, recall, and precision also improved across almost all algorithms. In comparison with other grouping approaches, modularity encoding generally showed slightly higher performance in AUC, ranging from 0.813 to 0.837, and precision, ranging from 0.752 to 0.782. Conclusions Modularity encoding enhances the performance of ML models in pharmacy research by effectively reducing dimensionality and retaining necessary information. Across the three algorithms used, models utilizing modularity encoding showed superior or comparable performance to other encoding approaches. Modularity encoding introduces other advantages such as it can be used for both hierarchical and non-hierarchical HCSs, the approach is clinically relevant, and can enhance ML models' clinical interpretation. A Python package has been developed to facilitate the use of the approach for future research.


Social robots in research on social and cognitive development in infants and toddlers: A scoping review

There is currently no systematic review of the growing body of literature on using social robots in early developmental research. Designing appropriate methods for early childhood research is crucial for broadening our understanding of young children’s social and cognitive development. This scoping review systematically examines the existing literature on using social robots to study social and cognitive development in infants and toddlers aged between 2 and 35 months. Moreover, it aims to identify the research focus, findings, and reported gaps and challenges when using robots in research. We included empirical studies published between 1990 and May 29, 2023. We searched for literature in PsychINFO, ERIC, Web of Science, and PsyArXiv. Twenty-nine studies met the inclusion criteria and were mapped using the scoping review method. Our findings reveal that most studies were quantitative, with experimental designs conducted in a laboratory setting where children were exposed to physically present or virtual robots in a one-to-one situation. We found that robots were used to investigate four main concepts: animacy concept, action understanding, imitation, and early conversational skills. Many studies focused on whether young children regard robots as agents or social partners. The studies demonstrated that young children could learn from and understand social robots in some situations but not always. For instance, children’s understanding of social robots was often facilitated by robots that behaved interactively and contingently. This scoping review highlights the need to design social robots that can engage in interactive and contingent social behaviors for early developmental research.


Using Network Analysis Modularity to Group Health Code Systems and Decrease Dimensionality in Machine Learning Models

May 2024

·

6 Reads

Background: Machine learning (ML) prediction models in healthcare and pharmacy-related research face challenges with encoding high-dimensional Healthcare Coding Systems (HCSs) such as ICD, ATC, and DRG codes, given the trade-off between reducing model dimensionality and minimizing information loss. Objectives: To investigate using Network Analysis modularity as a method to group HCSs to improve encoding in ML models. Methods: The MIMIC-III dataset was utilized to create a multimorbidity network in which ICD-9 codes are the nodes and the edges are the number of patients sharing the same ICD-9 code pairs. A modularity detection algorithm was applied using different resolution thresholds to generate 6 sets of modules. The impact of four grouping strategies on the performance of predicting 90-day Intensive Care Unit readmissions was assessed. The grouping strategies compared: 1) binary encoding of codes, 2) encoding codes grouped by network modules, 3) grouping codes to the highest level of ICD-9 hierarchy, and 4) grouping using the single-level Clinical Classification Software (CCS). The same methodology was also applied to encode DRG codes but limiting the comparison to a single modularity threshold to binary encoding. The performance was assessed using Logistic Regression, Support Vector Machine with a non-linear kernel, and Gradient Boosting Machines algorithms. Accuracy, Precision, Recall, AUC, and F1-score with 95% confidence intervals were reported. Results: Models utilized modularity encoding, especially at higher resolutions, constantly outperformed ungrouped codes binary encoding and generally performed better or similar to models using other grouping approaches. Conclusions: Modularity encoding enhances the implementation of ML models in pharmacy research. The approach demonstrated comparable or better performance to other methods. It provides additional advantages including suitability to hierarchical and non-hierarchical HCSs, clinical relevancy, and enhancing model interpretability through clinical insights provided by the modules. A Python package has been developed to facilitate using this approach in ML models.


Figure 2 A screenshot of our custom annotation tool used for the manual review
More Efficient Manual Review of Automatically Transcribed Tabular Data

April 2024

·

21 Reads

Historical Life Course Studies

Any machine learning method for transcribing historical text requires manual verification and correction, which is often time-consuming and expensive. Our aim is to make it more efficient. Previously, we developed a machine learning model to transcribe 2.3 million handwritten occupation codes from the Norwegian 1950 census. Here, we manually review the 90,000 codes (3%) for which our model had the lowest confidence scores. We allocated these codes to human reviewers, who used our custom annotation tool to review them. The reviewers agreed with the model's labels 31.9% of the time. They corrected 62.8% of the labels, and 5.1% of the images were uncertain or assigned invalid labels. 9,000 images were reviewed by multiple reviewers, resulting in an agreement of 86.4% and a disagreement of 9%. The results suggest that one reviewer per image is sufficient. We recommend that reviewers indicate any uncertainty about the label they assign to an image by adding a flag to their label. Our interviews show that the reviewers performed internal quality control and found our custom tool to be useful and easy to operate. We provide guidelines for efficient and accurate transcription of historical text by combining machine learning and manual review. We have open-sourced our custom annotation tool and made the reviewed images open access.


Citations (32)


... In this Track we discuss the challenges faced by both computing and historical sciences to outline a roadmap to address some of the most pressing issues of data access, preservation, conservation, harmonisation across national datasets, and governance on one side, and the opportunities and threats brought by AI and machine learning to the advancement of rigorous data analytics. We welcomed contributions that address the following and other related topics: -Advances brought by modern software development, AI, ML and data analytics to the transcription of documents and sources (Pedersen et al. [12], Mourits and Riswick [10] and O'Shea et al. [11]). -Tools and platforms that address the digital divide between physical, analog or digital sources and the level of curation of datasets needed for modern analytics (Le Roux and Gasperini [8], Zafeiridi et al. [16] and Breathnach et al. [3]). ...

Reference:

Digital Humanities and Cultural Heritage in AI and IT-Enabled Environments
Coding Historical Causes of Death Data with Large Language Models

... The performance of deep learning models heavily depends on the availability of such diverse and accurately labeled datasets 20 . The FAIR principles (Findability, Accessibility, Interoperability, and Reusability) were developed to address this gap and foster collaboration within the scientific community 21,22 . However, many existing datasets remain limited in size, diversity, and annotations, hindering the development of robust and generalizable models 16-20, 23, 24 . ...

Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review
  • Citing Article
  • February 2024

Journal of Pathology Informatics

... They developed an LSTM-based RNN model using MFCC-processed heart sound signals as input. This model had excellent discrimination of AS murmurs (area under the curve [AUC] = 0.979), but its performance was mediocre for aortic regurgitation (AUC = 0.634) and MR (AUC = 0.549) [58]. ...

Algorithm for predicting valvular heart disease from heart sounds in an unselected cohort

... 61 AI can also analyze tissue samples to predict responses to ICIs more accurately than traditional methods. 62 AI analysis of gene expression data reveals that nonsynonymous mutational burden is linked to the effectiveness of anti-PD-1 therapy, which is crucial for predicting tumor responses to ICIs. 63 AI also combines multi-omics data to identify biomarkers related to immunotherapy responses, such as tumor mutational burden, neoantigen load, and cytotoxic marker expression in the immune microenvironment. ...

Artificial intelligence algorithm developed to predict immune checkpoint inhibitors efficacy in non–small-cell lung cancer.
  • Citing Conference Paper
  • June 2023

Journal of Clinical Oncology

... An important trend nowadays is to use deep learning algorithms for the study. Part of the deep learning algorithms directly uses deep neural networks to extract features from the waveforms of the PCGs (one-dimensional signals) and classify them, e.g., [21,22], while the other part pre-extracts the time-frequency feature maps (usually twodimensional feature maps) and then uses the deep learning models to further extract the high-dimensional features and categorize them, e.g., [23][24][25]. Furthermore, Ref. [26] used a combination of deep learning models and traditional machine learning algorithms. ...

Phonocardiogram Classification Using 1-Dimensional Inception Time Convolutional Neural Networks
  • Citing Conference Paper
  • December 2022

... In the last few decades, the study of complex networks has gradually become a hot issue in the field of complexity disciplines. Scholars have made significant contributions to the study in these areas such as transportation, social [7][8][9], financial [10][11][12] and biological [13,14] networks. As research into complex networks have continued, the spreading of computer viruses in computer networks, contagious B Lixin Yang yanglixin@sust.edu.cn ...

Social network analysis of Staphylococcus aureus carriage in a general youth population

International Journal of Infectious Diseases

... There are datasets for TIL assessment without annotation that feature only associated clinical data. One such example is that proposed by Shvetsov and coworkers, called UiT-TILs, that can be used to clinically validate TIL classifications [94]. The UiT-TILs dataset contains 1189 image patches from 87 non-small cell lung cancer (NSCLC) patients with matched clinical data, and it is a subset of another dataset, reported by Rakaee et al. [95]. ...

A Pragmatic Machine Learning Approach to Quantify Tumor-Infiltrating Lymphocytes in Whole Slide Images

... Coding occupational information generally consists of three steps. First, entry errors, abbreviations, and spelling variations are removed to standardize occupational titles [4,10,20,26,32]. Second, these occupational titles are grouped into occupational groups using intermediate coding schemes. ...

Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes

Historical Life Course Studies

... These data will be combined with information from the clinical database, and end point registries (Fig 3). The multiple dimensions that can be combined for each possible study design has been described elsewhere [28]. Briefly, the combination of the following dimensions will create a multitude of possible study designs: time, exposures, measurements, diagnosis, participant selection, sample types, as well as stratification and de-confounding. ...

2. The Beauty of Complex Designs: Exploring Trajectories of Gene Expression