Trisevgeni Papakonstantinou’s research while affiliated with University College London and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (9)


The Use of Large Language Models for Qualitative Research: DECOTA
  • Preprint

July 2024

·

78 Reads

·

Ryan Hughes

·

·

[...]

·

Mark Wilson

Machine-assisted approaches for free-text analysis are rising in popularity, owing to a growing need to rapidly analyse large volumes of qualitative data. In both research and policy settings, these approaches have promise in providing timely insights into public perceptions and enabling policymakers to understand their community’s needs. However, current approaches still require expert human interpretation – posing a financial and practical barrier for those outside of academia. For the first time, we propose and validate the Deep Computational Text Analyser (DECOTA) - a novel Machine Learning methodology that automatically analyses large free-text datasets and outputs concise themes. Building on Structural Topic Modelling (STM) approaches, we used two fine-tuned Large Language Models (LLMs) and sentence transformers to automatically derive ‘codes’ and their corresponding ‘themes’, as in Inductive Thematic Analysis. To automate the process, we designed and validated a novel algorithm to choose the optimal number of ‘topics’ following STM. This approach automatically derives key codes and themes from free-text data, the prevalence of each code, and how prevalence varies with covariates such as age and gender. Each code is accompanied by three representative quotes. Four datasets previously analysed using Thematic Analysis were triangulated with DECOTA’s codes and themes. We found that DECOTA is approximately 378 times faster and 1920 times cheaper than human coding, and consistently yields codes in agreement with or complementary to human coding (averaging 91.6% for codes, and 90% for themes). The implications for evidence-based policy development, public engagement with policymaking, and the development of psychometric measures are discussed.


Figure 1. Image of workstation in the experimental laboratory kitchen.
Figure 2. Poster placed by the sink for the Timer group. The top-right shows the tap-mounted 'SaniTimer' device, which begins a 30 s countdown whenever the tap is turned on. The poster encourages participants to use the timer to make sure they wash their hands for 20 s and use proper handwashing technique.
Summary of handwashing primary outcome variables
GLM Poisson model with handwash frequency as the outcome variable
GLM Poisson model with back-hands count as the outcome variable Back-hands count
The effect of timers and precommitments on handwashing: a randomised controlled trial in a kitchen laboratory
  • Article
  • Full-text available

January 2024

·

58 Reads

Behavioural Public Policy

Many foodborne illness outbreaks originate in food service establishments. We tested two behavioural interventions designed to improve the duration and quality of handwashing. We ran a three-armed parallel trial in a laboratory kitchen, from 7 March to 27 May 2022. Participants were n = 195 workers who handle food. We randomly allocated participants to three groups: Timer – tap-mounted timer that counted seconds while participants washed their hands; Precommitment – agreed to five statements on good hand hygiene before attending the kitchen; and Control. Participants completed a food preparation task under time pressure. Cameras focused on the sink captured handwashing. Outcome measures were number of times participants washed their hands; number of times they washed their hands using soap; number of times they washed using soap and washed the backs of their hands; and mean duration of handwashing attempts using soap. Participants in Timer washed their hands for 1.9 s longer on average than Control (β = 2.20, 95% CI = 0.34-4.06, p = 0.021). Participants in Precommitment washed their hands for 2.5 s longer on average than Control (β = 2.30, 95% CI = 0.33-4.27, p = 0.022). We found no statistically significant differences on any other outcome measure.

Download

What was helpful about the information on the Germ Defence website? Summary of the topics (generated by the model, described by human) and the major themes (generated by human).
What did you not find helpful about the information on the Germ Defence website? Summary of the topics (generated by the model, described by human) and the major themes (generated by human).
Demographic characteristics of the sample (N = 1,472).
Results of the triangulation between the human-only analysis and the MATA. Human-only themes Human-only codes Triangulation with MATA codes Agreement Complementary
Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques

October 2023

·

62 Reads

·

15 Citations

Introduction Machine-assisted topic analysis (MATA) uses artificial intelligence methods to help qualitative researchers analyze large datasets. This is useful for researchers to rapidly update healthcare interventions during changing healthcare contexts, such as a pandemic. We examined the potential to support healthcare interventions by comparing MATA with “human-only” thematic analysis techniques on the same dataset (1,472 user responses from a COVID-19 behavioral intervention). Methods In MATA, an unsupervised topic-modeling approach identified latent topics in the text, from which researchers identified broad themes. In human-only codebook analysis, researchers developed an initial codebook based on previous research that was applied to the dataset by the team, who met regularly to discuss and refine the codes. Formal triangulation using a “convergence coding matrix” compared findings between methods, categorizing them as “agreement”, “complementary”, “dissonant”, or “silent”. Results Human analysis took much longer than MATA (147.5 vs. 40 h). Both methods identified key themes about what users found helpful and unhelpful. Formal triangulation showed both sets of findings were highly similar. The formal triangulation showed high similarity between the findings. All MATA codes were classified as in agreement or complementary to the human themes. When findings differed slightly, this was due to human researcher interpretations or nuance from human-only analysis. Discussion Results produced by MATA were similar to human-only thematic analysis, with substantial time savings. For simple analyses that do not require an in-depth or subtle understanding of the data, MATA is a useful tool that can support qualitative researchers to interpret and analyze large datasets quickly. This approach can support intervention development and implementation, such as enabling rapid optimization during public health emergencies.


Fig. 1. Example of the Structural Topic Model output.
Sample description.
User feedback on the NHS test & Trace Service during COVID-19: The use of machine learning to analyse free-text data from 37,914 UK adults

June 2023

·

22 Reads

Public Health in Practice

Objectives The UK government's approach to the pandemic relies on a test, trace and isolate strategy, mainly implemented via the digital NHS Test & Trace Service. Feedback on user experience is central to the successful development of public-facing Services. As the situation dynamically changes and data accumulate, interpretation of feedback by humans becomes time-consuming and unreliable. The specific objectives were to 1) evaluate a human-in-the-loop machine learning technique based on structural topic modelling in terms of its Service ability in the analysis of vast volumes of free-text data, 2) generate actionable themes that can be used to increase user satisfaction of the Service. Methods We evaluated an unsupervised Topic Modelling approach, testing models with 5–40 topics and differing covariates. Two human coders conducted thematic analysis to interpret the topics. We identified a Structural Topic Model with 25 topics and metadata as covariates as the most appropriate for acquiring insights. Results Results from analysis of feedback by 37,914 users from May 2020 to March 2021 highlighted issues with the Service falling within three major themes: multiple contacts and incompatible contact method and incompatible contact method, confusion around isolation dates and tracing delays, complex and rigid system. Conclusions Structural Topic Modelling coupled with thematic analysis was found to be an effective technique to rapidly acquire user insights. Topic modelling can be a quick and cost-effective method to provide high quality, actionable insights from free-text feedback to optimize public health Services.


User feedback on the NHS Test & Trace Service during COVID-19: the use of machine learning to analyse free-text data from 37,914 UK adults

November 2022

·

5 Reads

Objectives The UK government’s approach to the pandemic relies on a test, trace and isolate strategy, mainly implemented via the digital NHS Test & Trace Service. Feedback on user experience is central to the successful development of public-facing services. As the situation dynamically changes and data accumulate, interpretation of feedback by humans becomes time-consuming and unreliable. The specific objectives were to 1) evaluate a human-in-the-loop machine learning technique based on structural topic modelling in terms of its serviceability in the analysis of vast volumes of free-text data, 2) generate actionable themes that can be used to increase user satisfaction of the Service. Methods We evaluated an unsupervised Topic Modelling approach, testing models with 5-40 topics and differing covariates. Two human coders conducted thematic analysis to interpret the topics. We identified a Structural Topic Model with 25 topics and metadata as covariates as the most appropriate for acquiring insights. Results Results from analysis of feedback by 37,914 users from May 2020 to March 2021 highlighted issues with the Service falling within three major themes: multiple contacts and incompatible contact method and incompatible contact method, confusion around isolation dates and tracing delays, complex and rigid system. Conclusions Structural Topic Modelling coupled with thematic analysis was found to be an effective technique to rapidly acquire user insights. Topic modelling can be a quick and cost-effective method to provide high quality, actionable insights from free-text feedback to optimize public health services.


Applying machine-learning to rapidly analyse large qualitative text datasets to inform the COVID-19 pandemic response

June 2022

·

54 Reads

·

2 Citations

Background: machine-assisted topic analysis (MATA) uses artificial intelligence methods to assist qualitative researchers to analyse large amounts of textual data. This could allow qualitative researchers to inform and update public health interventions ‘in real-time’, to ensure they remain acceptable and effective during rapidly changing contexts (such as a pandemic). In this novel study we aimed to understand the potential for such approaches to support intervention implementation, by directly comparing MATA and ‘human-only’ thematic analysis techniques when applied to the same dataset (1472 free-text responses from users of the COVID-19 infection control intervention ‘Germ Defence’). Methods: in MATA, the analysis process included an unsupervised topic modelling approach to identify latent topics in the text. The human research team then described the topics and identified broad themes. In human-only codebook analysis, an initial codebook was developed by an experienced qualitative researcher and applied to the dataset by a well-trained research team, who met regularly to critique and refine the codes. To understand similarities and difference, formal triangulation using a ‘convergence coding matrix’ compared the findings from both methods, categorising them as ‘agreement’, ‘complementary’, ‘dissonant’, or ‘silent’. Results: human analysis took much longer (147.5 hours) than MATA (40 hours). Both human-only and MATA identified key themes about what users found helpful and unhelpful (e.g. Boosting confidence in how to perform the behaviours vs Lack of personally relevant content ). Formal triangulation of the codes created showed high similarity between the findings. All codes developed from the MATA were classified as in agreement or complementary to the human themes. Where the findings were classified as complementary, this was typically due to slightly differing interpretations or nuance present in the human-only analysis. Conclusions: overall, the quality of MATA was as high as the human-only thematic analysis, with substantial time savings. For simple analyses that do not require an in-depth or subtle understanding of the data, MATA is a useful tool that can support qualitative researchers to interpret and analyse large datasets quickly. These findings have practical implications for intervention development and implementation, such as enabling rapid optimisation during public health emergencies. Contributions to the literature Natural language processing (NLP) techniques have been applied within health research due to the need to rapidly analyse large samples of qualitative data. However, the extent to which these techniques lead to results comparable to human coding requires further assessment. We demonstrate that combining NLP with human analysis to analyse free-text data can be a trustworthy and efficient method to use on large quantities of qualitative data. This method has the potential to play an important role in contexts where rapid descriptive or exploratory analysis of very large datasets is required, such as during a public health emergency. Acknowledgements We would like to thank our voluntary research assistants; Benjamin Gruneberg, Lillian Brady, Georgia Farrance, Lucy Sellors, Kinga Olexa, and Zeena Abdelrazig for their valuable contribution to the coding of the data for the human-only analysis. We would also like to acknowledge Katherine Morton’s contribution to the administration of survey, and James Denison-Day for the construction and maintenance of the Germ Defence website. Publication references - 26 Show all Sorted by: Date Developing and testing an automated qualitative assistant (AQUA) to support qualitative analysis Robert P Lennon, Robbie Fraleigh, Lauren J Van Scoy, Aparna Keshaviah, Xindi C Hu, Bethany L Snyder, Erin L Miller, William A Calo, Aleksandra E Zgierska, Christopher Griffin 2021, Family Medicine and Community Health - Article 2 2 total citations on Dimensions. Article has an altmetric score of 4 View PDFAdd to Library Accelerating Mixed Methods Research With Natural Language Processing of Big Text Data Tammy Chang, Melissa DeJonckheere, V. G. Vinod Vydiswaran, Jiazhao Li, Lorraine R. Buis, Timothy C. Guetterman 2021, Journal of Mixed Methods Research - Article 8 8 total citations on Dimensions. Article has an altmetric score of 12 Add to Library Adapting Behavioral Interventions for a Changing Public Health Context: A Worked Example of Implementing a Digital Intervention During a Global Pandemic Using Rapid Optimisation Methods Katherine Morton, Ben Ainsworth, Sascha Miller, Cathy Rice, Jennifer Bostock, James Denison-Day, Lauren Towler, Julia Groot, Michael Moore, Merlin Willcox, Tim Chadborn, Richard Amlot, Natalie Gold, Paul Little, Lucy Yardley 2021, Frontiers in Public Health - Article 11 11 total citations on Dimensions. Article has an altmetric score of 5 View PDFAdd to Library Infection Control Behavior at Home During the COVID-19 Pandemic: Observational Study of a Web-Based Behavioral Intervention (Germ Defence) Ben Ainsworth, Sascha Miller, James Denison-Day, Beth Stuart, Julia Groot, Cathy Rice, Jennifer Bostock, Xiao-Yang Hu, Katherine Morton, Lauren Towler, Michael Moore, Merlin Willcox, Tim Chadborn, Natalie Gold, Richard Amlôt, Paul Little, Lucy Yardley 2021, Journal of Medical Internet Research - Article 10 10 total citations on Dimensions. Article has an altmetric score of 61 View PDFAdd to Library Carrying Out Rapid Qualitative Research During a Pandemic: Emerging Lessons From COVID-19 Cecilia Vindrola-Padros, Georgia Chisnall, Silvie Cooper, Anna Dowrick, Nehla Djellouli, Sophie Mulcahy Symmons, Sam Martin, Georgina Singleton, Samantha Vanderslott, Norha Vera, Ginger A. Johnson 2020, Qualitative Health Research - Article 197 197 total citations on Dimensions. Article has an altmetric score of 60 View PDFAdd to Library © 2022 Digital Science & Research Solutions, Inc. All Rights Reserved | About Dimensions · Privacy policy · · Legal terms · VPAT ®<br/


Applying machine-learning to rapidly analyse large qualitative text datasets to inform the COVID-19 pandemic response: Comparing human and machine-assisted topic analysis techniques

May 2022

·

63 Reads

·

3 Citations

Background Machine-assisted topic analysis (MATA) uses artificial intelligence methods to assist qualitative researchers to analyse large amounts of textual data. This could allow qualitative researchers to inform and update public health interventions ‘in real-time’, to ensure they remain acceptable and effective during rapidly changing contexts (such as a pandemic). Objective We aimed to understand the potential for such approaches to support intervention implementation, by directly comparing MATA and ‘human-only’ thematic analysis techniques when applied to the same dataset (1472 free-text responses from users of the COVID-19 infection control intervention ‘Germ Defence’). Methods In MATA, the analysis process included an unsupervised topic modelling approach to identify latent topics in the text. The human research team then described the topics and identified broad themes. In human-only codebook analysis, an initial codebook was developed by an experienced qualitative researcher and applied to the dataset by a well-trained research team, who met regularly to critique and refine the codes. To understand similarities and difference, formal triangulation using a ‘convergence coding matrix’ compared the findings from both methods, categorising them as ‘agreement’, ‘complementary’, ‘dissonant’, or ‘silent’. Results Human analysis took much longer (147.5 hours) than MATA (40 hours). Both human-only and MATA identified key themes about what users found helpful and unhelpful (e.g. Helpful: Boosting confidence in how to perform the behaviours. Unhelpful: Lack of personally relevant content). Formal triangulation of the codes created showed high similarity between the findings. All codes developed from the MATA were classified as in agreement or complementary to the human themes. Where the findings were classified as complementary, this was typically due to slightly differing interpretations or nuance present in the human-only analysis. Conclusions Overall, the quality of MATA was as high as the human-only thematic analysis, with substantial time savings. For simple analyses that do not require an in-depth or subtle understanding of the data, MATA is a useful tool that can support qualitative researchers to interpret and analyse large datasets quickly. These findings have practical implications for intervention development and implementation, such as enabling rapid optimisation during public health emergencies.


S35 Creating behavioural personas to drive better design in health technology for asthma self-management

November 2021

·

29 Reads

Thorax

Introduction and Objectives The health burden from asthma can be reduced through better provision of basic care and better self-management. Most health technology products tend to target a narrow range of behaviours with limited behaviour change techniques (BCTs), take a homogeneous approach towards the diverse population of people with asthma and have poor uptake.Objective to identify and characterise distinct behavioural self-management archetypes among UK adults with asthma with the aim of creating behavioural personas that can be used by product developers to better address the needs of people with asthma. Methods We conducted a scoping review of grey and academic literature, followed by workshops with subject matter experts to identify key behaviours and influences relevant to asthma self-management. We then conducted a rapid review and behavioural analysis on these key behaviours which were then synthesised into a behavioural systems map. A survey was constructed to explore a subset of key behaviours and influences in more detail including asthma management, asthma control, inhaler use, support seeking, monitoring, and technology use. The survey was administered to 2,324 people reflective of the UK adult asthma population. The results were analysed and synthesised using mixed methods. Data were segmented using Multiple Correspondence Analysis and k-means cluster analysis, and further statistical analysis was performed to identify factors independently associated with adherence behaviour. The results were synthesised into behavioural personas that characterise people with optimal vs. suboptimal preventer inhaler adherence in behavioural terms, alongside relevant design prompts and suggested BCTs. Results Segmenting by inhaler use revealed behaviours distributed as shown in figure 1. Segmenting by adherence to preventer-type inhalers alone revealed pronounced differences between optimal and sub-optimal behaviour clusters in terms of age and behavioural factors (including: skills, decision making, behavioural regulation, environmental opportunities, attitudes, motives, intentions, beliefs, identity, and emotions). • Download figure • Open in new tab • Download powerpoint Abstract S35 Figure 1 Conclusions We have developed unique insight into behaviours of people with asthma and the influences on these behaviours. We believe this work can contribute to a paradigm shift in the design of asthma health technology products, towards targeting new behaviours and their influences for change and ultimately driving better self-management and fewer asthma deaths.


Using machine learning to analyse large volume of public health data to drive service improvement

October 2021

·

7 Reads

The European Journal of Public Health

Background The UK government's approach to the pandemic relies on a test, trace and isolate strategy, mainly implemented via the digital Contact Tracing and Advice Service (CTAS). Feedback on user experience is central to the successful development of public-facing services. As the situation dynamically changes and data accumulate, interpretation of feedback by humans becomes time-consuming and unreliable. The aim was to evaluate the use of Machine Learning (ML) techniques as tools to understand the issues with the Service as expressed in the free-text responses of the users. The specific objectives were to 1) conduct an analysis of supervised and unsupervised techniques to develop the most optimal model, 2) generate actionable themes that can be used to increase user satisfaction of the CTAS. Methods We evaluated and compared 5 supervised classification algorithms in terms of serviceability and accuracy. We proceeded by evaluating an unsupervised Topic Modelling approach, testing models with 5-40 topics and differing covariates in terms of coherence, residuals and interpretability by human coders. Two human coders conducted thematic analysis to interpret the topics. Results Due to the low accuracy, the degree of human involvement and broadness of themes we found that a supervised ML approach was not well suited to our objective. We identified a Structural Topic Model with 25 topics and metadata as covariates as the most appropriate for acquiring insights. Preliminary results from analysis of the feedback by 16,262 users from May 2020 to March 2021 highlighted issues with the Service falling within three major themes: lack of data coordination, ineffective communication, and technical issues. Conclusions Structural Topic Modelling was found to be the most effective technique to rapidly acquire user insights. The 25 topics provided highly specific insights of issues that can be utilized towards improving the CTAS. Key messages A ML approach can be a quick and cost-effective method to provide high quality, actionable insights from free-text feedback in order to optimize public health services. Topic models can rapidly provide highly specific user insights with minimal human involvement and low maintenance requirements, making them ideal evaluation tools for pandemic response services.

Citations (3)


... Only a few publications discussed the strengths of software that were used to support analysis, and when they did they often reported that the software had enabled researchers to manage large volumes of data to provide an overview of the nature of the data to complement researcher-based interpretations (Abbott et al., 2017). Comparing machine learning analysis with human analysis showed a high level of agreement between the two, and the authors found that this demonstrated an element of trust in machine learning approaches (Towler et al., 2022). NVivo was identified as being beneficial as it allowed researchers to visualize and assign meaning to the data during the coding stage, it was also identified as enabling a rigorous and systematic approach. ...

Reference:

Making the most of big qualitative datasets: a living systematic review of analysis methods
Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques

... NLP and more advanced forms of AI can also be used to collect and analyze data inductively or deductively, including the collection of unstructured data that typically requires manual analysis of qualitative data that is traditionally time-consuming and slow [43]. The use of AI to conduct qualitative analyses is becoming more common, either as a standalone method or in a "human-assisted" method where researchers iteratively review the AI outputs and provide redirection as needed [44][45][46][47]. Newer, rapid approaches to qualitative analysis in IS have already sped up this step [48], but these newer analysis methods could also be augmented with AI to expedite or supplant person time by an order of magnitude. ...

Applying machine-learning to rapidly analyse large qualitative text datasets to inform the COVID-19 pandemic response

... We propose the use of artificial intelligence techniques, especially those based on natural language processing, as a useful resource for analyzing these large amounts of information in open text. These tools not only allow us to identify response patterns from the co-occurrence of words in large volumes of information, but also to identify latent themes in texts (Towler et al., 2022). ...

Applying machine-learning to rapidly analyse large qualitative text datasets to inform the COVID-19 pandemic response: Comparing human and machine-assisted topic analysis techniques