Sophia Ananiadou

Sophia Ananiadou
The University of Manchester · School of Computer Science

Doctor of Philosophy

About

511
Publications
184,403
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
15,576
Citations

Publications

Publications (511)
Preprint
Full-text available
In this paper, we present the SimDoc system, a simplification model considering simplicity, readability, and discourse aspects, such as coherence. In the past decade, the progress of the Text Simplification (TS) field has been mostly shown at a sentence level, rather than considering paragraphs or documents, a setting from which most TS audiences w...
Preprint
Full-text available
Understanding the mechanisms behind Large Language Models (LLMs) is crucial for designing improved models and strategies. While recent studies have yielded valuable insights into the mechanisms of textual LLMs, the mechanisms of Multi-modal Large Language Models (MLLMs) remain underexplored. In this paper, we apply mechanistic interpretability meth...
Chapter
The internet has brought both benefits and harms to society. A prime example of the latter is misinformation, including conspiracy theories, which flood the web. Recent advances in natural language processing, particularly the emergence of large language models (LLMs), have improved the prospects of accurate misinformation detection. However, most...
Preprint
Full-text available
The emergence of social media has made the spread of misinformation easier. In the financial domain, the accuracy of information is crucial for various aspects of financial market, which has made financial misinformation detection (FMD) an urgent problem that needs to be addressed. Large language models (LLMs) have demonstrated outstanding performa...
Preprint
We find arithmetic ability resides within a limited number of attention heads, with each head specializing in distinct operations. To delve into the reason, we introduce the Comparative Neuron Analysis (CNA) method, which identifies an internal logic chain consisting of four distinct stages from input to prediction: feature enhancing with shallow F...
Preprint
Full-text available
Recent advancements in large language model alignment leverage token-level supervisions to perform fine-grained preference optimization. However, existing token-level alignment methods either optimize on all available tokens, which can be noisy and inefficient, or perform selective training with complex and expensive key token selection strategies....
Preprint
Full-text available
Background: Named entity recognition (NER) aims to detect entity mentions from text and classify them into predefined types. It is a fundamental task in information extraction and many other downstream tasks. However, the necessity of extensive human efforts to annotate a large amount of training data imposes restrictions on the state-of-the-art su...
Preprint
Full-text available
Large language models (LLMs) have advanced financial applications, yet they often lack sufficient financial knowledge and struggle with tasks involving multi-modal inputs like tables and time series data. To address these limitations, we introduce \textit{Open-FinLLMs}, a series of Financial LLMs. We begin with FinLLaMA, pre-trained on a 52 billion...
Article
Full-text available
An individual’s likelihood of developing non-communicable diseases is often influenced by the types, intensities and duration of exposures at work. Job exposure matrices provide exposure estimates associated with different occupations. However, due to their time-consuming expert curation process, job exposure matrices currently cover only a subset...
Preprint
Data serves as the fundamental foundation for advancing deep learning, particularly tabular data presented in a structured format, which is highly conducive to modeling. However, even in the era of LLM, obtaining tabular data from sensitive domains remains a challenge due to privacy or copyright concerns. Hence, exploring how to effectively use mod...
Preprint
BACKGROUND The integration of large language models (LLMs) in mental health care is an emerging field. There is a need to systematically review the application outcomes and delineate the advantages and limitations in clinical settings. OBJECTIVE This review aims to provide a comprehensive overview of the use of LLMs in mental health care, assessin...
Preprint
Full-text available
Recent advancements in Large Language Models (LLMs) have demonstrated their potential in delivering accurate answers to questions about world knowledge. Despite this, existing benchmarks for evaluating LLMs in healthcare predominantly focus on medical doctors, leaving other critical healthcare professions underrepresented. To fill this research gap...
Preprint
Full-text available
Misinformation is prevalent in various fields such as education, politics, health, etc., causing significant harm to society. However, current methods for cross-domain misinformation detection rely on time and resources consuming fine-tuning and complex model structures. With the outstanding performance of LLMs, many studies have employed them for...
Conference Paper
Full-text available
We present a coherence-aware evaluation of document-level Text Simplification (TS), an approach that has not been considered in TS so far. We improve current TS sentence-based models to support a multi-sentence setting and the implementation of a state-of-the-art neural coherence model for simplification quality assessment. We enhanced English sent...
Article
Given the overwhelming and rapidly increasing volumes of the published biomedical literature, automatic biomedical text summarization has long been a highly important task. Recently, great advances in the performance of biomedical text summarization have been facilitated by pre-trained language models (PLMs) based on fine-tuning. However, existing...
Preprint
Full-text available
The context-aware emotional reasoning ability of AI systems, especially in conversations, is of vital importance in applications such as online opinion mining from social media and empathetic dialogue systems. Due to the implicit nature of conveying emotions in many scenarios, commonsense knowledge is widely utilized to enrich utterance semantics a...
Conference Paper
Full-text available
Early identification of depression is beneficial to public health surveillance and disease treatment. There are many models that mainly treat the detection as a binary classification task, such as detecting whether a user is depressed. However, identifying users' depression severity levels from posts on social media is more clinically useful for fu...
Preprint
Full-text available
Existing NTMs with contrastive learning suffer from the sample bias problem owing to the word frequency-based sampling strategy, which may result in false negative samples with similar semantics to the prototypes. In this paper, we aim to explore the efficient sampling strategy and contrastive learning in NTMs to address the aforementioned issue. W...
Article
Full-text available
Depressive symptoms identification on social media aims to identify posts from social media expressing symptoms of depression. This can be beneficial for developing mental health support systems and for understanding the symptoms of depression. The Patient Health Questionnaire-9 (PHQ-9) is an instrument that healthcare professionals widely use to a...
Preprint
Full-text available
In Emotion Recognition in Conversations (ERC), the emotions of target utterances are closely dependent on their context. Therefore, existing works train the model to generate the response of the target utterance, which aims to recognise emotions leveraging contextual information. However, adjacent response generation ignores long-range dependencies...
Preprint
Full-text available
Pretrained language models have been used in various natural language processing applications. In the mental health domain, domain-specific language models are pretrained and released, which facilitates the early detection of mental health conditions. Social posts, e.g., on Reddit, are usually long documents. However, there are no domain-specific p...
Preprint
Full-text available
Mental illnesses are one of the most prevalent public health problems worldwide, which negatively influence people's lives and society's health. With the increasing popularity of social media, there has been a growing research interest in the early detection of mental illness by analysing user-generated posts on social media. According to the corre...
Preprint
Full-text available
The exponential growth of biomedical texts such as biomedical literature and electronic health records (EHRs), provides a big challenge for clinicians and researchers to access clinical information efficiently. To address the problem, biomedical text summarization has been proposed to support clinical information retrieval and management, aiming at...
Preprint
Full-text available
The goal of temporal relation extraction is to infer the temporal relation between two events in the document. Supervised models are dominant in this task. In this work, we investigate ChatGPT's ability on zero-shot temporal relation extraction. We designed three different prompt techniques to break down the task and evaluate ChatGPT. Our experimen...
Preprint
Full-text available
Automated mental health analysis shows great potential for enhancing the efficiency and accessibility of mental health care, whereas the recent dominant methods utilized pre-trained language models (PLMs) as the backbone and incorporated emotional information. The latest large language models (LLMs), such as ChatGPT, exhibit dramatic capabilities o...
Article
Automatic extraction of patient medication histories from free-text clinical notes can increase the amount of relevant information to clinicians for developing treatment plans. In addition to detecting medication events, clinical text mining systems must also be able to predict event context, such as negation, uncertainty, and time of occurrence, i...
Preprint
Full-text available
The performance of abstractive text summarization has been greatly boosted by pre-trained language models recently. The main concern of existing abstractive summarization methods is the factual inconsistency problem of their generated summary. To alleviate the problem, many efforts have focused on developing effective factuality evaluation metrics...
Preprint
Full-text available
The information bottleneck (IB) principle has been proven effective in various NLP applications. The existing work, however, only used either generative or information compression models to improve the performance of the target task. In this paper, we propose to combine the two types of IB models into one system to enhance Named Entity Recognition...
Article
Full-text available
A key challenge for Emotion Recognition in Conversations (ERC) is to distinguish semantically similar emotions. Some works utilise Supervised Contrastive Learning (SCL) which uses categorical emotion labels as supervision signals and contrasts in high-dimensional semantic space. However, categorical labels fail to provide quantitative information b...
Preprint
Full-text available
A key challenge for Emotion Recognition in Conversations (ERC) is to distinguish semantically similar emotions. Some works utilise Supervised Contrastive Learning (SCL) which uses categorical emotion labels as supervision signals and contrasts in high-dimensional semantic space. However, categorical labels fail to provide quantitative information b...
Preprint
Full-text available
The citation graph is essential for generating high-quality summaries of scientific papers, in which references of a scientific paper and their correlations provide extra knowledge for understanding its background and main contributions. Despite the promising role of the citation graph, effectively incorporating it still remains a big challenge, gi...
Article
In Emotion Recognition in Conversations (ERC), the emotions of target utterances are closely dependent on their context. Therefore, existing works train the model to generate the response of the target utterance, which aims to recognise emotions leveraging contextual information. However, adjacent response generation ignores long-range dependencies...
Article
Full-text available
Mental illnesses are one of the most prevalent public health problems worldwide, which negatively influence people’s lives and society’s health. With the increasing popularity of social media, there has been a growing research interest in the early detection of mental illness by analysing user-generated posts on social media. According to the corre...
Article
Automatic extraction of relations between gene mutations and cancer entities occurring in the cancer literature using text mining can rapidly provide vital information to support precision cancer medicine. However, mutation-cancer relation extraction is more challenging than general relation extraction from free text, since it is often not possible...
Preprint
Different from general documents, it is recognised that the ease with which people can understand a biomedical text is eminently varied, owing to the highly technical nature of biomedical documents and the variance of readers' domain knowledge. However, existing biomedical document summarization systems have paid little attention to readability con...
Article
Full-text available
Background: In recent years, the COVID-19 pandemic has brought great changes to public health, society, and the economy. Social media provide a platform for people to discuss health concerns, living conditions, and policies during the epidemic, allowing policymakers to use this content to analyze the public emotions and attitudes for decision-makin...
Chapter
Minimising accident risk for new construction projects requires a thorough analysis of previous accidents, including an examination of circumstances and reasons for their occurrence, their consequences, and measures used for future mitigation. Such information is often recorded only within the huge amounts of unstructured textual documentation that...
Article
Labelled data for training sequence labelling models can be collected from multiple annotators or workers in crowdsourcing. However, these labels could be noisy because of the varying expertise and reliability of annotators. In order to ensure high quality of data, it is crucial to infer the correct labels by aggregating noisy labels. Although labe...
Preprint
Full-text available
Recently, neural topic models (NTMs) have been incorporated into pre-trained language models (PLMs), to capture the global semantic information for text summarization. However, in these methods, there remain limitations in the way they capture and integrate the global semantic information. In this paper, we propose a novel model, the graph contrast...
Article
Full-text available
The evolution of the Exposome concept revolutionised the research in exposure assessment and epidemiology by introducing the need for a more holistic approach on the exploration of the relationship between the environment and disease. At the same time, further and more dramatic changes have also occurred on the working environment, adding to the al...
Article
Biomedical text summarization is a critical task for comprehension of an ever-growing amount of biomedical literature. Pre-trained language models (PLMs) with transformer-based architectures have been shown to greatly improve performance in biomedical text mining tasks. However, existing methods for text summarization generally fine-tune PLMs on th...
Article
Full-text available
Background: In recent years, the COVID-19 pandemic has brought great changes to public health, society and the economy. Social media provides a platform for people to discuss health concerns, living conditions and policies during the epidemic, which allows policy makers to use its contents to analyse the public emotions and attitudes for decision...
Article
Full-text available
Background Nested and overlapping events are particularly frequent and informative structures in biomedical event extraction. However, state-of-the-art neural models either neglect those structures during learning or use syntactic features and external tools to detect them. To overcome these limitations, this paper presents and compares two neural...
Article
Full-text available
Introduction: Suicide is a global health concern. Sociocultural factors have an impact on self-harm and suicide rates. In Pakistan, both self-harm and suicide are considered as criminal offence's and are condemned on both religious and social grounds. The proposed intervention 'Youth Culturally Adapted Manual Assisted Problem Solving Training (YCM...
Article
Full-text available
Stress and depression detection on social media aim at the analysis of stress and identification of depression tendency from social media posts, which provide assistance for the early detection of mental health conditions. Existing methods mainly model the mental states of the post speaker implicitly. They also lack the ability to mentalise for com...
Article
Full-text available
Mental illness is highly prevalent nowadays, constituting a major cause of distress in people's life with impact on society's health and well-being. Mental illness is a complex multi-factorial disease associated with individual risk factors and a variety of socioeconomic, clinical associations. In order to capture these complex associations express...
Preprint
Full-text available
Negation and uncertainty modeling are long-standing tasks in natural language processing. Linguistic theory postulates that expressions of negation and uncertainty are semantically independent from each other and the content they modify. However, previous works on representation learning do not explicitly model this independence. We therefore attem...
Article
Gender bias is an important problem that affects models of natural language, and the propagation of such biases could be harmful. Much research focuses on gender biases in word embeddings, and there are also some works on gender biases in subsequent tasks. However, very limited prior work has been done on gender issues in emotion detection tasks. I...
Preprint
Recently, Transformer model, which has achieved great success in many artificial intelligence fields, has demonstrated its great potential in modeling graph-structured data. Till now, a great variety of Transformers has been proposed to adapt to the graph-structured data. However, a comprehensive literature review and systematical evaluation of the...
Preprint
Click-Through Rate (CTR) prediction, is an essential component of online advertising. The mainstream techniques mostly focus on feature interaction or user interest modeling, which rely on users' directly interacted items. The performance of these methods are usally impeded by inactive behaviours and system's exposure, incurring that the features e...
Article
Recent work in Natural Language Processing has increasingly focused on detecting suicidal intent in textual data, where the main aim is to detect expressions in a binary setting. However, previous research has shown that search results and other mentions of suicide online are not only limited to expressions of suicidal intent. Therefore, previously...
Article
Full-text available
Machine reading (MR) is essential for unlocking valuable knowledge contained in millions of existing biomedical documents. Over the last two decades 1,2 , the most dramatic advances in MR have followed in the wake of critical corpus development ³ . Large, well-annotated corpora have been associated with punctuated advances in MR methodology and aut...
Article
Full-text available
Large scale pre-trained language models (PLMs) have advanced state-of-the-art (SOTA) performance on various biomedical text mining tasks. The power of such PLMs can be combined with the advantages of deep generative models. These are examples of these combinations. However, they are trained only on general domain text, and biomedical models are sti...
Article
Full-text available
The COVID-19 pandemic resulted in an unprecedented production of scientific literature spanning several fields. To facilitate navigation of the scientific literature related to various aspects of the pandemic, we developed an exploratory search system. The system is based on automatically identified technical terms, document citations, and their vi...
Conference Paper
Within the EXPOSOME PROJECT FOR HEALTH AND OCCUPATIONAL RESEARCH (EPHOR) project we aim to develop a protocol to enable efficient update of job exposure matrices so that they can include the latest available information of highest quality possible. The protocol will include methods for searching and collecting new data from literature (assisted by...
Conference Paper
Full-text available
Research in Text Simplification (TS) has relied mostly on the Wikipedia-based datasets and the SARI evaluation metric, as the preferred means for creating and evaluating new simplification methods. Previous studies have pointed out the flaws of data evaluation resources, including incorrect alignment of simple/complex sentence pairs, sentences with...