• Home
  • Farhad Nooralahzadeh
Farhad Nooralahzadeh

Farhad Nooralahzadeh

About

37
Publications
4,505
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
366
Citations
Introduction
Skills and Expertise

Publications

Publications (37)
Preprint
Full-text available
Clinical oncology generates vast, unstructured data that often contain inconsistencies, missing information, and ambiguities, making it difficult to extract reliable insights for data-driven decision-making. General-purpose large language models (LLMs) struggle with these challenges due to their lack of domain-specific reasoning, including speciali...
Preprint
Full-text available
Capturing subtle speech disruptions across the psychosis spectrum is challenging because of the inherent variability in speech patterns. This variability reflects individual differences and the fluctuating nature of symptoms in both clinical and non-clinical populations. Accounting for uncertainty in speech data is essential for predicting symptom...
Preprint
Full-text available
The widespread use of chest X-rays (CXRs), coupled with a shortage of radiologists, has driven growing interest in automated CXR analysis and AI-assisted reporting. While existing vision-language models (VLMs) show promise in specific tasks such as report generation or abnormality detection, they often lack support for interactive diagnostic capabi...
Article
Full-text available
Many different methods for prompting large language models have been developed since the emergence of OpenAI's ChatGPT in November 2022. In this work, we evaluate six different few-shot prompting methods. The first set of experiments evaluates three frameworks that focus on the quantity or type of shots in a prompt: a baseline method with a simple...
Preprint
Full-text available
International enterprises, organizations, or hospitals collect large amounts of multi-modal data stored in databases, text documents, images, and videos. While there has been recent progress in the separate fields of multi-modal data exploration as well as in database systems that automatically translate natural language questions to database query...
Preprint
Full-text available
The potential for improvements brought by Large Language Models (LLMs) in Text-to-SQL systems is mostly assessed on monolingual English datasets. However, LLMs' performance for other languages remains vastly unexplored. In this work, we release the StatBot.Swiss dataset, the first bilingual benchmark for evaluating Text-to-SQL systems based on real...
Article
Full-text available
Specialised pre-trained language models are becoming more frequent in Natural language Processing (NLP) since they can potentially outperform models trained on generic texts. BioBERT (Sanh et al., Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv: 1910.01108 , 2019) and BioClinicalBERT (Alsentzer et...
Conference Paper
Full-text available
In this manuscript, we describe our submission to the RR-TNM subtask of NTCIR-17 MedNLP-SC shared task. We took an approach to create extensive question-and-answer (Q&A) pairs related to TNM classification as a method of domain-specific augmentation. Compared to the result without data augmentation, improvement in the accuracy especially for the M...
Conference Paper
Full-text available
We describe our submission to the RR-TNM subtask of the NTCIR-17 MedNLP-SC shared task. In the RR-TNM subtask, we developed our system for automatic extraction and classification of the TNM staging from Japanese radiology reports of lung cancers. In our system, zero-shot classification and prompt engineering were performed using ChatGPT and LangCha...
Conference Paper
Recent transformer-based models have made significant strides in generating radiology reports from chest X-ray images. However, a prominent challenge remains: these models often lack prior knowledge, resulting in the generation of synthetic reports that mistakenly reference non-existent prior exams. This discrepancy can be attributed to a knowledge...
Article
Full-text available
While several benefits were realized for multilingual vision-language pretrained models, recent benchmarks across various tasks and languages showed poor cross-lingual generalisation when multilingually pre-trained vision-language models are applied to non-English data, with a large gap between (supervised) English performance and (zero-shot) cross...
Preprint
Full-text available
Current transformer-based models achieved great success in generating radiology reports from chest X-ray images. Nonetheless, one of the major issues is the model's lack of prior knowledge, which frequently leads to false references to non-existent prior exams in synthetic reports. This is mainly due to the knowledge gap between radiologists and th...
Preprint
Full-text available
Background: Federated learning methods offer the possibility of training machine learning models on privacy-sensitive data sets, which cannot be easily shared. Multiple regulations pose strict requirements on the storage and usage of healthcare data, leading to data being in silos (i.e. locked-in at healthcare facilities). The application of federa...
Conference Paper
Full-text available
While several benefits were realized for multilingual vision-language pretrained models, recent benchmarks across various tasks and languages showed poor cross-lingual generalisation when multilingually pre-trained vision-language models are applied to non-English data, with a large gap between (supervised) English performance and (zero-shot) cross...
Article
Full-text available
Objectives: The first objective of this study was to implement and assess the performance and reliability of a vision transformer (ViT)-based deep-learning model, an 'off-the-shelf' artificial intelligence solution, for identifying distinct signs of microangiopathy in nailfold capilloroscopy (NFC) images of patients with SSc. The second objective...
Preprint
Full-text available
While several benefits were realized for multilingual vision-language pretrained models, recent benchmarks across various tasks and languages showed poor cross-lingual generalisation when multilingually pre-trained vision-language models are applied to non-English data, with a large gap between (supervised) English performance and (zero-shot) cross...
Conference Paper
In this paper, we discuss our contribution to the NII Testbeds and Community for Information Access Research (NTCIR)-16 Real-MedNLP shared task. Our team (ZuKyo) participated in the English subtask: Few-resource Named Entity Recognition. The main challenge in this low-resource task was a low number of training documents annotated with a high number...
Conference Paper
In this NTCIR-16 Real-MedNLP shared task paper, we present the methods of the ZuKyo-JA subteam for solving the Japanese part of Subtask1 and Subtask3 (Subtask1-CR-JA, Subtask1-RR-JA, Subtask3-RR-JA). Our solution is based on a sliding-window approach using a Japanese BERT pre-trained masked-language model., which was used as a common architecture f...
Article
Full-text available
Background An accurate assessment of nailfold capillaroscopy (NFC) images has great importance in the diagnosis and prognosis of systemic sclerosis (SSc). To overcome some of the inherent problems with NFC image analysis (operator/observer bias, time requirements), there is an interest to automate and standardize NFC image assessment using computer...
Preprint
Full-text available
Inspired by Curriculum Learning, we propose a consecutive (i.e. image-to-text-to-text) generation framework where we divide the problem of radiology report generation into two steps. Contrary to generating the full radiology report from the image at once, the model generates global concepts from the image in the first step and then reforms them int...
Conference Paper
Full-text available
Inspired by Curriculum Learning, we propose a consecutive (i.e., image-to-text-to-text) generation framework where we divide the problem of radiology report generation into two steps. Contrary to generating the full radiology report from the image at once, the model generates global concepts from the image in the first step and then reforms them in...
Preprint
Full-text available
Real-world applications of natural language processing (NLP) are challenging. NLP models rely heavily on supervised machine learning and require large amounts of annotated data. These resources are often based on language data available in large quantities, such as English newswire. However, in real-world applications of NLP, the textual resources...
Preprint
Full-text available
Learning what to share between tasks has been a topic of high importance recently, as strategic sharing of knowledge has been shown to improve the performance of downstream tasks. The same applies to sharing between languages, and is especially important when considering the fact that most languages in the world suffer from being under-resourced. I...
Conference Paper
Full-text available
Existing named entity recognition (NER) systems rely on large amounts of human-labeled data for supervision. However, obtaining large-scale annotated data is challenging particularly in specific domains like health-care, e-commerce and so on. Given the availability of domain specific knowledge resources, (e.g., ontologies, dictionaries), distant su...
Poster
Full-text available
In short First we extract the shortest dependency path (sdp) between two entities. We introduce a convolutional neural network (CNN) which takes the shortest dependency path embeddings as input. This approach achieved overall F1 scores of 76.7 and 83.2 for relation classification on clean and noisy data, respectively. For combined relation extracti...
Conference Paper
Full-text available
We investigate the use of different syntactic dependency representations in a neu-ral relation classification task and compare the CoNLL, Stanford Basic and Universal Dependencies schemes. We further compare with a syntax-agnostic approach and perform an error analysis in order to gain a better understanding of the results.
Preprint
Full-text available
We investigate the use of different syntactic dependency representations in a neural relation classification task and compare the CoNLL, Stanford Basic and Universal Dependencies schemes. We further compare with a syntax-agnostic approach and perform an error analysis in order to gain a better understanding of the results.
Article
Full-text available
This article presents the SIRIUS-LTG-UiO system for the SemEval 2018 Task 7 on Semantic Relation Extraction and Classification in Scientific Papers. First we extract the shortest dependency path (sdp) between two entities, then we introduce a convolutional neural network (CNN) which takes the shortest dependency path embeddings as input and perform...
Conference Paper
Full-text available
The extraction and the disambiguation of knowledge guided by textual resources on the web is a crucial process to advance the Web of Linked Data. The goal of our work is to semantically enrich raw data by linking the mentions of named entities in the text to the corresponding known entities in knowledge bases. In our approach multiple aspects are c...
Conference Paper
Full-text available
In the context of Social Media Analytics, Natural Language Processing tools face new challenges on on-line conversational text, such as microblogs, chat, or text messages, because of the specificity of the language used in these channels. This work addresses the problem of Part-Of-Speech tagging (initially for French but also for English) on noisy...
Conference Paper
Full-text available
The main objective of this paper is to compare the sentiments that prevailed before and after the presidential elections, held in both US and France in the year 2012. To achieve this objective we extracted the content information from a social medium such as Twitter and used the tweets from electoral candidates and the public users (voters), collec...

Network

Cited By