Table 5 - uploaded by Ali Allam
Content may be subject to copyright.
Source publication
Question Answering (QA) is a specialized area in the field of Information Retrieval (IR). The QA systems are concerned with providing relevant answers in response to questions proposed in natural language. QA is therefore composed of three distinct modules, each of which has a core component beside other supplementary components. These three core c...
Context in source publication
Context 1
... (4) and (5) show a comparative summary between the aforementioned researches with respect to the QA components and the QA approaches, respectively. (Table 4) illustrates the different QA system components that were covered by each of the aforementioned researches, while (Table 5) shows the approaches that were utilized by each research within every component. ...
Similar publications
This work presents a novel four-stage open-domain QA pipeline R2-D2 (Rank twice, reaD twice). The pipeline is composed of a retriever, passage reranker, extractive reader, generative reader and a mechanism that aggregates the final prediction from all system's components. We demonstrate its strength across three open-domain QA datasets: NaturalQues...
Citations
... Traditional MCQ generation, requiring meticulous effort to balance question difficulty, ensure the relevance and quality of distractors, and provide feedback to learners, is notably enhanced by AI technologies [8]- [10]. ...
... By matching the results of our participants with their responses from the pre-experiment survey as seen in Fig. 2, it becomes evident that three users, identified by IDs 7, 8, and 10, self-reported their familiarity with AI tools as novice (10) and beginner (7,8). Moreover, two of these individuals (7,10) stated they do not incorporate AI into their course development processes. ...
In the evolving landscape of educational technology, integrating Artificial Intelligence (AI) into educational assessment creation has emerged as a key area of innovation. This paper compares three methods for generating Multiple Choice Questions (MCQs): (1) without the use of generative AI, (2) with unguided usage of ChatGPT, and (3) with a specialized AI-powered micro-app designed specifically for MCQ generation. The micro-app, named the “MCQ Generator”, allows users to set preferences such as difficulty level, number of distractors, and inclusion of hints/feedback, tailoring the final prompt based on these selections. Our study involves instructional designers creating MCQs for hypothetical courses, with educators then evaluating the quality of these questions using a rubric-based approach. The results reveal that AI-assisted methods significantly enhance the efficiency and quality of MCQ generation compared to non-AI methods. Notably, the micro-app demonstrates potential advantages over ChatGPT, offering a more user-friendly interface and a lower barrier to entry for educators. These findings suggest that while ChatGPT can enhance the MCQ creation process, AI micro-apps may provide more tailored functionalities that further streamline educators’ workflows. This paper presents empirical evidence on the utility of AI in educational content development. By doing so, it contributes to the broader discourse on the transformative potential of AI in educational assessment, with a particular focus on online education. Additionally, it explores methods to increase AI adoption within the field.
... It aims to achieve crossmodal information integration and understanding by associating entities in different modalities (e.g., text, images, etc.), and to improve the accuracy of linking to entities in the Knowledge Graph (KG). It is an important foundation for NLP downstream tasks, e.g., Information Retrieval (Chang et al. 2006;Martinez-Rodriguez, Hogan, and Lopez-Arevalo 2020) and Question-Answer systems (Allam and Haggag 2012;Mollá, Van Zaanen, and Smith 2006). Despite recent research progress, existing MEL methods face the following challenges existing in training data, thus far from meeting the requirements of real-world applications. ...
Multi-modal Entity Linking (MEL) is a fundamental component for various downstream tasks. However, existing MEL datasets suffer from small scale, scarcity of topic types and limited coverage of tasks, making them incapable of effectively enhancing the entity linking capabilities of multi-modal models. To address these obstacles, we propose a dataset construction pipeline and publish , a large-scale dataset for MEL. includes 79,625 instances, covering 9 diverse multi-modal tasks, and 5 different topics. In addition, to further improve the model's adaptability to multi-modal tasks, We propose a modality-augmented training strategy. Utilizing as a corpus, train the model based on , and conduct a comparative analysis with an existing multi-modal baselines. Experimental results show that the existing models perform far below expectations (ACC of 49.4%-75.8%), After analysis, it was obtained that small dataset sizes, insufficient modality task coverage, and limited topic diversity resulted in poor generalisation of multi-modal models. Our dataset effectively addresses these issues, and the model fine-tuned with shows a significant improvement in accuracy, with an average improvement of 9.3% to 25% across various tasks. Our dataset is available at https://anonymous.4open.science/r/M3EL.
... Recently, novel XAI techniques have been proposed in the literature based on Natural Language Processing (NLP). There are NLP models capable of providing a specific answer in natural language given a context and a question [3]. For example, BERT [9], which stands for Bidirectional Encoder Representations from Transformers, uses a bidirectional network for language understanding. ...
Face Recognition (FR) has advanced significantly with the development of deep learning, achieving high accuracy in several applications. However, the lack of interpretability of these systems raises concerns about their accountability, fairness, and reliability. In the present study, we propose an interactive framework to enhance the explainability of FR models by combining model-agnostic Explainable Artificial Intelligence (XAI) and Natural Language Processing (NLP) techniques. The proposed framework is able to accurately answer various questions of the user through an interactive chatbot. In particular, the explanations generated by our proposed method are in the form of natural language text and visual representations, which for example can describe how different facial regions contribute to the similarity measure between two faces. This is achieved through the automatic analysis of the output's saliency heatmaps of the face images and a BERT question-answering model, providing users with an interface that facilitates a comprehensive understanding of the FR decisions. The proposed approach is interactive, allowing the users to ask questions to get more precise information based on the user's background knowledge. More importantly, in contrast to previous studies, our solution does not decrease the face recognition performance. We demonstrate the effectiveness of the method through different experiments, highlighting its potential to make FR systems more interpretable and user-friendly, especially in sensitive applications where decision-making transparency is crucial.
... On the other hand, the goal of QA methodologies is to effectively respond to user queries by suggesting contextually relevant answers. This intricate task is typically divided into three integral modules: question classification, information retrieval, and answer extraction [7]. Question classification involves anticipating the expected type of answer based on the nature of the posed question, while information retrieval generates search results aligned with the identified question type. ...
... Harnessing the power of automated QA systems can significantly enhance the effectiveness and efficiency of managing and accessing large volumes of data. Allam and Haggag (2012), defined QA as a multidisciplinary research area that intersects Information Retrieval (IR), Information Extraction (IE), and NLP. Its primary goal is to provide precise and comprehensive answers to user queries by extracting relevant information from textual documents or databases. ...
In the rapidly evolving landscape of Natural Language Processing (NLP), Large Language Models (LLMs) have demonstrated remarkable capabilities in tasks such as question answering (QA). However, the accessibility and practicality of utilizing these models for industrial applications pose significant challenges, particularly concerning cost-effectiveness, inference speed, and resource efficiency. This paper presents a comprehensive benchmarking study comparing open-source LLMs with their non-open-source counterparts on the task of question answering. Our objective is to identify open-source alternatives capable of delivering comparable performance to proprietary models while being lightweight in terms of resource requirements and suitable for Central Processing Unit (CPU)-based inference. Through rigorous evaluation across various metrics including accuracy, inference speed, and resource consumption, we aim to provide insights into selecting efficient LLMs for real-world applications. Our findings shed light on viable open-source alternatives that offer acceptable performance and efficiency, addressing the pressing need for accessible and efficient NLP solutions in industry settings.
... Question analysis is a foundational phase in AQA, vital for enhancing its overall quality. Its main objective is to create a structured representation of the required information to address the user query [17][18][19][20]. Typically, this analysis involves two main tasks. ...
In the domain of question subjectivity classification, there exists a need for detailed datasets that can foster advancements in Automatic Subjective Question Answering (ASQA) systems. Addressing the prevailing research gaps, this paper introduces the Fine-Grained Question Subjectivity Dataset (FQSD), which comprises 10,000 questions. The dataset distinguishes between subjective and objective questions and offers additional categorizations such as Subjective-types (Target, Attitude, Reason, Yes/No, None) and Comparison-form (Single, Comparative). Annotation reliability was confirmed via robust evaluation techniques, yielding a Fleiss’s Kappa score of 0.76 and Pearson correlation values up to 0.80 among three annotators. We benchmarked FQSD against existing datasets such as (Yu, Zha, and Chua 2012), SubjQA (Bjerva 2020), and ConvEx-DS (Hernandez-Bocanegra 2021). Our dataset excelled in scale, linguistic diversity, and syntactic complexity, establishing a new standard for future research. We employed visual methodologies to provide a nuanced understanding of the dataset and its classes. Utilizing transformer-based models like BERT, XLNET, and RoBERTa for validation, RoBERTa achieved an outstanding F1-score of 97%, confirming the dataset’s efficacy for the advanced subjectivity classification task. Furthermore, we utilized Local Interpretable Model-agnostic Explanations (LIME) to elucidate model decision-making, ensuring transparent and reliable model predictions in subjectivity classification tasks.
... The researchers noted that their developed virtual assistant can provide answers to the majority of COVID-19 related queries raised by users. Allam and Haggag (2012) state that the three main components of the methodology are question categorization, information retrieval and answer extraction. All these Evolving Systems components work together using natural language processing. ...
Question answering systems are capable of responding to user inquiries using natural language. These systems analyze questions utilizing natural language processing methods and retrieve responses from appropriate data sources using information retrieval techniques. Additionally, text mining and deep network techniques can enhance the effectiveness of question answering systems by providing more accurate and relevant information. In this study, we developed question answering models employing text mining and deep networks. We trained a pre-existing English BERT-base model with the Stanford Question Answering Dataset (SQuADv1.1) utilizing various hyperparameters and fine-tuning values. Our training yielded impressive results with an F1 score of 88.13 and an Exact Match (EM) rate of 80.74, outperforming previous studies in the field. An improvement study was conducted on the Turkish History Question Answering Dataset (THQuADv1.0), which led to the update of the dataset to THQuADv2.0 by adding questions regarding the units of Düzce University. The pre-trained Turkish BERTurk-base model received training with the THQuADv2.0 dataset utilizing the successful hyperparameters and fine-tuning values obtained in the English model. As a consequence of the training, we developed the BERTDuQuA (BERT Düzce University Question Answering) model for answering Turkish questions. The BERTDuQuA model demonstrated exceptional performance, achieving an F1 score of 87.10 and an EM of 76.90.
... Most work in this field focuses on a specific MCQ element, i.e., the stem (question), the key (correct answer), and the distractors (incorrect options). The most widely researched task is question answering (QA) [2,49]; this is equivalent to key generation, although most work in question answering do not place themselves within this context. The automatic generation of stems is related to generating freeform questions (QG). ...
... Question Answering (QA), drawing from multiple research fields such as Information Retrieval and Information Extraction, involves the application of various methods to solve and present answers relevant to a selected question. The task can be broadly divided into three key modules: question classification, information retrieval, and answering extraction (Allam, 2012). In the question classification phase, the goal is to determine the expected type of answer for the given question. ...
The demand for language models has increased drastically due to their popularization in recent years. In the natural language processing scenario, this popularization demonstrates a demand for models for low-resource languages, so that populations in emerging countries also have access to this type of technology. Nevertheless, most existing models are developed predominantly with English resources, struggling to adapt their knowledge to the complexities of sub-represented languages. This work proposes an evaluation of the current landscape of multilingual and specific language models, Aya and Sabiá-7B, focusing on their application and performance in Brazilian Portuguese, through Aspect-Based Sentiment Analysis (ABSA), Hate Speech Detection (HS), Irony Detection (ID), and Question Answering (QA) tasks. During our experiments, our approach had shown promising results of the Portuguese focused Sabiá-7B model on datasets made from native Portuguese examples, while the multilingual Aya model showed the best results when using texts translated from English, on the QA task.
... At present, the question classification mainly consists of four IQA_QC frameworks from different perspectives: content-based [3,9], template-based [10][11][12], calculationbased [13,14] and method-based [15,16] classification. From the perspective of content, questions can be divided into querying different facts like when, where and who. ...
... According to different content, questions can be further divided into different types which includes factoid, confirmation, definition and so on [3,9,23]. The specific details are mentioned in Table 2. ...
In the era of GeoAI, Geospatial Intelligent Question Answering (GeoIQA) represents the ultimate pursuit for everyone. Even generative AI systems like ChatGPT-4 struggle to handle complex GeoIQA. GeoIQA is domain complex IQA, which aims at understanding and answering questions accurately. The core of IQA is the Question Classification (QC), which mainly contains four types: content-based, template-based, calculation-based and method-based classification. These IQA_QC frameworks, however, struggle to be compatible and integrate with each other, which may be the bottleneck restricting the substantial improvement of IQA performance. To address this problem, this paper reviewed recent advances on IQA with the focus on solving question classification and proposed a comprehensive IQA_QC framework for understanding user query intention more accurately. By introducing the basic idea of the IQA mechanism, a three-level question classification framework consisting of essence, form and implementation is put forward which could cover the complexity and diversity of geographical questions. In addition, the proposed IQA_QC framework revealed that there are still significant deficiencies in the IQA evaluation metrics in the aspect of broader dimensions, which led to low answer performance, functional performance and systematic performance. Through the comparisons, we find that the proposed IQA_QC framework can fully integrate and surpass the existing classification. Although our proposed classification can be further expanded and improved, we firmly believe that this comprehensive IQA_QC framework can effectively help researchers in both semantic parsing and question querying processes. Furthermore, the IQA_QC framework can also provide a systematic question-and-answer pair/library categorization system for AIGCs, such as GPT-4. In conclusion, whether it is explicit GeoAI or implicit GeoAI, the IQA_QC can play a pioneering role in providing question-and-answer types in the future.