Conference Paper

Temporal information needs in ResPubliQA: an attempt to improve accuracy. The UC3M participation at CLEF 2010

Conference: CLEF 2010 LABs and Workshops, Notebook Papers, 22-23 September 2010, Padua, Italy
Source: DBLP

ABSTRACT

The UC3M team participates in 2010 in the second ResPubliQA evaluation campaign taking part in the monolingual Spanish task. On this occasion we have completely redesigned our Question Answering system, product of multiple efforts while being part of the MIRACLE team, by creating a whole new architecture. The aim was to gain in modularity, flexibility and evaluation capabilities that previous versions left pending. Despite its initial open-domain philosophy, the new system was tested by means of the JRC-Acquis and EUROPARL collections on the legal domain. We submitted two runs for the participation in the paragraph selection task. The main attempts in this campaign have focused on the study of the information needs concerning time. Starting from implementing a base system based on passage retrieval, we added temporal question analysis capabilities, temporal indexing to the collection, as well as some temporal filtering and reasoning features, getting a global accuracy of 0.51. In the second run we have implemented an answer analysis module based on n-gram analysis. The obtained results are slightly better, achieving a 0.52. We discuss the results found from each configuration when applied to the different temporal questions types.

Download full-text

Full-text

Available from: Paloma Martinez, Nov 19, 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes the first round of ResPubliQA, a Question Answering (QA) evaluation task over European legislation, proposed at the Cross Language Evaluation Forum (CLEF) 2009. The exercise consists of extracting a relevant paragraph of text that satisfies completely the information need expressed by a natural language question. The general goals of this exercise are (i) to study if the current QA technologies tuned for newswire collections and Wikipedia can be adapted to a new domain (law in this case); (ii) to move to a more realistic scenario, considering people close to law as users, and paragraphs as system output; (iii) to compare current QA technologies with pure Information Retrieval (IR) approaches; and (iv) to introduce in QA systems the Answer Validation technologies developed in the past three years. The paper describes the task in more detail, presenting the different types of questions, the methodology for the creation of the test sets and the new evaluation measure, and analyzing the results obtained by systems and the more successful approaches. Eleven groups participated with 28 runs. In addition, we evaluated 16 baseline runs (2 per language) based only in pure IR approach, for comparison purposes. Considering accuracy, scores were generally higher than in previous QA campaigns.
    Full-text · Conference Paper · Jan 2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes the second round of ResPubliQA, a Question Answering (QA) evaluation task over European legislation, a LAB of CLEF 2010. Two tasks have been proposed this year: Paragraph Selection (PS) and Answer Selection (AS). The PS task consisted of extracting a relevant paragraph of text that satisfies completely the information need expressed by a natural language question. In the AS task, the exercise was to demarcate the shorter string of text corresponding to the exact answer supported by the entire paragraph. The general aims of this exercise are (i) to move towards a domain of potential users; (ii) to propose a setting which allows the direct comparison of performance across languages; (iii) to allow QA technologies to be evaluated against IR approaches; (iv) to promote validation technologies to reduce the amount of incorrect answers by leaving some questions unanswered. These goals are achieved through the use of parallel aligned document collections (JRC-Acquis and EUROPARL) and the possibility to return two different types of answers, either passages or exact strings. The paper describes the task in more detail, presenting the different types of questions, the methodology for the creation of the test sets and the evaluation measure, and analyzing the results obtained by systems and the more successful approaches. Thirteen groups participated in both PS and AS tasks submitting 49 runs in total.
    Full-text · Conference Paper · Dec 2010