Emilio Sanchis Arnal

Universidad de Valladolid, Valladolid, Castille and León, Spain

Are you Emilio Sanchis Arnal?

Claim your profile

Publications (27)0 Total impact

  • Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop, SSPR&SPR 2010, Cesme, Izmir, Turkey, August 18-20, 2010. Proceedings; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we present a system that allows users to obtain the answer to a given spoken question expressed in natural language. A large vocabulary continuous speech recognizer is used to transcribe the spoken question into text. Then, a question answering engine is used to obtain the answer to the question. Some improvements over the baseline system were proposed in order to adapt the output of the speech recognizer to the question answering engine: capitalized output from the speech recognizer and a language model for questions. System performance was evaluated using a standard question answering test suite from CLEF. Results showed that the proposed approach outperforms the baseline system both in WER and in over-all system accuracy.
    Spoken Language Technology Workshop, 2008. SLT 2008. IEEE; 01/2009
  • CLEF, Edited by Peters, Carol, Gey, Fredric C., Gonzalo, Julio, Müller, Henning, Jones, Gareth J. F., Kluck, Michael, Magnini, Bernardo, Rijke, Maarten, 01/2006: pages 420-428; Springer Berlin Heidelberg., ISBN: 978-3-540-45697-1
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The disambiguation of verbs is usually considered to be more difficult with respect to other part-of-speech categories. This is due both to the high polysemy of verbs compared with the other categories, and to the lack of lexical resources providing relations between verbs and nouns. One of such resources is WordNet, which provides plenty of information and relationships for nouns, whereas it is less comprehensive with respect to verbs. In this paper we focus on the disambiguation of verbs by means of Support Vector Machines and the use of WordNet-extracted features, based on the hyperonyms of context nouns.
    Computational Linguistics and Intelligent Text Processing, 7th International Conference, CICLing 2006, Mexico City, Mexico, February 19-25, 2006, Proceedings; 01/2006
  • Source
    Davide Buscaldi, Paolo Rosso, Emilio Sanchis Arnal
    [Show abstract] [Hide abstract]
    ABSTRACT: Geographical entities often appears in very different forms in text collections, such as when a foreign name is used instead of the English one, or when the citation of some region or place omits the name of a larger geographical entity containing them. This is a known problem in the field of Information Retrieval. The use of an ontology like WordNet can help in addressing this issue. In this paper we propose an automatic method to expand the geographical terms in queries by using the WordNet ontology and another method that expands the terms during the indexing phase. The proposed methods exploits the synonymy, meronymy and holonymy relationships provided by WordNet, together with some information extracted from the gloss.
  • Source
    Davide Buscaldi, Paolo Rosso, Emilio Sanchis Arnal
    [Show abstract] [Hide abstract]
    ABSTRACT: This report describes a query expansion method based on the expansion of geographical terms by means of WordNet synonyms and meronyms. We used this method for our partic-ipation to the GeoCLEF 2005 English monolingual task, while using the well-known Lucene search engine for indexing and retrieval. The obtained results show that the proposed method was not suitable for the GeoCLEF track, while WordNet can be used in a more effective way during the indexing phase, by adding synonyms and holonyms to the index terms.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This report describes the work done by the RFIA group at the Departamento de Sistemas Informáticos y Computación of the Universidad Politécnica of Valencia for the 2005 edition of the CLEF Question Answering task. We participated in three monolingual tasks: Spanish, Italian and French, and in two cross-language tasks: spanish to english and english to spanish. Since this was our first participation, we focused our work on the passage-based search engine while using simple pattern matching rules for the Answer Extraction phase. As regards the cross-language tasks, we had resort to the most common web translation tools.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we present a new method to improve the coverage of Passage Retrieval (PR) systems when these systems are employed for the Question Answering (QA) tasks. The ranking of passages obtained by the PR system is rearranged to emphasize those passages with more probability to contain the answer. The new ranking is based on finding the n-gram structures of the question that are presented in the passage, and the weight of the passages increases when they contain longer n-grams structures of the question. The results we present show that the application of this method improves notably the coverage of the classical PR system based on the Space Vectorial Model.
    Text, Speech and Dialogue, 8th International Conference, TSD 2005, Karlovy Vary, Czech Republic, September 12-15, 2005, Proceedings; 01/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Passage Retrieval (PR) is typically used as the first step in current Question Answering (QA) systems. Most methods are based on the vector space model allowing the finding of relevant passages for general user needs, but failing on selecting pertinent passages for specific user questions. This paper describes a simple PR method specially suited for the QA task. This method considers the structure of the question, favoring the passages that contain the longer n-gram structures from the question. Experimental results of this method on Spanish, French and Italian show that this approach can be useful for multilingual question answering systems.
    MICAI 2005: Advances in Artificial Intelligence, 4th Mexican International Conference on Artificial Intelligence, Monterrey, Mexico, November 14-18, 2005, Proceedings; 01/2005
  • Proceedings of the 2nd Indian International Conference on Artificial Intelligence, Pune, India, December 20-22, 2005; 01/2005
  • Source
    Mikhail Alexandrov, Emilio Sanchis Arnal, Paolo Rosso
    [Show abstract] [Hide abstract]
    ABSTRACT: Cluster analysis of dialogs with transport directory service allows revealing the typical scenarios of dialogs, which is useful for designing automatic dialog systems. We show how to parameterize dialogs and how to control the process of clustering. The parameters include both data of transport service and features of passenger's behavior. Control of clustering consists in manipulating the parameter's weights and checking stability of the results. This technique resembles Makagonov's approach to the analysis of dweller's complaints to city administration. We shortly describe B. Stein's new MajorClust method and demonstrate its work on real person-to-person dialogs provided by Spanish railway service.
    Text, Speech and Dialogue, 8th International Conference, TSD 2005, Karlovy Vary, Czech Republic, September 12-15, 2005, Proceedings; 01/2005
  • Source
    Davide Buscaldi, Paolo Rosso, Emilio Sanchis Arnal
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes how we managed to use the WordNet ontology for the GeoCLEF 2005 English monolingual task. Both a query expansion method, based on the expansion of geographical terms by means of WordNet synonyms and meronyms, and a method based on the expansion of index terms, which exploits WordNet synonyms and holonyms. The obtained results show that the query expansion method was not suitable for the GeoCLEF track, while WordNet could be used in a more effective way during the indexing phase.
    Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Selected Papers; 01/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work is a revised version of the paper “INAOE-UPV Joint Participation at CLEF 2005: Experiments in Monolingual Question Answering”, previously published in the CLEF 2005 working notes (www.clef-campaign.org/2005/working_notes/). This paper describes a full data-driven system for question answering. The system uses pattern matching and statistical techniques to identify the relevant passages as well as the candidate answers for factoid and definition questions. Since it does not consider any sophisticated linguistic analysis of questions and answers, it can be applied to different languages without requiring major adaptation changes. Experimental results on Spanish, Italian and French demonstrate that the proposed approach can be a convenient strategy for monolingual and multilingual question answering. CONACYT (Project Grant 43990); R2D2 (CICYTTIC2003-07158-C04-03); ICT EU-India (ALA/95/23/2003/077-054)
    Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Selected Papers; 01/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes the QUASAR Question Answering Information System developed by the RFIA group at the Departamento de Sistemas Informáticos y Computación of the Universidad Politécnica of Valencia for the 2005 edition of the CLEF Question Answering exercise. We participated in three monolingual tasks: Spanish, Italian and French, and in two cross-language tasks: Spanish to English and English to Spanish. Since this was our first participation, we focused our work on the passage-based search engine while using simple pattern matching rules for the Answer Extraction phase. As regards the cross-language tasks, we had to resort to the most common web translation tools.
    Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Selected Papers; 01/2005
  • MICAI, Edited by Gelbukh, Alexander, Albornoz, Álvaro, Terashima-Mar'in, Hugo, 01/2005: pages 816-823; Springer Berlin Heidelberg., ISBN: 978-3-540-29896-0
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: En este trabajo se describe un sistema de diálogo desarrollado para el Proyecto DIHANA. El sistema está compuesto por siete módulos: un reconocedor automático del habla, un módulo de comprensión del habla, un gestor del diálogo, un módulo de consulta a la base de datos, un generador de respuestas en lenguaje natural, un sintetizador de texto a voz y finalmente por un gestor central de comunicaciones. Para la implementación del sistema se ha optado por una arquitectura siguiendo el paradigma cliente-servidor, donde el gestor central actúa como cliente, gestionando las comunicaciones, y el resto de módulos actúan como servidores. In this work we describe a dialog system developed into the DIHANA project. This system consists of seven modules: an automatic speech recognizer, a language understanding module, a dialog manager, a module that manages the queries to the database, a natural language answer generator, a text-to-speech converter and, finally, a central communication manager. For the implementation of the system, we built an architecture based on the client-server paradigm, where the central communication manager works as the client and manages the communications, and the other modules work as servers.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present an approach to the estimation of a dialogue-dependent understanding component of a dialogue system. Modelization which is specific to the dialogue state is proposed to improve the behavior of the understanding process. This work is developed in the framework of the BASURDE Spanish dialogue system, which answers queries about train timetables by telephone in Spanish. Some experimental results are presented. Se presenta una aproximación para la estimación del componente de comprensión de un sistema de diálogo en la que este componente se hace depender del propio diálogo. Esa particular, se propone llevar a cabo una modelización específica para cada estado del proceso de diálogo. Este trabajo se desarrolla dentro del sistema de diálogo BASU1{DE, que contesta a través de la línea telefónica a consultas sobre horarios y precios de trenes de largo recorrido en castellano. Se presentan también algunos resultados experimentales.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: En esta demo se muestra el comportamiento de la plataforma de adquisición desarrollada en el marco del proyecto DIHANA para la adquisición de un corpus de diálogo de consulta telefónica de horarios y precios de trenes de grandes líneas. In this demo it is illustrated the behaviour of the acquisition platform which has been developed in the DIHANA project in order to acquire a dialogue corpus about telephone query of timetables and cost of train tickets. Este trabajo ha sido subvencionado por el proyecto TIC2002/04103-C03 de la CICYT.
  • 01/2003; Universidad Politécnica de Valencia.

Publication Stats

123 Citations

Top Journals

Institutions

  • 2009
    • Universidad de Valladolid
      • Department of Informatics
      Valladolid, Castille and León, Spain
  • 2005–2006
    • Polytechnical University of Valencia
      • Department of Computer Systems and Computation
      Valenza, Valencia, Spain
    • University of Valencia
      Valenza, Valencia, Spain