Miguel Ángel Alonso Pardo

Miguel Ángel Alonso Pardo
Universidade da Coruña | UDC · Department of Computer Science

PhD in Computer Science

About

164
Publications
35,103
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,404
Citations
Introduction
I am with LYS, the research group on Natural Language Processing (NLP) at the University of A Coruña, Spain. Currently, my fields of interest are: Multilingual text processing; Opinion Mining and Sentiment Analysis; Information Retrieval applying NLP techniques; Parsing.
Additional affiliations
July 2003 - present
Universidade da Coruña
Position
  • Associate Professor, tenured
Description
  • Opinion Mining, Information Retrieval, Parsing
October 1997 - December 2000
Universidade da Coruña
Position
  • Adjunt (Profesor Asociado a Tiempo Completo)
Description
  • Automata models for Mildly Context-Sensitive Languages (MCSL), mainly Tree Adjoining Grammars (TAG) and Linear Indexed Grammars (LIG)
January 2001 - December 2003
Universidade da Coruña
Position
  • Associate Professor, tenure track
Description
  • Information Retrieval using Natural Language Processing techniques
Education
October 1993 - September 2000

Publications

Publications (164)
Article
We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules. On the one hand, we exploit some of the main advantages of unsupervised algorithms: (1) the interpretability of their output, in contrast with most supervised models, which behave as a black box and (2) their robustness across di...
Article
This article tackles the problem of performing multilingual polarity classification on Twitter, comparing three techniques: (1) a multilingual model trained on a multilingual dataset, obtained by fusing existing monolingual resources, that does not need any language recognition step, (2) a dual monolingual model with perfect language detection on m...
Article
Full-text available
Parsing is a core natural language processing technique that can be used to obtain the structure underlying sentences in human languages. Named entity recognition (NER) is the task of identifying the entities that appear in a text. NER is a challenging natural language processing task that is essential to extract knowledge from texts in multiple do...
Article
Full-text available
In recent years, we have witnessed a rise in fake news, i.e., provably false pieces of information created with the intention of deception. The dissemination of this type of news poses a serious threat to cohesion and social well-being, since it fosters political polarization and the distrust of people with respect to their leaders. The huge amount...
Article
Full-text available
The COVID-19 pandemic has affected many aspects of human life. The pandemic not only caused millions of fatalities and problems but also changed public sentiment and behavior. Owing to the magnitude of this pandemic, governments worldwide adopted full lockdown measures that attracted much discussion on social media platforms. To investigate the eff...
Chapter
To our knowledge, the majority of human language processing technologies for low-resource languages don’t have well-established linguistic resources for the development of sentiment analysis applications. Therefore, it is in dire need of such tools and resources to overcome the NLP barriers, so that, low-resource languages can deliver more benefits...
Conference Paper
Full-text available
Making natural language processing technologies available for low-resource languages is an important goal to improve the access to technology in their communities of speakers. To our knowledge, there are no well-established linguistic resources for the development of sentiment analysis applications for the Uzbek language. In this paper, we fill tha...
Conference Paper
Full-text available
We describe four systems to generate automatically bilingual dictionaries based on existing ones: three transitive systems differing only in the pivot language used, and a system based on a different approach which only needs monolingual corpora in both the source and target languages. All four methods make use of cross-lingual word embeddings trai...
Article
This paper addresses the feasibility of cross-lingual parsing with Universal Dependencies (UD) between Romance languages, analyzing its performance when compared to the use of manually annotated resources of the target languages. Several experiments take into account factors such as the lexical distance between the source and target varieties, the...
Article
Full-text available
Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines. Thus, rules are also dependent and require adaptation, especially in multilingual scenarios. We tackle this challenge in the context of the Iberian Peninsula, releasing the first symbolic syntax-base...
Article
Full-text available
En este trabajo presentamos una nueva estrategia para crear treebanks de lenguas con pocos recursos para el análisis sintáctico. El método consiste en la adaptación y combinaci ón de diferentes treebanks anotados con dependencias universales de variedades lingïísticas próximas, con el objetivo de entrenar un analizador sintáctico para la lengua ele...
Article
Full-text available
This paper presents a novel strategy for creating a Universal Dependencies (UD) treebank of a low-resource language. The method consists of adapting and combining different UD treebanks from related varieties in order to train a parser for the target language. More precisely, the paper explores the influence of three different levels for the select...
Conference Paper
Full-text available
We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of the learned languages, or even sentences that mix both. We test the approach on the Universal Dependency Treebanks, training with MaltParser and Malt...
Article
In contrast with their monolingual counterparts, little attention has been paid to the effects that misspelled queries have on the performance of Cross-Language Information Retrieval (CLIR) systems. The present work makes a first attempt to fill this gap by extending our previous work on monolingual retrieval in order to study the impact that the p...
Conference Paper
Full-text available
In this paper we describe our deep learning approach for solving both two-, three- and fiveclass tweet polarity classification, and twoand five-class quantification. We first trained a convolutional neural network using pretrained Twitter word embeddings, so that we could extract the hidden activation values from the hidden layers once some input h...
Conference Paper
Full-text available
Code-switching texts are those that contain terms in two or more different languages, and they appear increasingly often in social media. The aim of this paper is to provide a resource to the research community to evaluate the performance of sentiment classification techniques on this complex multilingual environment, proposing an English-Spanish c...
Article
Full-text available
The field of Cross-Language Information Retrieval relates techniques close to both the Machine Translation and Information Retrieval fields, although in a context involving characteristics of its own. The present study looks to widen our knowledge about the effectiveness and applicability to that field of non-classical translation mechanisms that w...
Article
Full-text available
In democratic countries, forecasting the voting intentions of citizens and knowing their opinions on major political parties and leaders is of great interest to the parties themselves, to the media, and to the general public. Traditionally, expensive polls based on personal interviews have been used for this purpose. The rise of social networks, pa...
Working Paper
Full-text available
We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules. On the one hand, we exploit some of the main advantages of unsupervised algorithms: (1) the interpretability of their output, in contrast with most supervised models, which behave as a black box and (2) their robustness across di...
Article
Twitter is an important platform for sharing opinions about politicians, parties and political decisions. These opinions can be exploited as a source of information to monitor the impact of politics on society. This article analyses the sentiment of 2,704,523 tweets referring to Spanish politicians and parties from a month in 2014-15. The article m...
Conference Paper
Full-text available
La Minería de Opiniones es la disciplina que aborda el tratamiento automático de las opiniones contenidas en un texto. Permite, por ejemplo, determinar si en un texto se está opinando o no, o si la polaridad o sentimiento que se expresa en el mismo es positiva, negativa o mixta. También permite la extracción automática de características, lo que po...
Conference Paper
Full-text available
This paper describes the participation of the LyS group at tass 2015. In this year's edition, we used a long short-term memory neural network to address the two proposed challenges: (1) sentiment analysis at a global level and (2) aspect-based sentiment analysis on football and political tweets. The performance of this deep learning approach is com...
Article
Full-text available
We introduce an approach to train parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that effectively analyze sentences in any of the learned languages, or even sentences that mix both languages. We test the approach on the Universal Dependency Treebanks, training with MaltParser and M...
Article
Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focussed on determining whet...
Article
The vast amount of opinions and reviews provided in Twitter is helpful in order to make interesting findings about a given industry, but given the huge number of messages published every day it is important to detect the relevant ones. In this respect, the Twitter search functionality is not a practical tool when we want to poll messages dealing wi...
Conference Paper
Full-text available
We address the problem of performing polarity classification on Twitter over different languages, focusing on English and Spanish, comparing three techniques: (1) a monolingual model which knows the language in which the opinion is written, (2) a monolingual model that acts based on the decision provided by a language identification tool and (3) a...
Conference Paper
Full-text available
This paper describes our participation at the third edition of the work-shop on Sentiment Analysis focused on Spanish tweets, tass 2014. This year's eval-uation campaign includes four challenges: (1) global sentiment analysis, (2) topic classification, (3) aspect-extraction and (4) aspect-based sentiment analysis. Tasks 1 and 2 are addressed from a...
Conference Paper
Full-text available
Resumen Empresas y organizaciones están empezado a interesarse en monitorizar lo que los usuarios opinan sobre ellas en Twitter ya que los tuits constituyen una buena fuente de información para conocer la percepción que la sociedad tiene sobre sú area de negocio. Para ello, primero es necesario discriminar las opiniones no relacionadas, dada la gra...
Article
Full-text available
We describe an opinion mining system which classifies the polarity of Spanish texts. We propose an NLP approach that undertakes pre-processing, tokenisation and POS tagging of texts to then obtain the syntactic structure of sentences by means of a dependency parser. This structure is then used to address three of the most significant linguistic con...
Conference Paper
Full-text available
This paper describes our participation at RepLab 2014, a competitive evaluation for reputation monitoring on Twitter. The fol-lowing tasks were addressed: (1) categorisation of tweets with respect to standard reputation dimensions and (2) characterisation of Twitter profiles, which includes: (2.1) identifying the type of those profiles, such as jou...
Conference Paper
Full-text available
This paper proposes an approach to solve message- and phrase-level polarity classification in Twitter, derived from an existing system designed for Spanish. As a first step, an ad-hoc preprocessing is performed. We then identify lexical, psychological and semantic features in order to capture different dimensions of the human language which are hel...
Conference Paper
Full-text available
This article describes the approach developed by our group in order to resolve the sentiment analysis at a global level, topic identification and political tendency classification tasks on Spanish tweets; proposed at the Workshop of Sentiment Analysis at sepln (tass 2013). As a preliminary step, we carry out an ad-hoc preprocessing in order to norm...
Conference Paper
Full-text available
This work describes the system for the normalization of tweets in Spanish designed by the Language in the Information Society (LYS) Group of the University of A Coruña for Tweet-Norm 2013. It is a conceptually simple and flexible system, which uses few resources and that faces the problem from a lexical point of view.
Conference Paper
Full-text available
We describe a system that classifies the polarity of Spanish tweets. We adopt a hybrid approach, which combines machine learning and linguistic knowledge acquired by means of NLP. We use part-of-speech tags, syntactic dependencies and semantic knowledge as features for a supervised classifier. Lexical particularities of the language used in Twitter...
Article
This article describes a system that classifies the polarity of Spanish tweets. We adopt a hybrid approach, which combines linguistic knowledge acquired by means of nlp with machine learning techniques. We carry out a preprocessing of the tweets as an initial step to address some characteristics of the language used in Twitter. Then, we apply part-...
Conference Paper
Full-text available
En este trabajo se presentan las conclusiones extraídas tras evaluar los resultados académicos obtenidos por los estudiantes en la asignatura de Programación II, del primer curso del Grado en Ingeniería Informática en la Universidad de A Coruña. Los datos, pertenecientes al segundo año de implantación de la asignatura bajo las directrices del EEES,...
Article
This article describes an opinion mining system that classifies the polarity of Spanish texts. We propose a nlp-based approach which performs segmentation, tokenization and pos tagging of texts to then obtain the syntactic structure of sentences by means of a dependency parser. The syntactic structure is then used to address three of the most signi...
Article
Full-text available
En este artículo se describe un sistema de minería de opiniones que clasifica la polaridad de textos en español. Se propone una aproximación basada enPLN que conlleva realizar una segmentación, tokenización y etiquetación de los textos para a continuación obtener la estructura sintáctica de las oraciones mediante algoritmos de análisis de dependenc...
Article
Full-text available
En este art��ículo se describe un sistema para la clasi�caci�on de la polaridad de tuits escritos en español. Se adopta una aproximaci�ón h�í�brida, que combina conocimiento lingüí��stico obtenido mediante PLN con técnicas de aprendizaje automático. Como paso previo, se realiza una primera etapa de preprocesado para tratar ciertas caracter��sticas...
Article
This work describes the system for the normalization of tweets in Spanish designed by the Language in the Information Society (LYS) Group of the University of A Coruna for Tweet-Norm 2013. It is a conceptually simple and flexible system, which uses few resources and that faces the problem from a lexical point of view.
Chapter
Full-text available
Introducción El Procesamiento del Lenguaje Natural (PLN), también en ocasiones referido como Lingüística Computacional [Jurafsky y Martin, 2009; Mitkov, 2005] es la disciplina encargada del diseño e implementación de los elementos software necesarios para el tratamiento computacional del lenguaje natural, entendiendo como tal todo lenguaje humano,...
Conference Paper
Full-text available
Resumen Presentamos la experiencia de un grupo de profesores de Informática en la impartición de dos asignaturas marcadamente interdisciplinares que unen dos ámbitos tan diferentes como la informática y las humanidades: una que se imparte en el segundo ciclo de la titulación de Ingeniería Informática, y otra que se imparte en un máster universitari...
Conference Paper
Full-text available
En este artículo presentamos el trabajo que en el Grupo LYS (Lengua y Sociedad de la Información) hemos venido desarrollando en fechas recientes en las áreas de recuperación de información tolerante a errores y recuperación de información multilingüe. El nexo común entre ambas líneas de investigación es el empleo de n-gramas de caracteres como unid...
Article
Robustness, the ability to analyze any input regardless of its grammaticality, is a desirable property for any system dealing with unrestricted natural language text. Error-repair parsing approaches achieve robustness by considering ungrammatical sentences as corrupted versions of valid sentences. In this article we present a deductive formalism, b...
Chapter
Full-text available
Resumen: Gran parte de los docentes coincide en que los exámenes no permiten una evaluación adecuada de los conocimientos, competencias y habilidades adquiridos por los alumnos, si bien esto parece ser aceptado sin grandes preocupaciones. Repasamos aquí una década de trabajo dentro una materia optativa de segundo ciclo de Ingeniería Informática don...
Conference Paper
Full-text available
El proceso de adaptación de las titulaciones universitarias españolas al Espacio Europeo de Educación Superior ha propiciado el surgimiento de titulaciones con un marcado carácter interdisciplinar. El caso que nos ocupa es el de un máster oficial que busca dotar de competencias informáticas a filólogos para su incorporación al emergente mercado lab...
Conference Paper
Full-text available
In order to produce efficient Natural Language Processing (NLP) tools, reliable linguistic resources are a preliminary requirement. When available for a given language, the resources are generally far below the expectations in terms of quality, coverage or usability. This paper presents a project whose ambition is to enhance the production capaciti...
Conference Paper
Full-text available
La eficiencia de las herramientas dedicadas al Procesamiento de los Lenguajes Naturales (PLN) depende directamente de la calidad y la cobertura de los recursos lingüísticos sobre los cuales se basan. Presentamos un proyecto cuyo objetivo es mejorar las capacidades de producción de recursos lingüísticos.
Article
We present a compiler which can be used to automatically obtain efficient Java implementations of parsing algorithms from formal specifications expressed as parsing schemata. The system performs an analysis of the inference rules in the input schemata in order to determine the best data structures and indexes to use, and ensure that the generated i...
Conference Paper
Full-text available
A desirable property for any system dealing with unrestricted natural language text is robustness, the ability to analyze any input regardless of its grammaticality. In this paper we present a novel, general transformation technique to automatically obtain robust, error-repair parsers from standard non-robust parsers. The resulting error-repair par...
Article
The performance of information retrieval systems is limited by the linguistic variation present in natural language texts. Word-level natural language processing techniques have been shown to be useful in reducing this variation. In this article, we summarize our work on the extension of these techniques for dealing with phrase-level variation in E...
Conference Paper
Full-text available
Logic programs share with context-free grammars a strong reliance on well-formedness conditions. Their proof procedures can be viewed as a generalization of context-free parsing. In particular, definite clause grammars can be interpreted as an extension of the classic context-free formalism where the notion of finite set of non-terminal symbols is...
Article
Full-text available
We study how the use of syntactic information can improve the performance of Information Retrieval systems based on single-word terms. We consider two different approaches. The first one identifies the syntactic structure of the text by means of a shallow parser in order to extract the head-modifier pairs of the most relevant syntactic dependencies...
Conference Paper
Full-text available
The parsing schemata formalism allows us to describe pars- ing algorithms in a simple, declarative way by capturing their fundamen- tal semantics while abstracting low-level detail. In this work, we present a compilation technique allowing the automatic transformation of parsing schemata to ecient executable implementations of their corresponding a...
Article
Full-text available
Se presentan los esquemas de análisis sintáctico con corrección de errores, que permiten definir algoritmos de análisis sintáctico con corrección de errores de una manera abstracta y declarativa. Este formalismo puede utilizarse para describir dichos algoritmos de manera simple y uniforme, y proporciona una base formal para demostrar su corrección...
Conference Paper
Full-text available
We present a technique for the construction of efficient prototypes for natural language parsing based on the compilation of parsing schemata to executable implementations of their corresponding algorithms. Taking a simple description of a schema as input, Java code for the corresponding parsing algorithm is generated, including schema-specific ind...
Chapter
Full-text available
We present a system allowing the automatic transformation of parsing schemata to effi cient executable implementations of their corresponding algorithms. This system can be used to easily prototype, test and compare different parsing algorithms. In this work, it has been used to generate several different parsers for Context Free Grammars and Tree...
Article
Full-text available
Resumen: En este trabajo se estudia el comportamiento de los algoritmos de análisis sint´actico m´as utilizados en el tratamiento de las Gram´aticas de Adjunci´on de ´Arboles (TAG). Para ello se aplica una t´ecnica de compilaci´on que permite la transformaci ´on autom´atica de esquemas de an´alisis sint´actico en implementaciones eficientes de los...
Chapter
Full-text available
Tree Adjoining Grammars (TAG) and Linear Indexed Grammars (LIG) are extensions of Context Free Grammars that generate the class of Tree Adjoining Languages. Taking advantage of this property, and providing a method for translating a TAG into a LIG, we define several parsing algorithms for TAG on the basis of their equivalent LIG parsers. We also ex...
Conference Paper
Full-text available
In this paper, a generic system that generates parsers from parsing schemata is applied to the particular case of the XTAG English grammar. In order to be able to generate XTAG parsers, some transformations are made to the grammar, and TAG parsing schemata are extended with feature structure unification support and a simple tree filtering mechanism...
Conference Paper
Full-text available
STAIRS 2006 is the third European Starting AI Researcher Symposium, an international meeting aimed at AI researchers, from all countries, at the beginning of their career: PhD students or people holding a PhD for less than one year. A total of 59 papers ...
Article
Full-text available
To date, attempts for applying syntactic information in the document-based retrieval model dominant have led to little practical improvement, mainly due to the problems associated with the integration of this kind of information into the model. In this article we propose the use of a locality-based retrieval model for reranking, which deals with sy...
Article
Full-text available
Resumen Presentamos un compilador capaz de gene-rar analizadores sintácticos a partir de esque-mas de análisis sintáctico. Dichos esquemas son representaciones de los analizadores en for-ma de sistemas deductivos, que abstraen los detalles de implementación y permiten definir y comparar fácilmente diferentes algoritmos. Nuestro compilador es capaz...
Conference Paper
In Information Retrieval (IR) systems, the correct representation of a document through an accurate set of index terms is the basis for obtaining a good performance. If we are not able to both extract and weight appropriately the terms which capture the semantics of the text, this shortcoming will have an effect on all the subsequent processing.
Conference Paper
Full-text available
Information Retrieval systems are limited by the linguistic variation of language. The use of Natural Language Processing techniques to manage this problem has been studied for a long time, but mainly focusing on English. In this paper we deal with European languages, taking Spanish as a case in point. Two different sources of syntactic information...
Article
Full-text available
The parsing schemata formalism allows us to describe parsing algorithms in a simple way by capturing their fundamental semantics while abstracting low-level detail. In this work, we present a compilation technique allowing automatic transformation of parsing schemata to executable implementations of their corresponding algorithms. Taking a simple d...
Conference Paper
Full-text available
To date, attempts for applying syntactic information in the document-based retrieval model dominant have led to little practical improvement, mainly due to the problems associated with the integration of this kind of information into the model. In this article we propose the use of a locality-based retrieval model for reranking, which deals with sy...
Conference Paper
Full-text available
This article describes the application of lemmatization and shallow parsing as a linguistically-based alternative to stemming in Text Retrieval, with the aim of managing linguistic variation at both word level and phrase level. Several alternatives for selecting the index terms among the syntactic dependencies detected by the parser are evaluated....
Article
Full-text available
Tree Adjoining Grammar (TAG) is a useful formalism for describing the syntactic structure of natural languages. In practice, a large part of wide coverage TAGs is formed by trees that satisfy the restrictions imposed by Tree Insertion Grammar (TIG), a simpler formalism. This characteristic can be used to reduce the practical complexity of TAG parsi...
Article
Full-text available
Our goal is to study a practical approach to deal with nontermination in de nite clause grammars. We focus on two problems, loop and cyclic structure detection and representation, maintaining a tight balance between practical eciency and operational completeness.
Article
Full-text available
The employment of Natural Language Processing techniques for Information Retrieval has been studied many times, but such works have mainly focused on English. In this article we describe the evolution of the research developed by our group for the case of Spanish: from our initial experiments, characterized by the lack of standard resources for eva...
Article
The employment of Natural Language Processing techniques for Information Retrieval has been studied many times, but such works have mainly focused on English. In this article we describe the evolution of the research developed by our group for the case of Spanish: from our initial experiments, characterized by the lack of standard resources for eva...
Article
In this our second participation in the CLEF Spanish monolingual track, we have continued applying Natural Language Processing techniques for single word and multi-word term con- ation. Two dierent conation approaches have been tested. The rst approach is based on the lemmatization of the text in order to avoid inectional variation. Our second appr...
Article
Full-text available
The extraction of the keywords that characterize each document in a given collection is one of the most important components of an Information Retrieval system. In this article, we propose to apply shallow parsing, implemented by means of cascades of nite-state transducers, to extract complex index terms based on an approximate grammar of Spanish....
Conference Paper
Full-text available
The extraction of the keywords that character-ize each document in a given collection is one of the most important components of an Informa-tion Retrieval system. In this article, we pro-pose to apply shallow parsing, implemented by means of cascades of finite-state transducers, to extract complex index terms based on an ap-proximate grammar of Spa...
Conference Paper
Full-text available
El rendimiento de los sistemas de Recuperación de Información se ve limitado por los fenómenos de variación lingüística presentes en los textos. Las técnicas de Procesamiento de Lenguaje Natural a nivel de palabra han mostrado su utilidad para reducir dicha variación. Proponemos en este artículo extender esta aproximación a la variación a nivel de...
Conference Paper
Full-text available
In this our second participation in the CLEF Spanish monolingual track, we have continued ap- plying Natural Language Processing techniques for single word and multi-word term conflation. Two different conflation approaches have been tested. The first approach is based on the lemmatization of the text in order to avoid inflectional variation. Our s...
Article
Full-text available
We work in the domain of a regional least-cost strategy with dynamic validation in order to avoid cascaded errors [3], extending the theoretical model to illustrate its asymptotic equivalence with global repair algorithms. This is an objective criterion to measure the quality of an error repair algorithm, since the point of reference is a technique...
Conference Paper
Full-text available
We de ne a new model of automata for the description of bidirectional parsing strategies for context-free grammars and a tabulation mechanism that allow them to be executed in polynomial time. This new model of automata provides a modular way of de ning bidirectional parsers, separating the description of a strategy from its execution.
Article
Full-text available
A large part of wide coverage Tree Adjoining Grammars (TAG) is formed by trees that satisfy the restrictions imposed by Tree Insertion Grammars (TIG). This characteristic can be used to reduce the practical complexity of TAG parsing, applying the standard adjunction operation only in those cases in which the simpler cubic-time TIG adjunction cannot...
Conference Paper
Full-text available
An incremental development environment for unrestricted context-free languages is described and tested. Our proposal includes a parse generator, an incremental facility to make the overall parsing efficient in the context of program development; and a graphical interface that provides a complete set of customization and trace facilities. The tool,...
Article
Full-text available
Tree Adjoining Grammar (TAG) is a formalism that has become very popular for the description of natural languages. However, the parsers for TAG that have been defined on the basis of the Earley's algorithm entail important computational costs. In this article, we propose to extend the left corner relation from Context Free Grammar (CFG) to TAG in o...
Article
Full-text available
Definimos un analizador tabular para gramáticas de adjunción de árboles (TAG) con estrategia de análisis ascendente y recorrido bidireccional de la cadena de entrada. Este analizador es el resultado de la fusión del analizador ascendente bidireccional ya definido para TAG con el nuevo analizador para gramáticas de inserción de árboles (TIG) que pre...
Article
Full-text available
The extraction of the keywords that characterize a document in a given collection is one of the most important components of an Information Retrieval system. In this article, we propose to apply shallow parsing, implemented by means of cascades of finite-state transducers, to extract complex index terms based on an approximated grammar of Spanish....