Un sistema para resumen automático de textos en castellano

Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 31, 2003, pags. 29-36
Source: OAI

ABSTRACT This paper presents a text summarization system for the Spanish language that combines classic techniques in automatic summarization with less frequent ones, like anaphora resolution and cohesive markers detection in order to fight the lack of coherence intrinsic to automatic text excerpts. Este artículo presenta un sistema resumidor para textos en castellano que combina técnicas clásicas dentro del campo del resumen automático con otras menos frecuentes, como son la detección de anáforas y de marcadores discursivos, pera paliar la escasa coherencia inherente a este tipo de resúmenes.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Purpose This study looks into the latest advances in ontology-based text summarization systems, with emphasis on the methodologies of a socio-cognitive approach, the structural discourse models and the ontology-based text summarization systems. Design/methodology/approach The paper analyzes the main literature in this field and presents the structure and features of Texminer, a software that facilitates summarization of texts on Port and Coastal Engineering. Texminer entails a combination of several techniques, including: socio-cognitive user models, Natural Language Processing (NLP), disambiguation and ontologies. After processing a corpus, the system was evaluated using as a reference various clustering evaluation experiments conducted by Arco (Arco, 2008) and Hennig (Hennig et. al., 2008). The results were checked with a Support Vector Machine, Rouge metrics, the F-Measure and calculation of precision and recall. Findings The experiment illustrates the superiority of abstracts obtained through the assistance of ontology-based techniques. Originality/value We were able to corroborate that the summaries obtained using Texminer are more efficient than those derived through other systems whose summarization models do not use ontologies to summarize texts. Thanks to ontologies, main sentences can be selected with a broad rhetorical structure, especially for a specific knowledge domain.
    Library Hi Tech 04/2014; 32(2). · 0.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Resumen El objetivo de este trabajo de investigación es confirmar si es adecuado emplear la compresión de frases como recurso para la optimización de sistemas de resumen automático de documentos. Para ello, en primer lugar, creamos un corpus de resúmenes de documentos especializados (artículos médicos) producidos por diversos sistemas de resumen automático. Posteriormente realizamos dos tipos de compresiones de estos resúmenes. Por un lado, llevamos a cabo una compresión manual, siguiendo dos estrategias: la compresión mediante la eliminación intuitiva de algunos elementos de la oración y la compresión mediante la eliminación de ciertos elementos discursivos en el marco de la Rhetorical Structure Theory (RST). Por otro lado, realizamos una compresión automática por medio de varias estrategias, basadas en la eliminación de palabras de ciertas categorías gramaticales (adjetivos y adverbios) y una baseline de eliminación aleatoria de palabras. Finalmente, comparamos los resúmenes originales con los resúmenes comprimidos, mediante el sistema de evaluación Rouge. Los resultados muestran que, en ciertas condiciones, utilizar la compresión de frases puede ser beneficioso para mejorar el resumen automático de documentos.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper studies a tuned version of an induction tree which is used for automatic detection of lexical word category. The database used to train the tree has several fields to describe Spanish words morpho-syntactically. All the processing is performed using only the information of the word and its actual sentence. It will be shown here that this kind of induction is good enough to perform the linguistic categorization.
    Advanced Techniques in Computing Sciences and Software Engineering, Volume II of the proceedings of the 2008 International Conference on Systems, Computing Sciences and Software Engineering (SCSS), part of the International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering, CISSE 2008, Bridgeport, Connecticut, USA; 01/2008

Full-text (2 Sources)

Available from
May 31, 2014