
Walter KozaConsejo Nacional de Investigaciones Cientificas
Walter Koza
Doctor en Humanidades y Artes con mención en Lingüística
Investigador Adjunto (CONICET)
About
24
Publications
2,830
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
17
Citations
Publications
Publications (24)
The objective of this work is to formalize nominal ellipsis in Spanish – a grammatical mechanism in which one element is silenced with its syntactic structure, under certain syntactic restrictions – through the creation of an algorithm that automatically recognizes and replaces elided elements in natural language texts. Based on the proposal by Saa...
Representing the predicate argument structure of the medical domain (ASMD) is important for automatic text analyses. This work aims to describe the ASMD through verbs and transformational possibilities. Computer resources were constructed from 100 selected biomedical verbs (corpus CCM2009). Firstly, these were analyzed to determine the quantity and...
This text proposes a method for automatic analysis of predicates for discovery (PD) in Spanish. A PD is a predicative unit that projects an argument structure (AS) whose meaning alludes to ‘something that is found by someone -or something- somewhere’ (e.g., ‘encontrar’, ‘hallar’). This type of task is useful in fields such as medicine, since it off...
Resumen El problema de la eventividad en las unidades léxicas nominales supo ser abordado con gran interés por la gramática generativa en los años setenta, no obstante, el foco de aquellas investigaciones iniciales estuvo centrado en los sustantivos derivados de verbos. Por el contrario, los estudios sobre los nombres eventivos no deverbales (NEND)...
We present a methodology for the automatic recognition of negated findings in radiological reports considering morphological, syntactic, and semantic information. In order to achieve this goal, a series of rules for processing lexical and syntactic information was elaborated. This required development of an electronic dictionary of medical terminol...
Prepositions represent an evident problem for children diagnosed with Specific Language Impairment (SLI), as they tend to elide prepositions or incorrectly use them in their oral production. This subject has been described in different languages, in Spanish it has not been studied in depth. In this context, the present work seeks to describe and an...
Prepositions represent an evident problem for children diagnosed with Specific Language Impairment (SLI), as they tend to elide prepositions or incorrectly use them in their oral production. This subject has been described in different languages, in Spanish it has not been studied in depth. In this context, the present work seeks to describe and an...
Syntactic complexity and narrative construction. Analysis of a retelling task by preschool children. The goal of this paper is to establish how children give complexity to their syntax in the different moments of a narration, according to a discursive functionalist approach. For this purpose, 30 preschool children were selected and they participate...
In this paper, the enumeration structure is inquired from a grammatical aproach and a computational linguistics perspective. For this, some theoretical aspects based on the nature of the elements that compose the enumeration, the relation among the enumeration, the matrix that contains it and the syntactic element that allows the enumeration in the...
In this paper, the enumeration structure is inquired from a grammatical aproach and a computational linguistics perspective. For this, some theoretical aspects based on the nature of the elements that compose the enumeration, the relation among the enumeration, the matrix that contains it and the syntactic element that allows the enumeration in the...
Resumen: Se describe un método de generación automática de definiciones mediante la explicitación del significado de los
morfemas aplicado a neologismos médicos. En primer lugar, se realiza el reconocimiento automático de los morfemas que
componen los neologismos y se le asignan a cada uno de ellos los significados correspondientes. A continuación...
This paper describes a method of generating definitions automatically by making the meaning of morphemes explicit, and applies this method to medical neologisms. First, the morphemes that make up the neologisms are automatically recognized and each morpheme is assigned its corresponding meaning. These meanings are then combined through rewrite rule...
In the present work, the enumerative series structure is inquired from a grammatical approach and a computational linguistics perspective. This phenomenon was studied by different authors like Luc (2001) and Cortés (2008), among others, and it could be defined like a textual construction composed by a matrix (an element that is expanded by the enum...
Analysis and formalization are presented about the enumerative series structure in Spanish for subsequent computational implantation. The enumerative series is a textual construct composed by a matrix, an enumerator and an enumeration. Each element of an enumeration is called ‘enumerating’; all elements are related to an “enumeratheme”. Enumerating...
The description of a method for automatic extraction of term candidates from the medical field by applying linguistic information is presented. Lexicography, morphological and syntactic rules were used. First, the detection was performed by applying a standard dictionary that assigned the tag ´MED´ (‘MEDICAL’) to the words that could be considered...
In this paper, the tasks made for obtaining an automatic extractor for verbal chilenismos using natural language rules are described. With this objective, a formalization of lexical, morphological and syntactic features was made, for a subsequent computational implementation. Firstly, verbal chilenismos were classified in four kinds, according to t...
Resumen: En el presente artículo, se describen las tareas realizadas para el desarrollo de un extractor automático de verbos diferenciales del español chileno mediante la aplicación de reglas de lenguaje natural. A partir de este objetivo, se procedió a la modelización de características léxicas, morfológicas y sintácticas de estas expresiones, la...
This paper studies the effect of automatic sentence boundary detection and comma prediction on entity and relation extraction in speech. We show that punctuating the machine generated transcript according to maximum F-measure of period and comma annotation results in suboptimal information extraction. Precisely, period and comma decision thresholds...
An automatic method to extract term candidates from the medical field by applying linguistic techniques is presented. Semantic, morphological and syntactic rules were used to develop this term extractor. On the first phase, the detection was performed by applying a standard dictionary. This dictionary was uploaded to the analyzer software that assi...
In this paper we describe and compare three approaches for the automatic extraction of medical terms using noun phrases (NPs) previously recognized on medical text corpus in Spanish. In the first approach, as baseline, we extracted all NPs, while for the second and third ones the extraction process is directed to "specific NPs" that are determined...
We present a recognition algorithm for enumerations of Noun Phrase (NPEs) whose objective is to detect and extract multiword ex-pression (MWE). The algorithm used syntactic rules elaboration from linguistic information aiming to recognize NPEs. This information corres-ponds to morphological categories (noun, adjective, female, male, etc.). The eval...
Projects
Projects (3)
Se pretende desarrollar una metodología que explore las posibilidades de traducción automática del conocimiento médico (TACM) en relación con la explicitación, entendida esta como la técnica de traducción que consiste en evidenciar en un texto meta información implícita en un texto de origen (Herrezuelo, 2008; Alcántara, 2013; Soto, 2013) y reglas transformacionales de reescritura (M. Gross, 1975; Messina & Langella, 2015; Silberztein, 2016); todo esto, a partir del análisis automático de la estructura argumental que desarrollan las unidades léxicas del dominio médico, según sus posibilidades de significado.
Proyecto de desarrollo de tareas orientadas a la extracción automática de candidatos a término del domino médico para el español, a partir del procesamiento de información lingüística. Para ello, se elaboran diccionarios electrónicos y gramáticas morfológicas y sintácticas.
El proyecto desarrolla tareas de extracción automática de candidatos a término para el español mediante el procesamiento de información lingüística. Esto implica la elaboración de diccionarios electrónicos y creación de gramáticas morfológicas y sintácticas.