Maria Das's research while affiliated with Universidade Federal de Viçosa (UFV) and other places

Publications (51)

Article
Full-text available
1 Recebido em 07-11-2007 e aceito para publicação em 23.06.2009. Plantas de clone de Eucalyptus grandis, em região de Cerrado, no espaçamento inicial de 3 x 3 m, foram submetidas à desrama artificial e ao desbaste. Os tratamentos de desrama tiveram início em três idades diferentes: 16, 20 ou 28 meses. Foram utilizadas três a quatro intervenções de...
Article
Full-text available
O objetivo deste trabalho foi avaliar o uso de decepa de plantas jovens de eucalipto, para produção de árvores de menor diâmetro, com colheita facilitada por pequenos produtores; ou para recuperação de povoamentos jovens severamente danificados e produção de biomassa para energia em ciclos curtos. Oexperimento foi conduzido em sistema agroflorestal...
Article
Full-text available
Resumo – O objetivo deste trabalho foi avaliar o uso de decepa de plantas jovens de eucalipto, para produção de árvores de menor diâmetro, com colheita facilitada por pequenos produtores; ou para recuperação de povoamentos jovens severamente danificados e produção de biomassa para energia em ciclos curtos. O experimento foi conduzido em sistema agr...
Article
Full-text available
We describe an approach to the automatic crea-tion of a sense tagged corpus intended to train a word sense disambiguation (WSD) system for English-Portuguese machine translation. The ap-proach uses parallel corpora, translation diction-aries and a set of straightforward heuristics. In an evaluation with nine corpora containing 10 am-biguous verbs,...
Article
Full-text available
This paper presents the challenge of Natural Language Processing, in particular, the case of Portuguese language in the scope of Computer Science and its disciplines. Questions related to natural language processing are associated to the challenges of knowledge access, information management in data intensive repositories, and the complex and inter...
Article
Full-text available
In spite of its potential for bidirectionality, Extensible Dependency Grammar (XDG) has so far been used almost exclusively for parsing. This paper represents one of the first steps towards an XDG-based inte-grated generation architecture by tackling what is arguably the most basic among generation tasks: lexicalization. Herein we present a constra...
Article
Full-text available
This paper presents a statistical generative model for unsupervised learning of verb argument structures. The model is based on the noisy-channel model and is trained with the Expectation-Maximization algorithm. The model was used to induce the argument structures for the 1.500 most frequent verbs in English. The evaluation of a sample of this verb...
Article
Full-text available
While it is generally agreed that Word Sense Dis-ambiguation (WSD) is an application-dependent task, the great majority of systems pursue applica-tion-independent approaches. We propose a strat-egy to support WSD for Machine Translation which is designed specifically for this application. It relies on the analysis of co-occurrences in the context t...
Article
Full-text available
This paper presents a statistical generative model for unsupervised learning of verb argument structures. The model is based on the noisy-channel model and is trained with the Expectation-Maximization algorithm. The model was used to induce the argument structures for the 1.500 most frequent verbs in English. The evaluation of a sample of this verb...
Article
Full-text available
This paper describes HERMETO, a computational environment for fully-automatic, both syntactic and semantic, natural language analysis and understanding. HERMETO converts lists into network s and has been used to enconvert Brazilian Portuguese and English sentences into Universal Networking Language (UNL) hypergraphs.
Article
Full-text available
This paper proposes a model for word sense disambiguation (WSD) with application in machine translation (MT) from English to Brazilian Portuguese. This model follows a hybrid natural language processing method, that is, a knowledge and corpus-based method. The main innovative feature of this model is the formalism used to represent the instances an...
Article
Full-text available
Parallel texts, i.e., texts in one language and their translations to other languages, are very useful nowadays for many applications such as machine translation and multilingual information retrieval. If these texts are aligned in sentence level, for instance, their rel-evance increases considerably. In this paper we describe some experiments that...
Article
Full-text available
This paper describes the automatic generation and the evaluation of sets of rules for word sense disambiguation (WSD) in machine translation. The ultimate aim is to identify high-quality rules that can be used as knowledge sources in a relational WSD model. The evaluation was carried out both automatically, by means of four objective measures (erro...
Article
Full-text available
We present a statistical generative model for unsupervised learning of verb argument structures. We use the model in order to automatically induce verb argument structures for a representative set of verbs. Approximately 80% of the induced argument structures are judged correct by human subjects. The structures overlap significantly with those in P...
Article
Full-text available
This paper presents a proposal for automatic discourse analysis of texts written in Brazilian Portuguese. The corresponding system, named DiZer, takes as input a full text and yields its rhetorical, semantic, and intentional structures. Based on corpus analysis, the underlying research is aimed at verifying the contribution of morphology, syntax, s...
Article
Full-text available
Recently, many projects have been proposed aiming at auto-matically transforming the multilingual information available on parallel texts into linguistic knowledge useful for machine translation. This paper describes an ongoing PhD project in which the main goal is to automati-cally induce transfer rules and bilingual dictionaries from part-of-spee...
Article
Full-text available
We propose a novel approach for word sense disambiguation which makes use of corpus-based evidence combined with background knowledge. Using an inductive logic programming technique, it generates expressive models which exploit several knowledge sources and also the relations between them. The approach is evaluated in two tasks: identification of t...
Article
O processo de desenvolvimento do revisor gramatical ReGra 1 Resumo A parceria USP-Itautec/Philco possibilitou a colocação no mercado de um revisor gramatical para o português do Brasil, avaliado como o de melhor desempenho na categoria, com qualidade similar à dos revisores gramaticais para o inglês. Trata-se, portanto, de um raro exemplo em que um...
Article
Realimentação cega de relevantes constitui uma técnica am-plamente utilizada para melhorar a performance de sistemas de Recu-peração de Informação. Este trabalho apresenta uma técnica de análise local para expansão automática de consultas através da utilização de sin-tagmas nominais extraídos do conjunto pseudo-relevante. Na condução de nossos expe...
Article
Full-text available
Regression equations were adjusted to estimate monthly and annual minimum, average and maximum temperatures for the region comprehended between 16 and 24 o latitude South and 48 and 60 o longitude West based on latitude, longitude and altitude. It was used data from 57 meteorological stations located in this region. After the adjustments of the mod...
Article
Full-text available
Resumo. O objetivo deste trabalho de doutorado foi a proposta e desenvolvimento de uma nova abordagem de desambiguação lexical de sentido, voltada especificamente para a tradução automática, que segue uma metodologia híbrida (baseada em conhecimento e em córpus) e utiliza um formalismo relacional para a representação de vários tipos de conhecimento...
Article
Full-text available
Ontologies are used for representing information units that contain related semantic understanding of varied real world situations. For systematizing the set of terminological data from a domain, the use of computational tools for term extraction is essential. This work presents the evaluation of statistical, linguistic and hybrid approaches to aut...
Article
Full-text available
The solar radiation in three sites in a Mata Atlântica forest fragment, in Viçosa-MG, Brazil, was studied to subsidize natural regeneration management. In average, the forest cover reduced the net radiation in 82.4%, the global solar radiation in 81.2%, and the photosynthetically active radiation in 86.2%. However, a spatial and temporal fluctuatio...
Article
Full-text available
Alignment of words and multiword units plays an important role in many natural language processing applications, such as example-based ma-chine translation, transfer rule learning for machine translation, bilingual lexi-cography, word sense disambiguation, etc. In this paper we describe LIHLA, a lexical aligner which uses bilingual probabilistic le...
Article
Parallel texts – texts in one language and their translation in other – and aligned parallel texts – with identification of translation correspondences – are becoming more and more important for many NLP applications, mainly, machine translation. In this paper we describe some experiments carried out on sentence and lexical alignment of Portuguese-...
Article
Full-text available
This paper addresses the current status, the structure and role of the UNL Knowledge Base (UNLKB) in the UNL System. It is claimed that the UNLKB, understood as the repository where Universal Words (UWs) are named and defined, demands a thorough revision, in order to accomplish the self-consistency requirement of the Universal Networking Language (...
Article
Full-text available
Natural Language Generation (NLG) concerns assigning linguistic form to data in non-linguistic representation (Reiter & Dale, 2000); Linguistic Realization (LR), in turn, com-prises all strictly target language-dependent NLG tasks. This work looks into RL systems from the perspective of three fundamental requirements  namely generality, instantiab...
Article
Full-text available
This paper aims at providing a general description for DIADORIM, a lexical database for Brazilian Portuguese. DIADORIM is said to successively merge two very different previous application-oriented dictionaries, increasing their user-friendliness, the reusability of their entries and their capability of incorporating new features. Besides improving...
Article
Full-text available
In this paper we describe a simple approach to the automatic creation of a sense tagged corpus intended for multilingual word sense disambiguation (WSD). The approach is based on English-Portuguese parallel corpora and a set of straightforward heuristics. In experiments with two corpora containing some verbs, a preliminary evaluation showed that, r...
Article
Full-text available
Parallel texts – texts in one language and their translation in other – are becoming plentiful and available nowadays on the WWW. Aligning these texts means to find the correspondences between them in sentence or word level. In this paper we describe some experiments done with two sentence alignment methods – Gale and Church's method [Gale and Chur...
Article
Full-text available
In this paper it is presented a discussion about the problem of word sense ambiguity in computational systems aiming at multilingual communication, specially at machine translation from English into Brazilian Portuguese. In order to do this, examples of sentences showing the implications of this linguistic phenomenon and the way it is addressed by...
Article
Full-text available
This paper presents a Portuguese sentence fusion model. Sentence fusion is a text-to-text generation task which takes a set of similar sentences as input and combines these into a single output sentence. This process is of extreme relevance in many NLP applications, for instance, to treat redundancies in Multidocument Summarization by fusing inform...
Article
The eXtensible Dependency Grammar (XDG) is a very promising CP-natural framework with which to tackle varied NLP problems and their combinatorial complexity. XDG draws heavily on its non-transformational character for efficiency, which opens the issue of N:M mapping e.g. between syntactic and semantic structures. We resume discussion on this issue...

Citations

... Nesse dicionário, além da classificação dos verbos, são definidas a subcategorização e as restrições de seleção desses verbos. Ao definir as restrições de seleção, de forma indireta, Borba define também um conjunto de opções de traços semânticos para substantivos, que auxilia a identificação dos traços semânticos dos substantivos do corpus, por meio de um estudo desse corpus; e-Esse modelo já foi aplicado em outro trabalho do NILC, o projeto TraSem ( Rino et al., 2001), cujo objetivo era a definição dos traços semânticos dos itens do Lex-Port para melhorar o desempenho do revisor ortográfico e gramatical ReGra. Poderíamos, portanto, aproveitar a experiência adquirida nesse projeto. ...
... No interior da vegetação os níveis médios anuais de PAR incidentes foram menores, aproximadamente 7% da PAR↓_Ext atingiu a superfície da floresta, existindo pouca luz solar (direta e/ou difusa) dentro do dossel. De acordo com Pezzopane et al. (2000), as maiores frações de radiação difusa ocorrem porque o processo de interação entre a radiação solar e as folhas é seletivo, ocorrendo alta absorção na faixa espectral da radiação fotossinteticamente ativa e baixa absorção na faixa espectral do infravermelho próximo. Outra explicação é que o dossel protege o interior florestal da forte e intensa radiação solar, além de outros fatores meteorológicos tempestades, fortes ventos e grandes variações térmicas. ...
... Figure 8 shows those relations and their frequency among sentences. In general, ELABORATION is very common in diverse corpora in different languages; for example, in the RST Spanish Treebank , in Discourse Treebank (Carlson et al., 2003) or CorpusTCC (Pardo and Nunes, 2004). That is because Elaboration is a common rhetorical strategy which the writer may use to expand on the previous context -thus, it becomes a de facto default whenever a more semantically marked relation does not fit the context (Carlson et al., 2003). ...
... Esse corpus abrange vários gêneros textuais (notícias, conversas telefônicas, weblogs e entrevistas, entre outros) escritos em inglês, chinês e árabe. Para o português, têm-se alguns corpus específicos para certos domínios e aplicações, como os apresentados por Specia (2007) e Machado et al. (2011), e outros mais gerais, como os apresentados por Nóbrega e Pardo (2014) e Travanca (2013). Specia (2007) propôs um método de DLS baseado em Programação Lógica Indutiva, caracterizado por utilizar aprendizado de máquina e regras em lógica proposicional. ...
... We assume that the segments are the smaller units of significance, and these smaller units are what establish the microstructural relations.The propositions are the conceptual forms and the segments are the superficial realization forms of those propositions. However, it is acknowledged that the delimitation of these minimal units of significance represents a serious difficulty to the work that involve the textual segmentation, see Pardo and Nunes [4]. ...
... Books, otherwise, are usually not evaluated on their technical details (as the type of paper and weight), but on more prototypical aspects in this domain (as characters and story). For evaluating the methods, we have computed the traditional clustering evaluation measures of Precision, Recall, F-measure and Global F-measure (as defined in [20]) over the reference clusters. Precision indicates the proportion of aspects of each automatic cluster that is correctly clustered (according to Consider a k as each aspect in gj, ignoring ai, which was already processed if a k in gj has related coreference terms in the corresponding reviews, as indicated by CORP then Add such terms to b coref end if if a k in gj contains foreignism related words in iLteC lexicon then Add such words to b fore end if if a k in gj contains diminutive or augmentative related words in our compiled list then Add such words to b dim-augm end if Remove duplicate items from B = {bsyn, bpart, bcaus, b devb , b fore , b dim-augm , b coref , b subs }; Add to g j the aspects of intersection(A,B) Remove from A the aspects of intersection(A,B) Empty B until every a k in gj is tested until A is empty repeat ...
... This very same reasoning is used in DiZer for analyzing texts. More details about the corpus and its annotation may be found in [17] and [18]. ...
... We recently investigated this problem considering MT from English to Portuguese. Our study (Specia and Nunes, 2004) has shown that the current MT systems do not appropriately handle the sense ambiguity problem and that this is one of the main causes of the very low quality resulting translations. The various approaches that have been proposed to WSD are generally aimed at monolingual contexts, considering mainly the English language. ...
... fonte, expressando os mesmos sentidos dessa palavra, ou seja porque a língua portuguesa utiliza distinções de sentido menos refinadas. Specia & Nunes (2004b). A descrição das abordagens publicadas até 1998 pode ser também ser encontrada em (Ide & Véronis, 1998). ...