"A Semantic vector is built by projecting one or many terms on a close space vector of 873 concepts. Concepts are taken out of an ontology defined in the French Larousse Thesaurus [Larousse, 1992], a Roget-based dictionary indexing all language entries with one or several items taken from the 873 concepts ontology. For instance, the French verb " brandir " (to brandish) is associated with the concept of " agitation " and the noun " drapeau " (flag) is indexed by the concepts of " paix (peace), armée (army), funérailles (funerals), signe (sign) " , and " cirque (circus) " . "
[Show abstract][Hide abstract] ABSTRACT: In this paper, different approaches of building and expanding conceptual classes are presented. Classes are built using syntactic and semantic information provided by a corpus. Then, expansion is addressed by two different methods. The first method deals with the objects of syntactic relations found in the corpus. Relations between classes are thus designed. They are called induced relations. Then we use objects of induced syntactic relations (called complementary objects) to expand conceptual classes. We propose an automatic experimental protocol to measure the relevance of the provided concepts. The protocol helps alleviating the judgment effort of a human expert. The second method expands concepts with more global terms by using Web knowledge associated with the existing concepts. Both methods are evaluated and mixed in order to provide the most reliable technique in expanding conceptual classes.
"For English, the lexical vector space dimension was originally 1043: Its most up-to-date version has shrank this number down to 1000. For French, the language with which we chiefly work, lexicologists have defined a family of 873 concepts, hierarchised in four levels (Larousse, 1992). This leads to a space which dimension is 873. "
[Show abstract][Hide abstract] ABSTRACT: This paper presents a study in text classification through semantic and syntactic natural language processing. The authors have used a parser for French, SYGFRAN, and applied it to a real project of press articles classification. The results of this research on a corpus of 4, 843 texts containing more than 76, 000 sentences are described. Classification into 37 categories has been obtained through meaning discrimination by semantic filtering techniques, explained in the document.
[Show abstract][Hide abstract] ABSTRACT: In the research framework in meaning representation in NLP, we focus our attention on thematic aspects and con- ceptual vectors. This vectorial base is built upon a mor- phosyntactic analysis of several lexical resources to reduce isolated problems. Also a meaning is a cluster of defini- tions that are pointed by an Id number. To check the results of an automatic clustering or a word sense disambiguation, we must continuously refer to the source dictionary. In this article, we describe a method for naming a word sense by a term of the vocabulary. This kind of annotation is a light and efficient method that uses meanings associations some- one or something can extract from any lexical knowledge base. Finally, the annotations should become a new lexical learning resource to improve the vectorial base.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.