Maciej Piasecki

Maciej Piasecki
Wroclaw University of Science and Technology | WUT · Department of Computational Intelligence

About

176
Publications
20,994
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,326
Citations

Publications

Publications (176)
Article
Full-text available
Emotion lexicons are useful in research across various disciplines, but the availability of such resources remains limited for most languages. While existing emotion lexicons typically comprise words, it is a particular meaning of a word (rather than the word itself) that conveys emotion. To mitigate this issue, we present the Emotion Meanings data...
Conference Paper
Full-text available
The paper reports on the methodology and final results of a large-scale synset mapping between plWordNet and Princeton WordNet. Dedicated manual and semi-automatic mapping procedures as well as interlingual relation types for nouns, verbs, adjectives and adverbs are described. The statistics of all types of interlingual relations are also provided.
Chapter
Wordnets for different languages are linked through synsets - sets of synonymous word senses. We present a method of automated transforming synset mapping to sense mapping to build a network of translational equivalents. Two heuristics based on a cross-lingual distributional similarity model are compared with several variants of machine learning ba...
Data
Presentation for the article: Propagation of emotions, arousal and polarity in WordNet using Heterogeneous Structured Synset Embeddings
Conference Paper
Full-text available
Relation Extraction is a fundamental NLP task. In this paper we investigate the impact of underlying text representation on the performance of neural classification models in the task of Brand-Product relation extraction. We also present the methodology of preparing annotated textual corpora for this task and we provide valuable insight into the pr...
Conference Paper
Full-text available
The paper presents the latest release of the Polish WordNet, namely plWord-Net 4.1. The most significant developments since 3.0 version include new relations for nouns and verbs, mapping semantic role-relations from the va-lency lexicon Walenty onto the plWord-Net structure and sense-level interlin-gual mapping. Several statistics are presented in...
Conference Paper
In this paper we present a morpho-syntactic tagger dedicated to Computer-mediated Communication texts in Polish. Its construction is based on an expanded RNN-based neural network adapted to the work on noisy texts. Among several techniques, the tagger utilises fastText embedding vectors, sequential character embedding vectors, and Brown clustering...
Article
Full-text available
Though the interest in use of wordnets for lexicography is (gradually) growing, no research has been conducted so far on equivalence between lexical units (or senses) in inter-linked wordnets. In this paper, we present and validate a procedure of sense-linking between plWordNet and Princeton WordNet. The proposed procedure employs a continuum of th...
Conference Paper
Full-text available
In this paper we present a novel method for emotive propagation in a wordnet based on a large emotive seed. We introduce a sense-level emotive lexicon annotated with polarity, arousal and emotions. The data were annotated as a part of a large study involving over 20,000 participants. A total of 30,000 lexical units in Polish WordNet were described...
Article
Automatic word sense disambiguation (WSD) has proven to be an important technique in many natural language processing tasks. For many years the problem of sense disambiguation has been approached with a wide range of methods, however, it is still a challenging problem, especially in the unsupervised setting. One of the well-known and successful app...
Conference Paper
Full-text available
In this article, we present a novel multidomain dataset of Polish text reviews. The data were annotated as part of a large study involving over 20,000 participants. A total of 7,000 texts were described with metadata, each text received about 25 annotations concerning polarity, arousal and eight basic emotions, marked on a multilevel scale. We pres...
Article
Full-text available
Though the interest in use of wordnets for lexicography is (gradually) growing, no research has been conducted so far on equivalence between lexical units (or senses) in inter-linked wordnets. In this paper, we present and validate a procedure of sense-linking between plWordNet and Princeton WordNet. The proposed procedure employs a continuum of th...
Article
Full-text available
The paper presents patterns of co-occurrences of wordnet relations involving verb lexical units in plWordNet - a large wordnet of Polish. The discovered patterns reveal tendencies of selected synset and lexical relations to form regular circular structures of clear semantic meanings. They involve several types of relations, e.g., presupposition, ca...
Article
Full-text available
Lexical platform – the first step towards user-centred integration of lexical resources Lexical platform – the first step towards user-centred integration of lexical resources The paper describes the Lexical Platform - a means for lightweight integration of independent lexical resources. Lexical resources (LRs) are represented as web components th...
Chapter
We present a method for computing semantic similarity of Polish texts with main focus given to short texts. We have taken into account the limited set of language tools for Polish, and especially that syntactic and semantic parsers do not express accuracy and robustness high enough and to become a stable basis for similarity computation. A very lar...
Conference Paper
Full-text available
In this paper we present a novel approach to the construction of an extensive, sense-level sentiment lexicon built on the basis of a wordnet. The main aim of this work is to create a high-quality sentiment lexicon in a partially automated way. We propose a method called Classifier-based Polarity Propagation, which utilises a very rich set of wordne...
Article
Full-text available
This paper presents a supervised approach to the recognition of Cross-document Structure Theory (CST) relations in Polish texts. Its core is a graph-based representation constructed for sentences. Graphs are built on the basis of lexicalised syntactic-semantic relations extracted from text. Similarity between sentences is calculated as similarity b...
Chapter
The paper presents a new functionality of CLARIN-PL Language Technology Centre (LTC). LTC Platform is developed as a research place for processing, visualizing and depositing language data. It can connect and support the research workflow, enabling scientists to increase the efficiency and effectiveness of their research in connection to CLARIN ser...
Conference Paper
Full-text available
The paper presents a feature-based model of equivalence targeted at (manual) sense linking between Princeton WordNet and plWordNet. The model incorporates insights from lexicographic and translation theories on bilingual equivalence and draws on the results of earlier synset level mapping of nouns between Princeton WordNet and plWordNet. It takes i...
Conference Paper
Full-text available
In this paper we present a comprehensive overview of recent methods of the sentiment propagation in a wordnet. Next, we propose a fully automated method called Classifier-based Polarity Propagation , which utilises a very rich set of features , where most of them are based on wordnet relation types, multi-level bag-of-synsets and bag-of-polarities....
Article
Full-text available
An open stylometric system based on multilevel text analysisStylometric techniques are usually applied to a limited number of typical tasks, such as authorship attribution, genre analysis, or gender studies. However, they could be applied to several tasks beyond this canonical set, if only stylometric tools were more accessible to users from differ...
Article
Full-text available
This paper explores inter-lingual equivalence from the perspective of linking two large lexicosemantic databases, namely the Princeton WordNet of English and the plWordnet (pl. Słowosiec) of Polish. Wordnets are built as networks of lexico-semantic relations between words and their meanings, and constitute a type of monolingual dictionary cum thesa...
Conference Paper
Full-text available
In this paper we present our attempts in the PolEval 2017 Sentiment Analysis Task. The task is not only one of the first challenges in sentiment analysis focused on Polish language, but also represents a novel approach to sentiment analysis, namely, predicting the sentiment not of a sentence, or a document, but of a word or a phrase within the cont...
Conference Paper
Full-text available
We present a large emotive lexicon of Polish which has been constructed by manual expansion of the emotive annotation defined for plWordNet 3.0 emo (a very large wordnet of Polish). The annotation encompasses: sentiment polarity, basic emotions and fundamental human values. Annotation scheme and revised guidelines for the annotation process are dis...
Conference Paper
We present a new morpho-syntactic tagger for Polish called MorphoIXTa-pl, which is based on the adaptation of the MorphoIiTa tagger developed originally for the Czech language. Following its basis, MorphoiTa-pl utilises a rich feature averaged perceptron neural network for morphological analysis and morpho-syntactic disambiguation of the Polish lan...
Conference Paper
Full-text available
In this article we present the result of the research on the recognition of genuine Polish suicide notes (SNs). We provide useful method to distinguish between SNs and other types of discourse, including counterfeited SNs. The method uses a wide range of word-based and semantic features and it was evaluated using Polish Corpus of Suicide Notes, whi...
Article
The paper focuses on the issue of creating equivalence links in the domain of bilingual computational lexicography. The existing interlingual links between plWordNet and Princeton WordNet synsets (sets of synonymous lexical units – lemma and sense pairs) are re-analysed from the perspective of equivalence types as defined in traditional lexicograph...
Conference Paper
Full-text available
We have released plWordNet 3.0, a very large wordnet for Polish. In addition to what is expected in wordnets – richly interrelated synsets – it contains sentiment and emotion annotations, a large set of multi-word expressions, and a mapping onto WordNet 3.1. Part of the release is enWordNet 1.0, a substantially enlarged copy of WordNet 3.1, with ma...
Conference Paper
Full-text available
One of the most significant events in Poland after communism broke up was the emergence of internal conflict after Polish president's plane crash on 10.04.2010. We investigate relations between political parties in Poland by opinion polls, members transitions between political parties and official speeches in Polish Parliament 4 years before and af...
Conference Paper
With the growing size of a wordnet, it is becoming more and more difficult to avoid, identify and eliminate errors in it, especially when a group of editors work in parallel. That is the case of plWordNet. Thus we need elaborated tools for both error prevention during editing, and diagnostic tools for error detection after the work was completed. I...
Article
Full-text available
A complex nature of big data resources demands new methods for structuring especially for textual content. WordNet is a good knowledge source for comprehensive abstraction of natural language as its good implementations exist for many languages. Since WordNet embeds natural language in the form of a complex network, a transformation mechanism WordN...
Article
Full-text available
p> Lexical Means in Communicating Emotion in Suicide Notes - on the Basis of the Polish Corpus of Suicide Notes Polish Corpus of Suicide Notes (PCSN) is a relatively large set of authentic suicide notes that are linguistically annotated on several levels. In order to identify features characteristic for this genre we compared PCSN with the coll...
Article
Full-text available
p> The System of Register Labels in plWordNet Stylistic registers influence word usage. Both traditional dictionaries and wordnets assign lexical units to registers, and there is a wide range of solutions. A system of register labels can be flat or hierarchical, with few labels or many, homogeneous or decomposed into sets of elementary features...
Article
Full-text available
p> Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources Lexical resources can be applied in many different Natural Language Engineering tasks, but the most fundamental task is the recognition of word senses used in text contexts. The problem is difficult, not yet fully solved and different lexical resourc...
Article
Full-text available
The paper describes a system of lexico-semantic relations proposed for the nomi-nal part of plWordNet 2.0 — the largest Polish wordnet. We briefly introduce a wordnet as a large electronic thesaurus. We discuss sixteen nominal relations together with many sub-types proposed for plWordNet 2.0. Each relation is based on linguistic intuition and suppo...
Article
Full-text available
p> Semantic relations among adjectives in Polish WordNet 2.0: a new relation set, discussion and evaluation Adjectives in wordnets are often neglected: there are many fewer of them than nouns, and relations among them are sometimes not as varied as those among nouns or verbs. Polish WordNet 1.0 was no exception. Version 2.0 aims to correct that...
Article
Full-text available
Semantic relations between verbs in Polish WordNet 2.0. The noun dominates wordnets. The lexical semantics of verbs is usually under-represented, even if it is essential in any semantic analysis which goes beyond statistical methods. We present our attempt to remedy the imbalance; it begins by designing a sufficiently rich set of wordnet relations...
Article
Automated extraction of lexical meanings from Polish corpora: potentialities and limitations Large corpora are often consulted by linguists as a knowledge source with respect to lexicon, morphology or syntax. However, there are also several methods of automated extraction of semantic properties of language units from corpora. In the paper we focus...
Article
Full-text available
The paper investigates the accuracy of a Named Entity Recognition (NER) algo-rithm based on the Hidden Markov Model in the domain of Polish stock exchange reports. The task of NER was limited to the recognition and classification of Named Entities representing persons and companies. The algorithm was tested on a small Polish domain corpus of stock...
Article
Full-text available
Sentiment analysis is a very active and nowadays highly addressed research area. One of the problem in sentiment analysis is text classification in terms of its attitude, especially in reviews or comments from social media. In general, this problem can be solved by two different approaches: machine learning methods and based on lexicons. Methods ba...
Article
Full-text available
The paper offers a critical evaluation of the power and usefulness of an automatic prompt system based on the extended Relaxation Labelling algorithm in the process of (manual) mapping plWordNet on Princeton WordNet. To this end the results of manual mapping – that is inter-lingual relations between plWN and PWN synsets – are juxtaposed with the au...
Article
Self-organising Logic of Structures as a Basis for a Dependency-based Dynamic Semantics Model We present Self-organising Logic of Structures (SLS), a semantic representation language of high expressive power, which was designed for a fully compositional representation of discourse anaphora following the Dynamic Semantics paradigm. The application...
Article
Full-text available
In the paper we present an extended version of the graph-based unsupervised Word Sense Disambiguation algorithm. The algorithm is based on the spreading activation scheme applied to the graphs dynamically built on the basis of the text words and a large wordnet. The algorithm, originally proposed for English and Princeton WordNet, was adapted to Po...
Conference Paper
Full-text available
Polish named entities are mostly out-of-vocabulary words, i.e. they are not described in morphological lexicons, and their proper analysis by Polish morphological analysers is difficult.The existing approaches to guessing unknown word lemmas and descriptions do not provide results on a satisfactory level. Moreover, lemmatisation of multiword named...
Conference Paper
A corpus-based Measure of Semantic Relatedness can be calculated for every pair of words occurring in the corpus, but it can produce erroneous results for many word pairs due to accidental associations derived on the basis of several context features. We propose a novel idea of a partial measure that assigns relatedness values only to word pairs we...
Article
A wordnet is many things to many people: a graph of inter-related lexicalised concepts, a taxonomy, a thesaurus, and so on. A wordnet makes good sense as the mainstay of any deep automated semantic analysis of text. We have begun the construction of a multi-component, multi-use toolkit of natural language processing tools with plWordNet, a very lar...
Conference Paper
Full-text available
Lexicalised concepts are represented in wordnets by word-sense pairs. The strength of markedness is one of the factors which influence word use. Stylistically unmarked words are largely context-neutral. Technical terms, obsolete words, "officialese", slangs, obscenities and so on are all marked, often strongly, and that limits their use considerabl...
Article
Wordnet lahko izdelamo na podlagi že obstoječega tujejezičnega wordneta ali pa kot osnovo za gradnjo vzamemo korpusne podatke. Prvi pristop je preprostejši in enostavnejši, zaradi česar ga razvijalci tudi najpogosteje uporabljajo. Vendar ima ta pristop veliko pomanjkljivost, predvsem to, da tako izdelan vir ne odseva nujno jezika, za katerega je bi...
Conference Paper
A method for mapping Wikipedia articles (treated as a large informal resource describing concepts) to wordnet synsets (relation-based word meaning descriptions) is proposed. In contrast to previous approaches, the method focuses mainly on wordnet relation structure as the main means for meaning description. The description is supplemented by knowle...
Conference Paper
A method for the recognition of the compositionality of Multi Word Expressions (MWEs) is proposed. First, we study associations between MWEs and the structure of wordnet lexico-semantic relations. A simple method of splitting plWordNet’s MWEs into compositional and non-compositional on the basis of the hypernymy structure is discussed. However, our...
Article
Sentiment analysis is a relatively new engineering problem in the domain of Natural Language Processing. Its crucial tool are sentiment polarities assigned to synsets (synonym sets) corresponding to abstract meanings existing the natural language. Synsets, together with their lexico-semantic relations are the essential components of every WordNet....
Article
Full-text available
Wordnets are built of synsets, not of words. A synset consists of words. Synonymy is a relation between words. Words go into a synset because they are synonyms. Later, a wordnet treats words as synonymous because they belong in the same synset $\ldots$ … Such circularity, a well-known problem, poses a practical difficulty in wordnet construction...
Book
The ever-growing popularity of Google over the recent decade has required a specific method of man-machine communication: human query should be short, whereas the machine answer may take a form of a wide range of documents. This type of communication has triggered a rapid development in the domain of Information Extraction, aimed at providing the a...
Article
Full-text available
The paper presents WordNetLoom – an application for WordNet development used in the construction of a Polish WordNet called plWordNet. WordNetLoom provides two means of interaction: a form-based, implemented initially, and a graph-based introduced recently. The graphical, active presentation of WordNet structure enables direct work on the structure...
Conference Paper
Full-text available
We report on our efforts aimed at building an Open Domain Question Answering system for Polish. Our contribution is twofold: we gathered a set of question-answer pairs from various Polish sources and we performed an empirical evaluation of two re-ranking methods. The gathered collection contains factoid, list, non-factoid and yes-no questions, whic...
Article
The paper presents a wordnet expansion algorithm, which is based on lexico-semantic relations extracted from large text corpora. We do not assume that the extracted relation instances (i.e. word pairs) are described by probabilities. Thus, results produced by any method, including pattern-based and Distributional Semantics approaches can be used. T...