
Gunta Nešpore-Bērzkalne- University of Latvia
Gunta Nešpore-Bērzkalne
- University of Latvia
About
22
Publications
4,163
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
154
Citations
Introduction
Current institution
Publications
Publications (22)
Rakstā latviešu valodas salīdzinājuma konstrukcijas apskatītas atbilstoši tipoloģiskajā valodniecībā izmantotajai konstrukciju klasifikācijai gradācijas salīdzinājuma, pielīdzinājuma un vienlīdzības konstrukcijās. Katrā konstrukciju veidā parādīti latviešu valodā izmantotie valodas līdzekļi, noteiktas konstrukcijas sastāvdaļas un tām raksturīgie ie...
The electronic dictionary Tezaurs.lv contains more than 400,000 entries from which 73,000 entries are multi-word expressions (MWEs). Over the past two years, there has been an ongoing division of these MWEs into subgroups (proper names, multi-word terms, taxa, phraseological units, collocations). The article describes the classification of MWEs, fo...
Latvijas Universitātes Matemātikas un informātikas institūtā tiek veidots „Latviešu valodas sintaktiski marķētais korpuss” (LVTB), kurā tiek marķētas gan latviešu valodas sintakses teorijā jau aprakstītās latviešu valodas sintaktiskās parādības, gan arī retākas, līdz šim gramatikās sīkāk neanalizētas konstrukcijas. Šajā rakstā aplūkota vārdkopas an...
We present a new ontology language Pini and the PiniTree ontology editor supporting it. Despite Pini language bearing lot of similarities with RDF, UML class diagrams, Property Graphs and their frontends like Google Knowledge Graph and Protégé, it is a more expressive language enabling FrameNet-style natural language annotation for Atomised journal...
We propose an approach for generating an accurate and consistent PropBank-annotated corpus, given a FrameNet-annotated corpus which has an underlying dependency annotation layer, namely, a parallel Universal Dependencies (UD) treebank. The PropBank annotation layer of such a multi-layer corpus can be semi-automatically derived from the existing Fra...
This paper presents a work in progress to create a multilayered syntactically and semantically annotated text corpus for Latvian. The broad application area we address is natural language understanding (NLU), while more specific applications are abstractive
text summarization and knowledge base population, which are required by the project industri...
This paper presents a work in progress, creating a FrameNet-annotated text corpus for Latvian. This is a part of a larger project which aims at the creation of a multilayered corpus, anchored in cross-lingual state-of-the-art syntactic and semantic representations: Universal Dependencies (UD), FrameNet and PropBank, as well as Abstract Meaning Repr...
The development of a verb valency lexicon for Latvian has been recently started. The chosen approach combines and supplements the experience of similar lexical resources developed for other languages. The paper describes our approach to the verb valency annotation—the valency layers (syntactic and semantic valency, selectional restrictions) and the...
Anotacija
CONTENT WORDS IN THE FORMAL ANALYSIS OF LATVIAN
Summary
To characterize morphological features of each word in a text, a set of morphological features for Latvian has been defined. It describes grammatical categories and their possible values characteristic of a particular part of speech or a smaller group of words.
To define this set i...
In this paper we demonstrate a hybrid treebank encoding format, derived from the dependency-based format used in Prague Dependency Treebank (PDT). We have specified a Prague Markup Lan-guage (PML) profile for the SemTi-Kamols hybrid grammar model that has been developed for languages with rela-tively free word order (e.g. Latvian). This has allowed...
In this paper we describe preparatory work for constructing a Treebank for Latvian as no such resource currently exists. Previously elaborated SemTi-Kamols hybrid dependency based grammar model has been extended to make it appropriate for broad coverage text annotation. We also have integrated extended SemTi-Kamols model with graphical tree editor...
Controlled natural languages (mostly English-based) recently have emerged as seemingly informal supplementary means for OWL ontology authoring, if compared to the formal notations that are used by professional knowledge engineers. In this paper we present by examples controlled Latvian language that has been designed to be compliant with the state...
The dependency approach, originally developed by Lucien Tesnière, has become a popular model of syntactic representation. However, the state-of-the-art dependency parsers and annotation schemes typically discard some relevant features of the original Tesnière's model, retaining only the concept of dependency relations between individual words. The...
Representation of FrameNet as a 4D multidimensional ontology is proposed in the paper. This novel representation allows both to re-create FrameNet ontology from semantically annotated texts, as well as to use this representation for semantic annotation of new texts. Further extensions of this approach with 5th dimension for anaphora annotation is d...
Word sense disambiguation (WSD) along with methods for discourse representation of the parsed text, are among the most difficult tasks in computational linguistics today. Without providing a satisfactory solution to these problems, the true automated semantic processing of texts, as envisioned by semantic web, machine translation, or information re...
Although phrase structure grammars have turned out to be a more popular approach for analysis and representation of the natural language syntactic structures, dependency grammars are often considered as being more appropriate for free word order languages. While building a parser for Latvian, a language with a rather free word order, we found (simi...