Guntis Barzdins

Guntis Barzdins
University of Latvia | LU · Institute of Mathematics and Computer Science

PhD

About

64
Publications
11,630
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
563
Citations
Citations since 2017
17 Research Items
273 Citations
201720182019202020212022202301020304050
201720182019202020212022202301020304050
201720182019202020212022202301020304050
201720182019202020212022202301020304050
Additional affiliations
January 1989 - present
University of Latvia
Position
  • Senior Researcher

Publications

Publications (64)
Conference Paper
Full-text available
LNCC is a diverse collection of Latvian language corpora representing both written and spoken language and is useful for both linguistic research and language modelling. The collection is intended to cover diverse Latvian language use cases and all the important text types and genres (e.g. news, social media, blogs, books, scientific texts, debates...
Chapter
Full-text available
In the medical domain various approaches are used to produce examination reports and other medical records. Depending on the language-specific technology support, the type of examination, the size of the hospital or clinic, and other aspects, the reporting workflow can range from completely manual to (semi-)automated. A manual workflow may complete...
Chapter
Full-text available
We present a new ontology language Pini and the PiniTree ontology editor supporting it. Despite Pini language bearing lot of similarities with RDF, UML class diagrams, Property Graphs and their frontends like Google Knowledge Graph and Protégé, it is a more expressive language enabling FrameNet-style natural language annotation for Atomised journal...
Chapter
Full-text available
This paper presents LVBERT – the first publicly available monolingual language model pre-trained for Latvian. We show that LVBERT improves the state-of-the-art for three Latvian NLP tasks including Part-of-Speech tagging, Named Entity Recognition and Universal Dependency parsing. We release LVBERT to facilitate future research and downstream applic...
Chapter
While a lot of work exists on text or keyword extraction from videos, not a lot can be found on the exact problem of extracting continuous text from scrolling tickers. In this work a novel Tesseract OCR based pipeline is proposed for location and continuous text extraction from scrolling tickers in videos. The solution worked faster than real time,...
Chapter
This paper describes a prototype system for partial automation of customer service operations of a mobile telecommunications operator with a human-in-the loop conversational agent. The agent consists of an intent detection system for identifying the types of customer requests that it can handle appropriately, a slot filling information extraction s...
Preprint
It has long been speculated that deep neural networks function by discovering a hierarchical set of domain-specific core concepts or patterns, which are further combined to recognize even more elaborate concepts for the classification or other machine learning tasks. Meanwhile disentangling the actual core concepts engrained in the word embeddings...
Preprint
Full-text available
Clustering news across languages enables efficient media monitoring by aggregating articles from multilingual sources into coherent stories. Doing so in an online setting allows scalable processing of massive news streams. To this end, we describe a novel method for clustering an incoming stream of multilingual documents into monolingual and crossl...
Conference Paper
Full-text available
We present the first prototype of the SUMMA Platform: an integrated platform for multilingual media monitoring. The platform contains a rich suite of low-level and high-level natural language processing technologies: automatic speech recognition of broadcast media, machine translation, automated tagging and classification of named entities, semanti...
Article
Full-text available
In the era of Big Data and Deep Learning, there is a common view that machine learning approaches are the only way to cope with the robust and scalable information extraction and summarization. It has been recently proposed that the CNL approach could be scaled up, building on the concept of embedded CNL and, thus, allowing for CNL-based informatio...
Article
Full-text available
The paper steps outside the comfort-zone of the traditional NLP tasks like automatic speech recognition (ASR) and machine translation (MT) to addresses two novel problems arising in the automated multilingual news monitoring: segmentation of the TV and radio program ASR transcripts into individual stories, and clustering of the individual stories c...
Conference Paper
Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding...
Patent
The invention relates to data processing methods as well as methods of systematic multilingual lexicalization of object type properties in web ontology language (OWL) ontologies. The proposed computer-implemented method of defining the lexical form and the syntactic valence of OWL object type properties comprises instructions to be carried out in a...
Conference Paper
The OWLGrEd ontology editor allows graphical visualization and authoring of OWL 2.0 ontologies using a compact yet intuitive presentation that combines UML class diagram notation with textual Manchester syntax for expressions. We present an extension mechanism for OWLGrEd that allows adding custom information areas, rules and visual effects to the...
Conference Paper
Full-text available
Although human language technologies have a long history in Latvia, the Latvian language still belongs to under-resourced languages, as there are many gaps in basic language technologies and tools. However, despite difficulties, some of these gaps for both, resources and tools, have been filled in the last five years. The main goal of this paper is...
Conference Paper
Full-text available
The paper presents a FrameNet-based information extraction and knowledge representation framework, called FrameNet-CNL. The framework is used on natural language documents and represents the extracted knowledge in a tailor-made Frame-ontology from which unambiguous FrameNet-CNL paraphrase text can be generated automatically in multiple languages. T...
Conference Paper
Full-text available
We describe a novel way for creating information systems based on ontologies. The described solution is aimed at domain experts who would benefit from being able to quickly prototype fully-functional, web-based information system for data input, editing and analysis. The systems backbone is SPARQL 1.1 endpoint that enables organization users to vie...
Conference Paper
Full-text available
Frame-semantic parsing is a kind of automatic semantic role labeling performed according to the FrameNet paradigm. The paper reports a novel approach for boosting frame-semantic parsing accuracy through the use of the C5.0 decision tree classifier, a commercial version of the popular C4.5 decision tree classifier, and manual rule enhancement. Addit...
Article
An Ethernet over IPv4 tunneling protocol is proposed, which categorizes all Ethernet frames to be tunneled into NICE and UGLY frames. The UGLY frames are tunneled by traditional methods, such as UDP or GRE encapsulation, resulting in substantial overhead due to additional headers and fragmentation usually required to transport long Ethernet frames...
Conference Paper
Full-text available
In this paper we present an ongoing research investigating the possibility and potential of integrating frame semantics, particularly FrameNet, in the Grammatical Framework (GF) application grammar development. An important component of GF is its Resource Grammar Library (RGL) that encapsulates the low-level linguistic knowledge about morphology an...
Article
Full-text available
The developers of StarDog OWL/RDF DBMS have pioneered a new use of OWL as a schema language for RDF databases. This is achieved by adding integrity constraints (IC), also expressed in OWL syntax, to the traditional "open-world" OWL axioms. The new database paradigm requires a suitable visual schema editor. We propose here a two-level approach for i...
Conference Paper
Full-text available
The paper presents an ongoing research that aims at OWL ontology authoring and verbalization using a deterministic controlled natural language (CNL) that would be as natural and intuitive as possible. Moreover, we focus on a multilingual CNL interface to OWL by considering both highly analytical and highly synthetic languages (namely, English and L...
Data
Full-text available
The paper presents an ongoing research that aims at OWL ontology authoring and verbalization using a deterministic controlled natural language (CNL) that would be as natural and intuitive as possible. Moreover, we focus on a multilingual CNL interface to OWL by considering both highly analytical and highly synthetic languages (namely, English and L...
Conference Paper
Full-text available
The presented tool uses a novel approach to explore and query a SPARQL endpoint. The tool is simple to use as a user needs only to enter an address of a SPARQL endpoint of one’s interest. The tool will extract and visualize graphically the data schema of the endpoint. The user will be able to overview the data schema and use it to construct a SPAR...
Conference Paper
Full-text available
The dependency approach, originally developed by Lucien Tesnière, has become a popular model of syntactic representation. However, the state-of-the-art dependency parsers and annotation schemes typically discard some relevant features of the original Tesnière's model, retaining only the concept of dependency relations between individual words. The...
Conference Paper
Full-text available
There have been several attempts to visualize OWL ontologies with UML style diagrams. Unlike ODM approach of defining a UML profile for OWL, we propose an extension to UML class diagrams (hard extension) that allows a more compact OWL visualization. The compactness is achieved through the native power of UML class diagrams extended with optional Ma...
Conference Paper
Full-text available
This collaborative report highlights the properties and prospects of Controlled Natural Languages (CNLs). The report poses a range of questions concerning the goals of the CNL, the design, the linguistic aspects, the relationships and evaluation of CNLs, and the application tools. In posing the questions, the report attempts to structure the field...
Conference Paper
Full-text available
Computational semantics and logic-based controlled natural languages (CNL) do not address systematically the word sense disambiguation problem of content words, i.e., they tend to interpret only some functional words that are crucial for construction of discourse representation structures. We show that micro-ontologies and multi-word units allow in...
Article
Full-text available
Ontological Re-engineering of Medical Databases This paper describes data export from multiple medical databases (relational databases) into a single shared Medical Data Warehouse (RDF database structured according to an integrated OWL ontology). The exported data is conveniently accessible via SPARQL or via graphical query language ViziQuer based...
Chapter
This chapter introduces the UML profile for OWL as an essential instrument for bridging the gap between the legacy relational databases and OWL ontologies. We address one of the long-standing relational database design problems where initial conceptual model (a semantically clear domain conceptualization ontology) gets “lost” during conversion into...
Chapter
This chapter introduces the UML profile for OWL as an essential instrument for bridging the gap between the legacy relational databases and OWL ontologies. We address one of the long-standing relational database design problems where initial conceptual model (a semantically clear domain conceptualization ontology) gets “lost” during conversion into...
Conference Paper
Full-text available
Representation of FrameNet as a 4D multidimensional ontology is proposed in the paper. This novel representation allows both to re-create FrameNet ontology from semantically annotated texts, as well as to use this representation for semantic annotation of new texts. Further extensions of this approach with 5th dimension for anaphora annotation is d...
Conference Paper
Full-text available
In this paper we show how semantic web technologies are used in a real application in the domain of national medical databases where an important technological gap between the legacy relation databases and OWL ontologies is bridged by the recently Standardized UML profile for OWL. After data has been exported from multiple relational databases into...
Conference Paper
Full-text available
Word sense disambiguation (WSD) along with methods for discourse representation of the parsed text, are among the most difficult tasks in computational linguistics today. Without providing a satisfactory solution to these problems, the true automated semantic processing of texts, as envisioned by semantic web, machine translation, or information re...
Conference Paper
Full-text available
Although phrase structure grammars have turned out to be a more popular approach for analysis and representation of the natural language syntactic structures, dependency grammars are often considered as being more appropriate for free word order languages. While building a parser for Latvian, a language with a rather free word order, we found (simi...
Conference Paper
Full-text available
We present an new Prot eg e plugin for constructing a minimal satisability model of an OWL ontology and visualizing it in the original music score notation.
Conference Paper
Full-text available
We present an original Protege plugin developed for the deep consistency checking of OWL ontologies. The plugin constructs and vi- sualizes a minimal satisfiability model of the ontology, which is likely to uncover potential ontological errors: if the constructed model contra- dicts the author's intentions, then the ontology itself is either wrong...
Conference Paper
Full-text available
Re-engineering of successful pre-OWL ontologies or other formal ER or UML system models towards OWL DL compliance opens new possibilities in ontology debugging, enabled by the formal semantics and automated reasoners developed for OWL DL. Meanwhile the transformation of legacy ontologies to OWL DL is a challenging and interesting task, which we ill...
Chapter
The method announced in this paper develops the inductive approach for implementing algebraic ADT. We tried to describe here some framework for this approach and to demonstrate it and its problems on simple examples. The concept of the partial model seems for us to be a very interesting object for inductive synthesis as there exist good inductive s...
Article
Full-text available
Tim Berners-Lee and co-authors in their seminal paper "The Semantic Web", published in 2001, outlined their vision about the future Semantic Web. But today we are still far from the implementation of this vision. Despite fundamental achievements, like definition of OWL (Web Ontology Language) and rapid progress of RDF/OWL content creation, storage...
Article
Address Resolution Protocol (ARP) is one of the key TCP/IP stack protocols, used on LANs to map 32 bit IP addresses into 48 bit hardware addresses. Regular ARP uses MAC layer broadcasts to perform the mapping. In this paper a new server-based ARP extension (smartARP) is proposed, which allows the extension of ARP functionality beyond a single MAC l...
Conference Paper
Our goal through several years has been the development of efficient search algorithm for inductive inference of expressions using only input/output examples. The idea is to avoid exhaustive search by means of taking full advantage of semantic equality of many considered expressions. This might be the way that people avoid too big search when findi...
Conference Paper
Given several input/output examples of some function we can state the problem: what is the simplest function which complies with these examples. This problem is well studied and is known to be very hard in the general case. In this paper we address a special case of the problem, when the target function can be expressed as a simple composition of k...
Conference Paper
Fast algorithm for inductive synthesis of term rewriting systems is described and proved to be correct. It is implemented and successfully applied for inductive synthesis of different algorithms, including the binary multiplication. The algorithm proposed supports automatic learning process and can be used for designing and implementation of ADT.
Conference Paper
There exists a fast algorithm [2] for inductive synthesis of terminating and ground confluent term rewriting systems from samples. The principles of this algorithm and the methodology of its use for implementation and completion of abstract data types are described.
Article
Full-text available
It is well known that end-users have problems to write even simple SQL queries. The new SPARQL query language for RDF databases is a step in the right direction, but is still not suitable for end-users. This lead us to creating a more convenient approach in which end-users could retrieve structured data from the database through a graphical query l...
Article
Full-text available
There are two approaches to the natural lan-guage processing – one is going in width to cover at shallow level (parsing, syntax) the rich linguistic variety found in the natural lan-guage, while another is going in depth (seman-tics, discourse structure) for a monosemous subset of natural language referred to as a con-trolled natural language (CNL)...
Article
Full-text available
Two years ago we presented a unified conception of "Semantic Latvia" which would make it possible for a small country like Latvia to take advantage of the emerging Semantic Web technologies. In this paper we show how this approach is starting to materialize into a real application in the domain of national Medical databases thanks to the important...
Article
Full-text available
There have been many attempts to visualize OWL ontologies but none of them is considered completely satisfactory and this is still an open problem. We propose a UML style graphical editor for OWL which not only visualizes ontologies using extended UML class diagram notation but also provides ontology editing facilities unavailable in most of the ot...

Network

Cited By

Projects