Stella MarkantonatouInstitute for Language and Speech Processing | ISLP · Department of Linguistic Resources and Lexicography
Stella Markantonatou
PhD in Linguistics
About
63
Publications
11,226
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
464
Citations
Introduction
Skills and Expertise
Publications
Publications (63)
Small cultural institutions play a vital role in preserving and promoting cultural heritage. To ensure their long-term viability and growth, these institutions must undergo a digital transformation. However, they often face significant challenges due to limited resources and a lack of technical expertise. This paper delves into the crucial role of...
Pomak is a non-standardised, endangered language variety of the East South Slavic dialect continuum. This article presents an online resource of 165 Pomak verbal multiword expressions collected via fieldwork. The resource has been developed
with IDION, which is a web-based environment for the documentation of a wide range of multiword properties. T...
ERIS, a lexical resource of Modern Greek for offensive language detection, is the result of cleansing, enriching and assigning graded offensiveness values to the EL branch of HurtLex. ERIS contains 1148 entries and is openly available. Graded values were obtained with the Best-Worst Scaling methodthat was applied with the Litescale tool. Nouns and...
Pomak is an endangered oral Slavic language of Thrace/Greece. We present a short description of its interesting morphological and syntactic features in the UD framework. Because the morphological annotation of the treebank takes advantage of existing resources, it requires a different methodological approach from the one adopted for syntactic annot...
The Philotis Project develops a platform for the multimodal documentation of living languages; the documentation materials are processed with state-of-the-art NLP technology
that renders them suitable for downstream applications. Philotis supports language documentation practitioners by automating the required technical processes so that they can
f...
We present a cleansed version of the Modern Greek branch of the multilingual lexicon HURTLEX. The new version contains 737 offensive words. We worked bottom-up in two annotation rounds and developed detailed diagnostics of "offensiveness" by cross-classifying words on three dimensions: context, reference, and thematic domain. Our work reveals a wid...
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data i...
The PHILOTIS project is developing a platform to enable researchers of living languages to easily create and make available state-of-the-art spoken and textual annotated resources. As a case study we use Greek and Pomak, the latter being an endangered oral Slavic language of the Balkans (including Thrace/Greece). The linguistic documentation of Pom...
In the recent decades, there has been a significant investment in the incorporation of games in the educational practice. This has taken either the form of game-based learning or serious gaming. A literature review on gaming and education results in numerous works tackling different aspects of the approach. Even a simple search on the Web on gaming...
The syntactic and semantic analyses of 2,500 dish names retrieved from 112 restaurant, tavern, and patisserie menus in Eastern Macedonia and in Thrace in Northern Greece show that only a small number of concepts are denoted by the heads of these noun phrases (NPs): Main Ingredient (MI) of a dish, Way of preparation, Part or Cuts (for MIs with an an...
We present AΜAΛΘΕΙA (AMALTHIA), an application ontology that models the domain of dishes as they are presented in 112 menus collected from restaurants/taverns/patisseries in East Macedonia and Thrace in Northern Greece. AΜAΛΘΕΙA supports a tourist mobile application offering multilingual translation of menus, dietary and cultural information about...
Automatic image-based food recognition is a particularly challenging task. Traditional image analysis approaches have achieved low classification accuracy in the past, whereas deep learning approaches enabled the identification of food types and their ingredients. The contents of food dishes are typically deformable objects, usually including compl...
It may be the case that the world is gradually becoming global (and somehow unified), but tourists are more and more looking for experiences based on ‘divergence’, on destination identity and culture. One such strong ‘diversity’ feature is the regional gastronomy. According to Y. Perdomo of UNWTO, each dish conveys a story and each ingredient relat...
This is a study on language and action that tries to shed light on their conceptual correspondence in terms of embodiment. The linguistic phenomenon of lexical aspect/Aktionsart is studied in connection to joint angles and time. For the purposes of this research, data concerning the usage of a set of Modern Greek verbs were collected and annotated;...
Automatic image-based food recognition is a particularly challenging task. Traditional image analysis approaches have achieved low classification accuracy in the past, whereas deep learning approaches enabled the identification of food types and their ingredients. The contents of food dishes are typically deformable objects, usually including compl...
In a previous article, the authors came up with a list of what they considered 10 challenges that would define the area of digital humanities at large and their evolution in the next years. However, in the almost two years that have passed since the publication of that paper, they are now able to see the need for relating the challenges for digital...
This article describes ongoing work on the design of an online application to support standardized classification of collections of folk exhibits and contribute to the collections' management and promotion. The rationale behind this is the normalization of comparable and multilingual controlled terminologies and their parallelization with recognize...
We describe ongoing work on the design of an online application to support the standardized classification of a large volume of folk exhibits collections and contribute to their management and promotion. The rationale behind our work is the normalization of comparable and multilingual controlled terminologies and their parallelization with internat...
We shed light on aspects of the relation between the semantics and the syntactic lexibility of multiword expressions (MWE) by investigating fixed adjective similes (FS), a redicative. MWE class not studied in this respect before. We work on Modern Greek data and find hat only a subset of the observed syntactic structures is related with diomaticity...
This paper is about fixed similes of the type ADJECTIVE + CONNECTOR + NOUN.
Regardless of whether one supports Digital Humanities as a discipline in its own, ‘traditional' Humanities are transforming with the incorporation of computational approaches. In this short position paper, we outline ten challenges that we consider important and propose to kick-off an in-depth dialog for the future shaping of Digital Humanities, wi...
Coupling culture and education has attracted significant attention and pushed towards the replacement of the typical STEM model into STEAM. An effective integration of culture in the everyday educational practice, empowered by game-based storytelling has already shown great potential in transforming the way people are exposed to and grasp knowledge...
In the course of developing facilities for integrating cultural heritage in the everyday education practice, highly structured information was retrieved from both the structured and the unstructured Europeana documentation contributed by the Greek cultural institutions (~480K entries); Modern Greek is the working language. Satisfactory results were...
Open linked data technologies pave the way towards the semantic Web of the future by a) exploiting the abundance in data availability, b) enhancing the continuing application developments in the Web and computer technologies, c) increasing the availability of game engines towards an expansion of techniques and d) bridging culture and education with...
Abstract—Continuous developments in the web and computer technologies along with an increasing availability of game engines contribute to an expansion of techniques that bridge culture and education with gaming. In addition, open linked data technologies pave the way towards the semantic web of the future by exploiting the abundance in data availab...
Abstract The continuous development of web services and computer infrastructures
complemented by the increasing availability of game development software engines,
contribute to an on-going expansion in the release of serious games (SG) in diverse areas,
ranging from entertainment, cultural heritage (CH), education, artificial intelligence (AI),...
Abstract Technological innovations have rapidly increased over the recent years as well as
e-learning usage and thus museums have increased e-learning investment in order to adapt
their services in a better and more efficient way for their visitors. While museums offer a
diverse range of personal digital collections systems on their websites it...
For nearly two decades there has been an increased interest regarding the exploitation of digital
games in both formal and informal learning contexts with educators mainly focusing on ways of
integrating them into their everyday teaching practices and researchers investigating their potential
use as effective learning environments. According to exi...
The proposed paper reports on work in progress aimed at the development of a conceptual lexicon of Modern Greek (MG) and the encoding of MWEs in it. Morphosyntactic and semantic properties of these expressions were specified formally and encoded in the lexicon. The resulting resource will be applicable for a number of NLP applications.
The proposed paper reports on work in progress aimed at the development of a conceptual lexicon of Modern Greek (MG) and the encoding of MWEs in it. Morphosyntactic and semantic properties of these expressions were specified formally and encoded in the lexicon. The resulting resource will be applicable for a number of NLP applications.
METIS-II was an EU-FET MT project running from October 2004 to September 2007, which aimed at translating free text input
without resorting to parallel corpora. The idea was to use “basic” linguistic tools and representations and to link them with
patterns and statistics from the monolingual target-language corpus. The METIS-II project has four par...
In this paper we describe the METIS-II system and its evaluation on each of the language pairs: Dutch, German, Greek, and Spanish to English. The METIS-II system envisaged developing a data-driven approach in which no parallel corpus is required and in which no full parser or extensive rule sets are needed. We describe the evaluation on a developme...
In this paper, we explain why we have adopted pattern matching for MT pur- poses and why we have embedded it into a hybrid approach. "Patterns" here are understood as independent meaningful sub-sentential segments received in a sys- tematic way. We describe the nature and size of the patterns used as well as the comparison algorithm developed. We d...
The innovative feature of the system presented in this paper is the use of pattern-matching techniques to retrieve translations resulting in a flexible, language-independent approach, which employs a limited amount of explicit a priori linguistic knowledge. Furthermore, while all state-of-the-art corpus-based approaches to Machine Translation (MT)...
In this paper we describe a machine translation prototype in which we use only minimal resources for both the source and the target language. A shallow source language analysis, combined with a translation dictionary and a mapping system of source language phe-nomena into the target language and a target language corpus for generation are all the r...
METIS-II, the MT system presented in this paper, does not view translation as a transfer process between a source lan-guage (SL) and a target one (TL), but rather as a matching procedure of patterns within a language pair. More specifically, translation is considered to be an assign-ment problem, i.e. a problem of discover-ing each time the best ma...
In this paper an innovative approach is presented for MT, which is based on pat- tern matching techniques, relies on extensive target language monolingual corpora and em- ploys a series of similarity weights between the source and the target language. Our system is based on the notion of 'patterns', which are viewed as 'models' of target language s...
This article describes a method for discriminating among registers of Modern Greek and among authors within a given register. Two issues have been investigated: (a) whether register discrimination can successfully exploit linguistic information reflecting the evolution of a language (such as diglossia features of the Modern Greek language) and (b)...
This article describes a method for discriminating among authors within a given register of Modern Greek. The focus here is to determine to what extent the stylistic differences among authors can be detected with a high degree of accuracy for a set of texts belonging to a well‐defined register. To that end, the chosen register is characteriz...
We report on the application of the Self-Organizing Map (SOM) classification method to the task of categorizing texts according to their register and the style of their author. The SOM has been selected as its performance in various data-mining applications has been found to be highly successful. Here, the method is evaluated against the task of cl...
Horizontal redundancy is inherent to lcxica consisting of descriptions of fully formed objects. This causes an unwelcome expansion of the lexical database and increases parsing time. To eliminate it, direct relations between descriptions of fifily formed objects are often defined.
This article investigates (a) whether register discrimination can successfully exploit linguistic information reflecting the evolution of a language (such as the diglossia phenomenon of the Modem Greek language) and (b) what kind of linguistic information and which statistical techniques may be employed to distinguish among individual styles within...
In the present paper, the Self-Organising Map (SOM) is applied to the problem of categorising a corpus of Modem Greek texts according to the style of their authors. A number of variants of the SOM model are used in a series of experiments, in order to compare and contrast their behaviour in the specific task. The experimental results indicate that...
This paper focusses on the English adjectival resultative construction, as exemplified in examples (1)-(4). (1) John hammered the metal flat. (2) The river froze solid. (3) The dog barked itself hoarse. (4) The dog barked the neighbours awake. Result predication occurs with both transitive and intransitive verbs. In the transitive case, the result...
Substantial formal grammatical and lexical resources exist in various NLP systems and in the form of textbook specifications. In the present paper we report on experimental results obtained in manual, semi-automatic and automatic migration of entire computational or textbook descriptions (as opposed to a more informal reuse of ideas or the design o...
We report on the first, to the best of our knowledge, attempt to define the core linguistic and formatting style specifications for controlled Modern Greek and develop an authoring tool (controlled language checker) in the context of the project "SCHEMATOPOIESIS". The tool is both parametric, in order to accommodate various thematic domains, and ex...
In this article the principles of the METIS Machine Translation system are pre- sented. METIS employs an extensive tagged and lemmatised corpus of texts in the target language, coupled with bilin- gual lexica covering the desired pairs of source-target languages. To generate a high-quality translation, the METIS sys- tem is provided with statistica...
The documentation and analysis of Byzantine Art is an important component of the overall effort to maintain cultural heritage and contributes to learning and comprehending ones history traversal path. Efficient publishing of the multi-dimensional and multifaceted information that is necessary for the complete documentation of artworks should draw o...
In this paper we report on the set of con- trolled language specifications defined for Modern Greek and the development of the respective style checker. We will focus on the effectiveness and suitability of these specifications by assessing the perform- ance of a commercial machine translation system over controlled texts and will comment on the ev...
In the present article, a hybrid approach is pro- posed for implementing a machine translation system using a large monolingual corpus cou- pled with a bilingual lexicon and basic NLP tools. In the first phase of the METIS system, a source language (SL) sentence, after being tagged, lemmatised and translated by a flat lemma-to-lemma lexicon, was ma...
In this article, a system is proposed for the automatic style categorisation of text corpora in the Greek language. This categorisation is based to a large extent on the type of language used in the text, for example whether the language used is representative of formal Greek or not. To arrive to this categorisation, the highly inflectional nature...