Katerina Pastra

Katerina Pastra
Athena-Research and Innovation Center in Information, Communication and Knowledge Technologies · Institute for Language and Speech Processing

PhD
Semantic technologies and cognition

About

44
Publications
6,372
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
378
Citations
Introduction
Working on the computational integration of Language, Perception and Action, from an Embodied and Enactive Cognition perspective. Research lines: - Semantic Memory (knowledge representation and reasoning, knowledge graphs for going beyond one shot learning in robotics and intelligent systems) - Action grammar (supramodal syntax bridging language and sensorimotor experience) - Object affordances through language - Cross-modal semantic relations Applications: Robotics, Education, Health
Additional affiliations
April 2018 - December 2020
American College of Greece
Position
  • Professor (Associate)
Description
  • Teaching in the Data Science MSc: - Natural Language Processing - Search Engines and Web Mining
December 2014 - December 2018
University of Plymouth
Position
  • Professor (Associate)
March 2011 - January 2020
CSRI
Position
  • Managing Director
Education
December 2000 - December 2004
The University of Sheffield
Field of study
  • Artificial Intelligence
September 1999 - September 2000
The University of Manchester
Field of study
  • Language Engineering
October 1995 - September 1999

Publications

Publications (44)
Article
Full-text available
Language and action have been found to share a common neural basis and in particular a common 'syntax', an analogous hierarchical and compositional organization. While language structure analysis has led to the formulation of different grammatical formalisms and associated discriminative or generative computational models, the structure of action i...
Article
Full-text available
Though everyday interaction is predominantly multimodal, a purpose-developed framework for describing the semantic interplay between verbal and non-verbal com- munication is still lacking. This lack not only indicates one's poorunderstandingofmultimodalhumanbehaviour,butalso weakens any attempt to model such behaviour computatio- nally.Inthisarticl...
Conference Paper
Full-text available
At a computational level, language processing tasks are traditionally processed in a language-only space/context, isolated from perception and action. However, at a cognitive level, language processing has been shown experimentally to be embodied, i.e. to inform and be informed by perception and action. In this paper, we argue that embodied cogniti...
Article
Full-text available
In the longstanding effort of defining object affordances, a number of resources have been developed on objects and associated knowledge. These resources, however, have limited potential for modeling and generalization mainly due to the restricted, stimulus-bound data collection methodologies adopted. To-date, therefore, there exists no resource th...
Article
Full-text available
While vision-language integration is important for a wide range of Artificial Intelligence (AI) prototypes and applications, the notion of integration has not been established within a theoretical framework that would allow for more thorough research on the issue. In this paper, we attempt to explore the reasons that dictate this content integratio...
Chapter
We present the results of a comparative study among four well-known web mapping services. The study explored how digital natives (young people under 30 years old who were born and raised with technology and smart devices) interact with web mapping services. The sample consisted of 167 University students in the field of Engineering and was conducte...
Chapter
Full-text available
What knowledge needs to be learned to acquire a novel task? What background knowledge does an agent need to use newly acquired knowledge effectively? This chapter considers the functional roles of knowledge in task learning. These roles of knowledge span interaction with other entities and the environment and core functional capabilities of the rea...
Article
Full-text available
Multimodal information fusion at both the signal and semantics level is a core part of most multimedia applications, including indexing, retrieval, and summarization. Prototype systems have implemented early or late fusion of modality-specific processing results through various methodologies including rule-based approaches, informationtheoretic mod...
Article
Full-text available
Research related to computational modeling for machine-based understanding requires ground truth data for training, content analysis, and evaluation. In this paper, we present a multimodal video database, namely COGNIMUSE, annotated with sensory and semantic saliency, events, cross-media semantics, and emotion. The purpose of this database is manif...
Conference Paper
Full-text available
Multimodal information fusion both at the signal and the semantics levels is a core part in most multimedia applications, including multimedia indexing, retrieval, summarization and others. Early or late fusion of modality-specific processing results has been addressed in multimedia prototypes since their very early days, through various methodolog...
Chapter
Full-text available
Spatial thinking has lately been acknowledged as an important ability both for sciences and for everyday life. There is a clear need for enhancing spatial thinking in education and engaging both educators and learners in more critical, inquiry-based teaching and learning methods. In this context, GEOTHNK project is a European effort to propose a sc...
Conference Paper
Full-text available
In this paper we present a movie summarization system and we investigate what composes high quality movie summaries in terms of user experience evaluation. We propose state-of-the-art audio, visual and text techniques for the detection of perceptually salient events from movies. The evaluation of such computational models is usually based on the co...
Technical Report
Full-text available
This technical report provides guidance on annotating semantic relations mainly between language, images and sounds as they occur in naturalistic contexts, such as audiovisual material and follow the COSMOROE cross-media semantics framework. COSMOROE (CMR) is a descriptive framework for modeling the semantic interplay between different means of ex...
Technical Report
Full-text available
This issue of the AMD Newsletter is very special: we are celebrating its 10 years of biannual publication! It has progressively become a place of lively and inspiring scientific dialogues spanning an incredible network of topics, with the contributions of many key actors of computational developmental sciences. In this volume, a new dialog has been...
Conference Paper
Full-text available
Geospatial thinking is a newly acknowledged ability with profound and rewarding effects on numerous aspects of everyday life and science - from giving and following directions and interpreting maps and diagrams, to achieving innovation in STEM disciplines. The GEOTHNK approach aims at enhancing geospatial thinking skills and engaging users in meani...
Article
Full-text available
We often use tactile-input in order to recognize familiar objects and to acquire information about unfamiliar ones. We also use our hands to manipulate objects and utilize them as tools. However, research on object affordances has mainly been focused on visual-input and, thus, limiting the level of detail one can get about object features and uses....
Conference Paper
Full-text available
A good data corpus lies at the heart of progress in both perceptual/cognitive science and in computer vision. While there are a few datasets that deal with simple actions, creating a realistic corpus for complex, long action sequences that contains also human-human interactions has so far not been attempted to our knowledge. Here, we introduce such...
Conference Paper
Full-text available
Natural language use, acquisition, and understanding takes place usually in multisensory and multimedia communication environments. Therefore, for one to model language in its interaction and integration with sensorimotor experiences, one needs a representative corpus of such interplay. In this paper, we will present the first corpus of language us...
Chapter
Full-text available
Η καθημερινή διάδραση μεταξύ των ανθρώπων βρίθει συνδυασμών και συσχετίσεων λόγου-κίνησης και εικόνας. Ακόμα και η πιο απλή λεκτική επικοινωνία απαιτεί συνομιλητές με ικανότητες αντίληψης και κατανόησης της κίνησης, του λόγου, της εικόνας του περιβάλλοντος κόσμου. Για παράδειγμα, μια απλή εντολή του τύπου «φέρε μου τα γυαλιά», απαιτεί γνώση της ενέ...
Conference Paper
Full-text available
The growing popularity of multimedia documents requires language technologies to approach automatic language analysis and generation from yet another perspec- tive: that of its use in multimodal commu- nication. In this paper, we present a sup- port tool for COSMOROE, a theoretical framework for modelling multimedia di- alectics. The tool is a text...
Book
Full-text available
How does meaning emerge in caricatures? Which is the role of sketch and natural language in the “language” of caricatures? The current study investigates -for the first time internationally- the basic mechanisms of sketch and language integration through which the “magic” of the caricature meaning emerges. The main objective of the study is the inv...
Article
Full-text available
The explosion of multimedia digital content and the development of technologies that go beyond traditional broadcast and TV have rendered access to such content important for all end-users of these technologies. While originally developed for providing access to multimedia digital libraries, video search technologies assume now a more demanding rol...
Conference Paper
Full-text available
State of the art artificial agents rely heavily on human in- tervention for performing vision-language integration; apart from being cost and effort effective, this intervention deprives artificial agents from the ability to react intelligently and to show intentionality when engaged in situated multimodal communication. In this paper, we suggest a...
Conference Paper
Full-text available
— The explosion of multimedia digital content and the development of technologies that go beyond traditional broadcast and TV have rendered access to such content important for all end-users of these technologies. REVEAL THIS develops content processing technology able to semantically index, categorise and cross-link multiplatform, multimedia and m...
Article
Full-text available
The ever growing popularity and availability of multimedia information has rendered automatic image-language association essential in a number of multimedia integration applications. Bridging the gap between the two media requires an appropriate feature-set for describing their common reference; one that will be both distinctive of the entities ref...
Article
Full-text available
In this paper, we look into the notion of cross-media decision mechanisms, focussing on ones that work within multimedia documents for a variety of applications, such as the generation of intelligent multimedia presentations and multimedia indexing. In order for these mech-anisms to go beyond the identification of semantic equivalence relations bet...
Article
Full-text available
The explosion of multimedia digital content and the development of technologies that go beyond traditional broadcast and TV have rendered access to such content important for all end-users of these technologies. While originally developed for providing access to multimedia digital libraries, video search technologies assume now a more demanding rol...
Thesis
Full-text available
This thesis explores the issue of vision-language integration from the Artificial Intelligence perspective of building intentional artificial agents able to combine their visual and linguistic abilities automatically. While such a computational vision-language integration is a sine qua non requirement for developing a wide range of intelligent mult...
Article
Full-text available
Multimodal human to human interaction requires integration of the contents/meaning of the modalities involved. Artificial Intelligence (AI) multimodal prototypes attempt to go beyond technical integration of modalities to this kind of meaning integration that allows for coherent, natural, "intelligent" communication with humans. Though bringing man...
Article
Full-text available
The growing demand for intelligent multimedia systems has led to the development of various multimodal resources and corresponding annotation schemes and processing tools. In this paper, we argue that there is a striking lack of multimodal corpora capturing the association and interaction of visual and linguistic data. We relate this research lacun...
Article
Full-text available
This report reviews the work done by the University of She#eld on SOCIS, Scene of the Crime Information System, under EPSRC grant GR/M89676/01. The work was done in close collaboration with a grant (GR/M89041) to Professor K. Ahmad of the University of Surrey. The overall aims of SOCIS were: . To explore the link between language and vision computa...
Article
Full-text available
This paper presents work on text-based photograph indexing and retrieval for crime investigation, an application domain where efficient querying of large crime-scene photograph databases is of crucial importance. Automating this task will change current police practices considerably, by bringing ‘intelligence’ to crime support information systems....
Article
Full-text available
We present a text-based approach for the automatic indexing and retrieval of digital photographs taken at crime scenes. Our research prototype, SOCIS, goes beyond keyword-based approaches and methods that extract syntactic relations from captions; it relies on advanced Natural Language Processing techniques in order to extract relational facts. The...
Article
Full-text available
In this paper we attempt to apply the IBM algorithm, BLEU, to the output of four different summarizers in order to perform an intrinsic evaluation of their output. The objective of this experiment is to explore whether a metric, originally developed for the evaluation of machine translation output, could be used for assessing another type of output...
Conference Paper
Full-text available
In this paper we present a new approach to the automatic semantic indexing of digital photographs based on the extraction of logic relations from their textual descriptions. The method is based on shallow parsing and propositional analysis of the descriptions using an ontology for the domain of application. We describe the semantic representation f...
Article
Full-text available
The Scene of Crime Information System's automatic image-indexing prototype goes beyond extracting keywords and syntactic relations from captions. The semantic information it gathers gives investigators an intuitive, accurate way to search a database of cases for specific photographic evidence. Intelligent, automatic indexing and retrieval of crime...
Article
Full-text available
In this paper, we investigate whether reusing existing grammars for NE recognition instead of creating them from scratch is a viable solution to time constraints in developing grammars. We discuss three possible factors that hinder grammar reuse and we present our corresponding empirical results, that encourage more widespread use of valuable exist...
Thesis
Full-text available
Nowadays, knowledge is considered the greatest asset of all. Managing this asset requires – among other things –efficient information systems that will support accessing, analysis, classification, storing of data and its transformation into useful information, leading therefore to knowledgeable decision making. Information Extraction is a core Lang...
Article
Full-text available
In this paper, we trace the different manifestations of grounding resources in a variety of Artificial Intelligence applica-tions and present the POETICON project's development plans for building such resource. In doing so, we introduce the PRAXICON, a resource that links natural language and sensorimotor representations of concepts, with the aim o...

Network

Cited By

Projects

Projects (7)
Project
To provide new theoretical and design principles for cartography and maps that (a) encompass the different types and uses of maps and geovisualizations on the Web, (b) serve the needs of the digitally skilled citizens of our time, and (c) enhance the spatial dimension of almost every piece of web-generated data. The research project is supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “First Call for H.F.R.I. Research Projects to support Faculty members and Researchers and the procurement of high-cost research equipment grant” (Project Number: HFRI-FM17-2661).