Fotini Simistira

Fotini Simistira
  • PhD
  • PostDoc Position at University of Fribourg

About

59
Publications
8,541
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
629
Citations
Current institution
University of Fribourg
Current position
  • PostDoc Position
Additional affiliations
August 2015 - present
University of Fribourg
Position
  • PostDoc Position
November 2014 - May 2015
Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau
Position
  • Visiting researcher
January 1997 - June 1997
Mayo Clinic - Rochester
Position
  • Visiting Researcher
Description
  • Implementation of a utility of interactive fly through navigation of the colon, using 3D-CT data sets and virtual reality equipment.
Education
December 2007 - June 2015
National Technical University of Athens
Field of study
  • Electrical ans Computer Engineer
October 1989 - June 1996
National Technical University of Athens
Field of study
  • Electrical and Computer Engineer

Publications

Publications (59)
Conference Paper
Full-text available
Artificial intelligence (AI) is transforming education by introducing innovative tools and approaches to enhance teaching and learning. Chatbots like Dawebot and TA-bot serve as intelligent student assistants, providing interactive quizzes, immediate feedback, and personalized support. AI-based systems like Rexy and Kwame offer personalized learnin...
Article
Full-text available
We conduct relatively extensive investigations of automatic hate speech (HS) detection using different State-of-The-Art (SoTA) baselines across 11 subtasks spanning six different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what advantage methods, such as data augmentation...
Chapter
Full-text available
Online monitoring of mental well-being and factors contributing to it is vital, especially in pandemics (like COVID-19), when physical contact is discouraged. As emotions are closely connected to mental health, the monitoring and classification of emotions are also vital. In this paper, we present an emotion recognition framework recognizing emotio...
Preprint
Full-text available
We evaluate five English NLP benchmark datasets (available on the superGLUE leaderboard) for bias, along multiple axes. The datasets are the following: Boolean Question (Boolq), CommitmentBank (CB), Winograd Schema Challenge (WSC), Winogender diagnostic (AXg), and Recognising Textual Entailment (RTE). Bias can be harmful and it is known to be commo...
Preprint
Full-text available
We conduct relatively extensive investigations of automatic hate speech (HS) detection using different state-of-the-art (SoTA) baselines over 11 subtasks of 6 different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what advantage methods like data augmentation and ensemble m...
Article
Full-text available
In this study, we demonstrate that an open-domain conversational system trained on idioms or figurative language generates more fitting responses to prompts containing idioms. Idioms are a part of everyday speech in many languages and across many cultures, but they pose a great challenge for many natural language processing (NLP) systems that invol...
Conference Paper
Full-text available
In this paper, we first describe various methods for enhancing student engagement in big online courses. We showcase the implementation of these methods in the "Introduction to Artificial Intelligence (AI)" course at Luleå University of Technology, which has attracted around 500 students in each of its iterations (twice yearly, since 2019). We also...
Article
Full-text available
We survey SoTA open-domain conversational AI models with the objective of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. Open-domain conversational AI models are known to have several c...
Preprint
Full-text available
We demonstrate, in this study, that an open-domain conversational system trained on idioms or figurative language generates more fitting responses to prompts containing idioms. Idioms are part of everyday speech in many languages, across many cultures, but they pose a great challenge for many Natural Language Processing (NLP) systems that involve t...
Preprint
Full-text available
We survey SoTA open-domain conversational AI models with the purpose of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. Open-domain conversational AI are known to have several challenges...
Preprint
Full-text available
We investigate the possibility of cross-lingual transfer from a state-of-the-art (SoTA) deep monolingual model (DialoGPT) to 6 African languages and compare with 2 baselines (BlenderBot 90M, another SoTA, and a simple Seq2Seq). The languages are Swahili, Wolof, Hausa, Nigerian Pidgin English, Kinyarwanda & Yor\`ub\'a. Generation of dialogues is kno...
Article
Full-text available
Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English. This work investigates...
Article
Full-text available
Word2Vec is a prominent model for natural language processing tasks. Similar inspiration is found in distributed embeddings (word-vectors) in recent state-of-the-art deep neural networks. However, wrong combination of hyperparameters can produce embeddings with poor quality. The objective of this work is to empirically show that Word2Vec optimal co...
Article
Full-text available
We investigate the performance of a state-of-the-art (SoTA) architecture T5 (available on the SuperGLUE) and compare with it 3 other previous SoTA architectures across 5 different tasks from 2 relatively diverse datasets. The datasets are diverse in terms of the number and types of tasks they have. To improve performance, we augment the training da...
Preprint
Full-text available
We investigate the performance of a state-of-the art (SoTA) architecture T5 (available on the SuperGLUE) and compare with it 3 other previous SoTA architectures across 5 different tasks from 2 relatively diverse datasets. The datasets are diverse in terms of the number and types of tasks they have. To improve performance, we augment the training da...
Chapter
Researchers have been using Electroencephalography (EEG) to build Brain-Computer Interfaces (BCIs) systems. They have had a lot of success modeling brain signals for applications, including emotion detection, user identification, authentication, and control. The goal of this study is to employ EEG-based neurological brain signals to recognize imagi...
Preprint
Full-text available
Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English. This work investigates...
Preprint
Full-text available
We present a fairly large, Potential Idiomatic Expression (PIE) dataset for Natural Language Processing (NLP) in English. The challenges with NLP systems with regards to tasks such as Machine Translation (MT), word sense disambiguation (WSD) and information retrieval make it imperative to have a labelled idioms dataset with classes such as it is in...
Preprint
Full-text available
The major contributions of this work include the empirical establishment of a better performance for Yoruba embeddings from undiacritized (normalized) dataset and provision of new analogy sets for evaluation. The Yoruba language, being a tonal language, utilizes diacritics (tonal marks) in written form. We show that this affects embedding performan...
Preprint
Full-text available
In this work, we show that the difference in performance of embeddings from differently sourced data for a given language can be due to other factors besides data size. Natural language processing (NLP) tasks usually perform better with embeddings from bigger corpora. However, broadness of covered domain and noise can play important roles. We evalu...
Preprint
Full-text available
In this paper, our main contributions are that embeddings from relatively smaller corpora can outperform ones from far larger corpora and we present the new Swedish analogy test set. To achieve a good network performance in natural language processing (NLP) downstream tasks, several factors play important roles: dataset size, the right hyper-parame...
Preprint
Full-text available
Word2Vec is a prominent tool for Natural Language Processing (NLP) tasks. Similar inspiration is found in distributed embeddings for state-of-the-art (sota) deep neural networks. However, wrong combination of hyper-parameters can produce poor quality vectors. The objective of this work is to show optimal combination of hyper-parameters exists and e...
Article
Full-text available
This essay discusses current research efforts in conversational systems from the philosophy of science point of view and evaluates some conversational systems research activities from the standpoint of naturalism philosophical theory. Conversational systems or chatbots have advanced over the decades and now have become mainstream applications. They...
Preprint
Full-text available
We propose a Historical Document Reading Challenge on Large Chinese Structured Family Records, in short ICDAR2019 HDRC CHINESE. The objective of the proposed competition is to recognize and analyze the layout, and finally detect and recognize the textlines and characters of the large historical document collection containing more than 20 000 pages...
Preprint
Full-text available
In this paper, we introduce the use of Semantic Hashing as embedding for the task of Intent Classification and outperform previous state-of-the-art methods on three frequently used benchmarks. Intent Classification on a small dataset is a challenging task for data-hungry state-of-the-art Deep Learning based systems. Semantic Hashing is an attempt t...
Chapter
iMuSciCA supports mastery of core academic content on STEM subjects for secondary school students alongside with the development of their creativity and deeper learning skills, through engagement in music activities. To reach this goal, iMuSciCA introduces new methodologies and innovative technologies supporting active, discovery-based, collaborati...
Conference Paper
This paper reports on high-performance Optical Character Recognition (OCR)experiments using Long Short-Term Memory (LSTM) Networks for Greek polytonic script. Even though there are many Greek polytonic manuscripts, the digitization of such documents has not been widely applied, and very limited work has been done on the recognition of such scripts....
Article
Although recognition of online handwritten text has reached a point of maturity, recognition of online handwritten mathematical expressions remains still a challenging problem. In this work we train a probabilistic SVM classifier to recognize spatial relations between two mathematical symbols or sub-expressions and then employ a CYK based algorithm...
Conference Paper
A critical issue in recognition of mathematical expressions is the identification of the spatial relations of the symbols or/and sub-expressions that comprise the entire mathematical formula. This paper addresses the problem of structural analysis of mathematical expressions by constructing appropriate feature vectors to represent the spatial affin...
Conference Paper
Full-text available
Mathematical expression recognition is still a very challenging task for the research community mainly because of the two-dimensional (2d) structure of mathematical expressions (MEs). In this paper, we present a novel approach for the structural analysis between two on-line handwritten mathematical symbols of a ME, based on spatial features of the...
Conference Paper
Full-text available
We present a system for recognizing online mathematical expressions (ME). Symbol recognition is based on a template elastic matching distance between pen direction features. The structural analysis of the ME is based on extracting the baseline of the ME and then classifying symbols into levels above and below the baseline. The symbols are then sequ...
Conference Paper
Document image binarization is an initial though critical stage towards the recognition of the text components of a document. This paper describes an efficient method based on mathematical morphology for extracting text regions from degraded handwritten document images. The basic stages of our approach are: (a) top-hat-by-reconstruction to produce...
Conference Paper
Full-text available
This paper proposes an enhancement of our previously presented word segmentation method (ILSPLWseg) [1] by exploiting local spatial features. ILSP-LWseg is based on a gap metric that exploits the objective function of a soft-margin linear SVM that separates successive connected components (CCs). Then a global threshold for the gap metrics is estima...
Article
Full-text available
The present article introduces a phrasealignment approach that involves the processing of a small bilingual corpus in order to extract suitable structural information. This is used in the PRESEMT project, whose aim is the quick development of phrase-based Machine Translation (MT) systems for new language pairs. A main bottleneck of such systems is...
Conference Paper
Full-text available
In this paper we describe a system that applies emerging technologies for speech recognition, language processing, multimedia indexing and retrieval, all integrated into a large video and audio library that covers broadcast news and current affairs in Greece. It assists the Greek National Council for Radio and Television (NCRTV) in compiling inform...
Article
Full-text available
An on-line handwritten character recognition technique based on a template matching distance is proposed. In this method, the pen-direction features are quantized using the 8-level Freeman chain coding scheme and the dominant points of the stroke are identified using the first difference of the chain code. The distance between two symbols results f...
Chapter
This paper presents the on-going work concerning the definition of a standardised use of Virtual Reality in a telemedical information society. It presents the collaboration of the EUROMED project and the usage of the VRASP over the WWW to perform Virtual Assisted Surgery. The aim of the EUROMED project is to standardise the use of the WWW for advan...
Article
Colon/rectal cancer is the second most common cause of death, yet among the most preventable when detected in its early stages. The traditional diagnostic procedures cause tremendous discomfort and are deeply invasive. The motivation for this work is firstly to develop an alternative technique to visualise the inner mucosal surface of the colonic w...

Network

Cited By