Jan Romportl

Jan Romportl
O2 Czech Republic · Centre for Artificial Intelligence

Ing. Mgr. Ph.D.

About

49
Publications
11,231
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
488
Citations
Citations since 2017
6 Research Items
204 Citations
2017201820192020202120222023010203040
2017201820192020202120222023010203040
2017201820192020202120222023010203040
2017201820192020202120222023010203040
Additional affiliations
October 2019 - March 2020
O2 Czech Republic
Position
  • Managing Director
July 2015 - December 2018
University of West Bohemia
Position
  • Senior Researcher
July 2015 - present
O2 Czech Republic
Position
  • Researcher
Education
September 2009 - September 2012
University of West Bohemia
Field of study
  • Philosophy
June 2004 - January 2009
University of West Bohemia
Field of study
  • Artificial Intelligence
September 1999 - June 2004
University of West Bohemia
Field of study
  • Cybernetics

Publications

Publications (49)
Conference Paper
Full-text available
The paper discusses a hypothesis relating high quality text-to-speech (TTS) synthesis in spoken dialogue systems with the concept of "uncanny valley". It introduces a "Wizard-of-Oz" experiment with 30 volunteers engaged in conversations with two synthetic voices of different naturalness. The results of the experiment are summarized and interpreted,...
Chapter
Full-text available
This position paper offers an answer to the question about the difference between artificial and natural. By building up a dichotomy between physis and logos, it argues that this difference is given by language and by what can be grasped with words. It concludes with an assertion that Good Old-Fashioned AI (GOFAI) cannot create anything natural, wh...
Article
Full-text available
The article discusses differences between a priori and a posteriori phrasing and their impor-tance in the task of automatic prosodic phrasing in text-to-speech systems. On several examples it illustrates shortcomings of common evaluation of a priori phrasing performance using a pos-teriori phrasing of referential corpus data. The paper also propose...
Chapter
Full-text available
This paper describes the initial experiments on voice conservation of patients with laryngeal cancer in an advanced stage. The final aim is to create a speech-aid device which is able to “speak” with their former voices. Our initial work is focused on applicability of speech data from patients with an impaired vocal tract for the purposes of speech...
Article
Full-text available
The research project COMPANIONS aims at developing an advanced embodied conversational agent (ECA). This ECA is used in two scenarios and two languages (English and Czech), and it requires a TTS system being able to generate very nat-ural expressive and emotional speech output. This paper de-scribes application issues of two such systems within the...
Article
Full-text available
Discussions about naturalness, artificiality and unnaturalness in this article are motivated by the field of Human Cognitive Enhancement (HCE) because of its potential for altering human personality and identity. This article at first proposes a concept of human naturalness as interaction between physis and logos. Then it presents an intuitive unde...
Chapter
Phoniatricians usually have the key role when dealing with a voice disorder, supported by logopedists/speech-language pathologists. Cooperation with other medical or non-medical disciplines may contribute when needed. Fundamentals, general goals and the structure of voice therapy are followed by a survey of specific methods and techniques. No matte...
Article
The paper introduces the motivation for creating dedicated speech corpora of air traffic control communication, describes in detail the process of preparation of corpora for both automatic speech recognition and text-to-speech synthesis, presents an illustrative example of speech recognition system developed using the automatic speech recognition c...
Preprint
Full-text available
The discussion between the automotive industry, governments, ethicists, policy makers and general public about autonomous cars' moral agency is widening, and therefore we see the need to bring more insight into what meta-factors might actually influence the outcomes of such discussions, surveys and plebiscites. In our study, we focus on the psychol...
Data
This includes Supplementary Tables 1-3. Supplementary Table 1 represents a complete list of all significant left/right hippocampal FCs identified in the baseline TEST session and is supplementary to Table 1 (see Results: Resting State fMRI/Baseline Hippocampal FC). Supplementary Table 2 demonstrates the simple intervention effects in functional con...
Article
Full-text available
Aims: This study aimed to assess possible changes in functional connectivity (FC) of hippocampus and other brain regions involved in spatial navigation. Methods: Thirty-three healthy participants completed two resting state functional magnetic resonance (rsfMRI) measurements (3T Siemens Prisma scanner) at the baseline and after 3 months. For this...
Article
The principle of emergence is, today, a key paradigm in the area of the development of systems of artificial intelligence. This article sketches the prospects for tackling emergence in artificial intelligence using Vopěnka’s Alternative Set theory applied to a model of interaction between two causal domains. Starting from the relations between semi...
Conference Paper
This paper focuses on voice banking and creating personalised speech synthesis of laryngectomised patients who lose their voice after this radical surgery. Specific aspects of voice banking are discussed in the paper, including a description of the adjustments of the generic methods. The main attention is paid to the speech corpus building since th...
Book
This book is an edited collection of chapters based on the papers presented at the conference “Beyond AI: Artificial Dreams” held in Pilsen in November 2012. The aim of the conference was to question deep-rooted ideas of artificial intelligence and cast critical reflection on methods standing at its foundations. Artificial Dreams epitomize our cont...
Article
Full-text available
Cílem článku je v historickém kontextu analyzovat základy, na nichž stojí dnešní vědní obor kybernetika, a nabídnout takovou definici kybernetiky, která by odpovídala jak jejím původním kořenům, tak i aktuální institucionalizované vědeckovýzkumné a vývojové praxi. Článek klade důraz na hluboce zakořeněnou inženýrskou motivaci kybernetiky, kyberneti...
Article
The aim of this article is to analyse in historical context the foundations of contemporary cybernetics and to off er such a definition of cybernetics that corresponds both with cybernetics' original roots as well as its actual institutionalised research and development form. The article stresses deeply rooted engineering motivation of cybernetics,...
Book
Full-text available
Proceedings of the International Conference Beyond AI 2013, Pilsen, Czech Republic, November 12–14, 2013
Book
Full-text available
Textbook of history of cybernetics. In Czech only.
Book
Products of modern artificial intelligence (AI) have mostly been formed by the views, opinions and goals of the “insiders”, i.e. people usually with engineering background who are driven by the force that can be metaphorically described as the pursuit of the craft of Hephaestus. However, since the present-day technology allows for tighter and tight...
Article
Full-text available
ARTIC (Artificial Talker in Czech) is a corpus-based text-to-speech (TTS) system that enables to synthesise an arbitrary text, mainly for the Czech language. Basically, two versions of ARTIC are available—a single unit instance system (also known as fixed-inventory synthesis) with the quality of resulting speech limited by the fixed inventory, and...
Chapter
Full-text available
The paper discusses Clark’s conception of extended mind and critically analyses his four criterions of externalised cognitive functions. Language as one of the most important means of externalisation is presented on the basis of Engelbart’s conception of human enhancement. Clark’s view of human-technology coupling is also strongly related to langua...
Conference Paper
Full-text available
The work presented in this paper deals with the text normalization for highly inflectional languages. This paper is focused on abbreviation expansion and likewise on numerals normalization. Our text normalization system does not use any explicit parser or part-of-speech tagger and thus it can be called lightly supervised. The standard rule-based te...
Chapter
Full-text available
The paper discusses Clark’s conception of extended mind and critically analyses his four criterions of externalised cognitive functions. On the basis of the interdisciplinary research in the fields of AI and cognitive science it proposes a redefinition of these criterions. The theoretical claims are supported by Gazzaniga’s experiments with split-bra...
Conference Paper
Full-text available
Our paper introduces implementation details of the application that serves as an audiovisual interface to the automatic dialogue system. It comprises a state-of-the-art large vocabulary continuous speech recognition engine and a TTS system coupled with an embodied avatar that is able to some extent convey a range of emotions to the user. The interf...
Conference Paper
Full-text available
The main idea of a priori machine learning is to apply a machine learning method on a machine learning problem itself. We call it “a priori” because the processed data set does not originate from any measurement or other observation. Machine learning which deals with any observation is called “posterior”. The paper describes how posterior machine l...
Conference Paper
This paper presents a real-time implementation of an automatic dialogue system called ‘Senior Companion’, which is not strictly task-oriented, but instead it is designed to ‘chat’ with elderly users about their family photographs. To a large extent, this task has lost the usual restriction of dialogue systems to a particular (narrow) domain, and th...
Article
Full-text available
In order to improve speech naturalness of a unit selection TTS system it is necessary to annotate prosodic phrase boundaries in the whole source corpus, which is extremely difficult to achieve manually. It is thus usefull to employ a machine classifier. This paper discusses suitable feature selection for such classification of a Czech TTS corpus, p...
Article
Full-text available
Objective annotation of prosodic phrases in a corpus for a text-to-speech system is an important issue due to its influence on the naturalness of synthesised speech. The paper discusses drawbacks of common ways of prosodic phrase annotation and proposes a con-cept of prosodic phrases defined by a maximum likelihood estimation over results of many p...
Conference Paper
Full-text available
The present paper focuses on the current handling of target fea- tures in the unit selection approach basically requiring huge cor- pora. In the paper there are outlined possible solutions based on measuring (dis)similarity among prosodic patterns. As the start of research, the feasibility of (dis)similarity estimation is ex- amined on several intu...
Article
Full-text available
The present paper deals with the evaluation of large-scale listening tests and with the detection of unaccountable or unreliable answers for each listener. The iterative maximum likelihood estimation scheme is proposed and its abilities are demonstrated and discussed on data collected from a large-scale listening test which was carried out with the...
Conference Paper
Full-text available
We describe a statistical method for assignment of prosodic phrases and semantic accents in read speech data. The method is based on statistical eval- uation of listening test data by a maximum-likelihood approach with parameters estimated by an EM algorithm. We also present linguistically relevant quantitative results about the prosodic phrase and...
Conference Paper
Full-text available
This paper deals with an HMM-based automatic phonetic seg- mentation (APS) system and proposes to increase its perfor- mance by employing a pitch-synchronous (PS) coding scheme. Such a coding scheme uses different frames of speech through- out voiced and unvoiced speech regions and enables thus better modelling of each individual phone. The PS codi...
Article
Full-text available
The present paper understands prosodic phrases as units which take part in constituting the rhythmical structure of speech. Due to very subjective and inconsistent criteria for the prosodic phrase perception there must be an objectively underlain method for prosodic phrase assignment. This has been achieved by extensive listening tests (103 partici...
Conference Paper
Full-text available
The paper gives a brief summarisation of preparation and recording of a phonetically and prosodically rich speech corpus for Czech unit selection text-to-speech synthesis. Special attention is paid to the process of two-phase orthographic annotations of recorded sentences with regard to their coherence.
Article
Full-text available
This paper describes data-driven modelling of all three basic prosodic features – fundamental frequency, intensity and segmental duration – in the Czech text-to-speech system ARTIC. The fundamental frequency is generated by a model based on concatenation of automatically acquired intonational patterns. Intensity of synthesised speech is modelled by...
Conference Paper
Full-text available
This paper gives a survey of the current state of ARTIC - the modern Czech concatenative corpus-based text-to-speech system. All stages of the system design are described in the paper, including the acoustic unit inventory building process, text processing and speech pro- duction issues. Two versions of the system are presented: the single unit ins...
Conference Paper
Full-text available
A formal prosody description framework is introduced to- gether with its relation to language semantics and NLP. The framework incorporates deep prosodic structures based on a generative grammar of abstract prosodic functionally involved units. This grammar creates for each sentence a structure of immediate prosodic constituents in the form of a tr...
Conference Paper
Full-text available
A formal prosody model is proposed together with its application in a text-to-speech system. The model is based on a generative grammar of abstract prosodic functionally involved units. This grammar creates for each sentence a structure of immediate prosodic constituents in the form of a tree. Each prosodic word of a sentence is assigned with a des...
Conference Paper
Full-text available
This paper presents recent improvements on ARTIC - the mod- ern Czech corpus-based text-to-speech system. As a statistical approach (using hidden Markov models) was applied to cre- ate an acoustic unit inventory, several improvements concern- ing acoustic unit modelling, clustering and segmentation have been accomplished to increase the intelligibi...
Conference Paper
Full-text available
This paper proposes results of an application of a neural network on the problem of deciding whether a certain punctuation mark in Czech text is or is not the end of a sentence. It also discusses possibilities of using methods for relevant parameters extraction and compares a neural network based method with a Bayes classifier and a heuristic class...
Conference Paper
Full-text available
This paper describes a preparation of the first large Czech prosodic database which should be useful both in automatic speech recognition (ASR) and text-to-speech (TTS) synthesis. In the area of ASR we intend to use it for an automatic punctua- tion annotation, in the area of TTS for building a prosodic mod- ule for the Czech high-quality synthesis...
Conference Paper
Full-text available
This paper presents part of the data collection efforts undergone within the project COMPANIONS whose aim is to develop a set of dialogue systems that will be able to act as an artificial “companions” for human users. One of these systems, being developed in Czech language, is designed to be a partner of elderly people which will be able to talk wi...
Article
Full-text available
The synthesis of emotional speech is still an open question. The principal issue is how to introduce expressivity without compromising the naturalness of the synthetic speech provided by the state-of-the-art technology. In this paper two concatenative synthesis systems are described and some approaches to address this topic are proposed. For exampl...
Article
Full-text available
This paper introduces a new data-driven prosody model for the text-to-speech system ARTIC. The model is intended to be al-most language-independent and to generate naturally sounding intonation with a link to semantics. It is based on text parametri-sation using a new prosodic grammar and on automatic speech corpora analysis methods. Its performanc...
Article
Full-text available
This paper understands Golem as an articial conscious be- ing and the problem of Golem construction is seen as establishing ap- propriate causal relations between dieren t causal domains { namely the causal domain of neural systems (either human brains or articial neural networks) and the causal domain of mental processes (mental or psychological s...

Network

Cited By