Are you G Skantze?

Claim your profile

Publications (8)0 Total impact

  • Conference Proceeding: Multimodal Multiparty Social Interaction with the Furhat Head
    [show abstract] [hide abstract]
    ABSTRACT: We will show in this demonstrator an advanced multimodal and multiparty spoken conversational system using Furhat, a robot head based on projected facial animation. Furhat is a human-like interface that utilizes facial animation for physical robot heads using back-projection. In the system, multimodality is enabled using speech and rich visual input signals such as multi-person real-time face tracking and microphone tracking. The demonstrator will showcase a system that is able to carry out social dialogue with multiple interlocutors simultaneously with rich output signals such as eye and head coordination, lips synchronized speech synthesis, and non-verbal facial gestures used to regulate fluent and expressive multiparty conversations.
    in Proc. of the 14th ACM International Conference on Multimodal Interaction ICMI; 10/2012
  • Source
    Conference Proceeding: Towards human-like behaviour in spoken dialog systems
    Proceedings of Swedish Language Technology Conference (SLTC 2006);
  • Conference Proceeding: User responses to prosodic variation in fragmentary grounding utterances in dialogue
    G Skantze, D House, J Edlund
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, actual user responses to fragmentary grounding utterances in Swedish human-computer dialog are investigated. Building on a previous study which demonstrated that listeners could use prosodic features (primarily peak height and alignment) to make different interpretations of such utterances, we now report on an experiment in which subjects participate in a color-naming task in a Wizard-of-Oz controlled human-computer dialog setting. The results show that two annotators were able to categorize the subjects' responses based on pragmatic meaning. Moreover, the subjects' response times differed significantly, depending on the prosodic features of the grounding fragment spoken by the system.
    Proceedings of Interspeech 2006 - ICSLP;
  • Conference Proceeding: Prosodic Features in the Perception of Clarification Ellipses
    J Edlund, D House, G Skantze
    [show abstract] [hide abstract]
    ABSTRACT: We present an experiment where subjects were asked to listen to Swedish human-computer dialogue fragments where a synthetic voice makes an elliptical clarification after a user turn. The prosodic features of the synthetic voice were systematically varied, and subjects were asked to judge the computer's actual intention. The results show that an early low F0 peak signals acceptance, that a late high peak is perceived as a request for clarification of what was said, and that a mid high peak is perceived as a request for clarification of the meaning of what was said. The study can be seen as the beginnings of a tentative model for intonation of clarification ellipses in Swedish, which can be implemented and tested in spoken dialogue systems.
    Proceedings of Fonetik 2005;
  • Conference Proceeding: The effects of prosodic features on the interpretation of synthesised backchannels
    Å Wallers, J Edlund, G Skantze
    [show abstract] [hide abstract]
    ABSTRACT: A study of the interpretation of prosodic features in backchannels (Swedish /a/ and /m/) produced by speech synthesis is presented. The study is part of work-in-progress towards endowing conversational spoken dialogue systems with the ability to produce and use backchannels and other feedback.
    Proceedings of Perception and Interactive Technologies;
  • Source
    Conference Proceeding: Grounding and prosody in dialog
    G Skantze, D House, J Edlund
    [show abstract] [hide abstract]
    ABSTRACT: In a previous study we demonstrated that subjects could use prosodic features (primarily peak height and alignment) to make different interpretations of synthesized fragmentary grounding utterances. In the present study we test the hypothesis that subjects also change their behavior accordingly in a human-computer dialog setting. We report on an experiment in which subjects participate in a color-naming task in a Wizard-of-Oz controlled human-computer dialog in Swedish. The results show that two annotators were able to categorize the subjects' responses based on pragmatic meaning. Moreover, the subjects' response times differed significantly, depending on the prosodic features of the grounding fragment spoken by the system.
    Working Papers 52: Proceedings of Fonetik 2006;
  • Article: Technology for Derived Services: Innovative Interfaces
    [show abstract] [hide abstract]
    ABSTRACT: Traditional human-machine interfaces are unintuitive to those unfamiliar with technology and assume a degree of “computer literacy”. Social/conversational interfaces represent a radically different approach to human-machine interaction, where the interaction metaphor is modelled after human-human face-to-face communication, resulting in an ECA – an the embodied conversational agent, which communicates via speech, facial expression, gaze & gesture. This document describes the application chosen for the prototype ECA developed within MonAMI: the reminder application. The application helps users to plan activities and remember what to do. The prototype is derived from Google Calendar, with added functionality mixing ECA technology and a digital pen and paper. The ECA provides the user with notifications on what he has written in the calendar, and the user can ask questions such as “When was I supposed to meet Sara?” or “What’s on my schedule today?” The solution allows the users to continue using a paper calendar in the same way they are used to, whilst adding, amongst other things, reminder functionality.
  • Conference Proceeding: Speech technology in the European project MonAMI
    [show abstract] [hide abstract]
    ABSTRACT: This paper describes the role of speech and speech technology in the European project MonAMI, which aims at “mainstreaming ac-cessibility in consumer goods and services, us-ing advanced technologies to ensure equal ac-cess, independent living and participation for all”. It presents the Reminder, a prototype em-bodied conversational agent (ECA) which helps users to plan activities and to remember what to do. The prototype merges speech technology with other, existing technologies: Google Cal-endar and a digital pen and paper. The solution allows users to continue using a paper calendar in the manner they are used to, whilst the ECA provides notifications on what has been written in the calendar. Users may also ask questions such as “When was I supposed to meet Sara?” or “What’s on my schedule today?”
    Proceedings of FONETIK 2008;