[Show abstract][Hide abstract] ABSTRACT: Increased availability of mobile computing, such as personal digital assistants (PDAs), creates the potential for constant and intelligent access to up-to-date, integrated and detailed information from the Web, regardless of one's actual geographical position. Intelligent question-answering requires the representation of knowledge from various domains, such as the navigational and discourse context of the user, potential user questions, the information provided by Web services and so on, for example in the form of ontologies. Within the context of the SmartWeb project, we have developed a number of domain-specific ontologies that are relevant for mobile and intelligent user interfaces to open-domain question-answering and information services on the Web. To integrate the various domain-specific ontologies, we have developed a foundational ontology, the SmartSUMO ontology, on the basis of the DOLCE and SUMO ontologies. This allows us to combine all the developed ontologies into a single SmartWeb Integrated Ontology (SWIntO) having a common modeling basis with conceptual clarity and the provision of ontology design patterns for modeling consistency. In this paper, we present SWIntO, describe the design choices we made in its construction, illustrate the use of the ontology through a number of applications, and discuss some of the lessons learned from our experiences.
Web Semantics: Science, Services and Agents on the World Wide Web. 07/2007;
[Show abstract][Hide abstract] ABSTRACT: SMARTWEB aims to provide intuitive multimodal access to a rich selection of Web-based informa- tion services. We report on the current prototype with a smartphone client interface to the Seman- tic Web. An advanced ontology-based represen- tation of facts and media structures serves as cen- tral description for rich media content. Underlying content is accessed through conventional web ser- vice middleware to connect the ontological knowl- edge base and an intelligent web service compo- sition module for external web services, which is able to translate between ordinary XML-based data structures and explicit semantic representations for user queries and system responses. The presenta- tion module renders the media content and the re- sults generated from the services and provides a de- tailed description of the content and its layout to the fusion module. The user is then able to employ multiple modalities, like speech and gestures, to in- teract with the presented multimedia material in a multimodal way.
Artifical Intelligence for Human Computing, ICMI 2006 and IJCAI 2007 International Workshops, Banff, Canada, November 3, 2006, Hyderabad, India, January 6, 2007, Revised Seleced and Invited Papers; 01/2007
[Show abstract][Hide abstract] ABSTRACT: This paper presents the text generation module of SmartWeb, a multimodal dialogue system. The generation module bases on NipsGen which combines SPIN, originally a parser developed
for spoken language, and a tree-adjoining grammar framework for German. NipsGen allows to mix full generation with canned
KI 2007: Advances in Artificial Intelligence, 30th Annual German Conference on AI, KI 2007, Osnabrück, Germany, September 10-13, 2007, Proceedings; 01/2007
[Show abstract][Hide abstract] ABSTRACT: SmartWeb aims to provide intuitive multimodal access to a
rich selection ofWeb-based information services. The current SmartWeb
prototypes are a smartphone client interface to the Semantic Web, an onboard
car dialog system that gets update from the web, and a motorbike
Annual German Conference on Artificial Intelligence (KI 2006), Bremen, Germany; 06/2006
[Show abstract][Hide abstract] ABSTRACT: In this chapter we give an general overview of the modality fusion component of SmartKom. Based on a selection of prominent multimodal interaction patterns, we present our solution for synchronizing the different
modes. Finally, we give, on an abstract level, a summary of our approach to modality fusion.
[Show abstract][Hide abstract] ABSTRACT: Experience shows that decisions in the early phases of the development of a multimodal system prevail throughout the life-cycle of a project. The distributed architecture and the requirement for robust multimodal interaction in our project SmartWeb resulted in an approach that uses and extends W3C standards like EMMA and RDFS. These standards for the interface structure and content allowed us to integrate available tools and techniques. However, the requirements in our system called for various extensions, e.g., to introduce result feedback tags for an extended version of EMMA. The interconnection framework depends on a commercial telephone voice dialog system platform for the dialog-centric components while the information access processes are linked using web service technology. Also in the area of this underlying infrastructure, enhancements and extensions were necessary. The first demonstration system is operable now and will be presented at the Football World Cup 2006 in Germany.
Proceedings of the 7th International Conference on Multimodal Interfaces, ICMI 2005, Trento, Italy, October 4-6, 2005; 01/2005
[Show abstract][Hide abstract] ABSTRACT: This paper presents a semantic parser for spoken dialogue sys- tems. The parser is designed especially for the analysis of free word order languages by providing a feature called order- independent matching. We describe how this feature allows writing of rules for free word order languages in an elegant way (using German as example language) and how it increases the robustness against speech recognition errors. As order- independent matching makes efficient parsing more difficult, we present a new parsing approach which provides efficient processing for rule bases that are, according to our experience, typical for spoken dialogue systems. The key feature of the parsing approach is a fixed application order of the rules to prune irrelevant results. A preliminary evaluation of the parser shows that this approach works very well in real-world dialogue systems.
INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005; 01/2005
[Show abstract][Hide abstract] ABSTRACT: In this paper we report on ongoing experiments with an advanced multimodal system for applications in architectural design. The system supports uninformed users in entering the relevant data about a bathroom that must be refurnished, and is tested with 28 subjects. First, we describe the IST project COMIC, which is the context of the research. We explain how the work in COMIC goes beyond previous research in multimodal interaction for eWork and eCommerce applications that combine speech and pen input with speech and graphics output: in design applications one cannot assume that uninformed users know what they must do to satisfy the system's expectations. Conse- quently, substantial system guidance is necessary, which in its turn creates the need to design a system architecture and an interaction strategy that allow the system to control and guide the interaction. The results of the user tests show that the appreciation of the system is mainly determined by the accuracy of the pen and speech input recognisers. In addition, the turn taking protocol needs to be improved.
User-Centered Interaction Paradigms for Universal Access in the Information Society, 8th ERCIM Workshop on User Interfaces for All, Vienna, Austria, June 28-29, 2004, Revised Selected Papers; 01/2004
[Show abstract][Hide abstract] ABSTRACT: The development of an intelligent user interface that supports multimodal access to multiple applications is a challenging task. In this paper we present a generic multimodal interface system where the user interacts with an anthropomorphic personalized interface agent using speech and natural gestures. The knowledge-based and uniform approach of SmartKom enables us to realize a comprehensive system that understands imprecise, ambiguous, or incomplete multimodal input and generates coordinated, cohesive, and coherent multimodal presentations for three scenarios, currently addressing more than 50 different functionalities of 14 applications. We demonstrate the main ideas in a walk through the main processing steps from modality fusion to modality fission.
Proceedings of the 5th International Conference on Multimodal Interfaces, ICMI 2003, Vancouver, British Columbia, Canada, November 5-7, 2003; 01/2003
[Show abstract][Hide abstract] ABSTRACT: We present applications enabled via the em-ployment of a single knowledge representation in the SMARTKOMmulti-modal multi-domain dialogue system. We focus on how an rig-orously constructed ontology whose ontologi-cal and representational choices are shared by multiple components of the system, can be re-used in different projects and applied to various tasks.
[Show abstract][Hide abstract] ABSTRACT: Marché Bonsecours There is nothing to indicate it today but The Canadian Parliament once stood in the parking lot that sprawls across much of the west end of Place dYouville immediately south of the Hotel St. Paul. Montreal became the capital of The United Province of Canada in 1844 and the government moved into St. Anns Market, an imposing two-storey limestone building that was built here in the early 1830s. The marketplace, 350 feet long and 50 feet wide, was converted into an imposing House of Assembly. The Legislative Council was in the east wing, the House of Assembly occupied the west wing. John A. Macdonald, later Canadas first prime minister, made his maiden speech as a parliamentarian in the building on April 27, 1846.
[Show abstract][Hide abstract] ABSTRACT: This paper describes a novel functionality of the VerbMobil system, a large scale translation system designed for spontaneously spoken multilingual negotiation dialogues. The task is the on-demand generation of dialogue scripts and result summaries of dialogues. We focus on summary generation and show how the relevant data are selected from the dialogue memory and how they are packed into an appropriate abstract representation. Finally, we demonstrate how the existing generation module of VerbMobil was extended to produce multilingual and result summaries from these representations. 1
[Show abstract][Hide abstract] ABSTRACT: For various purposes in the Verbmobil system it is necessary to build a full model of an unfolding dialog, on a suitably abstract level of representation. The basis of this model are representations of the individual utterances, and we capture their content by a combination of dialog act and propositional content. Our hierar- chy of dialog acts was used to annotate 21 CD-ROMs from the Verbmobil corpus, and the experience gained with the framework influenced standardization efforts in the international scientificcommunity. On the side of propositional content, partic- ular attention was given to the representation of temporal expressions, due to the application domains of Verbmobil.
Verbmobil: Foundations of Speech-to-Speech Translation, Edited by W. Wahlster, 06/2000: pages 441-451; Springer.
[Show abstract][Hide abstract] ABSTRACT: This chapter explains the major functionality of the dialog module in Verbmobil. Dialog knowledge is needed for context sensitive speech translation as well as for the automatic generation of dialog result summaries. Our component produces necessary structures for both purposes and stores them in a centrally ac- cessible data repository — the dialog memory. The structures are based on robustly extracted shallow data which are corrected, extended and structured by our dialog processor. We use time and object completion algorithms to collect context data and compute inter-object relations to infer relevance for summarization. The resulting structures are used by the document generator for dialog minutes and summaries, and by the context evaluation module for translation disambiguation.
Verbmobil: Foundations of Speech-to-Speech Translation, Edited by W. Wahlster, 06/2000: pages 452-465; Springer.
[Show abstract][Hide abstract] ABSTRACT: We present the multilingual summarization functionality for VERB-MOBIL, a speech translation system. We reuse resources of the system to create a summary. After content extraction, we interpret the results in the dialog context. A summary generator provides the input to generation. A first evaluation indicates the feasibility of the approach.
38th Annual Meeting of the Association for Computational Linguistics, Hong Kong, China, October 1-8, 2000.; 01/2000
[Show abstract][Hide abstract] ABSTRACT: Presents the application of statistical language modeling methods
for the prediction of the next dialogue act. This prediction is used by
different modules of the speech-to-speech translation system VERBMOBIL.
The statistical approach uses deleted interpolation of n-gram
frequencies as its basis and determines the interpolation weights by a
modified version of the standard optimization algorithm. Additionally,
we present and evaluate different approaches to improve the prediction
process, e.g. including knowledge from a dialogue grammar. Evaluation
shows that including the speaker information and mirroring the data
delivers the best results
[Show abstract][Hide abstract] ABSTRACT: We describe a reusable and scalable dialogue toolbox and its application in multiple systems. Our main claim is that ends-based representa-tion and processing throughout the complete dialogue backbone it essential to our approach.
[Show abstract][Hide abstract] ABSTRACT: This paper presents SPIN, a semantic parser developed for spoken dialog systems. The parser provides a powerful rule language for an easy and efficient creation of the rule set. Important features of the rule language include order-independent matching, built-in support for referring expressions, rule ordering, constraints and action functions. On the basis of an example utterance the advantages of the introduced features are shown. The increased processing complexity caused by the powerful rule language is handled by a new parsing approach that delivers sufficient performance for rule sets that are typical for dialog systems. We also show how the parser can be used for text generation. The paper closes with an evaluation of the parser performance showing that the approach is well suited for dialog systems. SPIN: Pomenski parser za sisteme govorjenega dialoga clanku je predstavljen SPIN, semantični razčlenjevalnik, ki je bil razvit za sisteme govorjenega dialoga. Razčlenjevalnik ima zmogljiv jezik za tvorjenje pravil, ki enostavno in učinkovito tvori nabor pravil. Pomembne značilnosti jezika za tvorjenje pravil so ujemanje ne glede na besedni red, vgrajena podpora referenčnim izrazom, razvrstitev pravil, omejitve in opravilne funkcije. Na podlagi primera izjave so prikazane prednosti vpeljanih lastnosti. Povečana kompleksnost procesiranja zaradi zmogljivega jezika za tvorjenje pravil obvladujemo z novim pristop k skladenjski analizi, ki ima zadosten učinek pri naboru pravil, značilnih za sisteme dialoga. Prikažemo tudi, kako je lahko parser uporabljen za tvorjenje besedila Clanek zaključimo z vrednotenjem delovanja parserja, ki pokaže, da je pristop primeren za sisteme dialoga.