[Show abstract][Hide abstract] ABSTRACT: Increased availability of mobile computing, such as personal digital assistants (PDAs), creates the potential for constant and intelligent access to up-to-date, integrated and detailed information from the Web, regardless of one's actual geographical position. Intelligent question-answering requires the representation of knowledge from various domains, such as the navigational and discourse context of the user, potential user questions, the information provided by Web services and so on, for example in the form of ontologies. Within the context of the SmartWeb project, we have developed a number of domain-specific ontologies that are relevant for mobile and intelligent user interfaces to open-domain question-answering and information services on the Web. To integrate the various domain-specific ontologies, we have developed a foundational ontology, the SmartSUMO ontology, on the basis of the DOLCE and SUMO ontologies. This allows us to combine all the developed ontologies into a single SmartWeb Integrated Ontology (SWIntO) having a common modeling basis with conceptual clarity and the provision of ontology design patterns for modeling consistency. In this paper, we present SWIntO, describe the design choices we made in its construction, illustrate the use of the ontology through a number of applications, and discuss some of the lessons learned from our experiences.
Full-text · Article · Jul 2007 · Journal of Web Semantics
[Show abstract][Hide abstract] ABSTRACT: This paper presents the text generation module of SmartWeb, a multimodal dialogue system. The generation module bases on NipsGen which combines SPIN, originally a parser developed
for spoken language, and a tree-adjoining grammar framework for German. NipsGen allows to mix full generation with canned
[Show abstract][Hide abstract] ABSTRACT: SMARTWEB aims to provide intuitive multimodal access to a rich selection of Web-based informa- tion services. We report on the current prototype with a smartphone client interface to the Seman- tic Web. An advanced ontology-based represen- tation of facts and media structures serves as cen- tral description for rich media content. Underlying content is accessed through conventional web ser- vice middleware to connect the ontological knowl- edge base and an intelligent web service compo- sition module for external web services, which is able to translate between ordinary XML-based data structures and explicit semantic representations for user queries and system responses. The presenta- tion module renders the media content and the re- sults generated from the services and provides a de- tailed description of the content and its layout to the fusion module. The user is then able to employ multiple modalities, like speech and gestures, to in- teract with the presented multimedia material in a multimodal way.
[Show abstract][Hide abstract] ABSTRACT: SmartWeb aims to provide intuitive multimodal access to a
rich selection ofWeb-based information services. The current SmartWeb
prototypes are a smartphone client interface to the Semantic Web, an onboard
car dialog system that gets update from the web, and a motorbike
[Show abstract][Hide abstract] ABSTRACT: In this chapter we give an general overview of the modality fusion component of SmartKom. Based on a selection of prominent multimodal interaction patterns, we present our solution for synchronizing the different
modes. Finally, we give, on an abstract level, a summary of our approach to modality fusion.
[Show abstract][Hide abstract] ABSTRACT: This chapter presents SPIN, a newly developed template-based semantic parser used for the task of natural language understanding
in SmartKom. The most outstanding feature of the approach is a powerful template language to provide easy creation and maintenance of
the templates and flexible processing. Nevertheless, to achieve fast processing, the templates are applied in a sequential
order that is determined offline.
[Show abstract][Hide abstract] ABSTRACT: This paper presents a semantic parser for spoken dialogue sys- tems. The parser is designed especially for the analysis of free word order languages by providing a feature called order- independent matching. We describe how this feature allows writing of rules for free word order languages in an elegant way (using German as example language) and how it increases the robustness against speech recognition errors. As order- independent matching makes efficient parsing more difficult, we present a new parsing approach which provides efficient processing for rule bases that are, according to our experience, typical for spoken dialogue systems. The key feature of the parsing approach is a fixed application order of the rules to prune irrelevant results. A preliminary evaluation of the parser shows that this approach works very well in real-world dialogue systems.
[Show abstract][Hide abstract] ABSTRACT: Experience shows that decisions in the early phases of the development of a multimodal system prevail throughout the life-cycle of a project. The distributed architecture and the requirement for robust multimodal interaction in our project SmartWeb resulted in an approach that uses and extends W3C standards like EMMA and RDFS. These standards for the interface structure and content allowed us to integrate available tools and techniques. However, the requirements in our system called for various extensions, e.g., to introduce result feedback tags for an extended version of EMMA. The interconnection framework depends on a commercial telephone voice dialog system platform for the dialog-centric components while the information access processes are linked using web service technology. Also in the area of this underlying infrastructure, enhancements and extensions were necessary. The first demonstration system is operable now and will be presented at the Football World Cup 2006 in Germany.
[Show abstract][Hide abstract] ABSTRACT: In this paper we report on ongoing experiments with an advanced multimodal system for applications in architectural design. The system supports uninformed users in entering the relevant data about a bathroom that must be refurnished, and is tested with 28 subjects. First, we describe the IST project COMIC, which is the context of the research. We explain how the work in COMIC goes beyond previous research in multimodal interaction for eWork and eCommerce applications that combine speech and pen input with speech and graphics output: in design applications one cannot assume that uninformed users know what they must do to satisfy the system's expectations. Conse- quently, substantial system guidance is necessary, which in its turn creates the need to design a system architecture and an interaction strategy that allow the system to control and guide the interaction. The results of the user tests show that the appreciation of the system is mainly determined by the accuracy of the pen and speech input recognisers. In addition, the turn taking protocol needs to be improved.
[Show abstract][Hide abstract] ABSTRACT: We present applications enabled via the em-ployment of a single knowledge representation in the SMARTKOMmulti-modal multi-domain dialogue system. We focus on how an rig-orously constructed ontology whose ontologi-cal and representational choices are shared by multiple components of the system, can be re-used in different projects and applied to various tasks.
[Show abstract][Hide abstract] ABSTRACT: The development of an intelligent user interface that supports multimodal access to multiple applications is a challenging task. In this paper we present a generic multimodal interface system where the user interacts with an anthropomorphic personalized interface agent using speech and natural gestures. The knowledge-based and uniform approach of SmartKom enables us to realize a comprehensive system that understands imprecise, ambiguous, or incomplete multimodal input and generates coordinated, cohesive, and coherent multimodal presentations for three scenarios, currently addressing more than 50 different functionalities of 14 applications. We demonstrate the main ideas in a walk through the main processing steps from modality fusion to modality fission.
[Show abstract][Hide abstract] ABSTRACT: Marché Bonsecours There is nothing to indicate it today but The Canadian Parliament once stood in the parking lot that sprawls across much of the west end of Place dYouville immediately south of the Hotel St. Paul. Montreal became the capital of The United Province of Canada in 1844 and the government moved into St. Anns Market, an imposing two-storey limestone building that was built here in the early 1830s. The marketplace, 350 feet long and 50 feet wide, was converted into an imposing House of Assembly. The Legislative Council was in the east wing, the House of Assembly occupied the west wing. John A. Macdonald, later Canadas first prime minister, made his maiden speech as a parliamentarian in the building on April 27, 1846.
[Show abstract][Hide abstract] ABSTRACT: This paper describes a novel functionality of the VerbMobil system, a large scale translation system designed for spontaneously spoken multilingual negotiation dialogues. The task is the on-demand generation of dialogue scripts and result summaries of dialogues. We focus on summary generation and show how the relevant data are selected from the dialogue memory and how they are packed into an appropriate abstract representation. Finally, we demonstrate how the existing generation module of VerbMobil was extended to produce multilingual and result summaries from these representations. 1
[Show abstract][Hide abstract] ABSTRACT: This chapter explains the major functionality of the dialog module in Verbmobil. Dialog knowledge is needed for context sensitive speech translation as well as for the automatic generation of dialog result summaries. Our component produces necessary structures for both purposes and stores them in a centrally ac- cessible data repository — the dialog memory. The structures are based on robustly extracted shallow data which are corrected, extended and structured by our dialog processor. We use time and object completion algorithms to collect context data and compute inter-object relations to infer relevance for summarization. The resulting structures are used by the document generator for dialog minutes and summaries, and by the context evaluation module for translation disambiguation.
[Show abstract][Hide abstract] ABSTRACT: For various purposes in the Verbmobil system it is necessary to build a full model of an unfolding dialog, on a suitably abstract level of representation. The basis of this model are representations of the individual utterances, and we capture their content by a combination of dialog act and propositional content. Our hierar- chy of dialog acts was used to annotate 21 CD-ROMs from the Verbmobil corpus, and the experience gained with the framework influenced standardization efforts in the international scientificcommunity. On the side of propositional content, partic- ular attention was given to the representation of temporal expressions, due to the application domains of Verbmobil.
[Show abstract][Hide abstract] ABSTRACT: The design rationale guiding the development of the reductionist dialog act based translation module in Verbmobil was robustness. Even in case the speech recognition or the prosodic processing does not perform perfectly, this module ex- tracts and translates the main intentions and facts related to the domain. In a three step approach, first the dialog act describing the intention is computed using a sta- tistical approach. The second step is the construction of the propositional content with robust hierarchical finite state transducers. For the definition of the transducers, knowledge sources available in Verbmobil are exploited. The resulting representa- tion of these two steps is used in a template based finite state generator to realize the target language expressions. The internal representation is also communicated to the dialog module where it plays an important part in maintaining the dialog state.
[Show abstract][Hide abstract] ABSTRACT: We present the multilingual summarization functionality for VERB-MOBIL, a speech translation system. We reuse resources of the system to create a summary. After content extraction, we interpret the results in the dialog context. A summary generator provides the input to generation. A first evaluation indicates the feasibility of the approach.
[Show abstract][Hide abstract] ABSTRACT: Presents the application of statistical language modeling methods
for the prediction of the next dialogue act. This prediction is used by
different modules of the speech-to-speech translation system VERBMOBIL.
The statistical approach uses deleted interpolation of n-gram
frequencies as its basis and determines the interpolation weights by a
modified version of the standard optimization algorithm. Additionally,
we present and evaluate different approaches to improve the prediction
process, e.g. including knowledge from a dialogue grammar. Evaluation
shows that including the speaker information and mirroring the data
delivers the best results