Article

Intentions in Communication

Authors:
  • Voicebox Technologies, Inc.
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Despite the appearance of an answer to (1), the intent of the speaker in (5) is more likely to be participation in the joke, as the question has already been answered by the same speaker. the functionalist philosophy [20], defining intentions as operational plans either in our mind or can be entailed by current actions. For example, when a speaker provides a "action-directive" utterance, the speaker's entailed plan is to have certain actions performed. ...
... The ISO 24617-2 standard proposed a semantically-based standard for dialogue annotation, and includes both dialogue acts and the relations between discourse units [14,15,16]. Researchers have long noted that multi-functionality (pragmatic overloading) is hard to capture with a single utterance purpose, especially in multiparty multi-threaded dialogues [4,20,26,22]. In our work, we follow SWBD-DAMSL's approach by augmenting its flattened DA tag set. ...
Preprint
Full-text available
In this paper, we introduce Dependency Dialogue Acts (DDA), a novel framework for capturing the structure of speaker-intentions in multi-party dialogues. DDA combines and adapts features from existing dialogue annotation frameworks, and emphasizes the multi-relational response structure of dialogues in addition to the dialogue acts and rhetorical relations. It represents the functional, discourse, and response structure in multi-party multi-threaded conversations. A few key features distinguish DDA from existing dialogue annotation frameworks such as SWBD-DAMSL and the ISO 24617-2 standard. First, DDA prioritizes the relational structure of the dialogue units and the dialog context, annotating both dialog acts and rhetorical relations as response relations to particular utterances. Second, DDA embraces overloading in dialogues, encouraging annotators to specify multiple response relations and dialog acts for each dialog unit. Lastly, DDA places an emphasis on adequately capturing how a speaker is using the full dialog context to plan and organize their speech. With these features, DDA is highly expressive and recall-oriented with regard to conversation dynamics between multiple speakers. In what follows, we present the DDA annotation framework and case studies annotating DDA structures in multi-party, multi-threaded conversations.
... In (3), for example, the anaphoric elements no (as negating the proposition contained in A's question) and he can only be resolved under the assumption that they realise an answer (to A's question) and an explanation (for the answer), respectively. (See, inter alia, Kamp and Reyle (1993) Finally, there is a large body of work elucidating the role of the agent model (representing their beliefs, desires, and intentions) in interpreting discourse (Cohen et al., 1990). To again give just one illustrating example, in (4), A must know something about B's likely desires and intentions (to stay awake, or not stay awake) to make sense of their reply. ...
Preprint
Even in our increasingly text-intensive times, the primary site of language use is situated, co-present interaction. It is primary ontogenetically and phylogenetically, and it is arguably also still primary in negotiating everyday social situations. Situated interaction is also the final frontier of Natural Language Processing, where, compared to the area of text processing, very little progress has been made in the past decade, and where a myriad of practical applications is waiting to be unlocked. While the usual approach in the field is to reach, bottom-up, for the ever next "adjacent possible", in this paper I attempt a top-down analysis of what the demands are that unrestricted situated interaction makes on the participating agent, and suggest ways in which this analysis can structure computational models and research on them. Specifically, I discuss representational demands (the building up and application of world model, language model, situation model, discourse model, and agent model) and what I call anchoring processes (incremental processing, incremental learning, conversational grounding, multimodal grounding) that bind the agent to the here, now, and us.
... 2 For a more complete and detailed discussion of actions and plans (on their preconditions and results; on how the contexts may affect their effects; on their explicit or implicit conflicts, etc.), please refer to [18,19]. 3 The context c defines the boundary conditions that can influence the other parameters of the indicated relationship. ...
Article
Full-text available
In this paper, we investigate the primitives of collaboration, useful also for conflicting and neutral interactions, in a world populated by both artificial and human agents. We analyze in particular the dependence network of a set of agents. And we enrich the connections of this network with the beliefs that agents have regarding the trustworthiness of their interlocutors. Thanks to a structural theory of what kind of beliefs are involved, it is possible not only to answer important questions about the power of agents in a network, but also to understand the dynamical aspects of relational capital. In practice, we are able to define the basic elements of an extended sociality (including human and artificial agents). In future research, we will address autonomy.
... The roots of contemporary studies of human communication trace back to mentalist readings of work done in the philosophy of language by Austin (1962), Grice (1989) and Searle (1969Searle ( , 1979, as well as to later attempts to formalize their theories in a computational perspective (e.g., Allen & Perrault, 1980;Cohen & Perrault, 1979;Cohen, Morgan, & Pollack, 1990). The overall framework so developed was then adopted by researchers more interested in understanding the actual functioning and activities of the human mind (e.g., Airenti, Bara, & Colombetti, 1993;Clark, 1992Clark, , 1996Sperber & Wilson, 1986;Tirassa, 1997Tirassa, , 1999a. ...
... In the context of Natural Language Processing, intent is defined by Dumoulin as "an interpretation of a statement or question that allows one to formulate the 'best' response to the statement" (Dumoulin 2014). There are multiple approaches to intent recognition, such as [ (Cohen, Morgan, and Pollack 1990), (Holtgraves 2008), (Montero and Araki 2005)], but we assume the pre-existence of chatbots which perform such tasks. ...
Conference Paper
Full-text available
When reviewing a chatbot's performance, it is desirable to prioritize conversations involving misunderstood human inputs. A system for measuring the posthoc risk of missed intent associated with a single human input is presented. Using defined indicators of risk, the system's performance in identifying misunderstood human inputs is given. These indicators are given weights and optimized on real world data. By application of our system, language model development is improved.
... Grice's introduction of a principled distinction between what people literally say in conversation and what they imply by what they say provided impetus for intention based formal approaches to modeling conversation (Grice, 1975;Sperber & Wilson, 1986). These developments in formal semantics and pragmatics in turn prompted new lines of investigation in experimental psychology and in computational modeling of interaction (see, e.g., Cohen, Morgan, & Pollack, 1990). ...
Article
Full-text available
Miscommunication is a neglected issue in the cognitive sciences, where it has often been discounted as noise in the system. This special issue argues for the opposite view: Miscommunication is a highly structured and ubiquitous feature of human interaction that systematically underpins people’s ability to create and maintain shared languages. Contributions from conversation analysis, computational linguistics, experimental psychology, and formal semantics provide evidence for these claims. They highlight the multi-modal, multi-person character of miscommunication. They demonstrate the incremental, contingent, and locally adaptive nature of the processes people use to detect and deal with miscommunication. They show how these processes can drive language change. In doing so, these contributions introduce an alternative perspective on what successful communication is, new methods for studying it, and application areas where these ideas have a particular impact. We conclude that miscommunication is not noise but essential to the productive flexibility of human communication, especially our ability to respond constructively to new people and new situations.
... People usually communicate with each other verbally, but nonverbal information, such as facial expression, gaze direction, and gestures, also plays an important role [1]. People can more easily understand the internal states of others from such information [2]. Even subtle changes in nonverbal information affect human communications. ...
Conference Paper
Full-text available
The impressions made by a blinking light used to create artificial subtle expressions (ASEs) and by a robot's appearance on users were investigated. The blinking light, which shows the user that the robot is performing speech recognition and thereby prevents utterance collisions, was separated from the robot by embedding it in a pedestal unit. In an evaluation experiment, participants performed five tasks with a spoken dialogue system coupled to a robot placed on the pedestal. The participants' impressions of the dialogue interactions and of the robot were obtained under four conditions (w/ light blinking or w/o blinking; humanoid or cuboid robot). The cuboid robot created a stronger impression of comfort and excitement for the interactions while the blinking light did not create a strong impression of anything. The robot's appearance and the blinking did not create a strong impression of anything for the robot. This suggests that the blinking light in the pedestal unit is a factor that is independent of robot appearance, meaning that the pedestal unit can be applied to robots with various appearances.
... Goal inferences are likely a fundamental tendency of human nature, and their occurrence is sensitive to the sociocognitive production and processing of messages (Berger, 2000;Palomares, 2008). In fact, much scholarship assumes that people detect or infer others' goals (e.g., Carberry, 1990;Cohen, Morgan, & Pollack, 1990;Schank & Abelson, 1977;Schmidt, 1976). Moreover, goal detection is consequential: People use their understanding of others' goals to explain their behavior (Dillard, 1990;Poynor & Morris, 2003); and accurate detection at times can promote effective communication (Nauta & Sanders, 2001;Russell & Schober, 1999), comprehension, and recall (Lynch & van den Broek, 2007). ...
Article
Full-text available
An experiment examined a theory explaining how people detect others’ goals. The framework maintains that because components or factors (e.g., context, tactic) of interaction increase the accessibility of inferable goals, goal detection is a product of the goals these cognitive linkages activate. In dyadic initial interactions, one participant was randomly assigned as the pursuer and the other as the detector; detectors sought a goal varying in congruency (i.e., identical, concord, and discord) with pursuers’ goal. Detectors’ cognitive busyness was also manipulated. The level of efficiency at which pursuers sought their goal and the accuracy and certainty of detectors’ inference of pursuers’ goal were measured. Results generally confirmed hypotheses. Efficiency and accuracy were positively correlated only when (a) not-busy detectors’ goal was concordant with pursuers’ goal and (b) busy detectors’ goal was discordant with pursuers’ goal, whereas efficiency and certainty were positively correlated only for not-busy detectors. Other results dealt with how detectors’ perspective taking promotes accuracy for inefficient goal pursuit and how accuracy yields favorable ratings of pursuers’ communication competence when goal inferences are certain. Results are discussed theoretically and methodologically.
Thesis
Full-text available
Dialogue is an interactive endeavour in which participants jointly pursue the goal of reaching understanding. Since participants enter the interaction with their individual conceptualisation of the world and their idiosyncratic way of using language, understanding cannot, in general, be reached by exchanging messages that are encoded when speaking and decoded when listening. Instead, speakers need to design their communicative acts in such a way that listeners are likely able to infer what is meant. Listeners, in turn, need to provide evidence of their understanding in such a way that speakers can infer whether their communicative acts were successful. This is often an interactive and iterative process in which speakers and listeners work towards understanding by jointly coordinating their communicative acts through feedback and adaptation. Taking part in this interactive process requires dialogue participants to have ‘interactional intelligence’. This conceptualisation of dialogue is rather uncommon in formal or technical approaches to dialogue modelling. This thesis argues that it may, nevertheless, be a promising research direction for these fields, because it de-emphasises raw language processing performance and focusses on fundamental interaction skills. Interactionally intelligent artificial conversational agents may thus be able to reach understanding with their interlocutors by drawing upon such competences. This will likely make them more robust, more understandable, more helpful, more effective, and more human-like. This thesis develops conceptual and computational models of interactional intelligence for artificial conversational agents that are limited to (1) the speaking role, and (2) evidence of understanding in form of communicative listener feedback (short but expressive verbal/vocal signals, such as ‘okay’, ‘mhm’ and ‘huh’, head gestures, and gaze). This thesis argues that such ‘attentive speaker agents’ need to be able (1) to probabilistically reason about, infer, and represent their interlocutors’ listening related mental states (e.g., their degree of understanding), based on their interlocutors’ feedback behaviour; (2) to interactively adapt their language and behaviour such that their interlocutors’ needs, derived from the attributed mental states, are taken into account; and (3) to decide when they need feedback from their interlocutors and how they can elicit it using behavioural cues. This thesis describes computational models for these three processes, their integration in an incremental behaviour generation architecture for embodied conversational agents, and a semi-autonomous interaction study in which the resulting attentive speaker agent is evaluated. The evaluation finds that the computational models of attentive speaking developed in this thesis enable conversational agents to interactively reach understanding with their human interlocutors (through feedback and adaptation) and that these interlocutors are willing to provide natural communicative listener feedback to such an attentive speaker agent. The thesis shows that computationally modelling interactional intelligence is generally feasible, and thereby raises many new research questions and engineering problems in the interdisciplinary fields of dialogue and artificial conversational agents.
Article
Full-text available
Combining discourse analysis with quantitative methods, this article compares how the legislatures of Turkey, the US, and the EU discursively constructed Turkey's Kurdish question. An examination of the legislative-political discourse through 1990 to 1999 suggests that a country suffering from a domestic secessionist conflict perceives and verbalizes the problem differently than outside observers and external stakeholders do. Host countries of conflicts perceive their problems through a more security-oriented lens, and those who observe these conflicts at a distance focus more on the humanitarian aspects. As regards Turkey, this study tests politicians' perceptions of conflicts and the influence of these perceptions on their preexisting political agendas for the Kurdish question, and offers a new model for studying political discourse on intra-state conflicts. The article suggests that a political agenda emerges as the prevalent dynamic in conservative politicians' approaches to the Kurdish question, whereas ideology plays a greater role for liberal/pro-emancipation politicians. Data shows that politically conservative politicians have greater variance in their definitions, based on material factors such as financial, electoral, or alliance-building constraints, whereas liberal and/or left-wing politicians choose ideologically confined discursive frameworks such as human rights and democracy.
Chapter
Full-text available
Being social creatures in a complex world, we do things together. We act jointly. While cooperation, in its broadest sense, can involve merely getting out of each other’s way, or refusing to deceive other people, it is also essential to human nature that it involves more active forms of collaboration and coordination (Tomasello 2009; Sterelny 2012). We collaborate with others in many ordinary activities which, though at times similar to those of other animals, take unique and diverse cultural and psychological forms in human beings. But we also work closely and interactively with each other in more peculiar and flexible practices which are in distinctive ways both species-specific and culturally and historically contingent: from team sports to shared labor, from committee work to mass demonstrations, from dancing to reminiscing together about old times. © Palgrave Macmillan, a division of Macmillan Publishers Limited 2014.
Chapter
I do not intend to draw up an inventory of the past and present publications on intonation. Partial or exhaustive references can be found in (1970), (1975), (1976), (1979), (1981), (1984), (1986), (1996), (1997), (1998), (1999). It would require a huge amount of effort to go back and look for past studies on prosody or intonation, which have been increasing exponentially in number over the years. Such a historical overview is beyond the scope of this paper. But it would be worth reporting the milestones which contain the seeds of the concepts underlying current theories on intonation and explain their development.
Chapter
Full-text available
Recent research in the formal modelling of dialogue has led to the conclusion that bifurcations like language use versus language structure, competence versus performance, grammatical versus psycholinguistic/pragmatic modes of explanation are all based on an arbitrary and ultimately mistaken dichotomy, one that obscures the unitary nature of the phenomena because it insists on a view of grammar that ignores essential features of natural language (NL) processing. The subsequent radical shift towards a conception of NL grammars as procedures for enabling interaction in context (Kempson et al. 2009a, b) now raises a host of psychological and philosophical issues: The ability of dialogue participants to take on or hand over utterances mid-sentence raises doubts as to the constitutive status of Gricean intention-recognition as a fundamental mechanism in communication. Instead, the view that emerges, rather than relying on mind-reading and cognitive state metarepresentational capacities, entails a reconsideration of the notion of communication and a non-individualistic view on meaning. Coordination/alignment/intersubjectivity among dialogue participants is now seen as relying on low-level mechanisms like the grammar (appropriately conceived).
Thesis
Full-text available
The main purpose of this study is to analyse the denial or refutation discourse relation and the linguistic constructions that express it in European contemporary Portuguese. Operating at the pragmatic level, denial can be defined as the discourse relation that holds between an utterance, produced by a speaker B, whose function is to reject another utterance, the target utterance produced by a speaker A. Prototypically, this relation occurs in dialogues and functions as a reactive act, a face-threatening act that challenges the faces of both speakers. The text span that carries out this function is typically followed by a discourse continuation that corrects or rectifies the target utterance. In his/her corrective discourse move, the speaker B presents the information that, in his/her opinion, should replace what was previously rebutted. In its most prototypical occurrences, denial involves what is said or implied in the rebutted utterance. In its less prototypical occurrences, it can equally involve formal aspects of the rebutted utterance, associated, in a broad sense, with its linguistic accuracy. Typically, in both cases, it is not the entire utterance produced by the speaker A that is rejected, but only the constituents that lead to its unacceptability from the point of view of the speaker B. Such constituents are typically focused by a wide range of linguistic processes, such as constituent negation, cleft sentences and contrastive intonation. A comprehensive analysis of these syntactic and prosodic focusing processes needs to consider the discourse function of the sequences in which they occur. In contemporary European Portuguese, the denial and correction text sequences under analysis can be shaped into two types of paratactic constructions: coordination and juxtaposition constructions. In coordination constructions, the linguistic forms não p, mas q or não p, mas sim/mas antes/e sim q, seem to be specialized in the linguistic marking of the discourse relations at stake (denial and correction). These constructions can be used to reject not only what has been said in or implied by the target utterance, but also some formal aspects of its linguistic formulation. In coordination constructions, denial and rectification may also be expressed by the construction não só p, mas também q, which is specialized in the rebuttal of the propositional content or the Q-implicatures associated with the target utterance. Constructions such as não é p, mas/mas sim/mas antes/e sim q, não se diz p, mas/mas sim/mas antes/e sim q, seem to be specialized in the rejection of formal aspects of the target utterance. In juxtaposition constructions such as não p || q, the link between the text spans that refute and rectify the target utterance is not marked by any connective. There are no restrictions concerning the kind of element (propositional content, implicatures or formal aspects) that triggers the rejection of the target utterance. In the juxtaposition constructions, the denial and corrective text spans can also assume specialized forms when only formal aspects of the target utterance are involved. It is the case of não é p || é q or similar ones, such as não se diz p || diz-se q or p, não || q. It is worth stressing that in juxtaposition constructions, the denial text span does not necessarily take the form of a (metalinguistic) negative utterance. Finally, juxtaposition constructions also display structures such as não p || antes/sim/pelo contrário q. In this context, these expressions behave as discourse markers or discourse connectives. They give instructions on how to compute the discourse relation that holds between p and q. The constructions não p || sim/antes q, where the units sim and antes signal, respectively, the polarity contrast and the preferential value of the text span in which they occur, are only acceptable when what is being rejected is the propositional content or the implicatures associated with the target utterance. The construction não p || pelo contrário q, given the antithetical value of the connective pelo contrário, is only acceptable when what is being rejected is the propositional content of the target utterance. More specifically, this construction is only acceptable when, in the target utterance and in the utterance introduced by pelo contrário, there are two distinct predicates related by an antonymic semantic relation. In contemporary European Portuguese, there are other kinds of expressions (non-connective ones) whose function, in specific contexts, is also related with the conventional marking of the discourse relation of denial: lá, cá, agora (Martins, 2010) and nada, in constructions such as [V_nada] (Pinto, 2011), seem to have a clear denial marking function. Furthermore, the expression mas é may also signal the rectification/correction discourse relation. These expressions occur predominantly in oral interaction, particularly in informal registers, and their usage seems to be restricted to cases where what is being rejected is the propositional content or the implicatures associated with the target utterance.
Article
Full-text available
There have been few groundbreaking works on the meaning of intonation. Bartels’s (1999) book or Pierrehumbert and Hirschberg’s (1990) famous paper and the response to it by Hobbs (1990) are the most cited examples of such studies. Truckenbrodt’s 2012 article on the semantics of intonation is too recent to be widely cited, but may well become another classical article on the meaning of prosody. Although not all ten papers of Prosody and meaning can compare to these, a few very useful and innovative studies are grouped together, and this renders the book important. Prosody and tonal structure are modules of linguistics that are lagging behind most others; this is because they need phonetic analysis, and the technology necessary for their study has become accessible to a large number of researchers only recently. Moreover, as a part of linguistics, intonation is an interface discipline, in need of syntactic and semantic components. Only few linguists master all three domains sufficiently to explore meaning in prosody. In their introduction to the book, the editors mention that papers with similar main themes or approaches could be grouped together, and the editors do an excellent job of summarizing the content of the papers, but a summary of the whole book or of the larger perspective behind the book is missing, as are connections among the contributions or to intonation research in general. Perhaps this is due to the way the book came into being in the first place: it grew out of a workshop held at the Institut d’Estudis Catalans in Barcelona, September 17–19, 2009, concerning the relation between prosody and meaning. Some of the papers presented there are included in the book plus a few additional ones. The book is not organized in sections, and the groups of papers mentioned in the introduction do not serve as a guideline for the order of the chapters. In my review, I follow the order of the book, and address each paper in turn. As the editors put it in the introduction, ‘Most of the papers devoted to the study of the production and perception of intonational contrasts related to information structure adopt a laboratory phonology methodology’ (1). In this category, we find Mariapaola D’Imperio, James German , and Amandine Michelas ’s paper, ‘A multi-level approach to focus, phrasing and intonation in French’, in their own wording, a compilation of older studies by the same authors. It contains a very nice and informative review of the previous literature on French intonation. The authors assume that, besides the prosodic domains of intonation phrase and accentual phrase used by most researchers, French needs an additional prosodic domain, called the intermediate phrase. This is because all prosodic levels characterized by tonal structure have an initial tonal rise; this initial rise is larger and more likely to occur at the left edge of a contrastive focus domain and in longer phrases, although these two factors do not interact. It is the presence of this larger initial rise that motivates an additional prosodic domain. A number of alternatives that could account for the larger initial rise without losing the insight that long or focused phrases are often phrased independently come to mind, such as recursive prosodic domains or larger phonetic cues due to information structure, but none of these alternatives are discussed. The second paper, ‘Syntax-prosody mapping, topic-comment structure and stress-focus correspondence in Hungarian’ by Balázs Surányi, Shinichiro Ishihara , and Fabian Schubö , reports on original research. The authors investigated the prosodic realization of Hungarian sentences containing two quantified phrases, QP1 and QP2, in preverbal position in three focus conditions: broad focus, narrow focus on QP1, and narrow focus on QP2. They ask the interesting question of how the need for nuclear stress to be preverbal is reconciled with the presence of two QPs in this position. The carefully analyzed phonetic results show that the speakers chose different prosodic patterns to realize a narrow focus that is not in the canonical preverbal position, thus on QP1. They also show what the speakers did with QP1 when QP2 was narrowly focused...
Chapter
Spoken language interactive systems range from speech-enabled command interfaces to dialogue systems which conduct spoken conversations with the user. In the first case, spoken language is used as an alternative input and output modality, so that the commands, which the user could type or select from the menu, may also be uttered. The system responses can also be given as spoken utterances, instead of written language or drawings on the screen, so the whole interaction can be conducted in speech. Spoken dialogue systems, however, are built on models concerning spoken conversations between participants so as to allow flexible interaction capabilities. Although interactions are limited concerning topics, turn-taking principles and conversational strategies, the systems aim at human–computer interaction that would support natural interaction which enables the user to interact with the system in an intuitive manner. Moreover, trying to combine insights of the processes that underlie typical human interactions, spoken dialogue modelling also seeks to advance our knowledge and understanding of the principles that govern communicative situations in general.
ResearchGate has not been able to resolve any references for this publication.