Arash Eshghi

Arash Eshghi
Heriot-Watt University · Interaction Lab, Department of Computer Science

PhD

About

85
Publications
12,309
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
913
Citations
Additional affiliations
April 2015 - February 2017
Heriot-Watt University
Position
  • Research Associate
January 2013 - present
Heriot-Watt University
Position
  • Research Associate
October 2009 - December 2012
Queen Mary, University of London
Position
  • Research Assistant
Description
  • Psychology of Dialogue, Semantics, Context, Linguistics, Grammar Induction

Publications

Publications (85)
Article
Anecdotal evidence suggests that participants in conversation can sometimes act as a coalition. This implies a level of conversational organization in which groups of individuals form a coherent unit. This paper investigates the implications of this phenomenon for psycholinguistic and semantic models of shared context in dialog. We present a corpus...
Conference Paper
Full-text available
In conversation, interlocutors routinely indicate whether something said or done has been processed and integrated. Such feedback includes backchannels such as 'okay' or 'mhm', the production of a next relevant turn, and repair initiation via clarification requests. Importantly, such feedback can be produced not only at sentence/turn boundaries, bu...
Conference Paper
Full-text available
We describe a method for learning an in-cremental semantic grammar from data in which utterances are paired with logical forms representing their meaning. Work-ing in an inherently incremental frame-work, Dynamic Syntax, we show how words can be associated with probabilistic procedures for the incremental projection of meaning, providing a grammar...
Article
Full-text available
We present a method for inducing new dialogue systems from very small amounts of unannotated dialogue data, showing how word-level exploration using Reinforcement Learning (RL), combined with an incremental and semantic grammar - Dynamic Syntax (DS) - allows systems to discover, generate, and understand many new dialogue variants. The method avoids...
Preprint
Full-text available
Evaluating Large Language Models (LLMs) on reasoning benchmarks demonstrates their ability to solve compositional questions. However, little is known of whether these models engage in genuine logical reasoning or simply rely on implicit cues to generate answers. In this paper, we investigate the transitive reasoning capabilities of two distinct LLM...
Preprint
Full-text available
In dialogue, the addressee may initially misunderstand the speaker and respond erroneously, often prompting the speaker to correct the misunderstanding in the next turn with a Third Position Repair (TPR). The ability to process and respond appropriately to such repair sequences is thus crucial in conversational AI systems. In this paper, we first c...
Preprint
Full-text available
This study explores replacing Transformers in Visual Language Models (VLMs) with Mamba, a recent structured state space model (SSM) that demonstrates promising performance in sequence modeling. We test models up to 3B parameters under controlled conditions, showing that Mamba-based VLMs outperforms Transformers-based VLMs in captioning, question an...
Preprint
AI personal assistants deployed via robots or wearables require embodied understanding to collaborate with humans effectively. However, current Vision-Language Models (VLMs) primarily focus on third-person view videos, neglecting the richness of egocentric perceptual experience. To address this gap, we propose three key contributions. First, we int...
Article
Full-text available
In spontaneous conversation, speakers seldom have a full plan of what they are going to say in advance: they need to conceptualise and plan incrementally as they articulate each word in turn. This often leads to long pauses mid-utterance. Listeners either wait out the pause, offer a possible completion, or respond with an incremental clarification...
Preprint
Full-text available
In conversation, speakers produce language incrementally, word by word, while continuously monitoring the appropriateness of their own contribution in the dynamically unfolding context of the conversation; and this often leads them to repair their own utterance on the fly. This real-time language processing capacity is furthermore crucial to the de...
Preprint
Full-text available
The ability to handle miscommunication is crucial to robust and faithful conversational AI. People usually deal with miscommunication immediately as they detect it, using highly systematic interactional mechanisms called repair. One important type of repair is Third Position Repair (TPR) whereby a speaker is initially misunderstood but then correct...
Preprint
Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee. Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarificational Exchanges (CE): a Clarification Request (CR) and a response. Here, we argue...
Article
Social robots have limited social competences. This leads us to view them as depictions of social agents rather than actual social agents. However, people also have limited social competences. We argue that all social interaction involves the depiction of social roles and that they originate in, and are defined by, their function in accounting for...
Preprint
Anaphoric expressions, such as pronouns and referential descriptions, are situated with respect to the linguistic context of prior turns, as well as, the immediate visual environment. However, a speaker's referential descriptions do not always uniquely identify the referent, leading to ambiguities in need of resolution through subsequent clarificat...
Article
Full-text available
Feedback such as backchannels and clarification requests often occurs subsententially, demonstrating the incremental nature of grounding in dialogue. However, although such feedback can occur at any point within an utterance, it typically does not do so, tending to occur at Feedback Relevance Spaces (FRSs). We present a corpus study of acknowledgem...
Poster
Full-text available
As transparency becomes key for robotics and AI, it will be necessary to evaluate the methods through which transparency is provided, including automatically generated natural language (NL) explanations. Here, we explore parallels between the generation of such explanations and the much-studied field of evaluation of Natural Language Generation (NL...
Preprint
Full-text available
As transparency becomes key for robotics and AI, it will be necessary to evaluate the methods through which transparency is provided, including automatically generated natural language (NL) explanations. Here, we explore parallels between the generation of such explanations and the much-studied field of evaluation of Natural Language Generation (NL...
Chapter
We have recently seen the emergence of several publicly available Natural Language Understanding (NLU) toolkits, which map user utterances to structured, but more abstract, Dialogue Act (DA) or Intent specifications, while making this process accessible to the lay developer. In this paper, we present the first wide coverage evaluation and compariso...
Article
In everyday conversation, no notion of “complete sentence” is required for syntactic licensing. However, so-called “fragmentary”, “incomplete”, and abandoned utterances are problematic for standard formalisms. When contextualised, such data show that (a) non-sentential utterances are adequate to underpin agent coordination, while (b) all linguistic...
Conference Paper
Full-text available
In this paper, we explore the idea that independently developed Dynamic Syntax accounts of dialogue and interaction fit well within the general approach of radical embodied and enac-tive accounts of cognition (REEC). This approach enables a rethinking of the grounding of linguistic universal constraints, specifically tree structure restrictions, as...
Preprint
Goal-oriented dialogue systems are now being widely adopted in industry where it is of key importance to maintain a rapid prototyping cycle for new products and domains. Data-driven dialogue system development has to be adapted to meet this requirement --- therefore, reducing the amount of data and annotations necessary for training such systems is...
Preprint
Dialogue technologies such as Amazon's Alexa have the potential to transform the healthcare industry. However, current systems are not yet naturally interactive: they are often turn-based, have naive end-of-turn detection and completely ignore many types of verbal and visual feedback - such as backchannels, hesitation markers, filled pauses, gaze,...
Article
Full-text available
Dialogue technologies such as Amazon's Alexa have the potential to transform the healthcare industry. However, current systems are not yet naturally interactive: they are often turn-based, have naive end-of-turn detection and completely ignore many types of verbal and visual feedback - such as backchannels, hesitation markers, filled pauses, gaze,...
Poster
Full-text available
This poster illustrates our paper with the same title here: https://www.researchgate.net/publication/335790583_Current_Challenges_in_Spoken_Dialogue_Systems_and_Why_They_Are_Critical_for_Those_Living_with_Dementia
Preprint
Full-text available
Learning with minimal data is one of the key challenges in the development of practical, production-ready goal-oriented dialogue systems. In a real-world enterprise setting where dialogue systems are developed rapidly and are expected to work robustly for an ever-growing variety of domains, products, and scenarios, efficient learning from a limited...
Preprint
We have recently seen the emergence of several publicly available Natural Language Understanding (NLU) toolkits, which map user utterances to structured, but more abstract, Dialogue Act (DA) or Intent specifications, while making this process accessible to the lay developer. In this paper, we present the first wide coverage evaluation and compariso...
Chapter
This handbook is the first volume to provide a comprehensive, in-depth, and balanced discussion of ellipsis phenomena, whereby a perceived interpretation is fuller than would be expected based solely on the presence of linguistic forms. Natural language abounds in these apparently incomplete expressions, such as I laughed but Ed didn’t, in which th...
Preprint
Full-text available
Spontaneous spoken dialogue is often disfluent, containing pauses, hesitations, self-corrections and false starts. Processing such phenomena is essential in understanding a speaker's intended meaning and controlling the flow of the conversation. Furthermore, this processing needs to be word-by-word incremental to allow further downstream processing...
Article
Full-text available
People give feedback in conversation: both positive signals of understanding, such as nods, and negative signals of misunderstanding, such as frowns. How do signals of understanding and misunderstanding affect the coordination of language use in conversation? Using a chat tool and a maze-based reference task, we test two experimental manipulations...
Article
Full-text available
Natural, spontaneous dialogue proceeds incrementally on a word-by-word basis; and it contains many sorts of disfluency such as mid-utterance/sentence hesitations, interruptions, and self-corrections. But training data for machine learning approaches to dialogue processing is often either cleaned-up or wholly synthetic in order to avoid such phenome...
Article
Full-text available
We investigate an end-to-end method for automatically inducing task-based dialogue systems from small amounts of unannotated dialogue data. It combines an incremental semantic grammar - Dynamic Syntax and Type Theory with Records (DS-TTR) - with Reinforcement Learning (RL), where language generation and dialogue management are a joint decision prob...
Conference Paper
Full-text available
We present an optimised multi-modal dialogue agent for interactive learning of visually grounded word meanings from a human tutor, trained on real human-human tutoring data. Within a life-long interactive learning period, the agent, trained us- ing Reinforcement Learning (RL), must be able to handle natural conversations with human users, and achie...
Conference Paper
Full-text available
We motivate and describe a new freely available human-human dialogue data set for interactive learning of visually grounded word meanings through osten-sive definition by a tutor to a learner. The data has been collected using a novel, character-by-character variant of the DiET chat tool (Healey et al., 2003; Mills and Healey, submitted) with a nov...
Conference Paper
Full-text available
Feedback such as backchannels and clarification requests can occur subsententially, demonstrating the incremental nature of grounding in dialogue. However, although such feedback can occur at any point within an utterance, it typically does not do so, tending to occur at feedback relevance spaces (FRSs). We provide a low-level, semantic processing...
Conference Paper
Full-text available
We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic parsing/generation framework - Dynamic Syntax and Type Theory with Records (DS-TTR) - with a set of visual classifiers that are learned throughout the interaction and which groun...
Conference Paper
We address the problem of interactively learning perceptually grounded word meanings in a multimodal dialogue system. Human tutors can correct, question, and confirm the statements of a dialogue agent which is trying to interactively learn the meanings of perceptual words, e.g. colours and shapes. We show that different learner and tutor dialogue s...
Presentation
Full-text available
Presentation at the 37th Annual Meeting of the Linguistics Department of Aristotle University
Conference Paper
Full-text available
We address the problem of interactively learning perceptually grounded word meanings in a multimodal dialogue system. We design a semantic and visual processing system to support this and illustrate how they can be integrated. We then focus on comparing the performance (Precision, Recall, F1, AUC) of three state-of-the-art attribute classifiers for...
Conference Paper
Full-text available
Dialogue is domain-specific, in that the communicative import of utterances is severely underdetermined in the absence of a specific domain of language use. This has lead dialogue system developers to use various techniques to map dialogue utterances onto hand-crafted, highly domain-specific Dialogue Act (DA) representations, leading to systems whi...
Conference Paper
Full-text available
We describe a method for learning an incremental semantic grammar from a corpus in which sentences are paired with logical forms as predicate- argument structure trees. Working in the framework of Dynamic Syntax, and as- suming a set of generally available compositional mechanisms, we show how lexical entries can be learned as probabilistic procedu...
Article
We present empirical evidence from dialogue that challenges some of the key assumptions in the Pickering & Garrod (P&G) model of speaker-hearer coordination in dialogue. The P&G model also invokes an unnecessarily complex set of mechanisms. We show that a computational implementation, currently in development and based on a simpler model, can accou...
Chapter
Full-text available
The Pickering and Garrod model (Pickering & Garrod, 2013) represents a significant advance within the language-as-action paradigm in providing a mechanistic non-inferential account of dialogue. However, we suggest that, in maintaining several aspects of the language-as-product tradition, it does not go far enough in addressing the dynamic nature of...
Chapter
Ellipsis is a phenomenon in which what is conveyed, in some sense to be explained, doesn't need to be fully verbally articulated, as in the second clause. This chapter explains the kind of notion of context that is needed to model the process of ellipsis resolution. It discusses what ellipsis reveals about linguistic content and the nature of natur...
Conference Paper
Full-text available
Clarification Requests (CR) provide a useful window on how contributions to dialogue are processed. We present chat-tool experiments that introduce CRs mid-turn into ongoing dialogue. The pattern of responses shows people are sensitive to both constituent structure at the interruption point and apparent origin of the CR: the conversational partner...
Article
Full-text available
This paper describes recent work on the DynDial project * towards incremental semantic inter-pretation in dialogue. We outline our domain-general grammar-based approach, using a variant of Dynamic Syntax integrated with Type Theory with Records and a Davidsonian event-based seman-tics. We describe a Java-based implementation of the parser, used wit...
Conference Paper
Full-text available
Conversations are a basic unit of analysis in studies of human interaction. These units are conventionally distinguished by reference to the set of ratified participants who take part, of- ten by appeal to their physical proximity/orientation. We show that within such conversational units there are distinct dialogue contexts which are more fine-gra...
Article
Full-text available
This paper presents a coding protocol that al-lows naïve users to annotate dialogue tran-scripts for anaphora and ellipsis. Cohen's kappa statistic demonstrates that the protocol is sufficiently robust in terms of reliability. It is proposed that quantitative ellipsis data may be used as an index of mutual-engagement. Current and potential uses of...
Article
Full-text available
Concepts of space are fundamental to our understanding of human action and interaction. The common sense concept of uniform, metric, physical space is inadequate for design. It fails to capture features of social norms and practices that can be critical to the success of a technology. The concept of ‘place’ addresses these limitations by taking acc...
Conference Paper
Full-text available
This paper presents a coding protocol that allows naïve users to annotate dialogue transcripts for anaphora and ellipsis. Cohen's kappa statistic demonstrates that the protocol is sufficiently robust in terms of reliability. It is proposed that quantitative ellipsis data may be used as an index of mutual-engagement. Current and potential uses of el...
Article
Full-text available
This document surveys the problems posed by ellipsis data, some of them very wellknown, but as a set still posing very considerable challenges and, as a proof of concept of the insights expressible by Dynamic Syntax analyses, uses the intrinsic incrementality of the DS framework to capture structural and semantic properties of elliptical fragments...

Network