Christof Monz's research while affiliated with University of Amsterdam and other places

Publications (44)

Article
In this article, we address the problem of answering complex information needs by conducting conversations with search engines , in the sense that users can express their queries in natural language and directly receive the information they need from a short system response in a conversational manner. Recently, there have been some attempts towards...
Conference Paper
Full-text available
Dialogue response generation (DRG) is a critical component of task-oriented dialogue systems (TDSs). Its purpose is to generate proper natural language responses given some context, e.g., historical utterances, system states, etc. State-of-the-art work focuses on how to better tackle DRG in an end-to-end way. Typically, such studies assume that eac...
Article
We define hybrid intelligence (HI) as the combination of human and machine intelligence, augmenting human intellect and capabilities instead of replacing them and achieving goals that were unreachable by either humans or machines. HI is an important new research focus for artificial intelligence, and we set a research agenda for HI by formulating f...
Preprint
Full-text available
Recent works have shown that Neural Machine Translation (NMT) models achieve impressive performance, however, questions about understanding the behavior of these models remain unanswered. We investigate the unexpected volatility of NMT models where the input is semantically and syntactically correct. We discover that with trivial modifications of s...
Preprint
Full-text available
In this paper, we address the problem of answering complex information needs by conversing conversations with search engines, in the sense that users can express their queries in natural language, and directly receivethe information they need from a short system response in a conversational manner. Recently, there have been some attempts towards a...
Article
Full-text available
Existing conversational systems tend to generate generic responses. Recently, Background Based Conversation (BBCs) have been introduced to address this issue. Here, the generated responses are grounded in some background information. The proposed methods for BBCs are able to generate more informative responses, however, they either cannot generate...
Article
Full-text available
Background Based Conversation (BBCs) have been introduced to help conversational systems avoid generating overly generic responses. In a BBC, the conversation is grounded in a knowledge source. A key challenge in BBCs is Knowledge Selection (KS): given a conversational context, try to find the appropriate background knowledge (a text fragment conta...
Chapter
Full-text available
We investigate BERT in an evidence retrieval and claim verification pipeline for the task of evidence-based claim verification. To this end, we propose to use two BERT models, one for retrieving evidence sentences supporting or rejecting claims, and another for verifying claims based on the retrieved evidence sentences. To train the BERT retrieval...
Preprint
Natural Language Generation (NLG) models are prone to generating repetitive utterances. In this work, we study the repetition problem for encoder-decoder models, using both recurrent neural network (RNN) and transformer architectures. To this end, we consider the chit-chat task, where the problem is more prominent than in other tasks that need enco...
Preprint
Full-text available
Dialogue response generation (DRG) is a critical component of task-oriented dialogue systems (TDSs). Its purpose is to generate proper natural language responses given some context, e.g., historical utterances, system states, etc. State-of-the-art work focuses on how to better tackle DRG in an end-to-end way. Typically, such studies assume that eac...
Preprint
Full-text available
Motivated by the promising performance of pre-trained language models, we investigate BERT in an evidence retrieval and claim verification pipeline for the FEVER fact extraction and verification challenge. To this end, we propose to use two BERT models, one for retrieving potential evidence sentences supporting or rejecting claims, and another for...
Preprint
Full-text available
Background Based Conversations (BBCs) have been introduced to help conversational systems avoid generating overly generic responses. In a BBC the conversation is grounded in a knowledge source. A key challenge in BBCs is Knowledge Selection (KS): given a conversation context, try to find the appropriate background knowledge (a text fragment contain...
Preprint
Full-text available
Existing conversational systems tend to generate generic responses. Recently, Background Based Conversations (BBCs) have been introduced to address this issue. Here, the generated responses are grounded in some background information. The proposed methods for BBCs are able to generate more informative responses, they either cannot generate natural...
Preprint
Full-text available
Earlier approaches indirectly studied the information captured by the hidden states of recurrent and non-recurrent neural machine translation models by feeding them into different classifiers. In this paper, we look at the encoder hidden states of both transformer and recurrent machine translation models from the nearest neighbors perspective. We i...
Preprint
Full-text available
Sequence-to-Sequence (Seq2Seq) models have achieved encouraging performance on the dialogue response generation task. However, existing Seq2Seq-based response generation methods suffer from a low-diversity problem: they frequently generate generic responses, which make the conversation less interesting. In this paper, we address the low-diversity p...
Preprint
Full-text available
Neural Machine Translation has achieved state-of-the-art performance for several language pairs using a combination of parallel and synthetic data. Synthetic data is often generated by back-translating sentences randomly sampled from monolingual data using a reverse translation model. While back-translation has been shown to be very effective in ma...
Article
Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and neural machine translation (Shi et al., 2016). In contrast, the ability to model structured data with non-recurrent...
Article
Full-text available
Neural Machine Translation (NMT) has been widely used in recent years with significant improvements for many language pairs. Although state-of-the-art NMT systems are generating progressively better translations, idiom translation remains one of the open challenges in this field. Idioms, a category of multiword expressions, are an interesting langu...
Article
Attention in neural machine translation provides the possibility to encode relevant parts of the source sentence at each translation step. As a result, attention is considered to be an alignment model as well. However, there is no work that specifically studies attention and provides analysis of what is being learned by attention models. Thus, the...
Article
Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT). With the recent increase in popularity of neural machine translation (NMT), we explore in this paper to what extent and how NMT can also benefit from data se...
Article
Full-text available
Neural machine translation is a recently proposed approach which has shown competitive results to traditional MT approaches. Standard neural MT is an end-to-end neural network where the source sentence is encoded by a recurrent neural network (RNN) called encoder and the target words are predicted using another RNN known as decoder. Recently, vario...
Article
Full-text available
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new s...
Article
Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling...
Article
This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions. This implies that in image captioning, all word categories other than nouns can be evoked by a powerful language model without sacr...
Article
Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our...
Article
Recent years have seen rapid growth in the deployment of statistical methods for computational language and speech processing. The current popularity of such methods can be traced to the convergence of several factors, including the increasing amount of data now accessible, sustained advances in computing power and storage capabilities, and ongoing...
Conference Paper
Research in domain adaptation for statistical machine translation (SMT) has resulted in various approaches that adapt system components to specific translation tasks. The concept of a domain, however, is not precisely defined, and most approaches rely on provenance information or manual subcorpus labels, while genre differences have not been addres...
Conference Paper
It is widely accepted that translating usergenerated (UG) text is a difficult task for modern statistical machine translation (SMT) systems. The translation quality metrics typically used in the SMT literature reflect the overall quality of the system output but provide little insight into what exactly makes UG text translation difficult. This pape...
Article
Full-text available
Given a pair of source and target language sentences which are translations of each other with known word alignments between them, we extract bilingual phrase-level segmentations of such a pair. This is done by identifying two appropriate measures that assess the quality of phrase segments, one on the monolingual level for both language sides, and...
Conference Paper
The ACL-2005 Workshop on Parallel Texts hosted a shared task on building statistical machine translation systems for four European language pairs: French-English, German-English, Spanish-English, and Finnish-English. Eleven groups participated in the event. This paper describes the goals, the task definition and resources, as well as results and so...

Citations

... Bold face indicates the ground truth knowledge selection some directions, e.g., the feedback mechanism can be further applied for updating decoder states, RFM could be combined with another novel mechanism to generate more appropriate and natural responses. BBC has been used in the tasks of conversational search [28] and blockchain [44], which is an evidence that BBC is applicable in real life. ...
... We tackle a MDS as a context-to-text generation problem (Hosseini-Asl et al. 2020;Pei et al. 2020) and deploy a unified framework called SeqMDS. Formally, given a sequence of dialogue context X, a MDS aims to generate a system response Y which maximizes the generation probability P (Y |X). ...
... The terms "augmented intelligence" or "augmented analytics" often refer to the facilitating role AI plays in enhancing the capabilities of "humans-in-the-loop", including learning, decision-making, and new experiences (Longoni and Cian, 2020). This is sometimes also called hybrid intelligence (Akata et al., 2020). ...
... Several previous works [12,22,23] focus on calculating the probability distributions (mainly, prior and posterior distributions) of knowledge, which occurs in the corresponding context and response. Other studies [21,27,50] focus on calculating a weight vector for the representation of selected knowledge. Our work goes ahead along the latter direction. ...
... In general, existing methods for BBCs are classified into two types: Networks (R-Net) [40], Question Answering Architecture (QANet) [45]); • generation-based methods (such as Get To The Point (GTTP) [29], Context-aware Knowledge Pre-selection (CaKe) [50], Global-to-Local Knowledge Selection (GLKS) [27]). ...
... Generally, a bidirectional encoder-based model has been developed to detect fake news to verify a fact. For example, the Bidirectional Encoder Representations from Transformers (BERT) [11] model has been successfully implemented to detect fake news [12,13] and fact verification [14,15]. Another model developed from BERT to detect fake news, called RoBERTa [16], has been successfully implemented [17,18]. ...
... Back-translation (Sennrich et al., 2016;Fadaee and Monz, 2018;Poncelas et al., 2019) corresponds to the scenario where target-side monolingual data is translated using an MT system to give corresponding synthetic source sentences, the idea being that it is particularly beneficial for the MT decoder to see well-formed sentences (Haddow et al., 2022). Back-translation has become a popular strategy among MT researchers, especially in low-resource scenarios (Haque et al., 2021). ...
... Transformer-based [1] neural machine translation has achieved state-of-the-art performance in neural machine translation, and it outperforms recurrent neural network (RNN)-based models [2][3][4]. However, recent work [5][6][7] has shown that the Transformer may not learn the linguistic information to the greatest extent possible due to the characteristics of the model, especially in low-resource scenarios. On the other hand, prior knowledge has been proved to be an effective way to improve the quality of statistical machine translation. ...
... • FACE (Jiang et al., 2019) It uses the frequency-aware cross-entropy loss to tackle the low-diversity problem. ...
... However, several studies pointed out that the overall view on the input sequence may distract the attention of SANs and lead to overlooking some important information [5,[13][14][15]. They found that these problems can be addressed by guiding the attention distribution with an inductive bias, which offers SANs the ability to learn a specific view of the input sentence, e.g., short-term view [16,17], forward and backward views [5], as well as phrasal patterns [9,13]. ...