Gerasimos Lampouras

Gerasimos Lampouras
Huawei Technologies · Noah’s Ark Lab

Doctor of Philosophy

About

23
Publications
2,386
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
239
Citations
Citations since 2016
15 Research Items
168 Citations
2016201720182019202020212022010203040
2016201720182019202020212022010203040
2016201720182019202020212022010203040
2016201720182019202020212022010203040
Additional affiliations
April 2015 - present
University College London
Position
  • Research Associate
Description
  • DILiGENt: "Domain-Independent Language Generation" (EPSRC project).
May 2014 - September 2014
Athens University of Economics and Business
Position
  • Research Assistant
Description
  • BioASQ: "A challenge on large-scale biomedical semantic indexing and question answering" (European FP7 ICT project).
March 2011 - August 2012
Athens University of Economics and Business
Position
  • Research Assistant
Description
  • “A Linear Programming approach to multi-document text summarization and natural language generation from ontologies”, Athens University of Economics and Business (AUEB) Basic Research Funding Program (BRFP) project.
Education
March 2009 - January 2015
Athens University of Economics and Business
Field of study
  • Artificial Intelligence, Natural Language Processing
October 2006 - June 2008
Athens University of Economics and Business
Field of study
  • Computer Science
October 2002 - July 2006

Publications

Publications (23)
Preprint
Curriculum Learning (CL) is a technique of training models via ranking examples in a typically increasing difficulty trend with the aim of accelerating convergence and improving generalisability. Current approaches for Natural Language Understanding (NLU) tasks use CL to improve in-distribution data performance often via heuristic-oriented or task-...
Preprint
Full-text available
We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train o...
Preprint
Full-text available
Concept-to-text Natural Language Generation is the task of expressing an input meaning representation in natural language. Previous approaches in this task have been able to generalise to rare or unseen instances by relying on a delexicalisation of the input. However, this often requires that the input appears verbatim in the output text. This pose...
Article
Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size con- sidering the complexity of the dialogues. Additionally, conventional training signal in- ference is not suitable for non-deterministic agent behavior, namely, co...
Preprint
Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size considering the complexity of the dialogues. Additionally, conventional training signal inference is not suitable for non-deterministic agent behaviour, i.e. consider...
Preprint
Full-text available
Deep-learning models for language generation tasks tend to produce repetitive output. Various methods have been proposed to encourage lexical diversity during decoding, but this often comes at a cost to the perceived fluency and adequacy of the output. In this work, we propose to ameliorate this cost by using an Imitation Learning approach to explo...
Preprint
Full-text available
We present our submission to the End-to-End Multi-Domain Dialog Challenge Track of the Eighth Dialog System Technology Challenge. Our proposed dialog system adopts a pipeline architecture, with distinct components for Natural Language Understanding, Dialog State Tracking, Dialog Management and Natural Language Generation. At the core of our system...
Technical Report
Full-text available
(i) We adapt Deep Q-Learning from Demonstrations (DQfD; Hester et al. 2017), an RL algorithm that has achieved high scores in Atari game environments, to the dialog domain. We employ DQfD to overcome the problem of having to randomly explore a large conversational search space, using its ability to learn to imitate an “expert” demonstrator. (ii) We...
Preprint
Many concept-to-text generation systems require domain-specific linguistic resources to produce high quality texts, but manually constructing these resources can be tedious and costly. Focusing on NaturalOWL, a publicly available state of the art natural language generator for OWL ontologies, we propose methods to extract from the Web sentence plan...
Preprint
Concept-to-text generation typically employs a pipeline architecture, which often leads to suboptimal texts. Content selection, for example, may greedily select the most important facts, which may require, however, too many words to express, and this may be undesirable when space is limited or expensive. Selecting other facts, possibly only slightl...
Conference Paper
Natural language generation (NLG) is the task of generating natural language from a meaning representation. Rule-based approaches require domain-specific and manually constructed linguistic resources, while most corpus based approaches rely on aligned training data and/or phrase templates. The latter are needed to restrict the search space for the...
Article
Full-text available
We present NaturalOWL, a natural language generation system that produces texts describing individuals or classes of OWL ontologies. Unlike simpler OWL verbalizers, which typically express a single axiom at a time in controlled, often not entirely fluent natural language primarily for the benefit of domain experts, we aim to generate fluent and coh...
Conference Paper
Full-text available
We present an ILP model of concept-to-text generation. Unlike pipeline architectures, our model jointly considers the choices in content selection, lexicalization, and aggregation to avoid greedy decisions and produce more compact texts.
Conference Paper
Full-text available
We present a new method to generate extractive multi-document summaries. The method uses Integer Linear Programming to jointly maximize the importance of the sentences it includes in the summary and their diversity, without exceeding a maximum allowed summary length. To obtain an importance score for each sentence, it uses a Support Vector Regressi...
Conference Paper
Full-text available
We present a system that finds short definitions of terms on Web pages. It employs a Maximum Entropy classifier, but it is trained on automatically generated examples; hence, it is in effect unsupervised. We use rouge-w to generate training examples from encyclopedias and Web snippets, a method that outperforms an alternative centroid-based one. Af...
Conference Paper
Full-text available
The subject of this demonstration is natu- ral language interaction, focusing on adap- tivity and profiling of the dialogue man- agement and the generated output (text and speech). These are demonstrated in a museum guide use-case, operating in a simulated environment. The main techni- cal innovations presented are the profiling model, the dialogue...
Conference Paper
Full-text available
We demonstrate an open-source natural language generation engine that produces descriptions of entities and classes in En- glish and Greek from OWL ontologies that have been annotated with linguistic and user modeling information expressed in RDF. We also demonstrate an accompany- ing plug-in for the Prot´ ege ontology editor, which can be used to...
Article
Full-text available
NaturalOWL is an open-source natural language genera- tion engine written in Java. It produces descriptions of individuals (e.g., items for sale, museum exhibits) and classes (e.g., types of exhibits) in English and Greek from OWL DL ontologies. The on- tologies must have been annotated in RDF with linguistic and user modeling resources. We demonst...

Network

Cited By

Projects

Projects (2)
Project
Develop new and better Pretrained Language Models for general code understanding and generation
Project
Goal: To develop reliable dialogue systems and their components (NLU, DM, NLG) in the three different flavors: task-oriented, open domain (chatbots) and question-answering.