PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Research on question answering with knowledge base has recently seen an increasing use of deep architectures. In this extended abstract, we study the application of the neural machine translation paradigm for question parsing. We employ a sequence-to-sequence model to learn graph patterns in the SPARQL graph query language and their compositions. Instead of inducing the programs through question-answer pairs, we expect a semi-supervised approach, where alignments between questions and queries are built through templates. We argue that the coverage of language utterances can be expanded using late notable works in natural language generation.
Content may be subject to copyright.
Neural Machine Translation for Query Construction and Composition
Tommaso Soru 1Edgard Marx 1Andr´
e Valdestilhas 1Diego Esteves 2Diego Moussallem 1Gustavo Publio 1
Research on question answering with knowledge
base has recently seen an increasing use of deep
architectures. In this extended abstract, we study
the application of the neural machine translation
paradigm for question parsing. We employ a
sequence-to-sequence model to learn graph pat-
terns in the SPA RQ L graph query language and
their compositions. Instead of inducing the pro-
grams through question-answer pairs, we expect
a semi-supervised approach, where alignments
between questions and queries are built through
templates. We argue that the coverage of language
utterances can be expanded using late notable
works in natural language generation.
1. Introduction
Question Answering with Knowledge Base (KB QA) parses
a natural-language question and returns an appropriate an-
swer that can be found in a knowledge base. Today, one of
the most exciting scenarios for question answering is the
Web of Data, a fast-growing distributed graph of interlinked
knowledge bases which comprises more than 100 billions
of edges (McCrae et al.,2018). Question Answering over
Linked Data (QA LD) is a subfield of KB QA aimed at trans-
forming utterances into SPARQL queries (Lopez et al.,2013).
Being a W3C standard, SPARQ L features a high expressiv-
ity (Prud’hommeaux et al.,2006) and is by far the most used
query language for Linked Data.
Among traditional approaches to KB QA ,Bao et al. (2014)
proposed question decomposition and Statistical Machine
Translation to translate sub-questions into triple patterns.
The method however relies on entity detection and strug-
gles in recognizing predicates by their contexts (e.g., play
in a film or a football team). In the last years, several
methods based on neural networks have been devised to
AKSW, University of Leipzig, Leipzig, Germany
SDA, Bonn
University, Bonn, Germany. Correspondence to: Tommaso Soru
Published at the ICML workshop Neural Abstract Machines &
Program Induction v2 (NAMPI) — Extended Abstract, Stockholm,
Sweden, 2018. Copyright 2018 by the author(s).
Figure 1.
Utterances are translated into SPARQL queries encoded
as sequences of tokens. Using complex surface forms leads to
more graph patterns. We aim at learning these compositions.
tackle the KB QA problem (Liang et al.,2016;Hao et al.,
2017;Lukovnikov et al.,2017;Sorokin & Gurevych,2017).
We study the application of the Neural Machine Translation
paradigm for question parsing using a sequence-to-sequence
model within an architecture dubbed Neural SPA RQ L Ma-
chine, previously introduced in Soru et al. (2017). Similarly
to Liang et al. (2016), we employ a sequence-to-sequence
model to learn query expressions and their compositions.
Instead of inducing the programs through question-answer
pairs, we expect a semi-supervised approach, where align-
ments between questions and queries are built through tem-
plates. Although query induction can save a considerable
amount of supervision effort (Liang et al.,2016;Zhong
et al.,2017), a pseudo-gold program is not guaranteed to be
correct when the same answer can be found with more than
one query (e.g., as the capital is often the largest city of a
country, predicates might be confused). On the contrary,
our proposed solution relies on manual annotation and a
weakly-supervised expansion of question-query templates.
2. Neural SPARQL Machines
Inspired by the Neural Programmer-Interpreter pattern
by (Reed & De Freitas,2015), a Neural SPA RQ L Machine
is composed by three modules: a generator, a learner, and
an interpreter (Soru et al.,2017). We define a query tem-
plate as an alignment between a natural language question
and its respective SPAR QL query, with entities replaced by
placeholders (e.g., “where is
located in?”). The gen-
arXiv:1806.10478v2 [cs.CL] 9 Jul 2018
Neural Machine Translation for Query Construction and Composition
Table 1. Experiments on a DBpedia subset about movies with different SPAR QL encodings and settings.
Encoding Description Test BLEU Accuracy Runtime Convergence
v1 1:1 SPARQL encoding 80.89% 22.33% 1h02:01 13,000
v1.1 Improved consistency 80.61% 22.33% 1h21:21 17,000
v2 Added templates with >1placeholders 89.69% 91.04% 1h59:10 22,000
v2.1 Encoding fix (double spaces removed) 98.40% 91.05% 1h47:11 20,000
v3 Shortened SPARQL sequences 99.28% 94.82% 1h12:07 25,000
v4 Added direct entity translations 99.29% 93.69% 1h23:00 20,000
erator takes query templates as input and creates the training
dataset, which is forwarded to the learner. The learner
takes natural language as input and generates a sequence
which encodes a SPA RQ L query. Here, a recurrent neural
network based on (Bidirectional) Long Short-Term Mem-
ories (Hochreiter & Schmidhuber,1997) is employed as a
sequence-to-sequence translator (see example in Figure 1).
The final structure is then reconstructed by the interpreter
through rule-based heuristics. Note that a sequence can be
represented by any LISP S-expression; therefore, alterna-
tively, sentence dependency trees can be used to encode
questions and ARQ algebra (Seaborne,2010) can be used to
encode SPARQL queries.
Neural SPARQL Machines do not rely on entity linking
methods, since entities and relations are detected within
the query construction phase. External pre-trained word
embeddings help deal with vocabulary mismatch. Knowl-
edge graph jointly embedded with SPARQL operators (Wang
et al.,2014) can be utilized in the target space. A curriculum
learning (Bengio et al.,2009) paradigm can learn graph pat-
tern and SPARQ L operator composition, in a similar fashion
of Liang et al. (2016). We argue that the coverage of lan-
guage utterances can be expanded using techniques such as
Question (Abujabal et al.,2017;Elsahar et al.,2018;Abuja-
bal et al.,2018) and Query Generation (Zafar et al.,2018) as
well as Universal Sentence Encoders (Cer et al.,2018). An-
other problem is the disambiguation between entities having
the same surface forms. Building on top of the DBtrends
approach (Marx et al.,2016), we force the number of oc-
currences of a given entity in the training set to be inversely
proportional to the entity ranking. Following this strategy,
we expect the RNN to associate the word Berlin with the
German capital and not with Berlin, New Hampshire.
3. Experiments and current progress
We selected the DBpedia Knowledge Base (Lehmann et al.,
2015) as the dataset for our experiments, due to its cen-
tral importance for the Web of Data. We built a dataset of
3,108 entities about movies from DBpedia and annotated
20 and 4 question-query templates with one and two place-
holders, resp. Our preliminary results are given in Table 1.
0 5,000 10,000 15,000 20,000
Figure 2. BL EU accuracy against training epochs.
We experimented with 6 different SPA RQL encodings, i.e.
ways to encode a SPA RQL query into a sequence of tokens.
At each row of the table, we provide the description of
the corresponding changes, each of which persists in the
next encodings. The experiments were carried out on a 64-
CPU Ubuntu machine with 512 GB RAM.
We adopted the
implementation of seq2seq in TensorFlow with internal em-
beddings of 128 dimensions, 2 hidden layers, and a dropout
value of 0.2. All settings were tested on the same set of
unseen questions after applying an 80-10-10% split.
The results confirmed that the SPARQ L encoding highly
influences the learning. Adding more complex templates
(i.e., with more than one placeholder) to the generator input
yielded a richer training set and more questions were parsed
correctly. Merging tokens (see queries and their respective
sequences in Figure 1) helped the machine translation, as
the SPAR QL sequences became shorter. Adding alignments
of entities and their labels to the training set turned out to be
beneficial for a faster convergence, as Figure 2shows. The
most frequent errors were due to entity name collisions and
out-of-vocabulary words; both issues can be tackled with
the strategies introduced in this work.
We plan to perform an evaluation on the WE BQUESTION-
SSP (Yih et al.,2016) and QA LD (Unger et al.,2014) bench-
marks to compare with the state-of-the-art approaches for
KBQ A and QA LD, respectively.
1Code available at
Neural Machine Translation for Query Construction and Composition
Abujabal, A., Yahya, M., Riedewald, M., and Weikum, G.
Automated template generation for question answering
over knowledge graphs. In Proc. of the 26th Int. Conf. on
World Wide Web, pp. 1191–1200, 2017.
Abujabal, A., Saha Roy, R., Yahya, M., and Weikum, G.
Never-ending learning for open-domain question answer-
ing over knowledge bases. In Proc. of the 2018 The Web
Conference, pp. 1053–1062, 2018.
Bao, J., Duan, N., Zhou, M., and Zhao, T. Knowledge-based
question answering as machine translation. In Proc. of
the 52nd Annual Meeting of the ACL, 2014.
Bengio, Y., Louradour, J., Collobert, R., and Weston, J.
Curriculum learning. In Proc. of the 26th annual Int.
Conf. on Machine Learning, pp. 41–48. ACM, 2009.
Cer, D., Yang, Y., Kong, S.-y., Hua, N., Limtiaco, N., John,
R. S., Constant, N., Guajardo-Cespedes, M., Yuan, S.,
Tar, C., et al. Universal sentence encoder. arXiv preprint
arXiv:1803.11175, 2018.
Elsahar, H., Gravier, C., and Laforest, F. Zero-shot question
generation from knowledge graphs for unseen predicates
and entity types. arXiv preprint arXiv:1802.06842, 2018.
Hao, Y., Zhang, Y., Liu, K., He, S., Liu, Z., Wu, H., and
Zhao, J. An end-to-end model for question answer-
ing over knowledge base with cross-attention combining
global knowledge. In Proc. of the 55th Annual Meeting
of the ACL (Vol. 1: Long Papers), pp. 221–231, 2017.
Hochreiter, S. and Schmidhuber, J. Long short-term memory.
Neural computation, 9(8):1735–1780, 1997.
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas,
D., Mendes, P. N., Hellmann, S., Morsey, M., Van Kleef,
P., Auer, S., et al. Dbpedia–a large-scale, multilingual
knowledge base extracted from wikipedia. Semantic Web,
6(2):167–195, 2015.
Liang, C., Berant, J., Le, Q., Forbus, K. D., and Lao, N.
Neural symbolic machines: Learning semantic parsers
on freebase with weak supervision. arXiv preprint
arXiv:1611.00020, 2016.
Lopez, V., Unger, C., Cimiano, P., and Motta, E. Evaluating
question answering over linked data. Journal of Web
Semantics, 21:3–13, 2013.
Lukovnikov, D., Fischer, A., Auer, S., and Lehmann, J. Neu-
ral network-based question answering over knowledge
graphs on word and character level. In Proc. of the 26th
Int. Conf. on World Wide Web, 2017.
Marx, E., Zaveri, A., Moussallem, D., and Rautenberg, S.
Dbtrends: Exploring query logs for ranking rdf data. In
Proc. of the 12th International Conference on Semantic
Systems, pp. 9–16. ACM, 2016.
McCrae, J. P., Abele, A., Buitelaar, P., Cyganiak, R.,
Jentzsch, A., and Andryushechkin, V. The Linked Open
Data Cloud, 2018. URL
Prud’hommeaux, E., Seaborne, A., et al. SPARQL query
language for RDF. Technical report, World Wide Web
Consortium, 2006. URL
Reed, S. and De Freitas, N. Neural programmer-interpreters.
arXiv preprint arXiv:1511.06279, 2015.
Seaborne, A. Arq-a sparql processor for jena. Obtained
Internet Httpjena Sourceforge NetARQ, 2010.
Sorokin, D. and Gurevych, I. End-to-end representation
learning for question answering with weak supervision. In
Semantic Web Evaluation Challenge, pp. 70–83. Springer,
Soru, T., Marx, E., Moussallem, D., Publio, G., Valdes-
tilhas, A., Esteves, D., and Neto, C. B. SPARQL as a
foreign language. In 13th Int. Conf. on Semantic Systems
(SEMANTiCS 2017) - Posters and Demos, 2017.
Unger, C., Forascu, C., Lopez, V., Ngomo, A.-C. N., Cabrio,
E., Cimiano, P., and Walter, S. Question answering over
linked data (qald-4). In Working Notes for CLEF 2014
Conf., 2014.
Wang, Z., Zhang, J., Feng, J., and Chen, Z. Knowledge
graph and text jointly embedding. In Proc. of the 2014
conf. on empirical methods in natural language process-
ing (EMNLP), pp. 1591–1601, 2014.
Yih, W.-t., Richardson, M., Meek, C., Chang, M.-W., and
Suh, J. The value of semantic parse labeling for knowl-
edge base question answering. In Proc. of the 54th Annual
Meeting of the ACL (Vol. 2: Short Papers), pp. 201–206,
Zafar, H., Napolitano, G., and Lehmann, J. Formal query
generation for question answering over knowledge bases.
In 15th Extended Semantic Web Conference, 2018.
Zhong, V., Xiong, C., and Socher, R. Seq2sql: Generating
structured queries from natural language using reinforce-
ment learning. arXiv preprint arXiv:1709.00103, 2017.
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Question answering (QA) systems often consist of several components such as Named Entity Disambiguation (NED), Relation Extraction (RE), and Query Generation (QG). In this paper, we focus on the QG process of a QA pipeline on a large-scale Knowledge Base (KB), with noisy annotations and complex sentence structures. We therefore propose SQG, a SPARQL Query Generator with modular architecture, enabling easy integration with other components for the construction of a fully functional QA pipeline. SQG can be used on large open-domain KBs and handle noisy inputs by discovering a minimal subgraph based on uncertain inputs, that it receives from the NED and RE components. This ability allows SQG to consider a set of candidate entities/relations, as opposed to the most probable ones, which leads to a significant boost in the performance of the QG component. The captured subgraph covers multiple candidate walks, which correspond to SPARQL queries. To enhance the accuracy, we present a ranking model based on Tree-LSTM that takes into account the syntactical structure of the question and the tree representation of the candidate queries to find the one representing the correct intention behind the question. SQG outperforms the base-line systems and achieves a macro F1-measure of 75% on the LC-QuAD dataset.
Full-text available
We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the encoding models allow for trade-offs between accuracy and compute resources. For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. Comparisons are made with baselines that use word level transfer learning via pretrained word embeddings as well as baselines do not use any transfer learning. We find that transfer learning using sentence embeddings tends to outperform word level transfer. With transfer learning via sentence embeddings, we observe surprisingly good performance with minimal amounts of supervised training data for a transfer task. We obtain encouraging results on Word Embedding Association Tests (WEAT) targeted at detecting model bias. Our pre-trained sentence encoding models are made freely available for download and on TF Hub.
Full-text available
In the last years, the Linked Data Cloud has achieved a size of more than 100 billion facts pertaining to a multitude of domains. However, accessing this information has been significantly challenging for lay users. Approaches to problems such as Question Answering on Linked Data and Link Discovery have notably played a role in increasing information access. These approaches are often based on handcrafted and/or statistical models derived from data observation. Recently, Deep Learning architectures based on Neural Networks called seq2seq have shown to achieve state-of-the-art results at translating sequences into sequences. In this direction, we propose Neural SPARQL Machines, end-to-end deep architectures to translate any natural language expression into sentences encoding SPARQL queries. Our preliminary results, restricted on selected DBpedia classes, show that Neural SPARQL Machines are a promising approach for Question Answering on Linked Data, as they can deal with known problems such as vocabulary mismatch and perform graph pattern composition.
Conference Paper
Full-text available
Question Answering (QA) systems over Knowledge Graphs (KG) automatically answer natural language questions using facts contained in a knowledge graph. Simple questions, which can be answered by the extraction of a single fact, constitute a large part of questions asked on the web but still pose challenges to QA systems, especially when asked against a large knowledge resource. Existing QA systems usually rely on various components each specialised in solving different sub-tasks of the problem (such as segmenta-tion, entity recognition, disambiguation, and relation classification etc.). In this work, we follow a quite different approach: We train a neural network for answering simple questions in an end-to-end manner, leaving all decisions to the model. It learns to rank subject-predicate pairs to enable the retrieval of relevant facts given a question. The network contains a nested word/character-level question en-coder which allows to handle out-of-vocabulary and rare word problems while still being able to exploit word-level semantics. Our approach achieves results competitive with state-of-the-art end-to-end approaches that rely on an attention mechanism.
Conference Paper
Full-text available
Harnessing the statistical power of neural networks to perform language understanding and symbolic reasoning is difficult, when it requires executing efficient discrete operations against a large knowledge-base. In this work, we introduce a Neural Symbolic Machine, which contains (a) a neural "programmer", i.e., a sequence-to-sequence model that maps language utterances to programs and utilizes a key-variable memory to handle compositionality (b) a symbolic "computer", i.e., a Lisp interpreter that performs program execution, and helps find good programs by pruning the search space. We apply REINFORCE to directly optimize the task reward of this structured prediction problem. To train with weak supervision and improve the stability of REINFORCE, we augment it with an iterative maximum-likelihood training process. NSM outperforms the state-of-the-art on the WebQuestionsSP dataset when trained from question-answer pairs only, without requiring any feature engineering or domain-specific knowledge.
Conference Paper
Translating natural language questions to semantic representations such as SPARQL is a core challenge in open-domain question answering over knowledge bases (KB-QA). Existing methods rely on a clear separation between an offline training phase, where a model is learned, and an online phase where this model is deployed. Two major shortcomings of such methods are that (i) they require access to a large annotated training set that is not always readily available and (ii) they fail on questions from before-unseen domains. To overcome these limitations, this paper presents NEQA, a continuous learning paradigm for KB-QA. Offline, NEQA automatically learns templates mapping syntactic structures to semantic ones from a small number of training question-answer pairs. Once deployed, continuous learning is triggered on cases where templates are insufficient. Using a semantic similarity function between questions and by judicious invocation of non-expert user feedback, NEQA learns new templates that capture previously-unseen syntactic structures. This way, NEQA gradually extends its template repository. NEQA periodically re-trains its underlying models, allowing it to adapt to the language used after deployment. Our experiments demonstrate NEQA's viability, with steady improvement in answering quality over time, and the ability to answer questions from new domains.
Conference Paper
In this paper we present a knowledge base question answering system for participation in Task 4 of the QALD-7 shared task. Our system is an end-to-end neural architecture for constructing a structural semantic representation of a natural language question. We define semantic representations as graphs that are generated step-wise and can be translated into knowledge base queries to retrieve answers. We use a convolutional neural network (CNN) model to learn vector encodings for the questions and the semantic graphs and use it to select the best matching graph for the input question. We show on two different datasets that our system is able to successfully generalize to new data.
Conference Paper
Templates are an important asset for question answering over knowledge graphs, simplifying the semantic parsing of input utterances and generating structured queries for interpretable answers. State-of-the-art methods rely on hand-crafted templates with limited coverage. This paper presents QUINT, a system that automatically learns utterance-query templates solely from user questions paired with their answers. Additionally, QUINT is able to harness language compositionality for answering complex questions without having any templates for the entire question. Experiments with different benchmarks demonstrate the high quality of QUINT.