The world’s biggest quiz - challenges
and consequences for NLU / NLP
Hamminger, Leopold / Stahlberger, Rolf. Europa Virtuelle Volkshochschule, Austria.
Corresponding author: firstname.lastname@example.org
The challenges of Natural Language Understanding (NLU) and
Natural Language Processing (NLP) are discussed, especially as
they pertain to using German language. The concept of “machine
understanding” is presented and its application in the area of
education: generating quizzes.
Natural Language Understanding (NLU) technologies have
received a huge boost in recent years from Apple's Siri, Google
Assistant, Amazon Alexa, etc., even in practical use. Since the
beginning of NLU with Eliza in 1966. Meanwhile, methods have
become increasingly refined. Sentiment analysis, vector-space
models, relation extraction, contextual word representations and
Natural Language Inference 1 are now the focus of current
Most research is done in English language and when extended to
other languages, typically a cross-lingual approach is used. To put
it simple, cross-lingual means that a specific method is applied
using English language and the output is translated to another
language. However, studies suggest that fine-tuned monolingual
models are superior.2. This is our preferred approach, in which a
specific method uses the output language only.
The German language in particular presents the NLP - community
with significantly greater challenges than the English language
does. Due to a higher grammar complexity, language economy
aspects such as omissions and coordination, ellipses are
widespread phenomena. These can cause significant problems, for
example, for search engines.
"I want to buy a sweater and my wife a car" - what happens when
we type this into a search engine of our choice? We receive
advertising for sweaters.
Yet it would be much more interesting for the search engine
provider - because it would be more lucrative - to also display ads
for cars. The search engine simply doesn't understand us properly.
2 e.g. Prettenhofer & Stein (2011) https://arxiv.org/abs/1008.0716 or Wan et al. (2011)
To resolve such problems it is first necessary to contextually
elucidate semantic ambiguities in words, phrases, and sentences.
To illustrate this let us look at a multiple-choice (M/C) quiz.
Let's consider a statement like "New York is the largest city in the
USA". If we create a question from this that reads "What is the
largest city in the USA?, we will find at least one solution, but not
necessarily only just one. "Big Apple" would also be a quite
acceptable answer. Clearly wrong, however, would be
To put it abstractly, we can define two sets as presented in figure
Figure 1 Representation of the possible options in the solution space
Let A be a non-empty set containing all possible correct answers
and let B be the set containing at least one incorrect answer. B
must necessarily be non-empty because natural language is
It is quite possible that semantically an answer option is wrong,
although it could be correct. Accordingly, the intersection A ∩ B
does not necessarily have to be empty.
Here is an example:
"Two boards are held together by three screws". If we replace
"screws" by "nails", the statement must not to be regarded as false,
although in our example the boards are necessarily really to be
connected by screws. Thus in this context „nails“ is not an element
of A. In another context, however, it may well be a correct
Now let's look again at our New York quiz, specifically at the
elements in B/A in Figure 1. There, we are certainly not interested
in those elements for our "New York" quiz that are unrealistic
answers, like Palm Springs. Vienna makes no sense either, and it
would be complete nonsense to suggest "screw" as an answer.
We can also present the above as optimization problem:
If we understand each solution option in terms of word2vec
/sent2vec as representative vector, the optimization problem for a
correct word solution (or phrase) a in general is approximately:
𝑚𝑎𝑥𝑠𝑖𝑚(𝑎,𝑏) = 𝑐𝑜𝑠(𝑣𝑒𝑐(𝑎), 𝑣𝑒𝑐(𝑏))
=𝑣𝑒𝑐(𝑎) ∗ 𝑣𝑒𝑐(𝑏)
|𝑣𝑒𝑐(𝑎)| ∗ |𝑣𝑒𝑐(𝑏)|
subject to b is in B/A, whereas b is a wrong solution.
Depending on how we train our model of word vectors, we may
generate Los Angeles, Houston, Philadelphia, Miami, Chicago,
etc. as incorrect solutions. We would than have created a
satisfactory and fascinating quiz with the question about the
largest city in the USA.
Let us now consider the case where we do not want to simply
represent a single entity, but whole a phrase within a sentence as
an MC quiz.
Let "Earth is the densest, fifth largest, and third closest planet to
the sun in the solar system" be our source text.
Using the familiar tools of part of speech tagging and dependency
parsing, we extract the noun chunk "the densest, fifth largest
Now, in order to find a wrong solution it would not be helpful to
simply substitute "planet" with any other word. Rather, our claim
is that we want to modify the whole phrase to create an interesting
yet clearly incorrect answer option. One solution would be to
negate one of the attributive adjectives:
„the not-densest, fifth largest planet“.
Not really pretty, we admit. It would be just as unattractive if we
replaced the numerical expression "fifth largest" with "second
largest". This would certainly give us a wrong statement, but
except for astronomers, probably only few among us will really
notice this as an error.
A better solution would be to compare Earth with Venus. As a
starting point we use the statement:
"Venus is the second innermost planet in the solar system, with an
average solar distance of 108 million kilometers, and the third
smallest, with a diameter of about 12,100 kilometers."
The system uses this correct statement about Venus to formulate a
wrong answer about the Earth question: "the densest, with a
diameter of about 12,100 kilometers the third smallest planet ".
Sounds more interesting, doesn't it?
However, we had to take a detour to do this. We had to recognize
that Earth is a planet, just like Venus. In addition we had to find
properties of Venus, which are undoubtedly not valid for Earth.
Apparently creating a quiz task can be quite a challenging
endeavor. With our system we created over 750,000 questions
based on German Wikipedia texts. We were able to achieve a
statistically significant high correctness at 98% (F1: 98.6) with an
acceptable complexity O(nlog(n)). We are thus in the range of
human achievement in terms of correctness.
Our quiz is ready be used in learning environments, in particular
lecturers and educational institutes can greatly benefit by saving
The concept our system is based on is also important in the
context of NLU. There, we are often confronted with omissions
and ambiguous statements, which only become semantically
unambiguous in the respective context. We have trained further
models which can cope with omissions and ambiguous statements
and other complex statements.
Let us consider, for example, a German legal text. There we can
find the sentence: „Der Handelsmakler hat, sofern nicht die
Parteien ihm dies erlassen oder der Ortsgebrauch mit Rücksicht
auf die Gattung der Ware davon entbindet, von jeder durch seine
Vermittlung nach Probe verkauften Ware die Probe, falls sie ihm
übergeben ist, so lange aufzubewahren, bis die Ware ohne
Einwendung gegen ihre Beschaffenheit angenommen oder das
Geschäft in anderer Weise erledigt wird“. ("The commercial
broker shall, unless the parties so decree or local usage with regard
to the type of goods exempts him from this, keep the sample, if it
is handed over to him, of any goods sold through his mediation
according to sample, until the goods are accepted without
objection to their condition or the transaction is settled in another
You did not understand this? Don't worry, we are native speakers
and feel the same way.
Let's make the text a little more vivid by using a trained system.
The result is several sentences that are easier to understand:
The commercial broker shall keep the sample of each commodity
sold by sample through his mediation.
The commercial broker must keep the sample until the goods are
accepted without objection, with the objection referring to its
nature; or, the commercial broker shall keep the sample until the
legal transaction is settled in another way.
This shall apply unless the parties waive this to the commercial
This shall apply if the local use with consideration of the kind of
the commodity releases from it.
This shall apply if the goods are handed over to the commercial
Our system has generated simpler statements that it can process
further. This is an important step in the processing of numerous,
for example, jurisprudential problem statements. It also plays an
important role in other areas, typically in chatbot applications. Our
system is also trained to find many idioms and phrases and is
capable of putting them in a semantic relation to the base text.
After this preliminary work, however, it is still crucial to bring
about an „machine understanding" of the statement. This is very
helpful in reorganizing a text in such a way that the semantic
quality remains, while the syntactic and structural relationship
changes. The result is an intersection of base text and
modification, which is the basis for a „machine understanding“ of
the actual message content.
These results have also an impact on further NLP developments:
Text Classification with pre-trained embeddings and universal
sentence (cross-)encoders and transformers, as well as Multi-task
NLP with transformer pipelines (sentiment analysis, NER or text
A key element of research is useful real-life applications. In the
area of education, for example, quiz-type output must be
embedded in a sound didactic framework based on proven
educational theories. In our research and practice we take a
holistic approach to ensure NLP / NLU is not for its own sake but
provides genuine added value.
Europa Virtuelle Volkshochschule, Austria