Conference PaperPDF Available

Development of an Open Source Natural Language Generation Tool for Finnish

Authors:

Abstract

We present an open source Python library to automatically produce syntactically correct Finnish sentences when only lemmas and their relations are provided. The tool resolves automatically morphosyntax in the sentence such as agreement and government rules and uses Omorfi to produce the correct morphological forms. In this paper, we discuss how case government can be learned automatically from a corpus and incorporated as a part of the natural language generation tool. We also present how agreement rules are modeled in the system and discuss the use cases of the tool such as its initial use as part of a computational creativity system, called Poem Machine.
Proceedings of the 4th International Workshop for Computational Linguistics for Uralic Languages (IWCLUL 2018), pages 51–58,
Helsinki, Finland, January 8–9, 2018. c
2018 Association for Computational Linguistics
51
Development of an Open Source Natural
Language Generation Tool for Finnish
Mika Hämäläinen
University of Helsinki
Department of Modern Languages
mika.hamalainen@helsinki.fi
Jack Rueter
University of Helsinki
Department of Modern Languages
jack.rueter@helsinki.fi
Abstract
We present an open source Python library to automatically produce syntac-
tically correct Finnish sentences when only lemmas and their relations are pro-
vided. e tool resolves automatically morphosyntax in the sentence such as
agreement and government rules and uses Omor to produce the correct mor-
phological forms. In this paper, we discuss how case government can be learned
automatically from a corpus and incorporated as a part of the natural language
generation tool. We also present how agreement rules are modelled in the system
and discuss the use cases of the tool such as its initial use as part of a computa-
tional creativity system, called Poem Machine.
Tiivistelmä
Tässä artikkelissa esielemme avoimen lähdekoodin Python-kirjaston kielio-
pillisten lauseiden automaaista tuoamista varten suomen kielelle. Kieliopilli-
set rakenteet pystytään tuoamaan pelkkien lemmojen ja niiden välisten suh-
teiden avulla. Työkalu ratkoo vaadiavan morfosyntaktiset vaatimukset kuten
kongruenssin ja rektion automaaisesti ja tuoaa morfologisesti oikean muodon
Omorn avulla. Esielemme tavan, jolla verbien rektiot voidaan poimia auto-
maaisesti korpuksesta ja yhdistää osaksi NLG-järjestelmää. Esielemme, miten
kongruenssi on mallinneu osana järjestelmää ja kuvaamme työkalun alkuperäi-
sen käyötarkoituksen osana laskennallisesti luovaa Runokone-järjestelmää.
1 Introduction
Natural language generation is a task that requires knowledge about the syntax and
morphology of the language to be generated. Such knowledge can partially be coded
is work is licensed under a Creative Commons Aribution 4.0 International Licence. Licence details:
http://creativecommons.org/licenses/by/4.0/
e source code is released in GitHub https://github.com/mikahama/syntaxmaker
52
by hand into a computational system, but part of the knowledge is beer obtained
automatically such as case government for verbs.
Having a computer create poetry automatically is a challenging task. Even more
so in the context of a morphologically rich language such as Finnish which makes gen-
erating grammatical sentences, even when they are not creative, a challenge. ere-
fore having a syntactically solid system as a part of the poem generation process is
extremely important.
In this paper, we present an open-source tool for producing syntactically correct
Finnish sentences. is tool is used as a part of an NLG pipeline in producing Finnish
poetry automatically. e poem generation part of the pipeline is out of the scope of
this paper.
2 Related work
Previously in the context of poetry generation in Finnish (Toivanen et al., 2012), the
problem of syntax has been solved by taking a ready-made poem, analyzing it mor-
phologically and replacing some of the words in it, inecting them with the mor-
phology of the original words. is, however, does not make it possible to generate
entirely new sentences, and it fails to take agreement or government rules into ac-
count, instead it expects agreement and government to be followed automatically if
words with sucient similarity are used in substitutes.
Another take on generating Finnish poetry in a human-computer co-creativity
seing (Kantosalo et al., 2015) was to use sentences extracted from the Project Guten-
berg’s children’s literature in Finnish. ese sentences were treated as ”poetry frag-
ments” and they were used to generate poems by combining them together in a ran-
domized fashion. is method indeed gives syntactically beer results than the one
described in (Toivanen et al., 2012), as it puts human-wrien sentences together, but
it doesn’t allow any variation in the poem apart from the order of the sentences in the
poem.
Reiter (1994) identies four dierent steps in an NLG pipeline. ose are content
determination, sentence planning, surface generation, and morphology and format-
ting. In the content determination step, an input is given to the NLG system, e.g. in
the form of a query to obtain desired information from the system. Based on this
query, a semantic representation is produced addressing the results to the query. In
other words, this step decides what information is to be conveyed to the user in the
nal output sentence, but also how it will be communicated in the rhetorical planning
of the sentence.
e sentence planner will then get the semantic representation as input and pro-
duce an abstract linguistic form which contains the words to be used in the output
and their syntactic relations. is step bears no knowledge of how the syntax will
actually be realized, i.e., agreement or government rules, instead it applies the chosen
words and how they are related to one another.
e last two steps of the pipeline deal with the actual realization of the syntax in
the sentence. It is the task of the surface generator to handle the linguistic expression
of the abstract linguistic structure. It means resolving agreement, forming questions
in a syntactically correct manner, negation and so on. e actual word forms required
are produced in the morphology step.
53
3 e Finnish NLG tool
e tool, Syntax Maker, described in this paper focuses on the surface generation step
of the NLG pipeline described by Reiter (1994). It is used as a part of a complete NLG
pipeline for producing Finnish poetry and is currently in place in the Poem Machine1
system. is tool was made as a part of the poem generation system in order to solve
the problem of creating novel, grammatical sentences not tackled by the previous
Finnish poem generators. Taking an NLG point of view hasn’t been studied before in
the case of Finnish poetry, which is a shame since Finnish, unlike English, has a rich
morphosyntax. is rich morphosyntax must be given proper aention if the com-
putational creativity system is to be given more freedom to produce sentences of its
own, using its own choice of words in a sentence that might cause other words around
them to undergo morphological change as dictated by agreement and government.
Syntax Maker only knows the morphology needed in the level of tags. For exam-
ple, it knows what case to use for a noun and what person to use for a verb. Actual
morphological forms are generated using Omor (Pirinen et al., 2017).
3.1 Syntactic representation
Syntax Maker is designed to take the abstract linguistic structure of a sentence as its
input. is structure consists of part-of-speech specic phrases each of which have
their head word in lemmatized form. e phrases are nested under each other so that
the highest possible root of the tree is a verb phrase.
When the phrases are nested together, they need to be added in proper slots to
fulll the requirements of agreement and government. For example, a noun phrase
that is to act as a direct object of a verb phrase has to be nested in the verb phrase
slot dir_object. In dealing with verb phrases, Syntax Maker automatically deduces
the possible slots based on the verbs used as heads. In other words, Syntax Maker,
determines the valency of a verb automatically and assigns values such as transitive,
ditransitive or intransitive. On an abstract level, the phrases and their structures are
dened manually.2
Using phrase structures gives us an easier way to implement the needed func-
tionalities. Since the structure of phrases is similar for dierent parts-of-speech, we
can reuse the same code across dierent parts-of-speech. e division into part-of-
speech specic phrases gives us more freedom in expressing their peculiarities such as
agreement and government rules and what kind of phrases can be nested under them.
ese structures come with a predened word order, but it’s not enforced by Syntax
Maker. In other words, the word order in a phrase can be shued at will without
losing the government or agreement information. Even with an altered word order,
Syntax Maker can resolve the proper morphology correctly. e phrase structures are
dened in JSON outside of the source code of Syntax Maker wrien in Python.
3.2 Handling government
Case government rules for adpositions have been hand coded. is can be aributed
to the fact that there is only a very limited number of adpositions in Finnish, and
it takes lile time for a native speaker to write down the case required of a noun
1http://runokone.cs.helsinki.fi/
2ese structures are available on https://github.com/mikahama/syntaxmaker/blob/master/
grammar.json
54
phrase when it serves as complement to a given adpositional phrase. e analogous
treatment of verbs, however, would be overly time consuming and laborious, and
hence this has been automated.
As Finnish is an accusative language, the object is marked with a specic case. e
case used depends on the verb in question and thus has to be specied for each verb
separately. We obtain the case government information together with verb transitiv-
ity automatically from e Finnish Internet Parsebank (Kanerva et al., 2014) syntactic
bi-grams.
Each line of the automatically parsed Parsebank bi-gram data consists of two word
forms connected by a syntactic relation in the order in which they appeared in the
sentence. ese word forms are accompanied by their lemma, part-of-speech, mor-
phology and syntactic annotation.
To extract the cases in which nouns have been linked to verbs, we look for lines in
which the rst word form has Vas its part-of-speech tag and the second word form
has Npart-of-speech and NUM_Sg in its morphology. e reason why we limit the
search to singular nouns only is that, in Finnish, a verb that takes its object in geni-
tive in singular, takes it in nominative in plural, e.g. syön kakun and syön kakut but
not *syön kakkujen. erefore taking plural objects into account as well would intro-
duce more undesired complexity. Furthermore, we ignore all nouns where the lemma
and word from are the same. is is done because the noun in question would then
either be in the nominative, which is not an object case, or it will have been given
an improper analysis in which case no lemmatization has been performed. Exam-
ples of this kind of wrong analyses in the corpus are kaella/kaella/N/NUM_Sg and
kasteleen/kasteleen/N/NUM_Sg. For each bi-gram lling these criteria, we store the
lemma of the verb and the case the related noun was in. is gives us a dictionary3
of verbs and frequencies for noun cases associated with each verb.
e resulting dictionary is then used to determine the transitivity of a verb and the
most frequent case for its object(s). is dictionary consists also of a plethora of non-
verbs such as Ljubuški and Dodonpa as a result of erroneous parsing in the Parsebank
data. is, however, causes no problems in the system because the dictionary also
contains a extensive number of real, lemmatized verbs. Given that Syntax Maker
operates on the level of surface generation, it is not actively involved in choosing the
words in the NLG task. is means that, unless Syntax Maker is specically instructed
to use a non-verb it happens to know as a verb, it won’t. is noise in the verb noun
case dictionary, however, has no real eect on the grammaticality of the generated
sentences.
e transitivity and most frequent case of the object is determined for a given verb
by the verb noun case dictionary. e system is coded to accept the genitive, partitive,
elative and illative as possible direct object cases and the essive, translative, ablative,
allative and illative cases as indirect object cases. When the system denes whether
a verb can take a direct object, it requires the relative frequency of one of the direct
object cases to be above 23% of all the possible cases the verb has been seen with. For
ditransitivity, the threshold is 18% for an indirect object case. Ditransitivity will not
be considered if the verb is determined not to have a direct object. ese threshold
values have been adjusted by hand aer looking at the performance of the system
with a handful of verbs used in testing.
e genitive serves another use in Finnish syntax in addition to marking the direct
3e verb-noun case dictionrary is released on https://github.com/mikahama/syntaxmaker/
blob/master/verb_valences_new.json
55
object. If the most frequent direct object case is genitive, we preform an additional
check to see that it really is being used as an object function. e verb has to also
has enough partitive case, over 23%, so that we can safely say that genitive indeed
can be used as an object. is is because in Finnish, verbs that take their direct object
in the genitive, also accept partitive in certain contexts such as in the expression of
dierences in aspect or negation.
3.3 Modelling agreement
Agreement, unlike government, is something that does not need to be extracted from
a corpus. It is a rather straightforward thing and can be modelled with hand-wrien
rules. In Finnish the predicate verb agrees in person and number with the subject,
and adjective aributes agree in case and number with the head noun.
Since all the phrase types in our system are modelled in a similar way, it is easy to
introduce agreement rules in the phrase structures. In a phrase structure, we dene
a key that is either parent referring to the parent phrase of the current phrase or a
key to the list of component. Component lists all the possible syntactic positions for
nested phrases such as subject or dir_object. Even though there aren’t many agree-
ment relations in Finnish, by modelling them in the external grammar le, we hope
to make it easier to add more languages to the system in the future.
When Syntax Maker produces a sentence, it starts to process the syntactic tree
phrase by phrase. For each phrase, it looks at the dened agreement relationship and
copies the morphological information from the phrase dened to be agreed with. e
agreement relation in the grammar le states the morphological tags which should be
copied, for example in the case of an adjective phrase, the tags are CASE and NUM.
3.4 Modifying the verb phrase
Apart from just providing basic grammaticality by resolving agreement and govern-
ment, Syntax Maker also provides means to modify verb phrases to produce more
complex, yet grammatical sentences.
Syntax Maker can be used to negate sentences. When a sentence is negated, a
new phrase with the head ei is added to the components of the verb phrase as aux.
e new phrase has an agreement relation parent->subject: PERS, NUM and the verb
phrase containing the predicate verb is tagged as NEG so that it will be conjugated as
such when the full sentence is produced as text. e case of the direct object is also
changed to partitive if the most frequent direct object case of the verb is genitive, in
compliance with Finnish grammar.
Mood and tense are also handled by Syntax Maker. In the case of the prefect,
the auxiliary verb olla is set as the new head of the verb phrase and the old head is
moved to a new subordinate phrase with the part-of-speech value PastParticiple and
agreement parent->subject: NUM. is makes sense from the point of view of Syntax
Maker since olla is the verb that is conjugated normally while the participle form only
agrees with the number of the subject. Other auxiliary verbs can be added in a similar
fashion, where the auxiliary verb substitutes the original head and the verb is moved
to a nested phrase with the morphology required by the auxiliary verb.
Passive voice is handled by creating a dummy phrase as a subject with the mor-
phological tags PERS = 4 and NUM = PE. is will automatically make the verb agree
with the dummy phrase’s morphology and produce the correct form as output. Also,
56
if the verb takes its direct object in genitive, the government rule is changed so that
the direct object will be in the nominative.
A sentence can also be turned into an interrogative one. is adds an additional
morphological tag CLIT = KO to the head of the verb phrase and moves it to the
beginning of the whole sentence. Syntax Maker does not produce punctuation, so a
question mark has to be appended to the end of the sentence at a dierent level in the
NLG pipeline.
4 Evaluation
In this part, we evaluate how accurately Syntax Maker can produce verb phrases. We
limit this evaluation to the automatically extracted information used by Syntax Maker
because it is more prone to errors than the hand wrien rules. is means that we are
evaluating two things in the generated output: the predicted valency i.e. how many
objects the verb can take and the predicted case for the object.
In order to do the evaluation, we take a hundred Finnish verbs at random from the
Finnish Wiktionary4. ese verbs are then given as input to Syntax Maker to produce
verb phrases out of them. e valency and object cases are then checked by hand to
conduct the evaluation phase.
too low too high correct
valency prediction 28% 5% 67%
Table 1: Accuracy in predicting valency
Syntax maker predicts the number of objects correctly 67% of the time and 28%
of the time too low. is is acceptable in the task of poem generation where we
are interested in generating syntactically correct poems. Having too few objects in
the generated output only creates an ellipsis that doesn’t result in incorrect syntax.
However, in other NLG tasks outside of the scope of poem generation, the objects
might be important and thus having a higher accuracy is something to work towards.
case correct case incorrect no object
object case prediction 50% 4% 46%
Table 2: Accuracy in predicting object case
In the test set, Syntax Maker produced a wrong case only 4% of the time. 46% of
the verbs were either truly intransitive or didn’t take an object according to Syntax
Maker. In other words, by just taking into account the transitive verbs recognized by
Syntax Maker, the accuracy reached to 93%. is means that Syntax Maker is very
good at coming up with the correct case but not as good at determining the valency
accurately.
5 Future work
At the current state, Syntax Maker doesn’t handle all parts of the Finnish grammar.
For instance, it doesn’t have the functionality to express aspectual dierence by alter-
4From a Wiktionary dump on https://dumps.wikimedia.org/fiwiktionary/
57
ing between genitive and partitive objects. In addition, it has only a limited knowl-
edge of the transitivity of verbs. Novel automated ways should be studied to solve
this shortcoming.
In the future, Syntax Maker should be tested as a part of the NLG pipeline in uses
other than poetry generation as well. is might reveal new requirements for the
system that do not appear in the task of poetry generation. is might also reveal
missing functionalities both in the generation of syntax and the API provided by the
library that are needed in other NLG tasks.
Including small Uralic languages in this tool is also in our interest for the future.
is is because having an NLG system would be especially useful in the case of minor-
ity languages, for example in generation of news automatically in these languages.
6 Conclusions
In this paper we have presented an open source Python library called Syntax Maker.
e library was made to be used as a low-level syntax producer in a new NLG pipeline
for producing Finnish poetry and is currently in place in a computational creativity
system known as Poem Machine5. By embracing the notion of separation of concerns
in the soware architecture of the system, Syntax Maker can be used in a multitude
of contexts outside of computational creativity applications as an all-purpose tool
for producing grammatical Finnish. To achieve this goal, a method for extracting
the information needed to resolve verbal agreement automatically was presented and
evaluated.
Acknowledgments
is work has been supported by the Academy of Finland under grant 276897 (CLiC)
References
Jenna Kanerva, Juhani Luotolahti, Veronika Laippala, and Filip Ginter. 2014. Syntactic
n-gram collection from a large-scale corpus of internet nnish. In Proceedings of
the Sixth International Conference Baltic HLT .
Anna Kantosalo, Jukka Toivanen, and Hannu Toivonen. 2015. Interaction evaluation
for human-computer co-creativity: A case study. In Proceedings of the Sixth Inter-
national Conference on Computational Creativity. pages 276–283.
Tommi A Pirinen, Inari Listenmaa, Ryan Johnson, Francis M. Tyers, and Juha
Kuokkala. 2017. Open morphology of nnish. LINDAT/CLARIN digital library at
the Institute of Formal and Applied Linguistics, Charles University. hp://hdl.han-
dle.net/11372/LRT-1992.
Ehud Reiter. 1994. Has a consensus nl generation architecture appeared, and is it
psycholinguistically plausible? In Proceedings of the Seventh International Workshop
on Natural Language Generation. INLG ’94.
5Poem Machine can be used on http://runokone.cs.helsinki.fi/
58
Jukka Toivanen, Hannu Toivonen, Alessandro Valitui, and Oskar Gross. 2012.
Corpus-based generation of content and form in poetry. In Proceedings of the ird
International Conference on Computational Creativity.
... For Finnish in particular, the literature for neural headline generation and summarization is scarce. Currently however, most work regarding Finnish headline and text generation seems to be using more conventional NLP, rule-based and statistical methods (Leppänen et al., 2017;Hämäläinen and Alnajjar, 2019;Hämäläinen and Rueter, 2018). ...
Preprint
Full-text available
We present a novel approach to generating news headlines in Finnish for a given news story. We model this as a summarization task where a model is given a news article, and its task is to produce a concise headline describing the main topic of the article. Because there are no openly available GPT-2 models for Finnish, we will first build such a model using several corpora. The model is then fine-tuned for the headline generation task using a massive news corpus. The system is evaluated by 3 expert journalists working in a Finnish media house. The results showcase the usability of the presented approach as a headline suggestion tool to facilitate the news production process.
... For Finnish in particular, the literature for neural headline generation and summarization is scarce. Currently however, most work regarding Finnish headline and text generation seems to be using more conventional NLP, rule-based and statistical methods (Leppänen et al., 2017;Hämäläinen and Alnajjar, 2019;Hämäläinen and Rueter, 2018). ...
Conference Paper
Full-text available
We present a novel approach to generating news headlines in Finnish for a given news story. We model this as a summarization task where a model is given a news article, and its task is to produce a concise headline describing the main topic of the article. Because there are no openly available GPT-2 models for Finnish, we will first build such a model using several corpora. The model is then fine-tuned for the headline generation task using a massive news corpus. The system is evaluated by 3 expert journalists working in a Finnish media house. The results showcase the usability of the presented approach as a headline suggestion tool to facilitate the news production process.
... Our current approach focuses on English, in the future, we are interested in using our method for other languages as well such as Finnish. This would require a more robust surface realization method to deal with morphology more complex than that of English (Hämäläinen and Rueter 2018). There is already a similar semantic database available for Finnish (Hämäläinen 2018) as the one we used for English, which greatly facilitates a multilingual port of our method. ...
Preprint
Full-text available
Automated news generation has become a major interest for new agencies in the past. Oftentimes headlines for such automatically generated news articles are unimaginative as they have been generated with ready-made templates. We present a computationally creative approach for headline generation that can generate humorous versions of existing headlines. We evaluate our system with human judges and compare the results to human authored humorous titles. The headlines produced by the system are considered funny 36\% of the time by human evaluators.
... Our current approach focuses on English, in the future, we are interested in using our method for other languages as well such as Finnish. This would require a more robust surface realization method to deal with morphology more complex than that of English (Hämäläinen and Rueter 2018). There is already a similar semantic database available for Finnish (Hämäläinen 2018) as the one we used for English, which greatly facilitates a multilingual port of our method. ...
Conference Paper
Full-text available
Automated news generation has become a major interest for new agencies in the past. Oftentimes headlines for such automatically generated news articles are unimaginative as they have been generated with ready-made templates. We present a computationally creative approach for headline generation that can generate humorous versions of existing headlines. We evaluate our system with human judges and compare the results to human authored humorous titles. The headlines produced by the system are considered funny 36% of the time by human evaluators.
... If a verb gets changed in the sentence, its case government rule might be different from the original verb. In this case, we apply Syntax Maker (Hämäläinen and Rueter 2018) to resolve the case of the complements of the new verb. Taking these two different linguistic rules into account while generating results in more syntactic output. ...
Thesis
Full-text available
Computational creativity has received a good amount of research interest in generating creative artefacts programmatically. At the same time, research has been conducted in computational aesthetics, which essentially tries to analyse creativity exhibited in art. This thesis aims to unite these two distinct lines of research in the context of natural language generation by building, from models for interpretation and generation, a cohesive whole that can assess its own generations. I present a novel method for interpreting one of the most difficult rhetoric devices in the figurative use of language: metaphors. The method does not rely on hand-annotated data and it is purely data-driven. It obtains the state of the art results and is comparable to the interpretations given by humans. We show how a metaphor interpretation model can be used in generating metaphors and metaphorical expressions. Furthermore, as a creative natural language generation task, we demonstrate assigning creative names to colours using an algorithmic approach that leverages a knowledge base of stereotypical associations for colours. Colour names produced by the approach were favoured by human judges to names given by humans 70% of the time. A genetic algorithm-based method is elaborated for slogan generation. The use of a genetic algorithm makes it possible to model the generation of text while optimising multiple fitness functions, as part of the evolutionary process, to assess the aesthetic quality of the output. Our evaluation indicates that having multiple balanced aesthetics outperforms a single maximised aesthetic. From an interplay of neural networks and the traditional AI approach of genetic algorithms, we present a symbiotic framework. This is called the master-apprentice framework. This makes it possible for the system to produce more diverse output as the neural network can learn from both the genetic algorithm and real people. The master-apprentice framework emphasises a strong theoretical foundation for the creative problem one seeks to solve. From this theoretical foundation, a reasoned evaluation method can be derived. This thesis presents two different evaluation practices based on two different theories on computational creativity. This research is conducted in two distinct practical tasks: pun generation in English and poetry generation in Finnish.
... We are interested in seeing what the effect of the automatically adapted dialect is on computer generated text. We use an existing Finnish poem generator (Hämäläinen 2018) that produces standard Finnish (SF) text as it relies heavily on hand defined syntactic structures that are filled with lemmatized words that are inflected with a normative Finnish morphological generator by using a tool called Syntax Maker (Hämäläinen and Rueter 2018). We use this generator to generate 10 different poems. ...
Preprint
Full-text available
We present a novel approach for adapting text written in standard Finnish to different dialects. We experiment with character level NMT models both by using a multi-dialectal and transfer learning approaches. The models are tested with over 20 different dialects. The results seem to favor transfer learning, although not strongly over the multi-dialectal approach. We study the influence dialectal adaptation has on perceived creativity of computer generated poetry. Our results suggest that the more the dialect deviates from the standard Finnish, the lower scores people tend to give on an existing evaluation metric. However, on a word association test, people associate creativity and originality more with dialect and fluency more with standard Finnish.
... We are interested in seeing what the effect of the automatically adapted dialect is on computer generated text. We use an existing Finnish poem generator (Hämäläinen 2018) that produces standard Finnish (SF) text as it relies heavily on hand defined syntactic structures that are filled with lemmatized words that are inflected with a normative Finnish morphological generator by using a tool called Syntax Maker (Hämäläinen and Rueter 2018). We use this generator to generate 10 different poems. ...
Conference Paper
Full-text available
We present a novel approach for adapting text written in standard Finnish to different dialects. We experiment with character level NMT models both by using a multi-dialectal and transfer learning approaches. The models are tested with over 20 different dialects. The results seem to favor transfer learning, although not strongly over the multi-dialectal approach. We study the influence dialectal adaptation has on perceived creativity of computer generated poetry. Our results suggest that the more the dialect deviates from the standard Finnish, the lower scores people tend to give on an existing evaluation metric. However, on a word association test, people associate creativity and originality more with dialect and fluency more with standard Finnish.
... For each parallel structure, we fill it with all the possibilities of different combinations of words to replace the placeholders. We use Syntax Maker [11] together with Omorfi [22] to produce the matching morphological form for the Finnish placeholders. For English, we inflect the words with the pattern library [24]. ...
Preprint
Full-text available
This paper presents the current lexical, morphological, syntactic and rule-based machine translation work for Erzya and Moksha that can and should be used in the development of a roadmap for Mordvin linguistic research. We seek to illustrate and outline initial problem types to be encountered in the construction of an Apertium-based shallow-transfer machine translation system for the Mordvin language forms. We indicate reference points within Mordvin Studies and other parts of Uralic studies, as a point of departure for outlining a linguistic studies with a means for measuring its own progress and developing a roadmap for further studies.
Conference Paper
Full-text available
This paper presents the current lexical, morphological, syntactic and rule-based machine translation work for Erzya and Moksha that can and should be used in the development of a roadmap for Mordvin linguistic research. We seek to illustrate and outline initial problem types to be encountered in the construction of an Apertium-based shallow-transfer machine translation system for the Mordvin language forms. We indicate reference points within Mordvin Studies and other parts of Uralic studies, as a point of departure for outlining a linguistic studies with a means for measuring its own progress and developing a roadmap for further studies.
Conference Paper
Full-text available
Interaction design has been suggested as a framework for evaluating computational creativity by Bown (2014). Yet few practical accounts on using an Interaction Design based evaluation strategy in Computational Creativity Contexts have been reported in the literature. This study paper describes the evaluation process and results of a human-computer co-creative poetry writing tool intended for children in a school context. We specifically focus on one formative evaluation case utilizing Interaction Design evaluation methods, offering a suggestion on how to conduct Interaction Design based evaluation in a computational creativity context, as well as, report the results of the evaluation itself. The evaluation process is considered from the perspective of a computational creativity researcher and we focus on challenges and benefits of the interaction design evaluation approach within a computational creativity project context.
Conference Paper
Full-text available
We employ a corpus-based approach to generate content and form in poetry. The main idea is to use two different corpora, on one hand, to provide semantic content for new poems, and on the other hand, to generate a specific grammatical and poetic structure. The approach uses text mining methods, morphological analysis, and morphological synthesis to produce poetry in Finnish. We present some promising results obtained via the combination of these methods and preliminary evaluation results of poetry generated by the system.
Syntactic n-gram collection from a large-scale corpus of internet finnish
  • Jenna Kanerva
  • Juhani Luotolahti
  • Veronika Laippala
  • Filip Ginter
Jenna Kanerva, Juhani Luotolahti, Veronika Laippala, and Filip Ginter. 2014. Syntactic n-gram collection from a large-scale corpus of internet finnish. In Proceedings of the Sixth International Conference Baltic HLT.
Open morphology of finnish. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics
  • Inari Tommi A Pirinen
  • Ryan Listenmaa
  • Francis M Johnson
  • Juha Tyers
  • Kuokkala
Tommi A Pirinen, Inari Listenmaa, Ryan Johnson, Francis M. Tyers, and Juha Kuokkala. 2017. Open morphology of finnish. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics, Charles University. http://hdl.handle.net/11372/LRT-1992.
Has a consensus nl generation architecture appeared, and is it psycholinguistically plausible?
  • Ehud Reiter
Ehud Reiter. 1994. Has a consensus nl generation architecture appeared, and is it psycholinguistically plausible? In Proceedings of the Seventh International Workshop on Natural Language Generation. INLG '94.