Conference PaperPDF Available

Generating Modern Poetry Automatically in Finnish

Authors:

Abstract and Figures

We present a novel approach for generating poetry automatically for the morphologically rich Finnish language by using a genetic algorithm. The approach improves the state of the art of the previous Finnish poem generators by introducing a higher degree of freedom in terms of structural creativity. Our approach is evaluated and described within the paradigm of computational creativity, where the fitness functions of the genetic algorithm are assimilated with the notion of aesthetics. The output is considered to be a poem 81.5% of the time by human evaluators.
Content may be subject to copyright.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing
and the 9th International Joint Conference on Natural Language Processing, pages 6001–6006,
Hong Kong, China, November 3–7, 2019. c
2019 Association for Computational Linguistics
6001
Generating Modern Poetry Automatically in Finnish
Mika H¨
am¨
al¨
ainen
Department of Digital Humanities
University of Helsinki
mika.hamalainen@helsinki.fi
Khalid Alnajjar
Department of Computer Science
University of Helsinki
khalid.alnajjar@helsinki.fi
Abstract
We present a novel approach for generating
poetry automatically for the morphologically
rich Finnish language by using a genetic al-
gorithm. The approach improves the state of
the art of the previous Finnish poem genera-
tors by introducing a higher degree of freedom
in terms of structural creativity. Our approach
is evaluated and described within the paradigm
of computational creativity, where the fitness
functions of the genetic algorithm are assimi-
lated with the notion of aesthetics. The output
is considered to be a poem 81.5% of the time
by human evaluators.
1 Introduction
Poem generation is a challenging task for cre-
ative NLG (natural language generation) requir-
ing structural integrity in the form of rhyming and
meter, grammatical correctness and figurative ex-
pression. Poems are meant to be interpreted and
therefore the meaning they convey cannot be fully
explained by semantics, but they rather require an
exploration into the notion of pragmatics.
In this paper, we present a novel approach
based on a genetic algorithm for creating poetry
in Finnish from the stand point of computational
creativity. In addition to solving problems related
to poems in general, the morphosyntactically com-
plex Finnish sets additional requirements for pro-
ducing grammatical output.
Computational creativity can be seen as a search
for creative artefacts in a conceptual space (cf.
Wiggins,2006). Therefore the use of genetic al-
gorithm for a creative task is reasonable as it con-
ducts a search and picks out the most suitable can-
didates based on its fitness function. An important
aspect for creativity is that the system should be
able to assess its own creations, a notion called
appreciation (Colton,2008) or aesthetic function
(Colton et al.,2011) in the literature. The fitness
function of the genetic algorithm serves for this
exact purpose, as it can score the output in terms
of different aesthetic dimensions.
2 Related work
In the past, poetry generation has been stud-
ied both from the point of view of computa-
tional creativity and natural language generation.
Poem generation has been tackled with a vari-
ety of different methods such as case-based rea-
soning (Gerv´
as,2001), templates (Colton et al.,
2012), translation with WFSTs (weighted finite-
state transducers) (Greene et al.,2010), text trans-
formation via word embeddings (Bay et al.,2017)
and conditional variational autoencoders (Li et al.,
2018). As the field of poem generation has been
broadly discussed by Oliveira (2017), we dedi-
cate the rest of this section to describing the exist-
ing poetry generation work conducted for Finnish
within the computational creativity paradigm. We
also discuss some previous approaches using ge-
netic algorithms.
One of the first takes on Finnish poem gener-
ation is the P. O. Eticus system (Toivanen et al.,
2012). P. O. Eticus uses a corpus of human au-
thored poems. These poems are used as templates
for generating new poetry. In practice, the system
takes a random poem from the corpus, conducts
a morphological analysis on it and replaces some
of the words in the existing poem. The replaced
words are inflected to match the morphology of
the original word.
Another take on the poetry generation in
Finnish is that of Kantosalo et al. (2015). This
approach is presented as a part of a poem author-
ing system. How this generator operates is that
it takes sentences form children’s books stored in
its corpus based on a shared keyword. These sen-
tences serve as verses, or poem fragments, and
6002
they are output one after another forming a gen-
erated poem. As the system does not alter text at
all, it does not have to deal with the complexities
of the Finnish morphology.
The most recent work on Finnish poem genera-
tion is the work presented by H¨
am¨
al¨
ainen (2018a).
This approach consists of individual rule-based
verse generators, each of which produces struc-
turally different verses with different types of fig-
urative expression, such as metaphors, tautology,
comparison and so on. The verse generators are
applied in the order defined by hand-written poem
structures. Semantic cohesion is achieved by the
fact that each verse generator outputs a noun to
the following verse generator in the poem struc-
ture. This guarantees that verses are always coher-
ent to some extent with the verse that immediately
precedes them. This generator is in use in the cre-
ative internet application Poem Machine tailored
for co-creativity (H¨
am¨
al¨
ainen,2018b).
The previous approaches in Finnish poem gen-
eration covered in this section are limited in terms
of structural creativity. The approaches are ei-
ther limited by the structure imposed by the exist-
ing poems or sentences, or the hand-written verse
structures. The approach we present in our pa-
per showcases more creative freedom on the struc-
tural level. This, however, is challenging due to
the complicated morphosyntax of Finnish; struc-
tural changes can easily render the results nonsen-
sical as wrong morphology in an incorrect syn-
tactic position will make the entire sentence non-
grammatical. We take this into account in our pro-
posed method.
Genetic algorithms have been used in the gener-
ation of poetic language before. Although not full
poem generation, Herv´
as et al. (2007) have used
genetic algorithms for generating alliterations in
Spanish. In terms of full poetry generation, Manu-
rung et al. (2012) aim for grammaticality,mean-
ingfulness and poeticness with their genetic algo-
rithm approach. Their approach tires to maximize
the similarity of the poem meter to the target me-
ter, and the poem semantics to the target seman-
tics, while still retaining grammaticality.
A recent approach to poem generation with ge-
netic algorithms, TwitSong 3.0 (Lamb and Brown,
2019), is based on a mined corpus of sentences
that are used as verses in poems based on their
inter-compatibility in terms of rhyming. They
base their fitness functions on the following met-
rics: (meter, emotion, topicality and imagery). The
fitness functions operate on verse level. They
solve emotion and imagery with existing lexicons,
topicality is assessed based on trigram and key-
word similarity with the desired topic and meter is
scored based on how close it is to a iambic meter.
3 Poem Generator
This section is dedicated to describing the data
used for poem generation, the genetic algorithm
and how the Finnish morphosyntax is solved by
the system. Special attention is paid to describing
the fitness functions, according to which the sys-
tem can rank its creations.
3.1 Data
We crawl Wikisources1for Finnish poetry. This
way we obtain 6,189 poems. We parse the poems
by using the Finnish dependency parser (Haveri-
nen et al.,2014) to obtain syntactic relations, mor-
phological features, part of speech and lemma for
each word. This constitutes our poem corpus,
denoted as P, with verse-level syntactic parsing.
These poems are used by the genetic algorithm for
the initial population, where a stanza of a human
authored poem is treated as one poem.
For semantics, we use the pretrained word2vec
word embeddings2trained on the Finnish Internet
Parsebank (Kanerva et al.,2014). This word2vec
model has been trained on lemmatized data, which
is important as we are interested in obtaining re-
placement words in an uninflected form.
3.2 Genetic Algorithm
Genetic algorithms are inspired by evolution tak-
ing place in the real world. They have an initial set
of individuals forming a population. These indi-
viduals are then exposed to evolutionary processes
such as mutation and crossover. After a genera-
tion, the fittest individuals survive to the next gen-
eration and the evolutionary process is repeated.
For modeling this process, we use the Python li-
brary DEAP (Fortin et al.,2012) as the genetic al-
gorithms framework.
We employ a standard (µ+λ) genetic algo-
rithm, which has previously been used in compu-
tational creativity applications (see Alnajjar et al.,
1https://fi.wikisource.org
2http://bionlp-www.utu.fi/fin-vector-space-models/fin-
word2vec-lemma.bin
6003
2018). The method begins by constructing an ini-
tial population and then evolving it, while optimiz-
ing certain parameters, throughout Ggenerations.
At each generation step, the fittest µindividuals
in the current population and the λoffspring are
selected to represent the next population. We em-
pirically set µand λto 100 and Gto 25. Addition-
ally, the algorithm takes two user-defined inputs,
a poem pand a theme t. For our case, we con-
sidered a theme tas a single word representing an
abstract concept such as nature; alternatively, a set
of words could be used instead to represent a more
focused theme (e.g. tree,forest,flower, . . . etc).
3.2.1 Initial Population
To build an initial population containing po-
ems with various syntactic structures, the method
makes µcopies of the input poem p. For each
poem, the method then replaces one verse in it
with a random verse from a different poem exist-
ing in the poem corpus P.
3.2.2 Mutation and Crossover
In our method, we implement one type of mu-
tation which selects a random content word in
the poem. The term content word in this case
refers to a word that belongs to an open class
part-of-speech category. The selected word is sub-
stituted with another semantically similar word,
which is determined as follows. Let wbe the
random content word selected to be replaced, the
method retrieves the top 300 semantically simi-
lar words to was candidate replacements from
the word2vec model. Thereafter, the method
uses UralicNLP (H¨
am¨
al¨
ainen,2019) to perform
morphological analysis on all candidate words.
The candidate words that have a different part-of-
speech tag than the original word ware omitted
out. Out of the remaining candidate words, a ran-
dom word is picked to substitute w.
We use a single-point crossover at the verse-
level. In practice, this means that during the evolu-
tionary process two poem individuals are selected
and a single point at the beginning of their verses
is chosen at random. Verses after that point are
swapped between them.
3.2.3 Fitness Functions
The genetic algorithm assesses the individuals
based on six metrics that evaluate the poem’s
structure and one metric that evaluates semantics.
The difference in the number of syllables in verses
and in the poetic foot, as measured by the distribu-
tion of long and short syllables, are contrasted to
the original poem. The genetic algorithm is set to
minimize these values to keep the difference mini-
mal. As not changing the poem at all would result
in the minimum difference in these values, we pe-
nalize identical verses by giving them a distance of
20. This way the genetic algorithm tries to make
changes so that results following the original me-
ter are preferred.
The number of full rhymes, assonance rhymes
and consonance rhymes in between the verses of
each generated poem are used as metrics to as-
sess to overall poetic quality of the individuals.
The number of alliterating words is counted within
verses as this type of rhyme is traditionally oc-
curring within verses in Finnish poetry. The val-
ues given by these four metrics are maximized by
the genetic algorithm to get the maximum number
rhyming words in the final outputs.
The last metric measures the average semantic
similarity of the words in the poem to the input
theme twith the word2vec model. Maximizing
this function pushes the evolutionary process to-
wards creating poems that are close in semantics
to the desired input theme.
As we are employing multiple objective func-
tions in our genetic algorithm, we resort to using a
non-dominant sorting algorithm (NSGA-II) (Deb
et al.,2002) for optimizing these functions. In
short, the algorithm selects individuals that are not
dominated by any other individual. An individ-
ual xis considered to be dominating another if its
scores on all objective functions are greater than
or equal to y’s and it is always better than yon at
least one objective.
3.3 Surface Generation
As the genetic algorithm does substitutions on the
level of lemmas, it is important to be able to turn
the verses with new lemmas into grammatical sen-
tences. This is not only needed for presenting the
final output produced by the genetic algorithm to
people, but also for the fitness functions to work.
In Finnish, the surface form of a word (mor-
phological realization) is affected by two mech-
anisms: agreement and government. The former
means that certain words have to share morpho-
logical features in a sentence. For example, ad-
jectives will have to follow the case and number
of the noun they modify, like so: punainen talo (a
6004
red house) and punaisessa talossa (in a red house).
This can be accounted for just by inflecting the re-
placement word with the morphology of the origi-
nal word. For this purpose we use Omorfi (Pirinen
et al.,2017) which implements Finnish morphol-
ogy as an FST (finite-state transducer).
Government, on the other hand, requires some
additional work. In government, words affect on
each other morphologically in a way that depends
on the governor. This means that if a governor
word is replaced by another one in the sentence,
the morphology of the governed word needs to
adapt to the change. In concrete, given an origi-
nal verse uneksin hatusta (I dream of a hat) and a
change of the verb to n¨
aen hatun (I see a hat), the
case of the object for hattu has to change from ela-
tive to genitive. We resolve government with Syn-
tax maker (H¨
am¨
al¨
ainen and Rueter,2018), which
resolves the required case based on corpus statis-
tics.
4 Results and Evaluation
As evaluation of creative systems is one of the
most difficult problems in the field of compu-
tational creativity, instead of trying to come up
with an evaluation metric of our own, we opt for
the evaluation method used to evaluate a previous
Finnish poem generator. In practice, this means
conducting a quantitative evaluation with human
judges with the evaluation questions defined by
Toivanen et al. (2012).
An additional reasoning for using human eval-
uators instead of automated evaluation metrics is
the poor correlation observed in a previous study
(H¨
am¨
al¨
ainen and Alnajjar,2019) of automatic
evaluation metrics such as BLEU (Papineni et al.,
2002) and PINC (Chen and Dolan,2011) scores
with human judgments when evaluating creativity
of a system.
We run the genetic algorithm to produce a final
population for 20 different initial poems for four
different themes luonto (nature), perhe (family),
lemmikki (pet) and ihminen (human). From each
of the 20 final populations, we pick one poem at
random. We shuffle the order of poems to reduce
the priming effect of poems appearing always in
a given order. We divide the 20 poems into two
batches of 10 poems to reduce the effort of an in-
dividual evaluator. Each batch of 10 is then eval-
uated by 10 different human evaluators recruited
from the university campus. The total number
of evaluators is 20 and all of them are native in
Finnish.
We use the following evaluation questions from
Toivanen et al. (2012): (1) How typical is the text
as a poem? (2) How understandable is it? (3)
How good is the language? (4) Does the text evoke
mental images? (5) Does the text evoke emotions?
(6) How much do you like the text?. These ques-
tions are evaluated in a 5 point Likert scale, where
1 represents the worst and 5 the best grade. In ad-
dition to these questions, one simple binary ques-
tion is asked: Is the text a poem?.
Figure 1: Evaluation results
Figure 1represents the average values of the re-
sults of the human evaluation for each question.
The plot also shows the evaluation results of P.O.
Eticus as obtained in their study. As we can see,
our method shows higher ratings on all the evalu-
ation questions except for question 3. As for the
binary question, the judges rated the output as a
poem 81.5% of the time which is exactly the same
result as P.O. Eticus got.
However, it is to remember that as a high level
of subjectivity is involved in this evaluation set-
ting, our results should not be directly compared to
those of P.O. Eticus. The results form their study
should taken more as a reference, rather than a
definite proof that our system always outperforms
P.O. Eticus.
Q1 Q2 Q3 Q4 Q5 Q6
Average 3.10 2.94 3.11 3.60 3.23 2.77
Median 3 3 3 4 3 3
Mode 2 2 3 4 4 2
Table 1: The average, median and mode of the evalua-
tion results
Table 1shows the median and mode of the re-
sults in addition to the average values. The median
values seems to correspond to the rounded average
values. However, the mode deviates in the case
6005
of the first, second, fifth and last questions as the
most chosen answer by the judges was different
from the average.
Ja kultaa, kuninkaankin saan.
Ja laulut ne kiert¨
av¨
at maata ja merta
Jos virkkaan kun orja
Aina, todella Herra pahankurisuutta antaa.
And gold, of a king I shall have.
And songs, they shall roam on the land and the sea
If I knit like a slave
Always, indeed the Lord shall wrack his mischief.
Above is an example of a poem generated by
the system and its translation in English. The po-
ems generated by the system are typically of this
length as the genetic algorithm uses a stanza of an
existing poem as its starting point.
5 Discussion and Conclusion
The method presented in this paper shows im-
provement on a previously used evaluation met-
ric. However, based on the discussions we had
with some of the human evaluators after they had
given their judgment, it became evident that peo-
ple have very different criteria for poetry. Some
of the judges had guessed that they were reading
computer generated poems, even though this detail
was not revealed to them explicitly. Their judg-
ments were the most critical towards the generated
poetry. On the other hand, the evaluators, who
were struck by a surprise that the poems were gen-
erated by a computer, were in general more gener-
ous in their judgments. One of the evaluators al-
most refused to believe the poems were generated
by a computer instead of a person.
The high level of subjectivity that we could ob-
serve just by talking with people calls for a more
robust qualitative study on the poem evaluation
problem itself in the future. This would allow us to
uncover additional factors that affect on the judg-
ments given by people. Furthermore, conducting a
study just on the evaluation itself makes it possible
for us to evaluate the adequacy of the used evalu-
ation metric in evaluating computer generated po-
etry.
Nevertheless, the scores achieved by our sys-
tem, in relation to a previous method by follow-
ing the same evaluation metric, are promising as
they are indicative of potentially higher quality
in the output. We have presented a solution for
the Finnish morphosyntax in conjunction with em-
ploying a genetic algorithm to cater for computa-
tional creativity in poem generation.
Acknowledgements
Special thanks to Jack Rueter for helping out with
the evaluation.
References
Khalid Alnajjar, Hadaytullah Hadaytullah, and Hannu
Toivonen. 2018. “Talent, Skill and Support.” A
method for automatic creation of slogans. In Pro-
ceedings of the 9th International Conference on
Computational Creativity (ICCC 2018), pages 88–
95, Salamanca, Spain. Association for Computa-
tional Creativity.
Benjamin Bay, Paul Bodily, and Dan Ventura. 2017.
Text transformation via constraints and word em-
bedding. In Proceedings of the Eighth International
Conference on Computational Creativity, pages 49–
56.
David L Chen and William B Dolan. 2011. Collect-
ing highly parallel data for paraphrase evaluation.
In Proceedings of the 49th Annual Meeting of the
Association for Computational Linguistics: Human
Language Technologies-Volume 1, pages 190–200.
Simon Colton. 2008. Creativity Versus the Percep-
tion of Creativity in Computational Systems. In
AAAI Spring Symposium: Creative Intelligent Sys-
tems, Technical Report SS-08-03, pages 14—-20,
Stanford, California, USA.
Simon Colton, John William Charnley, and Alison
Pease. 2011. Computational creativity theory: The
face and idea descriptive models. In ICCC, pages
90–95.
Simon Colton, Jacob Goodwin, and Tony Veale. 2012.
Full-face poetry generation. In Proceedings of the
Third International Conference on Computational
Creativity, pages 95—-102.
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. 2002.
A fast and elitist multiobjective genetic algorithm:
Nsga-ii.IEEE Transactions on Evolutionary Com-
putation, 6(2):182–197.
F´
elix-Antoine Fortin, Franc¸ois-Michel De Rainville,
Marc-Andr´
e Gardner, Marc Parizeau, and Chris-
tian Gagn´
e. 2012. DEAP: Evolutionary algorithms
made easy. Journal of Machine Learning Research,
13:2171–2175.
Pablo Gerv´
as. 2001. An expert system for the compo-
sition of formal Spanish poetry.Knowledge-Based
Systems, 14(3):181–188.
Erica Greene, Tugba Bodrumlu, and Kevin Knight.
2010. Automatic analysis of rhythmic poetry with
applications to generation and translation. In Pro-
ceedings of the 2010 Conference on Empirical Meth-
ods in Natural Language Processing, EMNLP ’10,
pages 524–533, Stroudsburg, PA, USA. Association
for Computational Linguistics.
6006
Mika H¨
am¨
al¨
ainen. 2018a. Harnessing NLG to Cre-
ate Finnish Poetry Automatically. In Proceedings
of the Ninth International Conference on Computa-
tional Creativity, pages 9–15.
Mika H¨
am¨
al¨
ainen. 2018b. Poem Machine - a Co-
creative NLG Web Application for Poem Writing.
In The 11th International Conference on Natural
Language Generation: Proceedings of the Confer-
ence, pages 195—-196.
Mika H¨
am¨
al¨
ainen. 2019. UralicNLP: An NLP library
for Uralic languages.Journal of Open Source Soft-
ware, 4(37):1345.
Mika H¨
am¨
al¨
ainen and Khalid Alnajjar. 2019. Mod-
elling the Socialization of Creative Agents in a
Master-Apprentice Setting: The Case of Movie
Title Puns. In Proceedings of the Tenth Inter-
national Conference on Computational Creativity,
pages 266–273.
Mika H¨
am¨
al¨
ainen and Jack Rueter. 2018. Develop-
ment of an Open Source Natural Language Gener-
ation Tool for Finnish. In Proceedings of the Fourth
International Workshop on Computational Linguis-
tics for Uralic Languages, pages 51–58.
Katri Haverinen, Jenna Nyblom, Timo Viljanen,
Veronika Laippala, Samuel Kohonen, Anna Missil¨
a,
Stina Ojala, Tapio Salakoski, and Filip Ginter. 2014.
Building the essential resources for finnish: the
turku dependency treebank.Language Resources
and Evaluation, 48(3):493–531.
Raquel Herv´
as, Jason Robinson, and Pablo Gerv´
as.
2007. Evolutionary assistance in alliteration and al-
lelic drivel. In Workshops on Applications of Evolu-
tionary Computation, pages 537–546. Springer.
Jenna Kanerva, Juhani Luotolahti, Veronika Laippala,
and Filip Ginter. 2014. Syntactic n-gram collection
from a large-scale corpus of internet Finnish. In Hu-
man Language Technologies-The Baltic Perspective:
Proceedings of the Sixth International Conference
Baltic HLT, volume 268, pages 184–191.
Anna Kantosalo, Jukka Toivanen, and Hannu Toivo-
nen. 2015. Interaction Evaluation for Human-
Computer Co-creativity: A Case Study. In Proceed-
ings of the Sixth International Conference on Com-
putational Creativity, pages 276–283.
Carolyn Lamb and Daniel G. Brown. 2019. Twit-
Song 3.0: towards semantic revisions in computa-
tional poetry. In Proceedings of the Tenth Inter-
national Conference on Computational Creativity,
pages 212–219.
Juntao Li, Yan Song, Haisong Zhang, Dongmin Chen,
Shuming Shi, Dongyan Zhao, and Rui Yan. 2018.
Generating classical Chinese poems via conditional
variational autoencoder and adversarial training.
In Proceedings of the 2018 Conference on Em-
pirical Methods in Natural Language Processing,
pages 3890–3900, Brussels, Belgium. Association
for Computational Linguistics.
Ruli Manurung, Graeme Ritchie, and Henry Thomp-
son. 2012. Using genetic algorithms to create mean-
ingful poetic text. J. Exp. Theor. Artif. Intell., 24:43–
64.
Hugo Gonc¸alo Oliveira. 2017. A survey on intelligent
poetry generation: Languages, features, techniques,
reutilisation and evaluation. In Proceedings of the
10th International Conference on Natural Language
Generation, pages 11–20, Santiago de Compostela,
Spain. Association for Computational Linguistics.
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-
Jing Zhu. 2002. Bleu: a method for automatic eval-
uation of machine translation. In Proceedings of
the 40th annual meeting on association for compu-
tational linguistics, pages 311–318.
Tommi A Pirinen, Inari Listenmaa, Ryan Johnson,
Francis M. Tyers, and Juha Kuokkala. 2017. Open
morphology of finnish. LINDAT/CLARIN digital
library at the Institute of Formal and Applied Lin-
guistics, Charles University.
Jukka Toivanen, Hannu Toivonen, Alessandro Valitutti,
and Oskar Gross. 2012. Corpus-Based Generation
of Content and Form in Poetry. In Proceedings
of the Third International Conference on Computa-
tional Creativity.
Geraint A Wiggins. 2006. A preliminary framework
for description, analysis and comparison of creative
systems. Knowledge-Based Systems, 19(7):449–
458.
... Meaning IV. Poeticness) and the six-dimension evaluation method [17,50,51] (I. How typical is the text as a poem? ...
Article
Full-text available
Literature has a strong cultural imprint and regional color, including poetry. Natural language itself is part of the poetry style. It is interesting to attempt to use one language to present poetry in another language style. Therefore, in this study, we propose a method to fine-tune a pre-trained model in a targeted manner to automatically generate French-style modern Chinese poetry and conduct a multi-faceted evaluation of the generated results. In a five-point scale based on human evaluation, judges assigned scores between 3.29 and 3.93 in seven dimensions, which reached 80.8–93.6% of the scores of the Chinese versions of real French poetry in these dimensions. In terms of the high-frequency poetic imagery, the consistency of the top 30–50 high-frequency poetic images between the poetry generated by the fine-tuned model and the French poetry reached 50–60%. In terms of the syntactic features, compared with the poems generated by the baseline model, the distribution frequencies of three special types of words that appear relatively frequently in French poetry increased by 12.95%, 15.81%, and 284.44% per 1000 Chinese characters in the poetry generated by the fine-tuned model. The human evaluation, poetic image distribution, and syntactic feature statistics show that the targeted fine-tuned model is helpful for the spread of language style. This fine-tuned model can successfully generate modern Chinese poetry in a French style.
... Generation of a variety of different kinds of creative text has received quite a lot of attention in the past years ranging from story generation (Concepción et al., 2016;Fan et al., 2018) to poem generation (Loller-Andersen and Gambäck, 2018;Hämäläinen and Alnajjar, 2019a) and humor generation (Weller et al., 2020;Alnajjar and Hämäläinen, 2021). However, one task of creative text generation that has eluded an extensive research is advertisement generation. ...
Conference Paper
Full-text available
Automated generation of textual advertisements for specific products is a natural language generation problem that has not received too wide a research interest in the past. In this paper, we present a genetic algorithm based approach that models the key components of advertising: creativity , ability to draw attention, memo-rability, clarity, informativeness and dis-tinctiveness. Our results suggest that our method outperforms the current state of the art in readability and informativeness but not in attractiveness.
... For Finnish in particular, the literature for neural headline generation and summarization is scarce. Currently however, most work regarding Finnish headline and text generation seems to be using more conventional NLP, rule-based and statistical methods (Leppänen et al., 2017;Hämäläinen and Alnajjar, 2019;Hämäläinen and Rueter, 2018). ...
Preprint
Full-text available
We present a novel approach to generating news headlines in Finnish for a given news story. We model this as a summarization task where a model is given a news article, and its task is to produce a concise headline describing the main topic of the article. Because there are no openly available GPT-2 models for Finnish, we will first build such a model using several corpora. The model is then fine-tuned for the headline generation task using a massive news corpus. The system is evaluated by 3 expert journalists working in a Finnish media house. The results showcase the usability of the presented approach as a headline suggestion tool to facilitate the news production process.
... For Finnish in particular, the literature for neural headline generation and summarization is scarce. Currently however, most work regarding Finnish headline and text generation seems to be using more conventional NLP, rule-based and statistical methods (Leppänen et al., 2017;Hämäläinen and Alnajjar, 2019;Hämäläinen and Rueter, 2018). ...
Conference Paper
Full-text available
We present a novel approach to generating news headlines in Finnish for a given news story. We model this as a summarization task where a model is given a news article, and its task is to produce a concise headline describing the main topic of the article. Because there are no openly available GPT-2 models for Finnish, we will first build such a model using several corpora. The model is then fine-tuned for the headline generation task using a massive news corpus. The system is evaluated by 3 expert journalists working in a Finnish media house. The results showcase the usability of the presented approach as a headline suggestion tool to facilitate the news production process.
... Along the prison walls No tomb could be seen. Ghazvininejad et al., 2016;Hämäläinen and Alnajjar, 2019;Deng et al., 2019;Yang et al., 2018), research on poetry translation is in its infancy (Ghazvininejad et al., 2018;Genzel et al., 2010). For example, Ghazvininejad et al. (2018) employs a constrained decoding technique to maintain rhyme in French to English poetry translation. ...
... (Ghazvininejad et al., 2018) system, and Google Translate. 2 et al., 2016;Hämäläinen and Alnajjar, 2019;Deng et al., 2019;Yang et al., 2018), research on poetry translation is in its infancy (Ghazvininejad et al., 2018;Genzel et al., 2010). ...
Preprint
Full-text available
Despite constant improvements in machine translation quality, automatic poetry translation remains a challenging problem due to the lack of open-sourced parallel poetic corpora, and to the intrinsic complexities involved in preserving the semantics, style, and figurative nature of poetry. We present an empirical investigation for poetry translation along several dimensions: 1) size and style of training data (poetic vs. non-poetic), including a zero-shot setup; 2) bilingual vs. multilingual learning; and 3) language-family-specific models vs. mixed-multilingual models. To accomplish this, we contribute a parallel dataset of poetry translations for several language pairs. Our results show that multilingual fine-tuning on poetic text significantly outperforms multilingual fine-tuning on non-poetic text that is 35X larger in size, both in terms of automatic metrics (BLEU, BERTScore) and human evaluation metrics such as faithfulness (meaning and poetic style). Moreover, multilingual fine-tuning on poetic data outperforms \emph{bilingual} fine-tuning on poetic data.
... In our own experiments (Hämäläinen and Alnajjar, 2019a) with human evaluation, we have found that questions that do not measure what has been modeled make it very difficult to say what should be improved in the system and how, although such an evaluation makes the end results look impressive. As Gervás (2017) puts it, any feature not modeled in a generative system that happens to be in the output can hardly be a merit of the system, but is in the result due to mere serendipity. ...
Conference Paper
Full-text available
We outline the Great Misalignment Problem in natural language processing research, this means simply that the problem definition is not in line with the method proposed and the human evaluation is not in line with the definition nor the method. We study this misalignment problem by surveying 10 randomly sampled papers published in ACL 2020 that report results with human evaluation. Our results show that only one paper was fully in line in terms of problem definition, method and evaluation. Only two papers presented a human evaluation that was in line with what was modeled in the method. These results highlight that the Great Misalignment Problem is a major one and it affects the validity and reproducibility of results obtained by a human evaluation.
... our own experiments (Hämäläinen and Alnajjar, 2019a) with human evaluation, we have found that questions that do not measure what has been modeled make it very difficult to say what should be improved in the system and how, although such an evaluation makes the end results look impressive. As Gervás (2017) puts it, any feature not modeled in a generative system that happens to be in the output can hardly be a merit of the system, but is in the result due to mere serendipity. ...
Preprint
Full-text available
We outline the Great Misalignment Problem in natural language processing research, this means simply that the problem definition is not in line with the method proposed and the human evaluation is not in line with the definition nor the method. We study this misalignment problem by surveying 10 randomly sampled papers published in ACL 2020 that report results with human evaluation. Our results show that only one paper was fully in line in terms of problem definition, method and evaluation. Only two papers presented a human evaluation that was in line with what was modeled in the method. These results highlight that the Great Misalignment Problem is a major one and it affects the validity and reproducibility of results obtained by a human evaluation.
Chapter
We present Erato, a framework designed to facilitate the automated evaluation of poetry, including that generated by poetry generation systems. Our framework employs a diverse set of features, and we offer a brief overview of Erato’s capabilities and its potential for expansion. Using Erato, we compare and contrast human-authored poetry with automatically-generated poetry, demonstrating its effectiveness in identifying key differences. Our implementation code and software are freely available under the GNU GPLv3 license.
Conference Paper
Full-text available
This paper presents work on modelling the social psychological aspect of socialization in the case of a com-putationally creative master-apprentice system. In each master-apprentice pair, the master, a genetic algorithm, is seen as a parent for its apprentice, which is an NMT based sequence-to-sequence model. The effect of different parenting styles on the creative output of each pair is in the focus of this study. This approach brings a novel view point to computational social creativity, which has mainly focused in the past on computation-ally creative agents being on a socially equal level, whereas our approach studies the phenomenon in the context of a social hierarchy.
Conference Paper
Full-text available
This paper presents a new, NLG based approach to poetry generation in Finnish for use as a part of a bigger Poem Machine system the objective of which is to provide a platform for human computer co-creativity. The approach divides generation into a linguistically solid system for producing grammatical Finnish and higher level systems for producing a poem structure and choosing the lexical items used in the poems. An automatically extracted open-access semantic repository tailored for poem generation is developed for the system. Finally , the resulting poems are evaluated and compared with the state of the art in Finnish poem generation.
Conference Paper
Full-text available
We present Poem Machine, an interactive online tool for co-authoring Finnish poetry with a computationally creative agent. Poem Machine can produce poetry of its own and assist the user in authoring poems. The main target group for the system is primary school children, and its use as apart of teaching is currently under study.
Article
Full-text available
In the past years the natural language processing (NLP) tools and resources for small Uralic languages have received a major uplift. The open-source Giellatekno infrastructure has served a key role in gathering these tools and resources in an open environment for researchers to use. However, the many of the crucially important NLP tools, such as FSTs and CGs require specialized tools with a learning curve. This paper presents UralicNLP, a Python library, the goal of which is to mask the actual implementation behind a Python interface. This not only lowers the threshold to use the tools provided in the Giellatekno infrastructure but also makes it easier to incorporate them as a part of research code written in Python.
Conference Paper
Full-text available
We present an open source Python library to automatically produce syntactically correct Finnish sentences when only lemmas and their relations are provided. The tool resolves automatically morphosyntax in the sentence such as agreement and government rules and uses Omorfi to produce the correct morphological forms. In this paper, we discuss how case government can be learned automatically from a corpus and incorporated as a part of the natural language generation tool. We also present how agreement rules are modeled in the system and discuss the use cases of the tool such as its initial use as part of a computational creativity system, called Poem Machine.
Conference Paper
Full-text available
In order to provide resources for artistic communities and further the linguistic capabilities of computationally creative systems, we present a computational process for creative text transformation and evaluation. Its purpose is to help solve the fundamental problem posed by the field of natural language generation, which is to computationally generate human-readable language. Our process entails the use of 1) vector word embedding to approximate meaning and 2) constraints to guide word replacement. We introduce intentions as objects that drive the generation of creative artefacts; a target theme, emotion, meter, or rhyme scheme may be represented via intention. Our implementation of this process, Lyrist, is oriented around poetry and song lyrics and successfully produces syntactically correct, human-voiced text. A preliminary evaluation suggests that our process successfully evokes human-recognizable sentiments and that even familiar texts are difficult to recognize after undergoing transformation.
Conference Paper
Full-text available
Interaction design has been suggested as a framework for evaluating computational creativity by Bown (2014). Yet few practical accounts on using an Interaction Design based evaluation strategy in Computational Creativity Contexts have been reported in the literature. This study paper describes the evaluation process and results of a human-computer co-creative poetry writing tool intended for children in a school context. We specifically focus on one formative evaluation case utilizing Interaction Design evaluation methods, offering a suggestion on how to conduct Interaction Design based evaluation in a computational creativity context, as well as, report the results of the evaluation itself. The evaluation process is considered from the perspective of a computational creativity researcher and we focus on challenges and benefits of the interaction design evaluation approach within a computational creativity project context.
Conference Paper
Slogans are an effective way to convey a marketing message. In this paper, we present a method for automatically creating slogans, aimed to facilitate a human slogan designer in her creative process. By taking a target concept (e.g. a computer) and an adjectival property (e.g. creative) as input, the proposed method produces a list of diverse expressions optimizing multiple objectives such as semantic relatedness, language correctness, and usage of rhetorical devices. A key component in the process is a novel method for generating nominal metaphors based on a metaphor interpretation model. Using the generated metaphors, the method builds semantic spaces related to the objectives. It extracts skeletons from existing slogans, and finally fills them in, traversing the semantic spaces, using the genetic algorithm to reach interesting solutions (e.g. “Talent, Skill and Support.”). We evaluate both the metaphor generation method and the overall slogan creation method by running two crowdsourced questionnaires.