Hannu Toivonen

Hannu Toivonen
University of Helsinki | HY · Department of Computer Science

About

202
Publications
40,696
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
16,217
Citations
Introduction
Hannu Toivonen works at the Department of Computer Science, University of Helsinki. Hannu does research in Artificial Intelligence and Data Science, more specifically in Computational Creativity and Data Mining. I am a passive ResearchGate user and don't update information here. Please see http://www.helsinki.fi/discovery for up-to-date information on my research, publications and other activities.
Additional affiliations
January 2002 - present
University of Helsinki

Publications

Publications (202)
Article
Full-text available
In advertising, slogans are used to enhance the recall of the advertised product by consumers and to distinguish it from others in the market. Creating effective slogans is a resource-consuming task for humans. In this paper, we describe a novel method for automatically generating slogans, given a target concept (e.g., car) and an adjectival proper...
Chapter
We present a novel environment for exploratory search in large collections of historical newspapers developed as a part of the NewsEye project. In this paper we focus on the intelligent Personal Research Assistant (PRA) component in the environment and the web interface. The PRA is an interactive exploratory engine that combines results of various...
Article
Biomine Explorer is a web application that enables interactive exploration of large heterogeneous biological networks constructed from selected publicly available biological knowledge sources. It is built on top of Biomine, a system which integrates cross-references from several biological databases into a large heterogeneous probabilistic network....
Article
Full-text available
Computational creativity seeks to understand computational mechanisms that can be characterized as creative. The creation of new concepts is a central challenge for any creative system. In this article, we outline different approaches to computational concept creation and then review conceptual representations relevant to concept creation, and ther...
Article
Computational Creativity [CC] is a multidisciplinary research field, studying how to engineer software that exhibits behavior which would reasonably be deemed creative. This article shows how composition of software solutions in this field can effectively be supported through a CC infrastructure that supports user-friendly development of CC softwar...
Conference Paper
A feature news story is often accompanied by illustrations and visuals. These visualizations can be, e.g., timelines, line charts, pie charts, or images. In this article, we present a largely data-driven and domain-independent approach for generating visualizations to accompany automatically generated news articles. We demonstrate the feasibility o...
Article
In an age of struggling news media, automated generation of news via Natural Language Generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the F...
Article
In an age of struggling news media, automated generation of news via natural language generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the F...
Article
Full-text available
Data musicalization is the process of automatically composing music based on given data as an approach to perceptualizing information artistically. The aim of data musicalization is to evoke subjective experiences in relation to the information rather than merely to convey unemotional information objectively. This article is written as a tutorial f...
Article
Full-text available
We study transformational computational creativity in the context of writing songs and describe an implemented system that is able to modify its own goals and operation. With this, we contribute to three aspects of computational creativity and song generation: (1) Application-wise, songs are an interesting and challenging target for creativity, as...
Conference Paper
In the age of increasing floods of information, finding the news signals from the noise has become increasingly resource and time intensive for journalists. Generally, news media companies have the important role of filtering and explaining this flood of information to the public. However, with the increase in availability of data sources, human jo...
Conference Paper
Full-text available
Many linguistic creativity applications rely heavily on knowledge of nouns and their properties. However, such knowledge sources are scarce and limited. We present a graph-based approach for expanding and weighting properties of nouns with given initial, non-weighted properties. In this paper, we focus on famous characters, either real or fictional...
Article
Networks often contain implicit structure. We introduce novel problems and methods that look for structure in networks, by grouping nodes into supernodes and edges to superedges, and then make this structure visible to the user in a smaller generalised network. This task of finding generalisations of nodes and edges is formulated as ‘network Summar...
Conference Paper
Writing rap lyrics requires both creativity to construct a meaningful, interesting story and lyrical skills to produce complex rhyme patterns, which form the cornerstone of good flow. We present a rap lyrics generation method that captures both of these aspects. First, we develop a prediction model to identify the next line of existing lyrics from...
Conference Paper
Full-text available
We propose a novel metaphor interpretation method, Meta4meaning. It provides interpretations for nominal metaphors by generating a list of properties that the metaphor expresses. Meta4meaning uses word associations extracted from a corpus to retrieve an approximation to properties of concepts. Interpretations are then obtained as an aggregation or...
Conference Paper
The goal of automatic text summarization is to generate an abstract of a document or a set of documents. In this paper we propose a word association based method for generating summaries in a variety of languages. We show that a robust statistical method for finding associations which are specific to the given document(s) is applicable to many lang...
Conference Paper
Full-text available
A frequent challenge in creative tasks such as advertising is finding novel and concrete representations of abstract concepts. We cast this problem as finding, in word association networks, the relevant indirect associations of a given node. We propose a novel approach, LayerFolding, which selects nodes at increasing distances from the given node,...
Article
Creative machines are an old idea, but only recently computational creativity has established itself as a research field with its own identity and research agenda. The goal of computational creativity research is to model, simulate, or enhance creativity using computational methods. Data mining and machine learning can be used in a number of ways t...
Conference Paper
Full-text available
Interaction design has been suggested as a framework for evaluating computational creativity by Bown (2014). Yet few practical accounts on using an Interaction Design based evaluation strategy in Computational Creativity Contexts have been reported in the literature. This study paper describes the evaluation process and results of a human-computer...
Article
Full-text available
Writing rap lyrics requires both creativity, to construct a meaningful and an interesting story, and lyrical skills, to produce complex rhyme patterns, which are the cornerstone of a good flow. We present a method for capturing both of these aspects. Our approach is based on two machine-learning techniques: the RankSVM algorithm, and a deep neural...
Article
We consider automated generation of humorous texts by substitution of a single word in a given short text. In this setting, several factors that potentially contribute to the funniness of texts can be integrated into a unified framework as constraints on the lexical substitution. We discuss three types of such constraints: formal constraints concer...
Conference Paper
Full-text available
Query auto-completion (QAC) is one of the most recognizable and widely used services of modern search engines. Its goal is to assist a user in the process of query formulation. Current QAC systems are mainly reactive. They respond to the present request using past knowledge. Specifically, they mostly rely on query logs analysis or corpus terms co-o...
Conference Paper
Full-text available
Query auto-completion (QAC) is one of the most recogniz-able and widely used services of modern search engines. Its goal is to assist a user in the process of query formulation. Current QAC systems are mainly reactive. They respond to the present request using past knowledge. Specifically, they mostly rely on query logs analysis [11, 10, 12] or cor...
Article
The objective of subgroup discovery is to find groups of individuals who are statistically different from others in a large data set. Most existing measures of the quality of subgroups are intuitive and do not precisely capture statistical differences of a group with the other, and their discovered results contain many redundant subgroups. Odds rat...
Article
Full-text available
In the age of big data, automatic methods for creating summaries of documents become increasingly important. In this paper we propose a novel, unsupervised method for (multi-)document summarization. In an unsupervised and language-independent fashion, this approach relies on the strength of word associations in the set of documents to be summarized...
Conference Paper
Full-text available
We propose a method for automatic poetry composition with a given document as inspiration. The poems gener-ated are not limited to the topic of the document. They expand the topic or even put it in a new light. This capa-bility is enabled by first detecting significant word asso-ciations that are unique to the document and then using them as the ke...
Conference Paper
Full-text available
This paper investigates how to transform machine creativity systems into interactive tools that support human-computer co-creation. We use three case studies to identify common issues in this transformation, under the perspective of User-Centered Design. We also anal-yse the interactivity and creative behavior of the three platforms in terms of Wig...
Article
We present a method for measuring beat-to-beat heart rate from ballistocardiograms acquired with force sensors. First, a model for the heartbeat shape is adaptively inferred from the signal using hierarchical clustering. Then, beat-to-beat intervals are detected by finding positions where the heartbeat shape best fits the signal. The method was val...
Article
Full-text available
There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons,...
Article
Full-text available
We propose a method for automated generation of adult humor by lexical replacement and present empirical evaluation results of the obtained humor. We propose three types of lexical constraints as building blocks of humorous word substitution: constraints concerning the similarity of sounds or spellings of the original word and the substitute, a con...
Conference Paper
Full-text available
We propose a method for automated generation of adult humor by lexical replacement and present empirical evaluation results of the obtained humor. We propose three types of lexical constraints as building blocks of humorous word substitution: constraints concerning the similarity of sounds or spellings of the original word and the substitute, a con...
Conference Paper
Full-text available
We address the challenging task of automatically composing lyrical songs with matching musical and lyrical features, and we present the first prototype, M.U. SicusApparatus, to accomplish the task. The focus of this paper is especially on generation of art songs (lieds). The proposed approach writes lyrics first and then composes music to match the l...
Article
Full-text available
In this paper, we introduce a novel technique for named entity filtering, focused on the analysis of word association networks. We present an approach for modelling concepts which are distinctively related to specific named entity. We evaluated our approach in the context of the TREC Knowledge Base Acceleration track, and we obtained significantly...
Article
Conditional functional dependencies (CFDs) have been proposed as a new type of semantic rules extended from traditional functional dependencies. They have shown great potential for detecting and repairing inconsistent data. Constant CFDs are 100% confidence association rules. The theoretical search space for the minimal set of CFDs is the set of mi...
Article
Full-text available
Subgroup discovery methods find interesting subsets of objects of a given class. Motivated by an application in bioinformatics, we first define a generalized subgroup discovery problem. In this setting, a subgroup is interesting if its members are characteristic for their class, even if the classes are not identical. Then we further refine this set...
Article
Full-text available
The ability to associate concepts is an important factor of creativity. We investigate the power of simple word co-occurrence analysis in tasks requiring verbal creativity. We first consider the Remote Associates Test, a psychometric measure of creativity. It turns out to be very easy for computers with access to statistics from a large corpus. Nex...
Conference Paper
Full-text available
A fluent ability to associate tasks, concepts, ideas, knowledge and experiences in a relevant way is often considered an important factor of creativity, especially in problem solving. We are interested in providing computational support for such creative associations. In this paper we design minimally supervised methods that can perform well in the...
Conference Paper
Full-text available
We aim to identify and control unintentional humor occurring in human-computer interaction, and recreate it intentionally. In this research we focus on text prediction systems, a type of interactive programs employed in mobile phones, search engines, and word processors. More specifically, we identified two design principles, inspired by humor and...
Conference Paper
We introduce data musicalization as a novel approach to aid analysis and understanding of sleep measurement data. Data musicalization is the process of automatically composing novel music, with given data used to guide the process. We present Sleep Musicalization, a methodology that reads a signal from state-of-the-art mattress sensor, uses highly...
Article
Full-text available
We describe an online sleep monitoring service, based on unobtrusive ballistocardiography (BCG) measurement in an ordinary bed. The novelty of the system is that the sleep tracking web application is based on measurements from a fully unobtrusive sensor. The BCG signal is measured with a piezoelectric film sensor under the mattress topper, and sent...
Conference Paper
Full-text available
Privacy preserving analysis of a social network aims at a better understanding of the network and its behavior, while at the same time protecting the privacy of its individuals. We propose an anonymization method for weighted graphs, i.e., for social networks where the strengths of links are important. This is in contrast with many previous studies...
Article
Full-text available
Background Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. Results Biomine is a system that inte...
Data
Gene pairs used in link prediction experiments
Data
Gene sets used in disease gene prediction experiments
Conference Paper
Full-text available
We employ a corpus-based approach to generate content and form in poetry. The main idea is to use two different corpora, on one hand, to provide semantic content for new poems, and on the other hand, to generate a specific grammatical and poetic structure. The approach uses text mining methods, morphological analysis, and morphological synthesis...
Article
We propose a relatively simple yet powerful model for choosing relevant and non-redundant pieces of information. The model addresses data mining or information retrieval settings where relevance is measured with respect to a set of key or query objects, either specified by the user or obtained by a data mining step. The problem addressed is not onl...
Conference Paper
Full-text available
We give methods to compress weighted graphs (i.e., networks or BisoNets) into smaller ones. The motivation is that large networks of social, biological, or other relations can be complex to handle and visualize. Using the given methods, nodes and edges of a give graph are grouped to supernodes and superedges, respectively. The interpretation (i.e....
Conference Paper
We propose a method to mine novel, document-specific associations between terms in a collection of unstructured documents. We believe that documents are often best described by the relationships they establish. This is also evidenced by the popularity of conceptual maps, mind maps, and other similar methodologies to organize and summarize informati...
Conference Paper
Biomine is a biological graph database constructed from public databases. Its entities (vertices) include biological concepts (such as genes, proteins, tissues, processes and phenotypes, as well as scientific articles) and relations (edges) between these entities correspond to real-world phenomena such as "a gene codes for a protein" or "an article...
Conference Paper
Full-text available
The article presents an approach to computational knowledge discovery through the mechanism of bisociation. Bisociative reasoning is at the heart of creative, accidental discovery (e.g., serendipity), and is focused on finding unexpected links by crossing contexts. Contextualization and linking between highly diverse and distributed data and knowle...
Conference Paper
Full-text available
We propose a novel problem to simplify weighted graphs by pruning least important edges from them. Simplified graphs can be used to improve visualization of a network, to extract its main structure, or as a pre-processing step for other data mining algorithms. We define a graph connectivity function based on the best paths between all pairs of node...
Conference Paper
Full-text available
BisoNets represent relations of information items as networks. The goal of BisoNet abstraction is to transform a large BisoNet into a smaller one which is simpler and easier to use, although some information may be lost in the abstraction process. An abstracted BisoNet can help users to see the structure of a large BisoNet, or understand connection...
Conference Paper
Biomine and ProbLog are two frameworks to implement bisociative information networks (BisoNets). They combine structured data representations with probabilities expressing uncertainty. While Biomine is based on graphs, ProbLog's core language is that of the logic programming language Prolog. This chapter provides an overview of important concepts,...
Chapter
Heterogeneous information networks or BisoNets, as they are called in the context of bisociative knowledge discovery, are a flexible and popular form of representing data in numerous fields. Additionally, such networks can be created or derived from other types of information using, e.g., the methods given in Part II of this volume. This part of th...
Chapter
Biomine is a biological graph database constructed from public databases. Its entities (vertices) include biological concepts (such as genes, proteins, tissues, processes and phenotypes, as well as scientific articles) and relations (edges) between these entities correspond to real-world phenomena such as “a gene codes for a protein” or “an article...
Article
We introduce the problem of identifying representative nodes in probabilistic graphs, motivated by the need to produce different simple views to large BisoNets. We define a probabilistic similarity measure for nodes, and then apply clustering methods to find groups of nodes. Finally, a representative is output from each cluster. We report on experi...
Article
Full-text available
In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, know...
Conference Paper
Full-text available
We propose to compress weighted graphs (networks), motivated by the observation that large networks of social, biological, or other relations can be complex to handle and visualize. In the process also known as graph simplification, nodes and (unweighted) edges are grouped to supernodes and superedges, respectively, to obtain a smaller graph. We pr...
Article
Full-text available
The paper presents an approach to computational knowledge discovery through the mechanism of bisociation. Bisociative reasoning is at the heart of creative, accidental discovery (e.g., serendipity), and is focused on finding unexpected links by crossing contexts. Contextu- alization and linking between highly diverse and distributed data and knowle...
Article
We propose a novel problem to simplify weighted graphs by pruning least important edges from them. Simplified graphs can be used to improve visualization of a network, to extract its main structure, or as a pre-processing step for other data mining algorithms. We define a graph connectivity function based on the best paths between all pairs of node...
Article
Full-text available
We study how probabilistic reasoning and inductive querying can be combined within ProbLog, a recent probabilistic extension of Prolog. ProbLog can be regarded as a database system that supports both probabilistic and inductive reasoning through a variety of querying mechanisms. After a short introduction to ProbLog, we provide a survey of the diff...
Article
We present a novel and efficient algorithm, Path Covering, for solving the most reliable subgraph problem. A reliable subgraph gives a concise summary of the connectivity between two given individuals in a social network. Formally, the given network is seen as a Bernoulli random graph Q, and the objective is to find a subgraph H ⊂ Q with at most B...