Timo Honkela

Timo Honkela
  • PhD
  • Professor (Full) at University of Helsinki

About

154
Publications
34,591
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,109
Citations
Current institution
University of Helsinki
Current position
  • Professor (Full)

Publications

Publications (154)
Conference Paper
Full-text available
This paper outlines a pilot study on multi-dimensional and multilingual sentiment analysis of social media content. We use parallel corpora of movie subtitles as a proxy for colloquial language in social media channels and a multilingual emotion lexicon for fine-grained sentiment analyses. Parallel data sets make it possible to study the preservati...
Poster
Full-text available
The poster for CoLing 2016: Challenges in Multidimensional Sentiment Analysis Across Languages
Article
Full-text available
In this article, automatically generated and manually crafted semantic representations are compared. The comparison takes place under the assumption that neither of these has a primary status over the other. While linguistic resources can be used to evaluate the results of automated processes, data-driven methods are useful in assessing the quality...
Conference Paper
We present a novel Bayesian reinforcement learning algorithm that addresses model bias and exploration overhead issues. The algorithm combines different aspects of several state-of-the-art reinforcement learning methods that use Gaussian Processes model-based approaches to increase the use of the online data samples. The algorithm uses a smooth rew...
Conference Paper
Surveys are widely conducted as a means to obtain information on thoughts, opinions and feelings of people. The representativeness of a sample is a major concern in using surveys. In this article, we consider meaning variation which is another potentially remarkable but less studied source of problems. We use Grounded Intersubjective Concept Analys...
Conference Paper
Full-text available
In this paper, we study how to analyze and improve the quality of a large historical newspaper collection. The National Library of Finland has digitized millions of newspaper pages. The quality of the outcome of the OCR process is limited especially with regard to the oldest parts of the collection. Approaches such as crowdsourcing has been used in...
Conference Paper
Full-text available
Sentiment analysis has become a widely used approach to assess the emotional content of written documents such as customer feedback. In positive psychology research, the typical one-dimensional analysis framework has been extended to include five dimensions. This five-dimensional model, PERMA, enables a fine-grained analysis of written texts. We pr...
Data
Wikipedia Animal Dataset is a dataset created during December 2010 and January 2011 with data retrieved from Wikipedia. It is available for research purposes. Statistics ----------- This dataset is made up by 498 unique URLs corresponding to articles about animals. For each animal the article was collected in English, Finnish and Spanish, fulfilli...
Conference Paper
Full-text available
Emotional semantic image retrieval systems aim at incorporating the user’s affective states for responding adequately to the user’s interests. One challenge is to select features specific to image affect detection. Another challenge is to build effective learning models or classifiers to bridge the so-called “affective gap”. In this work, we study...
Conference Paper
Full-text available
In this article, we consider how semantics of action verbs can be grounded on motion tracking data. We present the basic principles and requirements for grounding of verbs through case studies related to human movement. The data includes high-dimensional movement patterns and linguistic expressions that people have used to name these movements. We...
Conference Paper
Full-text available
Statistical machine learning methods can provide help when developing preventative services and tools that support the empowerment of individuals. We explore how the self-organizing map could be utilized as a tool for analyzing, visualizing and browsing heterogeneous survey data on wellbeing that contains both quantitative (numeric) and qualitative...
Conference Paper
Full-text available
An ideal verbally controlled virtual actor would allow the same interaction as instructing a real actor with a few words. Our goal is to create virtual actors that can be controlled with natural language instead of a predefined set of commands. In this paper, we present results related to a questionnaire where people described videos of human locom...
Conference Paper
We present an approach for comparing human-made and automatically generated semantic representations with an assumption that neither of these has a primary status over the other. In the experimental part, we compare the results gained by using independent component analysis and the self-organizing map algorithm on word context analysis with a seman...
Article
On the web, a huge variety of text collections contain knowledge in different expertise domains, such as technology or medicine. The texts are written for different uses and thus for people having different levels of expertise on the domain. Texts intended for professionals may not be understandable at all by a lay person, and texts for lay people...
Conference Paper
Mobile proximity information provides a rich and detailed view into the social interactions of mobile phone users, allowing novel empirical studies of human behavior and context-aware applications. In this study, we apply a statistical anomaly detection method based on multivariate binomial mixture models to mobile proximity data from 106 users. Th...
Conference Paper
In this article, we explore an application in an area of research called wellbeing informatics. More specifically, we consider how to build a system that could be used for searching stories that relate to the interest of the user (content relevance), and help the user in his or her developmental process by providing encouragement, useful experience...
Conference Paper
Full-text available
We propose a probabilistic model class for the analysis of three-way count data, motivated by studying the subjectivity of lan-guage. Our models are applicable for instance to a data tensor of how many times each subject used each term in each context, thus revealing individual variation in natural language use. As our main goal is ex-ploratory ana...
Conference Paper
It is generally accepted that there are cross-linguistic universal tendencies in the naming of colours. This is due in large part to the findings of Berlin and Kay. Recently, however, these universalist findings have been challenged, on both methodological and substantive grounds. Nisbett’s research on cultural cognition offers another interesting...
Conference Paper
A substantial amount of subjectivity is involved in how people use language and conceptualize the world. Computational methods and formal representations of knowledge usually neglect this kind of individual variation. We have developed a novel method, Grounded Intersubjective Concept Analysis (GICA), for the analysis and visualization of individual...
Article
Full-text available
Speech-to-speech machine translation is in some ways the peak of natural language pro- cessing, in that it deals directly with our original, oral mode of communication (as opposed to derived written language). As such, it presents challenges that are not to be taken lightly. Although existing technology covers each of the steps in the process, from...
Article
We present a methodology for learning a taxonomy from a set of text documents that each describes one concept. The taxonomy is obtained by clustering the concept definition documents with a hierarchical approach to the Self-Organizing Map. In this study, we compare three different feature extraction approaches with varying degree of language indepe...
Conference Paper
Full-text available
n this paper, we study fundamental properties of the Self-Organizing Map (SOM) and the Generative Topographic Mapping (GTM), ramifications of the initialization of the algorithms and properties of the algorithms in presence of missing data. We show that the commonly used principal component analysis (PCA) initialization of the GTM does not guarante...
Conference Paper
Full-text available
In this article, we introduce the concept of pathways of wellbeing and examine how such paths can be discovered from large data sets using the self-organizing map. Data sets used in the illustrative experiments include measurements of physical fitness and subjective assessments related to diagnosing work stress.
Conference Paper
Full-text available
In document clustering, semantically similar documents are grouped together. The dimensionality of document collections is often very large, thousands or tens of thousands of terms. Thus, it is common to reduce the original dimensionality before clustering for computational reasons. Cosine distance is widely seen as the best choice for measuring th...
Conference Paper
Full-text available
In this work, we study people’s emotions evoked by viewing abstract art images based on traditional low-level image features within a binary classification framework. Abstract art is used here instead of artistic or photographic images because those contain contextual information that influences the emotional assessment in a highly individual manne...
Conference Paper
In this article, we present an analysis of the impact of nutrition and lifestyle on health at a global level. We have used Self-organizing Maps (SOM) algorithm as the analysis technique. SOM enables us to visualize the relative position of each country against a set of the variables related to nutrition, lifestyle and health. The positioning of the...
Conference Paper
Full-text available
We present a selection of results produced in a project called Media Map. The project aims at developing an intuitive user interface to a library information system containing data on projects and publications. The user interface is a two-dimensional visual display created with the Self-Organizing Map algorithm. The map has been computed using the...
Article
In this article, we introduce a method to make visible the differences among people regarding how they conceptualize the world. The Grounded Intersubjective Concept Analysis (GICA) method first employs a conceptual survey designed to elicit particular ways in which concepts are used among participants, aiming to exclude the level of opinions and va...
Article
Full-text available
In this review and tutorial article, new developments towards extended use of information and communications technologies in science are discussed. The focus is in human and social sciences, specifically in linguistics and economics. Some challenging epistemological issues are handled in detail including the subjective and intersubjective nature of...
Article
Full-text available
We study the combination of symbol frequence analysis and negative selection for anomaly detection of discrete sequences where conventional negative selection algorithms are not practical due to data sparsity. Theoretical analysis on ergodic Markov chains is used to outline the properties of the presented anomaly detection algorithm and to predict...
Conference Paper
Full-text available
In this paper, we explore the possibility of applying a text mining method on a large qualitative source material concerning the history of information technology in one nation. This data was collected in the Swedish documentation project “From Computing Machines to IT.” We apply text mining on the interview transcripts of this Swedish documentatio...
Conference Paper
Full-text available
This paper presents a methodology for learning taxonomic relations from a set of documents that each explain one of the concepts. Three different feature extraction approaches with varying degree of language independence are compared in this study. The first feature extraction scheme is a language-independent approach based on statistical keyphrase...
Conference Paper
In this paper, we consider how to represent world knowledge using the self-organizing map (SOM), how to use a simple recurrent network (SRN) to device sentence comprehension, and how to use the SOM output space to represent situations and facilitate grounded logical reasoning.
Conference Paper
Full-text available
In this article, we study the scale-dependent dimensionality properties and overall structure of text data with a method that measures correlation dimension in different scales. As experimental results, we present the analysis of text data sets with the Reuters and Europarl corpora, which are also compared to artificially generated point sets. A co...
Conference Paper
Full-text available
In this article, we use the model adjectives using a vector space model. We further employ three different dimension reduction methods, the Principal Component Analysis (PCA), the Self-Organizing Map (SOM), and the Neighbor Retrieval Visualizer (NeRV) in the projection and visualization task, using antonym test for evaluation. The results show tha...
Article
Full-text available
Our aim is to find syntactic and semantic relationships and roles of words based on the analysis of corpora. We study three methods for analyzing words in contexts as potential methods for solving this task. The methods are latent semantic anal-ysis, self-organizing map and independent component analysis. Latent semantic analysis is a simple method...
Article
In this paper, we propose tensor based Maximum Margin Criterion algorithm (TMMC) for supervised dimensionality reduction. In TMMC, an image object is encoded as an nth-order tensor, and its 2-D representation is directly treated as matrix. Meanwhile, ...
Article
Full-text available
The article provides an introduction to and a demonstration of the self-organizing map (SOM) method for organizational researchers interested in the use of qualitative data. The SOM is a versatile quantitative method very commonly used across many disciplines to analyze large data sets. The outcome of the SOM analysis is a map in which entities are...
Conference Paper
The self-organizing map (SOM) is related to the classical vector quantization (VQ). Like in the VQ, the SOM represents a distribution of input data vectors using a finite set of models. In both methods, the quantization error (QE) of an input vector can be expressed, e.g., as the Euclidean norm of the difference of the input vector and the best-mat...
Conference Paper
In this paper, we discuss problems related to the basic Semantic Web methodologies that are based on predicate logic and related formalisms. We discuss complementary and alternative approaches. In particular, we suggest how the Self-Organizing Map can be a basis for making the Semantic Web more semantic.
Conference Paper
Full-text available
The complex phenomena of political science are typically studied using qualitative approach, potentially supported by hypothesis- driven statistical analysis of some numerical data. In this article, we present a complementary method based on data mining and specifically on the use of the self-organizing map. The idea in data mining is to explore th...
Article
Full-text available
In this article, we consider contemporary theories of concepts, and Bayesian and self-organizing models of concept formation. After introducing the differ-ent models, we present our own experiment. It utilizes a multi-agent simulation framework, in which the emergence of a common vocabulary can be studied. In the experiment, we use jointly the self...
Conference Paper
Full-text available
In time series prediction, one does often not know the properties of the underlying system generating the time series. For example, is it a closed system that is generating the time series or are there any external factors influencing the system? As a result of this, you often do not know beforehand whether a time series is stationary or nonstation...
Article
Full-text available
The purpose of the present article is to examine the implications of the pragmatic web for the research and development of educational technology. It is argued that, beyond knowledge acquisition and social participation, technology-mediated learning environments based on a semantic and pragmatic web have the potential for facilitating creation and...
Article
Full-text available
Finding ways in which communities of experts can benefit from each other is a question shared by the machine learning community and social sciences alike. Considerable research in machine learning methods has shown that communities of experts can provide consistently better classifications and decisions than single experts in various tasks and doma...
Article
Full-text available
Latent semantic analysis (LSA) can be used to create an implicit semantic vectorial rep-resentation for words. Independent compo-nent analysis (ICA) can be derived as an extension to LSA that rotates the latent se-mantic space so that it becomes explicit, that is, the features correspond more with those resulting from human cognitive activ-ity. Thi...
Conference Paper
Full-text available
This paper presents a method for creating interlingual word-to-word or phrase-to-phrase mappings between any two languages using the self-organizing map algorithm. The method can be used as a component in a statistical machine translation system. The conceptual space created by the self-organizing map serves as a kind of interlingual representation...
Article
We propose a theoretical framework for modeling communication between agents that have different conceptual models of their current context. We describe how the emergence of subjective models of the world can be simulated and what the role of language and communication in that process is. We consider, in particular, the role of unsupervised learnin...
Article
We propose a method for inferring semantic information from textual data in content-based multimedia retrieval. Training examples of images and videos belonging to a specific semantic class are associated with their low-level visual and aural descriptors augmented with textual features such as frequencies of significant words. A fuzzy mapping of a...
Chapter
Biological systems have been an inspiration in the development of prototype-based clustering and vector quantization algorithms. The two dominant paradigms in biologically motivated clustering schemes are neural networks and, more recently, biological immune systems. These two biological paradigms are discussed regarding their benefits and shortcom...
Article
Full-text available
In this article, we are studying the differences between the European Union languages using statistical and unsupervised methods. The analysis is conducted in the different levels of language: the lexical, morphological and syntactic. Our premise is that the difficulty of the translation could be perceived as differences or similarities in differen...
Conference Paper
Full-text available
We present Likey, a language-independent keyphrase extraction method based on sta- tistical analysis and the use of a reference corpus. Likey has a very light-weight pre- processing phase and no parameters to be tuned. Thus, it is not restricted to any sin- gle language or language family. We test Likey having exactly the same configura- tion with...
Conference Paper
Full-text available
In this article we approach neural networks as computational templates that travel across various sciences. Traditionally, it has been thought that models are primarily models of some target systems: they are assumed to represent partially or completely their target systems. We argue, instead, that many computational models cannot easily be conceiv...
Conference Paper
Full-text available
Serious efforts to develop computerized systems for natural language understanding and machine translation have taken place for more than half a century. Some successful systems that translate texts in limited domains such as weather forecasts have been implemented. However, the more general the domain or complex the style of the text the more diff...
Conference Paper
We present a probabilistic approach for detecting and analyzing changes in natural language motivated by biological immune systems. Contrary to traditional methods based on message-digest algorithms and line-by-line comparisons of two files, the proposed algorithm employs an implicit negative representation of text segments in the form of detector...
Conference Paper
Full-text available
We show that independent component analysis (ICA) can be used to find distributed representations for words that can be further processed by thresholding to produce sparse representations. The applicability of the thresholded ICA representation is compared to singular value decomposition (SVD) in a multiple choice vocabulary task with three data se...
Article
Full-text available
We present the results of an analysis of a text corpus of 129,000 abstracts of NSF-sponsored basic research projects between years 1990 and 2003. The methods used in the analysis include term extraction based on a reference corpus and an entropy measure, and the Self-Organizing Map algorithm for the formation of a term map and a document map. Metho...
Conference Paper
Full-text available
A symbol as such is disassociated from the world. In addition, as a discrete entity a symbol does not mirror all the details of the portion of the world that it is meant to refer to. Humans establish the association between the symbols and the referenced domain – the words and the world – through a long learning process in a community. This paper s...
Conference Paper
We propose a method of content-based multimedia retrieval of objects with visual, aural and textual properties. In our method, train- ing examples of objects belonging to a specific semantic class are associ- ated with their low-level visual descriptors (such as MPEG-7) and textual features such as frequencies of significant keywords. A fuzzy mappi...
Conference Paper
An art installation was on display in the Centre Pompidou National Museum of Modern Art in Paris, where visitors could contribute with their own personal objects, adding keyword descriptions and quantified semantic features such as age or hardness. The data was projected in real-time onto a Self-Organizing Map (SOM) which was shown in the gallery....
Article
A vital mechanism of high‐level natural cognitive systems is the anticipatory capability of making decisions based on predicted events in the future. While in some cases the performance of computational cognitive systems can be improved by modeling anticipatory behavior, it has been shown that for many cognitive tasks anticipation is mandatory. In...
Article
Full-text available
Quality of Internet health information is essential because it has the potential to benefit or harm a large number of people and it is therefore essential to provide consumers with some tools to aid them in assessing the nature of the information they are accessing and how they should use it without jeopardizing their relationship with their doctor...
Conference Paper
Full-text available
In this article, we study the emergence of associations between words and concepts using the self-organizing map. In particular, we explore the meaning negotiations among communicating agents. The self-organizing map is used as a model of an agent's conceptual memory. The concepts are not explicitly given but they are learned by the agent in an uns...
Conference Paper
Full-text available
In this article, we are studying the differences between the European languages using statistical and unsupervised methods. The analysis is conducted in different levels of language, lexical, morphological and syntactic. Our prem- ise is that the difficulty of the translation could be perceived as differences or similarities in different levels of...
Article
Full-text available
In this position paper, we discuss some problems related to those semantic web methodologies that are straightforwardly based on predicate logic and related for malisms. We also discuss complementary and alternative approaches and provide some examples of such.
Article
Full-text available
We study how independent component analysis can be used to create automatically syntactic and semantic features based on analyzing words in contexts.
Article
Full-text available
Purpose Studies aspects of Heinz von Foerster's work that are of particular importance for cognitive science and artificial intelligence. Design/methodology/approach Kohonen's self‐organizing map is presented as one method that may be useful in implementing some of Von Foerster's ideas. The main foci are the distinction between trivial and non‐tri...
Conference Paper
According to a connectionist view, mental states consist of the activations of neural units in a connectionist network. We consider the similarity of representations that emerge in unsupervised, self-organization process of neural lattices when exposed to color spectrum stimuli. Self-organizing maps (SOM) are trained with color spectrum input, usin...
Conference Paper
Full-text available
Our aim is to find syntactic and semantic relationships of words based on the analysis of corpora. We propose the application of independent component analysis, which seems to have clear advantages over two classic methods: latent semantic analysis and self-organizing maps. Latent semantic analysis is a simple method for automatic generation of con...
Article
Full-text available
The WEBSOM is a method for analyzing and visualizing large document col-lections. In the WEBSOM method, the self-organizing map algorithm is used to automatically organize collections of documents onto a two-dimensional map to enable easy exploration and search of the collection. Map regions that are close to each other contain similar items. GS Te...
Article
Full-text available
We study written language as if it were a multidimensional signal rather than a stream of symbols. We show that it is possible to find emergent features by independent component analysis from word contexts. The closeness of match between the learned features and traditional linguistic word categories is examined. It is shown that independent compon...
Article
Our aim is to find syntactic and semantic relationships of words based on the analysis of corpora. We propose the application of independent component analysis, which seems to have clear advantages over two classic methods: latent semantic analysis and self-organizing maps. Latent semantic analysis is a simple method for automatic generation of con...
Conference Paper
Full-text available
In this paper, we assume that word co-occurrence statistics can be used to extract meaningful features, exhibiting syntactic and se- mantic behavior, from text data. Independent component analysis (ICA), an unsupervised statistical method, is applied to word usage statistics, calculated from a natural language corpora, to extract a number of fea- t...
Article
This article presents empirical evidence for the hypothesis that persons consider counterintuitive representations more likely to be religious than other kinds of beliefs. In three studies the subjects were asked to rate the probable religiousness of various kinds of imaginary beliefs. The results show that counterintuitive representations in gener...
Article
Full-text available
We develop a framework for discussing the degree of conceptual autonomy of natural and artificial agents. We claim that aspects related to learning and communication necessitate adaptive agents that are partially autonomous. We demonstrate how partial conceptual autonomy can be obtained through a self-organization process. The input for the agents...
Article
Full-text available
In this article, we present a model of a cognitive system, or an agent, with the fol-lowing properties: it can perceive its environment, it can move in its environment, it can perform some simple actions, and it can send and receive messages. The main components of its internal structure include a working memory, a seman-tic memory, and a decision...
Article
Fuzzy logic, artificial neural network models and evolutionary computing are the main methodological tools of the soft computing area. This article provides an overview on two of them, namely neural networks and evolutionary models. The largest number of applications that combine these two is based on the idea that a genetic algorithm is used to op...
Chapter
Kohonen’s Self-Organizing Map (SOM) is a means for automatically arranging high-dimensional statistical data. The map attempts to represent all the input with optimal accuracy using a restricted set of models or prototypes. The prototypes also become ordered on the map grid so that similar prototypes are close to each other and dissimilar prototype...
Article
Full-text available
WEBSOM is a novel method for organizing document collections onto map displays to enhance the interactive browsing and retrieval of the documents. The map is organized automatically according to the contents of the full-text documents by the Self-Organizing Map algorithm. The map display provides a visual overview of the whole document collection....
Article
The current availability of large collections of full-text documents in electronic form emphasizes the need for intelligent information retrieval techniques. Especially in the rapidly growing World Wide Web it is important to have methods for exploring miscellaneous document collections automatically. In the report, we introduce the WEBSOM method f...
Article
Full-text available
In this article, the use of the self-organizing map (SOM) is approached on the basis of current theories of learning. Possibilities of computer and networked platforms that aim at helping human learning are also inspected. It is shown how the SOM can be considered a model of constructive learning. The area of constructive learning is outlined and t...

Questions

Question (1)
Question
Our collaborator has developed a agent-based simulation on emergence of money (see below). Are you aware of related research? The current publication is a technical report but it seems that the result would deserve wider dissemination. Which journals have computational economics in their agenda?
Risto Linturi: Social simulation of networked barter economy with emergent money.

Network

Cited By