Conference PaperPDF Available

Predicting Reaction Times in Word Recognition by Unsupervised Learning of Morphology

Authors:

Abstract and Figures

A central question in the study of the mental lexicon is how morphologically complex words are processed. We consider this question from the viewpoint of statistical models of morphology. As an indicator of the mental processing cost in the brain, we use reaction times to words in a visual lexical decision task on Finnish nouns. Statistical correlation between a model and reaction times is employed as a goodness measure of the model. In particular, we study Morfessor, an unsupervised method for learning concatenative morphology. The results for a set of inflected and monomorphemic Finnish nouns reveal that the probabilities given by Morfessor, especially the Categories-MAP version, show considerably higher correlations to the reaction times than simple word statistics such as frequency, morphological family size, or length. These correlations are also higher than when any individual test subject is viewed as a model.
Content may be subject to copyright.
Predicting Reaction Times in Word Recognition
by Unsupervised Learning of Morphology
Sami Virpioja1, Minna Lehtonen2,3,4, Annika Hult´en3,4,
Riitta Salmelin3, and Krista Lagus1
1Department of Information and Computer Science,
Aalto University School of Science
2Cognitive Brain Research Unit, Cognitive Science,
Institute of Behavioural Sciences, University of Helsinki
3Brain Research Unit, Low Temperature Laboratory,
Aalto University School of Science
4Department of Psychology and Logopedics, ˚
Abo Akademi University
Abstract. A central question in the study of the mental lexicon is how
morphologically complex words are processed. We consider this question
from the viewpoint of statistical models of morphology. As an indicator
of the mental processing cost in the brain, we use reaction times to words
in a visual lexical decision task on Finnish nouns. Statistical correlation
between a model and reaction times is employed as a goodness measure
of the model. In particular, we study Morfessor, an unsupervised method
for learning concatenative morphology. The results for a set of inflected
and monomorphemic Finnish nouns reveal that the probabilities given
by Morfessor, especially the Categories-MAP version, show considerably
higher correlations to the reaction times than simple word statistics such
as frequency, morphological family size, or length. These correlations are
also higher than when any individual test subject is viewed as a model.
1 Introduction
The processing of morphologically complex words is a central question in the
study of the mental lexicon. Theoretical models have been put forward that sug-
gest that morphologically complex words are recognized either through full-form
representations [3], full decomposition (e.g. [17]) or a combination of the two
(e.g. [11]). For example, Finnish words can be combined of several morphemes,
and one single noun can, in principle, attain up to 2000 different forms [7]. Having
separate neural representations for each of these forms would seem unnecessar-
ily demanding compared to a process where words would be analyzed based on
their compound morphemes. In behavioral word recognition tasks, a processing
cost (i.e., long reaction times and high error rates) has been robustly associ-
ated with inflected Finnish nouns in comparison to matched monomorphemic
nouns [11,10]. This has been taken as evidence for the existence of morphologi-
cal decomposition for most Finnish inflected words, with the possible exception
of very high frequency inflected nouns [15].
T. Honkela et al. (Eds.): ICANN 2011, Part I, LNCS 6791, pp. 275–282, 2011.
c
Springer-Verlag Berlin Heidelberg 2011
276 S. Virpioja et al.
Statistical models of language learning would be attractive both conceptu-
ally and because they yield quantitative predictions that may be tested against
measured values of performance and, eventually, of brain activation. In this first
feasibility test, we use reaction times as a proxy, providing an indirect measure
of the underlying mental processing. In previous studies, several factors, in-
cluding the cumulative base frequency (i.e., the summative frequency of all the
inflectional variants of a single stem, [16]), surface frequency (i.e., whole form
frequency, [1]), and morphological family size (i.e., the number of derivations
and compounds where the noun occurs as a constituent, [2]), have been found to
affect the recognition times of morphologically complex words. However, we do
not know of any previous work that would use statistical models of morphology
as models of the reaction times. In the proposed evaluation setting, we exam-
ine how well they predict the average reaction times for individual inflected and
monomorphemic words in a word recognition task. As a particular morphological
model we examine an unsupervised method for word segmentation, Morfessor,
that induces a compact lexicon of morphs from unannotated text data.
2 Experimental Setup
Our experimental setup can be summarized as follows: (1) Data recordi ng: Mea-
surement data from humans is obtained, namely reaction times recorded on test
subjects in a lexical decision task with inflected and monomorphemic words. (2)
Model estimation: Using training data of varying size and type, we estimate sta-
tistical models of morphology that can be used to predict the recognition times
of words. In addition, we collect such statistics of the words that are known to
affect the reaction times. (3) Model evaluation: We calculate linear correlation
between model predictions and the average reaction times of the test subjects.
A good model is one which produces costs that have high correlation to the
reaction times. Also any of the human test subjects can be viewed as a model,
and their reaction times thus correlated with those of the rest of the subjects.
2.1 Reaction Time Data and Model Evaluation
We use the reaction time data reported in [9]. Sixteen Finnish-speaking univer-
sity students participated in the experiment. The task was to decide as quickly
and accurately as possible whether the letter string appearing on the screen was
a real Finnish word or not, and to press a corresponding button. The stimuli
consisted of 320 real Finnish nouns and 320 pseudowords. The words were taken
from an unpublished Turun Sanomat newspaper corpus of 22.7 million word to-
kens and divided into four groups of 80 words according to their frequency in
the corpus (high or low) and morphological structure (monomorphemic or in-
flected). There were four kinds of pseudowords (monomorphemic, real stem with
pseudosuffix, pseudostem with real suffix, and incorrect combination of real stem
and suffix) and their lengths and bigram frequencies (i.e., the average frequency
of letter bigrams in the word) were similar to the real words.
Predicting Reaction Times in Word Recognition 277
As preprocessing, we exclude all incorrect responses and reaction times of
three standard deviations longer or shorter than the individual’s mean. For the
remaining data, we take the logarithm of the reaction times, normalize them to
zero mean for each subject, and calculate the average across subjects per each
word. To evaluate the predicted costs, we calculate the Pearson product-moment
correlation coefficient ρbetween the costs and the average reaction times, with
ρ[1,+1] and ρ= 0 for uncorrelated variables. This is equilavent to calculat-
ing linear regression, as ρ2corresponds to the coefficient of determination, i.e.,
the fraction of variance of the predicted variable explained by the predictor.
2.2 Statistics and Computational Models
Several statistics are calculated for each stimulus word: length, surface frequency,
base frequency, morphological family size, and bigram frequency. As logarithmic
frequencies often correlate with reaction times better than direct frequencies,
we also test those. The computational models examined here give a probability
distribution p(W) over the words. Thus, we can use the cost or self-information
log p(W) to explain the reaction times in a similar manner as with the word
frequencies: a high probability is assumed to correlate with a low reaction time.
N-gram Models. We use n-gram models to get a good estimate on how com-
mon the form of the word (sequence of letters li) is among all the words in
the language. An n-gram model of order nis a (n1):th order Markov model,
thus approximating p(W=l1l2...l
N)asN
i=1 p(li|lin+1 ...l
i1). For esti-
mating the n-gram probabilities p(li|lin+1 ...l
i1), the standard techniques
include smoothing of the maximum likelihood distributions and interpolation
between different lengths of n-grams. We apply one of the state-of-the-art meth-
ods, Kneser-Ney interpolation [4], implemented in VariKN toolkit [14].
Morfessor Baseline. Morfessor [6] is a method for unsupervised learning of
concatenative morphology. It does not limit the number of morphemes per word,
and is thus suitable for modeling complex morphology such as that in Finnish.
The basic idea can be explained using the Minimum Description Length (MDL)
principle [13], where modeling is viewed as a problem of encoding a data set
efficiently in order to transmit it. In two-part MDL coding, one first transmits
the model M, and then the data set by referring to the model. Thus the task
is to find the model that minimizes the sum of the coding lengths L(M)and
L(corpus|M). In the case of segmenting words into morphs, the model simply
consists of a lexicon of unique morphs, and a pointer assigned for each. The
corpus is then transmitted by sending the pointer of each morph as they occur
in the text. Using L(X)=log p(X), the task is equivalent to probabilistic
maximum a posteriori (MAP) estimation, where p(M|corpus) is maximized.
In Morfessor Baseline, the lexicon consists of the strings and frequencies of
the morphs. The cost of the lexicon increases by the number and length of the
morphs. Each pointer in the corpus corresponds to a maximum likelihood prob-
ability set according to the morph frequency. Thus, for a known segmentation,
278 S. Virpioja et al.
the likelihood for corpus is simply the product of the morph probabilities. Dur-
ing training, Morfessor applies a greedy algorithm for finding simultaneously
the morph lexicon and a segmentation for the training corpus. After training, a
Viterbi-like algorithm can be applied to find the segmentation with the highest
probability—the product of the respective morph probabilities—for any single
word. For details, see, e.g., [6] and [5].
Morfessor Categories-MAP. The assumption of the independence between
the morphs in a word is an obvious problem in Morfessor Baseline. For example,
the model gives an equal probability to “s + walk” and “walk + s”. The later
versions of Morfessor extend the model by adding another layer of representa-
tion, namely a Hidden Markov Model (HMM) model of the segments [6]. In
Morfessor Categories-MAP, the HMM has four categories (states): prefix, stem,
suffix, and non-morpheme. While the model allows hierarchical segmentation
to non-morphemes, the final analysis of a word is restricted by the regular ex-
pression (prefix* stem+ suffix*)+. Context-sensitivity of the model has lead
to improved segmentation results when compared to a linguistic gold standard
segmentation of words into morphemes [6].
2.3 Data for Learning Computational Models
The main corpus in our experiments is the one used in the Morpho Challenge
2007 competition [8]. It is part of the Wortschatz collection [12] and contains
three million sentences collected from World Wide Web. To observe the effect of
the training corpus, we also use 30 000, 100 000, 300000 and one million sentence
random subsets of the corpus. In addition, we use three smaller corpora: “Book”
(4.4 million words) and “Periodical” (2.1 million words) parts of Finnish Parole
corpus [18], subtitles of movies from OpenSubs corpus [19] (3.0 million words),
and their combination.
It is often unclear whether intra-word models should be trained on a cor-
pus (word tokens), a word lexicon (types), or something in between. For ex-
ample, Morfessor Baseline gives segments that correspond better to linguistic
morphemes when trained on types rather than tokens [6,5]: with token counts,
many inflected high-frequency words are not segmented. Morfessor Categories-
MAP, however, is by default trained on tokens [6]: the context-sensitivity of the
Markov model reduces the effect of direct corpus frequencies. We compare mod-
els trained on types, tokens, and an intermediate approach, where the corpus
frequencies care reduced using a logarithmic function f(c) = log(1 + c).
3Results
Table 1 shows the correlations of the different statistics and logarithmic proba-
bilities of the models to the average reaction times for the stimulus words. All
values, except for the bigram frequency, showed statistically significant correla-
tion (p(ρ=0)<0.01). Among the statistics, logarithmic frequencies gave higher
Predicting Reaction Times in Word Recognition 279
Tabl e 1 . Correlation coefficients ρof different word statistics and models to average
human reaction times. Surface frequency I and other statistics are from the Turun
Sanomat newspaper corpus. Surface frequency II is from the Morpho Challenge corpus
used for training the models. The last row shows correlations for reaction times of
individual subjects. The highest correlations are marked with an asterisk.
Word statistics Logarithmic Linear
Surface frequency I 0.5108 0.2806
Surface frequency II 0.5353* 0.2376
Base frequency 0.4453 0.1901
Morphological family size 0.4233 0.2916
Bigram frequency 0.0211 +0.0221
Length (letters) +0.2180 +0.2158
Length (morphemes) +0.5417* +0.5417*
Models Types Log-frequencies Tokens
Letter 1-gram model +0.1818 +0.1816 +0.1799
Letter 5-gram model +0.5394 +0.5380 +0.5160
Letter 9-gram model +0.6952* +0.6920 +0.6358
Morfessor Baseline +0.6605 +0.6765* +0.5817
Morfessor Categories-MAP +0.6620 +0.6950* +0.5474
Other Minimum Median Maximum
Reaction times of a single sub ject +0.2030 +0.4774 +0.5681*
correlations than linear frequencies, and the highest ones were obtained for the
number of morphemes in the word and the surface frequency. Among the models,
the n-grams were best trained with word types, while training with the logarit-
mic frequencies gave the highest correlations for Morfessor. The highest corre-
lation was obtained for the letter 9-gram model trained with word types—any
longer n-grams did not improve the results. Categories-MAP correlated almost
as well as the 9-gram model, while Baseline did somewhat worse. All of them
had markedly higher correlations than the maximum correlation obtained for an
single test subject to the average reaction times of the others.
With logarithmic counts, the Categories-MAP model segmented 135 of the
160 inflected nouns, but also 33 of the 160 monomorphemic nouns. The Baseline
model segmented less: 39 of the inflected and 5 of the monomorphemic nouns.
Figure 1 shows how the reaction times and probabilities given by Categories-
MAP model match for individual stimulus words. Observing the words that have
poor match between the predicted difficulty and reaction time led us to suspect
that some of the unexplained variance is due to a training corpus that does
not match the material that humans are exposed to. Thus we next studied the
effect of the training corpus for the morphological models (Fig. 2). Increasing
the amount of word types in the corpus clearly improved the correlation between
model predictions and measured reaction times. However, the data from books,
periodicals and subtitles gave usually higher correlations than the same amount
of the Morpho Challenge data.
280 S. Virpioja et al.
Fig. 1. Scatter plot of reaction times and log-probabilities from Morfessor Categories-
MAP. The words are divided into four groups: low-frequency monomorphemic
(LM), low-frequency inflected (LI), high-frequency monomorphemic (HM), and high-
frequency inflected (HI). Words that have faster reaction times than predicted are
often very concrete and related to family, nature, or stories: tytt¨o(girl), ¨aiti (mother),
haamu (ghost), etanaa (snail + partitive case), norsulla (elephant + adessive case).
Words that have slower reaction times than predicted are often more abstract or pro-
fessional: ohjelma (program), tieto (knowledge), hankkeen (project + genitive case),
ayt¨on (usage + genitive case), hiippa (miter), kapselin (capsule + genitive case).
Fig. 2. The effect of training corpus on correlations of Morfessor Baseline (blue circles),
Categories-MAP (red squares), and logarithmic surface frequencies (black crosses). The
dotted lines show the results on subsets of the same corpus. Unconnected points show
the results using different types of corpora.
4 Discussion
We studied how language models trained on unannotated textual data can pre-
dict human reaction times for inflected and monomorphemic Finnish words in
a lexical decision task. Three models, the letter-based 9-gram model and the
Predicting Reaction Times in Word Recognition 281
Morfessor Baseline and Categories-MAP models, provided not only higher cor-
relations than the simple statistics of words previously identified as impor-
tant factors affecting the recognition times in morphologically complex words
(cf. [16,1,2]), but also higher than the correlations of reaction times of individual
subjects to the average times of the others. The level of correlation was sur-
prisingly high especially because the training corpus is likely to differ from the
material humans encounter during their course of life. Based on the results us-
ing several training corpora, we assume that even higher correlations would be
obtained with more realistic training data.
The highest correlations were obtained for the letter 9-gram model. However,
its number of parameteres—almost 6 million n-gram probabilities—was very
large. As the estimates of the word probabilities are very precise, we assume
that they are good predictors especially for early visual processing stages.
The Categories-MAP model had almost as high correlation as the 9-gram
model with much fewer parameters (178 000 transition and emission probabili-
ties). It has three important aspects: First, it applies morpheme-like units instead
of words or letters. Second, it finds units that provide a compact representation
for the data. Third, the model is context-sensitive: the cost of next unit depends
on the previous unit. It is still unclear which contributes more to the high corre-
lations: the morpheme lexicon learned by minimizing the description length, or
the underlying probabilistic model. One way to study this question further is to
apply a similar model to a linguistic morphological analysis of a corpus.
While behavioral reaction times necessarily incorporate multiple processing
stages, brain activation measures could provide markedly more precise markers
of the different stages of visual word processing. At the level of the brain, effects
of morphology have been previously detected in neural responses that have been
associated with later stages of word recognition such as lexical-semantic, phono-
logical and syntactic processing [9,20]. Future work includes finding out whether
the predictive power of the models stems from some of these stages, or from an
earlier one related to the processing of visual word forms.
Acknowledgments. This work was funded by Academy of Finland, Graduate
School of Language Technology in Finland, Sigrid Jus´elius Foundation, Finnish
Cultural Foundation, and Stiftelsen f¨or ˚
Abo Akademi.
References
1. Alegre, M., Gordon, P.: Frequency effects and the representational status of regular
inflections. Journal of Memory and Language 40, 41–61 (1999)
2. Bertram, R., Baayen, R., Schreuder, R.: Effects of family size for complex words.
Journal of Memory and Language 42, 390–405 (2000)
3. Butterworth, B.: Lexical representation. In: Butterworth, B. (ed.) Language Pro-
duction, pp. 257–294. Academic Press, London (1983)
4. Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language
modeling. Computer Speech & Language 13(4), 359–393 (1999)
282 S. Virpioja et al.
5. Creutz, M., Lagus, K.: Unsupervised morpheme segmentation and morphology
induction from text corpora using Morfessor 1.0. Tech. Rep. A81. Publications in
Computer and Information Science. Helsinki University of Technology (2005)
6. Creutz, M., Lagus, K.: Unsupervised models for morpheme segmentation and mor-
phology learning. ACM Transactions on Speech and Language Processing 4(1)
(January 2007)
7. Karlsson, F.: Suomen kielen ¨anne- ja muotorakenne (The Phonological and Mor-
phological Structure of Finnish). Werner S¨oderstr¨om, Juva (1983)
8. Kurimo, M., Creutz, M., Varjokallio, M.: Morpho challenge evaluation using a
linguistic gold standard. In: Peters, C., Jijkoun, V., Mandl, T., M¨uller, H., Oard,
D.W., Pe˜nas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp.
864–872. Springer, Heidelberg (2008)
9. Lehtonen, M., Cunillera, T., Rodr´ıguez-Fornells, A., Hult´en, A., Tuomainen, J.,
Laine, M.: Recognition of morphologically complex words in Finnish: evidence
from event-related potentials. Brain Research 1148, 123–137 (2007)
10. Lehtonen, M., Laine, M.: How word frequency affects morphological processing
in monolinguals and bilinguals. Bilingualism: Language and Cognition 6, 213–225
(2003)
11. Niemi, J., Laine, M., Tuominen, J.: Cognitive morphology in Finnish: foundations
of a new model. Language and Cognitive Processes 9, 423–446 (1994)
12. Quasthoff, U., Richter, M., Biemann, C.: Corpus portal for search in monolin-
gual corpora. In: Proceedings of the Fifth International Conference on Language
Resources and Evaluation, LREC 2006, Genoa, Italy, pp. 1799–1802 (2006)
13. Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
14. Siivola, V., Hirsim¨aki, T., Virpioja, S.: On growing and pruning Kneser-Ney
smoothed n-gram models. IEEE Transactions on Audio, Speech & Language Pro-
cessing 15(5), 1617–1624 (2007)
15. Soveri, A., Lehtonen, M., Laine, M.: Word frequency and morphological processing
revisited. The Mental Lexicon 2, 359–385 (2007)
16. Taft, M.: Recognition of affixed words and the word frequency effect. Memory and
Cognition 7, 263–272 (1979)
17. Taft, M.: Morphological decomposition and the reverse base frequency effect. The
Quarterly Journal of Experimental Psychology A 57, 745–765 (2004)
18. The Department of General Linguistics, University of Helsinki and Research Insti-
tute for the Languages of Finland (gatherers): Finnish Parole Corpus (1996–1998),
available through CSC, http://www.csc.fi/
19. Tiedemann, J.: News from OPUS — A collection of multilingual parallel corpora
with tools and interfaces. In: Recent Advances in Natural Language Processing,
vol. 5, pp. 237–248. John Benjamins, Amsterdam (2009)
20. Vartiainen, J., Aggujaro, S., Lehtonen, M., Hult´en, A., Laine, M., Salmelin, R.:
Neural dynamics of reading morphologically complex words. NeuroImage 47, 2064–
2072 (2007)
... Morfessor was initially studied in psycholinguistic context by Virpioja, Lehtonen, Hultén, Salmelin, and Lagus (2011b) who demonstrated that the self-information values predicted by Morfessor correlated highly with word recognition times for morphologically complex Finnish words in a visual lexical decision task. These correlations were higher for Morfessor than for typically used psycholinguistic variables, such as lemma frequency, length, or morphological family size. ...
... As the Virpioja et al. (2011bVirpioja et al. ( , 2018 studies were based on lexical decision RTs, it is unclear whether the good performance of Morfessor and the whole-word model stem from particular, possibly different stages of the word recognition process. Word recognition times in a lexical decision task necessarily include several stages, including form-level (e.g., letter and bigram) processing and access to more abstract lexical representations (e.g., whole words or morphemes) but also decision-making processes and button-press-related motor preparation. ...
... Using these measures, we aim to better understand whether the predictive power of unsupervised Morfessor in lexical decision (Virpioja et al., 2011b(Virpioja et al., , 2018) stems primarily from early or late word recognition processes. We investigate how the MDL-based optimization principle of Morfessor and its different model variants (e.g., those that decompose words exhaustively vs. those that keep many words unsegmented) perform in predicting the different eyetracking measures during word recognition. ...
Article
Full-text available
We studied how statistical models of morphology that are built on different kinds of representational units, i.e., models emphasizing either holistic units or decomposition, perform in predicting human word recognition. More specifically, we studied the predictive power of such models at early vs. late stages of word recognition by using eye-tracking during two tasks. The tasks included a standard lexical decision task and a word recognition task that assumedly places less emphasis on postlexical reanalysis and decision processes. The lexical decision results showed good performance of Morfessor models based on the Minimum Description Length optimization principle. Models which segment words at some morpheme boundaries and keep other boundaries unsegmented performed well both at early and late stages of word recognition, supporting dual- or multiple-route cognitive models of morphological processing. Statistical models based on full forms fared better in late than early measures. The results of the second, multi-word recognition task showed that early and late stages of processing often involve accessing morphological constituents, with the exception of short complex words. Late stages of word recognition additionally involve predicting upcoming morphemes on the basis of previous ones in multimorphemic words. The statistical models based fully on whole words did not fare well in this task. Thus, we assume that the good performance of such models in global measures such as gaze durations or reaction times in lexical decision largely stems from postlexical reanalysis or decision processes. This finding highlights the importance of considering task demands in the study of morphological processing.
... Morfessor has previously been studied in a psycholinguistic setting by evaluating how well predictions of unsupervised Morfessor models correlated with the RTs for a set of monomorphemic and bimorphemic inflected Finnish nouns (Virpioja, Lehtonen, Hult en, Salmelin, & Lagus, 2011). The results were compared with predictions of letter-based n-gram models and a number of variables known to affect RTs. ...
Article
Determining optimal units of representing morphologically complex words in the mental lexicon is a central question in psycholinguistics. Here, we utilize advances in computational sciences to study human morphological processing using statistical models of morphology, particularly the unsupervised Morfessor model that works on the principle of optimization. The aim was to see what kind of model structure corresponds best to human word recognition costs for multimorphemic Finnish nouns: a model incorporating units resembling linguistically defined morphemes, a whole-word model, or a model that seeks for an optimal balance between these two extremes. Our results showed that human word recognition was predicted best by a combination of two models: a model that decomposes words at some morpheme boundaries while keeping others unsegmented and a whole-word model. The results support dual-route models that assume that both decomposed and full-form representations are utilized to optimally process complex words within the mental lexicon.
... For example, Lim et al. (2005) study a trie structure for storing Korean words, and find that the search times correlate to three properties of words and non-words (frequency, length, and nonwords similarity to a correct word) in a similar manner as human reaction times. In a recent work, Virpioja et al. (2011) study how an unsupervised probabilistic model can predict reaction times for Finnish nouns in a lexical decision task. These can be considered as direct evaluations, although the external "reference" is not an analysis by linguists but something measured from human test subjects. ...
Article
Full-text available
Unsupervised and semi-supervised learning of morphology provide practical solutions for processing morphologically rich languages with less human labor than the traditional rule-based analyzers. Direct evaluation of the learning methods using linguistic reference analyses is important for their development, as evaluation through the final applications is often time consuming. However, even linguistic evaluation is not straightforward for full morphological analysis, because the morpheme labels generated by the learning method can be arbitrary. We review the previous evaluation methods for the learning tasks and propose new variations. In order to compare the methods, we perform an extensive meta-evaluation using the large collection of results from the Morpho Challenge competitions.
Article
Full-text available
Neuroimaging studies of the reading process point to functionally distinct stages in word recognition. Yet, current understanding of the operations linked to those various stages is mainly descriptive in nature. Approaches developed in the field of computational linguistics may offer a more quantitative approach for understanding brain dynamics. Our aim was to evaluate whether a statistical model of morphology, with well-defined computational principles, can capture the neural dynamics of reading, using the concept of surprisal from information theory as the common measure. The Morfessor model, created for unsupervised discovery of morphemes, is based on the minimum description length principle and attempts to find optimal units of representation for complex words. In a word recognition task, we correlated brain responses to word surprisal values derived from Morfessor and from other psycholinguistic variables that have been linked with various levels of linguistic abstraction. The magnetoencephalography data analysis focused on spatially, temporally and functionally distinct components of cortical activation observed in reading tasks. The early occipital and occipito-temporal responses were correlated with parameters relating to visual complexity and orthographic properties, whereas the later bilateral superior temporal activation was correlated with whole-word based and morphological models. The results show that the word processing costs estimated by the statistical Morfessor model are relevant for brain dynamics of reading during late processing stages.
Article
Full-text available
We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes. A lexicon of word segments, called morphs , is induced from the data. The lexicon stores information about both the usage and form of the morphs. Several instances of the model are evaluated quantitatively in a morpheme segmentation task on different sized sets of Finnish as well as English data. Morfessor is shown to perform very well compared to a widely known benchmark algorithm, in particular on Finnish data.
Article
Full-text available
There have been many recent proposals concerning the nature of representations for inflectional morphology. One set of proposals addresses the question of whether there is decomposition of morphological structure in lexical access, whether complex forms are accessed as whole words, or if there is a competition between these two access modes. Another set of proposals addresses the question of whether inflected forms are generated by rule-based systems by connectionist type associative networks or if there is a dual system dissociating rule-based regular inflections from association-based irregular inflections. A central question is whether there are whole-word representations for regularly inflected forms. A series of five lexical decision experiments addressed this question by looking at whole-word frequency effects across a range of frequency values with constant stem-cluster frequencies. Frequency effects were only found for inflected forms above a threshold of about 6 per million, whereas such effects were found for morphologically simple controls in all frequency ranges. We discuss these data in the context of two kinds of dual models and in relation to competition models proposed within the connectionist literature.
Article
Full-text available
A simple and flexible schema for storing and presenting monolingual language resources is proposed. In this format, data for 18 different languages is already available in various sizes. The data is provided free of charge for online use and download. The main target is to ease the application of algorithms for monolingual and interlingual studies.
Article
Full-text available
In this work, we describe the first public version of the Morfessor software, which is a program that takes as input a corpus of unannotated text and produces a segmentation of the word forms observed in the text. The segmentation obtained often resembles a linguistic morpheme segmentation. Morfessor is not language-dependent. The number of segments per word is not restricted to two or three as in some other existing mor-phology learning models. The current version of the software essentially implements two morpheme segmentation models presented earlier by us (Creutz and Lagus, 2002; Creutz, 2003). The document contains user's instructions, as well as the mathematical formula-tion of the model and a description of the search algorithm used. Additionally, a few experiments on Finnish and English text corpora are reported in order to give the user some ideas of how to apply the program to his own data sets and how to evaluate the results.
Article
We summarise the main results from a series of Finnish studies dealing with single-word experiments with aphasics as well as lexical decision and eye-movement registration tests performed on normals. On the basis of our experimental results, we propose a processing model of Finnish nouns. For the input and central lexicons, this Stem Allomorph/Inflectional Decomposition (SAID) model assumes morphological decomposition of inflected (with the exception of the most frequently encountered inflected noun forms) but not derived noun forms. For the output lexicon, it predicts that both inflected and productive derived forms have decomposed representations. In the case of marked stem variation (resulting from stem formation and/or morphophonological alternation), the model assumes that the stems are represented by their allomorphs, and not by a single morph. In this respect, our model postulates more suppletion in the input/output lexicons than would be predicted on the basis of formal morphological analyses. However, among the allomorphs, the nominative singular of nouns appears to have a special status.
Article
The aims of the present study were to investigate the effects of word frequency on morphological processing of inflected words in Finnish, and to re-test previous results obtained for high frequency inflected words in Finnish which suggest that inflected words of high frequency might have full-form representations in the mental lexicon. Our results from three visual lexical decision experiments with monolingual Finnish speakers suggest that only very high frequency inflected Finnish words have full-form representations. This finding differs from results obtained from related studies in morphologically more limited Indo-European languages, in which full-form representations for inflected words seem to exist at a much lower level of frequency than in the morphologically rich Finnish language.
Article
R. Schreuder and R. H. Baayen (1997) reported that in visual lexical decision, response latencies to a simplex noun are shorter when this noun has a large morphological family, i.e., when it appears as a constituent in a large number of derived words and compounds. This article addresses the question of whether the family size of the base word of a complex word likewise affects lexical processing. College students participated in 6 experiments that show that family size plays a role for both inflected and derived words. Posthoc analyses show that the effect of family size is driven by the semantically transparent family members and that this effect is further constrained by semantic selection restrictions of the affix in the target word. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The present study investigated processing of morphologically complex words in three different frequency ranges in monolingual Finnish speakers and Finnish-Swedish bilinguals. By employing a visual lexical decision task, we found a differential pattern of results in monolinguals vs. bilinguals. Monolingual Finns seemed to process low frequency and medium frequency inflected Finnish nouns mostly by morpheme-based recognition but high frequency inflected nouns through full-form representations. In contrast, bilinguals demonstrated a processing delay for all inflections throughout the whole frequency range, suggesting decomposition for all inflected targets. This may reflect different amounts of exposure to the word forms in the two groups. Inflected word forms that are encountered very frequently will acquire full-form representations, which saves processing time. However, with the lower rates of exposure, which characterize bilingual individuals, full-form representations do not start to develop.