A data set linking large-scale, individual semantic networks and
Dirk U. Wulﬀ1,2, Samuel Aeschbach1, Simon De Deyne3, and Rui Mata1,2
1University of Basel
2Max Planck Institute for Human Development
3University of Melbourne
We report data from a proof-of-concept study involving the concurrent assessment of large-
scale individual semantic networks and cognitive performance. The data include 10,800 free
associations–collected using a dedicated web-based platform over the course of 2-4 weeks–and
responses to several cognitive tasks, including verbal ﬂuency, episodic memory, associative
recall tasks, from four younger and four older native German speakers. The data are unique
in scope and composition and shed light on individual and age-related diﬀerences in mental
representations and their role in cognitive performance across the lifespan.
Keywords: semantic networks, cognitive aging, individual diﬀerences
The data were collected from August to October 2018.
Over the lifespan, people accumulate a large and idiosyn-
cratic set of experiences that shape their mental knowledge
representations. These changes in mental representations
driven by experience could potentially be a major factor
underlying typical age-related patterns, such as decreased
memory performance with increased age (Buchler & Reder,
2007; Ramscar et al., 2014; Wulﬀet al., 2019). In line
with this view, recent research (Kenett et al., 2020; Siew et
al., 2019) has documented consistent diﬀerences in the size
and structure of younger and older adults’ mental represen-
tations (Dubossarsky et al., 2017; Wulﬀet al., 2018, Octo-
ber 29). To evaluate whether and how strongly these dif-
ferences in representations contribute to diﬀerences in cog-
nitive performance across age, we designed the My Small
Word of Words (MySWOW) project. Building on ongo-
ing eﬀorts to obtain word association norms for several lan-
guages in a large online citizen-science project, the Small
Dirk U. Wulﬀhttps://orcid.org/0000-0002-4008-8022 Samuel
Aeschbach https://orcid.org/0000-0002-6167-4901 Simon De
Deyne https://orcid.org/0000-0002-7899-6210 Rui Mata
Correspondence concerning this article should be addressed
to Dirk Wulﬀ, Department of Psychology, University of Basel,
Missionsstrasse 60-62, 4055 Basel, Switzerland. E-mail:
World of Words (SWOW; e.g., De Deyne et al., 2019) study
(https://smallworldofwords.org), MySWOW aims to elicit
large-scale, free association networks from single individuals
and concurrently assess their cognitive performance across a
variety of tasks that are known to be linked to semantic rep-
resentations. MySWOW addresses shortcomings of previ-
ous research, which either had focused on group-level repre-
sentations (Dubossarsky et al., 2017) or did not concurrently
assess cognitive performance on a broad scale (Wulﬀet al.,
2018, October 29). We present data of a proof-of-concept
study of MySWOW involving four younger and four older
individuals. For additional details of the study rationale, see
Wulﬀet al. (2021, February 15).
The MySWOW proof-of-concept study relied on a corre-
lational design encompassing the concurrent assessment of a
large number of free word associations and a broad battery of
cognitive tasks for four younger and four older individuals.
The free association task and cognitive battery were designed
to match each other in order to facilitate a comparison of se-
mantic networks and cognitive performance.
Four older adults aged 68 to 70 years old and four younger
adults aged 24 to 28 years old participated and completed the
study. Three more participants began the study, but dropped
out after .5%, 18.7%, and 41.7% of the free association task.
We only report data for the eight participants with complete
data. Participants were recruited from the participant pool
of the Center for Cognitive and Decision Sciences (CDS) of
2WULFF, AESCHBACH, DE DEYNE, AND MATA
the University of Basel. They were contacted via phone and
completed an initial screening to conﬁrm the following in-
clusion criteria: mother tongue being German or Swiss Ger-
man, daily access to a computer with a stable Internet con-
nection, absence of neurological or psychiatric diagnoses.
Participants were compensated with a ﬂat fee of CHF 220
(USD 245) consisting of CHF 180 for the free association
task (CHF 0.05 per cue) and CHF 60 for four hours of labo-
ratory assessment and instructions (CHF 15/h). Participants
were compensated with CHF 220 for their full participation
consisting of CHF 180 for 3,600 answered cues (CHF 0.05
per cue) and CHF 40 for two to three hours of laboratory
assessment and instructions (approx. CHF15/h).
All data were recorded in reference to a random six let-
ter identiﬁer assigned to participants at the beginning of the
study. Identifying information such as names or addresses
was not recorded. Potentially identifying information such
as participants’ age, birthday, and profession were not in-
cluded in the publicly available ﬁles. Participants provided
informed consent that included permission for public sharing
of the data. The study was approved by the internal review
board of the Department of Psychology at the University of
Basel (# 014-17-1).
Free association task
Free associations were collected via a password-protected
web-based platform that participants could access from
home. In the association task, participants were sequentially
presented with a total of 3,600 cues for which they provided
three associations each, following the same procedure used
in SWOW. Participants were instructed to enter, using the
keyboard, the ﬁrst three words that came to mind when think-
ing about the cue. If fewer than three words came to mind
or if the cue was not recognized, the participant could pro-
ceed to the next cue by clicking on a "no further responses"
or "unknown word" button, respectively. Figure 1 shows a
screenshot of the free association interface.
The 3,600 cues consisted of 3,000 unique and 600 re-
peated cues. The 3,000 unique cues, in turn, consisted of
three subsets of 1,000 cues each. To ensure high coverage
of central words in people’s semantic networks, the ﬁrst sub-
set consisted of 1,000 highest frequency words among the
4,500 cue words that, at time, were included in the German
SWOW, with frequency determined using the German SUB-
TLEX frequency norms (Brysbaert et al., 2011). To ensure
high coverage of the connections within people’s networks,
the second subset consisted of those 1,000 from the remain-
ing 3,500 cues in the German SWOW that most likely pro-
duced one of the cues in the ﬁrst subset. Finally, to ensure
a high network depth, the third subset consisted of the 1,000
most frequent associates in the German SWOW given to the
Screenshot of the free association task. The screenshot shows
one trial in training mini-study requiring associations to the
cue "Büroklammer" (paper clip).
cues of the ﬁrst subset. The cues were presented to the par-
ticipants in the same ﬁxed, randomly determined order.
Responses were cleaned in the following way. First, all
responses matching either individual words or composites
of words included in the German aspell dictionary were ac-
cepted as valid. The remaining words were subjected to man-
ual correction. Overall, 4.2% of responses were corrected
manually with a median string edit distance (i.e., the number
of letters that were changed) of 2 (mean =2.42).
The cognitive battery consisted of two sets of tasks fulﬁll-
ing diﬀerent purposes. The purpose of the ﬁrst set was the as-
sessment of people’s general cognitive abilities and function-
ing. This set included a 20-minute timed version of the Ad-
vanced Progressive Matrices (APM; Hamel & Schmittmann,
2006) as a measure of general intelligence, a digit-symbol
substitution test, as is found in the Wechsler Adult Intel-
ligence Scale IV as subtest "coding" (WAIS-IV; Wechsler,
2008) as a measure of processing speed, the Mehrfachwahl-
Wortschatz-Intelligenztest: Form I (MWT-A; Lehrl et al.,
1995) as a measure of vocabulary size, and, ﬁnally, the Dem-
Tect (Kalbe et al., 2004) as a screen for dementia. The
purpose of the second set was to establish word-level links
between the free association network and cognitive perfor-
mance. This set included 10-minute category (animals) and
phonemic ﬂuency (letter S) ﬂuency tasks (e.g., Wulﬀet al.,
2018, October 29), an episodic list memory task modeled af-
ter Penn Electrophysiology of Encoding and Retrieval Study
(e.g., Healey & Kahana, 2016), and an associative recall task
modeled after Naveh-Benjamin et al. (2003). Behavior in the
two ﬂuency tasks can be related to the free association net-
work based on the fact that both cues and responses naturally
SEMANTIC NETWORKS AND COGNITIVE AGING 3
Tasks in the cognitive battery
Task Description Motivation Reference
Category ﬂuency Name all the animals you can in 10
Wulﬀet al. (2018, Oc-
Phonemic ﬂuency Name all words starting with letter S
you can in 10 minutes.
Griﬃths et al. (2007)
Episodic memory task Study a word list and then recall the
words in any order (20 lists, 16 words
Healey and Kahana
Associative recall task Study a list of word pairs, then recall
for each one word of a pair while be-
ing cued with the other (4 lists, of 16
Naveh-Benjamin et al.
Advanced Progressive Ma-
Solve abstract reasoning problems. General cognitive abili-
Digit-symbol substitution Assign digits to symbols according to
General cognitive abili-
Recognize words in list of words and
General cognitive abili-
Lehrl et al. (1995)
DemTect Various cognitive tasks. Screen for age-related
Kalbe et al. (2004)
included animals and words starting with the letters S. Par-
ticipants retrieved between 62 and 113 animals and between
45 and 138 words of the letter S. The retrieved animals over-
lapped with 1.5% of cues and 0.8% of responses, whereas
the retrieved words starting with the letter S overlapped with
11.1% of cues and 11.9% of responses. The episodic mem-
ory task and the associative recall task were populated with
nouns from the cue set to establish comparability with the
associative network. In the episodic memory task, a total
of 20 lists of 16 words each were studied and subsequently
recalled. Participants correctly recalled between 28.7% and
60.9% of words, with an additional 1.3% to 25% intrusions.
In the associative recall task, 4 lists consisting of 16 word-
pairs were presented and tested. Participants correctly re-
called between 32.8% and 96.8% of pairs. See also Table 1
for an overview of tasks included in the cognitive assessment
in the MySWOW proof-of-concept study.
Entry and debrieﬁng questionnaires
At study entry, participants provided demographic infor-
mation concerning their primary language (German or Swiss
German), their current profession, their highest academic de-
gree, and the income level of their household. Participants
further answered questions on their usual reading behavior,
e.g., the number of books read in a year. At debrieﬁng, par-
ticipants were asked to provide information on their observa-
tions during the study, for example, whether they were able
to sustain concentration while working on the free associa-
tions. The speciﬁc questions are reported in the code book
(see Table 2).
Participants passing the initial screening over the phone
were invited to to our laboratory at the University of Basel
for an introductory session lasting approximately 30 minutes.
During this session participants provided informed consent,
completed the entry questionnaire, and were introduced to
the web-based platform using a training mini-study involv-
ing 15 cues. Over the course of the next weeks, participants
were instructed to log in and work on the free association task
twice a day for 30 minutes each. On average, participants
completed the free association task in 26.1 hours spread over
39.4 days. After completing the free association task, par-
ticipants were invited back to the laboratory for a three-hour
session that included the cognitive assessments and study de-
The cognitive assessment and study debrieﬁng session
consisted of the following elements: First, participants ﬁlled
out the debrieﬁng questionnaire. Next, the verbal ﬂuency
tasks were conducted orally and recorded for later transcrip-
tion by two student assistants responsible for data collection.
Following the verbal ﬂuency tasks, the participants were ad-
ministered a 90-second timed Digit Symbol Substitution Test
in paper and pencil format. To conclude the ﬁrst part of the
lab session, the Associative Recall task was completed as
a computerized task implemented in E-Prime (Psychology
Software Tools, Inc., 2016) at a lab-computer. After a 10-
4WULFF, AESCHBACH, DE DEYNE, AND MATA
Description of Data Files
participants.csv Contains data on demographic
information, reading behavior,
debrieﬁngs survey, and all but
four cognitive assessments.
associations.csv Contains the corrected and
uncorrected free association
episodic_memory.csv Contains the episodic memory
training and test data.
associative_recall.csv Contains the associative recall
training and test data.
animal_ﬂuency.csv Contains animal ﬂuency re-
letter_ﬂuency.csv Contains letter ﬂuency re-
codebook.pdf Contains descriptions of all
variable names in the data
minute break, the second part of the lab session began with
the List Memory task, which was also implemented as a com-
puterized task using E-Prime (Psychology Software Tools,
Inc., 2016). The Mehrfachwahl-Wortschatz-Intelligenztest
(MWT-A) was then conducted in paper and pencil format
followed by a 20-minute timed version of the Advanced Pro-
gressive Matrices (APM) in paper and pencil format. The lab
session concluded with the interactive verbal administration
of the DemTect, carried out by one of the student assistants.
Subsequently, participants received their monetary compen-
sation for participation.
Table 2 provides an overview of the diﬀerent ﬁles con-
taining the data. All data are available as comma-separated
ﬁles. A codebook.pdf ﬁle provides descriptions of all vari-
able names across the data ﬁles. All variable names and data
labels have been translated to English. The association and
ﬂuency data, however, were not translated.
The data were published on the Open Science Frame-
work (10.17605/OSF.IO/VKWPS) on February 15.02.2021.
The data are licensed under Creative Commons Attribution-
ShareAlike 4.0 International (CC BY-SA 4.0).
The reported data present the only publicly available re-
source containing large-scale free-association data on the in-
dividual level (cf. Morais et al., 2013). These data are
amenable to network analytic (Siew et al., 2019) and tra-
ditional approaches to free association data (Nelson et al.,
2001) that can shed light on individual and age-related dif-
ferences in semantic representations and retrieval. Of par-
ticular value is the fact that the large-scale free association
data are accompanied by a diverse cognitive battery, includ-
ing four tasks that can be linked to the free association data.
Further assessment of these links, for instance, using better
inference of the underlying network representation or more
elaborate models of cognitive performance, promises to im-
prove the understanding of experience-driven diﬀerences in
mental representations that may contribute to diﬀerences in
We thank Alina Gerlach for helping collecting the data.
We thank Laura Wiles for editing the manuscript. This work
was supported by a grant from the Swiss Science Foundation
(100015_197315) to Dirk U. Wulﬀ.
Brysbaert, M., Buchmeier, M., Conrad, M., Jacobs, A. M.,
Bölte, J., & Böhl, A. (2011). The word frequency
eﬀect. Experimental Psychology,58, 412–424.
Buchler, N. E. G., & Reder, L. M. (2007). Modeling age-
related memory deﬁcits: A two-parameter solution.
Psychology and aging,22(1), 104–121.
De Deyne, S., Navarro, D. J., Perfors, A., Brysbaert, M.,
& Storms, G. (2019). The “small world of words”
english word association norms for over 12,000
cue words. Behavior research methods,51(3), 987–
Dubossarsky, H., De Deyne, S., & Hills, T. T. (2017). Quan-
tifying the structure of free association networks
across the life span. Developmental psychology,
Griﬃths, T. L., Steyvers, M., & Firl, A. (2007). Google and
the mind: Predicting ﬂuency with pagerank. Psy-
chological science,18(12), 1069–1076.
Hamel, R., & Schmittmann, V. D. (2006). The 20-minute ver-
sion as a predictor of the raven advanced progres-
sive matrices test. Educational and Psychological
Healey, M. K., & Kahana, M. J. (2016). A four-component
model of age-related memory change. Psychologi-
cal Review,123(1), 23–69.
Kalbe, E., Kessler, J., Calabrese, P., Smith, R., Passmore,
A., Brand, M. a., & Bullock, R. (2004). Demtect: A
new, sensitive cognitive screening test to support the
diagnosis of mild cognitive impairment and early
dementia. International journal of geriatric psychi-
SEMANTIC NETWORKS AND COGNITIVE AGING 5
Kenett, Y. N., Beckage, N. M., Siew, C. S., & Wulﬀ, D. U.
(2020). Cognitive network science: A new frontier.
Lehrl, S., Triebig, G., & Fischer, B. (1995). Multiple choice
vocabulary test mwt as a valid and short test to
estimate premorbid intelligence. Acta Neurologica
Morais, A. S., Olsson, H., & Schooler, L. J. (2013). Map-
ping the structure of semantic memory. Cognitive
Naveh-Benjamin, M., Hussain, Z., Guez, J., & Bar-On, M.
(2003). Adult age diﬀerences in episodic memory:
Further support for an associative-deﬁcit hypothe-
sis. Journal of Experimental Psychology: Learning,
Memory, and Cognition,29(5), 826–837.
Nelson, D. L., Zhang, N., & McKinney, V. M. (2001). The
ties that bind what is known to the recognition of
what is new. Journal of experimental psychology.
Learning, memory, and cognition,27(5), 1147–59.
Psychology Software Tools, Inc. (2016). E-prime 3.0. https:
Ramscar, M., Hendrix, P., Shaoul, C., Milin, P., & Baayen, H.
(2014). The myth of cognitive decline: Non-linear
dynamics of lifelong learning. Topics in cognitive
Siew, C. S., Wulﬀ, D. U., Beckage, N. M., & Kenett, Y. N.
(2019). Cognitive network science: A review of re-
search on cognition through the lens of network rep-
resentations, processes, and dynamics. Complexity,
Wechsler, D. (2008). Wechsler adult intelligence
Wulﬀ, D. U., De Deyne, S., Aeschbach, S., & Mata, R. (2021,
February 15). Understanding the aging lexicon by
linking individuals’ experience, semantic networks,
and cognitive performance. https : / / doi . org /10 .
Wulﬀ, D. U., De Deyne, S., Jones, M. N., Mata, R., & Aging
Lexicon Consortium. (2019). New Perspectives on
the Aging Lexicon. Trends in Cognitive Sciences,
Wulﬀ, D. U., Hills, T., & Mata, R. (2018, October 29).
Structural diﬀerences in the semantic networks of
younger and older adults. https://doi.org/10.31234/