Content uploaded by Dirk Wulff
Author content
All content in this area was uploaded by Dirk Wulff on Jun 08, 2021
Content may be subject to copyright.
A data set linking large-scale, individual semantic networks and
cognitive performance
Dirk U. Wulff1,2, Samuel Aeschbach1, Simon De Deyne3, and Rui Mata1,2
1University of Basel
2Max Planck Institute for Human Development
3University of Melbourne
We report data from a proof-of-concept study involving the concurrent assessment of large-
scale individual semantic networks and cognitive performance. The data include 10,800 free
associations–collected using a dedicated web-based platform over the course of 2-4 weeks–and
responses to several cognitive tasks, including verbal fluency, episodic memory, associative
recall tasks, from four younger and four older native German speakers. The data are unique
in scope and composition and shed light on individual and age-related differences in mental
representations and their role in cognitive performance across the lifespan.
Keywords: semantic networks, cognitive aging, individual differences
Collection Date
The data were collected from August to October 2018.
Background
Over the lifespan, people accumulate a large and idiosyn-
cratic set of experiences that shape their mental knowledge
representations. These changes in mental representations
driven by experience could potentially be a major factor
underlying typical age-related patterns, such as decreased
memory performance with increased age (Buchler & Reder,
2007; Ramscar et al., 2014; Wulffet al., 2019). In line
with this view, recent research (Kenett et al., 2020; Siew et
al., 2019) has documented consistent differences in the size
and structure of younger and older adults’ mental represen-
tations (Dubossarsky et al., 2017; Wulffet al., 2018, Octo-
ber 29). To evaluate whether and how strongly these dif-
ferences in representations contribute to differences in cog-
nitive performance across age, we designed the My Small
Word of Words (MySWOW) project. Building on ongo-
ing efforts to obtain word association norms for several lan-
guages in a large online citizen-science project, the Small
Dirk U. Wulffhttps://orcid.org/0000-0002-4008-8022 Samuel
Aeschbach https://orcid.org/0000-0002-6167-4901 Simon De
Deyne https://orcid.org/0000-0002-7899-6210 Rui Mata
https://orcid.org/0000-0002-1679-906X
Correspondence concerning this article should be addressed
to Dirk Wulff, Department of Psychology, University of Basel,
Missionsstrasse 60-62, 4055 Basel, Switzerland. E-mail:
dirk.wulff@gmail.com
World of Words (SWOW; e.g., De Deyne et al., 2019) study
(https://smallworldofwords.org), MySWOW aims to elicit
large-scale, free association networks from single individuals
and concurrently assess their cognitive performance across a
variety of tasks that are known to be linked to semantic rep-
resentations. MySWOW addresses shortcomings of previ-
ous research, which either had focused on group-level repre-
sentations (Dubossarsky et al., 2017) or did not concurrently
assess cognitive performance on a broad scale (Wulffet al.,
2018, October 29). We present data of a proof-of-concept
study of MySWOW involving four younger and four older
individuals. For additional details of the study rationale, see
Wulffet al. (2021, February 15).
Methods
The MySWOW proof-of-concept study relied on a corre-
lational design encompassing the concurrent assessment of a
large number of free word associations and a broad battery of
cognitive tasks for four younger and four older individuals.
The free association task and cognitive battery were designed
to match each other in order to facilitate a comparison of se-
mantic networks and cognitive performance.
Participants
Four older adults aged 68 to 70 years old and four younger
adults aged 24 to 28 years old participated and completed the
study. Three more participants began the study, but dropped
out after .5%, 18.7%, and 41.7% of the free association task.
We only report data for the eight participants with complete
data. Participants were recruited from the participant pool
of the Center for Cognitive and Decision Sciences (CDS) of
2WULFF, AESCHBACH, DE DEYNE, AND MATA
the University of Basel. They were contacted via phone and
completed an initial screening to confirm the following in-
clusion criteria: mother tongue being German or Swiss Ger-
man, daily access to a computer with a stable Internet con-
nection, absence of neurological or psychiatric diagnoses.
Participants were compensated with a flat fee of CHF 220
(USD 245) consisting of CHF 180 for the free association
task (CHF 0.05 per cue) and CHF 60 for four hours of labo-
ratory assessment and instructions (CHF 15/h). Participants
were compensated with CHF 220 for their full participation
consisting of CHF 180 for 3,600 answered cues (CHF 0.05
per cue) and CHF 40 for two to three hours of laboratory
assessment and instructions (approx. CHF15/h).
All data were recorded in reference to a random six let-
ter identifier assigned to participants at the beginning of the
study. Identifying information such as names or addresses
was not recorded. Potentially identifying information such
as participants’ age, birthday, and profession were not in-
cluded in the publicly available files. Participants provided
informed consent that included permission for public sharing
of the data. The study was approved by the internal review
board of the Department of Psychology at the University of
Basel (# 014-17-1).
Materials
Free association task
Free associations were collected via a password-protected
web-based platform that participants could access from
home. In the association task, participants were sequentially
presented with a total of 3,600 cues for which they provided
three associations each, following the same procedure used
in SWOW. Participants were instructed to enter, using the
keyboard, the first three words that came to mind when think-
ing about the cue. If fewer than three words came to mind
or if the cue was not recognized, the participant could pro-
ceed to the next cue by clicking on a "no further responses"
or "unknown word" button, respectively. Figure 1 shows a
screenshot of the free association interface.
The 3,600 cues consisted of 3,000 unique and 600 re-
peated cues. The 3,000 unique cues, in turn, consisted of
three subsets of 1,000 cues each. To ensure high coverage
of central words in people’s semantic networks, the first sub-
set consisted of 1,000 highest frequency words among the
4,500 cue words that, at time, were included in the German
SWOW, with frequency determined using the German SUB-
TLEX frequency norms (Brysbaert et al., 2011). To ensure
high coverage of the connections within people’s networks,
the second subset consisted of those 1,000 from the remain-
ing 3,500 cues in the German SWOW that most likely pro-
duced one of the cues in the first subset. Finally, to ensure
a high network depth, the third subset consisted of the 1,000
most frequent associates in the German SWOW given to the
Figure 1
Screenshot of the free association task. The screenshot shows
one trial in training mini-study requiring associations to the
cue "Büroklammer" (paper clip).
cues of the first subset. The cues were presented to the par-
ticipants in the same fixed, randomly determined order.
Responses were cleaned in the following way. First, all
responses matching either individual words or composites
of words included in the German aspell dictionary were ac-
cepted as valid. The remaining words were subjected to man-
ual correction. Overall, 4.2% of responses were corrected
manually with a median string edit distance (i.e., the number
of letters that were changed) of 2 (mean =2.42).
Cognitive assessment
The cognitive battery consisted of two sets of tasks fulfill-
ing different purposes. The purpose of the first set was the as-
sessment of people’s general cognitive abilities and function-
ing. This set included a 20-minute timed version of the Ad-
vanced Progressive Matrices (APM; Hamel & Schmittmann,
2006) as a measure of general intelligence, a digit-symbol
substitution test, as is found in the Wechsler Adult Intel-
ligence Scale IV as subtest "coding" (WAIS-IV; Wechsler,
2008) as a measure of processing speed, the Mehrfachwahl-
Wortschatz-Intelligenztest: Form I (MWT-A; Lehrl et al.,
1995) as a measure of vocabulary size, and, finally, the Dem-
Tect (Kalbe et al., 2004) as a screen for dementia. The
purpose of the second set was to establish word-level links
between the free association network and cognitive perfor-
mance. This set included 10-minute category (animals) and
phonemic fluency (letter S) fluency tasks (e.g., Wulffet al.,
2018, October 29), an episodic list memory task modeled af-
ter Penn Electrophysiology of Encoding and Retrieval Study
(e.g., Healey & Kahana, 2016), and an associative recall task
modeled after Naveh-Benjamin et al. (2003). Behavior in the
two fluency tasks can be related to the free association net-
work based on the fact that both cues and responses naturally
SEMANTIC NETWORKS AND COGNITIVE AGING 3
Table 1
Tasks in the cognitive battery
Task Description Motivation Reference
Category fluency Name all the animals you can in 10
minutes.
Predict performance
from network
Wulffet al. (2018, Oc-
tober 29)
Phonemic fluency Name all words starting with letter S
you can in 10 minutes.
Predict performance
from network
Griffiths et al. (2007)
Episodic memory task Study a word list and then recall the
words in any order (20 lists, 16 words
per list).
Predict performance
from network
Healey and Kahana
(2016)
Associative recall task Study a list of word pairs, then recall
for each one word of a pair while be-
ing cued with the other (4 lists, of 16
word pairs).
Predict performance
from network
Naveh-Benjamin et al.
(2003)
Advanced Progressive Ma-
trices
Solve abstract reasoning problems. General cognitive abili-
ties
Hamel and
Schmittmann (2006)
Digit-symbol substitution Assign digits to symbols according to
rule.
General cognitive abili-
ties
Wechsler (2008)
Mehrfachwahl-Wortschatz-
Intelligenztest
Recognize words in list of words and
non-words.
General cognitive abili-
ties
Lehrl et al. (1995)
DemTect Various cognitive tasks. Screen for age-related
pathologies
Kalbe et al. (2004)
included animals and words starting with the letters S. Par-
ticipants retrieved between 62 and 113 animals and between
45 and 138 words of the letter S. The retrieved animals over-
lapped with 1.5% of cues and 0.8% of responses, whereas
the retrieved words starting with the letter S overlapped with
11.1% of cues and 11.9% of responses. The episodic mem-
ory task and the associative recall task were populated with
nouns from the cue set to establish comparability with the
associative network. In the episodic memory task, a total
of 20 lists of 16 words each were studied and subsequently
recalled. Participants correctly recalled between 28.7% and
60.9% of words, with an additional 1.3% to 25% intrusions.
In the associative recall task, 4 lists consisting of 16 word-
pairs were presented and tested. Participants correctly re-
called between 32.8% and 96.8% of pairs. See also Table 1
for an overview of tasks included in the cognitive assessment
in the MySWOW proof-of-concept study.
Entry and debriefing questionnaires
At study entry, participants provided demographic infor-
mation concerning their primary language (German or Swiss
German), their current profession, their highest academic de-
gree, and the income level of their household. Participants
further answered questions on their usual reading behavior,
e.g., the number of books read in a year. At debriefing, par-
ticipants were asked to provide information on their observa-
tions during the study, for example, whether they were able
to sustain concentration while working on the free associa-
tions. The specific questions are reported in the code book
(see Table 2).
Procedure
Participants passing the initial screening over the phone
were invited to to our laboratory at the University of Basel
for an introductory session lasting approximately 30 minutes.
During this session participants provided informed consent,
completed the entry questionnaire, and were introduced to
the web-based platform using a training mini-study involv-
ing 15 cues. Over the course of the next weeks, participants
were instructed to log in and work on the free association task
twice a day for 30 minutes each. On average, participants
completed the free association task in 26.1 hours spread over
39.4 days. After completing the free association task, par-
ticipants were invited back to the laboratory for a three-hour
session that included the cognitive assessments and study de-
briefing.
The cognitive assessment and study debriefing session
consisted of the following elements: First, participants filled
out the debriefing questionnaire. Next, the verbal fluency
tasks were conducted orally and recorded for later transcrip-
tion by two student assistants responsible for data collection.
Following the verbal fluency tasks, the participants were ad-
ministered a 90-second timed Digit Symbol Substitution Test
in paper and pencil format. To conclude the first part of the
lab session, the Associative Recall task was completed as
a computerized task implemented in E-Prime (Psychology
Software Tools, Inc., 2016) at a lab-computer. After a 10-
4WULFF, AESCHBACH, DE DEYNE, AND MATA
Table 2
Description of Data Files
File Description
participants.csv Contains data on demographic
information, reading behavior,
debriefings survey, and all but
four cognitive assessments.
associations.csv Contains the corrected and
uncorrected free association
data.
episodic_memory.csv Contains the episodic memory
training and test data.
associative_recall.csv Contains the associative recall
training and test data.
animal_fluency.csv Contains animal fluency re-
sponse sequences.
letter_fluency.csv Contains letter fluency re-
sponse sequences.
codebook.pdf Contains descriptions of all
variable names in the data
files.
minute break, the second part of the lab session began with
the List Memory task, which was also implemented as a com-
puterized task using E-Prime (Psychology Software Tools,
Inc., 2016). The Mehrfachwahl-Wortschatz-Intelligenztest
(MWT-A) was then conducted in paper and pencil format
followed by a 20-minute timed version of the Advanced Pro-
gressive Matrices (APM) in paper and pencil format. The lab
session concluded with the interactive verbal administration
of the DemTect, carried out by one of the student assistants.
Subsequently, participants received their monetary compen-
sation for participation.
Dataset description
Table 2 provides an overview of the different files con-
taining the data. All data are available as comma-separated
files. A codebook.pdf file provides descriptions of all vari-
able names across the data files. All variable names and data
labels have been translated to English. The association and
fluency data, however, were not translated.
The data were published on the Open Science Frame-
work (10.17605/OSF.IO/VKWPS) on February 15.02.2021.
The data are licensed under Creative Commons Attribution-
ShareAlike 4.0 International (CC BY-SA 4.0).
Reuse potential
The reported data present the only publicly available re-
source containing large-scale free-association data on the in-
dividual level (cf. Morais et al., 2013). These data are
amenable to network analytic (Siew et al., 2019) and tra-
ditional approaches to free association data (Nelson et al.,
2001) that can shed light on individual and age-related dif-
ferences in semantic representations and retrieval. Of par-
ticular value is the fact that the large-scale free association
data are accompanied by a diverse cognitive battery, includ-
ing four tasks that can be linked to the free association data.
Further assessment of these links, for instance, using better
inference of the underlying network representation or more
elaborate models of cognitive performance, promises to im-
prove the understanding of experience-driven differences in
mental representations that may contribute to differences in
cognitive performance.
Acknowledgements
We thank Alina Gerlach for helping collecting the data.
We thank Laura Wiles for editing the manuscript. This work
was supported by a grant from the Swiss Science Foundation
(100015_197315) to Dirk U. Wulff.
References
Brysbaert, M., Buchmeier, M., Conrad, M., Jacobs, A. M.,
Bölte, J., & Böhl, A. (2011). The word frequency
effect. Experimental Psychology,58, 412–424.
Buchler, N. E. G., & Reder, L. M. (2007). Modeling age-
related memory deficits: A two-parameter solution.
Psychology and aging,22(1), 104–121.
De Deyne, S., Navarro, D. J., Perfors, A., Brysbaert, M.,
& Storms, G. (2019). The “small world of words”
english word association norms for over 12,000
cue words. Behavior research methods,51(3), 987–
1006.
Dubossarsky, H., De Deyne, S., & Hills, T. T. (2017). Quan-
tifying the structure of free association networks
across the life span. Developmental psychology,
53(8), 1560–1570.
Griffiths, T. L., Steyvers, M., & Firl, A. (2007). Google and
the mind: Predicting fluency with pagerank. Psy-
chological science,18(12), 1069–1076.
Hamel, R., & Schmittmann, V. D. (2006). The 20-minute ver-
sion as a predictor of the raven advanced progres-
sive matrices test. Educational and Psychological
measurement,66(6), 1039–1046.
Healey, M. K., & Kahana, M. J. (2016). A four-component
model of age-related memory change. Psychologi-
cal Review,123(1), 23–69.
Kalbe, E., Kessler, J., Calabrese, P., Smith, R., Passmore,
A., Brand, M. a., & Bullock, R. (2004). Demtect: A
new, sensitive cognitive screening test to support the
diagnosis of mild cognitive impairment and early
dementia. International journal of geriatric psychi-
atry,19(2), 136–143.
SEMANTIC NETWORKS AND COGNITIVE AGING 5
Kenett, Y. N., Beckage, N. M., Siew, C. S., & Wulff, D. U.
(2020). Cognitive network science: A new frontier.
Complexity, 6870278.
Lehrl, S., Triebig, G., & Fischer, B. (1995). Multiple choice
vocabulary test mwt as a valid and short test to
estimate premorbid intelligence. Acta Neurologica
Scandinavica,91(5), 335–345.
Morais, A. S., Olsson, H., & Schooler, L. J. (2013). Map-
ping the structure of semantic memory. Cognitive
science,37(1), 125–145.
Naveh-Benjamin, M., Hussain, Z., Guez, J., & Bar-On, M.
(2003). Adult age differences in episodic memory:
Further support for an associative-deficit hypothe-
sis. Journal of Experimental Psychology: Learning,
Memory, and Cognition,29(5), 826–837.
Nelson, D. L., Zhang, N., & McKinney, V. M. (2001). The
ties that bind what is known to the recognition of
what is new. Journal of experimental psychology.
Learning, memory, and cognition,27(5), 1147–59.
Psychology Software Tools, Inc. (2016). E-prime 3.0. https:
//support.pstnet.com/
Ramscar, M., Hendrix, P., Shaoul, C., Milin, P., & Baayen, H.
(2014). The myth of cognitive decline: Non-linear
dynamics of lifelong learning. Topics in cognitive
science,6(1), 5–42.
Siew, C. S., Wulff, D. U., Beckage, N. M., & Kenett, Y. N.
(2019). Cognitive network science: A review of re-
search on cognition through the lens of network rep-
resentations, processes, and dynamics. Complexity,
2108423.
Wechsler, D. (2008). Wechsler adult intelligence
scale—fourth edition.
Wulff, D. U., De Deyne, S., Aeschbach, S., & Mata, R. (2021,
February 15). Understanding the aging lexicon by
linking individuals’ experience, semantic networks,
and cognitive performance. https : / / doi . org /10 .
31234/osf.io/z3ebt
Wulff, D. U., De Deyne, S., Jones, M. N., Mata, R., & Aging
Lexicon Consortium. (2019). New Perspectives on
the Aging Lexicon. Trends in Cognitive Sciences,
23(8), 686–698.
Wulff, D. U., Hills, T., & Mata, R. (2018, October 29).
Structural differences in the semantic networks of
younger and older adults. https://doi.org/10.31234/
osf.io/s73dp