Content uploaded by Paul Seitlinger
Author content
All content in this area was uploaded by Paul Seitlinger on Feb 13, 2018
Content may be subject to copyright.
Reconceptualizing imitation in social tagging: a reflective
search model of human web interaction
Paul Seitlinger
Knowledge Technologies Institute
Graz University of Technology
Graz, Austria
paul.seitlinger@tugraz.at
Tobias Ley
School of Digital Technologies
Tallinn University
Tallinn, Estonia
tley@tlu.ee
ABSTRACT
We analyze psychological dynamics of human-Web interac-
tion exemplified by social tagging. Whereas previous mod-
els assumed tagging was driven by individual knowledge and
social imitation, we introduce a reflective search framework
that assumes user behavior (e.g., exploration and tagging of
web resources) to arise from an iterative search of human
memory shaped continuously by past and present learning
episodes. We formalize this framework by means of a mathe-
matical model of search of human memory which interrelates
episodic and semantic memory processes. This allows us to
simulate both temporal macro dynamics (stabilization of tag
distribution) and underlying temporal micro dynamics (re-
flecting and tagging a resource). While the former are well
covered by previous models, these models are not able to ex-
plain the latter. We claim that shifting away from imitation
to reflective search holds great potential for understanding
and designing human web interaction more generally, and
to validate models of human memory in large-scale web en-
vironments.
CCS Concepts
•Information systems →Web searching and information
discovery; •Human-centered computing →Web-based
interaction; •Applied computing →Psychology;
Keywords
Social Tagging, Semantic Stabilization, Organism-Environment
Dynamics, Search of human memory
1. INTRODUCTION
Social tagging on the Web allows studying social and cog-
nitive processes on a large scale, such as memory, catego-
rization and language use [7]. From a psychological research
perspective, this is interesting as previous experimentation
has been mainly lab-based, involving smaller samples of par-
ticipants and creating controlled conditions to study these
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita-
tion on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
WebSci ’16, May 22-25, 2016, Hannover, Germany
c
2016 ACM. ISBN 978-1-4503-4208-7/16/05. . . $15.00
DOI: http://dx.doi.org/10.1145/2908131.2908157
phenomena in depth. All these have to some extent limited
the ecological validity and generalizability of results [11]. For
example, when looking at memory processes in artificial in-
dividual learning trials in the lab, it may be overlooked that
human memory is predominantly a tool used to interact with
our social and material environment [2].
Making use of social tagging data for identifying regular-
ities in how humans interact with the Web has been on the
Web Science agenda for a number of years. Our reading
of this literature is that mainly two directions have been
pursued. The first has looked at collective stabilization and
dynamics, and in particular at how semantic stabilization
happens despite limited central control (e.g., [26],[9]). In this
strand of research, different forms of imitation have played a
major role, that is models of how users of the tagging system
imitate other users’ tags, and how agreement about the use
of particular tags results from this (e.g., [8],[3],[9],[4],[28]).
The second line of research has been trying to establish
regularities of how individuals learn over time (e.g., [5],[24]),
and which tags they use in a particular moment to describe
a resource [14]. This has been important to suggest tag
recommender services that support users in the process of
tagging (e.g., [14]). Imitation has also played a role in this
second strand of research, e.g., in the semantic imitation
model of Fu and colleagues (e.g., [6],[5]).
If we want to leverage the massive data available in social
tagging systems for bringing forward psychological research,
then it is important to critically review the theoretical ba-
sis on which modelling of social tagging data rests. When
doing so, it seems to us that the focus on imitation in both
strands of social tagging research is to some extent surpris-
ing. The focus on imitation seems to suggest that the main
motivation of users of a social tagging system is that of im-
itating others. It seems as if people were constantly mak-
ing decisions of whether they should reuse another person’s
tag or not. And in fact, current generative models of so-
cial tagging separate imitation from the use of prior (also
called “background”) knowledge that the user brings to the
situation into two independent processes: I either draw on
my existing knowledge, or I copy a tag from someone else.
As we will show in section 2.2, this assumption has already
been empirically challenged in several studies conducted in
the tradition of the co-evolution model (e.g., [13]), and the
semantic imitation model (e.g., [5]).
From a psychological point of view, this artificial separa-
tion has obstructed the view on social tagging as a process
of searching and making sense of Web resources in which se-
mantic and episodic memory together produce a particular
set of tags in a concrete situation. While semantic memory
represents more general sedimendations of cultural mean-
ing making (e.g., whether Paris is the capital of France),
episodic memory represents personal experiences (e.g., my
visit to Paris last year and memories from this; e.g., [27]).
In this paper, we therefore suggest a model that we call
“reflective search”. We start from the assumption that in-
stead of imitating others, the main motivation for some-
one to use a social tagging system is to support searching,
collecting and organizing Web resources. By doing so, the
model moves the focus away from imitation, and conceptu-
alizes social tagging as a search process and its reflections in
memory.
We will show that despite having no explicit imitation
component, the model nevertheless produces social stabi-
lization as a side effect because users are part of a shared
meaning making system. At the same time, and because
of its roots in memory search, the model also allows more
fine grained predictions about the individual use of tags
that other models have not been able to make. The model
therefore unifies the distinct research traditions (on individ-
ual and collective processes) and integrates semantic and
episodic memory processes into one single framework.
In the next chapter, we first review research in social tag-
ging. We then propose the reflective search model and its
mathematical formalization. This formalization allows us
not only to validate the model with empirical data from a
large-scale social tagging system, but also to construct a
generative model of search and tagging. We subsequently
present a multi-agent simulation study that is able to repro-
duce several well-known individual and collective phenom-
ena and their relationship.
2. COLLECTIVE AND INDIVIDUAL PHE-
NOMENA IN SOCIAL TAGGING
2.1 The role of imitation and knowledge in so-
cial tagging
The seminal work of Golder & Huberman [8] introduces a
2-factor, explanatory scheme of stable patterns in tag pro-
portions and gives a subsequent discourse on self-organizing
mechanisms in social tagging a strong and outlasting spin.
The first factor they put forward is shared background knowl-
edge that is assumed to cause different users to make similar
tag choices for a given Web resource with or without being
aware of others’ behavior. Thus, from a cognitive view-
point, [8] focus on a user’s semantic-lexical memory, which
is shaped by statistical structures of natural language and
mirrors commonsense word associations [21]. As a second
factor, their scheme invokes the notion of imitation as some
kind of social motive that would drive users to seek for “so-
cial proof” and to regard popular tag recommendations as
the “right” way to categorize the given resource.
The stochastic model that Golder & Huberman [8] suggest
to formalize the quick development of stable tag proportions
is restricted to the social factor of imitation. Implemented
in form of a variant of Polya’s urn model, it realizes a pref-
erential attachment mechanism and successfully reproduces
stable tag proportions typically observed in social tagging
systems (e.g., Delicious).
One ambiguity of the 2-factor scheme is that [8] leave out
a specification of how the two factors are interrelated both
theoretically and formally. This may have well given rise
to the aforementioned artificial separation of knowledge and
imitation in how the field further developed. All subsequent
works either continued putting a formal focus on imitation
and neglecting the impact of knowledge, or they formally
implemented the explanatory scheme without a theoretical
reflection on the relationship between the two factors.
The works of [3] and [9] mainly attend to the factor of
imitation and suggest refinements of the preferential attach-
ment mechanism to better account for the characteristics of
stabilized rank-frequency distributions of tags [3] and for the
temporal dynamics in the stabilization of these distributions
[9]. [3] adds a power-law of forgetting to simulate a user’s
preference for tags used frequently and recently within the
system; [9] extends the mechanism by a sensitivity towards a
tag’s information value to simulate preferences for frequent
tags referring to concepts, which are neither too specific nor
too general in describing the resource.
Due to the inability of all these models to account for
a particular pattern, namely, the non-linear growth rate of
unique tags (e.g., [4], [18]), [4] were the first who proposed to
combine imitation with the stabilizing factor of shared back-
ground knowledge, i.e., to implement the 2-factor scheme of
[8]. In their proposed “epistemic dynamic model” including
five free parameters, [4] use an imitation parameter to let
a simulated user either draw on her/his background knowl-
edge (by sampling a word from a topic-related Web cor-
pus) with probability Ior choose a word from the currently
most popular tags with probability 1-I. In the sense of [16],
this proposed ‘either/or’ logic assuming stochastic indepen-
dence of knowledge and imitation exhibits high outcome but
– as we believe – low process validity. Even in most recent
works, e.g., [28], the ‘Background plus imitation’ model is
still widely used to explain semantic stabilization. In sum-
mary, although this model accounts for the emergence of sta-
ble patterns in social tagging, its underlying logic does not
square well with a series of empirical studies on individual
learning and human-Web interactions in social information
systems, as we will show next.
2.2 Coupling of individual search and collec-
tive memory in tagging
First, research of [13] and [15] reveals a coupling of individ-
ual knowledge and joint artifacts (e.g., wiki articles, tag dis-
tributions): Whether a user’s interaction with the joint ar-
tifact results in a qualitative extension (e.g., by adding a se-
mantically new tag) or merely quantitative change (e.g., by
imitating existing tags) depends on the user’s prior knowl-
edge, in particular, on the congruity between the individual
understanding (e.g., resource interpretation) and the seman-
tics conveyed by the artifact (e.g., recommended tags). Sim-
ilarly, the ‘Semantic Imitation’ model of Fu and colleagues
that is based on empirical studies (e.g., [6]) and model-based
analyses (e.g., [5]) implies strong dependence of imitation on
the way users interpret Web resources: if people categorize
an object differently, they are less likely to choose similar
tags. In other words, the more people converge in their
background knowledge, the stronger is the impact of imita-
tion on the emergence of tag distributions.
Taken together, empirical evidence does not allow to treat
imitation and knowledge as two stochastically independent
factors, and if the goal of Webscience is not only to achieve
high outcome but also high process validity, such evidence
should motivate the accommodation of established models
of social tagging.
From our perspective, the semantic imitation model [5]
provides an appropriate starting point for the endeavor of re-
fining dynamic models of social tagging. As a sophisticated
model of sensemaking built upon a scientifically sound the-
ory of human categorization [1], it models the formation and
refinement of mental categories during a user’s exploratory
search of the information ecology within a social tagging
system. By accounting for individual, episodic learning pro-
cesses as well as for the categorization of a present Web
resource, it also allows to anticipate whether the correspond-
ing user complies with existing tags or whether she/he intro-
duces a new tag to express an idiosyncratic resource catego-
rization. That way, it establishes a link between individual-
episodic learning and the emergence of macro structures,
e.g., stabilized tag distributions. However, users simulated
by the semantic imitation model have “no [prior] common-
sense knowledge” ([5], p. 44) and thus, the model lacks
the integration of episodic learning into background knowl-
edge, i.e., of a pre-existing semantc-lexical memory that gets
shaped during an exploratory information search and sup-
ports inter-individual sensemaking. For a model that should
anticipate dynamics on a macro level, we assume that it is
crucial to consider such prior semantic knowledge and how
it interacts with episodic learning during exploratory search.
E.g., depending on the environment to be observed and un-
derstood (e.g., Delicious vs. Pinterest), users can be more or
less heterogeneous in terms of pre-existing semantic mem-
ory, and the extent of heterogeneity in semantic memory can
be assumed to affect future dynamics and stabilization.
Based on this brief overview of dynamic models of social
tagging, we identify two major gaps that we will describe
next and address in this paper by introducing a unifying
model of ‘Reflective Search’ to be validated in an empirical
study and model-based analysis.
2.3 Contributions of a reflective search model
The first gap identified is that existing approaches put em-
phasis either on commonsense semantic memory structures
([8],[3],[9],[4]) or individual and episodic learning processes
(e.g., [5]). Therefore, the first goal is to introduce a unify-
ing ‘reflective search’ framework that helps to understand
how prior semantic memory and episodic learning (during
a user’s previous Web navigation) interact when users re-
flect upon a Web resource and choose tags to index their
reflections. That way, tagging becomes a verbalized epiphe-
nomenon embedded naturally into a cognitive user-resource
loop, and imitation becomes a correlate of inter-subjectivity
that refers to an inter-individual state of reflective agreement
with respect to a resource.
The second gap covered by our framework concerns the
temporal micro dynamics that unfold as a user reflects upon
and tags a resource. Previous models have targeted tem-
poral macro dynamics distributed across consecutive book-
marks of different users. The resulting tag resource stream
[4] has then been analyzed in terms of a stabilizing tag fre-
quency distribution and quantified applying measures, such
as Kullback-Leibler Divergence [9], Rank-Biased Overlap
[28], the rate at which the number of unique tags increases
[18] and tag proportions stabilize [8], or simply, calculat-
ing the probability of a bookmark including a new tag [25].
From all these studies, however, little has been learned about
the temporal micro dynamics underlying a single tag assign-
ment (TAS) of a given user. While the seminal work of [8]
has already included a cognitive and empirical consideration
of these dynamics, it has not been taken up in subsequent
models. In particular, [8] reveal early tags in a TAS to be
basic-level tags shared by many users, and tags at later TAS
positions to be idiosyncratic. Though it’s immediately ob-
vious that micro and macro dynamics should be interlinked
– as basic level tags should drive semantic stabilization –
until now, no model has been developed that accounts for
temporal micro dynamics.
Therefore, the goal of this paper is i) to introduce the re-
flective search framework and a corresponding, mathemat-
ical model of human memory search that is called Context
Maintenance and Retrieval (CMR; e.g., [22],[19]) as a first
concretization, and ii) to demonstrate its ability to close
these two gaps by means of model-based simulations of em-
pirical patterns extracted from a Delicious dataset. In the
following, we first describe and second, test the model.
3. REFLECTIVE SEARCH MODEL
3.1 Model Overview
Figure 1 shows a scheme of our ‘reflective search’ frame-
work that should help to clarify how it is applied to realize
two main model requirements: i) modeling the way a user’s
personal history, i.e., Web navigation, shapes pre-existing,
commonsense associations (semantic memory component)
through learning experiences (episodic memory component),
and ii) modeling how the user’s evolved memory (including
episodic and semantic components) supports the micro dy-
namics during reflecting and tagging a new Web resource.
Episodic learning during Web navigation. First, the fig-
ure shows two layers, which mutually influence each other.
There is a manifest layer, which can be observed (e.g., in log
files) and represents a user’s sequence of collected resources
(black dots). Please note that, within the notational system
of CMR, internal stimuli (e.g., thoughts) and environmen-
tal stimuli (e.g., Web resources or tags) are called items
and denoted as f. Thus, to refer to a Web resource (e.g.,
article), we use the symbol fi, where the subscript iindi-
cates the resource’s position in the user’s resource sequence.
Additionally, the model includes a latent layer, which is un-
observable and represents a sequence of context states (grey
dots) within a user’s memory. In the sense of [22], we use
the notion of context to refer to a blend of memories, which
is evoked by a Web resource (e.g., memories of similar arti-
cles) at position iand represented by the symbol ci. This
retrieved context determines two kinds of search processes:
i) an ‘internal’ memory search for further memories (i.e., in-
ternal items) that – in the present study – is regarded as
a reflection upon a resource, and ii) an environmental Web
search for a further resource (i.e., environmental item fi+1 ).
In the figure, the internal search process is illustrated by the
schematic helix spanned between the last black and grey dot.
The specific mechanism underlying this reflective and itera-
tive process that we also use to simulate tag choices will be
explained in detail below. The environmental search process
is indicated by the dashed arrows (e.g., from cito fi+1).
Human memory is a very plastic, neural structure, keep-
ing traces of each episodic experience (e.g., [10]). Therefore,
searching memory or the Web is a cumulative learning pro-
cess, which always depends on past and current events, such
web
navigation
fi
ci
resource fi-1
context ci-1
context
retrieval
episodic
learning
fi+1
sequence of collected Web resources f
pre-existing
semantic associations
integration of
episodic f-c associations
context evolution c
episodic
learning
reflection
context
retrieval
tag
choice
ci+1
Figure 1: Reflective search scheme
as consecutively collected and reflected Web resources. In
order to capture the learning processes during Web naviga-
tion, the model includes malleable associations storing item-
to-context associations (to retrieve resource-relevant con-
text) and context-to-item associations (to search for further
context-relevant items). In this study, each episode (reflect-
ing and tagging a Web resource) results in new episodic as-
sociations between the current resource (fi) and the current
context (ci), which are integrated into former episodic as-
sociations and added to pre-existing, semantic associations.
In Figure 1, the continuous integration of episodic learn-
ing experiences into pre-existing, semantic associations (first
modeling requirement) is illustrated by the grey area, which
increases in size along a user’s Web navigation. Thus, inter-
subjectivity, i.e., the extent to which different users agree
on their reflection upon a resource and are inclined to imi-
tate each other, depends on the impact (relative strength) of
pre-existing semantic knowledge and on users’ convergence
in previous Web navigation (personal learning episodes).
An essential construct in CMR is context evolution. The
current context ciis not only a blend of memories probed
by the current resource fibut it also blends with tempo-
rally weighted context states retrieved by previous resources
(or internally retrieved items). Hence, it refers to an inter-
nally maintained context representation (activation pattern)
in a user’s memory that changes continuously in time (e.g.,
during Web navigation). For example, when I’m reading
an article found on the Web, my thoughts and reflections
upon it are not only affected by the memories evoked by
that present article but also by memories and associations
probed by previously encountered resources (e.g., other ar-
ticles, videos, etc.). In terms of CMR, the evolved context
state plays the role of an internal ‘spotlight’ that guides the
search of memory by ‘illuminating’ context-relevant items,
such as words that a user might use as search terms to nav-
igate to a new Web resource or as tags to index her or his
resource reflections. The latter micro-dynamics of reflecting
upon a resource and using tags to externalize these reflec-
tions (second modeling requirement) is described next.
Micro dynamics in reflecting and tagging a Web resource.
The schematic helix in Figure 1 (circulating between fi+1
and ci+1) illustrates the micro-dynamics that we assume to
underlie a resource reflection accompanied by corresponding
tag choices. In a first iteration t, a new resource (environ-
mental item) addressing particular semantic categories (e.g.,
‘history’ and ‘humanities’) probes item-to-context associa-
tions and retrieves new context (e.g., memories of articles
dealing with similar categories) to be integrated into the al-
ready evolved spotlight, which then probes context-to-item
associations to retrieve an internal item, such as a word se-
mantically related to the spotlight. We assume the user
to apply this retrieved word as a first tag. The retrieved
word continues the resource reflection by probing new item-
to-context associations in a subsequent search iteration t+1
and retrieving new context, which causes a slight spotlight
shift, in turn guiding the search for a new word, i.e., tag.
This iterative search proceeds until the spotlight stops shift-
ing sufficiently to let the user become aware of new semantic
aspects of the resource, i.e., until reflection comes to an end.
3.2 Model-based predictions and hypotheses
Based on this theoretical analysis of temporal micro dy-
namics within a single tag assignment (TAS), we derive a
first prediction that does not follow from any of the existing
models and helps to relate aspects of semantic stabilization
(e.g., [28]) to the basic-level effects in TAS observed by [8].
In particular, we predict that for any pair of users, the
probability of choosing similar tags for a given resource de-
creases along the consecutive positions within a TAS. In
other words, users are more likely to agree on earlier than
on later tags when describing the content of a resource. We
make this prediction, because according to CMR, the very
first search iteration of each user’s resource reflection is cued
by the same environmental item (i.e., Web resource), which
retrieves comparatively similar context to be integrated into
the internal spotlight, in turn guiding a relatively uniform
search for an internal item to be translated into the first
tag. In a later iteration t, however, the spotlight is already
a blend of contextual states, which have been updated by
the first environmental and the subsequent (t-1) internally
retrieved items. As the probability of two users’ iteration
sequences including different items gets larger with an in-
creasing sequence length, we can expect that the spotlights
and hence, tag choices of later iterations (i.e., positions in
TAS) diverge more strongly between users than tag choices
of earlier iterations. Hence, we attribute the basic-level ob-
servation of [8] to the assumption that users’ internal spot-
lights drift when reflecting upon a resource.
We will test this ‘drifting spotlight assumption’ by ana-
lyzing the stabilization of a tag resource stream separately
for early and later positions of consecutive TAS. As already
mentioned in section 2.3, stabilization of tag resource streams
has been characterized in different ways. Among others, it
is reflected by a decreasing probability of consecutive book-
marks’ TAS including a new tag (e.g., [25],[5]). More specif-
ically, if pnew (r) denotes the probability of the rth book-
mark’s TAS including a new tag, then [25] and [5] have
shown that pnew(r) decreases along consecutive bookmarks
r. If we further let tindicate the position of a tag within
a TAS, the first and so-called ‘drifting spotlight’ hypothesis
HDS is that – independent of r – pnew(r, t)increases mono-
tonically with an increasing t. In other words, a tag added
by a user is more likely being new if it appears later in the
user’s set of chosen tags. If HD S holds, we can derive the fur-
ther expectation that temporal dynamics on the micro level
(iterative search of memory) are coupled with those on the
macro level (emerging tag distribution) and thus, that sta-
bilization is more strongly pronounced for tags in early than
for tags in later TAS positions. In particular, the second
so-called ‘micro macro coupling’ hypothesis HMMC is that
the exponential decline (slope) of pnew(r, t)along consecu-
tive bookmarks r, which has already been observed by [25],
decreases with increasing t.
We will test and validate the model by observing i) whether
the hypotheses comply with empirical data and ii) whether
a multi-agent simulation, in which each agent behaves ac-
cording to CMR assumptions, yields estimates of pnew(r, t)
for different rand tthat fit empirical estimates well. In con-
trast to all prior work on tagging models (except for [5]), our
model validation goes beyond a purely qualitative compar-
ison of predicted and observed data and instead, evaluates
the model’s goodness of fit quantitatively.
Though other and more refined measures of stabilization
have been applied (e.g., [9] or [28]), in this study we draw on
the measure of pnew(r, t) as it yields a pattern of frequencies
that is most suitable for our parameter fitting technique and
goodness of fit test (see section 4.1).
3.3 Generative mechanism
In the multi-agent simulation reported below, every simu-
lation run involves a number of magents each completing a
first episodic learning and a second social tagging phase. In
the first phase, a real user history (with a sequence of at least
20 Web resources) is sampled from a Delicious dataset [29]
and used to train the given agent according to CMR model
equations (see next section). Then, in the second phase, this
trained agent is exposed to a set of new resources (and as-
sociated tags assigned by previously active agents) in order
to reflect upon and tag each of them. That way, theoreti-
cal tag distributions emerge in the course of the simulated,
social tagging phase, whose characteristics, e.g., inter-agent
agreement in tag choices, or pnew(r, t), can be compared
with characteristics of empirical distributions.
The dataset, on which we draw, includes bookmarks of
Wikipedia articles that we regard as environmental items,
each characterized by one or several predefined Wikipedia
top-level categories that have been assigned by the Wikipedia
community. In total, there are 25 categories, and we apply
them to represent a given article as a feature vector f, which,
according to CMR notations, is a standard basis vector “of
unit length with a single non-zero element” ([19] p. 339).
To comply with this form of notation, we let each element j
in fcorrespond to one of a total of J=891 unique category
combinations, where the vector’s only non-zero element in-
dicates the combination of categories assigned to the article.
Formal structures and dynamics.
The two layers illustrated in Figure 1 are represented
by the feature layer Fand the context layer C, including
J=891 elements each. In the first episodic learning phase,
an agent passes through the sequence of articles (environ-
mental items) collected by the user assigned to the agent.
The states of the two layers Fand Cchange along this se-
quence and are represented at a particular position iby the
two vectors fiand ci, respectively. In the learning phase,
fiis given by the present article (i.e., its category combina-
tion j). The two vectors fiand cimutually influence each
other, and this item-context communication is mediated by
the two matrices MFC and MC F , storing strengths of as-
sociations from item to context (MF C ) and from context
to item elements (MCF ). Both matrices consist of a pre-
existing, semantic component (MFC
pre ,MCF
pre ) and an episodic
component (MF C
epi ,MCF
epi ). To simplify matters, MFC
pre is ini-
tialized to an identity matrix, and the semantic associations
are built into MCF
pre (e.g., [19],[22]). Every element (a,b) in
MCF
pre , i.e., the strength of association between the category
combinations aand b, is determined by calculating the Jac-
card Index J I(a, b) scaled by the free parameter s. E.g., if a
includes ‘history’ and ‘humanities’ and bincludes ‘history’
and ‘language’, JI(a, b) is 1/3. The episodic components are
initialized to 0, and the aim of the episodic learning phase
is to shape them by user-specific associations as well as to
give rise to a user-specific context vector (spotlight) c. This
learning process is formalized by CMR equations (1)-(5b)
validated through several psychological experiments.
To simplify matters, we have implemented CMR’s core as-
sumptions and left out processes that are of minor relevance
for our research questions. In particular, we have not consid-
ered the primacy effect (memory advantage for early learn-
ing episodes), resulting in a simplification of Equation 5b.
As our research questions do not address inter-response times
of consecutive tags in a TAS, we have also abstracted from
detailed assumptions on selecting tags for output (Equa-
tion 8). Beyond that, Equation 9 and 10 are not included
in CMR and are applied in this work to model a tagging-
specific process (i.e., the choice probability of a tag as an
interplay of reflection and previous users’ tag choices).
Episodic learning phase.
Context evolution. Consider an agent encountering an ar-
ticle at position i(in the corresponding user’s bookmark
sequence), which evokes a patterned activation in the fea-
ture layer F, i.e, fi, and provides contextual input cIN
ito
layer Cgiven by
cIN
i=MF C fi
MF C fi
(1)
that is integrated into the context state cithrough the
context evolution equation:
ci=ρici−1+βEcIN
i(2)
The drift parameter βEdetermines the rate (during the
episodic learning phase) at which newly retrieved context
(i.e., cIN
i) is integrated into the context state ci.ρiis cal-
culated at each position ito weight the previously updated
context state and to ensure that
ci
= 1. It is given by
ρi=q1 + β2
E[(ci−1·cIN
i)2−1] −βE(ci−1·cIN
i) (3)
Forming episodic associations. Hebbian outer-product learn-
ing is applied to form new episodic associations between fi
and ci[22] given by
∆MF C
epi =cifT
i(4a)
∆MCF
epi =ficT
i(4b)
where T indicates the transpose of fiand ci, respectively.
To maintain this learning effect, the episodic associations
from item to context (equation 4a) and from context to item
elements (equation 4b) are integrated into MF C and MC F ,
respectively, according to the following weighted sums
MF C = (1 −γF C )MF C
pre +γF C MF C
epi (5a)
MCF = (1 −γCF )(D+sM CF
pre ) + γCF MC F
epi (5b)
where γF C and γCF are free parameters controlling the
relative strengths of newly learned associations. Dis an
identity matrix introduced to ensure that the on-diagonal
associations are not affected by the semantic scale factor s.
Then, the next article (i.e., fi+1) is passed on to the agent
to repeat these processes, i.e., equations (1)-(5b). Once it
has processed all the user’s articles, the agent has integrated
user-specific episodic associations into pre-existing associa-
tions and has formed a unique context state. Next, we show
how to make use of the evolved structures and states to sim-
ulate realistic tag assignments.
Social tagging phase.
Iterative search of memory: Reflecting upon an article. In
this phase, each of the mtrained agents assigns tags to the
same set of nanew articles. The consecutive assignment of
tags to a given article is realized as an iterative search of
memory, which is triggered by a newly presented article and
involves several iterations teach yielding a tag the agent
assigns to the article. In the first iteration (t= 1), the
article evokes a pattern ftin Fand provides input to Cvia
MF C given by equation (1), where in this phase we apply
the running index tinstead of i.
Again, the newly retrieved context updates the context
state according to the context evolution equation (2), where
this time (i.e., during the tagging phase), the drift parameter
depends on the number of iteration t:
βt=(βEif t= 1
βIif t > 1(6)
If t= 1, ftrepresents an environmental item (article) and
thus, the rate of context integration is controlled by the pa-
rameter βEalready applied in the episodic phase. However,
in subsequent iterations (t > 1), ftrepresents an internal
item (see below) and thus, a different drift parameter, i.e.,
βI, is applied to control for the rate of context integration.
Once the context state has been updated, ctplays the role
of a spotlight and guides the search for a new item stored
in an agent’s memory. In particular, ctevokes an activation
pattern aon the feature layer Fgiven by
a=MCF ct(7)
where aincludes an activation value for each of the J=
891 elements jin F, i.e., category combination. Following
[20], the probability P(j) of retrieving jis defined as
P(j) = aτ
j/
J
X
k
aτ
k(8)
where τis a free parameter controlling the sensitivity in
P(j) to activation differences. The category combination
retrieved is denoted as jtand becomes the internal item that
cues the next iteration t+ 1 that proceeds along equations
(1), (2) and (6)-(8).
Indexing each search iteration t: Tagging an article. We
assume that in each iteration, the retrieved (internal) item
becomes manifest (e.g., visible for others), if the agent in-
dexes the item’s category combination jtin form of a tag w.
To model this indexing process, we extend CMR by a simple
mechanism and let every agent have access to a lexicon L,
which is the set of all tags wthat have been generated by
the musers in the Delicious dataset. We assume that the
probability P(w) of choosing the tag wdepends on its se-
mantic utility uwaffected by two interacting variables: the
agent’s reflection on the article f1and the behavior of former
agents that have already assigned tags to f1. Note that in
the tagging phase, f1refers to the present article as the sub-
script does not indicate the article’s position but the current
iteration of memory search. Referring to the first variable,
in each iteration t, every tag takes on a particular seman-
tic strength p(w|jt) that is approximated by the number of
times wco-occurs with jtdivided by the total occurrence
number of win the entire Delicious dataset.
The second variable (behavior of previous agents) has an
amplifying effect on the semantic strength (the first vari-
able). We assume that the utility of a semantically appro-
priate tag wfurther increases, if it is presented to the agent
via a tag recommendation mechanism. In Delicious, the
probability of a tag wbeing recommended depends on its
popularity, and in our model, we approximate it by p(w|f1)
that represents the relative frequency by which whas al-
ready been assigned to f1by previous agents. Therefore,
the semantic utility uwis given by
uw=p(w|jt)[1 + p(w|f1)]φ(9)
where φis a free parameter controlling the extent by
which a tag’s popularity amplifies its semantic utility. Fi-
nally, the probability P(w) of choosing tag wresults from
P(w) = uw/
N
X
k
uk(10)
Taken together, equations (9) and (10) represent the as-
sumption that a tag’s semantic utility and consequently,
choice probability, is greater than 0, only if it complies with
the agent’s interpretation of the article (i.e., if p(w|jt)>0).
Furthermore, if the tag is semantically appropriate and –
due to inter-subjectivity – has also been assigned to the ar-
ticle by former agents (i.e., p(w|f1)>0), the tag’s utility
and choice probability get larger.
4. MULTI-AGENT SIMULATION STUDY
A multi-agent simulation is a way to validate complex
socio-cognitive mechanisms that are difficult to observe due
to the underlying dynamical inter-relations of several vari-
ables. Typically, an empirical pattern is first identified (in
our case, this is for one the micro level dynamics, the sec-
ond the emerging tag distribution) and then, the generative
mechanism is used to try to simulate the pattern. Hence,
a multi-agent simulation not only allows for validating the
outcome, but also allows for process validation by comparing
the empirical with the simulated pattern. In the multi-agent
simulation study described next, each of the magents be-
0.4 0.5 0.6 0.7 0.8 0.9 1.0
Probability new tag pnew (r,t)
Consecutive bookmarks r
Data
Model
12345678910
(a) t= 1
0.4 0.5 0.6 0.7 0.8 0.9 1.0
Probability new tag pnew(r,t)
Consecutive bookmarks r
Data
Model
12345678910
(b) t= 2
0.4 0.5 0.6 0.7 0.8 0.9 1.0
Probability new tag pnew(r,t)
Consecutive bookmarks r
Data
Model
12345678910
(c) t= 3
0.4 0.5 0.6 0.7 0.8 0.9 1.0
Probability new tag pnew(r,t)
Consecutive bookmarks r
Data
Model
12345678910
(d) t= 4
Figure 2: Decline of pnew (r, t)– the probability of a tag being new at a particular bookmark rand a particular
position twithin a tag assignment TAS – along consecutive bookmarks and for varying TAS positions
haves according to the generative mechanism, i.e., passes
through an episodic learning phase (equations 1-5b) and
subsequently, participates in a social tagging phase. That
way, we can test our hypotheses more stringently and in two
steps: first, we test them empirically by analyzing Delicious
data and second, compare the empirical data with results of
the simulation by performing a goodness-of-fit test.
In the tagging phase, a tag assignment (TAS) emerges
as an agent reflects upon a given article along an iterative
search of memory (equations 1, 2, and 6 - 8), where each
iteration tresults in a tag choice (equations 9 and 10). In
line with the notations introduced in section 3.2, tindicates
a tag’s position within a TAS. In the multi-agent simula-
tion, each article is tagged consecutively by all magents,
where rindicates the agent’s position within this agent se-
quence. Thus, rcorresponds to the rth bookmark of a re-
source within the Delicious dataset, and pnew(r, t) represents
the probability that a tag, which is chosen by the rth agent
in iteration t, is new.
In section 3.2, we have derived the ‘drifting spotlight’ hy-
pothesis HDS that estimates of pnew (r, t) should increase
monotonically along the TAS positions, i.e., by increasing t.
The rational behind HDS is that if tincreases, the difference
in context states between any pair of agents (or users) gets
larger because during reflection, their spotlights drift and are
increasingly less determined by a shared-environmental but
more determined by eventually individual-internal items. As
we assume that these temporal micro dynamics give rise to
temporal macro dynamics (i.e., stabilization), the second
‘micro macro coupling’ hypothesis HMMC predicts that the
slope of the exponential decline of pnew (r, t) along consecu-
tive ris larger for early than for later TAS positions t.
The questions that arise now are: i) can we actually ob-
serve an empirical pattern within the Delicious dataset that
harmonizes with the hypotheses and ii) can we simulate a
distribution of pnew(r, t) for different rand tthat does not
diverge significantly from the empirical distribution?
4.1 Parameter fitting technique
Before describing the genetic algorithm by which we search
the model’s parameter space to find an optimal set of pa-
rameters, we first explain how we obtain the simulated and
empirical frequencies underlying estimates of pnew(r, t).
Simulated frequencies. In each simulation run, a computa-
tionally feasible number of m= 10 agents are involved and
thus, for every new article to be tagged in the social tagging
phase we get a sequence of r= 1, ..., m bookmarks (and as-
sociated TAS), where we set the number of new articles to
na= 30. Furthermore, we let each agent assign a number
of nt= 4 tags to every article, and thus, get a sequence
of t= 1, ..., ntpositions within a TAS. For the first agent
tagging a given article, pnew(1, t) = 1. Therefore, the re-
sult of each simulation run is a 4(positions)×9(bookmarks)
contingency table, where each cell includes a count for the
number of resources to which new tags have been assigned
(at a given rand t). Then, pnew (TASr,t ) is estimated by
dividing the count of cell (t, r) by na= 30. After obtaining
the best-fitting parameter estimates (see Genetic algorithm
below), we conduct 500 simulation runs and obtain the final
estimates of pnew(TASr,t) by determing the average 4 ×9
contingency table.
Empirical frequencies. We have extracted a number of
na= 1.046 articles, for which there are at least m= 10
bookmarks each described by at least nt= 4 tags. That
way, we get a corresponding empirical 4 ×9 contingency
table, in which each cell (t, r) includes the count for the
number of resources to which new tags have been assigned at
the corresponding TAS position tand bookmark number r.
Again, we estimate the empirical pnew(TASr,t) by dividing
the count of cell (t, r) by na= 1.046.
Genetic algorithm. Following the CMR parameter fit-
ting technique, we apply a genetic algorithm to find the
best-fitting parameters. Here we draw on the R package
GA [23]. This algorithm aims to minimize fitness values
for populations of parameter sets, which we define as the
sum of squared errors between model (simulated estimates
of pnew(r, t)) and data (empirical estimates of pnew (r, t))
weighted by the SE of the data (see also [22],[19]). GA
starts by generating an initial population of strings, which
are randomaly generated parameter sets. Here, we set the
population size to 500. Then, fitness evaluation takes place
and the model-data divergence of each string is determined
to select the fittest – in this study, top 5% – strings. In a
subsequent phase of exploration, processes of mutation (ran-
domly alterning the values of selected strings) and crossover
(combining values of two selected ‘parent’ strings) allow to
Table 1: Summary of free parameters
Description Range Estimates
βEContext integration rate, en-
vironmental item
0-1 0.913
βIContext integration rate, in-
ternal item
0-1 0.213
s Semantic scale factor 0-5 3.422
γF C Relative strength of MF C
epi 0-1 0.356
γCF Relative strength of MC F
epi 0-1 0.244
φAmplification factor 0-10 8.773
τSensitivity parameter 0-5 4.075
generate a further ‘generation’ of strings. These steps (popu-
lation generation, evaluation, exploration) are repeated until
a particular criterion of convergence is reached, where – in
this study – “GA stops if a number of [200] generations has
passed without improvement” [23].
The best-fitting parameter set yielded by GA is shown
in Table 1 to be discussed in section 4.2. For validating the
model, we have used this set of model parameters to conduct
500 simulation runs and to obatin the simulated contingency
table (see above) underlying the estimates of pnew(TASr,t ).
To evaluate the model quantitatively, we have first multi-
plied each simulated pnew(r, t) by the total number of em-
pirical observations (i.e., 1.046) and then compared these
simulated frequencies with the corresponding empirical fre-
quencies using a χ2goodness-of-fit test. Given 36 (=4*9)
datapoints and 7 free parameters, the critical goodness-of-fit
statistic is χ2
crit(29) = 42.56, which should not be exceeded
by our model if it fits data well.
4.2 Results and Discussion
Testing model predictions with behavioral data.
The ‘drifting spotlight’ hypothesis HDS assumes pnew(r, t)
to increase monotonically with an increasing t(independent
of r). Remember that pnew(r, t) is the estimated probability
that a tag, which occurs in the rth bookmark at iteration (or
TAS position) t, is new. Figure 2 presents the results: The
four diagrams show the decline of pnew(r, t) along the first
r= 1, ..., 10 bookmarks separately for the t= 1, ..., 4 TAS
positions, respectively. We first describe the solid curves,
which represent the empirical estimates. We can easily rec-
ognize that – independent of r– the average probability of
introducing a new tag ¯pnew appears to increase from left
to right, i.e., from the first to the fourth TAS position t.
The four corresponding means can be found in Table 2 (see
columns ‘Data’ and ‘¯pnew’) clearly revealing a monotonous
increase of ¯pnew along the four positions. In line with HDS
we therefore conclude that users are more likely to agree
on early than on later tags (of a TAS) when reflecting and
tagging the content of a resource.
The second hypothesis HMMC assumes that this positive
relationship between inter-user agreement and TAS posi-
tion is coupled with temporal dynamics on the macro level
(tag resource stream) and gives rise to a more strongly pro-
nounced stabilization at early than at later TAS postions.
The solid curves plotted in the diagrams of Figure 2 square
well with HMMC as their slopes λseem to decrease from
t= 1 (Figure 2, diagram a) to t= 4 (Figure 2, diagram
d). Drawing on [18] who have found that the rate at which
Table 2: Descriptive characteristics (¯pnew and λ) of
empirical and simulated data points
Data Model
¯pnew λ¯pnew λ
t=1 .580 .093 .584 .089
t=2 .639 .077 .633 .078
t=3 .669 .069 .665 .069
t=4 .708 .060 .700 .064
the number of unique tags increases follows an exponential
function, we have obtained estimates of λby fitting the the-
oretical function p(r, t) = e−λ∗rto each of the four positions
t. The amount of variance explained by this theoretical ex-
ponential is 95%, 96%, 97%, and 97%, for t= 1, t= 2, t= 3,
and t= 4, respectively. Table 2 provides these estimates of
λ(see columns ‘Data’ and ‘λ’), and discloses that the slope
indeed gets smaller as tgets larger. We therefore conclude
in accordance with HMMC that it is the early tags in a TAS
which drive temporal macro dynamics, i.e., stabilization.
These analyses of behavioral data extracted from Deli-
cious lend strong support to our two hypotheses HDS and
HMMC. As the hypotheses’ predictions follow naturally
from the proposed reflective search model and seem to har-
monize well with empirical patterns, we deem these results
a first step towards process validation. We further validated
the model with the generative mechanism embedded into
the simulation study, whose results are reported next.
Simulating behavioral data for process validation.
First, further qualitative evidence for the model’s abil-
ity to account for users’ tagging behavior is provided by
Figure 2. The dashed curves in each of the four diagrams
represent the result of the simulation, i.e., the estimates of
pnew(r, t) that we obtain when averaging across the 500 sim-
ulation runs each based on the best-fitting parameter set
shown in Table 1. Despite some deviations from the data
points, the picture of a model comes to the fore that gen-
erates the two predicted (and empirically observed) charac-
teristics of the data well: By approximating the data well,
the simulated curves show that i) the average probability of
introducing a new tag ¯pnew increases along the four TAS po-
sitions t(hypothesis HDS ), and ii) that the slope decreases
from t= 1 to t= 4 (hypothesis HMMC). The summarizing
descriptive values ( ¯pnew and slope λfor each of the four TAS
positions) are shown in Table 2 (see column ‘Model’: ¯pnew
and λ, respectively) and further suggest only slight devia-
tions between data and model. As for the empirical curves,
we have derived estimates of λby fitting the theoretical ex-
ponential to the simulated curve at each of the four TAS
positions (explaining 94%, 96%, 97%, and 97% of variance
for t= 1, t= 2, t= 3, and t= 4, respectively).
Empirical test of model fit.
Second, an evaluation of the goodness-of-fit using a χ2
statistic provides evidence for the model’s process valid-
ity: comparing the empirical and simulated estimates of
pnew(r, t) yields a value of χ2(29) = 13.74 that is far beyond
the critical value of χ2
crit = 42.56 (see section 4.1) and allows
to keep the nullhypothesis that the simulated and empirical
curves depicted in Figure 2 do not differ significantly.
Third, having gained qualitative and quantitative evidence
for the model’s process and outcome validity, we can now de-
scribe and discuss the parameters and their estimates shown
in Table 1 yielding theoretically plausible values. The first
two parameters βEand βIshow that the rate by which
newly retrieved context is integrated into the internal spot-
light (used to probe new context-to-item associations) is
larger for context elements activated by an article (βE) than
for context elements activated by an internally stored item
(e.g., tag; βI). This difference seems plausible as a user
should be more likely of shifting her or his spotlight af-
ter having read a new article than after having reflected
upon the very same article. The comparatively large esti-
mate of the third (semantic scale) factor sreveals that pre-
existing semantic (commonsense) associations have a rela-
tively strong influence on the learning process and the sub-
sequent reflection and indexing of new articles. This result
harmonizes well with previous studies on models of social
tagging (e.g., [4]) suggesting that we have to consider users’
background knowledge in order to model a non-linear in-
crease of the number of unique tags, or, as in our case,
the decline of pnew(r, t). The moderate estimates of γF C
and γCF indicate that – in line with [5] – knowledge be-
ing stored in MF C and MC F , respectively, gets shaped by
episodic learning experiences (captured by MF C
epi and MCF
epi ),
where the relative strength of this episodic influence seems
to be smaller than that of pre-existing associations (which
equals 1-γF C and 1-γCF ) between elements of the item and
context layer (and vice versa). This could explain states
of inter-subjectivity (inter-user agreement in reflecting on
Web resources), which – according to our model – becomes
manifest in a verbal imitation behavior, despite of episodic
learning processes in the course of individual Web naviga-
tion. The high estimate of the amplification factor φreveals
a strong impact of others’ tagging behavior (e.g., displayed
by tag recommendation interfaces) on individual tag choices
but only if this behavior complies with individual reflections
upon the present article (see equation 9). This appears to
be a natural process during verbalization: If a user is ex-
posed to a recommended tag (e.g., via a recommendation
interface) that matches a thought the user would like to ex-
press, then she or he should be very inclined to adopt (or
imitate) that semantically ‘matching’ tag. Finally, the esti-
mate of the sensitivity parameter τindicates that we have to
increase the signal-to-noise ratio in the activation of mem-
ory items, which is evoked by the spotlight (context state;
equation 7), in order to derive a realistic activation-based
probability of item retrieval (equation 8).
5. CONCLUSION
We have gained evidence speaking in favor of the pro-
cess and outcome validity of the proposed reflective search
model. First, based on the drifting spotlight assumption, we
have successfully predicted a pattern of results that can’t be
derived from existing models of social tagging (HDS ) and
contributes substantially to our understanding of stabiliza-
tion (HMMC). Second, the reflective search model allows
simulating frequency distributions that closely fit empirical
data, and third, it yields plausible estimates of parameters
that help to interpret the interplay of internal processes (e.g.,
context update and episodic learning) and environmental
cues (e.g., Web resources and recommended tags).
One aim of this work is to shed fresh light on a domi-
nant conception of the notion of imitation within the Web-
science discourse around socio-cognitive phenomena, such as
social tagging. According to our proposed reflective search
model, imitation is not a separate process detached from
individual, associative memory structures that can be mod-
eled as a stochastically independent factor (as e.g. proposed
by the ‘Background-Plus-Imitation’ model [4],[28]). Instead,
we deem it an integral process of an inter-subjective re-
flection upon content, in which those tags will be adopted
and become popular that help index and verbalize inter-
individually similar reflections. Evidence for the basic as-
sumption that the choice (and imitation) of tags is simply
an epiphenomenon of an iterative search of memory (that
we assume to underlie reflection) comes from Delicious data,
which comply with our first model prediction (the ‘drifting
spotlight’ hypothesis HDS ) and with the model’s ability to
simulate patterns that fit empirical data well.
The exclusion of imitation as an independent factor is
in line with works of e.g. [13] and [5] that suggest depen-
dence (i.e., co-evolution and coupling) of (social) imitation
and (individual) knowledge, where we believe that our ap-
proach refines these works by important aspects: In con-
trast to [5], our model includes structures and mechanisms
by which episodic learning is integrated into prior back-
ground knowledge and thus, helps to understand how states
of inter-subjectivity can come into being despite of diverg-
ing learning experiences. In contrast to [13], we endeavor
to concretize and formalize a coupling of internal and envi-
ronmental phenomena (e.g., memory and Web resources) by
means of a contemporary model of search of memory (i.e.,
CMR [22],[19]) in order to reduce the gap between theoret-
ical considerations and empirical observations.
We also see some potential to irritate conceptions of how
the individual is coupled with the so-called social system and
of how temporal micro and macro dynamics interact. A lit-
tle in the manner of [17], we assume that the tempting but
somehow artificial differentiation between ‘the individual’
and ‘the social’ becomes obsolet as soon as we stop conceiv-
ing and modeling the individual as a simple element within
a complex whole. And indeed, our empirical and simulated
results that provide evidence for the ‘micro macro coupling’
hypothesis HMMC suggest that stabilization (as an emerging
artifact of the ‘whole’) falls into place just by modeling the
individual as a reflected being and by letting these beings
interact.
We would like to stress that the reflective search scheme
of Figure 1 is not limited to social tagging but frames our
general way of thinking about human-Web interactions. Our
main argument is that interacting on the Web shapes mem-
ory and the internal spotlight (context representation), and
these cumulative learning processes drive future behavior,
which can become manifest in many different ways, e.g., in
choosing tags, in generating new search terms to navigate
to the next page, in conducting an exploratory search for
the purpose of information-based ideation [12], in browsing
through and selecting recommendations, etc.
In future work we will therefore apply the reflective search
scheme to model observable and latent correlates of Web
navigation (e.g., search paths and accompanied internal con-
text evolution), but also to design intelligent and creatively
stimulating recommendation mechanisms. For instance, the
model would allow to detect lengthy monotonous search and
consumption periods (without substantial spotlight shifts),
to which the mechanism could react by providing novel and
context-disrupting stimuli that could help the user to escape
a state of mental fixation [12].
6. ACKNOWLEDGMENTS
This work is supported by the Austrian Science Fund
(FWF): P 25593-G22, P 27709-G22, and the EU funded
project Learning Layers (Grant Agreement 318209).
7. REFERENCES
[1] J. R. Anderson. The adaptive nature of human
categorization. Psychological Review, 98(3):409, 1991.
[2] L. Barsalou. Situated simulation in the human
conceptual system. Language and cognitive processes,
18(5-6):513–562, 2003.
[3] C. Cattuto, V. Loreto, and L. Pietronero. Semiotic
dynamics and collaborative tagging. Proceedings of the
National Academy of Sciences, 104(5):1461–1464,
2007.
[4] K. Dellschaft and S. Staab. An epistemic dynamic
model for tagging systems. In Proceedings of the
nineteenth ACM conference on Hypertext and
hypermedia, pages 71–80. ACM, 2008.
[5] W.-T. Fu and W. Dong. Collaborative indexing and
knowledge exploration: A social learning model. IEEE
Intelligent Systems, 27(1):39–46, 2012.
[6] W.-T. Fu, T. Kannampallil, R. Kang, and J. He.
Semantic imitation in social tagging. ACM
Transactions on Computer-Human Interaction
(TOCHI), 17(3):12, 2010.
[7] R. J. Glushko, P. P. Maglio, T. Matlock, and L. W.
Barsalou. Categorization in the wild. Trends in
cognitive sciences, 12(4):129–135, 2008.
[8] S. A. Golder and B. A. Huberman. Usage patterns of
collaborative tagging systems. Journal of information
science, 32(2):198–208, 2006.
[9] H. Halpin, V. Robu, and H. Shepherd. The complex
dynamics of collaborative tagging. In Proceedings of
the 16th international conference on World Wide Web,
pages 211–220. ACM, 2007.
[10] D. L. Hintzman. Minerva 2: A simulation model of
human memory. Behavior Research Methods,
Instruments, & Computers, 16(2):96–101, 1984.
[11] D. L. Hintzman. Research strategy in the study of
memory: Fads, fallacies, and the search for the
ˆa ˘
AIJcoordinates of truthˆa ˘
A˙
I. Perspectives on
Psychological Science, 6(3):253–271, 2011.
[12] A. Kerne, A. M. Webb, S. M. Smith, R. Linder,
N. Lupfer, Y. Qu, J. Moeller, and S. Damaraju. Using
metrics of curation to evaluate information-based
ideation. ACM Transactions on Computer-Human
Interaction (ToCHI), 21(3):14, 2014.
[13] J. Kimmerle, J. Moskaliuk, and U. Cress. Individual
learning and collaborative knowledge building with
shared digital artifacts. International Journal of
Behavioral, Cognitive and Psychological Sciences,
1(1):25–32, 2009.
[14] D. Kowald, P. Seitlinger, C. Trattner, and T. Ley.
Long time no see: The probability of reusing tags as a
function of frequency and recency. In Proc. of
WWW’14, New York, NY, USA, 2014. ACM.
[15] B. Kump, J. Moskaliuk, S. Dennerlein, and T. Ley.
Tracing knowledge co-evolution in a realistic course
setting: A wiki-based field experiment. Computers &
Education, 69:60–70, 2013.
[16] D. Larsen-Freeman and L. Cameron. Research
methodology on language development from a
complex systems perspective. The Modern Language
Journal, pages 200–213, 2008.
[17] B. Latour. Network theory|networks, societies,
spheres: Reflections of an actor-network theorist.
International Journal of Communication, 5:15, 2011.
[18] T. Ley and P. Seitlinger. Dynamics of human
categorization in a collaborative tagging system: How
social processes of semantic stabilization shape
individual sensemaking. Computers in human
behavior, 51:140–151, 2015.
[19] L. J. Lohnas, S. M. Polyn, and M. J. Kahana.
Expanding the scope of memory search: Modeling
intralist and interlist effects in free recall.
Psychological review, 122(2):337, 2015.
[20] N. W. Morton and S. M. Polyn. A predictive
framework for evaluating models of semantic
organization in free recall. Journal of Memory and
Language, 86:119–140, 2016.
[21] D. L. Nelson, C. L. McEvoy, and T. A. Schreiber. The
university of south florida free association, rhyme, and
word fragment norms. Behavior Research Methods,
Instruments, & Computers, 36(3):402–407, 2004.
[22] S. M. Polyn, K. A. Norman, and M. J. Kahana. A
context maintenance and retrieval model of
organizational processes in free recall. Psychological
review, 116(1):129, 2009.
[23] L. Scrucca. GA: A package for genetic algorithms in
R. Journal of Statistical Software, 53(4):1–37, 2013.
[24] P. Seitlinger, T. Ley, and D. Albert. Verbatim and
semantic imitation in indexing resources on the web:
A fuzzy-trace account of social tagging. Applied
Cognitive Psychology, 29(1):32–48, 2015.
[25] S. Sen, S. K. Lam, A. M. Rashid, D. Cosley,
D. Frankowski, J. Osterhouse, F. M. Harper, and
J. Riedl. Tagging, communities, vocabulary, evolution.
In Proceedings of the 2006 20th anniversary conference
on Computer supported cooperative work, pages
181–190. ACM, 2006.
[26] L. Steels. Collaborative tagging as distributed
cognition. Pragmatics & Cognition, 14(2):287–292,
2006.
[27] E. Tulving. Episodic memory: From mind to brain.
Annual review of psychology, 53(1):1–25, 2002.
[28] C. Wagner, P. Singer, M. Strohmaier, and B. A.
Huberman. Semantic stability in social tagging
streams. In Proceedings of the 23rd international
conference on World Wide Web, pages 735–746.
International World Wide Web Conferences Steering
Committee, 2014.
[29] A. Zubiaga. Enhancing navigation on wikipedia with
social tags. In Wikimania 2009. Wikimedia
Foundation, 2009.