ArticlePDF Available

Abstract and Figures

Recent research in relational learning has suggested that simple training instances may lead to better generalization than complex training instances. We examined the perceptual-encoding mechanisms that might undergird this "simple advantage" by testing category and perceptual learning in adults with simplified and traditional (more complex) Chinese scripts. In Experiment 1, participants learned Chinese characters and their English translations, performed a memorization test, and generalized their learning to the corresponding characters written in the other script. In Experiment 2, we removed the training phase and modified the tests to examine transfer based purely on the perceptual similarities between simplified and traditional characters. We found the simple advantage in both experiments. Training with simplified characters produced better generalization than training with traditional characters when generalization relied on either recognition memory or pure perceptual similarities. On the basis of the results of these two experiments, we propose a simple process model to explain the perceptual mechanism that might drive this simple advantage, and in Experiment 3 we tested novel predictions of this model by examining the effect of exposure duration on the simple advantage. We found support for our model that the simple advantage is driven primarily by differences in the perceptual encoding of the information available from simple and complex instances. These findings advance our understanding of how the perceptual features of a learning opportunity interact with domain-general mechanisms to prepare learners for transfer.
Content may be subject to copyright.
The simple advantage in perceptual and categorical
generalization
Khanh-Phuong Thai
1
&Ji Y. Son
2
&Robert L. Goldstone
3
Published online: 14 September 2015
#Psychonomic Society, Inc. 2015
Abstract Recent research in relational learning has suggested
that simple training instances may lead to better generalization
than complex training instances. We examined the perceptual-
encoding mechanisms that might undergird this Bsimple
advantage^by testing category and perceptual learning in
adults with simplified and traditional (more complex) Chinese
scripts. In Experiment 1, participants learned Chinese charac-
ters and their English translations, performed a memorization
test, and generalized their learning to the corresponding char-
acters written in the other script. In Experiment 2, we removed
the training phase and modified the tests to examine transfer
based purely on the perceptual similarities between simplified
and traditional characters. We found the simple advantage in
both experiments. Training with simplified characters produced
better generalization than training with traditional characters
when generalization relied on either recognition memory or
pure perceptual similarities. On the basis of the results of these
two experiments, we propose a simple process model to explain
the perceptual mechanism that might drive this simple advan-
tage, and in Experiment 3we tested novel predictions of this
model by examining the effect of exposure duration on the
simple advantage. We found support for our model that the
simple advantage is driven primarily by differences in the per-
ceptual encoding of the information available from simple and
complex instances. These findings advance our understanding
of how the perceptual features of a learning opportunity interact
with domain-general mechanisms to prepare learners for
transfer.
Keywords Perception .Categorization .Generalization .
Memory .Similarity
We can remember all kinds of details about our experiences in
the world, but our visual systems have the capacity to ignore
all kinds of details as well. Generalization relies on dual pro-
cesses: attending to similarities while simultaneously ignoring
differences. Efficient learning minimizes the necessary expe-
rience with learning instances (e.g., the number of learning
instances needed or the time spent learning) and maximizes
appropriate generalization.
Simple instancesthose that are idealized or contain just
the information that is relevant for generalizationhave been
shown to engender rapid learning with selective attention to
the right information for the task. In the classic study by
Biederman and Shiffrar (1987), novices briefly trained with
simple line drawings of diagnostic features were able to clas-
sify chicks with the accuracy of expert chicken sexers, a skill
that used to take thousands of hours classifying actual chicks
to perfect. Those simple line drawings provided idealized ver-
sions of the defining features, making explicit what were sub-
tle but critical cues for encoding and classification. More re-
cently, young children who were taught category labels with
simple objects, defined as those with fewer features and de-
tails, were more successful at generalizing to novel category
members than they were when shown more complex learning
objects (Son, Smith, & Goldstone, 2008). We refer to this
asymmetry of transfer from simple versus complex training
instances as the simple advantage.
Most of the research demonstrating the simple advantage
has focused on learning and transfer of relational concepts in
*Khanh-Phuong Thai
kpthai@gmail.com
1
Department of Psychology, University of California, Los Angeles,
405 Hilgard Avenue, Los Angeles, CA 90095, USA
2
California State University, Los Angeles, CA, USA
3
Indiana University, Bloomington, IN, USA
Mem Cogn (2016) 44:292306
DOI 10.3758/s13421-015-0553-z
mathematics (Kaminski, Sloutsky, & Heckler, 2008;McNeil,
Uttal, Jarvin, & Sternberg, 2009; Sloutsky, Kaminski, &
Heckler, 2005) and science (Goldstone & Sakamoto, 2003;
Goldstone & Son, 2005). In these relational domains, in order
to generalize learning to a new situation, one must pay more
attention to structural information than to superficial details
that may differ across instances. For example, in McNeil et al.
(2009), students who were given highly realistic bills and
coins performed worse on word problems involving money
than did students who were given either bland bills and coins
(plain rectangular pieces of paper and circular tokens) or
no bills and coins at all. McNeil et al. proposed that the
extraneous surface features of the realistic bills and coins
that were irrelevant to the task distracted studentsatten-
tion from the underlying mathematical structures of the
problems (see also Goldstone, 2006; Sloutsky, Kaminski, &
Heckler, 2005). Similarly, Goldstone and Sakamoto (2003)
trained undergraduates on complex adaptive systems princi-
ples with simulations containing either idealized, abstract
graphic elements (e.g., dots and blobs) or perceptually rich,
concrete elements (e.g., ants and little fruits), and found that
those trained with the perceptually rich display had more dif-
ficulty transferring their knowledge to a new domain that
looked different but shared the same underlying concept as
the one they had learned. This idea is not new; Clark Hull
(1920) used slightly deformed Chinese characters to demon-
strate that concepts were learned more quickly when they
were learned in the order from simple to complex. Simple
learning instances can facilitate such structural extraction by
limiting the extraneous details that must be ignored and guid-
ing attention to the right features at the time of encoding.
Little is known, however, about the perceptual-encoding
mechanisms that may drive this simple advantage in these
highly conceptual instances of transfer. To explore the percep-
tual mechanisms of the simple advantage, we trained and test-
ed English-speaking adults in both category and perceptual
generalization in a domain that contains a large number
of complex and simple corresponding forms: Chinese
character scripts. On the basis of the results of these stud-
ies, we propose a process model to explore the hypothesis
that the simple advantage is driven primarily by differences in
encoding perceptual elements while learning from simple and
complex instances.
For a number of political and historical reasons, the tradi-
tional Chinese writing system was simplified in 1949. The
simplified characters have approximately 22.5% fewer strokes
than the more complex traditional script (Gao & Kao, 2002).
Several different simplification processes were employed
some were based on Chinese history and meaning, but others
were more straightforward perceptual simplifications. As a
result, many characters and their components (recurring
groups of strokes that make up the characters) have taken on
quite different appearances (Harbaugh, 2003). Whether these
differences between scripts affect the learnability of characters
is the subject of ongoing debate among researchers who study
Chinese language acquisition (see Chen & Yuen, 1991;
McBride-Chang, Chow, Zhong, Burgess, & Hayward, 2005;
Seybolt & Chiang, 1979). For example, Seybolt and Chiang
argued that the traditional script, because it contains more
visual features, may be easier to discriminate initially. It also
contains more regular meaning- and phonetic-based compo-
nents, which may better promote semantic and sound-based
strategies earlier than among simplified-script learners.
However, it is equally plausible that traditional characters
are more difficult to learn because of the large numbers of
strokes across characters. The current preponderance of evi-
dence suggests that simplified characters are easier to learn
than traditional ones (Hodge & Louie, 1998). This conclusion,
however, is somewhat controversial, because it is hard to
equate teaching curriculum, instruction, and cultural differ-
ences (see also Chung & Leung, 2008). Mostly importantly,
and more relevant for our purpose, the bulk of this previous
research has been on the acquisition of the two scripts, not on
transfer from one to the other. It has been maintained that
switching or transferring from one script to the other is
straightforward; for example, Wang (2009, para. 4) stated that
Bthe structural continuity makes the switch between them easy
and smooth, a skill any educated person can quickly acquire.^
This assumption, however, has not been empirically tested,
partially due to complicated issues of aesthetics, history,
politics, and tradition; partially due to these complicated
issues, little research has examined differences in reading
the distinct scripts. Such an endeavor, primarily motivated
by issues in cognition and learning, would provide a
quantitative way of exploring the different contentions.
The rich sets of naturally occurring simple and complex
corresponding characters provide a domain ideally suited
to the purpose of examining the simple advantage in per-
ceptual categorization.
Purpose of present work
The primary motivation of this set of experiments and the
accompanying process model was to examine the hypothesis
that simple learning instances have an advantage in encoding
that drives later advantages in generalization. Our first exper-
iment replicated previous findings that learning with simpli-
fied instances leads to greater category generalization than
does training with complex forms. In a second experiment,
we examined the question of perceptual generalization: Does
the simple advantage occur even when participants have
superficial and minimal exposure to the simplified forms?
The third experiment tested novel predictions of the pro-
cess model, by examining the effect of longer exposure
times on the simple advantage.
Mem Cogn (2016) 44:292306 293
Experiment 1
Participants were asked to study flashcards with a Chinese
character on one side and an English definition on the other
side. After each set, the participantsmemorization was mea-
sured with a matching-to-sample task in which students were
briefly shown the English definition and had to pick out the
matching character from among four answer choices. After the
memory test, generalization was measured in the same matching
task, except that participants had to match the definitions with
characters from the unlearned script. In the traditional-first con-
dition, participants studied traditional characters and their
English definitions. During the traditional-first memory test,
they were shown traditional characters paired with English
words and first were tested on the traditional characters that they
had learned (TT trials, indicating traditional learning instances
and a traditional test); then they were given a generalization test
in which the choices were replaced with the corresponding
simplified characters (TS trials, indicating traditional learning
but a simple test). During the simple-first condition, partici-
pants studied and had a memory test with simplified charac-
ters (SS), but their generalization test had traditional versions
of the learned characters (ST). If simplified learning instances
promote category generalization, participants should show
better generalization in the simple-first (ST) than in the cor-
responding traditional-first (TS) condition.
Method
Participants and design
A group of 14 undergraduates (seven females and seven
males) participated for course credit. All reported to having
no prior experience with Chinese characters. In this within-
subjects experiment, half of the participants experienced the
traditional-first condition (learning, memory test, generaliza-
tion test) before the simple-first condition, each on a different
set of characters, and the other half experienced the two con-
ditions in the reverse order.
Materials and procedures
Although historical or semantic reasons lie behind some types
of Chinese character simplifications, the subset of characters
chosen for this study were perceptually simplified forms of
their traditional counterparts. In each pair of characters, up to
two components (stroke groups called radicals) of the tradi-
tional characters were omitted in order to produce their sim-
plified versions. Thus, the simplified characters had fewer
strokes as well as fewer components. The simplified charac-
ters used had 313 strokes per character (average: 7.23
strokes), and their traditional versions had 822 strokes per
character (average: 14.06 strokes). We created four sets of 12
unique character pairs, but each participant only studied two
of these sets in either the simplified or the traditional script.
Thus, each participant studied 12 traditional and 12 simplified
ChineseEnglish pairs. The number of omitted strokes, the
number of omitted components, the location of the omitted
components within each character, and the usage frequency
were balanced across the character sets. The four sets were
used equally often across participants.
In the training phase, each participant received a randomly
assigned set of 12 flashcards of either traditional or simplified
characters, according to their assigned condition. Each
character was printed in black, 36-point SimSun ()
font, and the English words were printed in black, 24-
point Calibri font. Participants were told to study the
ChineseEnglish pairs and that they would be tested on
them later. Participants were not given a time limit for
studying, and everyone finished within 20 min.
Once participants had handed in the flashcards, they were
administered the memory and generalization tests, in that or-
der, on a computer using E-Prime 2.0 (Psychology Software
Tools, Inc., Sharpsburg, PA, USA). For both tests, there were
12 trials, one for each of the 12 characters in the training set. A
trial began with a fixation cross lasting for 0.5 s, followed by
an English word for 2 s, then four Chinese characters. The
distractor characters were randomly chosen from the set of
trained characters. The location of the correct answer in the
set of four alternatives was controlled so that each location
occurred equally in each condition. The intertrial interval
was 1 s, and the order of the trials was random across partic-
ipants. No feedback was provided after each trial, but the
average accuracy and response time were given at the end of
each test. Figure 1shows a sample trial and the procedure.
In the memory test, participants chose from Chinese char-
acters identical to those in their training set. The generalization
test was set up identically to the memorization test, except that
the answer choices in this test were characters written in the
unlearned script. Before the generalization trials, these instruc-
tions appeared: BThere are two types of scripts in the Chinese
written language, Traditional and Simplified. You have just
studied characters written in one of these two scripts, and
now we would like to see how well you can recognize the
same characters written in the other script.^
Participants were given a 5-min break before they were
given a different, randomly selected set of 12 flashcards with
characters written in the other script. The entire procedure was
repeated for the second set of characters.
Results and discussion
Proportion correct and average response time data for correct
responses are presented in Fig. 2and Fig. 3, respectively
(please see the left panels).
294 Mem Cogn (2016) 44:292306
Preliminary analysis
We observed no statistically significant difference in study
times for the traditional and simplified characters (p> .05).
We also found no significant differences among the four
sets of characters (ps> .10) and no effect of condition order
(ps> .10), so the accuracies and response times for each
condition were collapsed across those variables. One par-
ticipant (out of 14) performed at a chance accuracy level on
all tests and was dropped from the following analyses.
Preliminary analyses that included this participants data
did not impact our results.
Memorization and transfer results
Because accuracy performance on the memory test was near
ceiling with little variance, our data violated the normality
assumption [ShapiroWilk, W(14)< 0.8, ps< .05]. Thus, we
opted to confirm our findings with Wilcoxon signed ranks
tests, a nonparametric version of dependent ttests on accuracy.
Accuracy Performance was better on the memory test
(M=.99,SD = .03) than on the generalization test (M= .86,
SD= .10), Z= 2.99, p= .003. Although the two conditions
exhibited similar memory performance, the simple-first condi-
tion generalized more accurately than the traditional-first con-
dition. Participants in both the traditional-first (M= .99,
SD= .03) and simple-first (M= .98, SD= .04) conditions suc-
cessfully learned the word pairs and recognized them equally
well, Z= 0.45, p= .66. Generalization accuracy was significant-
ly higher in the simple-first condition (M=.91,SD= .06) than
in the traditional-first condition (M=.80,SD= .14), Z= 2.52,
p= .01.
1
As predicted, participants who initially learned simpli-
fied characters generalized their learning to the transfer script
better than those who learned traditional characters.
Response times for correct trials (RTs, given in seconds
per trial) Participants were faster on the memorization trials
(M=2.71,SD= 0.92) than on generalization (M=5.54,
SD=2.15),pairedt(13)= 6.63, p<.001,d= 1.77. Those in
the simple-first condition (M=2.34,SD= 0.74) were faster
than those in the traditional-first condition (M=4.38,
SD= 1.67) on the memory test, paired t(13)= 3.63, p=.003,
d= 0.97, but not on the generalization test, paired t(13)= 0.11,
p= .92. Thus, when trained with simplified characters, partic-
ipants tended to make more correct matches on the generali-
zation test and were faster on the memory test than those who
had trained with traditional characters.
Importantly, even though simplified and traditional charac-
ters were remembered equally well, the simplified training
exemplars led to better generalization than the traditional
ones. However, the simple advantage may have been depen-
dent on the amount of exposure to the learning instance. In
Experiment 2, we asked whether training with simplified char-
acters is more efficient than training with traditional charac-
ters, even without extended training experience.
Fig. 1 (a) Training phase, (b) memorization test procedure, and (c) generalization test procedure in the simple-first condition of Experiment 1
Fig. 2 Accuracy data from the memorization and generalization tests in
Experiment 1(left panel) and from the exact-match and generalization
tests in Experiment 2(right panel). Error bars indicate ±1 SE
1
The same patterns of results were confirmed with a 2 Condition × 2 Test
Type repeated measures analysis of variance (ANOVA) on accuracy. We
found a main effect of test, F(1, 12)= 26.89, p<.001,η
2
=.69;amain
effect of condition, F(1, 12)= 8.98, p< .05, η
2
= .43; and a significant
interaction, F(1, 12)= 9.04, p<.05,η
2
=.43.Post-hocttests confirmed
similar memory performance between the two conditions, t(12)= 1.00,
p= .34, and the simple-first condition generalized more accurately than
the traditional-first condition, t(12) = 3.045, p<.025.
Mem Cogn (2016) 44:292306 295
Experiment 2
To extend the findings of Experiment 1, we removed the train-
ing phase and modified the memorization and generalization
tests to examine matches based purely on perceptual similarity.
If simplicity promotes transfer by providing only the relevant
perceptual features, then the simple advantage should per-
sist even when generalization relies only on the perceptual
similarities between simplified and traditional characters.
Method
Participants and design
A group of 23 undergraduates (10 males, 13 females) who
reported having no knowledge of Chinese characters partici-
pated for course credit. Experiment 2was also based on a
within-subjects design, so the order of conditions was
counterbalanced across participants: Twelve were randomly
assigned to participate in the traditional-first before the
simple-first condition, and the other 11 participated in the
simple-first before the traditional-first condition.
Materials and procedures
The stimuli and procedures were nearly identical to those of
Experiment 1. The key difference in Experiment 2was the
lack of a training phase. Thus, participants never connected
any of the characters to their English meanings. Each trial
began with a fixation cross, followed by a Chinese char-
acter for 2 s and four answer choices. In exact-match trials
(SS and TT), participants matched characters to identical
characters. On the generalization trials (ST and TS), par-
ticipants were shown a character in one script and had to
choose the match from among characters written in the
other script. A sample trial and the procedure are shown
in Fig. 4.
Results and discussion
Preliminary analysis
As in Experiment 1, we observed no effect of character set nor
of condition order (ps> .10) in accuracy and RTs, so the data
for each condition were collapsed across those variables.
Exact-match and generalization results
Average proportions correct and average RT results are pre-
sented in Figs. 2and 3, respectively (see the right panels). As
in Experiment 1, accuracy on the exact-matching task was
uniformly high; thus, we used Wilcoxon signed rank tests to
confirm differences in accuracy performance.
Accuracy The results were consistent with the findings from
Experiment 1. Participants made significantly more correct
responses on the exact-matching test (M=.97,SD=.04)than
on the generalization test (M= .68, SD=.12),Z=4.20,
p< .001. We also found a differential effect of the sam-
ple script on generalization. There was no significant
difference between the simplified and traditional exact-
match-to-sample tests, Z= 1.26, p= .21. However, the
simple-first condition produced significantly better gen-
eralization performance (M= .79, SD= .14) than did the
traditional-first condition (M=.57,SD=.18),Z=3.56,
p< .001.
2
Again, as in Experiment 1, training with simplified
characters promoted greater generalization to traditional
characters than training with traditional characters promoted
transfer to simplified characters.
RTs for correct trials (given in seconds per trial) Participants
were generally faster in the exact-matching test (M= 1.44,
SD= .07) than in the generalization test (M=3.02,SD=.23),
t(22)= 7.71, p<.001,d= 1.61. Although the simple-first con-
dition was faster than the traditional-first condition in the ex-
act-matching test, t(22)= 3.91, p=.001,d= .81, RTs in the
generalization test were similar, t(22)= 1.05, p=.31.
Whereas there had been no difference in accuracy on the
2
The same patterns of results were confirmed with a 2 Condition × 2 Test
Type repeated measures ANOVA on accuracy. A main effect of test
emerged, F(1, 22) = 129.72, p<.001,η
2
=.86,aswellasamaineffect
of condition, F(1, 22) = 33.42, p<.001,η
2
= .60, and a significant inter-
action, F(1, 22) = 12.33, p< .01, η
2
= .36. Follow-up pairwise ttests
showed no significant difference between the simplified and traditional
exact-match-to-sample tests, t(22)= 1.32, p= .20. However, the simple-
first condition produced significantly better generalization performance
(M=.79,SD= .14) than the traditional-first condition (M=.57,SD= .18),
t(22) = 4.83, p<.001.
Fig. 3 Response time data of accurate responses from the memorization
and generalization tests in Experiment 1(left panel) and from the exact-
match and generalization tests in Experiment 2(right panel). Error bars
indicate ±1 SE
296 Mem Cogn (2016) 44:292306
exact-matching trials, traditional characters required more
time per correct response than did the simplified characters
(1.55 vs. 1.32 s). This result is interesting in light of classic
experiments and theories of similarity.
As in Podgorny and Garners(1979) classic work, which
demonstrated that participants judge the similarity of two Ss
on a screen faster than that oftwo Ws, we also found that some
Chinese characters were self-identified faster than others. Our
results run contrary to the prediction derived from Tverskys
(1977) feature-based contrast model of similarity: Complex
objects that share a greater number of overlapping features
are more self-similar than simple objects, and therefore should
be easier to self-identify. Traditional characters contain
more strokes, so one might assume that they should be
more self-similar and should result in shorter RTs in our
exact-match test. However, it is important to keep in mind
that the distractors in the field were also complex. These
complex characters may also be more similar to each other,
thus forcing participants to spend more time to distinguish the
target among them.
MemSam: A computational model of the simple advantage
The simple advantage is thus far empirically limited to situa-
tions in which learners must generalize to new instances from
only one learning instance (either a simple or complex one).
Aside from the two experiments covered in this article thus
far, most of the existing research on literacy with traditional
and simplified Chinese scripts has not made any attempt to
connect this effect to general theories of categorization. To
understand the basic cognitive mechanisms that might under-
lie the simple advantage, we propose a simplified version of
an exemplar-based process model of categorization (see
Medin & Schaffer, 1978;Nosofsky,1986) in which there is
only one exemplar. In this memory-sampling model
(MemSam), we assume that a probe stimulus functions as a
retrieval cue to access already-stored information that is sim-
ilar to the probe. We also assume that features from all items
are sampled rather than encoded veridically in memory.
Furthermore, we assume that traditional characters have more
features than simplified ones. Although none of these assump-
tions and processes are particularly new
3
or innovative,
MemSam puts these assumptions together to provide a coher-
ent, process-driven account that can explain the simple advan-
tage and generate novel predictions about the conditions in
which we should observe it.
Key to this process model is the encoding of information
during learning. In all cases, we assume that learners sample
features of the presented example that are available during
learning and that they do not generally have a complete and
veridical representation of the exemplar. In a given task or
type of stimulus, learners have a capacity limit of memory,
Km, on the number of features they can sample and store. In
the case of a simple learning instance, there are fewer features
to sample, and accordingly, learners can encode a greater pro-
portion of the features than in the case of a complex,
traditional-learning instance. We will call the features success-
fully sampled and stored from the learning example the mem-
ory trace.
To prompt generalization, learners are presented with a
probe and must decide whether the probe is sufficiently sim-
ilar to the memory trace to give the response associated with
the memory trace. We assume that only Kp of the probes
features are sampled, because it is unlikely that all features
of a complex novel figure will be mentally available for con-
sideration. However, the number of features sampled for the
probe trace is assumed to be greater than the number of fea-
tures sampled and retained for the memory trace, because it
has been presented more recently (K
p
>K
m
.
). In the specific
case of our present experimental paradigm, the probe remains
visually present while the participant chooses, a limiting case
of recency.
Fig. 4 (a) Exact-match test procedure and (b)generalization test procedure in the traditional-first condition of Experiment 2
3
The idea that cue items are categorized or identified by comparing them
to items stored in memory is shared by SAM (Raaijmakers & Shiffrin,
1980), REM (Shiffrin & Steyvers, 1997), and MINERVA2 (Hintzman,
1984), to name a few models. The assumption that features from cue item
are sampled, rather than using the entire cue item, is also present in such
models as EGCM (Lamberts, 2000).
Mem Cogn (2016) 44:292306 297
To describe MemSams encodings of the objects based on
their visual appearances, we make the simplifying assumption
that every stroke counts as a single feature. So, an originally
trained traditional character with eight strokes would be repre-
sented as the memory trace m={1,2,3,4,5,6,7,8} if all of its
strokes were stored (i.e., if Km8 ), with each number
representing a unique stroke in the character. The subsequently
probed, simplified character would then be represented as the
probe trace p={1,2,3,4,5,6} indicating that it possesses a subset
of six of the traditional characters eight strokes (i.e., if K
p
6).
In this case, the intersection m pis 1;2;3;4;5;6
fg
,which
has a set size of 6, and the union mpis 1;2;3;4;5;6;7;8
fg
,
which has a set size of 8. The number of distinctive strokes in
this pair would be 2. Kmand Kpare the parameters that limit
the sizes of the samples, mand p.
The likelihood of generalization is determined by the prob-
ability of choosing the memory trace ðm) response for the
probe (p),p(C
m,p
):
pC
m;p

¼
Fmp
mp

γ
XFmTp
mp

γ
þbγ
This choice probability is defined as the evidence for a
match between mand p, divided by the evidence for a match
between mand all four choices (including p). The evidence for
a match between mand pis the intersection of sampled fea-
tures from the memory and probe traces divided by the union
of the features in these traces. This proportion is multiplied by
a feature match parameter, F, to represent whether a feature
matches perfectly between the memory and probe traces (as in
the memory conditions, SS and TT) or is similar but slightly
distorted (in the generalization conditions, ST and TS). To
make this more concrete, take the case of the TT condition.
When a stroke is present in the traditional memory trace and
that identical stroke (placement, angle, size) is present in the
traditional probe trace, the feature match parameter is perfect
(e.g., F¼1 ). However, in conditions such as ST, a stroke
feature might be slightly larger and in a different position
in the simple memory trace than in the traditional probe
trace. Thus, the feature match parameter is less than perfect
(e.g., F¼:8 ). When there are mismatching strokes, both
matching and mismatching features are similarly distorted
relative to the matching features. This evidence for a match
is then used in a choice rule to account for the forced
choice in this particular experimental paradigm. The choice
probability also includes a parameter, γ(set to 4 in our
simulations), interpretable as the determinism of responding
(0= chance responding, = always choose the character
that has the greatest evidence). Also, the baseline attraction
of choices other than the correct one is represented by b
(set to .2 in our simulations). Neither the choice probability
nor the baseline attraction parameters significantly change
the qualitative patterns exhibited in the model, but they
have effects on the relative sizes of the effects.
This simple model captures some basic patterns in the ex-
perimental data and also generates some novel predictions.
The first important behavioral characteristic is the advantage
of learning with simple forms. MemSam demonstrates that
simple learning instances would lead to enhanced memory
performancethat is, SS would be greater than TT perfor-
mance. For TT trials, the strokes in the complex, traditional
form are likely to exceed the memory capacity K
m
,with
the result that only a subset of the characters strokes are
stored, leading to imperfect match to the same character
when it is later presented as the probe. By contrast, when
a simplified form is presented, it is likely to be perfectly,
or nearly perfectly, encoded into memory and matched to
the simplified probe.
A similar account explains why generalization is also
betterthat ST exceeds TS performance. In the case of
TS, when the traditional form is in memory, a relatively
small proportion of its features are likely to be stored,
meaning that its trace will match the probe trace relatively
poorly. Thus, the generalization likelihood is less than in
the ST condition, in which the simple form is in memory
and all or most of its features are likely to be stored,
meaning that it will match the probe trace well.
Another, more obvious and empirically observed pattern is
that memory conditions (SS and TT) would produce greater
performance than generalization conditions (ST and TS).
These patterns that demonstrate the simple advantage are
robust to the model parameters described above, as long
as K
p
is sufficiently larger than K
m
. The baseline attraction
parameter (b) mostly changes the overall performance such
that if baseline attraction were 1, generalization in all condi-
tions would be low. The choice probability parameter (γ)has
an effect on the relative size of these patterns. For instance, if
we let γbe 1 (the parameterless version of Luces, 1959,
choice axiom), the simple advantagealthough presentis
less pronounced between non-exact-match trials (ST and TS)
and exact-match trials (SS and TT). However, the overall pat-
tern of results is similar, which shows the robustness of the
modelspredictionstoγvariation.
We want to highlight two novel predictions of MemSam.
Usually when researchers test the Bsimple advantage^they
make a straightforward prediction: Simple instances are better
for learning than complex ones. However, what is good for
learning is not necessarily good for transfer. The public debate
on Chinese scripts is mostly centered on learning simplified
versus traditional scripts, whereas we are empirically looking
at the question of how transferable is the reading skill learned
from one script to the other. Most studies do not go further
to examine the conditions under which that is true or when
that advantage might be most prominent. MemSam makes
298 Mem Cogn (2016) 44:292306
Bnovel predictions^in the sense that these are not predic-
tions that have been made by other researchers examining
the simple advantage. The predictions borne out by
MemSam are both interactions that would be difficult to pre-
dict without the rationale provided by a model simulation.
4
First, since fewer features are sampled for the memory
trace, there should be a greater advantage for learning with
simple characters. Conversely, as the memory trace becomes
more accurate (better sampling), the simple advantage should
be diminished. For example, as learners are given more time
studying the training exemplar, their encoded memory trace
becomes more accurate because more features are accurately
encoded. This suggests that a longer study time should result
in a less pronounced simple advantage. That is, a longer time
for learning a traditional character should result in greater
memory and generalization performance than a shorter time.
However, learning from simple figures should not benefit as
much from longer learning times. The plots in Fig. 5show
MemSams predictions of generalization under each condition
for relatively few (left panel) versus many (right) features
sampled for the memory trace. The advantages of ST over
TS and SS over TT are larger when fewer memory samples
are taken. In creating these plots, MemSams inputs were the
actual numbers of shared and distinctive strokes for each of
the traditional and simple characters used in the experiments.
The Appendix contains a table of the stimuli and their respec-
tive feature counts.
The second novel prediction of the model is that a greater
number of distinctive features in the traditional form should
lead to a greater simple advantage. In Fig. 5, as the number of
distinctive strokes in the traditional form increases, the gap
between SS and TT becomes larger for both memory sample
sizes, and that between ST and TS becomes larger for the
larger sample size (but remains relatively constant for the
smaller sample size). This can be empirically investigated by
regressing memory and generalization performance against
the stimulus characteristics of the specific characters used in
the experimentnamely, the number of strokes comprising
the simple form of a character, the number of strokes compris-
ing the traditional form, the number of shared strokes, and the
number of distinctive strokes in the traditional form.
To examine these two predictions, we conducted a third
experiment in which participants were given an opportunity
to study the exemplar for either a relatively long or short
period of time. Additionally, we conducted an analysis of
the data by the particular stroke counts of the learning exem-
plar and generalization probes in each trial.
Experiment 3
Method
Participants and design
A total of 68 undergraduates
5
whoreportedhavingnoknowl-
edge of Chinese characters participated for course credit. As in
Experiments 1and 2, we used a within-subjects design, so the
order of conditions (SS, TT, ST, TS) was counterbalanced
across participants. This time, however, exposure time during
the training phase was also a within-subjects variable.
Materials and procedure
To examine the simple advantage to generalization from the
particular stroke counts of the learning exemplar and the gen-
eralization probes, we expanded the stimulus set to contain
120 traditionalsimplified character pairs, in which the sim-
plified form contained a subset of the strokes contained in the
traditional form. The full set was randomly divided into
four word lists with 30 character pairs in each list, while
maintaining the stroke count distribution across lists. The
stroke count of simplified characters ranged from 2 to 15
strokes (mean= 7.92), traditional characters ranged from 8
to 24 strokes (mean= 14.47), and the number of distinc-
tive features (traditional strokes minus simplified strokes)
ranged from 1 to 14 (mean= 6.55).
The procedures were similar to those of Experiment 2,
except that participants had either 0.5 or 6 s to study each
exemplar before the generalization phase. Each trial began
with a fixation cross, followed by a Chinese character
displayed for either 0.5 or 6 s, and four answer choices.
In SS and TT trials, participants matched simplified and
traditional characters to the respective identical characters.
In ST and TS trials, participants were shown a character
in one script (S or T, respectively) and were asked to
choose the best match among characters written in the
other script (T or S, respectively). Participants were asked
to respond with a numeric keypad.
Trials were blocked by conditions (SS, ST, TT, and TS),
and the order of conditions was counterbalanced across
participants. Each condition used only one of the four
word lists, counterbalanced across conditions. Thus, each
condition contained a total of 60 trials. Words were picked
randomly from each word list, and each word was shown
twice foreither 0.5 or 6 s. The presentation times for each word
were counterbalanced across participants, so that a character
4
To be clear, other models could probably explain our results. However,
our attempt was a minimal framework that explicitly articulates that the
stimuli are the main source of the asymmetry demonstrated by the simple
advantage. Other models have their own ways of handling asymmetries
that are sometimes built into the processes (e.g., the search mechanism
itself, or encoded asymmetries in association strength).
5
We increased thenumber of participants in Experiment 3 for two reason:
(1)to increase statistical power, in the attempt to test the MemSam model,
and (2)to counterbalance the assignments of presentation time (0.5 or 6 s)
to each word across participants.
Mem Cogn (2016) 44:292306 299
presented for 0.5 s for half of the participants was presented for
6 s to the other half of the participants. The presentation times
appeared in random order within each condition. Participants
could take a short break after every 60 trials.
Results and discussion
Preliminary analysis
We found no effect of character set or condition order (ps> .10)
in accuracy and RTs, so the data for each condition were col-
lapsed across those variables. Three participantsdata (out of 68)
were dropped because of chance-level accuracy performance
throughout the experiment. Inclusion of their data did not
change the results reported below.
Model Prediction 1 The simple advantage is stronger when
fewer features are sampled for the memory trace. Given that a
longer viewing time is expected to provide a more accurate
memory trace, our model predicts a greater simple advantage
for generalization with a 0.5-s than with 6-s presentation time.
Figure 6plots the accuracy by condition and by presenta-
tion time. As in Experiment 2, the exact-match performances
from both conditions were near ceiling with little variance,
which violated the assumption of normality [ShapiroWilk,
W(65)< 0.70, ps< .05]. Thus, we confirmed the condition
difference on the exact-matching task with the Wilcoxon
signed rank test, and also conducted ANOVAs and paired
ttests for the remaining analyses when their assumptions
were met.
As in the first two experiments, participants were generally
more accurate on the exact-matching task (M=.90,SD=.15)
than on the generalization task (M=.67,SD=.14),Z=6.79,
p< .001. We also replicated the simple advantage. The ST
condition produced significantly better generalization perfor-
mance than the TS condition, as was confirmed by a signifi-
cant Condition × Test Type interaction, Z= 5.70, p< .001.
Interestingly, the SS condition (M= .92, SD= .14) also
had higher overall accuracy than the TT condition (M=.89,
SD=.16),Z=3.09,p=.002.
6
We conducted a 2 Study Time (0.5 s, 6 s) × 2 Condition
Order (simplified first, traditional first) ANOVA on accu-
racy, and confirmed a main effect of condition, with
higher accuracy on the simple-first conditions (SS and ST,
M= .83, SD= .13) than on the traditional-first conditions
(TT and TS, M=.75,SD=.13),F(1, 64)= 61.732, p<.001,
η
2
p
= .49. No main effect of study time was apparent,
F(1, 64)= 1.20, p= .28. Importantly, however, we did find a
significant interaction, F(1, 64)= 10.84, p<.01,η
2
p
=.15:The
simplified-first condition produced higher accuracy than the
traditional-first condition with both short, paired t(64)= 8.57,
p< .001, and long, paired t(64)= 4.80, p< .001, study times,
suggesting that in either study time condition, there was a
simple advantage. No difference across study times emerged
for the simplified-first condition, but we did observe a
significant difference in the traditional-first condition,
paired t(64)= 2.35, p< .05, such that words studied for
the longer time of 6 s were identified more accurately
6
A 2 (Presentation Time: 0.5 s, 6 s) × 2 (Condition Order: simplified first,
traditional first) × 2 (Test Type: exact match, generalization) within-
subjects ANOVA on accuracy confirmed the same patterns. We observed
a main effect of condition, with participants being more accurate in the
simple-first than in the traditional-first conditions, F(1, 64)= 61.732, p<
.001, η
2
= .49. They also did better on the exact-matching task, F(1, 64)=
235.07, p< .001, η
2
= .31. The ST condition produced significantly better
generalization performance than the TS condition, as was confirmed by a
significant Condition × Test Type interaction, F(1, 64)= 29.15, p<.001,
η
2
= .79. We also found a significant difference between the SS and TT
conditions (p< .05), but no three-way interaction (p>.05).
Fig. 5 Predicted generalization from the MemSam model under each
condition for few (i.e., Km¼6, left panel) versus many (i.e., Km¼11,
right panel) sampled features. Each dot represents the mean proportion
accuracy from a particular condition. Solid lines represent the best-fitting
linear regression lines for generalization tests, and dashed lines represent
the best-fitting linear regression lines for the memorization tests. To pro-
duce these figures, Kp, the capacity limit for the probe trace, was set to 20.
Having fewer features sampled for the memory trace may correspond to
situations in which there is limited time or resources for initial learning
300 Mem Cogn (2016) 44:292306
(M=.76,SD= .15) than those studied for 0.5 s (M=.73,
SD= .14). Note that this suggests that the disadvantage of
learning from a traditional instance was diminished when
participants had a longer study time. Thus, as was predict-
ed by our model, the traditional disadvantage (the comple-
ment to the simple advantage) was more apparent with the
0.5-s presentation time than with the 6-s presentation time.
Can RTs explain away this effect? In other words, in the TS
condition, were participants more accurate on trials with a 6-s
viewing time simply because they also took longer to answer
on those trials than on those with 0.5-s viewing time?
Analyses of RTs showed that this was not the case. The cor-
relation between proportion accuracy and the RTs of accurate
responses was negative, r=.30, p< .05 (the correlation of
accuracy and the RTs of all responses was .34, p<.01),
indicating that slower participants were also less accurate.
Thus, we cannot attribute differences in the simple advantage
between presentation times to a speedaccuracy trade-off.
Figure 7displays the mean RT data of accurate responses by
condition for each presentation time.
Interestingly, participants were slower to respond correctly
on all trials with a 6-s viewing time (M=2.33,SD=1.35)than
with a 0.5-s viewing time (M=1.66,SD =0.45);slowerRTs
were not specific to TS trials. This was confirmed by a 2
Condition Order × 2 Test Type × 2 Presentation Time re-
peated measures ANOVA on the RTs of accurate responses.
This produced a main effect of presentation time, F(1, 64)=
18.49, p<.001,η
2
p
= .22, and a significant Presentation
Time × Test Type interaction, F(1, 64)= 5.75, p<.05,
η
2
p
=.08:Participantstookmoretimetoanswermemoriza-
tion trials correctly with a 6-s presentation time (M=1.79,
SD= 0.85) than with a 0.5-s presentation time (M=1.32,
SD=0.38),t(64)= 5.18, p< .001. They also took more time
to correctly respond to generalization trials with a 6-s pre-
sentation time (M=2.86,SD = 1.95) than with a 0.5-s pre-
sentation time (M=2.00,SD=0.66),t(64)= 3.74, p<.001.
Although the explanation is purely speculative, perhaps this
reflects a priming effect in which fast presentation times
prime participants to respond more quickly in this self-
paced task.
As in the first two experiments, participants were generally
faster to answer correctly on exact-match trials (M= 1.55,
SD= 0.54) than on generalization trials (M= 2.43, SD=1.11),
as was confirmed by a significant main effect of test type,
F(1, 64)= 83.55, p< .001, η
2
p
= .57. We also observed a
significant Condition Order × Test Type interaction, F(1, 64)=
25.06, p<.001,η
2
p
= .28. Participants were generally faster
when they were correct on TS trials (M=2.15,SD =0.73)than
they were on ST trials (M=2.7,SD=1.84),pairedt(64)= 2.63,
p<.05.TheywerealsofasteronSStrials(M= 1.48, SD=
0.76) than on TT trials (M=1.65,SD =0.48),pairedt(64)=
2.33, p<.05.
We found no main effect of condition order, no Presentation
Time × Condition Order interaction, and no three-way interac-
tion (all ps> .05).
Model Prediction 2 The greater the number of distinctive
features in the traditional form, the greater the simple advan-
tage for generalization.
To examine the contribution of the number of distinctive
features in the traditional form on the simple advantage for
generalization, we conducted a multiple regression analysis
using forward difference dummy coding to compare proportion
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
SS ST TT TS
Simple-first Traditional-first
Mean proportion correct
0.5 seconds
6 seconds
Fig. 6 Accuracy by presentation time data from the exact-match and
generalization tests of Experiment 3. Error bars indicate ±1 SE
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
SS ST TT TS
Simple-first Traditional-first
Mean Response Time When Correct (in seconds)
0.5 seconds
6 seconds
Fig. 7 Response times of accurate answers by presentation times from
the exact-match and generalization tests in Experiment 3. Error bars in-
dicate ±1 SE
Mem Cogn (2016) 44:292306 301
accuracies according to the number of distinctive features
between conditions SS and TT and between ST and TS.
Interaction variables were created to estimate the slopes
(accuracy by number of distinctive features) of the best-
fitting regression lines for each condition. All variables were
enteredsimultaneously.Theaccuraciesineachconditionby
the number of distinctive features are shown in Fig. 8.
The MemSam model fit the data well, R
2
=.68,F(5, 62)=
26.60, p< .001; the model explained 68.2% of the variance in
accuracy. The resulting regression equation was Accuracy=
.93 .018(Distinctive Features) .035(SS TT)+ .067(ST
TS)+ .02(SS TS × Distinctive Features)+ .022(ST TS ×
Distinctive Features). Conditions started with the same initial
mean on accuracy, but the effects of the number of distinctive
features were different for different conditions. The difference
in slopes between the SS (.002) and TT (.005) conditions
was statistically significant, b= .02, t(62) = 2.45, p= .017,
suggesting that we can reject the null hypothesis that the re-
gression lines were parallel for SS and TT. The difference in
slopes between the ST and TS conditions was also statistically
significant, b=.022,t(62)= 2.701, p= .009, suggesting that
theslopesofST(.03) and TS (.04) were not equal. Thus, as
the number of distinctive features in the traditional form in-
creases, the generalization accuracy drops faster in the TS than
in the ST condition. This finding is congruent with our model
hypothesis that a greater simple advantage should appear with
larger numbers of distinctive features in the traditional form.
To examine the simple effects, we used the recentering
strategy to test whether there were differences in accuracies
between the conditions at the mean number of distinctive fea-
tures (M=9,SD= 4.94) and at ±1 SD of the mean (approxi-
mately at 4 and 14, respectively). Consistent with Model
Prediction 2, we expected that the gaps between SS and TT
and between ST and TS would increase with increasing num-
bers of distinctive features in the traditional form. When there
were approximately four distinctive features, the overall effect
of condition was significant, F(2, 62)= 6.83, p= .002:
Conditions SS and TT did not differ from each other, t(62)=
1.37, p> .05, but ST showed significantly higher accuracy
than the TS condition, t(62)= 2.78, p< .01. At nine distinctive
features, we also found a significant main effect of condition,
F(2, 26)= 21.27, p< .001. SS had a higher accuracy than TT,
t(62)= 2.63, p= .011, and ST had a higher accuracy than TS,
t(62)= 4.75, p< .001. At 13 distinctive features, the overall
effect of conditions was also statistically significant, F(2, 62)=
47.21, p< .001. SS was statistically higher in accuracy than
TT, t(62)= 4.229, p< .001, and ST was largely more accurate
than TS, t(62)= 6.84, p<.001.Asthenumberofdistinctive
strokes in the traditional form increased, there was an
increasing gap between SS and TT as well as between
ST and TS for the larger sample sizes.
General discussion and conclusion
We examined the simple advantage for generalization
between simple and complex Chinese scripts in order
to explore the hypothesis that differences in encoding
opportunities drive this effect. In Experiment 1, participants
studied the characters and their English translations before
attempting to generalize their learning to the same characters
of the unlearned script. In Experiment 2, participants had only
brief controlled exposure to the characters before undergoing
the generalization test. In both experiments, we found a
generalization advantage when the initially shown exem-
plar was simple. Experiment 2showed that the asymmetry
can be localized to generalization itself, rather than being
unique to associating characters with English words.
Contrasting the results of Experiments 1and 2,generaliza-
tion performance was more accurate yet slower in Experiment
1than in Experiment 2. This pattern is reasonable, given the
differences in the tasks across experiments: Those in
Experiment 1had to recall the characters from memory when
given their English definitions, whereas those in Experiment 2
saw exemplar characters immediately before making their
choice. Taking more time to recall the trained characters
may have helped the participants in Experiment 1generalize
more accurately. A longer RT is probably less effective,
though, when generalization was more purely perceptual
(as in Exp. 2).
To explain the simple advantage, we proposed MemSam, a
simple process model, and tested its predictions in Experiment
3. Our model posits that the simple advantage is driven pri-
marily by differences in perceptual encoding of the available
information between learning from simple and complex
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 5 10 15 20
Proportion Accurate
Number of Distinctive Features
Simple to Simple
Traditional to Traditional
Simple to Traditional
Traditional to Simple
Fig. 8 Accuracy as a function of the number of distinctive features in the
traditional form from Experiment 3s stimulus set. Each dot represents the
mean proportion accuracy from a particular condition. Solid lines
represent the best-fitting linear regression lines for generalization tests,
and dashed lines represent the best-fitting linear regression lines for the
memorization tests
302 Mem Cogn (2016) 44:292306
instances. Simple learning instances contain fewer features to
be sampled, allowing learners to encode and store more of
those features. Thus, when learners are given more time to
study the exemplar, their memory trace becomes more accu-
rate, because more features are accurately encoded. Consistent
with this hypothesis, Experiment 3showed that the disadvan-
tage of learning from the traditional characters diminished if
participants had a longer learning time. Furthermore, as the
number of distinctive features between the simple and tradi-
tional forms increases, the model predicts that the asymmetry
between the TS and ST conditions should increase.
Experimental confirmation for this prediction was found, in
that the magnitude of the simple advantage increased as the
number of distinctive features in the traditional form in-
creased. The model thus unifies the results by making quanti-
tative predictions for all conditions and showing how they
interact with stimulus complexity and presentation time.
In the following subsections, we will discuss the theoretical
and educational implications of these findings.
Theoretical implications
These findings are consistent with the results of past research
on generalization by shape with young children (e.g., Son
et al., 2008)in short, simple instances promote better cate-
gory generalization. Why are these instances advantageous for
transfer? Simple training instances may allow for efficient
encoding of the right initial features and/or for retrieval of
useful representations. Learning from complex characters
may be detrimental just because of the presence of additional
nondiagnostic features that are not present in novel transfer
cases. Furthermore, complex instances may generally require
greater attentional resources to learn and use.
Novices of all stripes seem to exhibit similar difficulties in
both categorization and perceptual learning. A perceptual ex-
planation that may be illuminating is that potentially useful
and distracting features may not be psychologically separable
at the time of learning (Schyns & Rodet, 1997). Being ex-
posed to a simplified perceptual instance first may have en-
abled our learners to recognize the complex character as con-
taining the simple character along with other, new features.
Initial learning with a complex stimulus does not provide a
decomposedperceptual vocabulary, and thus the learner might
miss the shared components between the complex and simple
stimuli. An analog of this perceptual mechanism may underlie
the simple advantage found in studies of conceptual transfer,
given the parallels between perceptual and conceptual learn-
ing (Goldstone, Landy, & Son, 2010;Kaminskietal.,2008).
Additionally, this work raises more issues regarding the
relationship between similarity, recognition memory, and cat-
egory generalization. If recognition memory or category gen-
eralization is taken as a measure of similarity, this set of results
provides further evidence for the asymmetry of similarity.
Accuracy and RTs are asymmetrical between the initially
viewed exemplar and the potential matches, such that gener-
alization performance is aided by an initially simple exemplar.
Furthermore, this work raises the possibility that similarity
judgments based on perceptually available features may oper-
ate differently than when such judgments are based on fea-
tures retrieved from exemplars in memory.
Another theoretical issue that arises from the results and the
model is the question of the role of encoding in the simple
advantage. Both in the model and in the three experiments, the
learned stimulus was not present at the time of identification,
so encoding the learning exemplar into memory was part of
the process. Would our models predictions continue to be
empirically supported even when participants simultaneously
viewed the learning materials with the transfer choices, thus
eliminating memory requirements? Although we lack empir-
ical data, the model could account for a demonstration of the
simple advantage in this situation by more broadly defining
what the memory trace stands for. Instead of interpreting the
memory trace as something registered in permanent memory,
we could construe it as creating a representation of a base case
that we know about in order to make predictions about un-
known objects. Even if the learning exemplar was present at
the time of generalization, limited attentional resources
would probably preclude a viewer from attending to every
feature accurately (assuming that the object is complex/
novel enough). Our model predicts the simple advantage
as long as the number of features sampled from the probes
(K
p
) is sufficiently larger than the number of features sam-
pled from the learning object (K
m
).
Practical implications
If one of the most important goals of education is appropriate
generalization, the simple advantage appears to have broad
implications. Even though generalization would likely occur
if enough time and resources were devoted to training with
many complex, detailed instances (see, e.g., Kellman, Massey,
&Son,2010), the present research provides further support
for the idea that simple training instances may be able to foster
generalization more efficiently. MemSam, a stripped-down
process model, provides a step toward a true account of pre-
vious research that had examined the simple advantage within
academic domains.
More directly, these results bear on the cognitive role of
scripts in Chinese reading. Broadly speaking, there are no
measurable differences in reading or spelling between the
two scripts (Chan & Wang, 2003). A few studies have sug-
gested that learning to read with simplified characters is more
related to visual skills than is learning to read traditional char-
acters (Chen & Yuen, 1991; McBride-Chang et al., 2005).
Young children learning to read in mainland China (using sim-
plified script) were more likely to base similarity judgments of
Mem Cogn (2016) 44:292306 303
characters on visual characteristics than were children from
Hong Kong (primarily taught with traditional script; Chen &
Yu e n , 1991). Although further research will be necessary to
determine whether learning a few characters in a lab setting is
similar to learning hundreds of characters in order to gain lit-
eracy, our findings suggest that there might be a benefit of
starting with simplified characters. This empirical exploration
of the supposedly Beasy and smooth^switching from one
script to the other clearly demonstrates an asymmetry: The
two directions of switching are not equally easy and smooth.
Particularly if the goal is to read both scripts, learning the
simplified script may be more helpful for learning the tradi-
tional script than learning the traditional script is when trans-
ferring to the simplified characters.
Simplified characters contain fewer but more diagnostic
components (radicals), so it may be advantageous to treat these
recurring radicals as basic orthographic units. Perhaps an em-
phasis on explicitly learning these units early on may foster
better generalization to full-blown characters. Research on
Chinese literacy (e.g., Tsai & Nunes, 2003)hasshownthat
expert readers are generally quite sensitive to these components.
Whether such pedagogical practice supports future learning of
new Chinese characters is a question for future research.
The relevance of these findings for Chinese literacy is lim-
ited in two significant ways. First, the characters used in these
studies were only simplified via the component omission pro-
cess. Future research should incorporate character sets created
through other simplification methods, such as replacing a
complex component (e.g., four dashes) with a simpler one
(e.g., a single line), to draw broader conclusions about the
simple advantage for Chinese reading. Second, reading is
more than merely identifying or recognizing characters.
Traditional characters include cues to pronunciation and
meaning that have been removed in simplified characters.
These cues may be equally, or even more, important to full-
fledged reading than is ease of recognition.
Conclusions
The present results show that the simple advantage extends to
a naturally occurring generalization problemtransferring
from one Chinese script to another. This adds to the growing
evidence that this advantage is stable across a variety of tasks
and domains, from categorization and object recognition to
more complex forms of formal learning. The MemSam model
illustrated how this effect could be driven by a domain-general
encoding mechanism that bridges or incorporates both percep-
tual and conceptual learning. In some sense, all learning situ-
ations are ill-constrained, because a novice does not know
what information is relevant or irrelevant. Simplicity supports
learning by getting at the heart of this problem: The few fea-
tures that are presented are all relevant.
Author note This research was supported in part by the Institute of
Education Sciences, U.S. Department of Education (Grant No.
R305B080016) through the University of California, Los Angeles
(UCLA). The opinions expressed are those of the authors and do
not represent the views of the Institute or the U.S. Department of
Education. We gratefully acknowledge the support of Philip Kellman,
Linda Smith, James Stigler, Everett Mettler, Isabel Bay, Trinh Tran,
Xiaoya Qiu, the members of the UCLA Human Perception Laboratory,
and the UCLATeaching and Learning Laboratory.
Appendix
Tabl e 1 Full stimulus list used in Experiment 3
Count Simplified
Character
Trad itional
Character
Simplified
Feature
Count
Traditional
Feature
Count
Number of
Distinctive
Features
141814
2
*
7103
3
*
气氣 7103
48157
5
*
9167
6
*
9178
7恼惱 9123
861610
98168
10 12 21 9
11 91910
12 79 2
13
*
际際 7136
14
*
6137
15 脑腦 10 13 3
16 筑築 12 16 4
17
*
务務 5105
18
*
籴糴 82214
19
*
10 17 7
20 虏虜 8135
21
*
91910
22
*
4128
23 91910
24
*
8113
25 10 15 5
26
*
6148
27 13 17 4
28
*
3118
29 71811
30 11 16 5
31 31714
32 14 21 7
33 9156
34
*
厌厭 6148
35 靥靨 15 23 8
304 Mem Cogn (2016) 44:292306
Tabl e 1 (continued)
Count Simplified
Character
Trad itional
Character
Simplified
Feature
Count
Traditional
Feature
Count
Number of
Distinctive
Features
36 8168
37 13 21 8
38
*
5138
39 21513
40 8157
41 92011
42 飞飛 49 5
43
*
13 18 5
44 随隨 13 16 3
45
*
4117
46 8113
47 孙孫 6104
48 61913
49
*
挂掛 9112
50 7158
51 7136
52 9145
53 10 12 2
54 11 12 1
55
*
68 2
56 赶趕 10 14 4
57 7147
58 9145
59
*
7103
60 6104
61 11 19 8
62 踊踴 14 16 2
63
*
10 18 8
64
*
5138
65 8146
66 8157
67 阳陽 7125
68
*
粪糞 12 17 5
69
*
9156
70
*
8113
71 9134
72 9167
73 魇魘 15 24 9
74
*
5149
75
*
5138
76
*
4117
77
*
9178
78
*
3107
79
*
宁寧 5149
80 9167
Tabl e 1 (continued)
Count Simplified
Character
Trad itional
Character
Simplified
Feature
Count
Traditional
Feature
Count
Number of
Distinctive
Features
81
*
9123
82 6115
83
*
9167
84 69 3
85 8113
86 10 13 3
87 92011
88 7103
89 6126
90
*
夺奪 6148
91 6148
92
*
10 17 7
93
*
堕墮 11 14 3
94
*
5116
95
*
隶隸 8179
96 7125
97 10 13 3
98 儿兒 28 6
99
*
6115
100 9156
101
*
奋奮 7114
102 79 2
103 7158
104
*
8168
105 9189
106 齿齒 8157
107 7147
108
*
7169
109
*
耸聳 10 17 7
110 10 14 4
111
*
盘盤 10 14 4
112
*
宝寶 82012
113 干幹 31310
114
*
79 2
115 5149
116 71710
117 10 14 4
118 8168
119 71912
120 51813
**
61711
The asterisks (
*
) indicate characters used in Experiments 1and 2and as
input for the MemSam model.
**
This character was used in Experiments
1and 2but was not included in Experiment 3.
Mem Cogn (2016) 44:292306 305
References
Biederman, I., & Shiffrar, M. (1987). Sexing day-old chicks: A case study
and expert systems analysis of a difficult perceptual learning task.
Journal of Experimental Psychology: Learning, Memory, and
Cognition, 13, 640645. doi:10.1037/0278-7393.13.4.640
Chan, L., & Wang, L. (2003). Linguistic awareness in learning to read
Chinese: A comparative study of Beijing and Hong Kong children.
In C. McBride-Chang & H.-C. Chen (Eds.), Reading development in
Chinese children (pp. 91106). Westport, CT: Praeger Press.
Chen, M. J., & Yuen, J. C.-K. (1991). Effects of pinyin and script type on
verbal processing: Comparisons of China, Taiwan, and Hong Kong
experience. International Journal of Behavioral Development, 14,
429448.
Chung, F. H. K., & Leung, M. T. (2008). Data analysis of Chinese char-
acters in primary school corpora ofHong Kong and mainland China:
Preliminary theoretical interpretations. Clinical Linguistics and
Phonetics, 22, 379389.
Gao, D.-G., & Kao, H. S. R. (2002). Psycho-geometric analysis of
commonly used Chinese characters. In H. S. R. Kao, C.-K.
Leong, & D.-G. Gao (Eds.), Cognitive neuroscience studies of
the Chinese language (pp. 195206). Hong Kong, China: Hong
Kong University Press.
Goldstone, R. L. (2006). The complex systems see-change in education.
Journal of the Learning Sciences, 15, 3543.
Goldstone, R. L., Landy, D., & Son, J. Y. (2010). The education of
perception. Topics in Cognitive Science, 2, 265284. doi:10.1111/j.
1756-8765.2009.01055.x
Goldstone, R. L., & Sakamoto, Y. (2003). The transfer of abstract prin-
ciples governing complex adaptive systems. Cognitive Psychology,
46, 414466. doi:10.1016/S0010-0285(02)00519-4
Goldstone, R. L., & Son, J. Y. (2005). The transfer of scientific principles
using concrete and idealized simulations. Journal of the Learning
Sciences, 14, 69114.
Harbaugh, R. (2003). Chinese characters and culture. Retrieved from
www.zhongwen. com.
Hintzman, D. L. (1984). MINERVA 2: A simulation model of human
memory. Behavior Research Methods, Instruments, & Computers,
16, 96101. doi:10.3758/BF03202365
Hodge, R., & Louie, K. (1998). The politics of Chinese language and
culture: The art of reading dragons (pp. 6264). London, UK:
Psychology Press.
Hull, C. L. (1920). Quantitative aspects of evolution of concepts: An
experimental study. Psychological Monographs: General and
Applied, 28(1), 186. doi:10.1037/h0093130
Kaminski, J. A., Sloutsky, V. M., & Heckler, A. F. (2008). Learning
theory: The advantage of abstract examples in learning math.
Science, 320, 454455. doi:10.1126/science.1154659
Kellman, P. J., Massey, C. M., & Son, J. Y. (2010). Perceptual learning
modules in mathematics: Enhancing studentspattern recognition,
structure extraction, and fluency. Topics in Cognitive Science, 2,
285305. doi:10.1111/j.1756-8765.2009.01053.x
Lamberts, K. (2000). Information-accumulation theory of speeded cate-
gorization. Psychological Review, 107, 227260. doi:10.1037/0033-
295X.107.2.227
Luce, R. D. (1959). Individual choice behavior: A theoretical analysis.
New York, NY: Wiley.
McBride-Chang, C., Chow, B.W. Y., Zhong, Y., Burgess, S., & Hayward,
W. G. (2005). Chinese character acquisition and visual skills in two
Chinese scripts. Reading and Writing, 28, 99128.
McNeil, N. M., Uttal, D. H., Jarvin, L., & Sternberg, R. J. (2009). Should
you show me the money? Concrete objects both hurt and help per-
formance on mathematics problems. Learning and Instruction, 19,
171184. doi:10.1016/j.learninstruc.2008.03.005
Medin, D. L., & Schaffer, M. M. (1978). Context theory of classification
learning. Psychological Review, 85, 207238. doi:10.1037/0033-
295X.85.3.207
Nosofsky, R. M. (1986). Attention, similarity, and the identification-
categorization relationship. Journal of Experimental Psychology:
General, 115, 3957. doi:10.1037/0096-3445.115.1.39
Podgorny, P., & Garner, W. R. (1979). Reaction time as a mea-
sure of inter- and intraobject visual similarity: Letters of the
alphabet. Perception & Psychophysics, 26, 3752. doi:10.3758/
BF03199860
Raaijmakers, J. G. W., & Shiffrin, R. M. (1980). SAM: A theory of
probabilistic search of associative memory. In G. H. Bower (Ed.),
The psychology of learning and motivation: Advances in research
and theory (Vol. 14, pp. 207262). New York, NY: Academic Press.
doi:10.1016/S0079-7421(08)60162-0
Schyns, P. G., & Rodet, L. (1997). Categorization creates functional fea-
tures. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 23, 681696. doi:10.1037/0278-7393.23.3.681
Seybolt, P. J., & Chiang, G. K.-K. (1979). Introduction. In P. J. Seybolt &
G. K.-K. Chiang (Eds.), Language reform in China: Documents and
commentary (pp. 110). White Plains, NY: M. E. Sharpe.
Shiffrin, R. M., & Steyvers, M. (1997). A model for recognition memory:
REMretrieving effectively from memory. Psychonomic Bulletin
& Review, 4, 145166. doi:10.3758/BF03209391
Sloutsky, V. M., Kaminski, J. A., & Heckler, A. F. (2005). The advantage
of simple symbols for learning and transfer. Psychonomic Bulletin &
Review, 12, 508513. doi:10.3758/BF03193796
Son, J. Y., Smith, L. B., & Goldstone, R. L. (2008). Simplicity and
generalization: Short-cutting abstraction in childrens object catego-
rizations. Cognition, 108, 626638. doi:10.1016/j.cognition.2008.
05.002
Tsai, K.-C., & Nunes, T. (2003). The role of character schema in learning
novel Chinese characters. In C. McBride-Chang & H.-C. Chen
(Eds.), Reading development in Chinese children (pp. l09125).
Westport, CT: Praeger Press.
Tversky, A. (1977). Features of similarity. Psychological Review, 84,
327352. doi:10.1037/0033-295X.84.4.327
Wang, E. (2009, May 2). Elitism vs. populism. In The Chinese language,
ever evolving (Blog post). The New York Times. Retrieved from
http://roomfordebate.blogs.nytimes.com/2009/05/02/chinese-
language-ever-evolving/?_r=0
306 Mem Cogn (2016) 44:292306
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Participants in 2 experiments interacted with computer simulations designed to foster understanding of scientific principles governing complex adaptive systems. The quality of participants' transportable understanding was measured by the amount of transfer between 2 simulations governed by the same principle. The perceptual con- creteness of the elements within the first simulation was manipulated. The elements either remained concrete throughout the simulation, remained idealized, or switched midway into the simulation from concrete to idealized or vice versa. Transfer was better when the appearance of the elements switched, consistent with theories pre- dicting more general schemas when the schemas are multiply instantiated. The best transfer was observed when originally concrete elements became idealized. These results are interpreted in terms of tradeoffs between grounded, concrete construals of simulations and more abstract, transportable construals. Progressive idealization ("concreteness fading") allows originally grounded and interpretable principles to become less tied to specific contexts and hence more transferable. Cognitive psychologists and educators have often debated the merits of concrete versus idealized materials for fostering scientific understanding. Should chemical molecules be represented by detailed, shaded, and realistically illuminated balls or by simple ball-and-stick figures? Should a medical illustration of a pancreas in- clude a meticulous rendering of the islets of Langerhans or convey in a more styl- ized manner the organ's general form? Our informal interviews with mycologists at the Royal Kew Gardens (personal communication, Brian Spooner and David Pegler, May 1998) indicate a schism between authors of mushroom field guides.
Article
Full-text available
One of the most exciting promises of a complex systems perspective for learning and education, well captured in Jacobson and Wilensky’s target article, is to provide a unifying force to bring together increasingly fragmented scientific communities. The day when scientists have time to read broadly across chemistry, biology, physics, and social sciences is long gone. Journals, conferences, and academic departmental structures are becoming increasingly specialized and myopic. As Peter Csermely (1999), one of the organizers of the International Forum of Young Scientists expresses it, There is only a limited effort to achieve the appropriate balance between the discovery of new facts and finding their appropriate place and importance in the framework of science. Science is not self-integrating, and there are fewer and fewer people taking responsibility for ’net-making. (p. 1621) One possible response to this fragmentation of science is to simply view it as inevitable. Horgan (1996) argued that the age of fundamental scientific theorizing and discoveries has passed, and that all that is left to be done is refining the details of theories already laid down by the likes of Einstein, Darwin, and Newton. Complex systems researchers, and learning scientists more generally, offer an alternative perspective, choosing to reverse the trend toward increasing specialization. They have instead pursued principles that apply to many scientific domains, Correspondence should be addressed to Robert L. Goldstone, Department of Psychology
Article
Memory which had previously resisted the efforts of psychologists, became relatively accessible after Ebbinghaus invented a suitable method. It is hoped that the quantitative aspects of the evolution of concepts will ultimately yield to a similar technique. The present study is an effort to elaborate such a technique and to apply it to a number of the characteristic quantitative problems concerned with the evolution of concepts. The device somewhat resembled the drum form of Wirth memory apparatus. The periodic movements of the drum were controlled automatically and exactly by a simple pendulum clockwork which was built into the apparatus. Throughout the present study a uniform exposure-time of five seconds was employed. The results of the tests show a distinct advantage for the simple-to-complex method. There it appears that seven of the ten subjects show a more or less decided advantage for the simple-to-complex method. Out of a possible 108 errors in reacting to the tests of the simple-to-complex series, an average of 30.2 failures was made, as against an average of 44.1 failures on the tests of the complex-to-simple series. A considerable number of quantitative problems involved in the evolution of concepts may be conveniently investigated by slight modification of the general technique employed with the various experiments described in the present study. The explorational aim of the present study will have been attained if the general economy of the combination method is shown, and an adequate and convenient technique for the solution of the various subsidiary problems demonstrated.
Article
Children in the People's Republic of China (PRC) learn to read Chinese using a simplified script by pinyin, an alphabetic system. Taiwanese children learn Chinese using traditional characters and pinyin, Hong Kong children also learn Chinese with traditional characters, but without pinyin. The effects of these experiences were assessed by comparing children's performance on three tasks relevant to verbal processing. This study involved groups of children from each of the three places, China, Taiwan, and Hong Kong. Three tasks were used: pseudohomophone naming, similarity judgement, and lexical decision. The results showed that the PRC children and Taiwanese children performed better than the Hong Kong children in the naming of pseudohomophones. In the similarity judgement task, the children were required to choose between two response words, one of which was similar to the target word in pronunciation and the other in appearance. The PRC children tended to choose the visually similar reponse words more often than did the Taiwanese and Hong Kong children. In the lexical decision task, the PRC children were far less accurate than the other children in rejecting nonwords as real words. These results suggest that pinyin training helps readers pronounce unfamiliar words by facilitating the extraction of phonological information for pronunciation and that the PRC children's experience in learning the simplified Chinese script has made them more responsive to visual information but less precise in word recognition.
Article
Most theories dealing with ill-defined concepts assume that performance is based on category level information or a mixture of category level and specific item information. A context theory of classification is described in which judgments are assumed to derive exclusively from stored exemplar information. The main idea is that a probe item acts as a retrieval cue to access information associated with stimuli similar to the probe. The predictions of the context theory are contrasted with those of a class of theories (including prototype theory) that assume that the information entering into judgments can be derived from an additive combination of information from component cue dimensions. Across 4 experiments with 128 paid Ss, using both geometric forms and schematic faces as stimuli, the context theory consistently gave a better account of the data. The relation of context theory to other theories and phenomena associated with ill-defined concepts is discussed in detail. (42 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The sexing of day-old chicks has been regarded as an extraordinarily difficult perceptual task requiring years of extensive practice for its mastery. Experts can sex chicks at over 98% accuracy at a rate of 1,000 chicks per hour spending less than a half second viewing the cloacal region. Naive subjects were shown 18 pictures of cloacal regions of male and female chicks (in random appearing arrangement) and asked to judge the sex of each chick. The pictures included a number of rare and difficult configurations. The subjects were then instructed as to the location of a critical cloacal structure for which a simple contrast in shape (convex vs. concave or flat) could serve as an indicant of sex. When the subjects judged the pictures again (in a different order), accuracy increased from slightly above chance to a level comparable to that achieved by a sample of experts. The correlation (over items) between the naive subjects and the experts before instruction was .21; after instruction, .82. The instructions were based on an interview and observation of an expert who had spent 50 years sexing 55 million chicks. Much of the reported difficulty in developing perceptual expertise in this task may stem from the need to classify extremely rare configurations in which the convexity of the structure is not apparent. The rate of learning of these instances could be greatly increased through the use of simple instructions that specified the location of diagnostic contour contrasts. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Chapter
This chapter discusses probabilistic search of associative memory. The chapter introduces a theory of retrieval from long-term memory and presents a number of applications to data from paradigms involving free recall, categorized free recall, and paired-associate recall. Long-term store (LTS) is held to be a richly interconnected network, with numerous levels, stratifications, categories, and trees, containing varieties of relationships, schemata, frames, and associations. The retrieval system is noisy and inherently probabilistic; for a given memory structure and set of probe cues, the image selected from memory is a random variable. The retrieval process concern sampling and recovery. The relatively small set of images with non-negligible sampling probabilities is denoted as the “search-set.” When an image is sampled, its features will tend to become activated. There are subject controlled strategies in the theory, such as search termination rules, and choice of cues at various stages of the search.