ArticlePDF Available

Accuracy and Precision of Visual Stimulus Timing in PsychoPy: No Timing Errors in Standard Usage

Authors:

Abstract and Figures

In a recent report published in PLoS ONE, we found that the performance of PsychoPy degraded with very short timing intervals, suggesting that it might not be perfectly suitable for experiments requiring the presentation of very brief stimuli. The present study aims to provide an updated performance assessment for the most recent version of PsychoPy (v1.80) under different hardware/software conditions. Overall, the results show that PsychoPy can achieve high levels of precision and accuracy in the presentation of brief visual stimuli. Although occasional timing errors were found in very demanding benchmarking tests, there is no reason to think that they can pose any problem for standard experiments developed by researchers.
Content may be subject to copyright.
Accuracy and Precision of Visual Stimulus Timing in
PsychoPy: No Timing Errors in Standard Usage
Pablo Garaizar
1
*, Miguel A. Vadillo
2,3
1Deusto Institute of Technology, DeustoTech, Universidad de Deusto, Bilbao, Spain, 2Department of Experimental Psychology, University College London, London,
United Kingdom, 3Primary Care and Public Health Sciences, King’s College London, London, United Kingdom
Abstract
In a recent report published in PLoS ONE, we found that the performance of PsychoPy degraded with very short timing
intervals, suggesting that it might not be perfectly suitable for experiments requiring the presentation of very brief stimuli.
The present study aims to provide an updated performance assessment for the most recent version of PsychoPy (v1.80)
under different hardware/software conditions. Overall, the results show that PsychoPy can achieve high levels of precision
and accuracy in the presentation of brief visual stimuli. Although occasional timing errors were found in very demanding
benchmarking tests, there is no reason to think that they can pose any problem for standard experiments developed by
researchers.
Citation: Garaizar P, Vadillo MA (2014) Accuracy and Precision of Visual Stimulus Timing in PsychoPy: No Timing Errors in Standard Usage. PLoS ONE 9(11):
e112033. doi:10.1371/journal.pone.0112033
Editor: Trevor Bruce Penney, National University of Singapore, Singapore
Received July 28, 2014; Accepted September 28, 2014; Published November 3, 2014
Copyright: ß2014 Garaizar, Vadillo. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. Detailed data for all the tests reported are
available at the Open Science Framework public repository (https://osf.io/9dkgz/). The repository is available at: https://osf.io/6j3iz/.
Funding: The author(s) received no specific funding for this work.
Competing Interests: The authors have declared that no competing interests exist.
* Email: garaizar@deusto.es
During the last decades, computers have become an essential
tool in psychological and neuroscientific research. Thanks to them,
it is possible to present participants stimuli in different audiovisual
formats and register different aspects of their reaction to those
materials, including verbal judgments or response latencies.
However, not all combinations of software and hardware are able
to comply with the strict requirements of some experimental
paradigms. For instance, researchers often need to present stimuli
for very brief periods of time. Experiments on subliminal priming
typically involve the presentation of words or images for intervals
no longer than 16–50 ms [1,2]. In these experiments, even small
deviations from programmed durations (e.g., from 16 to 33 ms)
can make a substantial difference in participants’ ability to
perceive stimuli. Similarly, some experimental paradigms require
very accurate measurement of reaction times. Many of these
experiments explore effects in the range of just 30–100 ms [3,4].
Problems in the presentation of stimuli or in the logging of
responses can affect the results of these kinds of experiments.
Although measurement errors usually have a minimal impact on
data when researchers average reaction times collected across
many trials [5,6], they can compromise more sophisticated
analyses like, for example, fitting models to the distribution of
reaction-times [7–9].
Fortunately for researchers, there is a wide variety of software
packages available that have been carefully designed to comply
with these strict requirements. In addition to proprietary software
(e.g., E-Prime, Presentations), some outstanding open and free
access alternatives are also available [10–13]. Among them,
PsychoPy is quickly becoming a popular choice [14]. PsychoPy is a
multiplatform software package for designing and conducting
cognitive experiments that can run natively in Microsoft Windows,
GNU/Linux and Apple Mac OS X. It is coded in Python, like
many other alternatives available (e.g., Experiment Builder,
PyEPL, OpenSesame, Vision Egg), and provides a graphical
authoring tool (PsychoPy experiment builder) and a set of Python
libraries for building experiments.
Unfortunately, in a recent report published in PLoS ONE, we
found that the performance of PsychoPy degraded with very short
timing intervals, suggesting that it might not be perfectly suitable
for experiments requiring the presentation of very brief stimuli
[15]. Although the performance of PsychoPy improved noticeably
when running under a real-time operative system, important
timing errors still remained for stimuli durations of 100 ms or less.
However, there are reasons to suspect that the results of our
previous tests might underestimate the potential accuracy of
PsychoPy. Firstly, as noted by the author of PsychoPy himself [16],
our study on the accuracy of PsychoPy was conducted with an
early version of the software package that was almost 3 years old at
the time the study was finally published. Our report ignored any
improvements introduced in PsychoPy during that time. Secondly,
the scripts used in our tests were generated using the experiment
builder interface, which was not fully operative in that version.
Furthermore, the experiment builder of the tested version did not
allow defining stimulus durations in terms of ticks (i.e., display
refreshes). Therefore, in our benchmark tests stimulus durations
were defined in time units. This might have given rise to problems
in the translation from the millisecond definition of stimuli to the
corresponding number of ticks. Finally, given that the original
study used a single computer for all tests, it is impossible to discard
the possibility that the poor performance of PsychoPy reflected
limitations of hardware, rather than genuine problems of software.
PLOS ONE | www.plosone.org 1 November 2014 | Volume 9 | Issue 11 | e112033
The present study aims at providing an update of the performance
of the more recent version of PsychoPy under ideal conditions.
Methods
The main differences with respect to the previous study [15] are
the version of PsychoPy being tested (1.80 instead of 1.64) and the
specific scripts used to assess its accuracy. In the present study, the
scripts were not created with PsychoPy’s experiment builder.
Instead, we adapted a benchmarking program developed by
Jeremy Gray [17]. In addition, we have conducted our tests on
updated operative systems.
Methodology and Apparatus
Tests were conducted on two different computers: 1) Apple
MacBook Pro 11,1 ‘‘Core i5’’ 2.4 130Late 2013 with 8 GB of
RAM, a 13.30retina display (256061600 px), and an integrated
Intel Iris 5100 graphics processor that shares memory with the
Table 1. PsychoPy 1.80 timing tests on MacBook Pro Late 2013 under MacOS X ‘‘Mavericks’’.
Duration (ms) Trials per loop Test Missed frames
1000 60 1 0
60 2 0
60 3 0
60 4 0
60 5 0
500 120 1 0
120 2 0
120 3 0
120 4 0
120 5 0
200 300 1 0
300 2 0
300 3 0
300 4 0
300 5 0
100 600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
50 1200 1 0
1200 2 0
1200 3 0
1200 4 0
1200 5 0
16.667 2400 1 1199
2400 2 1198
2400 3 1199
2400 4 1200
2400 5 1198
2000 1 61
2000 2 1173
2000 3 1164
2000 4 50
2000 5 1173
1600 1 0
1600 2 1
1600 3 0
1600 4 0
1600 5 0
doi:10.1371/journal.pone.0112033.t001
Accuracy and Precision in PsychoPy
PLOS ONE | www.plosone.org 2 November 2014 | Volume 9 | Issue 11 | e112033
system; and 2) Apple MacBook Pro 5,5 ‘‘Core 2 Duo’’ 2.26 130
(SD/FW) Mid 2009 with 2 GB of RAM, a 13.30display
(12806800 px), and a NVIDIA GeForce 9400M graphics
processor with 256 MB of DDR3 SDRAM shared with main
memory. Three operative systems were installed on these
machines: 1) MacOS X ‘‘Mavericks’’; 2) Windows 7 64-bit
Ultimate edition; y 3) Ubuntu Linux 13.10 ‘‘Saucy Salamander’’.
All tests were conducted in full screen mode, with the Bluetooth
and the network connection (WiFi/Ethernet) disabled. The
accuracy and precision of stimulus presentation was assessed
using the Black Box Toolkit (BBTK), a set of photodetectors
specifically designed to conduct benchmarking studies like the one
reported here [18]. The BBTK detects changes in luminance from
the photodetector and sends this information to the parallel port of
an auxiliary computer, different from the one whose performance
is being tested. This avoids any interference between the timing
mechanisms used to generate the black to white and white to black
transitions and the real-time application used to gather the data
provided by the BBTK photodetector.
Design and Procedure
For each combination of hardware and operative system we
developed several full-screen animations with non-gradual,
repeated white-black transitions. The duration of each keyframe
was manipulated with values 1000, 500, 200, 100, 50, and
16.667 ms (60, 30, 12, 6, 3, and 1 display refreshes at 60 Hz,
respectively), although, as explained below, not all durations were
Table 2. PsychoPy 1.80 timing tests on MacBook Pro Late 2013 under Windows 7 64-bit Ultimate.
Duration (ms) Trials per loop Test Missed frames
1000 60 1 130
60 2 123
60 3 123
60 4 125
60 5 124
doi:10.1371/journal.pone.0112033.t002
Table 3. PsychoPy 1.80 timing tests on MacBook Pro Late 2013 under Ubuntu Linux 13.10 ‘‘Saucy Salamander’’.
Duration (ms) Trials per loop Test Missed frames
200 300 1 0
300 2 0
300 3 0
300 4 0
300 5 0
100 600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
50 300 1 0
300 2 1
300 3 0
300 4 0
300 5 0
16.667 300 1 15
300 2 15
300 3 2
300 4 40
300 5 26
150 1 0
150 2 15
150 3 0
150 4 0
150 5 0
doi:10.1371/journal.pone.0112033.t003
Accuracy and Precision in PsychoPy
PLOS ONE | www.plosone.org 3 November 2014 | Volume 9 | Issue 11 | e112033
included in all tests. For each of these conditions, we collected data
from 5 independent series of 60 seconds each. We limited our
study to repeated white-black transitions because many studies
about accuracy and precision in visual stimuli presentation use
similar procedures [19–22] and because it is safe to assume that
preparation times of this kind of simple stimuli will not affect their
presentation times. Trying to measure the presentation times of
complex or real-time generated stimuli from luminance changes
usually gives rise to spurious errors that can be avoided by
resorting to simple black-and-white transitions.
As mentioned above, to avoid any potential error in our
PsychoPy code, we adapted a script previously published by
Jeremy Gray, one of PsychoPy’s developers, in a comment to our
previous study [17]. We only modified the number of iterations in
the trial loop (depending on the duration of each keyframe we
needed more or less trials to complete the 60 seconds of
measurement) and the number of durations of each experiments
(we tested only one duration in each test, instead of 6). Apart from
these two changes, the rest of the scripts were a verbatim copy of
Gray’s original.
Results
Detailed data for all the tests reported below are available at the
Open Science Framework public repository (https://osf.io/
9dkgz/). The main goal of our tests was to find the threshold
where PsychoPy started to show timing errors. For this reason, we
did not test all stimulus durations (1000, 500, 200, 100, 50, and
16.667 ms) for all combinations of hardware and software. We
started our analyses by testing the 1000, 500, 200, 100, and 50 ms
conditions on MacOS X running on the MacBook Pro Late 2013.
The results of these tests are shown in Table 1. As can be seen, the
performance of PsychoPy was perfect for this range of stimulus
durations. However, we did find timing errors when we proceeded
to test the 16.667 ms interval. Upon further exploration of the
benchmarking scripts, we found that the number of trials per loop
was a key determinant of these timing errors. Specifically, we
observed that errors were somewhat decreased when the number
of trials per loop was reduced from 960 to 800. Following this
observation, we further adjusted the number of trials per loop to
640 and observed that timing errors virtually disappeared under
these conditions. These results have important consequences.
Firstly, they confirm that PsychoPy is perfectly able to reach
maximal precision even with the briefest stimulus presentation
(16.667 ms). Secondly, they show that the number of trials per
loop somehow affects the performance of PsychoPy. As a result,
this parameter was also manipulated in the following tests.
We then explored the performance of PsychoPy under Windows
7 using the same computer. Table 2 shows the results of these
tests. As can be seen, these tests yielded very poor results even for
the less demanding conditions (1000 ms). This made us suspect
that the timing errors observed in this condition could not be
attributed to a problem in PsychoPy. Instead, these results are
likely to be due to the deficient performance of the driver for the
Table 4. PsychoPy 1.80 timing tests on MacBook Pro Mid 2009 under MacOS X ‘‘Mavericks’’.
Duration (ms) Trials per loop Test Missed frames
100 600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
50 1200 1 573
1200 2 572
1200 3 573
1200 4 573
1200 5 576
600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
16.667 1200 1 600
1200 2 600
1200 3 600
1200 4 600
1200 5 600
600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
doi:10.1371/journal.pone.0112033.t004
Accuracy and Precision in PsychoPy
PLOS ONE | www.plosone.org 4 November 2014 | Volume 9 | Issue 11 | e112033
integrated Intel Iris 5100 graphics processor running on Windows
7, to the lack of precision and accuracy in Microsoft’s latest
operative systems [9], or to a combination of both. As shown
below, PsychoPy shows a good performance under Windows 7
when a different graphics card and different drivers are used.
We also tested PsychoPy on Ubuntu Linux running on the same
computer. The results of these tests are summarized in Table 3.
After checking that no timing errors were observed with 200 ms,
we proceeded to test the 100 and the 50 ms conditions.
Preliminary examination of the 50 ms interval did yield some
timing errors. Therefore, we decided to adjust the number of trials
per loop to 300. With this change, timing errors were no longer
observed for the 50 ms interval. However, we did still observe
timing errors in the 16.667 ms condition and we decided to
further reduce the number of trials per loop to 150, which
eliminated all timing errors in 4 out of the 5 tests conducted.
We took a similar approach to explore the performance of
PsychoPy in the second computer, a MacBook Pro Mid 2009.
Table 4 shows the results of the tests conducted with MacOS X
running on this machine. In this case, we started by testing the
200 ms condition, which yielded no errors. We proceeded to
conduct the tests in the 100 ms condition, where we did observe
numerous timing errors. As in our previous tests, we followed up
these tests changing the number of trials per loop from 1200 to
600. After this modification, timing errors were no longer observed
in the 100 ms condition. Bearing this in mind, we tested the 50 ms
condition with 600 trials per loop and we also found no timing
errors. The same happened when testing the 16.667 ms interval.
However, when we increased again the number of trials per loop
in the 16.667 ms condition, we found again timing errors. This
confirms that the timing errors that we found in PsychoPy so far
should not be attributed to its ability to present very brief stimuli,
but to the large number of trials per loop included in each test.
Note that this large number of trials, although common for
benchmarking studies, is rather unusual in the typical experiments
designed by researchers.
The results obtained with Windows 7 running on the MacBook
Pro Mid 2009 are shown in Table 5. In contrast with the results
obtained with the MacBook Pro Late 2013, no timing errors were
observed in the 100 ms condition. Isolated errors took place in the
50 ms condition, all of them conveniently reported in the
PsychoPy log file. Surprisingly, only 1 out of the 5 tests conducted
in the 16.667 ms condition yielded timing errors, even when the
number of trials per loop was set to 1200. The outstanding
performance of PsychoPy on Windows 7 even on adverse
conditions is in stark contrast with its poor performance on the
same operative system running on the MacBook Pro Late 2013. As
explained above, everything suggests that these timing errors
should not be attributed to a poor performance of PsychoPy. We
found, however, that timing errors could still be observed if the
number of trials per loop was set to 2400.
Finally, Table 6 shows the results of the tests conducted with
Ubuntu Linux running on the MacBook Pro Mid 2009. Before
gathering these data, we found a problem in the execution of our
tests: Preliminary tests showed that the stimulus durations
registered by the BBTK photosensors doubled the expected values
(e.g., white and black frames lasted 200 ms in the 100 ms
condition). Surprisingly, this error was not reported in the
PsychoPy log file. After commenting these results with the
developers of PsychoPy, they informed us that in some configu-
rations of Linux the graphics card is being told twice to wait for a
vertical blank before proceeding, so every frame actually takes two
frames. Because the frame time remains consistent, PsychoPy
assumes that the frame rate of the monitor is 30 Hz (and not
60 Hz). Therefore, it does not report any missed frames (all frames
look like the expected period by this measure). Fortunately, there
Table 5. PsychoPy 1.80 timing tests on MacBook Pro Mid 2009 under Windows 7 64-bit Ultimate.
Duration (ms) Trials per loop Test Missed frames
100 600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
50 1200 1 0
1200 2 15
1200 3 0
1200 4 0
1200 5 2
16.667 2400 1 600
2400 2 600
2400 3 600
2400 4 600
2400 5 599
1200 1 0
1200 2 0
1200 3 1
1200 4 0
1200 5 0
doi:10.1371/journal.pone.0112033.t005
Accuracy and Precision in PsychoPy
PLOS ONE | www.plosone.org 5 November 2014 | Volume 9 | Issue 11 | e112033
was a simple solution. PsychoPy includes a property option to
disable the wait for the next vertical blank (win.waitBanking
= False). After implementing this change, we tested the 200 ms
condition and found no timing errors. We also found no errors for
the 100, 50, and 16.667 ms conditions when 600 trials per loop
were requested. However, errors were found when the number of
trials per loop was set to 1200.
Discussion
When our previous study on the accuracy and precision of
PsychoPy and other experimental software was originally pub-
lished [15], the developers of PsychoPy [16] suggested that the
timing errors that we detected could be due either to the fact that
those tests were based on an earlier version of PsychoPy (1.64) or
to the definition of stimulus durations in terms of time units instead
of ticks (display refreshes). Actually, the latter problem was related
to the former, given that the experiment builder of PsychoPy 1.64
did not allow defining durations in terms of ticks. It is very likely
that the timing errors found in the previous study can be attributed
to this feature of the testing procedure. Timing visual events based
on timing intervals is known to be prone to artifacts, because those
intervals often do not synchronize precisely with the hardware
screen refresh interval, leading to uncertainties in the actual
achieved display times.
In light of the present results it appears that an additional factor
played a determinant role: The number of trials per loop
implemented in each test. Although testing large numbers of trials
per loop is common practise in software benchmarking, the
parameters used in this kind of studies are rather unusual in
Table 6. PsychoPy 1.80 timing tests on MacBook Pro Mid 2009 under Ubuntu Linux 13.10 ‘‘Saucy Salamander’’.
Duration (ms) Trials per loop Test Missed frames
200 600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
100 1200 1 94
1200 2 97
1200 3 86
1200 4 99
1200 5 97
600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
50 1200 1 184
1200 2 157
1200 3 166
1200 4 163
1200 5 142
600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
16.667 1200 1 35
1200 2 28
1200 3 27
1200 4 29
1200 5 37
600 1 0
600 2 0
600 3 0
600 4 0
600 5 0
doi:10.1371/journal.pone.0112033.t006
Accuracy and Precision in PsychoPy
PLOS ONE | www.plosone.org 6 November 2014 | Volume 9 | Issue 11 | e112033
psychological experiments. The divergence between the proce-
dures used in cognitive research and the methods used in
benchmarking has already been highlighted by Plant. As he
mentioned in his comment to our original study, ‘‘flashing a
bitmap over and over on idealised equipment is not representative
of what real researchers do in the field! Their equipment is never
ideal, their coding never as good as yours, their experiment is more
complicated, they link to different equipment to yours… Or they
are using a different version of the software to you’’ [23]. We
might add that the divergence sometimes runs in the opposite
direction: As the present study illustrates, sometimes the require-
ments of the software used to benchmark timing errors can be
much more demanding than those of standard programs designed
by experimenters. Given our results, a practical recommendation
for cognitive researchers is that large numbers of trials per loop
should be avoided by all means whenever it is possible.
The negative impact of this factor might be due to the large
amount of information that PsychoPy has to log in relatively little
time. Even though we disabled XLSX and CSV outputs, we still
found errors with large numbers of trials per loop. Fortunately, this
is more of a technical than a practical problem, because it only
poses timing problems in highly unusual conditions. However, in
light of the present results, it seems advisable to avoid complex
data output formats, such as XLSX, when timing errors can be an
issue, particularly for experimental programs requiring multiple
loops.
It is also important to note that the performance of PsychoPy
was also affected by details of the hardware and software used to
run the experiment. Severe timing errors were observed in
Windows 7 in one of the computers, possibly due to problems of
the graphic card driver. Similarly, the configuration of the graphic
card in Ubuntu Linux gave rise to unexpected timing errors that,
fortunately, could be fixed using the appropriate property options
in PsychoPy. These two examples illustrate that researchers can
never take for granted that their software will reach the highest
precision and accuracy levels under all circumstances. If a series of
experiments demands compliance with strict timing requirements,
the precision and accuracy of the experimental software should
always be tested first.
Based on the results of our studies, we can offer some guidelines
for researchers that are planning to use PsychoPy to conduct
experiments with strict timing requirements. First, it is important
to use suitable hardware equipment (i.e., a computer provided
with a fast CPU, enough RAM, a dedicated graphics processor,
and a display with low refresh rate) with the appropriate
configuration (i.e., Bluetooth, Ethernet. Wi-Fi, Mobile and other
kind of connections disabled; desktop visual effects disabled;
antivirus, software updates, background programs, and other kind
of asynchronous events sources disabled). Second, any configura-
tion problem of the graphics processor should be detected and
fixed (i.e., updating display’s and graphics processor’s drivers and
using vendor’s test utilities to benchmark them, if available). Third,
it is advisable to use the last version of PsychoPy. It is free, and
every update comes with new interesting features. Fourth, visual
stimuli should be defined in durations in ticks (screen refreshes)
and not in milliseconds. Fifth, it is preferable to avoid defining too
many trials per loop in experiments. For experimental paradigms
with large numbers of trials (i.e., experimental paradigms with
several hundreds of trials, such as priming or contextual cueing
[1,3]), splitting the whole set of trials in several blocks is an easy
way to avoid potential problems. Sixth, it is recommendable to
analyse and reduce the impact of logging processes during the
experiment (e.g., using XLSX log format is more demanding than
TXT log format). In addition to these general recommendations,
the precision and accuracy of the experimental setup should be
tested prior to conducting the experiment. In most cases,
PsychoPy’s logging information should be enough to detect timing
inaccuracies. In our study, all the timing errors except the one
caused by the Nvidia graphics configuration in Linux were
correctly reported by PsychoPy. To make sure that such faulty
configuration is not being used unknowingly, researchers can
define a human-measurable stimulus duration (e.g., 120
ticks = 2000 ms at 60 Hz) and check that the duration is not
doubled (i.e., 4000 ms). If that is the case, there is a simple
workaround in PsychoPy: Disabling waitBlank feature and
defining stimuli durations in milliseconds and not in ticks (contrary
to the previous recommendation).
To sum up, the present study shows that the most recent
versions of PsychoPy can achieve the highest levels of precision
and accuracy in the presentation of brief visual stimuli. There is no
reason to think that occasional timing errors found in bench-
marking tests with many trials per loop can pose any problem for
standard experiments developed by researchers. Properly used,
PsychoPy is an excellent tool for psychological research even
under the most demanding conditions.
Acknowledgments
The authors would like to thank Gorka Urkiola and Gorka Gorrotxategi
for granting us access to their MacBook Pro for the present study, Tom
Hardwicke for his corrections of a previous draft, and Jonathan Peirce,
Jeremy Gray, and Michael MacAskill for their help and support.
Author Contributions
Conceived and designed the experiments: PG MAV. Performed the
experiments: PG. Analyzed the data: PG. Contributed reagents/materials/
analysis tools: PG MAV. Wrote the paper: PG MAV.
References
1. Dehaene S, Naccache L, Le Clec’H G, Koechlin E, Mueller M, et al. (1998)
Imaging unconscious semantic priming. Nature 395: 597–600.
2. Hassin RR, Ferguson MJ, Shidlovski D, Gross T (2007) Subliminal exposure to
national flags affects political thought and behavior. PNAS 104: 19757–19761.
3. Chun MM, Jiang Y (1998) Contextual cueing: Implicit learning and memory of
visual context guides spatial attention. Cognitive Psychol 36: 28–71.
4. Ratcliff R, McKoon G (1978) Priming in item recognition: Evidence for the
prepositional structure of sentences. J Verbal Learn Verb Beh 17: 403–417.
5. Brand A, Bradley MT (2012) Assessing the effect of technical variance on the
statistical outcomes of web experiments measuring response times. Soc Sci
Comput Rev 30: 350–357.
6. Damian MF (2010) Does variability in human performance outweigh
imprecision in response devices such as computer keyboards? Behav Res
Methods 42: 205–211.
7. Donkin C, Averell L, Brown S, Heathcote A (2009) Getting more from accuracy
and response time data: Methods for fitting the linear ballistic accumulator.
Behav Res Methods 41: 1095–1110.
8. Wagenmakers EJ (2009) Methodological and empirical developments for the
Ratcliff diffusion model of response times and accuracy. Eur J Cogn Psychol 21:
641–671.
9. Plant RR, Quinlan PT (2013) Could millisecond timing errors in commonly
used equipment be a cause of replication failure in some neuroscience studies?
Cogn Affect Behav Ne 13: 598–614.
10. Brainard DH (1997) The Psychophysics Toolbox. Spatial Vision 10: 433–436.
11. Forster KI, Forster JC (2003) DMDX: A Windows display program with
millisecond accuracy. Beh Res Methods 35: 116–124.
12. Mathoˆt S, Schreij D, Theeuwes J (2011) OpenSesame: An open-source ,
graphical experiment builder for the social sciences. Beh Res Methods 44: 314–
324.
Accuracy and Precision in PsychoPy
PLOS ONE | www.plosone.org 7 November 2014 | Volume 9 | Issue 11 | e112033
13. Pelli DG (1997) The VideoToolbox software for visual psychophysics:
Transforming numbers into movies. Spatial Vision 10: 437–442.
14. Peirce JW (2007) PsychoPy—psychophysics software in Python. J Neurosci
Methods 162: 8–13.
15. Garaizar P, Vadillo MA, Lo´pez-de-Ipin˜a D, Matute H (2014) Measuring
software timing errors in the presentation of visual stimuli in cog nitive
neuroscience experiments. PLOS ONE 9: e85108.
16. Peirce JW (2014) PsychoPy can provide precise stimulus timing. Available:
http://www.ploso ne.org/annotation /listThread.acti on?root=78527. Acc essed
2014 Oct 14.
17. Gray J (2014) PsychoPy timing test script. Available: https://gist.github.com/
jeremygray/9062586. Accessed 2014 Oct 14.
18. Plant RR, Turner G (2004) Self-validating presentation and response timing in
cognitive paradigms: How and why? Behav Res Meth Ins C 36: 291–303.
19. Schmidt W (2001) Presentation accuracy of Web animation methods. Behav Res
Methods 33: 187–200.
20. Stewart N (2006) Millisecond accuracy video display using OpenGL under
Linux. Behav Res Methods 38: 142–145.
21. Reingold EM (2014) Eye tracking research and technology: Towards objective
measurement of data quality, Vis Cogn 22: 635–652.
22. Reimers S, Stewart N (in press) Presentation and response timing accuracy in
Adobe Flash and HTML5/JavaScript Web experiments. Behav Res Methods.
23. Plant RR (2014) Do these results generalise to what researchers really do?
Available: http://www.plosone.org/annotation/listThread.action?root=78527.
Accessed 2014 Oct 14.
Accuracy and Precision in PsychoPy
PLOS ONE | www.plosone.org 8 November 2014 | Volume 9 | Issue 11 | e112033
... Python is an interpreted programming language that has various libraries and high code readability and is easy to make and debug. Over the last decade, many useful Python software tools have been developed to establish specific experiments for psychology and neuroscience (Dalmaijer et al., 2014;Garaizar & Vadillo, 2014;Krause & Lindemann, 2013;Mathôt et al., 2012). Recently, the use of Python tools for experiments has been confirmed via benchmark tests, ensuring that they have robust accuracy and precision in both laboratory and online studies (Wiesing et al., 2020). ...
... The black-to-white screen transition test, which is a wellestablished evaluation for stimulus timing accuracy and precision (Garaizar & Vadillo, 2014;Krause & Lindemann, 2013;Tachibana & Niikuni, 2017;Wiesing et al., 2020), was performed in VR. In the experiments, black and white blanks were shown alternately 1000 times in the HMDs. ...
Article
Full-text available
Virtual reality (VR) is a new methodology for behavioral studies. In such studies, the millisecond accuracy and precision of stimulus presentation are critical for data replicability. Recently, Python, which is a widely used programming language for scientific research, has contributed to reliable accuracy and precision in experimental control. However, little is known about whether modern VR environments have millisecond accuracy and precision for stimulus presentation, since most standard methods in laboratory studies are not optimized for VR environments. The purpose of this study was to systematically evaluate the accuracy and precision of visual and auditory stimuli generated in modern VR head-mounted displays (HMDs) from HTC and Oculus using Python 2 and 3. We used the newest Python tools for VR and Black Box Toolkit to measure the actual time lag and jitter. The results showed that there was an 18-ms time lag for visual stimulus in both HMDs. For the auditory stimulus, the time lag varied between 40 and 60 ms, depending on the HMD. The jitters of those time lags were 1 ms for visual stimulus and 4 ms for auditory stimulus, which are sufficiently low for general experiments. These time lags were robustly equal, even when auditory and visual stimuli were presented simultaneously. Interestingly, all results were perfectly consistent in both Python 2 and 3 environments. Thus, the present study will help establish a more reliable stimulus control for psychological and neuroscientific research controlled by Python environments.
... Due to national regulations in response to the COVID-19 pandemic during 2020-2021 in The Netherlands, we designed the experiment so that it could be completed at home. We programmed the experiment in Psychopy (Peirce, 2007), which has been shown to operate reliably at high temporal precision and with limited variations across operating systems (Bridges et al., 2020;Garaizar & Vadillo, 2014). We asked participants to download and install the latest version of Psychopy from the website. ...
Article
Full-text available
We memorize our daily life experiences, which are often multisensory in nature, by segmenting them into distinct event models, in accordance with perceived contextual or situational changes. However, very little is known about how multisensory boundaries affect segmentation, as most studies have focused on unisensory (visual or audio) segmentation. In three experiments, we investigated the effect of multisensory boundaries on segmentation in memory and perception. In Experiment 1, participants encoded lists of pictures while audio and visual contexts changed synchronously or asynchronously. After each list, we tested recognition and temporal associative memory for pictures that were encoded in the same audio-visual context or that crossed a synchronous or an asynchronous multisensory change. We found no effect of multisensory synchrony for recognition memory: synchronous and asynchronous changes similarly impaired recognition for pictures encoded at those changes, compared to pictures encoded further away from those changes. Multisensory synchrony did affect temporal associative memory, which was worse for pictures encoded at synchronous than at asynchronous changes. Follow up experiments showed that this effect was not due to the higher dimensionality of multisensory over unisensory contexts (Experiment 2), nor that it was due to the temporal unpredictability of contextual changes inherent to Experiment 1 (Experiment 3). We argue that participants formed situational expectations through multisensory synchronicity, such that synchronous multisensory changes deviated more strongly from those expectations than asynchronous changes. We discuss our findings in light of supportive and conflicting findings of uni- and multi-sensory segmentation.
... The experiment was coded in PsychoPy2 Version 1.84 (Peirce, 2007). Temporal accuracy of ISI and stimulus presentation was controlled using the frame-based presentation functionality of PsychoPy (Garaizar & Vadillo, 2014;Peirce, 2007). The testing was done on a PC running Windows 7 and a computer monitor with a 60-Hz refresh rate. ...
Article
Full-text available
We tend to mentally segment a series of events according to perceptual contextual changes, such that items from a shared context are more strongly associated in memory than items from different contexts. It is also known that timing context provides a scaffold to structure experiences in memory, but its role in event segmentation has not been investigated. We adapted a previous paradigm, which was used to investigate event segmentation using visual contexts, to study the effects of changes in timing contexts on event segmentation in associative memory. In two experiments, we presented lists of 36 items in which the interstimulus intervals (ISIs) changed after a series of six items ranging between 0.5 and 4 s in 0.5 s steps. After each list, participants judged which one of two test items were shown first (temporal order judgment) for items that were either drawn from the same context (within an ISI) or from consecutive contexts (across ISIs). Further, participants judged from memory whether the ISI associated to an item lasted longer than a standard interval (2.25 s) that was not previously shown (temporal source memory). Experiment 2 further included a time-item encoding task. Results revealed an effect of timing context changes in temporal order judgments, with faster responses (Experiment 1) or higher accuracy (Experiment 2) when items were drawn from the same context, as opposed to items drawn from across contexts. Further, in both experiments, we found that participants were well able to provide temporal source memory judgments based on recalled durations. Finally, replicated across experiments, we found subjective duration bias, as estimated by psychometric curve fitting parameters of the recalled durations, correlated negatively with within-context temporal order judgments. These findings show that changes in timing context support event segmentation in associative memory.
... (i) Cue exposure Participants saw pictures with either neutral or alcohol related cues using PsychoPy (v1.78.00, J. [28]). The neutral images were photographs of landscapes or objects from everyday life settings. ...
Article
Full-text available
Identifying treatment options for patients with alcohol dependence is challenging. This study investigates the application of real-time functional MRI (rtfMRI) neurofeedback (NF) to foster resistance towards craving-related neural activation in alcohol dependence. We report a double-blind, placebo-controlled rtfMRI study with three NF sessions using alcohol-associated cues as an add-on therapy to the standard treatment. Fifty-two patients (45 male; 7 female) diagnosed with alcohol dependence were recruited in Munich, Germany. RtfMRI data were acquired in three sessions and clinical abstinence was evaluated 3 months after the last NF session. Before the NF training, BOLD responses and clinical data did not differ between groups, apart from anger and impulsiveness. During NF training, BOLD responses of the active group were decreased in medial frontal areas/caudate nucleus, and increased, e.g. in the cuneus/precuneus and occipital cortex. Within the active group, the down-regulation of neuronal responses was more pronounced in patients who remained abstinent for at least 3 months after the intervention compared to patients with a relapse. As BOLD responses were comparable between groups before the NF training, functional variations during NF cannot be attributed to preexisting distinctions. We could not demonstrate that rtfMRI as an add-on treatment in patients with alcohol dependence leads to clinically superior abstinence for the active NF group after 3 months. However, the study provides evidence for a targeted modulation of addiction-associated brain responses in alcohol dependence using rtfMRI.
... The first target platform's code generator that is fully developed is the generator for the PsychoPy experiment library. We have chosen this library as our first target due to its maturity, flexibility, and the fact that it is supported on all major operating systems, and can achieve high levels of precision and accuracy [41]. PsychoPy is implemented in Python, which is another positive side as Python has become a de facto standard in scientific research in recent years. ...
Article
Full-text available
The majority of studies in psychology are nowadays performed using computers. In the past, access to good quality software was limited, but in the last two decades things have changed and today we have an array of good and easily accessible open-source software to choose from. However, experiment builders are either GUI-centric or based on general-purpose programming languages which require programming skills. In this paper, we investigate an approach based on domain-specific languages which enables a text-based experiment development using domain-specific concepts, enabling practitioners with limited or no programming skills to develop psychology tests. To investigate our approach, we created PyFlies, a domain-specific language for designing experiments in psychology, which we present in this paper. The language is tailored for the domain of psychological studies. The aim is to capture the essence of the experiment design in a concise and highly readable textual form. The editor for the language is built as an extension for Visual Studio Code, one of the most popular programming editors today. From the experiment description, various targets can be automatically produced. In this version, we provide a code generator for the PsychoPy library while generators for other target platforms are planned. We discuss the language, its concepts, syntax, some current limitations, and development directions. We investigate the language using a case study of the implementation of the Eriksen flanker task.
... PsychoPy Builder allows the researcher to generate a Python script for the developed experiment, which is easily executed as a Python program. PsychoPy allowed us to start and stop the synthetic voice and talking avatar as sound and movie components in synchrony, audio-record verbal responses using the microphone components, measure the duration of test performances, program the sequences of the entire testing procedure, and automatically execute the proper tests using the clock functions and code components based on sub-millisecond precision [37,38]. Detailed information on the components and function are publicly available in the PsychoPy reference manual [39]. ...
Article
The COVID-19 pandemic resulted in suspending in-person human subject research across most institutions in the US. Our extensive cognitive assessment for a phase-2 clinical trial, Physical Activity and Alzheimer's Disease–2 (PAAD-2), was also paused in March 2020. It was important to identify strategies to mitigate the risk of COVID-19 transmission during our testing, which initially required substantial human speech and close person-to-person contact for test directions and instant feedback on paper/pencil tests. Given current understanding of the COVID-19 transmission, we dramatically adjusted the testing protocol to minimize the production of speech droplets and allow social distancing while maintaining the integrity of testing. We adopted state-of-the-art speech synthesis and computerization techniques to create an avatar to speak on behalf of the experimenter for all verbal instructions/feedback, used a document camera to observe the paper/pencil tests from the required distances, and automated the testing sequence and timing. This paper aims 1) to describe an innovative laboratory-based cognitive testing protocol for a completely contact-free, computer-speaking, and semi-automated administration; and 2) to evaluate the integrity of the modified protocol (n = 37) compared with the original protocol (n = 32). We have successfully operated the modified protocol since July 2020 with no evidence of COVID-19 transmission during testing, and data support that the modified protocol is robust and captures data identical to the original protocol. This transition of data collection methods has been critical during the pandemic and will be useful in future studies to mitigate the risk of contagious disease transmission and standardize laboratory-based psychological tests. Trial registration: ClinicalTrials.gov NCT03876314. Registered March 15, 2019
... A large-scale transfer of behavioral data collection to the Internet has consequently taken place over the last decades, and it has become established practice to collect questionnaire and survey data through web interfaces. This is accompanied by an increasing availability of online platforms and software tools that reduce the demands on researchers to design web-interfaces, and to manage the technical intricacies of client-host communication, database management, and multi-platform variability [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15]. "Platform" here refers to the combination of technical alternatives in any given console, including the computer type and performance, its input and output devices, the operating system (OS), and, in the case of web-based measurements, the browser and its plug-in applications, as well as the quality of the Internet connection. ...
Article
Full-text available
Behavioral data are increasingly collected over the Internet. This is particularly useful when participants’ own computers can be used as they are, without any modification that relies on their technical skills. However, the temporal accuracy in these settings is generally poor, unknown, and varies substantially across different hard- and software components. This makes it dubious to administer time-critical behavioral tests such as implicit association, reaction time, or various forms of temporal judgment/perception and production. Here, we describe the online collection and subsequent data quality control and adjustment of reaction time and time interval production data from 7127 twins sourced from the Swedish Twin Registry. The purposes are to (1) validate the data that are already and will continue to be reported in forthcoming publications (due to their utility, such as the large sample size and the twin design) and to (2) provide examples of how one might engage in post-hoc analyses of such data, and (3) explore how one might control for systematic influences from specific components in the functional chain. These possible influences include the type and version of the operating system, browser, and multimedia plug-in type
... The experiment was coded in PsychoPy2 Version 1.84 (Peirce, 2007). Temporal accuracy of ISI and stimulus presentation was controlled using the frame-based presentation functionality of PsychoPy (Garaizar & Vadillo, 2014;Peirce, 2007). The testing was done on a PC running Windows 7 and a computer monitor with a 60-Hz refresh rate. ...
Preprint
Full-text available
We tend to mentally segment a series of events according to perceptual contextual changes, such that items from a shared context are more strongly associated in memory than items from different contexts. It is also known that temporal context provides a scaffold to structure experiences in memory, but its role in event segmentation has not been investigated. We adapted a previous paradigm, which was used to investigate event segmentation using visual contexts, to study the effects of changes in temporal contexts on event segmentation in associative memory. We presented lists of items in which the inter-stimulus intervals (ISIs) ranged across lists between 0.5 and 4 s in 0.5 s steps. After each set of six lists, participants judged which one of two test items were shown first (temporal order judgment) for items that were either drawn from the same list or from consecutive lists. Further, participants judged from memory whether the ISI associated to an item lasted longer than a standard interval (2.25s) that was not previously shown. Results showed faster responses for temporal order judgments when items were drawn from the same context, as opposed to items drawn from different contexts. Further, we found that participants were well able to provide temporal duration judgments based on recalled durations. Finally, we found temporal acuity, as estimated by psychometric curve fitting parameters of the recalled durations, correlated inversely with within-list temporal order judgments. These findings show that changes in temporal context support event segmentation in associative memory.
Article
Full-text available
Web-based data collection is increasingly popular in both experimental and survey-based research because it is flexible, efficient, and location-independent. While dedicated software for laboratory-based experimentation and online surveys is commonplace, researchers looking to implement experiments in the browser have, heretofore, often had to manually construct their studies’ content and logic using code. We introduce , a free, open-source experiment builder that makes it easy to build studies for both online and in-laboratory data collection. Through its visual interface, stimuli can be designed and combined into a study without programming, though studies’ appearance and behavior can be fully customized using html , css , and JavaScript code if required. Presentation and response times are kept and measured with high accuracy and precision heretofore unmatched in browser-based studies. Experiments constructed with can be run directly on a local computer and published online with ease, with direct deployment to cloud hosting, export to web servers, and integration with popular data collection platforms. Studies can also be shared in an editable format, archived, re-used and adapted, enabling effortless, transparent replications, and thus facilitating open, cumulative science. The software is provided free of charge under an open-source license; further information, code, and extensive documentation are available from https://lab.js.org/ .
Article
Full-text available
Many researchers in the behavioral sciences depend on research software that presents stimuli, and records response times, with sub-millisecond precision. There are a large number of software packages with which to conduct these behavioral experiments and measure response times and performance of participants. Very little information is available, however, on what timing performance they achieve in practice. Here we report a wide-ranging study looking at the precision and accuracy of visual and auditory stimulus timing and response times, measured with a Black Box Toolkit. We compared a range of popular packages: PsychoPy, E-Prime®, NBS Presentation®, Psychophysics Toolbox, OpenSesame, Expyriment, Gorilla, jsPsych, Lab.js and Testable. Where possible, the packages were tested on Windows, macOS, and Ubuntu, and in a range of browsers for the online studies, to try to identify common patterns in performance. Among the lab-based experiments , Psychtoolbox, PsychoPy, Presentation and E-Prime provided the best timing, all with mean precision under 1 millisecond across the visual, audio and response measures. OpenSesame had slightly less precision across the board, but most notably in audio stimuli and Expyriment had rather poor precision. Across operating systems , the pattern was that precision was generally very slightly better under Ubuntu than Windows, and that macOS was the worst, at least for visual stimuli, for all packages. Online studies did not deliver the same level of precision as lab-based systems, with slightly more variability in all measurements. That said, PsychoPy and Gorilla, broadly the best performers, were achieving very close to millisecond precision on several browser/operating system combinations. For response times (measured using a high-performance button box), most of the packages achieved precision at least under 10 ms in all browsers, with PsychoPy achieving a precision under 3.5 ms in all. There was considerable variability between OS/browser combinations, especially in audio-visual synchrony which is the least precise aspect of the browser-based experiments. Nonetheless, the data indicate that online methods can be suitable for a wide range of studies, with due thought about the sources of variability that result. The results, from over 110,000 trials, highlight the wide range of timing qualities that can occur even in these dedicated software packages for the task. We stress the importance of scientists making their own timing validation measurements for their own stimuli and computer configuration.
Article
Full-text available
Visual words that are masked and presented so briefly that they cannot be seen may nevertheless facilitate the subsequent processing of related words, a phenomenon called masked priming 1,2. It has been debated whether masked primes can activate cognitive processes without gaining access to consciousness 3–5. Here we use a combination of behavioural and brain-imaging techniques to estimate the depth of processing of masked numerical primes. Our results indicate that masked stimuli have a measurable influence on electrical and haemodynamic measures of brain activity. When subjects engage in an overt semantic comparison task with a clearly visible target numeral, measures of covert motor activity indicate that they also unconsciously apply the task instructions to an unseen masked numeral. A stream of perceptual, semantic and motor processes can therefore occur without awareness.
Article
Full-text available
Web-based research is becoming ubiquitous in the behavioral sciences, facilitated by convenient, readily available participant pools and relatively straightforward ways of running experiments: most recently, through the development of the HTML5 standard. Although in most studies participants give untimed responses, there is a growing interest in being able to record response times online. Existing data on the accuracy and cross-machine variability of online timing measures are limited, and generally they have compared behavioral data gathered on the Web with similar data gathered in the lab. For this article, we took a more direct approach, examining two ways of running experiments online-Adobe Flash and HTML5 with CSS3 and JavaScript-across 19 different computer systems. We used specialist hardware to measure stimulus display durations and to generate precise response times to visual stimuli in order to assess measurement accuracy, examining effects of duration, browser, and system-to-system variability (such as across different Windows versions), as well as effects of processing power and graphics capability. We found that (a) Flash and JavaScript's presentation and response time measurement accuracy are similar; (b) within-system variability is generally small, even in low-powered machines under high load; (c) the variability of measured response times across systems is somewhat larger; and (d) browser type and system hardware appear to have relatively small effects on measured response times. Modeling of the effects of this technical variability suggests that for most within-and between-subjects experiments, Flash and JavaScript can both be used to accurately detect differences in response times across conditions. Concerns are, however, noted about using some correlational or longitudinal designs online.
Article
Full-text available
Two methods for objectively measuring eye tracking data quality are explored. The first method works by tricking the eye tracker to detect an abrupt change in the gaze position of an artificial eye that in actuality does not move. Such a device, referred to as an artificial saccade generator, is shown to be extremely useful for measuring the temporal accuracy and precision of eye tracking systems and for validating the latency to display change in gaze contingent display paradigms. The second method involves an artificial pupil that is mounted on a computer controlled moving platform. This device is designed to be able to provide the eye tracker with motion sequences that closely resemble biological eye movements. The main advantage of using artificial motion for testing eye tracking data quality is the fact that the spatiotemporal signal is fully specified in a manner independent of the eye tracker that is being evaluated and that nearly identical motion sequence can be reproduced multiple times with great precision. The results of the present study demonstrate that the equipment described has the potential to become an important tool in the comprehensive evaluation of data quality.
Article
Full-text available
Because of the features provided by an abundance of specialized experimental software packages, personal computers have become prominent and powerful tools in cognitive research. Most of these programs have mechanisms to control the precision and accuracy with which visual stimuli are presented as well as the response times. However, external factors, often related to the technology used to display the visual information, can have a noticeable impact on the actual performance and may be easily overlooked by researchers. The aim of this study is to measure the precision and accuracy of the timing mechanisms of some of the most popular software packages used in a typical laboratory scenario in order to assess whether presentation times configured by researchers do not differ from measured times more than what is expected due to the hardware limitations. Despite the apparent precision and accuracy of the results, important issues related to timing setups in the presentation of visual stimuli were found, and they should be taken into account by researchers in their experiments.
Article
Full-text available
Neuroscience is a rapidly expanding field in which complex studies and equipment setups are the norm. Often these push boundaries in terms of what technology can offer, and increasingly they make use of a wide range of stimulus materials and interconnected equipment (e.g., magnetic resonance imaging, electroencephalography, magnetoencephalography, eyetrackers, biofeedback, etc.). The software that bonds the various constituent parts together itself allows for ever more elaborate investigations to be carried out with apparent ease. However, research over the last decade has suggested a growing, yet underacknowledged, problem with obtaining millisecond-accurate timing in some computer-based studies. Crucially, timing inaccuracies can affect not just response time measurements, but also stimulus presentation and the synchronization between equipment. This is not a new problem, but rather one that researchers may have assumed had been solved with the advent of faster computers, state-of-the-art equipment, and more advanced software. In this article, we highlight the potential sources of error, their causes, and their likely impact on replication. Unfortunately, in many applications, inaccurate timing is not easily resolved by utilizing ever-faster computers, newer equipment, or post-hoc statistical manipulation. To ensure consistency across the field, we advocate that researchers self-validate the timing accuracy of their own equipment whilst running the actual paradigm in situ.
Article
Full-text available
A simulation was conducted to assess the effect of technical variance on the statistical power of web experiments measuring response times. The results of the simulation showed that technical variance reduced the statistical power and the accuracy of the effect size estimate by a negligible magnitude. This finding therefore suggests that researchers' preconceptions concerning the unsuitability of web experiments for conducting research using response time as a dependent measure are misguided.
Article
Full-text available
Several Web animation methods were independently assessed on fast and slow systems running two popular Web browsers under MacOS and Windows. The methods assessed included those requiring programming (Authorware, Java, Javascript/Jscript), browser extensions (Flash and Authorware), or neither (animated GIF). The number of raster scans that an image in an animation was presented for was counted. This was used as an estimate of the minimum presentation time for the image when the software was set to update the animation as quickly as possible. In a second condition, the image was set to be displayed for 100 msec, and differences between observed and expected presentations were used to assess accuracy. In general, all the methods except Java deteriorated as a function of the speed of the computer system, with the poorest temporal resolutions and greatest variability occurring on slower systems. For some animation methods, poor performance was dependent on browser, operating system, system speed, or combinations of these.
Article
The Ratcliff diffusion model for simple two-choice decisions (e.g., Ratcliff, 197863. Ratcliff , R. 1978 . A theory of memory retrieval . Psychological Review , 85 : 59 – 108 . [CrossRef], [Web of Science ®]View all references; Ratcliff & McKoon, 200868. Ratcliff , R. and McKoon , G. 2008 . The diffusion decision model: Theory and data for two-choice decision tasks . Neural Computation , 20 : 873 – 922 . [CrossRef], [PubMed], [Web of Science ®]View all references) has two outstanding advantages. First, the model generally provides an excellent fit to the observed data (i.e., response accuracy and the shape of RT distributions, both for correct and error responses). Second, the parameters of the model can be mapped on to latent psychological processes such as the speed of information accumulation, response caution, and a priori bias. In recent years, the advantages of the Ratcliff diffusion model have become increasingly clear. Current advances in methodology allow all researchers to fit the diffusion model to data easily. Recent applications to ageing, lexical decision, IQ, practice, the implicit association test, and the accessory stimulus effect serve to highlight the added value of a diffusion model perspective on simple decision making.