Best practices: Two Web-browser-based methods for stimulus presentation in behavioral experiments with high-resolution timing requirements

Article (PDF Available)inBehavior Research Methods 51(3) · October 2018with 304 Reads 
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
DOI: 10.3758/s13428-018-1126-4
Cite this publication
Abstract
The Web is a prominent platform for behavioral experiments, for many reasons (relative simplicity, ubiquity, and accessibility, among others). Over the last few years, many behavioral and social scientists have conducted Internet-based experiments using standard web technologies, both in native JavaScript and using research-oriented frameworks. At the same time, vendors of widely used web browsers have been working hard to improve the performance of their software. However, the goals of browser vendors do not always coincide with behavioral researchers’ needs. Whereas vendors want high-performance browsers to respond almost instantly and to trade off accuracy for speed, researchers have the opposite trade-off goal, wanting their browser-based experiments to exactly match the experimental design and procedure. In this article, we review and test some of the best practices suggested by web-browser vendors, based on the features provided by new web standards, in order to optimize animations for browser-based behavioral experiments with high-resolution timing requirements. Using specialized hardware, we conducted four studies to determine the accuracy and precision of two different methods. The results using CSS animations in web browsers (Method 1) with GPU acceleration turned off showed biases that depend on the combination of browser and operating system. The results of tests on the latest versions of GPU-accelerated web browsers showed no frame loss in CSS animations. The same happened in many, but not all, of the tests conducted using requestAnimationFrame (Method 2) instead of CSS animations. Unbeknownst to many researchers, vendors of web browsers implement complex technologies that result in reduced quality of timing. Therefore, behavioral researchers interested in timing-dependent procedures should be cautious when developing browser-based experiments and should test the accuracy and precision of the whole experimental setup (web application, web browser, operating system, and hardware).
Advertisement
Best practices: Two Web-browser-based methods for stimulus
presentation in behavioral experiments with high-resolution timing
requirements
Pablo Garaizar
1
&Ulf-Dietrich Reips
2
#Psychonomic Society, Inc. 2018
Abstract
The Web is a prominent platform for behavioral experiments, for many reasons (relative simplicity, ubiquity, and accessibility,
among others). Over the last few years, many behavioral and social scientists have conducted Internet-based experiments using
standard web technologies, both in native JavaScript and using research-oriented frameworks. At the same time, vendors of
widely used web browsers have been working hard to improve the performance of their software. However, the goals of browser
vendors do not always coincide with behavioral researchersneeds. Whereas vendors want high-performance browsers to
respond almost instantly and to trade off accuracy for speed, researchers have the opposite trade-off goal, wanting their
browser-based experiments to exactly match the experimental design and procedure. In this article, we review and test some
of the best practices suggested by web-browser vendors, based on the features provided by new web standards, in order to
optimize animations for browser-based behavioral experiments with high-resolution timing requirements. Using specialized
hardware, we conducted four studies to determine the accuracy and precision of two different methods. The results using CSS
animations in web browsers (Method 1) with GPU acceleration turned off showed biases that depend on the combination of
browser and operating system. The results of tests on the latest versions of GPU-accelerated web browsers showed no frame loss
in CSS animations. The same happened in many, but not all, of the tests conducted using requestAnimationFrame (Method
2) instead of CSS animations. Unbeknownst to manyresearchers, vendors of web browsers implement complex technologies that
result in reduced quality of timing. Therefore, behavioral researchers interested in timing-dependent procedures should be
cautious when developing browser-based experiments and should test the accuracy and precision of the whole experimental
setup (web application, web browser, operating system, and hardware).
Keywords Web a ni ma tions .Experimental software .High-resolution timing .iScience .Browser
Shortly after its inception, the Web was demonstrated to be an
excellent environment to conduct behavioral experiments.
The first Internet-based experiments were conducted in the
mid-1990s, shortly after the World Wide Web had been
invented at CERN in Geneva (Musch & Reips, 2000; Reips,
2012). Conducting studies via the Internet is considered a
second revolution in behavioral and social research, after the
computer revolution in the late 1960s, and subsequently that
method has brought about many advantages over widely used
paper-and-pencil procedures (e.g., automated processes,
heightened precision). The Internet added interactivity via a
worldwide network and brought many benefits to research,
adding a third category to what had traditionally been seen
as a dichotomy between lab and field experiments (Honing
& Reips, 2008;Reips,2002). Although Internet-based exper-
iments have some inherent limitations, due to a lack of control
and the limits of technology, they also have a number of ad-
vantages over lab and field experiments (Birnbaum, 2004;
Reips, 2002;Schmidt,1997). Some of the main advantages
are that (1) it is possible to easily collect large behavioral data
sets (see, however, Wolfe, 2017, noting that this is actually not
happening as frequently as one would expect); (2) it is also
possible to recruit large heterogeneous samples and people
with rare characteristics (e.g., people suffering from
sexsomnia and their peers; Mangan & Reips, 2007)fromlo-
cations far away; and (3) after an initial investment, the meth-
od is more cost-effective, in terms of time, space, and labor,
*Pablo Garaizar
garaizar@deusto.es
1
University of Deusto, Bilbao, Spain
2
University of Konstanz, Konstanz, Germany
Behavior Research Methods
https://doi.org/10.3758/s13428-018-1126-4
than either lab or field research. As compared to paper-and-
pencil research, most of the advantages of computer-mediated
research applyfor example, the benefit that process vari-
ables (Bparadata^) can be recorded (Stieger & Reips, 2010).
Despite the numerous studies comparing web-based re-
search with laboratory research that have concluded that both
approaches work, there are still doubts about the capabilities
of web browsers for presenting and recording data accurately
(e.g., Schmidt, 2007). Early discussions (Reips, 2000,2007;
Schmidt, 1997) saw reaction time measurement in Internet-
based experimenting as possible, but clearly pointed out its
limitations. In fact, there is an open debate as to the lack of
temporal precision of experimentation based on computers as
a possible cause to explain the ongoing replication crisis
across the field of psychology (Plant, 2016).
On the other hand, several studies have provided web tech-
nology benchmarks (see van Steenbergen & Bocanegra, 2016,
for a comprehensive list) that help researchers figure out when
the timing of web-based experimentation is acceptable for the
chosen experimental paradigm. Moreover, notable efforts
have been made in recent years to simplify the development
and improve the accuracy of timing in web experiments using
standard web technologies based in research-oriented frame-
works including jsPsych (de Leeuw, 2015)orLab.js
(Henninger, Mertens, Shevchenko, & Hilbig, 2017).
At the same time, vendors of widely used web browsers
(Google Chrome, Mozilla Firefox, Apple Safari, and
Microsoft Edge, among others) have been working hard to
improve the performance of their software. However, there
are some important discrepancies between the goals of brows-
er vendors and behavioral researchers regarding the desired
features of an ideal web browser. Whereas browser vendors
try their best to provide a faster browser than their competitors
and have as their main goal to increase the responsiveness of
the web applications presented to the user, behavioral re-
searchers foremost need precision and accuracy when present-
ing stimuli and recording user input, and not necessarily
speed. Thus, browser vendors and researchers tend to be at
opposite ends of the desired speedaccuracy trade-off.
Fortunately, some of the technological advances that have
recently been developed in response to browser vendorsneeds
have turned out to be aligned with behavioral researchersneeds
as well. Modern web browsers are now provided with frame-
oriented animation timers (i.e., requestAnimationFrame), a com-
prehensive and accurate application programming interface for
audio (Web Audio API), and submillisecond-accurate input
events timestamps (DOMHighResTimeStamp). They are also
provided with submillisecond-accurate timing functions (i.e.,
window.performance.now) in several versions, but a new class
of timing attacks in modern CPUs (e.g., Spectre and Meltdown)
have forced web-browser vendors to reduce the precision of
these timing functions, either by rounding (Scholz, 2018)or
slightly randomizing the value returned (Kyöstilä, 2018). In
the case of Mozilla Firefox, this limitation can be disabled by
modifying the privacy.reduceTimerPrecision configuration
property, which has been enabled by default since version 59.
In the case of Google Chrome, developers decided to reduce the
resolution of performance.now() from 5 to 100 μsandtoadd
pseudorandom jitter on top.
To explain these new features to application developers,
web-browser vendors have written several best-practice
guidelines emphasizing the underlying concepts related to
web animations in terms of performance (Bamberg, 2018a,
2018b;Lewis,2018). In the next section, we will review those
best practices from a behavioral researchersperspective.
Best practices for animations in Web-based
experiments
It is important to understand that a browser-based experiment
can be conducted either offline (not via the Internet) or online
(on the Internet); see, for instance, Honing and Reips (2008)
or Reips (2012). Even in web-technology-based experiments
conducted offline, it is necessary for accurate timing to load
the experiments assets (images, styles, audio, video, etc.) in a
participants browser before the experiment starts. Once load-
ed, the assets will be ready to be rendered by the browser. In
this section, we will analyze these two tasks from the perspec-
tive of a behavioral researcher.
Best practices for loading assets
For controlled timing, web browsers need to download all the
assets, including any media, referenced in the HTML docu-
ment that describes a web page before running it. In most
cases, preloading delays the time until the user can interact
with the web page, so reducing download time becomes a
priority. Consequently, browser vendors are defining new
standards to eliminate unnecessary asset downloads, optimize
file formats, or cache assets, among others (see HTTP/2 spec-
ification for details; Belshe, Peon, & Thomson, 2015).
However, from a behavioral researcher perspective, there is
no such need for speedy downloading or blocking of web
assets. In most experiments, researchers have to explain to
participants how to proceed, get their informed consent and
maybe gather some socio-demographic information. This
preexperimental time can be used to download large assets
in the background. Even in the unlikely case that participants
have read the instructions and filled all required information
before all the assets are downloaded, asking them to wait until
the experiment is ready to be conducted is not a serious prob-
lem. However, not predownloading all assets needed to com-
pose an experiments stimuli before it is presented to the par-
ticipant can cause serious methodological issues.
Behav Res
There are several techniques to preload web assets. In the
past, web developers used CSS tricks like fetching images as
background images of web components placed outside the
boundaries of the web page or set as hidden. Currently, the
rel=Bpreload^property of the link element in the header of the
HTML document should be the preferred way to preload web
assets (Grigorik & Weiss, 2018). This method should not be
confused with <link rel=Bprefetch^>. The Bprefetch^directive
asks the browser to fetch a resource that will probably be
needed for the next navigation. Therefore, the resource will
be fetched with extremely low priority. Conversely, the
Bpreload^directive tells the web browser to fetch the web
asset as soon as possible because it will be needed in the
current navigation.
Alternatively, web developers can preload images (or other
web assets) creating them from scratch in JavaScript. In
Listing 1, we provide an example script of how to create a
set of images and wait until it has been completely
downloaded in a JavaScript web application relying on the
Bonload^event of the images. This method works in most
cases, but there are some issues related to Bonload^: Events
not properly being fired have been reported in previous ver-
sions of widely used web browsers (e.g., Google Chrome v50)
and would affect cached images. For this reason, in Listing 2,
we provide a new script of how to actively wait until a set of
images has been completely downloaded in a JavaScript web
application that does not rely on the Bonload^event to deter-
mine whether the image has been completely downloaded and
ready to be displayed or not. These examples can be easily
adapted for other kinds of assets (audio, video) if needed, by
querying web elementsproperties (e.g., in the case of video
assets, the readyState property).
Best practices for rendering web pages
Once the assets needed to conduct the experiment have been
downloaded, the browser is ready to show the experiment.
Showing ormore technically speakingrendering a web
application implies a sequence of tasks that web browsers
have to accomplish in the following order: (1) JavaScript/
CSS (cascading style sheets), (2) style, (3) layout, (4) paint,
and (5) composite. Understanding all of them is crucial to
develop accurate and precise animations for web-based be-
havioral experiments.
However, rendering is only one of the steps web browsers
take when executing a web application, a simple form of which
is a web page. Web applications run in an execution environ-
ment that comprises several important components: (1) a
var images = [],
total = 24,
loaded = 0;
for (var i = 0; i < total; i++) {
images.push(new Image());
images[i].addEventListener('load', function() {
loaded++;
if (loaded == total) {
startExperiment();
}
}, false);
images[i].src = 'img/numbers/'+(i+1)+'.png';
}
Listing 1 JavaScript code to preload a set of images and use the onload event to check that all of them have been downloaded before the experiment
begins
function isImageLoaded (img) {
if (!img.complete) { return false; }
if (typeof img.naturalWidth != "undefined" && img.naturalWidth == 0)
{ return false; }
return true;
}
function checkLoad () {
for (var i = 0; i < images.length; i++) {
if (!isImageLoaded(images[i])) {
setTimeout(checkLoad, 50);
return false;
}
}
startExperiment();
return true;
}
Listing 2 JavaScript code to test whether a set of images has been downloaded before the experiment begins by not relying on the onload event
Behav Res
JavaScript execution context shared with all the scripts referred
in the web application, (2) a browsing context (useful to man-
age the browsing history), (3) an event loop (described later),
and (4) an HTML document, among other components. The
event loop orchestrates what JavaScript code will be executed
and when to run it, manages user interaction and networking,
renders the document, and performs other minor tasks (Mozilla,
2018;WHATWG2018).Theremustbeatmostoneeventloop
per related similar-origin browsing contexts (i.e., different web
applications running on the same web browser do not share
event loops, each one has its own event loop).
The event loop uses different task queues (i.e., ordered lists
of tasks) to manage its duties: (1) events queue: for managing
user-interface events; (2) parser queue: for parsing HTML; (3)
callbacks queue: for managing asynchronous callbacks (e.g.,
via setTimeout or requestIdleTask timers); (4) resources
queue: for fetching web resources (e.g., images) asynchro-
nously; and (5) document manipulation queue: for reacting
when an element is modified in the web document. During
the whole execution of the web application, the event loop
waits until there is a task in its queues to be processed.
Then, it selects the oldest task on one of the event loopstask
queues and runs it. After that, the event loop updates the ren-
dering of the web application.
Browsers begin the rendering process by interpreting
the JavaScript/CSS code that web developers have coded
to make visual changes in the web page. In some cases,
these visual changes are controlled by a JavaScript code
snippet, whereas in others CSS animations are used to
change the properties of web elements dynamically (JS/
CSS phase). This phase involves (in this order): (1)
dispatching pending user-interface events, (2) running
the resize and scroll steps for the web page, (3) running
CSS animations and sending corresponding events (e.g.,
Banimationend^), (4) running full-screen rendering steps,
and (5) running the animation frame callbacks for the web
page. Once the browser knows what must be done, it
figures out which CSS rules it needs to apply to which
web element and the compounded styles are applied to
each element (style phase). Then, the browser is able to
calculate how much space each element will take on the
screen to create the web page layout (layout phase). This
enables the browser to paint the actual pixels of every
visual part (text, colors, images, borders, shadows) of
the elements (paint phase). Modern browsers are able to
paint several overlapping layers independently for in-
creasing performance. These overlapping layers have to
be drawn in the correct order to render the web page
properly (composite phase).
Considering this rendering sequence as a pipeline, any
change made in one of the phases implies recalculating
the following phases. Therefore, developing web anima-
tions that only require composite changes prevents the
execution of previous phases. In addition to this general
recommendation, some other details should be taken into
account in each phase.
Taking into account the underlying technologies men-
tioned before, Web experiments should rely on CSS ani-
mations whenever suitable in the JavaScript/CSS phase,
for several reasons. First, they do not need JavaScript to
be executed and therefore do not add a new task to the
queues to be executed by the event loop. This not only
reduces the number of tasks that has to be executed, but
also increments the likelihood that input events (i.e., user
responses in the case of a web experiment) are dispatched
as fast as they occur. Second, if the web browser is able to
use GPU-accelerated rendering, some CSS animations can
be managed asynchronously by the browsersGPUpro-
cess, resulting in a performance boost.
However, not all web experiment animations can be de-
fined declaratively using CSS. For the cases in which the
animations needed to present stimuli rely on JavaScript,
avoiding standard timers (i.e., setTimeout, setInterval) in favor
of the requestAnimationFrame timer is a must: Standard
timers are not synchronized with the frame painting process
and can lead to accumulative timing errors in web animations
(e.g., it is impossible to be in sync with a display at 60 Hz
16.667 ms per frame using standard timers, because setting a
16-ms interval is too short and a 17-ms interval is too long),
whereas requestAnimationFrame was designed to be in per-
fect sync with the frame rate. Moreover, using
requestAnimationFrame in web experiments enables re-
searchers to implement frame counting in order to achieve
single-frame accuracy in most cases (Barnhoorn, Haasnoot,
Bocanegra, & van Steenbergen, 2015). Nevertheless, being
aware of the time needed by the browser to calculate every
frame of the animation is crucial. At this point, we should
consider that JavaScripts call stack is single-threaded, syn-
chronous, and nonblocking. This means that only one piece
of JavaScript code can be executed at a time in a browsing
context; there is no task switching (tasks are carried out to
completion); and web browsers still accept events even
though they might not be dispatched immediately. In such
an execution environment, requestAnimationFrame-based an-
imationsJavaScript code must compete with the rest of
JavaScript tasks waiting for the single execution thread.
Fortunately, newer versions of common browsers allow web
programmers to trace these times in detail using their web
developer toolkits, reducing the problem substantially.
The style phase can be optimized reducing the complexity
of the style sheets (i.e., complexity of selectors, number of
elements implied, or hierarchy of the elements affected by a
style change). Some tools (e.g., unused CSS) can significantly
reduce the complexity of style sheets.
Avoiding layout changes within a loop is the best recom-
mendation regarding the layout phase, because it implies the
Behav Res
calculation of lots of layouts that will be discarded immedi-
ately (also known as Blayout thrashing^). Another important
recommendation for this phase is to apply animations to ele-
ments that are position fixed or absolute because it is much
easier for the browser to calculate layout changes in those
cases.
Painting is often the most expensive phase of the pipeline.
Therefore, the recommendation here is to avoid or reduce
painting areas as much as possible. This can be done by dif-
ferent means: using layers, transforming opacity of Web ele-
ments, or modifying hidden elements. Finally, the recommen-
dation for the composite phase is to stick to transformations
(position, scale, rotation, skew, matrix) and opacity changes
for the experiments animations to maximize the likelihood of
being managed asynchronously by the GPU process of the
browser.
To validate these best practices, we prepared a set of exper-
iments in which we (1) preloaded all assets before an experi-
ment begins, (2) used CSS animations to control the experi-
ments animations, (3) tried to minimize layout changes, (4)
tried to reduce painting areas, and (5) tried to stick to opacity
changes in animations. In the study presented in the next sec-
tion, we tested the accuracy and precision of the animations
used in these experiments.
Study 1
The goal of the present study was to test the accuracy and
precision of the animations used in a set of experiments that
would try to follow the web-browser vendorsbest practices
explained above.
Method
Apparatus and materials Considering the potential inaccura-
cies that can take place when the same device is used to pres-
ent visual content and assess its timing, we decided to use an
external measurement system: the Black Box Toolkit
(BBTK), which is able to register the precise moment at which
the content is shown, with submillisecond accuracy (Plant,
Hammond, & Turner, 2004).
We installed Google Chrome 58 and Mozilla Firefox
54 web browsers on both Microsoft Windows 10 and
Ubuntu Linux 16.04.3 systems, on a laptop with an Intel
i56200-Uchipwith20GBofRAManda120-GBSSD
disk, not connected to the Internet and isolated from ex-
ternal sources of asynchronous events. In this setting, we
ran a web experiment application that showed an anima-
tion of visual items typical for many of the web experi-
ments that will be described below. This web application
uses CSS animations to control the presentation of the
stimuli. Each stimulus is placed in a different layer, and
the CSS animation controls which one is shown through
opacity changes of the layers. The stimuli consisted of 24
different images (i.e., the natural numbers from 1 to 24) in
which odd numbers were placed on a white background
and even numbers on a black background, to facilitate the
detection of changes by the photo-sensors of the BBTK.
The stimuli were preloaded by the experimental software
before the animation started. This set of stimuli and the
web application used in this study are publicly available
via the Open Science Framework: https://osf.io/h7erv/.
Listing 3shows the setSlideshow function of this web
experiment. In this function, a set of 24 images are
appended to the parent element in a for loop. Before this,
each image is properly configured: (1) opacity is set to
zero (invisible) and BwillChange^property is set to
Bopacity,^to inform the web browser that this property
will change during the animation; (2) a fixed position is
set, in order to prevent reflows of the web document; (3) a
CSS animation is configuredan Binterval^argument de-
fines the duration of the animation, the Bsteps (1, end)^
function defines a nonprogressive (= immediate) change
of opacity, and status is set to paused; (4) CSS animation
events are defined (Banimationstart,^Banimationend^)to
log the onset and offset times of the stimuli. Then, all the
animations of the images are changed to the Brunning^
state.
Procedure For each combination of web browser (Google
Chrome, Mozilla Firefox) and operating system (MS
Windows, GNU/Linux), we tested the same web experiment,
which presented a slideshow of the first 24 natural numbers.
Each number was presented during a short interval before the
next one was presented. We tested this series of stimuli with
different presentation intervals for each stimulus: 500, 200,
100, and 50 ms, which correspond to the duration of 30, 12,
six, and three frames in a 60-Hz display. Considering that all
the tests were conducted using this refresh rate, the subframe
deviations of the intervals measured by the photo-sensors of
the BBTK (e.g., 51.344 ms instead of 50 ms) were caused by
difficulties of the LCD displays with handling abrupt changes
of luminosity, and not by the series of stimuli tested.
Therefore, we converted all durations of the stimulus presen-
tations from milliseconds to frames because the main purpose
of this study was to assess the accuracy of the web presenta-
tion software, not the hardware. To reduce the effect of un-
foreseen sources of delays, we tested each configuration three
times.
Results
The results of the tests conducted on Google Chrome are
shown in Table 1. Each cell in the table represents the number
of Bshort^or Blong^frames during each test (a presentation of
Behav Res
the 24 stimuli). Surprisingly, there is a noticeable difference
between the GNU/Linux and MS Windows setups. The web
application tested works flawlessly on Google Chrome under
GNU/Linux at all intervals, whereas the same web application
presents an unacceptable number of lost frames under MS
Windows. We call those frames Bshort^that were presented
before they were expected, and those Blong^that were pre-
sented after they were expected (note that in many cases, the
sum of short and long frames is near zero, because CSS ani-
mations tend to interpolate all missing frames in an animation,
making the animation last as long as expected). As happens
with experimental software such as E-Prime, with its event
mode timing and cumulative mode timing (Schneider,
Eschman, & Zuccolotto, 2012), researchers can decide how
their experiment will behave when an unexpected delay oc-
curs. In the first case (event mode timing), the delay will
cause a stimulus to be displayed longer than expected.
This will not affect the duration of the next stimulus,
but will affect the total duration of the animation. In the
second case (cumulative mode timing), the delay will
function setSlideshow (element, start, interval) {
element.style.backgroundImage = 'none';
for (var i = 0; i < total; i++) {
images[i].style.opacity = 0;
images[i].style.willChange = 'opacity';
images[i].style.position = 'fixed';
images[i].style.top = 100;
images[i].style.left = 100;
images[i].style['animation'] = 'show '+interval+'ms steps(1,end) '+(start
+ i*interval)+'ms 1 normal none paused';
images[i].addEventListener('animationstart', function (event) {
console.log('Start at: ' + event.elapsedTime + ' ' + event.timeStamp);
}, false);
images[i].addEventListener('animationend', function (event) {
console.log('End at: ' + event.elapsedTime + ' ' + event.timeStamp);
element.style.backgroundImage = 'none';
}, false);
element.appendChild(images[i]);
}
for (var i = 0; i < total; i++) {
images[i].style['animation-play-state'] = 'running';
}
}
Listing 3 JavaScript code to configure a slideshow using opacity changes through CSS animations on a set of images
Table 1 Study 1: Short/long frames using CSS animations and opacity changes between layers on Google Chrome 58 and Mozilla Firefox 54
Tes t N30 Frames 12 Frames 6 Frames 3 Frames
Short Long Short Long Short Long Short Long
Google Chrome 58 Windows 1 24 10 10 13 12 13 12 12 11
22455 12 11 66 10 10
32411 10 12 11 12 11 11 10
Linux 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Mozilla Firefox 54 Windows 1 24 0 1 0 0 0 0 0 0
22400000000
32401000000
Linux 1 24 66 55 55 55
22455 75 31 44
32444 67 42 54
Behav Res
cause one stimulus to be displayed longer, whereas the
next will be displayed a shorter time than expected, to
make the whole animation meet its duration requirements.
In the tests presented in this study, CSS animations work
like E-Primes cumulative mode timing. However, it is
possible to use CSS animations to develop experiments
that work in event mode timing by replacing the 24-
keyframeanimationusedherewith24animationsofone
keyframe to be launched successively once the
Banimationend^event of the previous animation is
triggered.
The results of the tests conducted on Mozilla Firefox
arealsoshowninTable1. There is also a noticeable
difference between GNU/Linux and MS Windows,
butsurprisinglyin the opposite diection from the dif-
ference we found running these tests with Google
Chrome. Therefore, the tested technique (i.e., layer
opacity changes through CSS animations) cannot be
used as a reliable way to generate web experiments with
accurate and precise stimulus presentations in any
multiplatform environment. Consequently, we developed
a new web application for test purposes, based on a
slightly different approach.
Study 2
The goal of Study 2 was to find a good combination of
best practices in the development of web animations
that would be suitable to present stimuli in an accurate
andprecisewayonbothGoogleChrome58and
Mozilla Firefox 54 under MS Windows and GNU/
Linux operating systems.
In this new web application we also used CSS animations
to control the sequence of stimuli, but instead of creating the
slideshow by placing each stimulus in a separate layer and
using opacity changes to show each of them, we placed all
stimuli in one large single image and used Bbackground
position^changes to show each of them. This big image con-
taining all of the stimuli, and the corresponding offsets to
show each of them can easily be generated using tools such
as Glue (https://github.com/jorgebastida/glue)orthrough
HTML/JavaScript features such as canvas API. Needless to
say, the image with all stimuli has to be preloaded before the
experiment begins.
Method
Apparatus, materials, and procedure As in Study 1, we ran
the same web application with different presentation in-
tervals for each stimulus (30, 12, six, and three frames)
three times for each stimulusbrowserOS combination,
on Google Chrome 58 and Mozilla Firefox 54 under
both GNU/Linux and MS Windows. The BBTKs
photo-sensor was attached to the display of the laptop
used in Study 1. The procedure was identical to that in
Study 1.
Listing 4shows how this web application defines the
slideshow of stimuli. First, the corresponding background po-
sition for each stimulus in the big picture is defined. Then the
keyframes of the animation are added to a string that will
contain the whole definition of the CSS animation. After that,
the CSS animation keyframes are included in the web docu-
mentsBslideshow^style sheet. Finally, the animation of the
parents element (i.e., the div box that will show all stimuli) is
configured to use the keyframes previously defined. For log-
ging purposes, the Banimationstart^and Banimationend^
event listeners log the starting and ending time stamps of the
slideshow.
Results
Tab le 2summarizes the results obtained for Google Chrome
58 using our test web application. As we can see, despite the
fact that some frames were presented too early in the three-
frame interval under MS Windows, this new approach
outperformed the previous one and was able to present stimuli
in an accurate and precise way in most cases. The same hap-
pened when running our tests in Mozilla Firefox 54. The web
application tested also showed some short frames under MS
Windows in the three-frame interval, and under GNU/Linux
in the 30-frame interval, but it was able to present stimuli
accurately and precisely in most cases.
Therefore, contrary to the best practices suggested by the
web-browser vendors for the development of web animations,
changing background position in an image with all stimuli
(which implies new paint and composite phases)
outperformed changing the opacity of layers (which implies
just redoing the composite phase) in this setup. Slideshows
based on background position changes work properly in both
Google Chrome 58 and Mozilla Firefox 54 under GNU/Linux
and MS Windows. However, to understand the unexpected
results from Study 1, we decided to conduct another study
as a replication with newer browser versions and forced
GPU acceleration.
Study 3
The goal of Study 3 was to find out whether the browser
versions used in Study 1 could have been the cause of the
unexpected results found. With this goal in mind, we repeated
all the tests using the same technique (layer opacity changes
through CSS animations) 10 months later, using the latest
versions of Google Chrome (v.66) and Mozilla Firefox
(v.59) available, under GNU/Linux and MS Windows.
Behav Res
Method
Apparatus, materials, and procedure We ran the same set of
stimuli with different presentation intervals for each stimu-
lus (30, 12, six, and three frames) three times for each
stimulusbrowserOS combination, on Google Chrome 66
and Mozilla Firefox 59, under both GNU/Linux and MS
Windows. The BBTKs photo-sensor was attached to the
display of the laptop used in Study 1. The procedure was
identical to that of Study 1.
function setSlideshow (element, start, interval) {
var rules = '',
percs = '',
images = [],
order = ['blank', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20',
'21', '22', '23', '24', 'blank' ],
animationName = 'slideshow'+(new Date().getTime()),
stylesheet = document.getElementById('slideshow');
images['24'] = '{background-position:0px 0px;}';
images['23'] = '{background-position:0px -1440px;}';
images['22'] = '{background-position:-640px -1440px;}';
images['21'] = '{background-position:-1280px -1440px;}';
images['20'] = '{background-position:-1920px 0px;}';
images['19'] = '{background-position:-1920px -960px;}';
images['18'] = '{background-position:-1920px -1440px;}';
images['17'] = '{background-position:0px -1920px;}';
images['16'] = '{background-position:-640px -1920px;}';
images['15'] = '{background-position:-1280px -1920px;}';
images['14'] = '{background-position:-1920px -1920px;}';
images['13'] = '{background-position:-2560px 0px;}';
images['12'] = '{background-position:-2560px -480px;}';
images['11'] = '{background-position:-2560px -960px;}';
images['10'] = '{background-position:-2560px -1440px;}';
images['9'] = '{background-position:-640px 0px;}';
images['8'] = '{background-position:0px -480px;}';
images['7'] = '{background-position:-640px -480px;}';
images['6'] = '{background-position:-1280px 0px;}';
images['5'] = '{background-position:-1280px -480px;}';
images['4'] = '{background-position:0px -960px;}';
images['3'] = '{background-position:-640px -960px;}';
images['2'] = '{background-position:-1920px -480px;}';
images['1'] = '{background-position:-2560px -1920px;}';
images['blank'] = '{background-position:-1280px -960px;}';
for (var i = 0, len = order.length; i < len; i++) {
percs += (i*100/len) + '% ' + images[order[i]] + '\n';
}
rules += '@keyframes ' + animationName + ' {\n' + percs + '}\n';
stylesheet.innerHTML = rules;
element.style['animation'] = animationName + ' ' + (order.length *
interval) + 'ms steps(1) ' + start + 'ms 1 normal none paused';
element.addEventListener('animationstart', function (event) {
console.log('Start at: ' + event.elapsedTime + ' ' + event.timeStamp);
}, false);
element.addEventListener('animationend', function (event) {
console.log('End at: ' + event.elapsedTime + ' ' + event.timeStamp);
element.style.backgroundImage = 'none';
}, false);
element.style['animation-play-state'] = 'running';
}
Listing 4 JavaScript code to configure a slideshow using background position changes through CSS animations on a set of images
Behav Res
In addition to updating the versions of the web browsers,
we also configured them to force the use of GPU accelera-
tion. In the case of Google Chrome, we accessed the
chrome://flags URL in the address bar and enabled the
BOverride software rendering list.^option. Then we
relaunched Google Chrome and verified that GPU accelera-
tion was enabled by accessing the chrome://gpu URL. In the
case of Mozilla Firefox, we accessed the about:config URL
and changed the Blayers.acceleration.force-enabled^property
from Bfalse^to Btrue.^
Results
All tests conducted (24-stimulus animations with three-,
six-, 12-, and 30-frame durations, repeated three times)
resulted in no frame loss on Google Chrome 66 and
Mozilla Firefox 59 under MS Windows and GNU/
Linux. This was a significant improvement over the re-
sults obtained in Study 1.
On the basis of the results of Study 3, we could assume
that the poor results of Study 1 were due to the fact that the
configuration used did not ensure the use of GPU accelera-
tion in animations based on the change in opacity of the
layers. However, these results cannot distinguish whether
the web browser version update or the GPU acceleration
configuration caused the better performance of the tests. To
disentangle these possible causes, we repeated the tests that
had uncovered the timing problems in Study 1 (Mozilla
Firefox 54 under GNU/Linux and Google Chrome 58 under
MS Windows), but forced the use of GPU acceleration in
those configurations.
GPU-accelerated Mozilla Firefox 54 under GNU/
Linux performed accurately in the new tests (no frame
loss). However, GPU-accelerated Google Chrome 58 un-
der MS Windows still missed an unacceptable number
of frames in all tests (see Table 3for details).
Specifically, every stimulus presented on a white back-
ground lasted one frame longer than expected (i.e., one
long frame), and every stimulus presented on a black
background lasted one frame less than expected (i.e.,
one short frame) in the three-, six-, and 12-frame dura-
tion tests. At first sight, this inaccurate behavior might
look like a BBTK photo-sensor calibration problem.
However, every time we found a significant number of
short or long frames in our tests, we repeated a previ-
ously conducted test that yielded no short or long
frames (e.g., a six-frame duration stimulus using CSS
animations and background-position changes on Google
Chrome 58 under MS Windows) to be sure that our
experimental setup was still properly calibrated.
In the case of the 30-frame duration tests on GPU-
accelerated Google Chrome 58 under MS Windows, on-
ly one test presented this behavior, whereas the other
two lost no frames while presenting the last 16 stimuli.
Therefore, our recommendation for studies in which de-
viations of one frame are not acceptable is not only to
restrict data collection to browsers with enabled GPU
acceleration and to have participants update the
browsers whenever possible to a tested version that
loses no frames, but also to assess the accuracy of the
web technique used to present stimuli accurately on the
exact setup that will be used by participants. However,
accuracies within a one-frame deviation are likely ac-
ceptable in many experiments. Therefore, researchers
should weigh the cost of following these recommenda-
tions in those cases.
Table 2 Study 2: Short/long frames using CSS animations and background-position changes on Google Chrome 58 and Mozilla Firefox 54
Tes t N30 Frames 12 Frames 6 Frames 3 Frames
Short Long Short Long Short Long Short Long
Google Chrome 58 Windows 1 24 0 0 0 0 0 0 30
2240 0 0 0 0 0 20
3240 1 0 0 0 0 20
Linux 1 24 0 0 0 0 0 0 0 0
22401000000
32400000000
Mozilla Firefox 54 Windows 1 24 0 0 0 0 0 0 10
2240 0 0 0 0 0 10
3240 0 0 0 0 0 10
Linux 1 24 0 0 0 0 0 0 0 0
22412000000
32412000000
Behav Res
Study 4
The goal of Study 4 was to assess the accuracy of both tech-
niques developed in Studies 1 and 2 (layer opacity changes
and background-position changes) before using request
AnimationFrame instead of CSS animations.
Method
Apparatus, materials, and procedure In this study we tested
the two techniques presented in Studies 1 (layer opacity
changes) and 2 (background-position changes) using
requestAnimationFrame to animate the slideshow. Listing
5shows the animation function, which is scheduled to be
executed in every v-sync (i.e., repaint of the whole screen,
60 times every second at 60 Hz). This function gets a
time stamp from the web browser, to be aware of the
precise moment when requestAnimationFrame started to
execute callbacks (i.e., all the functions requested to be
executed by requestAnimationFrame). By subtracting from
this time stamp the moment the web animation had shown
the previous stimulus, it was possible to estimate the num-
ber of frames the stimulus had been presented and decide
when to present the next one. Note that 5 ms are added to
this numeric expression in order to prevent rounding errors
when calculating the moment when the next stimulus
should be rendered. This Brule of thumb^is a common
recommendation in experiment software user manuals
(e.g., E-Prime; see Schneider et al., 2012), and it allowed
our tests to work properly even with timing sources
rounded to 2 ms, such as in Mozilla Firefoxslatestver-
sions. We have made the web applications used in this
study publicly available at the Open Science Framework:
https://osf.io/h7erv/.
Results
The results of all the tests conducted are shown in Table 4.As
can be seen, there was no frame loss in the tests conducted on
Google Chrome 66 and Mozilla Firefox 59 under MS
Windows. The same happened in the tests conducted on
Mozilla Firefox 59 under GNU/Linux. In the case of Google
Chrome 66 under GNU/Linux, all tests that used background-
image position changes worked flawlessly, but we found
frame loss in tests using layer opacity changes to show the
stimuli. In most cases these tests only missed one frame, but
this combination of web technologies was especially unreli-
able during the 30-frame interval tests.
Conclusions and outlook
Studying the accuracy and precision of browser animations is
of fundamental methodological importance in web-based
Table 3 Study 3: Short/long frames using CSS animations and opacity changes between layers on Google Chrome 58 and Mozilla Firefox 54 with
GPU acceleration
Tes t N30 Frames 12 Frames 6 Frames 3 Frames
Short Long Short Long Short Long Short Long
Google Chrome 58 Windows 1 24 44 12 12 12 12 12 12
22444 12 12 12 12 12 12
32412 12 12 12 12 12 12 12
Mozilla Firefox 54 Linux 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
function animate (timestamp) {
if (i < total) window.requestAnimationFrame(animate);
var progress = timestamp - start;
if (progress + 5 >= interval) {
images[i].style.opacity = 0;
i++;
images[i].style.opacity = 1;
start = timestamp;
}
}
Listing 5 JavaScript code to animate a slideshow using layer opacity changes through requestAnimationFrame on a set of images
Behav Res
research. All static visual stimuli used by researchers in their
experiments can be easily converted to images (sometimes in
scenes made of several distinct image files) using the canvas
element before the experiment begins to preload or
pregenerate the assets needed in the web experiment.
Crucially, the stimuli presentation can then be controlled by
CSS animations to free the JavaScript event queue in order to
dispatch user-generated input events promptly to get accurate
time stamps.
The results of Studies 1 and 2 allow us to realize that
even when a combination of web techniques has proved
to be accurate enough for our experimental paradigm in
the past, it should be tested thoroughly (using BBTK or
similar procedures) again for the combinations of
browsersandoperatingsystemsthatmaybeusedbypar-
ticipants. This can be done ex post for the OSbrowser
combinations identified from server log files; see, for in-
stance, Reips and Stieger (2004). Otherwise, researchers
might obtain results that are biased by the choice of tech-
nology, as in the interaction between browsers and oper-
ating systems that we found in Study 1.
Best-practice recommendations by web-browser vendors
encourage researchers to use techniques such as manipulating
layersopacity to keep all the changes in the composite phase,
but Study 2 shows that background-position changes worked
better in most cases, even if this involved more phases in the
pipeline.
The results of Study 3 showed that enabling GPU acceler-
ation in web browsers can result in a significant improvement
of the accuracy of the presentation of visual stimuli in web-
based experiments. Thus, we recommend checking the status
of this feature before running web-based behavioral experi-
ments with high-resolution timing requirements.
Study 4 showed some limitations of the layer opacity
changes technique using requestAnimationFrame in Google
Chrome under GNU/Linux, but it worked flawlessly in the
rest of the tests under both GNU/Linux and MS Windows.
In light of the results from the studies we have presented
here, we believe that behavioral researchers should be cau-
tious in following browser vendor recommendations when
developing web-based experiments, and rather should adopt
the best practices derived from our empirical tests of the
Table 4 Study 4: Short/long frames using requestAnimationFrame to make layer opacity and background-position changes on Google Chrome 66 and
Mozilla Firefox 59
Tes t N30 Frames 12 Frames 6 Frames 3 Frames
Short Long Short Long Short Long Short Late
Back-ground position Google Chrome 66 Windows 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Linux 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Mozilla Firefox 59 Windows 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Linux 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Layer opacity Google Chrome 66 Windows 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Linux 1 24 32 10 10 10
22422 10 10 10
32443 11 10 10
Mozilla Firefox 59 Windows 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Linux 1 24 0 0 0 0 0 0 0 0
22400000000
32400000000
Behav Res
accuracy and precision of the whole experimental setup (web
application, web browser, operating system, and hardware).
These best practices should also immediately be included in
curricula in psychology and other behavioral and social sci-
ences, as students are often conducting web-based experi-
ments and will be future researchers (Krantz & Reips 2017).
Because the proposed web techniques had not been assessed
in previous studies on the accuracy of web applications under
high-resolution timing requirements (de Leeuw & Motz, 2016;
Garaizar, Vadillo, & López-de-Ipiña, 2014;Reimers&Stewart,
2015), the studies and detailed guidelines presented in this arti-
cle can help behavioral researchers who take them into account
when developing their web-based experiments.
In the old days of Internet-based experimenting, technology
was simpler. The effects of new technologies were easier to spot
for researchers who began using the Internet. In fact, one of us
(Reips) has long advocated a Blow-tech principle^in creating
Internet-based research studies, because, early on, technology
was shown to interfere with participantsbehavior in Internet-
based experiments. For example, Schwarz and Reips (2001)
created the very same web experiment both with server-side
(i.e., CGI) and client-side (i.e., Javascript) technologies and
observed significantly larger and increasing dropout rates in
the latter version. Buchanan and Reips (2001)further
established that technology preferences depend on a partici-
pants personality and may thus indirectly bias sample compo-
sition, and consequently behavior, in Internet-based research
studies (even though this seems to be less the case for different
operating systems on smartphones; see Götz, Stieger, & Reips,
2017). Modern web browsers have evolved to handle a much
wider range of technologies that, on the one hand, are capable of
achieving much more accuracy and precision in the control of
loading and rendering content than were earlier browsers, but on
the other hand, are increasingly likely to fall victim to insuffi-
cient optimization of complexity. Unbeknownst to many re-
searchers, vendors of web browsers implement a multitude of
technologies that are geared toward the optimization of goals
(e.g., speed) that are not in line with those of science (e.g.,
quality, timing). In the present article we have empirically shown
that this conflict has an effect on display and timing in Internet-
based studies and provided recommendations and scripts that
researchers can and should use to optimize their studies.
Alternativelyand this may be the only general rule of thumb
we are able to offer as an outcome of the empirical investigation
presented herethey might follow the Blow-tech principle^as
much as possible, to minimize interference.
Author note Support for this research was provided by the
Departamento de Educación, Universidades e Investigación
of the Basque Government (Grant No. IT1078-16) and by
the Committee on Research at the University of Konstanz.
The authors declare that there was no conflict of interest in
the publication of this study.
References
Bamberg, W. (2018a). Intensive JavaScript. MDN web docs. Retrieved
from https://developer.mozilla.org/en-US/docs/Tools/Performance/
Scenarios/Intensive_JavaScript
Bamberg, W. (2018b). Animating CSS properties. MDN web docs.
Retrieved from https://developer.mozilla.org/en-US/docs/Tools/
Performance/Scenarios/Animating_CSS_properties
Barnhoorn, J. S., Haasnoot, E., Bocanegra, B. R., & van Steenbergen, H.
(2015). QRTEngine: An easy solution for running online reaction
time experiments using Qualtrics. Behavior Research Methods,47,
918929. https://doi.org/10.3758/s13428-014-0530-7
Belshe,M.,Peon,R.,Thomson,M.(2015). Hypertext Transfer Protocol
Version 2 (HTTP/2). Retrieved from https://http2.github.io/http2-spec/
Birnbaum, M. H. (2004). Human research and data collection via the
Internet. Annual Review of Psychology,55,803832. https://doi.
org/10.1146/annurev.psych.55.090902.141601
Buchanan, T., & Reips, U.-D. (2001). Platform-dependent biases in on-
line research: Do Mac users really think different? In K. J. Jonas, P.
Breuer, B. Schauenburg, & M. Boos (Eds.), Perspectives on Internet
research: Concepts and methods. Available at http://www.uni-
konstanz.de/iscience/reips/pubs/papers/Buchanan_Reips2001.pdf.
Accessed 26 Sept 2018
Garaizar, P., Vadillo, M. A., & López-de-Ipiña, D. (2014). Presentation
accuracy of the web revisited: Animation methods in the HTML5
era. PLoS ONE,9, e109812. https://doi.org/10.1371/journal.pone.
0109812
Götz, F. M., Stieger, S., & Reips, U.-D. (2017). Users of the main
smartphone operating systems (iOS, Android) differ only little in
personality. PLoS ONE,12, e0176921. https://doi.org/10.1371/
journal.pone.0176921
Grigorik, I., & Weiss, Y. (2018). W3C Preload API. Retrieved from
https://w3c.github.io/preload/#x2.link-type-preload
Henninger, F., Mertens, U. K., Shevchenko, Y., & Hilbig, B. E. (2017).
lab.js: Browser-based behavioral research. https://doi.org/10.5281/
zenodo.597045
Honing, H., & Reips, U.-D. (2008). Web-based versus lab-based studies:
A response toKendall (2008). Empirical Musicology Review,3,73
77. https://doi.org/10.5167/uzh-4560
Krantz, J., & Reips, U.-D. (2017). The state of web-based research: A
survey and call for inclusion in curricula. Behavior Research
Methods,49, 16211629. https://doi.org/10.3758/s13428-017-0882-x
Kyöstilä, S. (2018). Clamp performance.now() to 100us. Retrieved from
https://chromium-review.googlesource.com/c/chromium/src/+/
853505
de Leeuw, J. R. (2015). jsPsych: A JavaScript library for creating behav-
ioral experiments in a Web browser. Behavior Research Methods,
47,112. https://doi.org/10.3758/s13428-014-0458-y
de Leeuw, J. R., & Motz, B. A. (2016). Psychophysics in a Web browser?
Comparing response times collected with JavaScript and
Psychophysics Toolbox in a visual search task. Behavior Research
Methods,48,112.https://doi.org/10.3758/s13428-015-0567-2
Lewis, P. (2018). Rendering performance. Retrieved from https://
developers.google.com/web/fundamentals/performance/rendering/
Mangan, M., & Reips, U.-D. (2007). Sleep, sex, and the Web: Surveying
the difficult-to-reach clinical population suffering from sexsomnia.
Behavior Research Methods,39,233236. https://doi.org/10.3758/
BF03193152
Mozilla. (2018). Concurrency model and Event Loop. MDN web docs.
Retrieved from https://developer.mozilla.org/en-US/docs/Web/
JavaScript/EventLoop
Musch, J., & Reips, U.-D. (2000). A brief history of Web experimenting.
In M. H. Birnbaum (Ed.), Psychological experiments on the Internet
(pp. 6188). San Diego: Academic Press. https://doi.org/10.1016/
B978-012099980-4/50004-6
Behav Res
Plant, R. R. (2016). A reminder on millisecond timing accuracy and
potential replication failure in computer-based psychology experi-
ments: An open letter. Behavior Research Methods,48,408411.
https://doi.org/10.3758/s13428-015-0577-0
Plant, R. R., Hammond, N., & Turner, G. (2004). Self-validating presen-
tation and response timing in cognitive paradigms: How and why?
Behavior Research Methods, Instruments, & Computers,36,291
303. https://doi.org/10.3758/BF03195575
Reimers, S., & Stewart, N. (2015). Presentation and response timing
accuracy in Adobe Flash and HTML5/JavaScript Web experiments.
Behavior Research Methods,47,309327. https://doi.org/10.3758/
s13428-014-0471-1
Reips, U.-D. (2000). The Web experiment method: Advantages, disad-
vantages, and solutions. In M. H. Birnbaum (Ed.), Psychological
experiments on the Internet (pp. 89117).San Diego: Academic
Press. https://doi.org/10.5167/uzh-19760
Reips, U.-D. (2002). Standards for Internet-based experimenting.
Experimental Psychology,49, 243256. https://doi.org/10.1027/
1618-3169.49.4.243
Reips, U.-D. (2007). Reaction times in Internet-based research. Invited
symposium talk at the 37th Meeting of the Society for Computers in
Psychology (SCiP) Conference, St. Louis.
Reips, U.-D. (2012). Using the Internet to collect data. In H. Cooper, P.
M. Camic, R. Gonzalez, D. L. Long, A. Panter, D. Rindskopf, & K.
J. Sher (Eds.), APA handbook of research methods in psychology,
Vol 2: Research designs: Quantitative, qualitative, neuropsycholog-
ical, and biological (pp. 291310). Washington, DC: American
Psychological Association. https://doi.org/10.1037/13620-017
Reips, U.-D., & Stieger, S. (2004). Scientific LogAnalyzer: AWeb-based
tool for analyses of server log files in psychological research.
Behavior Research Methods, Instruments, & Computers,36,304
311. https://doi.org/10.3758/BF03195576
Schmidt, W. C. (1997). World-Wide Web survey research: Benefits, po-
tential problems, and solutions. Behavior Research Methods,
Instruments, & Computers,29,274279. https://doi.org/10.3758/
BF03204826
Schmidt, W. C. (2007). Technical considerations when implementing
online research. In A. Joinson, K. McKenna, T. Postmes, & U.-D.
Reips (Eds.), The Oxford handbook of Internet psychology (pp.
461472). Oxford: Oxford University Press.
Schneider, W., Eschman, A., and Zuccolotto, A. (2012). E-Prime users
guide. Pittsburgh: Psychology Software Tools, Inc.
Scholz, F. (2018). performance.now(). MDN web docs. Retrieved from:
https://developer.mozilla.org/en-US/docs/Web/API/Performance/now
Schwarz, S., & Reips, U.-D. (2001). CGI versus JavaScript: A Web ex-
periment on the reversed hindsight bias. In U.-D. Reips & M.
Bosnjak (Eds.), Dimensions of Internet science (pp. 7590).
Lengerich: Pabst.
van Steenbergen, H., & Bocanegra, B. R. (2016). Promises and pitfalls of
Web-based experimentation in the advance of replicable psycholog-
ical science: A reply to Plant (2015). Behavior Research Methods,
48, 17131717. https://doi.org/10.3758/s13428-015-0677-x
Stieger, S., & Reips, U.-D. (2010). What are participants doing while
filling in an online questionnaire: A paradata collection tool and an
empirical study. Computers in Human Behavior,26,1488
1495. https://doi.org/10.1016/j.chb.2010.05.013
WHATWG (Apple, Google, Mozilla, Microsoft). (2018). HTML living
standard: Event loops. Retrieved from https://html.spec.whatwg.
org/multipage/webappapis.html#event-loops
Wolfe, C. R. (2017). Twenty years of Internet-based research at SCiP: A
discussion of surviving concepts and new methodologies. Behavior
Research Methods,49,16151620. https://doi.org/10.3758/s13428-
017-0858-x
Behav Res
  • ... With regard to stimulus presentation, web applications may occasionally realize shorter or longer durations than were requested (Barnhoorn, Haasnoot, Bocanegra, & van Steenbergen, 2015;Garaizar & Reips, 2018;Garaizar, Vadillo, & López-de-Ipiña, 2014a;Reimers & Stewart, 2015;Schmidt, 2001). Computer screens refresh with a constant frequency and presentation durations are typically counted in frames. ...
    ... Recently, particular methods for optimizing timing accuracy have been introduced and examined. These include three methods with which a stimulus can be presented, b a s e d o n m a n i p u l a t i n g t h e ( 1 ) o p a c i t y o r ( 2 ) background-color Cascading Style Sheet (CSS) properties of a Hypertext Markup Language (HTML) element (Garaizar & Reips, 2018), and (3) drawing to a canvas element (Garaizar, Vadillo, & López-de-Ipiña, 2014a). The onset and offset of stimuli can be timed via requestAnimationFrame (rAF; Barnhoorn et al., 2015;Garaizar & Reips, 2018;Garaizar, Vadillo, & López-de-Ipiña, 2014a) for all three presentation methods listed above. ...
    ... These include three methods with which a stimulus can be presented, b a s e d o n m a n i p u l a t i n g t h e ( 1 ) o p a c i t y o r ( 2 ) background-color Cascading Style Sheet (CSS) properties of a Hypertext Markup Language (HTML) element (Garaizar & Reips, 2018), and (3) drawing to a canvas element (Garaizar, Vadillo, & López-de-Ipiña, 2014a). The onset and offset of stimuli can be timed via requestAnimationFrame (rAF; Barnhoorn et al., 2015;Garaizar & Reips, 2018;Garaizar, Vadillo, & López-de-Ipiña, 2014a) for all three presentation methods listed above. Additionally, opacity and background-position presentation methods can be timed via CSS animations (Garaizar & Reips, 2018). ...
    Article
    Full-text available
    Web applications can implement procedures for studying the speed of mental processes (mental chronometry) and can be administered via web browsers on most commodity desktops, laptops, smartphones, and tablets. This approach to conducting mental chronometry offers various opportunities, such as increased scale, ease of data collection, and access to specific samples. However, validity and reliability may be threatened by less accurate timing than specialized software and hardware can offer. We examined how accurately web applications time stimuli and register response times (RTs) on commodity touchscreen and keyboard devices running a range of popular web browsers. Additionally, we explored the accuracy of a range of technical innovations for timing stimuli, presenting stimuli, and estimating stimulus duration. The results offer some guidelines as to what methods may be most accurate and what mental chronometry paradigms may suitably be administered via web applications. In controlled circumstances, as can be realized in a lab setting, very accurate stimulus timing and moderately accurate RT measurements could be achieved on both touchscreen and keyboard devices, though RTs were consistently overestimated. In uncontrolled circumstances, such as researchers may encounter online, stimulus presentation may be less accurate, especially when brief durations are requested (of up to 100 ms). Differences in RT overestimation between devices might not substantially affect the reliability with which group differences can be found, but they may affect reliability for individual differences. In the latter case, measurement via absolute RTs can be more affected than measurement via relative RTs (i.e., differences in a participant’s RTs between conditions).
  • ... Additionally, modern screen refresh rates are almost exclusively set to 60 Hz (de facto standard), making certain specifications of online studies a bit more predictable. Among others [57][58][59], two recent large studies [60,61] investigated timing precision (unintended variability in stimulus presentation) of several online and offline solutions. The online-based comparison found good overall precision for Gorilla (13 ms), jsPsych (26 ms), PsychoJS (−6 ms) and lab.js (10 ms). ...
    ... There are some aspects researchers should consider when starting out with running online studies or transferring lab-based experiments to online systems [59,75,76] (see Figure 2). To a certain extent, creating successful online experiments is similar to app development: one needs to think of a coherent framework and constantly worry about what the users are doing with the 'product' and whether they are using it as intended-without many opportunities for direct feedback. ...
    Article
    Full-text available
    Researchers have ample reasons to take their experimental studies out of the lab and into the online wilderness. For some, it is out of necessity, due to an unforeseen laboratory closure or difficulties in recruiting on-site participants. Others want to benefit from the large and diverse online population. However, the transition from in-lab to online data acquisition is not trivial and might seem overwhelming at first. To facilitate this transition, we present an overview of actively maintained solutions for the critical components of successful online data acquisition: creating, hosting and recruiting. Our aim is to provide a brief introductory resource and discuss important considerations for researchers who are taking their first steps towards online experimentation.
  • ... Therefore, rAF can solve the problem of low timing accuracy in web browsers while also addressing the issues raised by Plant [5]. Garaizar and Peips (2018) suggested that rAF shows high timing accuracy in operating systems except those based in Linux [11]. In the present study, we obtained similar results using the typical stimuli of dynamic sinusoidal gratings and flashes. ...
    Article
    Full-text available
    Online experiments are growing in popularity. This study aimed to determine the timing accuracy of web technologies and investigate whether they can be used to support high temporal precision psychology experiments. A dynamic sinusoidal grating and flashes were produced by setInterval, CSS3, and requestAnimationFrame (hereafter, rAF) technologies. They were run at normal or real-time priority processing in Chrome, Firefox, Edge, and Internet Explorer on Windows, macOS, and Linux. Timing accuracies were compared with that of Psychtoolbox which was chosen as gold standard. It was found that rAF with real-time priority had the best timing accuracy compared to the other web technologies and had a similar timing accuracy as Psychtoolbox in traditional experiments in most cases. However, rAF exhibited poor timing accuracy on Linux. Therefore, rAF can be used as technical basis for accuracy of millisecond timing sequences in online experiments, thereby benefiting the psychology field.
  • ... There are some aspects researchers should consider when starting out with running online studies or transferring lab-based experiments to online systems [57,58]. To a certain extent, creating successful online experiments is similar to app development: one needs to think of a coherent framework and constantly worry about what the users are doing with the 'product' and whether they are using it as intended -without many opportunities for direct feedback. ...
    Preprint
    Researchers have ample reasons to take their experimental studies out of the lab and into the online wilderness. For some, it is out of necessity, due to an unforeseen laboratory closure or difficulties in recruiting on-site participants. Others want to benefit from the large and diverse online population. However, the transition from in-lab to online data acquisition is not trivial and might seem overwhelming at first. To facilitate this transition, we present an overview of actively maintained solutions for the critical components of successful online data acquisition: creating, hosting and recruiting. Our aim is to provide a brief introductory resource and discuss important considerations for researchers who are taking their first steps towards online experimentation.
  • ... There have been various claims made on the scientific record regarding the display and response timing ability of experimental setups using web browsers, for instance that timing can be good depending on device and setup (Pronk, Wiers, Molenkamp, & Murre, 2019), and that different techniques of rendering animations lead to reduced timing precision (Garaizar & Reips, 2019). Ultimately, though, the variance in timing reflects the number of different ways to create an online experiment, and the state of the software and hardware landscape at the time of assessmentall of these are changing at a fast rate. ...
    Preprint
    Full-text available
    Due to its increasing ease-of-use and ability to quickly collect large samples, online behavioral research is currently booming. With this increasing popularity, it is important that researchers are aware of who online participants are, and what devices and software they use to access experiments. While it is somewhat obvious that these factors can impact data quality, it remains unclear how big this problem is. To understand how these characteristics impact experiment presentation and data quality, we performed a battery of automated tests on a number of representative setups. We investigated how different web-building platforms (Gorilla, jsPsych, Lab.js, and psychoJS/PsychoPy3), browsers (Chrome, Edge, Firefox, and Safari), and operating systems (mac OS and Windows 10) impact display time across 30 different frame durations for each software combination. In addition, we employed a robot actuator in representative setups to measure response recording across aforementioned platforms, and between different keyboard types (desktop and integrated laptop). We then surveyed over 200 000 participants on their demographics, technology, and software to provide context to our findings. We found that modern web-platforms provide a reasonable accuracy and precision for display duration and manual response time, but also identify specific combinations that produce unexpected variance and delays. While no single platform stands out as the best in all features and conditions, our findings can help researchers make informed decisions about which experiment building platform is most appropriate in their situation, and what equipment their participants are likely to have.
  • ...  Programming knowledge-not required  Complexity-simple, it is enough to install the application using the installer and configure the system, the interface is thoughtful and looks beautiful  Tools for data analyses- K. lab.js lab.js is a tool allows to create any researches in browser in the field of social and cognitive science [17]. The researches can be created with the use of visual constructor or with the help of programming code [18]. Fig. 11 shows the interface of test constructor in lab.js. ...
    Article
    Full-text available
    Longitudinal studies allow to access the review of causal hypotheses directly. It means that they make it possible causal relation between the order of impacts (i.e., life events, educational effects etc.) and the consequences that then occur. Long-term data storage has specific requirements for software and methods of data storage and conversion. The paper introduces criteria for evaluating software tools in the context of their application in longitudinal studies in psychology. The study is devoted to the analysis of popular tools of psychological research based on the criteria, which were introduced.
  • Presentation
    Invited Spotlight talk at conference "Online experiments – Quick? Cheap? Valid? A Conference on Online Experimental Research", https://www.suz.uzh.ch/de/forschung/projekte/eor/Invited-Speakers.html Ulf-Dietrich Reips has a background in psychology. In 1995, he founded the first online lab for experiments. His work focuses on Internet-based research methodologies, particularly Internet-based psychological experimenting, Internet-based tests, the psychology of the Internet, measurement, the cognition of causality, Social Media, and Big Data. He will talk about “Pitfalls of doing experiments online”.
  • Conference Paper
    Invited Spotlight talk at conference "Online experiments – Quick? Cheap? Valid? A Conference on Online Experimental Research", https://www.suz.uzh.ch/de/forschung/projekte/eor/Invited-Speakers.html Ulf-Dietrich Reips has a background in psychology. In 1995, he founded the first online lab for experiments. His work focuses on Internet-based research methodologies, particularly Internet-based psychological experimenting, Internet-based tests, the psychology of the Internet, measurement, the cognition of causality, Social Media, and Big Data. He will talk about “Pitfalls of doing experiments online”.
  • Article
    Full-text available
    Psychological effects connected with fluent processing are called fluency effects. In a sample of 403 participants we test whether conceptual fluency effects can be found in the context of inductive reasoning, a context that has not been investigated before. As a conceptual manipulation we vary the use of symbols (persons and crosses) in reasoning tasks. These symbols were chosen to provide hints for the solution of the implemented tasks and thus manipulate fluency. We found evidence that these hints influence ease of processing. The proportion of solved tasks increased by 11% on average in the condition with conceptual hints, F(1,399) = 13.47, partial η² = .033, p < .001. However, we did not find an effect of the conceptual manipulation on the temporal perception of the task. In a second study (n = 62) we strengthened our findings by investigating solution strategies for the tasks in more detail, 79% of the participants described the tasks in a way they were intended. Our results illustrate the advantages of the separation of ease of processing, fluency experience, and judgments.
  • Presentation
    Full-text available
    Internet-based research methods: Challenges and solutions. Computational Linguistics, Linguistic Modeling and its Interfaces lecture series, University of Tübingen.
  • Article
    Full-text available
    The increasingly widespread use of mobile phone applications (apps) as research tools and cost-effective means of vast data collection raises new methodological challenges. In recent years, it has become a common practice for scientists to design apps that run only on a single operating system, thereby excluding large numbers of users who use a different operating system. However, empirical evidence investigating any selection biases that might result thereof is scarce. Henceforth, we conducted two studies drawing from a large multi-national (Study 1; N = 1,081) and a German-speaking sample (Study 2; N = 2,438). As such Study 1 compared iOS and Android users across an array of key personality traits (i.e., well-being, self-esteem, willingness to take risks, optimism, pessimism, Dark Triad, and the Big Five). Focusing on Big Five personality traits in a broader scope, in addition to smartphone users, Study 2 also examined users of the main computer operating systems (i.e., Mac OS, Windows). In both studies, very few significant differences were found, all of which were of small or even tiny effect size mostly disappearing after sociodemographics had been controlled for. Taken together, minor differences in personality seem to exist, but they are of small to negligible effect size (ranging from OR = 0.919 to 1.344 (Study 1), np² = .005 to .036 (Study 2), respectively) and may reflect differences in sociodemographic composition, rather than operating system of smartphone users. © 2017 Goetz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
  • Article
    Full-text available
    The first papers that reported on conducting psychological research on the web were presented at the Society for Computers in Psychology conference 20 years ago, in 1996. Since that time, there has been an explosive increase in the number of studies that use the web for data collection. As such, it seems a good time, 20 years on, to examine the health and adoption of sound practices of research on the web. The number of studies conducted online has increased dramatically. Overall, it seems that the web can be a method for conducting valid psychological studies. However, it is less clear that students and researchers are aware of the nature of web research. While many studies are well conducted, there is also a certain laxness appearing regarding the design and conduct of online studies. This laxness appears both anecdotally to the authors as managers of large sites for posting links to online studies, and in a survey of current researchers. One of the deficiencies discovered is that there is no coherent approach to educating researchers as to the unique features of web research.
  • Article
    This discussion of the symposium 20 Years of Internet-Based Research at SCiP: Surviving Concepts, New Methodologies compares the issues faced by the pioneering Internet-based psychology researchers who presented at the first symposia on the topic, at the 1996 annual meeting of the Society for Computers in Psychology, to the issues facing researchers today. New methodologies unavailable in the early days of Web-based psychological research are discussed, with an emphasis on mobile computing with smartphones that is capitalizing on capabilities such as touch screens and gyro sensors. A persistent issue spanning the decades has been the challenge of conducting scientific research with consumer-grade electronics. In the 1996 symposia on Internet-based research, four advantages were identified: easy access to a geographically unlimited subject population, including subjects from very specific and previously inaccessible target populations; bringing the experiment to the subject; high statistical power through large sample size; and reduced cost. In retrospect, it appears that Internet-based research has largely lived up to this early promise—with the possible exception of sample size, since the public demand for controlled psychology experiments has not always been greater than the supply offered by researchers. There are many reasons for optimism about the future of Internet-based research. However, unless courses and textbooks on psychological research methods begin to give Web-based research the attention it deserves, the future of Internet-based psychological research will remain in doubt.
  • Article
    This article introduces some of the rudimentary underlying concepts of how the Internet works and points out a number of caveats that can influence the quality of collected data. Topics covered include Internet basics, technical problems, programming for the lowest common technology, client configuration issues, server side and data security issues, and the limits of precision. It is hoped that after becoming familiar with the information herein, researchers will be capable of determining whether the research application they are interested in pursuing is fit for the Internet medium, or whether technical issues will pose problems which threaten the validity of the work.
  • Article
    Full-text available
    In a recent letter, Plant (2015) reminded us that proper calibration of our laboratory experiments is important for the progress of psychological science. Therefore, carefully controlled laboratory studies are argued to be preferred over Web-based experimentation, in which timing is usually more imprecise. Here we argue that there are many situations in which the timing of Web-based experimentation is acceptable and that online experimentation provides a very useful and promising complementary toolbox to available lab-based approaches. We discuss examples in which stimulus calibration or calibration against response criteria is necessary and situations in which this is not critical. We also discuss how online labor markets, such as Amazon's Mechanical Turk, allow researchers to acquire data in more diverse populations and to test theories along more psychological dimensions. Recent methodological advances that have produced more accurate browser-based stimulus presentation are also discussed. In our view, online experimentation is one of the most promising avenues to advance replicable psychological science in the near future.
  • Article
    Full-text available
    There is an ongoing 'replication crisis' across the field of psychology in which researchers, funders, and members of the public are questioning the results of some scientific studies and the validity of the data they are based upon. However, few have considered that a growing proportion of research in modern psychology is conducted using a computer. Could it simply be that the hardware and software, or experiment generator, being used to run the experiment itself be a cause of millisecond timing error and subsequent replication failure? This article serves as a reminder that millisecond timing accuracy in psychology studies remains an important issue and that care needs to be taken to ensure that studies can be replicated on current computer hardware and software.
  • Article
    Full-text available
    Behavioral researchers are increasingly using Web-based software such as JavaScript to conduct response time experiments. Although there has been some research on the accuracy and reliability of response time measurements collected using JavaScript, it remains unclear how well this method performs relative to standard laboratory software in psychologically relevant experimental manipulations. Here we present results from a visual search experiment in which we measured response time distributions with both Psychophysics Toolbox (PTB) and JavaScript. We developed a methodology that allowed us to simultaneously run the visual search experiment with both systems, interleaving trials between two independent computers, thus minimizing the effects of factors other than the experimental software. The response times measured by JavaScript were approximately 25 ms longer than those measured by PTB. However, we found no reliable difference in the variability of the distributions related to the software, and both software packages were equally sensitive to changes in the response times as a result of the experimental manipulations. We concluded that JavaScript is a suitable tool for measuring response times in behavioral research.
  • Article
    Full-text available
    Using the Web to run behavioural and social experiments quickly and efficiently has become increasingly popular in recent years, but there is some controversy about the suitability of using the Web for these objectives. Several studies have analysed the accuracy and precision of different web technologies in order to determine their limitations. This paper updates the extant evidence about presentation accuracy and precision of the Web and extends the study of the accuracy and precision in the presentation of multimedia stimuli to HTML5-based solutions, which were previously untested. The accuracy and precision in the presentation of visual content in classic web technologies is acceptable for use in online experiments, although some results suggest that these technologies should be used with caution in certain circumstances. Declarative animations based on CSS are the best alternative when animation intervals are above 50 milliseconds. The performance of procedural web technologies based on the HTML5 standard is similar to that of previous web technologies. These technologies are being progressively adopted by the scientific community and have promising futures, which makes their use advisable to utilizing more obsolete technologies.
  • Article
    Full-text available
    Performing online behavioral research is gaining increased popularity among researchers in psychological and cognitive science. However, the currently available methods for conducting online reaction time experiments are often complicated and typically require advanced technical skills. In this article, we introduce the Qualtrics Reaction Time Engine (QRTEngine), an open-source JavaScript engine that can be embedded in the online survey development environment Qualtrics. The QRTEngine can be used to easily develop browser-based online reaction time experiments with accurate timing within current browser capabilities, and it requires only minimal programming skills. After introducing the QRTEngine, we briefly discuss how to create and distribute a Stroop task. Next, we describe a study in which we investigated the timing accuracy of the engine under different processor loads using external chronometry. Finally, we show that the QRTEngine can be used to reproduce classic behavioral effects in three reaction time paradigms: a Stroop task, an attentional blink task, and a masked-priming task. These findings demonstrate that QRTEngine can be used as a tool for conducting online behavioral research even when this requires accurate stimulus presentation times. Electronic supplementary material The online version of this article (doi:10.3758/s13428-014-0530-7) contains supplementary material, which is available to authorized users.