ChapterPDF Available

Abstract and Figures

As a consequence of the joint and rapid evolution of the Internet and the social and behavioral sciences during the last two decades, the Internet is becoming one of the best possible psychological laboratories and is being used by scientists from all over the world in more and more productive and interesting ways each day. This chapter uses examples from psychology, while reviewing the most recent Web paradigms, like the Social Web, Semantic Web, and Cloud Computing, and their implications for e-research in the social and behavioral sciences, and tries to anticipate the possibilities offered to social science research-ers by future Internet proposals. The most recent advancements in the architecture of the Web, both from the server and the client-side, are also discussed in relation to behavioral e-research. Given the increasing social nature of the Web, both social scientists and engineers should benefit from knowledge on how the most recent and future Web developments can provide new and creative ways to advance the understanding of the human nature.
Content may be subject to copyright.
Angel A. Juan
IN3 - Open University of Catalonia, Spain
Thanasis Daradoumis
University of the Aegean, Greece, & Open University of Catalonia, Spain
Meritxell Roca
IN3 - Open University of Catalonia, Spain
Scott E. Grasman
Rochester Institute of Technology, USA
Javier Faulin
Public University of Navarre, Spain
Collaborative and
Distributed E-Research:
Innovations in Technologies,
Strategies and Applications
Collaborative and distributed e-research: innovations in technologies, strategies, and applications / Angel A. Juan ... [et al.],
editors.
p. cm.
Includes bibliographical references and index.
Summary: “This book offers insight into practical and methodological issues related to collaborative e-research and
furthers readers understanding of current and future trends in online research and the types of technologies involved”--Pro-
vided by publisher.
ISBN 978-1-4666-0125-3 (hardcover) -- ISBN 978-1-4666-0127-7 (print & perpetual access) 1. Internet research. 2.
Group work in research. I. Juan, Angel A., 1972-
ZA4228.C65 2012
001.4’202854678--dc23
2011039614
British Cataloguing in Publication Data
A Cataloguing in Publication record for this book is available from the British Library.
All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the
authors, but not necessarily of the publisher.
Managing Director: Lindsay Johnston
Senior Editorial Director: Heather Probst
Book Production Manager: Sean Woznicki
Development Manager: Joel Gamon
Development Editor: Hannah Abelbeck
Acquisitions Editor: Erika Gallagher
Typesetter: Russell A. Spangler
Cover Design: Nick Newcomer, Lisandro Gonzalez
Published in the United States of America by
Information Science Reference (an imprint of IGI Global)
701 E. Chocolate Avenue
Hershey PA 17033
Tel: 717-533-8845
Fax: 717-533-8661
E-mail: cust@igi-global.com
Web site: http://www.igi-global.com
Copyright © 2012 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in
any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher.
Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or
companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark.
Library of Congress Cataloging-in-Publication Data
34
Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Chapter 2
Pablo Garaizar
Universidad de Deusto, Spain
Miguel A. Vadillo
Universidad de Deusto, Spain
Diego López-de-Ipiña
Universidad de Deusto, Spain
Helena Matute
Universidad de Deusto, Spain
The Web as a Platform for
E-Research in the Social
and Behavioral Sciences
ABSTRACT
As a consequence of the joint and rapid evolution of the Internet and the social and behavioral sciences
during the last two decades, the Internet is becoming one of the best possible psychological laboratories
and is being used by scientists from all over the world in more and more productive and interesting ways
each day. This chapter uses examples from psychology, while reviewing the most recent Web paradigms,
like the Social Web, Semantic Web, and Cloud Computing, and their implications for e-research in the
social and behavioral sciences, and tries to anticipate the possibilities offered to social science research-
ers by future Internet proposals. The most recent advancements in the architecture of the Web, both
from the server and the client-side, are also discussed in relation to behavioral e-research. Given the
increasing social nature of the Web, both social scientists and engineers should benet from knowledge
on how the most recent and future Web developments can provide new and creative ways to advance the
understanding of the human nature.
DOI: 10.4018/978-1-4666-0125-3.ch002
35
The Web as a Platform for E-Research in the Social and Behavioral Sciences
WEB-BASED RESEARCH IN
PSYCHOLOGY: TWO DECADES OF
JOINT EVOLUTION
Cognitive psychology has traditionally kept a
special relationship with computer science. In
fact, the “cognitive revolution” usually refers
to the joint developments that took place in the
mid 1950’s and 1960’s in computer science and
psychology, together with those of other cognitive
sciences (linguistics, philosophy, anthropology,
and neuroscience). Initially, psychologists were
interested in computers mainly because they
provided a novel and interesting model on how
the brain might work (Gardner, 1985). Behav-
iorists had complained that cognitive concepts
were difficult to apprehend in mechanistic and
reductionist terms and that they should, therefore,
always be avoided in scientific psychology. How-
ever, computer science showed for the first time
that simple, mechanic devices were also able to
process information and perform many cognitive
tasks that had sometimes been assumed to remain
beyond the realm of science. Soon, psychologists
proposed that the human brain was just a peculiar
type of computer and that the mind was a kind of
software running on this “hardware.” Moreover,
cognitive scientists started to describe cognitive
processes in a program-like manner and even tried
to simulate these processes in standard computers
(Newell & Simon, 1972).
For a long time, this was the main role played
by computers in cognitive psychology. However,
during the 1980’s, when computers became
cheaper and the recently developed high-level
programming languages made their use more
accessible, psychologists started to use comput-
ers with a new purpose in mind. Instead of just
using them as an abstract model of how the mind
works, psychologists began to use computers as
an additional tool in their experiments. Com-
puters simplified the presentation of stimuli to
participants and the registration of many types of
responses to those stimuli. In fact, any researcher
with rudimentary programming skills could eas-
ily conduct classical laboratory experiments with
only the help of a desktop computer. The old
laboratory in which a different apparatus had to be
used for each different experiment was soon sub-
stituted by laboratories in which computers were
used to run all types of experiments, including
tasks as different as: a) spatial navigation through
different types of mazes, b) memory tasks with
different lists of words, or with images or sounds,
c) reading comprehension studies using different
types of stories and distracting stimulation, d)
reaction-time studies, e) subliminal perception
involving words or images or sounds presented
so rapidly that they could not be consciously
processed, f) Pavlovian and operant conditioning
including visual or auditory stimuli as well as
different types of responses (from keyboard to
mouse to vocalizations), g) divided attention, h)
social dilemmas, or any other interesting research
question a psychologist could think of. Thus, by
the time the World Wide Web was created, in the
early 1990’s, most experimental psychologists
were already used to having computers in their
labs and using them extensively in their experi-
ments. It was only a matter of time before some
researchers made the first steps towards taking
advantage of the new opportunities offered by
the Internet as a multipurpose and world-wide
psychology laboratory. There is no doubt that this
world-wide laboratory is also of great value to
other social and behavioural sciences interested
in human behaviour, beliefs, attitudes, and social
relations. These include education, economics,
marketing, anthropology, sociology, and politics,
but this list is certainly not complete. Indeed,
any scientific discipline interested in how people
reason, learn, relate to each other, process informa-
tion, and respond to it, should benefit from using
the Web as a platform for e-research. Although
focusing on examples from our own field of
expertise, experimental psychology, the present
chapter should become a useful guide for other
social sciences as well.
36
The Web as a Platform for E-Research in the Social and Behavioral Sciences
From its very beginning at the Conseil Eu-
ropéen pour la Recherche Nucléaire (CERN) in
the early 1990’s, the World Wide Web has been
closely linked to the academic world. During
the past two decades, we have witnessed its
technological evolution and how social science
researchers have been taking advantage from
its capability to engage participants around the
whole world to perform larger, more insightful,
and valid experiments. The advantages of con-
ducting psychological research over the Internet
soon became clear. Probably the most remarkable
one is that the Internet made it relatively easy
to access extraordinarily large samples, some-
thing that is usually beyond the possibilities of
the traditional psychological laboratory. In fact,
this is still the main reason many researchers
decide to conduct experiments over the Internet
(e.g., Nosek, 2005). This feature of Internet-
based experiments becomes especially relevant
when researchers are interested in determining
whether a nonsignificant statistical result reflects
a genuine absence of effects or a simple lack
of statistical power (Bar-Anan, De Houwer, &
Nosek, 2010; Ratliff & Nosek, 2010). Moreover,
this methodology does not only allow the recruit-
ment of many participants but also can be used
to target very peculiar populations which would
otherwise remain inaccessible (Mangan & Reips,
2007; Vernberg, Snyder, & Schuh, 2005).
However, in spite of these and other advan-
tages, Internet-based research methods also pose
important methodological problems. In general,
researchers have little control over the condi-
tions in which online participants conduct the
experimental task. They cannot even make sure
of whether the participants have been paying
attention to the task or of whether they have
correctly understood the instructions. More-
over, the participant can enter the Web site of
the experiment several times and submit data
repeatedly. Although there are some technical
measures that can be used to avoid or reduce
the negative impact of these and related meth-
odological problems (Birnbaum, 2004; Reips,
2002), many of these solutions pose their own
problems as well. It is not surprising that the
pioneers of Internet-based research usually had
to face a high level of scepticism in the evalu-
ation of their studies.
In this situation, the first step that had to be
made before Internet-based methods could be
trusted was to carefully assess the impact of
conducting a study over the Internet, in relation
to the traditional laboratory. Therefore, many
researchers started to replicate well-known effects
over the Internet or to conduct studies simultane-
ously in the laboratory and over the Internet, so
that the results of both methodologies could be
contrasted. This work was done both to assess
the validity of online questionnaires and surveys
(Buchanan & Smith, 1999; Schmidt, 1997a), on
the one hand, and the validity of experimental
procedures, on the other (Birnbaum, 1999; Birn-
baum & Wakcher, 2002; Dandurand, Shultz, &
Onishi, 2008; Matute, Vadillo, & Bárcena, 2007;
Matute, Vegas & Pineño, 2002; Steyvers, Tenen-
baum, Wagenmakers, & Blum, 2003; Vadillo,
Bárcena, & Matute, 2006; Vadillo & Matute,
2009, 2011). It was particularly important to show
that effects could be replicated over the Internet
even when delicate dependent measures, such
as reaction times, were used (McGraw, Tew, &
Williams, 2000). In general, the main result of
this research was that, with very minor excep-
tions, online methods could be trusted and that
the drawbacks of this methodology were clearly
compensated by its many advantages.
A second step towards the generalization of
e-research in psychology was the development
of experimental software that could be easily
used by researcher to develop their own experi-
mental applications. Although most cognitive
psychologists were used to having to learn some
programming languages in order to design their
experiments, during the ’90’s and even today,
few of them have the necessary skills to adapt
these experiments to the Internet environment.
37
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Therefore, many researchers started to develop
simple design tools that could be used to generate
survey forms (Birnbaum, 2000; Schmidt, 1997b)
or more complicated experimental tasks (Reips &
Neuhaus, 2002) without much technical knowl-
edge of Internet-based programming.
After two decades of joint evolution, the
Internet has become a highly valuable research
tool for many experimental psychologists. They
do not only regularly use the Internet to conduct
experiments and large correlational studies. They
are also using it to gather other types of informa-
tion such as, for instance, psycholinguistic data
about the frequency of words in several languages
(Lahl, Göritz, Pietrowsky, & Rosenberg, 2009), or
the susceptibility of people to cognitive illusions
and biases such as the illusion of control (Matute,
Vadillo, Vegas, & Blanco, 2007). Moreover, the
increasing access of the general population to the
Internet is also providing psychologists with new
research topics that need to be explored (e.g.,
Internet abuse, cyberbullying) and with novel
ways of delivering psychological interventions to
the population (e.g., Botella et al., 2008a, 2008b,
2009). As we will show below, the recent devel-
opments in the Web 2.0 and the future Semantic
Web bring new and yet unexplored possibilities
for e-research in social sciences.
THE NEW WEB PARADIGMS
AND THEIR IMPLICATIONS
FOR E-RESEARCH IN THE
SOCIAL SCIENCES
The Future of the Web is yet unclear, but some
approaches like Social Web, Semantic Web, and
Cloud Computing have been used widely and
can still be substantially improved. In this sec-
tion we will describe the new scenarios enabled
by them in terms of e-research in the social and
behavioral sciences, and we will also glimpse
at the possibilities offered to social scientists by
future Internet proposals.
The Social Web
Despite of the fact that even from its first stages
of development at the early 1990’s the World
Wide Web (WWW) had a flexible and collabora-
tive design that allowed users to create new links
and content, it was not until the beginning of the
21
st
century that this possibility became true by
virtue of the technological and methodological
evolution popularly known as the Web 2.0 (Di-
Nucci, 1999; O’Reilly, 2005) or the Social Web
(Hoschka, 1998).
For the last two decades, Web users have moved
forward from simply carrying out hypertextual
data transfers to the socialization of many aspects
of their lives. The Web has evolved towards the
“Read/Write Web,” achieving one of the goals
initially proposed by its designers (Berners-Lee &
Cailliau, 1990). Describing this new stage of the
Web as the “Social Web” seems more appropriate,
since the adoption of the new technologies and
methodologies involved has not been as abrupt or
revolutionary as the term “Web 2.0might suggest,
but through a progressive process of socialization.
There may be differences between both terms when
explaining the origin of the change, but there is
no such discrepancy when considering its effects.
O’Reilly (2005) defines Web 2.0 applications as
services that get better the more people that use
them. Over the last 15 years, the Web has grown
from an information-centered network (i.e., “in-
formation superhighways”) into a people-centered
social media in which user-generated content is
crucial. This trend is likely to continue further,
considering that Future Internet aims at favoring
user-empowerment (i.e., two users making the
same search with the same keywords being at dif-
ferent locations and having different web profiles
will obtain different search results).
Two are the foundations of this Social Web:
users and data. Users generate content with dif-
ferent levels of implication (from merely being
part of a social media site to publishing and
editing multimedia content, rating or tagging it,
38
The Web as a Platform for E-Research in the Social and Behavioral Sciences
or providing recommendations and reviews), at-
tract more users, and reshape platforms and Web
services in terms of content (e.g., fixing errors)
and purpose (e.g. creating new ways of using
them, not defined by their designers). Data is
the fuel that drives social media, regardless of
whether it is generated by users or by other online
services. Instead of conforming vast repositories
of unrelated information, the Social Web’s data
is available in several standard formats, ready to
be remixed, completed or updated by third-party
services. As stated by Engeström (2009), people do
not just connect to each other using social media.
They connect through “shared objects,” and good
online services (e.g. Youtube, Flickr, Delicious)
allow people to create social objects that add value
to the rest of the users, and subsequently to the
whole social media.
Both these aspects of the Social Web can en-
hance e-research in social and behavioral sciences
in two different ways. On the one hand, the Social
Web allows using methodological approaches that
would be unfeasible otherwise, providing cheap
and effective ways to engage people in participat-
ing in experiments, or taking advantage from the
sharing features of social media to distribute and
process experimental data. On the other hand,
the success of the Social Web has significant
consequences from a psychological point of view,
in terms of new or implicit behaviors in social
media. Thus, the Social Web can be a good means
for improving research methodology and, at the
same time, an object of e-research in psychology.
Improving Research Methodology
through the Social Web
Many authors agree that the popularization of
computational technology provides a new way to
do science. Wolfram (2002) remarks the power of
using computers when facing complex problems,
even through simple programs, delegating the
tedious calculations to machines. According to
Shneiderman (2008), new kinds of science are
needed to study the integrated interdisciplinary
problems at the heart of socio-technical systems.
This Science 2.0 combines the methods of tradi-
tional science (i.e., hypothesis testing, predictive
models, and the need for validity, replicability, and
generalizability) with the opportunities offered by
the Social Web to collect and process real-time
empirical data. E-research in the social sciences
is also involved in this evolution. In addition
to academic social networks (e.g., Academia.
edu, Mendeley, ResearchID, SciLink), general
purpose social networks (e.g., Facebook, Twit-
ter, MySpace) can be used to spread scientific
findings or even to discuss them. The idea of
achieving insightful conclusions through open
debates in social media is still very controversial
for many reasons (e.g., irrelevant or non-accurate
contributions, reputation, non-disclosure agree-
ments). Nevertheless, there is no such debate
about using the social media as a way to recruit
participants for experiments. Many social science
experiments can take advantage from the wide
range of ways to push information provided by
the Social Web. It is easy to publish information
regarding the experiment or to recruit potential
participants using tags, categories, recommenda-
tions, groups, or fan-pages, without annoying other
people with unsolicited notifications that could be
considered as spam. Due to all the hints provided
by user-generated content, the Social Web is able
to hit very specific targets and avoid general and
ineffective ways of promotion (Anderson, 2006).
Thus, e-researchers in social sciences can reach
very specific participants in studies that would be
extremely difficult or even unfeasible without the
social features of the Web.
Apart from the increase in the number and
prevalence of participants in social sciences
experiments, the Social Web provides several
techniques to analyze and exploit user-generated
content. Virtually all social media services allow
interacting with their data through Application
Programming Interfaces (APIs). APIs allow con-
suming a Web service without using a browser
39
The Web as a Platform for E-Research in the Social and Behavioral Sciences
to browse through service’s Website (e.g., using
a mobile phone application) contents. Features
derived from APIs are often underestimated by
non-programmers. Using a real-life comparison,
APIs can be shown as “delivery services” of a
restaurant. If a family does not want to prepare
dinner, they can either go to a restaurant or call to a
delivery service. There are also mixed alternatives,
like going to a restaurant and order a take-away
meal, or even pay for a catering service at home.
Coming back to Web services, the meal is the
content that users need, and the delivery service
is the API. There are some restaurants without
delivery service that force customers to go to their
place for a meal, as API-less Web applications
do; and there are some other restaurants which
provide both alternatives (local or delivery), and
customers can decide whether it is worth going to
the restaurant or it is better to have dinner at home.
Moreover, by using APIs, third-party develop-
ers can aggregate social services to create new
ones, called “mash-ups” (e.g., an application
that mixes cartographic information provided by
Google Maps with beautiful pictures of nearby
places gathered from Flickr, with no need to have
an explicit agreement between the providers of
those APIs [Google and Yahoo!]). Another inter-
esting feature of social media APIs is that they
allow to access, collect, and analyze huge amounts
of useful information. Older methodologies, like
traditional offline experimentation or even online
experimentation with no use of social media, can
handle tens, hundreds, or thousands of participants.
But using the APIs provided by social media, mil-
lions of interactions can be handled in real time.
Eventually, the range provided by the API is able
to cover the whole target population of the study,
but it can sometimes be more limited in wide-range
studies due to technical and economic reasons.
Nevertheless, going from thousands to millions
of interactions is a significant leap in e-research.
For instance, Twitters public APIs allow to ac-
cess to the 1% of all real-time ‘tweets’—messages
sent via Twitter—to third-party applications. The
third-party applications using Twitter APIs can
apply to be whitelisted, which allows upgrading
their quota and access to the 10% of all Twitter
content. This means a huge number of 2 to 20
million tweets per day for regular and whitelisted
applications, respectively.
Providing ways to build a third-party applica-
tions’ ecosystem is at the core of all successful
social media platforms. The case of Facebook is
particularly remarkable because it enables the
creation of successful business models within the
social network (e.g., Zynga, the social videogame
company behind FarmVille, got over $1 million in
revenue a day during 2010 thanks to Facebook),
boosts the use of the Facebook fan-pages by com-
panies, and provides several interfaces to publish
outside-generated content in Facebook, or vice
versa, Facebook content in third-party platforms.
If Facebook’s benefits from interconnectivity are
considerable, Twitters are outstanding. Although
it is still unclear whether Twitter can be considered
a social network like Facebook or not (Kwak, Lee,
Park, & Moon, 2010), Twitter describes itself
as an “information network” where users find,
curate, and deliver content, rather than socialize.
Twitter users focus less on their social graph and
more on information broadcasting. Paradoxically,
Twitters limitations are its biggest strengths:
Its home page is extremely simple compared
with other social media, text-based content and
140-character limitations encourage focusing on
crucial information and the subscription-based
social graph allows asymmetric relationships
between users. Contrary to Facebook’s symmetric
relationships, where “friendship” is always bi-
directional, a Twitter user can follow another user
(i.e., subscribe to other users tweets), while not
being followed by her. The simplicity of Twitters
Web interface contrasts with the large set of rich
clients that extend the capabilities of the platform
through an intensive API usage. Metaphorically,
we can see Twitter as a government that builds
highways (i.e., Twitter servers and network band-
width), sets the regulations to use them (i.e., APIs
40
The Web as a Platform for E-Research in the Social and Behavioral Sciences
descriptions), and provides standard and simple
vehicles (i.e., Twitter Web page). Users can choose
to drive these standard vehicles or get others more
adapted to their needs (e.g., motorbikes, trucks,
etc.; or their equivalents in mobile Twitter clients,
blogging Twitter publishing buttons, etc.), as long
as they fulfil the regulations. This is a standard
feature of most social media platforms, but it is
more evident in Twitter because of the extreme
simplicity of its Web interface and the myriad of
third-party clients. As Cheng and Evans (2009)
found, in 2009 TweetDeck was the most popular
non-Twitter.com publishing tool with a 19.7%
market share, and more than half of Twitter users
(55%) used something other than Twitter.com. In
2010, as stated by Twitter’s CEO (Williams, 2010),
just 25 percent of the content is generated from
Twitter.com. That is, 75 percent of traffic comes
from outside Twitter.com through the ecosystem
provided by public APIs.
Twitter is a good example of the success that
public APIs can reach, but the rest of social media
(e.g., Facebook, Youtube, Flickr) are also seizing
the opportunity to offer content outside their plat-
forms and enabling mixing their contents: Facebook
users can share Youtube videos in their fan-pages
or walls, blog editors can embed slideshows from
Flickr and add social media links at the end of each
post to share it across the Social Web, LinkedIn (a
business-oriented social network) users can include
Slideshare presentations in their resumes. Using
APIs is not exclusive for social media, though.
Wherever a dynamic map is needed, Google is
providing it through Google Maps API.
Social sciences e-researchers must take into
account the fact that most of the Social Web’s
data is being heavily used and reused. Accounting
reused information should be done carefully. On
the one hand, researchers should avoid populating
their local databases with multiple copies of the
same information. On the other hand, informa-
tion reusage may imply relevance, and should be
analyzed on its own. Suh, Hong, Pirolli, and Chi
(2010) studied the variables that predict Twit-
ter content reusage (“retweetability,” the ability
of being retweeted—forwarded—when posting
content). Agichtein, Castillo, Donato, Gionis,
and Mishne (2008) question the simple metrics
used when analyzing social media, and introduce
a general classification framework for combining
the evidence coming from different sources of
information, that can be tuned automatically for
a given social media type and quality definition.
The Social Web’s public APIs are not the pana-
cea to all problems related to content. Sometimes
they are inadequate, insufficient or too complex to
use. In those cases, e-researchers should look for
third-party APIs with extra functionality or a wider
range of provided information formats. GNIP.com
is probably the best example of a third-party API
provider, as it supplies its own APIs for more than
30 different social media (e.g., Facebook, Twitter,
Youtube, Flicrk, Worpdress). For instance, GNIP.
com provides several Twitter-related APIs with
interesting extra features (“GNIP Premium Twitter
feeds”), like access to the 50 percent of all Twitter
content, delivered in real time (GNIP Twitter Half-
hose), a stream of all Twitter statuses containing
URLs, delivered in real time too (GNIP Twitter
Link Stream), or statuses that mention any given
user (GNIP Twitter User Mention Stream). The
main drawback of this kind of services is its price,
not suitable for low-budget research initiatives.
Even so, there are academic institutions that offer
third-party social media APIs for free, but under
some restrictions (e.g., limited date ranges, sample
sizes, storage quotas) due to the costs involved in
maintaining the service (Gaffney, Pearce, Darham,
& Nanis, 2010).
Finally, when no API is provided to access
relevant information from service providers,
e-researchers still have some alternatives. The
first one is to take advantage from “Web feeds”
(also known as “Web syndication”), if they are
available. Web applications use web feeds to
publish updated content in a standard way (i.e.,
using XML-based formats like RSS or Atom).
Newspapers are probably the best example to il-
41
The Web as a Platform for E-Research in the Social and Behavioral Sciences
lustrate the Web feed concept. In a newspapers
Web site, information is arranged by relevance and
topicality. If someone wants to be informed, she
should periodically access that Web site and look
for new content. As this can be very difficult for
some news, newspapers offer a publicly available
time-ordered news list as a Web feed. Gathering
the new content of a Web site is extremely easy
when Web feeds are available. The main difference
between API-based queries and Web feeds is that
using the former allows advanced queries, whereas
the later is limited to the last updates. Coming
back to the previous comparison between APIs
and pizza deliveries, a Web feed would be like a
delivery where customers cannot order the pizza
they want, but the last pizza that came out from
the oven. The second alternative when no API is
provided is a technique called “Web scraping.”
Using Web scraping techniques, a researcher can
convert human-readable data provided by a web
site and defined in HyperText Markup Language
(HTML) into raw data defined in an XML dialect
or other data formats, and subsequently store and
process it by means of a content analyzer. There
are many problems related to Web scraping: (a)
The HTML code of some Web sites is not easy to
parse to extract valuable data, (b) Small changes
in Web site’s HTML code have large impact in the
gathering process, and (c) Extracting data from
Web sites and using it in third-party services can
sometimes violate Terms of Use and content license
of the service provider.
In conclusion, there are many interesting fea-
tures of the Social Web that can be very useful for
e-research. Firstly, the Social Web is an excellent
platform to promote ongoing experiments and to
disclose findings, encouraging discussions about
them and reaching a large-scale pool of people
interested in participating in experiments. Sec-
ondly, the Social Web’s stress on data reuse allows
researchers to collect and process huge amounts
of original information, using social media public
APIs, third-party APIs, or even more complex
techniques like Web feeds or web scraping.
The Social Web as the Object
of E-Research in Psychology
Over the last decade, the Social Web has gone
through various stages (i.e., blogs, wikis, social
networks, location-based social networks). The
social component of the Web is a key factor to
understand why so many technologies are now
considered obsolete, and why the findings of
research studies conducted a few years ago no
longer apply to the current situation. Linden and
Fenn (2003) created a graphic representation of the
maturity, adoption and social application of specific
technologies to characterize the over-enthusiasm
or “hype” and subsequent disappointment that
typically happens with the introduction of new
technologies. As shown in Figure 1, a typical
“hype cycle” consists of five phases:
1. “Technology Trigger”: The new technology
is presented to the public with a proof of
concept, trying to achieve media coverage.
2. “Peak of Inflated Expectations”: Everyone
wants to use the new technology. Media or-
ganisations show it as the solution to every
problem. There may be some successful
applications of a technology, but there are
typically more failures.
3. Trough of Disillusionment: Over-
enthusiasm vanishes and technology fails
to be successful in all proposed scenarios.
Media usually abandons the topic and the
technology.
4. “Slope of Enlightenment”: The technology
is improved and adapted to those specific
situations where it was successful (less than
expected during the Peak of Inflated
Expectations,” but more than discarded dur-
ing the “Trough of Disillusionment”).
5. “Plateau of Productivity”: The benefits of us-
ing the technology are widely demonstrated
and accepted. The technology becomes in-
creasingly stable and evolves in second and
third generations.
42
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Blogging, like all technologies related to the
Social Web, passed through the different phases
of adoption, very similar to this “hype cycle.”
The number of blogs was consistently doubled
every 6 months from 2003 until 2006 (see Figure
2), but needed 320 days to double in 2007 (Sifry,
2007). In 2008, there were 600,000 blog posts per
day, less than in 2007 (Winn, 2009). Did blog-
ging reach the “Peak of Inflated Expectations”
at the end of 2006? It is very likely that the suc-
cess of Facebook and Twitter influenced in this
decrease of blogging usage, but this can also be
related to the fact that blogs were presented as
the solution to all Web needs during the period
from 2003 and 2007 (e.g., comercial promotion,
e-learning, research, multimedia portfolios). The
over-enthusiasm was answered with the creation
of companies focused in blog analysis (e.g.,
Technorati, BlogPulse, Google Blog Search), but
some of them changed their target when blogging
became less popular. Nowadays, blogs are close
to the “Slope of Enlightenment,” as they are being
used for the specific purpose they were created
for. This crazy growth of expectations around the
Social Web was described by Engeström (2009) as
“butterfly flights,” as they fly higher and higher,
and suddenly descend to the floor. The same hap-
pens with social media adoption when they lose
the users’ interest (e.g., Six Degrees, Friendster,
MySpace). Thus, exponential growth rates (Fisch,
2006) should be considered cautiously in the Social
Web, because they can be a sign of being at the
“Peak of Inflated Expectations.”
E-research in psychology has dealt with blog-
ging by studying the causes and consequences of
being a blogger. Even though there are diverse
motivations for blogging (Nardi, Schiano, Gum-
brecht, & Swartz, 2004), some authors suggest
that it can be predicted from the big five person-
ality traits (Digman, 1990): People who score
high in openness to new experience and high in
neuroticism too are more likely to be bloggers
(Guadagno, Okdie, & Eno, 2007). Similarly, Hsu
and Lin (2008) proposed a model based in the
Theory of Reasoned Action (TRA) where ease of
use, enjoyment, and knowledge sharing (i.e., al-
truism and reputation) explain 78 percent of the
variance of being a blogger; and social factors
(e.g. community identification) and attitude to-
ward blogging explain 83 percent of the variance
Figure 1. Gartners hype cycle phases (adapted from Linden & Fenn, 2003)
43
The Web as a Platform for E-Research in the Social and Behavioral Sciences
of continuing to blog. In relation to the conse-
quences of being a blogger, Baker and Moore
(2008) found that intending bloggers were more
psychologically distressed and more likely to use
venting and self-blame to cope with their stress
than non-bloggers. Intending bloggers also scored
lower on measures of social provisions and were
less satisfied with their number of online and of-
fline friends when compared to non-bloggers.
Consistent with the similarities between blogging
and writing a diary, many of the benefits related
to writing a diary (Smyth, 1998) have been found
to also be present in blogging.
Despite the fact that wikis did not attract
the attention of many e-researchers in psychol-
ogy, Amichai–Hamburger, Lamdan, Madiel, and
Hayat (2008) studied personality characteristics
of Wikipedia members (i.e., “wikipedians”),
and found that Wikipedia members locate their
real me on the Internet more frequently than
non-Wikipedia members. They also found sig-
nificant differences in agreeableness, openness,
and conscientiousness, which were lower for the
Wikipedia members, and an interaction between
Wikipedia membership and gender: Introverted
women were more likely to be Wikipedia members
as compared with extroverted women.
There is no doubt that online social networks
have been hot topics for e-research in psychology.
MySpace was the first online social network to
reach an audience of more than 100 million people
(Adest, 2006), creating huge expectations for this
Figure 2. Weblogs cumulative (March 2003 - April 2006). Weblogs reached the “Peak of Inflated Ex-
pectations” at the end of 2006 (© 2007, Technorati.com. Used with permission).
44
The Web as a Platform for E-Research in the Social and Behavioral Sciences
new way to socialize. Thelwall (2008) studied
the motivation of MySpace members, finding
significant differences between male and female.
Female members tended to be more interested
in friendship and males more interested in dat-
ing. As expected, female and younger members
had more friends than others, and both genders
seemed to prefer female friends, with this tendency
more marked in females for their closest friends.
Regarding privacy, females were more likely to
maintain private profiles. Due to the extensive
use of slang (i.e., not only related to the age of
the members, but also for being online and using
a social network with its own rules) there are
difficulties in parsing the messages shared on
MySpace (Thelwall, 2009). However, Thelwall,
Wilkinson, and Uppal (2010) found that two thirds
of the comments expressed positive emotion, and
that just a minority (20%) contained negative emo-
tion. Perhaps unsurprisingly, females were more
likely to give and receive positive comments than
males, but there was no difference for negative
comments. Torkjazi, Rejaie, and Willinger (2009)
remarked the “hype cycle” of MySpace analyzing
the number of members. The growth of allocated
user IDs in MySpace was exponential until 2007
(i.e., “Peak of Inflated Expectations”) followed
by a sudden and significant slow-down in 2008
(i.e., “Trough of Disillusionment”) motivated
by an increase in the popularity of Facebook.
Hargittai (2007) verified the struggle for new
members between Facebook and MySpace and
found significant differences according to their
socio-economic situation. Students whose parents
had lower levels of schooling were more likely to
be MySpace members, whereas students whose
parents had higher levels of education were more
likely to be Facebook members.
Facebook offers a reasonable trade-off between
standardization and customization for members’
profiles. Both characteristics are interesting from
a researcher perspective: Having standard features
makes profiles easily comparable, and customiza-
tions can be correlated with many other variables
(e.g., personality traits, mental disorders). Zhao,
Grasmuck, and Martin (2008) found differences
between anonymous online environments (e.g.,
Massively Multiplayer Online Role-Playing
Games, chat rooms, forums) and social networks
like Facebook, where most members use their real
identity. Facebook members claim their identities
implicitly (i.e., showing group and consumer
identities) rather than explicitly (i.e. talking about
themselves). Other studies agree that Facebook
profiles reflect actual personality rather than self-
idealization, and can be used to predict owners
personality, especially for extraversion, but not so
accurately for emotional stability (Gosling, Gad-
dis, & Vazire, 2007; Back, et al., 2010; Correa,
Hinsley, & Gil de Zúñiga, 2010). Rosen (2007)
studied the tendency to publicly trumpet one’s
online friendships, and characterized it as a narcis-
sistic quest for social status. Several authors came
to a similar conclusion. For example, Buffardi and
Campbell (2008) found that higher levels of social
activity and self-promoting content in Facebook
can be predicted through narcissistic personality
self-reports. Ellison, Steinfield, and Lampe (2007)
found a strong connection between Facebook use
and perceived social capital. Social networks help
to maintain relationships as people move from
one offline community to another (e.g., when
students graduate from high school or college).
Such connections could have strong payoffs in
terms of jobs, internships, and other opportuni-
ties, even in online environments. Lerman and
Galstyan (2008) found evidence of it analyzing
the impact of the “social graph” in social news
Websites (e.g., Digg, Reddit, StumbleUpon): Us-
ers tend to like stories submitted by friends and
stories their friends read and liked.
Another important issue related to Facebook is
privacy. There is a tendency for social media users
to value privacy, security, and trust, but there are
still inconsistent concerns about them. For instance,
Acquisti and Gross (2006) found that social net-
works users are mildly concerned about who can
access their personal information and how it can
45
The Web as a Platform for E-Research in the Social and Behavioral Sciences
be used, but not concerned about the information
itself, mostly because they are the publishers of the
content shared on the social network, and because
they believe to have some control on its access.
Moreover, there are many social motivators against
privacy when using social networks, like having fun
or allowing the social network to be a useful tool
by sharing enough information. Fogel and Nehmad
(2009) concluded that general privacy and identity
information disclosure concerns are more salient
to female than male (e.g., greater percentages of
male than female display their phone numbers and
home addresses on social media). Social media
around “social objects” (Engeström, 2009) offers
a wider range of alternatives to deal with privacy.
Lange (2007) analyzed social relationships among
youth on Youtube, identifying various degrees of
“publicness” in video sharing. Considering the
anonymity and the access restriction as factors,
four combinations could happen (i.e., public
account with unrestricted content, anonymous
account with unrestricted content, public ac-
count with restricted content, and anonymous
account with restricted content). Lange remarked
the use of two strategies to leverage anonymity
while sharing content: (a) “publicly private,” in
which video makers identities were revealed,
but content was relatively private because it was
not widely accessed, and (b) “privately public,”
where content was widely accessible, but detailed
information about video makers identities was
limited. However, anonymity and privacy are not
the same thing, and social media users should
realize that both are important.
As mentioned before, Twitter is not really
a social network, but an information network.
Perhaps the most interesting issue regarding Twit-
ter is related to the fact that users reshaped the
network, creating new ways of using it: (a) When
users needed a short way to answer a message,
they added an “@” to the username to mean it;
(b) “RT” or “retweet” was unofficially created
to express that the content is not original, but
forwarded from other user; and (c) as there was
no tagging system on Twitter, users started to
prepend a “#” to words to be considered as tags
by other users. Months later Twitter administra-
tors realized that these new codes were “de facto”
standards among the users and implemented them
as official features. Taking into account its fast
evolution in less than 5 years, studies published
in the last years should be considered within its
context, as their conclusions are not likely to apply
to current Twitter activity. For instance, Java and
Song (2007) concluded that the most common use
of Twitter was talk about daily routine. This could
be true in 2007, but nowadays people use Twitter
with other purposes. Moreover, other particular
characteristics of Twitter, like its text-based only
content, asymmetric relationships (e.g., @aplusk,
Ashton Kutcher user on Twitter follows less than
a thousand users, but is followed by more than 6
million users), or similar functionality for mobile
users, make Twitter a great platform for e-research
in social sciences. Cha, Haddadi, Benevenuto, and
Gummadi (2010) analyzed influence in Twitter
and found that popular users who have a large
number of followers are not necessarily influential
in terms of spawning retweets or mentions (e.g.,
a tweet from Ashton Kutcher is not more likely
to be forwarded or mentioned just because the
extraordinary number of followers). Furthermore,
a concerted effort (e.g., limiting tweets to a single
topic) seems to be the best way to gain influence
in Twitter. Conversely Suh, Hong, Pirolli, and
Chi (2010) found that the number of followers
and “following” (i.e., followees), as well as the
age of the account, seem to affect influence in
Twitter in terms of “retweetability,” while, interest-
ingly, the number of past tweets does not predict
retweetability of a users tweet. It is also interest-
ing the fact that URLs and hashtags have strong
relationships with retweetability, confirming the
shift of typical Twitter activity from daily routine
to information sharing. Perhaps some of the con-
troversial issues can be explained examining the
topology of the social graphs. Mislove, Marcon,
Gummadi, Druschel, and Bhattacharjee (2007)
46
The Web as a Platform for E-Research in the Social and Behavioral Sciences
stated that online social networks have structures
that differ from other networks, in particular the
Web. Social networks have a much higher fraction
of symmetric links and also exhibit much higher
levels of local clustering.
Location Based Social Networks (LBSN) go
further in the real-time social media mobility
(e.g., FourSquare, Gowall, Whirl). As LBSN
users tend to be Twitter members, all of them
are tightly connected with it, providing multiple
ways to automatically publish their content on
Twitter. This can be also problematic. Humphreys,
Krishnamurthy, and Gill (2010) found that about a
quarter of tweets included information regarding
when people are engaging in activities and where
they are. Educating users about the ways in which
personal information can be used for alternative
purposes (i.e., related to users privacy, security or
even safety) is an important step in media literacy.
Given the aforementioned privacy and secu-
rity issues in social networking, several radical
approaches for social media are being tested. For
example, Diaspora is a distributed social network
that provides a decentralized alternative to services
like Facebook. The project is currently under
development by Grippi, Salzberg, Sofaer, and
Zhitomirskiy (2010), and works by letting users
set up their own server of the social network, or by
using a server of a trusted organization. Diaspora
servers interact to share status updates and other
social data. Being open source software, it can be
audited by security experts and checked for back-
doors or other privacy leaks. With a decentralized
schema, the members of an institution concerned
about privacy, security, and trust (e.g., the Depart-
ment of Defence of the United States of America)
can use the Diaspora server set up by their own
IT department, and still socialize with the rest of
the social network with no risk. Path.com offers
another new concept: The personal network. Each
path member creates her own personal network
limited to her 50 closest friends (Morin, 2011).
This limit is based on the “Dunbars number,”
a theoretical cognitive limit to the number of
people with whom one can maintain stable social
relationships (Dunbar, 1992). Although, Dunbars
research suggested 150 as the maximum number
of social relationships, the network expands in
factors of roughly 3 (i.e., ~5 closest friends, ~20
people with regular contact, ~50 people considered
within the personal network, and ~150 stable social
relationships). It is still too soon to assess such
revolutionary approaches, but all of them suggest
that there is much work to do in the Social Web
in terms of privacy, security, and trust.
The Semantic Web
When Berners-Lee, designer of the World Wide
Web, described the Semantic Web with Hendler
and Lassilla (Berners-Lee, Hendler, & Lassila,
2001), the evolution of the Web appeared to be
targeted towards a machine-readable World Wide
Web (i.e., using metadata to describe meaning-
fully the content of the Web), and not through the
socialization of the technologies involved (i.e.,
the Social Web). Initially, Berners-Lee underes-
timated the Web 2.0 phenomenon (Lanningham,
2006), considering that most of the alleged new
features were already present in his original World
Wide Web design (i.e., the “Read/Write Web”).
He dreamed about a new Web where machines
would be able to understand and work with the
data transferred on interactions between people or
other machines (i.e., textual or multimedia content,
web links, user interactions), and prevent people
from tedious and repetitive procedures that could
be accomplished through machines talking to
other machines (Berners-Lee & Fischetti, 1999).
If so much data is already published on the Web,
why do we still need to compare or aggregate
it manually? For instance, a Semantic Web ap-
proach of “buying the cheapest flight from one
place to another” would provide the automatic
mechanisms to gather all the information from
diverse sources, understand and integrate it into
a semantic reasoner, and get the best offer among
all processed ones.
47
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Although the Semantic Web is a much bigger
step compared with the Social Web, its current
development status is promising but limited. The
main reason is technological. Adapting a Web
designed for humans (i.e., full of ambiguous
and incomplete information, multimedia content
without textual transcription, or broken links) to
a machine-readable one is not a trivial task. The
Social Web is the Web of people. Social media
users generate, consume and share content. The
Semantic Web is the Web of data. Semantic data
provide enough meta-data to allow for automatic
processing. Old hyper-text formats are too limited
for this purpose, so new formats are needed. Se-
mantic web formats should be able to describe data,
define data properties, relationships among data,
data classes or models, and logic rules to process
data without human help. The World Wide Web
Consortium (W3C) launched the Semantic Web
Activity, where a big number of Working Groups
are developing and adapting the Semantic Web
standards (e.g., RDF Working Group to define
Resource Description Framework, RDF format,
SPARQL Working Group to define SPARQL
Protocol and RDF Query Language, SPARQL
format, and so forth).
Designing the standard formats to build the
Semantic Web is very important, but more actions
have to be taken towards achieving a machine-
readable World Wide Web. The next step should
be to start describing Web data using semantic
formats, creating ontologies that explain and
describe relationships between concepts (Chan-
drasekaran, Josephson, & Benjamins, 1999).
Under the field of the Semantic Web, an ontology
is a formal representation of knowledge about
a specific domain. Within an ontology, concept
properties and relationships between them are
explicitly described using a restricted vocabulary
that allows automatic reasoning about them. Thus,
defining and using ontologies are key factors for
the Semantic Web’s success, but it is not an easy
task and experts are needed to supervise and correct
the ontology defining process. Besides, in most of
the cases experts cannot formally describe their
domain of expertise and the formalization process
may lead to a loss of accuracy.
Ontologies are valuable tools used to process
large amounts of data. They are often confused
with taxonomies—hierarchical, experts-made—
or even folksonomies (Vander Wal, 2004)—non-
hierarchical, amateurs-made. Both taxonomies
and ontologies need experts to be precisely
defined, but while taxonomies are focused on
classifying, ontologies provide enough semantic
metadata to enable automatic reasoning about the
data. Perhaps, recurring to a widely-used example
is the easiest way to understand ontologies better.
There are different versions of the beer ontology,
but regardless the semantic format used to describe
it, all of them are very similar. Figure 3 shows a
beer ontology (http://www.purl.org/net/ontology/
beer.owl) that describes almost everything related
to beer: Types of beer, ingredients, regions where
beer is produced, awards and associations related
to beer brewery, beer festivals, and so on. Expert
knowledge about beer brewery is needed to cre-
ate the beer ontology, because it has to be able to
accurately describe every concept related to beer.
Combining the beer ontology with a semantic
reasoner and other Web services, a Semantic Web
service would be able to fulfil complex queries like
“show me the closest bars serving a pale ale beer
with caramel, ordered by distance and beer price.”
Considering that data is the raw material of
scientific research, the Semantic Web (i.e., the
Web of data) is tightly related to e-research. Indeed,
Engelbrecht and Dror (2009) suggest that cogni-
tive psychology can contribute to the development
of ontologies for semantic technologies and the
Semantic Web in two different ways: (a) The ef-
ficiency with which activities that involve domain
experts (e.g. knowledge elicitation and ontology
authoring) are carried out and the utility of the
resulting ontologies can be improved by consid-
ering human information processing and its
limitations, and (b) the human cognitive system,
in general, and human knowledge representation,
48
The Web as a Platform for E-Research in the Social and Behavioral Sciences
in particular, can act as a model for the structure
of ontologies.
While it is essential to take into account the
cognitive abilities of experts to create ontologies,
it is also important to consider the opportunities
that the Semantic Web could provide. Bairoch
(2009) pictured the current situation: “It is quite
depressive to think that we are spending millions
in grants for people to perform experiments, pro-
duce new knowledge, hide this knowledge in an
often badly written text and then spend some more
millions trying to second guess what the authors
really did and found.” If all scientific knowledge
published in thousands of peer-reviewed journals
were stored using semantic formats, automatic
reasoning could be used to infer vast amounts
of new implicit knowledge, refuting established
models, completing preliminary studies, or fore-
seeing new fields of research. W3C’s Scientific
Publishing Task Force was created with this goal.
Actually, Aleman-Meza et al. (2006) went further
in reanalyzing published data when they proposed
a Semantic Web application that detects Conflict
of Interest (COI) relationships among potential
reviewers and authors of scientific papers.
W3C decided to encourage the use of Se-
mantic Web technologies for Health Care and
Life Sciences (with focus on biological science
and translational medicine), for many reasons:
(a) these domains have to process huge amounts
of complex (and not simplifiable) data, (b) there
is a high level of interaction in managed data
(e.g., interactions between molecules through
well-known processes generate new molecules
with different effects), (c) data sources are very
heterogeneous and diverse, (d) the benefits are
crucial for humanity. However, there are still some
bad practices related to “in silico” (i.e., computer-
based) experiments. Good and Wilkinson (2006)
criticized researchers that prefer to develop their
own and lesser-quality technological solutions in
order to increase the number of publications and
citations, which is precisely the opposite of the
main goal of the Semantic Web. Having more and
better applications using Life Science Identifier
systems (LSID), Resource Description Framework
(RDF), Web Ontology Language (OWL), and
Semantic Web Services should discourage the
use of non-standard technologies.
What would be the psychological equivalent
of large genomic databases? There is no direct
equivalent, but as explained before, the Social
Web generates millions of single interactions
among social media users that could be semanti-
Figure 3. Partial view of a beer ontology (http://www.purl.org/net/ontology/ beer.owl)
49
The Web as a Platform for E-Research in the Social and Behavioral Sciences
cally analyzed to extract opinions, emotions and
feelings and to infer new knowledge from them.
Opinion Mining (computer science) or Sentiment
Analysis (computational linguistics) are two
promising fields of research specialized on content
analysis (Pang & Lee, 2008). Some Web-focused
companies are currently developing semantic
parsers for social media content (e.g., Semiocast
provides semantic APIs for Facebook and Twit-
ter content), and many researchers are applying
data mining techniques to information shared
in social media. In two similar studies, Mislove
and colleagues (Mislove, et al., 2010; Mislove,
et al., 2010) created cartograms (i.e., maps where
geometry or space is distorted in order to convey
the information of a variable) based on the evolu-
tion of political topics on Twitter through time.
Sakaki, Okazaki, and Matsuo (2010) proposed
an algorithm to detect earthquakes in real time
by social sensors (i.e., social media activity). In
a similar way, Asur and Huberman (2010) found
significant correlations between box-office rev-
enues of movies and social media activity prior to
their public release. Finally, many other authors
(Specia & Motta, 2007; Van Damme, Hepp, &
Siorpaes, 2007) have worked on the integration
of the Social Web and the Semantic Web, trying
to take advantage of the best features of both
approaches.
Despite being so promising, the Semantic
Web will not be widely available within the next
few years, due to the technological and human
resources involved. However, “Linked Data”
(Berners-Lee, 2006) is an attempt to progress
towards a more realistic application of Semantic
Web, where a cut-down data model empowered
by the rich expressivity of new semantic standards
(especially a combination of RDF and OWL,
termed as RDFS++) is used to define vocabularies
and instance data which are interlinked. Thus, a
global knowledge graph (see Figure 4) is being
enabled under the auspices of Linking Open Data
initiative (Bizer, Heath & Berners-Lee, 2009),
linking and bringing together concepts and re-
lationships about different knowledge domains.
An interdisciplinary and global science could
arise from Open Data (Uhlir & Schroeder, 2007).
In the meantime, less ambitious approaches,
like microformats or “lowercase semantic web”—
also known as “decaffeinated Semantic-Web” or
“lower-s semantic web”—(Khare, 2006), can
provide semantic features to the Web through
simple semantic annotations (e.g., a non-seman-
tic Web service can offer semantic features using
microformats to express the language of each Web
page with a simple lang” property: <html
lang=”es”>). These annotations are often invisible
to users but enable valuable third-party web ap-
plications and e-research initiatives. For example,
each message or “tweet” transferred via Twitter
contains not only the 140 characters sent by the
user to the social network, but also tens of meta-
data fields regarding dates, location, user prefer-
ences, scope of the message and so forth (indeed,
the text of the message only represents the 5-10%
of the whole tweet, depending on the personal
settings of the sender). Reips and Garaizar (2011)
used geo-location related metadata of millions of
Twitter messages to create iScience Maps (http://
maps.iscience.deusto.es), a service that allows
researchers to assess via Twitter the effect of
specific events in different places as they are
happening and to make comparisons between
cities, regions, or countries and their evolution in
the course of an event.
Web-based research should be aware of this
kind of solutions and apply them, when available.
New Paradigm Addressed
by Future Internet
The current Internet, with billions of users world-
wide is a great success in terms of connecting
people and communities, but it was designed
in the 1970s for purposes quite different from
today’s heterogeneous needs and expectations.
The current Internet has grown beyond its origi-
nal expectations and beyond its original design
50
The Web as a Platform for E-Research in the Social and Behavioral Sciences
objectives. Many partial solutions have been
progressively developed and deployed to allow
the Internet to cope with the increasing demands
in terms of user connectivity and capacity. There
is a growing consensus among the scientific and
technical community that the methodology of
continuously “patching” the Internet technology
will not be able to sustain its continuous growth,
and to cope with it at an acceptable cost and speed.
The current Internet architecture is progressively
reaching a saturation point in meeting increas-
ing users expectations and behaviors, as well
as progressively showing inability to efficiently
respond to new technological challenges (i.e., in
terms of security, scalability, mobility, availability,
and manageability, but also of socio-economical
challenges).
Future Internet is a new term which sum-
marizes the efforts made by international asso-
ciations (e.g., GENI, AKARI, Future Internet) to
progress towards a better Internet, either through
(a) small, incremental evolutionary steps, or (b)
complete redesigns (clean slate) and architecture
principles. It should offer all users a secure, ef-
ficient, trusted, and reliable environment, that
allows open, dynamic, and decentralized access
to the network and adapt its performance to the
users’ needs and context.
Figure 5 illustrates how the 4 pillars of Future
Internet rely on the Future Internet networking
infrastructure foundation: (a) Internet by and for
people, (b) Internet of contents and knowledge,
(c) Internet of services, and (d) Internet of things
(Gershenfeld, Krikorian, & Cohen, 2004; Papad-
imitriou, 2009). All the elements of the Future
Internet (foundation and pillars) need each other
and are mutually dependent. New services and
applications are a prerequisite for investments in
Figure 4. Linking open data cloud diagram (© 2011, Richard Cyganiak and Anja Jentzsch: http://lod-
cloud.net/. Used with permission)
51
The Web as a Platform for E-Research in the Social and Behavioral Sciences
new infrastructure, since Infrastructure without
necessary capabilities cannot support new services
and applications (i.e., technology pull). New in-
frastructure technologies open new opportunities
for new services and applications (i.e., technology
push). Therefore, cooperation between all stake-
holders is required for a successful Future Internet.
Considering how all of these new approaches
redefine the relationship between users and tech-
nology, their implications for e-research in Social
Sciences are clear: Not only because of the meth-
odological changes that will come, but also be-
cause of the wide range of new scenarios and their
psychological and social implications.
THE WEB AS A PLATFORM
The most recent advances in the architecture of the
Web allow using it as an excellent platform to deploy
social science experiments over the Internet. In this
section we will discuss the specific implications for
e-research in social sciences of those improvements,
both on the server and the client-side.
Server-Side Technology: WOA,
REST, Cloud Computing
The Web as a platform defines its services using
Web Oriented Architecture (WOA), a design and
modelling methodology that extends Service
Oriented Architecture (SOA) to web applica-
tions. WOA represents information as resources
that will be handled by user-agents (browsers)
and Web servers through a simple Represen-
tational State Transfer (REST) mechanism. In
other words, simple interfaces using XML and
Hyper-Text Transfer Protocol (HTTP). Regarding
the infrastructure, Cloud Computing provides a
platform to develop and deploy Web applications
offered as services that can be consumed without
knowledge about its use and the implementation
of its resources.
This set of technologies and design methodolo-
gies fosters the vision of the Web as a platform to
develop and deploy all kind of applications, includ-
ing social science e-research ones, empowering
them with distributed, scalable (i.e., capable of not
degrading the service upon increasing demand),
and underlying technology agnostic services.
Figure 5. Future Internet foundation and pillars (adapted from Gershenfeld, Krikorian, & Cohen, 2004)
52
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Although Web 2.0 has more commonly been
known as the Social Web, some researchers have
also highlighted another important facet of it: It
transforms the Web into an application platform.
This is explained by the fact that there are numer-
ous services or functions published on the web as
proven by the site programmableweb.com, where
thousands of services, mainly offering easy to
consume Representational State Transfer (REST)
interfaces—a well-know approach to export
functionality through the HTTP protocol—and
smart combinations and aggregations of them
in the form of Web mash-ups are published.
Learning this publicly available (most often free)
web-accessible functionality is very easy at first,
developers only need to understand the methods,
parameters and results returned by its Application
Programming Interfaces (APIs). Therefore, they
have given place to remarkable innovation in
mash-up creation from the active Web developer
community.
Furthermore, the emergence of a new comput-
ing paradigm, namely Cloud Computing, where
data and services lie in highly scalable data centers
which can be ubiquitously accessed from any
Internet-connected device, is addressing two key
aspects for reliably pushing (migrating) applica-
tion functionality to the Internet: scalability and
ease of deployment. Cloud Computing can be
defined as a pool of abstracted, highly scalable,
and managed (supervised and controlled) compute
infrastructure capable of hosting (running) end-
customer applications and billed by consumption.
Interestingly, Cloud Computing makes available a
service hosting infrastructure for Web application
server-side deployment through easily consumed
REST interfaces. Remarkably, Cloud Comput-
ing is designed to make more computing and
storage resources available in a dynamic manner
as the demand for the consumption of a given
Web application grows in time. Big companies
such as Google, Microsoft, and Amazon offer
compelling Cloud Computing solutions on top
of which e-research applications could be easily
and scalably hosted.
Client-Side Technology: From
Browsers to Web Application Players
Thanks to the last decade’s technological and social
developments, the Web is progressively becoming
something that cannot only be browsed, but also
used for specific purposes. At present, it is becom-
ing less and less frequent to use a Website just to
have a look at news or to “jump” from one site to
another, with no particular purpose in mind. Instead,
many traditional desktop applications are being
replaced by equivalent programs offered as Web
services (e.g., Gmail instead of the traditional email
client, Youtube instead of the multimedia player,
Google Docs instead of the classical word proces-
sors) that can be accessed by Web user-agents (i.e.,
web-browsers), now more similar to “application
players” than to simple Internet browsers.
As we have already discussed, part of this
evolution from static websites to current Web ap-
plications can be seen as the result of the natural
development of the technologies that support the
Web (i.e., Cloud Computing, WOA). However,
the evolution of web user-agents has played an
important role too, providing an execution environ-
ment increasingly similar to that of the traditional
desktop applications. Many of these improvements
have been made possible by the Hypertext Applica-
tion Technology Working Group (WHATWG), an
initiative of the developers of the main web user-
agents (i.e., Apple, Mozilla, and Opera) aimed at
extending and updating the hypertext definition
language used to design Web sites (i.e., HTML) so
that it can support the main functionalities of most
applications, such as local storage, exchanging mes-
sages between documents, drag-and-drop features,
browsing history management, or 2D immediate
sketching, among others. The efforts made by the
WHATWG, with the support of the W3C, gave
rise to the specification of HTML5, which includes
many of these improvements (Hickson, 2011).
53
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Let’s analyze in detail the most important im-
plications of HTML5 for e-research in social sci-
ences. First, the “canvas” object for 2D sketching,
the drag-and-drop events or the timed multimedia
playing allows social science researchers to design
experiments that, until this very moment, could
only be conducted with desktop applications. Given
that the Web is increasingly being accessed from
mobile devices, one can achieve several goals by
being able to drag-and-drop elements directly with
the fingers or by synchronizing the interaction
with multimedia elements: (a) the development of
more attractive experiments, (b) the possibility of
interacting with the experiment almost anywhere
and in any circumstances, thanks to the use of Web
standards, and, most importantly, (c) the study of
psychological processes that can hardly be explored
with the limited interactivity of traditional PCs.
HTML5 also offers interesting possibilities for
Internet-based surveys, as it provides a new API
for web forms that allows to edit documents, send
data from one form to another, or even manage
browsing history. Both types of studies, either
interactive or form-based, can benefit from the
local-storage functions offered by HTML5, so
that experiments do not depend on the users’ con-
nectivity, which can facilitate longitudinal studies
and also naturalistic experiments conducted outside
the laboratory. An additional contribution of these
standards to the new Semantic Web is their support
for microdata management, brief semantic labels
that can help to make inferences about the type of
experiment being conducted, its published results
or the relationship between that experiment and
similar ones conducted by the same research team
or related people.
In spite of these new possibilities, both
WHATWG and W3C continue working on new
specifications that allow Web user-agents to better
compete with desktop applications. For instance,
Web workers have been designed to accomplish
background tasks (e.g., complex calculations).
Similarly, Web storage, Web sockets, and server-
sent events can be used for a better integration of
Web applications. The Geolocation API allows to
place on a given map any web interaction in an
automatic manner. Finally, other data formats can
be imported to HTML by means of SVG (vectorial
graphs) or MathML (mathematical equations). The
implications of all these tools for e-research are
promising, not only because of the specific contri-
butions made by each of them, but also because of
the more general transition from an hypertext- and
form-based Web to a different Web in which almost
everything in a desktop is possible and the Web
user-agents provide all the necessary functions to
simplify this task.
CONCLUSION
Throughout this chapter, the different approaches of
the Web as a platform for the development of social
science experiments have been fully considered,
from simple standalone web experiments to Cloud
Computing based services accessed by Web appli-
cation players. Researchers cannot only run their
traditional experiments on the Internet to get larger
and more representative data, as has become usual
during the last decade, but they can also benefit from
the new Web technologies and the increasingly social
character of the Web.
The Social Web can be used as a research platform
to collect data that can be used to contrast psychologi-
cal or sociological theories on how people interact,
reason and feel in different settings. Likewise, the
recent developments in Cloud Computing and in the
design of Web-user agents will allow conducting
more interactive and realistic experiments on the
Internet. Moreover, the way people interact and be-
have on the Internet may not be a mere reproduction
of the offline social behaviour. The peculiarities of
this medium and the impact it might have on behav-
iour, as well as the personal factors that influence
their use, are becoming important issues in current
social science research. Psychological research can
also make an important contribution by endowing
people with the cognitive skills necessary to grasp
54
The Web as a Platform for E-Research in the Social and Behavioral Sciences
knowledge in this vast universe of information, while
protecting their privacy.
Although our discussion has focused mainly on
our own area of expertise, experimental psychology,
other social sciences, such as marketing, sociology,
politics, economics, education, or anthropology,
should benefit to a similar extent from these new
technologies. During the next years, the combined
efforts of these disciplines and computer scientists
will surely provide new and intriguing insights on
human behaviour.
ACKNOWLEDGMENT
Support for this research was provided by Grants
IT363-10 and IT458-10 from the Education,
Universities, and Research Department of the
Basque Government.
REFERENCES
Agichtein, E., Castillo, C., Donato, D., Gionis,
A., & Mishne, G. (2008). Finding high-quality
content in social media. Paper presented at the
ACM Web Search and Data Mining Conference.
Palo Alto, CA.
Aleman-Meza, B., Nagarajan, M., Ramakrish-
nan, C., Ding, L., Kolari, P., Sheth, A. P., et al.
(2006). Semantic analytics on social networks:
Experiences in addressing the problem of conflict
of interest detection. In Proceedings of the 15th
International Conference on World Wide Web, (pp.
407-416). New York, NY: ACM Press.
Anderson, C. (2006). The long tail: How endless
choice is creating unlimited demand. London, UK:
Random House Business Books.
Asur, S., & Huberman, B. A. (2010). Predicting
the future with social media. Retrieved from
http:// www.hpl.hp.com/ research/ scl/ papers/
socialmedia/ socialmedia.pdf.
Back, M. D., Stopfer, J. M., Vazire, S., Gaddis,
S., Schmukle, S. C., Egloff, B., & Gosling, S. D.
(2010). Facebook profiles reflect actual personal-
ity, not self-idealization. Psychological Science,
21, 372–374. doi:10.1177/0956797609360756
Bairoch, A. (2009). The future of annotation/
biocuration. Paper presented at the 3rd Biocura-
tion Conference. Berlin, Germany.
Baker, J. R., & Moore, S. M. (2008). Distress,
coping, and blogging: Comparing new MySpace
users by their intention to blog. Cyberpsy-
chology & Behavior, 11, 81–85. doi:10.1089/
cpb.2007.9930
Bar-Anan, Y., De Houwer, J., & Nosek, B. A.
(2010). Evaluative conditioning and conscious
knowledge of contingencies: A correlational in-
vestigation with large samples. Quarterly Journal
of Experimental Psychology, 63, 2313–2335.
doi:10.1080/17470211003802442
Berners-Lee, T. (2006). Linked data. International
Journal on Semantic Web and Information Sys-
tems, 4(1), W3C.
Berners-Lee, T., & Cailliau, R. (1990). WorldWide-
Web: Proposal for a hypertexts project. Retrieved
from http:// www.w3.org/ Proposal.html.
Berners-Lee, T., & Fischetti, M. (1999). Weaving
the Web. San Francisco, CA: Harper.
Berners-Lee, T., Hendler, J., & Lassila, O. (2001).
The semantic Web. Scientific American, 284,
35–43. doi:10.1038/scientificamerican0501-34
Birnbaum, M. H. (1999). Testing critical proper-
ties of decision making on the Internet. Psycho-
logical Science, 10, 399–407. doi:10.1111/1467-
9280.00176
Birnbaum, M. H. (2000). SurveyWiz and Fac-
torWiz: JavaScript web pages that make HTML
forms for researchers on the Internet. Behavior
Research Methods, Instruments, & Computers,
32, 339–346. doi:10.3758/BF03207804
55
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Birnbaum, M. H. (2004). Human research and
data collection via the Internet. Annual Review of
Psychology, 55, 803–832. doi:10.1146/annurev.
psych.55.090902.141601
Birnbaum, M. H., & Wakcher, S. V. (2002). Web-
based experiments controlled by JavaScript: An
example from probability learning. Behavior
Research Methods, Instruments, & Computers,
34, 189–199. doi:10.3758/BF03195442
Bizer, C., Heath, T., & Berners-Lee, T. (2009).
Linked data - The story so far. Journal on Se-
mantic Web and Information Systems, 5(3), 1–22.
doi:10.4018/jswis.2009081901
Botella, C., Gallego, M. J., Garcia-Palacios, A.,
Baños, R. M., Quero, S., & Alcañiz, M. (2009).
The acceptability of an Internet-based self-help
treatment for fear of public speaking. British Jour-
nal of Guidance & Counselling, 37(3), 297–311.
doi:10.1080/03069880902957023
Botella, C., Gallego, M. J., García-Palacios, A.,
Baños, R. M., Quero, S., & Guillen, V. (2008a). An
Internet-based self-help program for the treatment
of fear of public speaking: A case study. Journal of
Technology in Human Services, 26(2-4), 182–202.
doi:10.1080/15228830802094775
Botella, C., Quero, S., Baños, R. M., Gara-Palacios,
A., Bretón-López, J., Alcañiz, M., & Fabregat, S.
(2008b). Telepsychology and self-help: The treatment
of phobias using the Internet. Cyberpsychology & Be-
havior, 11(6), 659–664. doi:10.1089/cpb.2008.0012
Buchanan, T., & Smith, J. L. (1999). Using the Inter-
net for psychological research: Personality testing on
the World Wide Web. The British Journal of Psychol-
ogy, 90, 125–144. doi:10.1348/000712699161189
Buffardi, L. E., & Campbell, W. K. (2008). Narcis-
sism and social networking web sites. Personality
and Social Psychology Bulletin, 34, 1303–1314.
doi:10.1177/0146167208320061
Cha, M., Haddadi, H., Benevenuto, F., & Gummadi,
K. P. (2010). Measuring user influence in Twitter:
The million follower fallacy. Paper presented at the
4th International AAAI Conference on Weblogs and
Social Media. Washington, DC.
Chandrasekaran, B., Josephson, J. R., & Benjamins,
V. R. (1999). What are ontologies, and why do we
need them? IEEE Intelligent Systems, 14, 20–26.
doi:10.1109/5254.747902
Dandurand, F., Shultz, T. R., & Onishi, K. H. (2008).
Comparing online and lab methods in a problem-
solving experiment. Behavior Research Methods,
40, 428–434. doi:10.3758/BRM.40.2.428
DiNucci, D. (1999). Fragmented future. Print,
53(4), 32.
Ellison, N. B., Steinfield, C., & Lampe, C. (2007).
The benefits of Facebook “friends: Social capital and
college students’ use of online social network sites.
Journal of Computer-Mediated Communication, 12,
1143–1168. doi:10.1111/j.1083-6101.2007.00367.x
Engelbrecht, P. C., & Dror, E. I. (2009). How psy-
chology and cognition can inform the creation of
ontologies in semantic technologies. In Y. Kiyoki,
T. Tokuda, H. Jaakkola, X. Chen, & N. Yoshida
(Eds.), Proceeding of the 2009 Conference on
Information Modeling and Knowledge Bases
XX, (pp. 340-347). Amsterdam, The Netherlands:
IOS Press.
Engeström, J. (2009). Building sites around social
objects. Paper presented at the Web 2.0 Expo. San
Francisco, CA.
Fisch, K. (2006). Did you know? Retrieved from
http:// thefischbowl.blogspot.com/ 2006/ 08/ did-
you-know.html.
Fogel, J., & Nehmad, E. (2009). Internet social
network communities: Risk taking, trust, and
privacy concerns. Computers in Human Behavior,
25, 153–160. doi:10.1016/j.chb.2008.08.006
56
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Gaffney, D., Pearce, I., Darham, M., & Nanis, M.
(2010). Presenting 140Kit: An open, extensible
research platform for Twitter. Retrieved from
http:// www.webecologyproject.org/ 2010/ 07/
presenting-140kit/.
Gardner, H. (1985). The mind’s new science: A
history of cognitive revolution. New York, NY:
Basic Books.
Gershenfeld, N., Krikorian, R., & Cohen, D.
(2004). The Internet of things. Scientific Ameri-
can, 291, 76–81. doi:10.1038/scientificameri-
can1004-76
Good, B. M., & Wilkinson, M. D. (2006). The life
sciences semantic web is full of creeps! Briefings
in Bioinformatics, 7, 275–286. doi:10.1093/bib/
bbl025
Gosling, S. D., Gaddis, S., & Vazire, S. (2007).
Personality impressions based on Facebook
profiles. Paper presented at the International
Conference on Weblogs and Social Media.
Boulder, CO.
Hargittai, E. (2007). Whose space? Differences
among users and non-users of social network
sites. Journal of Computer-Mediated Com-
munication, 13, 276–297. doi:10.1111/j.1083-
6101.2007.00396.x
Hickson, I. (2011). HTML5: A vocabulary and
associated APIs for HTML and XHTML. W3C
Editors Draft 16 February 2011. Retrieved from
http:// dev.w3.org/ html5/ spec/ Overview.html.
Hoschka, P. (1998). CSCW research at GMD-FIT:
From basic groupware to the Social Web. ACM
SIGGROUP Bulletin, 19, 5–9.
Hsu, C. L., & Lin, J. C. (2008). Acceptance of
blog usage: The roles of technology acceptance,
social influence, and knowledge sharing moti-
vation. Information & Management, 45, 65–74.
doi:10.1016/j.im.2007.11.001
Humphreys, L. M., Krishnamurthy, B., & Gill, P.
(2010). How much is too much? Privacy issues
on Twitter. Paper presented at the Conference of
the International Communication Association.
Singapore.
Java, A., & Song, X. (2007). Why we Twitter:
Understanding microblogging usage and com-
munities. In Proceedings of the Joint 9th WEB-
KDD and 1st SNA-KDD Workshop, (pp. 56-65).
Baltimore, MD: WEBKDD Press.
Khare, R. (2006). Microformats: The next (small)
thing on the semantic Web? IEEE Internet Comput-
ing, 10, 68–75. doi:10.1109/MIC.2006.13
Kwak, H., Lee, C., Park, H., & Moon, S. (2010).
What is twitter, a social network or a news media?
In Proceedings of the 19th International confer-
ence on World Wide Web, (pp. 591–600). ACM.
Lahl, O., Göritz, A. S., Pietrowsky, R., & Rosen-
berg, J. (2009). Using the world-wide-web to
obtain large-scale word norms: 190,212 ratings on
a set of 2,654 German nouns. Behavior Research
Methods, 41, 13–19. doi:10.3758/BRM.41.1.13
Lange, P. G. (2007). Publicly private and privately
public: Social networking on YouTube. Journal
of Computer-Mediated Communication, 13,
361–380. doi:10.1111/j.1083-6101.2007.00400.x
Lanningham, S. (2006). DeveloperWorks inter-
views: Tim Berners-Lee. Retrieved from http://
www.ibm.com/ developerworks/ podcast/ dwi/
cm-int082206txt.html.
Lerman, K. (2006). Social networks and so-
cial information filtering on Digg. Retrieved
from http:// arxiv.org/ PS_cache/ cs/ pdf/ 0612/
0612046v1.pdf.
Lerman, K., & Galstyan, A. (2008). Analysis of
social voting patterns on Digg. Paper presented
at WOSN 2008. Seattle, WA.
57
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Linden, A., & Fenn, J. (2003). Understanding
Gartners hype cycles. Strategic Analysis Report
R-20-1971. New York, NY: Gartner Research.
Mangan, M. A., & Reips, U.-D. (2007). Sleep, sex,
and the Web: Surveying the difficult-to-reach clini-
cal population suffering from sexsomnia. Behavior
Research Methods, 39, 233–236. doi:10.3758/
BF03193152
Matute, H., Vadillo, M. A., & Bárcena, R. (2007).
Web-based experiment control software for re-
search and teaching on human learning. Behavior
Research Methods, 39, 689–693. doi:10.3758/
BF03193041
Matute, H., Vadillo, M. A., Vegas, S., & Blanco,
F. (2007). Illusion of control in internet users and
college students. Cyberpsychology & Behavior,
10(2), 176–181. doi:10.1089/cpb.2006.9971
Matute, H., Vegas, S., & Pineño, O. (2002). Uti-
lización de un videojuego para estudiar cómo
interfiere lo nuevo que aprendemos sobre lo que
ya sabíamos. Paper presented at 1er Congreso
Online del Observatorio para la CiberSociedad.
Retrieved from http:// www.cibersociedad.net/
congreso/ comms/ g10matute-el-al2.htm.
McGraw, K. O., Tew, M. D., & Williams, J. E.
(2000). The integrity of web-delivered experi-
ments: Can you trust the data? Psychological Sci-
ence, 11, 502–506. doi:10.1111/1467-9280.00296
Mislove, A., Lehmann, S., Ahn, Y., Lazer, D.,
Lin, Y., Onnela, J., & Rosenquist, J. N. (2010).
Mapping the conversation: Political topics and
geography on Twitter. Retrieved from http://elec-
tion.ccs.neu.edu/.
Mislove, A., Lehmann, S., Ahn, Y., Onnela, J., &
Rosenquist, J. N. (2010). Pulse of the nation: U.S.
mood throughout the day inferred from Twitter.
Retrieved from http:// www.ccs.neu.edu/ home/
amislove/ twittermood/.
Mislove, A., Marcon, M., Gummadi, K. P., Drus-
chel, P., & Bhattacharjee, B. (2007). Measurement
and analysis of online social networks. Paper
presented at IMC 2007. San Diego, CA.
Newell, A., & Simon, H. (1972). Human problem
solving. Englewood Cliffs, NJ: Prentice Hall.
Nosek, B. A. (2005). Moderators of the relationship
between implicit and explicit evaluation. Journal
of Experimental Psychology, 134, 565–584.
O’Reilly, T. (2005). What is Web 2.0: Design
patterns and business models for the next genera-
tion of software. International Journal of Digital
Economics, 65, 17–37.
Pang, B., & Lee, L. (2008). Opinion min-
ing and sentiment analysis. Foundations and
Trends in Information Retrieval, 2, 1135.
doi:10.1561/1500000011
Papadimitriou, D. (2009, August 1st). Future In-
ternet: The cross-ETP vision document. European
Technology Platform.
Ratliff, K. A., & Nosek, B. A. (2010). Creating dis-
tinct implicit and explicit attitudes with an illusory
correlation paradigm. Journal of Experimental
Social Psychology, 46, 721–728. doi:10.1016/j.
jesp.2010.04.011
Reips, U. D. (2002). Standards for Internet-based
experimenting. Experimental Psychology, 49,
243–256. doi:10.1026//1618-3169.49.4.243
Reips, U.-D., & Garaizar, P. (2011). Mining Twit-
ter: Microblogging as a source for psychological
wisdom of the crowds. Behavior Research Meth-
ods, 43, 635–642. doi:10.3758/s13428-011-0116-6
Reips, U. D., & Neuhaus, C. (2002). WEXTOR:
A Web-based tool for generating and visualizing
experimental designs and procedures. Behavior
Research Methods, Instruments, & Computers,
34, 234–240. doi:10.3758/BF03195449
58
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Rosen, C. (2007). Virtual friendship and the new
narcissism. New Atlantis (Washington, D.C.), 17,
15–31.
Sakaki, T., Okazaki, M., & Matsuo, Y. (2010).
Earthquake shakes Twitter users: Real-time event
detection by social sensors. In Proceedings of the
18th International World Wide Web Conference.
New York, NY: ACM.
Schmidt, W. C. (1997a). World-wide web sur-
vey research: Benefits, potential problems, and
solutions. Behavior Research Methods, Instru-
ments, & Computers, 29, 274–279. doi:10.3758/
BF03204826
Schmidt, W. C. (1997b). World-wide web survey
research made easy with www survey assistant.
Behavior Research Methods, Instruments, & Com-
puters, 29, 303–304. doi:10.3758/BF03204832
Shneiderman, B. (2008). Science 2.0. Science,
319, 1349–1350. doi:10.1126/science.1153539
Specia, L., & Motta, E. (2007). Integrating folk-
sonomies with the semantic web. In Proceedings
of the European Semantic Web Conference (ESWC
2007). Innsbruck, Austria: Springer.
Steyvers, M., Tenenbaum, J. B., Wagenmakers, E.-J.,
& Blum, B. (2003). Inferring causal networks from
observations and interventions. Cognitive Science,
27, 453–489. doi:10.1207/s15516709cog2703_6
Suh, B., Hong, L., Pirolli, P., & Chi, E. H.
(2010). Want to be retweeted? Large scale ana-
lytics on factors impacting retweet in Twitter
network. Paper presented at the Second IEEE
International Conference on Social Computing
(SocialCom 2010). Minneapolis, MN.
Thelwall, M. (2008). Social networks, gender
and friending: An analysis of MySpace mem-
ber profiles. Journal of the American Society
for Information Science and Technology, 59,
1321–1330. doi:10.1002/asi.20835
Thelwall, M. (2009). MySpace comments.
Online Information Review, 33, 5876.
doi:10.1108/14684520910944391
Thelwall, M., Wilkinson, D., & Uppal, S.
(2010). Data mining emotion in social net-
work communication: Gender differences in
MySpace. Journal of the American Society
for Information Science and Technology, 61,
190–199.
Torkjazi, M., Rejaie, R., & Willinger, W. (2009).
Hot today, gone tomorrow: On the migration of
MySpace users. Paper presented at the WOSN
2009. Barcelona, Spain.
Uhlir, P. F., & Schroeder, P. (2007). Open data
for global science. Data Science Journal, 6,
36–53. doi:10.2481/dsj.6.OD36
Vadillo, M. A., Bárcena, R., & Matute, H.
(2006). The internet as a research tool in the
study of associative learning: An example
from overshadowing. Behavioural Processes,
73, 36–40. doi:10.1016/j.beproc.2006.01.014
Vadillo, M. A., & Matute, H. (2009). Learning
in virtual environments: Some discrepancies
between laboratory- and Internet-based re-
search on associative learning. Computers in
Human Behavior, 25, 402–406. doi:10.1016/j.
chb.2008.08.009
Vadillo, M. A., & Matute, H. (2011). Further
evidence on the validity of web-based research
on associative learning: Augmentation in a
predictive learning task. Computers in Hu-
man Behavior, 27, 750754. doi:10.1016/j.
chb.2010.10.020
Van Damme, C., Hepp, M., & Siorpaes, K.
(2007). Folksontology: An integrated approach
for turning folksonomies into ontologies. Paper
presented at the 4th European Semantic Web
Conference. Innsbruck, Austria.
59
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Vander Wal, T. (2004). Folksonomy. Retrieved
from http:// vanderwal.net/ folksonomy.html.
Vernberg, D., Snyder, C. R., & Schuh, M. (2005).
Preliminary validation of a hope scale for a
rare health condition using web-based meth-
odology. Cognition and Emotion, 19, 601–610.
doi:10.1080/02699930441000256
Williams, E. (2010). Keynote. Paper presented
at Chirp, the official Twitter Developer Confer-
ence. San Francisco, CA.
Wolfram, S. (2002). A new kind of science.
Champaign, IL: Wolfram Media.
ADDITIONAL READING
Almeida, A., Orduña, P., Castillejo, E., Lopez-
de-Ipiña, D., & Sacristan, M. (2011). Imhotep:
An approach to user and device conscious
mobile applications. Personal and Ubiquitous
Computing, 15, 419–429. doi:10.1007/s00779-
010-0359-8
Anderson, C. A., & Bushman, B. J. (2001).
Effects of violent video games on aggressive
behavior, aggressive cognition, aggressive
affect, physiological arousal, and prosocial be-
haviour: A meta-analytic review of the scientific
literature. Psychological Science, 12, 353–359.
doi:10.1111/1467-9280.00366
Back, M. D., Stopfer, J. M., Vazire, S., Gaddis,
S., Schmukle, S. C., Egloff, B., & Gosling, S. D.
(2010). Facebook profiles reflect actual personality,
not self-idealization. Psychological Science, 21,
372–374. doi:10.1177/0956797609360756
Balicer, R. D. (2007). Modeling infectious diseases
dissemination through online role-playing games.
Epidemiology (Cambridge, Mass.), 18, 260–261.
doi:10.1097/01.ede.0000254692.80550.60
Bargh, J. A., & McKenna, K. Y. A. (2004). The
Internet and social life. Annual Review of Psy-
chology, 55, 573590. doi:10.1146/annurev.
psych.55.090902.141922
Birnbaum, M. H. (Ed.). (2000). Psychological
experiments on the Internet. San Diego, CA: Aca-
demic Press.
Birnbaum, M. H. (2004). Human research and
data collection via the Internet. Annual Review of
Psychology, 55, 803–832. doi:10.1146/annurev.
psych.55.090902.141601
boyd, D. M., & Ellison, N. B. (2008). Social network
sites: Definition, history, and scholarship. Journal of
Computer-Mediated Communication, 13, 210-230.
Garcia-Zubia, J., Orduña, P., Lopez-de-Ipiña, D.,
& Alves, G. R. (2009). Addressing software impact
in the design of remote laboratories. IEEE Trans-
actions on Industrial Electronics, 56, 4757–4767.
doi:10.1109/TIE.2009.2026368
Giles, J. (2011). Social-bots infiltrate Twitter and
trick human users. New Scientist, 209(2804), 28.
doi:10.1016/S0262-4079(11)60614-3
Gómez-Goiri, A., Emaldi-Manrique, M., & pez-
de-Ipiña, D. (2011). A semantic resource-oriented
middleware for pervasive environments. CEPIS
UPGRADE, 12, 6–16.
mez-Goiri, A., & López-de-Ipiña, D. (2011). On the
complementarity of triple spaces and the web of things.
In Proceedings of the 2nd International Workshop on
the Web of Things (WoT 2011), (pp. 12:1-12:6). ACM.
Killingsworth, M. A., & Gilbert, D. T. (2010). A
wandering mind is an unhappy mind. Science,
330, 932. doi:10.1126/science.1192439
Kraut, R., Kiesler, S., Boneva, B., Cummings,
J., Helgeson, V., & Crawford, A. (2002).
Internet paradox revisited. The Journal of
Social Issues, 58, 49–74. doi:10.1111/1540-
4560.00248
60
The Web as a Platform for E-Research in the Social and Behavioral Sciences
Kraut, R., Patterson, M., Lundmark, V.,
Kiesler, S., Mukopadhyay, T., & Sherlis, W.
(1998). Internet paradox: A social technol-
ogy that reduces social involvement and
psychological well-being? The American Psy-
chologist, 53, 1017–1031. doi:10.1037/0003-
066X.53.9.1017
López-de-Ipiña, D., Díaz-de-Sarralde, I., &
Gara-Zubia, J. (2010). An ambient assisted
living platform integrating RFID data-on-tag
care annotations and Twitter. Journal of Uni-
versal Computer Science, 16, 1521–1538.
Matute, H., & Vadillo, M. A. (2007). Assess-
ing e-learning in web labs. In Gomes, L., &
Gara-Zubía, J. (Eds.), Advances on Remote
Laboratories and e-Learning Experiences (pp.
97–107). Bilbao, Spain: University of Deusto.
Matute, H., Vadillo, M. A., & Bárcena, R.
(2007). Web-based experiment control software
for research and teaching on human learning.
Behavior Research Methods, 39, 689–693.
doi:10.3758/BF03193041
Matute, H., Vadillo, M. A., & Garaizar, P.
(2011). Web based experiment control soft-
ware for research on human learning. In Seel,
N. M. (Ed.), Encyclopedia of the Sciences of
Learning. Berlin, Germany: Springer Verlag.
doi:10.3758/BF03193041
Matute, H., Vadillo, M. A., Vegas, S., & Blanco,
F. (2007). Illusion of control in Internet users and
college students. Cyberpsychology & Behavior,
10, 176–181. doi:10.1089/cpb.2006.9971
Orduña, P., & García-Zubia, J. Irurzun, J., &
López-de-Ipiña, D. (2011). Accessing remote
laboratories from mobile devices. In C. Li (Ed.),
Open Source Mobile Learning: Mobile Linux
Applications. Hershey, PA: IGI Global.
Vadillo, M. A., Bárcena, R., & Matute, H. (2006).
The internet as a research tool in the study of asso-
ciative learning: An example from overshadowing.
Behavioural Processes, 73, 36–40. doi:10.1016/j.
beproc.2006.01.014
Vadillo, M. A., & Matute, H. (2009). Learning
in virtual environments: Some discrepancies be-
tween laboratory- and Internet-based research on
associative learning. Computers in Human Behav-
ior, 25, 402–406. doi:10.1016/j.chb.2008.08.009
Vadillo, M. A., & Matute, H. (2011). Further evi-
dence on the validity of web-based research on
associative learning: Augmentation in a predictive
learning task. Computers in Human Behavior, 27,
750–754. doi:10.1016/j.chb.2010.10.020
Valkenburg, P. M., & Peter, J. (2009). Social
consequences of the Internet for adolescents: A
decade of research. Current Directions in Psy-
chological Science, 18, 1–5. doi:10.1111/j.1467-
8721.2009.01595.x
Vazire, S., & Gosling, S. D. (2004). e-Perceptions:
Personality impressions based on personal web-
sites. Journal of Personality and Social Psycholo-
gy, 87, 123–132. doi:10.1037/0022-3514.87.1.123
KEY TERMS AND DEFINITIONS
API: Application Programming Interface, a lay-
er or interface between different software programs
aimed to facilitate their uncoupled interaction.
Cloud Computing: Provides a platform to
develop and deploy web applications offered as
services that can be consumed without knowledge
about its use and the implementation of its resources.
Future Internet: The efforts made by inter-
national associations to progress towards a better
Internet that should offer all users a secure, efficient,
trusted, and reliable environment to enable open,
dynamic, and decentralized access to the network
and adapt its performance to the users needs and
context.
61
The Web as a Platform for E-Research in the Social and Behavioral Sciences
HTML5: The Fifth revision of the HTML
standard, a language for structuring and presenting
content for the World Wide Web.
Linked Data: An attempt to progress towards
a more realistic application of Semantic Web,
where a cut-down data model empowered by the
rich expressivity of new semantic standards (es-
pecially a combination of RDF and OWL, termed
as RDFS++) is used to define vocabularies and
instance data which are interlinked.
SaaS: Software as a Service, a software delivery
model in which software and its associated data are
provided as services hosted in Cloud Computing
based servers.
Semantic Web: A machine-readable World
Wide Web using metadata to describe meaningfully
the content of the web.
Social Web: A new approach of the Web, also
known as Web 2.0, that emphasizes user-generated
content and user interactions in web applications.
Article
Purpose – The purpose of this paper is to investigate the characteristics of social network comments to give a broad overview to serve as a baseline for future research. Design/methodology/approach – English comments from a representative sample of public MySpace profiles were examined with a collection of exploratory analyses, using automatic data processing, quantitative techniques and content analyses. Findings – Comments were normally for general friendship maintenance and were typically short, with 95 per cent having 57 or fewer words. They contained a combination of standard spelling, apparently accidental mistakes, slang, sentence fragments, “typographic slang” and interjections. Several new creative spelling variants derived from previous forms of computer-mediated communication have become extremely common, including u, ur, :), haha and lol. The vast majority of comments (97 per cent) contained at least one non-standard language feature, suggesting that members almost universally recognise the informal nature of this kind of messaging. Research limitations/implications – The investigation only covered MySpace and only analysed English comments. Practical implications – MySpace comments should not be written in, or judged by, standard linguistic norms and may cause special problems for information retrieval. Originality/value – This is the first large-scale study of language in social network comments.
Article
Full-text available
This publication contains reprint articles for which IEEE does not hold copyright. Full text is not available on IEEE Xplore for these articles.
Article
Full-text available
Social media provide many opportunities to connect people; however, the kinds of personally identifiable information that people share through social media is understudied. Such public discussions of personal information warrant a closer privacy discussion. This paper presents findings from a content analysis of Twitter in which the amount and kinds of personally identifiable information in Twitter messages were coded. Findings suggest that the majority of time Twitterers do write about themselves. Overwhelmingly, Twitterers do not include identifiable information such as phone numbers, email and home addresses. However, about a quarter of tweets do include information regarding when people are engaging in activities and where they are. This kind of information may have privacy implications when found in the same tweet or if coupled with other kinds of publicly available information. PRIVACY ON TWITTER 3 How much is too much? Privacy issues on Twitter Social media provide many people a new way to connect with friends, family and colleagues. In particular, social network sites are frequently used to communicate with people known to one another through offline connections (Ellison, Steinfield, & Lampe, 2007). For example, as of August 2009, Facebook was the fifth most frequented website in the US (ComScore, 2009). These services can help to reinforce social bonds and manage social
Article
Individuals communicate and form relationships through Internet social networking websites such as Facebook and MySpace. We study risk taking, trust, and privacy concerns with regard to social networking websites among 205 college students using both reliable scales and behavior. Individuals with profiles on social networking websites have greater risk taking attitudes than those who do not; greater risk taking attitudes exist among men than women. Facebook has a greater sense of trust than MySpace. General privacy concerns and identity information disclosure concerns are of greater concern to women than men. Greater percentages of men than women display their phone numbers and home addresses on social networking websites. Social networking websites should inform potential users that risk taking and privacy concerns are potentially relevant and important concerns before individuals sign-up and create social networking websites.
Chapter
Remote Laboratories constitute a first order didactic resource in engineering faculties. Its use in mobile devices to increase the availability of the system is a challenge highly coupled to the requirements established by each experiment. This work presents the main strategies for adapting a Remote Laboratory to mobile devices, as well as the experience of a real Remote Laboratory, WebLab-Deusto, in this adaption. These strategies are analyzed and compared in order to detail what strategy is more suitable under certain situations.
Article
This article reviews recent findings that violate a broad class of descriptive theories of decision making. A new study compared 1,224 participants tested via the Internet and 124 undergraduates tested in the laboratory. Both samples confirmed systematic violations of stochastic dominance and cumulative independence; new tests also found violations of coalescing. The Internet sample was older, more highly educated, more likely male, and also more demographically diverse than the lab sample. Internet participants were more likely than undergraduates to choose the gamble with higher expected value, but no one conformed exactly to expected value. Violations of stochastic dominance decreased as education increased, but violations of stochastic dominance and coalescing were still substantial in persons with doctoral degrees who had read a scientific work on decision making. In their implications, Internet research and lab findings agree: Descriptive decision theories cannot assume that identical consequences can be coalesced.