ArticlePDF Available

Abstract and Figures

We provide …field experimental evidence of the effects of monitoring in a context where productivity is multi-dimensional and only one dimension is monitored and incentivized. We hire students to do a job for us. The job consists of identifying euro coins. We study the direct effects of monitoring and penalizing mistakes on work quality and evaluate spillovers on unmonitored dimensions of productivity (punctuality and theft). We find that monitoring improves work quality only if incentives are strong, but substantially reduces punctuality irrespectively of the associated incentives. Monitoring does not affect theft, with ten percent of participants stealing overall. Our …findings are supportive of a reciprocity mechanism, whereby workers retaliate for being distrusted.
Content may be subject to copyright.
The Spillover E¤ects of Monitoring: A Field
Experiment
Michèle Belotand Marina Schrödery
September 26, 2014
Abstract
We provide …eld experimental evidence of the e¤ects of monitoring in
a context where productivity is multi-dimensional and only one dimension
is monitored and incentivized. We hire students to do a job for us. The
job consists of identifying euro coins. We study the direct e¤ects of moni-
toring and penalizing mistakes on work quality and evaluate spillovers on
unmonitored dimensions of productivity (punctuality and theft). We …nd
that monitoring improves work quality only if incentives are strong, but
substantially reduces punctuality irrespectively of the associated incen-
tives. Monitoring does not a¤ect theft, with ten percent of participants
stealing overall. Our …ndings are supportive of a reciprocity mechanism,
whereby workers retaliate for being distrusted.
Keywords : counterproductive behavior, monitoring, …eld experiment
JEL: C93, J24, J30, M42, M52
University of Edinburgh, Scho ol of Management, 30 Buccleuch Place, E dinburgh, EH8
9JT, UK, m ichele.b elot@ed.ac.uk.
yUniversity of Cologne, Faculty of Managem ent, E conomics and Social Sciences, A lbertus-
Magnus-Platz, 50923 Cologn e, Germany, marina.schroeder@ uni-koeln.de.
1
1 Introduction
Experts estimate that, globally, occupational fraud causes annual losses of more
than $3.5 trillion (Association of Certi…ed Fraud Examiners 2012). The question
is what an organization can do to prevent such behavior. One straightforward
instrument regularly applied in practice is to monitor workers and punish them
if they do not comply (or reward them if they do). But are such measures ef-
fective? There is experimental evidence that monitoring and incentivizing may
actually back…re (see Frey 1993, Falk and Kosfeld 2006; Frey and Jegen 2014 for
reviews of this literature). However, the evidence is so far limited to situations
where productivity is unidimensional, such as the number of units produced or
sold, performance at a test or monetary transfers in an experimental game (see
for example Gneezy and Rustichini 2000a; Nagin et al. 2002; Falk and Kosfeld
2006; Fisman and Miguel 2007; Dickinson and Villeval 2008; Boly 2011). These
studies assess the direct e¤ects of monitoring on work behavior in the monitored
productivity dimension. In typical work surroundings, however, productivity is
multi-dimensional and there are multiple ways in which workers can behave
counterproductively: From showing up late to do sloppy work, stealing, bully-
ing, or sabotaging other people’s work, counterproductive behavior has many
possible facets. Negative crowding out e¤ects of monitoring may spill over to
other productivity dimensions. These spillover e¤ects should be incorporated
when evaluating and designing monitoring and incentive schemes.
We study an experimental setup with multiple observable dimensions of pro-
ductivity, in which only one dimension is monitored and incentivised. We vary
(1) whether workers are monitored or not and (2) how "harsh" the incentives
are. We then evaluate the e¤ects of monitoring on the monitored dimension
and on the other non-monitored dimensions. The experimental setup we use is
related to the euro currency. It is a …eld version of the laboratory task proposed
in Belot and Schröder (2013). We recruited students to identify the provenance
of euro coins. Every worker receives four boxes of coins and is asked to identify
and return the coins by an appointed date. The task has the advantage of of-
2
fering a menu of observable forms of counterproductive behaviors that are very
common in the workplace, i.e., sloppy work, tardiness, and theft. These forms
of counterproductive behavior vary in their nature and perhaps, importantly, in
the non-monetary (or moral) costs associated with them (Robinson and Bennett
1995).
While it is obvious that sloppy work and theft a¤ect the principal nega-
tively, tardiness is also generally considered as undesired behavior (Robinson
and Bennett 1995; Gneezy and Rustichini 2000a; Gubler, Larkin and Pierce
2013). However, tardiness is not perceived in the same way across countries
(Basu and Weibull 2003; Krupka and Weber 2013). The experiment was con-
ducted in Germany, where there is a strong social norm of punctuality. Proper
business etiquette is to be exactly on time. For example, a website targeting
English speaking businessmen living in Germany (www.thelocal.de) ranks punc-
tuality as the most important aspect of etiquette for doing business in Germany.
Quoting: "1. Be on time. Being late in Germany is a cardinal sin. Seriously.
Turning up even …ve or ten minutes after the arranged time - especially for a
rst meeting - is considered personally insulting and can create a disastrous …rst
impression. Minimise reputation damage by calling ahead with a watertight ex-
cuse if you’re going to be held up" This advice is echoed on many international
business websites and guides to German etiquette.1
We compare three treatments with di¤erent degrees of monitoring and incen-
tives for work quality. The …rst treatment (no monitoring) entails no monitoring
at all. We contrast this treatment to treatments with monitoring and incentives.
We consider two alternative monitoring and incentive schemes. The …rst scheme
is a "low pain, low gain" incentive scheme (monitoring & mild incentives), which
introduces a productivity target that is relatively easy to pass and a low penalty
for failing to meet it. The second is a "high pain, high gain" incentive scheme
(monitoring & harsh incentives), which introduces a di¢ cult productivity tar-
get and a high penalty for failing to meet it. These two schemes are interesting
1See for exam ple w ww2.uni-frankfurt.de/46329991/Guide-to-German-culture_and-
etiquette.pdf and www.kwintessential.co.u k/etiquette/doing-business-germ any.html
3
because it is not clear a priori which of the two triggers greater e¤ort. Harsh
incentives may discipline workers and increase productivity, but incentives may
also discourage the workers if the target is perceived as not worthwhile achiev-
ing. Thus, the e¤ects of these incentive schemes on productivity are unclear ex
ante.
We …nd evidence for negative spillover e¤ects that appear as soon as moni-
toring is introduced. Speci…cally, we …nd that tardiness increases substantially:
The fraction of participants who show up late increases by 35% as soon as moni-
toring is implemented, and the magnitude of the increase is similar independent
of the incentives. Theft, on the other hand, remains constant across treatments:
On average, 10% of the participants steal coins. In our experiment, the direct
ect on work quality seems to be driven by incentives. We …nd a positive ef-
fect on work quality only when incentives are harsh. Mild incentives lead to no
improvement in work quality at all, while harsh incentives reduce the number
of mistakes by 40%. In a companion laboratory experiment, we replicate this
result and …nd that the combination of the productivity target and the penalty
is crucial to determine the e¤ectiveness of incentives.2
Overall, our experimental results reveal negative spillover e¤ects of moni-
toring on unmonitored productivity dimensions. The positive direct e¤ects of
monitoring seem to be contingent on harsh incentives and cannot be achieved
by monitoring per se. Our results are most supportive of an interpretation re-
lated to negative reciprocity, whereby workers wish to punish the principal (for
monitoring them) and do so in the least costly manner for themselves (both in
monetary and non-monetary terms).
Our results suggest that monitoring can only be e¢ cient in combination with
harsh incentives. Whether or not monitoring with harsh incentives is e¢ cient
depends on the ratio of the gains in the monitored productivity dimension to
the losses in other unmonitored productivity dimensions.
The rest of the paper is structured as follows: We present the experimen-
2In this laboratory experiment we vary the threshold and the penalty indep endently. We
briey describe the design and …ndings in the Results section. For a detailed description,
please see the Appen dix.
4
tal design in Section 2 and present the results in Section 3. We discuss the
interpretation of the results in Section 4 and conclude in Section 5.
2 Experimental design and procedure
We recruited students to support a research project. The task is adapted from
Belot and Schröder (2013) and consists of identifying the value and country of
origin of euro coins that were collected in various countries in the euro zone.
Participants in our experiment had one day to complete the task from home and
were requested to return the work materials at a speci…c deadline. Our design
has several methodological advantages. It involves a job that could realistically
be advertised by an economics department and that can be executed in a natural
work environment, i.e., workers can take the coins home rather than working in
an experimental laboratory. Additionally, we can observe multiple dimensions
of productivity that arise naturally: Participants can do a poor job, be late in
completing the job or steal some of the coins. Still, it is straightforward for us to
design a monitoring scheme targeting only one of these dimensions. Also, in this
job, participants who failed to comply in any of these three dimensions can be
categorized as behaving counterproductively, since it is possible for participants
to do a perfect job, provided they are willing to do it.
We recruited student workers via a notice posted at various places on the
campus of the University of Magdeburg. Interested students were asked to
contact the research team by email. Those who had not participated in any
previous related studies received a response mail brie‡y explaining the task.
In the email, we suggested two collection dates with the corresponding return
dates and asked students to choose one of them.3At collection, each participant
received standardized verbal instructions on how to perform the job and on the
monitoring procedure.4After answering all open questions in a standardized
way, we asked participants to indicate the exact time at which they would return
3Collection was always either M ond ay or Wedn esday in the m orning b etween 10:00 a.m.
and 12:30 p.m. and return was the next day between 3:30 p.m. and 6:00 p.m.
4For a detailed overview on the written and verbal communication as well as the work
material, please refer to the online app endix.
5
the coins the next day.5
We contrast one treatment with no monitoring and incentives to two treat-
ments with monitoring and incentives. In the no monitoring treatment, there
is no monitoring at all. In the two monitoring treatments, 1 out of the 4 boxes
is checked. Before starting to work, participants in both monitoring treatments
were informed that 1 out of the 4 boxes would be checked after returning the
coins. While we kept monitoring …xed in these two treatments, we varied the in-
centives associated with monitoring. In the monitoring & mild incentives treat-
ment participants were allowed to make 10 mistakes. If we found more than 10
mistakes in the box randomly chosen for checking, the participant would only
receive e19 instead of e20. In the monitoring & harsh incentives treatment,
the threshold number of mistakes was only 2. If we found more than 2 mistakes
in the checked box, the participants’payment was only e5 instead of e20. The
rst incentive scheme is mild: It is an easy threshold to pass and the penalty is
small. The second incentive is harsh: It leaves little room for mistakes and the
penalty is large.6Note that we played on two variables at the same time to vary
the incentives (threshold and penalty) and chose combinations of the two that
are probably most common in the workplace. However, to get more insight into
how the incentive schemes work (and a¤ect performance in the monitored task
in particular), we conducted additional treatments in a laboratory experiment
that vary the penalty and the threshold independently (in a 2x2 design). We
will comment more extensively on the results in the next section.
Ninety one students participated in this study, 30 each in the no monitor-
ing and monitoring & mild incentives treatments and 31 in the monitoring &
harsh incentives treatment. All participants were allowed to take the materials
home. They received a catalog illustrating the most common euro coins and four
identi…cation tables. Each participant received a set of 4 boxes of euro coins
5We gave participants enough time to check their schedule for the best suitable time in the
time horizon between 3:30 p.m. and 6:00 p.m. On ce a participant had decided on the exact
return time, we noted th e time in our calendar and wrote the time on a sheet of paper that
was handed to the participant.
6The incentive scheme was framed in a neutral language for participants. We did not use
the words reward or punishm ent.
6
collected in 4 di¤erent countries of the euro zone. The lid of each box indicated
the country the coins were collected in. Within one set, the composition of boxes
varied with respect to the value and the number of coins. Across sets, however,
the composition of boxes was similar. Each participant received a total of 780
coins with a value of e114.70.
When participants returned the work materials, we wrote down the exact
time the materials were returned. We also asked the participants for an estimate
of the time they had worked on the task, for their …eld of study, and we recorded
the gender. Participants in the no monitoring treatment immediately received
the full payment of e20 in cash. Participants in the two monitoring treatments
directly received the sure part of the payment and could collect the remaining
part later (usually a day later) if they met the work quality requirements of
the corresponding treatment. Participants were informed about the payment
procedure before working on the task.
Compared to the no monitoring treatment, the two monitoring treatments
are associated with a di¤erent payment procedure that generates some incon-
venience for participants. We see this as a necessary and inherent part of in-
troducing the monitoring technology. If we would have asked participants in
the no monitoring treatment to come back a day later to collect their payment,
they may have felt monitored as well. Given the nature of the task, it was
impossible to run the monitoring treatments without having particpants com-
ing back. Nevertheless, we believe such inconveniences are not atypical and are
often an inherent part of a monitoring scheme. In many real world examples,
monitoring is indeed associated with inconveniences for the worker, e.g., moni-
tored workers have to write extra reports, make detours in order to reach central
time measurement stations, cope with delays due to quality control, or bear the
discomfort of camera surveillance. Thus, we are convinced that inconveniences
are a natural elemant of monitoring mechanisms.
When the experiment was over, we checked all returned materials with re-
spect to coin composition and mistakes in the identi…cation task. Whenever we
observed deviations in the composition of coins, we replaced coins with identical
7
coins or coins with similar collector’s value before handing the materials to the
next participant.
3 Results
3.1 Summary statistics
Table 1 shows summary statistics for the behaviors of interest across the three
treatments. Regarding the productivity in the monitored dimension …rst, we
nd that the quality of work is on average higher in the monitoring & harsh in-
centives treatment than in the no monitoring and monitoring & mild incentives
treatments. In fact, quality in the no monitoring and the monitoring & mild
incentives treatments is very similar. In these two treatments, workers make 10
mistakes on average (2.5 per box), while they make on average 7 mistakes (1.7
per box) in the monitoring & harsh incentives treatment.
Looking more in detail at the distribution of mistakes, we …nd that most
boxes have fewer than 2 mistakes, but this share is larger in the treatment
with harsh incentives (It is 76.1% in the no monitoring treatment, 71.7% in
the monitoring & mild incentives treatment, and 83.1% in the monitoring &
harsh incentives). Most boxes have fewer than 10 mistakes, suggesting that
this threshold was indeed an easy threshold to reach (97% in the no monitoring
treatment, 95% in the monitoring & mild incentives, and 98% in the monitoring
& harsh incentives treatment).7
7In the m onitoring & mild incentives treatment, all checked boxes were b elow the tolerated
numb er of m istakes. H alf of the participants in the monitoring & mild incentives treatm ent
came back to collect the remaining payment. Com paring those participants who collected the
remainin g paym ent to those w ho did not, we do not …nd signi…cant di¤erences in the number of
mistakes m ade (U-test, p>0.10, two-tailed), stealing (Fisher Exact Test, p>0.10, two-tailed),
or punctuality (Fisher Exact Test, p>0.10, two-tailed). In the m onitoring & h arsh incentive s
treatment, 5 participants did not meet the quality requirements. Of the 26 participants who
met the requirements, 24 came back to collect the remaining payment.
8
Table 1 Summary statistics (standard deviations in brackets)
no monitoring monitoring
& mild incentives
monitoring
& harsh incentives
(1) (2) (3)
Work quality
avg. total no. of mistakes in all 4 boxes 10.23 (16.23) 9.97 (13.45) 6.90 (10.93)
% boxes with 0-2 mistakes 76.1% 71.7% 83.1%
% boxes with 3-10 mistakes 20.6% 23.3% 14.4%
% boxes with more than 10 mistakes 3.3% 5.0% 2.5%
Tardiness
% participants on time (within 5 min) 56.7% 33.3% 35.5%
% participants too early ( 1 min.) 46.6% 33.3% 35.5%
median advance in min. (if early) 11 (584.90) 20 (17.04) 10 (130.31)
% participants too late (1 min) 13.3% 43.3% 45.2%
median delay in min. (if late) 4 (6.29) 5 (15.48) 8 (38.93)
Theft
no. of participants who stole coins 3 3 3
Working time
avg. reported working time (in min) 111.83 (42.6) 112.5 (45.0) 124.5 (47.7)
Penalty
% participants eligible for full payment 100% 100% 83.9%
% collected full payment if eligible 100% 50% 92.6%
Turning to the other dimensions of productivity, we …nd that punctuality
varies substantially across treatments. The percentage of participants showing
up on time is much higher in the absence of monitoring. Figure 1 illustrates
a histogram of the deviation from the appointed return time for the separate
treatments.8While only four participants in the no monitoring treatment came
back late (compared to sharp punctuality), more than 40 percent showed up late
in the two monitoring treatments. In all treatments, a substantial fraction of
the participants came back too early.9
Turning to theft, we …nd that 10% of the participants (9 out of 91 par-
ticipants) steal coins. The prevalence of theft is identical across treatments.
8In the graph, we exclude outliers with a deviation ab ove 50 minutes.
9It is unclear what causes participants to come back early. It could be that they try really
hard not to be late and take any potentially delaying eventualities (that do not occur) into
account. However, it could also be plain unpunctuality. Also, the consequences of coming
back ea rly are di¤erent to those of coming back late. By waiting, early participants can still
be on time. T his is clearly not th e case for late participants.
Most delayed participants returned the coins within the time frame. Only one participant
(in the moniting & harsh incentives treatment) returned the coins after 6:00 p .m. For early
participants, we …nd that 15 participants (3 in the no monitorn ig and 6 in each monitoring
treatment) returned the work material before 3:30 p.m .
9
0.05 .10.05 .1
-50 050 -50 050
monitoring & mild incentives monitoring & harsh incentives
no monitoring overall
density
deviation in minutes
Figure 1: Deviation from the appointed return time
Overall, it seems that theft in our experiment is motivated by the collectors’
value of coins, rather than the nominal value of circulating coins. Participants
especially steal coins that at the time of the experiment were rarely found in
Germany, such as coins from the Vatican, Slovenia, or Slovakia. These are coins
that have a higher collectors’value than their actual nominal value. For exam-
ple, in three cases a 50 cent coin from the Vatican was stolen. On the German
ebay platform this coin was sold for e3 (plus shipping) at the time of the experi-
ment. In two cases (that occurred in di¤erent treatments) participants replaced
coins from the Vatican with other coins that had the same nominal value. We
categorize these acts as theft as the participants did not inform us that they
replaced the coins.
10
Our results allow us to observe multiple dimensions of counterproductive be-
havior. We …nd that counterproductive behavior in the di¤erent dimensions is
not correlated, i.e., participants who behave counterproductively in one dimen-
sion are neither more nor less likely to behave counterproductively in another
dimension than other participants. Comparing individuals who steal to those
who do not steal, we do not …nd a signi…cant di¤erence in tardiness (U-test,
p>0.10, two-tailed) or the number of mistakes (U-test, p>0.10, two-tailed).
Further, the number of mistakes is not correlated with the delay in minutes
(Spearman Correlation, p>0.10, two-tailed).
3.2 Regression analysis
We now present a regression analysis of the number of mistakes and tardiness
(we do not analyze theft since there is no variation across treatments), which
allows us to control for some observable characteristics of the workers. Starting
with work quality, Col. (1) shows the results of a Poisson regression.10 We …nd
that there are 40% less mistakes under the monitoring & harsh incentives treat-
ment than under no monitoring. On the other hand, we observe no signi…cant
di¤erences between monitoring & mild incentives and no monitoring. It seems
that monitoring alone does not have an e¤ect on work quality. Work quality is
only improved if monitoring is associated with harsh incentives.
Turning to punctuality, we …rst run a regression (Col. (2)) on whether the
participant showed up on time (within 5 minutes of the appointed time). We
nd that participants are signi…cantly less likely to show up on time as soon
as monitoring is introduced. Participants are 22 and 20 percent less likely to
show up on time in the monitoring & mild incentives and monitoring & harsh
incentives, respectively. One question is whether participants show up late be-
cause they put more e¤ort into the identi…cation task. We asked participants
how much time they spent on the task and the average reported working time
was 112 minutes for the no monitoring treatment, 113 minutes for the monitor-
10 T he distribution of the numb er of mistakes is not norm al. There is a substantial fraction
of zeros and small positive values. In those cases, count data mo dels are more appropriate.
This is why we use a Poisson regression.
11
ing & mild incentives treatment, and 124 minutes for the monitoring & harsh
incentives, with none of these di¤erences being statistically signi…cant (U-test,
p>0.10, two-tailed). Since the average time reported is far below 24 hours, it is
unlikely that participants were under time pressure. In Col. (3) we nevertheless
control whether the reported working time and the quality of work explain the
di¤erences in punctuality. In Col. (4) we additionally control for the day of
the week on which participants had to return the work material, for the time
coins were collected, and for the appointed return time. The results remain
unchanged when controlling for these additional variables.
The question is whether this decrease in punctuality is driven by the fact
that more participants come early or whether it is driven by more participants
coming late. Col. (5-10) look at the probability of returning the work material
early or late (compared to sharp punctuality). We only …nd signi…cant di¤er-
ences in the probability of being late. Participants are 35% and 36% more likely
to be late under monitoring & mild incentives and monitoring & harsh incen-
tives, respectively (Col. (8)). The e¤ects of monitoring remain if we control for
the total number of mistakes and the reported work time (Col. (6) and (9)),
which indicates that there is no relationship between e¤ort in the identi…cation
task and tardiness. We also control for the day of the week, the actual collec-
tion time, and the appointed return time (Col. (7) and (10)). Again, we …nd
that participants are signi…cantly more likely to be late in the two monitoring
treatments compared to the no monitoring treatment.11 It seems that introduc-
ing monitoring per se results in a negative spillover e¤ect on punctuality and
that these spillovers are una¤ected by the level of incentives associated with
monitoring.
11 Interestingly, we also …nd sign…cant e¤ects of the day of the week and the app ointed return
time on the probability of being late. Participants who return the work material on a Tuesday
are 23 p ercent more likely to be late compared to participants who retu rn the material on
a Thursday. Further, the probability of being late decreases the later the app ointed retu rn
time.
12
Table 2 Regression analysis
Number of
mistakes
(Poisson)
On time
(Probit)
Early
(Probit)
Late
(Probit)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
monitoring & mild incentives .003 -.221 -.229 -.243 -.137 -.136 -.120 .348 .356 .372
(.082) (.117)* (.117)* (.118)* (.119) (.119) (.123) (.132)** (.132)*** (.143)**
monitoring & harsh incentives -.407 -.205 -.240 -.250 -.105 -.109 -.135 .363 .389 .491
(.089)*** (.117)* (.117)* (.118)** (.120) (.122) (.122) (.129)*** (.131)*** (.136)***
female -.298 -.034 -.047 -.031 .070 .066 .064 -.000 .015 .027
(.073)*** (.107) (.107) (.110) (.105) (.106) (.108) (.103) (.104) (.109)
total mistakes - - -.007 -.007 - -.001 -.001 - .005 .005
(.004) (.005) (.004) (.004) (.004) (.004)
reported work time - - .001 .001 - .000 .000 - .000 .001
(.001) (.001) (.001) (.001) (.001) (.001)
Tuesday - - - -.001 - - -.166 - - .231
(.112) (.108) (.111)**
collection time - - - -.074 - - -.022 - - .084
(.072) (.066) (.066)
app. return time - - - -.024 - - .0589 - - -.080
(.034) (.039) (0.037)**
constant 2.435 - - - - - - - - -
(.062)***
pseudo) R2.027 .034 .056 .070 .014 .016 .050 .081 .098 .182
Obs. 91 91 91 91 91 91 91 91 91 91
*signi…cance at p<0.10, **signi…cance at p<0.05, ***signi…cance at p<0.001, Margin al e¤ects are reported for Probit estimates in Col. (2-10).
Dep end ent variables: Col. (1): Numb er of mistakes in the identi…cation task, Col. (2-4) dummy indicating whether the participant showed up on time
(within 5 minutes of the appointed time), Col. (5-7) dum my indicating wheth er the participant showed up early (com pared to sharp punctuality),
Col. (8-10) dummy indicating whether the participant showed up late (compared to sharp punctuality).
13
Our experimental design varies the incentives by playing on two variables at
the same time: the threshold and the size of the reward for meeting the thresh-
old. Since we see a substantial increase in the productivity in the identi…cation
task with harsher incentives, the question is whether this increase is driven by
the higher penalty, the more di¢ cult threshold, or both. To see how these two
variables a¤ect work quality independently of each other and in combination,
we conducted additional treatments in a laboratory setting where we varied the
threshold and the penalty in a 2x2 design. We …nd that both matter: a higher
penalty increases productivity and a more di¢ cult thresholdfurther reinforces
the productivity increase when the penalty is high. Harsh incentives (di¢ cult
threshold, large penalty) appear to be the most e¤ective way of triggering e¤ort,
while a di¢ cult threshold with a small penalty seems to be least e¤ective. In
the latter case (di¢ cult threshold, small penalty), incentives have an adverse
ect as the number of mistakes is substantially higher than in the absense of
incentives (no threshold, no penalty).We present these results in the Appendix.
3.3 Discussion
We …nd that monitoring has a negative e¤ect on punctuality. Independent
of the level of incentives associated with monitoring, punctuality signi…cantly
decreases as soon as monitoring is introduced. What drives this crowding out
ect? In the following we will summarize some existing theories on crowding
out e¤ects and will discuss whether they can explain the observed behavior in
our experiment.
One mechanism that has been proposed to explain crowding out e¤ects is
through information. Bénabou and Tirole (2003) argue that monitoring could
negatively a¤ect workers’perception of a task. Workers who are monitored infer
that the task is di¢ cult or unpleasant and as a consequence put less e¤ort into
the monitored task (Bénabou and Tirole 2003).
Sliwka (2007) proposes that monitoring could reveal information about peers’
behavior. In his model, monitoring work quality signals that the principal ex-
pects a large fraction of workers to work sloppily. Workers who aim at behaving
14
conform to their peers respond to this signal and choose to behave sloppily as
well. It is important to note that in our task the signal is only informative for
peers’behavior in the monitored productivity dimension. We showed in the re-
sults section that individuals who work sloppily are neither more nor less likely
to steal or to be late. Thus, a signal on peers’work quality is not informative
on their behavior in other productivity dimensions of our experiment. Both the
model by Bénabou and Tirole (2003) and the model by Sliwka (2007) only pre-
dict crowding out e¤ects on the monitored productivity dimension and cannot
explain our observation that crowding out e¤ects spill over to other productivity
dimensions.
Another mechanism driving crowding out e¤ects could be reciprocity (Ra-
bin 1993; Frey 1993). There are multiple ways by which monitoring negatively
ects workers. For a given level of e¤ort, monitoring e¤ectively reduces the
expected payment for a worker because it is associated with a …ne. Addition-
ally, workers infer inconveniences due to the process of monitoring. Monitoring
may further reduce workers’utility due to a reduction in autonomy. Reciprocal
workers may want to reduce the principal’s payo¤ as a consequence of the re-
duction in their own utility (Rabin 1993; Dufwenberg and Kirchsteiger 2004).
It could also be that workers reciprocate distrust. Monitoring and incentives
(independent of the level) may be perceived as a signal of distrust, and workers
may reciprocate distrust by being less trust worthy, i.e., by caring less about
the payo¤ of the principal (Frey 1993).
In a multi-dimensional context, workers should always choose the cheapest
way of reciprocating. In our design, there are three ways in which workers can
negatively reciprocate: (1) They can put less e¤ort, (2) they can steal coins, and
(3) they can be late in returning the work material.12 The …rst way is costly
to the workers because it reduces their expected payment. The other two do
not infer monetary costs for the worker (theft is even associated with monetary
gains) but are associated with costs of breaking social norms. The social and the
12 A ll exp eriments were ru n by the researchers involved in this pro ject. Since monitoring is
not an essential part of a usual work-relation, it is clear that the monitoring choice was made
by the experimenter and that tardiness would a¤ect the experim enter.
15
legal norm for theft is stronger than that for punctuality (e.g., Robinson and
Bennett 1995). It seems reasonable to assume that tardiness is the cheapest
way of reciprocating. Thus, our …nding that punctuality decreases as soon as
monitoring is implemented is in line with a reciprocity interpretation. It seems
that workers want to retaliate for being monitored by being unpunctual.1 3
With respect to the direct e¤ect of monitoring, we …nd that monitoring
improves work behavior only if it is associated with harsh incentives. If the in-
centives associated with monitoring are mild, monitoring workers does not have
any e¤ect on the monitored productivity dimension. If the incentives are harsh,
the number of mistakes falls signi…cantly. Thus, the improvement in work qual-
ity in the …eld experiment are due to incentives rather than monitoring. In a
laboratory experiment, we disentangle the e¤ect of our two incentive compo-
nents (threshold and penalty). We …nd that a large penalty always results in
a lower number of mistakes compared to a small penalty. With respect to the
threshold, we …nd that a di¢ cult threshold only improves work behavior when
it is associated with a large penalty. The combination of a di¢ cult threshold
and a small penalty has an adverse e¤ect on work behavior as the number of
mistakes made increases substantially compared to a situation without monitor-
ing and incentives. Our …ndings are in line with the existing literature on the
(adverse) e¤ects of incentives on performance (Gneezy and Rustichini 2000b;
Gneezy, Meier, and Rey-Biel 2011) and contribute to this literature in showing
that the combination of threshold and monetary incentives matters.
4 Conclusion
This paper provides …eld evidence on the e¤ect of monitoring and incentives in a
context where productivity is multi-dimensional and only one of the dimensions
(work quality) is monitored. We observe negative spillovers of monitoring on
unmonitored productivity dimensions. These spillover e¤ects arise independent
13 T he negative e¤ect of monitoring on workers in our exp eriment involves multiple dim en-
sions, e.g., redu ced expected payment, inconveniences associated with the procedure, reduced
autonomy, and distrust. More research is needed to be able to disentangle the e¤ects of the
seperate dimensions of monitoring on work behavior.
16
of the level of incentives. Thus, they appear to be driven by the mere presence
of monitoring. These observed crowding out e¤ects are in line with a model
of reciprocal behavior. Workers choose to punish the principal for monitoring
them, but they choose to do this through dimensions that have low costs for
them.
We …nd that monitoring improves productivity in the monitored dimension
only if it is associated with harsh incentives. Introducing monitoring and mild
incentives has no e¤ect at all on work quality. Thus, monitoring associated
with mild incentives is ine¢ cient. There is no signi…cant improvement in work
quality and tardiness increases signi…cantly. Monitoring with harsh incentives
is more e¤ective. The number of mistakes falls substantially, but at the same
time the negative spillover e¤ects are as large as in the monitoring treatment
with weak incentives.
Based on these results, we conclude that introducing a monitoring technology
only pays o¤ if (1) the incentives associated with monitoring are su¢ ciently
harsh, (2) the dimensions that cannot be monitored either entail high moral
costs or the relative gains in productivity in the monitored dimension more
than compensate for the losses in other dimensions, and (3) monitoring costs
for the employer are su¢ ciently low.
These …ndings relate more broadly to the literature on adverse e¤ects of
incentives (see Gneezy, Meier, and Rey-Biel 2011 for a recent review) and the
adverse e¤ects of control (Falk and Kosfeld 2006) and monitoring (Frey 1993).
In line with this literature, we …nd that monitoring and mild incentives are less
ective than no monitoring at all.
Appendix A Laboratory Experiment: Threshold
versus Penalty
We conducted …ve additional treatments in the laboratory to …nd out how the
threshold and the penalty a¤ect e¤ort in the identi…cation task. In the labora-
tory experiments, we computerized the identi…cation task and asked students
17
to identify coins on a screen. They had to identify 204 coins that corresponded
to the coins from one of the boxes in the …eld experiment.Since the duration
of the task was shorter (50 minutes on average), we adjusted incentives to make
them comparable to the …eld experiment and to be in accordance with expected
earnings in a typical laboratory experiment.
We introduced a treatment without incentives, where participants were paid
ae10 ‡at fee. Additionally, we ran four treatments with incentives, varying
the threshold and the penalty in a 2x2 design. We o¤ered a e10 payment
to those who met the performance requirements (fewer than 2 or 10 mistakes);
while those who failed would receive either e9.50 (small penalty) or e2.50 (large
penalty). The …ve treatments are summarized in Table A1. Note that T1 corre-
sponds to the "no monitoring" treatment , T2 corresponds to the "monitoring
& mild incentives" treatment, and T5 corresponds to the "monitoring & harsh
incentives" treatment in the …eld experiment.
Table A1 Experimental Design and Number of Participants
Laboratory experiment
no threshold easy threshold
(10 mistakes)
di¢ cult threshold
(2 mistakes)
no penalty T1, N= 30
small penalty (e0.50) T2, N= 32 T3, N= 32
large penalty (e2.50) T4, N= 31 T5, N= 32
We ran sessions for each treatment with a between-subjects design. We had
between 30 and 32 participants per treatment. Sessions were run in the Cologne
Laboratory for Economic Research and subjects recruited via ORSEE (Greiner,
2004).
Table A2 summarizes our results from this laboratory study. We replicate
what we …nd in the …eld experiment: Mild incentives (T2) do not signi…cantly
increase e¤ort relative to no incentives (T1) (U-test, p=0.17, two-tailed). How-
ever, harsh incentives (T5) lead to signi…cantly less mistakes than mild incen-
tives (U-test, p<0.05, two-tailed) and than no incentives at all (U-test, p<0.01,
two-tailed).
Do these e¤ects come from the change in the threshold or the change in
18
the penalty? We see that increasing the penalty always decreases the number
of mistakes, irrespective of the threshold (U-test, p<0.10, two-tailed). Making
the threshold more di¢ cult on the other hand leads to a substantial increase in
the number of mistakes made when the penalty is small (U-test, p<0.05, two-
tailed). When the penalty is large (e7.50), a di¢ cult threshold increases the
level of e¤ort compared to an easy threshold, but only slightly (U-test, p<0.10,
two-tailed).
These results show that harsh incentives increase productivity through both
channels: a higher penalty increases productivity, and a more di¢ cult threshold
further reinforces the productivity increase when the penalty is high. Harsh
incentives (di¢ cult threshold, large penalty) appear to be the most e¤ective
way of triggering e¤ort, while a di¢ cult threshold with a small penalty seems to
be least e¤ective. In the latter case, it seems that many participants do not put
much e¤ort at all into the task (41% made more than 10 mistakes, compared
to 0% in T5 (harsh incentives), 6% in T2 (mild incentives), and 3% in T1 (no
incentives) and T4 (large penalty and easy threshold)).
Table A2: Average number of mistakes
(standard deviations in brackets)
no threshold easy threshold
(10 mistakes)
di¢ cult threshold
(2 mistakes)
no penalty 3.7 (3.2)
small penalty (e0.50) 4.6 (9.8) 54.5 (77.8)
large penalty (e7.50) 1.9 (2.8) 0.9 (1.4)
Acknowledgments
The authors thank Uri Gneezy, Bernd Irlenbusch, Karim Sadrieh, and three
anonymous referees for valuable suggestions and comments that lead to substan-
tial improvements. We also bene…ted from comments from participants at the
European Workshop on Experimental and Behavioral Economics in Frankfurt
2013, the Royal Economic Society 2013 Conference, the 2013 Florence Work-
shop on Behavioural and Experimental Economics, and Seminars in Cologne
and Trier.We thank Claudia Gorylla, Markus Hartmann, and Linh Nguyen
for help in conducting the experiments. Financial support by the Institute for
19
Fraud Prevention and the Deutsche Forschungsgemeinschaft (DFG FOR 1371)
is gratefully acknowledged.
20
References
Association of Certi…ed Fraud Examiners. 2012. 2012 Report
to the Nations on Occupational Fraud and Abuse. Available at
http://www.acfe.com/uploadedFiles/ACFE_Website/Content/rttn/2012-
report-to-nations.pdf, last access 25.02.2014.
Basu, K., J. W. Weibull. 2003. Punctuality: A Cultural Trait as Equilibrium.
In Economics for an Imperfect World: Essays in Honor of Joseph E. Stiglitz,
ed. R. Arnott, B. Greenwald, R. Kanbur, B. Nalebu¤, 163–182. London: The
MIT Press.
Belot, M., M. Schröder. 2013. Sloppy Work, Lies and Theft: A Novel Experimen-
tal Design to Study Counterproductive behavior. Journal of Economic Behavior
and Organization 93 233-238.
Bénabou, R., J. Tirole. 2003. Intrinsic and Extrinsic Motivation. Review of
Economic Studies 70 489–520.
Boly, A. 2011. On the Incentive E¤ects of Monitoring: Evidence from the Lab
and the Field. Experimental Economics 14(2) 241–253.
Dickinson, D., M.-C. Villeval. 2008. Does Monitoring Decrease Work E¤ort?
The Complementary Between Agency and Crowding-Out Theories. Games and
Economic Behavior 63(1) 56–76.
Dufwenberg, M., G. Kirchsteiger. 2004. A Theory of Sequential Reciprocity.
Games and Economic Behavior 47 268–298.
Falk, A., M. Kosfeld. 2006. The Hidden Costs of Control. American Economic
Review 96(5) 1611–1630.
Fisman, R., E. Miguel. 2007. Corruption, Norms, and Legal Enforcement: Ev-
idence from Diplomatic Parking Tickets. Journal of Political Economy 115(6)
1020–1048.
21
Frey, B. S. 1993. Does Monitoring Increase Work E¤ort? The Rivalry with Trust
and Loyalty. Economic Inquiry 31(4) 663–670.
Frey, B. S., R. Jegen. 2001. Motivational Interactions: E¤ects on behavior.
Annales of Economics and Statistics, 63/64 131–153
Gneezy, U., S. Meier, P. Rey-Biel. 2011. When and Why Incentives (Don’t)
Work to Modify Behavior. Journal of Economic Perspectives 25(4) 191-210.
Gneezy, U., A. Rustichini. 2000a. A Fine is a Price. Journal of Legal Studies
29(1) 1-18.
Gneezy, U., A. Rustichini. 2000b. Pay Enough or Don’t Pay at All. Quarterly
Journal of Econoimcs 115(3) 791–810.
Greiner, B. 2004. An Online Recruitment System for Economic Experiments. In
Forschung und wissenschaftliches Rechnen 2003, ed. K. Kremer, V. Macho,73-
93. GWDG Bericht 63, Göttingen.
Gubler, T., I Larkin, L. Pierce. 2013. The Dirty Laundry of Employee Award
Programs: Evidence from the Field. Harvard Business School Working Paper
13-069.
Krupka, E. L., R. A. Weber. 2013. Identifying Social Norms Using Coordination
Games: Why does Dictator Game Sharing Vary? Journal of the European
Economic Association 11(3) 495–524.
Kwintessential. Doing Business in Germany. Available at
http://www.kwintessential.co.uk/etiquette/doing-business-germany.html,
last access 25.02.2014.
Nagin, D. S., J. B. Rebitzer, S. Sanders, L. J. Taylor. 2002. Monitoring, Mo-
tivation, and Management: The Determinants of Opportunistic Behavior in a
Field Experiment. American Economic Review 92(2) 850-873.
Rabin, M. 1993. Incorporating Fairness into Game Theory and Economics.
American Economic Review 83(5) 1281–302.
22
Robinson, S. L., R. J. Bennett. 1995. A Typology of Deviant Workplace Be-
haviors: A Multidimensional Scaling Study. Academy of Management Journal
38(2) 555–572.
Sliwka, D. 2007. Trust as a Signal of a Social Norm and the Hidden Costs of
Incentive Schemes. American Economic Review 97(3) 999–1012.
The Local: Germany’s news in English, Ten tips for German business eti-
quette. Available at http://www.thelocal.de/galleries/news/1773, last accesss
25.02.2014.
University of Frankfurt (International O¢ ce), 2013. Guide to Ger-
man culture, customs and etiquette. Available at http://www2.uni-
frankfurt.de/49378893/Guide-to-German-culture_-costums-and-etiquette-
02_12_13.pdf, last access 25.02.2014.
23

Supplementary resource (1)

... As involuntary mandates increase, voluntary contributions are increasingly crowded out, even when there is a personal benefit to participation. For example, field experiments have documented adverse spillover effects of monitoring workers' productivity [8]. In the context of the COVID-19 pandemic, studies have documented the impact of crowding out on social distance measures [57] and on people's acceptance of different types of countermeasures [48]. ...
... However, intrinsic motivations such as altruism are associated with installing COVID-19 contact tracing apps [35,39,48]. Therefore, we look at mechanisms of crowding out of social preferences that are discussed extensively in the literature [10] and extend them with possible spillover monitoring mechanisms [8,22] that are relevant to the unique aspects of surveillance. ...
... In contrast, mass surveillance may signal the government's belief that people cannot be trusted to install the application and voluntarily share information in the case of detection of proximity events. Monitoring can lead to a lower level of trustworthiness by agents [28] and to lower productivity when workers retaliate for being distrusted [8]. While we obtained these results in the context of workplace monitoring, it may be the case that similar reciprocity mechanisms can explain reactions to governmental mass surveillance. ...
Preprint
During the COVID-19 pandemic, many countries have developed and deployed contact tracing technologies to curb the spread of the disease by locating and isolating people who have been in contact with coronavirus carriers. Subsequently, understanding why people install and use contact tracing apps is becoming central to their effectiveness and impact. This paper analyzes situations where centralized mass surveillance technologies are deployed simultaneously with a voluntary contact tracing mobile app. We use this parallel deployment as a natural experiment that tests how attitudes toward mass deployments affect people's installation of the contact tracing app. Based on a representative survey of Israelis (n=519), our findings show that positive attitudes toward mass surveillance were related to a reduced likelihood of installing contact tracing apps and an increased likelihood of uninstalling them. These results also hold when controlling for privacy concerns about the contact tracing app, attitudes toward the app, trust in authorities, and demographic properties. Similar reasoning may also be relevant for crowding out voluntary participation in data collection systems.
... Furthermore, the mail was much more likely to be lost if the sender's last name matched the recipient's last name (i.e., a local name). Belot and Schröder (2015), as well as Greenberg (2002), created the opportunity for participants to steal cash. In Belot and Schröder's (2015) field experiment, the authors recruited students for a paid job of identifying the provenance of euro coins collected in different countries. ...
... Belot and Schröder (2015), as well as Greenberg (2002), created the opportunity for participants to steal cash. In Belot and Schröder's (2015) field experiment, the authors recruited students for a paid job of identifying the provenance of euro coins collected in different countries. Contrary to what participants were led to believe, a fixed number of coins was given to each participant, allowing the researchers to count the cash and assess the number of stolen coins. ...
... Two of these studies found significant deterrent effects of monitoring; namely, Cagala et al. (2014) found that high monitoring during the exam phase decreased pen theft in the post-exam phase, whereas Widner (1998) found that having anti-theft interventions decreased petrified wood theft. The other study did not find that monitoring decreased the theft of coins (Belot and Schröder 2015). ...
Article
Full-text available
Objectives Field experiments combine the benefits of the experimental method and the study of human behavior in real-life settings, providing high internal and external validity. This article aims to review the field experimental evidence on the causes of offending.Methods We carried out a systematic search for field experiments studying stealing or monetary dishonesty reported since 1979.ResultsThe search process resulted in 60 field experiments conducted within multiple fields of study, mainly in economics and management, which were grouped into four categories: Fraudulent/ dishonest behavior, Stealing, Keeping money, and Shoplifting.Conclusions The reviewed studies provide a wide variety of methods and techniques that allow the real-world study of influences on offending and dishonest behavior. We hope that this summary will inspire criminologists to design and carry out realistic field experiments to test theories of offending, so that criminology can become an experimental science.
... Both Belot and Schröder (2016) and Galeotti, Maggian, and Villeval (2018) point out that monitoring can improve outcomes but may also have adverse spillover effects in unmonitored contexts. This is also found in Hennig-Schmidt et al. (2019): When introducing monitoring, dishonest reporting of birth weights in contexts where monitoring cannot detect dishonesty significantly increases the level of dishonest reporting compared to situations with no monitoring. ...
... Also, since monitoring is not a temporary intervention but a permanent option to enforce physician liability, negative effects resulting from the removal of incentives are not relevant. Nevertheless, a negative effect of monitoring on specific, unmonitored dimensions in the provision of health care cannot be ruled out (Belot & Schröder, 2016;Galeotti et al., 2018). As a laboratory experiment, our study provides qualitative insights. ...
Article
Full-text available
This paper investigates the impact of monitoring institutions on market outcomes in health care. Healthcare markets are characterized by asymmetric information. Physicians have an information advantage over patients with respect to appropriate treatments, which they may exploit through over- or under-provision or by overcharging. We introduce two types of costly monitoring: endogenous and exogenous monitoring. When monitoring detects misbehavior, physicians have to pay a fine. Endogenous monitoring can be requested by patients, while exogenous monitoring is performed randomly by a third party. We present a toy model that enables us to derive hypotheses and test them in a laboratory experiment. Our results show that introducing endogenous monitoring reduces the level of undertreatment and overcharging. Even under high monitoring costs, the threat of patient monitoring is sufficient to discipline physicians. Exogenous monitoring also reduces undertreatment and overcharging when performed sufficiently frequently. Market efficiency increases when endogenous monitoring is introduced and when exogenous monitoring is implemented with sufficient frequency. Our results suggest that monitoring may be a feasible instrument to improve outcomes in healthcare markets.
... Second, active monitoring might be perceived as a signal of distrust or an attempt to exercise control over the person being monitored. There is mounting evidence documenting individuals' dislike of being tightly controlled-a phenomenon that has been termed control aversion (e.g., Falk and Kosfeld, 2006;Boly, 2011;Ziegelmeyer et al., 2012;Belot and Schröder, 2016). Such aversion has been shown to trigger negative reciprocity towards the person exercising control. ...
... Second, active monitoring might be perceived as a signal of distrust or an attempt to exercise control over the person being monitored. There is mounting evidence documenting that individuals dislike being tightly controlled-a phenomenon that has been termed control aversion (e.g., Falk and Kosfeld, 2006;Boly, 2011;Ziegelmeyer et al., 2012;Belot and Schröder, 2016). Such aversion has been shown to trigger negative reciprocity towards the person exercising control. ...
Preprint
Full-text available
Many modern organisations employ methods which involve the monitoring of employees’ actions in order to encourage teamwork in the workplace. While monitoring promotes a transparent working environment, the effects of making monitoring itself transparent may be ambiguous and have received surprisingly little attention in the literature. Using a novel laboratory experiment, we create a working environment in which first movers can (or cannot) observe second movers’ monitoring at the end of a round. Our framework consists of a standard repeated sequential Prisoner’s Dilemma, where the second mover can observe the choices made by first movers either exogenously or endogenously. We show that mutual cooperation occurs significantly more frequently when monitoring is made transparent. Additionally, our results highlight the key role of conditional cooperators (who are more likely to monitor) in promoting teamwork. Overall, the observed cooperation-enhancing effects are due to monitoring actions that carry information about first movers: the latter use it to better screen their co-player’s type and thereby reduce the risk of being exploited.
... Having power appears to excessively boost motivation, whereas a lack of power is met with exaggerated demotivation. At the same time, behavioral economists have demonstrated in incentivized experiments that if a leader exercises the power to control subordinates, this results in reduced prosocial motivation (Belot & Schröder, 2016;Charness, Cobo-Reyes, Jiménez, Lacomba, & Lagos, 2012;Falk & Kosfeld, 2006;Herz & Zihlmann, 2021). Consistent with these approaches, the more recent research on leader power that we will review tends to address the nuances of how a leader's power affects one's influence on followers, especially in terms of incentive provision, intrinsic motivation, and goal attainment. ...
Article
We provide a brief overview of why and how power is important to leaders. The scholarly fields of power and leadership both have rich histories yet have largely developed independently of one another, even though they both heavily inform one another. Our intention is to bring these two topics closer and create more synergies for power and leadership scholars. In this introductory editorial, we briefly review current literature that has examined both of these subjects in tandem. We then look at the field of power more broadly and explain how it has evolved recently in order to provide insights on what still needs to be done to ensure rigorous study of leader power. Next, we summarize the papers we accepted for the special issue and explain how they address current scholarly needs on the science of leader power. Finally, we conclude with an encouraging note for more research dedicated to bridging the fields of leadership and power.
... They found that trustees acted positively when they anticipate monitoring-they were consistent when they knew ahead of time that they would be observed, however they exploit trustors when they knew ahead of time that they would not be observed. Belot and Schröder (2015) found that monitoring improves work quality just if motivating forces are strict, yet considerably lessens timeliness independently of the related motivation. They claimed that Monitoring doesn't influence theft. ...
Article
Full-text available
The study explores the role of Organizational Trust (OT) in predicting Organizational Citizenship Behavior (OCB) with the moderation of Behavior Monitoring (BM). The study in an innovative way extends the extant literature and studies the influence of moderation effect of Behavior Monitoring on the relationship between OT and OCB. This study made use of pragmatism approach, for this purpose quantitative research was carried out. The quantitative approach helped to triangulate the findings from both quantitative and qualitative sides. The Quantitative analysis was based on a survey. The sample for the survey consisted of individuals working in the private business schools of Peshawar. A sample of 300 employees was selected based on their relevance to the area of research. Stratified random sampling was used to extract the sample. Questionnaires were distributed to the sample respondents. Reliability and validity of data instruments were also tested. The hypotheses were tested using different econometric tests like simple regression, multiple regressions and moderated analysis. The study also carries recommendations. The study can be extended for application in other institutions as well as organizations in Pakistan and outside Pakistan.
Article
A common rationale for the use of salary contracts is that they can produce substantial incentive effects when coupled with firing threats. However, enforcing firing threats may require close supervision of employees, thus possibly offsetting the very reasons salaries are commonly used, such as lowering monitoring costs and granting autonomy to employees. We design a series of experiments to study the effectiveness of firing threats when only limited information is available to supervisors. We show that light and unobtrusive supervision can produce large incentive effects. Compared to salary contracts, firing threats based on observing organizational performance alone increase employees’ output by 70% whereas only observing how long an employee works doubles output. These findings show that salaries can produce large incentive effects even in the absence of intensive supervision. Finally, we show that salary contracts with firing threats perform at least as well as other popular incentive schemes, such as bonuses, individual and team incentives, that rely on a similar amount of information about employees.
Article
Deterrence institutions are widely used in modern societies to discourage rule violations but whether they have an impact beyond their immediate scope of application is usually ignored. Using a quasi-experiment with naturally occurring variation in inspections we found evidence of spillover effects across contexts. We identified fraudsters and non-fraudsters on public transport who were or not exposed to ticket inspections by the transport company. We then measured the intrinsic honesty of the same persons in a new, unrelated context where they could misappropriate money. Instead of having an expected educative effect across contexts, the exposure to deterrence practices increased unethical behaviour of fraudsters but also, strikingly, of non-fraudsters, especially when inspection teams were larger. Learning about the prevailing norm is the most likely channel of this spillover effect.
Article
Despite increasing studies on IT monitoring, our understanding of how the relationships between the watcher and watched are affected by IT-mediation has remained limited in two areas. First, contradictory views exist on the relationships between the watchers and the watched. Studies either adapt traditional actor-centric frameworks assuming pre-defined watcher-watched relationships (e.g., panopticon or synopticon) or remove monitoring actors from the central focus to develop models based on data flows (e.g., dataveillance, assemblages, panspectron). Second, IT monitoring research has predominantly shared the assumptions of IT artifacts as stable objects, the use of which can be bounded and designed. To address these limitations, we develop a concept and framework of veillance applicable to a variety of possible IT or non-IT-mediated relationships between the watcher and the watched. We conduct a literature review with the proposed framework in order to identify IT-enabled transformations to the actors, goals, mechanisms and foci involved in monitoring. Based on our findings, we develop an action net model of IT veillance that aligns with theorization of IT artifacts as equivocal, distributable and open for uses, with edits and contributions by unbounded sets of heterogenous actors having diverse goals and capabilities. We define the action net of IT veillance as a flexible decentralized interconnected web shaped by watcher-watched relationships which are multidirectional, enabling multiple dynamic goals and foci. Cumulative contributions by heterogenous participants organize, impact on and manipulate the net through influencing dispositions, visibilities and the inclusion/exclusion of self and others. The proposed model makes three important theoretical contributions to our understanding of IT monitoring of watchers and watched and their relationships. We discuss implications and avenues for future studies on IT veillance.
Article
Full-text available
In this study, we developed a typology of deviant workplace behaviors using multidimensional scaling techniques. Results suggest that deviant workplace behaviors vary along two dimensions: minor versus serious, and interpersonal versus organizational. On ...
Article
Full-text available
Several experimental studies have shown that the crowding-out effect of monitoring may outweigh its disciplining effect through intrinsic motivation destruction, thereby reducing effort. However, most of these experiments use numeric effort tasks that subjects may not be intrinsically motivated to complete. This paper aims to analyze the incentive effects of monitoring using a real-effort task for which intrinsic motivation is more likely to exist. We conducted two similar experiments, in the lab in Montreal and in the field in Ouagadougou. In contrast to the lab, subjects in the field are unaware they are taking part in an experiment. The following results are observed both in the lab and in the field. Relative to the baseline treatment, we find that our two monitoring treatments significantly increase effort, in line with agency theory. However, effort levels are not significantly different between the monitoring treatments. Finally, increasing the subjects’ wage is found to have no effect on effort.
Article
We explore the influence of social norms on behavior. To do so, we introduce a method for identifying norms, based on the property that social norms reflect social consensus regarding the appropriateness of different possible behaviors. We demonstrate that the norms we elicit, along with a simple model combining concern for norm-compliance with utility for money, predict changes in behavior across several variants of the dictator game in which behavior changes substantially following the introduction of minor contextual variations. Our findings indicate that people care not just about monetary payoffs but also care about the social appropriateness of any action they take. Our work also suggests that a social norm is not always a single action that should or should not be taken, but rather a profile of varying degrees of social appropriateness for different available actions.
Article
In this paper we analyze the behavioral consequences of control on motivation. Wenstudy a simple experimental principal-agent game, where the principal decides whethernhe controls the agent by implementing a minimum performance requirement before the agent chooses a productive activity. Our main finding is that a principal's decisionnto control has a negative impact on the agent's motivation. While there is substantial individual heterogeneity among agents, most agents reduce their performance as a response to the principals' controlling decision. The majority of the principals seem to anticipate the hidden costs of control and decide not to control. In several treatmentsnwe vary the enforceable level of control and show that control has a non-monotonic effect on the principal's payoff. In a variant of our main treatment principals can also set wages. In this gift-exchange game control partly crowds out agents' reciprocity. The economic importance and possible applications of our experimental results are further illustrated by a questionnaire study which reveals hidden costs of control in various real-life labor scenarios. We also explore possible reasons for the existence of hidden costs of control. Agents correctly believe that principals who control expect to get less than those who don't. When asked for their emotional perception of control, most agents who react negatively say that they perceive the controlling decision as a signal of distrust and a limitation of their choice autonomy.
Article
We propose a novel experimental design to study counterproductive behaviour in a principal agent setting. The design allows us to study and derive clean measures of different forms of counterproductive behaviour in a controlled but non-obtrusive manner. We ask participants to complete a specific task (identify euro coins) and to report their output. Participants can engage in various forms of counterproductive behaviour, none of them being offered to them explicitly. They can make mistakes in the identification task, lie in their report or even steal coins. We present an application of the design to study the effects of different pay schemes (competition, fixed pay and piece rate) on counterproductive behaviour. On average counterproductive behaviour amounts to 10 percent of the average productivity, almost all arising through mistakes and overreporting of output. We find essentially no evidence of theft. Moreover, we find that both productive and counterproductive behaviour are significantly higher under competition than under the two other pay schemes.
Article
Many experimental studies indicate that people are motivated by reciprocity. Rabin [Amer. Econ. Rev. 83 (1993) 1281] develops techniques for incorporating such concerns into game theory and economics. His theory is developed for normal form games, and he abstracts from information about the sequential structure of a strategic situation. We develop a theory of reciprocity for extensive games in which the sequential structure of a strategic situation is made explicit, and propose a new solution concept—sequential reciprocity equilibrium—for which we prove an equilibrium existence result. The model is applied in several examples, and it is shown that it captures very well the intuitive meaning of reciprocity as well as certain qualitative features of experimental evidence.
Article
The "Motivation Crowding Effect" suggests that an external intervention via monetary incentives or punishments may undermine (or under different identifiable conditions strengthen) intrinsic motivation. "Crowding-out" and "crowding-in" are empirically relevant phenomena, which can, in specific cases, even dominate the traditional relative price effect. "Crowding effects" may also spread beyond the area and persons initially subject to "crowding-out" and "crowding-in". The paper discusses the conditions under which such a "Motivation Transfer Effect" may obtain.
Article
In this paper we introduce the Online Recruitment System for Economic Experiments (ORSEE). With this software experimenters have a free, convenient and very powerful tool to organize their experiments and sessions in a standardized way. Additionally, ORSEE provides subject pool statistics, a laboratory calendar, and tools for scientific exchange. A test system has been installed in order to visually support the reader while reading the paper.
Article
We study cultural norms and legal enforcement in controlling corruption by analyzing the parking behavior of United Nations officials in Manhattan. Until 2002, diplomatic immunity protected UN diplomats from parking enforcement actions, so diplomats' actions were constrained by cultural norms alone. We find a strong effect of corruption norms: diplomats from high-corruption countries (on the basis of existing survey-based indices) accumulated significantly more unpaid parking violations. In 2002, enforcement authorities acquired the right to confiscate diplomatic license plates of violators. Unpaid violations dropped sharply in response. Cultural norms and (particularly in this context) legal enforcement are both important determinants of corruption. (c) 2007 by The University of Chicago. All rights reserved..