ArticlePDF Available

Abstract and Figures

Since the concept of crowd sourcing is relatively new, many potential participants have questions about the AMT marketplace. For example, a common set of questions that pop up in an 'introduction to crowd sourcing and AMT' session are the following: What type of tasks can be completed in the marketplace? How much does it cost? How fast can I get results back? How big is the AMT marketplace? The answers for these questions remain largely anecdotal and based on personal observations and experiences. To understand better what types of tasks are being completed today using crowd sourcing techniques, we started collecting data about the AMT marketplace. We present a preliminary analysis of the dataset and provide directions for interesting future research.
Content may be subject to copyright.
AnalyzingtheAmazonMechanicalTurkMarketplace
PanagiotisG.Ipeirotis1
NewYorkUniversity
Introduction
AmazonMechanicalTurk(AMT)isapopularcrowdsourcingmarketplace,introducedbyAmazonin2005.The
marketplaceisnamedafter,“MechanicalTurk”an18thcentury“automatic”chessplayingmachine,whichwas
handilybeatinghumansinchessgames.Ofcourse,therobotwasnotusinganyartificialintelligencealgorithms
backthen.Thesecretofthe“MechanicalTurk”machinewasahumanoperator,hiddeninsidethemachine,who
wastherealintelligencebehindtheintelligentbehaviorexhibitedbythemachine.
TheAmazonMechanicalTurkisalsoamarketplaceforsmalltasksthatcannotbeeasilyautomatedtoday.For
example,humanscaneasilytelliftwodifferentdescriptionscorrespondtothesameproduct,caneasilytagan
imagewithdescriptionsofitscontent,orcaneasilytranscribewithhighqualityanaudiosnippet.However,such
simpletasksforhumansareoftenveryhardforcomputers.UsingAMT,itispossibleforcomputerstousea
programmableAPItoposttasksonthemarketplace,whicharethenfulfilledbyhumanusers.ThisAPIbased
interactiongivestheimpressionthatthetaskcanbeautomaticallyfulfilled,hencethename“MechanicalTurk.”
Inthemarketplace,employersareknownasrequestersposttasks,whicharecalledHITs,”anacronymof
“HumanIntelligenceTasks.”TheHITsarethenpickedupbyonlineusers,referredtoasworkers,”whocomplete
theminexchangeforasmallpayment,typicallyafewcentsperHIT.
Sincetheconceptofcrowdsourcingisrelativelynew,manypotentialparticipantshavequestionsabouttheAMT
marketplace.Forexample,acommonsetofquestionsthatpopupinan“introductiontocrowdsourcingand
AMT”sessionarethefollowing:
Whoaretheworkersthatcompletethesetasks?
Whattypeoftaskscanbecompletedinthemarketplace?
Howmuchdoesitcost?
HowfastcanIgetresultsback?
HowbigistheAMTmarketplace?
Forthefirstquestion,aboutthedemographicsoftheworkers,pastresearch(Ipeirotis,2010;Rossetal.2010)
indicatedthattheworkersthatparticipateonthemarketplacearemainlycomingfromtheUnitedStates,withan
increasingproportioncomingfromIndia.Ingeneral,theworkersarerepresentativeofthegeneralInternetuser
populationbutaregenerallyyoungerand,correspondingly,havelowerincomeandsmallerfamilies.

1PanagiotisG.IpeirotisisanAssociateProfessorattheDepartmentofInformation,Operations,andManagementSciencesat
LeonardN.SternSchoolofBusinessofNewYorkUniversity.Hisrecentresearchinterestsfocusoncrowdsourcing.He
receivedhisPh.D.degreeinComputerSciencefromColumbiaUniversityin2004,withdistinction.Hehasreceivedtwo
MicrosoftLiveLabsAwards,two"BestPaper"awards(IEEEICDE2005,ACMSIGMOD2006),two"BestPaperRunnerUp"
awards(JCDL2002,ACMKDD2008),andisalsoarecipientofaCAREERawardfromtheNationalScienceFoundation.This
workwassupportedbytheNationalScienceFoundationunderGrantNo.IIS0643846
Atthesametime,theanswersfortheotherquestionsremainlargelyanecdotalandbasedonpersonal
observationsandexperiences.Tounderstandbetterwhattypesoftasksarebeingcompletedtodayusing
crowdsourcingtechniques,westartedcollectingdataaboutthemarketplace.Here,wepresentapreliminary
analysisofthefindingsandprovidedirectionsforinterestingfutureresearch.
Therestofthepaperisstructuredasfollows.First,wedescribebrieflythedatacollectionprocessandthe
characteristicsofthecollecteddataset.Thenwedescribethecharacteristicsoftherequestersintermsofactivity
andpostedtasks,andwealsoprovideashortanalysisofthemostcommontasksthatarebeingcompletedon
MechanicalTurktoday.Next,weanalyzethepricedistributionsofthepostedHITsandanalyzetheHITposting
andcompletiondynamicsofthemarketplace.Weconcludebypresentingananalysisofthecompletiontime
distributionoftheHITsonMechanicalTurkandpresentsomedirectionforfutureresearchandsomedesign
improvementsthatcanimprovetheefficiencyandeffectivenessofthemarketplace.
DataCollection
WestartedgatheringdataaboutthemarketplaceofAMTinJanuary2009andwekeepcollectingdatauntiltoday.
Theprocessofcollectingdataisthefollowing:Everyhourwecrawledthelistof“HITsAvailable”onAMTandwe
keptthestatusofeachavailableHITgroup(groupid,requester,title,description,keywords,rewards,numberof
HITsavailablewithintheHITgroup,qualificationsrequired,timeofexpiration).WealsostoredtheHTMLcontent
ofeachHIT.Followingthisapproach,wecouldfindthenewHITsbeingpostedovertime,thecompletionrateof
eachHIT,andthetimethattheydisappearfromthemarketeitherbecausetheyhavebeencompletedorbecause
theyexpiredorbecauserequestercanceledandremovedtheremainingHITsfromthemarket.2Ashortcomingof
thisapproachisthatitcannotmeasuretheredundancyofthepostedHITs.So,ifasingleHITneedstobe
completedbymultipleworkers,wecanonlyobserveitasasingleHIT.
Thedataarealsopubliclyavailablethroughthewebsitehttp://www.mturktracker.com.
FromtheperiodofJanuary2009tillApril2010,wecollected165,368HITgroups,withatotalof6,701,406HITs,
from9,436requesters.ThetotalvalueofthepostedHITswas$529,259.Thesenumbers,ofcourse,donot
accountfortheredundancyofthepostedHITs,orforHITsthatwerepostedanddisappearedbetweenourcrawls.
Nevertheless,theyshouldbegoodapproximations(withinanorderofmagnitude)oftheactivityofthe
marketplace.

2IdentifyingexpiredHITsiseasy,asweknowtheexpirationtimeofaHIT.Identifying“cancelled”HITsisalittletrickier:we
needtomonitortheusualcompletionrateofaHITovertime,andseeifitislikely,atthetimeofdisappearance,forthe
remainingHITstohavebeencompletedwithinthetimesincethelastcrawl.
TopRequestersandFrequentlyPostedTasks
Onewaytounderstandwhattypesoftasksarebeingcompletedinthemarketplaceistofindthe“top”requesters
andanalyzetheHITsthattheypost.Table1showsthetoprequesters,basedonthetotalrewardsoftheHITs
posted,filteringoutrequestersthatwereactiveonlyforashortperiodoftime.
Wecanseethatthereareveryfewactiverequestersthatpostasignificantamountoftasksinthemarketplace
andaccountforalargefractionofthepostedrewards.Followingourmeasurements,thetoprequesterslistedin
Table1(whichis0.1%ofthetotalrequestersinourdataset),accountformorethan30%oftheoverallactivityof
themarket.
RequesterIDRequesterName#HITgroups TotalHITs RewardsTypeoftasks
A3MI6MIUNWCR7FCastingWords48,934 73,621 $59,099Transcription
A2IR7ETVOIULZUDoloresLabs1,676 320,543 $26,919Mediatorforother
requesters
A2XL3J4NH6JI12ContentGalore1,150 23,728 $19,375Contentgeneration
A1197OGL0WOQ3GSmartsheet.comClients 1,407 181,620 $17,086Mediatorforother
requesters
AGW2H4I480ZX1PaulPullen6,842 161,535 $11,186Contentrewriting
A1CTI3ZAWTR5AZClassifyThis228 484,369 $9,685Objectclassification
A1AQ7EJ5P7ME65Dave2,249 7,059 $6,448Transcription
AD7C0BZNKYGYVQuestionSwami798 10,980 $2,867Contentgeneration
andevaluation
AD14NALRDOSN9retaildata113 158,206 $2,118Objectclassification
A2RFHBFTZHX7UNContentSpooling.net 555 622 $987Contentgeneration
andevaluation
A1DEBE1WPE6JFOJoelHarvey707 707 $899Transcription
A29XDCTJMAE5RURaphaelMudge748 2,358 $548Websitefeedback
Table1:TopRequestersbasedonthetotalpostedrewardsavailabletoasingleworker(Jan2009‐April2010).
Giventhehighconcentrationofthemarket,thetypeoftaskspostedbytherequestersshowsthetypeoftasks
thatarebeingcompletedinthemarketplace:Castingwordsisthemajorrequester,postingtranscriptiontasks
frequently;therearealsotwoothersemianonymousrequesterspostingtranscriptiontasksaswell.Amongthe
toprequesterswealsoseetwomediatorservices,DoloresLabs(akaCrowdflower)andSmartsheet.com,whopost
tasksonMechanicalTurkonbehalfoftheirclients.Suchservicesareessentiallyaggregatorsoftasks,andprovide
qualityassuranceservicesontopofMechanicalTurk.Thefactthattheyaccountforapproximately10%ofthe
marketindicatesthatmanyusersthatareinterestedincrowdsourcingprefertouseanintermediarythataddress
theconcernsaboutworkerquality,andalsoallowpostingofcomplextaskswithouttheneedforprogramming.
WealsoseethatfourofthetoprequestersuseMechanicalTurkinordertocreateavarietyoforiginalcontent,
fromproductreviews,featurestories,blogposts,andsoon.3Finally,weseethattworequestersuseMechanical
Turkinordertoclassifyavarietyofobjectsintocategories.ThiswastheoriginaltaskforwhichMechanicalTurk
wasusedbyAmazon.

3Onerequester,“PaulPullen”,usesMechanicalTurkinordertoparaphraseexistingcontent,insteadofaskingtheworkersto
createcontentfromscratch.
Thehighconcentrationofthemarketisnotunusualforanyonlinecommunity.Thereisalwaysalongtailof
participantsthathassignificantlyloweractivitythanthetopcontributors.Figure1showshowthisactivityis
distributed,accordingtothevalueoftheHITspostedbyeachrequester.Thexaxisshowsthelog2ofthevalueof
thepostedHITsandtheyaxisshowswhatpercentageofrequestershasthislevelofactivity.Aswecansee,the
distributionisapproximatelylognormal.Interestinglyenough,thisisapproximatelythesamelevelofactivity
demonstratedbyworkers(Ipeirotis,2010).
Figure1:Numberofrequestersvs.totalrewardsposted.
Forouranalysis,wewantedtoalsoexaminethemarketplaceasawhole,toseeiftheHITssubmittedbyother
requestersweresignificantlydifferentthantheonespostedbythetoprequesters.Forthis,wemeasuredthe
popularityofthekeywordsinthedifferentHITgroups,measuringthenumberofHITgroupswithagivenkeywords,
thenumberofHITs,andthetotalamountofrewardsassociatedwiththiskeyword.Table2showstheresults.
OurkeywordanalysisofallHITsinourdatasetindicatesthattranscriptionisindeedaverycommontaskonthe
AMTmarketplace.Noticethatitisoneofthemostrewarding”keywordsandappearsinmanyHITgroups,butnot
inmanyHITs.ThismeansthatmostofthetranscriptionHITsarepostedassingleHITsandnotasgroupsofmany
similarHITs.BydoingacomparisonofthepricesforthetranscriptionHITs,wealsonoticedthatitisataskfor
whichthepaymentperHITiscomparativelyhigh.Itisunclearatthispointifthisisduetothehighexpectationfor
qualityorwhetherthehigherpricesimplyreflectsthehighereffortrequiredtocompletethesetranscriptionHITs.
Beyondtranscription,Table2indicatesthatclassificationandcategorizationareindeedtasksthatappearinmany
(inexpensive)HITs.Table2alsoindicatesthatmanytasksareaboutdatacollection,imagetaggingand
classification,andalsoaskworkersforfeedbackandadviceforavarietyoftasks(e.g.,usabilitytestingof
websites). 
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0.01% 0.10% 1.00% 10.00% 100.00%
PercentageofRewards
Percentofrequesters
QQPlot:%ofrequestersvs%ofrewards
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
16.00%
18.00%
0 5 10 15 20
NumberofRequesters
LOG2ofTotalRewardsPosted
#RequestersvsTotalRewardsPosted
KeywordRewardsKeyword #HITGroups Keyword #HITs
data$192,513castingwords 48,982 product 4,665,449
collection$154,680cw48,981 data 3,559,495
easy$93,293podcast 47,251 categorization3,203,470
writing$91,930transcribe 40,697 shopping 3,086,966
transcribe$81,416english 34,532 merchandise2,825,926
english$78,344mp 33,649 collection 2,599,915
quick$75,755writing 29,229 easy 2,255,757
product$66,726question 21,274 categorize2,047,071
cw$66,486answer 20,315 quick 1,852,027
castingwords$66,111opinion 15,407 website 1,762,722
podcast$64,418short 15,283 category 1,683,644
mp$64,162advice 14,198 image 1,588,586
website$60,527easy 11,420 search 1,456,029
search$57,578article 10,909 fast 1,372,469
image$55,013edit 9,451 shopzilla 1,281,459
builder$53,443research 9,225 tagging 1,028,802
mobmerge$53,431quick 8,282 cloudsort 1,018,455
write$52,188survey 8,265 classify 1,007,173
listings$48,853editing 7,854 listings 962,009
article$48,377data 7,548 tag 956,622
research$48,301rewriting 7,200 photo 872,983
shopping$48,086write 7,145 pageview 862,567
categorization$44,439paul 6,845 this 845,485
simple$43,460pullen 6,843 simple 800,573
fast$40,330snippet 6,831 builder 796,305
categorize$38,705confirm 6,543 mobmerge796,262
email$32,989grade 6,515 picture 743,214
merchandise$32,237sentence 6,275 url 739,049
url$31,819fast 5,620 am 613,744
tagging$30,110collection 5,136 retail 601,714
web$29,309review 4,883 web 584,152
photo$28,771nanonano 4,358 writing 548,111
review$28,707dinkle 4,358 research 511,194
content$28,319multiconfirmsnippet 4,218 email 487,560
articles$27,841website 4,140 v427,138
category$26,656money 4,085 different 425,333
flower$26,131transcription 3,852 entry 410,703
labs$26,117articles 3,540 relevance 400,347
crowd$26,117search 3,488 flower 339,216
doloreslabs$26,117blog 3,406 labs 339,185
crowdflower$26,117and 3,360 crowd 339,184
delores$26,117simple 3,164 crowdflower339,184
dolores$26,117answers 2,637 doloreslabs339,184
deloreslabs$26,117improve 2,632 delores 339,184
entry$25,644retranscribe 2,620 dolores 339,184
tag$25,228writer 2,355 deloreslabs339,184
video$25,100image 2,322 find 338,728
editing$24,791confirmsnippet 2,291 contact 324,510
classify$24,054confirmtranscription 2,288 address 323,918
answer$23,856voicemail 2,202 editing 321,059
Table2:Thetop50mostfrequentHITkeywordsinthedataset,rankedbytotalrewardamount,#ofHITgroups,and#ofHITs.
PriceDistributions
TounderstandbetterthetypicalpricespaidforcrowdsourcingtasksonAMT,weexaminedthedistributionofthe
HITpricesandthesizeofthepostedHITs.Figure2illustratestheresults.WhenexaminingHITgroups,thenwe
canseethatonly10%oftheHITgroupshaveapricetagof2centsorless,50%oftheHITshavepriceabove10
cent,andthat15%oftheHITscomewithapricetagof$1ormore.
However,thisanalysiscanbemisleading.Ingeneral,HITgroupswithhighpriceonlycontainasingleHIT,whilethe
HITgroupswithlargenumberofHITshavealowprice.Therefore,ifwecomputethedistributionofHITs(not
HITgroups)accordingtotheprice,wecanseethat25%oftheHITscreateonMechanicalTurkhaveapricetagof
just1cent,70%oftheHITshavearewardof5centsorless,and90%oftheHITscomewitharewardoflessthan
10cents.ThisanalysisconfirmsthecommonfeelingthatmostofthetasksonMechanicalTurkhavetinyrewards.
Ofcourse,thisanalysissimplyscratchesthesurfaceofthebiggerproblem:Howcanweautomaticallypricetasks,
takingintoconsiderationthenatureofthetask,theexistingcompetition,theexpectedactivitylevelofthe
workers,thedesiredcompletiontime,thetenureandprioractivityoftherequester,andmanyotherfactors?For
example,howmuchshouldwepayforanimagetaggingtask,for100,000images,inordertogetitdonewithin24
hours?Buildingsuchmodelswillallowtheexecutionofcrowdsourcingtaskstobecomeeasierforpeoplethat
simplywantto“getthingsdone”anddonotwanttotuneandmicrooptimizetheircrowdsourcingprocess.
Figure2:DistributionofHITgroupsandHITsaccordingtoHITPrice.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
$0.01 $0.10 $1.00 $10.00
HITPrice
%ofHITgroupsvsHITprice
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
$0.01 $0.10 $1.00 $10.00
HITPrice
%ofHITsvsHITprice
ActivityDynamicsontheAMTMarketplace:PostingandServingProcesses
WhatisthetypicalactivityintheAMTmarketplace?Whatisthevolumeofthetransactions?Thesearevery
commonquestionsfrommanypeoplethatareinterestedinunderstandingthesizeofthemarketandits
demonstratedcapacity4forhandlingbigtasks.
OnewaytoapproachsuchquestionsistoexaminethetaskpostingandtaskcompletionactivityonAMT.By
studyingthepostingactivitywecanunderstandthedemandforcrowdsourcing,andthecompletionrateshows
howfastthemarketcanhandlethedemand.Tostudytheseprocesses,wecomputed,foreachday,thevalueof
tasksbeingpostedbyAMTrequestersandthevalueofthetasksthatgotcompletedineachday.
Wepresentfirstananalysisofthetwoprocesses(postingandcompletion),ignoringanydependenciesontask
specificandtimespecificfactors.Figure3illustratesthedistributionsofthepostingandcompletionprocesses.
Thetwodistributionsaresimilarbutweseethat,ingeneral,therateofcompletionisslightlyhigherthantherate
ofarrival.Thisisnotsurprising,andisarequiredstabilitycondition:ifthecompletionratewaslowerthanthe
arrivalrate,thenthenumberofincompletetasksinthemarketplacewouldgotoinfinity.Weobservedthatthe
medianarrivalrateis$1,040perdayandthemediancompletionrateis$1,155/day.IfweassumethattheAMT
marketplacebehaveslikeanM/M/1queuingsystem,andusingbasicqueuingtheory,wecanseethatatask
worth$1hasanaveragecompletiontimeof12.5minutes,resultinginaneffectivehourlywageof$4.8.
Figure3:ThedistributionofthearrivalandcompletionrateontheAMTmarketplace,asafunctionoftheUSD($)valueofthe
posted/completedHITs.
Ofcourse,thisanalysisisanoversimplificationoftheactualprocess.Thetasksarenotcompletedinafirstinfirst
outmanner,andthecompletionrateisnotindependentofthearrivalrate.Inreality,workerspicktasksfollowing
personalpreferencesorbybeingrestrictedbythewebuserinterfaceofAMT.Forexample(Chiltonetal.2010)
indicatethatmostworkersusetwoofthemaintasksortingmechanismsprovidedbyAMTtofindandcomplete
tasks(“recentlyposted”and“largestnumberofHITs”orders).Furthermore,thecompletionrateisnot

4Detectingthetruecapacityofthemarketisamoreinvolvedtaskthansimplymeasuringitscurrentservingrate.Many
workersmayshowuponlywhenthereisasignificantamountofworkforthem,andbedormantundernormalloads.
Examiningfullythisquestionisbeyondthescopeofthispaper.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
$0 $500 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000
%ofdayswithcompletionactivity<X
ValueofcompletedHITsinUSD($)
Posting andCompletion ActivityCDF
independentofthearrivalrate.Whentherearemanytasksavailable,moreworkerscometocompletetasks,as
therearemoreopportunitiestofindandworkforbiggertasks,asopposedtoworkingforonetimeHITs.Asa
simpleexample,considerthedependencyofpostingandcompletionratesonthedayoftheweek.(Figure4
illustratestheresults.)Thepostingactivityfromtherequestersissignificantlylowerovertheweekendsandis
typicallymaximizedonTuesdays.Thiscanberathereasilyexplained:sincemostrequestersarecorporationsand
organizations,mostofthetasksarebeingpostedduringnormalworkingdays.However,thesamedoesnothold
forworkers.Thecompletionactivityisratherunaffectedbytheweekends.Theonlydayonwhichthecompletion
ratedropsisonMonday,andthisismostprobablyasideeffectofthelowerpostingrateovertheweekends.
(TherearefewertasksavailableforcompletiononMonday,duetothelowerpostingrateovertheweekend.)
Figure4:ThepostingandcompletionrateonAMTasafunctionofthedayoftheweek
Aninterestingopenquestionistounderstandbetterhowtomodelthemarketplace.Workonqueuingtheoryfor
modelingcallcentersisrelated,andcanhelpusunderstandbetterthedynamicsofthemarketandthewaythat
workershandlethepostedtasks.Next,wepresentsomeevidencethatmodelingcanhelpusunderstandbetter
theshortcomingsofthemarketandpointtopotentialdesignimprovements.
Sun Mon Tue Wed Thu Fri Sat
Day of the Week
0
1000
2000
3000
4000
5000
Total Value of Posted HITs
Sun Mon Tue Wed Thu Fri Sat
Day of the Week
0
500
1000
1500
2000
2500
3000
3500
Total Value of Completed HITs
ActivityDynamicsontheAMTMarketplace:CompletionTimeDistribution
GiventhatthesystemdoesnotsatisfytheusualqueuingassumptionsofM/M/1fortheanalysisofcompletion
times,weanalyzedempiricallythecompletiontimeforthepostedtasks.Thegoalofthisanalysiswasto
understandwhatapproachesmaybeappropriateformodelingthebehavioroftheAMTmarketplace.
Ouranalysisindicatedthatthecompletiontimefollows(approximately)apowerlaw,asillustratedinFigure5.We
observesomeirregularities,withsomeoutliersatapproximately12hoursandatthe7daycompletiontimes.
Thesearecommon“expirationtimes”setformanyHITs,hencethesuddendisappearanceofmanyHITsatthat
point.Similarly,weseeadifferentbehaviorofHITsthatareavailableforlongerthanoneweek:theseHITsare
typically“renewed”bytheirrequestersbythecontinuouspostingofnewHITswithinthesameHITgroup.5
Althoughitisstillunclearwhatdynamicscausesthisbehavior,theanalysisbyBarabási(2005)indicatesthat
prioritybasedcompletionoftaskscanleadtosuchpowerlawdistributions.
Tobettercharacterizethispowerlawdistributionofcompletiontimes,weusedthemaximumlikelihood
estimatorforpowerlaws.Toavoidbiases,wealsomarkedas“censored”theHITsthatwedetectedtobe
“abortedbeforecompletion”andtheHITsthatwerestillrunningatthelastcrawlingdateofourdataset.(For
brevity,weomitthedetails.)TheMLEestimatorindicatedthatthemostlikelyexponentforthepowerlaw
distributionofthecompletiontimesofMechanicalTurkisα=1.48.Thisexponentisveryclosetothevalue
predictedtheoreticallyforthequeuingmodelof(Cobham,1954),inwhicheachtaskuponarrivalisassignedtoa
queuewithdifferentpriority.Barabási(2005)indicatesthattheCobhammodelcanbeagoodexplanationofthe
powerlawdistributionofcompletiontimesonlywhenthearrivalrateisequaltothecompletionrateoftasks.
OurearlierresultsindicatethatfortheAMTmarketplacethisisnotfarfromreality.HencetheCobhammodelof
prioritybasedexecutionoftaskscanexplainthepowerlawdistributionofcompletiontimes.
Figure5:ThedistributionofcompletiontimesforHITgroupspostedonAMT.Thedistributiondoesnotchangesignificantlyifweusethe
completiontimeperHIT(andnotperHITgroup),as80%oftheHITgroupscontainjustoneHIT.

5AcommonreasonforthisbehaviorisfortheHITtoappearinthefirstpageofthe“Mostrecentlyposted”listofHITgroups,
asmanyworkerspickthetaskstoworkonfromthislist(Chilton,2010).
1
4
16
64
256
1,024
4,096
16,384
65,536
1 4 16 64 256 1,024 4,096 16,384
NumberofHITgroups
CompletiontimeforHITgroup(inhours)
DistributionofcompletiontimeforHITGroups
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 4 16 64 256 1,024 4,096 16,384
%ofHITgroupswithcompletiontime<x
CompletiontimeforHITgroup(inhours)
CDFofcompletiontimesforHITGroups
Unfortunately,asystemwithapowerlawdistributionofcompletiontimesisratherundesirable.Giventhe
infinitevarianceofpowerlawdistributions,itisinherentlydifficulttopredictthenecessarytimerequiredto
completeatask.Althoughwecanpredictthatformanytasksthecompletiontimewillbeshort,thereisahigh
probabilitythatthepostedtaskwillneedasignificantamountoftimetofinish.Thiscanhappenwhenasmalltask
isnotexecutedquickly,andthereforeisnotavailableinanyofthetwopreferredqueuesfromwhichworkerspick
taskstoworkon.Theprobabilityofa“forgotten”taskincreasesifthetaskisnotdiscoverablethroughanyofthe
othersortingmethodsaswell.
ThisresultindicatesthatitisnecessaryforthemarketplaceofAMTtobeequippedwithbetterwaysforworkers
topicktasks.Ifworkerscanpicktaskstoworkoninaslightlymore“randomized”fashion,itwillbepossibleto
changethebehaviorofthesystemandeliminatethe“heavytailed”distributionofcompletiontimes.Thiscan
leadtoahigherpredictabilityofcompletiontimes,whichisadesirablecharacteristicforrequesters.Especially
newrequesters,withoutthenecessaryexperienceformakingtheirtasksvisible,wouldfindsuchacharacteristic
desirable,asitwilllowerthebarriertosuccessfullycompletetasksasanewrequesterontheAMTmarket.
Weshouldnote,ofcourse,thattheseresultsdonottakeintoconsiderationtheeffectofvariousfactors.For
example,anestablishedrequesterisexpectedtohaveitstaskscompletedfasterthananewrequesterthathas
notestablishedconnectionswiththeworkercommunity.Ataskwithahigherpricewillbepickedupfasterthan
anidenticaltaskwithlowerprice.Animagerecognitiontaskistypicallyeasierthanacontentgenerationtask,
hencemoreworkerswillbeavailabletoworkonitandfinishitfaster.Theseareinterestingdirectionsforfuture
research,astheycanshowtheeffectofvariousfactorswhendesigningandpostingtasks.Thiscanleadtoa
betterunderstandingofthecrowdsourcingprocessandabetterpredictionofcompletiontimeswhen
crowdsourcingvarioustasks.
Higherpredictabilitymeanslowerriskfornewparticipants.Lowerriskmeanshigherparticipationandhigher
satisfactionbothforrequestersandforworkers.
Conclusions
OuranalysisindicatesthattheAMTisaheavytailedmarket,intermsofrequesteractivity,withtheactivityofthe
requestersfollowingalognormaldistribution;thetop0.1%oftherequestersamountfor30%ofthedollar
activityandwith1%oftherequesterspostingmorethan50%ofthedollarweightedtasks.Asimilaractivity
patternalsoappearsfromthesideofworkers(Ipeirotis,2010).Thiscanbeinterpretedbothpositivelyand
negatively.Thenegativeaspectisthattheadoptionofcrowdsourcingsolutionsisstillminimal,asonlyasmall
numberofparticipantsactivelyusecrowdsourcingforlargescaletasks.Ontheotherhand,thelongtailof
requestersindicatesasignificantinterestforsuchsolutions.Byobservingthepracticesofthesuccessful
requesters,wecanlearnmoreaboutwhatmakescrowdsourcingsuccessful,andincreasethedemandfromthe
smallerrequesters.
Wealsoobservethattheactivityisstillconcentratedaroundsmalltasks,with90%ofthepostedHITsgivinga
rewardof10centsorless.Anextstepinthisanalysisistoseparatethepricedistributionsbytypeoftaskand
identifythe“usual”pricingpointsfordifferenttypesoftasks.Thiscanprovideguidancetonewrequestersthatdo
notknowwhethertheyarepricingtheirtaskscorrectly.
Finally,wepresentedafirstanalysisofthedynamicsoftheAMTmarketplace.Byanalyzingthespeedofposting
andcompletionofthepostedHITs,wecanseethatMechanicalTurkisapriceeffectivetaskcompletion
marketplace,astheestimatedhourlywageisapproximately$5.Furtheranalysiswillallowustogetabetter
insightof“howthingsgetdone”ontheAMTmarket,identifyingelementsthatcanbeimprovedandleadtoa
betterdesignforthemarketplace.Forexample,byanalyzingthewaitingtimeforthepostedtasks,weget
significantevidencethatworkersarelimitedbythecurrentuserinterfaceandcompletetasksbypickingtheHITs
availablethroughoneoftheexistingsortingcriteria.Thislimitationleadstoahighdegreeofunpredictabilityin
completiontimes,asignificantshortcomingforrequestersthatwanthighdegreeofreliability.Abettersearch
anddiscoveryinterface(orperhapsabettertaskrecommendationservice,aspecialtyofAmazon.com,canleadto
improvementsintheefficiencyandpredictabilityofthemarketplace.
Furtherresearchisalsonecessaryinbetterpredictinghowchangesinthedesignandparametersofataskcan
affectqualityandcompletionspeed.Ideally,weshouldhaveaframeworkthatautomaticallyoptimizesallthe
aspectsoftaskdesign.Databasesystemshidealltheunderlyingcomplexityofdatamanagement,usingquery
optimizerstopicktheappropriateexecutionplans.GooglePredicthidesthecomplexityofpredictivemodelingby
offeringanautooptimizingframeworkforclassification.Crowdsourcingcanbenefitsignificantlybythe
developmentofsimilarframeworkthatprovidesimilarabstractionsandautomatictaskoptimizations.
References
MechanicalTurkMonitor,http://www.mturktracker.com.
Barabási,A.L.2005.Theoriginofburstsandheavytailsinhumandynamics.Nature,435:207211.
Cobham,A.1954.Priorityassignmentinwaitinglineproblems.J.Oper.Res.Sec.Am.2,7076.
Chilton,L.B.,Horton,J.J.,Miller,R.C.,andAzenkot,S.2010.Tasksearchinahumancomputationmarket.In
ProceedingsoftheACMSIGKDDWorkshoponHumanComputation(WashingtonDC,July25‐25,2010).
HCOMP'10.ACM,NewYork,NY,19.
Ipeirotis,P.2010.DemographicsofMechanicalTurk.WorkingPaperCeDER-10-01,NewYorkUniversity,Stern
SchoolofBusiness.Availableathttp://hdl.handle.net/2451/29585
Ross,J.,Irani,L.,Silberman,M.S.,Zaldivar,A.,andTomlinson,B.2010.Whoarethecrowdworkers?:shifting
demographicsinmechanicalturk.InProceedingsofthe28thoftheinternationalConferenceExtended
AbstractsonHumanFactorsinComputingSystems(Atlanta,Georgia,USA,April10‐15,2010).CHIEA'10.
ACM,NewYork,NY,28632872.
... Crowd work has several related terminologies, including crowdsourcing, human computing, citizen science, open innovation, collective intelligence, participatory sensing, and so on. The terminology is used to describe the phenomenon to utilize crowds to complete work in various forms (Kittur, et al., 2013;Ipeirotis, 2010;Howe, 2009;Erickson et al., 2012;Anya, 2015;Cefkin et al., 2014). Crowdwork is a job that is managed by the organization and is done by paid and distributed crowd workers in various locations. ...
... In a global context experiencing different work requirements, the need to crowdsource tasks is associated with human computation, which have the purpose of organizing the tasks executed by humans to perform computation processes (Law and Ahn, 2011). Through this lens, crowdsourcing can be seen as the optimal usage of human computation, which is particularly useful and helpful for companies (Ipeirotis 2010;Nguyen Hoang, Pedro, and David, 2017) and scientific institutions (Cooper et al. 2010;Raddick et al. 2019), while even contributing to advance artificial intelligence research (Chang et al. 2017;Correia et al. 2018;Muller et al. 2015;F. A. Schmidt 2019). ...
Article
Full-text available
Online microtask labor has increased its role in the last few years and has provided the possibility of people who were usually excluded from the labor market to work anytime and without geographical barriers. While this brings new opportunities for people to work remotely, it can also pose challenges regarding the difficulty of assigning tasks to workers according to their abilities. To this end, cognitive personalization can be used to assess the cognitive profile of each worker and subsequently match those workers to the most appropriate type of work that is available on the digital labor market. In this regard, we believe that the time is ripe for a review of the current state of research on cognitive personalization for digital labor. The present study was conducted by following the recommended guidelines for the software engineering domain through a systematic literature review that led to the analysis of 20 primary studies published from 2010 to 2020. The results report the application of several cognition theories derived from the field of psychology, which in turn revealed an apparent presence of studies indicating accurate levels of cognitive personalization in digital labor in addition to a potential increase in the worker’s performance, most frequently investigated in crowdsourcing settings. In view of this, the present essay seeks to contribute to the identification of several gaps and opportunities for future research in order to enhance the personalization of online labor, which has the potential of increasing both worker motivation and the quality of digital work.
... (Hirth et al., 2011) or Amazon Mechanical Turk (MTurkwww.mturk.com) (Ipeirotis, 2010), responsible for the recruitment and payment of the workers. ...
Preprint
Full-text available
Accurate tree detection is of growing importance in applications such as urban planning, forest inventory, and environmental monitoring. In this article, we present an approach to creating tree maps by annotating them in 3D point clouds. Point cloud representations allow the precise identification of tree positions, particularly stem locations, and their heights. Our method leverages human computational power through paid crowdsourcing, employing a web tool designed to enable even non-experts to effectively tackle the task. The primary focus of this paper is to discuss the web tool's development and strategies to ensure high-quality tree annotations despite encountering noise in the crowdsourced data. Following our methodology, we achieve quality measures surpassing 90% for various challenging test sets of diverse complexities. We emphasize that our tree map creation process, including initial point cloud collection, can be completed within 1-2 days.
Article
Research suggests that the temporal flexibility advertised to crowdworkers by crowdsourcing platforms is limited by both client-imposed constraints (e.g., strict completion times) and crowdworkers’ tooling practices (e.g., multitasking). In this paper, we explore an additional contributor to workers’ limited temporal flexibility: the design of crowdsourcing platforms, namely requiring crowdworkers to be ‘on call’ for work. We conducted two studies to investigate the impact of having to be ‘on call’ on workers’ schedule control and job control. We find that being ‘on call’ impacted: (1) participants’ ability to schedule their time and stick to planned work hours, and (2) the pace at which participants worked and took breaks. The results of the two studies suggest that the ‘on-demand’ nature of crowdsourcing platforms can limit workers’ temporal flexibility by reducing schedule control and job control. We conclude the paper by discussing the implications of the results for: (a) crowdworkers, (b) crowdsourcing platforms, and (c) the wider platform economy.
Article
Due to the presence of noise in crowdsourced labels, label aggregation (LA) has become a standard procedure for post-processing these labels. LA methods estimate true labels from crowdsourced labels by modeling worker quality. However, most existing LA methods are iterative in nature. They require multiple passes through all crowdsourced labels, jointly and iteratively updating true labels and worker qualities until a termination condition is met. As a result, these methods are burdened with high space and time complexities, which restrict their applicability in scenarios where scalability and online aggregation are essential. Furthermore, defining a suitable termination condition for iterative algorithms can be challenging. In this paper, we view LA as a dynamic system and represent it as a Dynamic Bayesian Network. From this dynamic model, we derive two lightweight and scalable algorithms: LA onepass and LA twopass . These algorithms can efficiently and effectively estimate worker qualities and true labels by traversing all labels at most twice, thereby eliminating the need for explicit termination conditions and multiple traversals over the crowdsourced labels. Due to their dynamic nature, the proposed algorithms are also capable of performing label aggregation online. We provide theoretical proof of the convergence property of the proposed algorithms and bound the error of the estimated worker qualities. Furthermore, we analyze the space and time complexities of our proposed algorithms, demonstrating their equivalence to those of majority voting. Through experiments conducted on 20 real-world datasets, we demonstrate that our proposed algorithms can effectively and efficiently aggregate labels in both offline and online settings, even though they traverse all labels at most twice. The code is on https://github.com/yyang318/LA_onepass.
Chapter
The task of semantic segmentation involves labeling each pixel in an image with its corresponding object class, which is achieved by clustering regions belonging to the same category using artificial intelligence. This is an important step from image processing to image analysis and has numerous applications in areas such as automatic driving, image enhancement, and 3D map reconstruction. With the emergence of deep learning, several sophisticated and efficient algorithms have been developed for this task. This chapter aims to review these methods, starting with a discussion of state-of-the-art semantic segmentation methods for both single modality and data fusion, emphasizing their contributions and significance in the field. Additionally, an overview of commonly used datasets is provided to assist researchers in selecting the appropriate dataset for their needs and goals. A comprehensive summary of evaluation metrics used to assess semantic segmentation results, along with corresponding benchmarks for a number of classic datasets, is also presented. Finally, practical applications of semantic segmentation in autonomous driving are explored, and conclusions are drawn on the current state of the art.
Chapter
The big five tech companies FAAMA (Facebook, Apple, Amazon, Microsoft, Alphabet) control our lives, made powerful by network effects. The combined market value of FAAMA exceeds seven trillion dollars and accounts for one quarter of S&P 500 index. Digital advertising is controlled by two landlords—Google and Facebook. Apple is the most valuable company on this planet. The big five tech companies have one proponent on their side, the user, who enjoys immense consumer surplus. FAAMA is accused of devouring competition and being the cause of many social evils. Lawmakers worldwide are trying to rein them, but not with much success.
Article
Full-text available
In order to understand how a labor market for human com-putation functions, it is important to know how workers search for tasks. This paper uses two complementary meth-ods to gain insight into how workers search for tasks on Mechanical Turk. First, we perform a high frequency scrape of 36 pages of search results and analyze it by looking at the rate of disappearance of tasks across key ways Mechanical Turk allows workers to sort tasks. Second, we present the results of a survey in which we paid workers for self-reported information about how they search for tasks. Our main find-ings are that on a large scale, workers sort by which tasks are most recently posted and which have the largest number of tasks available. Furthermore, we find that workers look mostly at the first page of the most recently posted tasks and the first two pages of the tasks with the most available instances but in both categories the position on the result page is unimportant to workers. We observe that at least some employers try to manipulate the position of their task in the search results to exploit the tendency to search for recently posted tasks. On an individual level, we observed workers searching by almost all the possible categories and looking more than 10 pages deep. For a task we posted to Mechanical Turk, we confirmed that a favorable position in the search results do matter: our task with favorable posi-tioning was completed 30 times faster and for less money than when its position was unfavorable.
Article
Full-text available
We present the results of a survey that collected information about the demographics of participants on Amazon Mechanical Turk, together with information about their level of activity and motivation for working on Amazon Mechanical Turk. We find that approximately 50% of the workers come from the United States and 40% come from India. Country of origin tends to change the motivating reasons for workers to participate in the marketplace. Significantly more workers from India participate on Mechanical Turk because the online marketplace is a primary source of income, while in the US most workers consider Mechanical Turk a secondary source of income. While money is a primary motivating reason for workers to participate in the marketplace, workers also cite a variety of other motivating reasons, including entertainment and education.
Article
Full-text available
The dynamics of many social, technological and economic phenomena are driven by individual human actions, turning the quantitative understanding of human behaviour into a central question of modern science. Current models of human dynamics, used from risk assessment to communications, assume that human actions are randomly distributed in time and thus well approximated by Poisson processes. In contrast, there is increasing evidence that the timing of many human activities, ranging from communication to entertainment and work patterns, follow non-Poisson statistics, characterized by bursts of rapidly occurring events separated by long periods of inactivity. Here I show that the bursty nature of human behaviour is a consequence of a decision-based queuing process: when individuals execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed, with most tasks being rapidly executed, whereas a few experience very long waiting times. In contrast, random or priority blind execution is well approximated by uniform inter-event statistics. These finding have important implications, ranging from resource management to service allocation, in both communications and retail.
Article
There are several commonly occurring situations in which the position of a unit or member of a waiting line is determined by a priority assigned to the unit rather than by its time of arrival in the line. An example is the line formed by messages awaiting transmission over a crowded communication channel in which urgent messages may take precedence over routine ones. With the passage of time a given unit may move forward in the line owing to the servicing of units at the front of the line or may move back owing to the arrival of units holding higher priorities. Though it does not provide a complete description of this process, the average elapsed time between the arrival in the line of a unit of a given priority and its admission to the facility for servicing is useful in evaluating the procedure by which priority assignments are made. Expressions for this quantity are derived for two cases—the single-channel system in which the unit servicing times are arbitrarily distributed (Eq. 3) and the multiple-channel system in which the servicing times are exponentially distributed (Eq. 6). In both cases it is assumed that arrivals occur at random. Operations Research, ISSN 0030-364X, was published as Journal of the Operations Research Society of America from 1952 to 1955 under ISSN 0096-3984.
Conference Paper
Amazon Mechanical Turk (MTurk) is a crowdsourcing system in which tasks are distributed to a population of thousands of anonymous workers for completion. This system is increasingly popular with researchers and developers. Here we extend previous studies of the demographics and usage behaviors of MTurk workers. We describe how the worker population has changed over time, shifting from a primarily moderate-income, U.S.-based workforce towards an increasingly international group with a significant population of young, well-educated Indian workers. This change in population points to how workers may treat Turking as a full-time job, which they rely on to make ends meet.
Priority assignment in waiting line problems
  • A Cobham
  • L B Chilton
  • J J Horton
  • R C Miller
 Cobham, A. 1954. Priority assignment in waiting line problems. J. Oper. Res. Sec. Am. 2, 70−76.  Chilton, L. B., Horton, J. J., Miller, R. C., and Azenkot, S. 2010. Task search in a human computation market. In Proceedings of the ACM SIGKDD Workshop on Human Computation (Washington DC, July 25-25, 2010). HCOMP '10. ACM, New York, NY, 1-9.
Working paper CeDER-10-01 New York University Stern School of Business
  • P Ipeirotis