Probabilistic Rotation Scheduling
ABSTRACT One of the biggest problems in parallel processing is to obtain a good schedule without having a knowledge of exact computation time of the tasks. These tasks normally occur when conditional instructions are employed and/or inputs of the tasks influence the computation time. The relationship of these tasks can be represented by a data-flow graph where each node models the task associated with a probabilistic computation time. In order to address the problem, the synchronous parallelism computing style is assumed in this paper, i.e., the synchronization is enforced at the end of each iteration. An algorithm called probabilistic rotation scheduling which takes advantage of loop pipelining is developed to schedule these tasks to a parallel processing system. We show that based on our loop scheduling algorithm the length of the resulting schedule can be guaranteed to be satisfied for a given probability. The experiments show that the resulting schedule length for a given probability of con...
-
Citations (0)
-
Cited In (0)
Page 1
ProbabilisticRotationSc heduling
?
S?TongsimaC?ChantrapornchaiE?ShaNelsonPassos
Dept?ofComputerSci??Engr?Dept? ofComputer Science
UniversityofNotreDame MidwesternStateUniv ersity
Abstract
Oneofthebiggestproblemsinparallel processing isto obtainagoodschedule withouthaving
aknowledge ofexactcomputation timeofthetasks?Thesetasksnormallyoccurwhenconditional
instructionsareemployed and?orinputsofthe tasksin?uence thecomputationtime?Therelationship
of thesetaskscanbe representedbyadata??owgraphwhereeachnodemodelsthetaskassociatedwith a
probabilisticcomputationtime?In ordertoaddressthe problem?thesynchronousparallelismcomputing
styleisassumedinthispaper? i?e??thesynchronizationisenforcedattheendofeachiteration?An
algorithmcalledprobabilisticrotationschedulingwhichtakes advan tageof loop pipeliningisdeveloped
toschedulethesetaskstoaparallel processingsystem?Wesho wthatbased on ourloopscheduling
algorithmthelengthoftheresultingschedule canbeguaranteed tobesatis?edforagivenprobability?
Theexp erimentsshowthattheresultingsc hedulelengthfor agivenprobabilityofcon?dencecanbe
signi?cantlybetterthanthesc hedules obtainedbyworst?caseoraverage?casescenario?
?Introduction
In manypracticalapplicationssuchasinterfacesystems?fuzzysystems?andarti?cialintelligencesystems?
etc??manytasksfromtheseapplicationsnormallyhaveuncertaincomputationtime?Suchtasksnormally
containconditionalinstructions and?oroperationsthatmaytakedi?erentcomputationtime whencalcu?
lating di?erentinputs?Adynamicschedulingscheme maybeconsideredtoaddresstheproblem?however?
thedecisionoftherun?timeschedulerwhichdependsonthelocalon?lineknowledgemaynotgiveagood
overallschedule?Although manystatic scheduling techniques canthoroughlycheckforthebestassignment
fordependenttasks?theexistingmethodsarenotabletodealwithsuchanuncertainty?Therefore?either
worst?caseoraverage?casecomputationforthetask?whichmaynotre?ecttherealoperatingsituation?is
usuallyassumed?
Foriterativeapplications?thestatisticsofacomputationtimeofthoseuncertaintasksarenotdi?cultto
becollected?Inordertotakeadvantageofthesestatisticaldataandlooppipelining?anov elloopscheduling
algorithm?calledprobabilisticrotationscheduling?PRS??isintroducedinthispaper?Thisalgorithm
attemptstoexp osethe parallelismof those certainanduncertain tasks?collectively called probabilistic
tasks?within each iteration?The synchronizationisthen appliedattheend ofeach iteration?Sucha
?
Thisw orkw aspartiallysupp ortedby theRoyal ThaiGovernment Scholarship and NFSgran tMIP?????????
?
Page 2
parallelcomputingstyleisalso kno wn assynchronousp aral lelism ??????? Theproposedalgorithmtakes
aninputapplicationwhichismo deledasa hierarchicaldata??owgraph?DF G?wherea no decorresponds to
atask?e?g??a collectionofstatements?andasetof edgesrepresentsdep endencies betw eenthese tasks?The
dependencydistances?alsocalleddelays?betweentasksindi?erentiterationsisrepresentedbyshortbar
linesonthoseedges?Thecomputation time ofthese nodescanbeeither?xed orvaried?Aprobability model
isemploy ed torepresentthetiming oftheprobabilistic tasks? Thegoalofthisalgorithmisto e?ectively
scheduleboth certainanduncertaintaskstom ultiplefunctionalunits?Inotherwords?givenacon?dence
probability
?? thealgorithmattemptstoreduce thelength
c
byoptimizingtheschedule suchthat the
probabilityoftheschedulelengthbeinglessthan
c
isgreaterthan
??Bytakinginto accountthevarying
timingcharacteristics?theproposedtechniquecanbeappliedtoawidervarietyofapplicationsinhigh?level
synthesis andcompileroptimization?
Considerableresearchhasbeenconducted intheareaofsc hedulingdirected?acyclic graphs?DA Gs?to
themultiplepro cessingsystem?Suchgraphsareobtainedbyignoringedgescontainingoneormoredela ys?
Manyheuristicshavebeenproposed?e?g??listscheduling?andgraphdecomposition??????toschedulethe
DAG?Thesemethodsconsiderneitherexploringtheparallelismacrossiterationsnoraddressingtheproblem
withtheprobabilistictasks?In???? ????KuandDeMicheli proposedtherelativesc hedulingmethodwhich
handlestaskswithunb ounded nodes?Theirapproac h?however?considersDAG as aninput graph and
does notexplore theparallelism acrossiterations?Furthermore? ifthestatistics of thecomputationtime of
uncertainnodes canbecollected?theirmethodwill notusethesestatisticalinformation?
Looptransformationsare anothercommontechniquesusedtorestructurelo opsfrom therepetitiveco de
segment inordertoreduce thetotalexecutiontimeoftheschedule ?????? ??????? Thesetechniques?however?
assumethatthetargetsystemshaveneitherlimited num berofprocessorsnorthetasks b eing uncertain?
Forthe classof global scheduling? softwarepip elining????is usedto overlap instructions?i?e??exposing the
parallelismacross iterations?This technique?ho wev er?expandsthe graphbyunfoldingit?Furthermore?
such anapproachis limitedto solving the problemwithoutconsidering the uncertaintyofthe computation
time???????
Arotation schedulingtec hniquewasdev elopedby Chao?LaPaughandSha ????? andextendedtohandle
m ulti?dimensionalapplicationsbyPassos?Sha andBass?????Thistechnique assigns nodes fromaDFG to
thesystemwith limited numberof functionalunits?Itimplicitly explorestraditionalretiming ???? inorder
toreduce thetotalcomputationtime ofthe no desalong thelongestpaths??alsocalledthecriticalpaths?in
theDFG?Inotherwords?thegraphistransformedinsuchawaythattheparallelismisexp osedbutthe
behaviorofthegraphispreserved?Inthispaper?therotationschedulingtechnique isextendedsothatit
candealwiththoseuncertaintasks?Unliketherelativescheduling?theinformationaboutataskisobtained
fromthestatisticalprocess?An applicationwithprobabilisticexecutiontimeis transformedtoagraph
model?calledtheprobabilisticdata??owgraph?PG??whichisageneralizationof theDFGmodel?After
theinitialexecutionorderandfunctionalunitassignmentaregiven?Theprobabilisticrotationscheduling
isappliedsothatthetotalcomputationtimeofthe?nalscheduleaboveagivenprobabilitycanbereduced
byexploring parallelismacrossiterations?
Figure??b?showsanexampleofthePGconsistingof?nodes?Note that suchagraphmodelsthe
codesegmentpresentedinFigure??a??Twobarlinesontheedge betweennodes
D
and
A
represent the
?
Page 3
for
i
?
?
to
???
do
A
?
?
temp
?
C?i
?
??
A
?
?
num
?
random?????
B
?
?
if
n um
????
then
B
?
?
A?i??
temp
?
??
??B?num
??
?
B
?
?
else
A?i??
temp
?
??
C
?
?
B?num??
temp
???
D
?
?
if
n um
?
??
then
D
?
?
C?i???temp
?
A?i??
B?num????
D
?
?
else
C?i??
A?i??
B?num?
?a? Codesegment
A
D
B
C
?b?PG
TimeNodes
ABCD
?????
???????????
?????
???????????
?c? Itscomputation time
Figure ?? PGand itscomputationtime
dep endency delaysbetw eenthesetwonodes?The computationtimeofnodes
A
and
C
arekno wntobe
?xed??timeunits?? Inthiscode? the uncertaintyoccursinthe computationofno des
B
and
D?Assume
that each operationtak es?time unitandsoastotheassignment statement?Alsothe timetocomputethe
comparison operationis negligible?Hence? itmay take either? time unitstoexecutenode
B
or?time unit
toexecuteno de
B?Putanotherway? about???of the time????outof ????statement
B
?
willbe activeand
node
B
willthentake?timeunits?otherwisenode
B
takesonly?timeunit?i?e??statement
B
?
hasonly one
operation?Lik e node
B?approximately????or?? outof????node
D
takes?timeunitsandabout???
chance?itwilltak e?timeunit?EachentryinFigure??c?showsaprobabilityassociatedwitheachnode?s
possiblecomputationtime?theprobability distribution??
Sincethecomputationtimeofthesenodesisnowarandomvariable?thetotalcomputationtime ofthe
PGisalsoarandomvariable?Theconceptofacontrolstep?i?e??thesynchronizationtimeofthetasks
withineachiteration?isnolongerapplicable?Ascheduleconveysonlytheexecutionorder?ofthetasks
beingexecuted ina functionalunit and?orbetween di?erent units?Ourtec hniquewillgiveagoodinitial
sc hedulewhoselengthis guaranteedforagiven probability?Therefore?the resultingsc heduleismostlikely
satis?ed the systemconstraintsandthenumberofredesigncyclescanbereduced?Inordertocomputethe
totalcomputationtimeofthisorder? aprob abilistic task?assignmentgraph ?PTG? isconstructed? Sucha
graphisobtainedfrom theoriginalPGinwhichnon?zerodelayedgesareignoredandeachnodeisassigned
toaspeci?cfunctionalunitinthesystem?ThePTGalsocontainssomeextraedges?called?ow?contr ol
edges?andeachofwhic hisestablishedbetweentwoindependenttasks
u
and
v
where
u
is executedright
before
v
withinthesamefunctionalunit?
Example
Forsimplicityofexplanation?thefunctionalunitsareassumedtobehomogeneousprocessingelemen ts
?PEs??ConsiderthePGinFigure??b??Thegoalhereistoassignthesenodestotwofunctionalunits?
e?g??PE
?
?
PE
?
?ApossiblePTGispresented inFigure??a?? Sincetheinputgraphiscyclic?anexecution
patternofthisPTGmayberepeated?Thesolidarcsin thisPTGrepresentthosezerodela yedges? called
dependencyedges?fromtheinputgraph?seeFigure??b???Inthis?gure?nodes
A?B
and
D
areassignedto
PE
?
andnode
C
is boundtoPE
?
?Notethat
D
isimplicitlyexecutedafter
A
sothedirectedgefrom
A
?
Page 4
to
D
canbeneglect?This graphmaybeviewedasarepeatedpatternpresen tedin Figure??b??Astatic
executionordershowsonlyoneiterationofarepeatedpattern?seeFigure??c???Forthisexecutionpattern?
thetotal computationtimeofthePTGislessthan?withhigherthan???con?dence?
PE0PE1
A
D
B
C
?a?PTG
Iter??Iter??
???
PE
?
ABDABD
???
PE
?
CC
???
?b?Initialexecutionpattern
PE
?
ABD
PE
?
C
?c?Staticschedule
Figure??Repeated patternandstaticexecutionorder
F romthe previous example? ifprobabilisticrotation scheduling isrun ontheresultingPTG?thealgorithm
will select the root node
A
toberescheduled? Afterthat?one delayfromtheincoming edgesof node
A
willbemov edto all itsoutgoingedges? Thisprocedure transforms thePG?seeFigure??a??which willbe
used asa referenceto update thePTG? Thenewexecutionpattern isequiv alent toreshaping the iteration
windo w aspresented inFigure??b??
A
D
B
C
?a?Rotate
A
Iter??Iter??
???
PE
?
ABDABD
???
PE
?
CC
???
?b?Reshaping iterationwindo w
Figure ??New PGandrep eatedpatternafterc hangingiteration window
Intuitiv ely?node
A
from the nextiterationis introduced tothestaticexecutionorder?Sincenode
A
nowhasnodata dependenciesassociatedwithothernodes?
A
canberescheduledtoanyfunctionalunit?
Onepossibleschedulingpositionis to assignno de
A
rightafternode
C
inPE
?
?Thealgorithmhasnow
completed oneiteration?TheresultingPTGandthenewexecutionorderareshowninFigure??a?and??b?
respectively?Thedottedarcfrom
C
to
A
inthis newPTGrepresentsthe?ow?controledge?Thisnew
assignmentconveysthattheresultingsc hedulelengthwillbelessthan?withhigherthan???con?dence?
PE0PE1
A
D
B
C
?a?PTG
PE
?
BD
PE
?
CA
?b?Staticexecutionorder
Figure??NewPTGandexecutionorderafterrescheduled
A
?
Page 5
Inordertoexplainhowtoschedulethe tasks accordingtoa givencon?dence levelof the execution
time?thispaperisorganizedasfollows?Section?presentsthegraphmodelusedinthisw ork?Required
terminologyandfundamentalconceptsarealsopresentedinthissection?Theprobabilisticrotationschedul?
ingalgorithmand somesupportedroutines willbediscussed inSection??Experimentalresultsandsome
example arediscussedinSection??Finally?Section? drawsa conclusion ofthisresearch?
?Fundamentalconcepts
Weno wintroducethegraphmodelwhichisusedtorepresen ttasksthatarecharacterized by theuncertainty
ofthecomputationtime?
De?nition???Aprobabilisticdata?owgraph?PG?isavertex?weighted?edge?weigh ted?directedgraph
G
?
hV?E?d?Ti?where
V
isthesetof verticesrepresentingtasks?
E
isthesetofedgesrepresentingthedata
dependenciesbetweenvertices?
d
isafunctionfrom
E
to
ZZ
Z
?
?thesetofpositiv eintegers?representingthe
numberofdelayson anedge?and
T
v
isarandom v ariablerepresentingthecomputationtimeofano de
v
?
V
?
Notethatan ordinaryDFGisasp ecialcase ofthePG?A prob ability distributionof
T
isassumedto
bediscreteinthispaper?Thegranularityoftheresulting probabilitydistribution?ifnecessary?dependson
theneed ofaccuracy? Thenotation P
?T
?
x?
isread?the probabilitythattherandomvariable
T
assumes
thevalue
x??Thepr obability functionisa functionthatmapsthepossiblevalue
x
toits probability?
i?e??
p?x??
P
?T
?
x??Eachvertex
v
?
V
fromthePGisweightedwiththeprobabilitydistributionofthe
computationtime?
T
v
?where
T
v
isa discreterandomvariableassociatedwiththesetofpossiblecomputation
timeofthevertex
v
suchthat
P
?x
P
?T
v
?
x??
??Forthoseno deswithonlyonecomputationtime
t
in the
PG?P
?T
?
t??
??Anedge
e
?
E
fromno des
u
to
v?is denotedby
u
e
?
v
andapath
p
startingfromnode
u
andendingatnode
v
isindicatedbythenotation
u
p
?
v?
Aniterationistheexecutionpatternofeachnodein
V
exactlyonce?Iterationsareidenti?edbyanindex
i
startingfrom
??In ter?iterationdependenciesarerepresentedbyweigh ted edges? Aniteration isassociated
withastaticschedule?Astaticschedulemustobeytheprecedencerelationsde?nedbythedata?o w
graph?Foranyiteration
j?anedge
e
from
u
to
v
withdelay
d?e?
conveysthatthecomputationofnode
v
at
iteration
j
dependsontheexecutionofnode
u
atiteration
j
?
d?e??An edgewithno delaysrepresentsadata
dependency withinthe sameiteration?Alegaldata?owgraphmusthaveastrictly positivedela y cycles?i?e??
thesummationofthedelayfunctionsalonganycyclecannotbelessthanorequaltozero?Also?thenumber
ofdelaysofapath
p
?d?p??where
p
?
v
?
e
?
?
v
?
e
?
?
???
e
k??
?
v
k
iscomputedby
d?p??
k??
P
i??
d?e
i
??Asan
example?thegraphinFigure??b?hasthesetofedges
E
?
A
e
?
?
B?A
e
?
?
C? A
e
?
?
D?B
e
?
?
D?C
e
?
?
D?D
e
?
?
A
where
A?B?C?
and
D
arenodesinthegraph?Thenumberofdelayson eachedge
d?e
i
??
???
?
i
?
?
and
d?e
?
??
??Theprobabilitydistributionof
T
v
where
v
isanodeinthisgraphispresentedin Figure??c??
TheexecutionorderofthePGcanbede?ned as?
De?nition???Aprobabilistic task?assignmentgraph?PTG?
G
?
hV?E?w?T?b
i?
isav ertex?weighted? edge?
weigh ted?directed acyclicgraph?where
V
isthesetofverticesrepresen tingtasks?
E
isthesetofedges
?
Page 6
representing thedatadependenciesb etweenvertices?
w
isaedge?typefunctionfrom
e
?
E
to
f???g?where
?
represents thetypeofdependencyedgeand
?
representsthetypeof?ow?controledge?
T
v
isarandom
v ariable represen tingthecomputation timeofa node
v
?
V
?and
b
isaprocessorbindingfunctionfrom
v
?
V
to
fPE
i
??
?
i
?
ng?wherePE
i
isprocessingelement
i
and
n
isthetotalnumberofprocessing
elements?
Figure??a?showsanexampleofthePTGwhiletwopro cessing elements areavailable?No des
B
and
D
areassignedtoPE
?
?Thatis
b?B??
b?D??
PE
?
?Meanwhile
b?C??
b?A??
PE
?
?Edgesconsistsof
C
e
?
?
A
?
C
e
?
?
D
?
B
e
?
?
D
where
w?e
?
??
?
and
w?e
?
??
w?e
?
??
??
Sincethepropagationdelayofavertex isarandomvariable?anop erationb etweentwoverticesinvolvesa
functionofatw o?dimensionalrandomvariable?i?e??twoor morenumericalcharacteristicsmustbeobserved
sim ultaneously???????Atwo?dimensionalrandomvariableisdenotedby
?X?Y
?
where
X
?
X?s?
and
Y
?
Y
?s?
beingtwoone?dimensionalrandomvariables?Thesetworandomvariablesareindependent ifthe
outcome of
X
doesnotin?uencetheoutcomeof
Y
?Theindependencenotionwillbeappliedwhendiscussing
theop erationoftwo?dimensionalrandomvariable?
Twobasicoperationsofrandomvariablesarethesummationandmaximum functions?If
T
v
and
T
u
are
random variablesrepresenting thecomputationtimeassociatedwith thevertices
u
and
v?thesummation
function isde?nedas
A
?
T
u
?
T
v
?
Furthermore?the maximum operation isde?ned as
M
?
max?A
?
?A
?
??
where
M
isthe setofpossiblegreatestvaluesproducedfromtherandomvariables
A
?
and
A
?
?Notethat
thesummationand maximumfunctionscanbe extendedfor handling
n
randomvariablesorn?dimensional
randomvariables?byapplyingthefunction successively toeachpairof therandomvariables?i?e??the
summationandmaximumoperationsareclosedundertheassociativityproperty?Theprobabilityassociated
withthesummationandmaximumfunctionscanbecomputed directlyfromthede?nitionofrandom
variables?
??? Retiming
Retiming techniquew as originallyusedforsynchronouscircuit optimization with?xed timingvalue ?????
Thisop eration rearrangesregistersin acircuitso thatthebehavior ofthecircuitispreserved whileachieving
afastercircuit?Theoptimizationgoalistoreducetheclockperiodorcycleperiod
??G??Theclockperiod
represents theexecutiontime ofthecriticalpathintermsofpropagationdelaythat hasallzero register
edges? Itisde?nedbytheequations
??G??
maxft?p??
d?p??
?g
where
p
?
v
?
e
?
?
v
?
e
?
?
???
e
k??
?
v
k
?
t?p??
P
k
i??
t?v
i
??and
d?p??
P
k??
i??
d?e
i
??
Theretimingofagraph
G
isatransformationfunctionfromnodestothesetofintegers?
r
?
V
??
ZZ
Z?The
retimingfunction describesthemovementofdelaysinagraph?orregistersinacircuit?withrespecttothe
verticessoastotransform
G
intoanewgraph
G
r
where
d
r
represen tsthenumber ofdelaysontheedgesof
G
r
?Thep ositive?or negativ e?v alueofthe retimingfunctiondeterminesthemovementofthedela ys?The
absolutevalueoftheretimingfunctionconveysan umb erofdela ysorregisters toberearranged?During
retimingthesamenumber ofdela ysarepushedfromallincoming?outgoing?edges ofa node toall outgoing
?incoming?edges?
r?u?
denotestheretimingv alueofno de
u?
?
Page 7
Thefollowing summarizessomeessentialprop ertiesoftheretimingtransformation?
Property???
??
r
isalegalretimingif
d
r
?e?
?
??
?e
?
E?
??Foranedge
u
e
?
v?
where
u?v
?
V?d
r
?e??
d?e??
r?u??
r?v??
??Forapath
u
p
?
v?
where
u?v
?
V?d
r
?p??
d?p??
r?u??
r?v??
??Inanydirected cycle?l? of
G
and
G
r
?
d
r
?l??
d?l?
???
Property?????guaranteesthattheretimedgraphwill nothaveanyedgecontaininganegativenumber
of delays?Prop erties ???????explainthemov ementof delays?If
r?v??
v
?
V
?hasapositivev alue? delays
willbedeleted fromthe incomingedge?s?of
v
andinserted ontotheoutgoingedge?s??andviceversa if
r?v?
hasthenegativev alue?Finally?Prop erty ?????ensuresthat thenumberof delaysinanycycleof the graph
remainsconstan t?
Aftera graphhasalreadybeenretimed?aprologueisthesetofinstructionsthatmustbeexecutedto
providethenecessarydatafortheiterativepro cess?In the exampleinprevious section? node
A
inthe?rst
iterationbecomestheprologue?Anepilogueistheotherextreme?whereacomplementarysetof instructions
willneedtobeexecutedtocompletetheprocess?Byconsideringthattheentireproblemconsistsofalarge
numberofiterations?wemayassumethatthetimerequiredtoruntheprologueandepiloguearenegligible
comparedtotherep etitiveexecutiontime oftheloopbody?
???Rotationscheduling
In????Chao?LaPaughandShaproposedanalgorithm?calledrotation scheduling?whichin tegratesthe
retimingtechniquesinordertodealwithschedulingacyclicDFGunderresourceconstraints?Theinputof
rotationschedulingisaDFGanditscorrespondingstaticsc hedule?e?g??asynchronizedorderofthenodes in
theDFG?This algorithmreducestheschedulelength ?orthenumber ofcontrol steps neededtoexecuteone
iteration of theschedule?byextractingtheparallelism acrossiterations?i?e?? shiftingtheiterationwindow
?orscop eofastaticscheduleinoneiteration?down byonecontrolstep?Lookingatonestaticiteration?
rotationschedulinganalogouslyrotatestasksfromthetopofthescheduledownto thebottom?Thisprocess
isequivalentto retiming thosetasks?nodesintheDF G?in whichonedela ywill bedeletedfromall their
incoming edgesandaddedto alltheiroutgoing edges resulting inanin termediate retimedgraph?Once the
parallelismisextracted?thealgorithmreassignsthesenodestothenewp ositionsothattheschedulelength
isshorter?
Asanexample? thecyclicDFGinFigure??a?needstobesc heduledtotwopro cessingelements?Fig?
ure??b?presents onep ossiblestaticscheduleofsuchagraph? Byusingrotation scheduling?thisschedule
canbeoptimized?Firstly?thealgorithmexplores node
A
from thenextiteration?Theoriginalgraph is
retimedwhere
r?A??
??i?e??onedelayfrom
E
e
?
A
is movedtoalloutgoingedgesof
A
?seeFigure??c???
Bydoingso?node
A
nowcanbeexecutedatanycontrolstepinthisnewiterationwindow?Assumethat
rotation schedulinguses there?mappingheuristicthatplacesnode
A
rightafter node
C
underPE
?
?The
resultingstaticschedulelengthisthenreducedbyonecontrolstepasshowninFigure??d??Inthenext
?
Page 8
section? theconceptoftheschedulelength andthere?mappingstrategywillbeextendedtohandlethe
probabilisticinput?
B
C
A
D
E
?a?cyclicDFG
Iter?
i
th
Iter?
i
?
?
th
???
PE
?
ABDEABDE
???
PE
?
CC
???
?b?staticschedule
B
A
C
D
E
?c?retimed
Iter??Iter??
???
PE
?
ABDEBDE
???
PE
?
CACA
???
?d?resultingschedule
Figure ??Rotationschedulingoverview
? ProbabilisticRotation Scheduling ?PRS?
In ordertoschedulenodesfromthePGtoamultiplefunctionalunitsystem?traditionalrotationscheduling
shouldbemodi?ed?First?sincethecomputationtimeof eachnode isarandomvariable? theprobability
theorymustbeusedtocalculateaschedulelength?calledmaximumreachingtime?whichisno warandom
variable?Second?asinthetraditionalrotationapproac h?the taskre?mappingstrategy forPRSshould take
theprobability nature ofthe problemin toaccount?The followingsubsectionsdiscuss theseconceptsin more
details?
???Sc hedule lengthsubjectto thecon?dencelevel
Recallthede?nitionof a probabilisticdata?o w graph?PG? foundin Section?? Sincethe computationtime
ofeachvertexofaPGisarandomvariable?the traditionalnotionof a?xedglobalcycleperiod?
??G??
forPG
G
isnolongervalid?Therefore?therandomvariable
mrt
?G??calledthemaximumreachingtimeof
graph
G?whichrepresen tstheprobabilistic cycleperiodfor graph
G?i?e??theprobabilistic schedulelength
of
G?isintroduced?Likewise?
mrt?u? v?
representstheprobabilisticcriticalpathlength fortheportion of
the graphbetweenno des
u
and
v?
De?nition???Themaximumreachingtime?mrt? ofPTG
G
?
hV?E? w?T?b
i isa randomv ariablerepresent?
ingthedistributionofcomputationtimeforthe graph?
Notethatthe
mrt
of PTG representstheprobabilistic cycleperiod? i?e??theprobabilisticsc hedulelengthof
thePTG?Ifthe computationtimeofeachnodeisa?xedv alue?computingthe totalsc hedulelengthin volves
?
Page 9
theadditionandmaximumoperations?Similarideacanbeappliedtocomputethe
mrt
wherethesetwo
operationsare thefunctionsof randomvariables?
Algorithm???calculatesthe
mrt
foraPTG?Inordertosimplifythecalculation? twodummyvertices
withzerocomputationtime?
v
s
and
v
d
?areaddedtothegraph? A setof zerodelay edgesis usedto connect
v ertex
v
s
to allro ot?nodes?andtoconnect allleaf?nodestov ertex
v
d
?Therefore?the
mrt?v
s
?v
d
?
givesthe
overallmaximumreachingtime ofthegraph andwillbeused tocomputethe schedulelengthof thegiv en
PTG?This sc hedulelengthimpliespossiblecomputationtime of thegraph?
Algo rithm ????Maximumreaching time?
Input?
PTG
G
?
hV?E?w?T?bi
Output?mrt?G??
mrt
?
?v
s
?v
d
?
?
G
?
?
hV
?
?E
?
?d?T
i
suchthat
V
?
?
V
?fv
s
?v
d
g?
?
E
?
?
E
?fv
s
e
?
v
?
V
r
?u
?
V
l
e
?
v
d
g
?
?
u
?
V
?
?mrt
?
?v
s
?u??
??T
v
s
?
T
v
d
?
??
Queue
?
v
s
?
while
Queue
??
?
do
?
get
?u?
Queue
?
?
mrt
?
?v
s
?u??
mrt
?
?v
s
?u??
T
u
??add?tworandomvariables
?
foreac h
u
e
?
vdo
?
indegree
?v??
indegree
?v??
?
?
mrt
?
?v
s
?v??
max?mrt
?
?v
s
?u??mrt
?
?v
s
?v??
??max? tworandomvariables
??
if
indegree
?v??
?then
put
?v?
Queue
?
?
??
od
??
od
Lines?and? addtwodummynodes
v
s
and
v
d
andzeroregisteredges connecting
v
s
toeveryrootnode
v
?
V
r
of
G
andconnectingeveryleaf node
u
?
V
l
of
G
to
v
d
?Line?initializesthe
mrt
?
?v
s
?u?
valuefor
eachvertex
u
in thenew graphandsetsthe computationtime of
T
v
s
and
T
v
d
to zero?Lines????traverse
the graphintopological orderandcompute the
mrt
ofeachnode
v
withrespectto
v
s
? The
mrt
?
for node
v
isoriginally setto zero?When the?rstparent of
v
isdequeued?
v
hasitsindegreereducedby one?Line??
andalsohasits
mrt
?
updated?Line ???V ertex
v?sother parentsarein turndequeued?andthepro cessis
repeated?Eventually?the lastparent of node
v
willbe dequeuedandmaximized?At thisp oin t? no de
v
will
be insertedin to the queuesince all paren ts haveb eenconsidered? i?e?? indegree of
v
equals zero?Line ????
No de
v
willbe ev en tuallydequeued by Line ?? Line?willthenadd
T
v
tothe
mrt
?
ofnode
v
pro ducingthe
?nal
mrt
with respecttoall pathsreac hingno de
v?
Usingthe
mrt?the conceptofaprobabilisticschedule lengthcanbederived?Inthispaper?sucha
schedule lengthisexpressedintermsofcon?dencelevel?i?e??percentofcertainty?Thefollowingisthe
de?nitionoftheprobabilisticschedulelength?
De?nition???AprobabilisticschedulelengthofPTG
G
?
hV?E?w?T?b
iwithresp ecttoacon?dence level
??
psl
?G????isthesmallestcomputationtime
c
suchthatP
?mrt
?G?
?c?
??
?
??
Considertheprobabilitydistribution ofthe
mrt?G??
Possiblecomputation time
?? ?? ?? ?? ?? ??????
Prob????????????????????????????????????????????????????????????????
?
Page 10
With
?
?
????
psl?G? ????
is??becausethesmallestpossiblecomputationtimeis ??whereP
?mrt?G?
????
?
???
????????
?
???????
?
???????
?
????????????? Notethat
c
couldbe??and?? but??isthesmallest
one?Therefore withabove ???con?dence? thecomputationtime of
G
islessthan ???
???Taskre?mappingheuristic?templatescheduling
Inthissection?weproposetheheuristic?calledtemplatesche duling?TS??to?ndaplacetore?sc hedulea
task?Thisre?mappingphaseplaysanimp ortantrole whichistoreducetheprobabilisticsc hedulelength
inthePRSalgorithm?Intheexperiments?thee?ectiveness ofsuchaheuristiciscomparedtothe other
heuristics?suchasexhaustive trial?ET??as?late?aspossiblescheduling ?AS??F or thetemplate scheduling
approach?aweight?calleddegreeof?exibility?isassignedtoeachnodeinthePTG?Inordertocomputesuch
aw eight?the expected computationtimeofeachnode iscomputed tobuildupatemplate?Thistemplate
impliesnotonlytheexecutionorderbutalsothecontrolstepassignmentforeachtask?Byobserving
thistemplate?onecanexpecthowlong?numberofcontrolsteps?eachprocessingelementwould beidle?
Therefore?thetemplatesc hedulingschemeshouldbeabletodecidewheretore?scheduledanode?
In ordertodetermineanexpectedcontr olstep?eachnodeinaPTGisvisitedin thetopologicalorder
andcomputethefollowing?
De?nition???Theexpectedcontrolstepofnode
v
ofPTG
G
?
hV?E?w?T?b
i?Ecs
?v??is computedby?
Ecs
?v??
max
i
?Ecs?u
i
??
ET
u
i
?
where
u
i
e
?
v
?
E?ET
u
representstheexpectedcomputationtimeofnode
u
and
Ecs?v
i
??
?
for allrootnodes
v
i
?
V
?
Theabovede?nitionassumesthatnode
v
canstartexecutionrightafter alloftheirparents?nishtheir
executions?Thefollowinggivesade?nitionofade greeof?exibilityofnode
v
withresp ecttoPE
i
?
De?nition???GivenaPTG
G
?
hV? E?w?T?b
i?
adegree of?exibilityofnode
u
with respect totheprocessing
element PE
i
?
d?ex
?u?i??iscomputed by?
d?ex?u?i??
Ecs?v??
Ecs?u??
ET
u
where
u
e
?
v
?
E
and
u
and
v
are assigned toPE
i
?
Thedegreeof ?exibilityconveystheexp ectedsizeofav ailabletimeslotwithinPE
i
?Figure?showsatypical
case wherenode
v
hasmorethanone parent?
u
?
?
u
?
and
u
?
areparentsofnode
v
andeachoftheseparents
v
u
u2
u
Ecs=3.7
ET=3
1
3
PE2
PE1
PE0
ET=1
Ecs=3
ET=4
Ecs=4.7
Ecs=8.7
Figure?? Expectedcontrolstep
hastheexpectedcomputation time
????
and
?
respectively?In thesameorder? theexpected control steps
??
Page 11
of theseno desare
??????????
Therefore? theexp ectedcontrolstep
Ecs
?v??
????AccordingtoDe?nition????
thedegree of?exibilityof
u
withrespectto PE
?
?is
???
?
?
?
?
?
????
Thisvalueconveysho wlongPE
?
has
to waitbefore
v
canbeexecuted?Notethatthe degreeof?exibilityofanode?whichisexecutedatlast in
anyPE?isunde?ned? Thetemplate scheduling heuristic canbedescribedasfollo wing?
Algorithm????Templatescheduling?
Input?
PTG
G
?
hV? E?w?T?bi
with pre?computed
ET
u
?
?u
?
V
?re?mappednode
v?and
?
Output?
New
G
withtheassignmentof
v
?
compute
Ecs
?u??
?u
?
V
?
compute
d?ex?u??
?u
?
V
?
G
temp
?
G?
G
best
?
NULL
?
for
PE
i
??
PE
?
to
PE
n
do
?
x
?
node
v
withmax
PE
?v??i
d?ex?v?i?
??selectanode withthehighestdegreeof?exibilityin
PE
i
?
G
temp
?
temporarilyassign
v
after
x
?
ifpsl
?G
temp
???
?psl?G
best
???
??
G
temp
isbetter than
G
best
?
thenG
best
?
G
temp
?
??keepthegood schedule
?
removetheassignmentandretryonthenext
PE
od
??
return
?G
best
?
Algorithm??? attemptstoassign a re?mappednodetoeveryprocessingelemen t? Assumingthatthe
exp ectedcomputation timeofeachnodeis alreadycomputed? itassignstheexpectedcontrolstepand
degree of ?exibility toeverynodeinthePTGin Lines????Thecurren ttask assignment graphissavedin
G
temp
and
G
best
keepsthebestPTGwhichyieldstheshortestsc hedulelength?Foreachprocessingelement
PE
i
?areferencenode
x
withthehighestdegree ofthe?exibilityisc hosen?Line????If all nodesinsuch
a PE have zerodegreeof?exibility?thereference nodeisc hosentobe the lastnodethatisexecutedin
thatPE??Afterthe re?mapped nodeistemp orarilyassigned afterthereferenceno deatthatpro cessing
element?Line???thecomputed probabilistic schedulelength iscompared tothebestone
G
best
??Inthe?rst
iteration?
G
temp
willimmediatelybecome to
G
best
since
G
best
is?rst set toNULL??Thealgorithm selects
thebestpro cessorassignmentwhichyield theshortest schedule length in Lines ????Ev entually theb est
PTGwill bereturnedasa?nalresult?
???Rotationphase
We nowpresent theoverallprobabilisticrotation scheduling ?PRS??
Algo rithm????ProbabilisticRotationScheduling?
Input?
PG
G
?
hV? E?d?T
i
? and
?
Output?
ashortestp ossiblePTG
G
s
?
hV?E?w?T?bi
?
compute
ET
u
?
?u
?
V
?
G
s
?
InitSc hedule
?G?
?? constructtheinitialsc hedule?DAG?
?
G
best
?
G
s
?
fori
?
?to?jV
j
do
?
R
?
ExtractRoots
?G
s
?
?
?G
s
?G?u??
SelectRotate
?R?G
s
?
??selectano deandrotate it
?
G
s
?
Re?map
?G
s
?u???
??re?mappingheuristic
?
if psl
?G
s
???
?psl?G
best
?
?
thenG
best
?
G
s
?od
??
return
?G
best
?
??
Page 12
Initially?an expectedcomputationtimeofeachtaskis computed?Afterthat?the initialschedule is
constructedinLine??Thealgorithmto constructan initialschedulecanbeanyDAGscheduling?e?g??list
schedulingwhichneeds tobemodi?edin order toreturna PTG?The rotationphaseb egins inLines????
This phaselo ops
?jV
j
timeswith hope thatall nodesinthegraphwillhaveachancetoberesc heduledat
leastonce? Extract
Ro ots returnsa setofrootswhic h canbe legally retimed?Then SelectRotate selects
node
u
toberotatedusinga priorityfunctionsuchasselectinganodethat hasthesmallestretimingvalue?
r?u??ifalltie? thenit arbitrarilypic ksone? In thisroutine? onedelayisdrawn fromallincomingedges
of node
u
andpushed toalloutgoingedgesofnode
u?Meanwhile?thePTG
G
s
isalsoupdated?i?e??the
?ow?controlanddependencyedgesaremodi?ed?Thenanode isre?mappedusingtheTS heuristicprop osed
inthe previoussubsection?If theobtainedprobabilisticscheduleisbetterthan thecurrentone?itsavesthe
b etterPTGand therotationiterationcontin ues?
? Experiments
Inthissection? we?rstshowtheoverview ofhow PRSw orksandthenpresen tsexperimen talresults obtained
fromrunning PRS onsomeselectedwell?kno wnbenchmarks?
???Example
Considerthe PGinFigure ??a?andthecorrespondingcomputationtime inFigure??b??Thecon?dence
probability isgiv enas
?
?
????Afterlistsc hedulingis applied?theinitialexecutionorderis constructedas
shownin Figure ??a??Thecorresponding PTGis presentedin Figure??b??No des
A?B? H
and
I
areassigned
toPE
?
?no des
E
and
F
aresc heduledPE
?
andno des
C? G
and
D
areassigned toPE
?
? Edges
B
e
?
H
and
C
e
?
G
and
G
e
?
D
are?o w?controledges?
I
E
GF
HD
B
A
C
?a?
TimeNo des
ABCDEFGHI
????????????????????
??????????? ?????? ???
??????????????? ???
??? ?????????????
????????????
?b?
Figure??Anexampleof thecomputationtime ofgraphinFigure ??b?
Forthis assignment? the
mrt
of sucha PTGiscomputedasfollo wing?
Possiblecomputationtime
?? ???? ???? ??????
Prob
???????????????????????????????????????????????????????????????
Therefore? withhigher than???con?denceprobability?
psl?G??????
???
??
Page 13
PE
?
ABHI
PE
?
EF
PE
?
CGD
?a?Staticexecutionorder
D
I
CFG
EA
B
H
PE0
PE1 PE2
?b?PTG
Figure?? Initialassignmentand execution order
Basedon this PTG?either
A
or
E
canbe rescheduled? Inthe ?rst iteration?PRSselects
A
tobe
resc heduled?Onedelayismovedfromallincoming edgesof
A
andpushedtoalloutgoingedgesof
A?
TheresultingretimedgraphPGisshowninFigure??a??Inthisgraph?node
A
requires nodirectdata
dependencyfrom any node? Inotherwords?
A
canbeplacedatanypositionintheschedule?Toapply
templatescheduling?we?rstcalculatetheexpectedcontrolstepandthe degreeof?exibilityofeachnode
?seeFigure??b???
I
E
GF
HD
B
A
C
?a?NewPG
ABCDEFGHI
ET
v
???????????????????????????
Ecs
?????????????????????????
d?ex
????????? ??? ???????????????
?b?
Ecs
and
d?ex
Figure??PG aftertherotationof
A
and templatev alues
Thetemplatescheduling heuristicattempts tosearchforapositiontomapano de? In tuitively?it looks
foraposition rightafteranode whosedegreeof ?exibilityis highestin each processingelement? Among
these processingelemen ts?theschedulingpositionwillbeselectedif the
mrt
of thePTG withits temporary
assignment isminim um?According to theinformation inFigures ??a?and ??b?? node
A
istemp orarily
assignedafter node
B
in PE
?
? afternode
F
inPE
?
and afterno de
D
inPE
?
?F romthe calculation?the
position ofnode
A
in PE
?
yieldsthebestsc hedulelengthwiththecon?dence probabilitybeinghigher than
???? Thetemplateschedulingeventuallydecidestoplace
A
toapositionbetw een
B
and
H
inPE
?
? The
resultingPTG? execution order? andthe
mrt
ofthis PTGareshowninFigure ???
After running PRSfor??iterations? theshortestp ossiblesc hedulelengthwasfoundinthe
??
th
itera?
tion?Figure???a? sho ws the?nal PG? Accordingto this con?guration?theschedule lengthisless than??
with probabilitygreaterthan???? Finally? the?nalPTG? executionorder? andits
mrt
arepresen tedin
Figures???b?????c? and???d? resp ectively?
??
Page 14
D
I
CF
E
B
H
PE0
PE1 PE2
G
A
?a?PTG
PE
?
BAHI
PE
?
EF
PE
?
CGD
?b? Executionorder
P ossiblecomputationtime
??????????? ??
Prob
????????????????????????????????????????????????? ???????
?c?
mrt
Figure ???PTG?execution orderand
mrt
aftertherotation of
A
I
E
GF
HD
B
A
C
?a?PG
PE0
PE2
PE1
I
E
GF
HD
B
A
C
?b? PTG
PE
?
ABID
PE
?
HC
PE
?
EGF
?c?Finalexecutionorder
Possiblecomputation time
???? ???? ??
Prob
?????????????????????????????????????????????????
?d?
mrt
Figure ???FinalPG?PTG?executionorder?and
mrt
??
Page 15
Sp ec?Benc hmarks?nodes
?
?
????
?
???
PLPRSPLPRS
ASETTS ASETTS
Di??Equation?? ?????? ???
???
??? ??????
???
? Adds? ??stageIIR ????? ??????
???
?????????
???
?Mul?All?poleLattice???????? ???
???
?????? ???
???
Volterra?? ?????????
???
?????????
???
?
th
Elliptic?? ?????????
???
?????? ???
???
Di?? Equation????? ???
??
????? ???
??
??
?Adds???stageIIR?? ?????? ??
??
??? ??? ??
??
? Muls?All?poleLattice ???????? ???
???
?????? ???
???
Volterra????????
???
??? ??????
???
???
?
th
Elliptic???????????
???
?????????
???
Table??Comparisonoftheresultsobtainedfromapplyinglistscheduling?andprobabilistic rotation schedul?
ingtoselectedbenchmarks
???Benchmarks
In ordertobemorerealisticandeasytocomparewithtraditionalrotationscheduling?wetested the PRS
algorithm onsomewell?kno wnbenchmarks? the?th elliptic?lter?? stage?IIR?lter?voltera?lter? and lattice
?lter?The computationtimeofeachno defrom thosebenchmark graphsisobtained from????Sucha timing
informationconsistsoftheminimum?t ypicalandmaximumv alues?Fortheexp erimentpurp ose?these
v aluesarebrokendown andgeneralizedto?tin a normalprobabilitydistribution?In practice?PRScanbe
appliedtoan y one?dimensionalloop body ? Theinformationab outcomputationtimeofataskcan simply
beobtained by eitherusingdirectexaminationof thecodeor theuseof pro?leinformation collectedbythe
earlierrunsoftheprogram ?????
Table?demonstratesthe e?ectiv enessofour approachonb oth??adder? ??m ultiplier and??adder???
m ultipliersystems?TheperformanceofPRSisevaluatedwhenthealgorithmappliesthreedi?erentre?
mappingheuristics?templatescheduling?TS??exhaustive trial?ET? andas?late?as?p ossiblescheduling?AS??
The ETapproachstriv es tore?mapa no de to allpossible legal location andreturns theassignmentwhich
yieldstheminimum
psl?G????This methodis simple andgivesagoodschedule?however? it istime consuming
andnotpractical totry allpossible schedulingplaces ineveryiteration?Furthermore?aPTGneedstobe
temporarilyupdatedineverytrial inorder to computethepossibleschedulelength?Onthecontrary?the
ASmethodreducesthenumberoftrialsbyonlyattemptingto scheduleataskonceat thelegal farthest
position in eachfunctionalunit?adderormultiplier?whiletheTSheuristiclegallyplacesataskafterthe
node withthe highestdegreeof?exibilityineachfunctionalunit?
Columns
?
?
???
and
?
?
???
showtheresultwhen consideringprobabilisticsituationwiththecon?dence
probability??? and???? Column?PL? presentsthe
psl
afterlistschedulingisappliedtothebenchmarks?
AfterrunningPRSusingthere?mappingheuristicsET?ASandTS?Columns ET?AS andTSshowthe
resulting
psl?Amongthesethreeheuristics?theTSsc hemeproducesbetterresultsthanAS whichuses the
simplestcriteria?Further?ityields asgoodasorsometimesevenbetterresultsthanwhatgivesbytheET
approach?whileTStakeslesstimetoselectare?scheduledpositionforanode?This isbecauseineach
iterationtheETmethod?ndsthelocaloptimalplace?however?schedulingnodestothesepositionsdoes
notalwaysresultintheglobaloptimalsc hedulelength?
InTable ??basedonthesystemthat has?addersand?multiplier?wepresentthecomparisonresults
??
View other sources
Hide other sources
-
Available from Sissades Tongsima · 4 Jan 2013
-
Available from nd.edu