Page 1

Patterns and Statistical Analysis for

Understanding Reduced Resource Computing

Martin Rinard

Massachusetts Institute of

Technology

rinard@mit.edu

Henry Hoffmann

Massachusetts Institute of

Technology

hank@csail.mit.edu

Sasa Misailovic

Massachusetts Institute of

Technology

misailo@csail.mit.edu

Stelios Sidiroglou

Massachusetts Institute of Technology

stelios@csail.mit.edu

Abstract

We present several general, broadly applicable mechanisms

that enable computations to execute with reduced resources,

typically at the cost of some loss in the accuracy of the result

theyproduce.Weidentifyseveralgeneralcomputationalpat-

terns that interact well with these resource reduction mech-

anisms, present a concrete manifestation of these patterns

in the form of simple model programs, perform simulation-

based explorations of the quantitative consequences of ap-

plying these mechanisms to our model programs, and relate

the model computations (and their interaction with the re-

source reduction mechanisms) to more complex benchmark

applications drawn from a variety of fields.

Categories and Subject Descriptors

chitectures]: Patterns; D.2.4 [Software/Program Verifica-

tion]: Statistical Methods; D.3.1 [Formal Definitions and

Theory]: Semantics

D.2.11 [Software Ar-

General Terms

ability, Verification

Design, Measurement, Performance, Reli-

Keywords

Loop Perforation, Cyclic Memory Allocation, Statistical

Analysis

ReducedResourceComputing,DiscardingTasks,

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for profit or commercial advantage and that copies bear this notice and the full citation

on the first page. To copy otherwise, to republish, to post on servers or to redistribute

to lists, requires prior specific permission and/or a fee.

Onward! 2010, October 17–21, 2010, Reno/Tahoe, Nevada, USA.

Copyright c ? 2010 ACM 978-1-4503-0236-4/10/10...$10.00

1.

The amount of available resources is a central factor in the

existence of virtually all living organisms. Mechanisms that

adapt the operation of the organism to variations in resource

availability occur widely throughout nature. For example,

during prolonged starvation, the human body preserves mus-

cle mass by shifting its fuel source from proteins to ke-

tone bodies [3]. Peripheral vasoconstriction, which mini-

mizes heat loss by limiting the flow of blood to the ex-

tremeties, is a standard response to hypothermia. The nasal

turbinates in dehydrated camels extract moisture from ex-

haled respiratory air, thereby limiting water loss and enhanc-

ing the ability of the camel to survive in dessicated environ-

ments [31]. All of these mechanisms take the organism away

from its preferred operating mode but enable the organism

to degrade its operation gracefully to enhance its survival

prospects in resource-poor environments.

The vast majority of computer programs, in contrast, exe-

cute with essentially no flexibility in the resources they con-

sume. Standard programming language semantics entails the

execution of every computation the program attempts to per-

form. If the memory allocator fails to return a valid reference

to an allocated block of memory, the program typically fails

immediately with a thrown exception, failed error check, or

memory addressing error. This inability to adapt to changes

in the underlying operating environment impairs the flexi-

bility, robustness, and resilience of almost all currently de-

ployed software systems.

Reduced resource computing encompasses a set of mech-

anisms that execute programs with only a subset of the

resources (time and/or space) that the standard program-

ming language semantics and execution environment pro-

vides. Specific reduced resource computing mechanisms in-

clude:

Introduction

Page 2

• Discarding Tasks: Parallel computations are often struc-

tured as a collection of tasks. Discarding tasks produces

new computations that execute only a subset of the tasks

in the original computation [23, 24].

• Loop Perforation: Loop perforation transforms loops to

execute only a subset of the iterations in the original com-

putation [16, 20]. Different loop perforation strategies in-

clude modulo perforation (which discards or executes ev-

ery nth iteration for some fixed n), truncation perforation

(which discards either an initial or final block of itera-

tions), and random perforation (which discards randomly

selected iterations).

• Cyclic Memory Allocation: Cyclic memory allocation

allocates a fixed-size buffer for a given dynamic alloca-

tion site [21]. At each allocation, it returns the next ele-

ment in the buffer, wrapping back around to the first el-

ement when it reaches the end of the buffer. If the num-

ber of live objects allocated at the site is larger than the

number of elements in the buffer, cyclic memory alloca-

tion produces new computations that execute with only

a subset of the memory required to execute the original

computation.

1.1Resource Reduction in Practice

Unsurprisingly, these mechanisms almost always change the

output that the program produces. So they are appropriate

only for computations that have some flexibility in the out-

put they produce. Examples of such computations include

many numerical and scientific computations, sensory appli-

cations (typically video and/or audio) that involve lossy en-

coding and decoding, many machine learning, statistical in-

ference, and finance computations, and information retrieval

computations. The relevant question is whether these kinds

of computations are still able to deliver acceptable output

after resource reduction.

Interestingly enough, our empirical results show that

many of these computations contain components that can

successfully tolerate the above resource reduction mecha-

nisms — the computation still produces acceptably accurate

outputs after the application of these mechanisms to these

components. And these resource reduction mechanisms can

often endow computations with a range of capabilities that

are typically otherwise available only through the manual

development of new algorithms. Specifically, discarding

tasks has been shown to enable computations to tolerate task

failures without retry [23], to produce accuracy and per-

formance models that make it possible to purposefully and

productively navigate induced accuracy versus performance

trade-off spaces (for example, maximizing accuracy subject

to performance constraints or maximizing peformance sub-

ject to accuracy constraints) [23], and to eliminate barrier

idling at the end of parallel loops [24]. Cyclic memory al-

location has been shown to eliminate otherwise potentially

fatal memory leaks [21]. Loop perforation has been shown

to reduce the overall execution time of the computation and

to enable techniques that dynamically control the compu-

tation to meet real-time deadlines in the face of clock rate

changes and processor failures [16].

A key to the successful application of these mechanisms

in practice is identifying the components that can success-

fully tolerate resource reduction, then applying resource re-

duction only to those components. This empirical fact leads

to usage scenarios in which the resource reduction mecha-

nisms generate a search space of programs close to the orig-

inal programs. An automated (or semiautomated) search of

this space finds the components that can tolerate resource

reduction, with resource reduction confined to those compo-

nents when the computation executes. The remaining com-

ponents execute with the full set of resources with which

they were originally designed to operate. The resulting effect

is conceptually similar to the mechanisms that biological or-

ganisms use to deal with reduced resources, which direct the

delivery of scarce resources to those critical functions most

necessary for survival.

1.2

The success of reduced resource computing shows that

manycomputations, likebiologicalorganisms, haveinherent

sources of redundancy that enable them to operate success-

fully in the face of reduced resources. Note, however, that

these sources of redundancy were not explicitly engineered

into the computation — they emerge as an unintended con-

sequence of the way the computation was formulated. In this

paper we analyze various sources of redundancy that enable

these computations to tolerate resource reduction.

The result of this analysis is several general computa-

tional patterns that interact in very reasonable ways with the

different resource reduction mechanisms. Viewing our com-

putations through the prism of these patterns helped us un-

derstand the behavior we were observing; we anticipate that

recognizing these patterns in other computations will facili-

tatethepredictionofhowtheseothercomputationswillreact

to resource reduction.

In the future, trends such as the increasing importance of

energy consumption, the need to dynamically adapt to com-

puting platforms with fluctuating performance, load, and

power characteristics, and the move to more distributed, less

reliable computing platforms will increase the need for com-

putations that can execute successfully across platforms with

a range of (potentially fluctuating) available resources. Ini-

tially, we expect developers to let automated techniques find

and exploit patterns in existing applications that interact well

with resource reduction. They may then move on to deploy-

ing such patterns into existing applications to enhance their

ability to function effectively in a range of environments. Ul-

timately, we expect developers to engineer software systems

fromthestartaroundpatternsthatinteractwellwithresource

reduction in much the same way that developers now work

with more traditional design patterns [12] in all phases of the

engineering process.

Inherent Redundancy

Page 3

1.3

This paper makes the following contributions:

Contributions

• Computational Patterns: It identifies computational

patterns that interact well with resource reduction mech-

anisms such as discarding tasks, perforating loops, and

dynamically allocating memory out of fixed-size buffers.

Understanding these patterns can help developers de-

velop a conceptual framework that they can use to reason

about the interaction of their applications with various

resource reduction mechanisms.

• Model Computations: It presents concrete manifesta-

tions of the general patterns in the form of simple model

computations. These model computations are designed

to capture the essential properties of more complex real-

world applications that enable these applications to op-

erate successfully in the presence of resource reduction

mechanisms.

In this way, the model computations can give developers

simple, concrete examples that can help them think pro-

ductively about the structure of their applications (both

existing and envisioned) and how that structure can affect

the way their applications will respond to the application

of different resource reduction mechanisms.

The model computations can also serve as the founda-

tion for static analyses that recognize computations that

interact well with resource reduction mechanisms. Such

analyses could produce statistical models that precisely

characterize the effect of resource reduction mechanisms

on the application at hand, thereby making it possible

to automatically apply resource reduction mechanisms to

obtain applications with known statistical accuracy prop-

erties in the presence of resource reduction.

• Simulations: It presents the results of simulations that

use the model computations to quantitatively explore the

impact of resource reduction on the accuracy of the re-

sults that the computations produce. The simulation re-

sults can help developers to better estimate and/or ana-

lyze the likely quantitative accuracy consequences of ap-

plying resource reduction mechanisms to their own ap-

plications.

• Relationship to Applications: It relates the structure of

the model computations and the simulation accuracy re-

sults back to characteristics of specific benchmark appli-

cations. Understanding these relationships can help de-

velopers better understand the relationships between the

model computations and their own applications.

• A New Model of Computation: Standard models of

computation are based on formal logic [11, 14]. In these

models, the computation is rigidly fixed by the applica-

tion source code, with formulas in discrete formal logic

characterizing the relationship between the input and out-

put. This paper, in contrast, promotes a new and fun-

damentally different model in which the computation is

flexible and dynamic, able to adapt to varying amounts

of resources, and characterized by (conceptually) contin-

uous statistical relationships between the input, output,

and amount of resources that the computation consumes.

Of course, almost every program has some hard logical

correctness requirements — even a video encoder, for

example, must produce a correctly formatted video file

(even though it has wide latitude in the accuracy of the

encoded video in the file). We therefore anticipate the de-

velopment of new hybrid analysis approaches which ver-

ify appropriate hard logical correctness properties using

standard program analysis techniques but use new statis-

tical techniques to analyze those parts of the computation

whose results can (potentially nondeterministically) vary

as long as they stay within acceptable statistical accuracy

bounds.

2.

Consider the following computations:

The Mean Pattern

• Search: The Search computation [7] from the Jade

benchmarksuite[28]simulatestheinteractionofelectron

beams with solids. It uses a Monte-Carlo simulation to

track the paths of electrons, with some electrons emerg-

ing back out of the solid and some remaining trapped in-

side. The program simulates the interaction for a variety

of solids. It produces as output the proportion of elec-

trons that emerge out of each solid. Each parallel task

simulates some of the electron/solid interaction pairs.

• String: The String computation [13] from the Jade

benchmark suite [25] uses seismic travel-time inversion

to compute a discrete velocity model of the geology be-

tween two oil wells. It computes the travel time of rays

traced through the geology model, then backprojects the

difference between the ray tracing times and the exper-

imentally observed propagation times back through the

model to update the individual elements in the velocity

model through which the ray passed. Each parallel task

traces some of the rays.

ThecorecomputationsinbothSearchandStringgenerate

sets of numbers, then compute the mean of each set. In

Search, each number is either one (if the electron emerges

from the solid) or zero (if the electron is trapped within

the solid). There is a single set of ones and zeros for each

solid; the output is the mean of the set. In String there is

one set of numbers for each element of the discrete velocity

model. Each number is a backprojected difference from one

ray that traversed the element during its path through the

geology model. String combines the numbers in each set by

computing their mean. It then uses these numbers to update

the corresponding elements of the velocity model.

The resource reduction mechanism for both computa-

tions, discarding tasks, has the effect of eliminating some

Page 4

of the numbers from the sets. It is possible to derive empir-

ical linear regression models that characterize the effect of

this resource reduction mechanism on the output that these

two computations produce [23]. These models show that dis-

carding tasks has a very small impact on the output that the

computation produces. Specifically, the models indicate that

discarding one quarter of the tasks changes the Search out-

put by less than 3% and the String output by less than 1%;

discarding halfof the tasks(which essentially halvesthe run-

ningtime)changestheSearchoutputbylessthan6%andthe

String output by less than 2%.

2.1

Our model computation for these two computations simply

computes the mean of a set of numbers:

The Model Computation

for (i = 0; i < n; i++) {

sum += numbers[i];

num++;

}

mean = sum/num;

The resource reduction mechanism for this model compu-

tation executes only a subset of the loop iterations to com-

pute the mean of a subset of the original set of numbers. The

specific mechanism we evaluate (loop perforation) simply

discards every other number when it computes the sum by

executing only every other iteration in the model computa-

tion:

for (i = 0; i < n; i += 2) {

sum += numbers[i];

num++;

}

mean = sum/num;

We evaluate the effect of this resource reduction mecha-

nism via simulation. Our first simulation works with floating

point numbers selected from a continuous probability distri-

bution. Each trial in the simulation starts by filling the ar-

ray numbers with the values of n independent pseudoran-

dom variables selected from the uniform distribution over

the interval [0,1]. It then computes the difference between

the computed mean values with and without resource reduc-

tion — i.e., the difference between the mean of all n values

in the numbers array and the mean of every other value in

the numbers array. For each even value of n between 10 and

100, we perform 1,000,000 such trials.

Our second simulation works with integers selected from

a discrete probability distribution. Each trial fills the ar-

ray numbers with the values of n psuedorandom variables

selected from the uniform discrete distribution on the set

{0,1}. We then perform the same simulation as detailed

above for the continuous distribution.

Figure 1 presents the results of these trials. This figure

presents four graphs. The x axes for all graphs range over

the sizes (values of n) of the sets in the trials. The upper

left graph plots the mean of the differences between the re-

sults that the standard and resource reduced computations

produce (each point corresponds to the mean of the ob-

served differences for the 1,000,000 trials for sets of the cor-

responding size) for the continuous distribution. The upper

right graph plots the variances of the differences for the con-

tinuous distribution. The lower left graph plots the mean of

the differences for the discrete distribution; the lower right

graph plots the corresponding variances.

These numbersshow thatour model computationexhibits

good accuracy properties in the presence of resource reduc-

tion. Specifically, for all but the smallest sets of numbers,

resource reductions of a factor of two cause (in most cases

substantially) less than a 10% change in the result that the

computation produces. We attribute this robustness in the

face of resource reduction to a diffuse form of partial re-

dundancy that arises from the interaction of the computation

with the data on which it operates. Because the numbers in

the reduced resource computation are a subset of the com-

plete set of numbers and because the numbers are all drawn

from the same probability distribution, the two mean values

are highly correlated (with the correlation increasing as the

size of the sets increases).

We note that the graphs show that the accuracy of the

resource reduced computation depends on the size of the set

of numbers, with larger sets producing smaller differences

and variances than larger sets of numbers. This phenomenon

is consistent with our redundancy-based perspective, since

smaller sets of numbers provide our model computation with

less redundancy than larger sets of numbers.

The discrete distribution has higher mean differences and

variances than the continuous distribution, which we at-

tribute to the concentration of the weight of the discrete

probability distribution at the extremes. We also note that

all of our simulation numbers are for resource reductions

of a factor of two, which is much larger than necessary for

many anticipated scenarios (for example, scenarios directed

towards tolerating failures).

2.2

The model computation provides insight into why the appli-

cations tolerate the task discarding resource reduction mech-

anism with little loss in accuracy. While discarding tasks

may, in principle, cause arbitrary deviations from the stan-

dard behavior of the application, the underlying computa-

tional patterns in these applications (although obscured by

the specific details of the realization of the patterns in each

application) interact well with the resource reduction mech-

anism (discarding tasks). The final effect is that the resource

reduction mechanism introduces some noise into the com-

puted values, but has no other systematic effect on the com-

putation. And in fact, the results from our model compu-

tation show that it is possible to discard half the tasks in

the computation with (depending on the size of the set of

numbers) single digit percentage accuracy losses. This result

Relationship to Search and String

Page 5

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

10 20 30 40 50 60 70 80 90 100

Mean Difference (Continuous)

Size of Set of Numbers

0

0.002

0.004

0.006

0.008

0.01

10 20 30 40 50 60 70 80 90 100

Variance (Continuous)

Size of Set of Numbers

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

10 20 30 40 50 60 70 80 90 100

Mean Difference (Discrete)

Size of Set of Numbers

0

0.002

0.004

0.006

0.008

0.01

10 20 30 40 50 60 70 80 90 100

Variance (Discrete)

Size of Set of Numbers

Figure 1. Means and Variances of Differences Between Standard and Resource Reduced Mean Computations

from our model computation is consistent with the results

from both the Search and String applications. The larger

mean differences and variances for the discrete distribution

show that the Search application (in which each number is

either 0 or 1) needs to work with larger sets than the String

application (in which each number is a floating point num-

ber) to obtain similar accuracies under similar resource re-

duction factors.

3.

Consider the following computations:

The Sum Pattern

• Water: Water evaluates forces and potentials in a sys-

tem of molecules in the liquid state [25]. Although the

structure is somewhat obscured by the application mod-

els the interactions between the water molecules, the core

computations in Water boil down to computing, then tak-

ing the sum of, sets of numbers. For example, a key in-

termolecular force calculation computes, for each wa-

ter molecule, the sum of the forces acting on that water

molecule from all of the other water molecules. Water is

coded as a Jade program [25], with each task computing,

then taking the sum of, a subset of the corresponding set

of numbers.

• Swaptions: Swaptions uses a Monte-Carlo simulation to

solve a partial differential equation to price a portfolio of

swaptions [5]. The core computation takes the sum of the

results from the individual simulations. The application

computes the final result by dividing the sum by the

number of simulations.

The resource reduction mechanism for Water is discard-

ing tasks [23, 24]; the resource reduction mechanism for

Swaptions is loop perforation [16, 20]. In both cases the

effect is a reduction in the result proportional to the num-

ber of discarded tasks or loop iterations. Unlike the Search

and String computations discussed in Section 2, for Water

and Swaptions discarding many tasks or loop iterations can

therefore induce a large change in the overall result that the

computation produces.

3.1

The model computation for Water and Swaptions computes

the sum of a set of psuedorandom numbers.

The Model Computation

for (i = 0; i < n; i++) sum += numbers[i];

As for the mean model computation, the resource reduction

mechanism is to discard every other iteration of the loop:

for (i = 0; i < n; i += 2) sum += numbers[i];

The effect is to divide the result (the value of the sum vari-

able) by approximately a factor of two. It is possible, how-

ever, to use extrapolation to restore the accuracy of the com-

putation [16, 23] - simply multiply the final result by two, or,

more generally, the number of tasks or loop iterations in the

original computation divided by the number of tasks or loop

iterations in the resource reduced computation. Note that the

former number (the number of tasks or loop iterations in the

original computation) is typically available in the resource

reduced computation as a loop bound (for our model com-

putation, n) or some other number used to control the gener-

ation of the computation. After extrapolation, the accuracy