Faster cyclic loess: normalizing RNA arrays via
Karla V. Ballman∗, Diane E. Grill, Ann L. Oberg and
Terry M. Therneau
Division of Biostatistics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
Recieved on January 9, 2004; revised and accepted on February 20, 2004
Advance Access publication May 27, 2004
technique that yields results similar to cyclic loess normaliza-
tion and with speed comparable to quantile normalization.
and quantile normalization and is fast; it is at least an order of
magnitude faster than cyclic loess and approaches the speed
of quantile normalization.Furthermore, fastlo is more versatile
than both cyclic loess and quantile normalization because it is
Availability: The Splus/R function for fastlo normalization is
available from the authors.
Our goal was to develop a normalization
High-density gene expression array technology allows
investigators to obtain quantitative measurement of the
specimen. There are two major types of microarray technolo-
gies: spotted cDNA and oligionucleotide arrays. Expression
data obtained from either type of microarray technology has
measurement error or variation. One type of variability is due
to biological differences between specimen samples. This
is what is of interest to the investigator; the investigator
would like to know which genes are differentially expressed
among different biological samples (e.g. between cancer-
ous and normal kidney tissues, among B-cells challenged
with different agents in culture.). Systematic variation also
affects the measured gene expression level. Many sources
contribute to this type of variation and are found in every
microarray experiment. Sources include, but are not lim-
ited to, the array manufacturing process, the preparation of
the biological sample, the hybridization of the sample to the
et al., 2001) provide a more complete discussion of the
∗To whom correspondence should be addressed.
process. The purpose of normalization is to minimize the
systematic variations in the measured gene expression levels
among different array hybridizations to allow the compar-
ison of expression levels across arrays and so that biological
differences can be more easily identified.
There are numerous methods for normalizing gene expres-
sion data. Generally, normalization methods use a scaling
are applied to the raw intensities of the spots (gene sequence
GeneChip array) on the microarray to produce normalized or
scaled intensities. Types of normalization techniques include
mean correction (Richmond and Somerville, 2000), non-
linear models (Yang et al., 2002), linear combination of
factors (Alter et al., 2000), and Bayesian methods (Newton
ofabaselinearray, performthebest(Bolstadetal.,2003). We
tend to favor two commonly used non-linear methods: cyclic
loess normalization and quantile normalization. Both tech-
niques are non-linear and perform normalization on the set of
arrays as a whole without specifying a reference array. Over-
all, we prefer cyclic loess because it is not as aggressive in
its normalization as is quantile normalization; however, cyc-
lic loess is relatively slow for even a moderate sized set of
arrays. Quantile normalization, on the other hand, is very
fast for even large sets of arrays. The goal of our invest-
igation was to develop a method that produced normalized
faster—on the order of the speed of quantile normalization.
The result is a new normalization method called fastlo. Since
normalization is performed on the raw intensity values of the
spots on the arrays, the methods discussed here are applic-
able to both major types of array technology: spotted cDNA
and oligonucleotide. However, our focus is on data arising
from GeneChip arrays and it should be noted that there are
cDNA arrays (Yang et al., 2002).
In the next section we discuss cyclic loess normalization
and useful insights that can be gained from viewing it as a
Bioinformatics vol. 20 issue 16 © Oxford University Press 2004; all rights reserved.
at University of Portland on May 23, 2011
average of the log (base 2) values
difference of the log (base 2) values
68 10 12 14
Fig. 1. MA plot for 10000 randomly selected probes.
parallel algorithm. Section 3 describes a new type of normal-
ization, fast linear loess (fastlo), arising from the observation
that cyclic loess is essentially a smoothing function coupled
with a very simple linear model. In Section 4, we review
quantile normalization. The performances of the three dif-
ferent normalization techniques are compared in Section 5.
Comparisons are made using simulated data and real data,
one of the benchmark sets for Affymetrix GeneChip expres-
sion measures (Cope et al., 2003). The simple linear model
underlying the fastlo method is extended in Section 6. We
close with a discussion of our findings in Section 7.
Most methods of normalization, including the methods dis-
change expression levels under the conditions being studied.
ratio of expression values between two conditions is one or
sion is zero for a typical gene. This is biologically plausible
for many studies. However, if there is good reason to believe
this assumption is not true for a particular study, then the
normalization methods described here are not appropriate.
A fundamental graphical tool for the analysis of gene expres-
sion array data is the M versus A plot (MA plot); here M is
the difference in log expression values and A is the average
of log expression values (Dudoit et al., 2002). Figure 1 con-
tains an MA plot for a random sample of 10000 probes from
two (unnormalized) GeneChip arrays with a loess smoother
show a point cloud scattered about the M = 0 axis. In other
words, the loess smoother would be a horizontal line at 0 for
ideally normalized data.
Cyclic loess normalizes two arrays at a time by applying a
correction factor obtained from a loess curve fit through the
MA plot of the two arrays, call the curve f(x). For example,
consider the circled point in Figure 1. This point corresponds
level at this spot would be reduced by 1/2 the distance of
f(x) from the y = 0 line; in other words, f(x)/2 would be
subtracted from the expression level of this spot on one of the
arrays. The expression value for this spot on the other array
would be increased by f(x)/2. After correction, the MA plot
for this particular pair of arrays would be horizontal. One
pass of the cyclic loess algorithm consists of performing this
pairwise normalization on all distinct pairs of arrays. Passes
of the algorithm continue until the computed corrections of a
completed pass are essentially zero.
In summary, to perform the cyclic loess algorithm begin
with the log2of the spot expression intensities arranged as a
matrix with one column per array and one row per array spot
and proceed through the steps below.
The x-axis is the mean probe expression value of the
two arrays and the y-axis is the difference (one point
for each spot).
(2) Fit a smooth loess curve f(x) through the data.
(3) Subtract f(x)/2 from the first array and add f(x)/2 to
(4) Repeat until all distinct pairs have been compared.
(5) Repeat until the algorithm converges.
In practice, the pairs of arrays are chosen by a method that
systematically cycles through all pairs. A drawback of cyclic
loess is the amount of time required to normalize a set of
data; the time grows exponentially as the number of arrays
increases. Typically, twoorthreepassesthroughthecomplete
cyclearerequiredforconvergence. Likely, cyclicloesswould
converge faster if pairings went in a more balanced order.
a loess smooth would still be required for a relatively large
number of array pairs.
Further examination of the cyclic loess algorithm reveals
some interesting facts. First, the algorithm preserves the row
means of the data matrix, Y. At any given step, one number in
a row is increased by f(x)/2 and another is decreased by the
same amount. Second, if all the values in one of the columns
algorithm (the scaled intensities on each array) are changed
only by the addition of a constant. This is because any one of
the plots on which the smooths are based is identical but for
the labeling of its axes, and thus any given smooth is changed
only by a constant. One pass through the algorithm requires
2loess smooths on all the spots on the array.
2.2 Parallel loess
Cyclic loess is inherently parallel in nature. Viewing it in this
manner may provide insights that allow computational time
savings. Imaginethatwehadaparallelmachine, sothatallthe
pairwise normalizations could be done simultaneously. Once
at University of Portland on May 23, 2011
K.V.Ballman et al.
each of the pairwise corrections is obtained, then the correc-
tionforspoti onarrayj wouldbethe‘average’correctionfor
this spot across all array pairs containing array j. Essentially,
the correction to the spots on array j would be the average of
j. To simplify the logic, we consider the pairing of each array
with itself, as well as both orderings of the pairs. This means
that for n arrays, there are n2pairings. Here, and throughout
the remaining text, Y is used to denote the matrix of the log2
intensity values where the column j corresponds to an array
and row i corresponds to a spot.
As the simplest example, consider the noiseless case where
each column of Y, corresponding to an array, differs from any
other by a constant and array ‘0’ is an imaginary reference
representing the true expression level. In other words, the
intensity of the spot i,j, yij, can be expressed as yij= yi0+
cj. The ideal correction of all spots for a given chip is cj− ¯ c
(a horizontal loess curve) and the average correction for array
1 from the parallel algorithm is
(c1− cj)/2 = (c1− ¯ c)/2.
That is, the average correction is 1/2 of what it should be.
As a result, we define the correction step for the i-th chip in
parallel loess to be (2/n)?
Now consider a simple simulation where each column of
the data matrix (each array) differs from any other by a con-
stant plus a symmetrically distributed noise term. Array ‘0’ is
the set of 5000 true intensities, which were randomly selected
from a uniform distribution with range from 0 to 10. The
log2intensity levels for the four arrays in the experiment, yij,
are derived from array ‘0’ as follows: yij = yi0+ j + eij,
where eij ∼ t8(i = 1,2,...,5000 and j = 1,2,3,4). A
t-distribution was selected for the error term (noise distribu-
tion) because it is symmetric and produces more outliers than
anormaldistribution. Inthiscase, thetruecorrectionforarray
j isj−¯j = j−2.5.Thefourpairwisecorrectionsforapartic-
ular spot involving array 1 have expectations of 0,−1.0,−2.0
and −3.0. The average of the four corrections for a particular
spot is −1.5. Figure 2 shows the four computed corrections
for each spot from parallel loess involving array 1 as well
as the average of the corrections across arrays. This suggests
averaging the corrections to get an overall update, rather than
applying each of the separate corrections to the data in turn.
cyclic loess. For this case performing the smooths cyclically
or in parallel produces equivalent results. For this figure, and
lic loess. However, we did not intend to obtain computational
speed savings by developing a parallel version of cyclic loess.
jfij, where fijis the smooth for
the plots of chips i and j.
Fig. 2. The computed corrections for array 1 from the 4 pairwise
smooths for a parallel loess (1v1, 1v2, 1v3, 1v4), for the average of
the pairwise corrections (avg), and the for cyclic loess (cyclic).
Rather, we used this construct to gain insight into how cyclic
loess works and how it might be made faster for non-parallel
Parallel loess can be shown to be unbiased for this simple
case. Let yij = αi+ βj+ ?ij; αiare the true probe values
(yi0in the simulation), βj the simple array effects, and ?ij
the error. Assume that a,b are two vectors of constants, and
a smooth of Yb on Ya is to be used to estimate the correc-
tion. Sincethecorrectionshouldbeconstant, wewantYa and
Yb to be uncorrelated, i.e. that there be no linear bias in the
smooth. The covariance matrix of Y has elements of σ2
on the diagonal, and σ2
variances of α and ? above. Simple algebra shows the cov-
ariance of Ya and Yb to be σ2
in the plot is avoided by choosing a and b so that both terms
are zero. For parallel loess, the plot for arrays 1 and 2 has
a = (1/2,1/2,0,...,0) and b = (1,−1,0,...,0), clearly
i,j. Other choices of a and b will be explored below.
αoff the diagonal, where these are the
?are unknown for a given problem, linear bias
the parallel cyclic loess algorithm is to replace the n loess
smooths, each on p points, with a single loess smooth on np
points. In other words, the corrections to be applied to spots
on array j would be obtained by placing all the points in the
MA plots containing array j (n of them) into a single plot and
performing a single loess smooth on this plot. Recall, the cor-
rections for array j are obtained in parallel loess by averaging
thenpairwisecorrections, obtainedfromthenMAplots; and
recall that the corrections are just twice the smooth value. So
if the smoother is a linear operator, which loess is other than
outlier rejection passes, then the average of the smooths will
j allplacedonasingleplot. Becauseoftheverylargeamount
Parallel loess variant one
An obvious variant of
at University of Portland on May 23, 2011
2468 1002468 10
# parallel variant 1
Fig. 3. The computed corrections for arrays 2 and 3 from the example of 4 arrays. The corrections are from a single smooth on all the derived
20000 data points (parallel variant 1), from parallel loess (parallel), and from cyclic loess (cyclic).
is not normally an important issue. Specifically, when there is
a large quantity of data over a given range, no point, even if
extreme, can exert much influence.
Figure 3 compares the computed corrections for the array
spots on two arrays from our four array example introduced
above: true expression values for the array shifted by a con-
stant plus symmetrically distributed error. The computed
corrections displayed are those obtained from cyclic loess,
from the pure parallel version of cyclic loess, and from a
single loess smooth on the plot containing all the points from
the four MA plots involving the array to be normalized (par-
allel variant 1). Recall that for the pure parallel version of
cyclic loess, the correction for a particular point is found by
averaging the corrections obtained from the n MA plots. As
expected, all three methods produced equivalent results.
This parallel variant 1 reduces the number of loess smooths
that are computed. One pass through the data with parallel
variant 1 requires only n loess smooths compared to Cn
smooths required by cyclic loess. However, each smooth for
this parallel variant 1 is performed on np versus p points.
Whether this version of parallel loess is faster than cyclic
loess likely depends on the implementation details of each
the number of points in the plot containing all the points from
the MA plots of parallel loess involving array j from np to p.
Sincetherearep spotsonanarray, theintentistoreplaceeach
collection of n points per spot i (one per each pairwise MA
plot—array j with array 1, array j with array 2,..., array
j with array n) with a single point, which summarizes the
collection of n points for spot i. The new plot would contain
p ‘summary’ points and we would perform a smooth on this
plot to obtain the spot corrections for array j.
Parallel loess variant two
The next idea is to reduce
A way to produce a point for spot i that summarizes the n
points for spot i that involve array j is to set the x-coordinate
equal to the average of the n x-coordinates:
+ ··· +(yi1+ yin)
=yi1+ ¯ yi·
Here ¯ yi·istherowmeanfortheithrowofthedata, whichis
equal to the mean expression value of spot i across the arrays.
The vertical position for spot i on array j is the average of the
n[(yi1− yi1) + (yi1− yi2) + ··· + (yi1− yin)] = yi1−¯ yi·.
The next step is to determine corrections for the spots on
chip j by fitting the loess smooth on this set of p points. This
would be repeated for each of the n−1 other chips; a total of
n smooths need to be computed for one pass through all the
data, similar to the parallel loess variant 1 above. However,
compared to np for variant 1.
Repeating the bias computation found in Section 2.2, the
plot for array 1 versus array 2 has vertical and horizontal axes
aY andbY,respectively,witha = (1−1/n,−1/n,...,−1/n)
and b = (1 + 1/n,1/n,...,1/n)/2. This yields?aj = 0,
tion between aY and bY and a potentially biased correction.
variant of parallel loess (parallel variant two), parallel loess,
and cyclic loess for two of the four arrays in our simple data
example. Clearly, thebiasissevere. Inthenextsectionweuse
a linear models argument to motivate a similar plot, but with
x-coordinate of ¯ yi·, leading to b = (1/n,...,1/n) and
?ajbj= 0 and an unbiased estimate.
?bj= 1and?ajbj= (n−1)/nleadingtopositivecorrela-
at University of Portland on May 23, 2011
K.V.Ballman et al.
2468 1002468 10
# parallel var 2
Fig. 4. The computed corrections for arrays 2 and 4 of the simulated example of 4 arrays. The corrections are from the smooth of the 5000
derived points (parallel var 2), from parallel loess (parallel), and from cyclic loess (cyclic).
Cyclic loess can be conceptualized as a smooth (loess),
coupled with a very simple linear model. Consider the case of
two arrays, 1 and 2, with gene expression levels represented
by yi1and yi2for i = 1,...,p, and the simplest possible
linear model for the data yij= αi+?ij, an intercept for each
spot. The solution is of course ˆ αi = ¯ yi·. In the MA plot for
the two arrays, the x-coordinate is (yi1+ yi2)/2 = ¯ yi·= ˆ yi·,
the y-coordinate is yi1− yi2= 2(yi1− ˆ yi·), and the adjust-
ment to spot i on the the first chip is 1/2 the height of the
loess smooth on this plot at the x-coordinate corresponding
to spot i.
Extending this idea to the n array case suggests creating a
sistsofarrayj andaconstructed‘average’array; theintensity
level of spot i on the ‘average’ array is equal to the average
intensityofspoti acrossallarraysfori = 1,...,p. Thecom-
smooth on the points of this plot. This would be done for each
of the j = 1,...,n arrays. This variant of cyclic loess, called
fastlo, requires n loess smooths each on p points for one pass
through the data. Obviously, this is considerably faster than
cyclic loess, which requires Cn
The steps of fastlo given below are done on the log2of the
FAST LINEAR LOESS
(1) Create the vector ˆ yi·= the row mean of Y. Note that
this is the same as creating an ‘average’ array.
(2) Plot ˆ y versus (yi− ˆ y) for each array j. This plot has
one point for each spot (modified MA plot).
(3) Fit a loess curve f(x) through the data.
(4) Subtract f(x) from array j.
(5) Repeat for all remaining arrays.
(6) Repeat until the algorithm converges.
A further interesting aspect of fastlo is that it requires only
1 or at most 2 iterations to converge. If there is no outlier
ator and the average of the smooths will be the smooth of all
where Sm is the smoother. If this holds, then for any given
row of Y, some elements increase and some decrease, but the
mean stays the same. If the row means do not change, the
algorithm has converged.
Sm(yn− ¯ y) ≈ Sm
y − ¯ y
= Sm(0) = 0
Quantile normalization makes the overall distribution of
values for each array identical, while preserving the overall
distribution of the values. It consists of two steps.
(1) Create a mapping between ranks and values. For rank
1 find the n values, one per array, that are the smallest
value on the array, and save their average. Similarly for
rank 2 and the second smallest values, and on up to the
n largest values, one per array.
(2) For each array, replace the actual values with these
As mentioned, this produces identical distributions of val-
ues on each array; quite an aggressive normalization process.
On the other hand, quantile normalization is extremely fast—
it only requires a sort of the arrays and a computation of
at University of Portland on May 23, 2011