Page 1

BioMed Central

Page 1 of 15

(page number not for citation purposes)

BMC Bioinformatics

Open Access

Methodology article

A permutation-based multiple testing method for time-course

microarray experiments

Insuk Sohn1, Kouros Owzar1,2, Stephen L George1,2, Sujong Kim3,4 and Sin-

Ho Jung*1,2

Address: 1Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina 27710, USA, 2CALGB

Statistical Center, Durham, North Carolina 27705, USA, 3Skin Research Institute, AmorePacific R&D Center, Yongin 449-729, Republic of Korea

and 4R&D Center, Komipharm International Co, LTD, Kyounggi-do 429-450, Republic of Korea

Email: Insuk Sohn - insuk.sohn@duke.edu; Kouros Owzar - kouros.owzar@duke.edu; Stephen L George - stephen.george@duke.edu;

Sujong Kim - sjkim007@hotmail.com; Sin-Ho Jung* - sinho.jung@duke.edu

* Corresponding author

Abstract

Background: Time-course microarray experiments are widely used to study the temporal

profiles of gene expression. Storey et al. (2005) developed a method for analyzing time-course

microarray studies that can be applied to discovering genes whose expression trajectories change

over time within a single biological group, or those that follow different time trajectories among

multiple groups. They estimated the expression trajectories of each gene using natural cubic splines

under the null (no time-course) and alternative (time-course) hypotheses, and used a goodness of

fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated

through a bootstrap method. Gene expression levels in microarray data are often complicatedly

correlated. An accurate type I error control adjusting for multiple testing requires the joint null

distribution of test statistics for a large number of genes. For this purpose, permutation methods

have been widely used because of computational ease and their intuitive interpretation.

Results: In this paper, we propose a permutation-based multiple testing procedure based on the

test statistic used by Storey et al. (2005). We also propose an efficient computation algorithm.

Extensive simulations are conducted to investigate the performance of the permutation-based

multiple testing procedure. The application of the proposed method is illustrated using the

Caenorhabditis elegans dauer developmental data.

Conclusion: Our method is computationally efficient and applicable for identifying genes whose

expression levels are time-dependent in a single biological group and for identifying the genes for

which the time-profile depends on the group in a multi-group setting.

Background

Time-course microarray experiments are widely used to

study the temporal profiles of gene expression. In these

experiments, the gene expressions are measured across

several time-points, enabling the investigator to study the

dynamic behavior of gene expressions over time.

A number of statistical methods have been developed in

recent years for identifying differentially expressed genes

Published: 15 October 2009

BMC Bioinformatics 2009, 10:336doi:10.1186/1471-2105-10-336

Received: 18 March 2009

Accepted: 15 October 2009

This article is available from: http://www.biomedcentral.com/1471-2105/10/336

© 2009 Sohn et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),

which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Page 2

BMC Bioinformatics 2009, 10:336http://www.biomedcentral.com/1471-2105/10/336

Page 2 of 15

(page number not for citation purposes)

from time-course microarray experiments. Park et al. [1]

proposed a permutation-based two-way ANOVA to com-

pare temporal profiles from different experimental

groups. Luna and Li [2] proposed a statistical framework

based on a shape-invariant model together with a false

discovery rate (FDR) procedure for identifying periodi-

cally expressed genes based on microarray time course

gene expression data and a set of known periodically

expressed guide genes. Storey et al. [3] represented gene

expression trajectories using natural cubic splines and

then compared the goodness of fit of the model under the

null hypothesis to that under alternative hypothesis. The

null distribution of these statistics was approximated

through a bootstrap method. Di Camillo et al. [4] pro-

posed test statistics using the maximum distance between

two time trajectories or comparing the areas under two

time course curves. Approximating the null distribution of

the test statistics using a bootstrap method, they show that

their test statistics are more powerful than Storey et al. [3]

if the number of measurement time points is small. Hong

and Li [5] introduced a functional hierarchical model for

detecting temporally differentially expressed genes

between two experimental conditions for cross sectional

designs, where the gene expression profiles are treated as

functional data and modelled by basis function expan-

sions. Angelini et al. [6] modelled time-course data within

a framework of a Bayesian hierarchical model and use

Bayes factors for testing purposes.

Permutation resampling methods have been popularly

used to derive the null distribution of high-dimensional

test statistics while preserving the complicated depend-

ence structure among genes in microarray data analysis. In

this paper, we present a permutation-based multiple-test-

ing method for time-course microarray experiments when

independent subjects contribute gene expression data at

different time points. While the method can be general-

ized to broad class of goodness-of-fit test statistics for

regression curves, for illustration we use the F-test type sta-

tistic based on natural splines used by Storey et al. [3]. We

propose computationally efficient algorithms for identify-

ing the genes whose expression levels are time-dependent

in a single biological group and for identifying the genes

whose time-profile differs among different groups. For the

multiple group setting, we will consider two sets of

hypotheses. In the first set, any difference among the

curves, including vertically shifted parallel curves, is con-

sidered to constitute a discrepancy among the groups. For

the second set, only differences in the actual time-trends

are considered to be of interest after removing the vertical

shift. We shall refer to these as "time-course" and "time-

trend" hypotheses, respectively. Note that if two separated

curves can be overlapped by a vertical shift, then they have

different time-courses, but the same time-trend. The test

on a time-trend hypothesis will remove potential batch

effects in microarray experiments.

The rest of the article is organized as follows. We first

present a non-parametric test method to identify differen-

tial gene expression in a time-course microarray. We then

present simulation results to evaluate the statistical prop-

erties of the proposed method. Next, we apply the pro-

posed method to the Caenorhabditis elegans dauer

developmental data [7]. Lastly, we give a brief discussion

of the methods.

Methods

At first, we briefly review a smoothing method to estimate

a gene expression profile over time. Using the smoothing

method, we discuss a non-parametric test method for

identifying genes whose expression levels are time-

dependent in a single biological group and for identifying

the genes for which the time-profile depends on the group

among multiple groups. We approximate the null joint

distribution of the test statistics using a permutation

method.

Estimation of the Time-Course Profile

Suppose that subject i(= 1,..., n) contributes gene expres-

sion levels on m genes (yi1,..., yim) at time ti. For gene j(=

1,..., m), we consider a time trajectory model E(yij|t) =

j(t), where j(·) is the unknown function that is param-

eterized by an intercept plus a p-dimensional linear basis:

Here [W1(t),..., Wp(t)] is a pre-specified p-dimensional

basis that is common to all m genes, and j = [0, j, 1, j,...,

p, j]T is a (p + 1)-dimensional vector of unknown param-

eters for gene j. Similar to Storey et al. [3], we employ a B-

spline basis (see chapter IX in de Boor [8]) and place the

knots at the 0,1/(p - 1), 2/(p - 1),...,(p - 2)/(p - 1), 1 quan-

tiles of the observed time points.

Let

W denotes the design matrix based on the spline model.

Then, the least square estimator of j is obtained by

(WTW)-1WTyj, where yj = (y1j,..., ynj)T.

=

jj s j

,

s

s

p

t W t

( ) ( ).

,

=+

=∑

0

1

W =

⎛

⎜

⎜

⎜

⎝

⎞

⎟

⎟

⎟

⎠

1

1

111

1

W tW t

p

W t W t

pnn

( )

( )

( )( )

.

ˆ j

Page 3

BMC Bioinformatics 2009, 10:336http://www.biomedcentral.com/1471-2105/10/336

Page 3 of 15

(page number not for citation purposes)

One Group Case

In the case of a single biological group (K = 1), we often

want to discover genes whose expression levels are time-

dependent. For gene j(= 1,..., m), we want to test the

hypotheses

against

Under

Hj, the

∑

constant is estimated as

. Under , we obtain the esti-

mate , where

() is estimated as described in the previ-

ous section.

For gene j, the sum of squares of errors (SSE) is expressed

∑

1

as . Let and

denote the SSE under Hj and

[3] employ the F-statistic

, respectively. Storey et al.

for testing Hj against . It is noted that for the permuta-

tion-based multiple testing described below, the (n - p -

1)/p factor in the Fj test statistic will have no impact on the

results and as such can be omitted from the computations.

In order to generate the null distribution of the vector of

test statistics (F1,..., Fm) for the m genes, we randomly

match the microarray of n subjects {(yi1,..., yim), i = 1,..., n}

with their measurement times {t1,..., tn} at each permuta-

tion. Let ( ,..., ñ) be a permutation of (1,..., n). Then

{( , yi1,..., yim), i = 1,..., n} is a permutation sample of the

original data {(ti, yi1,..., yim), i = 1,..., n}.

Family-wise error rate (FWER) is defined by the probabil-

ity of rejecting any null hypothesis Hj when all m null

hypotheses are true. A single-step multiple testing proce-

dure controlling the FWER at level can be described as

follows, refer to e.g., Westfall and Young [9] and Jung et al.

[10].

Multiple Testing for Time Trend of One Group

1. Compute the the F-test statistics (f1,..., fm) from the

original data.

2. From the b-th permutation data (b = 1,..., B), com-

( )

pute the F-test statistics ().

3. Single-step procedure to control the FWER

(a) From the b-th permutation data, calculate ub =

b ( )

max1jm

.

(b) For gene j, calculate the adjusted p-value by

∑

, where I(·) is an indica-

tor function.

(c) For a specified FWER level , discover gene j if

<.

pj

False discovery rate (FDR) is another popular type I error

for multiple testing adjustment that is defined by the

expected value of the proportion of the number of errone-

ously rejected null hypotheses among the total number of

rejected null hypotheses, refer to Benjamini and Hoch-

berg [11]. A multiple testing procedure to control the FDR

at level can be obtained by replacing Step 3 in above

algorithm with Step 3' as described below, refer to Tusher

et al. [12] and Storey [13].

3'. Multuple testing controlling the FDR:

(a) For gene j, estimate the marginal p-value by pj

∑

1

= B-1

fj).

(b) For a chosen constant (0, 1), such as 0.95

[13], estimate the q-value of gene j by

(c) For a specified FDR level , discover gene j (or

reject Hj) if qj <.

The testing algorithm can be considerably simplified dur-

ing permutations. First, is invari-

ant under permutations, and as such one does not have to

0

re-calculate for the permutation samples. Second,

Ht

jjj

:( ),

=

a constant

Htttt

jjj

: ( )( ).

≠

′

≠ ′

for

ˆ ( )

jj ij

i

n

tyny

==

−

=

1

1

Hj

ˆ ( )

j

ˆˆ

( )

,,

js js

s

p

t W t

=+

=

∑

0

1

ˆ

,ˆ

,,ˆ

,,,

01

jj p j

SSEj ijji

i

n

yt

=−

=

{ ( )}

2

SSEj

0

SSEj

1

Hj

F

jj

p

j

n p

− −

/(

j=

−

( )/

)

SSE SSE

SSE

01

1

1

Hj

1

ti

FF

b

m

b

1

( )

,,

Fj

pB I uf

jbj

b

B

=≥

−

=

1

1()

I Fj

(

b

b

B

)

( )

=

q

pj

−

I pl

I plpj

(

1

lm

=

∑

)

lm

=

j=

>

∑

≤

()

()

.

1

1

SSEj ijj

i

n

yy

02

1

=−

=

∑

()

SSEi

Page 4

BMC Bioinformatics 2009, 10:336 http://www.biomedcentral.com/1471-2105/10/336

Page 4 of 15

(page number not for citation purposes)

suppose that we fix the gene expression data {(yi1,...., yim),

i = 1,..., n} and shuffle the measurement times t1,..., tn in

each permutation. Let I denote the n × n identity matrix.

1

yj

Then, noting that = {I - W(WTW)-1WT}yj, per-

mutation replicates of can be obtained by simply

permuting the columns of I - W(WTW)-1WT. Thus, I -

W(WTW)-1WT does not have to be re-computed for the

permutation samples. Furthermore, given that m is con-

siderably larger than n, permuting the columns of I -

W(WTW)-1WT, a matrix of dimension n × n, is more effi-

cient than permuting the rows of [y1,...,ym], a matrix of

dimension m × n.

K Group Case

In order to compare the time-course profiles of gene

expression measurements among different experimental

groups, we assume that a fixed number of measurement

times are pre-specified commonly among the K groups

and at least one subject is assigned to each time point

from each group. Let t1 < ? <tL denote the L time points

chosen, and nkl denote the number of patients from group

k(= 1,..., K) observed at time tl(l = 1,..., L). We use the nota-

∑

1

∑

1

tions nk· = to denote the number of patients

from group k and n·l = to denote the number of

patients at

∑

time

∑

point

l. So,

denotes the total

number of subjects in the study. The design and sample

size under each condition is summarized in Table 1.

Let (ykli1,..., yklim) denote the expression measurements for

m genes at time tl(l = 1,..., L) from subject i(= 1,..., nkl)

belonging to group k(= 1,..., K). The expression values are

modelled as

where kj(t) = 0, kj + .

In the K-group setting, we want to identify the genes with

different time profiles in different groups. The hypotheses

for gene j are specified as

against

Under , the estimator is

estimated from the group k data, {(tl, yklij), 1 i nkl, 1 l

ˆ

,

kjo kj

s

=

∑

L}. Let .

Under Hj : 1j(t) = ? = Kj(t)(= j(t)), the group-free esti-

ˆ

(ˆ,ˆ,,ˆ

,,,

jjj p j

=

0

mator is estimated using the

pooled data, {(tl, yklij), 1 i nkl, 1 k K, 1 l L}. Let

ˆˆ

( )

,,

jj s js

s

=

∑

1

denote the estimator of the

common time trajectory under Hj.

For gene j, the SSE under Hj is calculated as

∑∑∑

, where (t) is

the estimate of 1j(t) = ? = Kj(t) from the pooled data.

The SSE under

Hj

∑∑∑

is calculated as

, where (t) is

the estimate of kj(t) from the group k data.

We reject Hj in favor of for a large value of the F-statis-

tic

The null distribution of the test statistics (F1,..., Fm) is

approximated using a permutation method. A permuta-

tion sample is generated by permuting the gene expres-

sion data within each time point: the gene expression data

of n·l subjects at time tl, {(ykli1,..., yklim), 1 i nkl, 1 k

K} are randomly partitioned into K groups of size n1l,...,

nKl. The subjects at different time points are not permuted.

For each subject, the random vector (ykli1,..., yklim) is

counted as a data point, so that the m genes are not per-

SSEj

T

SSEj

1

nkl

l

L

=

nkl

k

K

=

nnnn

k

k

K

l

l

L

kl

l

L

k

K

===

⋅

=

⋅

===

∑∑

1111

E y

(

t

klijkjl

)( ),

=

s kjs

s

p

W t

,

( )

=

∑

1

Htt

jjKj

: ( )( )

1

==

Htttkk

jkjk j

:( )( ).

≠≥≠ ′

′

for some and 0

Hj

ˆ

(ˆ

,ˆ

,,ˆ

)

,,,

kjkjkjp kj

T

=

01

ˆ ( )

ˆ

( )

,

s kjs

p

tW t

=+

1

)

T

1

ˆ ( )

p

tW t

=+

0

SSEjklijjkl

i

n

l

L

k

K

yt

kl

1

=

02

11

=−

==

{( )}

ˆ j

SSEjklijjkl

i

n

l

L

k

K

yt

kl

1

=

12

11

=−

==

{( )}

ˆ kj

Hj

F

j

SSE

j

Kp

j

n K p

−

/{

j=

−−+

+

( )/{( )()}

1

( )}

1

.

SSESSE

1

01

1

Table 1: Design and sample sizes for a K group case.

Time

Group

t1

?

tL

Total

1

?

K

n11

?

nK1

?

?

?

n1L

?

nKL

n1·

?

nK·

Total

n·1

?

n·L

n

Page 5

BMC Bioinformatics 2009, 10:336http://www.biomedcentral.com/1471-2105/10/336

Page 5 of 15

(page number not for citation purposes)

muted. One permutation sample is obtained by conduct-

ing this permutation process for all L time points. Note

that there are

different permutations. Table 2 demonstrates a permuta-

tion when K = 2. The proposed restricted permutation

maintains the time trend in the whole population and

allows heteroscedastic error models. Multiple testing to

control FDR or FWER is conducted as in one group case,

but by using the K group F statistics and permuting the

observed expressions within each time-point. We can save

computing time by utilizing the fact that the design matri-

ces of the regression models are invariant to permuta-

tions.

Results

Simulations

In this section, we investigate the performance of our

method for control of the FWER and power using exten-

sive simulations. We also apply the proposed methods to

a real data set.

Simulation Study

The three scenarios considered are based on amplitude

variation, phase variation and a homoscedastic versus a

heteroscedastic error model. We restrict ourselves to the

single- and two-group (i.e., K = 2) cases.

Simulation Settings

We set m = 1, 000. Given a trend j(t) for gene j(= 1,..., m),

expression data (y1,..., ym) measured at time t are generated

by

Let a1,..., a1000 and b1,..., b100 be IID N (0, 1) random vari-

ables. Then, heteroscedastic error terms are generated as

follows. For l = 1,..., 100 and j = 1,...,10, we generate 10(l-

+ bl

1−

1)+j = a10(l-1)+j

.

Note that the error terms (1,...,m) consist of 100 independ-

ent blocks of size 10, and the error terms in block l(= 1,...,

100), (10(l-1)+1,...,10l), have a compound symmetry correla-

tion structure with correlation coefficient , which is set at

0, 0.3 or 0.6. We choose L = 11 measurement times tl = 0,

1,..., 10, and simulate 4 replications at each time point for

each group.

In a single-group case, non-prognostic genes (genes under

Hj) have model j(t) = 0, and prognostic genes (genes

under ) have j(t) = 4 exp(-t) in Simulation 1 and j(t)

Hj

= sin(2t) in Simulation 2.

In a two-group case (K = 2), we consider three different

simulation models. In Simulation 1 (amplitude variation

model), non-prognostic genes have equal time trends for

both groups 1j(t) = 2j(t) = exp(-t), and prognostic genes

have 1j(t) = exp(-t) for group 1 and 2j(t) = 2.5 exp(-t) for

group 2, see the left panel of Figure 1. In Simulation 2

(phase variation model), non-prognostic genes have

equal time trends for both groups 1j(t) = 2j(t) = sin(2t),

and prognostic genes have 1j(t) = sin(2t) for group 1

and 2j(t) = sin(2(t - 1/4)) for group 2, see the right panel

of Figure 1.

In Simulations 1 and 2, all m = 1, 000 genes are non-prog-

nostic under the global null hypothesis .

Under a specific alternative hypothesis ,

the first m1 = 10 genes are prognostic, and the remaining

m0 = 990 genes are non-prognostic.

In Simulation 3 of a two-sample case, we consider hetero-

scedastic error models. Non-prognostic genes have 1(t) =

2(t) = t, and prognostic genes have 1(t) = t and 2(t) =

2.5 + t. For both groups (k = 1, 2), the first 100 genes (1

n

!

nnK

nL

⋅

!

nL

1

nKL

⋅

××

1

111

!

!

!

!

y t

j

t

jj

( )( ).

=+

†

HHj

j

m

0

1

=

=

HH

aj

j

m

=

=1

Table 2: Illustration of a permutation for a K = 2 group case

(a) Original Data

Time

t1

?

tl

?

tL

Group1

y11

y12

y13

yl1

yl2

yL1

yL2

2

y14

y15

yl3

yl4

yl5

yL3

yL4

(b) A permuted Sample

Time

t1

?

tl

?

tL

Group1

y14

y12

y15

yl2

yl3

yL3

yL1

2

y13

y11

yl1

yl4

yl5

yL4

yL2