Page 1

A Spectral Graphical Model Approach

for Learning Brain Connectivity Network

of Children’s Narrative Comprehension

Xiaodong Lin,1,*Xiangxiang Meng,2,3,*Prasanna Karunanayaka,4and Scott K. Holland3

Abstract

Narrative comprehension is a fundamental cognitive skill that involves the coordination of different functional

brain regions. We develop a spectral graphical model with model averaging to study the connectivity networks

underlying these brain regions using fMRI data collected from a story comprehension task. Based on the spectral

density matrices in the frequency domain, this model captures the temporal dependency of the entire fMRI time

series between brain regions. A Bayesian model averaging procedure is then applied to select the best directional

links that constitute the brain network. Using this model, brain networks of three distinct age groups are con-

structed to assess the dynamic change of network connectivity with respect to age.

Key words: Bayesian model averaging; brain development; functional magnetic resonance imaging; graphical

models; narrative comprehension; spectral density matrix

Introduction

N

variety of skills and strategies that decode and interact with

the text in a story. The logical coherence (cause and effect),

the goals and internal states of the characters, and the integra-

tion of different parts in the story are the three most impor-

tant aspects related to the development of children’s

narrative comprehension ability. Moreover, to connect differ-

ent components and summarize the story, it is necessary for a

reader (or listener) to build the causal relationships through-

out the whole narrative (Barthes and Duisit, 1975). Therefore,

the ability to comprehend a story is much more complicated

than understanding the text sentence by sentence and, thus,

involves sophisticated cognitive processes and interactions

from different functional regions in the brain.

Various network models have emerged recently to study

the neural interactions among different brain regions in a cog-

nitive or sensory task (Friston, 2009; Friston et al., 2003, 2011;

Mclntosh and Gonzalez-Lima, 1994; Roebroeck et al., 2005;

Zheng and Rajapakse, 2006). Structural equation modeling

(SEM) was first introduced by McIntosh et al. (1994) in the

network analysis of vision tasks using PET data, and has

arrative comprehension, a skill that develops in the

early school-age years (Lorch et al., 1998), consists of a

since been widely applied for modeling neural connectivity

based on different brain imaging techniques such as

functional MRI andelectro-encephalography/magneto-

encephalography (EEG/MEG) (Buchel and Friston, 1997;

Bullmore et al., 2000; Karunanayaka et al., 2007; McIntosh

et al., 1994; Mclntosh and Gonzalez-Lima, 1994). In an SEM

model of fMRI data obtained during a story-listening experi-

ment, Karunanayaka et al. (2007) describe how the age-

related connectivity of the neural network changes through

children’s development of narrative comprehension. How-

ever, the network analysis in SEM is confirmatory as it de-

pends on a presumed neural network structure that is often

obtained from existing neuro-anatomical results. The choice

of such a prior structure is not straightforward for a compli-

cated cognitive process, such as narrative comprehension.

Graphical models are a class of statistical models that en-

code the casual relationships between random variables

using conditional probability. In the literature, directed graph-

icalmodels are alsoknownasBayesian Networks (BN).Unlike

SEM, BN can not only estimate the path strength in the net-

work, but also identify the network structure based on the

functional imaging data. In a BN, the nodes in the graph rep-

resent random variables of interest, and the edges denote the

conditional dependency structure among the variables. In

1Department of Management Science and Information Systems, Rutgers University, Piscataway, New Jersey.

2Department of Mathematical Sciences, University of Cincinnati, Cincinnati, Ohio.

3Pediatric NeuroImaging Research Consortium, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio.

4Center for NMR Research, The Pennsylvania State University, Hershey, Pennsylvania.

*Joint first authors.

BRAIN CONNECTIVITY

Volume 1, Number 5, 2011

ª Mary Ann Liebert, Inc.

DOI: 10.1089/brain.2011.0045

389

Page 2

Zheng and Rajapakse (2006), BN were first applied to build

brain connectivity networks in a silent reading task and an in-

terference counting task. However, these models assume that

fMRI signals are independent and identically distributed,

and do not take into account any temporal correlation within

the fMRI signals. The Dynamic Bayesian Network (DBN) is

anextensionofBNandprovidesa frameworkforbuildingnet-

works on time-series data using fixed-length time-delayed

edges in the graph. In Burge et al. (2009), Li et al. (2008), and

Rajapakse and Zhou (2007), DBN was applied to analyzing

fMRI data for various cognitive tasks.

To investigate the neural network structure for narrative

comprehension, we propose a spectral network model with

model averaging based on the graphical model framework

for multivariate time series (Bach and Jordan, 2004). In this

approach, the neural interactions and temporal dependence

among different brain regions are measured by spectral den-

sity matrices after a Fourier transform of the fMRI signals into

the frequency domain. Unlike DBN, this approach measures

the temporal characteristics of the time series in the frequency

domain instead of measuring connections between brain re-

gions with a prespecified lag in the time domain.

As a starting point to build the connectivity network of the

narrative comprehension task, the activation regions are

detected by a group spatial Independent Component Analy-

sis (ICA) and confirmed by a random effect General Linear

Model (GLM) following the procedure in Karunanayaka

et al. (2007). For each region a representative time series is

extracted by averaging the voxel of peak activation and its

neighboring voxels. The spectral network model is then ap-

plied on the set of representative time series to learn the net-

work structure among these active regions. We compare our

results in three different age groups with the a priori model

(Karunanayaka et al., 2007) based on known neuro-anatomi-

cal results for story comprehension and language processing.

Synthetic multivariate time series are simulated from vector

autoregressive (VAR) models to show the advantage of our

approach over static BN and DBN models for constructing

connectivity networks using time-series data.

Materials and Methods

Experiment and image preprocessing

The fMRI data were collected from 313 children (279 Cau-

casian, 22 Africa-American, 2 Asian, 2 Hispanic, 1 Native

American, and 17 Multi-ethnic), including 152 boys and 161

girls (Schmithorst et al., 2006). The study was approved by

the Institutional Review Board of Cincinnati Children’s Hos-

pital Medical Center. Informed consent was obtained from

the child’s parents or guardian before participation. Assent

was also obtained from subjects 8 years and older.

The fMRI paradigm consisted of a 30-sec on-off block de-

sign (Fig. 1) (Holland et al., 2007). Children listened to differ-

ent stories read by adult female speaker during active

periods. Each story was followed by a control period of 30-

sec of pure tones of 1-sec duration at intervals 1–3sec. Each

story contains 9, 10, or 11 sentences of contrasting syntactic

constructions in order to increase the relative processing

load for this aspect of language comprehension. The pure

tones were designed to control for sublexical auditory pro-

cessing. Moreover, children were instructed to answer 10

multiple-choice questions at the end of the scanning session

to assess their performance during the task.

One hundred ten fMRI scans were obtained per subject dur-

ing the narrative comprehension paradigm using a Bruker 3T

Medspec (Bruker Medizintechnik, Karlsruhe, Germany) imag-

ing system. The total scan time was 5min and 30sec, and the

first 10 scans were discarded in order to allow the spins to

reach relaxation equilibrium. Details of the EPI-fMRI parame-

ters were TR/TE=3000/38ms, BW=125kHz, FOV=25.6cm·

25.6cm, matrix=64·64, and slice thickness=5mm. T1-

weighted inversion recovery MDEFT scans were obtained

from each subject for anatomical co-registration.

The fMRI data were preprocessed using in-house software

writteninInteractiveDataLanguage(IDL;ITTVisualInforma-

tion Solutions, Boulder, CO). A multi-echo reference scan

(Schmithorstetal.,2006)wasusedforthecorrectionofNyquist

ghosts and geometric distortion from B0 field inhomogeneity

in image reconstruction (Schmithorst et al., 2001), and a pyra-

mid iterative co-registration algorithm was used for motion

correction (Thevenaz et al., 1998). The data were subsequently

transformed into the stereotaxic space using linear affinetrans-

formation (Talairach and Tournoux, 1988).

The spectral graphical model with model averaging

The network construction consists of two main steps. In the

first step, the group ICA is conducted to identify spatial-inde-

pendent components of active brain regions. Once we iden-

tify these regions, each of them is regarded as a node in the

network and then the spectral graphical model is applied to

construct a connectivity network among these regions.

Group ICA.

nated subject-wise and then a group spatial ICA was applied

to identify activated brain regions involved in story compre-

hension (Calhoun et al., 2001; McKeown et al., 1998; Schmi-

thorst and Holland, 2004). First Principal Component

Analysis (PCA) was applied to reduce the data dimension

in the time domain for each child. Then, the data were concat-

enated across subjects and the PCA was further applied on

this grouped data set to reduce the temporal data dimension

to 40. The Fast ICA algorithm (Hyvarinen, 1999) wasrepeated

for 25 times, and a hierarchical agglomerative clustering algo-

rithm (Himberg et al., 2004) was used to group IC compo-

nents. Details of the group ICA method can be found in

Schmithorst et al. (2006). In those IC clusters identified to be

task related, active cortical regions are determined by a

voxelwise one-sample t-test performed on the individual IC

maps, with Bonferroni correction for multiple voxel compar-

isons. The task-related regions are also identified by a stan-

dard random effect GLM analysis (Karunanayaka et al.,

2007). For each active brain region, the average of the fMRI

signals from the maximum activation voxel and its six neigh-

bors is chosen as the representative time series.

The preprocessed fMRI data were concate-

FIG. 1.

the fMRI experiment for story

comprehension.

The block design of

390 LIN ET AL.

Page 3

Neural network modeling with BN.

sively studied in Machine Learning and Statistics literature

(Friedman and Goldszmidt, 1998; Geiger and Heckerman,

1994; Heckerman et al., 1995; Jordan, 1998) as a flexible struc-

ture learning technique. A BN is a directed acyclic graph

(DAG) that encodes the causal relationships among a set of

random variables. When applied to brain connectivity model-

ing, each node in the graph stands for a task-related brain

region. Consider a BN of brain regions indexed by

f1,2, ...,Mg, and let Xk=(Xk,1,Xk,2, ...,Xk,T) be the input

data for the (k)th region. Under the model assumptions of

DAG, the joint likelihood of the data given a DAG can be

decomposed into aproduct of a series ofconditional probabil-

ities (Jordan, 1998) (Fig. 2):

BN have been exten-

P(X1,X2, ...,XMjS,hS)=

Y

M

k=1

P(Xkjpk,S,hS) (1)

where pkdenotes the multivariate time series for the parent

regions of the (k)th region, S denotes the network of interest,

and hSdenotes the parameters in S. In recent BN applications

to neural network construction (Zheng and Rajapakse, 2006),

the inputs (Xk,t, pk,t) are assumed to be independent and dis-

cretized; thus, each conditional probability in Equation (1)

can be written as P(Xkjpk,S,hS)=QT

the Akaike information criterion (AIC) of the following

form is commonly used (Akaike, 1974)

t=1P(Xk,tjpk,t,S,hS): To

select the most likely network from the list of possible ones,

AIC(S)= ?logP(X1,X2, ...,XMjS,^hS)þqs:

Here^hSis the maximum likelihood estimate of hSand qSde-

notes the effective number of parameters in the model.

(2)

Spectral density matrix of the network.

describe the graphical model approach (Bach and Jordan, 2004)

for learning brain connectivity structure. In the remaining of the

article, we will use the term ‘‘Spectral Bayesian Network’’ (SBN)

for this graphical approach since it learns the network structure

based on the spectral density matrix of the multivariate time-

series input in the frequency domain. Let X=fX1,X2, ...,XMgT

be the M·T multivariate time series for M task-related active

brain regions, where each row Xk=(Xk,1,Xk,2, ...,Xk,T) is the

univariate time course representing the (k)th region. Denote

eXt=(X1,t, X2,t, ...,XM,t) as the (t)th column of X. Assuming

of X is an M·M matrix defined as

In this section, we

that X is centered and stationary, the autocovariance function

G(h)=E[eXtþheX¢

t] (3)

for any lagh 2 f0, ? 1, ? 2, ...g.Theoff-diagonal elements in

the sub block Gk,pk,(h) denote the cross-covariances of brain

region k and its parent nodes pk. For a given h, this block de-

scribes the pairwise linear dependency between the brain re-

gions in {k, pk}. The spectral density matrix of X is defined as

f(x)=

1

2p

+

1

h= ?1

G(h)expf?ixhg, (4)

for x 2 [?p,p]. The f(x) is an M·M symmetric matrix with a

fixed frequency x, and the (i, j) entries from all the spectral

matrices aggregate together to form a spectral decomposition

of the temporal dependence between the activations in brain

region i and j (Salvador et al., 2005, 2007) (Fig. 3). After apply-

ing Bayes’ rule to the conditional probabilities in Equation (1),

the AIC score of a given network S can be rewritten as

AIC(S)= ?log

Y

logp(Xk,pkjS,^hS)

p(pkjS,^hS)

M

k=1

P(Xkjpk,S,^hS)þqS

= ? +

M

k=1

þqS:

ð5Þ

This is a sum of likelihood p(XAjS,^hS),A ? f1,2, ...,Mg

with respect to the network structure S, plus the penalty

qSon network complexity. Assuming that the multivariate

FIG. 2.

Bayesian Networks (BN). Each likelihood is simplified into a

product of local conditional probabilities based on the net-

work structure (Jordan, 1998). Color images available online

at www.liebertonline.com/brain

The decomposition of the joint likelihood of data in

FIG. 3.

series with different levels of connectivity strength, with 90%

samplequartile intervals (dash line) in 1000 simulations. The bi-

variate time series are generatedfrom VAR models asdescribed

in section Simulation studies, with connection parameter p=0.1

(top),0.6(middle),and0.9(bottom)forthelinkbetweenthetwo

time series. The sample mean and standard deviation of the dif-

ferences of SBN AIC scores between the true network (con-

nected) and the empty networks in the simulations are shown

above the plots. AIC, Akaike information criterion; SBN, Spec-

tral Bayesian Network; VAR, vector autoregressive. Color

images available online at www.liebertonline.com/brain

Theestimatedcrossspectraldensitybetweentwotime

SPECTRAL GRAPHICAL MODEL FOR BRAIN CONNECTIVITY NETWORK391

Page 4

time-series X=fX1,X2, ...,XMgTare generated from a sta-

tionary Gaussian process, each local likelihood in Equation

(5) can be approximated using the corresponding sub block

in the spectral density matrix of X. plus a constant term C

that does not depend on the network structure (Bach and

Jordan, 2004; Jordan, 1998)

Zp

where A ? f1,2, ...,Mg is an arbitrary set of the nodes in the

graph, fA(x) is the square block of f(x) with respect to the

nodes in A, and^h corresponds to the estimated spectral den-

sity of the input time-series X. The Appendix gives the com-

plete derivation of the log likelihood estimate. The AIC

score for the complete network can be written as

Zp

logp(XAjS,^hS) ?T

4p

?p

logjfA(x)jdxþC,(6)

AIC(S)= ?T

4p+

M

k=1

?p

logj^ffXk,pkg(x)j

j^ffpkg(x)j

dxþqS

(7)

? ?1

2+

k=1

M

+

j=1

T

logj^ffXk,pkg(xj)j

j^ffpkg(xj)j

þqS:

(8)

Here^f(x) is the maximum likelihood estimate of the spectral

densities, and each^ff?g(x) in Equation (7) is a sub block of

^f(x). In practice f(x) is estimated by smoothing the periodo-

grams of the multivariate time-series X at discrete frequencies

xj, resulting in a numerical integration in Equation (8). Gauss-

ian spectral window is applied for periodogram smoothing

(Appendix). Since the estimated spectral densities represent

the cross-covariance of the brain activity among different re-

gions in the frequency domain (Fig. 3), Equation (8) provides

a concise selection criterion based on the dependency infor-

mation across the whole spectrum, rather than depending

on specific time lags.

Structure learning by Bayesian model averaging.

lection of the best network structure is a crucial step in the

model. Many factors influence this decision process, and spe-

cial care should be exercised to address this issue. The first

concern is due to the high noise level of fMRI data. With

such a noisy dataset, the best model chosen by the model se-

lection criterion (for instance, using the AIC scores) is the best

model for the contaminated data; thus, simply choosing the

model with the smallest AIC score may lead to false identifi-

cation of the true model. The second disadvantage of the tra-

ditional model selection procedures, such as AIC and

Bayesian Information Criterion, lies in the fact that no prior

network structure information can be incorporated into the

selection. In brain imaging, researchers have garnered well-

established neuro-anatomical knowledge over the years. It

is unreasonable from a modeling perspective to ignore vali-

dated information about the brain during our network search

procedures. The last issue is the notion of model equivalence

in the learning of BN structure. In BN, two graphs are said to

be equivalent if the factorization in Equation (1) based on one

graph is identical to the factorization based on the other.

When the estimation of network structure is the main focus,

we need to distinguish the networks that are different yet

have the same AIC score.

Observing these three issues, we propose the following

Bayesian model averaging (BMA) approach to identify the

The se-

optimal network structure. Instead of a direct comparison

of all the competing networks, our method computes and

ranks the posterior probability of the existence of a particular

link given the observed data, and constructs the connectivity

network byadding the most plausible link one at a time. Since

these posterior probabilities are unique, the network identi-

fied using this procedure is also unique.

In BMA (Hoeting et al., 1999), the posterior distribution of

an arbitrary variable of interest, D, given data D, can be writ-

ten in the following form

p(DjD)= +

K

k=1

p(DjMk,D)p(MkjD),(9)

where the posterior distribution p(DjD) is constructed by av-

eraging K candidate models Mk. In learning network struc-

tures, each model Mkrepresents a candidate network Sk.

The quantity of interest D is the existence of an edge Ea/bbe-

tween two brain regions a and b. We further assume

p(Ea/bjSk,D)=1 if the kth network structure Skcontains the

edge a/b, and p(Ea/bjSk,D)=0 if this edge is not in Sk.

Our BMA approach for network construction is described

by the following algorithm:

1. Choose S as the pool of candidate network structures.

2. Compute the network score for each Skin S using the

AIC metric in Equation (8).

3. Compute p(Ea/bjD) for each edge Ea/b, from averaging

over all networks in S.

p(Ea/bjD)= +

Sk2S

An estimate of p(SkjD) can be obtained when we compute

AIC (Bozdogan, 1987).

p(Ea/bjSk,D)p(SkjD):

(10)

4. Build the network.

a) Start with an empty graph S.

b) Consider all edges not in S, add the one with highest

p(Ea/bjD) into S.

c) If cycle is formed, delete the current edge.

d) Repeat b) and c) until no edges outside S satisfies

p(Ea/bjD)

maxall edgefp(Ea/bjD)g>C(11)

5. Output S.

Thealgorithmstartswithanemptynetworkandaddsedges

onebyoneintothegraphaccordingtotheirposterior probabil-

ity averaging from the pool unless the current addition results

in a cyclic graph, which is not allowed by the definition of BN.

We build a network only with the connections supported by

thedatasincealltheedgeswithrelativelowposteriorprobabil-

ity are excluded in the construction, which is controlled by the

threshold C in Equation (11) (Jeffreys, 1998; Madigan and

Raftery, 1994). Jeffreys (1998) and Madigan and Raftery

(1994) suggest the use of a value between 0.1 and 0.001 analo-

gous to p-values to exclude edge with significantly small pos-

terior probability. In this article, we use C=0.05, which is a

common choice of significance level in hypothesis testing.

We have examined the impact of threshold C in learning lan-

guage network structure in the section Study on children’s

392LIN ET AL.

Page 5

narrative comprehension, and the resulting networks are the

same for C ranging from 0.01 to 0.1.

For the selection of S, we discard networks with relatively

small posterior probabilities such that p(SjD)/max p(SjD)<C.

This is because networks with very low posterior probabilities

are generally very different from the ground truth. Discarding

these poor candidate models will enable the model selection

procedure to avoid substantial contribution to the sum in

Equation (10) by a large number of unreasonable networks

each contributing a small posterior probability. This is a com-

mon practice in BMA (Hoeting et al., 1999; Raftery et al., 1997).

Again, we choose C=0.05. Furthemore, the computation in

Equation (10) requires the estimation of the posterior probabil-

ity for eachcandidate model.Whenthe numberofnodes inthe

network is moderate (which is the case for our narrative com-

prehension study), exhaustive computation over all the com-

peting models is feasible. Since the number of the candidate

models grows exponentially as the number of nodes increases,

stochastic approximation may be needed for larger networks.

Results

Simulation studies

In this section we perform several simulations to demon-

strate the robustness of the SBN model in detecting the net-

work structure underlying multivariate time series. The

results of the spectral model are compared with those of static

BN and DBN. The multivariate time-series X=(X1,X2,

...,XM)Tare generated from a first-order VAR model. We

consider a network with four nodes (M=4):

eXt=FeXt?1þet

et~ Normal (~0,r2I4)

~X0~ Normal (~0,r2

(12)

0I4)

?

where~Xtis a four-dimensional observation at time t, and F

is the 4·4 one-step connectivity matrix. Any nonzero off-

diagonal entry Fij represents an unidirectional interaction

from node Xito node Xj. There is no connection in the graph

for nodes Xiand Xjwhen Fi,j=Fj,i=0. The et,t=1,2, ...,T are

assumed to be independent and identically distributed Gauss-

ian noise with zero mean and diagonal covariance matrix r2I4.

To ensure that the time series are Gaussian and stationary, We

assume X0to be Gaussian with zero mean and diagonal covari-

ance matrix r2

0I4. In this study, we examine three different con-

nectivity matrices F with different levels of connections:

0

B

0

B

F1=

p0

p1

p1

p1

p0

p1

p0

p1

p1

p0

p1

p0

B

B

@

B

1

C

1

C

C

C

A, F2=

C

p0

p1

p0

p1

p0

p1

p1

p0

0

B

B

B

@

1

C

C

C

A,

F3=

p0

p0

p1

p0

B

@

C

A

(13)

The corresponding network structures are demonstrated in

Figure 4.

We consider a variety of parameter settings to test the ro-

bustness of SBN modeling. First, we study the effect of the

length of observed time-series T when other parameters are

fixed (p1=0:8,p0=0:1,r2=0:5,r2

are generated with length ranging from 50 to 500 and the sim-

ulations are repeated 100 times. In each run we compute the

graphs learned from each method and record the percentage

of successful identification of the true network structure from

which the time series are generated.

Figure 5 demonstrates the difference in terms of structural

learning between ordinary BN (Zheng and Rajapakse, 2006),

DBN (Rajapakse and Zhou, 2007), and our approach. The

MATLAB software package, Bayes Net Toolbox (Murphy,

2001), is used for BN and DBN models. Clearly, SBN

0=1): Synthetic time series

FIG. 4.

dimensional multivariate time series from VAR models.

The three graphical structures for simulating four-

FIG. 5.

ent signal lengths (T). Green: SBN. Blue: Dynamic Bayesian

Networks (DBN). Red: Gaussian Bayesian Network without

assuming temporal dependency. Time series are simulated

from the three different graphs (F1, F2, F3) as shown in

Figure 4.

Percentage of identifying true structure under differ-

SPECTRAL GRAPHICAL MODEL FOR BRAIN CONNECTIVITY NETWORK393

Page 6

outperforms BN and DBN consistently in model identifica-

tion, especially when the length of the time-series T is limited.

When the true data generated network structure is complex

(F1) a moderate T is necessary to guarantee the performance

of SBN. For simpler structure as in F2and F3SBN can iden-

tify close to 100% of the true network when T ‡ 100, while

both BN and DBN give poor results. Compared with DBN

and SBN, the performance of BN is much worse. This is be-

cause the synthetic data are simulated with strong temporal

correlation (autoregressive) especially in F1and F2.

Furthermore, we examine the impact of the connectivity

strength on model performance when the observed time se-

ries have a moderate length (T=500) and a short length

(T=100). Other parameters are fixed (p0=0:1,r2=0:5,

r2

structure under moderate connectivity strength (0.4£p1£0.9)

when the observed time courses are of moderate length

(T=500). Figure 7 shows that when the length of observation

is limited (T=100), SBN can still learn the true network struc-

ture if the network has moderate numbers of connections (F2,

F3) with relatively strong connectivity strengths. In both sim-

ulations, SBN models perform much better than BN and DBN

unless the connectivity strength is weak (p1£0.3). It is worth

mentioning that the SBN approach is for learning the network

structure instead of estimating the strength of any specific

link in the network. Thus, although we validate the perfor-

mance of SBN mainly using data from first-order VAR mod-

0=1): Figure 6 shows that SBN can detect the right network

els, SBN can be applied to VAR models with longer lags or

time series with other type of temporal structure.

Study on children’s narrative comprehension

Six task-related spatial-independent components are iden-

tified using the procedure outlined in the section The spectral

graphical model with model averaging in the narrative com-

prehension task (Schmithorst et al., 2006). The ICA maps dis-

played in Figure 8 show the activations in each task-related

component. The components are ordered according to the

phase of the averaged Fourier component relative to the ref-

erence on–off time course. The activated brain regions and ac-

tivation foci are shown in Table 1. These task-related regions

are also identified by a random-effect GLM analysis (Karuna-

nayaka et al., 2007).

Two knowledge-based anatomical networks of language

comprehension circuit have been proposed (Fig. 9) (Karuna-

nayaka et al., 2007). The extended language network includes

six regions of interest (ROI) that are identified by ICA: Brod-

mann area (BA) 22, BA 22 posterior, BA 39, BA 41, BA 44, and

hippocampus. A simplified structure was also suggested in

Karunanayaka et al. (2007), which does not include hippo-

campus (Fig. 8d). The simplified structure is more consistent

with the Wernicke-Geschwind model and previous neuro-

anatomical results, and is the focus in the study using linear

SEM (Karunanayaka et al., 2007). Thus, in this article, we

apply the SBN approach with model averaging to learn

the network connectivity among the five brain regions

FIG. 6.

ent strengths of connectivity (p1) when the length of time-

series T=500. Blue: SBN. Red: DBN. Green: Gaussian

Bayesian Network without assuming temporal dependency.

Time series are simulated from the three different graphs

(F1, F2, F3) as shown in Figure 4.

Percentage of identifying true structure under differ-

FIG. 7.

ent strengths of connectivity (p1) when the length of time-

seriesT=100.Blue:SBN.Red:DBN.Green:GaussianBayesian

Network without assuming temporal dependency. Time se-

ries are simulated from the three different graphs (F1, F2, F3

as shown in Figure 4.

Percentage of identifying true structure under differ-

394 LIN ET AL.

Page 7

FIG. 8.

forming a narrative story comprehension task. The brain regions included in regions of interests (in the left and the right hemi-

spheres) of each map are shown in Table 1. Slice range: Z=?25 to +50mm (Talairach coordinates). All images are in

radiological orientation. Extracted from Schmithorst et al. (2006).

The six task-related networks (a-f) found from spatial group-independent component analysis of 313 children per-

Table 1. Activation Foci (Talairach Coordinates) for the Task-Related Independent

Component Analysis Components

ICA component Brain functionBrain regionBA Coordinates (X,Y,Z)

a Primary auditory cortexL. superior temporal gyrus

R. superior temporal gyrus

L. superior temporal gyrus

41

41

22

?38, ?21, 15

46, ?21, 20

?54, ?13, 5

50, ?17, 5

?54, ?33, 0

?38, 43, 5

?42, 7, 30

?50, ?53, 30

?6, 23, 45

?26, ?25, ?5

22, ?21, ?5

?50, ?49, 15

46, ?53, 10

?46, ?53, 25

46, ?49, 25

?10, ?45, 30

b Processing of auditory spectral and temporal

information

R. superior temporal gyrus

L. medial temporal gyrus

22

21c Broca’s Area and left lateralized phonological

working memory network

L. inferior frontal gyrus

L. inferior frontal gyrus

L. inferior parietal lobule

L. middle frontal gyrus

L. hippocampus

46

44/45

40

8

d Memory encoding and storage of narrative

elements

R. hippocampus

L. superior temporal gyrus

R. superior temporal gyrus

L. angular gyrus

R. angular gyrus

L. precuneus/posterior cingulate 7/31

e Wernicke’s Area. Acoustic word recognition22

22

39

39

fHigher order semantic processing

Talairach coordinates were extracted from Schmithorst et al. (2006).

ICA, independent component analysis; BA, Brodmann area; L., left; R., right.

SPECTRAL GRAPHICAL MODEL FOR BRAIN CONNECTIVITY NETWORK 395

Page 8

represented by the independent components a, b, c, e, and f,

which encompass Brodmann Areas BA 22, BA 22 posterior,

BA 39, BA 41, and BA 44. We compare our results with the

simplified network (Figure 9) proposed based on neuro-

anatomical knowledge.

For each of the five ROIs, a representative fMRI time series

is selected by averaging from the voxel of maximum activa-

tion and all its neighboring voxel per hemisphere per chil-

dren. We divide the subjects into three age groups: 5 to 8, 9

to 13, and 14 to 18 years old. For each age group, the struc-

tures of the neural functional networks in both left and

right hemisphere are estimated using SBN with model aver-

aging. The input to SBN for each age group is the median

time series from all subjects within the age range. For the

left hemisphere, we consider a five-node network among

BA 22, BA 22 posterior, BA 39, BA 41, and BA 44, and for

the right hemisphere, we only look at the network of BA 22,

BA 22 posterior, BA 39, and BA 41. We omitted BA 44 (Broca’s

area) from the right hemisphere model because this region

did not reach significance in the first-level analysis with

ICA or GLM and is not considered to be functionally related

to narrative comprehension.

Our model selection procedure searches among all the mod-

elsinthecandidatepoolS and decides the most probable set of

links. Thus, the choice of pool S is very important. First, we re-

strict S to only contain networks starting from brain region BA

41, which is a well-accepted anatomical result since BA 41 cor-

responds to the primary auditory cortex for the initial process-

ing of acoustic information. Furthermore, instead of averaging

from all possible neural networks with no edge pointed to BA

41, wediscardnetworks with relative small posterior probabil-

ity such that p(S j D)/max p(S j D)<0.05. Using the edge score

in Equation (10), we computed the posterior probability of the

existence of all possible connections among the task-related re-

gions and then the network structure is constructed using the

averaging algorithm outlined in the section Structure learning

by Bayesian model averaging.

Figure 10 shows the connectivity network learned for the

left hemisphere. The estimated networks stand for the aver-

age neural connectivity predicted during the task of narrative

comprehension in children ranging in age from 5 to 8, 9 to 13,

and 14 to 18 years. Most of the estimated pathways predicted

are highly consistent among the three age groups except the

links from BA 41 to BA 39 in the 5–8 group and from BA 41

to BA 44 in the 14–18 group. The edge scores computed

based on model averaging are shown in Table 2. All the

edge scores have been rescaled for each age group separately

such that Skp(Skj D)=1.The connections from BA 41 to BA

22 and from BA 22 to BA 44 show the highest edge scores

across three groups, which suggest a pathway with relative

high connectivity magnitude through these regions in all

these development stage of children. Of particular impor-

tance, we observe that the youngest age group of children

have a strong connection (edge score 0.440) between primary

auditory cortex (BA 41) and auditory-language association

areas (BA 39) that is not detected in the older age groups of

children. Instead, the oldest children exhibit a long-range con-

nection (edge score 0.841) between primary auditory cortex

(BA 41) and Broca’s area (BA 44) along the arcuate fasiculus.

The weak connection between BA22 and BA39 in the left

FIG. 9.

tive story comprehension. The simplified network does not

include the hippocampus identified in IC d. The Talairach co-

ordinates of the activation foci are shown in Table 1.

Knowledge-based neural/brain network for narra-

FIG. 10.

comprehension in the left hemisphere for three different age

groups—5 to 8 (top), 9 to 13 (middle), and 14 to 18 (bottom)—

based on the Bayesian averaged network that utilized a net-

work scoring criterion as part of the SBN approach. Green

edges are the connections not identified in all the age groups.

The estimated neural network for narrative story

396LIN ET AL.

Page 9

hemisphere (connection strength<0.2 for all age groups in

Table 2) is consistent with our understanding of the language

network connections within the brain in that the anterior part

of BA22 is directly involved with phonological processing

while the posterior part of BA22 (BA22 posterior) is tradition-

ally referred to as Wernicke’s Area for language processing

and feeds forward into language association areas in the

Angular gyrus (BA39) (Catani and Jones, 2004; Karuna-

nayaka et al., 2007).

With no activation identified in BA 44 in the right hemi-

sphere, Figure 11 shows the estimated networks among BA

22, BA 22 posterior, BA 39, and BA 41 for children in three dif-

ferent age groups in the right hemisphere. Similar to the net-

works learned for the left hemisphere, the network structures

are highly consistent among the groups. The eldest age group

shows an additional path from BA 41 to BA 22 Posterior,

which is not present in the two younger age groups, but its

score is relatively low compared with all other edges in the

network. A single pathway, BA41/BA 22/BA 22 posteri-

or/BA 39 is learned by the model averaging algorithm,

which is a simpler connectivity structure compared with its

counterpart in the left hemisphere.

Discussion & Future Work

In this article, we applied the spectral graphical model with

model averaging to construct brain connectivity networks

using fMRI data. The advantage of our approach is three-

fold. First, it assumes only a standard a priori constraint on

the network structure and the algorithm can select the most

probable network based on data. Second, this approach uti-

lizes the fMRI signals transformed into the frequency domain

to build a spectral graphical model; thus, the temporal char-

acteristic of the entire time series are accounted for. This is

fundamentally different from the BN and DBN approaches

(Rajapakse and Zhou, 2007; Zheng and Rajapakse, 2006),

where the data are assumed to follow an autoregressive pro-

cess with fixed lags (DBN) or no lag (BN). Third, we adopt the

BMA technique for model selection, which is less sensitive to

noise and more robust against outliers.

Previous network models (Karunanayaka et al., 2007) pro-

posed for this narrative comprehension task have used a

Table 2. Edge Scores in the Functional Networks Learned for Both Hemispheres in Each Age Group

Start region End regionAge 5–8 Age 9–13 Age 14–18

Left hemisphere

BA 22 (IC b)

BA 22 (IC b)

BA 22 (IC b)

BA 22 Post (IC e)

BA 22 Post (IC e)

BA 41 (IC a)

BA 41 (IC a)

BA 41 (IC a)

BA 41 (IC a)

Right hemisphere

BA 22 (IC b)

BA 22 Post (IC e)

BA 41 (IC a)

BA 41 (IC a)

BA 22 Post (IC e)

BA 39 (IC f)

BA 44 (IC c)

BA 39 (IC f)

BA 44 (IC c)

BA 22 (IC b)

BA 22 Post (IC e)

BA 39 (IC f)

BA 44 (IC c)

0.535

0.180

0.937

0.584

0.937

0.937

0.567

0.440

0.818

0.195

0.864

0.766

0.864

0.864

0.092

0.555

0.084

0.955

0.955

0.538

0.955

0.800

0.841

BA 22 Post (IC e)

BA 39 (IC f)

BA 22 (IC b)

BA 22 Post (IC e)

0.853

0.802

0.924

0.976

0.976

0.976

0.760

0.958

0.958

0.396

The edge score of an effective connection between two brain regions is computed using the Bayesian averaging approach defined in equa-

tion (10). The edge scores have been rescaled in each group so that the sum of the posterior probabilities of the candidate networks is one.

FIG. 11.

comprehension in the right hemisphere for three different age

groups—5 to 8 (top), 9 to 13 (middle), and 14 to 18 (bottom)—

basedontheBayesianaveragednetworkthatutilizedanetwork

scoring criterion as part of the SBN approach. Green edges are

the connections not identified in all the age groups.

The estimated neural network for narrative story

SPECTRAL GRAPHICAL MODEL FOR BRAIN CONNECTIVITY NETWORK397

Page 10

linear deterministic modeling approach (structural equation

model) that requires a rigidly defined set of a priori network

elements and links between them. The current treatment

using SBN with model averaging is more flexible than the

SEM approach because it does not require a priori definition

of all the network elements and connections. Instead, the op-

timization of the network in the frequency domain can gener-

ate new graphs that include connections that were not

predicted a priori. Moreover, compared with other BN (BN,

DBN) that have been applied to the structure learning of neu-

ral networks using fMRI data, the SBN approach considers

the whole spectrum (and thus the whole autocovariance func-

tion, G(h), h ‡ 0) of the fMRI time series in the approximation

of the likelihood function, while the applications using other

BN only considered fixed lags such as G(1) and G(0) (Li et al.,

2008; Rajapakse and Zhou, 2007; Zheng and Rajapakse, 2006).

The Dynamic Causal Models (DCM) are another popular

class of network models that do not rely on the autoregressive

assumptions of the underlying time series (Friston et al., 2003,

2011; Penny et al., 2004a, 2004b). Instead, DCM captures the

network structure by tuning model parameters that minimize

the difference between the predicted output of a dynamic sys-

tem and the observed time series. In comparison, our ap-

proach is fundamentally different from DCM in two

respects. DCM is built upon time domain data, whereas our

model is based on the spectral domain and has the flexibility

of examining network structure of different frequency bands.

DCM utilizes the Bayesian model selection criterion to choose

the optimal network, whereas our method adopts the BMA

concept for stepwise selection of the best connections. In

this article, we focus on demonstrating the applicability of

the SBN for learning brain connectivity network. A detailed

comparative study between DCM and SBN is currently

under investigation.

In our treatment of the connectivity in the narrative com-

prehension network, the left and right hemispheres are mod-

eled separately. This approach was used for two reasons.

First, the activation patterns corresponding to this task have

been found for the right and left hemispheres in both children

and adults using conventional GLM analysis (Holland et al.,

2007; Schmithorst et al., 2006) as well as ICA (Karunanayaka

et al., 2007). Further, the literature on hemispheric dominance

has demonstrated that left and right hemisphere networks for

language processing are asymmetric by multiple imagingand

behavioral methods, including fMRI, PET, Transcranial

Doppler ultrasound, Dichotic listening tests, and lesion stud-

ies (Lohmann et al., 2005; Petersen et al., 2000; Springer et al.,

1999). Between 92% and 96% of right-handed adults are left

dominant for language processing in the brain (Springer

et al., 1999) and diffusion MRI tractography has gone so far

as to demonstrate that the arcuate fasciculus, the major

white matter pathway connecting key language regions in

the left hemisphere, is incomplete in left-dominant individu-

als (Catani et al., 2007). Clearly, cooperation between hemi-

spheres is important for this task and the presented data

demonstrate a high degree of symmetry in the pattern of ac-

tivation. Therefore, it is appropriate to consider the interac-

tion between the hemispheres for this task. However,

considering a bi-hemisphere network, the number of nodes

becomes 10 to 14, requiring a much longer observed time se-

ries to guarantee the model performance. The simulations in-

cluded in Figures 5–7 demonstrate that our algorithm needs a

longer time series to accurately estimate connection strengths

in larger networks. Thus, we decide to investigate the intra-

hemisphere networks only, given the limitations of the

fMRI data we have from this group of children.

The left hemisphere SBN model differs somewhat from the

knowledge-based model in that the expected feed forward

connection between IC f (BA 39 located in the Angular

Gyrus) to IC c, which includes BA 44 (Broca’s Area) in the in-

ferior frontal gyrus, was not predicted. Instead, the Bayesian

model predicts connections to Broca’s Area arising directly

from Wernicke’s area IC e (BA 22 posterior) and IC b (BA

22). The SBN approach also detected two age-dependent con-

nections in the left-hemisphere language network for the nar-

rative comprehension task. These can be seen in Figure 10 and

in Table 2. An additional connection leading directly from IC

a in the superior temporal gyrus (BA 41 auditory cortex) to IC

c (BA 44), shown in Figure 10c as a dashed line, was predicted

in the 14–18-year-old age group but not in the two younger

groups of children. Such a connection is plausible given the

strong connections between auditory and language areas

along the path of the arcuate fasciculus in the brain. In the

5–8-year-old group, SBN predicts a connection directly from

IC a (BA 41) to IC f (BA 39), which vanishes in the older

age groups. These data-driven predictions can provide a

basis for revision and refinement of language network mod-

els and the influence of brain development on them.

The weak edge scores predicted for all age groups between

the anterior portion of BA 22 in the superior temporal gyrus

and BA 39 in the Angular gyrus in the left hemisphere are

expected based on earlier findings (Karunanayaka et al.,

2007). Recent findings from diffusion tractography imaging

elegantly demonstrate the physical pathways associated

with the left-dominant language network along the acruate

fasiculus as including the connection from BA22 posterior

to BA39 (Catani and Jones, 2004; Catani et al., 2007). So, the

SBN predictions of strong connectivity along this pathway

are satisfying in that they are in line with anatomical findings

as well as connectivity estimates based on other techniques.

The SBN models for the right hemisphere focus on the

4-node network among the auditory and posterior language

areas (ICs a, b, e, and f) for the narrative comprehension

task as diagramed in Figure 11. This prediction is consistent

with recent neuro-anatomical findings based on diffusion ten-

sor imagingthatdemonstratestrong asymmetrybetweenleft-

and right-hemisphere white matter connections along the ar-

cuate fasciculus pathway (Catani and Jones, 2004; Catani

et al., 2007). Right-hemisphere networks selected by SBN

were identical for the two younger age groups of children

but included an additional connection between auditory cor-

tex(BA 41)andWernicke’sarea (BA22posterior) intheoldest

group of 14–18-year-olds (Table 2). This may correspond with

continued myelination in late adolescence (Toga et al., 2006).

The application of the network learning approach

proposed here using SBN to model language connectivity

network in the developing brain provides a gnomic demon-

stration of the flexibility of this method to predict changing

connectivity in networks as a function of physiological or be-

havioral variables. In this case the age grouping of thesubjects

is used to demonstrate that different connections are important

during development. A similar approach could be used in the

future to examine data from patients with neurological diseases

such as epilepsy, stroke, or aphasia. In that case, connectivity

398 LIN ET AL.

Page 11

strength predicted by the graphs could pinpoint deficits in net-

work connections that underlie symptoms of the disorder. The

simulationsshowusthattheSBNapproachcancorrectlypredict

the networks structure, provided a sufficient sample of the tem-

poral network behavior is available. The results in children of

different ages shown in Figures 10 and 11 go one step further

anddemonstratethepowerofthisgraphicalmodelingapproach

for predicting the dynamic evolution of network connectivity—

in this case as a function of subject age.

As a nonparametric approach to approximate the model

selection criterion (AIC) using frequency-domain data in-

stead of building specific parametric model in the time do-

main, the SBN model prefers a longer time series for better

performance in network structure learning as shown in the

simulations. Typical fMRI experiments contain 100–250

time-series samples. For conventional fMRI statistical analy-

sis, this time series is available for each voxel in each slice

of the brain image volume. Future work using the SBN

approach to estimate task-related brain connectivity could

benefit from data sets with higher sampling rate and longer

time-series. fMRI has limitations on the sampling rate that

limit it to a second or more, depending on the number of sli-

ces acquired at each time point. However, other brain imag-

ing techniques such as EEG or MEG can provide time-series

data with much higher sampling rates of 5kHz or greater.

Group ICA of fMRI data to generate the network model

combined with SBN of MEG data from the same task might

permit better estimates of connection strengths and variabil-

ity with age, handedness, or gender in a whole-brain bi-

hemisphere functional network in the future.

Acknowledgment

This work was funded in part by a grant from the NIH:

R01-HD38578.

Author Disclosure Statement

No competing financial interests exist.

References

Akaike H. 1974. A new look at the statistical model identification.

IEEE Trans Autom Control 19:716–723.

Bach F, Jordan M. 2004. Learning graphical models for stationary

time series. IEEE Trans Signal Process 52:2189–2199.

Barthes R, Duisit L. 1975. An introduction to the structural anal-

ysis of narrative. New Literary Hist 6:237–272.

Bozdogan H. 1987. Model selection and Akaike’s information cri-

terion (AIC): the general theory and its analytical extensions.

Psychometrika 52:345–370.

Buchel C, Friston K. 1997. Modulation of connectivity in visual

pathways by attention: cortical interactions evaluated with

structural equation modelling and fMRI. Cereb Cortex 7:768.

Bullmore E, Horwitz B, Honey G, Brammer M, Williams S,

Sharma T. 2000. How good is good enough in path analysis

of fMRI data? Neuroimage 11:289–301.

Burge J, Lane T, Link H, Qiu S, Clark VP. 2009. Discrete dynamic

Bayesian network analysis of fMRI data. Hum Brain Mapp

30:122–137.

Calhoun V, Adali T, Pearlson G, Pekar J. 2001. A method for mak-

ing group inferences from functional MRI data using indepen-

dent component analysis. Hum Brain Mapp 14:140–151.

Catani M, Allin M, Husain M, Pugliese L, Mesulam M, Murray R,

Jones D. 2007. Symmetries in human brain language pathways

correlatewithverbalrecall.ProcNatlAcadSciUSA104:17163.

Catani M, Jones D. 2004. Perisylvian language networks of the

human brain. Ann Neurol 57:8–16.

Cover TM, Thomas JA, Wiley J. 1991. Elements of Information

Theory. Wiley Online Library.

Friedman N, Goldszmidt M. Learning Bayesian networks with

local structure. In Proceedings of the Twelfth Conference on

Uncertainty in Artificial Intelligence, San Francisco, Califor-

nia, USA, 1996, pp. 252–262.

Friston K. 2009. Causal modelling and brain connectivity in func-

tional magnetic resonance imaging. PLoS Biol 7:e1000033.

Friston K, Harrison L, Penny W. 2003. Dynamic causal model-

ling. Neuroimage 19:1273–1302.

Friston KJ, Li B, Daunizeau J, Stephan KE. 2011. Network discov-

ery with DCM. Neuroimage 56:1202–1221.

Geiger D, Heckerman D. Learning Gaussian Networks. In Pro-

ceedings of the Tenth Conference on Uncertainty in Artificial

Intelligence, Seattle, Washington, USA, 1994, pp. 235–243.

Heckerman D, Geiger D, Chickering D. 1995. Learning Bayesian

networks: the combination of knowledge and statistical data.

Mach Learn 20:197–243.

Himberg J, Hyvarinen A, Esposito F. 2004. Validating the inde-

pendent components of neuroimaging time series via cluster-

ing and visualization. Neuroimage 22:1214–1222.

Hoeting J, Madigan D, Raftery A, Volinsky C. 1999. Bayesian

model averaging: a tutorial. Stat Sci 14:382–401.

Holland SK, Vannest J, Mecoli M, Jacola LM, Tillema JM, Karuna-

nayaka PR, Schmithorst VJ, Yuan W, Plante E, Byars AW.

2007. Functional MRI of language lateralization during devel-

opment in children. Int J Audiol 46:533–551.

Hyvarinen A. 1999. Fast and robust fixed-point algorithms for in-

dependent component analysis. IEEE Trans Neural Netw

10:626–634.

Jeffreys S. 1998. Theory of Probability. New York, NY: Oxford Uni-

versity Press.

Jordan M. 1998. Learning in Graphical Models. Cambridge, MA:

MIT Press.

Karunanayaka P, Holland S, Schmithorst V, Solodkin A, Chen E,

Szaflarski J, Plante E. 2007. Age-related connectivity changes

in fMRI data from children listening to stories. Neuroimage

34:349–360.

Li J, Wang Z, Palmer S, McKeown M. 2008. Dynamic Bayesian

network modeling of fMRI: a comparison of group-analysis

methods. Neuroimage 41:398–407.

Lohmann H, Drager B, Muller-Ehrenberg S, Deppe M, Knecht S.

2005. Language lateralization in young children assessed by

functional transcranial Doppler sonography. Neuroimage

24:780–790.

Lorch EP, Milich R, Sanchez RP. 1998. Story comprehension in

children with ADHD. Clin Child Fam Psychol Rev 1:163–178.

Madigan D, Raftery A. 1994. Model selection and accounting for

model uncertainty in graphical models using Occam’s win-

dow. J Am Stat Assoc 89:1535–1546.

Mclntosh A, Gonzalez-Lima F. 1994. Structural equation model-

ing and its application to network analysis in functional brain

imaging. Hum Brain Mapp 2:2–22.

McIntosh A, Grady C, Ungerleider L, Haxby J, Rapoport S, Hor-

witz B. 1994. Network analysis of cortical visual pathways

mapped with PET. J Neurosci 14:655.

McKeownM,MakeigS,BrownG,JungT,KindermannS,BellA,Sej-

nowskiT.1998.AnalysisoffMRIdatabyblindseparationintoin-

dependent spatial components. Hum Brain Mapp 6:160–188.

SPECTRAL GRAPHICAL MODEL FOR BRAIN CONNECTIVITY NETWORK399

Page 12

Murphy K. 2001. The bayes net toolbox for matlab. Comput Sci

Stat 33:1024–1034.

Penny W, Stephan K, Mechelli A, Friston K. 2004a. Modelling

functional integration: a comparison of structural equation

and dynamic causal models. Neuroimage 23:S264–S274.

Penny W, Stephan K, Mechelli A, Friston K. 2004b. Comparing

dynamic causal models. Neuroimage 22:1157–1172.

Petersen S, Fox P, Posner M, Mintun M, Raichle M. 1988. Positron

emission tomographic studies of the cortical anatomy of

single-word processing. Nature 331:585–589.

Raftery A, Madigan D, Hoeting J. 1997. Model selection and ac-

counting for model uncertainty in linear regression models.

J Am Stat Assoc 92:179–191.

Rajapakse J, Zhou J. 2007. Learning effective brain connectivity

with dynamic Bayesian networks. Neuroimage 37:749–760.

Roebroeck A, Formisano E, Goebel R. 2005. Mapping directed in-

fluence over the brain using Granger causality and fMRI.

Neuroimage 25:230–242.

Salvador R, Achard S, Bullmore E. 2007. Frequency-dependent

functional connectivity analysis of fMRI data in Fourier and

wavelet domains. In: Jirsa VK, McIntosh AR (eds.) Handbook of

BrainConnectivity.NewYork,NY:Springer-Verlag,pp.379–401.

Salvador R, Suckling J, Schwarzbauer C, Bullmore E. 2005. Undir-

ected graphs of frequency-dependent functional connectivity

in whole brain networks. Philos Trans R Soc B: Biol Sci 360:937.

Schmithorst V, Dardzinski B, Holland S. 2001. Simultaneous correc-

tion of ghost and geometric distortion artifacts in EPI using a

multi-echo reference scan. IEEE Trans Med Image 20:535.

SchmithorstV,HollandS.2004.AComparisonof threemethodsfor

generatinggroupstatisticalinferencesfromindependentcompo-

nent analysis of fMRI data. J Magn Reson Imaging 19:365.

Schmithorst V, Holland S, Plante E. 2006. Cognitive modules uti-

lized for narrative comprehension in children: a functional

magnetic resonance imaging study. Neuroimage 29:254–266.

Springer JA, Binder JR, Hammeke TA, Swanson SJ, Frost JA, Bell-

gowan PSF, Brewer CC, Perry HM, Morris GL, Mueller WM.

1999. Language dominance in neurologically normal and ep-

ilepsy subjects. Brain 122:2033.

Talairach J, Tournoux P. 1988. Co-Planar Stereotaxic Atlas of the

Human Brain. New York, Thieme.

Thevenaz P, Ruttimann UE, Unser M. 1998. A pyramid approach

to subpixel registration based on intensity. IEEE Trans Image

Process 7:27–41.

Toga AW, Thompson PM, Sowell ER. 2006. Mapping brain mat-

uration. Trends Neurosci 29:148–159.

Zheng X, Rajapakse JC. 2006. Learning functional structure from

fMR images. Neuroimage 31:1601–1613.

Address correspondence to:

Xiaodong Lin

Department of Management Science and Information Systems

Rutgers University

Piscataway, NJ 08854

E-mail: xiaodonglin@gmail.com

Appendix

In a Spectral Bayesian Network (SBN) model of M·T mul-

tivariate time-series X=fX1,X2, ...,XMgT, where M is the

number of nodes (brain regions) in the network and T is the

length of time series, the likelihood is first decomposed

based on the network structure S as in static Bayesian Net-

works (BN):

log(XjS,hS)= +

M

k=1

logp(Xk,pkjS,hS)

p(pkjs,hS)

,

where Xkand pkdenotes the (multivariate) time series of the

(k)th nodes and the parent nodes of the (k)th nodes, respec-

tively, and hSdenotes the collection of parameters in all the

local probability distributions above. The distributions

p(YjS,hS), Y={Xk, pk} or fpkg,k=1,2, ...,M cannot be further

decomposed as in Equation (2) since Y is indeed a subset of

the multivariate time-series X. Instead, when T is long

enough, each local log-likelihood logp(YjS,hS) can be esti-

mated by

logp(YjS,hS) ? ?T·hhs(Y),

where h (Y) is the entropy rate of the time series given

that it is a finite sample of a stationary ergodic process.

This approximation is based on the asymptotic equiparti-

tion property: limT/1logp(Yjh)=T=?hh(Y). This is also

known as the Shannon-McMillian-Breiman theorem in in-

formation theory (Cover et al., 1991). Assuming the station-

ary and Gaussianity for the multivariate time-series Y, the

entropy rate can be computed from spectral densities

directly

h(Y)=MY

2

log2peþ

1

4p

Zp

?p

logjfy(x)jdx,

where e is the natural number, MYis the dimension of Y, and

fy(x) is the MY·MYspectral density matrix of Y. Thus, the

Akaike information criterion (AIC) score for a Spectral BN

can be aggregated as

AIC(S)= ? log(XjS,^hS)þqS

= ?T

4p+

M

k=1

Zp

?p

logj^ffXk,pkg(x)j

j^ffpkg(x)j

dxþqSþC,

where^fY(x) are the spectral density estimates corresponding

to the set of nodes Y, and C=M log 2pe/2 is a constant inde-

pendent of the network structure. Suppose we have a set of

discrete estimation of f(x) at Fourier frequencies, the integra-

tion above can be further approximated by interpolations and

the AIC score in SBN modeling is

AIC(S) ? ?1

2+

k=1

M

+

j=1

T

logj^ffXk,pkg(xj)j

j^ffpkg(xj)j

þqS,

where the constant term is ignored. The Gaussian spectral

window Wr(k)= expf?k2=2r2g=r

the peridograms to obtain the spectral density estimate. The

optimal smoothing parameter r is chosen by applying

the AIC on the whittle approximation of the likelihood of

the multivariate time series. We follow Bach and Jordan

(2004) to use qs=T?+M

work complexity where T* is the effective length of the

spectral density sequence after smoothing.

ffiffiffiffiffiffi

2p

p

is applied to smooth

k=1(2jpkjþ1) as the penalty of the net-

400LIN ET AL.