Article
Selection PolicyInduced Reduction Mappings for Boolean Networks
Dept. of Veterinary Physiol. & Pharmacology, Texas A&M Univ., College Station, TX, USA
IEEE Transactions on Signal Processing (Impact Factor: 2.79). 10/2010; 58(9):4871  4882. DOI: 10.1109/TSP.2010.2050314 Source: IEEE Xplore
Get notified about updates to this publication Follow publication 
Fulltext
Available from: Noushin Ghaffari, Dec 19, 2013IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010 4871
Selection PolicyInduced Reduction Mappings for
Boolean Networks
Ivan Ivanov, Plamen Simeonov, Noushin Ghaffari, Xiaoning Qian, and Edward R. Dougherty
Abstract—Developing computational models paves the way to
understanding, predicting, and inﬂuencing the longterm behavior
of genomic regulatory systems. However, several major challenges
have to be addressed before such models are successfully applied
in practice. Their inherent high complexity requires strategies for
complexity reduction. Reducing the complexity of the model by re
moving genes and interpreting them as latent variables leads to the
problem of selecting which states and their corresponding transi
tions best account for the presence of such latent variables. We use
the Boolean network (BN) model to develop the general framework
for selection and reduction of the model’s complexity via desig
nating some of the model’s variables as latent ones. We also study
the effects of the selection policies on the steadystate distribution
and the controllability of the model.
Index Terms—Compression, control, gene regulatory networks,
selection policy.
I. INTRODUCTION
A
fundamental goal of gene regulatory modeling is to derive
and study intervention strategies for the purpose of ben
eﬁcially altering the dynamic behavior of the underlying bio
logical system [1]. Owing to the inherent computational burden
of optimal control methods [2], the extreme complexity, which
grows exponentially with network size, creates an impediment
for the design of optimal control policies for large gene regu
latory networks. The computational burden can be mitigated in
different ways, for instance, by using an approximation to the
optimal policy [3] or designing greedy algorithms that do not
involve optimization relative to a cost function [4], [5] , but even
approximation and greedycontrol methods can only work with
networks that are still relatively small. Another approach, the
one considered here, is to reduce the network.
Manuscript received September 15, 2009; accepted April 23, 2010. Date of
publication May 17, 2010; date of current version August 11, 2010. The as
sociate editor coordinating the review of this manuscript and approving it for
publication was Dr. Yufei Huang. The work presented in the paper was partially
supported by the NSF Grant CCF0514644.
I. Ivanov is with the Department of Veterinary Physiology and Pharma
cology, Texas A&M University, College Station, TX 77843 USA (email:
iivanov@cvm.tamu.edu.).
P. Simeonov is with the Department of Mathematics, University of Houston
Downtown, Houston, TX 77002 USA (email: SimeonovP@uhd.edu.).
N. Ghaffari is with the Department of Electrical and Computer Engineering,
and the Department of Statistics, Texas A&M University, College Station, TX
77843 USA (email: nghaffari@tamu.edu.).
X. Qian is with the Department of Computer Science and Engineering, Uni
versity of South Florida, Tampa, FL 33620 USA (email: xqian@cse.usf.edu.).
E. R. Dougherty is with the Department of Electrical and Computer Engi
neering, Texas A&M University, College Station, TX 77843 USA. He is also
with the Translational Genomics Research Institute (TGEN), Phoenix, AZ
85004 USA (email: edward@ece.tamu.edu.).
Digital Object Identiﬁer 10.1109/TSP.2010.2050314
The Boolean network (BN) model [6] and the probabilistic
Boolean network (PBN) model [7] which in its binary form is
essentially composed of a collection of constituent Boolean net
works connected via a probability structure, have played key
roles in the study of gene regulatory systems, in particular, with
regard to regulatory intervention. These models are especially
useful when there is evidence of switchlike behavior. In ad
dition, the dynamical behavior of the PBN model is described
by the well developed theory of Markov chains and their asso
ciated transition probability matrices, which allows for a rig
orous mathematical treatment of optimal regulatory interven
tion. To address the issue of changing the longrun behavior,
stochastic control has been employed to ﬁnd stationary con
trol policies that affect the steadystate distribution of a PBN
[8]; however, the algorithms used to ﬁnd these solutions have
complexity which increases exponentially with the number of
genes in the network. Hence, there is a need for size reducing
mappings that produce more tractable models whose dynamical
behavior and stationary control policies are “close” to those of
the original model, the key issue here being closeness, whatever
way in which that is characterized.
One way to reduce a network is to delete genes. The ﬁrst map
ping of this kind was introduced in [9]. It preserves the proba
bility structure of a given PBN; however, the number of possible
constituent Boolean networks in the reduced model can increase
exponentially compared to the original PBN, thereby increasing
complexity of a different kind and making the biological inter
pretation of the reduced PBN problematic.
The
reduction mapping proposed in [10] addresses this
issue by mapping a given PBN onto a set of candidate PBNs
without increasing the number of constituent networks in the
original model. The removed gene is considered as a latent
variable which induces a speciﬁc “collapsing” of pairs of
states from the state space of the original PBN and induces a
selection of their successive states based on the steadystate
distribution of the network. The “collapsing” procedure
represents a situation in which a gene is not observable,
which implies that states of the regulatory system that differ
only in the expression of that gene become identical and
thus “collapse” into each other. The notion of collapsing is
both natural and general. If
and are two states in the
network that differ only in the value of gene
, and gene
is to be deleted, then these states will be identiﬁed as
in the reduced network and the
question becomes how to treat them within the functional
structure of the network.
This paper introduces a general framework for the construc
tion of reduction mappings based on the “collapsing” heuristic
1053587X/$26.00 © 2010 IEEE
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 1
4872 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
by utilizing the concept of a selection policy, of which the reduc
tion mapping of [10] is a special case. It studies the effects of se
lection policies on the steadystate distribution and on stationary
control, in particular, the meanﬁrstpassagetime (MFPT) con
trol policy [4]. Since a binary PBN is a collection of Boolean
networks with perturbation, we treat selection policies within
the framework of Boolean networks with perturbation
.
By reducing each constituent BN for the same gene, in effect,
we reduce the PBN.
II. B
ACKGROUND
A. Boolean and Probabilistic Boolean Networks
A BN with perturbation
, ,on genes is
deﬁned by a set of nodes
and a vector of
Boolean functions
. The variable
represents the expression level of gene , with 1 representing
high and 0 representing low expression. The vector
represents
the regulatory rules between genes. At every time step, the value
of a gene
is predicted by the values of a set, , of genes
at the previous time step, based on the regulatory function
.
The set of genes
is called the predictor
set and the function
is called the predictor function of .A
state of the
is a vector , and
the state space of the
is the collection of all states of the
network. The perturbation parameter
models random
gene mutations, i.e., at each time point there is a probability
of any gene changing its value uniformly randomly. Since there
are
genes, the probability of a random perturbation occurring
at any time point is
. At any time point , the state of
the network is given by
, and
is called a gene activity proﬁle (GAP). The dynamics of a
are completely described by its transition probability
matrix
where is the probability
of the underlying Markov chain undergoing the transition from
the state
to the state . The perturbation probability makes
the chain ergodic and therefore it possesses a steadystate prob
ability distribution
.
Computing the elements of
is straightforward. We elect to
present it here because of its importance in the subsequent con
siderations. When computing the transition probabilities for a
one has to realize that at every time step one of the two
mutually exclusive events happens: either the chain transitions
according to the regulatory rules
or a perturbation occurs. This
interpretation implies that when no perturbation occurs the net
work regulatory rules are applied. There are two important cases
in computing
for every given state . The ﬁrst case
is when
is a singleton attractor, i.e., . In this case,
, where is the number of the po
sitions where the binary representations of
and differ from
each other. The second case is when
, where .
In this case,
and , for any
. The transition can happen by either applying
the regulatory rules
with a probability of or by per
turbation with a probability of
. In summary
(1)
where
is the indicator function that takes value 1 if
according to the truth table and is equal to 0 other
wise.
A probabilistic Boolean network (PBN) consists of a set
of nodes/genes
, ,
, and a set of vector valued network functions,
, governing the state transitions of the genes, each
network function being of the form
,
where
,
[7] In most applications, the discretization is either binary or
ternary. Here we use binary
. At each time point a
random decision is made as to whether to switch the network
function for the next transition, with the probability
of a
switch being a system parameter. If a decision is made to
switch the network function, then a new function is chosen
from among
, with the probability of choosing
being the selection probability (note that network selection is
not conditioned by the current network and the current network
can be selected). Each network function
determines a BN,
the individual BNs being called the contexts of the PBN. The
PBN behaves as a ﬁxed BN until a random decision (with
probability
) is made to switch contexts according to the
probabilities
from among .If ,
then the PBN is said to be instantaneously random;if
[11], then the PBN is said to be contextsensitive. Our interest is
with contextsensitive PBNs with perturbation, meaning that at
each time point there is a probability
of any gene ﬂipping its
value uniformly randomly. Excluding the selectionprobability
structure, a contextsensitive PBN is a collection of
,
the contexts of the PBN. It is in this sense that by considering
reduction mappings for
we are ipso facto considering
reduction mappings for PBNs.
B. Effect of RankOne Perturbations on the SteadyState
Distribution
When studying the network reduction, we will need to char
acterize the changes in the steadystate distribution resulting
from certain kinds of changes in the Markov chain, the socalled
rankone perturbations (not to be confused with the notion of
perturbation in
transitions). This problem has been studied
in the framework of structural intervention in gene regulatory
networks [12] and more generally in the framework of Markov
chain perturbation theory [13].
Letting
and denote the transition probability matrix
and steadystate distribution for the perturbed
,wehave
and , where denotes transpose. Ana
lytical expressions for the steadystate distribution change can
be obtained using the fundamental matrix, which exists for any
ergodic Markov chain and is given by
,
where
is a column vector whose components are all unity
[14]. Letting
, the steadystate distribution change
is
.
For a rankone perturbation, the perturbed Markov chain has
the transition probability matrix
, where are
two arbitrary vectors satisfying
, and represents a
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 2
IVANOV et al.: SELECTION POLICYINDUCED REDUCTION MAPPINGS 4873
rankone perturbation of the original transition probability ma
trix
. In this case, it can be shown that [12]
(2)
An important special case occurs when the transition mecha
nisms before and after perturbation differ only in one state, say
the
th state. Then has nonzero values only in its
th row, where is the coordinate vector with a 1 in the th
position and 0s elsewhere. Substituting this into (2) yields
(3)
where
, and , are the th coordinates of
the respective vectors. For the
th state
(4)
The results for these special cases can be extended to arbi
trary types of perturbations so that it is possible to compute
the steadystate distributions of arbitrarily perturbed Markov
chains in an iterative fashion [12].
C. MFPT Control Policy
The problem of optimal intervention is usually formulated as
an optimal stochastic control problem [2]. We focus on treat
ments that specify the intervention on a single control gene
.A
control policy
based on is a sequence of decision
rules
for each time step . The values 0/1 are
interpreted as off/on for the application of the control.
The MFPT algorithm is based on the comparison between the
MFPTs of a state
and its ﬂipped (with respect to ) state
[4]. When considering therapeutic intervention, the state space
can be partitioned into desirable and undesirable states
according to the expression values of a given set
of genes.
For simplicity we will assume that
, i.e., only one
gene determines the partition. The intuition behind the MFPT
algorithm is that, given the control gene
, when a desirable state
reaches on average faster than , it is reasonable to apply
control and start the next network transition from
. The roles
of
and are reversed when . Without loss of generality
one can assume that the gene
is the leftmost gene in the state’s
binary representation, i.e.,
and , and
the desirable states correspond to the value
. With this
assumption, the probability transition matrix
of the Markov
chain can be written as
(5)
Using this representation one can compute the mean ﬁrstpas
sage times
and by solving the following system of
linear equations [15]:
(6)
(7)
where
denotes a unit vector of the appropriate length. The
vectors
and contain the MFPTs from each state in to
the set
, and from each state in to the set , respectively. The
MFPT algorithm designs stationary(time independent) control
policies
for each gene in the network
by comparing the coordinate differences
and
to , a tuning parameter that can be set to
a higher value when the ratio of the cost of control to the cost
of the undesirable states is higher, the intent being to apply the
control less frequently.
III. S
ELECTION
POLICIES AND
REDUCTION
MAPPINGS
To motivate the deﬁnition of a selection policy, we consider
a speciﬁc reduction mapping designed using information about
the steadystate distribution of the network [10]. That mapping
is constructed assuming that the gene
is to be ’deleted’ from
the network. For every state
, let denote the state
that differs from
only in the value of gene . Next, for every
such pair
and , we consider their successor states and
, and . Following deletion, becomes a latent
variable and, under the reduction mapping, the states
and
’collapse’ to a state in the reduced network. The state is ob
tained from either
or by removing their th coordinate. The
reduction mapping, denoted by
, constructs the truth table
of the reduced network by selecting the transition
if
or , otherwise. This particular type of re
duction is a special case of a reduction mapping
induced
by a selection policy
, which we deﬁne next.
Deﬁnition 1: A selection policy
corresponding to the
deleted gene
is a dimensional vector, ,
indexed by the states of
and having components equal to 1 at
exactly one of the positions corresponding to each pair
,
.
For each gene
there are different selection policies.
Using this deﬁnition, one can consider the reduction mapping
corresponding to the selection policy . constructs
the truth table of the reduced network by selecting the transition
if or , otherwise. Greater reduction of
the network can be achieved by sequentially deleting genes in an
iterative manner following the approach outlined in this section.
Each selection policy is obtained by the action of a corre
sponding selection matrix
of dimension on the
column unit vector
,
(8)
where
is obtained from the identity matrix by re
placing the rows corresponding to the 0 entries in the selection
policy
with zero rows. The selection matrix serves as a right
identity,
, to the matrix obtained
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 3
4874 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
from the selection matrix by deleting its zero rows. In what fol
lows, we will focus on the selection policies that correspond to
the case
. This could be achieved by permuting the gene
indexes and does not restrict the generality of our considera
tions. For example, in the case of BNs on 3 genes, one of the
possible
matrixes has the following format:
(9)
The
matrix is obtained from by ﬁrst trans
posing and then by setting for each column
, if
or if , . The
above example of the matrix
yields
(10)
We deﬁne the companion matrix
for the reduction map
ping
by
(11)
The companion matrix
deﬁnes a Markov chain that is
closely related to the dynamics of the reduced network
.
Its importance can be seen from the following theorem, which
states that
is asymptotically equal to the transition proba
bility matrix
of .
Theorem 1: For every selection policy
where denotes the maximum element of the respective
matrix.
Proof: Without loss of generality we set
. By the
deﬁnition, each row of the companion matrix
is obtained
as follows. First, the appropriate rows from
are selected ac
cording to the selection policy
. The other rows are discarded
and the columns in the resulting
matrix are added in
pairs indexed by the pairs of ﬂipped states
and , thus forming
the
companion matrix. Because the elements of
the matrices
and are indexed with the states ,we
compare them elementwise. Note that only rows in the orig
inal probability transition matrix
indexed by states for
which
contribute in the formation of both and
. Thus, for each state , , there are several cases
to consider:
1.
In this case considerations similar to the one in Section II
show that
which equals to
. When or
we have and
. Thus
where is the Hamming distance between the states
and , e.g., the deﬁnition of in Section II. Now, it is
enough to notice that
and
, where is deﬁned similarly to but
for the reduced network
. This observation shows that
2.
In this case and
which gives
For the rest of the states the veriﬁcation that
can be done as in case 1.
3.
, and
In this case and , while
and
. Computations, similar to the previous two
cases show that
Finally, for states , the computations are the same
as in the previous case and show that
.
The above considerations show that each row of the com
panion matrix
could possibly differ from the corresponding
row in the transition probability matrix
in only two entries,
one of them perturbed by a term that equals to
and
the other one by a term that equals to
. Both quan
tities tend to 0 as
. The norm of the matrix difference
from the statement of the theorem is proportional to the larger
of these two numbers, and thus we obtain the conclusion of the
theorem.
An immediate corollary of the theorem is that, if all of the
states in a
are either singleton attractors or part of at
tractor cycles of the type
, then the transition proba
bility matrix
for the reduced network is identical to the com
panion matrix for every reduction mapping induced by a selec
tion policy. The salient observation following from the theorem
is that for large networks
or for networks with a high
probability of perturbation,
, , is, up to a
very small perturbation, identical to
.
In the next two sections, we use the companion matrix
to study the effects of selectionpolicyinduced reduction
mappings on both the steadystate distribution of the original
network and a speciﬁc type of stationary control policy, the
MFPT control policy. The issue of the effects of a reduction
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 4
IVANOV et al.: SELECTION POLICYINDUCED REDUCTION MAPPINGS 4875
on the steadystate distribution is different from the issue of its
effects on a stationary control policy for the network; however,
the reduction effects on the steadystate distribution and the
control policy can both be used as measures for the goodness
of reduction mappings.
IV. E
FFECT OF
REDUCTION ON THE
STEADYSTATE
DISTRIBUTION
In this section we use the companion matrix
for the
reduction mapping
and results from Markov chain per
turbation theory to study the effects of the reduction on the
steadystate distribution of the
. Similarly to the previous
section, one can assume without loss of generality that the gene
determines the partition of the state space
, and is the gene to be ’deleted’ from the network.
Since
in
(1), all the entries in P have two terms
and
. We note that the ﬁrst term is the multi
plication of the probability of no random ﬂipping
with
the indicator
determined by the Boolean functions
of the . We denote . The second term
is determined by the perturbation parameter
and the Ham
ming distance between
and . We denote it by
. Hence,
for any entry in the transition matrix . Thus, we can
write
as given in the text. Each entry in
corresponds to and each entry in corresponds to
. Obviously, is the same for all with the same
number of genes and the same value for the perturbation param
eter
, and is determined by the regulatory rules or Boolean
functions
given in the .
For a given selection policy
, deﬁne a new Markov chain
by replacing the rows in
that correspond to with
the rows where
or by leaving them unchanged if
. Similarly to , the probability transition matrix
for this new Markov chain can be decomposed as
For example, one possible matrix for a on 3 genes under
the assumption
is
The so introduced new Markov chain is obtained from the orig
inal Markov chain representing the
by performing at most
rankone perturbations. Thus, following [12], one can
compute the total change in the steadystate distribution caused
by those perturbations. By the deﬁnition of the steadystate
distribution,
. For each state ,wehave
where denotes the th column of . For the state ,we
have the similar form
Note that and are ﬂipped states and and are
neighboring odd and even columns. Now, for every such pair of
ﬂipped states we compute
Recall that and, when , the
neighboring odd and even rows in
are identical, see the above
example of
.Wehave
Based on the deﬁnition of , and using Hamming distances, it
is straightforward to see that
Hence
The neighboring odd and even rows in are identical and the
fact that for an arbitrary pair of neighboring odd and even states
and the Hamming distances between , and , are equal
to each other gives us
We denote for all . The procedure fol
lowed here leads to the steadystate distribution for the com
panion matrix
. Thus
This observation when combined with the Markov chain per
turbation theory and Theorem 1 shows that the effects of a selec
tion policy
on the steadystate distribution of a given
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 5
4876 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
can be estimated approximately by comparing the dimen
sional vector with coordinates equal to the appropriate sums
of
and to the steadystate distribution for the com
panion matrix
. Indeed, according to Theorem 1 the steady
state distribution
of the Markov chain described by the com
panion matrix rapidly converges with
to the steadystate
distribution
of the reduced network. At the same time, this
probability distribution can be obtained via the relation
from the steadystate distribution of the Markov
chain described by the probability transition matrix
. The latter
is obtained from the probability transition matrix of the original
by means of at most rankone perturbations. The
discussion in Section IIB and (4) in particular, show how one
can compute the effect on the steadystate distribution after each
one of these perturbations. The total effect on the steadystate
distribution produced by all of the perturbations that lead to the
matrix
can be computed by iteratively applying (4). Thus,
for every state
in the original one can compute the per
turbation effect on the sums
which represent the
proper collapsing of the steadystate of the original network if
one considers the deleted gene
as a latent variable. This dis
cussion shows that the differences
, ,
can serve as a good approximation to the shift in the steadystate
distribution of the original
.
V. E
FFECT OF
REDUCTION ON THE CONTROL
POLICY
In this section we study the effects of the selectionpolicy
induced reduction mappings on control policies designed for a
. Since our goal is to arrive at a measure of performance for
reduction mappings, we need to consider a control policy whose
mathematical formulation can be related to selection policies.
As we will see, this requirement is met by the MFPT control
policy.
For a selection policy
and its corresponding reduction
mapping
we can consider the MFPT stationary control
policy on the reduced network. In this regard we assume
without loss of generality that
, , is the control
gene and
is the gene to be deleted from the network.
Interpreting the deletion of gene
as creation of a latent, or
nonobservable, variable, it is desirable that the MFPT control
policy
for the reduced network is as close as pos
sible to the one designed for the original network when both
control policies have the same parameter
. In this way, one
can achieve similar control actions for every state
and
its corresponding reduced state
. Obviously, this is achieved
only if
. Thus, we arrive at the following
deﬁnition which is formulated for the general case of stationary
control policies.
Deﬁnition 2: Given a stationary control policy
and a gene
to be deleted from the , the policy is called incon
sistent at the state
if and only if . The
state
is called inconsistent. The ratio of the number of
inconsistent states to the total number of states in
is called the relative inconsistency of the control policy .
The pairs of
inconsistent states in the original network
present us with two possible options for deﬁning the control
action for the states in the reduced network obtained by col
lapsing those pairs into their respective reduced states. Thus,
one should measure the effects of a selection policyinduced
reduction mapping by comparing the control actions for the
subset
of states that are not inconsistent to the
control actions for their corresponding reduced states in the
reduced network
. The next deﬁnition provides a relative
measure of those effects.
Deﬁnition 3: Given a stationary control policy
and a gene
to be deleted from the , the policy is called affected
at the state
if and only if the control action for the
reduced state
is different from . The ratio of the
number of states
where the control policy is affected
to the total number of states in
is called the relative effect of
the selection policy
on the control policy .
Because there is only a ﬁnite number of selection policies
, there exists a selection policy that minimizes the relative
effect
among the all possible selection policies on any
given stationary control policy
designed for a .
Recall that the MFPT control policy
is designed based
on the comparisons between
and
to . Thus, one can achieve the desired similarity be
tween the MFPT control policies by minimizing the differences
and between the respective vectors of mean
ﬁrstpassage times for the states in the original and the reduced
networks. However, the dimension of
and is while
the dimension of
and is , and one cannot simply
subtract the vectors of mean ﬁrstpassage times for the reduced
network from their counterparts for the original network.
This problem can be circumvented by using the companion
matrix. Because the sets of desirable and undesirable states are
the same for the Markov chain corresponding to the reduced
network and the one corresponding to
, one can consider
equations similar to (6) and (7), namely
(12)
(13)
Consider a state
such that . Then, from (6)
and (12), one gets (14), shown at the bottom of the page. where
the index
describes the coordinates of the vector and the
states
collapse to the state . The right hand
(14)
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 6
IVANOV et al.: SELECTION POLICYINDUCED REDUCTION MAPPINGS 4877
side suggests that the proper vector to subtract from is the
vector deﬁned by the action of the matrix
on the vector
, where, referring to (6), is the upper left
submatrix of and has dimension
with identical elements for each pair of positions indexed by a
pair of ﬂipped states
. Thus, if one pairs every equation of
type (14) with an artiﬁcially introduced identity corresponding
to the state
that was not selected by the policy
(15)
then one gets a linear system of
equations. That system
can be represented in a matrix format as
(16)
where the matrix
is obtained from the matrix , e.g.,
(6) and (7), by replacing the rows corresponding to 0 entries in
the selection policy
with the appropriate unit basis vectors
. Similar considerations lead to
(17)
where the matrix
is obtained from the matrix , e.g.,
(6) and (7), by replacing the rows corresponding to 0 entries in
the selection policy
with the appropriate unit basis vectors
, and with being the lower left
submatrix of . Equations (16) and (17) show
that the differences
and are eigenvectors
corresponding to the eigenvalue of 1 for the matrices
and
, respectively.
Recall that there exists a selection policy
that is optimal
with regard to minimizing the relative effect
among
the all possible selection policies
on the stationary MFPT
control policy
. Whereas for every candidate for deletion
gene there is an optimal selection policy to reduce the network,
ﬁnding that optimal selection policy requires ﬁnding the MFPT
control policy for each one of the
possible reduced
networks. This procedure is highly complex because it requires
ﬁnding the respective transition probability matrixes for all of
the reduced networks, then solving the corresponding systems
of linear equations, e.g., (6) and (7), to ﬁnd the vectors of mean
ﬁrstpassage times, and ﬁnally, executing the algorithm that
determines the control policy for the reduced network.
Although the optimal selection policy
does not necessarily
minimize the differences
and , every se
lection policy
that ensures close to 0 coordinates for these
vectors at the positions that correspond to the places where its
coordinates equals to 1 is asymptotically (as
) optimal
and Theorem 1 shows that the convergence is fast. The relations
given by (16) and (17) show that the differences
and
belong to the linear subspaces spanned by the eigen
vectors corresponding to the eigenvalue of 1 for their respec
tive matrixes
and . Our interest is in ﬁnding the ele
ments of those subspaces that have small coordinates at the po
sitions that correspond to the places where the selection policy
equals to 1.
While we do not consider the question of characterizing those
subspaces based on the structure of a given selection policy, we
provide two theorems concerning estimating the probability of
ﬁnding vectors that belong to the subspaces and have a certain
growth condition imposed on their coordinates. These probabil
ities are proportional to either the surface area or the volume of
certain portions of the
ball in which is endowed with either
or metric. Thus, we are interested in the sets
,
and
, where , .By
adjusting the parameters
, one can estimate the probability of
having the vectors of MFPT
and ( and ) close
to each other at the coordinates where the selection policy
equals to 1. Here we only state the theorems, leaving the proofs
in the Appendix.
Theorem 2: Consider the set
. Then
Theorem 3: Consider the sets and . Then
where , and are the characteristic
functions of the respective sets,
is the projection of
onto the subspace spanned by the coordinate vectors
and
The quantities and
describe the area of the projec
tion of the intersection between the positive coordinate cone in
and the surface of the ball in endowed with the and
topology respectively. The projections are onto the subspace
of
spanned by the coordinate vectors .
Similarly, the quantity
describes the
dimensional volume of the set , the volume of the
portion of the
ball in the space that is bounded by the
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 7
4878 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
Fig. 1. 4gene has 256 possible selection policies. All of these selection policies and their SSD shift toward Desirable states are shown in this ﬁgure. 16 out
of 256 selection policies produce the maximum shift toward Desirable states and are identiﬁed by a circle around them. Out heuristic selection policy is one of
these 16 optimal selection policies.
surface and the positive coordinate cone in . We focus
on the positive coordinate cone in
because of the task at
hand: estimating the closeness of the vectors
and or
and in either or norm. Taking into account all
of the possible permutations of the coordinate vectors, one can
see that the probability of ﬁnding vectors with the structure
described in the statements of the theorems is proportional
to
in the case, and to in the
case. These two quantities rapidly tend to zero if some of
the numbers
tend to zero. It is important to notice that the
minimum in either
or norm of the coordinates of interest
of the vectors
and depends only on the
given
and the selection policy . At the same time, the
relative effect
of the selection policy on the control
policy
depends on the choice of the parameter which
is related to the cost of control [4]. Smaller values of
imply
that control is applied more often, larger size of the set
, and
a potential increase of
. Smaller values for also imply
that a brute force search for vectors
and that are close
enough to the MFPT vectors
and is a hard problem.
Thus, there is a need for heuristic approaches that identify
both a gene for deletion and a selection policy which has small
relative effect on the control policy. It is also desirable that the
reduced network shares similar controllability properties as the
original one, where the controllability is understood in terms of
the shift of the steadystate distribution towards the desirable
states after the application of MFPT control policy. The control
policy developed in [16] employs such a heuristically chosen se
lection policy in the following scenario: (i) a gene is selected for
deletion by a procedure involving the Coefﬁcient of Determina
tion (CoD), [17]; (ii) a reduction mapping using the heuristically
chosen selection policy is applied; (iii) steps (i) and (ii) are used
iteratively until a network of a predetermined (computationally
feasible) size is achieved, (iv) the MFPT control policy is de
rived for the reduced network; and (v) a control policy is in
duced on the original network from the policy derived for the
reduced network.
VI. S
IMULATION STUDY
We performed simulation studies to illustrate the new con
cepts developed in this paper. Speciﬁcally, we study the rela
tionship between the relative effect of the selection policy on
the MFPT control policy and the shift of the steadystate distri
bution of the network towards the desirable states.
The key concept in this paper is that of the selection policy
which determines the reduction mapping, and subsequently
the structure of the reduced network. Given the large number
of possible selection policies, it is not feasible to exhaustively
search for the optimal selection policy. The optimal selection
policy minimizes the relative effect on a given stationary control
policy; therefore, if that stationary control policy is changed,
then the optimal selection policy could change as well. Thus, it
is important to develop heuristics for determining suboptimal
selection policies. The algorithm CoDReduce outlined in [16]
designs such a policy. Fig. 1 shows the performance of that
algorithm using a 4gene network as an example: the designed
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 8
IVANOV et al.: SELECTION POLICYINDUCED REDUCTION MAPPINGS 4879
Fig. 2. The average SSD shift toward Desirable states and the relative effects on the control policies of successive reductions of 2 sets of 100 . Each set has
randomly generated attractors which constrained to be evenly distributed between the Desirable and Undesirable states. At each step the MFPT contro
l policy is
designed and applied on the network. As the ﬁgure shows the SSD shift is similar in the original and reduced networks by applying their own control polic
ies.
Also, SSD shift and relative effect curves follow inverse patterns. (a) 9 genes. (b) 10 genes.
suboptimal selection policy is among the several optimal selec
tion policies. In our simulation study, we use CoDReduce to
design the selection policy.
We have generated several sets, 100 networks each, of
using the algorithm developed in [18] for different numbers of
genes. The networks are randomly generated subject to only one
constraint on their attractors: half of them are among the desir
able states of the respective network. For each randomly gen
erated network, also referred to as the original network in the
simulation study, we design and apply the respective MFPT con
trol policy. Then we compute the difference between the total
mass of the desirable states in the SSD of the network before
and after applying the MFPT control policy. We refer to this
difference as the SSD shift (toward the desirable states). It is
clear that more efﬁcient stationary control policies will produce
larger SSD shifts. For the next step in the simulations, we use
CoDReduce and delete one gene from the network and ﬁnd the
MFPT control policy for the reduced network. We then calculate
the relative effect of the selection policy on the MFPT control
policy for the original network by comparing that control policy
to the MFPT control policy for the reduced network. The reduc
tion step and the calculation of the relative effect on the MFPT
control policy ares repeated iteratively using the reduced net
work from the previous step as the “original” network for the
new reduction step.
Fig. 2 illustrates our ﬁndings: CoDReduce does not have a
signiﬁcant effect (on average) on the amount of SSD shift if
a single gene is removed from the model. Similarly, there is a
small change (on average) in the relative effect of the selection
policy on the MFPT control policy when moving from a network
to its reduced version. Thus on average, our heuristic for per
forming selection policybased reduction does not have much
of an impact on the controllability of the network when a single
gene is removed from it. However, the effects of reduction tend
to accumulate with the removal of more genes, and ultimately
the reduction mappings produce poor results when a very few
genes remain in the network.
In general, the SSD shift and the relative effect tend to follow
inverse trends. This is to be expected because a big relative effect
on the control policy implies signiﬁcant difference in the control
actions for the states on the larger network and their respective
reduced states in the smaller network which could ultimately
lead to a signiﬁcant difference in the shifts induced by those
control policies in the steadystate distributions of the models.
A. Gastrointestinal Cancer Network Application
We have considered the effects of the reduction on a real
world, experimentdriven network generated using the gastroin
testinal cancer data set from [19]. Before applying reduction,
we must infer the gene regulatory network from the gene ex
pression data and obtain the
. In the preprocessing step,
the gene expression data are normalized, ﬁltered and binarized
using the methods from [20]. For the inference, we use the pro
cedure used by [16] which is a modiﬁed version of the seed al
gorithm proposed by Hashimoto et al. [21]. The algorithm is
initialized by the C9orf65 gene as the seed gene and the gene
CXCL12 is chosen as the control gene. The seed gene is se
lected based on the fact that it is one of two genes composing
the best classiﬁer in [19] and the control gene selection is based
on the strong CoD [17] connection to the seed gene. The coef
ﬁcient of determination (CoD) measures how a set of random
variables improves the prediction of a target variable, relative to
the best prediction in the absence of any conditioning observa
tion [17]. Let
be a vector of predictor
binary random variables,
a binary target variable, and a
Boolean function such that
predicts . The meansquare
error of
as a predictor of is the expected squared dif
ference,
. Let be the minimum MSE
among all predictor functions
for and be the error
of the best estimate of
without any predictors. The CoD is de
ﬁned as
(18)
The network is grown by adding one gene at a time; at each
iterative step the gene having the strongest connectivity, mea
sured by the CoD, to one of the genes from the current network
is added to the network. Then, the network is rewired taking
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 9
4880 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
TABLE I
SSD S
HIFT
TOWARD THE
DESIRABLE STATES IN
GASTROINTESTINAL
CANCER
NETWORK
into account that a new gene in the network can change the
way genes inﬂuence each other. The total number of the genes
in the ﬁnal network is 17: C9orf65, CXCL12, TK1, SOCS2,
THC2168366, SEC61B, ENST00000361295, KCNH2, ACTB,
RPS18, RPS13, THC2199344, SNX26, RPL26, SLC20A1,
RPS11, THC2210612, THC2161967, IER2 and LAMP1. This
17gene
has a very large state space of size . Exact
computation of its SSD is infeasible. Thus the SSD is estimated
by ﬁrst running the network for a long time and then using the
KolmogorovSmirnov test to decide if the network has reached
its steady state. Initially the best gene for deletion is selected
using CoDReduce [16] and the network is reduced by deleting
one gene, then CoDReduce is applied consecutively to reduce
the network down to 10 genes: C9orf65, CXCL12, RPS18,
RPS13, THC2199344, SNX26, RPL26, SLC20A1, RPS11 and
THC2210612. After reducing the original network down to
10 genes, the MFPT control policy for the reduced network
is designed and then induced back on the original 17gene
network. The induction of the control policy is a necessary
step since the control policy designed on the reduced network
is of smaller dimension compared to the original one. We use
the method outlined in [16] for inducing the control policy de
signed on the 10gene network to the original 17gene network.
After deleting one gene, each state in the reduced network
corresponds to two states, referred to here as “parent” states,
in the original network that collapsed together. After designing
the contol policy and assigning the control actions to the states
of the reduced network, in the induction procedure, the control
action of each state in the reduced network will be duplicated
as the control actions for its parent states.
Table I shows the total probability mass of the desirable and
undesirable states before and after applying the induced control
policy. As the table illustrates, there is about 13% shift in the
steadystate distribution of the network toward more desirable
states.
VII. C
ONCLUSION
This paper presents a general compression strategy for
Boolean network models of genomic regulation. The strategy is
based on the concept of a selection policy, and the complexity
of the model is reduced by consecutively removing genes from
the network. The removed genes are treated as latent or nonob
servable variables, and the truth tables for the reduced models
are constructed based on the particular selection policy. The
effects of the compression on both the dynamical behavior of
the model and on the MFPT control policy are discussed using
the companion to the selection policy matrix. It is important to
emphasize that while there is always an optimal selection policy
which minimizes the effects of compression on the stationary
control policy the problem of ﬁnding it is a hard one. Thus,
there is a need to develop suboptimal compression algorithms
based on heuristical approaches.
A
PPENDIX
Proof (Theorem 2): The integral evaluation is easily veri
ﬁed by induction with respect to
.
In the case
direct integral evaluation shows that
. Next, assuming that the statement of the the
orem is true for
, one can compute
as follows:
This completes the proof of Theorem 2.
Proof (Theorem 3): We use induction with respect to
.
First we establish the case
. In this case,
and , thus . We obtain
From the deﬁnition of it is clear that
(19)
For every
we have (with the empty sum being
zero)
where . Thus, (with
(20)
Then, we should have
(21)
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 10
IVANOV et al.: SELECTION POLICYINDUCED REDUCTION MAPPINGS 4881
which is true since
by (19) and (20). Now we handle the general case. Let
and deﬁne
(22)
Then
(23)
Deﬁnition (22) clearly implies (with
)
(24)
Since
a relevant induction hypothesis is
(25)
This hypothesis is veriﬁed using (24). We set
,
and
(26)
We obtain
(27)
The last expression gives the righthand side of (25) after a
simple evaluation of the integral. The ﬁrst formula of the the
orem follows from (23) and (25).
It remains to evaluate
. From the def
inition of
and (21) we have
(28)
where
(29)
By the above deﬁnition, we obtain the recurrence relation
(30)
Furthermore, the evaluation of
and
provides the following hypothesis
(31)
To verify (31), again we set
, . Then,
(30) yields (32) at the bottom of the page.
Now (31) follows from (26) and (32). From (28) and (31) we
get the second formula in the statement of the theorem. This
completes the proof of Theorem 3.
ACKNOWLEDGMENT
The authors want to thank B. Shekhtman for the inspiring
discussions.
(32)
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 11
4882 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
REFERENCES
[1] A. Datta, R. Pal, A. Choudhary, and E. R. Dougherty, “Control
approaches for probabilistic gene regulatory networks,”
IEEE Signal
Process. Mag., vol. 24, no. 1, pp. 54–63, 2007.
[2] D. P. Bertsekas, Dynamic Programming and Optimal Control, 3rd
ed. Belmont, MA: Athena Scientiﬁc, 2005.
[3] B. Faryabi, A. Datta, and E. R. Dougherty, “On approximate stochastic
control in genetic regulatory networks,” IET Syst. Biol., vol. 1, no. 6,
pp. 361–368, 2007.
[4] G. Vahedi, B. Faryabi, J.F. Chamberland, A. Data, and E. R.
Dougherty, “Intervention in gene regulatory networks via a stationary
meanﬁrstpassagetime control policy,” IEEE Trans. Biomed. Eng.,
pp. 2319–2331, Oct. 2008.
[5] X. Qian, I. Ivanov, N. Ghaffari, and E. R. Dougherty, “Intervention in
gene regulatory networks via greedy control policies based on longrun
behavior,” BMC Syst. Biol., 2009.
[6] S. A. Kauffman, “Metabolic stability and epigenesis in randomly con
structed genetic nets,” J. Theoret. Biol., vol. 22, pp. 437–467, 1969.
[7] I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang, “Probabilistic
boolean networks: A rulebased uncertainty model for gene regulatory
networks,” Bioinform., vol. 18, no. 2, pp. 261–274, 2002.
[8] R. Pal, A. Datta, and E. R. Dougherty, “Robust intervention in proba
bilistic Boolean networks,” IEEE Trans. Signal Process., vol. 56, no.
3, pp. 1280–1294, 2008.
[9] E. R. Dougherty and I. Shmulevich, “Mappings between probabilistic
Boolean networks,” Signal Process., vol. 83, no. 4, pp. 799–809, 2003.
[10] I. Ivanov and E. R. Dougherty, “Reduction mappings between proba
bilistic boolean networks,” EURASIP J. Acoust. Signal Process., vol.
1, no. 1, pp. 125–131, 2004.
[11] M. Brun, E. R. Dougherty, and I. Shmulevich, “Steadystate probabil
ities for attractors in probabilistic Boolean networks,” Signal Process.,
vol. 85, no. 10, pp. 1993–2013, 2005.
[12] X. Qian and E. R. Dougherty, “On the longrun sensitivity of proba
bilistic Boolean networks,” J. Theoret. Biol., pp. 560–577, 2009.
[13] J. J. Hunter, “Stationary distributions and mean ﬁrst passage times
of perturbed Markov chains,” Linear Algebra Appl., vol. 410, pp.
217–243, 2005.
[14] P. J. Schweitzer, “Perturbation theory and ﬁnite Markov chains,” J.
Appl. Probab., vol. 5, pp. 401–413, 1968.
[15] J. Norris, Markov Chains. Cambridge, U.K.: Cambridge Univ. Press,
1998.
[16] N. Ghaffari, I. Ivanov, X. Qian, and E. R. Dougherty, “A CoDbased re
duction algorithm for designing stationary control policies on Boolean
networks,” Bioinformatics, vol. 26, pp. 1556–1563, 2010.
[17] E. R. Dougherty, S. Kim, and Y. Chen, “Coeffcient of determination in
nonlinear signal processing,” Signal Process., vol. 80, pp. 2219–2235,
2000.
[18] R. Pal, I. Ivanov, A. Datta, and E. R. Dougherty, “Generating Boolean
networks with a prescribed attractor structure,” Bioinformatics, vol. 54,
no. 21, pp. 4021–4025, Nov. 2005.
[19] N. D. Price, J. Trent, A. K. ElNaggar, D. Cogdell, E. Taylor, K. K.
Hunt, R. E. Pollock, L. Hood, I. Shmulevich, and W. Zhang, “Highly
accurate twogene classiﬁer for differentiating gastrointestinal stromal
tumors and leiomyosarcomas,” Proc. Nat. Acad. Sci., vol. 104, no. 9,
pp. 3414–3419, 2007.
[20] I. Shmulevich and W. Zhang, “Binary analysis and optimizationbased
normalization of gene expression data,” Bioinform., vol. 18, no. 4, pp.
555–565, 2002.
[21] R. Hashimoto, S. Kim, I. Shmulevich, W. Zhang, M. L. Bittner, and E.
R. Dougherty, “A directedgraph algorithm to grow genetic regulatory
subnetworks from seed genes based on strength of connection,” Bioin
form., vol. 20, no. 8, pp. 1241–1247, 2004.
Ivan Ivanov received the Ph.D. degree in mathe
matics from the University of South Florida, Tampa.
He is an Assistant Professor with the Department
of Veterinary Physiology and Pharmacology, Texas
A&M University, College Station, TX. His current
research is focused on genomic signal processing,
and, in particular, on modeling the genomic regu
latory mechanisms and on mappings reducing the
complexity of the models of genomic regulatory
networks.
Plamen Simeonov received the Ph.D. degree in
mathematics from the University of South Florida,
Tampa.
He is an Associate Professor of mathematics with
the Department of Computer and Mathematical Sci
ences, University of HoustonDowntown, Houston,
TX. His current research interests are in the areas
of analysis, approximation theory, potential theory,
orthogonal polynomials and special functions, com
puteraided geometric design, and biostatistics.
Noushin Ghaffari received the M.Sc. degree in
computer information systems from the University
of Houston—Clear Lake, TX, in 2006.
Currently, she is pursuing the Ph.D. degree with
the Department of Electrical and Computer Engi
neering, Texas A&M University, College Station.
Her research interests include: genomic signal
processing, systems biology, and computational
biology; especially complexity reduction and control
of Genetic Regulatory Networks.
Xiaoning Qian received the Ph.D. degree in elec
trical engineering from Yale University, New Haven,
CT, in 2005.
Currently, he is an Assistant Professor with the
Department of Computer Science and Engineering,
University of South Florida, Tampa. He was with
the Bioinformatics Training Program, Texas A&M
University, College Station. His current research
interests include computational biology, genomic
signal processing, and biomedical image analysis.
Edward R. Dougherty received the M.S. degree
in computer science from Stevens Institute of
Technology, Hoboken, NJ, the Ph.D. degree in math
ematics from Rutgers University, New Brunswick,
NJ, and the Doctor Honoris Causa by the Tampere
University of Technology, Finland.
He is a Professor with the Department of Electrical
and Computer Engineering, Texas A&M Univer
sity, College Station, where he holds the Robert
M. Kennedy ’26 Chair in Electrical Engineering
and is Director of the Genomic Signal Processing
Laboratory. He is also coDirector of the Computational Biology Division of
the Translational Genomics Research Institute, Phoenix, AZ, and is an Adjunct
Professor with the Department of Bioinformatics and Computational Biology,
University of Texas M. D. Anderson Cancer Center, Houston. He is author
of 15 books, editor of ﬁve others, and author of 250 journal papers. He has
contributed extensively to the statistical design of nonlinear operators for image
processing and the consequent application of pattern recognition theory to
nonlinear image processing. His current research in genomic signal processing
is aimed at diagnosis and prognosis based on genetic signatures and using gene
regulatory networks to develop therapies based on the disruption or mitigation
of aberrant gene function contributing to the pathology of a disease.
Dr. Dougherty is a fellow of SPIE, has received the SPIE President’s Award,
and served as the editor of the SPIE/IS&T Journal of Electronic Imaging.At
Texas A&M University received the Association of Former Students Distin
guished Achievement Award in Research, been named Fellow of the Texas En
gineering Experiment Station, and named Halliburton Professor of the Dwight
Look College of Engineering.
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 12
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.

 "Once the predictors of each gene are determined , we assign 1 to each corresponding location in R S with probability β ∈[ 0, 1] and −1 with probability 1 − β. β reflects a bias toward what type of regulatory relationship (activation or suppression) is more likely to occur. Given the perturbation probability p, we calculate a TPM and its SSD for the network corresponding to the seed regulatory matrix [23]. We then select a nominal healthy network, R H , as the network with minimum undesirable steadystate mass among all possible networks with a single mutation relative to R S . "
[Show abstract] [Hide abstract] ABSTRACT: Typically, a vast amount of experience and data is needed to successfully determine cancer prognosis in the face of (1) the inherent stochasticity of cell dynamics, (2) incomplete knowledge of healthy cell regulation, and (3) the inherent uncertain and evolving nature of cancer progression. There is hope that models of cell regulation could be used to predict disease progression and successful treatment strategies, but there has been little work focusing on the third source of uncertainty above. In this work, we investigate the impact of this kind of network uncertainty in predicting cancer prognosis. In particular, we focus on a scenario in which the precise aberrant regulatory relationships between genes in a patient are unknown, but the patient gene regulatory network is contained in an uncertainty class of possible mutations of some known healthy network. We optimistically assume that the probabilities of these abnormal networks are available, along with the best treatment for each network. Then, given a snapshot of the patient gene activity profile at a single moment in time, we study what can be said regarding the patient’s treatability and prognosis. Our methodology is based on recent developments on optimal control strategies for probabilistic Boolean networks and optimal Bayesian classification. We show that in some circumstances, prognosis prediction may be highly unreliable, even in this optimistic setting with perfect knowledge of healthy biological processes and ideal treatment decisions. 
 "A big obstacle in deriving optimal treatment strategies from networks is the computational complexity arising directly from network complexity. Hence, significant effort has been focused on network reduction [18,19]. As with any compression scheme, reduction methods sacrifice information in return for computational tractability. "
[Show abstract] [Hide abstract] ABSTRACT: The inference of gene regulatory networks is a core problem in systems biology. Many inference algorithms have been proposed and all suffer from false positives. In this paper, we use the minimum description length (MDL) principle to reduce the rate of false positives for bestfit algorithms. The performance of these algorithms is evaluated via two metrics: the normalizededge Hamming distance and the steadystate distribution distance. Results for synthetic networks and a wellstudied buddingyeast cell cycle network show that MDLbased filtering is more effective than filtering based on conditional mutual information (CMI). In addition, MDLbased filtering provides better inference than the MDL algorithm itself.  [Show abstract] [Hide abstract] ABSTRACT: We present a novel algorithm (CoDReduce) for reducing the size of a probabilistic Boolean network (PBN) model for genomic regulation. The algorithm uses the coefficient of determination (CoD) to find the best candidate for dasiadeletionpsila gene. The selection policy that determines how the transition probabilities for the reduced network are obtained from those in the original network is designed using the steadystate distribution (SSD) of the model. The performance of the algorithm is measured by the shift in the steadystate distribution after applying the meanfirstpassagetime (MFPT) control policy, and the relative effect of the selection policy on the MFPT control policy.