Article

Selection Policy-Induced Reduction Mappings for Boolean Networks

Dept. of Veterinary Physiol. & Pharmacology, Texas A&M Univ., College Station, TX, USA
IEEE Transactions on Signal Processing (Impact Factor: 2.79). 10/2010; 58(9):4871 - 4882. DOI: 10.1109/TSP.2010.2050314
Source: IEEE Xplore

ABSTRACT

Developing computational models paves the way to understanding, predicting, and influencing the long-term behavior of genomic regulatory systems. However, several major challenges have to be addressed before such models are successfully applied in practice. Their inherent high complexity requires strategies for complexity reduction. Reducing the complexity of the model by removing genes and interpreting them as latent variables leads to the problem of selecting which states and their corresponding transitions best account for the presence of such latent variables. We use the Boolean network (BN) model to develop the general framework for selection and reduction of the model's complexity via designating some of the model's variables as latent ones. We also study the effects of the selection policies on the steady-state distribution and the controllability of the model.

Full-text

Available from: Noushin Ghaffari, Dec 19, 2013
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010 4871
Selection Policy-Induced Reduction Mappings for
Boolean Networks
Ivan Ivanov, Plamen Simeonov, Noushin Ghaffari, Xiaoning Qian, and Edward R. Dougherty
Abstract—Developing computational models paves the way to
understanding, predicting, and influencing the long-term behavior
of genomic regulatory systems. However, several major challenges
have to be addressed before such models are successfully applied
in practice. Their inherent high complexity requires strategies for
complexity reduction. Reducing the complexity of the model by re-
moving genes and interpreting them as latent variables leads to the
problem of selecting which states and their corresponding transi-
tions best account for the presence of such latent variables. We use
the Boolean network (BN) model to develop the general framework
for selection and reduction of the model’s complexity via desig-
nating some of the model’s variables as latent ones. We also study
the effects of the selection policies on the steady-state distribution
and the controllability of the model.
Index Terms—Compression, control, gene regulatory networks,
selection policy.
I. INTRODUCTION
A
fundamental goal of gene regulatory modeling is to derive
and study intervention strategies for the purpose of ben-
eficially altering the dynamic behavior of the underlying bio-
logical system [1]. Owing to the inherent computational burden
of optimal control methods [2], the extreme complexity, which
grows exponentially with network size, creates an impediment
for the design of optimal control policies for large gene regu-
latory networks. The computational burden can be mitigated in
different ways, for instance, by using an approximation to the
optimal policy [3] or designing greedy algorithms that do not
involve optimization relative to a cost function [4], [5] , but even
approximation and greedy-control methods can only work with
networks that are still relatively small. Another approach, the
one considered here, is to reduce the network.
Manuscript received September 15, 2009; accepted April 23, 2010. Date of
publication May 17, 2010; date of current version August 11, 2010. The as-
sociate editor coordinating the review of this manuscript and approving it for
publication was Dr. Yufei Huang. The work presented in the paper was partially
supported by the NSF Grant CCF-0514644.
I. Ivanov is with the Department of Veterinary Physiology and Pharma-
cology, Texas A&M University, College Station, TX 77843 USA (e-mail:
iivanov@cvm.tamu.edu.).
P. Simeonov is with the Department of Mathematics, University of Houston-
Downtown, Houston, TX 77002 USA (e-mail: SimeonovP@uhd.edu.).
N. Ghaffari is with the Department of Electrical and Computer Engineering,
and the Department of Statistics, Texas A&M University, College Station, TX
77843 USA (e-mail: nghaffari@tamu.edu.).
X. Qian is with the Department of Computer Science and Engineering, Uni-
versity of South Florida, Tampa, FL 33620 USA (e-mail: xqian@cse.usf.edu.).
E. R. Dougherty is with the Department of Electrical and Computer Engi-
neering, Texas A&M University, College Station, TX 77843 USA. He is also
with the Translational Genomics Research Institute (TGEN), Phoenix, AZ
85004 USA (e-mail: edward@ece.tamu.edu.).
Digital Object Identifier 10.1109/TSP.2010.2050314
The Boolean network (BN) model [6] and the probabilistic
Boolean network (PBN) model [7] which in its binary form is
essentially composed of a collection of constituent Boolean net-
works connected via a probability structure, have played key
roles in the study of gene regulatory systems, in particular, with
regard to regulatory intervention. These models are especially
useful when there is evidence of switch-like behavior. In ad-
dition, the dynamical behavior of the PBN model is described
by the well developed theory of Markov chains and their asso-
ciated transition probability matrices, which allows for a rig-
orous mathematical treatment of optimal regulatory interven-
tion. To address the issue of changing the long-run behavior,
stochastic control has been employed to find stationary con-
trol policies that affect the steady-state distribution of a PBN
[8]; however, the algorithms used to find these solutions have
complexity which increases exponentially with the number of
genes in the network. Hence, there is a need for size reducing
mappings that produce more tractable models whose dynamical
behavior and stationary control policies are “close” to those of
the original model, the key issue here being closeness, whatever
way in which that is characterized.
One way to reduce a network is to delete genes. The first map-
ping of this kind was introduced in [9]. It preserves the proba-
bility structure of a given PBN; however, the number of possible
constituent Boolean networks in the reduced model can increase
exponentially compared to the original PBN, thereby increasing
complexity of a different kind and making the biological inter-
pretation of the reduced PBN problematic.
The
reduction mapping proposed in [10] addresses this
issue by mapping a given PBN onto a set of candidate PBNs
without increasing the number of constituent networks in the
original model. The removed gene is considered as a latent
variable which induces a specific “collapsing” of pairs of
states from the state space of the original PBN and induces a
selection of their successive states based on the steady-state
distribution of the network. The “collapsing” procedure
represents a situation in which a gene is not observable,
which implies that states of the regulatory system that differ
only in the expression of that gene become identical and
thus “collapse” into each other. The notion of collapsing is
both natural and general. If
and are two states in the
network that differ only in the value of gene
, and gene
is to be deleted, then these states will be identified as
in the reduced network and the
question becomes how to treat them within the functional
structure of the network.
This paper introduces a general framework for the construc-
tion of reduction mappings based on the “collapsing” heuristic
1053-587X/$26.00 © 2010 IEEE
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 1
4872 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
by utilizing the concept of a selection policy, of which the reduc-
tion mapping of [10] is a special case. It studies the effects of se-
lection policies on the steady-state distribution and on stationary
control, in particular, the mean-first-passage-time (MFPT) con-
trol policy [4]. Since a binary PBN is a collection of Boolean
networks with perturbation, we treat selection policies within
the framework of Boolean networks with perturbation
.
By reducing each constituent BN for the same gene, in effect,
we reduce the PBN.
II. B
ACKGROUND
A. Boolean and Probabilistic Boolean Networks
A BN with perturbation
, ,on genes is
defined by a set of nodes
and a vector of
Boolean functions
. The variable
represents the expression level of gene , with 1 representing
high and 0 representing low expression. The vector
represents
the regulatory rules between genes. At every time step, the value
of a gene
is predicted by the values of a set, , of genes
at the previous time step, based on the regulatory function
.
The set of genes
is called the predictor
set and the function
is called the predictor function of .A
state of the
is a vector , and
the state space of the
is the collection of all states of the
network. The perturbation parameter
models random
gene mutations, i.e., at each time point there is a probability
of any gene changing its value uniformly randomly. Since there
are
genes, the probability of a random perturbation occurring
at any time point is
. At any time point , the state of
the network is given by
, and
is called a gene activity profile (GAP). The dynamics of a
are completely described by its transition probability
matrix
where is the probability
of the underlying Markov chain undergoing the transition from
the state
to the state . The perturbation probability makes
the chain ergodic and therefore it possesses a steady-state prob-
ability distribution
.
Computing the elements of
is straightforward. We elect to
present it here because of its importance in the subsequent con-
siderations. When computing the transition probabilities for a
one has to realize that at every time step one of the two
mutually exclusive events happens: either the chain transitions
according to the regulatory rules
or a perturbation occurs. This
interpretation implies that when no perturbation occurs the net-
work regulatory rules are applied. There are two important cases
in computing
for every given state . The first case
is when
is a singleton attractor, i.e., . In this case,
, where is the number of the po-
sitions where the binary representations of
and differ from
each other. The second case is when
, where .
In this case,
and , for any
. The transition can happen by either applying
the regulatory rules
with a probability of or by per-
turbation with a probability of
. In summary
(1)
where
is the indicator function that takes value 1 if
according to the truth table and is equal to 0 other-
wise.
A probabilistic Boolean network (PBN) consists of a set
of nodes/genes
, ,
, and a set of vector valued network functions,
, governing the state transitions of the genes, each
network function being of the form
,
where
,
[7] In most applications, the discretization is either binary or
ternary. Here we use binary
. At each time point a
random decision is made as to whether to switch the network
function for the next transition, with the probability
of a
switch being a system parameter. If a decision is made to
switch the network function, then a new function is chosen
from among
, with the probability of choosing
being the selection probability (note that network selection is
not conditioned by the current network and the current network
can be selected). Each network function
determines a BN,
the individual BNs being called the contexts of the PBN. The
PBN behaves as a fixed BN until a random decision (with
probability
) is made to switch contexts according to the
probabilities
from among .If ,
then the PBN is said to be instantaneously random;if
[11], then the PBN is said to be context-sensitive. Our interest is
with context-sensitive PBNs with perturbation, meaning that at
each time point there is a probability
of any gene flipping its
value uniformly randomly. Excluding the selection-probability
structure, a context-sensitive PBN is a collection of
,
the contexts of the PBN. It is in this sense that by considering
reduction mappings for
we are ipso facto considering
reduction mappings for PBNs.
B. Effect of Rank-One Perturbations on the Steady-State
Distribution
When studying the network reduction, we will need to char-
acterize the changes in the steady-state distribution resulting
from certain kinds of changes in the Markov chain, the so-called
rank-one perturbations (not to be confused with the notion of
perturbation in
transitions). This problem has been studied
in the framework of structural intervention in gene regulatory
networks [12] and more generally in the framework of Markov
chain perturbation theory [13].
Letting
and denote the transition probability matrix
and steady-state distribution for the perturbed
,wehave
and , where denotes transpose. Ana-
lytical expressions for the steady-state distribution change can
be obtained using the fundamental matrix, which exists for any
ergodic Markov chain and is given by
,
where
is a column vector whose components are all unity
[14]. Letting
, the steady-state distribution change
is
.
For a rank-one perturbation, the perturbed Markov chain has
the transition probability matrix
, where are
two arbitrary vectors satisfying
, and represents a
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 2
IVANOV et al.: SELECTION POLICY-INDUCED REDUCTION MAPPINGS 4873
rank-one perturbation of the original transition probability ma-
trix
. In this case, it can be shown that [12]
(2)
An important special case occurs when the transition mecha-
nisms before and after perturbation differ only in one state, say
the
th state. Then has nonzero values only in its
th row, where is the coordinate vector with a 1 in the th
position and 0s elsewhere. Substituting this into (2) yields
(3)
where
, and , are the th coordinates of
the respective vectors. For the
th state
(4)
The results for these special cases can be extended to arbi-
trary types of perturbations so that it is possible to compute
the steady-state distributions of arbitrarily perturbed Markov
chains in an iterative fashion [12].
C. MFPT Control Policy
The problem of optimal intervention is usually formulated as
an optimal stochastic control problem [2]. We focus on treat-
ments that specify the intervention on a single control gene
.A
control policy
based on is a sequence of decision
rules
for each time step . The values 0/1 are
interpreted as off/on for the application of the control.
The MFPT algorithm is based on the comparison between the
MFPTs of a state
and its flipped (with respect to ) state
[4]. When considering therapeutic intervention, the state space
can be partitioned into desirable and undesirable states
according to the expression values of a given set
of genes.
For simplicity we will assume that
, i.e., only one
gene determines the partition. The intuition behind the MFPT
algorithm is that, given the control gene
, when a desirable state
reaches on average faster than , it is reasonable to apply
control and start the next network transition from
. The roles
of
and are reversed when . Without loss of generality
one can assume that the gene
is the leftmost gene in the state’s
binary representation, i.e.,
and , and
the desirable states correspond to the value
. With this
assumption, the probability transition matrix
of the Markov
chain can be written as
(5)
Using this representation one can compute the mean first-pas-
sage times
and by solving the following system of
linear equations [15]:
(6)
(7)
where
denotes a unit vector of the appropriate length. The
vectors
and contain the MFPTs from each state in to
the set
, and from each state in to the set , respectively. The
MFPT algorithm designs stationary(time independent) control
policies
for each gene in the network
by comparing the coordinate differences
and
to , a tuning parameter that can be set to
a higher value when the ratio of the cost of control to the cost
of the undesirable states is higher, the intent being to apply the
control less frequently.
III. S
ELECTION
POLICIES AND
REDUCTION
MAPPINGS
To motivate the definition of a selection policy, we consider
a specific reduction mapping designed using information about
the steady-state distribution of the network [10]. That mapping
is constructed assuming that the gene
is to be ’deleted’ from
the network. For every state
, let denote the state
that differs from
only in the value of gene . Next, for every
such pair
and , we consider their successor states and
, and . Following deletion, becomes a latent
variable and, under the reduction mapping, the states
and
’collapse’ to a state in the reduced network. The state is ob-
tained from either
or by removing their th coordinate. The
reduction mapping, denoted by
, constructs the truth table
of the reduced network by selecting the transition
if
or , otherwise. This particular type of re-
duction is a special case of a reduction mapping
induced
by a selection policy
, which we define next.
Definition 1: A selection policy
corresponding to the
deleted gene
is a dimensional vector, ,
indexed by the states of
and having components equal to 1 at
exactly one of the positions corresponding to each pair
,
.
For each gene
there are different selection policies.
Using this definition, one can consider the reduction mapping
corresponding to the selection policy . constructs
the truth table of the reduced network by selecting the transition
if or , otherwise. Greater reduction of
the network can be achieved by sequentially deleting genes in an
iterative manner following the approach outlined in this section.
Each selection policy is obtained by the action of a corre-
sponding selection matrix
of dimension on the
column unit vector
,
(8)
where
is obtained from the identity matrix by re-
placing the rows corresponding to the 0 entries in the selection
policy
with zero rows. The selection matrix serves as a right
identity,
, to the matrix obtained
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 3
4874 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
from the selection matrix by deleting its zero rows. In what fol-
lows, we will focus on the selection policies that correspond to
the case
. This could be achieved by permuting the gene
indexes and does not restrict the generality of our considera-
tions. For example, in the case of BNs on 3 genes, one of the
possible
matrixes has the following format:
(9)
The
matrix is obtained from by first trans-
posing and then by setting for each column
, if
or if , . The
above example of the matrix
yields
(10)
We define the companion matrix
for the reduction map-
ping
by
(11)
The companion matrix
defines a Markov chain that is
closely related to the dynamics of the reduced network
.
Its importance can be seen from the following theorem, which
states that
is asymptotically equal to the transition proba-
bility matrix
of .
Theorem 1: For every selection policy
where denotes the maximum element of the respective
matrix.
Proof: Without loss of generality we set
. By the
definition, each row of the companion matrix
is obtained
as follows. First, the appropriate rows from
are selected ac-
cording to the selection policy
. The other rows are discarded
and the columns in the resulting
matrix are added in
pairs indexed by the pairs of flipped states
and , thus forming
the
companion matrix. Because the elements of
the matrices
and are indexed with the states ,we
compare them element-wise. Note that only rows in the orig-
inal probability transition matrix
indexed by states for
which
contribute in the formation of both and
. Thus, for each state , , there are several cases
to consider:
1.
In this case considerations similar to the one in Section II
show that
which equals to
. When or
we have and
. Thus
where is the Hamming distance between the states
and , e.g., the definition of in Section II. Now, it is
enough to notice that
and
, where is defined similarly to but
for the reduced network
. This observation shows that
2.
In this case and
which gives
For the rest of the states the verification that
can be done as in case 1.
3.
, and
In this case and , while
and
. Computations, similar to the previous two
cases show that
Finally, for states , the computations are the same
as in the previous case and show that
.
The above considerations show that each row of the com-
panion matrix
could possibly differ from the corresponding
row in the transition probability matrix
in only two entries,
one of them perturbed by a term that equals to
and
the other one by a term that equals to
. Both quan-
tities tend to 0 as
. The norm of the matrix difference
from the statement of the theorem is proportional to the larger
of these two numbers, and thus we obtain the conclusion of the
theorem.
An immediate corollary of the theorem is that, if all of the
states in a
are either singleton attractors or part of at-
tractor cycles of the type
, then the transition proba-
bility matrix
for the reduced network is identical to the com-
panion matrix for every reduction mapping induced by a selec-
tion policy. The salient observation following from the theorem
is that for large networks
or for networks with a high
probability of perturbation,
, , is, up to a
very small perturbation, identical to
.
In the next two sections, we use the companion matrix
to study the effects of selection-policy-induced reduction
mappings on both the steady-state distribution of the original
network and a specific type of stationary control policy, the
MFPT control policy. The issue of the effects of a reduction
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 4
IVANOV et al.: SELECTION POLICY-INDUCED REDUCTION MAPPINGS 4875
on the steady-state distribution is different from the issue of its
effects on a stationary control policy for the network; however,
the reduction effects on the steady-state distribution and the
control policy can both be used as measures for the goodness
of reduction mappings.
IV. E
FFECT OF
REDUCTION ON THE
STEADY-STATE
DISTRIBUTION
In this section we use the companion matrix
for the
reduction mapping
and results from Markov chain per-
turbation theory to study the effects of the reduction on the
steady-state distribution of the
. Similarly to the previous
section, one can assume without loss of generality that the gene
determines the partition of the state space
, and is the gene to be ’deleted’ from the network.
Since
in
(1), all the entries in P have two terms
and
. We note that the first term is the multi-
plication of the probability of no random flipping
with
the indicator
determined by the Boolean functions
of the . We denote . The second term
is determined by the perturbation parameter
and the Ham-
ming distance between
and . We denote it by
. Hence,
for any entry in the transition matrix . Thus, we can
write
as given in the text. Each entry in
corresponds to and each entry in corresponds to
. Obviously, is the same for all with the same
number of genes and the same value for the perturbation param-
eter
, and is determined by the regulatory rules or Boolean
functions
given in the .
For a given selection policy
, define a new Markov chain
by replacing the rows in
that correspond to with
the rows where
or by leaving them unchanged if
. Similarly to , the probability transition matrix
for this new Markov chain can be decomposed as
For example, one possible matrix for a on 3 genes under
the assumption
is
The so introduced new Markov chain is obtained from the orig-
inal Markov chain representing the
by performing at most
rank-one perturbations. Thus, following [12], one can
compute the total change in the steady-state distribution caused
by those perturbations. By the definition of the steady-state
distribution,
. For each state ,wehave
where denotes the th column of . For the state ,we
have the similar form
Note that and are flipped states and and are
neighboring odd and even columns. Now, for every such pair of
flipped states we compute
Recall that and, when , the
neighboring odd and even rows in
are identical, see the above
example of
.Wehave
Based on the definition of , and using Hamming distances, it
is straightforward to see that
Hence
The neighboring odd and even rows in are identical and the
fact that for an arbitrary pair of neighboring odd and even states
and the Hamming distances between , and , are equal
to each other gives us
We denote for all . The procedure fol-
lowed here leads to the steady-state distribution for the com-
panion matrix
. Thus
This observation when combined with the Markov chain per-
turbation theory and Theorem 1 shows that the effects of a selec-
tion policy
on the steady-state distribution of a given
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 5
4876 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
can be estimated approximately by comparing the -dimen-
sional vector with coordinates equal to the appropriate sums
of
and to the steady-state distribution for the com-
panion matrix
. Indeed, according to Theorem 1 the steady-
state distribution
of the Markov chain described by the com-
panion matrix rapidly converges with
to the steady-state
distribution
of the reduced network. At the same time, this
probability distribution can be obtained via the relation
from the steady-state distribution of the Markov
chain described by the probability transition matrix
. The latter
is obtained from the probability transition matrix of the original
by means of at most rank-one perturbations. The
discussion in Section II-B and (4) in particular, show how one
can compute the effect on the steady-state distribution after each
one of these perturbations. The total effect on the steady-state
distribution produced by all of the perturbations that lead to the
matrix
can be computed by iteratively applying (4). Thus,
for every state
in the original one can compute the per-
turbation effect on the sums
which represent the
proper collapsing of the steady-state of the original network if
one considers the deleted gene
as a latent variable. This dis-
cussion shows that the differences
, ,
can serve as a good approximation to the shift in the steady-state
distribution of the original
.
V. E
FFECT OF
REDUCTION ON THE CONTROL
POLICY
In this section we study the effects of the selection-policy-
induced reduction mappings on control policies designed for a
. Since our goal is to arrive at a measure of performance for
reduction mappings, we need to consider a control policy whose
mathematical formulation can be related to selection policies.
As we will see, this requirement is met by the MFPT control
policy.
For a selection policy
and its corresponding reduction
mapping
we can consider the MFPT stationary control
policy on the reduced network. In this regard we assume
without loss of generality that
, , is the control
gene and
is the gene to be deleted from the network.
Interpreting the deletion of gene
as creation of a latent, or
non-observable, variable, it is desirable that the MFPT control
policy
for the reduced network is as close as pos-
sible to the one designed for the original network when both
control policies have the same parameter
. In this way, one
can achieve similar control actions for every state
and
its corresponding reduced state
. Obviously, this is achieved
only if
. Thus, we arrive at the following
definition which is formulated for the general case of stationary
control policies.
Definition 2: Given a stationary control policy
and a gene
to be deleted from the , the policy is called -incon-
sistent at the state
if and only if . The
state
is called -inconsistent. The ratio of the number of
-inconsistent states to the total number of states in
is called the -relative inconsistency of the control policy .
The pairs of
-inconsistent states in the original network
present us with two possible options for defining the control
action for the states in the reduced network obtained by col-
lapsing those pairs into their respective reduced states. Thus,
one should measure the effects of a selection policy-induced
reduction mapping by comparing the control actions for the
subset
of states that are not -inconsistent to the
control actions for their corresponding reduced states in the
reduced network
. The next definition provides a relative
measure of those effects.
Definition 3: Given a stationary control policy
and a gene
to be deleted from the , the policy is called -affected
at the state
if and only if the control action for the
reduced state
is different from . The ratio of the
number of states
where the control policy is -affected
to the total number of states in
is called the relative effect of
the selection policy
on the control policy .
Because there is only a finite number of selection policies
, there exists a selection policy that minimizes the relative
effect
among the all possible selection policies on any
given stationary control policy
designed for a .
Recall that the MFPT control policy
is designed based
on the comparisons between
and
to . Thus, one can achieve the desired similarity be-
tween the MFPT control policies by minimizing the differences
and between the respective vectors of mean
first-passage times for the states in the original and the reduced
networks. However, the dimension of
and is while
the dimension of
and is , and one cannot simply
subtract the vectors of mean first-passage times for the reduced
network from their counterparts for the original network.
This problem can be circumvented by using the companion
matrix. Because the sets of desirable and undesirable states are
the same for the Markov chain corresponding to the reduced
network and the one corresponding to
, one can consider
equations similar to (6) and (7), namely
(12)
(13)
Consider a state
such that . Then, from (6)
and (12), one gets (14), shown at the bottom of the page. where
the index
describes the coordinates of the vector and the
states
collapse to the state . The right hand
(14)
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 6
IVANOV et al.: SELECTION POLICY-INDUCED REDUCTION MAPPINGS 4877
side suggests that the proper vector to subtract from is the
vector defined by the action of the matrix
on the vector
, where, referring to (6), is the upper left
submatrix of and has dimension
with identical elements for each pair of positions indexed by a
pair of flipped states
. Thus, if one pairs every equation of
type (14) with an artificially introduced identity corresponding
to the state
that was not selected by the policy
(15)
then one gets a linear system of
equations. That system
can be represented in a matrix format as
(16)
where the matrix
is obtained from the matrix , e.g.,
(6) and (7), by replacing the rows corresponding to 0 entries in
the selection policy
with the appropriate unit basis vectors
. Similar considerations lead to
(17)
where the matrix
is obtained from the matrix , e.g.,
(6) and (7), by replacing the rows corresponding to 0 entries in
the selection policy
with the appropriate unit basis vectors
, and with being the lower left
sub-matrix of . Equations (16) and (17) show
that the differences
and are eigenvectors
corresponding to the eigenvalue of 1 for the matrices
and
, respectively.
Recall that there exists a selection policy
that is optimal
with regard to minimizing the relative effect
among
the all possible selection policies
on the stationary MFPT
control policy
. Whereas for every candidate for deletion
gene there is an optimal selection policy to reduce the network,
finding that optimal selection policy requires finding the MFPT
control policy for each one of the
possible reduced
networks. This procedure is highly complex because it requires
finding the respective transition probability matrixes for all of
the reduced networks, then solving the corresponding systems
of linear equations, e.g., (6) and (7), to find the vectors of mean
first-passage times, and finally, executing the algorithm that
determines the control policy for the reduced network.
Although the optimal selection policy
does not necessarily
minimize the differences
and , every se-
lection policy
that ensures close to 0 coordinates for these
vectors at the positions that correspond to the places where its
coordinates equals to 1 is asymptotically (as
) optimal
and Theorem 1 shows that the convergence is fast. The relations
given by (16) and (17) show that the differences
and
belong to the linear subspaces spanned by the eigen-
vectors corresponding to the eigenvalue of 1 for their respec-
tive matrixes
and . Our interest is in finding the ele-
ments of those subspaces that have small coordinates at the po-
sitions that correspond to the places where the selection policy
equals to 1.
While we do not consider the question of characterizing those
subspaces based on the structure of a given selection policy, we
provide two theorems concerning estimating the probability of
finding vectors that belong to the subspaces and have a certain
growth condition imposed on their coordinates. These probabil-
ities are proportional to either the surface area or the volume of
certain portions of the
-ball in which is endowed with either
or metric. Thus, we are interested in the sets
,
and
, where , .By
adjusting the parameters
, one can estimate the probability of
having the vectors of MFPT
and ( and ) close
to each other at the coordinates where the selection policy
equals to 1. Here we only state the theorems, leaving the proofs
in the Appendix.
Theorem 2: Consider the set
. Then
Theorem 3: Consider the sets and . Then
where , and are the characteristic
functions of the respective sets,
is the projection of
onto the subspace spanned by the coordinate vectors
and
The quantities and
describe the area of the projec-
tion of the intersection between the positive coordinate cone in
and the surface of the -ball in endowed with the and
topology respectively. The projections are onto the subspace
of
spanned by the coordinate vectors .
Similarly, the quantity
describes the
-dimensional volume of the set , the volume of the
portion of the
-ball in the space that is bounded by the
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 7
4878 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
Fig. 1. 4-gene has 256 possible selection policies. All of these selection policies and their SSD shift toward Desirable states are shown in this figure. 16 out
of 256 selection policies produce the maximum shift toward Desirable states and are identified by a circle around them. Out heuristic selection policy is one of
these 16 optimal selection policies.
surface and the positive coordinate cone in . We focus
on the positive coordinate cone in
because of the task at
hand: estimating the closeness of the vectors
and or
and in either or norm. Taking into account all
of the possible permutations of the coordinate vectors, one can
see that the probability of finding vectors with the structure
described in the statements of the theorems is proportional
to
in the case, and to in the
case. These two quantities rapidly tend to zero if some of
the numbers
tend to zero. It is important to notice that the
minimum in either
or norm of the coordinates of interest
of the vectors
and depends only on the
given
and the selection policy . At the same time, the
relative effect
of the selection policy on the control
policy
depends on the choice of the parameter which
is related to the cost of control [4]. Smaller values of
imply
that control is applied more often, larger size of the set
, and
a potential increase of
. Smaller values for also imply
that a brute force search for vectors
and that are close
enough to the MFPT vectors
and is a hard problem.
Thus, there is a need for heuristic approaches that identify
both a gene for deletion and a selection policy which has small
relative effect on the control policy. It is also desirable that the
reduced network shares similar controllability properties as the
original one, where the controllability is understood in terms of
the shift of the steady-state distribution towards the desirable
states after the application of MFPT control policy. The control
policy developed in [16] employs such a heuristically chosen se-
lection policy in the following scenario: (i) a gene is selected for
deletion by a procedure involving the Coefficient of Determina-
tion (CoD), [17]; (ii) a reduction mapping using the heuristically
chosen selection policy is applied; (iii) steps (i) and (ii) are used
iteratively until a network of a predetermined (computationally
feasible) size is achieved, (iv) the MFPT control policy is de-
rived for the reduced network; and (v) a control policy is in-
duced on the original network from the policy derived for the
reduced network.
VI. S
IMULATION STUDY
We performed simulation studies to illustrate the new con-
cepts developed in this paper. Specifically, we study the rela-
tionship between the relative effect of the selection policy on
the MFPT control policy and the shift of the steady-state distri-
bution of the network towards the desirable states.
The key concept in this paper is that of the selection policy
which determines the reduction mapping, and subsequently
the structure of the reduced network. Given the large number
of possible selection policies, it is not feasible to exhaustively
search for the optimal selection policy. The optimal selection
policy minimizes the relative effect on a given stationary control
policy; therefore, if that stationary control policy is changed,
then the optimal selection policy could change as well. Thus, it
is important to develop heuristics for determining suboptimal
selection policies. The algorithm CoD-Reduce outlined in [16]
designs such a policy. Fig. 1 shows the performance of that
algorithm using a 4-gene network as an example: the designed
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 8
IVANOV et al.: SELECTION POLICY-INDUCED REDUCTION MAPPINGS 4879
Fig. 2. The average SSD shift toward Desirable states and the relative effects on the control policies of successive reductions of 2 sets of 100 . Each set has
randomly generated attractors which constrained to be evenly distributed between the Desirable and Undesirable states. At each step the MFPT contro
l policy is
designed and applied on the network. As the figure shows the SSD shift is similar in the original and reduced networks by applying their own control polic
ies.
Also, SSD shift and relative effect curves follow inverse patterns. (a) 9 genes. (b) 10 genes.
suboptimal selection policy is among the several optimal selec-
tion policies. In our simulation study, we use CoD-Reduce to
design the selection policy.
We have generated several sets, 100 networks each, of
using the algorithm developed in [18] for different numbers of
genes. The networks are randomly generated subject to only one
constraint on their attractors: half of them are among the desir-
able states of the respective network. For each randomly gen-
erated network, also referred to as the original network in the
simulation study, we design and apply the respective MFPT con-
trol policy. Then we compute the difference between the total
mass of the desirable states in the SSD of the network before
and after applying the MFPT control policy. We refer to this
difference as the SSD shift (toward the desirable states). It is
clear that more efficient stationary control policies will produce
larger SSD shifts. For the next step in the simulations, we use
CoD-Reduce and delete one gene from the network and find the
MFPT control policy for the reduced network. We then calculate
the relative effect of the selection policy on the MFPT control
policy for the original network by comparing that control policy
to the MFPT control policy for the reduced network. The reduc-
tion step and the calculation of the relative effect on the MFPT
control policy ares repeated iteratively using the reduced net-
work from the previous step as the “original” network for the
new reduction step.
Fig. 2 illustrates our findings: CoD-Reduce does not have a
significant effect (on average) on the amount of SSD shift if
a single gene is removed from the model. Similarly, there is a
small change (on average) in the relative effect of the selection
policy on the MFPT control policy when moving from a network
to its reduced version. Thus on average, our heuristic for per-
forming selection policy-based reduction does not have much
of an impact on the controllability of the network when a single
gene is removed from it. However, the effects of reduction tend
to accumulate with the removal of more genes, and ultimately
the reduction mappings produce poor results when a very few
genes remain in the network.
In general, the SSD shift and the relative effect tend to follow
inverse trends. This is to be expected because a big relative effect
on the control policy implies significant difference in the control
actions for the states on the larger network and their respective
reduced states in the smaller network which could ultimately
lead to a significant difference in the shifts induced by those
control policies in the steady-state distributions of the models.
A. Gastrointestinal Cancer Network Application
We have considered the effects of the reduction on a real-
world, experiment-driven network generated using the gastroin-
testinal cancer data set from [19]. Before applying reduction,
we must infer the gene regulatory network from the gene ex-
pression data and obtain the
. In the preprocessing step,
the gene expression data are normalized, filtered and binarized
using the methods from [20]. For the inference, we use the pro-
cedure used by [16] which is a modified version of the seed al-
gorithm proposed by Hashimoto et al. [21]. The algorithm is
initialized by the C9orf65 gene as the seed gene and the gene
CXCL12 is chosen as the control gene. The seed gene is se-
lected based on the fact that it is one of two genes composing
the best classifier in [19] and the control gene selection is based
on the strong CoD [17] connection to the seed gene. The coef-
ficient of determination (CoD) measures how a set of random
variables improves the prediction of a target variable, relative to
the best prediction in the absence of any conditioning observa-
tion [17]. Let
be a vector of predictor
binary random variables,
a binary target variable, and a
Boolean function such that
predicts . The mean-square
error of
as a predictor of is the expected squared dif-
ference,
. Let be the minimum MSE
among all predictor functions
for and be the error
of the best estimate of
without any predictors. The CoD is de-
fined as
(18)
The network is grown by adding one gene at a time; at each
iterative step the gene having the strongest connectivity, mea-
sured by the CoD, to one of the genes from the current network
is added to the network. Then, the network is rewired taking
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 9
4880 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
TABLE I
SSD S
HIFT
TOWARD THE
DESIRABLE STATES IN
GASTROINTESTINAL
CANCER
NETWORK
into account that a new gene in the network can change the
way genes influence each other. The total number of the genes
in the final network is 17: C9orf65, CXCL12, TK1, SOCS2,
THC2168366, SEC61B, ENST00000361295, KCNH2, ACTB,
RPS18, RPS13, THC2199344, SNX26, RPL26, SLC20A1,
RPS11, THC2210612, THC2161967, IER2 and LAMP1. This
17-gene
has a very large state space of size . Exact
computation of its SSD is infeasible. Thus the SSD is estimated
by first running the network for a long time and then using the
Kolmogorov-Smirnov test to decide if the network has reached
its steady state. Initially the best gene for deletion is selected
using CoD-Reduce [16] and the network is reduced by deleting
one gene, then CoD-Reduce is applied consecutively to reduce
the network down to 10 genes: C9orf65, CXCL12, RPS18,
RPS13, THC2199344, SNX26, RPL26, SLC20A1, RPS11 and
THC2210612. After reducing the original network down to
10 genes, the MFPT control policy for the reduced network
is designed and then induced back on the original 17-gene
network. The induction of the control policy is a necessary
step since the control policy designed on the reduced network
is of smaller dimension compared to the original one. We use
the method outlined in [16] for inducing the control policy de-
signed on the 10-gene network to the original 17-gene network.
After deleting one gene, each state in the reduced network
corresponds to two states, referred to here as “parent” states,
in the original network that collapsed together. After designing
the contol policy and assigning the control actions to the states
of the reduced network, in the induction procedure, the control
action of each state in the reduced network will be duplicated
as the control actions for its parent states.
Table I shows the total probability mass of the desirable and
undesirable states before and after applying the induced control
policy. As the table illustrates, there is about 13% shift in the
steady-state distribution of the network toward more desirable
states.
VII. C
ONCLUSION
This paper presents a general compression strategy for
Boolean network models of genomic regulation. The strategy is
based on the concept of a selection policy, and the complexity
of the model is reduced by consecutively removing genes from
the network. The removed genes are treated as latent or non-ob-
servable variables, and the truth tables for the reduced models
are constructed based on the particular selection policy. The
effects of the compression on both the dynamical behavior of
the model and on the MFPT control policy are discussed using
the companion to the selection policy matrix. It is important to
emphasize that while there is always an optimal selection policy
which minimizes the effects of compression on the stationary
control policy the problem of finding it is a hard one. Thus,
there is a need to develop suboptimal compression algorithms
based on heuristical approaches.
A
PPENDIX
Proof (Theorem 2): The integral evaluation is easily veri-
fied by induction with respect to
.
In the case
direct integral evaluation shows that
. Next, assuming that the statement of the the-
orem is true for
, one can compute
as follows:
This completes the proof of Theorem 2.
Proof (Theorem 3): We use induction with respect to
.
First we establish the case
. In this case,
and , thus . We obtain
From the definition of it is clear that
(19)
For every
we have (with the empty sum being
zero)
where . Thus, (with
(20)
Then, we should have
(21)
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 10
IVANOV et al.: SELECTION POLICY-INDUCED REDUCTION MAPPINGS 4881
which is true since
by (19) and (20). Now we handle the general case. Let
and define
(22)
Then
(23)
Definition (22) clearly implies (with
)
(24)
Since
a relevant induction hypothesis is
(25)
This hypothesis is verified using (24). We set
,
and
(26)
We obtain
(27)
The last expression gives the right-hand side of (25) after a
simple evaluation of the integral. The first formula of the the-
orem follows from (23) and (25).
It remains to evaluate
. From the def-
inition of
and (21) we have
(28)
where
(29)
By the above definition, we obtain the recurrence relation
(30)
Furthermore, the evaluation of
and
provides the following hypothesis
(31)
To verify (31), again we set
, . Then,
(30) yields (32) at the bottom of the page.
Now (31) follows from (26) and (32). From (28) and (31) we
get the second formula in the statement of the theorem. This
completes the proof of Theorem 3.
ACKNOWLEDGMENT
The authors want to thank B. Shekhtman for the inspiring
discussions.
(32)
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 11
4882 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 9, SEPTEMBER 2010
REFERENCES
[1] A. Datta, R. Pal, A. Choudhary, and E. R. Dougherty, “Control
approaches for probabilistic gene regulatory networks,
IEEE Signal
Process. Mag., vol. 24, no. 1, pp. 54–63, 2007.
[2] D. P. Bertsekas, Dynamic Programming and Optimal Control, 3rd
ed. Belmont, MA: Athena Scientific, 2005.
[3] B. Faryabi, A. Datta, and E. R. Dougherty, “On approximate stochastic
control in genetic regulatory networks, IET Syst. Biol., vol. 1, no. 6,
pp. 361–368, 2007.
[4] G. Vahedi, B. Faryabi, J.-F. Chamberland, A. Data, and E. R.
Dougherty, “Intervention in gene regulatory networks via a stationary
mean-first-passage-time control policy, IEEE Trans. Biomed. Eng.,
pp. 2319–2331, Oct. 2008.
[5] X. Qian, I. Ivanov, N. Ghaffari, and E. R. Dougherty, “Intervention in
gene regulatory networks via greedy control policies based on long-run
behavior, BMC Syst. Biol., 2009.
[6] S. A. Kauffman, “Metabolic stability and epigenesis in randomly con-
structed genetic nets, J. Theoret. Biol., vol. 22, pp. 437–467, 1969.
[7] I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang, “Probabilistic
boolean networks: A rule-based uncertainty model for gene regulatory
networks, Bioinform., vol. 18, no. 2, pp. 261–274, 2002.
[8] R. Pal, A. Datta, and E. R. Dougherty, “Robust intervention in proba-
bilistic Boolean networks, IEEE Trans. Signal Process., vol. 56, no.
3, pp. 1280–1294, 2008.
[9] E. R. Dougherty and I. Shmulevich, “Mappings between probabilistic
Boolean networks, Signal Process., vol. 83, no. 4, pp. 799–809, 2003.
[10] I. Ivanov and E. R. Dougherty, “Reduction mappings between proba-
bilistic boolean networks, EURASIP J. Acoust. Signal Process., vol.
1, no. 1, pp. 125–131, 2004.
[11] M. Brun, E. R. Dougherty, and I. Shmulevich, “Steady-state probabil-
ities for attractors in probabilistic Boolean networks, Signal Process.,
vol. 85, no. 10, pp. 1993–2013, 2005.
[12] X. Qian and E. R. Dougherty, “On the long-run sensitivity of proba-
bilistic Boolean networks, J. Theoret. Biol., pp. 560–577, 2009.
[13] J. J. Hunter, “Stationary distributions and mean first passage times
of perturbed Markov chains, Linear Algebra Appl., vol. 410, pp.
217–243, 2005.
[14] P. J. Schweitzer, “Perturbation theory and finite Markov chains, J.
Appl. Probab., vol. 5, pp. 401–413, 1968.
[15] J. Norris, Markov Chains. Cambridge, U.K.: Cambridge Univ. Press,
1998.
[16] N. Ghaffari, I. Ivanov, X. Qian, and E. R. Dougherty, “A CoD-based re-
duction algorithm for designing stationary control policies on Boolean
networks, Bioinformatics, vol. 26, pp. 1556–1563, 2010.
[17] E. R. Dougherty, S. Kim, and Y. Chen, “Coeffcient of determination in
nonlinear signal processing, Signal Process., vol. 80, pp. 2219–2235,
2000.
[18] R. Pal, I. Ivanov, A. Datta, and E. R. Dougherty, “Generating Boolean
networks with a prescribed attractor structure, Bioinformatics, vol. 54,
no. 21, pp. 4021–4025, Nov. 2005.
[19] N. D. Price, J. Trent, A. K. El-Naggar, D. Cogdell, E. Taylor, K. K.
Hunt, R. E. Pollock, L. Hood, I. Shmulevich, and W. Zhang, “Highly
accurate two-gene classifier for differentiating gastrointestinal stromal
tumors and leiomyosarcomas, Proc. Nat. Acad. Sci., vol. 104, no. 9,
pp. 3414–3419, 2007.
[20] I. Shmulevich and W. Zhang, “Binary analysis and optimization-based
normalization of gene expression data, Bioinform., vol. 18, no. 4, pp.
555–565, 2002.
[21] R. Hashimoto, S. Kim, I. Shmulevich, W. Zhang, M. L. Bittner, and E.
R. Dougherty, “A directed-graph algorithm to grow genetic regulatory
subnetworks from seed genes based on strength of connection, Bioin-
form., vol. 20, no. 8, pp. 1241–1247, 2004.
Ivan Ivanov received the Ph.D. degree in mathe-
matics from the University of South Florida, Tampa.
He is an Assistant Professor with the Department
of Veterinary Physiology and Pharmacology, Texas
A&M University, College Station, TX. His current
research is focused on genomic signal processing,
and, in particular, on modeling the genomic regu-
latory mechanisms and on mappings reducing the
complexity of the models of genomic regulatory
networks.
Plamen Simeonov received the Ph.D. degree in
mathematics from the University of South Florida,
Tampa.
He is an Associate Professor of mathematics with
the Department of Computer and Mathematical Sci-
ences, University of Houston-Downtown, Houston,
TX. His current research interests are in the areas
of analysis, approximation theory, potential theory,
orthogonal polynomials and special functions, com-
puter-aided geometric design, and biostatistics.
Noushin Ghaffari received the M.Sc. degree in
computer information systems from the University
of Houston—Clear Lake, TX, in 2006.
Currently, she is pursuing the Ph.D. degree with
the Department of Electrical and Computer Engi-
neering, Texas A&M University, College Station.
Her research interests include: genomic signal
processing, systems biology, and computational
biology; especially complexity reduction and control
of Genetic Regulatory Networks.
Xiaoning Qian received the Ph.D. degree in elec-
trical engineering from Yale University, New Haven,
CT, in 2005.
Currently, he is an Assistant Professor with the
Department of Computer Science and Engineering,
University of South Florida, Tampa. He was with
the Bioinformatics Training Program, Texas A&M
University, College Station. His current research
interests include computational biology, genomic
signal processing, and biomedical image analysis.
Edward R. Dougherty received the M.S. degree
in computer science from Stevens Institute of
Technology, Hoboken, NJ, the Ph.D. degree in math-
ematics from Rutgers University, New Brunswick,
NJ, and the Doctor Honoris Causa by the Tampere
University of Technology, Finland.
He is a Professor with the Department of Electrical
and Computer Engineering, Texas A&M Univer-
sity, College Station, where he holds the Robert
M. Kennedy ’26 Chair in Electrical Engineering
and is Director of the Genomic Signal Processing
Laboratory. He is also co-Director of the Computational Biology Division of
the Translational Genomics Research Institute, Phoenix, AZ, and is an Adjunct
Professor with the Department of Bioinformatics and Computational Biology,
University of Texas M. D. Anderson Cancer Center, Houston. He is author
of 15 books, editor of five others, and author of 250 journal papers. He has
contributed extensively to the statistical design of nonlinear operators for image
processing and the consequent application of pattern recognition theory to
nonlinear image processing. His current research in genomic signal processing
is aimed at diagnosis and prognosis based on genetic signatures and using gene
regulatory networks to develop therapies based on the disruption or mitigation
of aberrant gene function contributing to the pathology of a disease.
Dr. Dougherty is a fellow of SPIE, has received the SPIE President’s Award,
and served as the editor of the SPIE/IS&T Journal of Electronic Imaging.At
Texas A&M University received the Association of Former Students Distin-
guished Achievement Award in Research, been named Fellow of the Texas En-
gineering Experiment Station, and named Halliburton Professor of the Dwight
Look College of Engineering.
Authorized licensed use limited to: University of South Florida. Downloaded on August 10,2010 at 18:36:18 UTC from IEEE Xplore. Restrictions apply.
Page 12
  • Source
    • "Once the predictors of each gene are determined , we assign 1 to each corresponding location in R S with probability β ∈[ 0, 1] and −1 with probability 1 − β. β reflects a bias toward what type of regulatory relationship (activation or suppression) is more likely to occur. Given the perturbation probability p, we calculate a TPM and its SSD for the network corresponding to the seed regulatory matrix [23]. We then select a nominal healthy network, R H , as the network with minimum undesirable steady-state mass among all possible networks with a single mutation relative to R S . "
    [Show abstract] [Hide abstract] ABSTRACT: Typically, a vast amount of experience and data is needed to successfully determine cancer prognosis in the face of (1) the inherent stochasticity of cell dynamics, (2) incomplete knowledge of healthy cell regulation, and (3) the inherent uncertain and evolving nature of cancer progression. There is hope that models of cell regulation could be used to predict disease progression and successful treatment strategies, but there has been little work focusing on the third source of uncertainty above. In this work, we investigate the impact of this kind of network uncertainty in predicting cancer prognosis. In particular, we focus on a scenario in which the precise aberrant regulatory relationships between genes in a patient are unknown, but the patient gene regulatory network is contained in an uncertainty class of possible mutations of some known healthy network. We optimistically assume that the probabilities of these abnormal networks are available, along with the best treatment for each network. Then, given a snapshot of the patient gene activity profile at a single moment in time, we study what can be said regarding the patient’s treatability and prognosis. Our methodology is based on recent developments on optimal control strategies for probabilistic Boolean networks and optimal Bayesian classification. We show that in some circumstances, prognosis prediction may be highly unreliable, even in this optimistic setting with perfect knowledge of healthy biological processes and ideal treatment decisions.
    Preview · Article · Dec 2015 · EURASIP Journal on Bioinformatics and Systems Biology
  • Source
    • "A big obstacle in deriving optimal treatment strategies from networks is the computational complexity arising directly from network complexity. Hence, significant effort has been focused on network reduction [18,19]. As with any compression scheme, reduction methods sacrifice information in return for computational tractability. "
    [Show abstract] [Hide abstract] ABSTRACT: The inference of gene regulatory networks is a core problem in systems biology. Many inference algorithms have been proposed and all suffer from false positives. In this paper, we use the minimum description length (MDL) principle to reduce the rate of false positives for best-fit algorithms. The performance of these algorithms is evaluated via two metrics: the normalized-edge Hamming distance and the steady-state distribution distance. Results for synthetic networks and a well-studied budding-yeast cell cycle network show that MDL-based filtering is more effective than filtering based on conditional mutual information (CMI). In addition, MDL-based filtering provides better inference than the MDL algorithm itself.
    Full-text · Article · Dec 2014 · EURASIP Journal on Bioinformatics and Systems Biology
  • [Show abstract] [Hide abstract] ABSTRACT: We present a novel algorithm (CoDReduce) for reducing the size of a probabilistic Boolean network (PBN) model for genomic regulation. The algorithm uses the coefficient of determination (CoD) to find the best candidate for dasiadeletionpsila gene. The selection policy that determines how the transition probabilities for the reduced network are obtained from those in the original network is designed using the steady-state distribution (SSD) of the model. The performance of the algorithm is measured by the shift in the steady-state distribution after applying the mean-first-passage-time (MFPT) control policy, and the relative effect of the selection policy on the MFPT control policy.
    No preview · Article · May 2009
Show more