Virtual parallel computing and a search algorithm using matrix product states.
ABSTRACT We propose a form of parallel computing on classical computers that is based on matrix product states. The virtual parallelization is accomplished by representing bits with matrices and by evolving these matrices from an initial product state that encodes multiple inputs. Matrix evolution follows from the sequential application of gates, as in a logical circuit. The action by classical probabilistic one-bit and deterministic two-bit gates such as NAND are implemented in terms of matrix operations and, as opposed to quantum computing, it is possible to copy bits. We present a way to explore this method of computation to solve search problems and count the number of solutions. We argue that if the classical computational cost of testing solutions (witnesses) requires less than O(n2) local two-bit gates acting on n bits, the search problem can be fully solved in subexponential time. Therefore, for this restricted type of search problem, the virtual parallelization scheme is faster than Grover's quantum algorithm.
-
Citations (0)
-
Cited In (0)
Page 1
arXiv:1202.1809v1 [quant-ph] 8 Feb 2012
Virtual parallel computing and a search algorithm using matrix product states
Claudio Chamon1and Eduardo R. Mucciolo2
1Department of Physics, Boston University, Boston, Massachusetts 02215, USA
2Department of Physics, University of Central Florida, Orlando, Florida 32816, USA
(Dated: February 9, 2012)
We propose a form of parallel computing on classical computers that is based on matrix product
states. The virtual parallelization is accomplished by evolving all possible results for multiple inputs,
with bits represented by matrices. The action by classical probabilistic 1-bit and deterministic
2-bit gates such as NAND are implemented in terms of matrix operations and, as opposed to
quantum computing, it is possible to copy bits.
computation to solve search problems and count the number of solutions. We argue that if the
classical computational cost of testing solutions (witnesses) requires less than O(n2) local two-bit
gates acting on n bits, the search problem can be fully solved in subexponential time. Therefore,
for this restricted type of search problem, the virtual parallelization scheme is faster than Grover’s
quantum algorithm.
We present a way to explore this method of
Interference and the ability to follow many history
paths simultaneously make quantum systems attractive
for implementing computations [1]. Efficient algorithms
exploring these properties have been proposed to solve
practical problems such as number factoring [2] and un-
sorted database search [3]. However, we still do not have
a sufficiently large and resilient quantum computer to
take advantage of these algorithms. It is thus very de-
sirable to try to find better and more efficient ways to
compute with classical systems. In this regard, recent
advances in our understanding of quantum many-body
systems provide some guidance. It is well understood
now that the time evolution of a large class of one-
dimensional interacting systems can be efficiently sim-
ulated by expressing their wave functions in a matrix
product state form and by using a time-evolving block
decimation (TEBD) [4].A key aspect of this success
is data compression: Even though many-body interac-
tions tend to increase the rank of the matrices over time,
it is possible to use truncation along the evolution to
keep the matrices relatively small, such that the result-
ing wave function approximates quite accurately the ex-
act one without an exponential computation cost [5]. In
quantum systems, it is well understood that local inter-
actions do not quickly entangle one-dimensional many-
body state, justifying the matrix truncation [6, 7].
In this Letter, we describe a method of classical com-
putation that utilizes matrix product states (MPS) to
implement search and other similar tasks. Compression,
when possible, provides additional speedup. Formally,
instead of working with wave functions and quantum am-
plitudes, we describe the state of the computer in terms
of a stochastic probability distribution written as traces
of matrix product states associated to bit configurations.
The idea of expressing classical probability distributions
in the form of MPS is not new [8], but the focus so far has
been on using it to study non-equilibrium phenomena of
physical systems (see for instance Ref. 9). As we show
below, an MPS formulation of classical probability distri-
butions can also be employed to create a virtual parallel
machine where all possible outcomes of an algorithm are
obtained for all 2ninputs of an n-bit register. Informa-
tion about these outcomes is encoded and compressed
in the matrices forming the MPS. By itself this “paral-
lelism” is not obviously useful; it is, however, if a certain
problem can use the probability of a single outcome at
a time. This is the case of a search problem that seeks,
for a given y, the value of x such that y = f(x) for an
algorithmically computable function f. Then, the focus
is not on all values of the output, but on only one given y.
We shall show below that in this case matrix computing
can be useful. In particular, from the probability of y,
the method directly provides the number of input values
x satisfying the functional constraint y = f(x).
In our matrix computing, insertion and removal of bits
are allowed and 1-bit and 2-bit gates can be implemented
much like in a conventional computer. Our 1-bit gates
are probabilistic while our 2-bit gates are deterministic.
2-bit gates rely on a singular value decomposition (SVD)
to maintain the MPS form of the probability distribution.
All these operations preserve the positivity and the over-
all normalization of the probability even though we work
with non-positive matrices. (We find that even when ma-
trices are truncated, no significant negative probabilities
are produced for practical calculations.)
Matrix computing formulation – Consider a set of bi-
nary variables {xj = 0,1}j=1,...,n describing a set of n
bits, with |x1x2...xn? ≡ |x? denoting a particular con-
figuration of this system. In analogy to quantum me-
chanics, we define the vector
|P? =
?
xn,...,x1=0,1
P(x1,...,xn)|x1...xn?,(1)
where
P(x1,...,xn) = tr(Mx1
1···Mxn
n).(2)
Here, each Mxj
The trace can be dropped if we consider the first and last
j
is a real matrix of dimensions Dj−1×Dj.
Page 2
2
matrices to be row and column vectors, i.e., D0= Dn=
1. The state vector is normalized in the following sense:
Define |Σ? =?
Starting from aninitial
P0(x1,...,xn), the vector |P? evolves by the application
of 1-bit and 2-bit gates, with the latter always acting on
adjacent bits.
• 1-bit gates: We will use probabilistic one-bit gates,
which take states 0,1 to states 0,1 with probabilities
p,1 − p and q,1 − q:
p
− − → 0
1− − → 0
The probabilities can be encoded into a transfer function
t˜ a,athat takes a logic input a = 0,1 into a logic output
˜ a = 0,1. Explicitly: t0,0= p, t1,0= 1 − p, t0,1= 1 − q,
t1,1= q. A 1-bit gate acting on bit j yields a new matrix
x1,...,xn=0,1|x1...xn?, then Z = ?Σ|P? =
xP(x) = 1.
probability distribution
1 since?
0or0
1−p
− − → 1
q
− − → 1 .
1−q
or1
˜
Mxj
j
=
?
x′
j=0,1
txj,x′
jM
x′
j
j
.(3)
The transfer function satisfies the sum rule?
tained as the system evolves. Examples of 1-bit gates
are: (a) Deterministic NOT, with p = 0 and q = 0, (b)
RAND, with p = 1/2 and q = 1/2, which randomizes the
bit, (c) RST, with p = 1 and q = 0, which resets the bit
to 0.
• 2-bit gates: We will consider only deterministic two-
bit gates. Given two logical functions A(a,b) and B(a,b),
we construct the transfer function T˜ a˜b,ab, taking bits with
states a and b to bits with states ˜ a and˜b, respectively:
˜ a=0,1t˜ a,a=
1, which ensures that the normalization Z = 1 is main-
T˜ a˜b,ab=
?
1,
0,
˜ a = A(a,b) and˜b = B(a,b),
otherwise.
(4)
Similarly to 1-bit gates, the normalization after 2-bit
gates is preserved by the sum rule?
˜
Mxj−1
j−1
x′
j−1,x′
˜ a,˜b=0,1T˜ a˜b,ab= 1.
The evolved matrices must satisfy
˜
Mxj
j
=
?
j=0,1
Txj−1xj,x′
j−1,x′
jM
x′
j−1M
j−1
x′
j,
j
(5)
and we use the SVD to decompose the result of the gate
operation on the right-hand side of Eq. (5) as a product
of two matrices, as in the left-hand side of the equation,
for all the four cases xj−1,xj= 0,1.
Let us demonstrate this construction with a concrete
example. Consider the following logical operation on bits
j − 1 and j: ANAND(a,b) = a and BNAND(a,b) = a ∧ b.
The first bit is unaffected, while the second one evolves
into the NAND operation between the two bits. In this
case, T01,00= T01,01= T11,10= T10,11= 1, with all
other elements set to zero. We use the transfer function
to determine the four blocks (for xj−1,xj = 0,1) of a
matrix MNAND
M1
j
j−1,j of dimension 2Dj−2× 2Dj:
0
M0
MNAND
j−1,j=
j−1M0
j+ M0
j−1M1
j
j−1M1
M1
j−1M0
j
.(6)
To factor the matrix Mi,i−1 back into a product, we
employ an SVD,
Mi,i−1
SVD
=
˜
M0
i
˜
M1
i
?˜
M0
i−1
˜
M1
i−1
?.(7)
In this process, the common dimension Dj−1may change
and likely increase. This is an issue of fundamental im-
portant, which we shall return when we discuss a search
algorithm.
• Bit insertions and removals: For computational tasks
such addition and multiplication, it is important to be
able to insert and remove bits.
straightforward for MPS. Insertion of a new bit (say,
initially set to 0) in between bits j − 1 and j amounts
to replacing Mxj−1
j
and M0
αare Dj−1× Dj−1 null and identity matrices,
respectively, and the total sum over bit configurations
in the vector |P? [see Eq.
the binary variable xα = 0,1.
done by absorbing its matrix into the one of an adja-
cent bit, namely, by tracing it out; for instance, we use
?
How can matrix computing can be used to solve cer-
tain computational problems? – Here we shall present
computational algorithms that explore the virtual paral-
lelism encoded in matrix product states. To be concrete,
consider the following search problem as an example:
These operations are
j−1Mxj
with Mxj−1
j−1Mxα
αMxj
j, where M1
α
(1)] has now to include
Removal of a bit is
xj=0,1Mxj
jMxj+1
j+1=˜Mxj+1
j+1to remove bit j.
Given a function y = C(x) that can be com-
puted algorithmically with O(nd) gates and a cer-
tain value for y, we would like to search for an input
x that yields as output y = f(x).
The reason why matrix computation is useful for this
search problem can be argued as follows. Matrix product
states can express the probability values of all possible m-
bit outputs y ≡ y1y2...ymif one starts with a product
state encoding all possible n-bit inputs x ≡ x1x2...xn,
namely, P(x) = 2−nfor all x. Of course, if we were inter-
ested in all the probabilities, we would have to compute
an exponentially large (2m) number of traces of products
of matrices. But this is not what is needed to perform
the search above: We are interested in just one output y
for this problem. We thus proceed in the following steps.
1. Starting with all bits xi, i = 1,...,n, random-
ized with equal probabilities 1/2 for being 0 or 1,
we compute the final output matrices Myj
1,...,m.
j, j =
Page 3
3
2. We compute the probability P(y) for the given y
we are interested in. If P(y) ≥ 2−n, then there is
at least one value of x such that y = f(x).
3. We then fix one of the input bits, say x1, to be
0, instead of randomizing it. We recompute the
output matrices Myj
j, j = 1,...,m, and the new
probability P(y). Again we test if P(y) ≥ 2−n. If
the probability fell below the threshold, we must
reset x1to 1. (Notice that since there may be more
than one x for a given y, that P(y) stays above
threshold does not mean that switching to x1= 1 is
necessarily forbidden, but we shall stick instead to
x1= 0 in this case to avoid unnecessary iterations.)
4. We repeat step 3 fixing now input bit x2, then re-
peat it again fixing input bit x3, and so on until
we finally fix input bit xn. At the end of n steps,
having fixed all the n bits of the input, we have
arrived at one value for x such that y = f(x).
Let us discuss the computational cost of such algo-
rithm. To simplify the discussion, let us present it in
terms of the largest matrix dimension D in the compu-
tations, which we shall relate to the number ngof gates
involved in the computation of the function f(x). All
SVD steps involve matrices with rank smaller or equal to
D; therefore, the cost associate to gate operations is no
more than O(ng×D3). One has also to compute the trace
of the matrix products for a fixed y to yield the probabil-
ity P(y), and this takes time O(n×D), which we discard
in comparison with the SVD steps. We then have to re-
peat the procedure fixing bit-by-bit the xi, i = 1,...,n.
Therefore, in the worst case it takes a time O(n×ng×D3)
to find x.
The largest computational cost comes from the SVD
steps, which depends on the rank D of the matrices.
The crucial issue is how D scales with either the number
of bits n or the number of gates ng for a given algo-
rithm to compute f(x). We shall break the discussion
below into two cases. The first one focuses on the case
where all the singular values are kept and no approxi-
mations are made. The purpose of this discussion is to
show that, even without approximations, matrix comput-
ing can yield search algorithms that perform faster than
Grover’s quantum algorithm depending on the complex-
ity involved in computing the function f(x). The second
case is more applied, and makes use of the Eckart-Young
theorem [10] to best approximate the matrices by others
of rank Dcut< D by keeping only the largest Dcutsin-
gular values in the SVD steps. Of course the usefulness
of computing with the truncated singular values depends
on how compressible the partial answers are in each step
of the computation.
Computational costs without discarding singular val-
ues – We shall prove below the following result: The
maximum dimension of any matrix in a computation
using ng gates in a system with n bits is bounded by
D ≤ Dmax(n,ng) = min
quence of this result on the computational time is as fol-
lows. As we argued above, the search algorithm takes a
time O(n×ng×D3). For a function y = f(x) that can be
computed with ng∼ ndgates, the time to search for an x
that gives a fixed y has two different behaviors depending
on whether d < 2 or d ≥ 2. If d < 2, Dmax∼ 2
and thus the search takes, in the worst possible case, a
time O(nd+1× 23√2nd/2) using matrix computing algo-
rithms. If instead d ≥ 2, Dmaxsaturates to Dmax∼ 2n/2
and in the worst possible case the computation (without
discarding singular values) takes exponential time. In
other words, there is a transition between subexponen-
tial and exponential behavior at dc= 2. It thus follows
that for any function f(x) that can be computed with
ng< O(n2) gates, the full search problem can be solved
faster using matrix computing than using Grover’s quan-
tum algorithm, which scales as O(2n/2).
Proof of the bound on the largest bond dimension –
Upon application of a 2-bit gate on bits j − 1 and j, the
dimension Dj−1 will increase as follows. Starting with
Dj−2× Dj−1 matrices Mxj−1
Mxj
the example of the NAND gate in Eq. (6)]. The SVD step
will lead to Dj−2טDj−1matrices˜
matrices ˜ Mxj
min(2Dj−2,2Dj). It is useful to work on a logarithmic
scale and define hj= log2Dj. Thus we can write˜hj−1=
min(hj−2,hj) + 1.
Let us next prove that at any step in the algorith-
mic evolution the “entanglement heights” hj satisfy the
condition |hj− hj−1| ≤ 1,∀j, which we shall refer to
as the height difference constraint (hdc). The proof is
done by induction. At the initial state of the calculation,
one starts with the product state of all possible equally
weighted inputs x, which correspond to 1×1 matrices or,
equivalently, all hj= 0, so that |hj−hj−1| = 0 ≤ 1, thus
satisfying the condition. Now suppose that the condi-
tion is satisfied at step τ; we can show that it is then
also satisfied at step τ + 1, when a 2-bit gate is ap-
plied between two adjacent bits j − 1 and j. None of
the heights other than hj−1→˜hj−1are changed, there-
fore the hdc condition |hj− hj−1| ≤ 1 remains satisfied
for all i < j − 1 and i > j, and it just remains to be
shown that it is satisfied for i = j − 1 and i = j. Con-
sider the case where hj−2≤ hj(the other case hj≤ hj−2
is analogous).In this case˜hj−1 = hj−2+ 1, satisfy-
ing the condition |˜hj−1− hj−2| ≤ 1. Now hj−˜hj−1 =
hj− hj−2− 1 = (hj− hj−1) + (hj−1− hj−2) − 1, and
using that hj− hj−1≤ 1 and hj−1− hj−2≤ 1, as well
as that hj−2 ≤ hj, we have that |hj−˜hj−1| ≤ 1. It
thus follows that the hdc condition |hj−hj−1| ≤ 1,∀j is
satisfied at all steps in the calculation. An example of a
?
2⌊√2ng⌋,2⌊n/2⌋?
. The conse-
√2nd/2,
j−1
and Dj−1× Dj matrices
j, one assembles a 2Dj−2× 2Dj matrix Mgate
j−1,j[see
Mxj−1
j−1and˜Dj−1×Dj
j, where the new bond dimension˜Dj−1 =
Page 4
4
1023567894101112
j
0
2
4
h
6
j
FIG. 1. Example of a configuration of entanglement heights
(hj = log2Dj) satisfying the height difference constraint
|hj − hj−1| ≤ 1,∀j when n = 12. The dashed line shows
the configuration with maximum heights.
configuration of entanglement heights satisfying the hdc
is show in Fig. 1.
If all we do to evolve the state is to apply 2-bit gates,
we have shown that |hj− hj−1| ≤ 1,∀j. It is easy to see
that after a bit insertion the condition is still satisfied,
because the change in height is zero on the two sides of
the inserted bit (corresponding to a square matrix), with
all other relative height differences unchanged. The re-
moval (tracing out) of bits is slightly more subtle. Right
after the removal, there are large jumps across the region
where the bits were removed, but these can be brought
up to satisfy the hdc by applying a series of 2-bit iden-
tity gates [A(a,b) = a and B(a,b) = b] sweeping from
left-to-right followed by another from right-to-left. These
sweeps remove the height “faults” (and actually tend to
decrease the overall height). Therefore we arrive at the
result that the hdc condition is satisfied after all opera-
tions, 2-bit gates, bit insertions, and bit deletions (after
the identity sweeps).
Let us now show that the maximum height resulting
from the application of ng 2-bit gates is bounded by
hmax≤ ⌊?2ng⌋. The application of a single 2-bit gate
min(hj−2,hj) + 1. Because the relative heights of neigh-
boring bonds cannot differ by more than 1 unit due to
the hdc, the maximum amount that the height˜hj−1can
increase with respect to hj−1is by 2 (which occurs when
hj−2 = hj = hj−1+ 1). Therefore one can write that
S =?
to the right of bit imax); because the heights h0 to the
left of the 1st bit and hnto the right of the nth bit are
both equal to 0 at all times, and because of the hdc con-
dition, there are constraints on how quickly the heights
can grow from 0 to hmaxat imaxand then decrease down
to 0 again. The climb and descent that minimizes the
area S can be trivially seen to be a triangle where hj
increases linearly from j = imax− hmaxto j = imax, and
then decreases linearly until j = imax+hmax. The area of
this triangle is Smin= h2
max, and any other height profile
that reaches the same maximum height hmaxhas larger
or equal area. Therefore, h2
max≤ S ≤ 2ng, and thus we
arrive at the conclusion that hmax≤ ⌊?2ng⌋, i.e., the
on bits j − 1 and j changes the height hj−1 →˜hj−1 =
ihi ≤ 2ng. Now, suppose that the maximum
height is hmax at some bond labelled by imax (located
bound on the maximum entanglement height for a given
number of gates. Furthermore, because of the hdc and
the fact that h0= hn= 0, the entanglement height for a
fixed j is bounded by hj≤ min(j,n−j), and the overall
maximum hmax= ⌊n/2⌋ is reached at the center of the
chain, j = ⌊n/2⌋ and j = ⌈n/2⌉ (which coincide when n
is even).
Putting all the conditions together, we arrive at
hmax≤ min?⌊?2ng⌋,⌊n/2⌋?, or equivalently, the bound
D ≤ Dmax(n,ng) = min
to obtain the absolute maximum running time of the
search algorithm.
When the ranks of the matrices do not grow as fast
as in the worst case scenario discussed above, the cal-
culations should run even faster. Another possibility for
speed up is to keep only a subset of the singular values
in the decompositions.
Faster computations by keeping most relevant singular
values? – Truncating the rank of the matrices by se-
lecting only a subset of singular values coming from the
SVD steps is standard procedure in quantum methods
such as the TEBD, and the classical version for stochas-
tic evolution, the cTEBD. Carrying out logic computa-
tions in the way we present above can be regarded as a
form of stochastic evolution, and therefore the analysis of
stochastic evolution with cTEBD carried out in Ref. [9]
applies.
What criteria does a calculation must fulfill so as to
be efficiently performed (in polynomial time in n) us-
ing matrix computation? To address this question, let
us start by presenting a necessary condition: The final
state of the calculation P(y) must be compressible and
thus writable as a matrix product state with matrices of
dimension Dcutscaling as nα, for some exponent α > 0.
The condition is necessary but not sufficient because it
is possible that the actual computation has intermediate
steps with “entropic” barriers and those depend on the
specific algorithm (including how cleverly it can be writ-
ten) to compute f(x). However, focusing on the final
state P(y) is useful because it allows us to investigate
the applicability of the method, say by analyzing the be-
havior and scaling for small n first, even before designing
the sequence of gates that implements the algorithmic
calculation of y = f(x). For instance, one can compute a
measure such as the entropy cost associated to partition-
ing P(y), which plays a similar role to the entanglement
entropy in a quantum matrix product state and is lower
bounded by the mutual information [11].
Conclusions – We have shown that is is possible to
achieve virtual parallelization in single-processor classi-
cal computers using 1-bit and 2-bit local gates acting on
matrix product states. We propose a search algorithm
based on this method that is faster than Grover’s quan-
tum search algorithm when the cost to check a witness
requires less than O(n2) 2-bit gates. Additional speedup
?
2⌊√
2ng⌋,2⌊n/2⌋?
which we used
Page 5
5
is possible in particular cases when either the rank of the
matrices involved in the product state grows slowly as
the computation progresses or the rank can be reduced
by truncation during gate operations. The method is not
limited to one-dimensional bit arrays and could in princi-
ple be extended to higher dimension tensor products. Fi-
nally, we point out that this method also naturally counts
the number of satisfying assignments of a given Boolean
formula, which is a problem of much importance in Com-
puter Science.
This work was supported in part by the NSF grants
CCF-1116590 and CCF-1117241. E.R.M. acknowledges
partial financial support from the ONR. The authors
thank P. Wocjan for useful discussions.
[1] D. Deutsch, Proc. R. Soc. London A 400, 97 (1985).
[2] P. W. Shor, SIAM J. Sci. Statist. Comput. 26, 1484
(1997).
[3] L. K. Grover, Phys. Rev. Lett. 79, 325 (1997).
[4] G. Vidal, Phys. Rev. Lett. 91, 147902 (2003); ibid 93,
040502 (2004).
[5] F. Verstraete, V. Murg, and J. I. Cirac, Adv. Phys. 57,
143 (2008).
[6] J. I. Cirac and F. Verstraete, J. Phys. A: Math. Theor.
42, 504004 (2009).
[7] A. Hamma, S. Santra, and P. Zanardi, arXiv:1109.4391.
[8] B. Derrida, M. R. Evans, H. Hakim, and V. Pasquier, J.
Phys. A 26, 1493 (1993).
[9] T. H. Johnson, S. R. Clark, and D. Jaksch, Phys. Rev. E
82, 036702 (2010).
[10] C. Eckart and G. Young, Psychometrika 1, 211 (1936).
[11] K. Temme and F. Verstraete, Phys. Rev. Lett. 104,
210502 (2010).