Conference PaperPDF Available

Adiabatic Quantum Computing for Kernel k = 2 Means Clustering

Authors:

Abstract

Adiabatic quantum computers are tailored towards finding minimum energy states of Ising models. The quest for implementations of machine learning algorithms on such devices is thus the quest for Ising model (re-)formulations of their underlying objective functions. In this paper, we discuss how to accomplish this for the problem of kernel binary clustering. We then discuss how our models can be solved on an adiabatic quantum computing device. Finally, in simulation experiments, we numerically solve the respective Schrödinger equations and observe our approaches to yield convincing results.
Adiabatic Quantum Computing
for Kernel k= 2 Means Clustering
Christian Bauckhage, Cesar Ojeda, Rafet Sifa, and Stefan Wrobel
Fraunhofer Center for Machine Learning, Sankt Augustin, Germany
Fraunhofer IAIS, Sankt Augustin, Germany
B-IT, University of Bonn, Bonn, Germany
Abstract. Adiabatic quantum computers are tailored towards finding
minimum energy states of Ising models. The quest for implementations
of machine learning algorithms on such devices is thus the quest for
Ising model (re-)formulations of their underlying objective functions. In
this paper, we discuss how to accomplish this for the problem of kernel
binary clustering. We then discuss how our models can be solved on an
adiabatic quantum computing device. Finally, in simulation experiments,
we numerically solve the respective Schr¨odinger equations and observe
our approaches to yield convincing results.
1 Introduction
Quantum computing exploits quantum mechanical phenomena for information
processing and is now becoming practical. Working quantum computers are on
the market [1], industry invests increasing efforts [2,3,4,5,6,7], and further rapid
progress is expected [8]. This will likely impact artificial intelligence and machine
learning because quantum computing promises efficient solutions to many of the
search- or optimization problems encountered in these fields [9,10,11,12,13,14].
Here, we extend our earlier work on quantum computing for unsupervised
learning [15] towards the problem of kernel k= 2 means clustering and discuss
how to solve it via adiabatic quantum computing.
Note that adiabatic quantum computers solve a kind of optimization problem
not unfamiliar in machine learning. Devices such as those produced by D-Wave
Systems [1,16] determine low energy states of Ising models. While they were orig-
inally conceived to describe spin glass systems [17], Ising models occur in other
settings, too. Examples include Boolean satisfiability- or graph cutting problems
[18,19,20] as well as neurocomputing models known as Hopfield networks [21].
If a problem can be formulated as an Ising energy minimization problem,
there are standard procedures for preparing systems of quantum bits (qubits)
and energy operators (Hamiltonians) for processing [18,19]. The solution process
itself relies on the adiabatic theorem [22] which states that if a quantum system
starts in a low energy configuration (ground state) of a Hamiltonian which then
gradually changes, the system will end up in the ground state of the resulting
Hamiltonian. To harness this for problem solving, one prepares a qubit system
in the ground state of a problem independent Hamiltonian and adiabatically
evolves it to a Hamiltonian whose ground state represents a solution to the
problem at hand.
In this paper, we discuss these ideas in detail. First, we elaborate on the
notion of Ising models and their role in adiabatic quantum computing. We then
propose two Ising models for kernel k= 2 means clustering. Given these models,
we discuss how to set them up for computing and review the required quantum
mechanical concepts. Finally, we present several simulation experiments in which
we numerically solve the Schr¨odinger equations which govern the corresponding
quantum mechanical processes; these experiments demonstrate the feasibility of
our approach and illustrate how appropriately prepared systems of qubits evolve
towards a clustering solution.
2 Ising Models
Existing adiabatic quantum computers are designed to find low energy states of
Ising models. In other words, they solve
s= argmin
s∈{−1,+1}n
s|Qs +s|q(1)
where the 2nvectors sare possible global states of a system of nentities each
of which can be in one of two local states (+1 or 1). The coupling matrix
QRn×nmodels interactions within the system and the vector qRnmodels
external influences.
Since Ising models are concerned with bipolar state vectors s∈ {−1,+1}n,
they appear suited to formalize bi-partitioning problems such as binary cluster-
ing of ndata points. This is because, for suitable, problem dependent choices
of Qand q, the entries s
i=±1 of the solution to (1) can be thought of as
membership indicators for two distinct clusters. In section 3, we therefore devise
Ising models particularly for this purpose.
Note, however, that the problem in (1) is a quadratic unconstrained binary
optimization problem (QUBO) and therefore generally NP hard. For instance, a
na¨ıve approach to k= 2 means clustering would be to exhaustively evaluate (1)
for each of the 2npossible assignments of ndata points to 2 clusters. For large n,
this becomes of course impractical on a digital computer. On an adiabatic quan-
tum computer, on the other hand, we could prepare a system of nqubits that
is in a quantum mechanical superposition of all the 2npossible solutions. Here,
the challenge is thus to manipulate the system to evolve towards a state that
corresponds to a desired partition. In section 4, we discuss how to accomplish
this.
3 Ising Models for Kernel k= 2 Means Clustering
In this section, we devise Ising models for the problem of clustering a sample X
of ndata points xRminto two disjoint clusters X1and X2where |X1|=n1,
|X2|=n2, and n1+n2=n. Without loss of generality, we assume that the data
in Xare normalized to zero mean so that we have
µ=1
nX
xX
x=1
n
2
X
i=1 X
xXi
x=1
nn1µ1+n2µ2=0.(2)
This implies n1µ1=n2µ2which is to say that the two cluster means µ1and
µ2will be of opposite sign.
Regarding the idea of k= 2 means clustering, our problem would typically
be formalized as having to determine the two minimizers µ1and µ2of the within
cluster scatter
SW=
2
X
i=1 X
xXi
xµi
2.(3)
Indeed, most of the well known k-means algorithms such as those of Lloyd [23],
Hartigan [24], or MacQueen [25] consider this objective.
However, in this paper, we follow a different route and observe that the
problem of minimizing the within cluster scatter is equivalent to the problem of
maximizing the between cluster scatter
SB=
2
X
i,j=1
ninj
µiµj
2.(4)
This actually holds true for any k1 and follows from Fisher’s analysis of
variance [26,27]. It establishes that the total scatter can be written as
ST=X
xX
xµ
2=SW+1
2nSB(5)
which, since STand nare positive constants, implies that any decrease of SW
entails an increase of SB.
Looking at the two equivalent objective functions in (3) and (4), we remark
that their optimization proves to be NP hard in general [28] because they both
constitute integer programming problems in disguise [29]. Algorithms such as
those in [23,24,25] are therefore mere heuristics for which there is no guarantee
that they will find the optimal solution. In this sense it appears acceptable, that
the Ising models we next derive from (4) involve a heuristic assumption, too.
3.1 An Ambiguous Ising Model
For our problem of k= 2 means clustering, the maximization objective in (4) is
expressed in an overly complicated manner and it is easy to see that it can be
simplified to
SB= 2 n1n2
µ1µ2
2.(6)
Interestingly, this simplification now provides an intuition as to why k-means
clustering is agnostic of cluster shapes and distances and often produces clusters
of about equal size [30]. In order for SBin (6) to be large, both the distance
kµ1µ2kbetween the two cluster centers and the product n1n2of the two
cluster sizes have to be large. However, since the sum n1+n2=nis fixed, the
product of the sizes will be maximal if n1=n2=n
2.
This observation provides us with a heuristic argument for how to rewrite
the objective in (6) which in turn will allow us to set up an Ising model.
If we assume that, at a solution, we will likely have n1n2n
2, we may
consider the approximation
2n1n2
µ1µ2
22n2
4
µ1µ2
2
= 2
n
2µ1µ2
2
2
n1µ1n2µ2
2(7)
which turns the k= 2 means clustering problem into the problem of having to
solve
µ
1,µ
2= argmax
µ1,µ2
n1µ1n2µ2
2.(8)
Next, we observe that the norm in (8) can be expressed in a form that does
not explicitly depend on the unknown cluster means µi. To this end, we gather
the given data in a data matrix X= [x1,...,xn]Rm×nand introduce two
binary indicator vectors z1,z2∈ {0,1}nwhich indicate cluster memberships in
the sense that entry lof ziequals 1 if xlXiand 0 otherwise. This way, we
can write n1µ1=Xz1as well as n2µ2=Xz2and therefore
n1µ1n2µ2
2=
Xz1z2
2=
Xs
2.(9)
Note that sintroduced in (9) is guaranteed to be a bipolar vector because,
in k-means clustering, every given data point is assigned to one and only one
cluster so that
z1z2=z1(1z1)=2z11=s∈ {−1,1}n.(10)
This, however, establishes that there is an Ising model for the problem of
k= 2 means clustering of zero mean data.
On the one hand, since kXsk2=s|X|Xs is convex in s, the maximization
problem in (8) is equivalent to the following minimization problem
s= argmin
s∈{−1,1}ns|X|X s (11)
= argmin
s∈{−1,1}ns|Q s.(12)
On the other hand, because of (2), this will necessarily yield a solution vector s
whose entries are not all equal and thus induce a clustering of the data in X.
Looking at (12), we next observe that the coupling matrix Q=X|Xis
a Gram matrix. The criterion we just derived therefore allows for invoking the
kernel trick and thus leads to kernel k= 2 means clustering. In particular, if we
choose the coupling matrix for the Ising model in (12) to be a centered kernel
matrix
Qij =k(xi,xj)1
nX
l
k(xi,xl)1
nX
k
k(xk,xj) + 1
n2X
k,l
k(xk,xl) (13)
where kxi,xjis an appropriate Mercer kernel, our assumption of zero mean
data remains valid in the feature space as well.
3.2 An Unambiguous Ising Model
Looking at (12), we furthermore observe that there is a form of symmetry. This
is because, if ssolves this optimization problem, then so does ssince we did
not specify whether an entry of, say, +1 is supposed to indicate membership to
cluster one or two.
To remove this ambiguity, we may remove a degree of freedom from our
model. W.l.o.g. we can, for instance, fix sn= +1 and solve (12) for the remaining
n1 entries of s. This way, the problem becomes to solve
s= argmin
s∈{−1,1}n1
n1
X
i,j=1
Qij sisj2
n1
X
j=1
Qnj sjQnn (14)
= argmin
s∈{−1,1}n1s|Q0ss|q0(15)
which we recognize as yet another Ising energy minimization problem.
4 Adiabatic Quantum Kernel k= 2 Means Clustering
To perform adiabatic quantum kernel k= 2 means clustering of ndata points, we
consider a time-dependent system of nentangled qubits that is in a superposition
of 2nbasis states. Using the Dirac notation, this is written as
ψ(t)=
2n1
X
i=0
ai(t)ψi(16)
where the time dependent amplitudes aiCobey Pi|ai|2= 1. We understand
each of the different basis states
ψ0=000 . . . 000(17)
ψ1=000 . . . 001(18)
ψ2=000 . . . 010(19)
ψ3=000 . . . 011(20)
.
.
.
as an indicator vector which represents one of the 2npossible assignments of
ndata points to 2 distinct clusters and use the common shorthand to express
tensor products, for instance
ψ1=000 . . . 001=00. . . 1.(21)
If a quantum system such as the one in (16) evolves under the influence of a
time-dependent Hamiltonian H(t), its behavior is governed by the Schr¨odinger
equation
∂t ψ(t)=i H (t)ψ(t)(22)
where we have set ~= 1. In adiabatic quantum computing, we consider periods
ranging from t= 0 to t=Tand assume the Hamiltonian at time tto be given
as a convex combination of two static Hamiltonians, namely
H(t) = 1t
THB+t
THP.(23)
HBis called the beginning Hamiltonian whose ground state is easy to construct
and HPis the problem Hamiltonian whose ground state encodes the solution to
the problem at hand.
For Ising models such as the ones in (12) and (15), there are by now stan-
dard suggestions for how to set up a suitable problem Hamiltonian [18,19]. In
particular, we may define
HP=
n
X
i,j=1
Qij σi
zσj
z+
n
X
i=1
qiσi
z(24)
where σi
zdenotes the Pauli spin matrix σzacting on the ith qubit, that is
σi
z=II. . . I
| {z }
i1 terms σzII . . . I
| {z }
niterms
.(25)
The beginning Hamiltonian is then typically chosen to be orthogonal to the
problem Hamiltonian, for instance
HB=
n
X
i=1
σi
x(26)
where σi
xis defined as above, this time with respect to the Pauli spin matrix σx.
To compute a clustering, we then let |ψ(t)ievolve from |ψ(0)ito ψ(T)
where |ψ(0)iis chosen to be the ground state of HB. That is, if λdenotes the
smallest eigenvalue of HB, the initial state |ψ(0)iof the system corresponds to
the solution of
HBψ(0)=λψ(0).(27)
Finally, at time t=T, a measurement is performed on the nqubit system.
This will cause the wave function ψ(T)to collapse to a particular basis state
and the probability for this state to be ψiis given by the amplitude |ai(T)|2.
Yet, since the adiabatic evolution was steered towards the problem Hamiltonian
HP, basis states that correspond to ground states of HPare more likely to be
found.
On an adiabatic quantum computer, this algorithm is carried out physically.
On a digital computer, we may simulate it by numerically solving
ψ(T)=iZT
0
H(t)ψ(t)dt (28)
which is the approach we adhere to in the next section.
5 Practical Examples
In this section, we present several examples which demonstrate the feasibility
of quantum computing for kernel k= 2 means clustering. Our examples are
of didactic nature and first and foremost intended to illustrate the adiabatic
evolution of nqubit systems. In each experiment, we therefore restrict ourselves
to n= 16 data points xiR2which form two clusters. In other words, we
consider data matrices X= [X1,X2]R2×16 whose 16 column vectors form
two clusters of size n1and n2, respectively. These simple settings allow us to
comprehensibly visualize the data and the evolution of the the amplitudes of
the 216 or 215 basis states of qubit systems which implement the ambiguous and
unambiguous model derived above.
In each experiment, we simulate quantum adiabatic evolutions on a digital
computer. To this end, we set up the corresponding problem Hamiltonian HP,
the beginning Hamiltonian HB, and its ground state ψ(0)as discussed above
and use the Python quantum computing toolbox QuTiP [31] to numerically
solve (28) for t[0, T = 75] where 75 O(2n).
Experiment 1: In our first experiment, we consider the data in Fig. 1(a) and
minimize the energy of the ambiguous Ising model in (12) where the coupling
matrix Qresults from computing (13) with a Gaussian kernel
kxi,xj= exp 1
2σ2
xixj
2.(29)
Figure 1(c) illustrates the temporal evolution of the amplitudes |ai(t)|2of the
216 = 65536 basis states ψithe corresponding 16 qubit quantum system ψ(t)
can be in. At t= 0, all states are equally likely but over time their amplitudes
begin to increase or decrease. At t=T, two of the basis states have an amplitude
considerably higher than the others so that a measurement will likely cause the
system to collapse to either of these equally likely more probable states. These
two basis states are 0000000011111111and 1111111100000000which, when
understood as cluster indicator vectors, both induce the result in Fig. 1(b).
Looking at this result, we can conclude that our approach can cluster the
data in a manner a human observer would expect and deem appropriate.
(a) n= 16 data points in R2
(b) clustering result (c) evolution of amplitudes of basis states
ψi
Fig. 1: Example of adiabatic quantum kernel k= 2 means clustering using the
ambiguous Ising model in (12). (a) sample of 16 data points forming two half-
moons; (b) corresponding clustering result; (c) adiabatic evolution of a system
of 16 qubits. During its evolution over time t, the system is in a superposition
of 216 = 65536 basis states ψieach representing a possible binary clustering.
Initially, it is equally likely to find the system in any of these states. At the
end, two basis states have noticeably higher amplitudes |ai|2than the others
and are therefore more likely to be measured; these are 0000000011111111and
1111111100000000and they both induce the result in (b).
Experiment 2: In our second experiment, we consider the same data as above
where, this time, one of the data points has been manually preassigned to a
cluster (see Fig. 2(a)). This allows for the use of the unambiguous Ising model
in (15). We compute the coupling matrix as above, however, since only 15 data
points still need to be assigned to a cluster, we consider a 15 qubit system ψ(t)
which is in a superposition of 215 = 32768 basis states.
Figure 2(c) visualizes the adiabatic quantum evolution of this system. Again,
at time t= 0, each basis state ψiis equally likely to be measured but the
corresponding amplitudes |ai|2soon begin to increase or to decrease. Since we
are considering the unambiguous Ising model, the process reaches a configuration
at t=Twhere only one basis state has a much higher amplitude than the others.
This one is 000000001111111and induces the result in Fig. 2(b)). Just as in
our first experiment, the result obtained from the unambiguous model considered
here is reasonable and convincing.
Experiment 3: In our third experiment, we investigate whether or not practical
success of our approach critically hinges on the heuristic assumption in (7),
namely, that clusters are of about equal size. The n= 16 data points in Fig. 3(a)
(a) n= 16 data points in R2one
of which has been preassigned to
a cluster
(b) clustering result (c) evolution of amplitudes of basis states
ψi
Fig. 2: Example of adiabatic quantum kernel k= 2-means clustering using the
unambiguous Ising model in (15). (a) sample of 16 data points where one data
point has been manually preassigned to a cluster; (b) corresponding clustering
result; (c) adiabatic evolution of a system of 15 qubits. Throughout, the system is
in a superposition of 215 = 32768 basis states. Upon termination of its adiabatic
evolution, the single most likely basis state for the system to be found in is
000000001111111which induces the assignment of points to clusters in (b).
were sampled from two bi-variate Gaussians where n1= 11 and n2= 5 and thus
form two clusters where the one is more than twice as big than the other.
Since one of the data points in Fig. 3(a) has bee preassigned a cluster label, we
again consider the unambiguous Ising model in (15) using a kernelized coupling
matrix as described above. The evolution of the corresponding 15 qubit system
is shown in Fig. 3(c) and it leads to a configuration where the single most likely
basis state 0000000000001111induces the clustering shown in Fig. 3(b).
As the result in Fig. 3(b) certainly appears reasonable, this experiments
shows that the minimum energy configurations of our Ising model(s) for quantum
clustering do not necessarily have to correspond to equally sized partitions of
a given set of data. This is of course desirable and shows resilience against the
simple heuristic we applied in (7). Of course, the result may have looked less
convincing to the human eye if the two clusters were closer together; but this
caveat would apply to conventional (kernel) k-means clustering, too [30].
6 Summary
After decades of mainly theoretical research, quantum computing is now about
to become practical. Companies such as Google, IBM, Intel, or Microsoft invest
(a) n= 16 data points in R2one
of which has been preassigned to
a cluster
(b) clustering result (c) evolution of amplitudes of basis states
ψi
Fig. 3: Example of adiabatic quantum kernel k= 2-means clustering applied to
a set of data that consists of two clusters of unequal sizes. Nevertheless, using
the unambiguous Ising model in (15), the corresponding 15 qubit system evolves
to a configuration where the single most likely basis state 0000000000001111
partitions the data into two groups a human observer would deem reasonable.
increasing resources into corresponding research and development and further
rapid technological progress is expected. These developments will likely impact
supervised and unsupervised machine learning, because working quantum com-
puters promise fast solutions to the kind of search- or optimization procedures
that are at the heart of many algorithms in these areas.
In this paper, we were thus concerned with the general feasibility of quantum
computing for machine learning and considered adiabatic quantum computing
for the problem of kernel k= 2 means clustering. We discussed that, from an
abstract point of view, the problem of setting up machine learning algorithms
for adiabatic quantum computing can be seen as the problem of expressing their
objective functions in terms of Ising energy minimization problems because adi-
abatic quantum computers are tailored towards minimizing Ising energies.
We therefore devised Ising models for (kernel) k= 2 means clustering of n
data points. The first model was straightforward to derive from an alternative,
less well known objective for k-means clustering but suffers from ambiguities
because if state swould minimize the Ising energy, then so would state s.
We addressed this issue and devised a second, slightly more involved Ising model
of n1 rather than of ndegrees of freedom.
In order for this paper to be as self-contained as possible we then discussed
how to prepare systems of nor n1 qubits whose adiabatic evolution according
to an appropriate time-dependent Hamiltonian would lead to a solution of our
Ising energy minimization problems and thus to an assignment of data points to
clusters.
Finally, we presented several simulation experiments where we numerically
solved the Schr¨odinger equations governing the dynamics of the corresponding
qubit systems. Our examples demonstrated that adiabatic quantum computing
can indeed perform kernel k= 2 means clustering.
References
1. D-Wave press release: D-Wave announces D-Wave 2000Q quantum computer and
first system order (Jan 2017)
2. Connover, E.: Google Moves toward Quantum Supremacy with 72-qubit Computer.
Science News 193(6) (2018) 13
3. Daimler press release: Daimler joins forces with Google to research the application
of quantum computers (Mar 2018)
4. IBM press release: IBM announces advances to IBM quantum systems & ecosystem
(Nov 2017)
5. Intel press release: Intel advances quantum and neuromorphic computing research
(Jan 2018)
6. Microsoft press release: With new Microsoft breakthroughs, general purpose quan-
tum computing moves closer to reality (2017)
7. VW press release: Volkswagen Group and Google work together on quantum
computers (Nov 2017)
8. Knight, W.: Serious quantum computers are finally here. What are we going to do
with them? MIT Technology Review (February 2018)
9. ımeur, E., Brassard, G., Gambs, S.: Quantum Speed-up for Unsupervised Learn-
ing. Machine Learning 90(2) (2013)
10. Lloyd, S., Mohseni, M., Rebentrost, P.: Quantum Algorithms for Supervised and
Unsupervised Machine Learning. arXiv:1307.0411 [quant-ph] (2013)
11. Wittek, P.: Quantum Machine Learning. Academic Press (2014)
12. Schuld, M., Sinayskiy, I., Petruccione, F.: An Introduction to Quantum Machine
Learning. Contemporary Physics 56(2) (2014)
13. Wiebe, N., Kapoor, A., Svore, K.: Quantum Algorithms for Nearest-Neighbor
Methods for Supervised and Unsupervised Learning. Quantum Information &
Computation 15(3–4) (2015)
14. Dunjiko, V., Taylor, J., Briegel, H.: Quantum-Enhanced Machine Learning. Phys-
ical Review Letters 117(13) (2016)
15. Bauckhage, C., Brito, E., Cvejoski, K., Ojeda, C., Sifa, R., Wrobel, S.: Ising Models
for Biary Clustering via Adiabatic Quantum Computing. In: Proc. EMMCVPR.
Volume 10746 of LNCS., Springer (2017)
16. Johnson, M., et al.: Quantum Annealing with Manufactured Spins. Nature
473(7346) (2011)
17. Ising, E.: Beitrag zur Theorie des Ferromagnetismus. Zeitschrift f¨ur Physik 31(1)
(1925) 253–258
18. Farhi, E., Goldstone, J., Gutmann, S., Sipser, M.: Quantum Computation by
Adiabatic Evolution. arXiv:quant-ph/0001106 (2000)
19. Lucas, A.: Ising Formulations of Many NP Problems. Frontiers in Physics 2(5)
(2014)
20. Ushijima-Mwesigwa, H., Negre, C., Mniszewski, S.: Graph Partitioning Using
Quantum Annealing on the D-Wave System. In: Proc. Int. Workshop on Post
Moores Era Supercomputing, ACM (2017)
21. Hopfield, J.: Neural Networks and Physical Systems with Collective Computational
Abilities. PNAS 79(8) (1982)
22. Born, M., Fock, V.: Beweis des Adiabatensatzes. Zeitschrift f¨ur Physik 51(3–4)
(1928)
23. Lloyd, S.: Least Squares Quantization in PCM. IEEE Trans. Information Theory
28(2) (1982)
24. Hartigan, J., Wong, M.: Algorithm AS 136: A k-Means Clustering Algorithm. J.
of the Royal Statistical Society C 28(1) (1979)
25. MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Ob-
servations. In: Proc. Berkeley Symp. on Mathematical Statistics and Probability.
(1967)
26. Fisher, R.: On the Probable Error of a Coefficient Correlation Deduced from a
Small Sample. Metron 1(1921)
27. Bauckhage, C.: k-Means and Fisher’s Analysis of Variance. Technical report,
researchgate (May 2018)
28. Aloise, D., Deshapande, A., Hansen, P., Popat, P.: NP-Hardness of Euclidean
Sum-of-Squares Clustering. Machine Learning 75(2) (2009)
29. Bauckhage, C.: k-Means Clustering via the Frank-Wolfe Algorithm. In: Proceed-
ings KDML-LWA. (2016)
30. MacKay, D.: Information Theory, Inference, and Learning Algorithms. Cambridge
University Press (2003)
31. Johansson, J., Nation, P., Nori, F.: QuTiP 2: A Python Framework for the Dy-
namics of Open Quantum Systems. Computer Physics Communications 184(4)
(2013)
... This procedure is equivalent to solving the quadratic unconstrained binary optimization (Qubo) problem, which has been investigated since the 1960s (see, e.g., Kochenberger et al. 2014). Its value lies in its applicability to a wide range of combinatorial optimization problems, from economics (Laughhunn 1970;Hammer and Shlifer 1971) over satisfiability (Kochenberger et al. 2005), resource allocation and routing problems (Neukart et al. 2017;Stollenwerk et al. 2019) to machine learning (Bauckhage et al. 2018;Date et al. 2020;Mücke et al. 2023)-just to name a few. Solving Qubo is, in general, NP-hard (Pardalos and Jha 1992). ...
... SubsetSum consists of finding a subset from a list of values that sum up to a given target value. Both have well-established Qubo embeddings (Bauckhage et al. 2018;Biesner et al. 2022). We chose these problems as they are (i) real-world problems of both scientific and economic interest, (ii) easy to generate for arbitrary n, and (iii) their Qubo instances' DR is a direct result of the input data. ...
... From this data, we derive a Qubo instance Q using the method from Bauckhage et al. (2018). We use a linear kernel, which leads to a vanilla 2-means clustering based on Euclidean distance. ...
Article
Full-text available
Quadratic unconstrained binary optimization (QUBO) problems are well-studied, not least because they can be approached using contemporary quantum annealing or classical hardware acceleration. However, due to limited precision and hardware noise, the effective set of feasible parameter values is severely restricted. As a result, otherwise solvable problems become harder or even intractable. In this work, we study the implications of solving QUBO problems under limited precision. Specifically, it is shown that the problem’s dynamic range has a crucial impact on the problem’s robustness against distortions. We show this by formalizing the notion of preserving optima between QUBO instances and explore to which extend parameters can be modified without changing the set of minimizing solutions. Based on these insights, we introduce techniques to reduce the dynamic range of a given QUBO instance based on the theoretical bounds of the minimal energy value. An experimental evaluation on random QUBO instances as well as QUBO-encoded BinClustering and SubsetSum problems show that our theoretical findings manifest in practice. Results on quantum annealing hardware show that the performance can be improved drastically when following our methodology.
... This procedure is equivalent to solving the quadratic unconstrained binary optimization (Qubo) problem, which has been investigated since the 1960s (see e. g. (Kochenberger et al, 2014)). Its value lies in its applicability to a wide range of combinatorial optimization problems, from economics (Laughhunn, 1970;Hammer and Shlifer, 1971) over satisfiability (Kochenberger et al, 2005), resource allocation and routing problems (Neukart et al, 2017;Stollenwerk et al, 2019) to machine learning (Bauckhage et al, 2018;Date et al, 2020;Mücke et al, 2023)-just to name a few. Solving Qubo is, in general, NP-hard (Pardalos and Jha, 1992). ...
... SubsetSum consists of finding a subset from a list of values that sum up to a given target value. Both have well-established Qubo embeddings (Bauckhage et al, 2018;Biesner et al, 2022). ...
... From this data we derive a Qubo instance Q using the method from (Bauckhage et al, 2018). We use a linear kernel, which leads to a vanilla 2means clustering based on Euclidean distance. ...
Preprint
Quadratic unconstrained binary optimization (QUBO) problems are well-studied, not least because they can be approached using contemporary quantum annealing or classical hardware acceleration. However, due to limited precision and hardware noise, the effective set of feasible parameter values is severely restricted. As a result, otherwise solvable problems become harder or even intractable. In this work, we study the implications of solving QUBO problems under limited precision. Specifically, it is shown that the problem's dynamic range has a crucial impact on the problem's robustness against distortions. We show this by formalizing the notion of preserving optima between QUBO instances and explore to which extend parameters can be modified without changing the set of minimizing solutions. Based on these insights, we introduce techniques to reduce the dynamic range of a given QUBO instance based on theoretical bounds of the minimal energy value. An experimental evaluation on random QUBO instances as well as QUBO-encoded Binary Clustering and Subset Sum problems show that our theoretical findings manifest in practice. Results on quantum annealing hardware show that the performance can be improved drastically when following our methodology.
... where Q is real and z is binary (proper definitions follow in Section 2). Despite problem (1)'s simple structure, it is NPhard, and hence covers a plethora of real-world optimization challenges, from problems like the traveling salesperson and graph coloring (Lucas 2014) to machine learning (ML) (Bauckhage et al. 2018;) and various other applications, e.g., (Biesner et al. 2022;Chai et al. 2023). One common issue of QUBO hardware solvers is limited physical precision of the matrix entries, as real-world hardware devices use finite numerical representations. ...
... Three exemplary problems are considered: BINCLUSTERING represents 2-means clustering, SUBSETSUM consists of finding a subset from a list of values that sum up to a given target value and MRF consists of maximum a posteriori state inference in an undirected graphical model. All three problems have known QUBO embeddings (Bauckhage et al. 2018;Biesner et al. 2022;Mücke, Piatkowski, and Morik 2019). The specific setups are described in the Appendix. ...
Preprint
Full-text available
The demand for high-performance computing in machine learning and artificial intelligence has led to the development of specialized hardware accelerators like Tensor Processing Units (TPUs), Graphics Processing Units (GPUs), and Field-Programmable Gate Arrays (FPGAs). A key strategy to enhance these accelerators is the reduction of precision in arithmetic operations, which increases processing speed and lowers latency - crucial for real-time AI applications. Precision reduction minimizes memory bandwidth requirements and energy consumption, essential for large-scale and mobile deployments, and increases throughput by enabling more parallel operations per cycle, maximizing hardware resource utilization. This strategy is equally vital for solving NP-hard quadratic unconstrained binary optimization (QUBO) problems common in machine learning, which often require high precision for accurate representation. Special hardware solvers, such as quantum annealers, benefit significantly from precision reduction. This paper introduces a fully principled Branch-and-Bound algorithm for reducing precision needs in QUBO problems by utilizing dynamic range as a measure of complexity. Experiments validate our algorithm's effectiveness on an actual quantum annealer.
... In addition to Ising-based kernel clustering proposed in this paper, another way to perform clustering of irregular data is Ising-based binary clustering using the kernel method [40]. This method involves utilizing only one single decision boundary to linearly separate the data mapped onto a high-dimensional feature space. ...
Article
Full-text available
Combinatorial clustering based on the Ising model is drawing attention as a high-quality clustering method. However, conventional Ising-based clustering methods using the Euclidean distance cannot handle irregular data. To overcome this problem, this paper proposes an Ising-based kernel clustering method. The kernel clustering method is designed based on two critical ideas. One is to perform clustering of irregular data by mapping the data onto a high-dimensional feature space by using a kernel trick. The other is the utilization of matrix–matrix calculations in the numerical libraries to accelerate preprocess for annealing. While the conventional Ising-based clustering is not designed to accept the transformed data by the kernel trick, this paper extends the availability of Ising-based clustering to process a distance matrix defined in high-dimensional data space. The proposed method can handle the Gram matrix determined by the kernel method as a high-dimensional distance matrix to handle irregular data. By comparing the proposed Ising-based kernel clustering method with the conventional Euclidean distance-based combinatorial clustering, it is clarified that the quality of the clustering results of the proposed method for irregular data is significantly better than that of the conventional method. Furthermore, the preprocess for annealing by the proposed method using numerical libraries is by a factor of up to 12.4 million × from the conventional naive python’s implementation. Comparisons between Ising-based kernel clustering and kernel K-means reveal that the proposed method has the potential to obtain higher-quality clustering results than the kernel K-means as a representative of the state-of-the-art kernel clustering methods.
... Kumar et al. 2018;Bauckhage et al. 2018), vector quantization(Bauckhage et al. 2020), support vector machines(Mücke et al. 2019;Date et al. 2020), and probabilistic graphical models(Mücke et al. 2019), to name a few. ...
Article
Full-text available
In machine learning, fewer features reduce model complexity. Carefully assessing the influence of each input feature on the model quality is therefore a crucial preprocessing step. We propose a novel feature selection algorithm based on a quadratic unconstrained binary optimization (QUBO) problem, which allows to select a specified number of features based on their importance and redundancy. In contrast to iterative or greedy methods, our direct approach yields higher-quality solutions. QUBO problems are particularly interesting because they can be solved on quantum hardware. To evaluate our proposed algorithm, we conduct a series of numerical experiments using a classical computer, a quantum gate computer, and a quantum annealer. Our evaluation compares our method to a range of standard methods on various benchmark data sets. We observe competitive performance.
... (13) are a popular class of optimization problems, known to be NP-hard (Pardalos and Jha, 1992). Numerous practical optimization problems have been embedded into QUBO form, ranging from finance and economics (Laughhunn, 1970;Hammer and Shlifer, 1971) over satisfiability (Kochenberger et al, 2005) to ML tasks such as clustering (Kumar et al, 2018;Bauckhage et al, 2018), vector quantization (Bauckhage et al, 2020), support vector machines (Mücke et al, 2019;Date et al, 2020) and probabilistic graphical models (Mücke et al, 2019), to name a few. ...
Preprint
In machine learning, fewer features reduce model complexity. Carefully assessing the influence of each input feature on the model quality is therefore a crucial preprocessing step. We propose a novel feature selection algorithm based on a quadratic unconstrained binary optimization (QUBO) problem, which allows to select a specified number of features based on their importance and redundancy. In contrast to iterative or greedy methods, our direct approach yields higherquality solutions. QUBO problems are particularly interesting because they can be solved on quantum hardware. To evaluate our proposed algorithm, we conduct a series of numerical experiments using a classical computer, a quantum gate computer and a quantum annealer. Our evaluation compares our method to a range of standard methods on various benchmark datasets. We observe competitive performance.
Chapter
Combinatorial optimization problems, integral to various scientific and industrial applications, often vary significantly in their complexity and computational difficulty. Transforming such problems into Quadratic Unconstrained Binary Optimization (Qubo) has regained considerable research attention in recent decades due to the central role of Qubo in Quantum Annealing. This work aims to shed some light on the relationship between the problems’ properties. In particular, we examine how the spectral gap of the Qubo formulation correlates with the original problem, since it has an impact on how efficiently it can be solved on quantum computers. We analyze two well-known problems from Machine Learning, namely Clustering and Support Vector Machine (SVM) training, regarding the spectral gaps of their respective Qubo counterparts. An empirical evaluation provides interesting insights, showing that the spectral gap of Clustering Qubo instances positively correlates with data separability, while for SVM Qubo the opposite is true.
Preprint
Full-text available
In the context of optical signal processing, quantum and quantum-inspired machine learning algorithms have massive potential for deployment. One of the applications is in error correction protocols for the received noisy signals. In some scenarios, non-linear and unknown errors can lead to noise that bypasses linear error correction protocols that optical receivers generally implement. In those cases, machine learning techniques are used to recover the transmitted signal from the received signal through various estimation procedures. Since quantum machine learning algorithms promise advantage over classical algorithms, we expect that optical signal processing can benefit from these advantages. In this review, we survey several proposed quantum and quantum-inspired machine learning algorithms and their applicability with current technology to optical signal processing.
Article
Full-text available
Quantum computing for machine learning attracts increasing attention and recent technological developments suggest that especially adiabatic quantum computing may soon be of practical interest. In this paper, we therefore consider this paradigm and discuss how to adopt it to the problem of binary clustering. Numerical simulations demonstrate the feasibility of our approach and illustrate how systems of qubits adiabatically evolve towards a solution.
Article
Full-text available
The emerging field of quantum machine learning has the potential to substantially aid in the problems and scope of artificial intelligence. This is only enhanced by recent successes in the field of classical machine learning. In this work we propose an approach for the systematic treatment of machine learning, from the perspective of quantum information. Our approach is general and covers all three main branches of machine learning: supervised, unsupervised, and reinforcement learning. While quantum improvements in supervised and unsupervised learning have been reported, reinforcement learning has received much less attention. Within our approach, we tackle the problem of quantum enhancements in reinforcement learning as well, and propose a systematic scheme for providing improvements. As an example, we show that quadratic improvements in learning efficiency, and exponential improvements in performance over limited time periods, can be obtained for a broad class of learning problems.
Conference Paper
Full-text available
We show that k-means clustering is a matrix factorization problem. Seen from this point of view, k-means clustering can be computed using alternating least squares techniques and we show how the constrained optimization steps involved in this procedure can be solved efficiently using the Frank-Wolfe algorithm.
Article
Full-text available
Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessable way, and discusses the potential of a future theory of quantum learning.
Article
Full-text available
We provide Ising formulations for many NP-complete and NP-hard problems, including all of Karp's 21 NP-complete problems. This collects and extends classic results relating partitioning problems to Ising spin glasses, as well as work describing exact covering algorithms and satisfiability. In each case, the state space is at most polynomial in the size of the problem, as is the number of terms in the Hamiltonian. This work may be useful in designing adiabatic quantum optimization algorithms.
Article
We present quantum algorithms for performing nearest-neighbor learning and k–means clustering. At the core of our algorithms are fast and coherent quantum methods for computing the Euclidean distance both directly and via the inner product which we cou- ple with methods for performing amplitude estimation that do not require measurement. We prove upper bounds on the number of queries to the input data required to compute such distances and find the nearest vector to a given test example. In the worst case, our quantum algorithms lead to polynomial reductions in query complexity relative to Monte Carlo algorithms. We also study the performance of our quantum nearest-neighbor al- gorithms on several real-world binary classification tasks and find that the classification accuracy is competitive with classical methods.