Content uploaded by Clark Alexander
Author content
All content in this area was uploaded by Clark Alexander on Nov 04, 2020
Content may be subject to copyright.
Picking Efficient Portfolios from 3,171 US
Common Stocks with New Quantum and
Classical Solvers
Chicago Quantum∗
email the authors
November 4, 2020
Abstract
We analyze 3,171 U.S. common stocks to create an efficient portfo-
lio based on the Chicago Quantum Net Score (CQNS). We begin with
classical solvers, then incorporate quantum annealing. We incorporate a
simulated bifurcator into our ‘team’ of solvers, along with the new D-Wave
AdvantageT M quantum annealing computer with 5,760 available qubits in
a Pegasus graph size P16 architecture.
Contents
1 Introduction 1
2 Motivation 2
3 The Simulated Bifurcation Machine 4
3.1 OurModifications .......................... 4
3.2 TuningtheSBM ........................... 6
3.3 OurPicks ............................... 6
4 Scaling the D-Wave Advantage Solver 7
4.1 Changes from the Previous Generation . . . . . . . . . . . . . . . 8
4.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 9
1 Introduction
In this work we continue our progress on the stock picking problem using the
Sharpe ratio and the Chicago Quantum Net Score. Previously we had worked
∗Jeffrey Cohen, Clark Alexander
1
arXiv:2011.01308v1 [quant-ph] 30 Oct 2020
through the problems of picking the optimal portfolio from a universe of 40
then 60 stocks. In this work we have scaled our problem to a universe of 3171
U.S. common stocks, which in some measure is the entirety of all daily traded
stocks which have “good data” for the year period beginning September 2019
and ending September 2020. In order to scale our solutions for this problem
we have employed a number of new techniques. We continue to use a genetic
algorithm, several simulated annealers, and the D-Wave Systems (hereafter D-
Wave) 2000Q quantum annealing computer. In this work we employ D-Wave’s
next generation quantum annealing computer, the AdvantageT M [Adv], as well
as our own simulated bifurcation machine. We have significantly sped up our
genetic algorithm to allow for many more generations and to produce an initial
population from multiple sources. While we have worked through a universe of
3171 stocks, we have no claim to a universal best solution, but rather, we claim
to have gotten “good” solutions, which we can measure empirically by their
performances as stock portfolios. Since we are incorporating a buy-and-hold
strategy, we do not consider puts and calls, or the Black-Scholes model with
the American or European option, nor do we employ the plethora of time series
based analyses of stocks. Instead, we present our results from the work we have
done. This set of techniques leaves us with a solution space of 23171 possible
portfolios. A quick calculation reveals:
23171 ≈3.68 ×10954
Being late 2020, we can precisely say what computationally intractable means.
In this case, let’s assume we have exascale (digital computers) that is, 1018
FLOPS. Additionally, if every atom in the observable universe were its own
exascale computer we’d arrive at approximately 1090 exascale computers, and
we have had roughly 4.4×1017 seconds. Thus we round up to 1018. Putting it
toghter, there is no way to brute force a solution of size 10126. We see that we
are far in excess of this number with 3171 stocks. Thus while we know that our
solutions are better than quintillions of other possible portfolios, we cannot in
good faith guarantee a universal optimal solution. However, one faces a second
issue in measuring the “goodness” of a solution when dealing with stocks. Is the
investor doing better than the market? If so, how much better? In this case,
our solutions will, with high probability, pass muster.
2 Motivation
Our work begins with validating and running all U.S. equities (common stocks)
through our classical solver to find the most attractive portfolio of Nstocks
that can be run on a quantum annealing computer. We also allow these solvers
to identify attractive portfolios. In Step 2, we take the best Nstock portfolio
found and run those stocks through additional solvers, including the quantum
annealing computer, to identify the overall most attractive portfolios.
This two-step approach enables us to more efficiently find deeper local min-
ima i.e. more negative CQNS values, which imply better investment portfolios.
2
Our two-step process finds lower CQNS scores, which equate with more attrac-
tive portfolios.
Our motivation is that our classical solvers scale to analyze U.S. common
stocks listed on NYSE, NASDAQ-Q and NYSE American (previously the Amer-
ican Stock Exchange), which includes 3,171 validated stocks as of October 10,
2020. At this scale, we create a single 3171 ×3171 matrix for each portfolio size
desired to create our QUBO.
The D-Wave quantum annealing computer scales to analyze 138 out of a
maximum possible 180 [CDHR, EFD] stocks at one time (20% embedding suc-
cess rate) and embeds 134 stocks consistently (∼80% embedding success during
this experiment). We have additional classical solvers at this scale, including
the D-Wave TABU and simulated annealer, along with our in-house simulated
bifurcation machine. These methods utilize the QUBO which we create at this
scale.
For 3,171 stocks, we use the following workflow.
1. Calculate the ‘all-in’ values, and calibrate CQNS value to zero.
2. Run Monte Carlo in a discrete probability distribution around N/2 stock
portfolio size
3. Run Monte Carlo for target Nportfolio size
4. Run Genetic Algorithm, with a randomly selected initial population for
target portfolio size of N.
5. Run the simulated annealer for target portfolio size of N.
6. Build N-asset, M×M(here M= 3171) matrix for conversion to QUBO
7. Run D-Wave simulated annealer for target portfolio size of N
8. Run the simulated bifurcation machine for target portfolio size of N
9. Run D-Wave TABU sampler for target portfolio size of N
10. Run the Genetic Algorithm with prior solution as its initial population,
for target portfolio size of N.
We adjust the parameters for each solver to run each of them in five minutes
or less. In our experiment we did not find a valid solution (correct number of
assets) with the D-Wave TABU or Simulated Annealer samplers. They return
larger solutions (1,500 and 350 vs. 134 stocks).
Our matrix build creates the QUBO data for each size stock portfolio in
sequence, and adds the values to the N×N×Narray. Our matrix build fails
between 1,500 and 2,000 matrices, likely due to a memory issue with our iMAC
2013. This occurred whether we broke up the build process into subsets, or we
eliminate the matrix scale process.
In the end, our best N, or 134 stock portfolio, had a CQNS score of −3.14 ×
10−3. This was found by our seeded genetic algorithm. The ‘all 3,171 asset’
3
portfolio has a CQNS score of zero, and the other solvers found solutions between
these values if they found valid solutions.
Remark. Our collective runs took 232 seconds. This gives a bit of scale as
to how fast the SBM runs on home scale architecture. We can speed this up
significantly on a more powerful computing platform.
3 The Simulated Bifurcation Machine
Our work with the simulated bifurcation machine, hereafter SBM, is based fully
on [SB]. We shall not go into the explanation of why the mechanism of SBM
works, but rather we shall say what modifications we made, why, and what
results we obtained using SBM. One should also note, that we did not use a
version of SBM available through a cloud service, but rather one that was coded
in-house, the reason for this is simple, we want to really understand how to tune
the parameters to force the bifurcations to happen as quickly or as slowly as
desired.
Figure 1: Time evolution of xvector in Simulated Bifurcation
In 1 we see real data, but we acknowledge that 3171 items is difficult to
visualize, so we have added some smaller portfolios in Appendix B 9.
3.1 Our Modifications
The most important modification we should mention is that the official white
paper contains a small error for our set up. The set up of SBM in the white
paper assumes a purely Ising model matrix for a QUBO model. In our case,
we are looking at a {0,1}solution set rather than a {−1,1}solution set. This
means we must make a coordinate change.
4
Remark. Let ~x ∈ {0,1}Nand ~z ∈ {−1,1}N. One converts ~x to ~z by
zi= 2xi−1.
Thus we define ~
1 = Pˆeiand write
~z = 2~x −~
1
A generalized quadratic form is given by
~xtA~x +B·~x +K1
We wish to convert this to an Ising model
~ztJ ~z +C·~z +K2
To make a long story short we get the following formulae
J=A/4 (1)
C=1
2B+ 4(J~
1)(2)
K2=K1+ 2(C·~
1) −~
1tJ~
1 (3)
However, we throw away both K1and K2since they are unimportant to the
values of ~x, ~z.
The main modifcation we make is that in converting from a pure {0,1}
quadratic form i.e. ~xtA~x to an Ising model, we pick up a nonzero offset vector
C. In [SB] this offset vector is missing in equations 12,13,17, and thus in our
coding we have reintroduced these offset vectors.
That is to say we have made the change
ξ0J·~z 7→ ξ0(J·~z +C) (4)
We also experimented with the pressure scaling from a linear progression of
zero to one to a logistic curve, however we found no empirical difference in the
quality of solutions.
The other modifcation we have made is that when one trajectory has clearly
bifurcated, but others are still oscillating around zero, we have set a threshold to
avoid divergent trajectories. We were unable to replicate the quality of solutions
in the original paper for large scale problems, however in tiny problems, (i.e.
dozens of trajectories) we were able to find the universally best solution. We
did, however, manage to tune our SBM to give good solutions very quickly.
Our empirical findings, however, show that for a single solution technique, our
in house simulated annealer tuned with the statistics of random matrices was
able to consistently produce better results, at the cost of additional time. For
example, in time tests, we were able to produce very good results on a 10,000 ×
10,000 matrix at the cost of 47 minutes, but on a laptop. For this paper
we executed 10,016 iterations on a 3,171 ×3,171 matrix to select 134 stocks
in 232 seconds. It is, obviously the reader’s choice as to which combination
of techniques to use, but we find that the combination of seeding a genetic
algorithm with a initial population of Monte Carlo, simulated annealer, genetic
algorithm, quantum annealer, and SBM solutions works best.
5
3.2 Tuning the SBM
The simulated bifurcation machine takes in a number of parameters for tuning,
specifically ∆, K, ξ, ε, p, and number of iterations. In our experiments we found
that the most effective parameters for tuning were ε, ξ, and number of iterations.
The white paper [SB] sets ∆, K at approximately 1, and pincreases linearly.
We found these parameter choices to be sufficient, but the coupling constant ξ
which drives the adiabatic shift from nonlinear oscillator to Ising model greatly
affected our results. Additionally, εaffected how quickly a bifurcation occurred,
and number of iterations is roughly inversely proportional to ε.
3.3 Our Picks
The stocks we selected out of 3,171 candidates were AX and MCRB, with data
effective from market close October 9, 2020. This analysis is not fundamental
to the companies, but relies on the patterns of the adjusted closing price data.
Axos Financial, Inc. (AX) is a regional bank with mortgage exposure in
Southern California. Seres Therapeutics, Inc. (MCRB) is a microbiome ther-
apeutics biological drug platform company. These are picks from significantly
different industries.
They started up 5% and 0.5% in the first 2 and a quarter trading days,
compared to the market indices which are up ∼1%. Our model also appears
to select some stocks with short-term volatility to trade on due to preferring
stocks with higher βvalues.
Figure 2: Movement of the pair over 2.25 trading days
We check the performance of our two stocks against the S&P 500 index ETF
(SPY) and the index we use to calculate β. We start October 12 market open
and we read out October 30 at time of writing. (MCRB) and (AX) appear as
blue lines, while the index is yellow.
6
Figure 3: Movement of the pair over 14.5 trading days, orange is QQQ, yellow
is SPY, dark blue is AX, lighter blue is MCRB
While the S&P 500 fell 5.5%, (AX) increased 6.8%, and (MCRB) fell 9.5%.
On average, our equally weighted portfolio falls 1.4% which is less than half the
(SPY). 3
These stocks consistently move in opposite directions over the period and
straddle the index. (MCRB) has significant volatility during the period which
gives investors a chance to trade that stock.
In this chart, the Nasdaq Composite index ETF (QQQ) fell 4.5% during the
period. The (QQQ) also moves between the two stocks selected.
4 Scaling the D-Wave Advantage Solver
The new D-Wave computer has 5,760 qubits set in a “Pegasus” graph size P16
topology with increased connectivity over the “Chimera” topology in the previ-
ous solver generation. It runs at 15.8mK, and is used via D-Wave’s OceanTM
software development toolkit available through the D-Wave LeapTM cloud-
based service.
From an experimental perspective the D-Wave Advantage 1.1 runs like the
previous generation of quantum annealing solver. We connect via a python
Jupyter Notebook, establish the connection via the cloud, and submit jobs for
execution. We use the D-Wave Problem InspectorT M to gain additional infor-
mation about the problem submitted, its embedding on the qubits, and the
system itself.
We successfully embedded 138 stocks on the system and downsize the prob-
lem to 134 stocks for our experiment to increase embedding success rates.
7
Figure 4: D-Wave Dashboard
4.1 Changes from the Previous Generation
A few things change during this experiment from our prior work on the D-Wave
Chimera system (embed 64 stocks).
1. The wait time between runs has increased significantly to ∼10 minutes
on average depending on the time of day. This requires us to significantly
limit our runs.
2. D-Wave embedding was successful 11 of 15 attempts for 134 stocks. We
embedded and ran a maximum of 138 stocks one time. When we fail to
embed an error occurs and we manually restart the analysis.
3. We see biases upwards of (-7,3) on the qubits, whereas in the Chimera our
bias range was typically (-1,1).
4. Qubit bias ranges of (-1,1) create qubit connections above and below the
matrix. Qubit biases that significantly exceed abs(1.0) due to division by
chain strength values create no connectivity below the matrix (see 2 charts
below).5
5. The chain strengths that find valid solutions are 0.15 and 1.5.
6. Our chain lengths vary from 27 to 45 qubits. Our experimental results
show almost 100% chain breaks which is significantly higher than on the
Chimera.
8
7. The number of target variables, and max chain length, varied due to the
embedding quality. Best and worst combination:
target variables max chain length chain strength
2,430 27 0.2
3,730 45 1.5
8. We scale our QUBO to (-4,4) instead of (-0.99,0.99) before using it with
solvers.
9. We vary the number of runs for each size portfolio from 200 to 1000.
10. We vary the annealing times from 20 to 150 µs with the portfolio size
desired (larger portfolios require more annealing time).
11. As for all methods, we save all “better” CQNS portfolios from each method
for use as input to the seeded genetic algorithm.
12. Our change in formulation, for a portfolio P, from V ar[P]−Σi(E[Pi]3) to
V ar[P]−(ΣiE[Pi])3for the final portfolio scoring does change the resulting
“ideal” portfolio. We are using the QUBO-based solvers to select portfolios
based on the previous formulation. We changed this to enhance the quality
of our portfolio selection.
13. The QPU run time on the D-Wave systems varies from 85k to 199k µs
with programming times consistently at 26k µs and annealing times set
by the operator. Post processing times vary from 8k to 36k µs.
14. The number of stocks we seek in the QUBO matters. We sampled for 5
portfolio sizes, were successful with 67, 47 or 3 stocks out of 134, but not
37 or 17 stocks. We envision running through more portfolio sizes once
the system wait time is reduced and the embedding success rate increases.
4.2 Experimental Results
We controlled for sample size between 3,171 stocks and 134 stocks by setting
the CQNS power = ln(V ar)/ln(E). We then used the same CQNS power with
134 stocks. Our first portfolio chosen, with 134 stocks, had a CQNS score
−3.14 ×10−3, versus the second portfolio chosen, with 2 stocks, which had a
value of −3.9×10−3. The best answer was found by four of our solvers (GA
seeded, MC, bespoke SA, and DWave SA).
The best D-Wave quantum annealing run returned 3 stocks which was less
attractive than our ideal solution. It had a CQNS score of −1.69 ×10−3, was
found in 234k µs plus wait and cloud access time. This is a good solution, better
than all-in 134 stocks with a value of −6×10−5and the simulated bifurcator
score of 4.1×10−4.
9
(in xls format) from the NYSE website. We remove funds, trusts, warrants, test,
and preferred stocks. Our stock sample includes US inter-exchange listed foreign
stocks that trade in U.S. exchanges.
We do a test download of each stock for one year and remove stocks with
negative prices, missing prices, failed downloads, and less than one full year of
continuous trading data via the yfinance python module.
We check for positive eigenvalues (within precision tolerance of python) in
the covariance matrix created with all stocks to ensure we have a definite positive
covariance matrix.
We set a βrange and remove stocks with higher and lower βvalues. In this
experiment, we only remove negative βstocks.
Appendix B: Charts
References
[Adv] McGeoch C, Farre P., Sept 25, 2020. The D-Wave Advantage System:
An Overview
[CDHR] Chapuis G, Djidjev H, Hahn G, Rizk G, April 2018, Finding Maximum
Cliques on the D-Wave Quantum Annealer
[CQ] Cohen Jeffrey, October 2020, Trying out the new D-Wave Advantage on
the stock picking problem
[EFD] Pelofske E., Hahn G., Djidjev H., 10 Sept 2020, Decomposition algo-
rithms for solving NP-hard problems on a quantum annealer
[NASDAQ] NASDAQ FTP
[NYSE] NYSE Market Trading Support Desk, Oct 9 2020, Intercontinental Ex-
change
[Px] Boothby K.,Bunyk P.,Raymond J.,Roy A., Next-Generation Topology of
D-Wave Quantum Processors. 03 Mar 2020
[QA] D-Wave Quantum Annealer Documentation
[QUBO1] Glover F., Kochenberger G., Du Y., 2018, Quantum Bridge Analytics
I: A Tutorial on Formulating and Using QUBO Models.
[SA] D-Wave Simulated Annealer Documentation
[SB] Goto H., Tatsumura K., Dixon A., Combinatorial optimization by simulat-
ing adiabatic bifurcations in nonlinear Hamiltonian systems, Science Ad-
vances 19 Apr 2019: Vol. 5, no. 4, eaav2372, DOI: 10.1126/sciadv.aav2372
11
(a) CQNS (b) CQNS vs. Portfolio Sequence
(c) CQR vs. Portfolio Sequence (d) Sharpe Ratio vs. Portfolio Sequence
Figure 6: Charts showing resulting scores with different methods
12
Figure 7: A time of rising markets
13
(a) Qubit Embedding 1 (b) Qubit Embedding 2
(c) Qubit Embedding 3
Figure 8: Charts showing different qubit embeddings
14
(a) A 25 stock run of the SBM (b) A 15 stock run of the SBM
Figure 9: Additional runs of Simulated Bifurcator Machines
15