PreprintPDF Available

SSAAM: Sentiment Signal-based Asset Allocation Method with Causality Information

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

This study demonstrates whether financial text is useful for tactical asset allocation using stocks by using natural language processing to create polarity indexes in financial news. In this study, we performed clustering of the created polarity indexes using the change-point detection algorithm. In addition, we constructed a stock portfolio and rebalanced it at each change point utilizing an optimization algorithm. Consequently, the asset allocation method proposed in this study outperforms the comparative approach. This result suggests that the polarity index helps construct the equity asset allocation method.
arXiv:2408.06585v1 [cs.CE] 13 Aug 2024
SSAAM: Sentiment Signal-based Asset Allocation
Method with Causality Information
Rei Taguchi
School of Engineering
The University of Tokyo
Tokyo, Japan
s5abadiee@g.ecc.u-tokyo.ac.jp
Hiroki Sakaji
School of Engineering
The University of Tokyo
Tokyo, Japan
sakaji@sys.t.u-tokyo.ac.jp
Kiyoshi Izumi
School of Engineering
The University of Tokyo
Tokyo, Japan
izumi@sys.t.u-tokyo.ac.jp
Abstract—This study demonstrates whether financial text is
useful for tactical asset allocation using stocks by using natural
language processing to create polarity indexes in financial news.
In this study, we performed clustering of the created polarity
indexes using the change-point detection algorithm. In addition,
we constructed a stock portfolio and rebalanced it at each
change point utilizing an optimization algorithm. Consequently,
the asset allocation method proposed in this study outperforms
the comparative approach. This result suggests that the polarity
index helps construct the equity asset allocation method.
Index Terms—Financial news, MLM scoring, causal inference,
change-point detection, portfolio optimization
I. INT RO DU CTI ON
This study proposes that financial text can be useful for
tactical asset allocation methods using equities. This study
focuses on the point at which stock and portfolio prices change
rapidly due to external factors, that is, the point of regime
change. Regimes in finance theory refer to invisible market
states, such as expansion, recession, bulls, and bears. In this
study, we specifically drew on the two studies presented below.
Wood et al. [1] used a change-point detection module to
capture regime changes and created a simple and expressive
model. Ito et al. [2] developed a method for switching invest-
ment strategies in response to market conditions. In this study,
we go one step further and focus on how to measure future
regime changes. If the information on future regime changes
(i.e., future changes in the market environment) is known,
active management with a higher degree of freedom becomes
possible. However, there are certain limitations in calculating
future regimes using only traditional financial time-series data.
Therefore, this study constructs an investment strategy based
on a combination of alternative data that has been attracting
attention in recent years and financial time-series data.
In this study, we hypothesized the following:
Portfolio performance can be improved by switching be-
tween risk-minimizing and return-maximizing optimiza-
tion strategies according to the change points created by
the polarity index.
The contributions of this study are as follows:
We demonstrate that the estimation of regime change
points using financial text is active the active management
and propose a highly expressive asset allocation frame-
work.
The framework of this study consists of the following four
steps.
Step 1 (Creating polarity index): Score financial news
titles using MLM scoring. In addition, quartiles are calcu-
lated from the same data, and a three-value classification
of positive, negative, and neutral is performed according
to the quartile range. The calculated values are aggregated
daily.
Step 2 (Demonstration of leading effects): We use sta-
tistical causal inference to demonstrate whether financial
news has leading effects on a stock portfolio. Use the
polarity index created in Step 1. We will also create a
portfolio of 10 stocks combined. The algorithm used is
VAR-LiNGAM.
Step 3 (Change point detection): Verify that the polarity
index has leading effects in Step 2. Calculate the regime
change point of the polarity index using the change point
detection algorithm. The algorithm used is the Binary
Segmentation Search Method.
Step 4 (Portfolio optimization): Portfolio optimization
is performed based on the change points created in Step
3. The algorithm used is EVaR optimization.
II. ME T HO D
A. Creating polarity index
This study used pseudo-log-likelihood scores (PLLs) to
create polarity indices. PLLs are scores based on probabilistic
language models proposed by Salazar et al. [3]. Because
masked language models (MLMs) are pre-trained by pre-
dicting words in both directions, they cannot be handled by
conventional probabilistic language models. However, PLLs
can determine the naturalness of sentences at a high level
because they are represented by the sum of the log-likelihoods
of the conditional probabilities when each word is masked
and predicted. Token ψtis replaced by [MASK], and the
past and present tokens Ψ\t= [ψ1, ψ2, ..., ψt]are predicted. t
represents time. Θis the model parameter. PMLM (·)denotes
the probability of each sentence token. The MLM selects
BERT (Devlin et al. [4]).
PLL(Ψ):=
|Ψ|
X
t=1
log2PM LM (ψt|Ψ\t; Θ) (1)
After pre-processing, score the financial news text with
PLLs one sentence at a time. Quartile ranges1were calculated
for data that scored one sentence at a time. The figure below
illustrates the polarity classification method.
TABLE I
POL AR IT Y CLA SSI FIC ATI ON MET H OD
Classification Method Sentiment Score
3rd quartile <PLLs 1 (positive)
1st quartile PLLs 3rd quartile 0 (neutral)
1st quartile >PLLs -1 (negative)
Aggregate the scores chronologically according to the title
column of financial news.
B. Demonstration of leading effects
In this study, we used VAR-LiNGAM to demonstrate the
precedence. VAR-LiNGAM is a statistical causal inference
model proposed by Hyv ¨arinen et al. [5]. The causal graph
inferred by VAR-LiNGAM is as follows:
x(t) =
T
X
τ=1
Bτx(tτ) + e(t)(2)
where x(t)is the vector of the variables at time tand
τis the time delay. Trepresents the maturity date. In ad-
dition, Bτis a coefficient matrix that represents the causal
relationship between the variables x(tτ).e(t)denotes the
disturbance term. VAR-LiNGAM was implemented using the
following procedure: First, a VAR (Vector Auto-Regressive)
model is applied to the causal relationships among variables
from the lag time to the current time. Second, for the causal
relationships among variables at the current time, LiNGAM
inference is performed using the residuals of the VAR model.
This study confirms whether financial news is preferred to
stock portfolios.
C. Change point detection
Binary segmentation search (Bai [6]; Fryzlewicz [7]) is a
greedy sequential algorithm. The notation of the algorithm
follows Truong et al. [8]. This operation is greedy in the sense
that it seeks the change point with the lowest sum of costs.
Next, the signal was divided into two at the position of the
first change point, and the same operation was repeated for the
obtained partial signal until the stop reference was reached.
The binary segmentation search is expressed in Algorithm 1.
We define a signal y={ys}S
s=1 that follows a multivariate
non-stationary stochastic process. This process involves S
1Arranging the data in decreasing order, the data in the 1/4 are called the
1st quartile, the data in the 2/4 are called the 2nd quartile, and the data in
the 3/4 are called the 3rd quartile. (3rd quartile - 1st quartile) is called the
quartile range.
samples. Lrefers to the list of change points. Let sdenote
the value of a change point. Grefers to an ordered list
of change points to be computed. If signal yis given, the
(ba)-sample long sub-signal {ys}b
s=a+1,(1 a < b S)
is simply denoted ya,b . Hats represent the calculated values.
Other notations are noted in the algorithm’s comment.
Algorithm 1 Binary Segmentation Search
Input: signal y={ys}S
s=1, cost function c(·), stopping
criterion.
Initialize L {}. Estimated breakpoints.
Repeat
k |L|. Number of breakpoints.
s00and sk+1 S Dummy variables
if k > 0then
Denote by si(i= 1, ..., k)the elements (in ascending
order) of L, ie L={s1, ..., sk}.
end if
Initialize Ga(k+ 1)-long array. List of gains
for i= 0, ..., k do
G[i]c(ysi,si+1 )minsi<s<si+1 [c(ysi,s)+c(ys,si+1 )].
end for
ˆ
iarg maxiG[i]
ˆsarg minsi<s<si+1 [c(ysˆ
i,t) + c(ys,sˆ
i+1 )].
Estimated change-points
LL {ˆs}
Until stopping criterion is met.
Output: set Lof estimated breakpoint indexes.
.
D. Portfolio optimization
The entropy value at risk (EVaR) is a coherent risk measure
that is the upper bound between the value at risk (VaR)
and conditional value at risk (CVaR) derived from Chernoff’s
inequality (Ahmadi-Javid [9]; Ahmadi-Javid [10]). EVaR has
the advantage of being computationally tractable compared to
other risk measures, such as CVaR, when incorporated into
stochastic optimization problems (Ahmadi-Javid [10]). EVaR
is defined as follows.
EVaRα(X) := min
z>0zln 1
αMX1
z (3)
Xis a random variable. MXis the moment-generating
function. αdenotes the significance level. zare variables.
A general convex programming framework for the EVaR is
proposed by Cajas [11]. In this study, we switch between the
following two optimization strategies depending on the regime
classified in Section II-C.
Minimize risk optimization: A convex optimization
problem with constraints imposed to minimize EVaR
given a level of expected µ(bµ).
minimize q+zloge1
T α
subject to µw bµ
N
X
i=1
wi= 1
z
T
X
j=1
uj
(rjwq, z, uj)Kexp (j= 1, ..., T )
wi= 0 (i= 1, ..., N )
(4)
Maximize return optimization: A convex optimization
problem imposed to maximize expected return given a
level of expected EV aR (
\
EV aR).
maximize µw
subject to q+zloge1
T α
\
EV aR
N
X
i=1
wi= 1
z
T
X
j=1
uj
(rjwq, z, uj)Kexp (j= 1, ..., T )
wi= 0 (i= 1, ..., N )
(5)
where q,zand uare the variables, Kexp is the exponential
cone, and Tis the number of observations. wis defined as a
vector of weights for Nassets, ris a matrix of returns, and
µis the mean vector of assets.
III. EXP ER IME NT S & RES ULT S
A. Dataset description
This study calculates the signal for portfolio rebalancing and
tactical asset allocation to actively go for an alpha based on the
assumption that financial news precedes the equity portfolio.
Two types of data were used.
Stock Data: We used the daily stock data provided by
Yahoo!Finance2. The stocks used are the components of
the NYSE FANG+ Index: Facebook, Apple, Amazon,
Netflix, Google, Microsoft, Alibaba, Baidu, NVIDIA, and
Tesla were selected. For this data, adjusted closing prices
are used. The time period for this data is January 2015
through December 2019.
Financial News Data: We used the daily historical finan-
cial news archive provided by Kaggle3, a data analysis
platform. This data represents the historical news archive
2https://finance.yahoo.com/
3https://www.kaggle.com/
of U.S. stocks listed on the NYSE/NASDAQ for the past
12 years. This data was confirmed to contain information
on ten stock data issues. This data consists of 9 columns
and 221,513 rows. The title and release date columns
were used in this study. The time period for this data is
January 2015 through December 2019.
B. Preparation for backtesting
The polarity index is presented in section II-A. The financial
news data were pre-processed once before creating the polarity
index. Both financial news and stock data are in daily units;
however, to match the period, if there are blanks in either ,
lines containing blanks are dropped. Once the polarity index
is created in Section II-A, the next step is to create a stock
portfolio by adding the adjusted closing prices of 10 stocks.
The investment ratio for the portfolio is set uniformly for all
stocks. Next, we use VAR-LiNGAM in Section II-B to perform
causal inference. The causal inference results are as follows:
Python library ruptures (Truong et al. [8]) was used.
TABLE II
CAUS AL IN F ER EN CE I N VAR-L INGAM
Direction Causal Graph Value
Index(t-1) 99K Index(t) 0.39
Index(t-1) 99K Portfolio(t) 0.11
Portfolio(t-1) 99K Portfolio(t) 1.00
The values in Table II refer to the elements of the adjacency
matrix. The lower limit was set to 0.05. The results in the table
show that the polarity index has a leading edge in the equity
portfolio. The Python library LiNGAM (Hyv¨arinen et al. [5])
was used.
C. Backtesting scenarios
In this study, the following rebalancing timings were merged
and backtested. Python library vector (Polakow [12]) and
Riskfolio-Lib (Cajas [13]) was used for backtesting. In addi-
tion to EVaR optimization, CVaR optimization and the mean-
variance model were used as optimization algorithms and
comparative methods, respectively. In this study, the number
of regimes was set to 5 and 10. The rebalancing times were
30, 90, and 180 days. The backtesting methodology was
as follows. In this study, CPD-EVaR++ was positioned as
the proposed strategy, and CPD-EVaR+ was the runner-up
strategy.
CPD-EVaR++ (proposed): Changepoint rebalancing us-
ing risk minimization and return maximization EVaR
optimization + regular intervals rebalancing strategy
CPD-EVaR+: Changepoint rebalancing using risk mini-
mization and no-restrictions EVaR optimization + regular
intervals rebalancing strategy
EVaR: EVaR optimization regular intervals rebalancing
strategy
CVaR: CVaR optimization regular intervals rebalancing
strategy
MV: Mean-Variance optimization regular intervals rebal-
ancing strategy
The binary determination of whether the polarity index
within each regime shows an upward or downward trend is
made by examining the divided regimes. MinRiskOpt (Section
II-D(4)) is assigned to an upward trend, and MaxReturnOpt
(Section II-D (5)) is assigned to a downward trend.
D. Evaluation by backtesting
The following metrics were employed to assess the portfolio
performance.
Total Return (TR): TR refers to the total return earned
from investing in an investment product within a given
period. TR formula is as follows: TR = Valuation Amount
+ Cumulative Distribution Amount Received + Cumula-
tive Amount Sold - Cumulative Amount Bought. This
study does not incorporate tax amounts and trading
commissions.
Maximum Drawdown (MDD): MDD refers to the rate
of decline from the maximum asset. MDD formula is
as follows: MDD = (Trough Value - Peak Value) / Peak
Value.
TABLE III
BACK TEST ING (SS AA M)
Rebalance Regime Algorithm TR [%] MDD [%]
30-days
5CPD-EVaR++ 810.9915 26.8629
CPD-EVaR+ 594.7410 26.8629
10 CPD-EVaR++ 485.5201 45.0235
CPD-EVaR+ 392.1392 42.4803
90-days
5CPD-EVaR++ 535.7349 27.6386
CPD-EVaR+ 410.8530 27.6386
10 CPD-EVaR++ 417.8354 27.7646
CPD-EVaR+ 373.5849 27.7646
180-days
5CPD-EVaR++ 152.0988 27.3924
CPD-EVaR+ 131.2210 27.3924
10 CPD-EVaR++ 169.2992 25.3050
CPD-EVaR+ 232.4513 25.3050
TABLE IV
BACK TE S TI NG (C OM PAR I SO N)
Rebalance Algorithm TR [%] MDD [%]
30-days
EVaR 587.9630 46.6651
CVaR 558.7446 44.4532
MV 527.2827 42.9851
90-days
EVaR 500.1421 44.9860
CVaR 496.7423 44.0592
MV 459.1195 42.7358
180-days
EVaR 353.2412 44.7714
CVaR 382.9451 44.2525
MV 360.4298 42.8165
IV. DIS CUS SI ON & CONC LU S IO N
Table III shows that the higher the number of regular rebal-
ances, the higher the total return. In addition, the maximum
drawdowns hovered between 25% and 45%, which is consid-
ered acceptable to the average system trader. In this study, the
experiment was conducted separately when the regime was five
and when the regime was ten. The total return was higher when
the regime was five, whereas the maximum drawdown was
almost the same for both regimes. Moreover, as hypothesized,
CPD-EVaR++, a combination of risk minimization and return
maximization operations, performed better than the others.
Therefore, using this method, the best practice in managing
equity portfolios is to use CPD-EVaR++ and to rebalance
irregularly in regime 5, in addition to regular rebalancing every
30 days.
Backtesting of Table IV using the same parameters as in
Table III. The results show that for the algorithm, EVaR
optimization performed better than the others, similar to the
results of Cajas [11]. This may be because the computational
efficiency of EVaR in stochastic optimization problems is
higher than that of other risk measures, such as CVaR.
This study demonstrates the utility of financial text in asset
allocation with equity portfolios. In the future, we would like
to develop a tactical asset allocation strategy that mixes stocks
and other asset classes, such as bonds. In the future, we would
also like to apply this research to monetary policy and other
macroeconomic analyses.
ACK NOWL EDG ME NT
This work was supported by the JST-Mirai Program Grant
Number JPMJMI20B1, Japan. The authors declare that the
research was conducted without any commercial or financial
relationships that could be construed as potential conflicts of
interest.
REF ER E NC ES
[1] Kieran Wood, Stephen Roberts, and Stefan Zohren. Slow momentum
with fast reversion: A trading strategy using deep learning and change-
point detection. The Journal of Financial Data Science, 4(1):111–129,
dec 2021.
[2] Masatake Ito, Kabun Jo, and Norio Hibiki. Application of asset
allocation models in practice and mutual fund design [in japanese].
Operations research as a management science, 66(10):683–689, 2021.
[3] Julian Salazar, Davis Liang, Toan Q. Nguyen, and Katrin Kirchhoff.
Masked language model scoring. In Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics, pages 2699–
2712, Online, July 2020. Association for Computational Linguistics.
[4] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova.
Bert: Pre-training of deep bidirectional transformers for language un-
derstanding, 2019.
[5] Aapo Hyv¨arinen, Kun Zhang, Shohei Shimizu, and Patrik O Hoyer.
Estimation of a structural vector autoregression model using non-
gaussianity. Journal of Machine Learning Research, 11(5), 2010.
[6] Jushan Bai. Estimating multiple breaks one at a time. Econometric
theory, 13(3):315–352, 1997.
[7] Piotr Fryzlewicz. Wild binary segmentation for multiple change-point
detection. The Annals of Statistics, 42(6):2243–2281, 2014.
[8] Charles Truong, Laurent Oudre, and Nicolas Vayatis. Selective review of
offline change point detection methods. Signal Processing, 167:107299,
2020.
[9] A. Ahmadi-Javid. An information-theoretic approach to constructing
coherent risk measures. In 2011 IEEE International Symposium on
Information Theory Proceedings, pages 2125–2127, 2011.
[10] Amir Ahmadi-Javid. Entropic value-at-risk: A new coherent risk mea-
sure. Journal of Optimization Theory and Applications, 155(3):1105–
1123, 2012.
[11] Dany Cajas. Entropic portfolio optimization: a disciplined convex
programming framework. Available at SSRN 3792520, 2021.
[12] Oleg Polakow. vectorbt (1.4.2), 2022.
[13] Dany Cajas. Riskfolio-lib (3.0.0), 2022.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Momentum strategies are an important part of alternative investments and are at the heart of the work of commodity trading advisors. These strategies have, however, been found to have difficulties adjusting to rapid changes in market conditions, such as during the 2020 market crash. In particular, immediately after momentum turning points, when a trend reverses from an uptrend (downtrend) to a downtrend (uptrend), time-series momentum strategies are prone to making bad bets. To improve the responsiveness to regime change, the authors introduce a novel approach, in which they insert an online changepoint detection (CPD) module into a deep momentum network pipeline, which uses a long short-term memory deep-learning architecture to simultaneously learn both trend estimation and position sizing. Furthermore, their model is able to optimize the way in which it balances (1) a slow momentum strategy that exploits persisting trends but does not overreact to localized price moves and (2) a fast mean-reversion strategy regime by quickly flipping its position and then swapping back again to exploit localized price moves. The CPD module outputs a changepoint location and severity score, allowing the model to learn to respond to varying degrees of disequilibrium, or smaller and more localized changepoints, in a data-driven manner. The authors back test their model over the period 1995–2020, and the addition of the CPD module leads to a 33% improvement in the Sharpe ratio. The module is especially beneficial in periods of significant nonstationarity; in particular, over the most recent years tested (2015–2020), the performance boost is approximately 66%. This is especially interesting because traditional momentum strategies underperformed in this period.
Article
Full-text available
This article presents a selective survey of algorithms for the offline detection of multiple change points in multivariate time series. A general yet structuring methodological strategy is adopted to organize this vast body of work. More precisely, detection algorithms considered in this review are characterized by three elements: a cost function, a search method and a constraint on the number of changes. Each of those elements is described, reviewed and discussed separately. Implementations of the main algorithms described in this article are provided within a Python package called ruptures.
Article
This work presents a disciplined convex programming framework for entropic value at risk (EVaR) based on exponential cone programming. This framework allows us to use EVaR in several convex portfolio optimization problems like maximize the EVaR adjusted return, constraints on EVaR or risk parity EVaR. Also, we propose a new portfolio optimization framework based on an extension of EVaR but applied to drawdowns distribution that we called entropic drawdown at risk (EDaR). Then, we run some numerical examples of EVaR and EDaR portfolio optimization frameworks using Python, Riskfolio-Lib package and MOSEK solver. Finally, we test the efficiency of EVaR and EDaR frameworks in large scale problems respect to conditional value at risk (CVaR) and conditional drawdown at risk (CDaR).
Article
We propose a new technique, calledWild Binary Segmentation (WBS), for consistent estimation of the number and locations of multiple change-points in data. We assume that the number of change-points can increase to infinity with the sample size. Due to a certain random localisation mechanism, WBS works even for very short spacings between the change-points and/or very small jump magnitudes, unlike standard Binary Segmentation. On the other hand, despite its use of localisation, WBS does not require the choice of a window or span parameter, and does not lead to a significant increase in computational complexity. WBS is also easy to code. We propose two stopping criteria forWBS: one based on thresholding and the other based on what we term the “strengthened Schwarz Information Criterion”. We provide default recommended values of the parameters of the procedure and show that it offers very good practical performance in comparison with the state of the art. The WBS methodology is implemented in the R package wbs, available on CRAN. In addition, we provide a new proof of consistency of Binary Segmentation with improved rates of convergence, as well as a corresponding result for WBS.
Article
This paper introduces the concept of entropic value-at-risk (EVaR), a new coherent risk measure that corresponds to the tightest possible upper bound obtained from the Chernoff inequality for the value-at-risk (VaR) as well as the conditional value-at-risk (CVaR). We show that a broad class of stochastic optimization problems that are computationally intractable with the CVaR is efficiently solvable when the EVaR is incorporated. We also prove that if two distributions have the same EVaR at all confidence levels, then they are identical at all points. The dual representation of the EVaR is closely related to the Kullback-Leibler divergence, also known as the relative entropy. Inspired by this dual representation, we define a large class of coherent risk measures, called g-entropic risk measures. The new class includes both the CVaR and the EVaR.
Article
Analysis of causal effects between continuous-valued variables typically uses either autoregressive models or structural equation models with instantaneous effects. Estimation of Gaussian, linear structural equation models poses serious identifiability problems, which is why it was recently proposed to use non-Gaussian models. Here, we show how to combine the non-Gaussian instantaneous model with autoregressive models. This is effectively what is called a structural vector autoregression (SVAR) model, and thus our work contributes to the long-standing problem of how to estimate SVAR's. We show that such a non-Gaussian model is identifiable without prior knowledge of network structure. We propose computationally efficient methods for estimating the model, as well as methods to assess the significance of the causal influences. The model is successfully applied on financial and brain imaging data.
Article
Sequential (one-by-one) rather than simultaneous estimation of multiple breaks is investigated in this paper. The advantage of this method lies in its computational savings and its robustness to misspecification in the number of breaks. The number of least-squares regressions required to compute all of the break points is of order T, the sample size. Each estimated break point is shown to be consistent for one of the true ones despite underspecification of the number of breaks. More interestingly and somewhat surprisingly, the estimated break points are shown to be T-consistent, the same rate as the simultaneous estimation. Limiting distributions are also derived. Unlike simultaneous estimation, the limiting distributions are generally not symmetric and are influenced by regression parameters of all regimes. A simple method is introduced to obtain break point estimators that have the same limiting distributions as those obtained via simultaneous estimation. Finally, a procedure is proposed to consistently estimate the number of breaks. Copyright 1997 by Cambridge University Press.
Application of asset allocation models in practice and mutual fund design
  • Masatake Ito
  • Kabun Jo
  • Norio Hibiki
Masatake Ito, Kabun Jo, and Norio Hibiki. Application of asset allocation models in practice and mutual fund design [in japanese]. Operations research as a management science, 66(10):683-689, 2021.
Bert: Pre-training of deep bidirectional transformers for language understanding
  • Jacob Devlin
  • Ming-Wei Chang
  • Kenton Lee
  • Kristina Toutanova
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019.