Conference PaperPDF Available

New Approaches in Ordinal Pattern Representations for Multivariate Time Series

Authors:
  • inovex GmbH

Abstract and Figures

Many practical applications involve classification tasks on time series data, e.g., the diagnosis of cardiac insufficiency by evaluating the recordings of an electrocardiogram. Since most machine learning algorithms for classification are not capable of dealing with time series directly, mappings of time series to scalar values, also called representations, are applied before using these algorithms. Finding efficient mappings, which capture the characteristics of a time series is subject of the field of representation learning and especially valuable in cases of few data samples. Time series representations based on information theoretic entropies are a proven and well-established approach. Since this approach assumes a total ordering it is only directly applicable to univariate time series and thus rendering it difficult for many real-world applications dealing with multiple measurements at the same time. Some extensions were established which also cope with mul-tivariate time series data, but none of the existing approaches take into account potential correlations between the movement of the variables. In this paper we propose two new approaches , considering the correlation between multiple variables , which outperform state-of-the-art algorithms on real-world data sets.
Content may be subject to copyright.
New Approaches in Ordinal Pattern Representations for Multivariate Time Series
Marisa Mohr, 1,2, * Florian Wilhelm,1Mattis Hartwig,2Ralf M¨
oller,2Karsten Keller3
1inovex GmbH, 76131 Karlsruhe, Germany
2Institute of Information Systems, University of L¨
ubeck, 23562 L¨
ubeck, Germany
3Institute of Mathematics, University of L¨
ubeck, 23562 L¨
ubeck, Germany
*Correspondence: mohr@ifis.uni-luebeck.de
Abstract
Many practical applications involve classification tasks on
time series data, e.g., the diagnosis of cardiac insufficiency
by evaluating the recordings of an electrocardiogram. Since
most machine learning algorithms for classification are not
capable of dealing with time series directly, mappings of time
series to scalar values, also called representations, are applied
before using these algorithms. Finding efficient mappings,
which capture the characteristics of a time series is subject
of the field of representation learning and especially valu-
able in cases of few data samples. Time series representations
based on information theoretic entropies are a proven and
well-established approach. Since this approach assumes a to-
tal ordering it is only directly applicable to univariate time se-
ries and thus rendering it difficult for many real-world appli-
cations dealing with multiple measurements at the same time.
Some extensions were established which also cope with mul-
tivariate time series data, but none of the existing approaches
take into account potential correlations between the move-
ment of the variables. In this paper we propose two new ap-
proaches, considering the correlation between multiple vari-
ables, which outperform state-of-the-art algorithms on real-
world data sets.
1 Introduction
Time series data is part of many real-world applications,
e.g., in weather prediction, stock markets, energy produc-
tion, medical recordings, sales and websites activity, polit-
ical or sociological factors. Classification of time series is
a challenging and important subject, e.g., to diagnose car-
diac insufficiency by evaluating the recordings of an electro-
cardiogram (ECG). Classical machine learning models for
classification such as k-nearest neighbor, support vector ma-
chines or random forest, can’t process time series directly. It
is necessary to extract scalar-valued representations (or fea-
tures) from time series before using these algorithms.
The choice of suitable representations strongly influences
the quality of a model and its predictions. Ideally, represen-
tations are chosen without domain knowledge of specialists
and yet contain as much information as possible. An ap-
proach for automatic extraction of representations is given
Copyright c
2020, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
by the application of deep neural networks, which learn ap-
propriate representations of time series within their hidden
layers (Franceschi, Dieuleveut, and Jaggi 2019). Deep neu-
ral networks, however, require many data samples for model
training, which are rarely available in real-world tasks.
In general, the goal is to find efficient mappings from a
time series to a scalar representation that capture as many
characteristics about the time series as possible. Information
theoretic entropies are promising through an encoding that
preserves information content (Amig´
o 2010). The concept
of Permutation Entropy (PE) has already been successfully
used in time series analysis and classification on many uni-
variate real-world data sets (Antonelli, Meschino, and Bal-
larin 2019; J. Weck et al. 2014; Xue et al. 2019). PE uses
total ordering of data points in a univariate time series to
encode the ups and downs of neighboring points.
The concept of PE in its original form is not able to han-
dle multivariate time series, because of the inability to deter-
mine a total order between vector-valued time points. Nev-
ertheless, in real-world applications we often have to deal
with high-dimensional multivariate time series. For exam-
ple, medical measurements stored as ECG data are usually
not determined from a single electrode, but from multiple
electrodes. There are extensions of PE which also cope with
multivariate time series data, but none of the existing ap-
proaches take into account potential correlations between
the simultaneous movement of the variables over the time.
This paper introduces two new approaches considering
simultaneous movement and correlations of variables. Sec-
tion 2 introduces the ordinal pattern formalism and a defini-
tion of PE. Section 3 presents related work. Section 4 details
our new approaches. To prove their contribution, we show
in Section 5, that our approaches outperform in classifica-
tion tasks on many data sets out of UEA Multivariate Time
Series Classification (MTSC) archive (Bagnall et al. 2018).
In the last Section, we discuss limitations, as well as further
research.
2 Preliminaries
For using entropies as representations, time series observa-
tions are encoded as sequences of symbolic abstractions. As
far as current research is concerned, there are two general
(0,1,2),(2,1,0),(1,0,2),(2,0,1),(0,2,1),(1,2,0)
Figure 1: All possible ordinal patterns of order d= 3.
Figure 2: Ordinal pattern determination of order d= 3 and
time delay τ= 20 in a univariate time series.
approaches of symbolization. Classical symbolization ap-
proaches use threshold values and data range partitioning
for symbol assignment such as the well-know SAX repre-
sentation introduced by Chiu, Keogh, and Lonardi (2003).
The ordinal pattern symbolization approach, describing the
up and down, is based on Bandt and Pompe (2002). The for-
malism and the advantages of this symbolization scheme are
introduced in the following.
2.1 Ordinal Pattern Symbolization
Ordinal patterns describe the total order between two or
more neighbors, coded by permutations.
Definition 1 (Univariate Ordinal Pattern).A vector
(x1, ..., xd)Rdhas ordinal pattern (r1, ..., rd)Ndof
order dNif xr1... xrdand rl1> rlin the case
xrl1=xrl.
Note that the equality of two values within a pattern is
not allowed. In this case, for example, the newer value is
regarded as a smaller value. Figure 1 shows all possible or-
dinal patterns of order d= 3 of a vector (x1, x2, x3). To
symbolize a time series (x1, x2, ..., xT)RTeach time
point t∈ {d, ..., T }is assigned its ordinal pattern of order
d. The order dis chosen much smaller than the length Tof
the time series to look at small windows in a time series and
their distributions of up and down movements. To access the
overarching trend delayed behavior is of interest. The time
delay τNis the delay between successive points in the
symbol sequences. Different delays show different details of
the structure of a time series. Figure 2 visualizes the ordinal
pattern determination of order d= 3 and time delay τ= 20
of four different time points in a univariate time series.
The ordinal approach has notable advantages in applica-
tion. First of all, the method is conceptionally simple. Sec-
ond, it is not necessary to have previous knowledge about the
data range or type of time series. Third, the ordinal approach
supports robust and fast implementations (Keller et al. 2017;
Piek, Stolz, and Keller 2019). Fourth, it allows an easier es-
timation of a good symbolization scheme compared to the
classical symbolization approaches (Keller, Maksymenko,
and Stolz 2015; Stolz and Keller 2017).
2.2 Ordinal Pattern Distributions
Not the ordinal patterns themselves, but their distribution in
different parts of a univariate time series (xt)T
t=1 are of in-
terest. Thus, each pattern is identified with exactly one of
the ordinal pattern symbols j= 1,2, ..., d!. Using the distri-
bution of symbols, the entropy of ordinal pattern symbols is
calculated as in the following Definition.
Definition 2 (Permutation Entropy (PE)).The permutation
entropy of order dNand delay τNof a univariate time
series (xt)T
t=1,TNis defined by
PE(d, τ ) =
d!
X
j
pτ,d
jln pτ,d
j,(1)
where
pτ,d
j=#{t|(xt(d1)τ, ..., xtτ, xt)has pattern j}
T(d1)τ(2)
is the relative frequency of ordinal pattern jin the time se-
ries.
Depending on the area of research, entropy is a mea-
sure for quantifying inhomogeneity, impurity, complexity
and uncertainty or unpredictability. In this paper, we use
PE as a representation (also known as feature) in a learning
model that measures the complexity of a time series, which
is measured in the (in)regularity of ordinal patterns occur-
ring. For time series with maximum random ordinal pattern
symbols (resulting in a uniform pattern distribution due to
uniqueness), PE is ln(d!). For a time series with regular pat-
tern, e.g., in case of monotony, PE is equal to zero (Amig´
o
2010).
Amultivariate time series ((xi
t)m
i=1)T
t=1 has more than one
time-dependent variable. Each variable xifor i1, ..., m
depends not only on its past values but also has some de-
pendency on other variables. Considering two time points
(xi
t)m
i=1 and (xi
t+1)m
i=1 with mvariables, it is not possible
to establish a total order between them. A total order is only
possible if xi
t> xi
t+1 or xi
t< xi
t+1 for all i1, ..., m.
Therefore, there is no trivial generalization of the PE algo-
rithm to the multivariate case.
3 Related Work
In order to determine ordinal patterns on multivariate time
series, two cases have been discussed in literature so far:
1. The multivariate time series is projected into ordinal
space directly by either
a. determining univariate ordinal pattern between values
in time of a single variable i(row marked in blue in
Figure 3), and then averaging over all mvariables, or
b. determining univariate ordinal pattern between values
in time of a single variable i(box marked in green in
Figure 3), storing all mpattern at a fixed time point tin
one matrix, or
Ordnung von Vektoren
1
x""
x#"
x%"
x"#
x##
x"'
x#'
⋮ ⋱ ⋮
x%# ⋯ x%'
x""
x#"
x%"
x"#
x##
x"'
x#'
⋮ ⋱ ⋮
x%# ⋯ x%'
(x′"x#
*⋯ x'
*)
1) 2)
Figure 3: Two possibilities of univariate ordinal pattern de-
termination in a multivariate time series. Case 1 shows ig-
noring variables (blue) and time (red) dependencies respec-
tively. Our first proposed approach considers both (green).
Case 2 shows a previous dimensionality reduction (yellow).
c. determining univariate ordinal pattern between values
of all variables at a single time point t(column marked
in red in Figure 3), and then averaging over all Ttime
points.
2. The multivariate time series is reduced to a single-
dimensional projection first, and then transformed into or-
dinal space.
Consequently, Definition 2 can be used for PE calcula-
tion in all cases. The first approach 1(a) is implemented by
Keller and Lauffer (1999) as Pooled Permutation Entropy
and hereafter referred to as PEpool. Morabito et al. (2012) ex-
pand the concept of PEpool by taking into account variations
over multiple scales, named Multivariate Multi-Scale Per-
mutation Entropy (MMSPE). Some theoretical basis for 1(b)
is set by Keller; Antoniouk, Keller, and Maksymenko (2012;
2013). A variant of approach 1(c) is implemented by He,
Sun, and Wang (2016) as Multivariate Permutation Entropy
(MVPE). The second case is implemented by Rayan, Mo-
hammad, and Ali (2019). They propose to reduce the num-
ber of variables mvariable first by applying different dis-
tance measures between all points of the time series and a
reference point before calculating PE on the reduced univari-
ate time series. Depending on the distance measure and ref-
erence point we denote the calculated PE-variants as PEeucl
(Euclidian distance with reference point (xi
0)m
i=1)), PEmanh
(Manhattan distance with reference point (xi
0)m
i=1)), and
PEnorm (Euclidian distance with reference point 0). Note that
the length Tof the time series in variable-dimensions must
be the same in order to perform a dimensionality reduction.
The well-known Matrix Profile by Yeh et al. (2016) pursues
a different goal and is therefore not considered in this paper.
4 Ordinal Pattern Representations
Considering Correlations of Variables
The simultaneous movement pattern of several variables
over time is important information, that should be encoded
in a mathematical representation. For example, each of ECG
and Atrial Blood Pressure (ABP) signals contain informa-
tion of cardiac status, which can be used for diagnosis of
diseases. The depolarization of ventricles and contraction of
the large ventricular muscles of the human heart (so-called
Figure 4: Medical time series (ECG signal II and ABP) from
a patient of the MIMIC III waveform database with identifi-
cation id 3900006 0029 published by Johnson et al. (2016).
QRS complex) can be observed during an ECG signal by
the highest rash, the central and most visually obvious part
of the tracing as it is shown in the signal at top in Figure 4.
Shortly after the electrical activity the blood pressure rises as
it is shown in the second signal in Figure 4. Both movements
of variables depend on each other and should be considered
together (red boxes in Figure 4), rather than separately.
In the following, we propose two new approaches tak-
ing into account the correlation of two or more variables.
In Section 4.1, we introduce our first approach adapting the
first case in Section 3 by natural extension of dimensions to
include both, variable and time. In Section 4.2, we present
a second approach following the second case in Section 3
by an improvement of dimensional reduction via Principal
Component Analysis.
4.1 Permutation Entropy Based on Multivariate
Ordinal Pattern
The intuitive idea is to store the univariate ordinal patterns of
all variables at a time point ttogether into one multivariate
pattern.
Definition 3 (Multivariate Ordinal Pattern).A matrix
(x1, ..., xd)Rm×dhas multivariate ordinal pattern
r11 · · · r1d
.
.
.....
.
.
rm1· · · rmd
Nm×d(3)
of order dNif xri1... xrid for all i= 1, ..., m
and ril1> ril in the case xril1=xril .
Figure 5 shows all possible multivariate ordinal patterns
of order d= 2 and number of variables m= 2.
With the natural extension of Definition 1, which leads to
Definition 3, it is possible to apply the PE algorithm in Def-
inition 2 in its original form to multivariate time series. We
denote PEmulti as PE that have been calculated on multivari-
ate ordinal patterns. Obviously, this approach has the dis-
advantage that the number of possible patterns d!increases
exponentially with the number of variables m, in summary
(d!)m. In this setting, we quickly end up with an under-
sampling and a uniform distribution of patterns, resulting in
1 0
1 0,1 0
0 1,0 1
1 0,0 1
0 1
Figure 5: All possible multivariate ordinal patterns of order
d= 2 and variable-dimension m= 2.
maximum complexity. Nevertheless, with small order dand
sufficiently large length Tof the time series, more informa-
tion can be obtained than with its averaged variant PEpool.
The following evaluation confirms this.
4.2 Permutation Entropy Based on Principal
Component Analysis
The disadvantages of the approach in the previous Section
limit its applicability on real-world data sets. In order to min-
imize the combinatorial possibilities of ordinal patterns, we
consider the univariate ordinal pattern case. The idea is to
transform a multivariate time series ((xi
t)m
i=1)T
t=1 into a uni-
variate time series ((x0
t)T
t=1 and then calculate the PE from
Definition 2 on the reduced time series as usual. To order
time points based on a single-dimensional values, Rayan,
Mohammad, and Ali (2019) propose to use the Euclidian or
Manhattan distance between all points of the time series and
a reference point for reduction. The calculation of distances
can be interpreted as the strength of a signal or time series,
on the basis of which PE is then calculated. This approach
does not take into account correlations between variables.
The aim, however, should be to sustain as much infor-
mation of variables correlations as possible to understand
their simultaneous movement. Principal Component Analy-
sis (PCA) is a well-known method converting a set of obser-
vations of possibly correlated variables into a set of values of
linearly uncorrelated variables by an orthogonal transforma-
tion. The total variance is an information that describes one
kind of characteristics of the time series (Hastie, Tibshirani,
and Friedman 2009). For m-dimensional data, there are ba-
sically mbasis vectors that are orthogonal. The variance of
data points along each basis vector is the total variance of the
data. If the first r < m basis vectors cover a sufficiently large
percentage of the total variance, then the major components
represented by the new rbasis vectors will be sufficient for
the information content of the data. Keeping only the first
rprincipal components gives the truncated transformation
X0
r=XWr, where WNr×ris a matrix of weights
whose columns are the eigenvectors of XTXsorted in de-
scending order of the rhighest corresponding eigenvalues.
For applying PCA, we assume that the time series
((xi
t)m
i=1)T
t=1 is stationary for all i= 1, .., m. An appropri-
ate algorithm to reduce the data by PCA can be found in all
popular statistical textbooks, e.g. in Hastie, Tibshirani, and
Friedman (2009). We denote PEPCA as PE that have been
calculated on the single-dimensional projection by PCA. We
have r= 1 for reducing a multivariate into a univariate time
series. The reduction from high number of variables mto
only one dimension can lead to a high loss of information of
variance in the data, provided that the first main component
explains very little variance of the data. In order to show that
the addition of further main components does not lead to bet-
ter results, we denote PEPCA2 as the calculation of PEmulti on
the first two principal components. A possible explanation
for this could be the orthogonality of the main components.
5 Experimental Results
We have conducted several experiments to investigate the
relevance of the different ordinal pattern representations for
multivariate time series. We consider a representation as be-
ing good, if it is flexible and realiable applicable on differ-
ent real-world data sets. To show the flexibility, we com-
pare it on 25 different real-world use cases. To show the
reliability, we compare the accuracy of seven different PE-
variants. Higher accuracy means that the representation is
able to identify the underlying explanatory factors better
than other representations and discriminatory properties can
be identified as useful inputs for supervised predictors. (Ben-
gio, Courville, and Vincent 2012). We challenge our or-
dinal pattern representations in classification tasks on the
UEA MTSC archive, a collection of different time series
data sets from many real-world cases, released in October
2018 by Bagnall et al. (2018). The archive consists of 30
data sets with a wide range of series lengths, dimensions
and cases from human activity recognition, motion classi-
fication, ECG classification, electroencephalography (EEG)
and magnetoencephalography (MEG) classification to audio
spectra classification and many others. In the following set-
ting 5 of 30 data sets of the archive could not be used without
compromising comparability due to different length Tof the
time series in its mvariables.
5.1 Deep Dive: AtrialFibrillation Classification
Before we start with a general evaluation of different data
sets, we give a detailed insight in one specific data set out
of the 30 for a better understanding. The AtrialFibrillation
data set contains two-channel ECG recordings for predicting
spontaneous termination of atrial fibrillation (AF). The class
labels are: t, s and n. Class t contains data, where the AF
terminates at the latest within one second after the recording
ending. Class s is described as an AF that self terminates at
least one minute after the recording ending. In Class n, the
AF does not terminate for at least one hour after the record-
ing of the data. An example of the recordings for each class
is shown in Figure 6. More details are in Moody (2004). Be-
low we examine the separability of the three classes based
on the different PE-variants, PEpool, PEmulti and PEPCA. The
variance ratio for the first main component is greater than
74.17% for every l= 1, .., ntrain, where ntrain is the number
of train samples. So a dimensionality reduction via PCA can
be applied without losing too much information. Figure 7
shows boxplots for the calculated values of three different
PE-variants for each class. While PEpool does not allow any
separability of the classes, PEmulti allows classes n and t to
be separated relatively well. An even better separation of
classes s and t is achieved by PEPCA.
Figure 6: Example of 2-channel recordings for all three
classes.
Figure 7: Boxplots for the values of three different PE-
variants of order d= 2 and delay τ= 1 for classes n (blue),
s (orange) and t (green).
5.2 A Comparison of PE-variants on the UEA
Data Sets
We perform a classification on the UEA MTSC data sets to
compare the ability of separation through seven ordinal pat-
tern representations PEpool, PEmulti , PEeucl, PEmanh , PEnorm ,
PEPCA and PEPCA2 on arbitrary data sets. The initial bench-
marking on the UEA MTSC data sets by Bagnall et al.
(2018) is with the standard 1-nearest neighbor (NN) clas-
sifier with three different distance functions. To make a cer-
tain comparability we also use the 1-NN classifier. As model
input, all seven PE-variants were used individually for the
evaluation. Finding an optimal order dand time delay τis a
challenging problem in research (Riedl, M¨
uller, and Wessel
2013; Myers and Khasawneh 2019). For simplicity, we have
done an extensive hyperparameter optimization.
Table 1 lists the accuracy of the performed 1-NN classifi-
cation on all data sets with train size ntrain, number of vari-
ables m, length Tof the time series and number of classes
C. The highest accuracy scores per data set are bold. Be-
sides, the names of data sets whose benchmark accuracy we
outperform are italics.
PEmulti outperforms on two data sets with low number m
of variables and high length Tof the time series, which con-
firms that the representation is successful in this special case,
especially more successful than its averaged variant PEpool.
PEPCA outperforms the other state-of-the-art PE variants in
most cases. The results show that a dimensionality reduc-
tion by decorrelation of variables is more successful than by
different distance measures. The accuracy scores of PEPCA2
confirm that adding more major components does not pro-
duce better results than PEPCA.
6 Conclusion and Future Work
Throughout this paper, we have discussed ordinal pattern
representations for multivariate time series both from the
viewpoint of their theoretical foundation and their applica-
tion. We have shown that our approaches, especially PEPCA,
outperform state-of-the-art algorithms on many real-world
data sets. PEmulti is a valuable representation in case of small
number mof variables and high length Tof the time series.
Due to an improvement in prediction results and easy han-
dling, the integration into toolboxes for representation learn-
ing of multivariate time series is indispensable.
Furthermore, we will study, if an optimized method for
dealing with PCA in case of high-dimension, low-sample-
size (HDLSS) data suggested by Yata and Aoshima (2012)
can improve our results of PEPCA. In addition, it may be
worth taking a closer look at the individual PE of all mmain
components to understand how movements on the decorre-
lated variables of the multivariate time series change.
The advantages of ordinal patterns for analysis and pre-
diction of time series is well known in research. The ap-
proach presented here has the disadvantage that represen-
tation and prediction model are chosen independently. We
currently work on adapting the ordinal pattern approach to
automatic representation learning. Instead of first calculat-
ing PE and applying an ML model in a second step, this will
allow to both learn the representation and use them to per-
form a specific problem in one task.
References
Amig´
o, J. 2010. Permutation Complexity in Dynamical Systems:
Ordinal Patterns, Permutation Entropy and All That. Springer Se-
ries in Synergetics. Berlin Heidelberg: Springer-Verlag.
Antonelli, A.; Meschino, G.; and Ballarin, V. 2019. Mammo-
graphic Density Estimation Through Permutation Entropy. In Lhot-
ska, L.; Sukupova, L.; Lackovi´
c, I.; and Ibbott, G. S., eds., World
Congress on Medical Physics and Biomedical Engineering 2018,
IFMBE Proceedings, 135–141. Singapore: Springer.
Antoniouk, A.; Keller, K.; and Maksymenko, S. 2013.
Kolmogorov-Sinai entropy via separation properties of order-
generated sigma-algebras. Discrete and Continuous Dynamical
Systems 34(5):1793–1809.
Bagnall, A.; Dau, H. A.; Lines, J.; Flynn, M.; Large, J.; Bostrom,
A.; Southam, P.; and Keogh, E. 2018. The UEA multivariate time
series classification archive, 2018. arXiv:1811.00075 [cs, stat].
Bandt, C., and Pompe, B. 2002. Permutation Entropy: A Natu-
ral Complexity Measure for Time Series. Physical Review Letters
88(17).
Bengio, Y.; Courville, A.; and Vincent, P. 2012. Representation
Learning: A Review and New Perspectives. arXiv:1206.5538 [cs].
Chiu, B.; Keogh, E.; and Lonardi, S. 2003. Probabilistic Discovery
of Time Series Motifs. In Proceedings of the Ninth ACM SIGKDD
International Conference on Knowledge Discovery and Data Min-
ing, KDD ’03, 493–498. New York, NY, USA: ACM.
Franceschi, J.-Y.; Dieuleveut, A.; and Jaggi, M. 2019. Unsuper-
vised Scalable Representation Learning for Multivariate Time Se-
ries. arXiv:1901.10738 [cs, stat].
Hastie, T.; Tibshirani, R.; and Friedman, J. 2009. The Elements
of Statistical Learning: Data Mining, Inference, and Prediction,
Second Edition. Springer Series in Statistics. New York: Springer-
Verlag, 2 edition.
1-NN based on
Datset ntrain m T C PEpool PEmulti PEeucl PEmanh PEnorm PEPCA PEPCA2
ArticularyWordR. 275 9 144 25 0.137 0.073 0.116 0.11 0.11 0.127 0.064
AtrialFibr. 15 2 640 3 0.667 0.6 0.6 0.6 0.666 0.8 0.4
BasicMotions 40 6 100 4 0.625 0.525 0.525 0.45 0.475 0.675 0.45
DuckDuckGeese 50 1345 270 5 0.3 0.2 0.28 0.24 0.26 0.36 0.28
EigenWorms 128 6 17984 5 0.557 0.611 0.578 0.584 0.578 0.595 0.443
Epilepsy 137 3 206 4 0.5 0.478 0.507 0.48 0.485 0.514 0.384
ERing 30 4 65 6 0.392 0.351 0.385 0.374 0.351 0.381 0.274
EthanolCon. 261 3 1751 4 0.323 0.312 0.304 0.323 0.330 0.331 0.300
FaceDetection 5890 144 62 2 0.504 0.500 0.508 0.506 0.504 0.512 0.510
FingerMovem. 360 28 50 2 0.57 0.54 0.6 0.62 0.55 0.58 0.56
HandMovem. 160 10 400 4 0.419 0.419 0.392 0.392 0.432 0.364 0.405
Handwriting 150 3 152 26 0.072 0.06 0.075 0.069 0.074 0.071 0.051
Heartbeat 204 61 405 2 0.692 0.722 0.682 0.712 0.698 0.731 0.727
Libras 180 2 45 15 0.322 0.266 0.311 0.295 0.35 0.277 0.117
LSST 2459 6 36 14 0.238 0.212 0.201 0.236 0.198 0.260 0.309
MotorImagery 278 64 3000 2 0.52 0.56 0.54 0.49 0.55 0.57 0.56
NATOPS 180 24 51 6 0.239 0.2 0.25 0.272 0.272 0.266 0.2
PEMS-SF 267 963 144 7 0.671 0.173 0.563 0.647 0.524 0.676 0.266
PenDigits 7494 2 8 10 0.203 0.190 0.201 0.198 0.176 0.229 0.107
PhonemeSpectra 3315 11 217 39 0.060 0.052 0.049 0.045 0.060 0.061 0.034
RacketSports 151 6 30 4 0.348 0.296 0.355 0.349 0.368 0.381 0.362
SelfR.SCP1 268 6 896 2 0.580 0.556 0.659 0.662 0.607 0.611 0.539
SelfR.SCP2 200 7 1152 2 0.588 0.577 0.561 0.572 0.584 0.6 0.577
StandWalkJ. 12 4 2500 3 0.666 0.733 0.666 0.533 0.733 0.666 0.533
UWaveGest. 120 3 315 8 0.243 0.234 0.225 0.216 0.206 0.244 0.184
Table 1: Accuracy scores of multivariate PE variants on 25 UEA MTSC data sets.
He, S.; Sun, K.; and Wang, H. 2016. Multivariate permuta-
tion entropy and its application for complexity analysis of chaotic
systems. Physica A: Statistical Mechanics and its Applications
461:812–823.
J. Weck, P.; Schaffner, D.; Brown, M.; and Wicks, R. 2014. Permu-
tation Entropy and Statistical Complexity Analysis of Turbulence
in Laboratory Plasmas and the Solar Wind. arXiv.
Johnson, A. E. W.; Pollard, T. J.; Shen, L.; Lehman, L.-w. H.; Feng,
M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Anthony Celi, L.; and
Mark, R. G. 2016. MIMIC-III, a freely accessible critical care
database. Scientific Data 3:160035.
Keller, K., and Lauffer, H. 1999. Symbolic analysis of high-
dimensional time series. In Int. J. Bifurcation Chaos, 2657–2668.
Keller, K.; Mangold, T.; Stolz, I.; and Werner, J. 2017. Permutation
Entropy: New Ideas and Challenges. Entropy 19(3):134.
Keller, K.; Maksymenko, S.; and Stolz, I. 2015. Entropy determi-
nation based on the ordinal structure of a dynamical system. Dis-
crete and Continuous Dynamical Systems - Series B 20(10):3507–
3524.
Keller, K. 2012. Permutations and the Kolmogorov-Sinai entropy.
Discrete & Continuous Dynamical Systems - A 32(3):891.
Moody, G. 2004. Spontaneous termination of atrial fibrillation: A
challenge from physionet and computers in cardiology 2004. In
Computers in Cardiology, 2004, 101–104.
Morabito, F. C.; Labate, D.; La Foresta, F.; Bramanti, A.; Mora-
bito, G.; and Palamara, I. 2012. Multivariate Multi-Scale Per-
mutation Entropy for Complexity Analysis of Alzheimer’s Disease
EEG. Entropy 14(7):1186–1202.
Myers, A., and Khasawneh, F. 2019. On the Automatic Param-
eter Selection for Permutation Entropy. arXiv:1905.06443 [nlin,
physics:physics].
Piek, A. B.; Stolz, I.; and Keller, K. 2019. Algorithmics, Pos-
sibilities and Limits of Ordinal Pattern Based Entropies. Entropy
21(6):547.
Rayan, Y.; Mohammad, Y.; and Ali, S. A. 2019. Multidimen-
sional Permutation Entropy for Constrained Motif Discovery. In
Nguyen, N. T.; Gaol, F. L.; Hong, T.-P.; and Trawi´
nski, B., eds.,
Intelligent Information and Database Systems, Lecture Notes in
Computer Science, 231–243. Cham: Springer International Pub-
lishing.
Riedl, M.; M¨
uller, A.; and Wessel, N. 2013. Practical consider-
ations of permutation entropy: A tutorial review. The European
Physical Journal Special Topics 222.
Stolz, I., and Keller, K. 2017. A General Symbolic Approach to
Kolmogorov-Sinai Entropy. Entropy 19(12):675.
Xue, X.; Li, C.; Cao, S.; Sun, J.; and Liu, L. 2019. Fault Diagnosis
of Rolling Element Bearings with a Two-Step Scheme Based on
Permutation Entropy and Random Forests. Entropy 21(1):96.
Yata, K., and Aoshima, M. 2012. Effective PCA for high-
dimension, low-sample-size data with noise reduction via geomet-
ric representations. Journal of Multivariate Analysis 105(1):193–
215.
Yeh, C.-C. M.; Zhu, Y.; Ulanova, L.; Begum, N.; Ding, Y.; Dau,
H. A.; Silva, D. F.; Mueen, A.; and Keogh, E. 2016. Matrix Pro-
file I: All Pairs Similarity Joins for Time Series: A Unifying View
That Includes Motifs, Discords and Shapelets. In 2016 IEEE 16th
International Conference on Data Mining (ICDM), 1317–1322.
... In general, entropies measure inhomogeneity, impurity, complexity, and uncertainty or unpredictability, see Mohr et al. (2020). For a uniform OP-distribution, that is, if all possible m OPs occur with the same probability 1/m, the PE attains its upper bound 1. ...
... There have been various attempts in order to generalize ordinal patterns to the multivariate setting. The probably most natural version has been proposed by Mohr et al. (2020), who defined MOPs as the vector consisting of the univariate OPs with respect to each dimension. ...
... Regarding the PE of MOPs as suggested by Mohr et al. (2020), it is sufficient to only consider the probabilities of the respective (d!) 2 MOPs, which correspond to the joint probabilities (p k,l ) k,l as well as their respective estimators. Then, the CLT (2) in Section 3.1 immediately yields ...
... The aforementioned approach is an extension of a previous study [29] and can be evaluated for IPTV multivariate time series clustering. In [30], the authors focused their research on the correlation between simultaneous movement patterns of variables over time, with the multivariate pattern being a union of all univariate ordinal patterns. The recent work in [31] focused even more on the multivariate time series, as they emphasized human activity as a typical case with multiple observed dimensions. ...
... Although it is mentioned in [30] that the features in the multivariate time series might be simultaneously dimensiondependent, it should be noted that in the proposed analysis method the features in p are collectively independent (with the exception of f 1 , as the environment needs to be turned on before any other actions can be performed). ...
Article
Full-text available
Internet Protocol Television (IPTV) has had a significant impact on live TV content consumption in the past decade, as improvements in the broadband speed have allowed more data volume to be delivered. In addition to existing infrastructure, which is mostly based on the set top boxes, new content providers have emerged, utilizing newly developed proprietary streaming platforms. As the number of IPTV users grew, more volume and variety of data became available for analysis. By analyzing stored user actions, it is possible to create a multivariate time series that represents user behavior over time. The approach presented in the paper is based on multivariate time series generation from user data and determining the similarity between them. Time series are created for each user based on the proposed quantified action sets, grouped in the feature groups and summarized over time. The action sets and feature groups can be adjusted to a certain IPTV platform. The end result of the analysis is the similarity score matrix, generated by calculating the similarities of all users’ time series, where the similarity measure calculation can be chosen arbitrarily.
... Having presented our method to deal with multivariate ordinal patterns, let us shortly recall what can be found in the existing literature: Mohr has dealt with different concepts in her doctoral thesis (Mohr 2022). Compare in this context also Mohr et al. (2020b). ...
Article
Full-text available
The classification of movement in space is one of the key tasks in environmental science. Various geospatial data such as rainfall or other weather data, data on animal movement or landslide data require a quantitative analysis of the probable movement in space to obtain information on potential risks, ecological developments or changes in future. Usually, machine-learning tools are applied for this task, as these approaches are able to classify large amounts of data. Yet, machine-learning approaches also have some drawbacks, e.g. the often required large training sets and the fact that the algorithms are often hard to interpret. We propose a classification approach for spatial data based on ordinal patterns. Ordinal patterns have the advantage that they are easily applicable, even to small data sets, are robust in the presence of certain changes in the time series and deliver interpretative results. They therefore do not only offer an alternative to machine-learning in the case of small data sets but might also be used in pre-processing for a meaningful feature selection. In this work, we introduce the basic concept of multivariate ordinal patterns and the corresponding limit theorem. A simulation study based on bootstrap demonstrates the validity of the results. The approach is then applied to two real-life data sets, namely rainfall radar data and the movement of a leopard. Both applications emphasize the meaningfulness of the approach. Clearly, certain patterns related to the atmosphere and environment occur significantly often, indicating a strong dependence of the movement on the environment.
... Having presented our method to deal with multivariate ordinal patterns, let us shortly recall what can be found in the existing literature: Mohr has dealt with different concepts in her doctoral thesis (Mohr, 2022). Compare in this context also Mohr et al (2020b). ...
Preprint
Full-text available
The classification of movement in space is one of the key tasks in environmental science. Various geospatial data such as rainfall or other weather data, data on animal movement or landslide data require a quantitative analysis of the probable movement in space to obtain information on potential risks, ecological developments or changes in future. Usually, machine-learning tools are applied for this task, as these approaches are able to classify large amounts of data. Yet, machine-learning approaches also have some drawbacks, e.g. the often required large training sets and the fact that the algorithms are often seen as black boxes. We propose a classification approach for spatial data based on ordinal patterns. Ordinal patterns have the advantage that they are easily applicable, even to small data sets, are robust in the presence of certain changes in the time series and deliver interpretative results. They therefore do not only offer an alternative to machine-learning in the case of small data sets but might also be used in pre-processing for a meaningful feature selection. In this work, we introduce the basic concept of multivariate ordinal patterns and the corresponding limit theorem. A simulation study based on bootstrap demonstrates the validity of the results. The approach is then applied to two real-life data sets, namely rainfall radar data and the movement of a leopard. Both applications emphasize the meaningfulness of the approach. Clearly, certain patterns related to the atmosphere and environment occur significantly often, indicating a strong dependence of the movement on the environment. MSC Classification: 62M10 , 62H20 , 62F12 , 60F05 , 05A05 , 62G30
... Darüber hinaus gilt eine Repräsentation als gut, wenn sie flexibel und zuverlässig auf verschiedene Probleme und reale Datensätze anwendbar ist. Etwa übertreffen Ordinalmusterpräsentationen in Verbindung mit einem einfachen k-Nächste-Nachbarn-Verfahren verschiedene state-of-the-art Klassifikationsverfahren (siehe [14]), und das auf zahlreichen realen Datensätzen des UEA MTSC-Archivs [4]. ...
... The empirical evaluation shows that by means of our approach unnecessary groundings are reduced, i.e., improving runtime performance, while also keeping the model accuracy through leveraging inferred evidence and therefore representing the reality more realistically. In future work, we use multivariate ordinal patterns introduced by Mohr et al. [12] to incorporate entity clusters based on their partitions in the DPRM, i.e., on their parfactors. We also investigate various forms of interconnectivity between entity symmetry clusters that can help to further increase the accuracy of the model by representing the reality even better. ...
Chapter
Lifted inference approaches reduce computational work as inference is performed using representatives for sets of indistinguishable random variables, which allows for tractable inference w.r.t. domain sizes in dynamic probabilistic relational models. Unfortunately, maintaining a lifted representation is challenging in practically relevant application domains, as evidence often breaks symmetries making lifted techniques fall back on their ground counterparts. In existing approaches asymmetric evidence is counteracted by merging similar but distinguishable objects when moving forward in time. While undoing splits a posteriori is reasonable, we propose learning approximate model symmetries a priori to prevent unnecessary splits due to inaccuracy or one-time events. In particular, we propose a multivariate ordinal pattern symbolization approach followed by spectral clustering to determine sets of domain entities behaving approximately the same over time. By using object clusters, we avoid unnecessary splits by keeping entities together that tend to behave the same over time. Understanding symmetrical and asymmetrical entity behavior a priori allows for increasing accuracy in inference by means of inferred evidence for unobserved entities to better represent reality. Empirical results show that our approach reduces unnecessary splits, i.e., improves runtimes, while keeping accuracy in inference high.
Article
Detecting transportation modes’ usability in spatiotemporal urban trajectories can provide valuable insights into the mobility preferences of urban populations, helping epidemic prevention and urban quality-of-life improvement. With this goal, we introduce POPAyI, a strategy that bases its design on the Ordinal Pattern (OP) transformation applied to mobility-related time series. POPAyI can quantify time-series dynamics with a low-complex cost, muscling time series’ characteristics without the need for high computational and methodological complexities as the current Machine Learning (ML) and Deep Learning (DL) literature. POPAyI uses polar representation and captures amplitude information in time series, bringing the multivariate capability to the standard 1D OP transformation. Our experiments show that POPAyI: (i) perfectly adapts to multi-dimensional mobility time series and natural non-linear mobility behavior. (ii) presents consistent detection results in any considered number of transportation mode’s classes with efficiency in terms of storage and computation complexity, using fewer features than ML approaches and computational resources than DL methods, e.g., reaching 10000 fewer parameters than a lightweight DL approach while increasing by 3% the F1-score.
Chapter
We propose a methodology for calibrating a physical system simulator and whose computational model represents its events in time series. The methodology reduces the search space of the fit parameters by exploring a database that contains stored historical events and their corresponding simulator fit parameters. We carry out the symbolic representation of the time series using ordinal patterns to classify the series, which allows us to search and compare by similarity on the stored data of the series represented. This classification strategy allows us to speed up the parameter search process, reduce the computational cost of the adjustment process and consequently improve energy cost savings. The experiences showed a reduction in the computational cost of 29% compared with our tuning methodology proposed in previous research.KeywordsParametric simulationTuning methodologyOrdinal patternTime series classificationData driven
Chapter
Exploiting symmetries is an important topic to obtain sparse (lifted) representations, reduce complexity and achieve good performance in dynamic probabilistic relational models (DPRMs). DPRMs factorise a full joint probability distribution by encoding multivariate time series through a set of conditionally dependent random variables. As obtaining exact symmetries throughout multivariate time series is often not realistic in real-world contexts and counteracts lifted representations, we propose to approximate the multivariate time series with a symbolisation scheme that encodes the overarching trend in up and down movements. In this work, we introduce MOP4SA, an approach for the approximation of symmetries based on multivariate ordinal pattern encodings and spectral clustering. Understanding symmetrical behaviour has several benefits that we evaluate in two use cases. We use MOP4SA (a) to detect structures in model symmetries over time, and (b) to avoid model splits and groundings in DPRMs.KeywordsSymmetryMultivariate time seriesMultivariate ordinal patternRelational modelsLifting
Article
Full-text available
The study of nonlinear and possibly chaotic time-dependent systems involves long-term data acquisition or high sample rates. The resulting big data is valuable in order to provide useful insights into long-term dynamics. However, efficient and robust algorithms are required that can analyze long time series without decomposing the data into smaller series. Here symbolic-based analysis techniques that regard the dependence of data points are of some special interest. Such techniques are often prone to capacity or, on the contrary, to undersampling problems if the chosen parameters are too large. In this paper we present and apply algorithms of the relatively new ordinal symbolic approach. These algorithms use overlapping information and binary number representation, whilst being fast in the sense of algorithmic complexity, and allow, to the best of our knowledge, larger parameters than comparable methods currently used. We exploit the achieved large parameter range to investigate the limits of entropy measures based on ordinal symbolics. Moreover, we discuss data simulations from this viewpoint.
Article
Full-text available
This study presents a two-step fault diagnosis scheme combined with statistical classification and random forests-based classification for rolling element bearings. Considering the inequality of features sensitivity in different diagnosis steps, the proposed method utilizes permutation entropy and variational mode decomposition to depict vibration signals under single scale and multiscale. In the first step, the permutation entropy features on the single scale of original signals are extracted and the statistical classification model based on Chebyshev’s inequality is constructed to detect the faults with a preliminary acquaintance of the bearing condition. In the second step, vibration signals with fault conditions are firstly decomposed into a collection of intrinsic mode functions by using variational mode decomposition and then multiscale permutation entropy features derived from each mono-component are extracted to identify the specific fault types. In order to improve the classification ability of the characteristic data, the out-of-bag estimation of random forests is firstly employed to reelect and refine the original multiscale permutation entropy features. Then the refined features are considered as the input data to train the random forests-based classification model. Finally, the condition data of bearings with different fault conditions are employed to evaluate the performance of the proposed method. The results indicate that the proposed method can effectively identify the working conditions and fault types of rolling element bearings.
Article
Full-text available
MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care hospital. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. The database supports applications including academic and industrial research, quality improvement initiatives, and higher education coursework.
Article
Full-text available
The ordinal approach to evaluate time series due to innovative works of Bandt and Pompe has increasingly established itself among other techniques of nonlinear time series analysis. In this paper, we summarize and generalize the theory of determining the Kolmogorov-Sinai entropy of a measure-preserving dynamical system via increasing sequences of order generated partitions of the state space. Our main focus are measuring processes without information loss. Particularly, we consider the question of the minimal necessary number of measurements related to the properties of a given dynamical system.
Article
Full-text available
An original multivariate multi-scale methodology for assessing the complexity of physiological signals is proposed. The technique is able to incorporate the simultaneous analysis of multi-channel data as a unique block within a multi-scale framework. The basic complexity measure is done by using Permutation Entropy, a methodology for time series processing based on ordinal analysis. Permutation Entropy is conceptually simple, structurally robust to noise and artifacts, computationally very fast, which is relevant for designing portable diagnostics. Since time series derived from biological systems show structures on multiple spatial-temporal scales, the proposed technique can be useful for other types of biomedical signal analysis. In this work, the possibility of distinguish among the brain states related to Alzheimer’s disease patients and Mild Cognitive Impaired subjects from normal healthy elderly is checked on a real, although quite limited, experimental database.
Article
Full-text available
More than ten years ago Bandt and Pompe introduced a new measure to quantify complexity in measured time series. During these ten years, this measure has been modified and extended. In this review we will give a brief introduction to permutation entropy, explore the different fields of utilization where permutation entropy has been applied and provide a guide on how to choose appropriate parameters for different applications of permutation entropy.
Article
Permutation Entropy (PE) is a cost effective tool for summarizing the complexity of a time series. It has been used in many applications including damage detection, disease forecasting, detection of dynamical changes, and financial volatility analysis. However, to successfully use PE, an accurate selection of two parameters is needed: the permutation dimension n and embedding delay τ. These parameters are often suggested by experts based on a heuristic or by a trial and error approach. Both of these methods can be time-consuming and lead to inaccurate results. In this work, we investigate multiple schemes for automatically selecting these parameters with only the corresponding time series as the input. Specifically, we develop a frequency-domain approach based on the least median of squares and the Fourier spectrum, as well as extend two existing methods: Permutation Auto-Mutual Information Function and Multi-scale Permutation Entropy (MPE) for determining τ. We then compare our methods as well as current methods in the literature for obtaining both τ and n against expert-suggested values in published works. We show that the success of any method in automatically generating the correct PE parameters depends on the category of the studied system. Specifically, for the delay parameter τ, we show that our frequency approach provides accurate suggestions for periodic systems, nonlinear difference equations, and electrocardiogram/electroencephalogram data, while the mutual information function computed using adaptive partitions provides the most accurate results for chaotic differential equations. For the permutation dimension n, both False Nearest Neighbors and MPE provide accurate values for n for most of the systems with a value of n=5 being suitable in most cases.
Article
To measure the complexity of multivariate systems, the multivariate permutation entropy (MvPE) algorithm is proposed. It is employed to measure complexity of multivariate system in the phase space. As an application, MvPE is applied to analyze the complexity of chaotic systems, including hyperchaotic Hénon map, fractional-order simplified Lorenz system and financial chaotic system. Results show that MvPE algorithm is effective for analyzing the complexity of the multivariate systems. It also shows that fractional-order system does not become more complex with derivative order varying. Compared with PE, MvPE has better robustness for noise and sampling interval, and the results are not affected by different normalization methods.
Article
This paper provides a way for determining the Kolmogorov-Sinai entropy of time-discrete dynamical systems on the base of quantifying ordinal patterns obtained from a finite set of observables. As a consequence, it is shown that the Kolmogorov-Sinai entropy is bounded from above by a quantity which generalizes the concept of permutation entropy. In this framework, the determination of the Kolmogorov-Sinai entropy of a multidimensional system by use of only a single one-dimensional observable and Takens' embedding theorem is discussed.