Conference PaperPDF Available

Quaternion Denoising Encoder-Decoder for Theme Identification of Telephone Conversations

Authors:

Abstract

In the last decades, encoder-decoders or autoen-coders (AE) have received a great interest from researchers due to their capability to construct robust representations of documents in a low dimensional sub-space. Nonetheless, autoencoders reveal little in way of spoken document internal structure by only considering words or topics contained in the document as an isolate basic element, and tend to overfit with small corpus of documents. Therefore, Quaternion Multi-layer Per-ceptrons (QMLP) have been introduced to capture such internal latent dependencies, whereas denoising autoen-coders (DAE) are composed with different stochastic noises to better process small set of documents. This paper presents a novel autoencoder based on both hitherto-proposed DAE (to manage small corpus) and the QMLP (to consider internal latent structures) called " Quater-nion denoising encoder-decoder " (QDAE). Moreover, the paper defines an original angular Gaussian noise adapted to the specificity of hyper-complex algebra. The experiments, conduced on a theme identification task of spoken dialogues from the DECODA framework, show that the QDAE obtains the promising gains of 3% and 1.5% compared to the standard real valued denoising autoencoder and the QMLP respectively.
Quaternion Denoising Encoder-Decoder
for Theme Identification of Telephone Conversations
Titouan Parcollet, Mohamed Morchid, Georges Linar`
es
LIA, University of Avignon (France)
{firstname.lastname}@univ-avignon.fr
email@address
Abstract
In the last decades, encoder-decoders or autoencoders (AE)
have received a great interest from researchers due to their capa-
bility to construct robust representations of documents in a low
dimensional subspace. Nonetheless, autoencoders reveal little
in way of spoken document internal structure by only consid-
ering words or topics contained in the document as an isolate
basic element, and tend to overfit with small corpus of docu-
ments. Therefore, Quaternion Multi-layer Perceptrons (QMLP)
have been introduced to capture such internal latent depen-
dencies, whereas denoising autoencoders (DAE) are composed
with different stochastic noises to better process small set of
documents. This paper presents a novel autoencoder based on
both hitherto-proposed DAE (to manage small corpus) and the
QMLP (to consider internal latent structures) called “Quater-
nion denoising encoder-decoder” (QDAE). Moreover, the paper
defines an original angular Gaussian noise adapted to the speci-
ficity of hyper-complex algebra. The experiments, conduced on
a theme identification task of spoken dialogues from the DE-
CODA framework, show that the QDAE obtains the promising
gains of 3% and 1.5% compared to the standard real valued de-
noising autoencoder and the QMLP respectively.
Index Terms: Spoken language understanding, Neural net-
works, Quaternion algebra, Denoising encoder-decoder neural
networks
1. Introduction
A basic encoder-decoder neural network [1] (AE) consists of
two neural networks (NN): an encoder that maps an input vec-
tor into a low-dimensional and fixed context vector; a decoder
that generates a target vector by reconstructing this context vec-
tor. Multidimensional data such as latent structures of spo-
ken dialogue are difficult to capture by traditional autoencoders
due to the unidimensionality of real numbers employed. [2],
[3] have introduced a quaternions-based multilayer perceptron
(QMLP) as well as a specific spoken dialogues segmentation to
better capture internal structures as a result of the Hamilton dot
product [4], and thus achieve better accuracies than real-valued
multilayer perceptrons (MLP), on a theme identification task of
spoken dialogues. A quaternion encoder-decoder has then been
proposed by [5] to take advantage of the multidimensionality
of hyper-complex numbers to code existing latents relations be-
tween pixel colors. However, both quaternions and real num-
bers based autoencoders suffer from overfitting and degraded
generalization capabilities when dealing with small corpus of
documents [6]. Indeed, autoencoders try to map the initial vec-
tor in a low-dimensional subspace and are thus highly corre-
lated with the number of patterns to learn. To overcome this
drawback, a stochastic encoder-decoder called denoising auto-
encoders (DAE) have been proposed by [6] and investigated in
[7, 8, 9]. Intuitively, a denoising auto-encoder encodes artifi-
cially corrupted inputs, and try to reconstruct the initial vector.
By learning this noisy representation, DAE tends to better ab-
stract patterns in a reduced robust subspace.
The paper proposes a novel quaternion denoising encoder-
decoder (QDAE) that takes into account the internal document
structure (such as the QMLP) and is able to manage small cor-
pus (as DAE). Nonetheless, traditional noises, such as additive
isotropic Gaussian noise [10] , are elaborated for real-numbers
autoencoders. Therefore, we also propose a Gaussian angu-
lar noise (GAN) adapted to the quaternion algebra. The ex-
periments on the DECODA telephone conversations framework
show the impact of the different noises, alongside to underline
the performance of the proposed QDAE over DAE, AE, MLP
and QMLP.
The rest of the paper is organized as follows: Section 2 presents
the quaternion encoder-decoder and Section 3 details the exper-
imental protocol. The results are discussed in Section 4 before
concluding on Section 5.
2. Quaternion Denoising Encoder-Decoder
The proposed QDAE is a denoising autoencoder with quater-
nion numbers. Section 2.1 details the quaternion properties re-
quired for the QAE, and QDAE algorithms are presented in Sec-
tion 2.2.
2.1. Quaternion algebra
Quaternion algebra Qis an extension of complex numbers de-
fined in a four dimensional space as a linear combination of
four basis elements denoted as 1,i,j,kto represent a rotation.
A quaternion Qis written as Q=r1+xi+yj+zk. In a
quaternion, ris the real part while xi+yj+zkis the imaginary
part (I) or the vector part. A set of basic quaternion properties
needed for the QDAE definition are defined as follow:
all products of i,j,k:i2=j2=k2=ijk =1
quaternion conjugate Qof Qis:
Q=r1xiyjzk
inner product between two quaternions Q1and Q2is
hQ1,Q
2i=r1r2+x1x2+y1y2+u1u2
normalized of a quaternion Q/=Q
pr2+x2+y2+z2
rotation through the angle of quaternion R/:
Q0=R/QR/
Hamilton product between Q1and Q2encodes latent de-
pendencies and is defined as follows:
Q1Q2=(r1r2x1x2y1y2z1z2)+
(r1x2+x1r2+y1z2z1y2)i+
(r1y2x1z2+y1r2+z1x2)j+
(r1z2+x1y2y1x2+z1r2)k
Q1Q2performs an interpolation between two rotations fol-
lowing a geodesic over a sphere in the R3space. More about
hyper-complex numbers can be found in [4, 11, 12] and about
quaternion algebra in [13].
2.2. Quaternion Autoencoder (QAE)
The QAE is a three-layered neural network made of an encoder
and a decoder (see Figure 1-(a)). The well known autoencoder
(AE) is obtained with the same algorithm but with real numbers.
hn
Qp˜
Qp
hn
Qcorrupted
pf(Qp)Qp˜
Qp
(a) Quaternion autoencoder (b) Quaternion denoising autoencoder
Figure 1: Illustration of the Quaternion autoencoders.
Given a set of Pnormalized inputs Q/
p(referenced as Qpfor
convenience) (1pP) of size M, the encoder computes
an hidden representation hnof Qp={Qm}M
m=1 (Nis the
number of hidden units):
hn=(
M
X
m=1
w(1)
nm Qm+(1)
n)
where w(1) is a NMweight matrix and (1) is a N-
dimensional bias vector; (Q)is the sigmoid activation func-
tion of the quaternion Q[14] (Q)=sig(r)1 + sig(x)i+
sig(y)j+sig(z)k, with sig (.)= 1
1+e.. The decoder attempts
to reconstruct the input vector Qpfrom the hidden vector hnto
obtain the output vector ˜
Qp=n˜
QmoM
m=1:
˜
Qm=(
N
X
n=1
w(2)
mn hn+(2)
m)
where the reconstructed quaternion vector ˜
Qpis M- dimen-
sional, w(2) is a MNweight matrix and (2) is a
M-dimensional bias vector. During learning, the QAE at-
tempts to reduce the reconstruction error ebetween ˜
Qpand
Qpby using the traditional Mean Square Error (MSE) [15]
(eMSE(˜
Qm,Q
m)=|| ˜
QmQm||2) for minimizing the to-
tal reconstruction error LMSE =1
PP
p2PP
m2M
eMSE(˜
Qm,Q
m)
with respect to the parameters (quaternions) set =
{w(1),
(1),w
(2),
(2)}.
2.3. Quaternion Denoising Autoencoder (QDAE)
Traditional autoencoders fail to: 1) separate robust features and
relevant information to residual noise [9] from small corpus; 2)
take into account the temporal and internal structures of spo-
ken documents. Therefore, denoising autoencoders (DAE) [9]
corrupt inputs using specific noises during the encoding and
decode this representation to reconstruct the non-corrupted in-
puts. DAE models learn a robust generative model to bet-
ter represent small sized corpus of documents; [2] propose
to learn internal and temporal structure representation with
a quaternion multilayer perceptron (QMLP). The paper pro-
poses to address issues related to small sized corpus (such
as DAE) and to temporal structure (QMLP) by introducing
a quaternion denoising autoencoder called QDAE. Figure 1-
(b) shows an input vector Qpartificially corrupted by a noise
function f() applied to each index Qmof Qpas f(Qp)=
{f(Q1),...,f(Qm),...,f(QM)}.
Standard real-numbers-adapted noises :
Additive isotropic Gaussian (G): Adds a different Gaussian
noise to each input values (Q1,...,Q
m,...,Q
M)of a fixed
proportion of patterns Qpwith means and variances of the
Gaussian distribution bounded by the corresponding average
of all the patterns of the same prediction theme of Qp.
Salt-and-pepper (SP): fixes amount of patterns of all patterns
Qprandomly set to 1or 0.
Dropout(D): fixes amount of patterns of all patterns Qpran-
domly set to 0.
Given a noise function f() the corresponding corrupted quater-
nion of Qm=r1+xi+yj+zkis Qcorrupted
m=f(Qm)=
f(r)1 + f(x)i+f(y)j+f(z)k. Nonetheless, such a represen-
tation does not take into account the specificity of quaternion
algebra since they were designed for real numbers. Indeed, a
quaternion represents a rotation over the R3space. Therefore,
basic additive and non-angular noises such as a Gaussian noise,
only represents a one dimensional translation and does not take
advantage of the rotation defined by a quaternion.
Quaternion Gaussian Angular Noise (GAN):
The GAN takes advantage of the quaternion algebra (rotation)
and is proposed to address the drawback of a weakly adapted
noise function (add a noise to each quaternion) to the rotation
definition of quaternions. The GAN noise function is based on
the rotation of a quaternion vector Qparound an axis defined
in a cone centered in mtand delimited by vt; where mtis the
mean and vtis the variance of the patterns Qpbelonging to
theme t. Let Rt
pbe a Gaussian noised Quaternion for the theme
tdefined as:
Rt
p=mt+N(0,I)vt.(1)
The Gaussian angular noise function f() rotates Qpbelonging
to the theme taround Rt
pto obtain the corrupted Quaternion
Qcorrupted
p:
f(Qp)= Rt
pQpRt
p
|Rt
pQp|and (2)
f(Qp)=(Qp,if Rt
p=Qp
Qcorrupted
p,otherwise (3)
It is worth noticing in eq.(3) that fis idempotent since Rt
p=
Qpto maintain the dialogue pattern unaltered.
3. Experimental protocol
The effectiveness of the proposed QDAE-GAN is evaluated dur-
ing a theme identification task of telephone conversations from
the DECODA corpus detailed in Section 3.1. Section 3.2 ex-
presses the dialogue features employed as inputs of autoen-
coders as well as the configurations of each neural network.
3.1. Spoken Dialogue dataset
The DECODA corpus [16] contains human-human telephone
real-life conversations collected in the CSS of the Paris trans-
portation system (RATP). It is composed of 1,242 telephone
conversations, corresponding to about 74 hours of signal, split
into a train (740 dialogues), a development (dev - 175 dia-
logues) and a test set (327 dialogues). Each conversations is
annotated with one of 8 themes. Themes correspond to cus-
tomer problems or inquiries about itinerary, lost and found, time
schedules, transportation cards, state of the traffic, fares, fines
and special offers. The LIA-Speeral Automatic Speech Recog-
nition (ASR) system [17] is used for automatically transcribing
each conversation. Acoustic model parameters are estimated
from 150 hours of telephone speech. The vocabulary contains
5,782 words. A 3-gram language model (LM) is obtained by
adapting a basic LM with the training set transcriptions. Auto-
matic transcriptins are obtained with word error rates (WERs)
of 33.8%,45.2% and 49.%on the train, dev. and test sets re-
spectively. These high rates are mainly due to speech disflu-
encies in casual users and to adverse acoustic environments in
metro stations and streets.
3.2. Input features and Neural Networks settings
The experiments compare our proposed QDAE with DAE
based on real-numbers [7] and to the QMLP[2].
Input features: [2] show that a LDA [18] space with 25 topics
and a specific user-agent document segmentation involving the
quaternion Q=r1+xi+yj+zkto be build with the user
part of the dialogue in the first complex value x, the agent in
yand the topic prior of the whole dialogue on z, achieve the
best results on 10 folds with the QMLP. Therefore, we keep
this segmentation and concatenate the 10 representations of
size 25 in a single input vector of size M=250. Indeed, the
compression of 10 folds in a single input vector gives to DAEs
more features for generalizing patterns. For fair comparison, a
QMLP with the same input vector is tested.
QDAE and QMLP configurations: The appropriate size
of the hidden layer hfor the QDAE have to be chosen by
varying the number of neurons of the hidden layer to change
the amount and the shape of features given to the classifier.
Different autoencoders have thus been learned by fluctuating
the hidden layer size from 10 to 120. Finally a QMLP classifier
is trained with 8hidden neurons; the hidden layer of the QAE,
QDAE as the input vectors; and 8 outputs neurons (8 themes t
on the DECODA corpus).
4. Experiments and Results
The proposed Quaternion denoising autoencoder (QDAE) is
compared to the quaternion autoencoder (QAE) in Section 4.1,
throughout a theme identification task of telephone conversa-
tions described in Section 3.1. For fair comparison, the QDAE
is then compared to the real-valued AE and MLP in Section 4.2.
4.1. QDAE with additive and angular noises
Figure 2 shows the accuracies obtained with the denoising
quaternion encoder-decoder for the development and the test set
during the theme identification task of telephone conversations
of DECODA project.
The first remark is that the results obtained on the development
dataset reported in Fig.2 are similar whatever the model em-
10 20 50 60 80 100 120
65
70
75
80
85
90
(a) SEG 1 on Dev
Mean = 81.8
Mean = 81.3
Mean = 82.4
QDAE-GAN
QDAE-G
QDAE-D
QDAE-SP
QAE
20 50 60 80 100 120
(b) SEG 1 on Test
Mean = 74.8
Mean = 74.6
Mean = 75.4
QDAE-GAN
QDAE-G
QDAE-D
QDAE-SP
QAE
Figure 2: Accuracies in %obtained on the development (left)
and test (right) set by varying the number of neurons in the hid-
den layer of the QAE and QDAE.
ployed. Nonetheless, the proposed QDAE-GAN gives better,
and more robust to hidden layer size variation, results on the test
dataset than any other methods. Table 1 validates the results ob-
served for the QDAE-GAN with a gain of more than 3.5% and
2.5% for QDAE-G and QDAE-D respectively. As expected, tra-
ditional noises give worse results compared to the adapted noise
due to the specificities of the quaternion algebra. Indeed, an ad-
ditive real-based Gaussian noise applied to a quaternion does
not take advantage of rotations defining quaternions. It is worth
underlying the bad performances reported with the QDAE-SP
and QDAE-D, which are not based on real or quaternion al-
gebra specificities: These poor performances are explained by
the high impact of zero values propagated during the Hamil-
ton product (see Section 2.1) by increasing the number of dead
neurons through the neural network. Finally, the non-corrupted
QAE gives a good ”best test” value on the test dataset (83%)
regarding the other QDAE, proving the non-relevance of real-
based noises to quaternion-based autoencoders.
Models Dev. Best Test Real Test
QAE 89.1 83.0 80.9
QDAE-SP 88.5 82.5 81.2
QDAE-G 88.5 83.1 81.5
QDAE-D 89.1 83.0 82.5
QDAE-GAN 90.2 85.2 85.2
Table 1: Accuracies in % obtained by proposed quaternion
encoder-decoders on the DECODA dataset
4.2. QDAE vs. real-valued neural networks
For a fair comparison this original QDAE-GAN approach is
compared to real-valued autoencoders and traditional neural
networks, and the results are depicted in Table 2.
Table 2 shows that non-adapted noise and standard QAE give
worse performances than a QMLP because of the lack of unseen
compressed information they give to the classifier. It is worth
emphasizing that the best accuracies observed are obtained by
the QDAE-GAN representing a gain of 11% regarding DAE [7].
The results depicted on Table 2 demonstrate the global improve-
ment of performances of the quaternion-valued neural networks
compared to the real-valued ones. Indeed, QMLP also gives a
Models Type Dev. Best Test Real Test Impr.
MLP[2] R85.2 79.6 79.6 -
QMLP Q89.7 83.7 83.7 +4.1
AE[7] R- - 81 -
QAE Q89.1 83.0 80.9 -0.1
DAE[7] R- - 74.3 -
DSAE[7] R88.0 83.0 82.0 +7.7
QDAE-GAN Q90.2 85.2 85.2 +10.9
Table 2: Summary of accuracies in % obtained by different
neural networks on the DECODA famework.
important gain of more than 4% regarding the MLP; QDAE-
GAN obtains a gain of 3.2% compared to DSAE.
5. Conclusion
Summary. This paper proposes a promising denoising encoder-
decoder based on the quaternion algebra coupled with an orig-
inal and well-adapted quaternion Gaussian angular noise. The
initial intuition that the QDAE better captures latent relations
between input features and can generalize from small cor-
pus, has been demonstrated. It has been shown that ongoing
noises during learning must be adapted to the quaternion alge-
bra to give better results and truly expose the full potential of
quaternion neural networks. Moreover, this paper shows that
quaternion-valued neural networks always perform better than
real-valued ones achieving impressive accuracies on the small
DECODA corpus with less input features and less neural pa-
rameters.
Limitations and Future Work. Document segmentation is a
crucial issue when it comes to better capture latent, temporal
and spacial information and thus needs more investigation to
expose the potential of quaternion-based models. Moreover, the
lack of GPU tools to manage quaternions impline a massive im-
plementation time to deal with bigger spoken document corpus.
A future work is to investigate other quaternion adapted noises,
and other quaternion based neural networks which better take
into consideration the document internal structure , such as re-
current neural networks and Long Short Term Memory neural
networks.
6. References
[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature,
vol. 521, no. 7553, pp. 436–444, 2015.
[2] T. Parcollet, M. Morchid, P.-M. Bousquet, R. Dufour, G. Linar`
es,
and R. De Mori, “Quaternion neural networks for spoken lan-
guage understanding,” in Spoken Language Technology Workshop
(SLT), 2016 IEEE. IEEE, 2016, pp. 362–368.
[3] M. Morchid, G. Linar`
es, M. El-Beze, and R. De Mori, “Theme
identification in telephone service conversations using quater-
nions of speech features,” in Interspeech. ISCA, 2013.
[4] I. Kantor, A. Solodovnikov, and A. Shenitzer, Hypercomplex num-
bers: an elementary introduction to algebras. Springer-Verlag,
1989.
[5] T. Isokawa, N. Matsui, and H. Nishimura, “Quaternionic neural
networks: Fundamental properties and applications,” Complex-
Valued Neural Networks: Utilizing High-Dimensional Parame-
ters, pp. 411–439, 2009.
[6] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Ex-
tracting and composing robust features with denoising autoen-
coders,” in Proceedings of the 25th international conference on
Machine learning. ACM, 2008, pp. 1096–1103.
[7] K. Janod, M. Morchid, R. Dufour, G. Linares, and R. De Mori,
“Deep stacked autoencoders for spoken language understanding,”
Matrix, vol. 1, p. 2, 2016.
[8] X. Lu, Y. Tsao, S. Matsuda, and C. Hori, “Speech enhancement
based on deep denoising autoencoder.” in Interspeech, 2013, pp.
436–440.
[9] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Man-
zagol, “Stacked denoising autoencoders: Learning useful repre-
sentations in a deep network with a local denoising criterion,”
Journal of Machine Learning Research, vol. 11, no. Dec, pp.
3371–3408, 2010.
[10] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimension-
ality of data with neural networks,” Science, vol. 313, no. 5786,
pp. 504–507, 2006.
[11] J. B. Kuipers, Quaternions and rotation sequences. Princeton
university press Princeton, NJ, USA:, 1999.
[12] F. Zhang, “Quaternions and matrices of quaternions, Linear al-
gebra and its applications, vol. 251, pp. 21–57, 1997.
[13] J. Ward, Quaternions and Cayley numbers: Algebra and applica-
tions. Springer, 1997, vol. 403.
[14] P. Arena, L. Fortuna, G. Muscato, and M. G. Xibilia, “Multilayer
perceptrons to approximate quaternion valued functions,Neural
Networks, vol. 10, no. 2, pp. 335–342, 1997.
[15] Y. Bengio, “Learning deep architectures for ai,” Foundations and
trends® in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.
[16] F. Bechet, B. Maza, N. Bigouroux, T. Bazillon, M. El-Beze,
R. De Mori, and E. Arbillot, “Decoda: a call-centre human-human
spoken conversation corpus.” in LREC, 2012, pp. 1343–1347.
[17] G. Linares, P. Noc´
era, D. Massonie, and D. Matrouf, “The lia
speech recognition system: from 10xrt to 1xrt,” in Text, Speech
and Dialogue. Springer, 2007, pp. 302–308.
[18] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet alloca-
tion,” the Journal of machine Learning research, vol. 3, pp. 993–
1022, 2003.
... Recently, researchers [10][11][12][13] have experimented with the quaternion extension of CNN and produced outperforming results compared to real-valued CNN. In this experiment, the channel and spatial attention modules of a quaternion residual quaternion network were employed to improve the performance of predicting Pneumonia from CXR images. ...
... The authors of 22 studied the influence of the Hamilton product on the grayscale-only reconstruction of color images. To reconstruct a unique grayscale image, a quaternion convolutional encoder-decoder architecture is created in 12 . In contrast to standard convolutional encoder-decoder networks, their method can efficiently learn to reconstruct an image's colors from its grayscale representation. ...
... ⊗ operator is used to represent the Hamiltonian product of two quaternions Q 1 and W 1, and it is defined as Eq. (12). ...
Article
Full-text available
Worldwide, pneumonia is the leading cause of infant mortality. Experienced radiologists use chest X-rays to diagnose pneumonia and other respiratory diseases. The diagnostic procedure's complexity causes radiologists to disagree with the decision. Early diagnosis is the only feasible strategy for mitigating the disease's impact on the patent. Computer-aided diagnostics improve the accuracy of diagnosis. Recent studies established that Quaternion neural networks classify and predict better than real-valued neural networks, especially when dealing with multi-dimensional or multi-channel input. The attention mechanism has been derived from the human brain's visual and cognitive ability in which it focuses on some portion of the image and ignores the rest portion of the image. The attention mechanism maximizes the usage of the image's relevant aspects, hence boosting classification accuracy. In the current work, we propose a QCSA network (Quaternion Channel-Spatial Attention Network) by combining the spatial and channel attention mechanism with Quaternion residual network to classify chest X-Ray images for Pneumonia detection. We used a Kaggle X-ray dataset. The suggested architecture achieved 94.53% accuracy and 0.89 AUC. We have also shown that performance improves by integrating the attention mechanism in QCNN. Our results indicate that our approach to detecting pneumonia is promising.
... In other words, KGC infers the implicit triples based on the true triplets that exist in the KGs. For example, if the triple (Bill Clinton, Friendship, Seven Spielberg) is correct, i.e. Bill Clinton has a friend- Conventional approaches for KGC have achieved substantial improvement via embedding entities and relations into low-dimensional continuous space, such as TransE (Bordes et al. 2013), TransH(Wang et al. 2014), TransD (Ji et al. 2015), TransR (Lin et al. 2015), etc. Instead of using a realvalued space, ComplEx (Trouillon et al. 2016), RotatE ) project entities and relations into a complex space preforming the strong representation ability. ...
... TransH donates the normal vector in the hyperplane as w r : S r (e) = e − w r ew r . TransR (Lin et al. 2015) proposes to use a relation-specific projection matrix M r rather than hyperplane to project the entities embedding vectors into the space: S r (e) = M r e. It can be seen that large-scale parameters will be designed, so training the model is demanding which requires a lot of storage space. ...
... The scoring functions of semantic matching models reflect the confidence of the semantic information of the triples. RESCAL (Nickel, Tresp, and Kriegel 2011) represents each relation as a full rank matrix, optimizing a scoring function that computes a bilinear product between head and tail entity embedding vectors and relation matrix. Due to the large number of parameters, the model has overfitting problem. ...
Preprint
Full-text available
In recent years, knowledge graph completion methods have been extensively studied, in which graph embedding approaches learn low dimensional representations of entities and relations to predict missing facts. Those models usually view the relation vector as a translation (TransE) or rotation (rotatE and QuatE) between entity pairs, enjoying the advantage of simplicity and efficiency. However, QuatE has two main problems: 1) The model to capture the ability of representation and feature interaction between entities and relations are relatively weak because it only relies on the rigorous calculation of three embedding vectors; 2) Although the model can handle various relation patterns including symmetry, anti-symmetry, inversion and composition, but mapping properties of relations are not to be considered, such as one-to-many, many-to-one, and many-to-many. In this paper, we propose a novel model, QuatDE, with a dynamic mapping strategy to explicitly capture a variety of relational patterns, enhancing the feature interaction capability between elements of the triplet. Our model relies on three extra vectors donated as subject transfer vector, object transfer vector and relation transfer vector. The mapping strategy dynamically selects the transition vectors associated with each triplet, used to adjust the point position of the entity embedding vectors in the quaternion space via Hamilton product. Experiment results show QuatDE achieves state-of-the-art performance on three well-established knowledge graph completion benchmarks. In particular, the MR evaluation has relatively increased by 26% on WN18 and 15% on WN18RR, which proves the generalization of QuatDE.
... Shang and Hiros [37], proposed a quaternion neural-network-based PolSAR for land classification in Pointcare-sphere-parameter space. Parcollet et al. studied the applications of a deep quaternion neural network to speech recognition [38,39]. Gaudet and Maidat [39], and Parcollet et al. [40] investigated the use of quaternion convolution networks for image processing on the CIFAR and KITTI datasets and an end-to-end automatic speech recognition problem respectively. ...
... Parcollet et al. studied the applications of a deep quaternion neural network to speech recognition [38,39]. Gaudet and Maidat [39], and Parcollet et al. [40] investigated the use of quaternion convolution networks for image processing on the CIFAR and KITTI datasets and an end-to-end automatic speech recognition problem respectively. Pavllo et al. modelled human motion using quarternion-based neural networks [40]. ...
Article
Full-text available
Recurrent Neural Networks (RNNs) are known for their ability to learn relationships within temporal sequences. Gated Recurrent Unit (GRU) networks have found use in challenging time-dependent applications such as Natural Language Processing (NLP), financial analysis and sensor fusion due to their capability to cope with the vanishing gradient problem. GRUs are also known to be more computationally efficient than their variant, the Long Short-Term Memory neural network (LSTM), due to their less complex structure and as such, are more suitable for applications requiring more efficient management of computational resources. Many of such applications require a stronger mapping of their features to further enhance the prediction accuracy. A novel Quaternion Gated Recurrent Unit (QGRU) is proposed in this paper, which leverages the internal and external dependencies within the quaternion algebra to map correlations within and across multidimensional features. The QGRU can be used to efficiently capture the inter- and intra-dependencies within multidimensional features unlike the GRU, which only captures the dependencies within the sequence. Furthermore, the performance of the proposed method is evaluated on a sensor fusion problem involving navigation in Global Navigation Satellite System (GNSS) deprived environments as well as a human activity recognition problem. The results obtained show that the QGRU produces competitive results with almost 3.7 times fewer parameters compared to the GRU. The QGRU code is available at https://github.com/onyekpeu/Quarternion-Gated-Recurrent-Unit.
... Similarly, the default values of forget gate, input gate and output gate are b g , b i , b o , b C .C g specifies candicate cell, C g specifies the current space of cell, h g at the moment the value of the operator is x at the time the current state of the cell is specified. Hyperbolic (tanh) and sigmoid (σ) tangents are activation functions used in cell LSTM ( [15]). ...
Article
Various predictive methods have been applied to predict the value of stocks. The purpose of this research is to implement the discrete Hilbert transform in stock returns. The ability to predict stock price movements has big implications for investors. Traditional methods are often limited in capturing the complexity of market dynamics. It was found that the proposed method obtained an average of MAE, RMSE and MAPE values of 0.02055, 0.02237, and 0.012985 which is lower than the conventional LSTM method. This research provides a new understanding of the application of discrete Hilbert transform in a dynamic global financial context.
... These hyper-complex numbers are composed of one real and three imaginary components making them ideal for three or four dimensional data. Quaternion neural networks (QNNs) have enjoyed a surge in recent research and show promising results [3,4,5,6,7,8]. Quaternion networks have been shown to be effective at capturing relations within multidimensional data of four or fewer dimensions. ...
... These hyper-complex numbers are composed of one real and three imaginary components making them ideal for three or four dimensional data. Quaternion neural networks (QNNs) have enjoyed a surge in recent research and show promising results [3,4,5,6,7,8]. Quaternion networks have been shown to be effective at capturing relations within multidimensional data of four or fewer dimensions. ...
Preprint
Full-text available
We show that the core reasons that complex and hypercomplex valued neural networks offer improvements over their real-valued counterparts is the weight sharing mechanism and treating multidimensional data as a single entity. Their algebra linearly combines the dimensions, making each dimension related to the others. However, both are constrained to a set number of dimensions, two for complex and four for quaternions. Here we introduce novel vector map convolutions which capture both of these properties provided by complex/hypercomplex convolutions, while dropping the unnatural dimensionality constraints they impose. This is achieved by introducing a system that mimics the unique linear combination of input dimensions, such as the Hamilton product for quaternions. We perform three experiments to show that these novel vector map convolutions seem to capture all the benefits of complex and hyper-complex networks, such as their ability to capture internal latent relations, while avoiding the dimensionality restriction.
... Recently, the combination of quaternions and neural networks has received more and more attention because quaternion numbers allow neural network-based models to code latent inter-dependencies between groups of input features during the learning process with fewer parameters than C-NNs. In particular, the deep quaternion network [29], [30], the deep quaternion convolutional network [31], [32], or the deep quaternion recurrent neural network [33] has been employed for challenging tasks such as images and language processing. However, there is still a lack of investigations on the combination of the quaternions and the capsule networks, so this is a topic worth studying. ...
Article
Full-text available
Knowledge graphs are collections of factual triples. Link prediction aims to predict lost factual triples in knowledge graphs. In this paper, we present a novel capsule network method for link prediction taking advantages of quaternion. More specifically, we explore two methods, including a relational rotation model called QuaR and a deep capsule neural model called CapS-QuaR to encode semantics of factual triples. QuaR model defines each relation as a rotation from the head entity to the tail entity in the hyper-complex vector space, which could be used to infer and model diverse relation patterns, including: symmetry/anti-symmetry, reversal and combination. Based on these characteristics of quaternions, we use the embeddings of entities and relations trained from QuaR as the input to CapS-QuaR model. Experimental results on multiple benchmark knowledge graphs show that the proposed method is not only scalable, but also able to predict the correctness of triples in knowledge graphs and significantly outperform the existing state-of-the-art models for link prediction. Finally, the evaluation of a real dataset for search personalization task is conducted to prove the effectiveness of our model.
Article
Quaternions are extensively used in several fields including physics, applied mathematics, computer graphics, and control systems because of their notable and unique characteristics. Embedding quaternions into deep neural networks has attracted significant attention to neurocomputing researchers in recent years. Quaternion’s algebra helps to reconstruct neural networks in the quaternionic domain. This paper comprehensively reviewed and analyzed the recent advancements in quaternion deep neural networks (QDNNs) and their practical applications. Several architectures integrating quaternions in deep neural networks such as quaternion convolutional neural networks, quaternion recurrent neural networks, quaternion self-attention networks, hypercomplex convolutional neural networks, quaternion long-short term memory networks, quaternion residual networks, and quaternion variational autoencoders are thoroughly examined and reviewed with applications. It is observed that they have outperformed conventional real-valued neural networks. This study also discusses the main discoveries and possible advanced mechanisms of QDNN for future research. The open challenges and future scopes of QDNNs are also addressed, which provides the right direction of work in this field. This review may help researchers interested in architectural advancements and their practical applications.
Article
The neurocomputing communities have focused much interest on quaternionic-valued neural networks (QVNNs) due to the natural extension in quaternionic signals, learning of inter and spatial relationships between the features, and remarkable improvement against real-valued neural networks (RVNNs) and complex-valued neural networks (CVNNs). The excellent learning capability of QVNN inspired the researchers working on various applications in image processing, signal processing, computer vision, and robotic control system. Apart from its applications, many researchers have proposed new structures of quaternionic neurons and extended the architecture of QVNN for specific applications containing high-dimensional information. These networks have revealed their performance with a lesser number of parameters over conventional RVNNs. This paper focuses on past and recent studies of simple and deep QVNNs architectures and their applications. This paper provides the future directions to prospective researchers to establish new architectures and to extend the existing architecture of high-dimensional neural networks with the help of quaternion, octonion, or sedenion for appropriate applications.
Conference Paper
Full-text available
We previously have applied deep autoencoder (DAE) for noise reduction and speech enhancement. However, the DAE was trained using only clean speech. In this study, by using noisyclean training pairs, we further introduce a denoising process in learning the DAE. In training the DAE, we still adopt greedy layer-wised pretraining plus fine tuning strategy. In pretraining, each layer is trained as a one-hidden-layer neural autoencoder (AE) using noisy-clean speech pairs as input and output (or transformed noisy-clean speech pairs by preceding AEs). Fine tuning was done by stacking all AEs with pretrained parameters for initialization. The trained DAE is used as a filter for speech estimation when noisy speech is given. Speech enhancement experiments were done to examine the performance of the trained denoising DAE. Noise reduction, speech distortion, and perceptual evaluation of speech quality (PESQ) criteria are used in the performance evaluations. Experimental results show that adding depth of the DAE consistently increase the performance when a large training data set is given. In addition, compared with a minimum mean square error based speech enhancement algorithm, our proposed denoising DAE provided superior performance on the three objective evaluations.
Article
Full-text available
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Article
Full-text available
The goal of the DECODA project is to reduce the development cost of Speech Analytics systems by reducing the need for manual annotation. This project aims to propose robust speech data mining tools in the framework of call-center monitoring and evaluation, by means of weakly supervised methods. The applicative framework of the project is the call-center of the RATP (Paris public transport authority). This project tackles two very important open issues in the development of speech mining methods from spontaneous speech recorded in call-centers : robustness (how to extract relevant information from very noisy and spontaneous speech messages) and weak supervision (how to reduce the annotation effort needed to train and adapt recognition and classification models). This paper describes the DECODA corpus collected at the RATP during the project. We present the different annotation levels performed on the corpus, the methods used to obtain them, as well as some evaluation of the quality of the annotations produced.
Conference Paper
Full-text available
The paper introduces new features for describing possible focus variation in a human/human conversation. The application considered is a real-life telephone customer care service. The purpose is to hypothesize the dominant theme of conversations between a casual customer calling. Conversations are processed by an automatic speech recognition system that provides hypotheses used for extracting word frequency. Features are extracted in different, broadly defined and partially overlapped, time segments. Combinations of each feature in different segments are represented in a quaternion algebra framework. The advantage of the proposed approach is made evident by the statistically significant improvements in theme classification accuracy.
Conference Paper
Full-text available
Previous work has shown that the dicul- ties in learning deep generative or discrim- inative models can be overcome by an ini- tial unsupervised learning step that maps in- puts to useful intermediate representations. We introduce and motivate a new training principle for unsupervised learning of a rep- resentation based on the idea of making the learned representations robust to partial cor- ruption of the input pattern. This approach can be used to train autoencoders, and these denoising autoencoders can be stacked to ini- tialize deep architectures. The algorithm can be motivated from a manifold learning and information theoretic perspective or from a generative model perspective. Comparative experiments clearly show the surprising ad- vantage of corrupting the input of autoen- coders on a pattern classification benchmark suite.
Article
Quaternions are a class of hypercomplex number systems, a four-dimensional extension of imaginary numbers, which are extensively used in various fields such as modern physics and computer graphics. Although the number of applications of neural networks employing quaternions is comparatively less than that of complex-valued neural networks, it has been increasing recently. In this chapter, the authors describe two types of quaternionic neural network models. One type is a multilayer perceptron based on 3D geometrical affine transformations by quaternions. The operations that can be performed in this network are translation, dilatation, and spatial rotation in three-dimensional space. Several examples are provided in order to demonstrate the utility of this network. The other type is a Hopfield-type recurrent network whose parameters are directly encoded into quaternions. The stability of this network is demonstrated by proving that the energy decreases monotonically with respect to the change in neuron states. The fundamental properties of this network are presented through the network with three neurons.
Article
In this paper a new type of multilayer feedforward neural network is introduced. Such a structure, called hypercomplex multilayer perceptron (HMLP), is developed in quaternion algebra and allows quaternionic input and output signals to be dealt with, requiring a lower number of neurons than the real MLP, thus providing a reduced computational complexity. The structure introduced represents a generalization of the multilayer perceptron in the complex space (CMLP) reported in the literature. The fundamental result reported in the paper is a new density theorem which makes HMLPs universal interpolators of quaternion valued continuous functions. Moreover the proof of the density theorem can be restricted in order to formulate a density theorem in the complex space. Due to the identity between the quaternion and the four-dimensional real space, such a structure is also useful to approximate multidimensional real valued functions with a lower number of real parameters, decreasing the probability of being trapped in local minima during the learning phase. A numerical example is also reported in order to show the efficiency of the proposed structure. © 1997 Elsevier Science Ltd. All Rights Reserved.
Article
We give a brief survey on quaternions and matrices of quaternions, present new proofs for certain known results, and discuss the quaternionic analogues of complex matrices. The methods of converting a quaternion matrix to a pair of complex matrices and homotopy theory are emphasized.