- Access to this full-text is provided by Hindawi.
- Learn more
Download available
Content available from Security and Communication Networks
This content is subject to copyright. Terms and conditions apply.
Research Article
Blind Key Based Attack Resistant Audio Steganography Using
Cocktail Party Effect
Barnali Gupta Banik 1and Samir Kumar Bandyopadhyay 2
1St. omas’ College of Engineering & Technology, Kolkata 700023, India
2University of Calcutta, Kolkata 700098, India
Correspondence should be addressed to Barnali Gupta Banik; gupta.barnali@gmail.com
Received 19 October 2017; Revised 13 January 2018; Accepted 5 February 2018; Published 16 April 2018
Academic Editor: Emanuele Maiorana
Copyright © Barnali Gupta Banik and Samir Kumar Bandyopadhyay. is is an open access article distributed under the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided
the original work is properly cited.
Steganography is a popular technique of digital data security. Among all digital steganography methods, audio steganography is
very delicate as human auditory system is highly sensitive to noise; hence small modication in audio can make signicant audible
impact. In this paper, a key based blind audio steganography method has been proposed which is built on discrete wavelet transform
(DWT) as well as discrete cosine transform (DCT) and adheres to Kerckho’s principle. Here image has been used as secret message
which is preprocessed using Arnold’s Transform. To make the system more robust and undetectable, a well-known problem of audio
analysis has been explored here, knownas C ocktail Party Problem, for wrapping stego audio.e robustness of the proposed method
has been tested against Steganalysis attacks like noise addition, random cropping, resampling, requantization, pitch shiing, and
mp compression. e quality of resultant stego audio and retrieved secret image has been measured by various metrics, namely,
“peak signal-to-noise ratio”; “correlation coecient”; “perceptual evaluation of audio quality”; “bit error rate”; and “structural
similarity index.” e embedding capacity has also been evaluated and, as seen from the comparison result, the proposed method
has outperformed other existing DCT-DWT based technique.
1. Introduction
In the present era, communicating through Internet has
become vulnerable as there may be several intruders who can
eavesdrop for secret messages to capture and disburse them
for unlawful misconducts. Henceforth nowadays it is most
necessary to camouage secret message in such a way that
stego cannot be identied as carrier of secret message. Cam-
ouaging secret message through carrier objects introduces
the age-old technique of steganography. However, with the
current enormous use of Internet and elevation of various
Steganalysis attacks, it is required to have an extra shield to
protect steganography techniques. is is the reason cocktail
party eect in audio steganography has been explored to
ensure enhanced security during data transmission.
2. Related Work
2.1. Audio Steganography Techniques. In audio steganogra-
phy, audio is used as cover media. In [], authors have
described dierent spatial and frequency domain techniques
of audio steganography. e popular spatial domain tech-
niques are as follows.
Least Signicant Bit (LSB) Encoding. is is the simplest
method of audio steganography where Least Signicant Bit
of each audio sample is modied with bits of secret message
vector. With the extensive use of this method it becomes more
prone to attack and its embedding capacity is poor compared
to others. To cope up with the necessity of increasing capacity,
authors of [] have proposed an enhanced method of LSB
technique where it has been proved that nd and rd LSB
modication does not make audible dierence in audio
sample. In [], authors have suggested another enhancement
over LSB technique by shiing LSB modication from rd bit
to th bit which incur more embedding capacity compared to
previous methods of LSB encoding.
Parity Encoding. In this approach, audio signal is broken into
number of samples []. Depending on sample’s parity bit,
Hindawi
Security and Communication Networks
Volume 2018, Article ID 1781384, 21 pages
https://doi.org/10.1155/2018/1781384
Security and Communication Networks
HH
HL
High-pass
lter
Low-pass
lter
High-pass
lter
Low-pass
lter
High-pass
lter
Low-pass
lter
I
LH
LL
2
2
2
2
2
2
F : Block diagram of level D DWT.
secret message is embedded in the LSB of the sample byte
stream.
Echo Hiding. In this method, a short echo signal is introduced
as part of cover audio where secret message is hidden [].
Study shows that the echo signal is inaudible provided the
delay between cover audio and echo signal is up to ms.
e widespread frequency domain techniques are as
follows.
Phase Coding. As human auditory system cannot percept
phase component modulation, hence, in this technique,
secret data is embedded by modication of selected phase
component of cover audio signal. Using psychoacoustic
model, a threshold is calculated which can be used as masking
threshold[].In[],authorshaveuseddierencebetweenthe
phase values of the selected component frequencies and their
adjacent frequencies of the cover signal as a medium to hide
secret data bits. is method provides more robustness than
the previous approaches.
Spread Spectrum. e basic principle of spread spectrum is
to spread the secret message over the frequency spectrum of
cover audio signal. In [], Direct Sequence Spread Spectrum
is used to hide text data in an audio. Here a key is used to
embed message to the noise. In [], authors have discovered
that low spreading rate improves performance of spread spec-
trum audio steganography. erefore, authors have proposed
a technique which decreases correlation between original
signal and spread data signal by having phase shi in each
subband signal of original audio.
Discrete Wavelet Transforms (DWT). DWT decomposes a
signal in four frequency components, popularly known as
subbands. ese sub bands are Low-Low (LL), Low-High
(LH), High-Low (HL), and High-High (HH), as shown in
Figure . e LL subband describes approximation details.
e HL band demonstrates variation along the -axis or
horizontal details and the LH band demonstrates the -
axis variation or vertical details []. In other words, the
low frequency subband is a low-pass approximation of the
original signal and contains most energy of the signal. e
other subbands include mainly detailed components which
have low energy level. is is the reason LH subband is very
popular for data hiding.
In [], authors have proposed a method to create DWT
ofcoveraudioandselecthigherfrequencytoembedimage
data using low bit encoding technique. In [], authors have
decomposed the cover audio signal using Haar DWT and
then choose coecient to embed data. is is done using a
precalculated threshold value to ip data. In [], secret audio
is embedded using synchronizing code in the low frequency
part of DWT of cover audio.
Discrete Cosine Transforms (DCT).DCTisusedtoconvert
a signal from spatial domain into frequency domain. DCT
decomposes a signal into a series of cosine functions. e
two-dimensional DCT can be performed by executing one-
dimensional DCT twice, initially in the direction, next by
direction. e formulation of the D DCT for an input signal
with rows and columns and the output signal has been
given in
𝑥,𝑦
=𝑥𝑦
𝑀−1
𝑖=0
𝑁−1
𝑗=0 𝑖𝑗 cos (2+1)
2 cos 2+1
2 ,()
where 0≤≤−1and 0≤≤−1and
𝑥=
1
,where =0
2
,where 1≤≤−1,
𝑦=
1
,where =0
2
,where 1≤≤−1.
()
InverseDDCTisalsoavailabletotransformafrequency
domain coecient to spatial domain signal, as specied in
𝑖𝑗 =𝑀−1
𝑥=0
𝑁−1
𝑦=0𝑥𝑦𝑥𝑦 cos (2+1)
2 cos 2+1
2 ,()
where 0≤≤−1and 0≤≤−1.
DCT can be performed in block-by-block basis like 4×4,
8×8,and16×16blocks.
As shown in Figure (a), the top le coecient is called
DC coecient holding the approximate value of the whole
signal; normally it has coecients with zero frequency and
the remaining coecients are called AC coecients hold-
ing most detailed parameters of the signal, having coecients
with nonzero frequency. ere are some DCT coecients
which hold quite similar values. Human brains are less
sensitive to detect changes where all the elements hold more
or less the same value. erefore, this region of similar values
canbeselectedfordatahidingpurpose.isregionisknown
as midband region, as shown in Figure (b).
In [], authors have used speech signal as cover, where
voiced and nonvoiced part of the speech are separated by
zero crossing count and short time energy. e secret data is
embedded by modifying DCT coecient of nonvoiced part.
Security and Communication Networks
(a) DC and AC coecients in 4×4block (b) Midband region of 4×4block
F
In [], authors have decomposed the cover audio in 8×8
nonoverlapping block and secret data is hidden in the DC
coecient and th AC coecient in line. In [], authors
have embedded secret data in the low frequency component
of DCT quantization. In [], authors have decomposed the
cover audio into 8×8block and then each of those blocks was
decomposed further into 4×4frames. Embedding of secret
message depends on the dierence between rst or last two
frames.
2.2. Correlation Coecient (CC). A correlation coecient is a
measure of linear relationship between two random variables.
is term was rst coined by Karl Pearson in . e value
of correlation coecient can vary from −to.Ifthevalue
is perfect − or that indicates both variables are linearly
related. If the value is that indicates there is no relation
between the said variables. Moreover, the sign indicates that
thevariablesarepositivelyrelatedornegativelyrelated[].
ere are three types of correlation coecients: Pearson’s
coecient (), Spearman’s rho coecient (𝑠), and Kendall’s
tau coecient (). Pearson’s coecient, which is also known
as product-moment correlation coecient, is the most widely
used popular correlation coecient. It is given by paired
measurements (1,1),(2,2),...,(𝑛,𝑛)as mentioned
in
𝑝=∑𝑛
𝑖=1 𝑖−𝑖−
∑𝑛
𝑖=1 𝑖−2∑𝑛
𝑖=1 𝑖−2,()
where and are the mean of (1,2,...,𝑛)and
(1,2,...,𝑛), respectively. Correlation coecient can also
be used as quality metrics to measure similarity between two
signals.
2.3. Arnold Transform. Arnold’sTransformisachaoticbidi-
rectional map proposed by Vladimir Arnold in . A
chaotic map is an evaluation function which demonstrates
x
y
(a, b)
(a,b
)
F : Representation of point (,)shearedtopoint(,).
some sort of chaotic nature, as seen in the following trans-
formation function:
Γ:T2→ T2given by,
Γ:(,)→ (2+,+)mod 1. ()
An image is collection of pixels in row and column
arrangement, which can be organized in square or nonsquare
shape. If Arnold transform is applied to an image, it scrambles
the image by “” times iteration (e.g., iteration will scramble
less and iteration will scramble more), which makes the
imageimperceptible.isundetectableimageformatcanbe
used for data hiding securely as it is unable to reveal any
existence of secret data. Hence scrambling an image can be
a preprocessing step of data hiding technique.
Traditionally Arnold transform can be applied only for
square matrices; however later it has been improvised to apply
on any matrix, by
=11
12
mod ,
where ,∈{0,1,2,...,−1},()
where (,) is the element of original matrix and (,)is
the element of transformed matrix and is the order of
thematrix;asshowninFigure,thepoint(,)issheared
through -and-axis to get (,).
Security and Communication Networks
e function mod is important to regenerate the
original ×image. e functions to shear in -axis, -
axis, and modulo function is represented in
→+
(a) Function to shear in axis
→
+
(b) Function to shear in axis
→
(c) Modulo function.
()
Arnold transformation is reversible []. To recover
original image from scrambled image there are two ways, the
traditional way is periodicity, and the better approach is to
use inverse matrix, which is also known as Reverse Arnold
Transformation [] and expressed by
=2−1
−1 1
mod . ()
In [], authors have used Arnold’s transformation to
scramble the image before embedding into the DWT coef-
cient of cover audio. In [], authors have embedded
scrambled image in “Redundant Discrete Wavelet Transform”
coecient using Singular Value Decomposition (SVD) tech-
nique. In [], authors have proposed data hiding in DWT
and DCT domain using SVD where the secret image is
scrambled before embedding.
2.4. Cocktail Party Problem. Cocktail Party Problem is a
classic example of source separation which is very popular
in digital signal processing. In this problem, several people
are talking to each other in a banquet room and a listener
is trying to recognize one specic speech from that crowd
of partying guests. Human brain can distinguish one explicit
signal component from a mixed signal combination in real
time which is popularly known as “Auditory Scene Analysis.”
However, in digital signal processing, it is dicult to extract
only one speaker’s voice from the rest in cocktail party
situation.
In [], Colin Cherry rst revealed the ability of human
auditory system to separate a single speech or audio from a
combination of voices, which may turn into noise through
properties like pitch, gender, rate of speech, and/or direction
of speech. is task of separating single source audio from
a noise is known as dichotic listening task []. In [],
authors have reviewed the same techniques to train machine
to segregate signals. In [], Broadbent has concluded that
simultaneous listening can be performed for small messages,
not for long ones. Human ability to identify audio from a
mixed signal can be improved by listening by two ears [].
It has been seen that, in ideal circumstances, the signal
detection threshold of binaural listening is dB more than
monaural listening. In [], it has been stated that cocktail
party eect can be explained by Binaural Masking Level
Dierence (BMLD). As per BMLD, for binaural listening
the desired signal coming from one direction is ineectively
maskedbythenoisegeneratedindierentdirection.In
[], Kassebaum et al. discussed two methods for sig-
nal separation—Back Propagation (BP) and Self-Organizing
Neural Network (SONN). at experiment was carried out
through kHz channel using a modem data signal and a male
speech signal. It has been concluded that BP requires more
inputs and training time than SONN.
In [] authors have discussed types of approach to solve
Cocktail Party Problem:
(i) Temporal binding and oscillatory correlation
(ii) Cortronic network
(iii) Blind source separation.
In [], von der Malsburg explained the temporal binding
technique. He stated that neuron carries two distinct signals
and the binding is accomplished by correlation. e synchro-
nization allows neuron to create topological network. In [],
von der Malsburg and Schneider proposed a cocktail party
processor enhancing this idea—the Oscillatory Correlation
which is the basis of Computational Auditory Scene Analysis.
In [, ], multistage neural model has been proposed to
separate speech from interfering sounds using oscillatory
correlation.
In [], authors have proposed a biological approach to
solve Cocktail Party Problem using articial neural network
named as cortronic network. A cortronic neural network
describes connection among neurons in several regions
which demonstrates the output links of each neuron and the
strength of the connections.
e Blind Source Separation (BSS) is the technique of
separating signal from a mixed source without having knowl-
edge of source signals and the process of mixing. ere are
dierent methods of BSS among which Principal Component
Analysis (PCA), Independent Component Analysis (ICA),
and Time and Frequency domain approaches are signicant.
PCAandICAarebothstatisticalapproacheswhicharebetter
than Time or Frequency domain approach, since Fourier
components of data segments are xed in frequency domain
whereas in statistical domain the transformation depends on
thedatatobeanalyzed[].
PCA is a mathematical technique of transforming large
correlated dataset into a small number of major components
known as principal components []. It is moderately related
to mathematical theory of Singular Value Decomposition
(SVD), which is used to implement PCA []. Independent
Component Analysis can also be implemented with SVD,
though there are subtle dierences between PCA and ICA.
e aim of PCA is to nd decorrelated variables whereas
the aim of ICA is to nd independent variables. PCA and
ICA both perform matrix factorization for linear transfor-
mation, though PCA perform low rank matrix factorization
whereas ICA performs full-rank matrix factorization. e
Security and Communication Networks
T : Advantage and disadvantage of dierent approaches for solving Cocktail Party Problem.
Approach for solving
Cocktail Party Problem Advantages of the method Disadvantages of the method
Temporal binding
Asshownin[],thisstrategyhelps Asstatedin[],thisstrategyresults
robustness against loss of network
elements
Ninexible refocusing of system onto
events rapidly occurring in sequence
richness of representation
processing speed enhancement
Cortronic network
As mentioned in [], in this method As shown in [], this technique is
there is no requirement for having
knowledge of background sounds such as
static, trac, and music
Ncostly to implement as it requires a
separate articial neural network
Blind source separation
As shown in [], in this technique As reported in [], in this method
thereisnoneedforhaving
knowledge of source signals or the
process of mixing
Nconvergence speed is slow
no need for dening a cut-o
frequency for separation
low computational complexity
helps signal enhancement
advantageofICAoverPCAisthatPCAjustremoves
correlations whereas ICA removes correlations and higher
order dependencies []. ICA has extensive use in biomedical
imaging and audio processing []. ICA can also be used for
transformation to independent variable using multiplication
of observed data and for demixing matrix []. It depends
on the fact that there are as many sources as channels of
data available, which are to be separated as independent
sources—by utilizing this fact, ICA is used in Blind Source
Separation. In [], author described a fast method for ICA
using xed point iteration. is algorithm is popularly known
as FastICA.
In Table , comparison of the existing techniques for
solving Cocktail Party Problem has been discussed. It can be
noted that each of these techniques has its own advantage and
disadvantages. However, as blind steganographic approach
is considered more robust and secure than the nonblind
steganography techniques, hence, in this proposed method,
“Blind Source Separation” approach has been chosen for
solving cocktail party eect.
3. Proposed Method
3.1. In a Nutshell. Steganography can be broadly grouped
into two types: blind and nonblind techniques. e technique
where cover object is not required to retrieve the secret is
called blind steganography. e method where cover object
is required to regain secret is called nonblind or cover escrow
techniqueofsteganography.Tocreateamostrobustmethod
of steganography, here a blind steganography technique has
been proposed.
In this proposed method, image has been used as secret
message. is secret image is scrambled using Arnold trans-
form. en Haar lter is applied for two-dimensional DWT
on the cover source audio. Since audio is one-dimensional
signal, hence it must be reshaped into two-dimensional
matrix to perform D DWT. Haar is simple, fast, and
memory ecient compared to other available DWT lters
like Daubechies and Coiets. Aer DWT application, LH
subband has been chosen for further decomposition into
4×4blocks where two-dimensional DCT has been applied.
As shown in Figure (b), in Section ., midband region of
those 4×4blocks has been chosen and embedding has been
performed by the following equation:
mid
𝑎=mid
𝑎+∝× PN,()
where mid(
(𝑎))indicates midband frequency region; ∝is
the embedding factor; and PN is the pseudorandom number.
Equation () has been further explained in Section .;
embedding factor (∝) has been discussed in Section .
and pseudorandom number (PN) has been discussed in
Section ..
Aer embedding, the resultant cover becomes stego
audio. To increase security of the proposed method, this stego
audio is blended with other audio signals to produce cocktail
party eect—aerwards this has been securely transmitted
throughthewebtoreachtheintendedrecipient.Evenifany
intruder is able to break the communication channel and get
access to the transmitted media, neither he would decipher
the cocktail party eect to identify stego audio nor he would
able to decode the stego audio to recognize the secret message
without knowing the key required for extraction, whereas the
intended receiver knowing the key as well as the entire algo-
rithms is able to easily extract the secret message implanted
without any loss of data. e proposed method is also tested
against well-known Steganalysis attacks and the outcomes
are quite impressive (discussed in Section .)—hence this
technique provides complete security.
Once the intended recipient receives the cocktail eect,
using the demixing algorithm (discussed in Section .) s/he
Security and Communication Networks
Cover audio, source 1 (S1)
Secret image
Arnold
transform
Embedding functionScrambled image
Pseudorandom
number (PN)
Key based linear
feedback shi register
2D discrete wavelet transform
2D discrete cosine transform
Find the midband coecients
Inverse 2D DCT followed
by inverse 2D DWT
Audio source 2 (S2) Stego audio (St)
Mixing function
Cocktail stego audio
Find the maximum coecient
value of LH subband (G;Rf)
Set embedding factor
()=G;R
f×Gultiplicative factor
F : Flowchart of embedding procedure.
Stego audio (St)
Cocktail stego audio
Demixing function
Audio source 2 (S2)
2D discrete wavelet transform
2D discrete cosine transform
Find the midband coecients
Extraction by comparing correlation coecients
Pseudorandom
number (PN)
Scrambled secret image
Reverse Arnold
transform
Secret
image
Key based linear
feedback shi register
F : Flowchart of extraction procedure.
can separate the audios and can also apply the extraction
procedure on them, as the recipient is aware of the key. e
extraction algorithm performs correlation between the coef-
cients and extracts the secret bits, from which the scrambled
secret image can be generated. Finally, by applying inverse
Arnold transform, the secret image can be reconstructed.
e owcharts for embedding and extraction procedure have
been shown in Figures and , respectively.
3.2. Input Preparation
Cover Audio Source.Anyspeechormusiccanbeusedhereas
cover audio sources. For this demonstration, popular English
songs have been chosen—as mentioned below. All the audio
sources have been sampled at kHz in monochannel
with -bit depth, cut to seconds’ duration for optimizing
embedding capacity calculation, and nally saved as .wav le.
Security and Communication Networks
e following are the audio sources used for this research
experiment:
() “My Heart Will Go On” by Celine Dion from lm
“Titanic” →saved as tt.wav
() “Beat It” by Michael Jackson from album “riller” →
saved as mj.wav
() “Like a Rolling Stone” by Bob Dylan from album
“Highway Revisited” →saved as bob.wav
() Title song from lm “Mamma Mia!” by Meryl Streep
→saved as mm.wav
() Title song from lm “High School Musical” by chorus
→saved as hsm.wav.
Secret Image. ough any types of grayscale image (.jpg
or .bmp) can be used here as secret, however for this
experiment binary images (.pbm) have been chosen for better
quality extraction. For this proposed method, secret images
need to transform to binary, which is lossy conversion;
hence any true-color RGB images cannot be applied here
as, aer extraction, the retrieved image will only have two
colors—black and white. Secret image size here is taken as
128× 128, which can be further increased if the length of
input cover audio source is more than seconds. For this
experiment, secret images have been either downloaded from
Internet (these do not have any copyright restriction) or
drawn by Microso Paint soware.
3.3. Scrambling and Descrambling Algorithm for Secret Image.
e “Arnold transform” algorithm randomizes the input
image by number of iterations to create scrambled image.
Input: AnybinaryImage(𝑚×𝑛), number of iteration ()
Output: Scrambled Image (out)
Algorithm: written as function Arnold (𝑚×𝑛,)
Step 1:Findoutthesizeofand store in and
Step 2:
for =1to
for =0to
for =0to
Find out =11
12
;
out(mod((2),)+1,mod((1),)+1)
←(+1,+1);
end;
end;
=out ;
end;
Once applied to the scrambled image, the “Reverse Arnold
Transform” algorithm returns the original secret image aer
specied iterations.
Input: Any scrambled binary Image (𝑚×𝑛), number of
iteration ()
Output: Descrambled Image (out)
Algorithm: written as function iArnold (𝑚×𝑛,)
Step 1: Find out the size of and store in and
Step 2:
for =1to
for =0to
for =0to
Find out =2−1
−1 1
;
out (mod((2),)+1,mod((1),)+1)
←(+1,+1);
end;
end;
=out ;
end;
3.4. Embedding and Multiplicative Factors. As shown in ()
in Section ., embedding factor ()hasbeenmultiplied
with PN to oset the increment of DCT coecient value
such that, aer embedding, stego audio will not have any
audiblenoise.Hencethevalueofmust be between
and . Aer repeated experiments, it has been observed that
when value of embedding factor nears , then the extracted
message is having very high PSNR and SSIM—which tends
to high robustness—however simultaneously, in stego audio,
there are audible artifacts identied, which is dierentiating
with the cover audio. is signies value of near to
compromise imperceptibility. On the other hand, if the
value of approaches , the stego audio would be just
like the original cover audio (the PSNR between these two
audios reaches around dB), whereas then the secret image
extracted is completely corrupted. ese test results indicate
that, to get an optimum outcome, the tradeo must be done
between robustness and imperceptibility.
While experimenting with several cover audios along
with various secret images, it has been also noticed that
keeping a constant value of embedding factor ()cannot
ensure similar quality outcome, aer extraction. Henceforth
it is decided to set depending on the cover to generate
the optimal result. As the data hiding takes place in the
LH subband of DWT, hence, to formularize ,maximum
coecient value of the LH subband has been chosen as one
of the aspects of the following formula:
Embedding Factor ()
=Multiplicative Factor
×Max (coecients of LH).
()
Finally, for this proposed method, the value of Multi-
plicative Factor has been universally set as ., based on the
experimental outcome, as shown in Table .
3.5. Pseudorandom Number. For embedding secret into
cover, in this proposed method “pseudorandom number”
(PN) has been used; PN is generated using Linear Feedback
Security and Communication Networks
T : Experimental Results with dierent embedding and multiplicative factors.
Original secret Extracted secret image Embedding
factor
PSNR of
extracted
secret
SSIM of
extracted
secret
PSNR of
stego audio
.x Maximum
Coecient
Value of L H
. . .
.x Maximum
Coecient
Value of L H
. . .
.x Maximum
Coecient
Value of L H
. . .
Bit 5 Bit 4 Bit 3 Bit 2 Bit 1
F : Simplied block diagram of LFSR.
Shi Register (LFSR), as shown in Figure . Here LFSR
hasbeendesignedusingonlyrightshioperatorandthe
operation of this shi register is completely deterministic. It
must be initialized with a set of numbers and, at any given
point, the value of LFSR can be determined by its present
state.
In this proposed method, two simple algorithms have
been designed to generate two dierent sets of PN values
for a given key with the same initial sequence of numbers.
is initial sequence can be altered any time. Here, for easy
illustration purpose, “00001” has been chosen as
initial sequence.
Description: e below algorithm(s) generates endless
non-sequential lists of numbers in binary base
using Linear Feedback Shi Register.
Input: AnumberasKey
Output: Pseudo-random Numbers, PN[]and PN[]
respectively.
Algorithm 1: written as function SRPN (Key)
Step 1: set =Key;
Step 2: set initial state of shi register as
state = 00001
Step 3: set PN = [];
Step 4:
for =1to
PN = [PN state(5)]
if state(1)== state(4)
then set temp = ;
else set temp = ;
end;
set state(1)=state(2);
Security and Communication Networks
set state(2)=state(3);
set state(3)=state(4);
set state(4)=state(5);
set state(5)=temp;
end;
Algorithm 2: written as function SRPN (Key)
Step 1: set =Key;
Step 2: set initial state of shi register as
state = 00001
Step 3: set PN = [];
Step 4:
for =1to
PN = [PN state(5)]
if state(1)== state(2)
then set temp = ;
else set temp = ;
end;
if state(4)== temp
then set temp = ;
else set temp = ;
end;
if state(5)== temp
then set temp = ;
else set temp = ;
end;
set state(1)=state(2);
set state(2)=state(3);
set state(3)=state(4);
set state(4)=state(5);
set state(5)=temp;
end;
3.6. Embedding Algorithm. To ensure more security and
imperceptibility, in this proposed method, the secret message
is embedded in the transform domain using discrete wavelet
transform (DWT) as well as by discrete cosine transform
(DCT).
Description: algorithm for embedding secret data.
Input:aCoverAudio(𝑎), Secret message as an image
(𝐼)
Output: a Stego Audio (Steg Aud).
Algorithm:
Step 1: read cover audio (𝑎)
Step 2:readsecretmessage(𝐼)
Step 3: set iteration as a number =
Step 4:callfunctionArnold(𝐼,)whichreturns
scrambled image (𝐼)
Step 5: set Key as a number =
Step 6: call function SRPN()whichreturns
PN[];
Step 7: call function SRPN()whichreturns
PN[];
Step 8: apply D DWT on 𝑎to decompose in
LL, LH, HL and HH;
Step 9:ndmax
f= max(value of coecients in
LH);
Step 10: set embedding factor ()=Multiplica-
tive Factor ×maxf
Step 11:applyDDCToverLHandget
(𝑎).
Step 12: nd mid-band coecient region of
(𝑎)and term it as mid(
(𝑎));
Step 13:if𝐼(,)==
then set mid(
(𝑎))=mid(
(𝑎))+×
PN[];
else set mid(
(𝑎))=mid(
(𝑎))+×
PN[];end;
Step 14: perform inverse DCT to get new(LH).
Step 15: perform inverse DWT using LL,
new(LH), HL, HH and get Stego
Step 16: write Stego in Steg Aud
3.7. Mixing Algorithm. is algorithm mixes two audio
sources from two dierent channels to create cocktail eect
of two audio signals.
Input: two monochannel .wav les (1and 2)having
same duration and sampling rate of Hz
Output: .wav les having cocktail sound eect (S3and
4)
Algorithm: written as function Mixing (1,2)
Step 1: set Gain Factor ()asdecimal(0<<1)
Step 2: read 1and 2in sig1&sig
2while keeping
their respective sampling frequencies stored in
Fs1and Fs2
Step 3: set Mixed1=sig
1+(×sig2)andMixed
2
=sig
2+(×sig1);
Step 4: write Mixed1in audio le 3with Fs1and
write Mixed2in audio le 4with Fs2
3.8. Demixing Algorithm. Here, for demixing, FastICA MAT-
LAB package (ver. .) has been used which estimates
the independent components from given multidimensional
signals using Blind Source Separation technique.
Input: two .wav les (3and 4) containing mixed
signals from dierent channels
Output: twounmixedsource.wavles(1,2)
Algorithm: written as function Demixing (3,4)
Step 1: read 3and 4in &while keeping
their respective sampling frequencies stored in
Fs1and Fs2
Step 2: nd complex conjugate transpose of
and ,storetheminand
Step 3: createonematrixfromand ,storeit
in
Step 4: set =FastICA();
Security and Communication Networks
Step 5: extract two sources from as source1and
source2
Step 6: write source1in 1with Fs1and source2
in 2with Fs2
3.9. Extraction Algorithm
Input: stego audio (Steg Aud)
Output: secret image (𝐼)
Algorithm:
Step 1: read Stego audio (Steg Aud) in 𝑎
Step 2: set Key as a number =
Step 3: call function SRPN()whichreturns
PN[];
Step 4: call function SRPN()whichreturns
PN[];
Step 5: apply D DWT on 𝑎to decompose it in
LL, LH, HL and HH;
Step 6: apply D DCT over LH and get
(𝑎)
Step 7: ndmid-bandcoecientregionof
(𝑎)
and term it as mid(
(𝑎))
Step 8: if Correlation(mid(
(𝑎)), PN1[])>=
Correlation(mid(
(𝑎)), PN2[])
then 𝐼(,)=else𝐼(,)=;end;
Step 9: reshapetheimagebitsstoredin𝐼to get
secret scrambled image
Step 10: set iteration as a number =
Step 11: call function iArnold (𝐼,) which
returns secret image (𝐼)
4. Experimental Results and Analysis
is proposed method has been applied on several sets of
cover audio and secret images, though, for ecient use of
space, here only sets of robustness test results have been
presented for Steganalysis attacks.
4.1. Adherence to Kerckho ’s Principle. In this research article,
a key based steganography technique has been proposed.
Hence it should follow Kerckho’s principle of cryptography
[], which says an exemplary method should be secure even
if the public is aware of all the details of that method except
the key. As mentioned in Section ., here LFSR has been used
both at sender’s end and at receiver’s end. It requires a unique
key to generate the same set of pseudorandom numbers []
which are used in embedding equation () and again in
Step8of the extraction algorithm for comparing correlation
coecients. If the exact same key is not used during embed-
ding and extraction, then LFSR will generate dierent set of
pseudorandom numbers using which secret image cannot be
extracted from the stego audio. Henceforth it is proved that
the proposed method complies with Kerckho’s principle.
4.2. Outcome of Quality Metrics
Embedding Capacity (EC).ECismeasuredbytheratio
between size of hidden message (in bits) and size of cover
(in bits), as shown in () below. In this research experiment,
it has been observed that, to hide 128×128size of a secret
image, it requires cover audio size of bits—which
implies embedding capacity value of .%. Similarly, to
implant a 64× 64 secret image, bits of cover audio
is needed—this again conrms the proportion of embedding
capacity as .%.
capacity =size of hidden data
size of cover data ×100%.()
Peak Signal-to-Noise Ratio (PSNR).PSNRrepresentstheratio
betweenmaximumpoweroftestsignalandthepowerof
reference signal. e mathematical representation for PSNR
is as follows:
PSNR =10log10 Maxsf 2
MSE , ()
where Maxsf is maximum signal value or maximum uctu-
ation in the input image data type (e.g., for -bit unsigned
integer data type, Maxsf is)andMSEistheMeanSquared
Error, which is given by
MSE =1
𝑚−1
𝑖=0
𝑛−1
𝑗=0 Ref −Te s t 2,()
where Ref represents original signal; Tes t represents
degraded signal; and represent numbers of rows and
columns of the signal matrix, respectively; represents index
of row and represents index of column.
Structural Similarity Index (SSIM). SSIM is a measurement
of similarity, calculated through luminance, contrast, and
structural dierences between two images as given below.
SSIM (S,E)=2SE+12SE +2
2
S+2
E+12
S+2
E+2,()
where Sand Eare the mean of secret image S and extracted
imageE,respectively;Sand Eare the standard deviation of
SandE;SE is correlation of S and E.
Bit Error Rate (BER). BER is dened by number of error bits
divided by total number of transmitted bits, as shown in the
following equation:
BER =ErrorBit
BitsTransmitted ×100. ()
Here the BER is calculated between original secret image
and extracted secret image.
Table shows the quality outcome of the secret and
extracted images with respect to PSNR, SSIM, BER, and
correlation coecient (CC, discussed in Section .).
Security and Communication Networks
T : Quality analysis of secret and extracted image.
Secret image (S) Scrambled secret image Extracted scrambled
image Extracted image (E)PSNR
(S,E)
SSIM
(S,E)
BER
(S,E)
CC
(S,E)
. . . .
. . . .
. . . .
. . . .
Security and Communication Networks
1
0.5
0
−0.5
300
200
100
0
300
200
100
0
F : Surface plot of NCC between secret and extracted image.
Perceptual Evaluation of Audio Quality (PEAQ).PEAQisa
standardized metric to evaluate audio quality utilizing human
perceptual properties, output of which is given in a scale of
to (where signies poor and implies excellent) depending
on the Mean Opinion Score (MOS) of all listeners. e quality
of output audio is measured by comparing with a reference
audio.
Normalized Cross-Correlation (NCC). NCC quanties degree
of similarity between two signals. NCC computes normalized
two-dimensional cross-correlation values between two image
metrics. e values of correlation coecients lie between −
and , where signies identical images and − denotes totally
dierent image. It is formulated as
NCC =∑𝐴
𝑝=1 ∑𝐵
𝑞=1 ,,
∑𝐴
𝑝=1 ∑𝐵
𝑞=1 ,2∑𝐴
𝑝=1 ∑𝐵
𝑞=1 ,2,()
where (,) is the extracted image and (,) is the
reference image. NCC is used to produce surface plot, which
depicts functional relationship between two independent
variables and map to a plane which is parallel to -plane.
Here,inFigure,thesurfaceplotofNCCbetweensecretand
extracted image has been shown.
In Table , quality analysis of the cover and stego audio
has been shown in PSNR, PEAQ, and CC.
4.3. Robustness Tests by Steganalysis Attacks
By Random Cropping. On average, English music or a full
song has duration of over minutes, that is, more than
seconds. In this proposed method, only seconds of audio is
required to hide a secret image having size of 128×128.is
secret can be kept anywhere within the stego, that is, at the
startorattheendoraerth seconds—in short, the secret
can be moved throughout the cover and the exact place of
hiding is not predetermined. at is why out of attempts
of random cropping leave the secret image intact, as stego
has been cropped elsewhere. For the remaining out of
attempts, that is, when the stego audio has been cropped in
×105
1
0.5
0
0.5
−1024681012
(a) Graphical plot of audio during cropping
(b) Scrambled secret
(c) Extracted secret
F
such a place where secret image was embedded, Figures (a),
(b), and (c) provide the results.
As shown in Figure (a), from a stego audio of seconds’
duration, -second-long window (from nd to th second)
has been chosen and the remaining audio signal has been
replaced with zero. When the intended recipient applies the
extraction mechanism on such modied stego audio, it gen-
erates only a portion of scrambled secret image as shown in
Figure(b).However,when“ReverseArnoldTransform”has
beenappliedonsuchpartiallyscrambledsecretimage,itstill
recovers the extracted secret as shown in Figure (c). Quality
analysis of the extracted secret image has revealed PSNR
value of . and SSIM value of ., when compared
with the original secret image which was embedded.
By Adding White Gaussian Noise. In this type of attack,
“Additive White Gaussian Noise” (AWGN) is added to the
stego audio to distort the hidden message. AWGN can
beaddedtoanysignal,andithasuniformpowerandis
Security and Communication Networks
T : Quality analysis of cover and stego audio.
Secret image and cover audio Graphic plot of cover audio Graphic plot of stego audio PSNR
(in dB)
PEAQ
(in MOS) CC
embedded in bob.wav
×105
1
0.5
0
−0.5
−1024681012
×105
1
0.5
0
−0.5
−1024681012
. . .
embedded in mm.wav
×105
0.4
0.2
0
−0.4
−0.2
024681012
×105
0.4
0.2
0
−0.4
−0.2
024681012
. . .
embedded in mj.wav
×105
1
0.5
0
−0.5
−1024681012
×105
1
0.5
0
−0.5
−1024681012
. . .
embedded in tt.wav
×105
1
0.5
0
−0.5
−1024681012
×105
1
0.5
0
−0.5
−10 2 4 6 8 10 12
. . .
Security and Communication Networks
T : Experimental results of AWGN stego analysis attack.
Original secret Extracted secret aer adding
noise at SNR dB
Extracted secret aer adding
noise at SNR dB
Extracted secret aer adding
noise at SNR dB
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
Security and Communication Networks
T : Outcome of resampling attack.
Original secret embedded in cover
audio of sampling rate Hz
Extracted secret aer changing
sampling rate to Hz
Extracted secret aer changing
sampling rate to Hz
Extracted secret aer changing
sampling rate to Hz
PSNR:.
SSIM: .
PSNR: .
SSIM: .
PSNR:.
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
Security and Communication Networks
T : Results of requantization attack.
Original secret embedded in cover
audio of -bit depth
Extracted secret aer changing
audio bit depth to bits
Extracted secret aer changing
audio bit depth to bits
Extracted secret aer changing
audio bit depth to bits
PSNR:.
SSIM: .
PSNR:.
SSIM: .
PSNR:.
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
Security and Communication Networks
T : Experimental results of pitch shiing attack.
Original secret in cover audio Extracted secret aer %
reduction of audio pitch
Extracted secret aer %
increment of audio pitch
Extracted secret aer %
increment of audio pitch
PSNR: .
SSIM: .
PSNR:.
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
Security and Communication Networks
T : Experimental results of MP compression attack.
Secret embedded in cover audio .wav
le
Extracted secret from MP of
bitrate kbps
Extracted secret from MP of
bitrate kbps
Extracted secret from MP of
bitrate kbps
PSNR:.
SSIM: .
PSNR:.
SSIM: .
PSNR:.
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
PSNR: .
SSIM: .
Security and Communication Networks
distributed with respect to time. As shown in Table , to test
robustness of the proposed method, here , , and dB of
SNR (Signal-to-Noise Ratio) per sample is added to the stego
audio signal, assuming the power of stego signal is dBW
(decibel-watt is a unit of power in decibel scale, relative to
watt).
By Resampling. While writing audio data into a le, sampling
rate of the audio is generally mentioned as Fs. In the
resamplingattack,atrstthissamplingratehasbeenchanged
to a higher or lower frequency while saving the same audio in
a new le. As resampling causes impact on audio le length,
hence, to maintain the same length as of original cover,
modiedaudiohasbeencutorlledwithzeros.Oncesaved,
resampling has been performed again on the modied audio
to revert it back to the original sampling frequency—by this,
audibly no dierences will be noted; however it will distort the
embedded secret message (if any). In Table , result of such
resampling attack has been shown.
By Requantization.enumberofbitsrequiredtoexpress
each audio sample is known as bit depth. It is a measurement
of sound accuracy: the higher the bit depth is, the more it
wouldbeprecise.Intherequantizationattack,thisbitdepth
of stego audio has been changed to pervert the embedded
secret image. Table illustrates the outcome of the extraction
process aer requantization attack.
By Pitch Shiing. Pitch means tone of a signal; it describes
the quality of a sound by the rate of vibrations. In pitch
shiing attack, original pitch of an audio is lied or dropped
without modifying its length to destroy the hidden message
embedded in a stego audio. Here pitch shiing has been done
by utilizing time-scale modication algorithm called “Phase
Vocoder” [], the result of which is shown in Table .
By MP3 Compression. In this Steganalysis attack, stego.wav
le has been compressed to MP format to eliminate redun-
dant data, by which embedded secret message would be
completely removed. Here mpwrite MATLAB function has
been used to convert the stego.wav le into mp format and
mpread MATLAB function has been applied to read from
the mp le during extraction process.
Table reects the extraction outcome from three dier-
ent mp les of the same stego audio which has been encoded
with bitrates kbps, kbps, and kbps, respectively.
4.4. Comparison with Existing Method. For comparison with
the proposed method, research articles published in SCI
indexed journal have been searched—where data hiding
in audio has been performed by DWT along with DCT
and extraction mechanism is blind. Authors of [] have
proposed DCT-DWT based data hiding technique using -
bit Barker code as synchronizing code to accommodate 64×
64 binary image as secret message. From the comparison
results presented in Table , this can be proved that the
proposed method has outperformed the existing one in terms
of quality and robustness test against Steganalysis attacks.
T : Comparison results.
Features based comparison and
robustness tests
Proposed
method
Existing
method []
Secret message size 128×128 64×64
Adherence to Kerckho ’s principle N
Peak signal to noise ratio . -
Structural similarity index . -
Perceptual evaluation of audio quality -
Addition of white Gaussian noise
Random cropping Steganalysis attack N
Resampling Steganalysis attack
Requantization Steganalysis attack
Pitch shiing Steganalysis attack N
MP compression Steganalysis attack
In Table , “” signies “satisfactory result obtained”;
“N” signies “unsatisfactory result or method does not
comply”; and “-” implies “details not mentioned.”
5. Conclusion
Secret communication using age-old steganography tech-
niques oen increases chances of detectability through the
perceivable noise. Hence, in this article, the cocktail party
eect has been considered which has eectively reduced the
probability of detectability. is has also been proved by the
help of dierent Steganalysis techniques. Additionally, PSNR,
CC, and PEAQ values are also analyzed to determine the per-
ceptual noise recorded due to secret message embedding and
extraction. Since all the above results verify the undetectabil-
ity and robustness of the system, hence it can be concluded
that this audio steganography technique is successful in secret
communication with very high robustness.
In future, this proposed method can be further impro-
vised by utilizing speaker diarization technique, which deter-
mines “who spoke when.” Application of speaker diarization
along with speech recognition would identify a speaker’s
voice and this concept will permit segregating secret audio
stream into multiple speech segments, ensuring another
novel approach of data hiding.
Conflicts of Interest
e authors declare that there are no conicts of interest
regarding the publication of this paper.
References
[] B. G. Banik and S. K. Bandyopadhyay, “Review on steganog-
raphy in digital media,” International Journal of Science and
Research,vol.,no.,pp.–,.
[] M. Asad, J. Gilani, and A. Khalid, “An enhanced least signif-
icant bit modication technique for audio steganography,” in
Proceedings of the 1st International Conference on Computer
Networks and Information Technology (ICCNIT ’11), pp. –,
IEEE, Pakistan, July .
Security and Communication Networks
[] N. Cvejic and T. Seppanen, “Increasing the capacity of LSB-
based audio steganography,” in Proceedings of the 2002 5th IEEE
Workshop on Multimedia Signal Processing (MMSP ’02),pp.
–, IEEE, USA, December .
[] Jayaram, Ranganatha, and Anupama, “Information Hiding
Using Audio Steganography - A Survey,” e International
Journal of Multimedia & Its Applications,vol.,no.,pp.–
, .
[] D. Gruhl, A. Lu, and W. Bender, “Echo hiding,” in Information
Hiding,vol.ofLecture Notes in Computer Science, pp. –
, Springer Berlin Heidelberg, Berlin, Heidelberg, .
[] D. Xiaoxiao, M. F. Bocko, and Z. Ignjatovic, “Robustness
analysis of a digital audio steganographic method based on
phase manipulation,” in Proceedings of the 7th International
Conference on Signal Processing (ICSP ’04),vol.,pp.–,
IEEE, Beijing, China, .
[] N. Parab, M. Nathan, and K. T. Talele, “Audio Steganography
Using Dierential Phase Encoding,” in Technology Systems and
Management,vol.ofCommunications in Computer and
Information Science, pp. –, Springer Berlin Heidelberg,
Berlin, Heidelberg, .
[] R. M. Nugraha, “Implementation of Direct Sequence Spread
Spectrum steganography on audio data,” in Proceedings of the
2011 International Conference on Electrical Engineering and
Informatics (ICEEI ’11), pp. –, IEEE, Indonesia, July .
[] H. Matsuoka, “Spread spectrum audio steganography using
sub-band phase shiing,” in Proceedings of the 2006 Interna-
tional Conference on Intelligent Information Hiding and Mul-
timedia Signal Processing (IIH-MSP ’06), pp. –, IEEE, USA,
December .
[] G. Prabakaran and R. Bhavani, “A modied secure digital
image steganography based on discrete wavelet transform,” in
Proceedings of the 2012 International Conference on Computing,
Electronics and Electrical Technologies (ICCEET ’12),pp.–
, IEEE, India, March .
[] N. Gupta and N. Sharma, “Dwt and LSB based Audio Steganog-
raphy,” in Proceedings of the 2014 International Conference on
Reliability, Optimization and Information Technology (ICROIT
’14), pp. –, IEEE, India, February .
[] S. S. Verma, R. Gupta, and G. Shrivastava, “A novel technique
for data hiding in audio carrier by using sample comparison
in DWT domain,” in Proceedings of the 2014 4th International
Conference on Communication Systems and Network Technolo-
gies (CSNT ’14), pp. –, IEEE, India, April .
[] W. Junjie, M. Qian, M. Dongxia, and Y. Jun, “Research for
synchronic audio information hiding approach based on DWT
domain,” in Proceedings of the 2009 International Conference on
E-Business and Information System Security (EBISS ’09),pp.–,
IEEE, China, May .
[] A. Kanhe and G. Aghila, “DCT based audio steganography
in voiced and un-voiced frames,” in Proceedings of the 1st
International Conference on Informatics and Analytics (ICIA ’16),
pp. –, ACM Press, India, August .
[] Z. Zhou and L. Zhou, “A novel algorithm for robust audio water-
marking based on quantication DCT domain,” in Proceedings
of the 3rd International Conference on Intelligent Information
Hiding and Multimedia Signal Processing (IIHMSP ’07),pp.–
, IEEE, Taiwan, November .
[] W. Yongqi and Y. Yang, “A synchronous audio watermarking
algorithm based on chaotic encryption in DCT domain,” in
Proceedings of the 2008 International Symposium on Information
Science and Engineering (ISISE ’08), pp. –, IEEE, China,
December .
[]S.Roy,N.Sarkar,A.K.Chowdhury,andS.M.A.Iqbal,
“An ecient and blind audio watermarking technique in DCT
domain,” in Proceedings of the 18th International Conference on
Computer and Information Technology (ICCIT ’15), pp. –,
IEEE, Bangladesh, December .
[] B. Ratner, “e correlation coecient: Its values range between
+/−, or do they?” Journal of Targeting, Measurement and
Analysis for Marketing,v
ol.,no.,pp.–,.
[] L. Min, L. Ting, and H. Yu-jie, “Arnold Transform Based Image
Scrambling Method,” in Proceedings of the 3rd International
Conference on Multimedia Technology (ICMT ’13),pp.–
, Publisher Atlantis Press, Guangzhou, China, November
.
[] L.Wu,J.Zhang,W.Deng,andD.He,“Arnoldtransforma-
tion algorithm and anti-Arnold transformation algorithm,” in
Proceedings of the 1st International Conference on Information
Science and Engineering (ICISE ’09), pp. –, IEEE, China,
December .
[] N. V. Lalitha, S. Rao, and P. V. JayaSree, “DWT - Arnold Trans-
form based audio watermarking,” in Proceedings of the 2013
IEEE Postgraduate Research in Microelectronics and Electronics
Asia (PrimeAsia), pp. –, IEEE, Visakhapatnam, India,
December .
[] S. Gaur and V. K. Srivastava, “Robust embedding of improved
arnold transformed watermark in digital images using RDWT-
SVD,” in Proceedings of the 4th IEEE International Conference on
Parallel, Distributed and Grid Computing (PDGC ’16),pp.–
, IEEE, India, December .
[] Z. Zhang, C. Wang, and X. Zhou, “Image watermarking
schemebasedonArnoldtransformandDWT-DCT-SVD,”in
Proceedings of the 13th IEEE International Conference on Signal
Processing (ICSP ’16), pp. –, IEEE, China, November
.
[] E. C. Cherry, “Some experiments on the recognition of speech,
with one and with two ears,” e Journal of the Acoustical Society
of America,vol.,no.,pp.–,.
[] R. Russell, Cognition: eory and Practice,WorthPublishers,
.
[] B. Arons, “A Review of e Cocktail Party Eect,” Journal of e
American Voice I/O Society,vol.,pp.–,.
[] D. E. Broadbent, “Selective listening to speech,” in Perception
and Communication,pp.–,PergamonPress,.
[] N. I. Durlach and H. S. Colburn, “Binaural Phenomena,” in
Hearing, pp. –, Elsevier, .
[] J. Blauert and R. A. Butler, “Spatial hearing: the psychophysics of
human sound localization,” e Journal of the Acoustical Society
of America,vol.,no.,pp.-,.
[] J. Kassebaum, M. F. Tenorio, and C. Schaefers, “e Cocktail
Party Problem: Speech/Data Signal Separation Comparison
between Backpropagation and SONN,” in Proceedings of the
2nd International Conference on Neural Information Processing
Systems, pp. –, MIT Press Cambridge, Cambridge, USA,
.
[] S. Haykin and Z. Chen, “e cocktail party problem,” Neural
Computation,vol.,no.,pp.–,.
[] C. von der Malsburg, “ e correlation theory of brain function,”
in Models of Neural Networks, Temporal Aspects of Coding
and Information Processing in Biological Systems, pp. –,
Springer New York, New York, NY, USA, .
Security and Communication Networks
[] C. von der Malsburg and W. Schneider, “A neural cocktail-party
processor,” Biological Cybernetics,vol.,no.,pp.–,.
[] D.L.WangandG.J.Brown,“Separationofspeechfrominterfer-
ing sounds based on oscillatory correlation,” IEEE Transactions
on Neural Networks and L earning Systems,vol.,no.,pp.–
, .
[] G. J. Brown and D. L. Wang, “An oscillatory correlation frame-
work for computational auditory scene analysis,” in Advances
in Neural Information Processing Systems 12, pp. –, MIT
Press, .
[] B.Sagi,S.C.Nemat-Nasser,R.Kerr,R.Hayek,C.Downing,
and R. Hecht-Nielsen, “A biologically motivated solution to the
cocktail party problem,” Neural Computation,vol.,no.,pp.
–, .
[] S. Ao, Z. Luo, N. Zhao, and R. Wang, “Blind source sepa-
ration based on principal component analysis- independent
component analysis for acoustic signal during laser welding
process,” in Proceedings of the 2010 International Conference on
Digital Manufacturing and Automation (ICDMA ’10),pp.–
, IEEE, China, December .
[] I. T. Jollie, Principal Component Analysis, Springer Series in
Statistics, Springer, New York, NY, USA, nd edition, .
[] M. E. Wall, A. Rechtsteiner, and L. M. Rocha, “Singular
value decomposition and principal component analysis,” in A
Practical Approach to Microarray Data Analysis,pp.–,
Kluwer Academic Publishers, .
[] J. Wellhausen, “Audio signal separation using independent sub-
space analysis and improved subspace grouping,” in Proceedings
of the 7th Nordic Signal Processing Symposium (NORSIG ’06),pp.
–, IEEE, Iceland, June .
[] Q. Cai and X. Tang, “A digital audio watermarking algorithm
based on independent component analysis,” in Proceedings of
the 9th International Congress on Image and Signal Processing,
BioMedical Engineering and Informatics, CISP-BMEI 2016,pp.
–, IEEE, China, October .
[] A. Hyv¨
arinen, “Independent component analysis: recent
advances,” Philosophical Transactions of the Royal Society A:
Mathematical, Physical & Engineering Sciences,vol.,no.
,pages,.
[] A. Hyv¨
arinen, “Fast and robust xed-point algorithms for
independent component analysis,” IEEE Transactions on Neural
Networks and Learning Systems,vol.,no.,pp.–,.
[] H. Helfrich, TimeandMindII:InformationProcessingPerspec-
tives,Hogrefe&HuberPublishers,Cambridge,MA,USA,.
[] M. F. Casanova and I. Opris, Eds., Recent Advances on the
Modular Organization of the Cor tex, Springer Netherlands,
Dordrecht, .
[] C. Ionescu and R. De Keyser, “Exploring the advantages of blind
source separation in monitoring input respiratory impedance
during apneic events,” Journal of Control Engineering and
Applied Informatics,vol.,no.,pp.–,.
[]Q.Su,Y.Shen,W.Jian,andP.Xu,“Blindsourceseparation
algorithm based on modied bacterial colony chemotaxis,” in
Proceedings of the 5th International Conference on Intelligent
ControlandInformationProcessing(ICICIP’14), pp. –,
IEEE, Dalian, China, August .
[] F. A. P. Petitcolas, “Kerckhos’ Principle,” in Encyclopedia of
Cryptography and Security,H.C.A.vanTilborgandS.Jajodia,
Eds.,p.,Springer,Boston,MA,USA,.
[] Paar, Christof, and J. Pelzl, “Stream Ciphers,” in In Understand-
ing Cryptography,p.,Springer,Berlin,Heidelberg,Germany,
.
[] J. Laroche and M. Dolson, “New phase-vocoder techniques
for pitch-shiing, harmonizing and other exotic eects,” in
Proceedings of the 1999 Workshop on Applications of Signal
Processing to Audio and Acoustics, pp. –, IEEE, New Paltz,
NY, USA.
[] X.-Y. Wang and H. Zhao, “A novel synchronization invariant
audio watermarking scheme based on DWT and DCT,” IEEE
Transactions on Signal Processing,vol.,no.,pp.–,
.
Available via license: CC BY 4.0
Content may be subject to copyright.