ArticlePDF Available

From Frequency to Quefrency: A History of the Cepstrum

Authors:

Abstract

The idea of the log spectrum or cepstral averaging has been useful in many applications such as audio processing, speech processing, speech recognition, and echo detection for the estimation and compensation of convolutional distortions. To suggest what prompted the invention of the term cepstrum, this article narrates the historical and mathematical background that led to its discovery. The computations of earlier simple echo representations have shown that the spectrum representation domain results does not belong in the frequency or time domain. Bogert et al. (1963) chose to refer to it as quefrency domain and later termed the spectrum of the log of a time waveform as the cepstrum. The article also recounts the analysis of Al Oppenheim in relation to the cepstrum. It was in his theory for nonlinear signal processing, referred to as homomorphic systems, that the realization of the characteristic system of homomorphic convolution was reminiscent of the cepstrum. To retain both the relationship to the work of Bogart et al. and the distinction, the term power cepstrum was eventually applied to the nonlinear mapping in homomorphic deconvolution . While most of the terms in the glossary have faded into the background, the term cepstrum has survived and has become part of the digital signal processing lexicon.
dsp history
Alan V. Oppenheim and Ronald W. Schafer
From Frequency to Quefrency: A History of the Cepstrum
IEEE SIGNAL PROCESSING MAGAZINE
n their classic paper enti-
tled The Quefrency Alanysis
of Time Series for Echos:
Cepstrum, Pseudo-Auto-
covariance, Cross-Cepstrum,
and Saphe Cracking [1], Bogert,
Healy and Tukey were perhaps in a
very playful frame of mind when
they coined the term cepstrum,
along with a complete glossary that
included, in addition to those in the
title of their paper, terms such as
rahmonics and liftering. The motiva-
tion for this strange new terminolo-
gy, where familiar words were
paraphrased by interchanging con-
sonants, was succinctly stated as fol-
lows: “In general, we find ourselves
operating on the frequency side in
ways customary on the time side and
vice versa” [1].
To suggest what prompted the
invention of the term cepstrum,
note that a signal with a simple echo
can be represented as
x(t) = s (t ) + αs (t τ). (1)
The Fourier spectral density (spec-
trum) of such a signal is given by
|X ( f )|
2
=|S ( f )|
2
[1 + α
2
+ 2αcos(2π f τ)].(2)
Thus, we see from (2) that the
spectral density of a signal with an
echo has the form of an envelope
(the spectrum of the original signal)
that modulates a periodic function
of frequency (the spectrum contri-
bution of the echo). By taking the
logarithm of the spectrum, this
product is converted to the sum of
two components; specifically
C ( f ) = log|X ( f )|
2
= log|S ( f )|
2
+ log[1 + α
2
+ 2 α cos(2π f τ)].(3)
Thus,
C ( f )
viewed as a wave-
form has an additive periodic com-
ponent whose “fundamental
frequency” is the echo delay
τ
. In
conventional analysis of time wave-
forms, such periodic components
show up as lines or sharp peaks in
the corresponding Fourier spec-
trum. Therefore, the “spectrum” of
the log spectrum would likewise
show a peak when the original time
waveform contained an echo. This
new “spectral” representation
domain was not the frequency
domain, nor was it really the time
domain. So, looking to forestall
confusion while emphasizing con-
nections to familiar concepts,
Bogert et al. chose to refer to it as
the quefrency domain, and they
termed the spectrum of the log of
the spectrum of a time waveform
the cepstrum. While most of the
terms in the glossary at the end of
the original paper have faded into
the background, the term cepstrum
has survived and become part of the
digital signal processing lexicon.
In the early 1960s, totally unre-
lated to, and independent of, the
work by Bogert et al., Al Oppen-
heim was pursuing his doctoral
research on a class of nonlinear sig-
nal processing techniques inspired
by the concept of homomorphic
(i.e., linear in a generalized sense)
mappings between algebraic groups
and vector spaces. His dissertation,
“Superposition in a Class of
Nonlinear Systems” [2] completed
at MIT in May, 1964, developed a
theory for nonlinear signal process-
ing referred to as homomorphic sys-
tems. The use of such systems for
signal processing was termed homo-
morphic filtering.
The essential idea of homomor-
phic system theory was that many
signal processing operations satisfy
the same algebraic postulates as
addition. Therefore, homomorphic
mappings between signal spaces in
which these other operations play
the role of signal (vector) addition
are, in essence, linear mappings in a
generalized sense. This suggested a
new approach to a variety of prob-
lems in separating signals that had
been nonadditively combined, such
as through convolution or multipli-
cation. Various potential applica-
tions of homomorphic signal
separation were actively considered,
primarily for deconvolution and
demultiplication.
The first step in the development
of homomorphic filtering for
deconvolution was to work out
both the basic theory and ways to
implement homomorphic systems
for convolution. It was shown in Al
Oppenheim’s Ph.D. dissertation
that all homomorphic systems have
a canonic representation consisting
of a cascade of three systems. The
first system is an invertible nonlinear
SEPTEMBER 2004
95
I
1053-5888/04/$20.00©2004IEEE
IEEE SIGNAL PROCESSING MAGAZINE
96
SEPTEMBER 2004
characteristic operator (system) that
maps a nonadditive combination
operation such as convolution into
ordinary addition. The second sys-
tem is a linear system obeying addi-
tive superposition, and the third
system is the inverse of the first
nonlinear system. Thus, for signals
combined by convolution, a homo-
morphic deconvolution system
maps convolution into addition,
then addition into addition, and
finally addition into convolution.
For convolution (denoted
)
, the
characteristic operator
D
[][][]
has
the property that
D
[x
1
x
2
] =
ˆ
x
1
+
ˆ
x
2
,
where
D
[x
1
] =
ˆ
x
1
and
D
[x
2
] =
ˆ
x
2
.
One intuitively
appealing approach to the imple-
mentation of the nonlinear mapping
D
[][][]
is through the complex-val-
ued logarithm of the complex-val-
ued Fourier transform. That is, if
two signals are convolved, their
dsp history
continued
heirs is a historical collaboration. It started in 1965, it con-
tinues to-date, and it is planned to continue for at least
another 40 years. From this collaboration resulted the coau-
thored books Digital Signal Processing (1975), Discrete-Time
Signal Processing (1989, 1999), Computer-Based Exercises
for Signal Processing Using MATLAB 5 (1998), and numerous
research articles, all of which testify to the creative enthusi-
asm and challenging work of their authors. Their numerous
research and teaching awards, including the IEEE Education
Medal received by each clearly acknowledge their great love
and talent for both teaching and research. The quiz is easy,
Dear Reader, and you may have guessed the answer right
away. Our guests today, in a double feature of the “DSP
History” column, are Dr. Al Oppenheim and Dr. Ron Schafer.
Al Oppenheim was born on 11 November 1937, in New
York City. He obtained his S.B. (1961), S.M. (1961), and Sc.D.
(1964) degrees from the Massachusetts Institute of
Technology (MIT), Cambridge. He has been with the
Department of Electrical Engineering and Computer Science at
MIT for his entire academic career, with a two-year leave of
absence at MIT Lincoln Laboratory (1967–1969) and two
years part time as an associate division head at MIT Lincoln
Laboratory (1978–1980). Dr. Oppenheim’s research focuses
on algorithms for signal processing motivated by applications
in speech processing, image processing, acoustics, communi-
cations, and more recently drifting toward biology. In addition
to the books mentioned earlier, he also coauthored:
Signals
Signals
and Systems
and Systems (1982, 1997), authored two widely used video
lecture series, and edited and coedited other books. Dr.
Oppenheim is the recipient of numerous awards for excel-
lence, including the IEEE Third Millenium Medal (2000), the
IEEE Centennial Medal (1984), the IEEE Education Medal
(1988), and election to the National Academy of Engineering
(1987). In collaborators, he appreciates “creative, out-of-the-
box thinking, willingness to assume that any idea is a poten-
tially good idea, ability to listen and be open-minded, but also
to challenge in a friendly and constructive way.” His hobbies
include a variety of sports pursued intensely. Over the years
these have included skiing, squash, tennis, offshore sailing and
racing, windsurfing, biking, and flying. His current nonprofes-
sional passions are magic (illusions), windsurfing, and flying
(he received his private pilot’s license in 2003). He is a playful
soul, even more so on St. Patrick’s day, when (in true Irish
spirit), his friends and colleagues call him “O’Ppenheim.”
Ron Schafer was born on 17 February 1938 in Tecumseh,
Nebraska. He obtained the B.S.E.E. (1961) and M.S.E.E.
(1962) degrees from the University of Nebraska, Lincoln, and
the Ph.D. degree (1968) from MIT. After a career-defining six
years with Bell Telephone Laboratories (1968–1974), he
joined the Georgia Institute of Technology (Georgia Tech) in
1974 as John and Marilu McCarty Professor. In 2004, he
retired from Georgia Tech as Professor Emeritus and took a
position as distinguished technologist at Hewlett-Packard
Laboratories in Palo Alto, California. Ron’s work has focused
on speech processing, image processing, and applications of
DSP to medicine and biology. In addition to the books men-
tioned earlier, he also coauthored Digital Processing of
Speech Signals (1978), DSP First: A Multimedia Approach
(1998), and Signal Processing First (2003). He is the recipient
of several awards for excellence, including the IEEE Emanuel
R. Piore Award (1980), the IEEE Education Medal (1992), the
IEEE Signal Processing Society Education Award (2000), and
election to the National Academy of Engineering (1992). In
addition to his almost four-decade collaboration with Al, he
has had long, fruitful, and greatly valued collaborations over
many years with Larry Rabiner, Tom Barnwell, Russ
Mersereau, and Jim McClellan. Ron appreciates in collabora-
tors their ability to push and motivate him and their tolerance
of his tendency to procrastinate. To his own children and a
multitude of students, he has often quoted an instruction
from the MIT catalog of 1963: “A student is expected to study
at least three hours outside of class for every hour in the
classroom.” To his chagrin, this suggestion was not generally
met with an enthusiastic response. Dr. Schafer’s hobbies
include photography, fly fishing, reading, walking, and volun-
teering.
We invite you to follow Dr. Al Oppenheim and Dr. Ron
Schafer in their pursuit of reconstructing the captivating his-
tory of the cepstrum. Spell-checkers are not recommended...
—Adriana Dumitras and
George Moschytz
“DSP History” column editors
adrianad@ieee.org,
moschytz@isi.ee.ethz.ch
T
Fourier transforms are multiplied,
and a suitably defined complex log-
arithm will produce the sum of two
log Fourier transforms. The inverse
Fourier transform of a sum is the
sum of the individual inverse trans-
forms, so the cascade of — Fourier
transform
complex logarithm
inverse Fourier transform — maps
convolution into a sum of corre-
sponding signals. Interest in this
implementation fortunately coincid-
ed nicely with the publication of the
1965 paper by Cooley and Tukey
[3] on efficient implementation of
the Fourier transform on digital
computers, i.e., the fast Fourier
transform (FFT). The FFT provided
an efficient means of implementing
the Fourier transform computations
needed to implement the nonlinear
mapping
D
[[[]]]
.
It was a fortuitous discussion in
1965 between Jim Flanagan of Bell
Telephone Laboratories and Al
Oppenheim that connected the
work going on at MIT to the devel-
opment of the cepstrum at Bell
Laboratories. After hearing about
homomorphic deconvolution from
Oppenheim, Flanagan noted that
the characteristic system for homo-
morphic convolution was reminis-
cent of the spectrum of the log of
the spectrum (i.e., the cepstrum) as
proposed by Bogert et al.
Furthermore, he suggested looking
at work by Michael Noll [4], [5] in
the Journal of Acoustical Society of
America. Noll credits Manfred
Schroeder (who was aware of the
work of Bogert et. al.) with suggest-
ing to him that it might be interest-
ing to apply cepstrum analysis on a
short-time basis to speech signals.
In the Journal of Acoustical Society
of America papers, Noll applied the
cepstrum as a basis for pitch detec-
tion. The problem of pitch detec-
tion is very similar to detecting echo
times in the sense that the basic
speech model consists of represent-
ing speech as the convolution of the
vocal tract impulse response with
the quasi-periodic train of glottal
pulses. The basic idea of cepstral
pitch detection is illustrated in
Figures 1 and 2.
The success of cepstral pitch
detection suggested to us the strong
potential of applying homomorphic
deconvolution to deconvolve speech,
i.e., to separate the vocal tract
impulse response, the glottal pulse
shape, and the periodic excitation.
Bogert et al. had introduced the con-
cept of liftering, i.e., linear filtering
of the log spectrum, as a way of
emphasizing the periodic component
of the log spectrum so as to enhance
the detectability of echos [1]. The
concept of homomorphic filtering
clearly suggested how liftering could
be used to actually separate the vari-
ous convolutional components, such
as separating the vocal tract filter
response from the periodic excitation
spectrum. By applying a low-pass
lifter to the cepstrum in Figure 2 to
extract the low quefrency compo-
nents below the first rahmonic peak,
the slowly varying curve (in red,
upper graph) results. The low que-
frency components thus correspond
to the resonance structure of the
vocal tract and high frequency falloff
of the speech spectrum due to the
glottal pulse.
This example shows that the gen-
eral concept of homomorphic
deconvolution is fundamentally dif-
ferent from echo or pitch detection
in a number of ways. One, of course,
is that the ultimate objective in
homomorphic deconvolution is sig-
nal separation and recovery rather
than detection; i.e., the nonlinear
mapping is followed by linear filter-
ing and subsequently by the inverse
of the nonlinear mapping. While the
spectral density as used by Bogert et
al. loses phase information, the
objective of homomorphic deconvo-
lution requires that phase informa-
tion be retained or reconstructed.
Consequently, in homomorphic
deconvolution the complex-valued
Fourier transform and complex loga-
rithm must in general be used. To
retain both the relationship to the
work of Bogert at al. and the distinc-
tion, the term complex cepstrum was
eventually applied to the nonlinear
mapping in homomorphic deconvo-
lution, and the term power cepstrum
or real cepstrum used for the cep-
strum as originally defined by
Bogert et al.
Shortly after Al Oppenheim
joined the MIT faculty and became
aware of the work in [1] and [4], he
and Ron Schafer were introduced to
each other by Tom Stockham, a
IEEE SIGNAL PROCESSING MAGAZINE
SEPTEMBER 2004 97
1. A segment of voiced speech with a Hamming window during a voiced (vowel-like)
time interval of 50 ms. Short-time analysis of speech involves the analysis of a succession
of such segments taken sequentially in time. At this particular time in a vowel sound, the
pitch period is approximately 12.5 ms.
1
0.5
0
–0.5
–1
50
Windowed Speech Segment
12.5
Fundamental Period
051015 20 25 30 35 40 45
Time [ms]
IEEE SIGNAL PROCESSING MAGAZINE
98
SEPTEMBER 2004
faculty member at MIT and a mem-
ber of Oppenheim’s dissertation
committee. At that time, Ron
Schafer was a graduate student in
search of an interesting doctoral dis-
sertation topic. Thus began a close
and wonderfully productive collabo-
ration between a young professor
and an enthusiastic graduate stu-
dent, exploring and developing the
theoretical concepts, techniques, and
applications of the complex cep-
strum. The resulting doctoral disser-
tation “Echo Removal by Discrete
Generalized Linear Filtering [6] by
Ron Schafer at MIT in 1968,
focused on the issues of the dis-
crete-time formulation of the com-
plex cepstrum, phase computation,
recursion relations, and applications
to echo removal from speech. In our
development of the complex cep-
strum, a variety of alternate imple-
mentations of the complex cepstrum
and relationships between the power
cepstrum and complex cepstrum
were developed, both in general and
for minimum-phase, maximum-
phase and all-pass sequences. The
work also led to the interpretation in
the cepstral domain of the Hilbert
transform relationship between
Fourier transform magnitude and
phase for minimum phase signals.
The early work on the applica-
tion of homomorphic filtering to
deconvolution and demultiplication
and consequently on the extension
of the power cepstrum to the com-
plex cepstrum, was described in the
Proceedings of the IEEE [7] and also
developed and described more
extensively in Chapter 10 of our
first jointly authored textbook [8]
published in 1975. As an interesting
side note, throughout the various
stages of proofreading of this book,
we constantly had to maintain vigi-
lance to be certain that this
“strange” term cepstrum wasn’t
inadvertently “corrected” to what
seemed to be more appropriate.
(Our text editing program also con-
tinues to complain as we write this
article). We breathed a sigh of relief
when the last page proofs were
returned to the publisher. When the
first printing of the book appeared,
it was clear that a particularly dili-
gent proofreader at the publisher
had caught the “error” at the last
instant and cepstrum had been
reversed to spectrum throughout.
This unusual term has caused lin-
guistic difficulties in other ways.
Since it was meant as a paraphrase
of the term spectrum, its natural
pronunciation would be kepstrum.
On the other hand, based on its
spelling the natural pronunciation
would be sepstrum, and there have
been lively debates on which should
be used. The adopted pronunciation
among those aware of the origin is
kepstrum, and in fact some authors
have chosen to spell it as kepstrum
to make the pronunciation clear.
Applications of the cepstrum,
complex cepstrum, and homomor-
phic deconvolution have been
explored in a variety of areas includ-
ing audio processing, speech pro-
cessing, geophysics, radar, medical
imaging, and others. Early in the
development of the concept of
homomorphic filtering, Tom
Stockham had shown particular
dsp history continued
2. Log magnitude (in blue, upper graph), lowpass liftered (linearly filtered) log magni-
tude (in red, upper graph), and cepstrum (in blue, bottom graph) of the segment of
voiced speech illustrated in Figure 1. The rapidly varying curve in the upper graph is the
log magnitude of the discrete-time Fourier transform of the segment of speech in Figure 1.
The lower plot is the cepstrum, i.e., the inverse discrete Fourier transform of the log mag-
nitude in the upper plot. Note the peaks at “rahmonics” of 1/80 = 12.5 ms, the fundamen-
tal quefrency of the quasi-periodic ripples in the upper graph. As can be seen by
comparing the speech segment in Figure 1 to this cepstrum, this fundamental quefrency is
also the period (pitch period) of the time waveform. The search for such peaks in the cep-
strum was the basis for Noll's pitch detection algorithm.
4
2
0
–2
–4
–6
0
0.5
0
–0.5
–1
50
Log Magnitude of Fourier Transform
Fundamental "Period" = 80 Hz
0
0.5 1 1.5 2 2.5 3
3.5 4
Frequency [kHz]
Cepstrum of Segment of Voiced Speech
First Rahmonic Peak at 12.5
051015 20 25 30 35 40 45
Frequency [ms]
interest in exploring potential appli-
cations to both demultiplication and
deconvolution. The work on homo-
morphic demultiplication led to
novel methods for dynamic range
compression, as disclosed in a patent
by Stockham and Oppenheim [9],
and signal enhancement for both
audio and images, the latter of
which was discussed by Stockham in
[10]. An early and very novel appli-
cation of the cepstrum to deconvo-
lution was the work by Stockham et
al. [11] directed at blind deconvolu-
tion such as the restoration of old
phonograph recordings. They
focused in particular on restoration
of recordings by Enrico Caruso.
The objective in the restoration of
these old recordings was to com-
pensate for the undesireable fre-
quency response of the “morning
glory” recording horn used in that
era. Since the recording horn was
retuned daily and since its frequency
response was typically subject to
temperature and humidity varia-
tions, the compensation could not
simply be based on modern day
measurements of an archived
recording horn, i.e., the spectral
characteristics would have to be
estimated from the recording itself.
Stockham’s essential idea in the
Caruso restoration was to use log
spectral averaging (essentially cep-
stral averaging) and subtraction of
the log spectral average of a modern
recording of the same arias. The
result was then used as an estimate
of the log spectrum of the recording
horn which was to be compensated
for. This idea of log spectrum, or
equivalently, cepstrum averaging has
been useful in many other applica-
tions for estimating and compensat-
ing for a convolutional distortion.
Also in the context of audio pro-
cessing, speech processing devel-
oped as a major application area for
the cepstrum, both in the form of
the power cepstrum and the com-
plex cepstrum. For example, the
cepstral pitch detector utilizing the
power cepstrum became widely
used. Homomorphic deconvolution
was also successfully applied by us
and others in the late 1960s to sepa-
rate the glottal excitation and the
vocal tract impulse response for
speech modeling in general and for
speech compression in particular.
The cepstrum also plays a signifi-
cant role in many speech recogni-
tion systems. Specifically, the
cepstral coefficients have been
found empirically to be a more
robust, reliable feature set for
speech recognition and speaker
identification than linear predictive
coding (LPC) coefficients or other
equivalent parameter sets. Cepstral
parameter extraction in speech rec-
ognizers is, in some instances, based
on converting LPC parameters to
cepstral coefficients by utilizing the
recursion relationship [7], [8] that
we developed in our early research
to obtain the complex cepstrum of
minimum-phase signals without the
need for explicit computation of the
Fourier transform or phase unwrap-
ping. An alternative use of the cep-
strum in speech recognition is the
mel-frequency cepstrum [12]. The
mel-frequency cepstrum is based on
calculating the cepstrum from the
logarithm of the spectrum obtained
from a filter bank with center fre-
quencies and bandwidths deter-
mined by a constant mel-frequency
interval. Cepstral averaging and
time-differencing of cepstra are now
standard methods of removing the
effects of linear filtering prior to
recognition.
While speech processing has been
one of the more successful applica-
tions of cepstral analysis and homo-
morphic deconvolution, applications
in other areas have been explored
with considerable success. As men-
tioned previously, detection of
echoes in seismic signals was the
motivating application in the original
work of Bogert et. al. Following the
publication of [7], there was consid-
erable interest in exploring homo-
morphic deconvolution in the
context of seismology and explo-
ration geophysics. For example, the
complex cepstrum was used in the
early 1970s in determining seismic
wavelets and in deconvolving seismic
traces [13], [14].
There are many other innova-
tions and applications that can be
traced to the cepstrum and its sub-
sequent development, but space
does not permit an exhaustive dis-
cussion or bibliography. This is not
surprising in light of the ubiquity of
convolution as a model for what
goes on in our physical world. It is
almost certain that we have not seen
the end of new ideas and new appli-
cations of cepstrum analysis and
homomorphic filtering. For exam-
ple, our original theoretical develop-
ment showed that the complex
cepstrum of a signal with a rational
z
-transform could be obtained in
terms of the roots of the numerator
and denominator [8]. This avoids
the issue of phase computation
completely. Steiglitz and Dickinson
demonstrated this in [15] for poly-
nomial
z
-transforms of only moder-
ate length. However, a recent article
[16] by Sitton, Burrus, Fox, and
Treitel in IEEE Signal Processing
Magazine has shown that polynomi-
als of order up to 1 million can be
accurately rooted using methods
based on the FFT. At the time the
complex cepstrum was introduced
and developed, factoring such large
polynomials was not even consid-
ered. With such powerful new tools
it may be time to take another look
at the complex cepstrum in the con-
text of new applications.
IEEE SIGNAL PROCESSING MAGAZINE
SEPTEMBER 2004 99
(continued on page 106)
IEEE SIGNAL PROCESSING MAGAZINE
106
SEPTEMBER 2004
DNA synthesis and associative search using
error-correcting codes and vector-quanti-
zation,” Revised Papers from the 6th Int.
Workshop on DNA-Based Computers: DNA
Computing, Lecture Notes in Computer Science,
vol. 2054. Berlin: Springer-Verlag, 2001, pp.
145–172.
[23] J.H. Reif, “DNA lattices: A programmable
method for molecular scale patterning and
computation,” Comput. Scientific Eng.
(Special Issue on Bio-Computation), no. 1, pp.
32–41, Feb. 2002.
[24] J.H. Reif, “The emergence of the discipline of
biomolecular computation in the US,” New Gener.
Comput., vol. 20, no. 3, pp. 217–236, 2002.
[25] J.H. Reif, T.H. LaBean, M. Pirrung, V. Rana,
B. Guo, K. Kingsford, and G. Wickham,
“Experimental construction of very large scale
DNA databases with associative search capabili-
ty,” in Proc. 7th Int. Meeting DNA Based
Computers, Tampa, FL, 2001, pp. 231–247.
[26] S. Roweis, E. Winfree, R. Burgoyne, N.
Chelyapov, M. Goodman, P. Rothemund, and L.
Adleman, “A sticker based architecture for DNA
computation,” in Proc. 2nd Annu. Workshop
DNA Computing, Princeton, NJ, 1999, pp. 1–29.
[27] A.J. Ruben, S.J. Freeland, and L. Landweber,
“PUNCH: An evolutionary algorithm for opti-
mizing bit set selection,” in Revised Papers from
the 7th International Workshop on DNA-Based
Computers: DNA Computing, Lecture Notes in
Computer Science. London: Springer-Verlag,
2001, pp. 150–160.
[28] J.P. Klein, T.H. Leete and H. Rubin, “A
biomolecular implementation of logically
reversible computation with minimal energy
dissipation,” BioSyst., vol. 52, issue 1–3, pp.
15–23, 1999.
[29] J. SantaLucia, Jr., “A unified view of polymer,
dumbbell, and oligonucleotide DNA nearest—
Neighbor thermodynamics,” Proc. Nat. Acad.
Sci.,USA, vol. 95, no. 4, pp. 1460–1465, 1998.
[30] N.C. Seeman, “DNA in a material world,”
Nature, vol. 421, no. 6921, pp. 427–431, 2003.
[31] M.N. Stojanovic and D. Stefanovic,
“Deoxyribozyme-based half-adder,” J. Amer.
Chem. Soc., vol. 125, no. 22, pp. 6673–6676,
2003.
[32] M.N. Stojanovic and D. Stefanovic, “A
deoxyribozyme-based molecular automaton,”
Nature Biotechnology, vol. 21, no. 9, pp.
1069–1074, 2003.
[33] F. Tanaka, M. Nakatsugawa, M. Yamamoto, T.
Shiba, and A. Ohuchi, “Developing support sys-
tem for sequence design in DNA computing,” in
in Revised Papers from the 7th International
Workshop on DNA-Based Computers: DNA
Computing, Lecture Notes in Computer Science.
London: Springer-Verlag, 2001, pp. 129–137.
[34] D. Tulpan, H. Hoos, and A. Condon,
“Stochastic local search algorithms for DNA
word design,” in in Revised Papers from the 8th
International Workshop on DNA-Based
Computers: DNA Computing, Lecture Notes in
Computer Science. London: Springer-Verlag,
2003, pp. 229–241.
[35] B. Yurke, A.P. Miller, and S.L. Cheng, “DNA
implementation of addition in which the input
strands are separate from the operator strands,”
BioSystems, vol. 52, no. 1–3, pp. 165–174,
1999.
[36] B. Yurke, A.J. Turberfield, A.P. Mills, Jr., F.C.
Simmel, and J.L. Neumann, “A DNA-fuelled
molecular machine made of DNA,” Nature,
vol. 406, no. 6796, pp. 605–608, 2000.
[37] J.D. Watson, T.A. Baker, S.P. Bell, A. Gann,
M. Levine and R. Losick, Molecular Biology of
the Gene, 5th ed. San Francisco, CA:
Pearson/Benjamin Cummings, 2004.
[38] E. Winfree, X. Yang, and N.C. Seeman,
“Universal computation via self-assembly of
DNA: Some theory and experiments,” in 2nd
Annu. DIMACS Meeting DNA Based
Computers, 1996, pp. 191–213.
[39] M. Zuker, “Mfold web server for nucleic acid
folding and hybridization prediction,” Nucleic
Acids Res., vol. 31, no. 13, pp. 3406–3415, 2003.
Sotirios A. Tsaftaris, Aggelos K.
Katsaggelos, and Thrasyvoulos N.
Pappas are with the Department of
Electrical and Computer Engi-
neering, Northwestern University.
Eleftherios T. Papoutsakis is with the
Department of Chemical Engin-
eering, Northwestern University.
[1] B.P. Bogert, M.J.R. Healy, and J.W. Tukey,
“The quefrency alanysis of time series for
echoes: Cepstrum, pseudo-autocovariance,
cross-cepstrum, and saphe cracking,” in Time
Series Analysis, M. Rosenblatt, Ed., 1963, ch.
15,pp. 209–243.
[2] A.V. Oppenheim, “Superposition in a class of
nonlinear systems,” Ph.D. dissertation, MIT,
May, 1964.
[3] J.W. Cooley and J.W. Tukey, “An algorithm for
the machine computation of complex Fourier
series,” Math. Computation, vol. 19, pp.
297–301, Apr. 1965.
[4] A.M. Noll, “Short-time spectrum and ‘cep-
strum’ techniques for vocal-pitch detection,” J.
Acoust. Soc. Amer., vol. 36, no. 2, pp. 296–302,
Feb. 1964.
[5] A.M. Noll, “Cepstrum pitch determination,” J.
Acoust. Soc. Amer., vol. 41, no. 2, pp. 293–309,
Feb. 1967.
[6] R.W. Schafer, “Echo removal by discrete gener-
alized linear filtering,” Ph.D. dissertation, MIT,
Jan. 1968.
[7] A.V. Oppenheim, R.W. Schafer, and T.G.
Stockham, Jr., “Nonlinear filtering of multi-
plied and convolved signals,” Proc. IEEE, vol.
56, no. 8, pp. 1264–1291, Aug. 1968.
[8] A.V. Oppenheim and R.W. Schafer, Digital
Signal Processing. Englewood Cliffs, NJ:
Prentice-Hall, 1975.
[9] A.V. Oppenheim and T.G. Stockham Jr.,
“Signal compression and expansion system,”
U.S. Patent 3 518 578, June 1970.
[10] T.G. Stockham, Jr., “Image processing in the
context of a visual model,” Proc. IEEE, vol. 60,
pp. 828–842, July 1972.
[11] T.G. Stockham, Jr., T.M. Cannon, and R.B.
Ingebretsen, “Blind deconvolution through
digital signal processing,” Proc. IEEE, vol. 63,
pp. 678–692, Apr. 1975.
[12] S.B. Davis and P. Mermelstein, “Comparison
of parametric representations for monosyllabic
word recognition in continuously spoken sen-
tences,” IEEE Trans. Acoust., Speech, Signal
Processing, vol. ASSP-28, no. 4, pp. 357–366,
Aug. 1980.
[13] T.J. Ulrych, “Application of homomorphic
deconvolution to seismology,” Geophysics,
vol. 36, no. 4, pp. 650–660, Aug. 1971.
[14] P.L. Stoffa, P. Buhl, and G.M. Bryan, “The
application of homomorphic deconvolution to
shallow-water marine seismology–Part I:
Models; Part II: Real data,” Geophysics, vol. 39,
pp. 401–426, Aug. 1974.
[15] K. Steiglitz and B. Dickinson, “Computation
of the complex cepstrum by factorization of the
Z-transform,” in Proc. Int. Conf. Acoust., Speech
and Signal Processing, 1977, pp. 723–726.
[16] G.A. Sitton, C.S. Burrus, J.W. Fox, S. Treitel,
“Factoring very high-degree polynomials,”
IEEE Signal Processing Mag., vol. 20, no. 6, pp.
27–42, Nov. 2003.
For an extensive bibliography, see
http://www.rle.mit.edu/dspg/pub
_journal.html.
dsp history continued from page 99
References
life sciences continued
... Since its invention in 1963 [5], the cepstrum has been applied in various discrete-time signal processing problems, such as detecting the echo delay, deconvolution, feature representations for speech recognition like the Mel-Frequency Cepstral Coefficients (MFCC), and estimating the pitch of an audio signal. A thorough review of the cepstrum can be found in [43,44]. ...
... In practice, the irrelevant components lie in the low quefrency region. Therefore, we need to apply a long-pass lifter on U (h,γ) f (t, ξ ), where the lifter refers to a "filter" processed in the cepstral domain, again by inverting the first four letters of "filter", to distinguish it from the filter processed in the spectral domain [5,43]. Moreover, since the quefrency is measured in the unit of time, a lifter is identified as a short-pass or long-pass one rather than a low-pass or a high-pass one [5,43]. ...
... Therefore, we need to apply a long-pass lifter on U (h,γ) f (t, ξ ), where the lifter refers to a "filter" processed in the cepstral domain, again by inverting the first four letters of "filter", to distinguish it from the filter processed in the spectral domain [5,43]. Moreover, since the quefrency is measured in the unit of time, a lifter is identified as a short-pass or long-pass one rather than a low-pass or a high-pass one [5,43]. In short, a long-pass lifter passes mainly the component of high quefrency (long period) while rejects mainly the component of low quefrency (short period). ...
Preprint
We propose to combine cepstrum and nonlinear time-frequency (TF) analysis to study mutiple component oscillatory signals with time-varying frequency and amplitude and with time-varying non-sinusoidal oscillatory pattern. The concept of cepstrum is applied to eliminate the wave-shape function influence on the TF analysis, and we propose a new algorithm, named de-shape synchrosqueezing transform (de-shape SST). The mathematical model, adaptive non-harmonic model, is introduced and the de-shape SST algorithm is theoretically analyzed. In addition to simulated signals, several different physiological, musical and biological signals are analyzed to illustrate the proposed algorithm.
... Typically, these methodologies involve two main stages. The first stage, known as estimation, computes preliminary formant values over short time-segments (e.g., 25 ms) using techniques such as linear prediction (LP) [16] or cepstral analysis [17]. The second stage, tracking, integrates the formant estimates from individual frames into continuous contours that span longer speech units, such as syllables, words, or phrases [11], [12]. ...
Article
Full-text available
Formant tracking is an area of speech science that has recently undergone a technology shift from classical model-driven signal processing methods to modern data-driven deep learning methods. In this study, these two domains are combined in formant tracking by refining the formants estimated by a data-driven deep neural network (DNN) with formant estimates given by a model-driven linear prediction (LP) method. In the refinement process, the three lowest formants, initially estimated by the DNN-based method, are frame-wise replaced with local spectral peaks identified by the LP method. The LP-based refinement stage can be seamlessly integrated into the DNN without any training. As an LP method, the study advocates the use of quasi-closed phase forward-backward (QCP-FB) analysis. Three spectral representations are compared as DNN inputs: mel-frequency cepstral coefficients (MFCCs), the spectrogram, and the complex spectrogram. Formant tracking performance was evaluated by comparing the proposed refined DNN tracker with seven reference trackers, which included both signal processing and deep learning based methods. As evaluation data, ground truth formants of the Vocal Tract Resonance (VTR) corpus were used. The results demonstrate that the refined DNN trackers outperformed all conventional trackers. The best results were obtained by using the MFCC input for the DNN. The proposed MFCC refinement (MFCC-DNNQCP-FB) reduced estimation errors by 0.8 Hz, 12.9 Hz, and 11.7 Hz for the first (F1), second (F2), and third (F3) formants, respectively, compared to the Deep Formants refinement (DeepFQCP-FB). When compared to the model-driven KARMA tracking method, the proposed refinement reduced estimation errors by 2.3 Hz, 55.5 Hz, and 143.4 Hz for F1, F2, and F3, respectively. A detailed evaluation across various phonetic categories and gender groups showed that the proposed hybrid refinement approach improves formant-tracking performance across most test conditions.
... Power cepstrum is an analytical technique that transforms a signal from the time domain into the frequency domain [21]. Cepstrum analysis is particularly effective in capturing the temporal evolution of characteristics within eLoran signals, offering a detailed insight into their dynamic behaviour over time. ...
Article
Full-text available
The Enhanced Long Range Navigation (eLoran) system serves as a crucial backup to the Global Navigation Satellite System (GNSS), leveraging advantages, such as low signal frequency, high transmitter power, and stable propagation distance. However, the prevailing demodulation techniques employed by the eLoran system, which are largely based on conventional digital signal processing, are susceptible to substantial inaccuracies when confronted with intense interference and complex environmental conditions. This paper introduces a novel GTCN‐Transformer network designed for the specific task of recognising message in eLoran pulse group signal. The network is constructed by enhancing the architecture of Temporal Convolutional Networks (TCN) and integrating the Transformer mechanism. In order to extract significant features from the pulse group signal, a sequence dataset was obtained by using cepstral analysis. Subsequently, the GTCN‐Transformer network is deployed to recognise the message contained within the eLoran pulse group signal. The experimental results demonstrate that the GTCN‐Transformer network achieves a recognition accuracy of over 95% for eLoran signal message information when the SNR exceeds 10 dB, even in the presence of sky‐wave and cross‐interference signals. Moreover, a comparative analysis with recurrent neural network (RNN) reveals that the GTCN‐Transformer network outperforms these architectures in terms of recognition accuracy.
... (Francesca Raimondi) 1 GIPSA-Lab (Department of Image and Signal-processing), CNRS, Université Grenoble Alpes, 38400 Saint Martin d'Hères, France 2 Dipartimento di Elettronica Informazione e Bioingegneria, Politecnico di Milano, I-20133 Milano, Italy decomposition of spectral density functions as the product of minimum phase and maximum phase terms [9]. Cepstrum analysis eases the design of causal filters [10,11]. The latter have been extended to two dimensions [9,12,13,14,15], and generalized to the multidimensional case [16], through the definition of dD semi-causality. ...
Preprint
This paper proposes a new perspective on the problem of multidimensional spectral factorization, through helical mapping: d-dimensional (dD) data arrays are vectorized, processed by 1D cepstral analysis and then remapped onto the original space. Partial differential equations (PDEs) are the basic framework to describe the evolution of physical phenomena. We observe that the minimum phase helical solution asymptotically converges to the dD semi-causal solution, and allows to decouple the two solutions arising from PDEs describing physical systems. We prove this equivalence in the theoretical framework of cepstral analysis, and we also illustrate the validity of helical factorization through a 2D wave propagation example and a 3D application to helioseismology.
... where à represents the convolution, dðtÞ is the unit IR function, and T is the period of IR function. Cepstrum analysis, originally utilized to detect echoes generated by seismic events and explosive blasts, 42 is a deconvolution technique that involves the analysis of two signals 43 : the input source signal and a series of impulses. Cepstrum analysis encompasses a logarithmic transformation of the spectrum of the signal of interest, followed by an inverse Fourier transform, resulting in the conversion of the convolved signal into an additive signal within the cepstral domain, as illustrated in Fig. 5, which provides a schematic representation of the inverse spectrum. ...
Article
Full-text available
Determining fracture locations in hydraulic fracturing is essential for diagnostic purposes. Water hammer waves generated during pump shut-in in hydraulic fracturing create pressure fluctuations as they pass through fractures. The pressure signals collected at the wellhead contain valuable information about subsurface fracture positions. This study, based on the water hammer equation, establishes an internal flow model within pipelines, considering both the pump shut-in process and subsurface fracture boundary conditions (fracture permeability, fracture storage, and fracture inertia effects). The method of characteristics (MOC) is employed for numerical discretization to simulate the wellhead pressure fluctuations during pump shut-in. A novel fracture localization method is proposed, combining comprehensive filtering, cepstral analysis, and velocity conversion. Comprehensive filtering effectively removes various noises present in the collected signals. Subsequently, cepstral analysis identifies negative peaks in the cepstral domain generated by pulse functions at fracture locations. This information is then used to determine the propagation time of pressure waves from fractures to the wellhead, which is converted to depth by wave velocity. Through numerical simulations and field experiments, the method's effectiveness is validated, demonstrating its capability to efficiently filter out signal noise, identify cepstral negative peaks from pulse functions at fractures, and provide precise inversion of fracture locations. This method holds significant guidance for practical field applications.
Article
Full-text available
jitter percent, jitter), 진폭 변동률(shimmer percent, shimmer), 소음대배음비(Noise to Harmonic Ratio, NHR)와 유성음으로만 이루어진 연속발화과제의 음향 학적 측정치로 캡스트럼 분석 측정치인 켑스트럼 피크 현저성(Cepstral Peak Prominence, CPP), 저주파수대고주파 수 스펙트럼비(Low/High spectral ratio, L/H ratio)의 상관을 분석하였다. 음성장애 환자 65명을 대상으로 수집된 자 료를 분석한 결과, 유성음 문장의 음향학적 측정치인 CPP와 모음연장발성의 측정치인 jitter(r =-.624, p = .000), shimmer (r =-.530, p = .000), NHR ABSTRACT: This study aimed to investigate the clinical utility of voiced sentence tasks for voice evaluation. To this end, we analyzed the correlation between perturbation-based acoustic measurements [jitter percent (jitter), shimmer percent (shimmer), Noise to Harmonic Ratio (NHR)] using sustained vowel phonation, and cepstrum-based acoustic measurements [Cepstral Peak Prominence (CPP), Low/High spectral ratio (L/H ratio)] using voiced sentences. As a result of analyzing data collected from 65 patients with voice disorders, there was a significant correlation between the CPP and jitter (r =-.624, p = .000), shimmer (r =-.530, p = .000), NHR (r =-.469, p = .000).This suggests that the cepstrum measurement of voiced sentences can be used as an alternative to the analysis limitations of the pathological voice such as not possible perturbation-based acoustic measurement, and result difference according to the analysis section.
Article
Purpose Semiconductor fabrication facilities often suffer from undesired particle introduction into process chambers in vacuum systems. Ideally, it is unusual to observe particles formed in the exhaust pumping line inside the chamber, but non-volatile compound products at relatively low temperatures jeopardize the vacuum pumping system, gas scrubber and the wafer-in-process. This study proposes a monitoring system for constructing a complete condition-based maintenance system for diagnosing the powder build-up within exhaust pipes used in the semiconductor manufacturing industry. This system includes ultrasonic sensors and machine learning. Design/methodology/approach Employing ultrasonic sensors, physical and data-driven models are established. The time- or frequency-domain data acquired by the monitoring system are converted into cepstrums for modeling the powder layer thickness using machine learning. Findings The algorithms used in the proposed system successfully classified the thicknesses with an average accuracy of above 97%, and feature importance analysis identified the quefrency that varied with the thickness of the powder layer. Practical implications The limitation of this research lies within the lab environment. It is unfortunate that the suggested method has not been evaluated in actual semiconductor manufacturing facilities, as powder build-up may take more than a few months to be called the facility maintenance. However, the submitted paper is still valid in academic and engineering aspects to be utilized in industry. Originality/value We modeled the system using data acquired by an ultrasonic sensor, and we constructed a data-driven model that was trained using cepstral data to replace the physical models that monitor thickness. We are the first to use ultrasound and machine learning to estimate the thickness of powder in the exhaust vacuum pumping line.
Article
This paper provides a description of the recent evolution in the US of an emerging technology known as DNA Computation or more generally as Biomolecular Computation from its early stages to its development and extension into other areas such as nanotechnology, emerging as a viable sub-discipline of science and engineering.
Article
The complex cepstrum is investigated mathematically and through models for functions of interest in shallow-water marine seismology. Association of the slowly varying components of the phase spectrum with the source replaces the usual minimum-phase assumption. This is analogous to the usual treatment of the amplitude spectrum. Complex cepstrum expressions are developed for an arbitrary (but minimum-phase) reflector series, water-column multiple generator, and simplified bubble-pulse oscillation. While the complex cepstrum of all functions is of infinite extent, removing only the first n nonzero complex-cepstrum contributions of a decaying, impulsive, periodic time function (such as the water-column multiple generator) serves to eliminate the first n multiples entirely in the time domain and reduces the remaining multiples to at most 1/(n plus 1) of their original value. A new method of computing the continuous, ramp-free phase spectrum required for complex-cepstrum analysis is developed on the basis of the derivative of the phase curve. Refs.
Article
The application of the techniques of Part I to seismic reflection data acquired on the Argentine continental shelf yields results which appear superior to time-domain, minimum-phase inverse filtering via the auto-correlation function. This is in part because of a narrow-band, maximum-phase source component. Minimum-phase deconvolution disperses this component rather than compressing it. Very slight exponential weighting (a **t, a equals 0. 998) appears to make the reflector series minimum phase. This weighting in conjunction with quadrupling the trace length by extending it with zeros virtually eliminates aliasing in the complex cepstrum. Simple zeroing of the complex-cepstrum terms works well as a deconvolution technique even though for exactness their harmonics at longer cepstrum periods should also be removed. Refs.
Article
Homomorphic systems (Oppenheim, 1965a and 1965b) are a class of nonlinear systems which satisfy a generalized principle of superposition. Such systems are particularly useful in separating signals which have been combined through convolution. This paper deals with the application of homomorphic deconvolution to the recovery of the seismic wavelet from a time series formed by the convolution of this wavelet with an impulse train. The unique point about this approach is that it does not require the usual assumptions of a minimum‐phase wavelet and a random distribution of impulses.
Article
A spectrum analyzer based on a definition of short‐time power spectra has been designed and simulated on a digital computer. The analyzer is primarily intended for use in speech analysis. It has been designed to operate in real time, and to produce high‐resolution spectra without utilizing either heterodyning methods or bandpass filter banks. The logarithm of each consecutive amplitude spectrum thus obtained can be used as the input to a second similar spectrum analyzer. The output of this analyzer is then the “cepstrum” or power spectrum of the logarithm spectrum. The cepstrum of a speech signal has a peak corresponding to the fundamental period for voiced speech but no peak for unvoiced speech. Thus, a cepstrum analyzer can function both as a pitch and as a voiced‐unvoiced detector. Cepstral pitch detection has the important advantages that it is insensitive to phase distortion, and is also resistant to additive noise and amplitude distortion of the speech signal. The method does not require the presence of the fundamental frequency in the speech signal, and will give several separate cepstral peaks if several different pitch periods are present. Cepstral techniques appear to be even more reliable and efficient than visual methods for pitch detection. The short‐time spectrum and cepstrum analyzers described in this paper were simulated by a sampled‐data system on an IBM‐7090 digital computer. The simulation was programmed with the assistance of a special block‐diagram compiler.
Article
A new approach to separating convolved signals, referred to as homomorphic deconvolution, is presented. The class of systems considered in this report is a member of a larger class called homomorphic systems, which are characterized by a generalized principle of superposition that is analogous to the principle of superposition for linear systems. A detailed analysis based on the z-transform is given for discrete-time systems of this class. The realization of such systems using a digital computer is also discussed in detail. Such conputational realizations are made possible through the application of high-speed Fourier analysis techniques. As a particular example, the method is applied to the separation of the components of a convolution in which one of the components is an impulse train. This class of signals is representative of many interesting signal-analysis and signal-processing problems such as speech analysis and echo removal and detection. It is shown that homomorphic deconvolution is a useful approach to either removal or detection of echoes.
Article
An efficient method for the calculation of the interactions of a 2' factorial ex- periment was introduced by Yates and is widely known by his name. The generaliza- tion to 3' was given by Box et al. (1). Good (2) generalized these methods and gave elegant algorithms for which one class of applications is the calculation of Fourier series. In their full generality, Good's methods are applicable to certain problems in which one must multiply an N-vector by an N X N matrix which can be factored into m sparse matrices, where m is proportional to log N. This results inma procedure requiring a number of operations proportional to N log N rather than N2. These methods are applied here to the calculation of complex Fourier series. They are useful in situations where the number of data points is, or can be chosen to be, a highly composite number. The algorithm is here derived and presented in a rather different form. Attention is given to the choice of N. It is also shown how special advantage can be obtained in the use of a binary computer with N = 2' and how the entire calculation can be performed within the array of N data storage locations used for the given Fourier coefficients. Consider the problem of calculating the complex Fourier series N-1 (1) X(j) = EA(k)-Wjk, j = 0 1, * ,N- 1, k=0