PreprintPDF Available

The Discrete Fourier Transform - A practical approach

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

These notes on the Discrete Fourier Transform include numerous practical examples that make use of audio signals.
The DFT A Practical Approach
David Dorran Page 1
Table of Contents
How to use this series of documents ...................................................................................... 2
An Introduction to the Frequency-Domain ........................................................................... 3
What are sinusoids? ............................................................................................................. 5
All signals can be decomposed into sinusoids ................................................................... 7
A note on the duration of a sinusoid .................................................................................. 8
How to use Octave/Matlab’s fft function ............................................................................... 9
Using Octave/Matlab’s fft function to analyse a bass guitar signal.................................. 9
Interpreting the output of the fft function ........................................................................ 12
The Discrete Fourier Transform ............................................................................................ 16
What the DFT does .............................................................................................................19
A note on summing cosine and sine waveforms ........................................................... 21
The inner workings of the DFT ......................................................................................... 24
DFT analysis of well-defined signals ................................................................................. 29
Key points and a brief note on negative frequencies ....................................................... 34
The DFT in Practice zero-padding and windowing .......................................................... 37
Time-frequency resolution trade-off ................................................................................ 44
Spectral leakage ................................................................................................................... 51
How and why zero-padding works ................................................................................... 54
How and why windowing affects main-lobes and side-lobes .......................................... 61
Why the spectral shape of window functions is frequency shifted ................................ 68
The DFT A Practical Approach
David Dorran Page 2
How to use this series of documents
Digital Signal Processing Foundationsprovides a gentle introduction to the world of
DSP. This series of documents deals with topics relevant to DSP and they provide links to
online video content in an effort to make some perhaps tricky concepts easier to
understand for the reader.
If you are completely new to DSP then I’d recommend you take a look at the foundations
document first. Also note, that if you have already read the foundations document you
can skip straight to the section entitled “How to use Octave/Matlab’s fft function”.
I believe that the approach of integrating text with video takes advantage of the unique
visualisations that video material has to offer with the more in-depth detail and speed of
review that text-based material does. Most of the video material relates to my own
youtube channel, which at the time of writing (2021) had over 16,000 subscribers and 3
million views.
I also provide Octave/Matlab code examples throughout the document and I’d encourage
anyone who wants to develop practical DSP skills to download Octave, which is available
free of charge, and implement your ideas.
My intention is to continue this series so as to deal with the major elements of DSP, such
as correlation, the Z-Transform and so on. I’ll post updates to this series at
http://www.pzdsp.com/docs if you want to check out any new additions.
If you find any of this work useful I’d be most grateful if you could cite the relevant
resource when appropriate to provide recognition.
Regards,
David
The DFT A Practical Approach
David Dorran Page 3
An Introduction to the Frequency-Domain
When someone plays the guitar different sounds are created because the guitar strings
vibrate or oscillate at different frequencies. A similar effect can be heard if you stretch an
elastic band between your fingers and pluck it and you’d notice that changing either the
length or the tension of the band would alter the frequency of the sound since this causes
the band to vibrate at a different rate or frequency.
When something is oscillating a repeating pattern is being produced over time. This can
be seen with a vibrating elastic band as it moves backwards and forwards through its
initial position. Take a look at the following link to clearly see this effect in a slow motion
video of guitar strings as they oscillate: pzdsp.com/vid11 .
The repeating nature associated with the movement of a guitar string can also be seen in
a plot of the audio signal it produces, as shown below, where the amplitude of a bass guitar
audio signal is seen to move up and down over time as the strings vibrate. You should
note that the rate of oscillation of the string is the same as the rate of oscillation of the
audio signal since it is the string vibrations that cause pressure variations in the air which
we perceive as sound (The audio recording of the bass guitar signal shown above can be
downloaded from pzdsp.com/sig1). The change in air pressure can also be picked up by a
microphone and stored on a computer as a discrete signal i.e. a sequence of numbers that
were obtained by measuring the sound pressure at regular time intervals.
The frequency-domain representation of a signal is a convenient way of showing the
oscillation rate associated with a signal, as explained in the following paragraph.
From the figure above, the sound pressure oscillates after the initial ‘attack’ or transient
component at the start of the signal. This plot of pressure variation over time is referred
200ms segment
Time-domain view of a
1.5 second recording of a bass guitar
Magnitude
Frequency Hz
Frequency-domain view
time
Amplitude
50 100 150 200 250 300
Fundamental (55 Hz)
The DFT A Practical Approach
David Dorran Page 4
to as a time-domain plot and by looking closely at this plot you can see that the time to
complete one cycle of an oscillation is about 1.82 milliseconds (approx. 11 cycles over a
200ms segment). In other words the cycle is repeating about 55 times every second. To
the right of the time-domain plot is a plot of the magnitude spectrum which is a frequency-
domain representation that can be used to quickly determine the rate of oscillations in
time-domain signals. The three relatively large ‘spikes’ shown in the magnitude spectrum
represent the fundamental frequency (55 Hz) and the first two harmonics (110 Hz and 165
Hz). You should notice that you can tell the rate of oscillation (55 Hz) quite easily when
you look at the signal in the frequency-domain; much more quickly and easily than by
analysing the period of the time-domain signal.
This type of repeating pattern doesn’t just happen with audio signals and it can be
observed in many signals, including those from our heart. Your heart will beat at
particular rate, or frequency, depending on what you are doing and your heart rate will
increase if you go for a run or cycle. Engineers and scientists (and musicians and doctors!)
are often analysing the repeating nature of signals and the frequency-domain view of a
signal shows the frequency of the repeating patterns in a convenient graph.
The frequency-domain view of a signal provides another way of analysing a signal which
can provide valuable insight into a signals’ behaviour. I find it useful to relate this to the
way an architect has different drawings of a building depending on who she is dealing
with: A client would find it easier to visualise what the building would look like by
examining a 3-D view of the building; while a builder would require detailed plans in order
to construct the building. Both sets of drawings are
representations of the same building and both have their uses.
It’s the same with the time-domain and frequency-domain
views of signals both represent the same signal and both can
be very useful when analysing signals. Heres a link to a video
which demonstrates the benefit of both the time-domain view
and frequency-domain view of a signal pzdsp.com/vid12.
0123456789 1 2 3 45 6
Magnitude
Frequency (Hz)
0
Ampliitude
Time (Seconds)
Fundamental (1.1 Hz)
Time-domain view of an ECG signal Frequency-domain view of the ECG Signal
Magnitude Spectrum
The DFT A Practical Approach
David Dorran Page 5
Frequency-domain graphs of signals are very easy to create using software tools like
Octave and Matlab and they make use of Fourier analysis techniques to extract frequency
information from a time-domain signal (more on this later!). The basic principle behind
all of the Fourier analysis techniques is that any signal can be broken down into a set of
sinusoidal signals and this concept is explored further in the next couple of subsections.
What are sinusoids?
A sinusoid is a waveform that oscillates smoothly over time (see the plot below) and is
associated with many signals that occur in nature. For example, when you whistle you
create pressure variations in the air which have a sinusoidal shape or of you were to allow
an object attached to the end of a spring bounce up and down then the motion of the
object would also be sinusoidal (see pzdsp.com/vid13). Even more interestingly it turns
out that sinusoids are a fundamental building block of any signal so it’s worth spending
some time getting used to what they look like and how they can be represented
mathematically. This fact was shown mathematically by a French mathematician called
Jean Baptiste Joseph Fourier (1768-1830).
There are three features of sinusoidal waveforms that you’ll need to be comfortable with
to fully appreciate Fourier analysis: frequency, amplitude and phase offset.
The figure above shows a time-domain plot of a cosine waveform to the left and its
corresponding magnitude spectrum to the right. From the time-domain view notice
that the sinusoids amplitude oscillates between 1.5 and -1.5 which means that the
amplitude of the sinusoid is 1.5. You’ll notice that the sinusoid is repeating every 0.5
seconds, in other words it has a period of 0.5 seconds, which means that it has a frequency
of 2 Hz. I’d recommend you check out the interactive animation at pzdsp.com/sinusoids
in order to get a clearer idea about these parameters.
A sinusoidal waveform of
amplitude 1.5 and frequency 2 Hz
0 0.5 1 1.5 2 2.5 3
-1
0
1
Amplitude
time (seconds)
Magnitude
Frequency (Hz)
Frequency-domain view
12345 6
Time-domain view
1.5
The DFT A Practical Approach
David Dorran Page 6
The frequency-domain plot of the sinusoid above shows a single
spike at a frequency of 2 Hz. Anytime you have a time-domain plot
of a single sinusoid you will observe a single spike in the
frequency-domain and the position of the spike on the frequency
axis corresponds to the frequency of the sinusoid. The magnitude
(height) of the spikeis proportional to the amplitude of the
sinusoid. You’ll see examples of signals with more than one
sinusoid present in the next section.
Before we look at the phase associated with this sinusoid lets first
take a look at a mathematical function often used to represent a
sinusoid which is shown below:
()=cos(2 +)
The A parameter specifies the amplitude of the sinusoid; f specifies the frequency and
(Greek letter phi) parameter specifies the phase offset (also referred to as the initial phase
or phase). The t variable represents time and the mathematical expression is evaluated for
a range of values of t in order to create a time-domain signal. So, if you wanted to recreate
the plot of the sinusoid shown above you’d substitute A with 1.5, f with 2 and with 0 to
give x(t) = 1.5cos(4πt), and then you could evaluate this for a number of values of t before
finally plotting your graph of x(t) against time.
You should notice that when the phase value is zero that the waveform will be a maximum
when t=0 and every period of the waveform after that. Changing the phase will change the
times when the maximum of the sinusoid will occur. You should try this out for yourself
using the code above and you should also observe that adding 2π to any phase offset value
you try out will produce the exact same waveform. For example, the waveform produced
when the phase offset is set to 1.4 will be the same as the waveform produced when the
Octave code to create a
plot of a sinusoid:
A = 1.5;
f = 2;
phi = 0;
duration = 1; %1 second
T = 1/f;
t=0:T/100:duration;
x = A*cos(2*pi*f*t + phi);
plot(t,x)
xlabel(‘time (seconds)’)
ylabel(‘Amplitude’)
The DFT A Practical Approach
David Dorran Page 7
phase is set to 1.4+2π, or 1.4+4π, or even 1.4-2π for that matter. In fact, you will find that
for any integer k the following relationship holds:
cos(2+)=cos(2++2)
All signals can be decomposed into sinusoids
The French mathematician Jean-Baptiste Joseph Fourier showed that any signal can be
recreated by adding sinusoidal signals together. (See pzdsp.com/vid14 and
pzdsp.com/vid15 for video tutorials/demonstrations on this concept).
The frequency-domain view of a signal provides a way to visualise the sinusoids that make
up a signal i.e. the sinusoids that when added together reproduce the original signal. The
magnitude spectrum shows the amplitudes of the various sinusoids which make up a
signal, while the phase spectrum shows the phases of the sinusoids which make up a
signal.
The figure above shows a waveform (top) which is a plot of the time-domain signal
produced when the two sinusoids shown below it are added together. The frequency-
domain view of this signal contains two spikes; the spike at 2 Hz is larger than the one at
24 Hz because the 2 Hz sinusoid is larger (5 times larger) than the 24 Hz sinusoid.
The DFT A Practical Approach
David Dorran Page 8
The figure to the right shows the
magnitude spectrum of a signal in the
bottom plot; with the time-domain
view of the same signal shown in the
top plot. Each of the 'spikes' in the
magnitude spectrum represents a
sinusoid (there are 4 in total
indicating the presence of 4 sinusoids
in the signal; in other words the signal
could be reproduced by adding four
sinusoids together). Each of the four
sinusoids, which when summed
together produce the time-domain signal shown in the top plot, are shown in the middle
plot. The green sinusoid has 5 cycles over the one second duration of the segment shown
and therefore has a frequency of 5 Hz; it has the largest amplitude, as can also be seen in
the corresponding magnitude spectrum plot where the ‘spike’ shown at 5 Hz is the largest.
It can also be seen in the magnitude spectrum that the ‘spike’ at 8 Hz is less than half the
height of the 5 Hz spike; this can also be seen in the middle plot whereby the sinusoid
with 8 cycles in one second has an amplitude of less than a half the amplitude of the 5 Hz
sinusoid.
The phase values for each of the sinusoids present in the signal are 0, 0, 3.14, 2.13 radians
for the 1, 5, 8, and 10 Hz components. These phase values are phase shifts relative to cosine
waveforms. A plot of the phase spectrum shows the phase values plotted against frequency
in a similar way to the magnitude spectrum showing the magnitude values plotted against
frequency.
If you would like to see a practical application of the frequency-domain then take a look
at pzdsp.com/vid12.
A note on the duration of a sinusoid
From the mathematical description of a sinusoid a sinusoidal waveform exists for all
instances of time. In this document I show plots of sinusoidal segments which have a finite
duration and you’ll notice that I still refer to these plots of sinusoidal segments as
sinusoids, which is, strictly speaking, incorrect but makes the document a bit easier to
read.
The DFT A Practical Approach
David Dorran Page 9
How to use Octave/Matlab’s fft function
The fft function can be used to determine the amplitude, frequencies and phases of the
sinusoids that a signal is comprised of and is frequently used to obtain a frequency-
domain plot of a signal. In this section I’ll explain how to use the fft function without
getting into detail on its inner workings. You should note that the fft function is an
implementation of the Discrete Fourier Transform algorithm which is described in detail
in the next section.
In this section I’ll first show how to create a frequency-domain plot of the bass guitar
signal used in the Introduction, then I’ll provide another example which provides more
insight on how to use the fft to analyse a signal which is based on the popular video on
the subject that I created in 2012 pzdsp.com/vid16.
Using Octave/Matlab’s fft function to analyse a bass guitar signal
The following code can be used to load in an audio signal and plot its frequency content.
The audio file in the example can be downloaded from pzdsp.com/sig1 and you should
make sure the audio file is stored/saved in the ‘present working directory’ this can be
determined by typing pwd at the command line.
>> [b fs]= audioread('bass_note.wav'); % the variable b
contains the audio samples. The audioread function also
returns the sampling rate,fs, which is 44100 in this case
>> B = fft(b); % the fft returns 67822 complex numbers
which are stored in the array variable B. Note that, by
convention, capital letters are used to store frequency-
domain information while lowercase are used for time-
domain. There are 67822 samples in the time-domain signal
b. The fft function returns the same number of values as
are in the signal being analysed i.e. the time-domain
signal b in this case.
>> B_mags = abs(B); % the abs function determines the
magnitudes of the 67822 complex numbers.
The second line in the code above is the one that does all the hard work. The fft function
analyses the time-domain signal b (as described in detail in the next section) to determine
the magnitudes and phases of the sinusoids required to reproduce the time-domain signal
b.
The DFT A Practical Approach
David Dorran Page 10
One of the most common ways to visually analyse the frequency content of signal is to
plot the magnitudes of the values returned by the fft function, which provides a plot of
the magnitude spectrum. You should note that there are numerous ways to plot the
magnitude spectrum and the video available at pzdsp.com/vid17 provides a detailed
explanation on how to do so. The following code shows one common method of plotting
the magnitude spectrum against frequency using units of hertz, where the xlim([0 500])
command limits the range of frequencies being displayed to be from 0 to 500 Hz.
>> plot([0:length(b)-1]/length(b)*fs , B_mags)
>> xlim([0 500]); %limit the range of frequencies to be
from 0 to 500 Hz
>> xlabel('Frequency (Hz) '); ylabel('Magnitude');
A more detailed explanation of this code can be found from pzdsp.com/vid17. This code
will produce the following plot in which the fundamental and first two harmonics of the
guitar note can be clearly seen. There is a fundamental frequency component at about
55Hz, with strongly present harmonics at 110 Hz and 155 Hz, as indicated by the large
spikes at these frequencies
The time-domain view of this signal can be plotted using the following code, and you can
see that the signal contains a sharp attack/transient element from when the bass guitar
string was plucked.
>> [b fs]= audioread('bass_note.wav');
>> t = [0:length(b)-1]*1/fs;
>> plot(t,b)
>> xlabel('Time (seconds)'); ylabel('Amplitude')
050 100 150 200 250 300 350 400 450 500
Frequency (Hz)
0
1000
2000
3000
4000
Magnitude
The DFT A Practical Approach
David Dorran Page 11
The figure above also highlights a steady-state region in which the signal is reasonably
stationary. This steady-state/stationary segment can be reproduced reasonably well by
adding just three sinusoidal components, as shown by the code and plots below. Note that
the code doesn’t show how the amplitudes, frequencies and phases of the three sinusoidal
components are determined, however, how this could be done can be appreciated after
reading through the entire document!
>> [ip fs]= audioread('bass_note.wav');
>> N = 9670;
>> stationary_seg = ip(10000:10000+N-1);
>> t = [0:N-1]/fs; t_offset = 10000/fs;
>> subplot(2,2,1); plot(t+t_offset, stationary_seg)
>> title('Original segment')
>> xlabel('Time (seconds)'); ylabel('Amplitude');
>> subplot(2,2,2);
>> plot([0:N-1]/N*fs , abs(fft(stationary_seg)))
>> xlim([0 250]); %limit frequencies from 0 to 250 Hz
>> xlabel('Frequency (Hz) '); ylabel('Magnitude');
>> title('Magnitude Spectrum')
>> fundamental = 0.112*cos(2*pi*54.7*t-0.82);
>> harmonic1 = 0.072*cos(2*pi*109.7*t+3);
>> harmonic2 = 0.028*cos(2*pi*164.6*t + 0.83);
>> synth_sig = fundamental + harmonic1 + harmonic2 ;
>> subplot(2,2,3);plot(t+t_offset, synth_sig);
>> xlabel('Time (seconds)'); ylabel('Amplitude');
>> title('Synthesised segment')
>> subplot(2,2,4); plot(t+t_offset, fundamental);
>> hold on ; plot(t+t_offset, harmonic1);
>> plot(t+t_offset, harmonic2);
>> xlabel('Time (seconds)'); ylabel('Amplitude');
>> title('Three synthesis sinusoids')
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Time (seconds)
-0.4
-0.2
0
0.2
0.4
Amplitude
Attack/Transient
A steady-state region
The DFT A Practical Approach
David Dorran Page 12
Interpreting the output of the fft function
The fft function returns a sequence of complex numbers, and these complex numbers
describe the amplitude and phases of the sinusoidal waveforms that a time-domain signal
is comprised of. In this section I’ll attempt to explain how to interpret the complex
numbers returned by the fft function using some synthesised example signals, as
explained in pzdsp.com/vid16.
Let’s start by synthesising a time-domain signal that contains three sinusoids. The fft
function should be able to determine the amplitudes and phases of these three sinusoids
so let’s see how it does it.
>> fs = 1000;
>> t = 0 : 1/fs :1.5-1/fs;
>> x = 3*cos(2*pi*20*t + 0.2) + 1*cos(2*pi*30*t -0.3) +
2*cos(2*pi*40*t + 2.4);
>> plot(t,x);
>> xlabel('Time (seconds) ');
>> ylabel('Amplitude');
0.25 0.3 0.35 0.4
Tim e (seconds )
-0.2
-0.1
0
0.1
0.2
Amplitude
Original segment
050 100 150 200 250
Frequency (Hz)
0
200
400
600
Magnitude
Magnitude Spectrum
0.25 0.3 0.35 0.4
Tim e (seconds )
-0.2
-0.1
0
0.1
0.2
Amplitude
Synthesised segment
0.25 0.3 0.35 0.4
Tim e (seconds )
-0.2
-0.1
0
0.1
0.2
Amplitude
Three synthesis sinusoids
The DFT A Practical Approach
David Dorran Page 13
We know that the time-domain signal shown above contains three sinusoids of
frequencies 20 Hz, 30 Hz and 40 Hz with phase offsets of 0.2 radians, -0.3 radians and 2.4
radians, respectively, and amplitudes 3, 1, and 2, respectively. The time-domain signal
contains 1500 samples (sampling rate is 1000 Hz) and when we apply the fft function to
this signal 1500 complex numbers are returned. If we plot the magnitudes of these 1500
complex numbers, as shown below, we can see three ‘spikes’ on the left hand side of the
plot with another three spikes ‘mirrored’ on the right hand side. The three pairs of ‘spikes’
represent the three sinusoidal components that the original signal is comprised of.
>> X = fft(x);
>> plot(abs(X)); xlabel('Frequency (bins)');
ylabel('Magnitude');
>> title('Magnitude Spectrum')
The amplitude of the ‘spikes’ correspond to the amplitude of the sinusoids. Referring to
the ‘spikes’ on the left-hand side; the spike furthest to the left corresponds to the 20 Hz
sinusoid which has the largest amplitude; the middle spikehas the lowest amplitude and
corresponds to the 30 Hz sinusoid; while the ‘spike’ to the right of the grouping is twice
0 0.5 1 1.5
Tim e (seconds)
-4
-2
0
2
4
Amplitude
0500 1000 1500
Frequency (bins)
0
500
1000
1500
2000
2500
Magnitude
Magnitude Spe ctr um
The DFT A Practical Approach
David Dorran Page 14
the amplitude of the middle spikeand corresponds to the sinusoid with a frequency of
40 Hz.
If we took a closer look at the values of the variable X we’d see that they are complex
numbers that contain a lot of zero values. While showing all 1500 values is impractical we
can use the following matlab code to look at a few:
>> X(30:32) % X(30) X(31) X(32)
0+0j 2205.15+447j 0+0j
>> X(45:47) % X(45) X(46) X(47)
0+0j 716.5-221.64j 0+0j
>> X(60:62) % X(60) X(61) X(62)
0+0j -1106.09+1013.19j 0+0j
We can see that there are three non-zero values at indices 31, 46 and 61. The magnitudes
of these values are 2250, 750 and 1500, respectively, and these values can also be
determined from the plot of the magnitude spectrum by examining the amplitude of each
of the ‘spikes’. This should make sense, since the plot of the magnitude spectrum is simply
a visual representation of the magnitudes of the variable X. It’s worth noting that if we
divide these magnitude values by 750 (which is half the number of values in X) then we
get a result of 3, 1 and 2 which exactly match the amplitudes of the three sinusoids that
the synthesised signal is comprised of.
The phase angles of the complex numbers of X at indices 31, 46 and 61 are 0.2 radians, -0.3
radians and 2.4 radians, respectively. By referring to the code which synthesised the time-
domain signal we can see that these phase angles directly correspond to the phases of the
sinusoids that the synthesised signal is comprised of.
So, we can see that the fft function can determine the amplitudes and phase of the
sinusoids that a signal is comprised. The remaining piece of information is the frequency
of each of those sinusoids and to determine the frequency we have to examine the indices
of the non-zero values of X i.e. 31, 46 and 61. Before continuing its important to note that
matlab and octave index the first value of an array with the number 1, while
mathematicians (and most other programming) languages will index the first element of
an array with the number 0. The values of the array returned by the fft function (stored in
the variable X) are referred to as bin values, with the first element of the array being
referred to as bin number 0. The non-zero values of the variable X occurring at indices 31,
The DFT A Practical Approach
David Dorran Page 15
46 and 61 therefore correspond to bin numbers 30, 45 and 60. These bin numbers are
related to frequency with the bin numbers associated with the left-hand side of the
magnitude spectrum being converted to frequency in hertz using the following formula:
f = k.fs/N
where f is the frequency associated with bin k, fs is the sampling frequency and N is the
total number of bins (which is equal to the number of values in the variable X).
Using this formula for bin values k set to 30, 45 and 60 gives frequencies 20 Hz, 30 Hz and
40 Hz, which correspond to the frequencies of the sinusoids used to synthesise the time-
domain signal.
Before moving on to understanding how the DFT works (and the fft function is simply an
efficient implementation of the DFT) its important to appreciate how to interpret the
values returned by the DFT (the fft function).
I would like to say that for many scientists and engineers a deep understanding of the DFT
is not required to undertake some basic frequency analysis and an ability to have a high-
level interpretation of the output of the DFT would be adequate in a large number of
cases. For those readers who feel they may not require a detailed understanding of the
DFT I would recommend skipping ahead to the section entitled “The DFT in Practice
zero-padding and windowing”.
The DFT A Practical Approach
David Dorran Page 16
The Discrete Fourier Transform
The Discrete Fourier Transform (DFT) is a mathematical process that can be used to
determine the sinusoids that a discrete signal is comprised of. It basically works out the
amplitudes and phases of the sinusoids that will reproduce the original discrete signal if
those sinusoids were added together.
Before getting into the detail of the DFT it’s
worth noting a couple of features associated
with the sinusoids that the DFT uses to
reproduce a discrete signal. Firstly, the
sinusoids only have an integer number of
cycles over N samples, where N is the number
of samples in the discrete signal being
analysed.
The figure to the right shows some examples of
such sinusoids. The second plot from the top
shows a sinusoid with 1 cycle over N samples;
the next one down has 2 cycles over N samples;
and so on. The plot at the very top is a special
case and has 0 cycles over N samples, in other
words, a DC signal that has a constant
amplitude.
The second key feature is that a discrete signal of length N samples will require at most
N/2+1 of these ‘integer cycle’ sinusoids to reproduce the original discrete-time
signal.
To understand the operation of the DFT, it’s useful to think of the frequency of the
sinusoids in terms of radians per sample rather than Hertz. For example the sinusoid
which oscillates at a rate of one cycle over N samples (i.e. the sinusoid second from top in
the above figure) has a frequency of 2π/N radians per sample, since a sinusoid with one
cycle over N samples undergoes 2π radians of a revolution over N samples. Similarly, the
sinusoid at the bottom of the figure has a frequency of 12π/N radians per sample, since it
underdoes 6 cycles over N samples i.e. 6 times 2π radians. Take a look at the video at
pzdsp.com/vid24 for an overview of angular frequency.
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
Signal of length N samples
The DFT A Practical Approach
David Dorran Page 17
Given the above two features, any discrete-time signal of 8 samples could be reproduced
by summing just 5 sampled sinusoids. You should note that no matter what the values of
the 8 samples are, or what sampling rate was used to capture the 8 samples, the frequency
of the sinusoids will be same when expressed in radians per sample. The key point is, we
can reproduce any discrete-time signal containing 8 samples using the 5 sinusoids shown
in the figure below. All we have to do is modify the amplitude and phase of the 5 sinusoids
in a particular way. These sinusoids are referred to as sinusoidal basis functions, so
there are 5 sinusoidal basis functions which are modified in terms of amplitude and phase
before they are summed together to reproduce the original discrete signal. You should
note that all the sinusoidal basis functions have an amplitude of 1 and have a phase shift
of 0, relative to a cosine waveform.
It’s worth noting that the red dots shown in the above figure are the samples of sinusoids
which have an integer number of cycles over 8 samples and that the discrete signals
associated with each basis function are simply a sequence of 8 numbers. For example, the
discrete signal associated with the basis function with 4 cycles over 8 samples at the
bottom of the figure is the sequence 1, -1, 1, -1, 1, -1, 1, -1. The blue lines are shown to
make it easier to visualise the sinusoidal nature of these basis functions.
To make this a little bit clearer it can be useful to look at an example discrete-time signal
and its associated modified basis functions.
The DFT A Practical Approach
David Dorran Page 18
The plot shown in the bottom right corner of the figure above shows the discrete signal
given by the sequence
[0.7467 0.5378 3.4674 -0.4475 -0.6467 -1.0378 1.4326 -0.0525].
The red dots show the discrete sample points. The 5 basis functions are shown to the left
of the figure. Each basis function is modified in terms of magnitude and phase, as shown
by the waveforms to the right. For example, the basis function that has 2 cycles over 8
samples (the third waveform from the top) has a phase shift of 3.14 radians applied to it
and its amplitude is scaled by a factor 1.2 i.e. increased by 20%.
The figure below shows a second example where the discrete-time signal in this case is
given by the sequence
[3.2127 0.8225 0.1968 2.1293 0.0723 1.4848 2.9182 1.1634]
Since both of the example signals contain 8 samples, they both can be reproduced by
modifying the same set of sinusoidal basis functions, however the way in which each of
the basis functions are modified is different for each example.
The DFT A Practical Approach
David Dorran Page 19
The key point to appreciate is that any sequence of 8 samples can be reproduced by
modifying the 5 basis functions shown in the left of the figures above. Each unique
sequence of 8 samples will have a unique set of magnitude and phase modification
parameters though.
What the DFT does
The DFT determines how the magnitude and phase of the sinusoidal basis functions
should be modified so that they will reproduce the original signal when the modified basis
functions are added together.
In practice, when the DFT algorithm is applied to a discrete-time signal, it will output a
sequence of complex numbers where the magnitudes of the complex numbers correspond
to the magnitudes that each basis functions should be multiplied by. Similarly, the phase
of each complex number corresponds to the phase shift that should be imparted on each
sinusoidal basis function. I’ll show an example of this in action in the paragraphs below
but before I do I’d like to point out that most DFT algorithms are implemented so that
they output N complex numbers rather than N/2 + 1. You’ll find that the ‘additional’ N/2
[x
0
x
1
x
2
x
3
x
4
x
5 ... ...
x
N-1
]
[M
0
M
1
M
2
M
3
M
4
M
5 ... ...
M
N/2
]
[
φ0φ1φ2φ3φ4φ5 ... ... φN/2
]
Discrete Time Signal
Magnitudes
Phases
Discrete
Fourier
Transform
The DFT A Practical Approach
David Dorran Page 20
-1 complex numbers are somewhat redundant though so, for the moment, don’t worry too
much about them!
Let’s take a look at how the DFT is used on a discrete signal with 8 samples using a
matlab/octave examplethe complete mathematical calculations are available in the next
section but at this stage it’s not necessary to follow the mathematical detail, rather it’s
more beneficial to focus on what the DFT outputs at this stage. The code below shows the
variable x being set equal to an array of 8 values, this represents the 8 samples of a discrete
signal. The variable x is then processed by a function called fft (fft is an acronym for fast
Fourier transform, which is an efficient implementation of the DFT) and you can see that
in the fft function returns, or outputs, 8 complex numbers.
>> x = [3.2127 0.8225 0.1968 2.1293 0.0723 1.4848 2.9182
1.1634]; % a discrete-time signal of length 8 samples
>> X = fft(x) % the fft returns, or outputs, 8 complex
numbers. The last 3 are the conjugate of the earlier
values and are generally not required
12.0+0.0i 1.9891+2.5067i 0.17+0.9854i 4.2917-2.9361i 0.8+0.0i
4.2917+2.9361i 0.17-0.9854i 1.9891-2.5067i
>> amps = abs(X(1:5)) %determine the magnitude of the
first 5 values of X
12.0000 3.2000 1.0000 5.2 0.8000
>> phases =angle(X(1:5)) % determine the phase values of
the first 5 values of X
0 0.9000 1.4000 -0.6000 0
The matlab example above demonstrates how to use the fft function to determine the set
of amplitude and phase values of the sinusoids that the time-domain signal x is comprised
of. The example signal used is the same as the one shown the figure above (top of pg 17).
Referring to the figure, the phase values determined by the fft function in the code above
directly correspond to φ0, φ1, φ2, φ3 and φ4 in the figure. Note that the phase of the DC
component is zero, which is the first phase value returned by matlab’s angle function. The
amplitude values returned by matlab’s abs function require some additional scaling: the
first and last magnitude values of those listed are divided by N, while the remainder are
divided by N/2. The first and last values will always be scaled by N regardless of the length
of the signal being analysed, while the remaining values are scaled by N/2. For the sake of
completeness I feel I have to mention that the reason some of the sinusoids are scaled by
The DFT A Practical Approach
David Dorran Page 21
N/2 is because the ‘somewhat redundant’ sinusoids I mentioned above aren’t included. I
realise this comment is perhaps a distraction for some readers and you can dismiss it if it
causes any confusion.
To verify that these values represent the amplitudes and phases of the sinusoids that x
comprises of example matlab code is shown below which adds appropriately scaled and
phase shifted sinusoids together. Note: a discrete sinusoid with k cycles over N samples is
given by the expression
2
cos kn
N
π



.
>> N = 8;
>> n = 0 : N-1; % vector of sample numbers
>> dc_component = amps(1)*cos(0*n)/N;
>> one_cycle = amps(2)*cos(2*pi*n/N+phases(2))/(N/2);
>> two_cycle = amps(3)*cos(2*pi*n*2/N+ phases(3))/(N/2);
>> three_cycle = amps(4)*cos(2*pi*n*3/N +
phases(4))/(N/2);
>> four_cycle = amps(5)*cos(2*pi*n*4/N + phases(5))/N;
>> sinusoids_summed = dc_component + one_cycle + two_cycle
+ three_cycle + four_cycle % add the five components
to reproduce the original signal
3.2127 0.8225 0.1968 2.1293 0.0723 1.4848 2.9182 1.1634
As can be seen by the numerical values shown on the last line above, the sinusoids added
together reproduce the original signal x.
The short example sequence of 8 samples above is useful in terms of being able to
appreciate the numerical values that the DFT returns, however, in practice, scientists and
engineers usually analyse discrete-time signals which are hundreds, thousands or even
millions of samples long. The reader is referred back to the section on ‘How to use
Octave/Matlab’s fft function’ to see a practical audio example in which a bass guitar signal
is analysed.
A note on summing cosine and sine waveforms
It’s worth noting that a sinusoid given by Acos(2πft + φ) can also be represented as the
sum of a sine waveform and a cosine waveform of the same frequency, f, with no phase
offsets i.e.
Acos(2πft + φ) = Accos(2πft) - Assin(2πft)
The DFT A Practical Approach
David Dorran Page 22
where A = (Ac2+ As2) and φ = atan2(As, Ac). (I won’t provide the proof of this here but the
interested reader could show this from Ptolemy’s identities).
The following example code will help illustrate this, whereby two sinusoids synthesised
in two different ways are shown to be equivalent.
>> t=0 : 0.01 : 1;
>> f = 4;
>> x1 = 3*cos(2*pi*f*t) + 4*sin(2*pi*f*t); % sinusoid 1
>> x2 = sqrt(3^2+4^2)*cos(2*pi*f*t + atan2(-4,3)); %
sinusoid 2 : 5cos(8πt 0.9273)
>> plot(t,x1); hold on;
>> plot(t,x2, ' ro'); % plot in red circles
>> xlabel('Time (seconds) ');ylabel('Amplitude');
>> title('Both sinusoids are the same! ')
So, any sinusoidal waveform (with any phase shift) can be further decomposed into a sine
waveform and cosine waveform with zero phase shifts.
You’ll notice that the relationship between the amplitude of the sinusoidal waveform A
and the amplitudes of the sine and cosine waveforms (As and Ac, respectively) can be
visualised using a right-angle triangle or an argand diagram (Readers familiar with phasor
representations will also appreciate this!), as shown below.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tim e (seconds)
-5
0
5
Amplitude
Both sinusoids are the same!
The DFT A Practical Approach
David Dorran Page 23
The key point to note from this is that the values returned from the fft function are
complex numbers, and while the previous section used the magnitude and phase values
of these complex numbers to apply an appropriate amplitude scaling and phase shift to
sinusoidal basis functions, an alternative approach is to use the real and imaginary parts
of the complex numbers to appropriately scale cosine and sine waveforms with no phase
shift. Compare the following code with that of page 21you’ll notice I get the same result
using this alternative approach.
>> real_vals = real(X(1:5)) %determine the real values of
the first 5 values of X
>> imag_vals = imag(X(1:5)) %determine the imaginary
values of the first 5 values of X
>> N = 8;
>> n = 0 : N-1; % vector of sample numbers
>> dc_component = amps(1)*cos(0*n)/N;
>> one_cycle = (real_vals (2)*cos(2*pi*n/N) - imag_vals
(2)*sin(2*pi*n/N) )/(N/2);
>> two_cycle = (real_vals (3)*cos(2*pi*2*n/N) - imag_vals
(3)*sin(2*pi*2*n/N) )/(N/2);
>> three_cycle = (real_vals(4)*cos(2*pi*3*n/N) -
imag_vals (4)*sin(2*pi*3*n/N) )/(N/2);
>> four_cycle = (real_vals (5)*cos(2*pi*4*n/N) -
imag_vals (5)*sin(2*pi*4*n/N) )/N;
>> sinusoids_summed = dc_component + one_cycle + two_cycle
+ three_cycle + four_cycle % add the five components
to reproduce the original signal
3.2127 0.8225 0.1968 2.1293 0.0723 1.4848 2.9182 1.1634
The DFT A Practical Approach
David Dorran Page 24
The reason I’ve gone through this cosine/sine perspective is because the inner workings
of the DFT are more readily explained from this perspective.
The inner workings of the DFT
A brief overview of how the DFT works can be seen at pzdsp.com/vid26, which some
readers may find useful before getting into the mathematics.
Mathematically the DFT is described by the following equation
1
0
[ ] [ ].
K
N
jn
n
X k xn e
ω
=
=
where x[n] is the discrete time signal being analysed; e is Euler’s number (approx. 2.71828
and from Euler’s formula ejθ = cos(θ)+jsin(θ)); k is the so-called bin number, which is an
integer;
ω
k is the frequency associated with bin k, which is given by
ω
k = k/N radians per
sample; N is length of signal x[n] being analysed. X[k] is typically evaluated over a range
of values of k from 0 up to N, however 0 up to N/2 is usually all that is required.
I’ll attempt to provide a visually intuitive perspective on this algorithm later but before
doing that it can be useful to see a worked example of the algorithm so let’s return to our
example of a discrete signal given by the sequence of numbers [3.2127 0.8225 0.1968
2.1293 0.0723 1.4848 2.9182 1.1634]. Don’t worry too much about following the
calculations too closely I just want you to appreciate that all that’s going on is a
series of multiplications and additions using complex numbers!
Note that
ω
k =0 when k=0, so e-j
ω
kn = 1, since e-0= 1. Therefore X[0] =[]
 = x[0]+ x[1]+
x[2]+ x[3]+ x[4]+ x[5]+ x[6]+ x[7]. Summing these values gives us a result of 12 for X[0].
When k=1,
ω
k = π/4. Therefore X[1] =[]
 = x[0] 
+ x[1] 
+ x[2] 
+
x[3] 
+ x[4] + x[5] 
+ x[6] 
+ x[7] 
. This is equal to x[0]+ x[1](1/2
1/2)+ x[2](-j)+ x[3] (1/21/2)+ x[4](-1)+ x[5] (1/2 + 1/2)+ x[6] ()+ x[7]
(1/2+ 1/2). Summing these values gives us a result of 1.9891 + 2.5067j for X[1].
When k=2,
ω
k = π/2. Therefore X[2] =[]
 = x[0] 
+ x[1] 
+ x[2] +
x[3] 
+ x[4] + x[5] 
+ x[6] + x[7] 
. This is equal to x[0]+ x[1]()+
x[2](-1)+ x[3] ()+ x[4](1)+ x[5] (-j)+ x[6] (1)+ x[7] (j). Summing these values gives us a
result of 0.1700 + 0.9854j for X[2].
The DFT A Practical Approach
David Dorran Page 25
When k=3,
ω
k = 3π/4. Therefore X[3] =[]
 = x[0] 
+ x[1] 
+ x[2] 
+
x[3] 
+ x[4] + x[5] 
+ x[6] 
+ x[7] 
. This is equal to x[0]+
x[1](1/21/2)+ x[2](j)+ x[3] (1/21/2)+ x[4](-1)+ x[5] (1/2 + 1/2)+ x[6] (-
)+ x[7] (1/2+ 1/2). Summing these values gives us a result of 4.2917 - 2.9361j for
X[3].
When k=4,
ω
k = π. Therefore X[4] =[]
 = x[0] + x[1] + x[2] +
x[3] + x[4] + x[5] + x[6+ x[7] . This is equal to x[0] (1)+ x[1](1)
+ x[2]( 1)+ x[3](1)+ x[4](1)+ x[5]( 1)+ x[6] (1)+ x[7] (1). Summing these values gives us
a result of 0.8 - 0j for X[4].
The above calculations show that the DFT is just undertaking a number of multiplications
and summations that’s all! There isn’t anything too mysterious there really and it can be
useful to remind yourself of that. When the DFT is applied to longer signals there are just
going to be a greater amount of multiplications and additions but the process doesn’t
change. It is worth pointing out that there are efficient ways to do these calculations that
take advantage of the fact that some calculations don’t need to be repeated (for example,
you’ll notice in the calculations above that x[3] and x[5] are both multiplied by 1/√2 when
k is 1 and also when k is 3. Efficient algorithms, such as the fast Fourier transform, take
advantage of this type of redundancy by only doing the multiplication once and storing
the result for later use. You should note that the output of the DFT is same as any of the
efficient implementations, such as the fast Fourier transform, the only difference is how
the DFT outputs are calculated.
It’s useful to establish some terminology in order to discuss the DFT and in particular its
useful to become comfortable with the terms bin number and bin value. The variable k
used in the DFT calculations above is referred to as the bin number. For example, the bin
value associated with bin number 0 is 12 i.e. X[0] = 12+0j. Bin number 3 has a value of
4.2917 - 2.9361j i.e. X[3] = 4.2917 - 2.9361j.
Bin numbers are always integers. Bin number k is associated with sinusoidal waveforms
that oscillate k cycles over N samples (as can be appreciated from the code and plots in
the paragraphs below). So, bin number 2 is associated with a sinusoid that has a frequency
of 2 cycles over N samples; bin number 3 is associated with a sinusoid that has a frequency
of 3 cycles over N samples and so on. When you are dealing with signals that contain more
samples then youll find that you have more DFT bin values, which basically just indicates
that you need more sinusoidal waveforms to create longer signals. Make sure that you
The DFT A Practical Approach
David Dorran Page 26
remember that a bin number k is always associated with a sinusoid that has k cycles over
N samples, and that k is always an integer.
While it’s nice to see how to calculate the output of the DFT it’s perhaps more insightful
to be able to visualise what the algorithm is doing and this is easier if we use Euler’s
formula (ejθ = cos(θ)+jsin(θ)) to allow us break the DFT equation into two parts, as shown
below.
1
0
11
00
[ ] [ ].
[ ]cos( ) [ ]sin( )
K
N
jn
n
NN
KK
nn
X k xn e
xn n j xn n
ω
ωω
=
−−
= =
=
= −
∑∑
since
cos( ) sin( )
K
jn
KK
e nj n
ω
ωω
= −
, from eulers formula.
Looking at the left-hand summation in the expression above it can be seen that the
discrete signal x[n] is being multiplied by cos(
ω
kn) and the samples of the resulting signal
(sequence of numbers) are being summed. A similar operation is taking place with a sine
waveform with the right-hand summation term, but note that the summation is
multiplied by j i.e. -1.
It’s important to visualise what the signal cos(
ω
kn) looks like, where
ω
k = 2πk/N, and if you
evaluated this expression for 0 n < N, you’d find that that the waveform associated with
cos(
ω
kn) is sinusoidal and that the sinusoid has exactly k cycles over N samples, as shown
in the example code below. The signal sin(
ω
kn) is, of course, similar, except the waveforms
associated with this expression will be π/2 radians out of phase with its cosine
counterpart.
>> N = 100;
>> n = 0 : N-1; % vector of sample numbers
>> k = 3;
>> x1 = cos((2*pi*k/N) *n);
>> x2 = sin((2*pi*k/N) *n);
>> plot(n, x1, n, x2);
>> title('Plot of cos((2{\pi} k/N)n) and sin((2{\pi}
k/N)n) for k = 3')
The DFT A Practical Approach
David Dorran Page 27
The figure on the next page attempts to capture the DFT process visually. It shows a
discrete signal x[n] at the top being multiplied by a set of sine and cosine waveforms
(referred to as basis functions) to the left, where the basis functions all have an integer
number of cycles over N samples. The result of the multiplication is shown in the
waveforms to the right and the samples of these waveforms (note the sample values
displayed are rounded to 2 decimal places) are then summed to give a numerical result,
with the ‘sine’ terms multiplied by -j, as per the DFT equation. These numerical results
correspond to DFT bin values, where the imaginary part of the bin values are associated
with sine basis functions and the real part of the bin values are associated with the cosine
basis function. You should note how these values relate to the calculations shown earlier
in this section, for example X[1] = 1.9891+2.5067j, which correspond to the results
associated with the sine and cosine basis functions with 1 cycle over 8 samples.
The magnitude of the numerical result obtained by multiplying and summing gives an
indication of how ‘strongly present’ a particular basis function is in the signal x[n] and this
process of multiplying and summing is known as correlation (see pzdsp.com/vid25 for
and overview of correlation). Looking at the figure we can see that the cosine basis
function with 3 cycles is the most ‘strongly present’ sinusoidal basis function in the signal
x[n], since it is associated with the largest numerical result (4.2917) shown on the right, in
terms of magnitude. A visual inspection of the plot of x[n] also shows a waveform that
appears to have pattern that repeats approximately three times; this is not a coincidence!
The fact that bin 0 has a value of 12 means that the signal has a positive DC offset i.e. the
mean of the samples is greater than zero. If bin 0 has a value of zero this indicates that
the mean of the samples is zero while a negative result for bin 0 indicates the mean of the
samples is negative i.e. a negative DC offset.
Its worth noting that the frequency associated with bin k is given by k.fs/N Hz, since there
are k cycles over N samples, and for a sampling rate fs this equates to k cycles over N/fs
seconds, and the unit of Hertz relates to the number of cycles per second.
The DFT A Practical Approach
David Dorran Page 28
At this stage I’d like to highlight that the result of summing the samples of the waveform
produced by multiplying the sine wave basis functions by the signal x[n] is also multiplied
by j i.e. --1. The reason for this can be appreciated by referring back to the DFT
equation, where it can be seen that sine basis functions are multiplied by -j.
1
0
11
00
[ ] [ ].
[ ]cos( ) [ ]sin( )
K
N
jn
n
NN
KK
nn
X k xn e
xn n j xn n
ω
ωω
=
−−
= =
=
= −
∑∑
-3.5
3.5
0
1
2
12
-1
0
1
-1
0
1
1.9891
2.5067j
Bin 3
Bin 4
Bin 2
Bin 1
Bin 0
-1
0
1
-1
0
1
0.17
0.9854j
-1
0
1
-1
0
1
4.2917
-2.9361j
-1
0
1
-1
0
1
0.8
0j
x[n] - signal being
analysed
x[n] = [3.2127 0.8225 0.1968 2.1293 0.0723 1.4848 2.9182 1.1634]
cos
(2
πn(0)/
N
)
sin
(2
πn(1)/
N
)
cos
(2
πn(1)/
N
)
sin
(2
πn(2)/
N
)
cos
(2
πn(2)/
N
)
sin
(2
πn(3)/
N
)
cos
(2
πn(3)/
N
)
sin
(2
πn(4)/
N
)
cos
(2
πn(4)/
N
)
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
3.21
0.82 0.2
2.13
0.07
1.48
2.92
1.16
3.21
0.58
0
-1.51
-0.07 -1.05 00.82
0.58 0.2
1.51
0-1.05
-2.92
-0.82
3.21
0-0.2 00.07 0
-2.92
0
0.82 0
-2.13
0
1.48
0-1.16
3.21
-0.58 0
1.51
-0.07 1.05 0-0.82
0.58 -0.2
1.51
0-1.05
2.92
-0.82
3.21
-0.82 0.2
-2.13
0.07
-1.48
2.92
-1.16
0000000
0
0
0
0
The DFT A Practical Approach
David Dorran Page 29
Referring back to the calculations shown earlier in this section, we can see that the DFT
outputs a sequence of complex numbers with the real values associated with how ‘strongly
presentcosine basis functions are in the signal and imaginary values associated with the
strengthof the sine terms. Recall also, from the section entitled ‘A note on summing
cosine and sine waveforms’, that if a cosine and a sine term are present in a signal that
this is equivalent to a sinusoidal waveform of the same frequency with a phase shift being
present i.e.
Acos(ωkn + φ) = Accos(ωkn) - Assin(ωkn)
where A = (Ac2+ As2) = | Ac+ jAs | and φ = atan2(As, Ac) = ( Ac+ jAs)
This allows us to consider the fact that we obtained a correlation result of 4.2917 and -
2.9361j for the three-cycle cosine and sine basis functions as being equivalent to indicating
that a sinusoid given by the following equation is present in the signal:
5.2cos(3n/N - 0.6) = 4.2917cos(3n/N) + 2.9361sin(3n/N)
At this stage you should appreciate that the DFT identifies the presence of sinusoidal
components in a signal by correlating the signal of interest with a set of sine and cosine
basis functions. In the next section we’ll take a look some specific signals with a view to
being able to predict/interpret what the DFT outputs for ‘well-defined’ signals.
DFT analysis of well-defined signals
It can be worthwhile considering a few particular examples to appreciate how the DFT
works and what it outputs. The remainder of this section examines four such examples:
A sine waveform with 2 cycles over N samples and amplitude 2 i.e. x[n] =
2sin(2n/N)
A cosine waveform with 2 cycles over N samples and amplitude 3 i.e. x[n] =
3cos(2n/N)
A signal comprising of a sine waveform with 1 cycle over N samples and a cosine
waveform with 3 cycles over N samples with amplitudes 2 and 0.5, respectively i.e.
x[n] = 2sin(1n/N) + 0.5cos(3n/N)
A cosine waveform with a phase shift of -1.0854 and a frequency of 3 cycles over N
samples i.e. x[n] = 3cos(3n/N -1.0854)
The DFT A Practical Approach
David Dorran Page 30
For each of these four examples I’ll show a visual representation of how the DFT would
process each one and in the text I’ll point out the key features I’d like you to focus on.
You’ll notice that in all four cases that the sinusoidal basis functions shown to the left of
the figures do not change, since in all four cases we are dealing with signals that have 8
samples i.e. N = 8.
For the first two examples you can see that the DFT bin values (the values to the right)
are zero except for those values associated with basis functions that have 2 cycles over N
samples. You’ll also notice that for the first two examples that the amplitude of the
sinusoidal waveforms can be determined from the DFT output by dividing the value of
0
0
2
0
-2
0
2
0
-2
0
0
1.41
0
1.41
0
-1.41
0
-1.41
0j
0
1.41
0
-1.41
0
-1.41
0
1.41
0
00000000
-8j
0
2
0
2
0
2
0
2
0
0
-1.41
0
-1.41
0
1.41
0
1.41
0j
0
1.41
0
-1.41
0
-1.41
0
1.41
0
0
-2
0
2
0
-2
0
2
0j
00000000
-3.5
3.5
0
1
2
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
x[n] - signal being analysed
= 2sin(4πn/N)
x[n] = [ 0 2 0 -2 0 2 0 -2]
cos
(2
πn(0)/
N)
sin
(2
πn(1)/
N)
cos
(2
πn(1)/
N)
sin
(2
πn(2)/
N)
cos
(2
πn(2)/
N)
sin
(2
πn(3)/
N)
cos
(2
πn(3)/
N)
sin
(2
πn(4)/
N)
cos
(2
πn(4)/
N)
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
Bin 3
Bin 4
Bin 2
Bin 1
Bin 0
The DFT A Practical Approach
David Dorran Page 31
bin number 2 by N/2 = 4. This is an important fact to note the magnitude of the bin
values are proportional to amplitudes of the sinusoidal basis functions present in a signal.
If a bin value is zero then this indicates that the sinusoidal basis function associated with
that bin value is also zer0. In other words a particular basis function is not present in a
signal if its associated bin value is zero.
The third example shows a case where the signal being analysed is comprised of two scaled
basis functions of different frequencies. As you might expect, the DFT bin values are non-
zero for the bins associated with these two sinusoids. Bin number 1 has a value of 0-8j,
0
3
0
-3
0
3
0
-3
0
0
3
000
-3
0 0 0
0j
0 0
-3
0 0 0
3
0
12
3
0
3
0
3
0
3
0
0j
00000000
0
3
000
-3
0 0 0
0j
00
3
000
-3
0
0
3
0
-3
0
3
0
-3
0
0j
0 0 0 0 0 0 0 0
-3.5
3.5
0
1
2
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
x[n] - signal being analysed
= 3cos(4πn/N)
x[n] = [ 3 0 -3 0 3 0 -3 0]
cos(
2
πn(0)/
N)
sin(
2
πn(1)/
N)
cos(
2
πn(1)/
N)
sin(
2
πn(2)/
N)
cos(
2
πn(2)/
N)
sin(
2
πn(3)/
N)
cos(
2
πn(3)/
N)
sin(
2
πn(4)/
N)
cos(
2
πn(4)/
N)
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
Bin 3
Bin 4
Bin 2
Bin 1
Bin 0
The DFT A Practical Approach
David Dorran Page 32
where the imaginary value indicates that the signal being analysed contains a sine
waveform which oscillates at a frequency of one cycles every N samples. Bin number 3 has
a value of 2+0j, indicating that the signal being analysed contains a cosine waveform which
oscillates at a frequency of 3 cycles every N samples and that it is one quarter the
amplitude of the sine waveform.
The fourth and last example is a case where the signal being analysed is comprised of a
cosine waveform of amplitude 3 with a phase shift -1.0854 radians. From our discussion
earlier in this section it should be appreciated that this waveform can be considered to be
0
0.5 1.06 21.77
-0.5 -1.06 -2 -1.77
0
0.5 0.75 0
-1.25
0.5 0.75 0
-1.25
-8j
00.75
21.25
00.75
21.25
0
0.5 0
-2
0-0.5 0
2
0
0j
01.06 0
-1.77
0-1.06 0
1.77
2
0.5
-0.75 0
1.25 0.5
-0.75 0
1.25
0j
00.75
-2
1.25
00.75
-2
1.25
0
0.5
-1.06
2
-1.77
-0.5
1.06
-2
1.77
0j
00000000
-3.5
3.5
0
1
2
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
x[n] - signal being analysed =
2sin(2πn/N) + 0.5cos(6πn/N)
x[n] = [ 0.5 1.06 2.0 1.7678 -0.5 -1.06 -2.0 -1.7678]
cos
(2
πn(0)/N
)
sin
(2
πn(1)/N
)
cos
(2
πn(1)/N
)
sin
(2
πn(2)/N
)
cos
(2
πn(2)/N
)
sin
(2
πn(3)/N
)
cos
(2
πn(3)/N
)
sin
(2
πn(4)/N
)
cos
(2
πn(4)/N
)
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
Bin 3
Bin 4
Bin 2
Bin 1
Bin 0
The DFT A Practical Approach
David Dorran Page 33
a sum of a sine waveform and a cosine waveform. The equation below shows this
relationship mathematically.
3cos(6πn/N -1.0854) = 2.6535cos(6πn/N) + 1.3997sin(6πn/N)
We can see that the bin values of the DFT are all zero except for the case of bin 3. This
indicates that the signal only contains sinusoids that oscillate at 3 cycles over N samples,
as you should expect. The fact that bin number 3 contains both real and imaginary non-
zero values indicates that both a sine waveform and cosine are present in the signal.
0
1.4 0.89
-2.65
2.87
-1.4 -0.89
2.65
-2.87
0
1.4 0.63 0
-2.03
1.4 0.63 0
-2.03
0
00.63
-2.65
2.03
00.63
-2.65
2.03
0
1.4
0
2.65
0
-1.4
0
-2.65
0
0
00.89 0
-2.87
0-0.89 0
2.87
5.5987
1.4
-0.63 0
2.03 1.4
-0.63 0
2.03
-10.614j
00.63
2.65 2.03
00.63
2.65 2.03
0
1.4
-0.89
-2.65 -2.87
-1.4
0.89
2.65 2.87
0
00000000
-3.5
3.5
0
1
2
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
0
1
x[n] - signal being analysed =
3cos(6πn/N-1.0854)
x[n] = [ 1.4 0.887 -2.653 2.866 -1.4 -0.887 2.65 -2.867]
cos(
2
πn(0)/
N
)
sin(
2
πn(1)/
N
)
cos(
2
πn(1)/
N
)
sin(
2
πn(2)/
N
)
cos(
2
πn(2)/
N
)
sin(
2
πn(3)/
N
)
cos(
2
πn(3)/
N
)
sin(
2
πn(4)/
N
)
cos(
2
πn(4)/
N
)
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
-3.5
3.5
Bin 3
Bin 4
Bin 2
Bin 1
Bin 0
The DFT A Practical Approach
David Dorran Page 34
As before the amplitudes of the sinusoidal basis functions should be divided by N/2 = 4 in
order to determine the amplitude of the basis functions should be scaled if you wanted to
reproduce the original signal by summing the scaled basis functions.
You’ll notice that the amplitudes of the scaled basis functions are equal to the amplitudes
of the cosine and sine waveforms shown in the equation given earlier.
You should also notice the magnitude of the complex number associated with bin 3 i.e.
5.5987 -10.614j, scaled by N/2, is equal to the amplitude of the phase shifted cosine
waveform and that the phase shift is equal to the angle associated with this complex
number.
Key points and a brief note on negative frequencies
The take away points from the previous section are that discrete signals of length N
samples can be reconstructed by adding together sinusoids and that the DFT can be used
to determine the nature of the sinusoids that will, when summed, reproduce the original
signal. A key feature of the sinusoidal waveforms is that they all have an integer number
of cycles over N samples.
There are two related perspectives on the sinusoidal summing process. From one
perspective you can consider the sinusoids to be N/2 + 1 cosine waveforms which have a
particular amplitude and phase shift applied, and that the magnitude and phase of the
complex values that DFT outputs correspond to the amplitude and phase shift that should
be applied to each of the N/2 + 1 cosine waveforms.
From a second perspective the sinusoids that are summed are sine waveforms and cosine
waveforms with no phase shift. In this case there are a total of N waveforms to be summed.
The real part of the complex values the DFT outputs corresponds the amplitude of the
cosine waveforms and the imaginary part corresponds the amplitude of the sine
waveform, which should be further multiplied by -1.
Both of these perspectives are equivalent since a cosine of any amplitude with a phase
shift is mathematically equal to the sum of zero-phase shifted cosine and sine waveform
of certain amplitudes (see the section entitled A note on summing cosine and sine
waveforms”).
A third perspective worth considering is that of the discrete signals being a sum of what
are referred to as complex exponentials, which are described mathematically by the
The DFT A Practical Approach
David Dorran Page 35
expression e-jωt. Complex exponentials are trickier to visualise than sinusoidal waveforms
because they occupy complex space (see pzdsp.com/vid27 for a way to visualise them) but
they are fundamentally a ‘more correct’ way of thinking about the DFT as they can be used
to recreate any signal even those that are a sequence of complex numbers rather than the
discrete signals that are a sequence of real valued numbers that we have considered so far.
I’d like to stress that this perspective isn’t essential to understanding and using the DFT
to analyse signals, however many text books do refer to complex exponentials (and their
associated negative frequencies) and they are required in certain contexts (quadrature
signal analysis for example) so the interested reader may find the following useful. You
should feel free to skip on to the next section without any loss in continuity if you prefer
though.
Complex exponentials have the general form of Ae j(ωt+φ), where ω represents angular
frequency, t represents time φ represents a phase shift, and they are related to sinusoidal
waveforms by Euler’s formula, e = cos(θ)+jsin(θ).
Since cos(-θ) = cos(θ) and sin(-θ) = -sin(θ) then
cos()=  +
2
Therefore
( +)= ()+()
2
The equation above shows that any phase shifted cosine waveform can be considered as
being the sum of two complex exponentials, Ae jt+φ) and Ae -j(ωt+φ) . You’ll notice that the
two complex exponential terms are very similar except that one has a negative sign in its
exponent. This negative sign controls the direction of rotation of the complex exponential,
which can be understood by referring to pzdsp.com/vid27.
Earlier in this document I provided a number of frequency-domain plots of signals which
showed what is known as the magnitude spectrum of the signal. The magnitude spectrum
shows the magnitudes of the sinusoids that are present in a signal, with magnitude shown
on the vertical axis of the plot and frequency along the horizontal axis. It also very
common to show the magnitude spectrum as an illustration of the magnitudes of the
complex exponentials in a signal and in this case you will see negative frequencies as well
as positive frequencies. An example of this is shown below with details on how to create
The DFT A Practical Approach
David Dorran Page 36
these plots available at pzdsp.com/vid28. You should note that the concept of negative
frequency is only relevant in the context of complex exponentials where positive and
negative frequency complex exponentials rotate in the opposite direction in complex
space as illustrated in pzdsp.com/vid27.
010 20 30
Frequency (Hz)
1000
2000
3000
4000
5000
6000
Magnitude
Single-sided Magnitude spectrum (Her tz)
-30 -20 -10 010 20 30
Frequency (Hz)
1000
2000
3000
4000
5000
6000
Magnitude
Double-sided Magnitude spectrum (He rtz)
The DFT A Practical Approach
David Dorran Page 37
The DFT in Practicezero-padding and windowing
In the previous section I used some well-defined signals to show how the DFT works. The
signals I used were synthesised so that they all were perfectly sinusoidal and they all had
an integer number of cycles over 8 samples. I’m sure you can appreciate that this is very
unlikely to ever occur in a real-world situation.
At this stage I’d like to highlight a couple of issues with the DFT by going back to a
practical example whereby we are analysing the frequency content of a bass guitar signal
(see page 9). These issues are also explained in pzdsp.com/vid29. I’ve used the code below
to plot the magnitude spectrums of two very similar segments of the bass guitar signal, as
can be seen in the plots below. One of the time-domain segments is just 400 samples
longer than the other but they share 9270 of the exact same samples. What’s perhaps
surprising is that the magnitude spectrums of these two signals are significantly different,
despite the fact that they are so similar in the time-domain. You’ll notice that the
amplitude of the frequency content associated with the fundamental frequency (first peak
on the left) is very different. Also, in one magnitude spectrum the spectral shape of the
fundamental is similar to the first two harmonics while in the other one they are quite
different.
>> [ip fs]= audioread('bass_note.wav'); %from
pzdsp.com/sig1
>> N = 9670; %change this to 9270 for the second signal
>> seg = ip(10000:10000+N-1);
>> plot(seg)
>> xlabel('Samples');
>> ylabel('Amplitude');
>> fax = [0:N-1]*fs/N;
>> figure; plot(fax, abs(fft(seg))/N);
>> xlabel('Frequency (Hz)');
>> ylabel('Magnitude');
>> xlim([0 500]) % just show the first 500 Hz
This difference between magnitude spectrums can be problematic since engineers and
scientists are often looking for patterns in signals (in both time and frequency domains)
to diagnose problems or to extract useful information. If the shape (or pattern) of the
spectral information associated with a signal is heavily dependent on the length of the
The DFT A Practical Approach
David Dorran Page 38
time-domain signal being analysed then the frequency-domain perspective becomes less
useful in practice. Luckily techniques exist to deal with this issue!
The difference in the magnitude spectrums is caused by a phenomenon known as spectral
leakage (spectral spreading), and in the next two sub-sections I’ll explain what causes this
phenomenon but before I do I’d like to show you how DSP practitioners deal with this
issue using two techniques known as “windowing” and “zero-padding”. Later on, I’ll
explain why these techniques achieve the desired effect, of ensuring that the length of the
time-domain signal does not significantly impact the magnitude spectrum, but for now
I’d just like you to see their effect and how straight-forward they are to implement.
Zero-padding is simply the process of appending samples of zero amplitude to a signal as
show in the code and plot below. In this case I have appended 5000 samples of zero
amplitude to the bass guitar segment. In other words, I have zero-padded the segment by
5000 samples.
>> [ip fs]= audioread('bass_note.wav'); %from
pzdsp.com/sig1
>> N = 9670; %change this to 9270 for the second signal
>> seg = ip(10000:10000+N-1);
>> zpad_seg = [seg ; zeros(5000,1)]; %zero-padding
>> plot(zpad_seg)
>> xlabel('Samples');
>> ylabel('Amplitude');
>> title('Bass segment zero padded by 5000 samples')
02000 4000 6000 8000 10000
Samples
-0.2
0
0.2
Amplitude
0100 200 300 400 500
Frequency (Hz)
0
0.02
0.04
0.06
Magnitude
02000 4000 6000 8000 10000
Samples
-0.2
0
0.2
Amplitude
0100 200 300 400 500
Frequency (Hz)
0
0.02
0.04
0.06
Magnitude
The DFT A Practical Approach
David Dorran Page 39
In practice it can be beneficial to zero-pad by much larger amounts. An appreciation for
how much you should zero-pad will be provided later on in this document. The following
code and plots show the impact of zero-padding the two bass segments dealt with earlier.
In this case zero-padding is five times the length of the original signal.
>> [ip fs]= audioread('bass_note.wav');
>> N = 9670; %change this to 9270 for the second plot
>> seg = ip(10000:10000+N-1);
>> zpad_seg = [seg; zeros(N*5,1)]; %append N*5 zeros
>> plot(zpad_seg)
>> N2 = length(zpad_seg);
>> xlabel('Samples');
>> ylabel('Amplitude');
>> fax = [0:N2-1]*fs/N2;
>> plot(fax, abs(fft(zpad_seg))/N);
>> xlabel('Frequency (Hz)');
>> ylabel('Magnitude');
>> xlim([0 500]) % just show first 500 Hz
The time-domain plots below are the same as the ones shown below except they are zero-
padded i.e. one time-domain signal contains 9670 samples of the bass signal while the
other contains 9270. You can see that the frequency content of both zero-padded signals
look much more similar than the frequency-domain plots above. They will become even
more similar if the amount of zero-padding is increased, with the increase in zero-padding
resulting in a greater number of fft computations being required i.e. it will take the fft
longer to compute!
05000 10000 15000
Samples
-0.2
-0.1
0
0.1
0.2
Amplitude
Bass segme nt zero padded by 5000 samples
The DFT A Practical Approach
David Dorran Page 40
You’ll notice the frequency domain plots have 3 relatively wide ‘peaks’ (referred to as
main-lobes) centred at the fundamental (55 Hz) and the first two harmonics (110 Hz and
165 Hz). There are also numerous narrower and smaller ‘peaks’ (referred to as side-lobes)
either side of the main-lobes. These side-lobes are considered as interference and can be
problematic when taking measurements and reducing them is often desirable. This
desired reduction in the magnitude of the side-lobes can be achieved using a processing
called ‘windowing’.
Windowing a signal involves multiplying it by another signal (referred to as a windowing
function) to produce a new windowed signal, as shown in the code and plots below.
>> [ip fs]= audioread('bass_note.wav');
>> N = 9670;
>> seg = ip(10000:10000+N-1);
>> subplot(1,3,1);plot(seg)
>> xlabel('Samples'); ylabel('Amplitude');
>> title('Segment of bass signal')
>> hanning_win = hanning(N); % built-in function
>> subplot(1,3,2);plot(hanning_win)
>> xlabel('Samples'); ylabel('Amplitude');
>> title('Hann window function')
>> windowed_seg = seg.*hanning_win;
>> subplot(1,3,3);plot(windowed_seg)
>> xlabel('Samples'); ylabel('Amplitude');
>> title('Windowed segment')
The DFT A Practical Approach
David Dorran Page 41
As can be seen above, the effect of windowing a signal is to alter the shape of a signal so
that the start and end of the original signal are closer to zero. Windowing has the effect
of superimposing an ‘amplitude envelope’ on the original signal which matches the shape
of the window function. The example above makes use of the Hann window function but
there are a large number of window functions with some common ones shown below.
You’ll notice that all the window functions shown taper off towards the beginning and
end.
>> plot(hamming(1000));
>> hold on
>> plot(blackman(1000));
>> plot(bartlett(1000));
>> plot(tukeywin(1000));
>> ylabel('Amplitude'); xlabel('Samples')
>> legend({'Hamming', 'Blackman','Bartlett','Tukey'})
05000 10000
Samples
-0.2
-0.1
0
0.1
0.2
Amplitude
Segment of bass signal
05000 10000
Samples
0
0.5
1
Amplitude
Hann window func tion
05000 10000
Samples
-0.2
-0.1
0
0.1
0.2
Amplitude
Window ed segm ent
X=
0100 200 300 400 500 600 700 800 900 1000
Samples
0
0.2
0.4
0.6
0.8
1
Amplitude
Some common window functions
Hammin g
Blackman
Bartlett
Tukey
The DFT A Practical Approach
David Dorran Page 42
The code below shows the effect of windowing the bass guitar segment has in the
frequency-domain, whereby the side-lobes, discussed above, are reduced. Notice,
however, that the amplitudes of the main-lobes are wider and that they are also lower in
amplitude, compared to the plots provided earlier (they are half the amplitude), as a
consequence of the windowing process. A Hann window function is used in this example
and if a different window was used you would find that there would be different levels of
side-lobe attenuation and main-lobe width. There is generally a trade-off in terms of
main-lobe width against side-lobe magnitude in the choice of window i.e. a narrow main-
lobe (typically desirable) comes at the expense of more significant side-lobes (typically
undesirable).
Windowing and zero-padding are reasonably easy to implement and their effect is effect
is quite dramatic, so they are both routinely used in practice. By windowing and zero-
padding DSP practitioners will produce consistent looking frequency-domain plots in
which frequency components will not significantly interfere with other components. In
many cases it can be sufficient just to be aware of the benefits of these two techniques
without fully appreciating why side-lobe artefacts occur and how they can be ‘controlled’.
The next sections provide more detail on how and why zero-padding and windowing
work.
Before moving on to the last few sections it is worth noting another practical benefit of
zero-padding, which relates to our ability to estimate the frequency of sinusoidal
components using the DFT. The code below synthesises a sinusoidal segment of
frequency 2.3 Hz and duration 2 seconds, using a sampling rate of 50 Hz, and shows the
magnitude spectrums of this segment for two cases, with and without zero padding. You’ll
notice that the location of the maximum peak is at 2.5 Hz for the non-zero padded
spectrum while it is closer to the actual frequency of 2.3 Hz (highlighted with the dotted
line in the figure) when zero-padding is employed. By zero-padding the sinusoidal
The DFT A Practical Approach
David Dorran Page 43
segment we are improving the DFT frequency resolution which, in turn, improves our
ability to accurately determine the frequency of sinusoidal components present in a
signal. The peak of the magnitude spectrum of the bottom plot (associated with the zero-
padded signal) occurs at a frequency of 2.33 Hz which is much closer to the actual
frequency of 2.3 Hz.
A further improvement in frequency estimate accuracy could be achieved by zero-padding
by a greater amount. By zero-padding we are improving what is referred to as the DFT bin
frequency resolution, which is given by fs/N, where N is the length of the zero-padded
signal. The DFT bin frequency resolution dictates the maximum error in frequency
measurements, so by increasing the length of the signal you can reduce the frequency
estimate error. When no zero-padding is applied the bin resolution is 50/100 = 0.5 Hz (fs
= 50; N = 100), however, by zero-padding by 500 samples N then becomes 600 so the bin
resolution becomes a smaller value given by 50/600 = 0.0833 Hz.
0 0.5 1 1.5 2
Tim e (seconds)
-1
0
1
Amplitude
012345678
Frequency (Hz)
0
50
Magnitude
0 2 4 6 8 10
Tim e (seconds)
-1
0
1
Amplitude
0123456 7 8
Frequency (Hz)
0
50
Magnitude
2.33 Hz
2.5 Hz
The DFT A Practical Approach
David Dorran Page 44
Time-frequency resolution trade-off
The choice of segment length, prior to zero-padding, affects our ability to identify and
analyse sinusoidal frequency components (see pzdsp.com/vid32 for an overview on this).
The code below analyses segments of the bass guitar audio signal and can be used to
produce the four magnitude spectrum plots below (the second line of the code should be
modified to reproduce each of the four plots).
>> [ip fs]= audioread('bass_note.wav');
>> N = 4000; % change this to generate the spectrum plots
>> seg = ip(10000:10000+N-1);
>> zpad_seg = [seg; zeros(50000,1)]; % zero-pad
>> N2 = length(zpad_seg);
>> fax = [0:N2-1]*fs/N2;
>> plot(fax, abs(fft(zpad_seg))/N);
>> xlabel('Frequency (Hz)'); ylabel('Magnitude');
>> xlim([0 500]) % just show first 500 Hz
>> title(['Magnitude Spectrum for N = ' num2str(N)])
You’ll notice in the four plots that it’s more difficult to clearly see the fundamental and
first two harmonics as the variable N (the segment length prior to zero-padding) gets
0100 200 300 400 500
Frequency (Hz)
0
0.05
Magnitude
Magnitude Spec trum for N = 4000
0100 200 300 400 500
Frequency (Hz)
0
0.05
Magnitude
Magnitude Spec trum for N = 2000
0100 200 300 400 500
Frequency (Hz)
0
0.05
Magnitude
Magnitude Spec trum for N = 1000
0100 200 300 400 500
Frequency (Hz)
0
0.05
0.1
Magnitude
Magnitude Spec trum for N = 500
0100 200 300 400 500
Samples
-0.2
0
0.2
Amplitude
Time-domain s egme nt for N = 500
0200 400 600 800 1000
Samples
-0.2
0
0.2
Amplitude
Time-domain s egme nt for N = 1000
0500 1000 1500 2000
Samples
-0.2
0
0.2
Amplitude
Time-domain s egme nt for N = 2000
01000 2000 3000 4000
Samples
-0.2
0
0.2
Amplitude
Time-domain s egme nt for N = 4000
The DFT A Practical Approach
David Dorran Page 45
smaller. This is because the main-lobe width gets wider, as do the side-lobes (an
explanation of why this occurs is provided later in this document), and the separation
between the fundamental and harmonics becomes less distinctive. When N is 500 (bottom
right plot) the fundamental and harmonics cannot be clearly identified making it
impossible to measure any frequency characteristics of the three harmonically related
components. Even when N is 2000, when both the fundamental and harmonics can be
clearly seen, you’ll notice that the magnitude (height) of the main-lobe associated with
the first harmonic has been increased in comparison to the top plot, when N=4000. This
is predominantly due to interference from the side-lobes associated with the fundamental
frequency. You should note that this interference could be reduced through an
appropriate windowing process, as explained later in this document.
From the plots above its clear to see that if we have more time-domain samples we can
get a clearer picture of the frequency-domain representation of that signal. Intuitively this
should make sense, since the frequency-domain is essentially a measure of the rate of
change of the time-domain and the more time-domain samples we have the more ‘rate of
change’ between samples we can measure. Taking this perspective to its extreme we can
consider how much frequency-domain information we could extract from a single sample
i.e. when N = 1. The answer is ‘very little’ just the DC level, which is of no real interest in
isolation from other samples.
From the analysis above alone, it would appear that the length of segment should always
be as long as possible, however if the segment is too long this can also cause problems in
interpreting the magnitude spectrum. To appreciate why this is the case let’s return to a
synthesised example, whereby two sinusoidal waveforms are present but the amplitude of
the waveforms are slowly changing, as shown in the code and corresponding time-domain
plots below. Note that the variables amp_envelope1 and amp_envelope2 can be
used to determine accurate amplitudes of the two sinusoidal waveforms.
>> fs = 1000; N = 2000; t = [0:N-1]/fs;
>> amp_envelope1 = t+1;
>> amp_envelope2 = 2-t;
>> amp_envelope2(N/2+1:end)=fliplr(amp_envelope2(1:N/2));
>> waveform1 = cos(2*pi*25*t).*amp_envelope1;
>> waveform2 = cos(2*pi*15*t).*amp_envelope2;
>> synth_sig = waveform1+waveform2;
>> subplot(3,1,1); plot([0:N-1], waveform1);
>> xlabel('Samples');ylabel('Amplitude');
>> subplot(3,1,2); plot([0:N-1], waveform2);
The DFT A Practical Approach
David Dorran Page 46
>> xlabel('Samples');ylabel('Amplitude');
>> subplot(3,1,3); plot([0:N-1], synth_sig);
>> xlabel('Samples');ylabel('Amplitude');
>> title('Synthesised signal - sum of waveforms above')
For the purpose of understanding the impact in the choice of segment length, let’s
imagine a situation where we only have access to the synthesised signal and we want to
determine the evolution of the amplitude of both sinusoidal waveforms using the DFT.
If we applied a DFT to the entire synthesised signal (bottom plot above) then the
magnitude (height) of the two main-lobes in the magnitude spectrum would indicate the
average amplitude of the two sinusoids, as shown by the code and plot below. We can see
that the average amplitude of the 15 Hz frequency component is lower in amplitude than
the 25 Hz component, which corresponds to our expectation since the average amplitude
of the 25 Hz component is 2 ( mean(amp_envelope1)), while the average amplitude
of the 15 Hz component is 1.5 ( mean(amp_envelope2)). This plot clearly cannot be
used to determine how the amplitudes of the sinusoids evolve as it simply shows the
average over the 2000 samples.
>> offset = 0; % change this to reproduce other plots
>> N = 2000; % change this to reproduce other plots
0200 400 600 800 1000 1200 1400 1600 1800 2000
Samples
-2
0
2
Amplitude
0200 400 600 800 1000 1200 1400 1600 1800 2000
Samples
-2
0
2
Amplitude
0200 400 600 800 1000 1200 1400 1600 1800 2000
Samples
-5
0
5
Amplitude
Synthesised signal - sum of two waveform s above
The DFT A Practical Approach
David Dorran Page 47
>> zpad_win_sig = [synth_sig(1+offset:N+offset)
zeros(1,1000000)];
>> N2 = length(zpad_win_sig);
>> fax = [0:N2-1]*fs/N2;
>> plot(fax, abs(fft(zpad_win_sig))/(N/2))
>> xlim([0 30]); ylabel('Magnitude'); xlabel('Frequency
(Hz)')
>> title(['Magnitude spectrum for N = ' num2str(N) ',
offset = ' num2str(offset)])
A better indication of the amplitude evolution of both sinusoidal components can be
obtained by segmenting the waveform into smaller segments (these segments are also
often referred to as windows or frames). The magnitude spectrum plot below is associated
with the first 500 samples of the entire 2000 samples of the signal synthesised. The
magnitude of the main-lobe associated with the 15 Hz component is approx 1.75 while the
25 Hz component is approx. 1.25. These magnitudes represent the average
(approximately!) of the sinusoidal waveforms over the first 500 samples and also
correspond to the amplitude of the sinusoidal waveforms at the centre of the segment i.e.
at sample number 250, since the increasing and decreasing amplitudes either side of this
centre point will have the effect of cancelling each other out. Notice, however, that the
main-lobe associated with the 25 Hz component is not centred at exactly 25 Hz due to
side-lobe interference associated with the 15 Hz component. This interference will
increase if the segments get shorter since the main-lobe widths increase and their centres
become closer to each other, as explained later in this document.
Note that the benefit of using synthesised signals is that we can determine the exact
amplitudes of the sinusoids by referring to the code used to synthesise the signals. The
amplitudes at sample 250 will be amp_envelope1(250), which is 1.249, and
amp_envelope1(250), which is 1.751.
0 5 10 15 20 25 30
Frequency (Hz)
0
0.5
1
1.5
2
2.5
Magnitude
Magnitude spectrum for N = 2000, offset = 0
The DFT A Practical Approach
David Dorran Page 48
Let’s see what happens when the segment length is further reduced to down to 200
samples. From analysis of the time-domain waveforms the amplitude of the 15 Hz sinusoid
after 100 samples will be 1.1 and the 25 Hz sinusoid will be 1.9 (these amplitudes can also
be determined from code using amp_envelope1(100) and amp_envelope1(100)).
Examining the plot of magnitude spectrum of the first 200 samples, shown below,
indicates that the magnitude of the 25 Hz component is about 1.25 rather than 1.1. Also
notice how the centre of the main-lobe associated with the 25 Hz component is now
shifted further away from 25 Hz to 26 Hz. Both these inaccuracies (in terms of amplitude
and frequency) can be attributed to side-lobe interference associated with the 15 Hz
component.
The key point to take away from these examples is that a shorter analysis segment will
reduce the effect of averaging and can provide a more accurate estimate of the local
amplitude of each sinusoidal component within the synthesised signal. However, shorter
segments also result in more significant interference from side-lobes, and even main-lobes
for extremelyshort segments (see segment length guidelines 2 paragraphs down for a
sense of what constitutes an adequate length segment). While the windowing techniques,
introduced earlier, and described in more detail later on, can reduce the effects of side-
lobe interference their effects will always be somewhat present. Note that for extremely
short segments the main-lobes associated with components will interfere with each other,
and meaningful frequency measurements can become impossible to determine.
0 5 10 15 20 25 30
Frequency (Hz)
0
0.5
1
1.5
2
Magnitude
Magnitude spectrum for N = 500, offset = 0
0 5 10 15 20 25 30
Frequency (Hz)
0
0.5
1
1.5
2
Magnitude
Magnitude spectrum for N = 200, offset = 0
The DFT A Practical Approach
David Dorran Page 49
This trade-off in terms of the desire to choose a short segment to reduce the impact of
averaging with the desire to minimise interference by choosing a longer segment is
referred to as a time-frequency resolution trade-off. A long segment length improves
frequency resolution (our ability to visually ‘space-out’ or ‘separate’ components along the
frequency axis, which also impacts the interference between components) at the cost of
more accurate time resolution (our ability to accurately estimate the characteristics of the
frequency components at a particular time).
A rough guideline for selecting a minimum segment length is to ensure that the segment
length equates to twice the period of the lowest frequency component present in the
signal, for periodic signals. When dealing with a number of non-harmonically related
quasi-sinusoidal waveforms you would need to determine the ‘likely’ frequency of all the
waveforms and find the smallest frequency difference between these sinusoids, then the
segment length should be twice the inverse of this minimum frequency difference. For
example, if sinusoidal components of 100 Hz, 107 Hz and 109 Hz where present in a signal
the segment length should equate to 0.5 seconds, since the smallest frequency difference
is 2 Hz (109107 Hz).
Of course, these guidelines depend on prior knowledge of the frequencies present in the
signal and engineers and scientists may rely on expert input for this information. For
example, with ECG (heart signal) analysis scientists would rely on medical experts
knowledge of the physiological limits of human heart rate. Audio engineers might rely on
the frequency range of certain musical genres in their analysis. In the event that you had
little or no prior knowledge of the expected frequency characteristic of a signal, estimates
of the frequency components present could be determined experimentally using the
DFT’s using a variety of segment lengths, with the optimum segment length determined
through refined experimentation.
Before concluding this section, I would like to highlight that the amplitude evolution of
the synthesised signal can be determined by adjusting the offset variable in the first
line of the code provided above. By incrementing the offset variable you will be able to
estimate the amplitudes of the two sinusoidal components at different times/samples.
I would also like to point out that the magnitude of the main-lobes won’t always
correspond to the amplitude of the sinusoidal waveforms at the centre of the segment
selected (even taking side-lobe interference into consideration!). In the example below,
the offset is 750 and the segment length is 500, and it can seen that magnitudes of the
The DFT A Practical Approach
David Dorran Page 50
main-lobes are 1.2 for the 15 Hz component and 2 for the 25 Hz component. The actual
amplitudes of the sinusoidal waveforms at the centre of the segment are 1 (i.e.
amp_envelope2(1000)) and 2 (i.e. amp_envelope1(1000)). The reason the 15 Hz
measurement is incorrect is because the average amplitude of the waveform is greater
than 1, which should be apparent from a visual inspection. This contrasts with the 25 Hz
component in which the average amplitude is equal to the amplitude at the centre of the
segment.
The plot below explicitly highlights the time-domain segment associated with the
magnitude spectrum shown above, highlighted in pink.
0 5 10 15 20 25 30
Frequency (Hz)
0
0.5
1
1.5
2
2.5
Magnitude
Magnitude spectrum for N = 500, offset = 750
0200 400 600 800 1000 1200 1400 1600 1800 2000
Samples
-2
0
2
Amplitude
0200 400 600 800 1000 1200 1400 1600 1800 2000
Sampled
-2
0
2
Amplitude
0200 400 600 800 1000 1200 1400 1600 1800 2000
Samples
-5
0
5
Amplitude
Synthesised signal - sum of tw o wavefor ms a bove
offset N
750 500
The DFT A Practical Approach
David Dorran Page 51
Spectral leakage
Spectral leakage refers to a spread of energy across the frequency axis. Referring to the figure below
(a modified version of a figure shown earlier) the magnitude spectrum in the bottom right shows
a spread of energy across a wider range of DFT bins in comparison to the spectrum in the top right,
particularly in the region of the fundamental frequency of 55 Hz. The figure has been adapted from
the earlier one to include the sinusoidal waveform associated with the fundamental frequency
(shown as a dotted line). What you should notice is that in the top left time-domain plot the
fundamental has close to 12 complete cycles, while the bottom left plot contains 11.5 cycles of the
fundamental. You’ll observe a significant spread of energy across DFT bins if there are sinusoidal
components present in a signal that