ArticlePDF Available

A comparison of audio and visual analysis of complex time-series data sets

Authors:

Abstract and Figures

This paper describes an experiment to compare user understanding of complex data sets presented in two different modalities, a) in a visual spectrogram, and b) via audification. Many complex time-series data sets taken from helicopter flight recordings were presented to the test subjects in both modalities separately. The aim was to see if a key set of attributes (noise, repetitive elements, regular oscillations, discontinuities, and signal power) were discernable to the same degree in the different modalities. Statistically significant correlations were found for all attributes, which shows that audification can be used as an alternative to spectrograms for this type of analysis. 1. CONTEXT AND BACKGROUND This paper describes an experiment to verify that sound can be used as an alternative to graphs in the analysis of complex signals. We have compared a visual and an audio display of the same data sets in order to confirm that certain key attributes are at least as discernable from a complex data set by sonification as by visualization. This verification is important to those projects which aim to use sound representation for data analysis. The world is currently dominated by visual techniques, and many people need to be convinced that information will not somehow be 'lost' by representing it as sound. Once that has been established, it becomes a lot easier to stress the advantages of using sound.
Content may be subject to copyright.
Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 6-9, 2005
ICAD05-175
A COMPARISON OF AUDIO & VISUAL ANALYSIS OF COMPLEX TIME-
SERIES DATA SETS
Sandra Pauletto & Andy Hunt
Media Engineering Group
Electronics Dept., University of York
Heslington, York, YO10 5DD, U.K.
{sp148,adh} @ ohm.york.ac.uk
ABSTRACT
This paper describes an experiment to compare user
understanding of complex data sets presented in two different
modalities, a) in a visual spectrogram, and b) via audification.
Many complex time-series data sets taken from helicopter flight
recordings were presented to the test subjects in both modalities
separately. The aim was to see if a key set of attributes (noise,
repetitive elements, regular oscillations, discontinuities, and
signal power) were discernable to the same degree in the
different modalities. Statistically significant correlations were
found for all attributes, which shows that audification can be
used as an alternative to spectrograms for this type of analysis.
1. CONTEXT AND BACKGROUND
This paper describes an experiment to verify that sound can be
used as an alternative to graphs in the analysis of complex
signals. We have compared a visual and an audio display of the
same data sets in order to confirm that certain key attributes are
at least as discernable from a complex data set by sonification
as by visualization.
This verification is important to those projects which aim to
use sound representation for data analysis. The world is
currently dominated by visual techniques, and many people
need to be convinced that information will not somehow be
‘lost’ by representing it as sound. Once that has been
established, it becomes a lot easier to stress the advantages of
using sound.
1.1. Previous work on audio / visual comparisons
Visual representations of data have been used for a lot longer
than auditory representations. In fact, visual displays can be
said to be the norm, and particular visual displays (graphs,
diagrams, spectrograms) are widely understood. It is therefore
natural when evaluating new auditory displays that we compare
their efficacy in portraying information to that of a somewhat
equivalent visual display. In the literature there are various
studies which compare audio and visual displays. Nesbitt &
Barass [1] compared a sonification of stock-market data with a
visual display of the same data and with the combined display
(audio-visual). Brown & Brewster [2]
designed an experiment
to study the understanding of sonified line graphs. Peres and
Lane [3] evaluated different ways of representing statistical
graphs (box plots) with sound. Valenzuela et al [4] compared
the sonification of impact-echo signals (a method for non-
destructive testing of concrete and masonry structures) with a
visual display of the signal. Fitch and Kramer [5] compared the
efficacy of an auditory display of physiological data with a
visual display by asking the subjects (who play the role of
anesthesiologists) to try to keep alive a ‘digital patient’ by
monitoring his status with each display.
The evaluation methods used in the above examples are
dependent on the type of data, the type of auditory display and
the context in which the displays are used. These examples
show how important it is to compare auditory displays with
visual ones for their evaluation, but their results are specific to
the type of data, their complexity and the sonification used.
In this paper the sonification method used is audification,
i.e. where data are appropriately scaled and used as sound
samples. There are some studies in the literature about the
efficacy of audification of complex data. Audification is often
used for the sonification of data that are produced by physical
systems. Hayward [6] describes audification techniques of
seismic data. He finds that audification is a very useful
sonification method for such data, but he stresses that proper
evaluation and comparisons with visual methods are needed.
Dombois [7, 8] presents more evidence of the efficacy of
audification of seismic data which appears to complement the
visual representations.
Rangayyan et al [9] describe the use of audification to
represent data related to the rubbing of knee-joint surfaces. In
this case though the audification is compared to other
sonification techniques (not to a visual display) and it is not
found to be the best at showing the difference between normal
and abnormal signals.
In all these studies on audification of data, the scaling of the
data is informed by an a priori knowledge of the basic
properties of the data to be represented.
The novel slant of the experiment presented here is that no
assumption is made on the characteristics of the data.
1.2. So why use sound anyway?
This work is a small part of a larger project to work with
professionals who use data analysis on a day-to-day basis, but
are finding visual analysis techniques inadequate for the task.
We have built an interactive sonification toolkit [10] to allow
the human analyst to interact with the recorded data as sound, in
order to spot unusual patterns to aid in the diagnosis of system
faults. The power of a human interacting in a closed loop with
sonic feedback is described in [11], and in the IEEE Multimedia
special issue on Interactive Sonification [12].
The use of sound is particularly good way of portraying time-
series data, because the time-base is preserved in sound
playback. The eye tends to scan a picture at its own speed, yet
Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 6-9, 2005
ICAD05-176
sound is heard as it is revealed. This yields a particularly
natural portrayal of the dynamics of a complex data set.
Complex frequency responses in the data are often perceived
holistically as timbral differences. Large amounts of data can be
rendered rapidly, yet the microstructure is still manifest as
timbral artifacts. However, the purpose of this experiment is to
determine if some basic attributes of the data are lost by moving
from a visual representation to a sonic one.
In our project, we are working specifically with two groups
of professionals who need to analyze large quantities of
complex data which emanate from sensors connected to the
subject being studied.
Physiotherapists at the University of Teesside, UK, record
the complex bursts of activity from several EMG sensors
attached to the surface of a patient’s skin. From these signals
the therapists hope to build up a mental image of how the
patient’s muscles and joints are working, and what is perhaps
going wrong in a particular case. We are working with them in
sound as it appears to portray the dynamic response of the
muscles in a more natural way than by looking at traces on a
graph (which is the established, conventional technique).
However, our second collaborators have provided us with much
more complex data, the analysis of which is the focus of this
paper.
1.3. Helicopter flight analysis
We are working with flight analysis engineers at Westland
Helicopters, UK. These engineers are routinely required to
handle flight data and analyze it to solve problems in the
prototyping process. As we have reported in [10] flight data is
gathered from pilot controls and many sensors around the
aircraft. The many large data sets that are collected are currently
examined off-line using visual inspection of graphs. Printouts of
the graphs are laid across an open floor and engineers walk
around this paper display looking for anomalous values and
discontinuities in the signal. The paper is considered more
useful than the limited display on a computer monitor.
The current project aims to improve the analysis technique
by providing a sonic rendition of the data which can be heard
rapidly, and therefore will save valuable technician time and
speed up the analysis process. Sound representation also
provides the added benefit of allowing the presentation of
several time-series data sets together, for dynamic comparison
of two (or many more) signals. We are currently also working
on methods of portraying many tens of complex parameters
together to give a picture of the whole helicopter’s flight data.
<Reference to MMViz paper to come later for the camera copy>
The flight engineers are often given the task of analyzing
this data because a pilot has reported something wrong in a test
flight. The analysts now have a huge amount of data to sift
through in order to look for unusual events in the data. These
unusual events could be, for instance:
unwanted oscillations,
vibrations and noise superimposed on usually clean
signals,
unusual cyclic modes (data repeated, where it would
normally be expected to progress)
drifts in parameters that would normally be constant,
non-standard variations in power or level,
a change in the correlation between two parameters
(e.g. signals which are normally synchronized
becoming decoupled),
Discontinuities or ‘jumps’ in data which is in general
smooth or constant.
Identification of such events helps to pinpoint problems in
the aircraft, and can provide enough information to launch a
further, more focused, investigative procedure.
We wish to determine whether any information from the
data series is going to be lost when rendered sonically rather
than graphically. So, for the purposes of this experiment we
have identified five basic attributes of data which we study both
visually and aurally. These are 1) Noise, 2) Repetitive
elements, 3) Oscillations at fixed frequencies, 4)
Discontinuities, and 5) Signal power level.
If a human analyst perceived the presence of one or more of
the first four attributes, (or a change in overall signal strength),
in an area of the signal where it would not be expected, this
would prompt further investigation. So, our experiment
determines whether subjects rate the presence of the first four
attributes, and the average level of the signal power, to the same
degree using a) visual and b) aural presentation.
2. EXPERIMENTAL AIMS & HYPOTHESIS
The aim of the experiment is to compare how users rank the
above five attributes when a series of data sets is presented
visually or aurally. We are looking to see whether aural
presentation allows the identification of each attribute to the
same degree as visual presentation. We are interested in the
average response across a large group of subjects, rather than
identifying whether an individual subject can use visual or
audio presentation equally well.
2.1. Hypothesis
The experimental hypothesis is that for each data series, there
will be a strong correlation between the recognition of each of
the five data attributes in the visual domain and audio domain.
If this hypothesis is proved, then we have a strong basis for
trusting the analysis of the data using sound alone.
In this experiment we only try to verify if the sound portrays
the data attributes at least as well as the visual display. If there
is poor correlation, with this experiment, we cannot infer the
reasons. We would need other experiments to discover the
reasons for a poor correlation.
2.2. Structure of the data under test
In consultation with the flight handling qualities group at
Westland helicopters we have gathered 28 sets of time-
synchronized data taken from a half hour test flight. Each data
set is taken from a sensor on the aircraft under test. The details
of the aircraft and the mapping of each individual sensor are
being kept confidential.
Each data set contains 106500 samples which were
originally sampled at 50Hz. The helicopter parameters
measured are of highly differing natures: from the speed of the
rotors, to engine power, etc. Most of the data sets represent
physical parameters that change over time. For this experiment,
the knowledge of what each channel represents in the helicopter
system is not important, only whether the user perceives the
presence of noise (etc.) in both the visual and audio displays.
Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 6-9, 2005
ICAD05-177
2.3. Overview of the experimental task
The visual display used in this experiment is the spectrogram of
each data set. The audio display is the audification of the data.
The subjects were presented with a screen containing
thumbnail pictures of the spectrograms of all the data sets. After
having had an overview of al the spectrograms, they were asked
to examine and score each spectrogram (on an integer scale
from 1 to 5) for the following characteristics:
a) presence of noise;
b) presence of a repetitive element in time;
c) oscillations at fixed frequencies;
d) presence of discontinuities or jumps in amplitude;
e) signal power.
For the sonic display, the subjects were presented with
icons – one for each data set, which played the audification
when clicked. Subjects were asked to listen to all the sounds at
least once. Then they were asked to listen to each sound as
many times as required, then score it using the same categories
as for spectrograms.
2.4. The audifications
Kramer describes the audification of data as “a direct translation
of a data waveform to the audible domain” [13]. The
audifications in this experiment were created by linearly scaling
the 28 data arrays between -1 and 1 and by converting each
array into a wave file of sampling rate 44100Hz in Matlab. Each
audification was therefore around 2.5 seconds long.
2.5. The spectra
The spectrograms, of the same data channels, were created by
using the Matlab function ‘specgram’. The sampling frequency
specified when computing the spectrograms was ‘fs = 50’,
which corresponded to the original sampling frequency of the
data (50Hz). The minimum and maximum values of the color
scale of the spectrograms were set the same for each
spectrogram so that the spectrograms were comparable to each
other. All the spectrograms were saved as .jpg files.
2.6. The subjects
The subjects for this test were selected according to the
following criteria.
It was considered that the end user of such an auditory
display would be an experienced analyst, able to interpret
spectrograms and able to distinguish various characteristics in a
sound’s signal such as noise, repetitions, frequencies,
discontinuities and signal level.
Apart from this specific knowledge, the user could be any
gender or age or from any cultural background. A between-
subjects design, in which there are 2 groups of different subjects
(one of which scores the spectra and the other the sounds),
would have been ideal for this experiment. This would have
required the recruitment of too many subjects, which was not
realistic. Instead, we chose a group of 23 subjects and used a
mixed within-subjects / between-subjects design, in which
mostly the same group of people scored both the spectra and the
sounds, but some only did one or the other. This design was due
to the fact that some subjects were available only for a short
time.
In order to minimize the errors in the results due to the
order of presentation of the task, the order in which the spectra
and the sounds were presented to each person was randomized
between subjects and tasks.
Out of the 23 subjects tested, 21 were men and 2 were
women. The average age of the subjects was 33. All the subjects
were lecturers, researchers or postgraduate students in media
and elctronic engineering (with a specialisation in audio and
music technology) and one person was a computer music
composer. They all had experience in working with sounds and
spectrograms. The subjects’ understanding of sounds and
spectrograms was considered to be similar to the expected
understanding of the ideal end user. Subjects were from
different nationalities. All the subjects declared that had no
known problems with their hearing and that they had good sight
or, if the sight had some defect, it was fully corrected by
spectacles.
2.7. Procedure
Firstly, each subject was given a single-page written document
which explained the task. Then the subject was asked to fill in a
questionnaire to gather the information about occupation,
gender, age, nationality, his/her familiarity with spectrograms
and sound interpretation, and any known hearing or sight
problems.
The audio test was carried out in a silent room (mostly in
the recording studio performance area at York). Good quality
headphones (DT990 Beyerdynamic) were used with a wide
frequency response (5 - 35,000Hz). This minimized the errors
that could be due to external sounds. The volume of the sounds
was maintained the same for all subjects.
The spectrogram test was also conducted in a generally
quiet room which allowed concentration.
Subjects who were able to do both tests in one sitting were
asked to take at least a 2 minute rest between the visual and the
audio parts of the test. The total test, for each subject, lasted
about 45 minutes. Subjects were asked to record on a piece of
paper any comments about the test they thought could be
valuable.
For the experiment, a program was created in PD (Pure Data
[14]) and all the results of the test were automatically recorded
in a text file. Before presenting the spectrograms and the sounds
to each subject, the order of presentation of each data set on the
screen was randomized. This should minimize errors due to the
order of presentation.
The test began with an overview of all the spectrograms (see
Figure 1). Then by clicking on each thumbnail image a larger
version of the spectrogram appeared. A click on the ‘Test’
button brought up a further window, consisting of a series of
radio buttons (labeled from 1 to 5) for each parameter being
scored (noise, repetition, frequency, discontinuity and signal
power).
Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 6-9, 2005
ICAD05-178
Figure 1: Thumbnails of Spectrograms
For the second part of the experiment the subject was presented
with a set of buttons, one for each audification. Before starting
scoring, the subject was asked to listen to all the sounds at least
once.
By clicking on a button the subject could hear each
audification through headphones. Again, a click on the ‘Test’
button brought up the scoring window, identical to that used for
the spectrograms (see Figure 2).
Figure 2: The scoring window superimposed upon the
buttons for each sound
3. RESULTS
The scores were divided and analyzed by the 5 attributes being
tested (noise, repetition, frequency, discontinuity and signal
power).
For each of the 28 data sets (i.e. channels of sensor
information from the helicopter) two mean scores were
calculated across all subjects; one for the sound display and one
for the visual display. Therefore for each attribute being tested
(noise, repetition, etc) we have 2 arrays of average scores (one
for the sound and one for the spectra), with an average score
across all subjects for each data set.
If the two displays portray information in exactly the same
way, then we might expect the two arrays of scores to be exactly
the same. A scatter plot (x axis = spectra scores, y axis = sounds
scores) was plotted for each of the five attributes under test.
This helps us to see if a linear relationship exists between the
spectra scores and the sound scores. Then the correlation factor
(1) was calculated.
Correlation factor
==
2
_
2
_
__
)()(
))((
yyxx
yyxx
ss
s
r
yx
xy
(1)
3.1. Presence of noise
In all the following scatter plots, the continuous line represents
the line where ideally the dots should sit, while the segmented
line represents the regression line calculated from the actual
points. Each dot is the average score across all subjects for one
(of the 28) data sets.
noise scatter plot
1
2
3
4
5
1 2 3 4 5
spectra average scores
sounds average scores
channels
regression line
1.0 correlation line
Figure 3: Scatter plot for the attribute ‘Presence of
Noise’
noise
1
1.5
2
2.5
3
3.5
4
4.5
5
1 3 5 7 9 11 13 15 17 19 21 23 25 27
data channels
average scores
spectra
sounds
Figure 4: Average Noise scores for each data set
For each category a second plot was made (e.g. see Figure 4) in
which along the x axis are the individual data sets (the
‘channels’) and along the y axis are the average scores across all
subjects. The solid line represents the results for the sound
display and the segmented line represents the spectra results.
The correlation (r: 0.88) between the auditory display
scores and the visual display scores is very high. Thus the
average scores for presence of noise is very similar whether
people are presented with a spectrogram or an audification of
the data sets. Another way of looking at this is as follows. Let
Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 6-9, 2005
ICAD05-179
us round the average scores to the nearest integer (remembering
that people were asked to score with a 5-step integer scale) and
we calculate the absolute value of the difference between the
rounded spectra scores and rounded audio scores for each
channel (see Table 1).
Rounded spectra
scores
Rounded sounds
scores
Abs(difference)
3 3 0
3 3 0
3 3 0
3 3 0
3 4 1
3 3 0
3 4 1
3 3 0
3 3 0
4 4 0
5 5 0
4 4 0
4 3 1
2 2 0
4 4 0
3 3 0
3 4 1
3 3 0
2 2 0
3 3 0
3 3 0
4 5 1
3 3 0
4 4 0
4 4 0
2 3 1
3 3 0
3 4 1
Table 1: Difference in rounded scores
We can see that only 7 data sets out of 28 are scored
differently in the visual display than in the audio display (for
the degree of noise present) and the difference is only 1 point.
We now present the data in the same formats (scatter-plot
and average channel scores) for each of the remaining
attributes.
3.2. Presence of a repetitive element
repetitive element scatter plot
1
2
3
4
5
1 2 3 4 5
spectra average scores
sounds average scores
channels
regression line
1.0 correlation line
Figure 5: Repetitive element scatter plot
repetitive element
1
1.5
2
2.5
3
3.5
4
4.5
1 3 5 7 9 11 13 15 17 19 21 23 25 27
data channels
average scores
spectra
sounds
Figure 6: Repetitive element scores
The correlation (r: 0.70) is still quite high but less than in the
noise case. 15 out of 28 rounded average scores are different
between the visual and the audio display. In 13 the difference is
by 1 point and in 2 by 2 points.
3.3. Presence of oscillations at fixed frequencies
frequencies scatter plot
1
2
3
4
5
1 2 3 4 5
spectra average scores
sounds average scores
channels
regression line
1.0 correlation line
Figure 7: Frequencies scatter plot
frequencies
1
1.5
2
2.5
3
3.5
4
4.5
5
1 3 5 7 9 11 13 15 17 19 21 23 25 27
data channels
average scores
spectra
sounds
Figure 8: Frequencies scores
The correlation (r: 0.71) is close to the correlation calculated
for the repetitive element. 15 out of 28 rounded average scores
are different between the two displays: 11 have a difference of 1
point, and 4 by 2 points.
Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 6-9, 2005
ICAD05-180
3.4. Presence of discontinuities
discontinuities scatter plot
1
2
3
4
5
1 2 3 4 5
spectra average scores
sounds average scores
channels
regression line
1.0 correlation line
Figure 9: Discontinuity scatter plot
discontinuities
1
1.5
2
2.5
3
3.5
4
4.5
5
1 3 5 7 9 11 13 15 17 19 21 23 25 27
data channels
average scores
spectra
sounds
Figure 10: Discontinuity scores
The correlation (r: 0.76) is quite high. 11 out of 28 rounded
average scores are different between the 2 displays: in all cases
the difference is by 1 point.
3.5. Rating of signal power
signal power scatter plot
1
2
3
4
5
1 2 3 4 5
spectra average scores
sounds average scores
channels
regression line
1.0 correlation line
Figure 11: signal power scatter plot
signal power
1
1.5
2
2.5
3
3.5
4
4.5
5
1 3 5 7 9 11 13 15 17 19 21 23 25 27
data channels
average scores
spectra
sounds
Figure 12: Signal power scores
The correlation (r: 0.88) is very high. Only 9 out of 28 average
rounded scores are different between the displays and these are
by only 1 point.
4. DISCUSSION
For each of the five attributes the average scores for the spectra
show high correlation with the average scores for the sounds.
This means that the two displays do indeed allow users to
gather some basic information about the structure of the data to
a similar degree.
It is reasonable to think that the degree of similarity of the
two displays could be improved by considering the following:
The audio display could be improved by choosing a
different data scaling informed by sound perception
principles.
The subjects were presented with a complex task.
They had to score 28 data sets for each of the 5
attributes, both in the visual mode and in the audio
mode. It is possible that an easier task (e.g. score 10
channels for one category only at the time) could
show an even higher similarity between the data.
The subjects had to score very complex sounds
containing (to varying degrees) noise, clicks, the
presence of many frequency components, and often a
complex evolution of the sound over time. Again
with easier sounds, i.e. simpler data structures, the
similarity in the scores could be higher.
The test questions were often ambiguous. For
example, subjects often wondered if the noise of the
clicks, produced when there is a discontinuity in
amplitude, should count for ‘presence of noise’ since
it was already accounted for under ‘presence of
discontinuity’. These ambiguities have surely
contributed to the increase in variance in the results.
Less ambiguous questions could yield better results.
The correlation between average scores for the visual display
and the auditory display in the Noise and Signal Power
attributes is higher than that for the other three. The reason for
this difference can probably be found in the nature of the data
displayed and the way the displays were built. For instance, the
perception of frequency influences the perception of loudness,
e.g. to perceive a 100Hz and a 1000Hz sound with the same
loudness, the level of the 100Hz sound needs be higher than
that of the 1000Hz sound [15]. It is possible, therefore, that
Proceedings of ICAD 05-Eleventh Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 6-9, 2005
ICAD05-181
frequencies that can be seen in the spectrogram are not easily
perceivable in the audification. The difference could also be due
to the different characteristics of the visual and the auditory
sense: one could be better at picking up certain elements than
the other. For instance, the ear could be better at perceiving
repetitive elements in time, since we are used to recognizing
rhythmic structures in sound, while repetitions could be harder
to spot in a spectrogram. A more precise analysis of the results
for each particular channel, focusing in particular on the
differences in scoring between audio and visual display, will be
done in the near future. From the results of such deeper analysis
new hypothesis could be made regarding the degree of
similarity and difference of these two displays, which will then
need to be tested with new experiments.
Finally, during the test, the subjects were free to write down any
comments about the spectra or the sounds or the test procedure.
13 out of 23 subjects chose to comment and here is a summary
of the most common observations:
it is difficult to score in particular noise,
discontinuities and repetitions (7 comments);
I can hear more detail in the sounds than in the
spectra (2 comments);
I feel that I get better at scoring as I go along (4
comments);
Some data sets actually sound like a helicopter (2
comments).
5. CONCLUSIONS
This paper has described an experiment which compares a
visual display and an auditory display in their abilities to
portray basic information about complex time-series data sets.
The subjects of the experiments were asked to score the spectra
and the audifications of the data sets on an integer scale from 1
to 5 for the following attributes: presence of noise, presence of a
repetitive element, presence of discernible frequencies, presence
of amplitude discontinuities and overall signal power. It was
found that the scores for each data set, averaged over all
subjects, showed high correlation between the visual and
auditory displays for all five attributes. This means that these
two displays portray similarly well some basic information
about this data set.
6. ACKNOWLEDGEMENTS
The data used in this paper was gathered as part of the project
‘Improved data mining through an interactive sonic approach’.
The project was launched in April 2003 and is funded by the
EPSRC (Engineering and Physical Sciences Research Council).
The research team consists of academics at the Universities of
York (User Interfacing and Digital Sound) and Teesside
(Physiotherapy) led by Prof. Tracey Howe, and engineers at
Westland Helicopters led by Prof. Paul Taylor. Many thanks are
due to all the people of KTH (Sweden), the Electronics, Music
and Computer Science Departments of York University who
participated to this experiment.
7. REFERENCES
[1] Nesbitt K. V. and Barrass S., Finding Trading Patterns in
Stock Market Data, IEEE Computer Graphics and
Applications, September/October 2004, pp. 45-55
[2] Brown L. M. and Brewster S. A., Drawing by ear:
Interpreting Sonified Line Graphs, Proc. International
Conference on Auditory Display (ICAD), 2003
[3] Peres S. C. and Lane D. M., Sonification of Statistical
Graphs, Proc. ICAD, 2003
[4] Valenzuela M. L., Sansalone M. J., Streett W. B.,
Krumhansl C. L., Use of Sound for the Interpretation of
Imapct-Echo Signals, Proc. ICAD, 1997.
[5] Tecumseh Fitch W. and Kramer G., Sonyfing the Body
Electric: Superiority of an Auditory over a Visual Display
in a Complex, Multivariate System, In Kramer G. (ed)
Auditory Display: Sonification, Audification, and Auditory
Interface, Addison-Wesley, Reading, MA, 1994
[6] Hayward C., Listening to the Earth sing, In Kramer G.
(ed) Auditory Display: Sonification, Audification, and
Auditory Interface, Addison-Wesley, Reading, MA, 1994
[7] Dombois F., Using Audification in planetary seismology,
Proc. ICAD , 2002
[8] Dombois F., Auditory Seismology on free Oscillations,
Focal Mechnisms, Explosions and Synthetic Seismograms,
Proc. ICAD , 2002
[9] Krishnan S., Rangayyan R. M., Douglas Bell G., Frank C.
B., Auditory display of knee-joint vibration signals,
Journal of the Acoustical Society of America, 110(6),
December 2001, pp. 3292-3304.
[10] Pauletto, S., & Hunt, A.D., A Toolkit for interactive
sonification, Proc. ICAD, 2004
[11] Hunt, A. & Hermann, T., The importance of interaction in
sonification, Proc. ICAD, 2004.
[12] Hunt, A.D., & Hermann, T. (eds), Special Issue on
Interactive Sonification, IEEE Multimedia, Apr-Jun 2005.
[13] Kramer, G., 1994, Some organizing principles for
representing data with sound. In Kramer G. (ed) Auditory
Display: Sonification, Audification, and Auditory
Interface, Addison-Wesley, Reading, MA
[14] Pure Data: http://pd.iem.at/
[15] Howard, D. and Angus, J., 1996, Acoustics and
Psychoacoustics, Music Technology Series, Focal Press,
Oxford.
... Each mode has its advantages to highlight certain informational elements from the data space. Pauletto and Hunt [59] made a thorough comparison between sonification and visualization; the most relevant elements, are summarized in Table.1 [34]. These properties have ensured the realization of most applications developed so far. ...
...  For nonperiodical sequences, in the absence of a real pattern, the "convergence trend" can occur: the listener would have the tendency to classify the heard elements according to his memorized patterns ("six rhythm universals") [63].  The information provided by sonification is redundant with that received by other ways (visualization, speech) [59]; however, this feature ease the learning.  Almost all proposed solutions require a learning phase [5,23,58]; it has been found that more conventional mapping with simple patterns are easier to memorize [34]. ...
Article
Full-text available
Medical data can be represented in various forms. The most common is visualization, but recent work started to also add sonic representation - sonification. In this study we start with a theoretical background, then focus on medical applications. The discussion synthesizes the authors view about the present state of the domain and tries to foresee future potential developments in medicine. In conclusion we present a set of original recommendations for developing new applications with potential use in medicine and healthcare.
... Most sonification experiments have focused on individual accuracy metrics to interpret whether listeners are able to interpret the contents of the sonification (Schuett and Walker, 2013). Sonification complexity has also been cited as a limiting factor in the utility of sonifications (Marila, 2002;Pauletto and Hunt, 2005). In our tests, individual accuracy was relatively low, and highly variable (Fig. 4). ...
Article
Full-text available
The migration of Pacific salmon is an important part of functioning freshwater ecosystems, but as populations have decreased and ecological conditions have changed, so have migration patterns. Understanding how the environment, and human impacts, change salmon migration behavior requires observing migration at small temporal and spatial scales across large geographic areas. Studying these detailed fish movements is particularly important for one threatened population of Chinook salmon in the Snake River of Idaho whose juvenile behavior may be rapidly evolving in response to dams and anthropogenic impacts. However, exploring movement data sets of large numbers of salmon can present challenges due to the difficulty of visualizing the multivariate, time-series datasets. Previous research indicates that sonification, representing data using sound, has the potential to enhance exploration of multivariate, time-series datasets. We developed sonifications of individual fish movements using a large dataset of salmon otolith microchemistry from Snake River Fall Chinook salmon. Otoliths, a balance and hearing organ in fish, provide a detailed chemical record of fish movements recorded in the tree-like rings they deposit each day the fish is alive. This data represents a scalable, multivariate dataset of salmon movement ideal for sonification. We tested independent listener responses to validate the effectiveness of the sonification tool and mapping methods. The sonifications were presented in a survey to untrained listeners to identify salmon movements with increasingly more fish, with and without visualizations. Our results showed that untrained listeners were most sensitive to transitions mapped to pitch and timbre. Accuracy results were non-intuitive; in aggregate, respondents clearly identified important transitions, but individual accuracy was low. This aggregate effect has potential implications for the use of sonification in the context of crowd-sourced data exploration. The addition of more fish, and visuals, to the sonification increased response time in identifying transitions.
... Sonification is the core element of the "auditory display", which comprises "any technical solution for gathering, processing, and computing necessary to obtain sound in response to data" [1]. A comparison between visualization and sonification [2] can partially exculpate the much lower use of sonification for scientific purposes as compared to visualization. Nevertheless, the present technical advent and the permanent increase of expectations to extend the ways we interact with the environment open the doors for new developments in this direction. ...
Article
Full-text available
This 'vision' paper refers to sonification - a novel method to represent data by sounds. A short theoretical background comprises the main features to attach sound to a set of data - how to map the correspondence between the sound parameters (pitch, duration) and the initial set of data. The classification of sonification methods is followed by a description of sound display tools - tempolenses and artifacts (saccadic display or loudness variations). The Results section comprises examples of sonification performed by our team: heart rate (HR), ECG signals, HR variations during exercise, including warning procedures. The procedure to evaluate the discriminant power of various sonification algorithms is then described. As a 'vision' paper, the most important part is not represented by the results, but the potential future developments, presented in the Discussion section, which starts with a critical view of the present state and presents future potential applications of sonification in medicine.
... Audification seldom involves any preprocessing other than time or frequency shift, dilatation, and filtering, relying entirely on the strengths of the auditory system to explore and analyze the data. This sonification technique is particularly efficient for identifying irregularities in large datasets with periodic components, 68 Chapter 7. The Use of Sound to Convey Information replacing a tedious visual search by a mere listening of soundscape singularities [118]. ...
Article
Full-text available
This research investigates the design of a radio Direction Finder (DF) for rescue operation using victims' cellphone as localization beacons. The conception is focused on an audio interface, using sound to progressively guide rescuers towards their target. The thesis' ambition is to exploit the natural mechanisms of human hearing to improve the global performance of the search process rather than to develop new Direction-Of-Arrival (DOA) estimation techniques.Classical DOA estimation techniques are introduced along with a range of tools to assess their efficiency. Based on these tools, a case study is proposed regarding the performance that might be expected from a lightweight DF design tailored to portable operation. It is shown that the performance of high-resolution techniques usually implemented for DOA estimation are seriously impacted by any size-constraint applied on the DF, particularly in multi-path propagation conditions.Subsequently, a review of interactive parameter mapping sonification is proposed. Various sonification paradigms are designed and assessed regarding their capacity to convey information related to different levels of DF outputs. Listening tests are conducted suggesting that trained subjects are capable of monitoring multiple audio streams and gather information from complex sounds. Said tests also indicate the need for a DF sonification that perceptively orders the presented information, for beginners to be able to effortlessly focus on the most important data only. Careful attention is given to sound aesthetic and how it impacts operators' acceptance and trust in the DF, particularly regarding the perception of measurement noise during the navigation.Finally, a virtual prototype is implemented that recreates DF-based navigation in a virtual environment to evaluate the proposed sonification mappings. In the meantime, a physical prototype is developed to assess the ecological validity of the virtual evaluations. Said prototype exploits a software defined radio architecture for rapid iteration through design implementations. The overall performance evaluation study is conducted in consultation with rescue services representatives and compared with their current search solutions.It is shown that, in this context, simple DF designs based on the parallel sonification of the output signal of several antennas may produce navigation performance comparable to these of more complex designs based on high-resolution methods. As the task objective is to progressively localize a target, the system's cornerstone appears to be the robustness and consistency of its estimations rather than its punctual accuracy. Involving operators in the estimation allows avoiding critical situations where one feels helpless when faced with an autonomous system producing non-sensical estimations. Virtual prototyping proved to be a sensible and efficient method to support this study, allowing for fast iterations through sonifications and DF designs implementations.
... In this case study, the research scientist was able to extract important spectral cues through the application of auditory and multimodal analysis methods. While a correlation between assessments made through audition and was established by Pauletto and Hunt [2005], this case study further demonstrates how auditory analysis may be useful when applied in an open-ended feature identification task. ...
Article
The effective navigation, mining, and analysis of large time series data sets presents a recurring challenge throughout heliophysics. Audification, a specific form of auditory analysis commonly used in other fields of research (such as geo-seismology), provides a promising technique for the evaluation of spectral features in long heliospheric time series data sets. Following a standard research methodology for the development of new analysis techniques, this paper presents a detailed case study in which audification was introduced into the working process of an experienced heliophysics research scientist and used for the identification and classification of features in high-resolution magnetometer data during a structured analysis task. Auditory evaluation successfully led to the detection of artificial, instrument-induced noise that was not previously observed by the scientist, and also the identification of wave activity embedded within turbulent solar wind data. A follow-up interview indicated that the scientist continued using these auditory analysis methods in the assessment of every large data set during the two months after the study was completed. These findings indicate that audification can be valuable and enabling for researchers in forming a deeper understanding of both micro- and macro-structures within large time series. Additionally, as both a standalone methodology and a supplement to visual analysis methods, audification can expedite certain stages of the data survey, analysis, and mining process, and provide new qualitative insight into the spectral content of time-varying signals.
... No difference in terms of performance was found between the two groups of subjects. Pauletto and Hunt [18] conducted an experiment comparing the ability of visual and auditory interfaces to represent basic information about complex timeseries data sets. The subjects were asked to rank five attributes of data (e.g. ...
Conference Paper
Full-text available
Wireless Sensor Networks (WSNs) allow the monitoring of activity or environmental conditions over a large area, from homes to industrial plants, from agriculture fields to forests and glaciers. They can support a variety of applications, from assisted living to natural disaster prevention. WSNs can, however, be challenging to setup and maintain, reducing the potential for real-world adoption. To address this limitation, this paper introduces SensorTune, a novel mobile interface to support non-expert users in iteratively setting up a WSN. SensorTune uses non-speech audio to present to its users information regarding the connectivity of the network they are setting up, allowing them to decide how to extend it. To simplify the interpretation of the data presented, the system adopts the metaphor of tuning a consumer analog radio, a very common and well known operation. A user study was conducted in which 20 subjects setup real multi-hop networks inside a large building using a limited number of wireless nodes. Subjects repeated the task with SensorTune and with a comparable mobile GUI interface. Experimental results show a statistically significant difference in the task completion time and a clear preference of users for the auditory interface.
... The development of sonification tools relied, in a certain extent, on comparison between sonification and visualization [5]: when trying to see more details, visualization uses "lenses". There are several ways to build an equivalent sonic lens, depending on the "magnified" parameter: "pitchlenses", which increase frequency resolution or "tempolenses" (TL), which increase temporal resolution. ...
Article
Full-text available
The purpose of this study was to develop the tools and the methodology for a systematic analysis of usefulness of adding sonic representation of data, supplementary to visualization. This paper is mainly dedicated various temporal lenses, including the newly developed lenses with variable magnification, proposed as a tool for a better perception of short events combined with a compression of irrelevant intervals. Sonification procedures are also briefly presented. The programs were tested using various cardiac signals: ECG and heart rate HR both in humans and in rats (experimental data). The results, represented by the sound files, were uploaded in an accessible library, which contains both sonic and visual representation of the signals.
Preprint
Over the last ten years there has been a large increase in the number of projects using sound to represent astronomical data and concepts. Motivation for these projects includes the potential to enhance scientific discovery within complex datasets, by utilising the inherent multi-dimensionality of sound and the ability of our hearing to filter signals from noise. Other motivations include creating engaging multi-sensory resources, for education and public engagement, and making astronomy more accessible to people who are blind or have low vision, promoting their participation in science and related careers. We describe potential benefits of sound within these contexts and provide an overview of the nearly 100 sound-based astronomy projects that we identified. We discuss current limitations and challenges of the approaches taken. Finally, we suggest future directions to help realise the full potential of sound-based techniques in general and to widen their application within the astronomy community.
Chapter
Encompassing ideas and techniques from music composition, perceptual psychology, computer science, acoustics, biology and philosophy, data sonification is a multi- even trans-disciplinary practice. This chapter summarizes different ways sonification has been defined, the types and classifications of data that it attempts to represent with sound, and how these representations perform under the pressure of various real-world utilizations.
Article
The current study presents an effective approach to convert well logs to music and listen to their sound. Well log data play an important role in different stages of oil and gas field's exploration and development. Several rock properties such as porosity, lithology, fluids saturation, fluid contacts and pay zones can be obtained through interpretation of well log data. Boring tabulated data or a mass of curves can be converted into joyous and pleasant sounds. For this purpose, four case studies were run to show how borehole quality logs, petrophysical evaluation results, capillary pressure data (pore size distribution) and drilling data can be converted to musical notes. The proposed approach can help in quality control or interpretation of well logs or any other reservoir data. The interpreter just needs to wear wireless headphones and listen to the music generated from reservoir data. Service companies may consider the musical interpretation of well logs as an alternative and immediate way for quality control of logging procedures at well bore site. Meanwhile, visually disabled petroleum engineers may use the aural interpretation of subsurface data. A siren can be sounded when lost circulation occurs or a kick is detected to warn well site crew about the drilling risks. The result of musical transformed well logs can be stored in MP3 files for future applications.
Conference Paper
Full-text available
This paper describes work-in-progress on an Interactive Sonification Toolkit which has been developed in order to aid the analysis of general data sets. The toolkit allows the designer to process and scale data sets, then rapidly change the sonification method. The human user can then interact with the data in a fluid manner, continually controlling the position within the set. The interface used by default is the computer mouse, but we also describe plans for multiparametric interfaces which will allow real-time control of many aspects of the data. Early results of interactive sonic analysis of two example domains are described, but extensive user tests are being planned.
Conference Paper
Full-text available
This paper argues for a special focus on the use of dynamic human interaction to explore datasets while they are being transformed into sound. We describe why this is a special case of both human computer interaction (HCI) techniques and sonification methods. Humans are adapted for interacting with their physical environ- ment and making continuous use of all their senses. When this ex- ploratory interaction is applied to a dataset (by continuously con- trolling its transformation into sound) new insights are gained into the data's macro and micro-structure, which are not obvious in a visual rendering. This paper reviews the importance of interaction in sonification, describes how a certain quality of interaction is required, provides examples of the techniques being applied inter- actively, and outlines a plan of future work to develop interaction techniques to aid sonification.
Article
Full-text available
The research presented here describes a pilot study into the interpretation of sonified line graphs containing two data series. The experiment aimed to discover the level of accuracy with which sighted people were able to draw sketches of the graphs after listening to them. In addition, it aimed to identify any differences in performance when the graphs were presented using different combinations of instruments?either with piano representing both data series (same-instruments condition), or with piano representing one data series and trumpet representing the other (different-instruments condition). The drawings were evaluated by calculating the percentage of key features present. The results showed that accuracy was high (over 80% on average) in both conditions, but found no significant differences between the two. There were indications of some differences between the two conditions, but a larger study is necessary to discover whether these are significant. The results indicate that graph sonification systems should allow users to choose between these two presentation modes, depending on their preference and current task. The study showed that sonified graphs containing two data series can be interpreted, and drawn, by sighted people, and that evaluation with blind users (our target users) would be worthwhile.
Article
Full-text available
Sounds generated due to rubbing of knee-joint surfaces may lead to a potential tool for noninvasive assessment of articular cartilage degeneration. In the work reported in the present paper, an attempt is made to perform computer-assisted auscultation of knee joints by auditory display (AD) of vibration signals (also known as vibroarthrographic or VAG signals) emitted during active movement of the leg. Two types of AD methods are considered: audification and sonification. In audification, the VAG signals are scaled in time and frequency using a time-frequency distribution to facilitate aural analysis. In sonification, the instantaneous mean frequency and envelope of the VAG signals are derived and used to synthesize sounds that are expected to facilitate more accurate diagnosis than the original signals by improving their aural quality. Auditory classification experiments were performed by two orthopedic surgeons with 37 VAG signals including 19 normal and 18 abnormal cases. Sensitivity values (correct detection of abnormality) of 31%, 44%, and 83%, and overall classification accuracies of 53%, 40%, and 57% were obtained with the direct playback, audification, and sonification methods, respectively. The corresponding d' scores were estimated to be 1.10. -0.36, and 0.55. The high sensitivity of the sonification method indicates that the technique could lead to improved detection of knee-joint abnormalities; however, additional work is required to improve its specificity and achieve better overall performance.
Article
Full-text available
A multisensory human perceptual tool for the real-world task domain of stock market trading was designed and evaluated. The data mined is bid-and-ask data from the Australian Stock Exchange. Nonexperts were tested on their ability to use the tool to predict the future direction of stock prices. The experiment's null hypothesis was that nonexperts couldn't predict the direction of the stock price using this tool. At the culmination of the experiment, the null hypothesis was proved false, leading to the possibility that useful patterns were detected in the display.
Article
In which respect, to what effect and under which circumstances can audification improve seismological research? Building on the philosophical insight that the human ear is admittedly stronger than any other of our senses in recognizing time, continuum and tension between remembrance and expectation my investigation prepares for an acoustic prediction research. Hereby several geophysical categories are confirmed and will, in some cases, even be improved. By means of acoustic approach things like tectonic type, distance between focus and station, site response and event recognition are easier to handle than so far possible by means of more conventional scientific strategies.
Article
The method of audifying seismograms and interpreting seismological data by ear enables a wide range of geo-physical questions to be studied in a new manner. In this article we report about the actual state of research in Auditory Seismology. Some tests were carried out in the area of free oscillation phenomena, focal mechanisms of earthquakes, explosion signals and synthetic seismo-grams. The results confirm the difference between visual and acoustic approach as having different foci: Free oscillations as a phenomenon of resonance are easily accessible for the ear, whereas synthetic seismograms give acoustically less insight than visually. Beyond the seismological investigation the article gives a historical introduction of Auditory Seismology up to the present, and a summary of earlier results to contextualize these and assist the process of establishing the acoustical approach as an individual research field.
Acoustics and Psychoacoustics, Music Technology Series
  • D Howard
  • J Angus
Howard, D. and Angus, J., 1996, Acoustics and Psychoacoustics, Music Technology Series, Focal Press, Oxford.