Article

Automatic detection of echolocation clicks based on a Gabor model of their waveform

If you want to read the PDF, try requesting it from the authors.

Abstract

Prior research has shown that echolocation clicks of several species of terrestrial and marine fauna can be modelled as Gabor-like functions. Here, a system is proposed for the automatic detection of a variety of such signals. By means of mathematical formulation, it is shown that the output of the Teager-Kaiser Energy Operator (TKEO) applied to Gabor-like signals can be approximated by a Gaussian function. Based on the inferences, a detection algorithm involving the post-processing of the TKEO outputs is presented. The ratio of the outputs of two moving-average filters, a Gaussian and a rectangular filter, is shown to be an effective detection parameter. Detector performance is assessed using synthetic and real (taken from MobySound database) recordings. The detection method is shown to work readily with a variety of echolocation clicks and in various recording scenarios. The system exhibits low computational complexity and operates several times faster than real-time. Performance comparisons are made to other publicly available detectors including pamguard.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The pipeline of the system is shown in Fig. 1. The clicks were extracted by an automatic detection method presented by Madhusdhan [9], and were represented by the FFT magnitude spectrum. Using a click train, a prediction was made by a joint strategy based on the outputs of the k nearest neighbors method with reject option (kNNRO). ...
... The automatic method [9] was used to detect echolocation clicks. Prior researches [2][8] [11] have shown that the output of the Teager-Kaiser Energy Operator (TKEO) applied to echolocation clicks can be approximated by a Gaussian-like function. ...
... 2) Click detection and Feature extraction: The parameters of the automatic clicks detection algorithm were the same as [9]. The click duration varied from 40 to 70ȝs [3]. ...
... The residuals followed a normal distribution and according to the AIC selection the best regression was found for the linear mixed model assessing a random effect on the intercept only (Table 4- 1). The regression between the number of buzzes and the number of PCA was significantly positive (p<0.001, Figure 4-3), and the intercept of the model is not significantly different from 0 because of the variance due to individuals (Intercept= 2.9 ± 6.2). ...
... Our detector allows for (i) manual labeling, based on both listening and visualization, and (ii) automatic detection. DeteClic includes 4 different automatic click detection approaches: (i) a Teager-Kaiser energy operator [1], (ii) an intercorrelation computation with a given reference signal, (iii) a spectrogram analysis [2], and (iv) a kurtosis-based statistical detection [3]. ...
Thesis
Full-text available
Many marine predator species feed on fish caught by fishers directly from the fishing gear. Known as depredation this interaction issue has substantial socio-economic consequences for fishermen and conservation implications for the wildlife. Costs for fishers include damages to the fishing gear and increased fishing effort to complete quotas. For marine predators, depredation increases risks of mortality (lethal retaliation from fishers or bycatch on the gear). Longline fisheries are the most impacted worldwide, primarily by odontocetes (toothed whales) depredation, urging the need for mitigation solutions to be developed. Most of studies assessing depredation have primarily relied on surface observation data, thus the way odontocetes interact with longlines underwater remains unclear. Besides, the way fishermen respond to depredation during fishing operations, or can influence their detectability to odontocetes, have been poorly investigated. This thesis therefore aimed at investigating these aspects through a passive acoustic monitoring, bio-logging and human ecology approaches, focusing on the French Patagonian toothfish (Dissostichus eleginoides) longline fisheries impacted by killer whales (Orcinus orca) and sperm whales (Physeter macrocephalus). Firstly, this thesis reveals that captains behave as optimal foragers but with different personal perception of competition and fishing fulfilment. Some captains would thus be more likely to stay within a patch or to haul closest longline even in presence of competition, suggesting these captains would show higher interaction rates. Additionally, the propagation of vessels’ acoustics varied depending on the type of manoeuvre (e.g. going backward vs. forward). The way captains use their vessels to navigate may therefore influence their detectability and so their depredation level. Secondly, loggers deployed on both the longlines (accelerometers) and odontocetes (GPS-TDR) revealed that killer whales and sperm whales are able to depredate on longlines while soaking on the seafloor. These observations suggest, therefore, that odontocetes can localise fishing activity before the hauling, which could be partially explained by specific acoustic signatures recorded during the setting process. Altogether, the results of the thesis suggest that depredation rates on demersal longlines are most likely underestimated. The thesis also brings some important insights for mitigation measures, suggesting that countermeasures should start from setting to hauling.
... The echolocation clicks were automatically detected and extracted via a process described in detail in Madhusudhana et al. (2015). The process identifies regions of atypically high broadband energy. ...
... The filter difference ratio (FDR) of the outputs of the two filters was used to detect the echolocation clicks. The parameter values were the same as those described in Madhusudhana et al. (2015). Acoustic clips, whose corresponding FDR values beyond 0.65, were hypothesized to contain the echolocation clicks. ...
Article
A method based on a convolutional neural network for the automatic classification of odontocete echolocation clicks is presented. The proposed convolutional neural network comprises six layers: three one-dimensional convolutional layers, two fully connected layers, and a softmax classification layer. Rectified linear units were chosen as the activation function for each convolutional layer. The input to the first convolutional layer is the raw time signal of an echolocation click. Species prediction was performed for groups of m clicks, and two strategies for species label prediction were explored: the majority vote and maximum posterior. Two datasets were used to evaluate the classification performance of the proposed algorithm. Experiments showed that the convolutional neural network can model odontocete species from the raw time signal of echolocation clicks. With the increase in m, the classification accuracy of the proposed method improved. The proposed method can be employed in passive acoustic monitoring to classify different delphinid species and facilitate future studies on odontocetes.
... Finding proper ratio of the outputs of Gaussian and rectangular filter, two moving average filter, is efficient in detecting clicks signals. This automatic method was used in this work [7]. ...
... The network was trained with the same architecture but varying in kernel sizes. Network with convolutional kernel size of 1, 3,5,7,9,11,13 were trained and resulted in 7 different models. ...
... A class-specific SVM was trained to identify click vocalization from four odontocetes species. Madhu- sudhana S et al. proposed an automatic detection method of echolo- cation click based on Teager-Kaiser energy operator (TKEO) and Gabor model [22]. The detection algorithm is well suited for processing con- tinuous input audio samples which contain a variety of echolocation clicks. ...
... Experimental results show that both the scale feature-vector and the time feature-vector can well reflect the differences in tems of energy distribution and duration between clicks from sperm whale and clicks from long-finned pilot whale. Compared with the existing methods of whale clicks classification presented in Section 1 [5,9,[19][20][21][22][23][24], the method proposed in the paper shows a better classi- fication performance for both whale species. Moreover, even in the case of a small training dataset size or a small number of features, a sa- tisfactory classification performance for both whale species can also be obtained from the proposed method. ...
... Automation can also reduce biases and errors which often result with human analysts (Aide et al., 2013;Heinicke et al., 2015) and provide additional information from the data such as bearings calculated from time of arrival differences for signals received on multiple hydrophones. A suite of automatic processes are currently available for analysing acoustic data for a range of species, including click detectors and click classifiers (Gillespie et al., 2008;Madhusudhana et al., 2015;Miller & Miller, 2018), energy band comparisons (Klinck & Mellinger, 2011), extraction of spectral features (Gillespie et al., 2013;Lin & Chou, 2015) and more recently, machine learning methods (Bergler et al., 2019;Bermant et al., 2019;Jiang et al., 2018;Shamir et al., 2014). These methods differ in their computational requirements, performance and ability to process sounds from a range of species, the wider environment and anthropogenic sources. ...
Article
Full-text available
Passive acoustic surveys are becoming increasingly popular as a means of surveying for cetaceans and other marine species. These surveys yield large amounts of data, the analysis of which is time consuming and can account for a substantial proportion of the survey budget. Semi‐automatic processes enable the bulk of processing to be conducted automatically while allowing analyst time to be reserved for validating and correcting detections and classifications. Existing modules within the Passive Acoustic Monitoring software PAMGuard were used to process a large (25.4 Terabyte) dataset collected during towed acoustic ship transits. The recently developed ‘Multi‐Hypothesis Tracking Click Train Detector’ and the ‘Whistle and Moan Detector’ modules were used to identify occasions within the dataset at which vocalising toothed whales (odontocetes) were likely to be acoustically present. These putative detections were then reviewed by an analyst, with false positives being corrected. Target motion analysis provided a perpendicular distance to odontocete click events enabling the estimation of detection functions for both sperm whales and delphinids. Detected whistles were assigned to the lowest taxonomical level possible using the PAMGuard ‘Whistle Classifier’ module. After an initial tuning process, this semi‐automatic method required 91 hr of an analyst's time to manually review both automatic click train and whistle detections from 1,696 hr of survey data. Use of the ‘Multi‐Hypothesis Tracking Click Train Detector’ reduced the amount of data for the analyst to search by 74.5%, while the ‘Whistle and Moan Detector’ reduced data to search by 85.9%. In total, 443 odontocete groups were detected, of which 55 were from sperm whale groups, six were from beaked whales, two were from porpoise and the remaining 380 were identified to the level of delphinid group. An effective survey strip half width of 3,277 and 699 m was estimated for sperm whales and delphinids respectively. The semi‐automatic workflow proved successful, reducing the amount of analyst time required to process the data, significantly reducing overall project costs. The workflow presented here makes use of existing modules within PAMGuard, a freely available and open‐source software, readily accessible to acoustic analysts.
... Signal-and image-processing techniques have been used in the temporal and spectral domains to automatically detect and quantify fauna sounds. For example, click detectors operating on waveforms have been applied to recordings of dolphins and porpoises (e.g., Sostres and Nuuttila, 2015), belugas (Le Bot et al., 2015), sperm whales (Madhusudhana et al., 2015), beaked whales (e.g., Yack et al., 2010;Le Bien, 2017), and snapping shrimp (e.g., Bohnenstiehl et al., 2016;Du et al., 2018). Matched-filtering of spectrograms has been used to detect highly stereotypical sounds of some whales and fishes (e.g., Mellinger and Clark, 1997;Ricci et al., 2017;Madhusudhana et al., 2020;Ogundile and Versfeld, 2020). ...
Article
Full-text available
Aquatic environments encompass the world’s most extensive habitats, rich with sounds produced by a diversity of animals. Passive acoustic monitoring (PAM) is an increasingly accessible remote sensing technology that uses hydrophones to listen to the underwater world and represents an unprecedented, non-invasive method to monitor underwater environments. This information can assist in the delineation of biologically important areas via detection of sound-producing species or characterization of ecosystem type and condition, inferred from the acoustic properties of the local soundscape. At a time when worldwide biodiversity is in significant decline and underwater soundscapes are being altered as a result of anthropogenic impacts, there is a need to document, quantify, and understand biotic sound sources–potentially before they disappear. A significant step toward these goals is the development of a web-based, open-access platform that provides: (1) a reference library of known and unknown biological sound sources (by integrating and expanding existing libraries around the world); (2) a data repository portal for annotated and unannotated audio recordings of single sources and of soundscapes; (3) a training platform for artificial intelligence algorithms for signal detection and classification; and (4) a citizen science-based application for public users. Although individually, these resources are often met on regional and taxa-specific scales, many are not sustained and, collectively, an enduring global database with an integrated platform has not been realized. We discuss the benefits such a program can provide, previous calls for global data-sharing and reference libraries, and the challenges that need to be overcome to bring together bio- and ecoacousticians, bioinformaticians, propagation experts, web engineers, and signal processing specialists (e.g., artificial intelligence) with the necessary support and funding to build a sustainable and scalable platform that could address the needs of all contributors and stakeholders into the future.
... Gabor functions were used to generate simulated clicks to pretrain the DCAEs. The following representation (Madhusudhana et al., 2015) of the discrete Gabor signals was selected to generate the network training data: ...
Article
Ocean noise has a negative impact on the acoustic recordings of odontocetes' echolocation clicks. In this study, deep convolutional autoencoders (DCAEs) are presented to denoise the echolocation clicks of the finless porpoise (Neophocaena phocaenoides sunameri). A DCAE consists of an encoder network and a decoder network. The encoder network is composed of convolutional layers and fully connected layers, whereas the decoder network consists of fully connected layers and transposed convolutional layers. The training scheme of the denoising autoencoder was applied to learn the DCAE parameters. In addition, transfer learning was employed to address the difficulty in collecting a large number of echolocation clicks that are free of ambient sea noise. Gabor functions were used to generate simulated clicks to pretrain the DCAEs; subsequently, the parameters of the DCAEs were fine-tuned using the echolocation clicks of the finless porpoise. The experimental results showed that a DCAE pretrained with simulated clicks achieved better denoising results than a DCAE trained only with echolocation clicks. Moreover, deep fully convolutional autoencoders, which are special DCAEs that do not contain fully connected layers, generally achieved better performance than the DCAEs that contain fully connected layers.
... It was first proposed for echolocation detection by Kandia and Stylianou (2006) and can be smoothed when working on high-frequency data for effective click detection . Characteristics of the energy detections are frequently examined to determine if the detected signal should be considered a click, frequently looking at features such as duration, peak frequency, bandwidth, envelope shape, etc. (e.g., Soldevilla, 2008;Frasier, 2015;Madhusudhana et al., 2015). Additional methods examine spectral characteristics (Bermant et al., 2019), spectral and temporal characteristics (Zimmer et al., 2005a), and phase changes in the group delay of the timedomain signal (Kandia and Stylianou, 2008). ...
Article
Full-text available
This work demonstrates the effectiveness of using humans in the loop processes for constructing large training sets for machine learning tasks. A corpus of over 57 000 toothed whale echolocation clicks was developed by using a permissive energy-based echolocation detector followed by a machine-assisted quality control process that exploits contextual cues. Subsets of these data were used to train feed forward neural networks that detected over 850 000 echolocation clicks that were validated using the same quality control process. It is shown that this network architecture performs well in a variety of contexts and is evaluated against a withheld data set that was collected nearly five years apart from the development data at a location over 600 km distant. The system was capable of finding echolocation bouts that were missed by human analysts, and the patterns of error in the classifier consist primarily of anthropogenic sources that were not included as counter-training examples. In the absence of such events, typical false positive rates are under ten events per hour even at low thresholds.
... The point with the maximal Teager-Kaiser energy is tagged as the central point of a click. Echolocation clicks of toothed whales can be approximated by the following Gabor-like function [4]: ...
Article
Passive acoustic monitoring records large amounts of acoustic data, and thus an efficient detection method is essential to analyse these data. A novel approach is proposed to automatically detect odontocete echolocation clicks. A time-frequency filter detects echolocation clicks in the time-frequency domain, and then the Teager-Kaiser energy operator and Gabor curve-fitting method determine the start and end points of each echolocation click precisely. Detector performance is assessed by using synthetic data. The experimental results showed that the recall rates were higher than 90% when the signal-to-noise ratio was above 10 dB.
... It would be impossible to analyse long-term recordings solely by ear and eye and therefore a multitude of autodetection and classification algorithms have been developed. These include matched filtering (Stafford, Fox & Clark 1998), spectrogram correlation (Mellinger & Clark 1997), peak energy detection in certain frequency bands or ratios of energy in target to non-target frequency bands for clicks (Klinck & Mellinger 2011) and tonal sounds , Shannon entropy (Erbe & King 2008) or Teager-Kaiser energy computation (Madhusudhana, Gavrilov & Erbe 2015), image-processing techniques such as edge and ridge detection (Kershenbaum & Roch 2013), neural networks (Erbe 2000) and other techniques. Detectors are often tuned to search for specific signal features and they perform well in a limited range of environments with which they were trained. ...
Conference Paper
Full-text available
The Australian marine soundscape exhibits a diversity of sounds, which can be grouped into biophony, geophony and anthrophony based on their sources. Animals from tiny shrimp, to lobsters, fish and seals, to the largest animals on Earth, blue whales, contribute to the Australian marine biophony. Wind, rain, surf, Antarctic ice break-up and marine earthquakes make up the geophony. Ship traffic, mineral and petroleum exploration and production, construction, defence exercises and commercial fishing add to the anthrophony. While underwater recorders have become affordable mainstream equipment, precise sound recording and analysis remain an art. Australia's Integrated Marine Observing System (IMOS) consists of a network of oceanographic and remote sensors, including passive acoustic listening stations managed by the Centre for Marine Science & Technology, Curtin University, Perth. All of the acoustic recordings are freely available online. Long-term records up to a decade exist at some sites. The recordings provide an exciting window into the underwater world. We present examples of soundscapes from around Australia and discuss various aspects of soundscape recording, analysis and reporting—the to-dos and not-to-dos. 1. INTRODUCTION The marine soundscape is a rapidly growing field of research. At relatively low cost, marine soundscapes can be monitored over long periods of time. They provide information on geophysical events and weather, on human activities and on the animals living in the environment—entirely non-invasively by passively listening at a distance. Soundscapes are often compared to identify good versus bad habitat, or changes of an environment over time. However, comparisons can be difficult because of differences in sound measurement, analysis and reporting. The development of acoustic standards would help overcome some of these challenges, but is a long and tedious process depending entirely on voluntary time (Erbe, Ainslie, et al. 2016). Also, people working in slightly different fields of research sometimes use terminology differently. International Standard ISO 12913-1 (2014) defines an acoustic environment as the " sound at the receiver from all sound sources as modified by the environment " (International Organization for Standardization (ISO) 2014). The soundscape, however, is a perceptual construct and requires a listener: " acoustic environment as perceived or experienced and/or understood by a person or people, in context ". In underwater acoustics, the term soundscape is often used synonymously with acoustic environment and includes all sounds of an environment independent of a listener who might " filter " the received sound. The sounds of an acoustic environment are often grouped into geophony, anthrophony and biophony comprising abiotic, anthropogenic and biotic sounds respectively. These sounds vary with geographic location, recording depth below the sea surface, time of day, season and year. The sound propagation environment (characterised by the bathymetry, seafloor geology, water temperature, salinity, etc.) varies on similar scales and affects the spectral and temporal features of the sounds received. Finally, the chosen sound recording and analysis parameters (system calibration, sampling frequency, duty cycle, Fourier analysis settings, any filtering or averaging applied, etc.) affect the measured or displayed features of a soundscape. Here, we show examples of Australian marine soundscapes and provide some suggestions for soundscape recording, analysis and reporting.
Article
Author ContributionsRunning Head: Acoustic communities reflect ecological disturbanceAbstract In the Anthropocene, there is an intensification of ecological disturbances that can be directly or indirectly attributed to human activities. Seagrass meadows are among the most imperiled ecosystems worldwide and are particularly sensitive to anthropogenic disturbance events. In Everglades National Park, Florida, USA, ecological disturbances including widespread seagrass die offs, can in part be attributed to anthropogenic alteration of historic freshwater flow. In this study, we assess the impacts of seagrass die offs on the composition and behavior of aquatic organisms in acoustic communities at two ecologically similar sites in Florida Bay: Buoy Key, which experienced massive seagrass die off in 2015, and Little Rabbit Key, which did not. We focused on three sonic indicator species for North American estuaries and coastal ecosystems, and specifically Florida Bay: spotted seatrout (Cynoscion nebulosus, family Sciaenidae), gulf toadfish (Opsanus beta, family Batrachoididae), and snapping shrimp (Alpheus spp. and Synalpheus spp., family Alpheidae). The sounds produced by these species are collectively indicative of a broader range of ecosystem function and services – including trophic transfer and recreational fishing – and are more indicative of ecosystem functioning at varying temporal scales than any one species alone. Using a combination of automated detection and visual spectrographic analysis on passive acoustic data, we found that the occurrence of seatrout and toadfish calls and/or chorusing were significantly lower at the disturbed site than the undisturbed site, with no seatrout or toadfish breeding choruses detected at the disturbed location. At both sites, toadfish call properties exhibited a direct relationship with water temperature. Snapping shrimp snap rates were significantly lower at the disturbed site, and snap rates were positively related to temperature at the undisturbed location. Further, examination of a short-duration, pre-disturbance acoustic dataset, showed minimal differences in the acoustic occurrence of the three species at the disturbed site between 2015 immediately after the die off and 2018. These results support the hypothesis that by 2018 the disturbed site had not fully recovered from the most recent seagrass die off and that restoration efforts may still needed. Acoustic community composition and behavior may be used as a reliable indicator of ecosystem health and restoration success or efficacy. This work lays the conceptual groundwork for using acoustic communities as indicators in human altered systems to inform managers that habitats either may be approaching disturbance or may require continued restoration efforts to meet achieve recovery goals.
Article
An effective method for detecting the presence of dolphins is by using passive acoustic monitoring (PAM), where pod size indications can be estimated by counting individual whistles. The detection of dolphin whistles is commonly applied on a time-frequency representation, followed by denoising and whistle tracking to evaluate the number of whistles. However, due to harmonics, multipath and time-varying signal-to-noise ratio, a single dolphin whistle may be associated with multiple whistle-traces. Thus, as a first step towards evaluating dolphins' abundance, our goal is to cluster individual whistle traces into unique whistles. Our scheme measures the similarity between each pair of whistle traces, and estimates the likelihood of whistle traces sharing the same cluster. Clustering is formalized as an optimization problem, aims to maximize the stability of clusters. Formalizing the problem as a minimal-cut optimization on a graph provides an effective solution based on spectral decomposition of the graph-Laplacian. Our model of the likelihood sharing cluster provides a physically-meaningful method to calculate the graph's connectivity parameters, thereby leading to a robust blind clustering. Based on numerical simulations and real recordings of dolphin whistles at sea, we demonstrate the applicability of our solution and its advance beyond alternative approaches.
Article
This paper introduces a feature extraction technique that identifies highly informative features from sonar magnitude spectra for automated target classification. The approach involves creating feature representations through convolution of a two-dimensional Gabor wavelet and acoustic color magnitudes to capture elastic waves. This feature representation contains extracted localized features in the form of Gabor stripes, which are representative of unique targets and are invariant of target aspect angle. Further processing removes non-informative features through a threshold-based culling. This paper presents an approach that begins connecting model-based domain knowledge with machine learning techniques to allow interpretation of the extracted features while simultaneously enabling robust target classification. The relative performance of three supervised machine learning classifiers, specifically a support vector machine, random forest, and feed-forward neural network are used to quantitatively demonstrate the representations' informationally rich extracted features. Classifiers are trained and tested with acoustic color spectrograms and features extracted using the algorithm, interpreted as stripes, from two public domain field datasets. An increase in classification performance is generally seen, with the largest being a 47% increase from the random forest tree trained on the 1–31 kHz PondEx10 data, suggesting relatively small datasets can achieve high classification accuracy if model-cognizant feature extraction is utilized.
Article
In this work, a convolutional neural network based method is proposed to automatically detect odontocetes echolocation clicks by analyzing acoustic data recordings from a passive acoustic monitoring system. The neural network was trained to distinguish between click and non-click clips and was subsequently converted to a full-convolutional network. The performance of the proposed network was evaluated using synthetic data and real audio recordings. The experimental results indicate that the proposed method works stably with echolocation clicks of different species.
Article
Full-text available
In this paper we present an alternative way to usual energy based approaches for detecting whale clicks. We suggest the use of the phase spectrum since the information about the location of clicks is very well represented in phase spectra. The method is referred to as the phase slope function. It is shown that the phase slope function is robust to additive noise while it offers simplicity in click detection since it is independent of the click source level. We further discuss its properties regarding the mono-pulse and multi-pulse character of clicks by introducing the notion of center of gravity for clicks. To evaluate the suggested phase based whale click detector we labeled clicks by hand in recordings of sperm and beaked whales provided by the Atlantic Undersea Test and Evaluation Center (AUTEC). Conducting detection tests demonstrate that 88% (on average) of the hand labeled mono-pulsed clicks were detected within an accuracy of less than 1 ms. Regarding the detection of multi-pulsed clicks we were able to detect over 95% of them by considering a multi-pulsed click as one acoustic event and not as a series of pulses.
Article
Full-text available
The reliable and accurate measurement of descriptive features to enable accurate description and statistical comparison of animal sounds has presented one of the major difficulties in bioacoustics. To facilitate a quantitative analysis of the acoustic repertoire of Hector's dolphins (Cephalorhynchus hectori), a computer-based measurement system in a signal processing language (SIGPROC) has been developed. This automated system allowed us to easily measure descriptive features from 7661 dolphin clicks. The reliability of the measurement system was confirmed by reconstructing some of the clicks from their measured features. The measurement and reconstruction techniques are described. Analyses of the measured data showed that most Hector's dolphin vocalizations are simple, high-frequency, narrow-band "clicks." A few examples of wideband clicks were found, as was a sound comprising two frequency components.
Article
Full-text available
Biosonar signals radiated along the beam axis of an Atlantic bottlenose dolphin resemble short transient oscillations. As the azimuth of the measuring hydrophones in the horizontal plane progressively increases with respect to the beam axis the signals become progressively distorted. At approximately ±45°, the signals begin to divide into two components with the time difference between the components increasing with increasing angles. At ±90° or normal to the longitudinal axis of the animal, the time difference between the two pulses measured by the hydrophone on the right side of the dolphin's head is, on average, ∼11.9 μs larger than the time differences observed by the hydrophone on the left side of the dolphin's head. The center frequency of the first pulse is generally lower, by 33-47 kHz, than the center frequency of the second pulse. When considering the relative locations of the two phonic lips, the data suggest that the signals are being produced by one of the phonic lips and the second pulse resulting from a reflection within the head of the animal. The generation of biosonar signals is a complex process and the propagation pathways through the dolphin's head are not well understood.
Article
Full-text available
Spectrogram correlation is an effective and widely used technique for detecting and classifying cetacean echolocation clicks. Because different species operate in different frequency bands, different spectral templates are needed to detect them. However, examination of the spectrograms from a wide range of whales and dolphins suggests they are generally similar in form but, as already said, occupying different frequency ranges and spread over different time intervals. This suggests that if the clicks from different species can be translated in frequency and temporally normalised, one detector will suffice for all species. This would have many advantages: the frequency shifting is easily achieved with simple analogue electronics; if all signals are translated to baseband, a sampling rate of 96kHz would probably be adequate, so off-the-shelf audio equipment can be used; lower sampling rates mean reduced computing requirements. This paper investigates these ideas and proposes a system design that can carry out the required processes, will detect all species of echolocating cetacean, and classify the species from the degree of frequency shift required.
Article
Full-text available
The development of an algorithm for automatic detection of sperm whale clicks in large recordings is described. It is based on the Teager–Kaiser (TK) energy operator and it is able to detect efficiently creaks as well as regular clicks. A matching filter is first used as a pre-processor in order to overcome the difficulties caused by the multi-pulse structure of regular clicks. Next, the TK energy operator is applied to the output of the matching filter. A first selection of clicks is performed based on statistical measurements on the TK output, while the final selection is carried out by a forward–backward search algorithm. The proposed system has been tested on a total duration of 25 min of data containing regular clicks as well as creak clicks, where the location of clicks has been marked by hand. An average rate of 94.05% of correct detections was achieved by comparison with the hand-labeled data created from the tested files. Comparing to a standard method used for the same task, the proposed algorithm is more effective in detection rate by 30% and much more accurate and robust.
Article
Full-text available
Sounds from Longman's beaked whale, Indopacetus pacificus, were recorded during shipboard surveys of cetaceans surrounding the Hawaiian Islands archipelago; this represents the first known recording of this species. Sounds included echolocation clicks and burst pulses. Echolocation clicks were grouped into three categories, a 15 kHz click (n = 106), a 25 kHz click (n = 136), and a 25 kHz pulse with a frequency-modulated upsweep (n = 70). The 15 and 25 kHz clicks were relatively short (181 and 144 ms, respectively); the longer 25 kHz upswept pulse was 288 ms. Burst pulses were long (0.5 s) click trains with approximately 240 clicks/s.
Article
Full-text available
Many odontocetes produce frequency modulated tonal calls known as whistles. The ability to automatically determine time × frequency tracks corresponding to these vocalizations has numerous applications including species description, identification, and density estimation. This work develops and compares two algorithms on a common corpus of nearly one hour of data collected in the Southern California Bight and at Palmyra Atoll. The corpus contains over 3000 whistles from bottlenose dolphins, long- and short-beaked common dolphins, spinner dolphins, and melon-headed whales that have been annotated by a human, and released to the Moby Sound archive. Both algorithms use a common signal processing front end to determine time × frequency peaks from a spectrogram. In the first method, a particle filter performs Bayesian filtering, estimating the contour from the noisy spectral peaks. The second method uses an adaptive polynomial prediction to connect peaks into a graph, merging graphs when they cross. Whistle contours are extracted from graphs using information from both sides of crossings. The particle filter was able to retrieve 71.5% (recall) of the human annotated tonals with 60.8% of the detections being valid (precision). The graph algorithm's recall rate was 80.0% with a precision of 76.9%.
Article
Full-text available
The spectral and temporal properties of echolocation clicks and the use of clicks for species classification are investigated for five species of free-ranging dolphins found offshore of southern California: short-beaked common (Delphinus delphis), long-beaked common (D. capensis), Risso's (Grampus griseus), Pacific white-sided (Lagenorhynchus obliquidens), and bottlenose (Tursiops truncatus) dolphins. Spectral properties are compared among the five species and unique spectral peak and notch patterns are described for two species. The spectral peak mean values from Pacific white-sided dolphin clicks are 22.2, 26.6, 33.7, and 37.3 kHz and from Risso's dolphins are 22.4, 25.5, 30.5, and 38.8 kHz. The spectral notch mean values from Pacific white-sided dolphin clicks are 19.0, 24.5, and 29.7 kHz and from Risso's dolphins are 19.6, 27.7, and 35.9 kHz. Analysis of variance analyses indicate that spectral peaks and notches within the frequency band 24-35 kHz are distinct between the two species and exhibit low variation within each species. Post hoc tests divide Pacific white-sided dolphin recordings into two distinct subsets containing different click types, which are hypothesized to represent the different populations that occur within the region. Bottlenose and common dolphin clicks do not show consistent patterns of spectral peaks or notches within the frequency band examined (1-100 kHz).
Article
Full-text available
This study presents a system for classifying echolocation clicks of six species of odontocetes in the Southern California Bight: Visually confirmed bottlenose dolphins, short- and long-beaked common dolphins, Pacific white-sided dolphins, Risso's dolphins, and presumed Cuvier's beaked whales. Echolocation clicks are represented by cepstral feature vectors that are classified by Gaussian mixture models. A randomized cross-validation experiment is designed to provide conditions similar to those found in a field-deployed system. To prevent matched conditions from inappropriately lowering the error rate, echolocation clicks associated with a single sighting are never split across the training and test data. Sightings are randomly permuted before assignment to folds in the experiment. This allows different combinations of the training and test data to be used while keeping data from each sighting entirely in the training or test set. The system achieves a mean error rate of 22% across 100 randomized three-fold cross-validation experiments. Four of the six species had mean error rates lower than the overall mean, with the presumed Cuvier's beaked whale clicks showing the best performance (<2% error rate). Long-beaked common and bottlenose dolphins proved the most difficult to classify, with mean error rates of 53% and 68%, respectively.
Article
Full-text available
Porpoise echolocation has been studied previously, mainly in target detection experiments using stationed animals and steel sphere targets, but little is known about the acoustic behaviour of free-swimming porpoises echolocating for prey. Here, we used small onboard sound and orientation recording tags to study the echolocation behaviour of free-swimming trained porpoises as they caught dead, freely drifting fish. We analysed porpoise echolocation behaviour leading up to and following prey capture events, including variability in echolocation in response to vision restriction, prey species, and individual porpoise tested. The porpoises produced echolocation clicks as they searched for the fish, followed by fast-repetition-rate clicks (echolocation buzzes) when acquiring prey. During buzzes, which usually began when porpoises were about 1-2 body lengths from prey, tag-recorded click levels decreased by about 10 dB, click rates increased to over 300 clicks per second, and variability in body orientation (roll) increased. Buzzes generally continued beyond the first contact with the fish, and often extended until or after the end of prey handling. This unexplained continuation of buzzes after prey capture raises questions about the function of buzzes, suggesting that in addition to providing detailed information on target location during the capture, they may serve additional purposes such as the relocation of potentially escaping prey. We conclude that porpoises display the same overall acoustic prey capture behaviour seen in larger toothed whales in the wild, albeit at a faster pace, clicking slowly during search and approach phases and buzzing during prey capture.
Article
Full-text available
We analyzed the echolocation sounds emitted by Mediterranean bottlenose dolphins. We extracted the click trains by visual inspection of the data files recorded along the coast of the Tuscany with the collaboration of the CETUS Research Center. We modeled the extracted sonar clicks as Gaussian or exponential multicomponent signals, we estimated the characteristic parameters and compared the data with the reconstructed signals based on the estimates. Results about the estimation and the data fitting are largely shown in the paper.
Article
Full-text available
We tried to find discriminating features for sperm whale clicks in order to distinguish between clicks from different whales, or to enable unique identification. We examined two different methods to obtain suitable characteristics. First, a model based on the Gabor function was used to describe the dominant frequencies in a click, and then the model parameters were used as classification features. The Gabor function model was selected because it has been used to model dolphin sonar pulses with great precision. Additionally, it has the interesting property that it has an optimal time–frequency resolution. As such, it can indicate optimal usage of the sonar by sperm whales. Second, the clicks were expressed in a wavelet packet table, from which subsequently a local discriminant basis was created. A wavelet packet basis has the advantage that it offers a highly redundant number of coefficients, which allow signals to be represented in many different ways. From the redundant signal description a representation can be selected that emphasizes the differences between classes. This local discriminant basis is more flexible than the Gabor function, which can make it more suitable for classification, but it is also more complex. Class vectors were created with both models and classification was based on the distance of a click to these vectors. We show that the Gabor function could not model the sperm whale clicks very well, due to the variability of the changing click characteristics. Best performance was reached when three subsequent clicks were averaged to smoothen the variability. Around 70% of the clicks classified correctly in both the training and validation sets. The wavelet packet table adapted better to the changing characteristics, and gave better classification. Here, also using a 3-click moving average, around 95% of the training sets classified correctly and 78% of the validation sets. These numbers lowered by only a few per cent when single clicks, instead of a moving average, were classified. This indicates that, while the features may show too much variability to enable unique identification of individual whales on a click by click basis, the wavelet approach may be capable of distinguishing between a small group of whales.
Article
Full-text available
On the basis of measured receptive field profiles and spatial frequency tuning characteristics of simple cortical cells, it can be concluded that the representation of an image in the visual cortex must involve both spatial and spatial frequency variables. In a scheme due to Gabor, an image is represented in terms of localized symmetrical and antisymmetrical elementary signals. Both measured receptive fields and measured spatial frequency tuning curves conform closely to the functional form of Gabor elementary signals. It is argued that the visual cortex representation corresponds closely to the Gabor scheme owing to its advantages in treating the subsequent problem of pattern recognition.
Article
Full-text available
Toothed whales (Odontoceti, Cetacea) navigate and locate prey by means of active echolocation. Studies on captive animals have accumulated a large body of knowledge concerning the production, reception and processing of sound in odontocete biosonars, but there is little information about the properties and use of biosonar clicks of free-ranging animals in offshore habitats. This study presents the first source parameter estimates of biosonar clicks from two free-ranging oceanic delphinids, the opportunistically foraging Pseudorca crassidens and the cephalopod eating Grampus griseus. Pseudorca produces short duration (30 micro s), broadband (Q=2-3) signals with peak frequencies around 40 kHz, centroid frequencies of 30-70 kHz, and source levels between 201-225 dB re. 1 micro Pa (peak to peak, pp). Grampus also produces short (40 micro s), broadband (Q=2-3) signals with peak frequencies around 50 kHz, centroid frequencies of 60-90 kHz, and source levels between 202 and 222 dB re. 1 micro Pa (pp). On-axis clicks from both species had centroid frequencies in the frequency range of most sensitive hearing, and lower peak frequencies and higher source levels than reported from captive animals. It is demonstrated that sound production in these two free-ranging echolocators is dynamic, and that free-ranging animals may not always employ biosonar signals comparable to the extreme signal properties reported from captive animals in long-range detection tasks. Similarities in source parameters suggest that evolutionary factors other than prey type determine the properties of biosonar signals of the two species. Modelling shows that interspecific detection ranges of prey types differ from 80 to 300 m for Grampus and Pseudorca, respectively.
Article
Full-text available
Rousettus aegyptiacus Geoffroy 1810 is a member of the only genus of Megachiropteran bats to use vocal echolocation, but the structure of its brief, click-like signal is poorly described. Although thought to have a simple echolocation system compared to that of Microchiroptera, R. aegyptiacus is capable of good obstacle avoidance using its impulse sonar. The energy content of the signal was at least an order of magnitude smaller than in Microchiropteran bats and dolphins (approximately 4 x 10(-8) J m(-2)). Measurement of the duration, amplitude and peak frequency demonstrate that the signals of this animal are broadly similar in structure and duration to those of dolphins. Gabor functions were used to model signals and to estimate signal parameters, and the quality of the Gabor function fit to the early part of the signal demonstrates that the echolocation signals of R. aegyptiacus match the minimum spectral spread for their duration and amplitude and are thus well matched to its best hearing sensitivity. However, the low energy content of the signals and short duration should make returning echoes difficult to detect. The performance of R. aegyptiacus in obstacle avoidance experiments using echolocation therefore remains something of a conundrum.
Article
Full-text available
Blainville's beaked whales (Mesoplodon densirostris Blainville) echolocate for prey during deep foraging dives. Here we use acoustic tags to demonstrate that these whales, in contrast to other toothed whales studied, produce two distinct types of click sounds during different phases in biosonar-based foraging. Search clicks are emitted during foraging dives with inter-click intervals typically between 0.2 and 0.4 s. They have the distinctive form of an FM upsweep (modulation rate of about 110 kHz ms(-1)) with a -10 dB bandwidth from 26 to 51 kHz and a pulse length of 270 micros, somewhat similar to chirp signals in bats and Cuvier's beaked whales (Ziphius cavirostris Cuvier), but quite different from clicks of other toothed whales studied. In comparison, the buzz clicks, produced in short bursts during the final stage of prey capture, are short (105 micros) transients with no FM structure and a -10 dB bandwidth from 25 to 80 kHz or higher. Buzz clicks have properties similar to clicks reported from large delphinids and hold the potential for higher temporal resolution than the FM clicks. It is suggested that the two click types are adapted to the separate problems of target detection and classification versus capture of low target strength prey in a cluttered acoustic environment.
Article
Full-text available
We propose a novel transform, an expansion of an arbitrary function onto a basis of multi-scale chirps (swept frequency wave packets). We apply this new transform to a practical problem in marine radar: the detection of floating objects by their "acceleration signature" (the "chirpyness" of their radar backscatter), and obtain results far better than those previously obtained by other current Doppler radar methods. Each of the chirplets essentially models the underlying physics of motion of a floating object. Because it so closely captures the essence of the physical phenomena, the transform is near optimal for the problem of detecting floating objects. Besides applying it to our radar image processing interests, we also found the transform provided a very good analysis of actual sampled sounds, such as bird chirps and police sirens, which have a chirplike nonstationarity, as well as Doppler sounds from people entering a room, and from swimmers amid sea clutter.
Article
The Transient Research Underwater Detector (TRUD) is designed to search for echolocation clicks from marine mammals. It uses a spectrogram correlation method with a set of reference matrices to search for clicks from multiple species. This paper describes the algorithm and presents the results of processing the workshop datasets from the Third International Workshop on the Detection and Classification of Marine Mammals using Passive Acoustics held in Boston in July, 2007. The work shows that TRUD can detect and classify the target species. Recommendations are made for further improvements to the algorithm.
Article
A species classifier is presented which decides whether or not short groups of clicks are produced by one or more individuals from the following species: Blainville's beaked whales, short-finned pilot whales, and Risso's dolphins. The system locates individual clicks using the Teager energy operator and then constructs feature vectors for these clicks using cepstral analysis. Two different types of detectors confirm or reject the presence of each species. Gaussian mixture models (GMMs) are used to model time series independent characteristics of the species feature vector distributions. Support vector machines (SVMs) are used to model the boundaries between each species' feature distribution and that of other species. Detection error tradeoff curves for all three species are shown with the following equal error rates: Blainville's beaked whales (GMM 3.32%/SVM 5.54%), pilot whales (GMM 16.18%/SVM 15.00%), and Risso's dolphins (GMM 0.03%/SVM 0.70%).
Article
Acoustic signais from free-ranging finless porpoises were recorded in the waters around Hong Kong during March 2000. Finless porpoises produced short-duration high-frequency clicks. Signal analysis showed finless porpoise clicks to be both "typical" phocoenid sounds, i.e. narrowband, high frequency ultrasonic pulses, and "atypical" broadband pulses with sharp onsets. Peak energy in the narrowband porpoise click spectrum occurred at 142 kHz, with negligible energy below 100 kHz. Energy was more diffuse in the spectra of broadband clicks, with a tendency towards higher frequencies. Mean pulse duration of narrowband clicks was 104 microseconds, whereas mean pulse duration of broadband clicks was 61 microseconds. Generally, finless porpoise clicks were inaudible to the human ear, except on occasion when faint, but distinct, pulses were heard from animals close to the hydrophone.
Article
Navy sonar has recently been implicated in several marine mammal stranding events. Beaked whales (particulary Mesoplodon densirostris) have been the predominant species involved in a number of these strandings. Monitoring and mitigating the effects of anthropogenic noise on marine mammals are active areas of research. Key to both monitoring and mitigation is the ability to automatically detect and classify animals, especially beaked whales. This paper presents a novel support vector machine based methodology for automated, species level classification of small odontocetes. The new classifier, called the class-specific support vector machine (CS-SVM), consists of multiple binary SVM's where each SVM discriminates between a class of interest and a common reference class. A main objective in the development of the CS-SVM was to realize a robust multi-class SVM whose implementation is simpler than existing multi-class SVM methods. A CS-SVM was trained to identify click vocalization from four species of odontocetes including Mesoplodon densirostris. The algorithm processes time series data in a fully automated fashion first detecting and then classifying click events. Results from the application of this automated classifier to the data sets provided by the 3rd International Workshop on Detection and Classification of Marine Mammals Using Passive Acoustics are presented.
Book
The sonar of dolphins has undergone evolutionary re-finement for millions of years and has evolved to be the premier sonar system for short range applications. It far surpasses the capability of technological sonar, i.e. the only sonar system the US Navy has to detect buried mines is a dolphin system. Echolocation experiments with captive animals have revealed much of the basic parameters of the dolphin sonar. Features such as signal characteristics, transmission and reception beam patterns, hearing and internal filtering properties will be discussed. Sonar detection range and discrimination capabilities will also be included. Recent measurements of echolocation signals used by wild dolphins have expanded our understanding of their sonar system and their utilization in the field. A capability to perform time-varying gain has been recently uncovered which is very different than that of a technological sonar. A model of killer whale foraging on Chinook salmon will be examined in order to gain an understanding of the effectiveness of the sonar system in nature. The model will examine foraging in both quiet and noisy environments and will show that the echo levels are more than sufficient for prey detection at relatively long ranges.
Article
Toothed whales (Odontoceti, Cetacea) navigate and locate prey by means of active echolocation. Studies on captive animals have accumulated a large body of knowledge concerning the production, reception and processing of sound in odontocete biosonars, but there is little information about the properties and use of biosonar clicks of free-ranging animals in offshore habitats. This study presents the first source parameter estimates of biosonar clicks from two free-ranging oceanic delphinids, the opportunistically foraging Pseudorca crassidens and the cephalopod eating Grampus griseus. Pseudorca produces short duration (30 μs), broadband (Q=2–3) signals with peak frequencies around 40 kHz, centroid frequencies of 30–70 kHz, and source levels between 201–225 dB re. 1 μPa (peak to peak, pp). Grampus also produces short (40 μs), broadband (Q=2–3) signals with peak frequencies around 50 kHz, centroid frequencies of 60–90 kHz, and source levels between 202 and 222 dB re. 1 μPa (pp). On-axis clicks from both species had centroid frequencies in the frequency range of most sensitive hearing, and lower peak frequencies and higher source levels than reported from captive animals. It is demonstrated that sound production in these two free-ranging echolocators is dynamic, and that free-ranging animals may not always employ biosonar signals comparable to the extreme signal properties reported from captive animals in long-range detection tasks. Similarities in source parameters suggest that evolutionary factors other than prey type determine the properties of biosonar signals of the two species. Modelling shows that interspecific detection ranges of prey types differ from 80 to 300 m for Grampus and Pseudorca, respectively.
Article
A large collection of underwater sounds (sonar clicks) from six specimens of Tursiops truncatus have been investigated with regard to their behaviour in dominant frequency. The recordings include not only the Atlantic and Pacific Tursiops in captivity, but also include the sonar sound of free-ranging wild T. lruncatus from the Atlantic. Applying a parametric description of the sonar waveform, following the GABOR-model, a cluster representation of the two highest-ranking features of the sonar signal is given to illustrate the acoustic be­ haviour. Dominant frequencies for the whole signal collection range from a lower 40 kHz to up to 80 kHz. Up to a certain level, separate clusters are distinguishable. The ellipsoid cluster for all the data in the scatter plot indicates for the dominant fre­ quency a tendency towards a linear relationship based on the concept ofconstant relative bandwidth. This phenomenon is closely related to the observed functioning of the peripheral auditory systems of delphinids and humans.
Conference Paper
A simple algorithm is derived that permits on-the-fly calculation of the energy required to generate, in a certain sense, a signal. The results of applying this algorithm to a number of well-known signals are shown. Some of the invariance and noise properties of the algorithm are derived and verified by simulation. The implementation of the algorithm and its application to speech processing are briefly discussed
Article
Hitherto communication theory was based on two alternative methods of signal analysis. One is the description of the signal as a function of time; the other is Fourier analysis. Both are idealizations, as the first method operates with sharply defined instants of time, the second with infinite wave-trains of rigorously defined frequencies. But our everyday experiences¿especially our auditory sensations¿insist on a description in terms of both time and frequency. In the present paper this point of view is developed in quantitative language. Signals are represented in two dimensions, with time and frequency as co-ordinates. Such two-dimensional representations can be called ¿information diagrams,¿ as areas in them are proportional to the number of independent data which they can convey. This is a consequence of the fact that the frequency of a signal which is not of infinite duration can be defined only with a certain inaccuracy, which is inversely proportional to the duration, and vice versa. This ¿uncertainty relation¿ suggests a new method of description, intermediate between the two extremes of time analysis and spectral analysis. There are certain ¿elementary signals¿ which occupy the smallest possible area in the information diagram. They are harmonic oscillations modulated by a ¿probability pulse.¿ Each elementary signal can be considered as conveying exactly one datum, or one ¿quantum of information.¿ Any signal can be expanded in terms of these by a process which includes time analysis and Fourier analysis as extreme cases. These new methods of analysis, which involve some of the mathematical apparatus of quantum theory, are illustrated by application to some problems of transmission theory, such as direct generation of single sidebands, signals transmitted in minimum time through limited frequency channels, frequency modulation and time-division multiplex telephony.
Conference Paper
This paper compare two real-time passive underwater acoustic methods to track multiple emitting whales using four or more omni-directional widely-spaced bottom-mounted hydrophones. The Stochastic Matched Filter (SMF) is first used in the whale tracking. The SMF with an echo removal is compared to the Teager-Kaiser-Mallat (TKM) filter method. We briefly review the SMF and TKM theory, rough time delays of arrival are calculated, selected and filtered, and used to estimate the positions of whales for a constant or linear sound speed profile. The complete algorithm is tested on real data from the NUWC and the AUTEC. We evaluate the a priori performance of the system via the Cramer-Rao Lower Bound (CRLB) and Monte Carlo simulations. The CRLB and Monte Carlo simulations are computed and compared with the tracking results. SMF shows higher performance than TKM with more position estimated. Results is validated by similar results from the US Navy and Hawaii univ labs in the case of one whale, and by similar whales counting from the Columbia univ. ROSA lab in the case of multiple whales. The model is validated with good performances with the theoric CRLB and the computed confidence ellipses. At this time, our tracking method is the only one giving typical speed and depth estimations for multiple (4) emitting whales located at 1 to 5 km from the hydrophones.
Article
Passive acoustic monitoring (PAM) of marine mammal vocalizations has been efficiently used in a wide set of applications ranging from marine wildlife surveys to risk mitigation of military sonar emissions. The primary use of PAM is for detecting bioemissions, a good proportion of which are impulse sounds or clicks. A click detection algorithm based on kurtosis estimation is proposed as a general automatic click detector. The detector works under the assumption that click trains are embedded in stochastic but Gaussian noise. Under this assumption, kurtosis is used as a statistical test for detection. The algorithm explores acoustic sequences with the optimal frequency bandwidth for focusing on impulse sounds. The detector is successfully applied to field observations, and operates under weak signal to noise ratios and in presence of stochastic background noise. The algorithm adapts to varying click center frequency. Kurtosis appears as a promising approach to detect click trains, alone or in combination with other clicks detector, and to isolate individual clicks.
Article
A set of algorithms for real-time detection and localization of vocalizing marine mammals has been developed as part of the Marine Mammal Monitoring on Navy Ranges (M3R) program. These algorithms work on a broad variety of vocalizations including sperm whale clicks. The detection algorithm is a two stage process utilizing a binary thresholded FFT as the first stage. The second stage examines the FFT output to determine whether a click is present in a given FFT window. Detected clicks are split out of the data stream and sent to a data association algorithm called a scanning sieve. Time differences of arrival (TDOAs) are calculated which are then fed into 2D and 3D hyperbolic localization algorithms. Software written to implement the algorithms was used to process a data set consisting of sperm whale vocalizations provided as part of the 2nd International Workshop on Detection and Localization of Marine Mammals using Passive Acoustics. Real-time detection and localization results from the data set are provided, along with a detailed description of the algorithms.
Article
The energy ratio mapping algorithm (ERMA) was developed to improve the performance of energy-based detection of odontocete echolocation clicks, especially for application in environments with limited computational power and energy such as acoustic gliders. ERMA systematically evaluates many frequency bands for energy ratio-based detection of echolocation clicks produced by a target species in the presence of the species mix in a given geographic area. To evaluate the performance of ERMA, a Teager-Kaiser energy operator was applied to the series of energy ratios as derived by ERMA. A noise-adaptive threshold was then applied to the Teager-Kaiser function to identify clicks in data sets. The method was tested for detecting clicks of Blainville's beaked whales while rejecting echolocation clicks of Risso's dolphins and pilot whales. Results showed that the ERMA-based detector correctly identified 81.6% of the beaked whale clicks in an extended evaluation data set. Average false-positive detection rate was 6.3% (3.4% for Risso's dolphins and 2.9% for pilot whales).
Article
Traditionally, sperm whale clicks have been described as multipulsed, long duration, nondirectional signals of moderate intensity and with a spectrum peaking below 10 kHz. Such properties are counterindicative of a sonar function, and quite different from the properties of dolphin sonar clicks. Here, data are presented suggesting that the traditional view of sperm whale clicks is incomplete and derived from off-axis recordings of a highly directional source. A limited number of assumed on-axis clicks were recorded and found to be essentially monopulsed clicks, with durations of 100 micros, with a composite directionality index of 27 dB, with source levels up to 236 dB re: 1 microPa (rms), and with centroid frequencies of 15 kHz. Such clicks meet the requirements for long-range biosonar purposes. Data were obtained with a large-aperture, GPS-synchronized array in July 2000 in the Bleik Canyon off Vesterålen, Norway (69 degrees 28' N, 15 degrees 40' E). A total of 14 h of sound recordings was collected from five to ten independent, simultaneously operating recording units. The sound levels measured make sperm whale clicks by far the loudest of sounds recorded from any biological source. On-axis click properties support previous work proposing the nose of sperm whales to operate as a generator of sound.
Article
An array of four hydrophones arranged in a symmetrical star configuration was used to measure the echolocation signals of the dusky dolphin (Lagenorhynchus obscurus) near the Kaikoura Peninsula, New Zealand. Most of the echolocation signals had bi-modal frequency spectra with a low-frequency peak between 40 and 50 kHz and a high-frequency peak between 80 and 110 kHz. The low-frequency peak was dominant when the source level was low and the high frequency peak dominated when the source level was high. The center frequencies in the dusky broadband echolocation signals are among the highest of dolphins measured in the field. Peak-to-peak source levels as high as 210 dB re 1 microPa were measured, although the average was much lower in value. The levels of the echolocation signals are about 9-12 dB lower than for the larger white-beaked dolphin (Lagenorhynchus albirostris) which belongs to the same genus but is over twice as heavy as the dusky dolphins. The source level varied in amplitude approximately as a function of the one-way transmission loss for signals traveling from the animals to the array. The wave form and spectrum of the echolocation signals were similar to those of other dolphins measured in the field.
Article
Strandings of beaked whales of the genera Ziphius and Mesoplodon have been reported to occur in conjunction with naval sonar use. Detection of the sounds from these elusive whales could reduce the risk of exposure, but descriptions of their vocalizations are at best incomplete. This paper reports quantitative characteristics of clicks from deep-diving Cuvier's beaked whales (Ziphius cavirostris) using a unique data set. Two whales in the Ligurian Sea were simultaneously tagged with sound and orientation recording tags, and the dive tracks were reconstructed allowing for derivation of the range and relative aspect between the clicking whales. At depth, the whales produced trains of regular echolocation clicks with mean interclick intervals of 0.43 s (+/- 0.09) and 0.40 s (+/- 0.07). The clicks are frequency modulated pulses with durations of approximately 200 micros and center frequencies around 42 kHz, -10 dB bandwidths of 22 kHz, and Q(3 dB) of 4. The sound beam is narrow with an estimated directionality index of more than 25 dB, source levels up to 214 dB(pp) re: 1 microPa at 1 m, and energy flux density of 164 dB re: 1 microPa2 s. As the spectral and temporal properties are different from those of nonziphiid odontocetes the potential for passive detection is enhanced.
Investigations on cetacean sonar XI: Intrinsic comparison of the wave shapes of some members of the Phocoenidae family
  • C Kamminga
  • A C Stuart
  • G K Silber
Kamminga, C., Stuart, A. C., and Silber, G. K. (1996). "Investigations on cetacean sonar XI: Intrinsic comparison of the wave shapes of some members of the Phocoenidae family," Aquat. Mamm. 22, 45-55.
Wave shape estimation of delphinid sonar signals, a parametric model approach
  • C Kamminga
  • A C Stuart
Kamminga, C., and Stuart, A. C. (1995). "Wave shape estimation of delphinid sonar signals, a parametric model approach," Acoust. Lett. 19, 70-76.
The Levenberg-Marquardt method
  • P E Gill
  • W Murray
  • M H Wright
Gill, P. E., Murray, W., and Wright, M. H. (1981). "The Levenberg-Marquardt method," in Practical Optimization (Academic, London), pp. 136-137.