Udo Zölzer

Udo Zölzer
  • PhD
  • Professor at Helmut Schmidt University

About

259
Publications
125,588
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,509
Citations
Current institution
Helmut Schmidt University
Current position
  • Professor

Publications

Publications (259)
Conference Paper
Full-text available
Recently, drone detection has become a topic of interest due to the widespread usage of drones in various applications, particularly for recreational purposes. Such detection tasks are usually performed by deep learning models which require different kinds of image datasets to be trained on. Hence, a dataset of infrared images for drone detection i...
Conference Paper
Full-text available
Datasets of maritime objects are very important for training applications related to activity monitoring in locations around ports and shores. This paper introduces a maritime dataset for object detection, named as the shore livecam dataset. The dataset is a collection of high definition (HD), full high definition (FHD), and ultra high definition (...
Conference Paper
Full-text available
In this project, a digital ladder filter has been investigated and expanded. This structure is a simplified digital analog model of the well known analog Moog ladder filter. The goal of this paper is to derive the differentiation expressions of this filter with respect to its control parameters in order to integrate it in machine learning systems....
Article
Equalizers for audio signal processing play an important role in microphones, mixing systems, mastering processors, monitor loudspeakers, and consumer devices such as smart tablets, smart phones, wireless headphones, and hearing devices. We will introduce low-complexity filter designs for equalizers based on first- and second order recursive filter...
Article
Full-text available
Methods based on convolutional neural networks (CNNs) for drone detection in infrared images are explored in this paper. For the task of drone detection, a dataset of infrared images containing drones is initially built from a publicly available database and drone videos captured with our cameras. The required drone images are extracted and manuall...
Conference Paper
Full-text available
In this paper, the problem of image enhancement in the form of single image superresolution and compression artifact reduction is addressed by proposing a convolutional neural network with an inception module containing an attention mechanism. The inception module in the network contains parallel branches of convolution layers employing filters wit...
Conference Paper
Full-text available
Peak and shelving filters are parametric infinite impulse response filters which are used for amplifying or attenuating a certain frequency band. Shelving filters are parametrized by their cutoff frequency and gain, and peak filters by center frequency, bandwidth and gain. Such filters can be cascaded in order to perform audio processing tasks like...
Conference Paper
Full-text available
A convolutional neural network is proposed for the reduction of artifacts introduced by JPEG compression. In this work the existing DnCNN architecture has been considered as the basis network and it is extended by introducing inception blocks on top of a normal convolution block consisting of convo-lution, rectified linear unit and batch-normalizat...
Preprint
Full-text available
We develop a new contour tracing algorithm to enhance the results of the latest object contour detectors. The goal is to achieve a perfectly closed, 1 pixel wide and detailed object contour, since this type of contour could be analyzed using methods such as Fourier descriptors. Convolutional Neural Networks (CNNs) are rarely used for contour tracin...
Conference Paper
Full-text available
Head-related transfer functions (HRTFs) are used in many applications for 3D spatial audio through head-phones. Often, the HRTFs are stored as FIR filters. However, IIR filters give the opportunity to approximate the magnitude of these FIR filters with fewer coefficients. By using a cascade of parametric IIR filters such as shelving and peak filter...
Article
Capacitor microphones are widely used to transduce sound waves into electrical voltages. The converting capacitor can either be operated in baseband (audio-frequency) or passband (radio-frequency, RF). A baseband operation can use a straightforward circuit implementation while a passband operation becomes more complex. The advantage of operating th...
Chapter
A ResNet-based multi-path refinement CNN is used for object contour detection. For this task, we prioritise the effective utilization of the high-level abstraction capability of a ResNet, which leads to state-of-the-art results for edge detection. Keeping our focus in mind, we fuse high, mid and low-level features in that specific order, which diff...
Conference Paper
Full-text available
In this paper different cost functions are studied for the optimization of FIR filters as minimum variance controllers in active noise cancelling headphones. The resulting controllers are implemented in a headphones prototype and their attenuation performances are measured using a dummy-head. The measurements are then compared and evaluated in term...
Conference Paper
Full-text available
In this work a convolutional neural network with two input channels is proposed for image super-resolution. Initially, the original image is downsampled with bicubic and nearest neighbour interpolation methods and the low-resolution image pair is used as the input to our network. Additionally, the input channels are randomly swapped as an augmentat...
Conference Paper
Full-text available
Headphones can be equipped with active noise cancelling for providing their users with the attenuation of the acoustical environmental noise that surrounds them. This technology is based on feedforward or feedback control schemes adapted for the control of sound. While feedback schemes can be implemented as fixed systems, the feedforward ones have...
Preprint
Full-text available
A ResNet-based multi-path refinement CNN is used for object contour detection. For this task, we prioritise the effective utilization of the high-level abstraction capability of a ResNet, which leads to state-of-the-art results for edge detection. Keeping our focus in mind, we fuse the high, mid and low-level features in that specific order, which...
Article
In this work analog guitar amplifiers are modeled with an automated procedure using input/output measurements and iterative optimization. The digital model is an extended Wiener-Hammerstein model consisting of a linear time-invariant (LTI) block, a nonlinear block with a nonlinear mapping function, and another LTI block connected in series. The mod...
Conference Paper
Full-text available
This paper proposes a method for detection of harbour porpoise from very high resolution images. The approach involves the detection of salient objects of very small sizes with the feature accelerated segment test method, creation of an annotated database and classification of those objects with different convolutional neural networks. Multiple dee...
Conference Paper
Full-text available
The classical feedforward and feedback active noise control structures can be combined in pairs into hybrid control systems, in order to partially compensate for their individual limitations. The increase in complexity is compensated by the flexibility and attenuation performance they can achieve together. Several combination strategies can be foun...
Conference Paper
Full-text available
Psychoacoustic active noise control aims to decrease the perceived loudness and annoyance of the acoustic noise present in the environment. In order to achieve this, the frequencies to which the human ear is more sensitive are attenuated with higher priority. An implementation of such a control system based on FeLMS and noise weighting curves has s...
Presentation
Full-text available
Virtual analog modeling of guitar equipment, especially tube-based guitar amplifiers, is a hot topic in sound effect research. However, there are no established standards on the evaluation of the accuracy of such modeling processes. This work presents a first approach on finding a metric for evaluating the similarity between the output of an analog...
Conference Paper
Full-text available
With the increasing number of applications for virtual reality also the research activity on 3D audio through headphones has risen. Thus, different approaches for improving the perception of the virtual experience have been developed. This paper summarizes and evaluates the work done on this topic during the last ten years. The investigations mainl...
Conference Paper
Full-text available
Virtual analog modeling of guitar amplifiers is an ongoing research topic and its aim is the recreation of an analog system as exactly as possible. A mathematical model is created which can be used reproducibly and independently of aging hardware and temperature dependence. In this way, the popular sound of vintage tube amplifiers can be combined w...
Conference Paper
Full-text available
An image enhancement approach with Convolutional Neural Network (CNN) for infrared (IR) images from maritime environment, is proposed in this paper. The approach includes different CNNs to improve the resolution and to reduce noise artefacts in maritime IR images. The denoising CNN employs a residual architecture which is trained to reduce grainine...
Conference Paper
This paper discusses an application of block-oriented modeling to a popular analog dynamic range compressor using iterative minimization. The reference device studied here is the UREI 1176LN, which has been widely used in music production and recording. A clone of the circuit built in a previous project has been used as a reference device to compar...
Conference Paper
Full-text available
Using bandlimited impulse train (BLIT) synthesis, it is possible to generate waveforms with a configurable number of harmonics with an equal amplitude. In contrast to the sinc-pulse, which is typically used for bandlimiting in BLIT and only allows to set the cutoff frequency, a Hammerich pulse can be tuned by two independent parameters for cutoff f...
Conference Paper
Full-text available
In this work, analog guitar amplifiers are modeled with an automated procedure using iterative optimization techniques. The digital model is divided into functional blocks, consisting of linear-time-invariant (LTI) filters and nonlinear blocks with nonlinear mapping functions and memory. The model is adapted in several steps. First the filters are...
Conference Paper
Full-text available
In the digital simulation of non-linear audio effect circuits, the arising non-linear equation system generally poses the main challenge for a computationally cheap implementation. As the computational complexity grows super-linearly with the number of equations, it is beneficial to decompose the equation system into several smaller systems, if pos...
Conference Paper
Full-text available
Feedforward control structures for active noise control headphones suffer from challenging processing delay constraints, due to the small distances between transducers and their varying relative orientation to the noise sources. On top of that, the context of its usage comprehends a multi-source environment, which prevents to provide the system wit...
Conference Paper
Full-text available
Connecting an Internal Model Controller (IMC) and a Minimum Variance Controller (MVC) into a hybrid structure aims to combine the attenuation capabilities of the individual structures with- out the need of additional microphones or speakers. Moreover, if connected in a particular way, both controllers can be designed independently from each other....
Article
Full-text available
Techniques for generating multichannel audio from stereo audio signals are supposed to enhance and extend the listening experience of the listener. To assess the quality of such upmix algorithms, subjective evaluations have been carried out. In this paper, we propose an objective evaluation test for stereo-to-multichannel upmix algorithms. Based on...
Article
Changes in room acoustics provide important clues about the environment of sound source-perceiver systems, for example, by indicating changes in the reflecting characteristics of surrounding objects. To study the detection of auditory irregularities brought about by a change in room acoustics, a passive oddball protocol with participants watching a...
Conference Paper
With an increasing variety of loudspeaker setups and the spread of 3D audio playback, automatic upmixing of legacy two-channel stereo recordings to arbitrary channel configurations became an important topic. We outline the basic principles of a recently developed low-complexity stereo to 9-channel 3D audio upmix algorithm. This includes the estimat...
Conference Paper
Full-text available
This paper describes a time-domain algorithm to upmix stereo recordings for an enhanced playback on a surround sound loudspeaker setup. It is mainly the simplified version of a previously published frequency-domain algorithm where the standard short-time Fourier transform is now replaced by an IIR filter bank. The design of complementary filter blo...
Conference Paper
Full-text available
The blending of audio signals, called cross-fading, is a very common task in audio signal processing. Therefore, digital audio workstations offer several fading curves to select from. The choice of the fading curve typically depends on the signal characteristics and is supposed to result in a mixed signal featuring power and loud-ness close to the...
Conference Paper
Full-text available
The sound of a vacuum tube guitar amplifier may be significantly influenced by the non-linear behavior of its output transformer, which therefore should also be considered in digital simulations. In this work, we develop a model for inductors and transformers with the magnetization following the model of Jiles and Atherton. For this purpose, the or...
Conference Paper
Full-text available
This paper describes black-box modeling of distortion circuits. The analyzed distortion circuits all originate from guitar effect pedals, which are widely used to enrich the sound of an electric guitar with harmonics. The proposed method employs a block-oriented model which consists of a linear block (filter) and a non-linear block. In this study t...
Article
Full-text available
We propose the usage of Möbius transformations, defined in the context of Clifford algebras, for geometrically manipulating a point cloud data lying in a vector space of arbitrary dimension. We present this method as an application to signal classification in a dimensionality reduction framework. We first discuss a general situation where data anal...
Conference Paper
Full-text available
In the digital simulation of non-linear audio effect circuits, the arising non-linear equation generally poses the main challenge for a computationally cheap implementation. For any but the simplest circuits, using an iterative solver at execution time will be too slow, while exhaustive look-up tables quickly grow intolerably large. To better cope...
Conference Paper
Virtual analog modeling is the process of creating a digital model of an analog system. In this work a virtual analog model of a dynamic range compression circuit for electrical guitars is constructed by analyzing and measuring the analog reference system. The particular property of the chosen compression system is the use of an analog optical isol...
Conference Paper
In this paper we present a general low-complexity stereo signal decomposition approach. Based on a common stereo signal model, the panning coefficients and azimuth positions of the sources in a stereo mix are estimated. In a next step, this information is used to separate direct and ambient signal components. The simple algorithm can be implemented...
Conference Paper
Full-text available
Even in a time of surround and 3D sound, many tracks and recordings are still only available in mono or it is not feasible to record a source with multiple microphones for several reasons. In these cases, a pseudo stereo conversion of mono signals can be a useful preprocessing step and/or an enhancing audio effect. The conversion proposed in this p...
Conference Paper
Full-text available
In 1998, the ITU published a recommendation for an algorithm for objective measurement of audio quality, aiming to predict the outcome of listening tests. Despite the age, today only one implementation of that algorithm meeting the conformance requirements exists. Additionally, two open source implementations of the basic version of the algorithm a...
Conference Paper
Full-text available
An algorithm to estimate the perceived azimuth directions in a stereo signal is derived from a typical signal model. These estimated directions can then be used to separate direct and ambient signal components and to remix the original stereo track. The processing is based on the idea of a bandwise mid-side decomposition in the frequency-domain whi...
Conference Paper
Full-text available
The aim of this study is to demonstrate how ADPCM-based codec structures can be improved using cascaded prediction. The advantage of predictor cascades is to allow the adaption to several signal conditions, as it is done in block-based perceptual codecs like MP3, AAC, etc. In other words, additional predictors with a small order are supposed to enh...
Conference Paper
Full-text available
Several modern applications require audio encoders featuring low data rate and lowest delays. In terms of delay, Adaptive Differential Pulse Code Modulation (ADPCM) encoders are advantageous compared to block-based codecs due to their instantaneous output and therefore preferred in time-critical applications. If the the audio signal transport is do...
Conference Paper
Full-text available
Virtual analog modeling is the process of digitally recreating an analog device. This study focuses on analog distortion pedals for guitarists, which are categorized as stompboxes, because the musician turns them on and off by stepping on the switch. While some of the current digital models of distortion effects are circuit-based, this study uses a...
Conference Paper
Full-text available
Virtual analog modeling is an important field of digital audio signal processing. It allows to recreate the tonal characteristics of real-world sound sources or to impress the specific sound of a certain analog device upon a digital signal on a software basis. Automatic virtual analog modeling using black-box system identification based on input/ou...
Conference Paper
Virtual analog modeling is an important field of digital audio signal processing. It allows to recreate the tonal characteristics of real-world sound sources or to impress the specific sound of a certain analog device upon a digital signal on a software basis. Automatic virtual analog modeling using black-box system identification based on input/ou...
Conference Paper
Full-text available
In this paper a system for vowel conversion between different speakers using short-time speech segments is presented. The input speech signal is segmented into period-length speech segments whose fundamental frequency and first two formants are used to find the perceivable vowel-quality. These segments are used to represent a voiced phoneme, i.e. a...
Data
Full-text available
Conference Paper
Full-text available
Digital emulation of analog circuits for musical audio processing , like synthesizers, guitar effect pedals, or vintage amplifiers , is an ongoing research topic. David Yeh proposed to use the nodal DK method to derive a non-linear state-space system from a circuit schematic in a very systematic way. However, this approach has some drawbacks and li...
Conference Paper
Nowadays, many applications related to maritime security and ship monitoring require a correct detection of ships. In the field of ship detection, different types of images are used depending on the application. Regarding high-resolution images, the variable characteristics of the sea environment often complicate a precise detection. These characte...
Article
We present a novel macroblock mode decision approach for video encoder. It aims at minimizing the rate given a certain distortion constraint. Motivated by the fact that humans perceive only a small area around the eye-fixation point very clearly, the presented mode decision algorithm is designed to tolerate a certain amount of distortion depending...
Conference Paper
An automated Passive Acoustic Monitoring (PAM) method for the detection and differentiation of sperm whale individuals is proposed. Various methods benefit from the correlation of multi-channel recordings to identify active whales. However, the proposed approach employs audio recordings from a single hydrophone and uses a correspondence analysis to...
Conference Paper
Full-text available
An automated Passive Acoustic Monitoring (PAM) method for the detection and differentiation of sperm whale individuals is proposed. Various methods benefit from the correlation of multi-channel recordings to identify active whales. However, the proposed approach employs audio recordings from a single hydrophone and uses a correspondence analysis to...
Article
A visual saliency based approach for macroblock (MB) mode selection in video compression is presented. It is based on the Laplace distribution of transformed residuals. Visual saliency of a MB is used to make the encoder vote for a mode which requires less bits in a region which is visually less important. It shows a gain of up to 0.6 dB in terms o...
Article
We propose the use of genetic algorithms for optimizing 8-ary signal constellations. The constellations are either optimized for a high bit interleaved coded modulation (BICM) mutual information or a low bit error rate (BER) in the additive white Gaussian noise channel (AWGN). We obtained a constellation that outperforms the 8-cross by 0.25 dB for...
Conference Paper
Full-text available
In this study, a famous boxed effect pedal, also called stompbox, for electrical guitars is analyzed and simulated. The nodal DK method is used to create a non-linear state-space system with Matlab as a physical model for the MXR Phase 90 guitar ef-fect pedal. A crucial component of the effect are Junction Field Effect Transistors (JFETs) which are...
Conference Paper
Modification of the quantizer value based on a region's visual importance within a frame is a popular and widely-used procedure in perceptual video coding. The first contribution of this paper is an alternative modification scheme. Second, our previous work, where we use saliency maps in order to guide the macroblock (MB) mode decision during the r...
Conference Paper
Full-text available
Subjective listening tests are an essential tool for the evaluation and comparison of audio processing algorithms. In this paper we introduce BeaqleJS, a framework based on HTML5 and JavaScript to run listening tests in any modern web browser. This allows an easy distribution of the test environment to a significant amount of participants in combin...
Conference Paper
Full-text available
Today’s public Internet availability and capabilities allow manifold applications in the field of multimedia that were not possible a few years ago. One emerging application is the so-called Networked Music Performance, standing for the online, low-latency interaction of musicians. This work proposes a stand-alone device for that specific purpose a...
Conference Paper
This work comprises an extension of a backward adaptive quantizer which is employed together with a robust lattice predictor in an ADPCM coding scheme. Predictors of the ADPCM audio coding schemes are often considered as the part most sensitive to transmission errors. Nevertheless, a single transmission error causes a short destabilization of the a...
Conference Paper
Nowadays, many different image processing applications are of high interest to maritime authorities because of security reasons. Depending on the application, different kinds of images are employed. The extraction of ship silhouettes requires high resolution images in order to obtain accurate results. However, when the characteristics of the naval...
Conference Paper
Full-text available
A major problem in low-latency Audio over IP transmission is the unpredictable impact of the underlying network, leading to jitter and packet loss. Typically, error concealment strategies are employed at the receiver to counteract audible artifacts produced by missing audio data resulting from the mentioned network characteristics. Known concealmen...
Conference Paper
In this paper, a load balanced implementation of a delayless FxLMS algorithm for the purpose of active noise cancellation is proposed. Frequency-domain adaption algorithms using FFT’s are well-known for their efficiency. However, their block-based character will lead to a delay in the order of the used block length. A hybrid adaption approach is ch...
Conference Paper
Full-text available
The tonalness spectrum shows the likelihood of a spectral bin being part of a tonal or non-tonal component. It is a non-binary measure based on a set of established spectral features. An easily extensible framework for the computation, selection, and combination of features is introduced. The results are evaluated and compared in two ways. First wi...
Conference Paper
Full-text available
We consider the problem of transmission errors in the well known adaptive differential pulse code modulation (ADPCM) system. A single transmission error destabilizes the reconstruction process at the decoder side in the ADPCM coding scheme if a non-leaky algorithm is used. We propose a delay-free and fixed rate of \unit[3]{bit/sample} audio source...
Conference Paper
Full-text available
In this study, receiver-based audio error concealment in the context of low-latency Audio over IP transmission is analyzed. Therefore, the well-known technique of audio extrapolation is investigated concerning its usability in real-time scenarios, its applied predic-tion techniques and various transmission parameters. A large-scale automated evalua...
Conference Paper
Due to the ever-growing importance of video content delivered through today's Internet, not only the availability of high-definition video material but also its high quality must be guaranteed. The latter can be achieved by means of objective video quality metrics. Recently, insufficient quality prediction accuracy for well-known full-reference obj...
Thesis
Full-text available
The purpose of this project is to investigate how Adaptive Differential Pulse Code Modulation (ADPCM) can be further improved using Recursively Indexed Quantizer units. ADPCM is a lossy low-delay audio compression codec, particularly interesting because it can run in real time and deliver high quality coding at a low bitrate. Theoretical background...
Conference Paper
Full-text available
A low delay audio coding scheme with good perceptual au-dio quality for a desired limited bit rate is presented. The proposed audio coding scheme is based on differential pulse code modulation (DPCM) and block companded (BC) quanti-zation. Prediction is realized as a FIR filter in lattice structure. DPCM performs in feedback manner, therefore no tr...
Conference Paper
Real-time audio transmission requires good quality for a restricted channel capacity and minimum latency. An ultra-low delay audio coding scheme based on differential pulse code modulation (DPCM) and block companded quantization is presented. The prediction filter of the base backward DPCM codec is attained as a FIR filter in lattice structure. The...
Conference Paper
We investigate to what extent temporal pooling and spatio-temporal quality interaction can help to increase a video quality metric's prediction accuracy. Our results prove that a significant gain is possible. The evaluations are conducted on high-definition video material. We propose a stand-alone Gaussian weighting as an alternative pooling strate...
Conference Paper
The paper provided herein aims at investigating two state-of-the-art publicly available full-reference video quality assessment metrics, particularly with regard to high-definition video data. Concretely, we will concentrate on the performance of the Multi-Scale Structural Similarity index (MS-SSIM) and the NTIA General Video Quality Metric (VQM) c...
Conference Paper
High Frequency Surface Wave Radars (HFSWRs) play an important role in long-range ocean surveillance, with particular interest in reliable detection and tracking of far-distant ships. One of the biggest challenges in ship target detection in HFSWR is the non-homogeneous detection background. Depending on the chosen detection parameters, this non-hom...
Conference Paper
Image segmentation is a very important step in order to accomplish automatic tasks, such as classification, being continuously demanded by our society. However, the particular case of naval images entails issues that may exacerbate the conditions for segmentation. Specifically due to water reflections the gray distribution might be very disperse th...
Conference Paper
Full-text available
In this paper, a novel chroma extraction technique called Time-Domain Chroma Extraction (TDCE) is introduced. In comparison to many other known schemes, the calculation of a time-frequency representation is unnecessary since the TDCE is a pure sample-bysample technique. It mainly consists of a pitch tracking module that is implemented with a phase-...
Conference Paper
Full-text available
An advanced phase vocoder technique for high quality audio pitch shifting and time stretching is described. Its main concept is based on the PVSOLA time stretching algorithm which is already known to give good results on monophonic speech. Some enhancements are proposed to add the ability to process polyphonic material at equal quality by distingui...
Article
Full-text available
Object recognition is a very interesting task with multiple applications and for that reason it has been dealt with very intensively in the last years. In particular, the application to naval ship pictures may facilitate the work of the coastguards or the navy. However, this type of images entails some difficulties due to their specific environment...
Conference Paper
Pitch detection and tracking can be used for several audio effects, such as time and pitch scaling and the generation of harmonic signals related to the detected fundamental frequency. A variety of audio effects has efficiently been implemented by time-domain processing. Pitch tracking of monophonic audio signals can also be performed by time domai...
Article
Digital audio effects are usually controlled by certain parameters and the incoming audio signal. The combination of user defined parameters and signal adaptive parameters leads to more exciting audio effects where the main effect parameters change accordingly to the audio input. The paper will cover several pitch-based audio effects and will discu...
Conference Paper
The radar group at Helmut Schmidt University / University of the Federal Armed Forces Hamburg (HSU) is working in the field of coastal and mobile HF surface wave radar (SWR) since 2009. Supported by the Bundeswehr Technical Centre for Ships and Naval Weapons (WTD 71) in Eckernforde new signal processing techniques for clutter suppression and enhanc...

Network

Cited By