• Home
  • Siemens
  • Department of Corporate Technology (CT)
  • Andreas Hutter
Andreas Hutter

Andreas Hutter
Siemens · Department of Corporate Technology (CT)

Dr.-Ing.

About

76
Publications
12,499
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,158
Citations
Citations since 2016
13 Research Items
495 Citations
2016201720182019202020212022020406080
2016201720182019202020212022020406080
2016201720182019202020212022020406080
2016201720182019202020212022020406080
Additional affiliations
August 1993 - August 1999
Technische Universität München
Position
  • Research Assistant

Publications

Publications (76)
Preprint
Full-text available
We present a method to incrementally generate complete 2D or 3D scenes with the following properties: (a) it is globally consistent at each step according to a learned scene prior, (b) real observations of a scene can be incorporated while observing global consistency, (c) unobserved regions can be hallucinated locally in consistence with previous...
Preprint
Full-text available
While convolutional neural networks are dominating the field of computer vision, one usually does not have access to the large amount of domain-relevant data needed for their training. It thus became common to use available synthetic samples along domain adaptation schemes to prepare algorithms for the target domain. Tackling this problem from a di...
Conference Paper
Full-text available
With the increasing availability of large databases of 3D CAD models, depth-based recognition methods can be trained on an uncountable number of synthetically rendered images. However, discrepancies with the real data acquired from various depth sensors still noticeably impede progress. Previous works adopted unsupervised approaches to generate mor...
Conference Paper
Full-text available
Recent progress in computer vision has been dominated by deep neural networks trained with large amount of labeled data. Collecting and annotating such datasets is however a tedious, and in some contexts impossible task; hence a recent surge in approaches that rely solely on synthetically generated data from 3D models for their training. For depth...
Conference Paper
Full-text available
In this paper, we address the problem of 3D object instance recognition and pose estimation of localized objects in cluttered environments using convolutional neural networks. Inspired by the descriptor learning approach of Wohlhart et al., we propose a method that introduces the dynamic margin in the manifold learning triplet loss function. Such a...
Article
Pixel-wise linear prediction using backwardadaptive least-squares or weighted least-squares estimation of prediction coefficients is currently among the state-of-the-art methods for lossless image compression. While current research is focused on mean intensity prediction of the pixel to be transmitted, best compression requires occurrence probabil...
Article
Full-text available
This paper presents a moving object detection algorithm for H.264/AVC video streams that is applied in the compressed domain. The method is able to extract and analyze several syntax elements from any H.264/AVC-compliant bit stream. The number of analyzed syntax elements depends on the mode in which the method operates. The algorithm is able to per...
Article
In this paper, we present a novel moving object detection algorithm for H.264/AVC-compressed video streams. The algorithm does not require full decoding up to the pixel domain but only parsing the compressed bit streams. Thereby, only syntax elements for reconstructing (sub-)macroblock types and quantization parameters are extracted. These features...
Patent
Full-text available
Annotation of a sequence of digitized images in multimedia data is aided by a computer analyzing the multimedia data to identify one or more objects and assigning each object to a respective role. The role assignment is determined by processing context information representing a model of the multimedia data.
Article
In this paper we present a new hybrid framework for detecting and tracking persons in surveillance video streams compressed according to the H.264/AVC video coding standard. The framework consists of three stages and operates in both the compressed and the pixel domain of the video. The combination of compressed and pixel domain represents the hybr...
Patent
The disclosure relates to a method for encoding an XML-based document (DOC), where the contents of the document correspond to an XML-schema voice definition. According to one exemplary method, an encoded binary representation (BDOC) of the document is produced by associating the contents of the document with binary structural codes (SBC) using enco...
Patent
Full-text available
A prediction error (eq[x,y]) is added to a predicted frame ({circumflex over (f)}[x,y]) or a predicted block for receiving a decoded frame (gq[x,y]) or a decoded block to be further used in a prediction loop by an encoder or to be sent to the output of a decoder. The reference frame (gq[x,y]) or the reference block includes a useful signal part and...
Conference Paper
A new approach for volumetric deformation compensation in temporally predictive coding of dynamic medical heart images is presented. Instead of using conventional vectors, motion is represented by deformation values to model 3-D muscle contractions. In this way, estimated motion is more homogeneous among the image domain and predictions do not cont...
Conference Paper
The recently introduced High Efficiency Video Coding (HEVC) standard is currently further investigated for potential use in professional applications. The considered Range Extensions should on the one hand introduce higher bit depths and additional color formats, and on the other hand the coding efficiency of HEVC for high fidelity compression as w...
Conference Paper
Medical imaging in hospitals requires fast and efficient image compression to support the clinical work flow and to save costs. Least-squares autoregressive pixel prediction methods combined with arithmetic coding constitutes the state of the art in lossless image compression. However, a high computational complexity of both prevents the applicatio...
Conference Paper
This paper presents an intra-frame prediction scheme designed for lossless coding using HEVC. The proposed coding method comprises a pixel-wise prediction based on original samples. It is realized as a separate intra prediction mode, which replaces the PLANAR mode. In order to perform the prediction, a four-sample template around the pixel that is...
Conference Paper
This paper presents a method for iterative minimization of combined residual and prediction error for near-lossless compression of medical computed tomography acquisitions using pixel-wise least-squares prediction. While most other lossy state-of-the-art image compression systems like JPEG 2000 make use of transform-based coding, in lossless coding...
Article
Semantic queries involving image understanding aspects require the exploitation of multiple clues, namely the (inter-) relations between objects and events across multiple images, the situational context, and the application context. A prominent example for such queries is the identification of individuals in video sequences. Straightforward face r...
Article
This paper introduces a low complexity frame-based object detection algorithm for H.264/AVC video streams. The method solely parses and evaluates H.264/AVC macroblock types extracted from the video stream, which requires only partial decoding. Different macroblock types indicate different properties of the video content. This fact is used to segmen...
Conference Paper
Predictive coding is applied in many state-of-the-art lossless image compression algorithms like JPEG-LS, CALIC, or least-squares-based methods. We propose a new approach for accurate intensity prediction in pixel-predictive coding of computed tomography (CT) images. Exploiting their particular edge characteristic, the method only relies on a small...
Article
We present a method for compressed domain stitching of streams coded according to the (draft) High Efficiency Video Coding (HEVC) specification for video conferencing applications. The main challenges for mixing of video streams in the compressed domain are elaborated and divided into pixel processing level, the level of syntax elements, and entrop...
Article
We present a new approach for efficient estimation and storage of tissue deformation in dynamic medical image data like 3-D+t computed tomography reconstructions of human heart acquisitions. Tissue deformation between two points in time can be described by means of a displacement vector field indicating for each voxel of a slice, from which positio...
Conference Paper
Full-text available
Derzeitig basiert der diagnostische Prozess eines Krankheitsverlaufes in Krankenhäusern auf einer manuellen Beurteilung von Patientendaten zu unterschiedlichen Zeiten und unterschiedlichen Modalitäten (z.B. CT-Aufnahmen vs. MRT). Diese Aufnahmen werden in sehr grossen Datenarchiven (Picture Archiving and Communication System, PACS) gespeichert, woh...
Conference Paper
Full-text available
Compression of noisy image sequences is a hard challenge in video coding. Especially for high quality compression the preprocessing of videos is not possible, as it decreases the objective quality of the videos. In order to overcome this problem, this paper presents an in-loop denoising framework for efficient medium to high fidelity compression of...
Conference Paper
This paper presents a novel change detection algorithm for the compressed domain. Many video surveillance systems in practical use transmit their video data over a network by using the Real-time Transport Protocol (RTP). Therefore, the presented algorithm concentrates on analyzing RTP streams to detect major changes within contained video content....
Conference Paper
The compression efficiency of coding noisy image sequences is highly dependent on the noise itself and on the used quantization parameter. For very high quality near lossless compression a promising approach to save bitrate is to introduce an in-loop denoising filter in the codec. In this paper, we enhance our in-loop denoising scheme by a low comp...
Conference Paper
This paper proposes an approach to enable automatic generation of probable semantic hypotheses for a given set of collected observations for forensic visual surveillance. As video analytic power exploited in visual surveillance is getting matured, the more automatically generated intermediate semantic metadata became available. In the sense of fore...
Article
Scalable video coding (SVC) is a good approach for video services over heterogeneous networks. To achieve efficient video broadcasting systems, a good understanding on the performance of coding tools in SVC is necessary. In this paper, the efficiency of inter-layer prediction tools in SVC extension of H.264/AVC is thoroughly investigated by simulat...
Article
Full-text available
We present a new method for data-adaptive compression of dense vector fields in dynamic medical volume data. Conven-tional block-based motion compensation used for temporal prediction in video compression cannot conveniently cope with deformable motion typically found in medical image sequences encoded over time. Based on an approximation of physio...
Conference Paper
Full-text available
When compressing noisy image sequences, the compression efficiency is limited by the noise amount within these image sequences as the noise part cannot be predicted. In this paper, we investigate the influence of noise within the reference frame on lossy video coding of noisy image sequences. We estimate how much noise is left within a lossy coded...
Conference Paper
Full-text available
The major gain in video coding applications compared to single image coding is the use of temporal prediction, which exploits the correlation between adjacent frames. However, in high quality video coding, especially lossless video coding, the compression gain of P-frames over I-Frames becomes very small. The reason for that is that the reference f...
Article
Full-text available
In forensic analysis of visual surveillance data, condi-tional knowledge representation and inference under uncer-tainty play an important role for deriving new contextual cues by fusing relevant evidential patterns. To address this aspect, both rule-based (aka. extensional) and state based (aka. intensional) approaches have been adopted for situ-a...
Conference Paper
Full-text available
Quantization parameter (QP) cascaded hierarchical prediction structures have been proved as efficient techniques in hybrid video coding. However, the current QP cascading method is empirical and not adaptive. The reason for the higher coding efficiency of this method has not been fully explored so far. In this paper, the rate-distortion performance...
Conference Paper
The majority of recent work in forensic analysis of visual surveillance content has been focusing on automatic information extraction aspects. However, little attention has been paid to the intelligent reuse of extracted (meta)data. For reasoning upon such pre-acquired metadata, in our previous paper, we proposed the use of logic programming to rep...
Conference Paper
Full-text available
Nowadays multimedia data is produced and consumed at an ever increasing rate. Similarly to this trend, diverse storage approaches for multimedia data have been introduced. These observations lead to the fact that distributed and heterogeneous multimedia repositories exist whereas an unified and easy access to the stored multimedia data is not given...
Article
Semantic queries involving image understanding aspects require the exploitation of multiple clues, namely the (inter-)relations between objects and events across multiple images, the situational context, and the application context. A prominent example for such queries is the identification of individuals in video sequences. Straightforward face re...
Article
Nowadays multimedia data is produced and consumed at an ever increasing rate. Similarly to this trend, diverse storage approaches for multimedia data have been introduced. These observations lead to the fact that distributed and heterogeneous multimedia repositories exist whereas an unified and easy access to the stored multimedia data is not given...
Conference Paper
Full-text available
Originally, hierarchical prediction structures were proposed to achieve temporal scalability. Soon after, it was realized that with a proper quantization parameter cascading (QPC) scheme the general performance can be significantly improved by hierarchical coding. However, the theory behind the gain has not been explored so far. In this paper, the...
Conference Paper
Full-text available
In this paper, a one-pass budget allocation algorithm is proposed for hybrid video coding. Taking the percentage of skipped MBs as the measure of inter-frame dependency, the optimal budget allocation is first modeled for a two-frame case. Then this model is extended to a practical method in slow movement scenario, where the information of inter-fra...
Article
In this paper, an efficient one-pass frame level rate control algorithm is proposed for H.264/AVC, where the two essential problems in rate control, i.e., the budget allocation (BA) and the quantization parameter determination (QPD) are both considered. First, an efficient BA scheme is designed with special consideration of the inter-frame dependen...
Conference Paper
Full-text available
The Lagrangian multiplier based rate-distortion optimization (RDO) has been widely employed in single layer video coding. During the development of scalable video coding (SVC) extension of H.264/AVC, it was directly applied in a multilayer scenario. However, such an application is not very efficient since the correlation between layers is not consi...
Conference Paper
In traditional visual surveillance systems, retrieval has been relying on indexing events and features extracted by visual analytic algorithms that were developed for well-defined, specific domains. However, due to the increasing need for intelligent forensic retrieval with contextual semantics, this approach is reaching its limits, because it is a...
Conference Paper
Full-text available
In this paper, a one-pass multi-layer rate-distortion optimiza- tion algorithm is proposed for quality scalable video coding. To improve the overall coding efficiency, the MB mode in the base layer is selected not only based on its rate-distortion per- formance relative to this layer but also according to its impact on the enhancement layer. Moreov...
Article
Full-text available
In today's hybrid video coding, Rate-Distortion Optimization (RDO) plays a critical role. It aims at minimizing the distortion under a constraint on the rate. Currently, the most popular RDO algorithm for one-pass coding is the one recommended in the H.264/AVC reference software. It, or HR-lambda for convenience, is actually a kind of universal met...
Article
The Lagrange multiplier based rate-distortion optimization (RDO) has been widely employed in single layer video coding. During the development of scalable video coding (SVC) extension of H.264/AVC, it was directly applied in a multi-layer scenario. However, such an application is not very efficient since the correlation between layers is not consid...
Conference Paper
Scalable video coding (SVC) is standardized as H.264/AVC Annex G by the Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG. SVC extends the features of its base specification H.264/AVC by flexible scalability features in all directions (temporal, spatial and SNR) while maintaining high compression efficiency. This paper investigates scalable vid...
Conference Paper
Full-text available
This paper addresses disparity estimation from image pairs by error minimization. When taken from a monocular camera pan sequence, those images often suffer from luminance variations, caused by the spatially and temporally separated camera positions. Therefore, here we discuss and analyze an existing correction approach for the block matching algor...
Article
In this paper,a rate-distortionoptimizedframelevel rate con- trol algorithm is presented for H.264/AVC. To improve the performance in both distortion and rate, two techniques are developed. First, an adaptive frame layer rate-distortion op- timization technique is included into the rate control mod- ule so that the average distortion is decreased....
Article
Full-text available
This paper presents the integration of scalable video coding (SVC) into a generic platform for multimedia adaptation. The platform provides a full MPEG-21 chain including server, adaptation nodes, and clients. An efficient adaptation framework using SVC and MPEG-21 digital item adaptation (DIA) is integrated and it is shown that SVC can seamlessly...
Conference Paper
Full-text available
The Lagrangian multiplier based rate-distortion optimization has been proved to be an effective way in hybrid video coding. In this paper, an advanced Lagrange multiplier selection method is presented. Based on Laplace distribution, the variance of transformed residuals is introduced into the rate and distortion models. Moreover, inspired by the p-...
Article
Full-text available
The MPEG-21 standard defines a framework for the interoperable delivery and consumption of multimedia content. Within this framework the adaptation of content plays a vital role in order to support a variety of terminals and to overcome the limitations of the heterogeneous access networks. In most cases the multimedia content can be adapted by appl...
Article
Full-text available
The Lagrange multiplier based rate-distortion optimization has been proved an effective technique in hybrid video codec design. To improve the coding efficiency, an efficient La- grange multiplier selection method is presented in this paper. As an extension to our previous work, the Laplace distribu- tion based rate model is further refined by taki...
Article
Full-text available
XML-based metadata is widely adopted across the different communities and plenty of commercial and open source tools for processing and transforming are available on the market. However, all of these tools have one thing in common: they operate on plain text encoded metadata which may become a burden in constrained and streaming environments, i.e.,...
Article
Full-text available
Due to the large expansion of heterogeneous networks, the scalability features of next generation video coding schemes have become a very important topic for future developments. These algorithms enable to generate a universal (temporally, spatially and SNR) scalable bit-stream so that the video may be decodable by any kind of device from a mobile...
Conference Paper
Full-text available
The diversity of end-terminal and access network capabilities as well as the dynamic nature of wireless connections pose significant challenges to providers of multimedia streaming services. In this paper, we present a system based on MPEG-21 Digital Item Adaptation (DIA) technologies that automatically adapts scalable multimedia resources, like up...
Article
Full-text available
Due to the heterogeneity of the current terminal and network infrastructures, multimedia content needs to be adapted to specific capabilities of these terminals and network devices. Furthermore, user preferences and user environment characteristics must also be taken into consideration. The problem becomes even more complex by the diversity of mult...
Article
In this paper, a generic method is described to allow the adaptation of different multimedia resources by a single, media resource-agnostic processor. This method is based on an XML description of the media resource’s bitstream syntax, which can be transformed to reflect the desired adaptation and then be used to generate an adapted version of the...