H.-P. Seidel’s research while affiliated with Max Planck Institute for Informatics and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (115)


We introduce a novel method to recover a fractal description from an image containing a self‐similar shape. Our hybrid optimization achieves state‐of‐the‐art fractal inversion, enabling the synthesis of intricate details at any desired scale – illustrated here with 64x zoom‐ins.
A fern exhibiting self‐similarities.
An IFS fractal generated using the chaos game (Eq. 2) with varying numbers of points.
Overview of our model. The fractal point generator uses functions in ℱ and employs a parallel stochastic scheme to efficiently and robustly execute the chaos game (Eq. 2). The generated points are then rendered into an image using differentiable splatting.
Qualitative fractal inversion results for various methods (rows). For each instance (columns), the full fractal is shown on the left, with a 64x zoomed‐in view on the right. The bottom row presents the input image Iref used for all methods (left) next to a zoomed‐in view of the ground‐truth fractal (right). Note that the zoomed‐in views are consistent across rows. Images marked with a ▴ denote views that do not contain the full point count (see Sec. 5). More visuals can be found in our supplemental materials. For continuous zoom‐in visuals, please refer to our supplemental video.

+3

Learning Image Fractals Using Chaotic Differentiable Point Splatting
  • Article
  • Full-text available

April 2025

·

5 Reads

A. Djeacoumar

·

F. Mujkanovic

·

H.‐P. Seidel

·

T. Leimkühler

Fractal geometry, defined by self‐similar patterns across scales, is crucial for understanding natural structures. This work addresses the fractal inverse problem, which involves extracting fractal codes from images to explain these patterns and synthesize them at arbitrary finer scales. We introduce a novel algorithm that optimizes Iterated Function System parameters using a custom fractal generator combined with differentiable point splatting. By integrating both stochastic and gradient‐based optimization techniques, our approach effectively navigates the complex energy landscapes typical of fractal inversion, ensuring robust performance and the ability to escape local minima. We demonstrate the method's effectiveness through comparisons with various fractal inversion techniques, highlighting its ability to recover high‐quality fractal codes and perform extensive zoom‐ins to reveal intricate patterns from just a single image.

Download

Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising

April 2025

·

6 Reads

·

2 Citations

M. Bemana

·

T. Leimkühler

·

·

[...]

·

T. Ritschel

We demonstrate generating HDR images using the concerted action of multiple black‐box, pre‐trained LDR image diffusion models. Common diffusion models are not HDR as, first, there is no sufficiently large HDR image dataset available to re‐train them, and, second, even if it was, re‐training such models is impossible for most compute budgets. Instead, we seek inspiration from the HDR image capture literature that traditionally fuses sets of LDR images, called “exposure brackets”, to produce a single HDR image. We operate multiple denoising processes to generate multiple LDR brackets that together form a valid HDR result. To this end, we introduce a brackets consistency term into the diffusion process to couple the brackets such that they agree across the exposure range they share. We demonstrate HDR versions of state‐of‐the‐art unconditional and conditional as well as restoration‐type (LDR2HDR) generative modeling.


Enhancing image quality prediction with self‐supervised visual masking

April 2024

·

56 Reads

Full‐reference image quality metrics (FR‐IQMs) aim to measure the visual differences between a pair of reference and distorted images, with the goal of accurately predicting human judgments. However, existing FR‐IQMs, including traditional ones like PSNR and SSIM and even perceptual ones such as HDR‐VDP, LPIPS, and DISTS, still fall short in capturing the complexities and nuances of human perception. In this work, rather than devising a novel IQM model, we seek to improve upon the perceptual quality of existing FR‐IQM methods. We achieve this by considering visual masking, an important characteristic of the human visual system that changes its sensitivity to distortions as a function of local image content. Specifically, for a given FR‐IQM metric, we propose to predict a visual masking model that modulates reference and distorted images in a way that penalizes the visual errors based on their visibility. Since the ground truth visual masks are difficult to obtain, we demonstrate how they can be derived in a self‐supervised manner solely based on mean opinion scores (MOS) collected from an FR‐IQM dataset. Our approach results in enhanced FR‐IQM metrics that are more in line with human prediction both visually and quantitatively.


Video frame interpolation for high dynamic range sequences captured with dual‐exposure sensors

May 2023

·

36 Reads

·

2 Citations

Video frame interpolation (VFI) enables many important applications such as slow motion playback and frame rate conversion. However, one major challenge in using VFI is accurately handling high dynamic range (HDR) scenes with complex motion. To this end, we explore the possible advantages of dual‐exposure sensors that readily provide sharp short and blurry long exposures that are spatially registered and whose ends are temporally aligned. This way, motion blur registers temporally continuous information on the scene motion that, combined with the sharp reference, enables more precise motion sampling within a single camera shot. We demonstrate that this facilitates a more complex motion reconstruction in the VFI task, as well as HDR frame reconstruction that so far has been considered only for the originally captured frames, not in‐between interpolated frames. We design a neural network trained in these tasks that clearly outperforms existing solutions. We also propose a metric for scene motion complexity that provides important insights into the performance of VFI methods at test time.


Learning a self‐supervised tone mapping operator via feature contrast masking loss

May 2022

·

46 Reads

·

9 Citations

High Dynamic Range (HDR) content is becoming ubiquitous due to the rapid development of capture technologies. Nevertheless, the dynamic range of common display devices is still limited, therefore tone mapping (TM) remains a key challenge for image visualization. Recent work has demonstrated that neural networks can achieve remarkable performance in this task when compared to traditional methods, however, the quality of the results of these learning‐based methods is limited by the training data. Most existing works use as training set a curated selection of best‐performing results from existing traditional tone mapping operators (often guided by a quality metric), therefore, the quality of newly generated results is fundamentally limited by the performance of such operators. This quality might be even further limited by the pool of HDR content that is used for training. In this work we propose a learning‐based self‐supervised tone mapping operator that is trained at test time specifically for each HDR image and does not need any data labeling. The key novelty of our approach is a carefully designed loss function built upon fundamental knowledge on contrast perception that allows for directly comparing the content in the HDR and tone mapped images. We achieve this goal by reformulating classic VGG feature maps into feature contrast maps that normalize local feature differences by their average magnitude in a local neighborhood, allowing our loss to account for contrast masking effects. We perform extensive ablation studies and exploration of parameters and demonstrate that our solution outperforms existing approaches with a single set of fixed parameters, as confirmed by both objective and subjective metrics.


Learning to Predict Image‐based Rendering Artifacts with Respect to a Hidden Reference Image

November 2019

·

62 Reads

·

1 Citation

Image metrics predict the perceived per‐pixel difference between a reference image and its degraded (e. g., re‐rendered) version. In several important applications, the reference image is not available and image metrics cannot be applied. We devise a neural network architecture and training procedure that allows predicting the MSE, SSIM or VGG16 image difference from the distorted image alone while the reference is not observed. This is enabled by two insights: The first is to inject sufficiently many un‐distorted natural image patches, which can be found in arbitrary amounts and are known to have no perceivable difference to themselves. This avoids false positives. The second is to balance the learning, where it is carefully made sure that all image errors are equally likely, avoiding false negatives. Surprisingly, we observe that the resulting no‐reference metric, subjectively, can even perform better than the reference‐based one, as it had to become robust against mis‐alignments. We evaluate the effectiveness of our approach in an image‐based rendering context, both quantitatively and qualitatively. Finally, we demonstrate two applications which reduce light field capture time and provide guidance for interactive depth adjustment.


Practical Capture and Reproduction of Phosphorescent Appearance

May 2017

·

35 Reads

·

4 Citations

This paper proposes a pipeline to accurately acquire, efficiently reproduce and intuitively manipulate phosphorescent appearance. In contrast to common appearance models, a model of phosphorescence needs to account for temporal change (decay) and previous illumination (saturation). For reproduction, we propose a rate equation that can be efficiently solved in combination with other illumination in a mixed integro-differential equation system. We describe an acquisition system to measure spectral coefficients of this rate equation for actual materials. Our model is evaluated by comparison to photographs of actual phosphorescent objects. Finally, we propose an artist-friendly interface to control the behavior of phosphorescent materials by specifying spatiotemporal appearance constraints. © 2017 The Author(s) Computer Graphics Forum © 2017 The Eurographics Association and John Wiley & Sons Ltd. Published by John Wiley & Sons Ltd.


Motion Parallax in Stereo 3D: Model and Applications

November 2016

·

87 Reads

·

9 Citations

Binocular disparity is the main depth cue that makes stereoscopic images appear 3D. However, in many scenarios, the range of depth that can be reproduced by this cue is greatly limited and typically fixed due to constraints imposed by displays. For example, due to the low angular resolution of current automultiscopic screens, they can only reproduce a shallow depth range. In this work, we study the motion parallax cue, which is a relatively strong depth cue, and can be freely reproduced even on a 2D screen without any limits. We exploit the fact that in many practical scenarios, motion parallax provides sufficiently strong depth information that the presence of binocular depth cues can be reduced through aggressive disparity compression. To assess the strength of the effect we conduct psycho-visual experiments that measure the influence of motion parallax on depth perception and relate it to the depth resulting from binocular disparity. Based on the measurements, we propose a joint disparity-parallax computational model that predicts apparent depth resulting from both cues. We demonstrate how this model can be applied in the context of stereo and multiscopic image processing, and propose new disparity manipulation techniques, which first quantify depth obtained from motion parallax, and then adjust binocular disparity information accordingly. This allows us to manipulate the disparity signal according to the strength of motion parallax to improve the overall depth reproduction. This technique is validated in additional experiments.


Efficient Multi-image Correspondences for On-line Light Field Video Processing

October 2016

·

91 Reads

·

36 Citations

Light field videos express the entire visual information of an animated scene, but their shear size typically makes capture, processing and display an off-line process, i. e., time between initial capture and final display is far from real-time. In this paper we propose a solution for one of the key bottlenecks in such a processing pipeline, which is a reliable depth reconstruction possibly for many views. This is enabled by a novel correspondence algorithm converting the video streams from a sparse array of off-the-shelf cameras into an array of animated depth maps. The algorithm is based on a generalization of the classic multi-resolution Lucas-Kanade correspondence algorithm from a pair of images to an entire array. Special inter-image confidence consolidation allows recovery from unreliable matching in some locations and some views. It can be implemented efficiently in massively parallel hardware, allowing for interactive computations. The resulting depth quality as well as the computation performance compares favorably to other state-of-the art light field-to-depth approaches, as well as stereo matching techniques. Another outcome of this work is a data set of light field videos that are captured with multiple variants of sparse camera arrays. © 2016 The Author(s) Computer Graphics Forum © 2016 The Eurographics Association and John Wiley & Sons Ltd. Published by John Wiley & Sons Ltd.


Stream Line-Based Pattern Search in Flows

August 2016

·

27 Reads

·

10 Citations

We propose a method that allows users to define flow features in form of patterns represented as sparse sets of stream line segments. Our approach finds similar occurrences in the same or other time steps. Related approaches define patterns using dense, local stencils or support only single segments. Our patterns are defined sparsely and can have a significant extent, i.e., they are integration-based and not local. This allows for a greater flexibility in defining features of interest. Similarity is measured using intrinsic curve properties only, which enables invariance to location, orientation, and scale. Our method starts with splitting stream lines using globally consistent segmentation criteria. It strives to maintain the visually apparent features of the flow as a collection of stream line segments. Most importantly, it provides similar segmentations for similar flow structures. For user-defined patterns of curve segments, our algorithm finds similar ones that are invariant to similarity transformations. We showcase the utility of our method using different 2D and 3D flow fields.


Citations (86)


... More recently, generative imaging has been utilized in inverse tonemapping, with GlowGAN [60] being the first to recover a multi-exposure bracket, as in traditional HDR capture [10]. Exposure Diffusion [3] improves this idea by adding exposure constraints to a pretrained diffusion model to generate relative exposures. While it capitalizes on the robust priors found in diffusion models, this method is restricted to pixel-space diffusion. ...

Reference:

GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR
Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising

... To interpolate surfaces between four adjacent points, the polynomial functions also require the elevation values of neighboring points in multiple directions. The resulting system of equations grows in proportion to the square of the planar dimensions, including 16 equations and 16 unknowns or polynomial coefficients for an orthogonal system consisting of two governing directions (Spath 1995;Haber et al. 2001). Solving this system produces multidirectional elevation functions and the corresponding multidirectional derivatives. ...

Smooth approximation and rendering of large scattered data sets
  • Citing Conference Paper
  • January 2001

... However, high dynamic range (HDR) scenes [37,46], which are more consistent with the physical world, offer a broader dynamic range and provide a superior visual experience for humans. Traditional HDR image reconstruction techniques [29,47,8] still focus on 2D images. How to reconstruct 3D HDR scenes from multi-exposure unstructured LDR images remains a question worthy of investigation. ...

Learning a self‐supervised tone mapping operator via feature contrast masking loss

... Features can be largely distinguished by their representation, e.g. individual points [50], curves [17] that serve to connect feature points, or areas [13] that serve to partition the domain into homogeneous regions of flow. Topology-based feature extraction is concerned with extracting critical points in the domain, namely those locations where the vector's components are zero. ...

Extracting Higher Order Critical Points and Topological Simplification of 3D Vector Fields
  • Citing Conference Paper
  • January 2005

... Therefore, researchers tend to integrate the temporal perceptual mechanism in VQA methods. Among them, by analyzing the luminance adaptation and visual masking mechanisms, Aydin et al. [15] proposed a VQA method based on luminance adaptation, spatial-temporal contrast sensitivity, and visual masking. He et al. [16] explored and exploited the compact representation of energy in the 3D-DCT domain and evaluated video quality with three different types of statistical features. ...

Video Quality Assessment for Computer Graphics Applications
  • Citing Article
  • January 2010

ACM Transactions on Graphics

... To be able to efficiently use the most common photorealistic rendering systems, an artist is typically required to have an understanding of physical quantities pertaining to the most commonly modeled phenomena in light transport, e.g., indices of refraction, scattering and absorption albedos and more [STPP09,BS12,NSR17]. This modeling time can be cut down by techniques that enable editing bidirectional reflectance distribution function (BRDF) models directly within the scene [BAOR06, CPWAP08, SZC * 07], however, with many of these methods, the artist is still required to understand the physical properties of light transport, often incurring a significant amount of trial and error. ...

Practical Capture and Reproduction of Phosphorescent Appearance
  • Citing Article
  • May 2017

... CDM-based crack growth models have been developed using techniques such as contact elements, node release, element removal, dynamic remeshing, and meshless methods to simulate crack extension [82]. Several localized CDMbased models have been proposed [83,84,[85][86][87][88]. For example, JianPing and colleagues developed a CDM-based creep-fatigue crack growth model to predict steam turbine rotor failure [89], while Yatomi used a CDM-based creep crack growth model with nodal release to evaluate the C* and Q* integrals [90]. ...

Dynamic remeshing and applications
  • Citing Conference Paper
  • January 2003

... Given a single LDR image, some studies [24,68] estimate a 360 • HDR environment map as spatially-varying light for inverse rendering and object insertion. Meanwhile, other studies [11,22,23,36,49] focus on estimating outdoor illumination from multiple images. ...

Relighting objects from image collections
  • Citing Conference Paper
  • June 2009

... Surface reconstruction from point clouds has been studied extensively for the last three decades. The field has seen significant evolution, from computational geometry methods (Amenta and Bern 1998;Dey and Goswami 2003) to implicit function techniques (Hoppe et al. 1992;Ohtake et al. 2003;Kazhdan, Bolitho, and Hoppe 2006;Kazhdan and Hoppe 2013;Hou et al. 2022;Liu et al. 2024), and more recently to deep learning approaches (Park et al. 2019;Chibane, Mir, and Pons-Moll 2020;Zhou et al. 2023;Ren et al. 2023;Wang et al. 2023a;Fainstein, Siless, and Iarussi 2024). Due to space constraints, this section primarily focuses on deep learning-based 3D reconstruction techniques. ...

Multi-level partition of unity implicits
  • Citing Article
  • January 2003

ACM Transactions on Graphics

... This is used to reconstruct coherent dynamic geometry from time-varying point clouds captured by real-time 3D scanning techniques. One widely used method is to reconstruct meshes for all frames first and then to fit a template mesh to all reconstructed meshes [2,18,49,50]. These methods always need additional markers or landmarks which must be specified by the users. ...

Template deformation for point cloud fitting
  • Citing Article
  • January 2006