Hans-Peter Seidel’s research while affiliated with Max Planck Institute for Informatics and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (797)


Uncertainty separation via ensemble quantile regression
  • Preprint
  • File available

December 2024

·

4 Reads

·

Hans-Peter Seidel

·

Vahid Babaei

This paper introduces a novel and scalable framework for uncertainty estimation and separation with applications in data driven modeling in science and engineering tasks where reliable uncertainty quantification is critical. Leveraging an ensemble of quantile regression (E-QR) models, our approach enhances aleatoric uncertainty estimation while preserving the quality of epistemic uncertainty, surpassing competing methods, such as Deep Ensembles (DE) and Monte Carlo (MC) dropout. To address challenges in separating uncertainty types, we propose an algorithm that iteratively improves separation through progressive sampling in regions of high uncertainty. Our framework is scalable to large datasets and demonstrates superior performance on synthetic benchmarks, offering a robust tool for uncertainty quantification in data-driven applications.

Download

Based on sparse images captured from various viewpoints and with different focus distances, apertures, and exposure times (left), our method generates an HDR 3D Gaussian representation that can be rendered with arbitrary depth of field. Our representation supports rendering images in a six‐dimensional space of camera and lens parameters (right, five dimensions are shown).
Our pipeline for rendering HDR radiance fields with controllable depth of field.
Our training data includes underexposed and overexposed images (left). Our loss incorporates both the original data (Eq. 9) and a normalized version of it (Eq. 10). To manage defocus blur, a third loss component (Eq. 11) incorporates a defocus map.
Comparison of all‐in‐focus reconstruction. Our method provides better or comparable results to Deblur‐Splatting method.
Comparison of HDR reconstruction. Left‐most column presents input LDR images for a given view. The following columns depict results acquired by HDR‐NeRF, Ground Truth, and results of our method, respectively. Please note, that our method properly reconstructs the shape of the flame and reproduces successfully high‐frequency content, while these details are missing in the results generated by HDR‐NeRF. To depict the accuracy of both methods we provide the HDR‐VDP‐3 error map in the top right corner of the images. To better show the differences in the highlights, we render these regions at ‐3 stops, as shown in the yellow box.

+6

Cinematic Gaussians: Real‐Time HDR Radiance Fields with Depth of Field

November 2024

·

25 Reads

Chao Wang

·

Krzysztof Wolski

·

·

[...]

·

Thomas Leimkühler

Radiance field methods represent the state of the art in reconstructing complex scenes from multi‐view photos. However, these reconstructions often suffer from one or both of the following limitations: First, they typically represent scenes in low dynamic range (LDR), which restricts their use to evenly lit environments and hinders immersive viewing experiences. Secondly, their reliance on a pinhole camera model, assuming all scene elements are in focus in the input images, presents practical challenges and complicates refocusing during novel‐view synthesis. Addressing these limitations, we present a lightweight method based on 3D Gaussian Splatting that utilizes multi‐view LDR images of a scene with varying exposure times, apertures, and focus distances as input to reconstruct a high‐dynamic‐range (HDR) radiance field. By incorporating analytical convolutions of Gaussians based on a thin‐lens camera model as well as a tonemapping module, our reconstructions enable the rendering of HDR content with flexible refocusing capabilities. We demonstrate that our combined treatment of HDR and depth of field facilitates real‐time cinematic rendering, outperforming the state of the art.


Figure 3: Predictive performance of MoleVers, averaged over 5 splits, when finetuned on two assays with varying dataset size: (a) CHEMBL5291763, (b) CHEMBL2328568 (Zdrazil et al., 2024).
Ablation studies of our pretraining strategy. We can see that combining both pretraining stage 1 and stage 2 gives the best performance on the downstream datasets.
Impact of pretraining (stage 1) dataset diversity, measured by the number of training sam- ples. The downstream performance of MoleVers improves as the number of training samples in-
Two-Stage Pretraining for Molecular Property Prediction in the Wild

November 2024

·

18 Reads

Accurate property prediction is crucial for accelerating the discovery of new molecules. Although deep learning models have achieved remarkable success, their performance often relies on large amounts of labeled data that are expensive and time-consuming to obtain. Thus, there is a growing need for models that can perform well with limited experimentally-validated data. In this work, we introduce MoleVers, a versatile pretrained model designed for various types of molecular property prediction in the wild, i.e., where experimentally-validated molecular property labels are scarce. MoleVers adopts a two-stage pretraining strategy. In the first stage, the model learns molecular representations from large unlabeled datasets via masked atom prediction and dynamic denoising, a novel task enabled by a new branching encoder architecture. In the second stage, MoleVers is further pretrained using auxiliary labels obtained with inexpensive computational methods, enabling supervised learning without the need for costly experimental data. This two-stage framework allows MoleVers to learn representations that generalize effectively across various downstream datasets. We evaluate MoleVers on a new benchmark comprising 22 molecular datasets with diverse types of properties, the majority of which contain 50 or fewer training labels reflecting real-world conditions. MoleVers achieves state-of-the-art results on 20 out of the 22 datasets, and ranks second among the remaining two, highlighting its ability to bridge the gap between data-hungry models and real-world conditions where practically-useful labels are scarce.


Jump Restore Light Transport

September 2024

·

1 Read

Markov chain Monte Carlo (MCMC) algorithms come to rescue when sampling from a complex, high-dimensional distribution by a conventional method is intractable. Even though MCMC is a powerful tool, it is also hard to control and tune in practice. Simultaneously achieving both \emph{local exploration} of the state space and \emph{global discovery} of the target distribution is a challenging task. In this work, we present a MCMC formulation that subsumes all existing MCMC samplers employed in rendering. We then present a novel framework for \emph{adjusting} an arbitrary Markov chain, making it exhibit invariance with respect to a specified target distribution. To showcase the potential of the proposed framework, we focus on a first simple application in light transport simulation. As a by-product, we introduce continuous-time MCMC sampling to the computer graphics community. We show how any existing MCMC-based light transport algorithm can be embedded into our framework. We empirically and theoretically prove that this embedding is superior to running the standalone algorithm. In fact, our approach will convert any existing algorithm into a highly parallelizable variant with shorter running time, smaller error and less variance.


Neural Gaussian Scale-Space Fields

July 2024

·

1 Read

ACM Transactions on Graphics

Gaussian scale spaces are a cornerstone of signal representation and processing, with applications in filtering, multiscale analysis, anti-aliasing, and many more. However, obtaining such a scale space is costly and cumbersome, in particular for continuous representations such as neural fields. We present an efficient and lightweight method to learn the fully continuous, anisotropic Gaussian scale space of an arbitrary signal. Based on Fourier feature modulation and Lipschitz bounding, our approach is trained self-supervised, i.e., training does not require any manual filtering. Our neural Gaussian scale-space fields faithfully capture multiscale representations across a broad range of modalities, and support a diverse set of applications. These include images, geometry, light-stage data, texture anti-aliasing, and multiscale optimization.


Learning Images Across Scales Using Adversarial Training

July 2024

·

8 Reads

·

1 Citation

ACM Transactions on Graphics

The real world exhibits rich structure and detail across many scales of observation. It is difficult, however, to capture and represent a broad spectrum of scales using ordinary images. We devise a novel paradigm for learning a representation that captures an orders-of-magnitude variety of scales from an unstructured collection of ordinary images. We treat this collection as a distribution of scale-space slices to be learned using adversarial training, and additionally enforce coherency across slices. Our approach relies on a multiscale generator with carefully injected procedural frequency content, which allows to interactively explore the emerging continuous scale space. Training across vastly different scales poses challenges regarding stability, which we tackle using a supervision scheme that involves careful sampling of scales. We show that our generator can be used as a multiscale generative model, and for reconstructions of scale spaces from unstructured patches. Significantly outperforming the state of the art, we demonstrate zoom-in factors of up to 256x at high quality and scale consistency.




Learning Images Across Scales Using Adversarial Training

June 2024

·

33 Reads

The real world exhibits rich structure and detail across many scales of observation. It is difficult, however, to capture and represent a broad spectrum of scales using ordinary images. We devise a novel paradigm for learning a representation that captures an orders-of-magnitude variety of scales from an unstructured collection of ordinary images. We treat this collection as a distribution of scale-space slices to be learned using adversarial training, and additionally enforce coherency across slices. Our approach relies on a multiscale generator with carefully injected procedural frequency content, which allows to interactively explore the emerging continuous scale space. Training across vastly different scales poses challenges regarding stability, which we tackle using a supervision scheme that involves careful sampling of scales. We show that our generator can be used as a multiscale generative model, and for reconstructions of scale spaces from unstructured patches. Significantly outperforming the state of the art, we demonstrate zoom-in factors of up to 256x at high quality and scale consistency.


Cinematic Gaussians: Real-Time HDR Radiance Fields with Depth of Field

June 2024

·

34 Reads

Radiance field methods represent the state of the art in reconstructing complex scenes from multi-view photos. However, these reconstructions often suffer from one or both of the following limitations: First, they typically represent scenes in low dynamic range (LDR), which restricts their use to evenly lit environments and hinders immersive viewing experiences. Secondly, their reliance on a pinhole camera model, assuming all scene elements are in focus in the input images, presents practical challenges and complicates refocusing during novel-view synthesis. Addressing these limitations, we present a lightweight method based on 3D Gaussian Splatting that utilizes multi-view LDR images of a scene with varying exposure times, apertures, and focus distances as input to reconstruct a high-dynamic-range (HDR) radiance field. By incorporating analytical convolutions of Gaussians based on a thin-lens camera model as well as a tonemapping module, our reconstructions enable the rendering of HDR content with flexible refocusing capabilities. We demonstrate that our combined treatment of HDR and depth of field facilitates real-time cinematic rendering, outperforming the state of the art.


Citations (47)


... We argue that each value on the camera setting scale is crucial, as it embodies a deeper physical understanding rather than merely representing a few discrete values. Continuous sampling supports continuous-scale training [8,10,64], which enhances the model's ability to perform across any value on the scale. Therefore, when constructing contrastive data, we perform random sampling across the continuous camera setting scale to gather multiple values. ...

Reference:

Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis
Learning Images Across Scales Using Adversarial Training
  • Citing Article
  • July 2024

ACM Transactions on Graphics

... The second approach centers on inverting the tone mapping applied to the input LDR image, aiming to recover missing details in clipped regions [20,21,35,38,42,56,72,76]. With the growth of generative models, approaches to HDR generation also emerged. GlowGAN [63] is the first attempt to generate HDR content from scratch and can extend existing LDR images through GAN inversion. However, its effectiveness is limited by the class-specific training and generation constraints inherent to GANs. ...

GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild
  • Citing Conference Paper
  • October 2023

... Azinovic et al. [2019]; Nimier-David et al. [2020] highlight that gradient estimates can be biased when optimizing with MC samples, potentially distorting the converged scene parameters. To address this, methods such as using two uncorrelated samples for the rendering integral and its derivatives [Azinovic et al. 2019;Nimier-David et al. 2020;Vicini et al. 2021], employing dual-buffer techniques for the L2 loss [Deng et al. 2022;Pidhorskyi et al. 2022], and exploiting variance reduction techniques [Balint et al. 2023;Fischer and Ritschel 2023;Nicolet et al. 2023;Zhang et al. 2021] have been proposed. ...

Joint Sampling and Optimisation for Inverse Rendering
  • Citing Conference Paper
  • December 2023

... AudioLM [10] employs neural codecs [78] for long-term consistency in audio generation, mapping input audio to discrete tokens and using a language modeling task. Meanwhile, DiffSound [83], using a text encoder, decoder, vector quantized variational autoencoder (VQVAE) [32], and "vocoder", introduces a mask-based text generation strategy (MBTG) for generating text descriptions from audio labels to address the scarcity of audio-text paired data. ...

An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range
  • Citing Article
  • December 2023

ACM Transactions on Graphics

... To reduce the cost associated with constructing per-pixel kernels, Işik et al. [16] generate per-pixel features that are used to construct adaptive dilated kernels. Balint et al. [4] use a series of lightweight U-Nets to construct a low-pass pyramid filter, where the kernel constructor is also trained to separate the input radiance between pyramidal layers as an alternative to classical downsampling and upsampling approaches, which are often prone to aliasing. In common with the aforementioned methods, our approach uses a U-Net, although it in our case it is integrated into a local linear model for improved computational efficiency, rather than generating kernels or output pixels directly. ...

Neural Partitioning Pyramids for Denoising Monte Carlo Renderings
  • Citing Conference Paper
  • July 2023

... Their algorithm should be general but is demonstrated in two dimensions (or on surfaces embedded in 3-d). A sliced approach also allows for multi-class color stippling [SGSS22]. An interesting generalization relates to the optimal transport approximation of a density by other non punctual measures [MM99], such as a single (long) continuous curve [LdGKW19] (Fig. 10, right). ...

Scalable Multi-Class Sampling via Filtered Sliced Optimal Transport
  • Citing Article
  • November 2022

ACM Transactions on Graphics

... Real-Time VR Rendering. To achieve real-time VR rendering, prior works have leaned on image-based rendering or remote rendering [9,29,39,50,70]. Some systems directly stream rendered videos to clients, but they often suffer from bandwidth limits and require efficient video encoding [52]. ...

QuadStream: A Quad-Based Scene Streaming Architecture for Novel Viewpoint Reconstruction
  • Citing Article
  • November 2022

ACM Transactions on Graphics

... The highlights can easily damage the quality of the target image owing to the combined effect of the sunlight and the target's surface's physical characteristics. The flow and strength of the illumination will be determined by objective's category and dye, and also the reservation among the external area and the source of light [2]. Due to these highlights, the brightness of the image will be reduced in a sliding window and it frequently causes some unwanted discontinuities in the diffused part of the object [3]. ...

Gloss management for consistent reproduction of real and virtual objects
  • Citing Conference Paper
  • November 2022

... The development team utilized models for motion capture and remix music from GarageBand®. Microsoft Visual Studio was used to debug and deploy the programming code from the development software to the OST-HMD device (Rahaman, Champion, & Bekele, 2019) to address potential issues related to vengeance-accommodation conflict (Arabadzhiyska, Tursun, Seidel, & Didyk, 2022). Finally, the conjoint survey was developed via Sawtooth software. ...

Practical Saccade Prediction for Head-Mounted Displays: Towards a Comprehensive Model
  • Citing Article
  • October 2022

ACM Transactions on Applied Perception

... NeRF-based approaches [7,15,20,24,30,44,59] showed promising results in dehazing [12,27,50], underwater image restoration and novel view synthesis. ScatterNeRF [50] and DehazeNeRF [12] extended NeRF with atmospheric scattering for reconstructing scenes with low visibility due to haze, and rectifying color and geometry for hazed scenes. ...

Eikonal Fields for Refractive Novel-View Synthesis
  • Citing Conference Paper
  • August 2022