Tobias Ritschel's research while affiliated with Kingston College United Kingdom and other places

Publications (182)

Preprint
In many recent works, multi-layer perceptions (MLPs) have been shown to be suitable for modeling complex spatially-varying functions including images and 3D scenes. Although the MLPs are able to represent complex scenes with unprecedented quality and memory footprint, this expressive power of the MLPs, however, comes at the cost of long training an...
Article
We propose a relighting method for outdoor images. Our method mainly focuses on predicting cast shadows in arbitrary novel lighting directions from a single image while also accounting for shading and global effects such the sun light color and clouds. Previous solutions for this problem rely on reconstructing occluder geometry, e. g., using multi‐...
Preprint
Full-text available
We propose a relighting method for outdoor images. Our method mainly focuses on predicting cast shadows in arbitrary novel lighting directions from a single image while also accounting for shading and global effects such the sun light color and clouds. Previous solutions for this problem rely on reconstructing occluder geometry, e.g. using multi-vi...
Conference Paper
Computer-Generated Holography (CGH) offers the potential for genuine, high-quality three-dimensional visuals. However, fulfilling this potential remains a practical challenge due to computational complexity and visual quality issues. We propose a new CGH method that exploits gaze-contingency and perceptual graphics to accelerate the development of...
Preprint
There currently are two main approaches to reproducing visual appearance using Machine Learning (ML): The first is training models that generalize over different instances of a problem, e.g., different images from a dataset. Such models learn priors over the data corpus and use this knowledge to provide fast inference with little input, often as a...
Article
Recent advances in neural rendering indicate immense promise for architectures that learn light transport, allowing efficient rendering of global illumination effects once such methods are trained. The training phase of these methods can be seen as a form of pre-computation, which has a long standing history in Computer Graphics. In particular, Pre...
Article
High-dynamic range (HDR) video reconstruction using conventional single-exposure sensors can be achieved by temporally alternating exposures. This, in turn, requires computing exposure alignment which is difficult to achieve due to the exposure differences that notoriously creates problems for moving content, in particular, in larger saturated and...
Preprint
Scanning Transmission Electron Microscopes (STEMs) acquire 2D images of a 3D sample on the scale of individual cell components. Unfortunately, these 2D images can be too noisy to be fused into a useful 3D structure and facilitating good denoisers is challenging due to the lack of clean-noisy pairs. Additionally, representing a detailed 3D structure...
Conference Paper
Computer-Generated Holography (CGH) promises to deliver genuine, high-quality visuals at any depth. We argue that combining CGH and perceptually guided graphics can soon lead to practical holographic display systems that deliver perceptually realistic images. We propose a new CGH method called metameric varifocal holograms. Our CGH method generates...
Conference Paper
Ventral metamers, pairs of images which may differ substantially in the periphery, but are perceptually identical, offer exciting new possibilities in foveated rendering and image compression, as well as offering insights into the human visual system. However, existing lit-erature has mainly focused on creating metamers of static images. In this wo...
Preprint
Full-text available
Three-dimensional (3D) X-ray imaging techniques like tomography and confocal microscopy are crucial for academic and industrial applications. These approaches access 3D information by scanning the sample with respect to the X-ray source. However, the scanning process limits the temporal resolution when studying dynamics and is not feasible for some...
Preprint
We tackle the problem of generating novel-view images from collections of 2D images showing refractive and reflective objects. Current solutions assume opaque or transparent light transport along straight paths following the emission-absorption model. Instead, we optimize for a field of 3D-varying Index of Refraction (IoR) and trace light through i...
Preprint
Appropriate weight initialization has been of key importance to successfully train neural networks. Recently, batch normalization has diminished the role of weight initialization by simply normalizing each layer based on batch statistics. Unfortunately, batch normalization has several drawbacks when applied to small batch sizes, as they are require...
Article
Full-text available
Density estimation plays a crucial role in many data analysis tasks, as it infers a continuous probability density function (PDF) from discrete samples. Thus, it is used in tasks as diverse as analyzing population data, spatial locations in 2D sensor readings, or reconstructing scenes from 3D scans. In this paper, we introduce a learned, data-drive...
Preprint
Full-text available
Computer-Generated Holography (CGH) offers the potential for genuine, high-quality three-dimensional visuals. However, fulfilling this potential remains a practical challenge due to computational complexity and visual quality issues. We propose a new CGH method that exploits gaze-contingency and perceptual graphics to accelerate the development of...
Article
Full-text available
To peripheral vision, a pair of physically different images can look the same. Such pairs are metamers relative to each other, just as physically-different spectra of light are perceived as the same color. We propose a real-time method to compute such ventral metamers for foveated rendering where, in particular for near-eye displays, the largest pa...
Article
Full-text available
To peripheral vision, a pair of physically different images can look the same. Such pairs are metamers relative to each other, just as physically-different spectra of light are perceived as the same color. We propose a real-time method to compute such ventral metamers for foveated rendering where, in particular for near-eye displays, the largest pa...
Preprint
Full-text available
Density estimation plays a crucial role in many data analysis tasks, as it infers a continuous probability density function (PDF) from discrete samples. Thus, it is used in tasks as diverse as analyzing population data, spatial locations in 2D sensor readings, or reconstructing scenes from 3D scans. In this paper, we introduce a learned, data-drive...
Article
Full-text available
Controlled capture of real‐world material appearance yields tabulated sets of highly realistic reflectance data. In practice, however, its high memory footprint requires compressing into a representation that can be used efficiently in rendering while remaining faithful to the original. Previous works in appearance encoding often prioritized one of...
Article
Phase retrieval approaches based on deep learning (DL) provide a framework to obtain phase information from an intensity hologram or diffraction pattern in a robust manner and in real-time. However, current DL architectures applied to the phase problem rely on i) paired datasets, i. e., they are only applicable when a satisfactory solution of the p...
Article
We propose Blue Noise Plots, two-dimensional dot plots that depict data points of univariate data sets. While often one-dimensional strip plots are used to depict such data, one of their main problems is visual clutter which results from overlap. To reduce this overlap, jitter plots were introduced, whereby an additional, non-encoding plot dimensio...
Preprint
Our goal is to learn a deep network that, given a small number of images of an object of a given category, reconstructs it in 3D. While several recent works have obtained analogous results using synthetic data or assuming the availability of 2D primitives such as keypoints, we are interested in working with challenging real data and with no manual...
Preprint
We learn a latent space for easy capture, semantic editing, consistent interpolation, and efficient reproduction of visual material appearance. When users provide a photo of a stationary natural material captured under flash light illumination, it is converted in milliseconds into a latent material code. In a second step, conditioned on the materia...
Article
Previous work has demonstrated learning isolated 3D objects (voxel grids, point clouds, meshes, etc.) from 2D-only self-supervision. Here we set out to extend this to entire 3D scenes made out of multiple objects, including their location, orientation and type, and the scenes illumination. Once learned, we can map arbitrary 2D images to 3D scene st...
Preprint
Controlled capture of real-world material appearance yields tabulated sets of highly realistic reflectance data. In practice, however, its high memory footprint requires compressing into a representation that can be used efficiently in rendering while remaining faithful to the original. Previous works in appearance encoding often prioritised one of...
Preprint
We propose Blue Noise Plots, two-dimensional dot plots that depict data points of univariate data sets. While often one-dimensional strip plots are used to depict such data, one of their main problems is visual clutter which results from overlap. To reduce this overlap, jitter plots were introduced, whereby an additional, non-encoding plot dimensio...
Preprint
We seek to reconstruct sharp and noise-free high-dynamic range (HDR) video from a dual-exposure sensor that records different low-dynamic range (LDR) information in different pixel columns: Odd columns provide low-exposure, sharp, but noisy information; even columns complement this with less noisy, high-exposure, but motion-blurred data. Previous L...
Chapter
Massive semantically labeled datasets are readily available for 2D images, however, are much harder to achieve for 3D scenes. Objects in 3D repositories like ShapeNet are labeled, but regrettably only in isolation, so without context. 3D scenes can be acquired by range scanners on city-level scale, but much fewer with semantic labels. Addressing th...
Conference Paper
Full-text available
Massive semantically labeled datasets are readily available for 2D images, however, are much harder to achieve for 3D scenes. Objects in 3D repositories like ShapeNet are labeled, but regrettably only in isolation, so without context. 3D scenes can be acquired by range scanners on city-level scale, but much fewer with semantic labels. Addressing th...
Preprint
Full-text available
Previous work has demonstrated learning isolated 3D objects (voxel grids, point clouds, meshes, etc.) from 2D-only self-supervision. We here set out to extend this to entire 3D scenes made out of multiple objects, including their location, orientation and type, and the scenes illumination. Once learned, we can map arbitrary 2D images to 3D scene st...
Article
We suggest to represent an X-Field - -a set of 2D images taken across different view, time or illumination conditions, i.e., video, lightfield, reflectance fields or combinations thereof - -by learning a neural network (NN) to map their view, time or light coordinates to 2D images. Executing this NN at new coordinates results in joint view, time or...
Preprint
Full-text available
Phase retrieval approaches based on DL provide a framework to obtain phase information from an intensity hologram or diffraction pattern in a robust manner and in real time. However, current DL architectures applied to the phase problem rely i) on paired datasets, i. e., they are only applicable when a satisfactory solution of the phase problem has...
Preprint
The motion of picking up and placing an object in 3D space is full of subtle detail. Typically these motions are formed from the same constraints, optimizing for swiftness, energy efficiency, as well as physiological limits. Yet, even for identical goals, the motion realized is always subject to natural variation. To capture these aspects computati...
Preprint
We suggest to represent an X-Field -a set of 2D images taken across different view, time or illumination conditions, i.e., video, light field, reflectance fields or combinations thereof-by learning a neural network (NN) to map their view, time or light coordinates to 2D images. Executing this NN at new coordinates results in joint view, time or lig...
Conference Paper
We suggest a generative model of 2D and 3D natural textures with diversity, visual fidelity and at high computational efficiency. This is enabled by a family of methods that extend ideas from classic stochastic procedural texturing (Perlin noise) to learned, deep, non-linearities. Our model encodes all exemplars from a diverse set of textures witho...
Preprint
Proteins perform a large variety of functions in living organisms, thus playing a key role in biology. As of now, available learning algorithms to process protein data do not consider several particularities of such data and/or do not scale well for large protein conformations. To fill this gap, we propose two new learning operations enabling deep...
Preprint
Full-text available
Massive semantic labeling is readily available for 2D images, but much harder to achieve for 3D scenes. Objects in 3D repositories like ShapeNet are labeled, but regrettably only in isolation, so without context. 3D scenes can be acquired by range scanners on city-level scale, but much fewer with semantic labels. Addressing this disparity, we intro...
Article
Full-text available
Convolutional neural networks (CNNs) handle the case where filters extend beyond the image boundary using several heuristics, such as zero, repeat or mean padding. These schemes are applied in an ad-hoc fashion and, being weakly related to the image content and oblivious of the target task, result in low output quality at the boundary. In this pape...
Preprint
We propose a generative model of 2D and 3D natural textures with diversity, visual fidelity and at high computational efficiency. This is enabled by a family of methods that extend ideas from classic stochastic procedural texturing (Perlin noise) to learned, deep, non-linearities. The key idea is a hard-coded, tunable and differentiable step that f...
Article
Designing point patterns with desired properties can require substantial effort, both in hand-crafting coding and mathematical derivation. Retaining these properties in multiple dimensions or for a substantial number of points can be challenging and computationally expensive. Tackling those two issues, we suggest to automatically generate scalable...
Article
Displacement mapping is routinely used to add geometric details in a fast and easy‐to‐control way, both in offline rendering as well as recently in interactive applications such as games. However, it went largely unnoticed (with the exception of McGuire and Whitson [MW08]) that, when applying displacement mapping to a surface with a low‐distortion...
Preprint
We suggest representing light field (LF) videos as "one-off" neural networks (NN), i.e., a learned mapping from view-plus-time coordinates to high-resolution color values, trained on sparse views. Initially, this sounds like a bad idea for three main reasons: First, a NN LF will likely have less quality than a same-sized pixel basis representation....
Conference Paper
We introduce PLATONICGAN to discover the 3D structure of an object class from an unstructured collection of 2D images, i. e., neither any relation between the images is available nor additional information about the images is known. The key idea is to train a deep neural network to generate 3D shapes which rendered to images are indistinguishable f...
Article
Image metrics predict the perceived per‐pixel difference between a reference image and its degraded (e. g., re‐rendered) version. In several important applications, the reference image is not available and image metrics cannot be applied. We devise a neural network architecture and training procedure that allows predicting the MSE, SSIM or VGG16 im...
Conference Paper
We introduce PlatonicGAN to discover the 3D structure of an object class from an unstructured collection of 2D images, i.e., where no relation between photos is known, except that they are showing instances of the same category. The key idea is to train a deep neural network to generate 3D shapes which, when rendered to images, are indistinguishabl...
Article
We suggest a rasterization pipeline tailored towards the needs of HMDs, where latency and field-of-view requirements pose new challenges beyond those of traditional desktop displays. Instead of image warping for low latency, or using multiple passes for foveation, we show how both can be produced directly in a single perceptual rasterization pass....
Article
We suggest a method to directly deep‐learn light transport, i. e., the mapping from a 3D geometry‐illumination‐material configuration to a shaded 2D image. While many previous learning methods have employed 2D convolutional neural networks applied to images, we show for the first time that light transport can be learned directly in 3D. The benefit...
Preprint
We show that denoising of 3D point clouds can be learned unsupervised, directly from noisy 3D point cloud data only. This is achieved by extending recent ideas from learning of unsupervised image denoisers to unstructured 3D point clouds. Unsupervised image denoisers operate under the assumption that a noisy pixel observation is a random realizatio...
Preprint
Image metrics predict the perceived per-pixel difference between a reference image B and its degraded (e. g., re-rendering) version A. We devise a neural network architecture and training procedure that allows predicting the MSE, SSIM or VGG16 image difference A-B from the distorted image A alone, when the reference B is not observed. This is enabl...
Conference Paper
Deep learning systems extensively use convolution operations to process input data. Though convolution is clearly defined for structured data such as 2D images or 3D volumes, this is not true for other data types such as sparse point clouds. Previous techniques have developed approximations to convolutions for restricted conditions. Unfortunately,...
Conference Paper
In computer graphics, many traditional problems are now better handled by deep-learning based data-driven methods. In applications that operate on regular 2D domains, like image processing and computational photography, deep networks are state-of-the-art, beating dedicated hand-crafted methods by significant margins. More recently, other domains su...
Preprint
We develop PlatonicGAN to discover 3D structure of an object class from an unstructured collection of 2D images. The key idea is to learn a deep neural network that generates 3D shapes that are never objectionable to a discriminator looking only at its 2D projections, i.e. renderings of the generated volumes. Using such a 2D instead of a 3D discrim...
Preprint
We suggest a method to directly deep-learn light transport, i. e., the mapping from a 3D geometry-illumination-material configuration to a shaded 2D image. While many previous learning methods have employed 2D convolutional neural networks applied to images, we show for the first time that light transport can be learned directly in 3D. The benefit...
Article
Deep learning systems extensively use convolution operations to process input data. Though convolution is clearly defined for structured data such as 2D images or 3D volumes, this is not true for other data types such as sparse point clouds. Previous techniques have developed approximations to convolutions for restricted conditions. Unfortunately,...
Conference Paper
Faithful manipulation of shape, material, and illumination in 2D Internet images would greatly benefit from a reliable factorization of appearance into material (i.e. diffuse and specular) and illumination (i.e. environment maps). On the one hand, current methods that produce very high fidelity results, typically require controlled settings, expens...
Article
Simulating combinations of depth-of-field and motion blur is an important factor to cinematic quality in synthetic images but can take long to compute. Splatting the point-spread function (PSF) of every pixel is general and provides high quality, but requires prohibitive compute time. We accelerate this in two steps: In a pre-process we optimize fo...
Preprint
Sample patterns have many uses in Computer Graphics, ranging from procedural object placement over Monte Carlo image synthesis to non-photorealistic depiction. Their properties such as discrepancy, spectra, anisotropy, or progressiveness have been analyzed extensively. However, designing methods to produce sampling patterns with certain properties...
Preprint
We suggest a rasterization pipeline tailored towards the need of head-mounted displays (HMD), where latency and field-of-view requirements pose new challenges beyond those of traditional desktop displays. Instead of rendering and warping for low latency, or using multiple passes for foveation, we show how both can be produced directly in a single p...
Preprint
We propose an efficient and effective method to learn convolutions for non-uniformly sampled point clouds, as they are obtained with modern acquisition techniques. Learning is enabled by four key novelties: first, representing the convolution kernel itself as a multilayer perceptron; second, phrasing convolution as a Monte Carlo integration problem...
Article
Convolutional neural networks (CNNs) handle the case where filters extend beyond the image boundary using several heuristics, such as zero, repeat or mean padding. These schemes are applied in an ad-hoc fashion and, being weakly related to the image content and oblivious of the target task, result in low output quality at the boundary. In this pape...
Preprint
Convolutional neural networks (CNNs) handle the case where filters extend beyond the image boundary using several heuristics, such as zero, repeat or mean padding. These schemes are applied in an ad-hoc fashion and, being weakly related to the image content and oblivious of the target task, result in low output quality at the boundary. In this pape...
Article
As many different 3D volumes could produce the same 2D x‐ray image, inverting this process is challenging. We show that recent deep learning‐based convolutional neural networks can solve this task. As the main challenge in learning is the sheer amount of data created when extending the 2D image into a 3D volume, we suggest firstly to learn a coarse...
Article
We propose a deep representation of appearance, i. e. the relation of color, surface orientation, viewer position, material and illumination. Previous approaches have used deep learning to extract classic appearance representations relating to reflectance model parameters (e. g. Phong) or illumination (e. g. HDR environment maps). We suggest to dir...
Conference Paper
In computer graphics, many traditional problems are now better handled by deep-learning based data-driven methods. In an increasing variety of problem settings, deep networks are state-of-the-art, beating dedicated hand-crafted methods by significant margins. This tutorial gives an organized overview of core theory, practice, and graphics-related a...
Article
We suggest a rasterization pipeline tailored towards the need of head-mounted displays (HMD), where latency and field-of-view requirements pose new challenges beyond those of traditional desktop displays. Instead of rendering and warping for low latency, or using multiple passes for foveation, we show how both can be produced directly in a single p...
Article
Faithful manipulation of shape, material, and illumination in 2D Internet images would greatly benefit from a reliable factorization of appearance into material (i.e., diffuse and specular) and illumination (i.e., environment maps). On the one hand, current methods that produce very high fidelity results, typically require controlled settings, expe...
Article
As many different 3D volumes could produce the same 2D x-ray image, inverting this process is challenging. We show that recent deep learning-based convolutional neural networks can solve this task. As the main challenge in learning is the sheer amount of data created when extending the 2D image into a 3D volume, we suggest firstly to learn a coarse...
Conference Paper
Full-text available
How much does a single image reveal about the environment it was taken in? In this paper, we investigate how much of that information can be retrieved from a foreground object, combined with the background (i.e. the visible part of the environment). Assuming it is not perfectly diffuse, the foreground object acts as a complexly shaped and far-from-...