Petr Kellnhofer

Petr Kellnhofer
Massachusetts Institute of Technology | MIT · Computer Science and Artificial Intelligence Laboratory

Dr.-Ing.

About

49
Publications
10,733
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,765
Citations
Additional affiliations
June 2015 - January 2016
Massachusetts Institute of Technology
Position
  • Visiting PhD Student
September 2012 - December 2016
Max Planck Institute for Informatics
Position
  • PhD Student
September 2007 - June 2012
University of West Bohemia
Position
  • Student
Education
September 2012 - November 2016
Universität des Saarlandes
Field of study
  • Computer graphics
September 2010 - June 2012
University of West Bohemia
Field of study
  • Computer Graphics
September 2007 - June 2010
University of West Bohemia
Field of study
  • Computer Sciences

Publications

Publications (49)
Preprint
Unsupervised learning of 3D-aware generative adversarial networks (GANs) using only collections of single-view 2D photographs has very recently made much progress. These 3D GANs, however, have not been demonstrated for human bodies and the generated radiance fields of existing frameworks are not directly editable, limiting their applicability in do...
Article
Virtual and augmented reality (VR/AR) displays strive to provide a resolution, framerate and field of view that matches the perceptual capabilities of the human visual system, all while constrained by limited compute budgets and transmission bandwidths of wearable computing systems. Foveated graphics techniques have emerged that could achieve these...
Preprint
Novel view synthesis is a long-standing problem in machine learning and computer vision. Significant progress has recently been made in developing neural scene representations and rendering techniques that synthesize photorealistic images from arbitrary views. These representations, however, are extremely slow to train and often also slow to render...
Article
Full-text available
A Correction to this paper has been published: https://doi.org/10.1038/s41586-021-03476-5.
Preprint
Virtual and augmented reality (VR/AR) displays strive to provide a resolution, framerate and field of view that matches the perceptual capabilities of the human visual system, all while constrained by limited compute budgets and transmission bandwidths of wearable computing systems. Foveated graphics techniques have emerged that could achieve these...
Preprint
Novel view synthesis is a challenging and ill-posed inverse rendering problem. Neural rendering techniques have recently achieved photorealistic image quality for this task. State-of-the-art (SOTA) neural volume rendering approaches, however, are slow to train and require minutes of inference (i.e., rendering) time for high image resolutions. We ad...
Article
Full-text available
The ability to present three-dimensional (3D) scenes with continuous depth sensation has a profound impact on virtual and augmented reality, human–computer interaction, education and training. Computer-generated holography (CGH) enables high-spatio-angular-resolution 3D projection via numerical simulation of diffraction and interference¹. Yet, exis...
Preprint
We have witnessed rapid progress on 3D-aware image synthesis, leveraging recent advances in generative visual models and neural rendering. Existing approaches however fall short in two ways: first, they may lack an underlying 3D representation or rely on view-inconsistent rendering, hence synthesizing images that are not multi-view consistent; seco...
Preprint
Understanding where people are looking is an informative social cue. In this work, we present Gaze360, a large-scale gaze-tracking dataset and method for robust 3D gaze estimation in unconstrained images. Our dataset consists of 238 subjects in indoor and outdoor environments with labelled 3D gaze across a wide range of head poses and distances. It...
Article
Full-text available
Humans can feel, weigh and grasp diverse objects, and simultaneously infer their material properties while applying the right amount of force—a challenging set of tasks for a modern robot1. Mechanoreceptor networks that provide sensory feedback and enable the dexterity of the human grasp2 remain difficult to replicate in robots. Whereas computer-vi...
Preprint
Motivated by the recent potential of mass customization brought by whole-garment knitting machines, we introduce the new problem of automatic machine instruction generation using a single image of the desired physical product, which we apply to machine knitting. We propose to tackle this problem by directly learning to synthesize regular machine in...
Article
Motivated by the recent potential of mass customization brought by whole-garment knitting machines, we introduce the new problem of automatic machine instruction generation using a single image of the desired physical product, which we apply to machine knitting. We propose to tackle this problem by directly learning to synthesize regular machine in...
Article
2019 IEEE. Understanding where people are looking is an informative social cue. In this work, we present Gaze360, a large-scale remote gaze-tracking dataset and method for robust 3D gaze estimation in unconstrained images. Our dataset consists of 238 subjects in indoor and outdoor environments with labelled 3D gaze across a wide range of head poses...
Preprint
We introduce a saliency-based distortion layer for convolutional neural networks that helps to improve the spatial sampling of input data for a given task. Our differentiable layer can be added as a preprocessing block to existing task networks and trained altogether in an end-to-end fashion. The effect of the layer is to efficiently estimate how t...
Chapter
Illumination is a critical element of photography and is essential for many computer vision tasks. Flash light is unique in the sense that it is a widely available tool for easily manipulating the scene illumination. We present a dataset of thousands of ambient and flash illumination pairs to enable studying flash photography and other applications...
Chapter
We introduce a saliency-based distortion layer for convolutional neural networks that helps to improve the spatial sampling of input data for a given task. Our differentiable layer can be added as a preprocessing block to existing task networks and trained altogether in an end-to-end fashion. The effect of the layer is to efficiently estimate how t...
Book
Illumination is a critical element of photography and is essential for many computer vision tasks. Flash light is unique in the sense that it is a widely available tool for easily manipulating the scene illumination. We present a dataset of thousands of ambient and flash illumination pairs to enable studying flash photography and other applications...
Article
Springer Nature Switzerland AG 2018. We introduce a saliency-based distortion layer for convolutional neural networks that helps to improve the spatial sampling of input data for a given task. Our differentiable layer can be added as a preprocessing block to existing task networks and trained altogether in an end-to-end fashion. The effect of the l...
Conference Paper
Full-text available
Accommodative depth cues, a wide field of view, and ever-higher resolutions present major design challenges for near-eye displays. Optimizing a design to overcome one of them typically leads to a trade-off in the others. We tackle this problem by introducing an all-in-one solution - a novel display for augmented reality. The key components of our s...
Article
Full-text available
Stereoscopic 3D (S3D) movies have become widely popular in the movie theaters, but the adoption of S3D at home is low even though most TV sets support S3D. It is widely believed that S3D with glasses is not the right approach for the home. A much more appealing approach is to use automulti-scopic displays that provide a glasses-free 3D experience t...
Article
We propose a system to infer binocular disparity from a monocular video stream in real-time. Different from classic reconstruction of physical depth in computer vision, we compute perceptually plausible disparity, that is numerically inaccurate, but results in a very similar overall depth impression with plausible overall layout, sharp edges, fine...
Article
Full-text available
Accommodative depth cues, a wide field of view, and ever-higher resolutions all present major hardware design challenges for near-eye displays. Optimizing a design to overcome one of these challenges typically leads to a trade-off in the others. We tackle this problem by introducing an all-in-one solution – a new wide field of view, gaze-tracked ne...
Article
Binocular disparity is the main depth cue that makes stereoscopic images appear 3D. However, in many scenarios, the range of depth that can be reproduced by this cue is greatly limited and typically fixed due to constraints imposed by displays. For example, due to the low angular resolution of current automultiscopic screens, they can only reproduc...
Conference Paper
Binocular disparity is the main depth cue that makes stereoscopic images appear 3D. However, in many scenarios, the range of depth that can be reproduced by this cue is greatly limited and typically fixed due to constraints imposed by displays. For example, due to the low angular resolution of current automultiscopic screens, they can only reproduc...
Article
Predicting human visual perception has several applications such as compression, rendering, editing, and retargeting. Current approaches, however, ignore the fact that the human visual system compensates for geometric transformations, e.g., we see that an image and a rotated copy are identical. Instead, they will report a large, false-positive diff...
Article
Producing a high quality stereoscopic impression on current displays is a challenging task. The content has to be carefully prepared in order to maintain visual comfort, which typically affects the quality of depth reproduction. In this work, we show that this problem can be significantly alleviated when the eye fixation regions can be roughly esti...
Article
Full-text available
From scientific research to commercial applications, eye tracking is an important tool across many domains. Despite its range of applications, eye tracking has yet to become a pervasive technology. We believe that we can put the power of eye tracking in everyone's palm by building eye tracking software that works on commodity hardware such as mobil...
Thesis
Full-text available
Virtual and Augmented Reality applications typically rely on both stereoscopic presentation and involve intensive object and observer motion. A combination of high dynamic range and stereoscopic capabilities become popular for consumer displays, and is a desirable functionality of head mounted displays to come. The thesis is focused on complex inte...
Conference Paper
Full-text available
Different from classic reconstruction of physical depth in computer vision, depth for 2D-to-3D stereo conversion is assigned by humans using semi-automatic painting interfaces and, consequently, is often dramatically wrong. Here we seek to better understand why it still does not fail to convey a sensation of depth. To this end, four typical dispari...
Article
When human luminance perception operates close to its absolute threshold, i. e., the lowest perceivable absolute values, appearance changes substantially compared to common photopic or scotopic vision. In particular, most observers report perceiving temporally-varying noise. Two reasons are physiologically plausible; quantum noise (due to the low a...
Conference Paper
Full-text available
Predicting human visual perception has several applications such as compression, rendering, editing and retargeting. Current approaches however, ignore the fact that the human visual system compensates for geometric transformations, e. g., we see that an image and a rotated copy are identical. Instead, they will report a large, false-positive diffe...
Article
Full-text available
Predicting human visual perception has several applications such as compression, rendering, editing and retargeting. Current approaches however, ignore the fact that the human visual system compensates for geometric transformations, e. g., we see that an image and a rotated copy are identical. Instead, they will report a large, false-positive diffe...
Conference Paper
Full-text available
Several approaches attempt to reproduce the appearance of a scotopic low-light night scene on a photopic display (“day-for-night”) by introducing color desaturation, loss of acuity, and the Purkinje shift toward blue colors. We argue that faithful stereo reproduction of night scenes on photopic stereo displays requires manipulation of not only colo...
Article
Full-text available
Several approaches attempt to reproduce the appearance of a scotopic low-light night scene on a photopic display ("day-for-night") by introducing color desaturation, loss of acuity, and the Purkinje shift toward blue colors. We argue that faithful stereo reproduction of night scenes on photopic stereo displays requires manipulation of not only colo...
Article
Presenting stereoscopic content on 3D displays is a challenging task, usually requiring manual adjustments. A number of techniques have been developed to aid this process, but they account for binocular disparity of surfaces that are diffuse and opaque only. However, combinations of transparent as well as specular materials are common in the real a...
Article
Full-text available
This paper investigates the presentation of moving stereo images on different display devices. We address three important issues. First, we propose temporal compensation for the Pulfrich effect when using anaglyph glasses. Second, we describe, how content-adaptive capture protocols can reduce false motion-in-depth sensation for time-multiplexing ba...
Article
Beyond the careful design of stereo acquisition equipment and rendering algorithms, disparity post-processing has recently received much attention, where one of the key tasks is to compress the originally large disparity range to avoid viewing discomfort. The perception of dynamic stereo content however, relies on reproducing the full disparity-tim...
Conference Paper
Full-text available
The musculoskeletal modelling and simulation is an essential step in the process of looking for an optimal strategy to provide patients suffering from various musculoskeletal disorders, such as osteoporosis, with better health care. In our previous work, we proposed a deformation method suitable for clinical practise that deforms each muscle represe...
Conference Paper
Full-text available
This paper proposes a gradient domain deformation for wrapping surface models of muscles around bones as they move during a simulation of physiological activities. Each muscle is associated with one or more poly-lines that represent the muscle skeleton to which the surface model of the muscle is bound so that transformation of the skeleton (caused...

Network

Cited By