December 2024
·
4 Reads
·
6 Citations
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
December 2024
·
4 Reads
·
6 Citations
October 2024
·
1 Read
·
1 Citation
October 2024
·
3 Reads
·
1 Citation
September 2024
·
9 Reads
State-of-the-art techniques for 3D reconstruction are largely based on volumetric scene representations, which require sampling multiple points to compute the color arriving along a ray. Using these representations for more general inverse rendering -- reconstructing geometry, materials, and lighting from observed images -- is challenging because recursively path-tracing such volumetric representations is expensive. Recent works alleviate this issue through the use of radiance caches: data structures that store the steady-state, infinite-bounce radiance arriving at any point from any direction. However, these solutions rely on approximations that introduce bias into the renderings and, more importantly, into the gradients used for optimization. We present a method that avoids these approximations while remaining computationally efficient. In particular, we leverage two techniques to reduce variance for unbiased estimators of the rendering equation: (1) an occlusion-aware importance sampler for incoming illumination and (2) a fast cache architecture that can be used as a control variate for the radiance from a high-quality, but more expensive, volumetric cache. We show that by removing these biases our approach improves the generality of radiance cache based inverse rendering, as well as increasing quality in the presence of challenging light transport effects such as specular reflections.
July 2024
·
26 Reads
·
19 Citations
ACM Transactions on Graphics
While surface-based view synthesis algorithms are appealing due to their low computational requirements, they often struggle to reproduce thin structures. In contrast, more expensive methods that model the scene's geometry as a volumetric density field (e.g. NeRF) excel at reconstructing fine geometric detail. However, density fields often represent geometry in a "fuzzy" manner, which hinders exact localization of the surface. In this work, we modify density fields to encourage them to converge towards surfaces, without compromising their ability to reconstruct thin structures. First, we employ a discrete opacity grid representation instead of a continuous density field, which allows opacity values to discontinuously transition from zero to one at the surface. Second, we anti-alias by casting multiple rays per pixel, which allows occlusion boundaries and subpixel structures to be modelled without using semi-transparent voxels. Third, we minimize the binary entropy of the opacity values, which facilitates the extraction of surface geometry by encouraging opacity values to binarize towards the end of training. Lastly, we develop a fusion-based meshing strategy followed by mesh simplification and appearance model fitting. The compact meshes produced by our model can be rendered in real-time on mobile devices and achieve significantly higher view synthesis quality compared to existing mesh-based approaches. Our interactive webdemo is available at https://binary-opacity-grid.github.io.
June 2024
·
14 Reads
·
91 Citations
June 2024
·
2 Citations
June 2024
·
2 Reads
·
5 Citations
May 2024
·
12 Reads
Neural Radiance Fields (NeRFs) typically struggle to reconstruct and render highly specular objects, whose appearance varies quickly with changes in viewpoint. Recent works have improved NeRF's ability to render detailed specular appearance of distant environment illumination, but are unable to synthesize consistent reflections of closer content. Moreover, these techniques rely on large computationally-expensive neural networks to model outgoing radiance, which severely limits optimization and rendering speed. We address these issues with an approach based on ray tracing: instead of querying an expensive neural network for the outgoing view-dependent radiance at points along each camera ray, our model casts reflection rays from these points and traces them through the NeRF representation to render feature vectors which are decoded into color using a small inexpensive network. We demonstrate that our model outperforms prior methods for view synthesis of scenes containing shiny objects, and that it is the only existing NeRF method that can synthesize photorealistic specular appearance and reflections in real-world scenes, while requiring comparable optimization time to current state-of-the-art view synthesis models.
April 2024
·
413 Reads
·
35 Citations
The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes. In these domains, diffusion models are the generative AI architecture of choice. Within the last year alone, the literature on diffusion‐based tools and applications has seen exponential growth and relevant papers are published across the computer graphics, computer vision, and AI communities with new works appearing daily on arXiv. This rapid growth of the field makes it difficult to keep up with all recent developments. The goal of this state‐of‐the‐art report (STAR) is to introduce the basic mathematical concepts of diffusion models, implementation details and design choices of the popular Stable Diffusion model, as well as overview important aspects of these generative AI tools, including personalization, conditioning, inversion, among others. Moreover, we give a comprehensive overview of the rapidly growing literature on diffusion‐based generation and editing, categorized by the type of generated medium, including 2D images, videos, 3D objects, locomotion, and 4D scenes. Finally, we discuss available datasets, metrics, open challenges, and social implications. This STAR provides an intuitive starting point to explore this exciting topic for researchers, artists, and practitioners alike.
... While other significant subsequent projects have been published very recently, e.g., NeRF-Casting [11] or UniSDF [12], they fall outside the scope of this work. ...
December 2024
... Most modern approaches heavily leverage the success of the computer graphics community, which has produced accurate and efficient models for the "forward" problem of rendering an image from an underlying 3D model [34]. Many modern inverse rendering techniques use radiance fields that, rather than mapping a 3D location and viewing direction to an outgoing color, map a 3D location to material properties and surface normals, which are then rendered according to some estimate of incident illumination [2,7,10,11,14,23,29,31,41,42]. Because this problem is inherently ill-posed, these techniques often critically rely on analytical priors to regularize the estimated scene decomposition. ...
October 2024
... Curved Diffusion [39] explores conditioning T2I diffusion on diverse optical geometries to simulate lens-induced distortions, such as fisheye and panoramic effects. Additionally, some methods achieve zoom effects by implicitly modeling focal length [9,43]. ...
June 2024
... While significant progress has been made in rendering reflective objects, challenges arising from complex light interactions persist. Recent years have seen numerous studies addressing these issues, primarily by decomposing appearance into lighting and material properties (Bi et al., 2020;Boss et al., 2021;Li & Li, 2022;Srinivasan et al., 2020;Zhang et al., 2021b;Munkberg et al., 2022;Zhang et al., 2021a;Verbin et al., 2024a;Zhao et al., 2024). Building on this foundation, some research has focused on improving the capture and reproduction of specular reflections (Verbin et al., 2022;Verbin et al., 2024b), while others have leveraged signed distance functions (SDFs) to enhance normal estimation (Ge et al., 2023;Liang et al., 2023a;Liu et al., 2023b;. ...
June 2024
... 3D scene reconstruction from multi-view images is a longstanding topic in computer vision. Recent advances in neu-ral implicit representations [30,57] have enabled significant progress in novel-view rendering [3,18,93,95] and 3D geometry reconstruction [62,85,97]. Despite these advances, existing approaches are limited by representing an entire scene as a whole. ...
June 2024
... A key design decision is the choice of 3D scene representation to establish correspondences and capture structure. Popular representations include multi-plane images [4,13,27,65,71,72,82], neural fields [40,56], voxel grids [41,50,61], and Gaussian splatting [22]. NVS methods can be broadly divided into those that optimize scene representation at test time and those that directly predict it through a feed-forward network [5,12,52,59,67,78]. ...
July 2024
ACM Transactions on Graphics
... Cross-attention is a variant of self-attention [44] where the attention mechanism is applied between the image latent and the conditioning embedding. Compared to the augmentation performed in CFG, cross-attention has been shown to handle complex conditioning information [45], which could help capture variations in observation positions. Santos et al. [21] have investigated applying cross-attention-based deterministic method for field reconstruction tasks, and demonstrated promising results. ...
April 2024
... Vanilla NeRF involves querying a deep MLP model millions of times [13], leading to slow training and rendering speeds. Some research efforts have tried to speed up this process by using more efficient sampling schemes [14]- [16], while some have attempted to apply improved data structures to represent the objects or scenes [7], [8], [17]- [19]. Besides, to improve the NeRF training on low-quality images, enhancements have been made to handle degradation, such as blurring [20]- [23], lowlight [24], and reflection [25], [26]. ...
March 2024
IEEE Transactions on Pattern Analysis and Machine Intelligence
... To address these issues, we have developed a feed-forward photorealistic style transfer network based on 3D Gaussians, which can perform inference at real-time speeds. 3D Scene Editing of Radiance Fields: 3D scene editing in radiance fields has attracted considerable attention, with approaches incorporating physical properties [12,44] or utilizing 2D generative models [11,30,45]. However, NeRFbased methods are limited by implicit representations, resulting in high computational costs for optimization. ...
January 2024
IEEE Transactions on Pattern Analysis and Machine Intelligence
... Novel view synthesis (NVS) has been a long-standing challenge in computer vision and graphics (e.g., [1,7,19]). Recent neural 3D scene representations for NVS (e.g., [2,47]) have been very successful, but (i) are fit per-scene, avoiding the scale ambiguity issue, and (ii) cannot handle unseen or disoccluded scene areas. To handle such uncertainties, generative NVS (GNVS) models have been devised (e.g., [20,29,30,43]), with most recent methods applying diffusion models (DMs) [5,12,21,22,41,[45][46][47]. ...
October 2023