ACM Transactions on Graphics

Published by Association for Computing Machinery
Online ISSN: 1557-7368
Print ISSN: 0730-0301
We introduce a computational framework for discovering regular or repeated geometric structures in 3D shapes. We describe and classify possible regular structures and present an effective algorithm for detecting such repeated geometric patterns in point- or mesh-based models. Our method assumes no prior knowledge of the geometry or spatial location of the individual elements that define the pattern. Structure discovery is made possible by a careful analysis of pairwise similarity transformations that reveals prominent lattice structures in a suitable model of transformation space. We introduce an optimization method for detecting such uniform grids specifically designed to deal with outliers and missing elements. This yields a robust algorithm that successfully discovers complex regular structures amidst clutter, noise, and missing geometry. The accuracy of the extracted generating transformations is further improved using a novel simultaneous registration method in the spatial domain. We demonstrate the effectiveness of our algorithm on a variety of examples and show applications to compression, model repair, and geometry synthesis.
(a) Rendering a cityscape with a pinhole aperture results in no perceptible blur. The scene looks large and far away. (b) Simulating a 60mwide aperture produces blur consistent with a shallow depth of field, making the scene appear to be a miniature model.
Comparison of blur patterns produced by three rendering techniques: consistent blur (a), simulated tilt-and-shift lens (b), and linear blur gradient (c). The settings in (b) and (c) were chosen to equate the maximum blur-circle diameters with those in (a). The percent differences in blur-circle diameters between the images are plotted in (d), (e), and (f). Panels (d) and (e) show that the simulated tilt-and-shift lens and linear blur gradient do not closely approximate consistent blur rendering. The large differences are due to the buildings, which protrude from the ground plane. Panel (f) shows that the linear blur gradient provides essentially the same blur pattern as a simulated tilt-and-shift lens. Most of the differences in (f) are less than 7%; the only exceptions are in the band near the center, where the blur diameters are less than one pixel and not detectable in the final images.
Focal distance as a function of relative distance and retinal-image blur. Relative distance is defined as the ratio of the distance to an object and the distance to the focal plane. The three colored curves represent different amounts of image blur expressed as the diameter of the blur circle, c, in degrees. We use angular units because in these units, the image device's focal length drops out [Kingslake 1992]. The variance in the distribution was determined by assuming that pupil diameter is Gaussian distributed with a mean of 4.6mm and standard deviation of 1mm [Spring and Stiles 1948]. For a given amount of blur, it is impossible to recover the original focal distance without knowing the relative distance. Note that as the relative distance approaches 1, the object moves closer to the focal plane. There is a singularity at a relative distance of 1 because the object is by definition completely in focus at this distance.
We present a probabilistic model of how viewers may use defocus blur in conjunction with other pictorial cues to estimate the absolute distances to objects in a scene. Our model explains how the pattern of blur in an image together with relative depth cues indicates the apparent scale of the image's contents. From the model, we develop a semiautomated algorithm that applies blur to a sharply rendered image and thereby changes the apparent distance and scale of the scene's contents. To examine the correspondence between the model/algorithm and actual viewer experience, we conducted an experiment with human viewers and compared their estimates of absolute distance to the model's predictions. We did this for images with geometrically correct blur due to defocus and for images with commonly used approximations to the correct blur. The agreement between the experimental data and model predictions was excellent. The model predicts that some approximations should work well and that others should not. Human viewers responded to the various types of blur in much the way the model predicts. The model and algorithm allow one to manipulate blur precisely and to achieve the desired perceived scale efficiently.
We present a new meshless animation framework for elastic and plastic materials that fracture. Central to our method is a highly dynamic surface and volume sampling method that supports arbitrary crack initiation, propagation, and termination, while avoiding many of the stability problems of traditional mesh-based techniques. We explicitly model advancing crack fronts and associated fracture surfaces embedded in the simulation volume. When cutting through the material, crack fronts directly affect the coupling between simulation nodes, requiring a dynamic adaptation of the nodal shape functions. We show how local visibility tests and dynamic caching lead to an efficient implementation of these effects based on point collocation. Complex fracture patterns of interacting and branching cracks are handled using a small set of topological operations for splitting, merging, and terminating crack fronts. This allows continuous propagation of cracks with highly detailed fracture surfaces, independent of the spatial resolution of the simulation nodes, and provides effective mechanisms for controlling fracture paths. We demonstrate our method for a wide range of materials, from stiff elastic to highly plastic objects that exhibit brittle and/or ductile fracture.
Benefits of Reliable CCD Queries: We highlight the benefits of our exact CCD algorithm on cloth simulation. Our algorithm can be used to generate a plausible simulation (a). If parameters are not properly tuned, floating-point-based CCD algorithms (b) can result in penetrations and artifacts. 
Benchmarks: We use five different benchmarks arising from cloth and FEM simulations. 
We present fast algorithms to perform accurate CCD queries between triangulated models. Our formulation uses properties of the Bernstein basis and Bézier curves and reduces the problem to evaluating signs of polynomials. We present a geometrically exact CCD algorithm based on the exact geometric computation paradigm to perform reliable Boolean collision queries. Our algorithm is more than an order of magnitude faster than prior exact algorithms. We evaluate its performance for cloth and FEM simulations on CPUs and GPUs, and highlight the benefits.
We propose an efficient scheme for evaluating nonlinear subspace forces (and Jacobians) associated with subspace deformations. The core problem we address is efficient integration of the subspace force density over the 3D spatial domain. Similar to Gaussian quadrature schemes that efficiently integrate functions that lie in particular polynomial subspaces, we propose cubature schemes (multi-dimensional quadrature) optimized for efficient integration of force densities associated with particular subspace deformations, particular materials, and particular geometric domains. We support generic subspace deformation kinematics, and nonlinear hyperelastic materials. For an r-dimensional deformation subspace with O(r) cubature points, our method is able to evaluate subspace forces at O(r(2)) cost. We also describe composite cubature rules for runtime error estimation. Results are provided for various subspace deformation models, several hyperelastic materials (St.Venant-Kirchhoff, Mooney-Rivlin, Arruda-Boyce), and multimodal (graphics, haptics, sound) applications. We show dramatically better efficiency than traditional Monte Carlo integration. CR CATEGORIES: I.6.8 [Simulation and Modeling]: Types of Simulation-Animation, I.3.5 [Computer Graphics]: Computational Geometry and Object Modeling-Physically based modeling G.1.4 [Mathematics of Computing]: Numerical Analysis-Quadrature and Numerical Differentiation.
(A) Image formation with converging cameras. P o is coordinates of point P, f is camera focal length, t is separation between the cameras, C is distance to which the camera optical axes are converged, V c is angle between cameras’ optical axes, W c is width of camera sensors, x cl and x cr are x-coordinates of P’s projection onto left and right camera sensors. (B) The cameras’ optical axes can be made to converge by laterally offsetting the sensors relative to the lens axes. h is offset between sensor center and intersection of lens axis with the sensor. (C) Reconstruction of P from sensor images. Rays are projected from eye centers through corresponding points on picture. The ray intersection is estimated location of P. E l and E r are 3d coordinates of left and right eyes; P l and P r are locations of image points in the picture of P for left and right eyes; I is inter-ocular distance; d is distance between centers of pictures. The green and red horizontal lines represent the images presented to the left and right eyes, respectively. another, the optical axes can be parallel ( h = 0, V c = 0) or describe this: one with its origin on the display surface and one converging ( h ≠ 0, V c ≠ 0) (Figure 1A and 1B). P ’s coordinates with its origin at the viewer. For the first set, X and Y are the in the left and right cameras are ( x cl , y cl ) and ( x cr , y cr ), where x horizontal and vertical axes centered on the display surface and Z and y are horizontal and vertical coordinates in the sensors: is orthogonal to them. In these coordinates, the eyes’ positions are E l and E r . The positions of the points in picture are: x cl = f tan ⎡ ⎣ ⎢ tan − 1 ⎛ ⎝ ⎜ t 2 P + o P ( z o ) ( x ) ⎠ ⎞ ⎟ − V 2 c ⎦ ⎤ ⎥ − h P l = ( X sl , Y sl ,0) P r = ( X sr , Y sr ,0) 
Estimated 3d scenes for different acquisition and viewing situations. Each panel is a plan view of the viewer, stereo cameras, display surface, actual 3d stimulus, and estimated 3d stimulus. Red lines represent cameras’ optical axes. E) Proper viewing situation. Parameters are listed in Section 3. The actual and estimated stimuli are the same. B) Viewer is too distant from picture. H) Viewer is too close. D) Viewer is too far to the left relative to the picture. F) Viewer is too far to the right. A) Cameras are too close together for viewer’s inter-ocular distance. I) Cameras are too far apart. C) Distance between centers of the left and right stereo pictures is too great. G) Distance between the centers of pictures is too small. 
Anaglyph stereograms captured with the acquisition settings listed in Section 3.1. Top: cameras with parallel optical axes. Bottom: cameras’ optical axes were converged at 0.55m (center of cube). To view the stereograms, use red-green glasses with green filter over left eye. Try different viewing situations. 1) Move closer to and farther away from the page. 2) Move left and right while holding the head parallel to the page. 3) Position yourself directly in front of the page and rotate the head about a vertical axis (yaw) and then about a forward axis (roll). In each case, notice the changes in the cube’s apparent shape. Points in the cube were randomly perturbed to lessen contributions of perspective cues to 3d percept. 
Disparity as a function of azimuth and elevation. Fick coordinates (azimuth and elevation measured as longitudes and latitudes, respectively) were used. Vectors represent the direction and magnitude of disparities on the retinas produced by a stereoscopic image of a cube 0.3m on a side and placed 0.55m in front of the stereo cameras. Unless otherwise noted, the conditions listed in Section 3 were used to generate the figures. Arrow tails represent points on right eye’s retina, and arrowheads represent corresponding points in left eye’s retina. Panels A, B, and C contain points from the proximal face of the cube, where the eyes are fixating. D, E, and F represent the cube’s distal face. In A and D, the observer is viewing the display at a 45deg angle. In B and E, the viewer’s head has been rolled 20deg. In C and F, the cameras converge at 0.55m. from stereoscopic displays, particularly when the acquisition and of a shared space. In Proceedings of ACM SIGGRAPH 97 , viewing parameters are improper as necessarily occurs with ACM Press/Addison-Wesley, New York. Computer Graphics multiple viewers. The standard model makes reasonable Proceedings, Annual Conference Series, ACM, 327-332. predictions in many situations, but fails to make predictions in some important ones that are known to produce misperceptions. B ACKUS , B.T., B ANKS , M. S., VAN E E , R., AND C ROWELL , J. A. Those situations involve rotation of the viewer’s head relative to 1999. Horizontal and vertical disparity, eye position, and the display and the use of converging cameras in acquisition with stereoscopic slant perception, Vision Research , 39, 6, 1143- single displays for viewing. The skew rays that occur in those 1170. situations give rise to vertical disparities in the retinal images that were not present before the viewer rotation or before converging B ANKS , M. S., H OOGE , I. T., AND B ACKUS , B. T. 2001. Perceiving cameras were used. We described findings in the vision-science slant about a horizontal axis from stereopsis, Journal of Vision , literature that point to how the visual system determines 3d 1, 2, 55-79. structure in these situations. In particular, the system uses vertical disparity as an additional signal for determining the structure. C HAN , H. P., G OODSITT , M. M., H ELVIE , M. A., H ADJIISKI , L. M., Preliminary observations are consistent with the predictions L YDICK , J. T., R OUBIDOUX , M. A., B AILEY , J. E., N EES , A., derived from this model. B LANE , C. E., AND S AHINER , B. 2005. ROC study of the effect 
3d shape and scene layout are often misperceived when viewing stereoscopic displays. For example, viewing from the wrong distance alters an object's perceived size and shape. It is crucial to understand the causes of such misperceptions so one can determine the best approaches for minimizing them. The standard model of misperception is geometric. The retinal images are calculated by projecting from the stereo images to the viewer's eyes. Rays are back-projected from corresponding retinal-image points into space and the ray intersections are determined. The intersections yield the coordinates of the predicted percept. We develop the mathematics of this model. In many cases its predictions are close to what viewers perceive. There are three important cases, however, in which the model fails: 1) when the viewer's head is rotated about a vertical axis relative to the stereo display (yaw rotation); 2) when the head is rotated about a forward axis (roll rotation); 3) when there is a mismatch between the camera convergence and the way in which the stereo images are displayed. In these cases, most rays from corresponding retinal-image points do not intersect, so the standard model cannot provide an estimate for the 3d percept. Nonetheless, viewers in these situations have coherent 3d percepts, so the visual system must use another method to estimate 3d structure. We show that the non-intersecting rays generate vertical disparities in the retinal images that do not arise otherwise. Findings in vision science show that such disparities are crucial signals in the visual system's interpretation of stereo images. We show that a model that incorporates vertical disparities predicts the percepts associated with improper viewing of stereoscopic displays. Improving the model of misperceptions will aid the design and presentation of 3d displays.
Example stimuli. Left: The corner in the center looks undistorted. Right: the corner looks like an obtuse angle. The corners participants are asked to look at are indicated by the crosshairs (or by a blinking red dot in the experiment).  
The hinge device (left) and our experimental setup (right).  
(a) The deviation of averaged perceived angles (Experiment 1) from 90° has a similar pattern as (b), the interpolated medians of ratings from Experiment 2 (both linearly interpolated). (c) Perceived angles predicted by Eq. 19 mapped to ratings using Eq. 20, and (d) the same for an extended domain.
Image-based rendering (IBR) creates realistic images by enriching simple geometries with photographs, e.g., mapping the photograph of a building façade onto a plane. However, as soon as the viewer moves away from the correct viewpoint, the image in the retina becomes distorted, sometimes leading to gross misperceptions of the original geometry. Two hypotheses from vision science state how viewers perceive such image distortions, one claiming that they can compensate for them (and therefore perceive scene geometry reasonably correctly), and one claiming that they cannot compensate (and therefore can perceive rather significant distortions). We modified the latter hypothesis so that it extends to street-level IBR. We then conducted a rigorous experiment that measured the magnitude of perceptual distortions that occur with IBR for façade viewing. We also conducted a rating experiment that assessed the acceptability of the distortions. The results of the two experiments were consistent with one another. They showed that viewers' percepts are indeed distorted, but not as severely as predicted by the modified vision science hypothesis. From our experimental results, we develop a predictive model of distortion for street-level IBR, which we use to provide guidelines for acceptability of virtual views and for capture camera density. We perform a confirmatory study to validate our predictions, and illustrate their use with an application that guides users in IBR navigation to stay in regions where virtual views yield acceptable perceptual distortions.
We describe how to create with machine learning techniques a generative, videorealistic, and speech animation module. A human subject is first recorded using a videocamera as he/she utters a pre-determined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject's mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence, which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. At run time, the input to the system can be either real audio sequences or synthetic audio produced by a text-to-speech system, as long as they have been phonetically aligned.
Intuitive explanation. (a) the two endpoints of the curve detect the new points in A and B; (b) the grown curve  
We present a novel algorithm based on least-squares minimization to approximate point cloud data in 2D plane with a smooth B-spline curve. The point cloud data may represent an open curve with self intersection and sharp corner. Unlike other existing methods, such as the moving least-squares method and the principle curve method, our algorithm does not need a thinning process. The idea of our algorithm is intuitive and simple - we make a B-spline curve grow along the tangential directions at its two end-points following local geometry of point clouds. Our algorithm generates appropriate control points of the fitting B-spline curve in the least squares sense. Although presented for the 2D case, our method can be extended in a straightforward manner to fitting data points by a B-spline curve in higher dimensions
In this paper, we study the generation of maximal Poisson-disk sets with varying radii. First, we present a geometric analysis of gaps in such disk sets. This analysis is the basis for maximal and adaptive sampling in Euclidean space and on manifolds. Second, we propose efficient algorithms and data structures to detect gaps and update gaps when disks are inserted, deleted, moved, or have their radius changed. We build on the concepts of the regular triangulation and the power diagram. Third, we will show how our analysis can make a contribution to the state-of-the-art in surface remeshing.
Photo retouching enables photographers to invoke dramatic visual impressions by artistically enhancing their photos through stylistic color and tone adjustments. However, it is also a time-consuming and challenging task that requires advanced skills beyond the abilities of casual photographers. Using an automated algorithm is an appealing alternative to manual work but such an algorithm faces many hurdles. Many photographic styles rely on subtle adjustments that depend on the image content and even its semantics. Further, these adjustments are often spatially varying. Because of these characteristics, existing automatic algorithms are still limited and cover only a subset of these challenges. Recently, deep machine learning has shown unique abilities to address hard problems that resisted machine algorithms for long. This motivated us to explore the use of deep learning in the context of photo editing. In this paper, we explain how to formulate the automatic photo adjustment problem in a way suitable for this approach. We also introduce an image descriptor that accounts for the local semantics of an image. Our experiments demonstrate that our deep learning formulation applied using these descriptors successfully capture sophisticated photographic styles. In particular and unlike previous techniques, it can model local adjustments that depend on the image semantics. We show on several examples that this yields results that are qualitatively and quantitatively better than previous work.
Mapping a line to a point and vice versa.
This paper describes a general-purpose programming technique, called the Simulation of Simplicity, which can be used to cope with degenerate input data for geometric algorithms. It relieves the programmer from the task to provide a consistent treatment for every single special case that can occur. The programs that use the technique tend to be considerably smaller and more robust than those that do not use it. We believe that this technique will become a standard tool in writing geometric software. Comment: 38 pages
Frequently, data in scientific computing is in its abstract form a finite point set in space, and it is sometimes useful or required to compute what one might call the ``shape'' of the set. For that purpose, this paper introduces the formal notion of the family of $\alpha$-shapes of a finite point set in $\Real^3$. Each shape is a well-defined polytope, derived from the Delaunay triangulation of the point set, with a parameter $\alpha \in \Real$ controlling the desired level of detail. An algorithm is presented that constructs the entire family of shapes for a given set of size $n$ in time $O(n^2)$, worst case. A robust implementation of the algorithm is discussed and several applications in the area of scientific computing are mentioned. Comment: 32 pages
The control polygon of a Bezier curve is well-defined and has geometric significance---there is a sequence of weights under which the limiting position of the curve is the control polygon. For a Bezier surface patch, there are many possible polyhedral control structures, and none are canonical. We propose a not necessarily polyhedral control structure for surface patches, regular control surfaces, which are certain C^0 spline surfaces. While not unique, regular control surfaces are exactly the possible limiting positions of a Bezier patch when the weights are allowed to vary.
Blue noise refers to sample distributions that are random and well-spaced, with a variety of applications in graphics, geometry, and optimization. However, prior blue noise sampling algorithms typically suffer from the curse-of-dimensionality, especially when striving to cover a domain maximally. This hampers their applicability for high dimensional domains. We present a blue noise sampling method that can achieve high quality and performance across different dimensions. Our key idea is spoke-dart sampling, sampling locally from hyper-annuli centered at prior point samples, using lines, planes, or, more generally, hyperplanes. Spoke-dart sampling is more efficient at high dimensions than the state-of-the-art alternatives: global sampling and advancing front point sampling. Spoke-dart sampling achieves good quality as measured by differential domain spectrum and spatial coverage. In particular, it probabilistically guarantees that each coverage gap is small, whereas global sampling can only guarantee that the sum of gaps is not large. We demonstrate advantages of our method through empirical analysis and applications across dimensions 8 to 23 in Delaunay graphs, global optimization, and motion planning.
In many graphics applications, the computation of exact geodesic distance is very important. However, the high computational cost of the existing geodesic algorithms means that they are not practical for large-scale models or time-critical applications. To tackle this challenge, we propose the parallel Chen-Han (or PCH) algorithm, which extends the classic Chen-Han (CH) discrete geodesic algorithm to the parallel setting. The original CH algorithm and its variant both lack a parallel solution because the windows (a key data structure that carries the shortest distance in the wavefront propagation) are maintained in a strict order or a tightly coupled manner, which means that only one window is processed at a time. We propose dividing the CH's sequential algorithm into four phases, window selection, window propagation, data organization, and events processing so that there is no data dependence or conflicts in each phase and the operations within each phase can be carried out in parallel. The proposed PCH algorithm is able to propagate a large number of windows simultaneously and independently. We also adopt a simple yet effective strategy to control the total number of windows. We implement the PCH algorithm on modern GPUs (such as Nvidia GTX 580) and analyze the performance in detail. The performance improvement (compared to the sequential algorithms) is highly consistent with GPU double-precision performance (GFLOPS). Extensive experiments on real-world models demonstrate an order of magnitude improvement in execution time compared to the state-of-the-art.
Performance of our point vs. line darts.
We formalize sampling a function using k-d darts. A k-d Dart is a set of independent, mutually orthogonal, k-dimensional hyperplanes called k-d flats. A dart has d choose k flats, aligned with the coordinate axes for efficiency. We show k-d darts are useful for exploring a function's properties, such as estimating its integral, or finding an exemplar above a threshold. We describe a recipe for converting some algorithms from point sampling to k-d dart sampling, if the function can be evaluated along a k-d flat. We demonstrate that k-d darts are more efficient than point-wise samples in high dimensions, depending on the characteristics of the domain: for example, the subregion of interest has small volume and evaluating the function along a flat is not too expensive. We present three concrete applications using line darts (1-d darts): relaxed maximal Poisson-disk sampling, high-quality rasterization of depth-of-field blur, and estimation of the probability of failure from a response surface for uncertainty quantification. Line darts achieve the same output fidelity as point sampling in less time. For Poisson-disk sampling, we use less memory, enabling the generation of larger point distributions in higher dimensions. Higher-dimensional darts provide greater accuracy for a particular volume estimation problem.
Left: for a facade layout we show a terminal region, R i , with parameters (x i , y i , w i , h i ). Right: a selection of terminal regions that are used in the layout is shown on the top. Two nonterminal regions are shown on the bottom. The location of these regions in the layout is highlighted in red on the left. 
A visual representation of the effect of the split rule, F1 (left), and the repeat rule, F2 (right), described in the text. The rectangles of F1 and F2 are not drawn to scale. 
In this paper, we address the following research problem: How can we generate a meaningful split grammar that explains a given facade layout? To evaluate if a grammar is meaningful, we propose a cost function based on the description length and minimize this cost using an approximate dynamic programming framework. Our evaluation indicates that our framework extracts meaningful split grammars that are competitive with those of expert users, while some users and all competing automatic solutions are less successful.
The Zerobit Encoding Scheme
A Sample Slice in the Cropped Region (Abdomen)  
Sample Slices from the four 3D Textures
Four Renderings of Polygonal Models with 3D Textures  
Aliasing Artifacts of Compression-Based 3D Texture Mapping (2X)  
This paper presents a new 3D RGB image compression scheme designed for interactive real-time applications. In designing our compression method, we have compromised between two important goals: high compression ratio and fast random access ability, and have tried to minimize the overhead caused during runtime reconstruction. Our compression technique is suitable for applications wherein data are accessed in a somewhat unpredictable fashion, and real-time performance of decompression is necessary. The experimental results on three different kinds of 3D images from medical imaging, image-based rendering, and solid texture mapping suggest that the compression method can be used effectively in developing real-time applications that must handle large volume data, made of color samples taken in three- or higher-dimensional space. Keywords: 3D volume data, Data compression, Haar wavelets, Random access, Interactive real-time applications, Medical imaging, Image-based rendering, 3D text...
Valuable 3D graphical models, such as high-resolution digital scans of cultural heritage objects, may require protection to prevent piracy or misuse, while still allowing for interactive display and manipulation by a widespread audience. We have investigated techniques for protecting 3D graphics content, and we have developed a remote rendering system suitable for sharing archives of 3D models while protecting the 3D geometry from unauthorized extraction. The system consists of a 3D viewer client that includes lowresolution versions of the 3D models, and a rendering server that renders and returns images of high-resolution models according to client requests. The server implements a number of defenses to guard against 3D reconstruction attacks, such as monitoring and limiting request streams, and slightly perturbing and distorting the rendered images. We consider several possible types of reconstruction attacks on such a rendering server, and we examine how these attacks can be defended against without excessively compromising the interactive experience for non-malicious users.
The digitization of the 3D shape of real objects is a rapidly expanding field, with applications in entertainment, design, and archaeology. We propose a new 3D model acquisition system that permits the user to rotate an object by hand and see a continuously-updated model as the object is scanned. This tight feedback loop allows the user to find and fill holes in the model in real time, and determine when the object has been completely covered. Our system is based on a 60 Hz. structured-light rangefinder, a real-time variant of ICP (iterative closest points) for alignment, and point-based merging and rendering algorithms. We demonstrate the ability of our prototype to scan objects faster and with greater ease than conventional model acquisition pipelines.
Equifaced tetrahedron in rectangular parallelpiped.  
Two quite different Tutte embeddings of the same mesh. Colored points mark corresponding vertices.  
Parametrization of 3D mesh data is important for many graphics applications, in particular for texture mapping, remeshing and morphing. Closed manifold genus-0 meshes are topologically equivalent to a sphere, hence this is the natural parameter domain for them. Parametrizing a triangle mesh onto the sphere means assigning a 3D position on the unit sphere to each of the mesh vertices, such that the spherical triangles induced by the mesh connectivity do not overlap. Satisfying the non-overlapping requirement is the most difficult and critical component of this process. We present a generalization of the method of barycentric coordinates for planar parametrization which solves the spherical parametrization problem, prove its correctness by establishing a connection to spectral graph theory and describe efficient numerical methods for computing these parametrizations.
Reflection paths obey Fermat's variational principle, stating that the length of the optical path connecting p and q is a local extremum. For any p and q there may be many such paths.
One-bounce reflection images generated by the perturbation method for a polygon (left) and a solid object (right). The visibility in the right image is correctly handled by z-buffering. The results are nearly identical to the ray traced image, yet the perturbed images can be computed very rapidly (approximately 0.1 seconds per update) as the lizard-shaped polygon or the cube is moved interactively.  
In this paper we apply perturbation methods to the problem of computing specular reflections in curved surfaces. The key idea is to generate families of closely related optical paths by expanding a given path into a high-dimensional Taylor series. Our path perturbation method is based on closed-form expressions for linear and higher-order approximations of ray paths, which are derived using Fermat's Variation Principle and the Implicit Function Theorem. The perturbation formula presented here holds for general multiple-bounce reflection paths and provides a mathematical foundation for exploiting path coherence in ray tracing acceleration techniques and incremental rendering. To illustrate its use, we describe an algorithm for fast approximation of specular reflections on curved surfaces; the resulting images are of high accuracy and nearly indistinguishable from ray traced images. Keywords: perturbation theory, implicit surfaces, optics, ray tracing, specular reflection 1 1 Introduct...
Ray tracers, which sample radiance, are usually regarded as offline rendering algorithms that are too slow for interactive use. In this article we present a system that exploits object-space, ray-space, image-space, and temporal coherence to accelerate ray tracing. Our system uses per-surface interpolants to approximate radiance while conservatively bounding error. The techniques introduced in this article should enhance both interactive and batch ray tracers. Our approach explicitly decouples the two primary operations of a ray tracer - shading and visibility determination - and accelerates each of them independently. Shading is accelerated by quadrilinearly interpolating lazily acquired radiance samples. Interpolation error does not exceed a user-specified bound, allowing the user to control performance/quality tradeoffs. Error is bounded by adaptive sampling at discontinuities and radiance nonlinearities. Visibility determination at pixels is accelerated by reprojecting interpolants as the user's viewpoint changes. A fast scan-line algorithm then achieves high performance without sacrificing image quality. For a smoothly varying viewpoint, the combination of lazy interpolants and reprojection substantially accelerates the ray tracer. Additionally, an efficient cache management algorithm keeps the memory footprint of the system small with negligible overhead.
Good character animation requires convincing skin deformations including subtleties and details like muscle bulges. Such effects are typically created in commercial animation packages which provide very general and powerful tools. While these systems are convenient and flexible for artists, the generality often leads to characters that are slow to compute or that require a substantial amount of memory and thus cannot be used in interactive systems. Instead, interactive systems restrict artists to a specific character deformation model which is fast and memory efficient but is notoriously difficult to author and can suffer from many deformation artifacts. This paper presents an automated framework that allows character artists to use the full complement of tools in high-end systems to create characters for interactive systems. Our method starts with an arbitrarily rigged character in an animation system. A set of examples is exported, consisting of skeleton configurations paired with the deformed geometry as static meshes. Using these examples, we fit the parameters of a deformation model that best approximates the original data yet remains fast to compute and compact in memory. Keywords: Interactive, Skin, Approximation I
We describe a process for compositing a live performance of an actor into a virtual set wherein the actor is consistently illuminated by the virtual environment. The Light Stage used in this work is a two-meter sphere of inward-pointing RGB light emitting diodes focused on the actor, where each light can be set to an arbitrary color and intensity to replicate a real-world or virtual lighting environment. We implement a digital two-camera infrared matting system to composite the actor into the background plate of the environment without affecting the visible-spectrum illumination on the actor. The color reponse of the system is calibrated to produce correct color renditions of the actor as illuminated by the environment. We demonstrate moving-camera composites of actors into real-world environments and virtual sets such that the actor is properly illuminated by the environment into which they are composited.
Custom-fabricated circuit board containing flip-flops and H-bridge transistor arrays.
Our puck design includes a permanent magnet and an infrared LED for vision tracking.
The electromagnets (lower left) exert forces on the puck (upper right).
Magnetic field interactions between electromagnets. The top images show magnetic flux lines and the bottom images map flux density to brightness. The three image pairs show the fields resulting from a single center magnet turned on (left), the left and center magnets turned on (center), and all three magnets turned on (right). The effect of this field-shifting behavior can be modeled approximately using force summation. These images were generated with the VisiMag software package [2].
The Actuated Workbench is a device that uses magnetic forces to move objects on a table in two dimensions. It is intended for use with existing tabletop tangible interfaces, providing an additional feedback loop for computer output, and helping to resolve inconsistencies that otherwise arise from the computer's inability to move objects on the table. We describe the Actuated Workbench in detail as an enabling technology, and then propose several applications in which this technology could be useful.
This paper introduces a unified and general tesselation algorithm for parametric and implicit surfaces. The algorithm produces a hierarchical mesh that is adapted to the surface geometry and has a multiresolution and progressive structure. This representation can be exploited with advantages in several applications.
Images generated with the simple Blinn-Phong shader, two parameter settings for the encapsulated Blinn- Phong shader, and the wood shader derived using shader algebra operators. See Listings 1, 2, and 3.  
An algebra consists of a set of objects and a set of operators that act on those objects. We treat shader programs as first-class objects and define two operators: connection and combination. Connection is functional composition: the outputs of one shader are fed into the inputs of another. Combination concatenates the input channels, output channels, and computations of two shaders. Similar operators can be used to manipulate streams and apply computational kernels expressed as shaders to streams. Connecting a shader program to a stream applies that program to all elements of the stream; combining streams concatenates the record definitions of those streams.
Intersection and projection of two ellipses algorithm for computing the resultant matrix is similar to Algorithm I. In particular, let P(x; y; ) = F(x; y)G(x; ) ? F(x; )G(x; y) y ? :
: The problem of computing the intersection of parametric and algebraic curves arises in many applications of computer graphics and geometric and solid modeling. Previous algorithms are based on techniques from elimination theory or subdivision and iteration. The former is however, restricted to low degree curves. This is mainly due to issues of efficiency and numerical stability. In this paper we use elimination theory and express the resultant of the equations of intersection as a matrix determinant. The matrix itself rather than its symbolic determinant, a polynomial, is used as the representation. The problem of intersection is reduced to computing the eigenvalues and eigenvectors of a numeric matrix. The main advantage of this approach lies in its efficiency and robustness. Moreover, the numerical accuracy of these operations is well understood. For almost all cases we are able to compute accurate answers in 64 bit IEEE floating point arithmetic. Keywords: Intersection, curves, a...
In view of the fundamental role that functional composition plays in mathematics, it is not surprising that a variety of problems in geometric modeling can be viewed as instances of the following composition problem: given representations for two functions F and G, compute a representation of the function H = F ffi G: We examine this problem in detail for the case when F and G are given in either B'ezier or B-spline form. Blossoming techniques are used to gain theoretical insight into the structure of the solution which is then used to develop efficient, tightly codable algorithms. From a practical point of view, if the composition algorithms are implemented as library routines, a number of geometric modeling problems can be solved with a small amount of additional software. This paper was published in TOG, April 1993, pg 113-135 Categories and Subject Descriptors: I.3.5 [Computer Graphics]: Computational Geometry and Object Modeling - curve, surface, and object representations; J.6 [...
Distribution of saccade directions 
Overall system architecture  
(a) Original eye image from the eyetracker (left), (b) Output of Canny Enhancer (right) distribution  
For an animated human face model to appear natural it should produce eye movements consistent with human ocular behavior. During face-to-face conversational interactions, eyes exhibit conversational turn-taking and agent thought processes through gaze direction, saccades, and scan patterns. We have implemented an eye movement model based on empirical models of saccades and statistical models of eye-tracking data. Face animations using stationary eyes, eyes with random saccades only, and eyes with statistically derived saccades are compared, to evaluate whether they appear natural and effective while communicating.
A box, shown in dark at the center, and its 24 neighbors. 
The box on the left shows large grid and the one on the right shows small grid as well as the subboxes. In this figure, 4 and 16. 
The lower bound construction showing . 
Heuristics that exploit bouning boxes are common in algorithms for rendering, modeling, and animation. While experience has shown that bounding boxes improve the performance of these algorithms in practice, the previous theoretical analysis has concluded that bounding boxes perform poorly in the worst case. This paper reconciles this discrepancy by analyzing intersections among n geometric objects in terms of two parameters: α an upper bound on the aspect ratio or elongatedness of each object; and σ an upper bound on the scale factor or size disparity between the largest and smallest objects. Letting K o and K b be the number of intersecting object pairs and bounding box pairs, respectively, we analyze a ratio measure of the bounding boxes' efficiency, ρ = K b / (n + K 0 ) . The analysis proves that ρ = O(α√σlog ² σ) and ρ = Ω(α√σ) . One important consequence is that if α and σ are small constants (as is often the case in practice), then K b = O ( K o )+ O ( n , so an algorithm that uses bounding boxes has time complexity proportional to the number of actual object intersections. This theoretical result validates the efficiency that bounding boxes have demonstrated in practice. Another consequence of our analysis is a proof of the output-sensitivity of an algorithm for reporting all intersecting pairs in a set of n convex polyhedra with constant α and σ. The algorithm takes time O ( n log d -1 n + K o log d -1 n ) for dimension d = 2, 3. This running time improves on the performance of previous algorithms, which make no assumptions about α and σ.
Real-time avatar control in our system. (Top) The user controls the avatar's motion using sketched paths in maze and rough terrain environments. (Bottom left) The user selects from a number of choices in a playground environment. (Bottom right) The user is controlling the avatar by performing a motion in front of a camera. In this case only, the avatar's motion lags the user's input by several seconds.
Step stool example. (Top left) Choice-based interface. (Top right) Sketch-based interface. (Middle and bottom left) The user performing a motion in front of a video camera and her silhouette extracted from the video. (Middle and bottom right) The avatar being controlled through the vision-based interface and the rendered silhouette that matches the user's silhouette.  
Real-time control of three-dimensional avatars is an important problem in the context of computer games and virtual environments. Avatar animation and control is difficult, however, because a large repertoire of avatar behaviors must be made available, and the user must be able to select from this set of behaviors, possibly with a low-dimensional input device. One appealing approach to obtaining a rich set of avatar behaviors is to collect an extended, unlabeled sequence of motion data appropriate to the application. In this paper, we show that such a motion database can be preprocessed for flexibility in behavior and efficient search and exploited for real-time avatar control. Flexibility is created by identifying plausible transitions between motion segments, and efficient search through the resulting graph structure is obtained through clustering. Three interface techniques are demonstrated for controlling avatar motion using this data structure: the user selects from a set of available choices, sketches a path through an environment, or acts out a desired motion in front of a video camera. We demonstrate the flexibility of the approach through four different applications and compare the avatar motion to directly recorded human motion.
We present the Rigid Fluid method, a technique for animating the interplay between rigid bodies and viscous incompressible fluid with free surfaces. We use distributed Lagrange multipliers to ensure two-way coupling that generates realistic motion for both the solid objects and the fluid as they interact with one another. We call our method the rigid fluid method because the simulator treats the rigid objects as if they were made of fluid. The rigidity of such an object is maintained by identifying the region of the velocity field that is inside the object and constraining those velocities to be rigid body motion. The rigid fluid method is straightforward to implement, incurs very little computational overhead, and can be added as a bridge between current fluid simulators and rigid body solvers. Many solid objects of different densities (e.g., wood or lead) can be combined in the same animation.
We present an algorithm to efficiently and robustly process collisions, contact and friction in cloth simulation. It works with any technique for simulating the internal dynamics of the cloth, and allows true modeling of cloth thickness. We also show how our simulation data can be post-processed with a collision-aware subdivision scheme to produce smooth and interference free data for rendering.
We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject's mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. At run time, the input to the system can be either real audio sequences or synthetic audio produced by a text-to-speech system, as long as they have been phonetically aligned.
We present a physically based method for modeling and animating fire. Our method is suitable for both smooth (laminar) and turbulent flames, and it can be used to animate the burning of either solid or gas fuels. We use the incompressible Navier-Stokes equations to independently model both vaporized fuel and hot gaseous products. We develop a physically based model for the expansion that takes place when a vaporized fuel reacts to form hot gaseous products, and a related model for the similar expansion that takes place when a solid fuel is vaporized into a gaseous state. The hot gaseous products, smoke and soot rise under the influence of buoyancy and are rendered using a blackbody radiation model. We also model and render the blue core that results from radicals in the chemical reaction zone where fuel is converted into products. Our method allows the fire and smoke to interact with objects, and flammable objects can catch on fire.
We discuss a method for creating animations that allows the animator to sketch an animation by setting a small number of keyframes on a fraction of the possible degrees of freedom. Motion capture data is then used to enhance the animation. Detail is added to degrees of freedom that were keyframed, a process we call texturing. Degrees of freedom that were not keyframed are synthesized. The method takes advantage of the fact that joint motions of an articulated figure are often correlated, so that given an incomplete data set, the missing degrees of freedom can be predicted from those that are present.
We present a new method for the animation and rendering of photorealistic water effects. Our method is designed to produce visually plausible three dimensional effects, for example the pouring of water into a glass (see figure 1) and the breaking of an ocean wave, in a manner which can be used in a computer animation environment. In order to better obtain photorealism in the behavior of the simulated water surface, we introduce a new "thickened" front tracking technique to accurately represent the water surface and a new velocity extrapolation method to move the surface in a smooth, water-like manner. The velocity extrapolation method allows us to provide a degree of control to the surface motion, e.g. to generate a windblown look or to force the water to settle quickly. To ensure that the photorealism of the simulation carries over to the final images, we have integrated our method with an advanced physically based rendering system.
A smoke galleon sails through a ring of smoke (side and front views).
Variables on the computational grid (in 2D).
Top row: a walking mouse. Bottom row: a leaping tiger.
In this paper we present a new method for efficiently controlling animated smoke. Given a sequence of target smoke states, our method generates a smoke simulation in which the smoke is driven towards each of these targets in turn, while exhibiting natural-looking interesting smoke-like behavior. This control is made possible by two new terms that we add to the standard flow equations: (i) a driving force term that causes the fluid to carry the smoke towards a particular target, and (ii) a smoke gathering term that prevents the smoke from diffusing too much. These terms are explicitly defined by the instantaneous state of the system at each simulation timestep. Thus, no expensive optimization is required, allowing complex smoke animations to be generated with very little additional cost compared to ordinary flow simulations.
Monte Carlo sampling can be used to estimate solutions to global light transport and other rendering problems. However, a large number of observations may be needed to reduce the variance to acceptable levels. Rather than computing more observations within each pixel, if spatial coherence exists in image space it can be used to reduce visual error by averaging estimators in adjacent pixels. Anisotropic diffusion is a space-variant noise reduction technique that can selectively preserve texture, edges, and other details using a map of image coherence. The coherence map can be estimated from depth and normal information as well as interpixel color distances. Incremental estimation of the reduction in variance, in conjunction with statistical normalization of interpixel color distances, yields an energy-preserving algorithm that converges to a spatially nonconstant steady state.
With recent improvements in methods for the acquisition and rendering of 3D models, the need for retrieval of models has gained prominence in the graphics and vision communities. A variety of methods have been proposed that enable the efficient querying of model repositories for a desired 3D shape. Many of these methods use a 3D model as a query and attempt to retrieve models from the database that have a similar shape. In this paper we consider the implications of anisotropy on the shape matching paradigm. In particular, we propose a novel method for matching 3D models that factors the shape matching equation as the disjoint outer product of anisotropy and geometric comparisons. We provide a general method for computing the factored similarity metric and show how this approach can be applied to improve the matching performance of many existing shape matching methods.
this article we address the general problem of drawing nice-looking undirected straight-line graphs. Any proposed solution to this problem requires setting general criteria for the "quality" of the picture. Defining such criteria so that they apply to different types of graphs, but at the same time are combined into a meaningful cost function that can then be subjected to general optimization methods, was one of the main objectives of our work. Another was to introduce flexibility, so that the user may change the relative weights of the criteria to obtain varying solutions that reflect his or her preferences
This paper describes a framework that allows a user to synthesize human motion while retaining control of its qualitative properties. The user paints a timeline with annotations --- like walk, run or jump --- from a vocabulary which is freely chosen by the user. The system then assembles frames from a motion database so that the final motion performs the specified actions at specified times. The motion can also be forced to pass through particular configurations at particular times, and to go to a particular position and orientation. Annotations can be painted positively (for example, must run), negatively (for example, may not run backwards) or as a don't-care. The system uses a novel search method, based around dynamic programming at several scales, to obtain a solution efficiently so that authoring is interactive. Our results demonstrate that the method can generate smooth, natural-looking motion.
Decomposition of each polygon integral into edge integrals e 1 , e 2 , and e 3 .  
Approximation of each edge integral as a weighted sum of line integrals. Form a trapezoid by projecting each polygon edge to the y axis. Edge integrals integrate within these gray trapezoidal regions; the sum of edge integrals equals the integral within the polygonal region.  
: This paper introduces quadrature prefiltering, an accurate, efficient, and fairly simple algorithm for prefiltering polygons for scanline rendering. It renders very high quality images at reasonable cost, strongly suppressing aliasing artifacts. For equivalent RMS error, quadrature prefiltering is significantly faster than either uniform or jittered supersampling. Quadrature prefiltering is simple to implement and space-efficient; it needs only a small two-dimensional lookup table, even when computing non radially symmetric filter kernels. Previous algorithms have required either three dimensional tables or a restriction to radially symmetric filter kernels. Though only slightly more complicated to implement than the widely used box prefiltering method, quadrature prefiltering can generate images with much less visible aliasing artifacts. Introduction Good spatial antialiasing is computationally expensive, a significant consumer of computing resources when rendering images with very...
Confocal microscopy is a family of imaging techniques that employ focused patterned illumination and synchronized imaging to create cross-sectional views of 3D biological specimens. In this paper, we adapt confocal imaging to large-scale scenes by replacing the optical apertures used in microscopy with arrays of real or virtual video projectors and cameras. Our prototype implementation uses a video projector, a camera, and an array of mirrors. Using this implementation, we explore confocal imaging of partially occluded environments, such as foliage, and weakly scattering environments, such as murky water. We demonstrate the ability to selectively image any plane in a partially occluded environment, and to see further through murky water than is otherwise possible. By thresholding the confocal images, we extract mattes that can be used to selectively illuminate any plane in the scene.
This paper presents a method for approximating polyhedral objects to support a timecritical collision-detection algorithm. The approximations are hierarchies of spheres, and they allow the time-critical algorithm to progressively refine the accuracy of its detection, stopping as needed to maintain the real-time performance essential for interactive applications. The key to this approach is a preprocess that automatically builds tightly fitting hierarchies for rigid and articulated objects. The preprocess uses medial-axis surfaces, which are skeletal representations of objects. These skeletons guide an optimization technique that gives the hierarchies accuracy properties appropriate for collision detection. In a sample application, hierarchies built this way allow the time-critical collision-detection algorithm to have acceptable accuracy, improving significantly on that possible with hierarchies built by previous techniques. The performance of the time-critical algorithm in this appli...
Densities dropping down a domain with curvilinear boundaries.
The velocity field is used as the principal direction of anisotropy in this sequence.
Textures created using a reaction-diffusion process.
In this paper we introduce a method to simulate fluid flows on smooth surfaces of arbitrary topology: an effect never seen before. We achieve this by combining a two-dimensional stable fluid solver with an atlas of parametrizations of a Catmull-Clark surface. The contributions of this paper are: (i) an extension of the Stable Fluids solver to arbitrary curvilinear coordinates, (ii) an elegant method to handle cross-patch boundary conditions and (iii) a set of new external forces custom tailored for surface flows. Our techniques can also be generalized to handle other types of processes on surfaces modeled by partial differential equations, such as reactiondiffusion. Some of our simulations allow a user to interactively place densities and apply forces to the surface, then watch their effects in real-time. We have also computed higher resolution animations of surface flows off-line.
This paper studies geometrically continuous spline curves of arbitrary degree. Based on the concept of universal splines we obtain geometric constructions for both the spline control points and for the B'ezier points and give algorithms for computing locally supported basis functions and for knot insertion. The geometric constructions are based on the intersection of osculating flats. The concept of universal splines is defined in such a way that these intersections are guaranteed to exist. As a result of this development we obtain a generalization of polar forms to geometrically continuous spline curves by intersecting osculating flats. The presented algorithms have been coded in Maple, and concrete examples illustrate the approach. Categories and Subject Descriptors: I.3.5 [Computer Graphics]: Computational Geometry and Object Modelling - curve, surface, solid, and object representations General Terms: Algorithms, Design Additional Key Words and Phrases: B'ezier point, blossom, de...
Top-cited authors
Carsten Rother
  • Technische Universität Dresden
Andrew Blake
  • University of Oxford
Ariel Shamir
  • Interdisciplinary Center Herzliya
Xin Tong
  • Microsoft
Yang Liu
  • Microsoft