ArticlePDF Available

A Fast Voxel Traversal Algorithm for Ray Tracing

Authors:

Abstract and Figures

A fast and simple voxel traversal algorithm through a 3D space partition is introduced. Going from one voxel to its neighbour requires only two floating point comparisons and one floating point addition. Also, multiple ray intersections with objects that are in more than one voxel are eliminated. Introduction In recent years, ray tracing has become the algorithm of choice for generating high fidelity images. Its simplicity and elegance allows one to easily model reflection, refraction and shadows. 1 Unfortunately, it has a major drawback: computational expense. The prime reason for this is that the heart of ray tracing, intersecting an object with a ray, is expensive and can easily take up to 95% of the rendering time. Unless some sort of intersection culling is performed, each ray must intersect all the objects in the scene, a very expensive proposition. There are two general strategies for intersection culling: hierarchical bounding volumes 1, 2, 3, 4 and space partitioning...
Content may be subject to copyright.
A Fast Voxel Traversal Algorithm
for
Ray Tracing
John Amanatides
Andrew Woo
Dept. of Computer Science
University of Toronto
Toronto, Ontario, Canada M5S 1A4
ABSTRACT
A fast and simple voxel traversal algorithm through a 3D space partition is intro-
duced. Going from one voxel to its neighbour requires only two floating point compar-
isons and one floating point addition. Also, multiple ray intersections with objects that
are in more than one voxel are eliminated.
Introduction
In recent years, ray tracing has become the algorithm of choice for generating high fidelity images.
Its simplicity and elegance allows one to easily model reflection, refraction and shadows.
1
Unfortunately,
it has a major drawback: computational expense. The prime reason for this is that the heart of ray tracing,
intersecting an object with a ray, is expensive and can easily take up to 95% of the rendering time. Unless
some sort of intersection culling is performed, each ray must intersect all the objects in the scene, a very
expensive proposition.
There are two general strategies for intersection culling: hierarchical bounding volumes
1, 2, 3, 4
and
space partitioning.
5, 6, 7, 8
The general idea of the first approach is to envelop complicated objects that take a long time to
intersect with simpler bounding volumes that are much easier to intersect, such as spheres or boxes.
Before intersecting the complicated object, the bounding volume is first intersected. (Actually, it is not a
full intersection test; all we care about is if the ray hits the bounding volume, not where). If there is no
intersection with the bounding volume, there is no need to intersect the complicated object, thus saving
time. For a complicated scene made up of many objects, a bounding volume is placed around the entire
scene with each object also containing a bounding volume. If an object is made up of several parts each
of these parts can also have a bounding volume. We thus can built a tree of bounding volumes, with each
node containing a bounding volume that envelops its children. Objects within a subtree are intersected
only if their parent node bounding volume is intersected by the ray. In this manner, the amount of actual
intersections are significantly reduced. Of course, we now hav e to spent time intersecting bounding vol-
umes but this is more than offset by the reduced total intersections.
The second approach of reducing intersections is to partition space itself into regions or voxels.
Each voxel has a list of objects that are in that voxel. If an object spans several voxels it is in more than
one list. When a ray is shot, we first look into the voxel in which it originates. If it hits any objects in the
starting voxel’s list, the intersections are sorted and the closest one is retained. If the intersection is in the
current voxel there is no need to intersect any other objects as we have found the closest intersection. If
no intersection is found in the current voxel or the object list is empty, we follow the ray into a neighbour-
ing voxel and check its object list. We continue until either we find an intersection or we completely tra-
verse the space partition. Since we intersect objects roughly in the order as they occur along the ray and
trivially reject objects far from the ray, the number of intersections that need to be performed is vastly
reduced. There are two popular space partition schemes: octrees by Glassner,
6
where voxels are of differ-
ent sizes, and constant size voxel partitioning (hereafter called a grid partition) by Fujimoto et. al.
7, 8
The
first conserves space but makes traversal difficult while the latter allows for simpler traversal at the
expense of more voxels.
In this paper, we introduce a fast and simple incremental grid traversal algorithm. Like Fujimoto et.
al.,
7, 8
it is a variant of the DDA line algorithm. However, instead of basing it on the simple DDA (Fuji-
moto et. al.), in which an unconditional step along one axis is required, ours has no preferred axis. This
considerably simplifies the inner loop and allows for easy testing of an intersection point to see if it is in
the current voxel. Along with the the new traversal algorithm, we introduce a technique to eliminate mul-
tiple intersections when an object spans several voxels. This technique can be used with all space subdi-
vision algorithms with minimum modifications.
The New Traversal Algorithm
Let us derive the new traversal algorithm. We consider the two dimensional case first; the extension
to three dimensions is straightforward. Consider figure 1:
y
x
Grid
h
g
fe
d
c
ba
Ray
Figure 1
To correctly traverse the grid, a traversal algorithm must visit voxels a, b, c, d, e, f, g and h in that order.
The equation of the ray is
u + t
v for t 0. The new traversal algorithm breaks down the ray into intervals
of t, each of which spans one voxel. We start at the ray origin and visit each of these voxels in interval
order.
The traversal algorithm consists of two phases: initialization and incremental traversal. The initial-
ization phase begins by identifying the voxel in which the ray origin,
u, is found. If the ray origin is out-
side the grid, we find the point in which the ray enters the grid and take the adjacent voxel. The integer
variables X and Y are initialized to the starting voxel coordinates. In addition, the variables stepX and
stepY are initialized to either 1 or -1 indicating whether X and Y are incremented or decremented as the
ray crosses voxel boundaries (this is determined by the sign of the x and y components of
v).
Next, we determine the value of t at which the ray crosses the first vertical voxel boundary and
store it in variable tMaxX. We perform a similar computation in y and store the result in tMaxY. The
minimum of these two values will indicate how much we can travel along the ray and still remain in the
current voxel.
Finally, we compute tDeltaX and tDeltaY. TDeltaX indicates how far along the ray we must move
(in units of t) for the horizontal component of such a movement to equal the width of a voxel. Similarly,
we store in tDeltaY the amount of movement along the ray which has a vertical component equal to the
height of a voxel.
The incremental phase of the traversal algorithm is very simple. The basic loop is outlined below:
loop {
if(tMaxX < tMaxY) {
tMaxX= tMaxX + tDeltaX;
X= X + stepX;
} else {
tMaxY= tMaxY + tDeltaY;
Y= Y + stepY;
}
NextVoxel(X,Y);
}
We loop until either we find a voxel with a non-empty object list or we fall out of the end of the grid.
Extending the algorithm to three dimensions simply requires that we add the appropriate z variables and
find the minimum of tMaxX, tMaxY and tMaxZ during each iteration. This results in:
list= NIL;
do {
if(tMaxX < tMaxY) {
if(tMaxX < tMaxZ) {
X= X + stepX;
if(X == justOutX)
return(NIL); /* outside grid */
tMaxX= tMaxX + tDeltaX;
} else {
Z= Z + stepZ;
if(Z == justOutZ)
return(NIL);
tMaxZ= tMaxZ + tDeltaZ;
}
} else {
if(tMaxY < tMaxZ) {
Y= Y + stepY;
if(Y == justOutY)
return(NIL);
tMaxY= tMaxY + tDeltaY;
} else {
Z= Z + stepZ;
if(Z == justOutZ)
return(NIL);
tMaxZ= tMaxZ + tDeltaZ;
}
}
list= ObjectList[X][Y][Z];
} while(list == NIL);
return(list);
The loop above requires two floating point comparisons, one floating point addition, two integer compar-
isons and one integer addition per iteration. The initialization phase requires 33 floating point operations
(including comparisons) if the origin of the ray is inside the grid and up to 40 floating point operations
otherwise.
To correctly determine visibility, we must make sure that the "closest" intersection point is in the
current voxel. This is best illustrated by the figure 2. The first voxel that has an object in its list that can
be intersected is b. But the actual intersection of that object (B) is in voxel d with a closer object (A) in
voxel c.
Figure 2
cd
A
B
ba
At first, checking to make sure that the intersection point is within the voxel would seem to require six
floating point comparisons. But we can reduce these to just one: the value of t at the intersection point
compared to the maximum value of t allowed in the current voxel. If it is less than or equal to the maxi-
mum allowed, it is in the current voxel; otherwise, we must continue traversal until either we reach the
voxel in which the intersection occurs or find a closer object. The easiest way to perform this comparison
is to include it into the incremental traversal code just after the minimum of tMaxX, tMaxY and tMaxZ is
determined. Of course, we don’t want to do this comparison until we have hit a surface so we have two
traversal functions, one without the extra comparison and the other with. After an intersection is found
we call the incremental traversal function again; this time we call the version with the extra comparison.
If the intersection is in the current voxel, the function returns NIL and we stop. Otherwise, we continue
traversal until either we find a non-empty voxel or the voxel in which the intersection occurred. Thus, for
minimal cost, we can decide if the intersection is within the current voxel.
Av oiding Multiple Intersections
A major drawback
4
with current space subdivision schemes is that since objects may reside in more
than one voxel, a ray may intersect the same object many times. We introduce a technique that blunts this
criticism. But first we will outline what happens when we perform an intersection test with an object.
We shall assume that the ray, along with
u and
v, has stored with it a pointer to information regard-
ing the current visibility "winner"; that is, the object that is closest to the ray origin of all the objects that
have been intersected so far. In particular, we store the t value of the intersection point. As we intersect a
new candidate we compare the t value of the intersection point (if any) with that of the current winner. If
the candidate is closer, it becomes the new winner and updates the winner data. Otherwise, it is rejected.
To solve the multiple intersection problem we add to each ray an integer variable, called rayID.
Every ray shot through the grid will have it’s own unique value for rayID. Also, each object will have
stored with it the rayID of the ray that most recently performed an intersection test with it (this variable is
initialized to 0, a value that no ray is ever allowed to have). Before an intersection test, the rayID of the
ray and the object are compared. If they are equal, the object has previously been intersected by the same
ray and the current intersection is not necessary. Otherwise, the intersection test is performed and the
object’s rayID is set to that of the ray’s. In this manner, a ray intersection test need only be performed
once for any object. Of course, this approach requires an extra integer comparison; this extra comparison
is well worth it as a full intersection test is typically several orders of magnitude more expensive.
The fact that multiple intersections are eliminated means that we do not have to spend as much time
in determining the minimal number of voxels that an object occupies. Glassner indicated that he would
include an object in a voxel’s list only if the object’s surface existed within the voxel. Computing these
voxels is non-trivial, especially for complicated objects. Faster, though slightly less tight algorithms can
be used as we no longer perform multiple intersections on the same object.
Results
The above traversal algorithm, with the indicated correction, has been implemented in C on a Sun
3/75 running Unix. The results of four test scenes are revealed in the table below. In all cases, the images
were computed at 512 by 512 resolution with one sample per pixel and with one light source casting a
shadow.
Figure Time Grid Objects Objects/ Intersections/ Intersections/
(min) Subdivision Voxel Ray Ray (with rayID)
4 49.5 20 4 0.3 4.6 1.7
5 44.8 20 31 0.9 5.3 2.0
6 21.7 30 62 0.2 1.9 1.4
7 32.6 40 3540 1.1 6.6 3.8
We see that even for very large numbers of objects, the number of objects that must actually be intersected
with stays small. Also, the rayID optimization significantly reduces the number of objects that must be
intersected. As the level of grid subdivision increases, the rayID optimization will become more signifi-
cant as objects cover more voxels. Unfortunately, as we increase grid subdivision, voxel initialization
time, traversal time as well as memory usage also become significant thus ultimately limiting the subdivi-
sion rate.
Figure 3 plots the execution time of rendering the scene in figure 7 for several different levels of
subdivision. We see that for low lev els of subdivision, rendering the scene is very expensive but the cost
plummets with only moderate increases in the subdivision rate.
20 40 60 80
Grid Subdivision
Time
(min)
0
200
400
600
800
1000
Figure 3
Conclusions and Future Research
We hav e introduced a space partition algorithm that requires very few floating point operations.
Multiple intersection tests for objects are eliminated.
We are currently studying the feasibility of reducing the algorithm to just integer operations to fur-
ther reduce operating time. Also, we are looking to see how easily the algorithm can be modified to tra-
verse octree space subdivision schemes.
The quick traversal times through a grid suggest its use for more than just simple intersection
culling. For example, we could store in the grid the seed points of iterative processes. A good candidate
is ray tracing bicubic parametric patches using Newton’s method for finding roots.
9, 10
Another possibility is using grids to reduce shadow rays. If there are many light sources, shadow
rays are very important.
11
At the cost of two bits per light source per voxel (one indicating that the voxel
is completely in shadow, the other indicating that nothing blocks the voxel from seeing the light source) a
great number of these rays no longer are necessary.
References
1. T. Whitted, “An Improved Illumination Model for Shaded Display,Comm. of the ACM, 23(6), pp.
343-349 (June 1980).
2. S.M. Rubin and T. Whitted, “A 3-Dimensional Representation for Fast Rendering of Complex
Scenes, Computer Graphics, 14(3), pp. 110-116 (July 1980).
3. H. We ghorst, G. Hooper, and D.P. Greenberg, “Improved Computational Methods for Ray Tracing,
ACM Trans. on Graphics, 3(1), pp. 52-69 (January 1984).
4. T.L. Kay and J.T. Kajiya, “Ray Tracing Complex Scenes,Computer Graphics, 20(4), pp. 269-278
(August 1986).
5. J.G. Cleary, B. Wyvill, G.M. Birtwistle, and R. Vatti, “Multiprocessor Ray Tracing,Research
Report No. 83/128/7 Dept. of Computer Science University of Calgary (1983).
6. A.S. Glassner, “Space Subdivision for Fast Ray Tracing,IEEE Computer Graphics and Applica-
tions, 4(10), pp. 15-22 (October 1984).
7. A. Fujimoto and K. Iwata, “Accelerated Ray Tracing,Proc. CG Tokyo ’85, pp. 41-65.
8. A. Fujimoto, T. Tanaka, and K. Iwata, “ARTS: Accelerated Ray-Tracing System,IEEE Computer
Graphics and Applications, 6(4), pp. 16-26 (April 1986).
9. D.L. Toth, “On Ray Tracing Parametric Surfaces,Computer Graphics, 19(3), pp. 171-179 (July
1985).
10. M.A.J. Sweeney and R.H. Bartels, “Ray Tracing Free-Form B-Spline Surfaces,IEEE Computer
Graphics and Applications, 6(2), pp. 41-49 (February 1986).
11. E.A. Haines and D.P. Greenberg, “The Light Buffer: A Shadow-Testing Accelerator,IEEE Com-
puter Graphics and Applications, 6(9), pp. 6-16 (September 1986).
... Second, the screw center lines were determined by applying Hough Transform [69]. The last step was finding the screw head locations achieved by fast voxel traversal for ray tracing [70]. In the following chapters, the details on implementation are reported. ...
... The algorithm returns the detected lines in a list including the following information: the number of points that have been assigned to the line, their center of mass and the line direction. In step three, screw entry points were identified using a fast voxel traversal method based on ray tracing [70] (Figure 4(3)). The center of mass per screw and the corresponding line direction, found through Hough transform, served as starting point and direction for the ray tracing algorithm within the thresholded segmentation masks. ...
Article
Full-text available
In clinical practice, image-based postoperative evaluation is still performed without state-of-the-art computer methods, as these are not sufficiently automated. In this study we propose a fully automatic 3D postoperative outcome quantification method for the relevant steps of orthopaedic interventions on the example of Periacetabular Osteotomy of Ganz (PAO). A typical orthopaedic intervention involves cutting bone, anatomy manipulation and repositioning as well as implant placement. Our method includes a segmentation based deep learning approach for detection and quantification of the cuts. Furthermore, anatomy repositioning was quantified through a multi-step registration method, which entailed a coarse alignment of the pre- and postoperative CT images followed by a fine fragment alignment of the repositioned anatomy. Implant (i.e., screw) position was identified by 3D Hough transform for line detection combined with fast voxel traversal based on ray tracing. The feasibility of our approach was investigated on 27 interventions and compared against manually performed 3D outcome evaluations. The results show that our method can accurately assess the quality and accuracy of the surgery. Our evaluation of the fragment repositioning showed a cumulative error for the coarse and fine alignment of 2.1 mm. Our evaluation of screw placement accuracy resulted in a distance error of 1.32 mm for screw head location and an angular deviation of 1.1° for screw axis. As a next step we will explore generalisation capabilities by applying the method to different interventions.
... New and improved techniques have come up, which handle a very low degree of approximation but are inherently complex and dynamic in nature. For instance, Heng et al. (2021) focused on the efficiency of signal line indexing in the tomography model, using a fast voxel traversal algorithm (FVTA), initially proposed by Amanatides and Woo (1987). Using FVTA in WV tomography improved the indexing efficiency of the signal line and provided accurate grid intercept positioning information for observation equations. ...
Article
Global navigation satellite system (GNSS) meteorology is utilized in predicting and monitoring extreme weather events using tropospheric products like precipitable water vapour for horizontal 2D detail and water vapour density profiles for 3D detail. In water vapour (WV) tomography, the slant delays from GNSS observations are used to model 4D variations of wet refractivity in the atmosphere above the study area. We present a comprehensive review on the evolution of GNSS WV tomography since its inception, the current state of the art, the challenges faced by the meteorology community, and a qualitative analysis of various techniques used in the process. With the growing infrastructure of meteorologically oriented GNSS station networks as well as increasing utilization of multi-source earth observation datasets powered with machine learning tools, GNSS tomography has shown improvements in producing accurate WV profiles. These new improvements have created the need for conducting multi-factor canonical analyses to understand the efficiency of these well-established methods in controlling the accuracy of the output field.
... intersected by the ray were listed using the Fast Voxel Traversal method (Amanatides & Woo, 1987). ...
Article
Full-text available
Many modelling applications require 3D meshes that should be generated from filtered/cleaned point clouds. This paper proposes a methodology for filtering of terrestrial laser scanner (TLS)‐derived point clouds, consisting of two main parts: an anisotropic point error model and the subsequent decimation steps for elimination of low‐quality points. The point error model can compute the positional quality of any point in the form of error ellipsoids. It is formulated as a function of the angular/mechanical stability, sensor‐to‐object distance, laser beam's incidence angle and surface reflectivity, which are the most dominant error sources. In a block of several co‐registered point clouds, some parts of the target object are sampled by multiple scans with different positional quality patterns. This situation results in redundant data. The proposed decimation steps removes this redundancy by selecting only the points with the highest positional quality. Finally, the Good, Bad, and the Better algorithm, based on the ray‐tracing concept, was developed to remove the remaining redundancy due to the Moiré effects. The resulting point cloud consists of only the points with the highest positional quality while reducing the number of points by factor 10. This novel approach resulted in final surface meshes that are accurate, contain predefined level of random errors and require almost no manual intervention.
Article
Background Monte Carlo (MC) simulations are considered the gold‐standard for accuracy in radiotherapy dose calculation; so far however, no commercial treatment planning system (TPS) provides a fast MC for supporting clinical practice in carbon ion therapy. Purpose To extend and validate the in‐house developed fast MC dose engine MonteRay for carbon ion therapy, including physical and biological dose calculation. Methods MonteRay is a CPU MC dose calculation engine written in C++ that is capable of simulating therapeutic proton, helium and carbon ion beams. In this work, development steps taken to include carbon ions in MonteRay are presented. Dose distributions computed with MonteRay are evaluated using a comprehensive validation dataset, including various measurements (pristine Bragg peaks, spread out Bragg peaks in water and behind an anthropomorphic phantom) and simulations of a patient plan. The latter includes both physical and biological dose comparisons. Runtimes of MonteRay were evaluated against those of FLUKA MC on a standard benchmark problem. Results Dosimetric comparisons between MonteRay and measurements demonstrated good agreement. In terms of pristine Bragg peaks, mean errors between simulated and measured integral depth dose distributions were between −2.3% and +2.7%. Comparing SOBPs at 5, 12.5 and 20 cm depth, mean absolute relative dose differences were 0.9%, 0.7% and 1.6% respectively. Comparison against measurements behind an anthropomorphic head phantom revealed mean absolute dose differences of with global 3%/3 mm 3D‐γ passing rates of 99.3%, comparable to those previously reached with FLUKA (98.9%). Comparisons against dose predictions computed with the clinical treatment planning tool RayStation 11B for a meningioma patient plan revealed excellent local 1%/1 mm 3D‐γ passing rates of 98% for physical and 94% for biological dose. In terms of runtime, MonteRay achieved speedups against reference FLUKA simulations ranging from 14× to 72×, depending on the beam's energy and the step size chosen. Conclusions Validations against clinical dosimetric measurements in homogeneous and heterogeneous scenarios and clinical TPS calculations have proven the validity of the physical models implemented in MonteRay. To conclude, MonteRay is viable as a fast secondary MC engine for supporting clinical practice in proton, helium and carbon ion radiotherapy.
Article
Purpose Increasingly 3D printing is used for parts of garments or for making whole garments due to their flexibility and comfort and for functionalizing or enhancing the aesthetics of the final garment and hence adding value. Many of these applications rely on complex programming of the 3D printer and are usually provided by the vendor company. This paper introduces a simpler, easier platform for designing 3D-printed textiles, garments and other artifacts, by predicting the optimal orientation of the target objects to minimize the use of plastic filaments. Design/methodology/approach The main idea is based on the shadow-casting analogy, which assumes that the volume of the support structure is similar to that of the shadow from virtual sunlight. The triangular elements of the target object are converted into 3D pixels with integer-based normal vectors and real-numbered coordinates via vertically sparse voxelization. The pixels are classified into several groups and their noise is suppressed using a specially designed noise-filtering algorithm called slot pairing. The final support structure volume information was rendered as a two-dimensional (2D) figure, similar to a medical X-ray image. Thus, the authors named their method modified support structure tomography. Findings The study algorithm showed an error range of no more than 1.6% with exact volumes and 6.8% with slicing software. Moreover, the calculation time is only several minutes for tens of thousands of mesh triangles. The algorithm was verified for several meshes, including the cone, sphere, Stanford bunny and human manikin. Originality/value Simple hardware, such as a CPU, embedded system, Arduino or Raspberry Pi, can be used. This requires much less computational resources compared with the conventional g-code generation. Also, the global and local support structure is represented both quantitatively and graphically via tomographs.
Article
Large-scale 3D mapping nowadays is a research hotspot in robotics. A greatly concerning issue is reconstructing high-accuracy maps in a hardware environment with limited memory. To address this problem, we propose a novel implicit neural mapping approach with higher accuracy and less memory. It first adopts an improved hierarchical hash encoder, independent of geometric bounding (e.g., bounding box or sphere), for a more compact map representation, and then leverages a spatial hash grid to restrict the encoding space to the proximity of geometric surfaces, preventing hash collisions between encoding in free space and near geometric surfaces. The hash grid indexes the scene point cloud produced by ranging data. Through a tiny MLP, features encoded from sampled points in the hash grid can be converted to truncated signed distance values. To further improve mapping accuracy, a new method is developed to instantly obtain more accurate signed distance labels from ranging data by computing the closest distances from sampled points to the point cloud indexed by the constructed hash grid, not just the distances from sampled points to geometric surfaces along rays, and then use these labels to supervise the learning of our hash encoder. Experimental evaluations performed on large-scale indoor and outdoor datasets demonstrate that our approach achieves state-of-the-art mapping performance with less than half of the memory consumption compared with previous advanced 3D mapping methods using ranging data.
Preprint
Full-text available
This study comprehensively explores static and dynamic occlusion issues in urban scenarios, focusing mainly on their interplay with the rising prevalence of connected automated vehicles (CAVs). We propose a unique methodology for pinpointing static and dynamic occlusions and examining the impacts of CAVs that integrate collective perception in their sensing systems. A crucial aspect of our investigation is identifying a critical point concerning the CAV penetration ratio, past which dynamic occlusion ceases to exert significant influence. Based on our investigation, a penetration rate of around 34% seems to alleviate the problems associated with dynamic occlusions. Nonetheless, our research also uncovers that issues related to static occlusion may en10 dure even with increased CAV penetration levels, thus requiring additional mitigation approaches. Furthermore, this study broadens the understanding of static and dynamic occlusion, creating a new metric to explain the Level of Visibility (LoV) in urban areas. The framework applied in our evaluations is disclosed in conjunction with this paper. This research represents a substantial advancement in understanding and improving the operation of CAVs in occluded scenarios.
Article
A 3D occupancy map that is accurately modeled after real-world environments is essential for reliably performing robotic tasks. Probabilistic volumetric mapping (PVM) is a well-known environment mapping method using volumetric voxel grids that represent the probability of occupancy. The main bottleneck of current CPU-based PVM, such as OctoMap, is determining voxel grids with occupied and free states using ray-shooting. In this paper, we propose an octree-based PVM, called OctoMap-RT, using a hybrid of off-the-shelf ray-tracing GPUs and CPUs to substantially improve CPU-based PVM. OctoMap-RT employs massively parallel ray-shooting using GPUs to generate occupied and free voxel grids and to update their occupancy states in parallel, and it exploits CPUs to restructure the PVM using the updated voxels. Our experiments using various large-scale real-world benchmarking environments with dense and high-resolution sensor measurements demonstrate that OctoMap-RT builds maps up to 41.2 times faster than OctoMap and 9.3 times faster than the recent SuperRay CPU implementation. Moreover, OctoMap-RT constructs a map with 0.52% higher accuracy, in terms of the number of occupancy grids, than both OctoMap and SuperRay.
Conference Paper
Full-text available
Hierarchical representations of 3-dimensional objects are both time and space efficient. They typically consist of trees whose branches represent bounding volumes and whose terminal nodes represent primitive object elements (usually polygons). This paper describes a method whereby the object space is represented entirely by a hierarchical data structure consisting of bounding volumes, with no other form of representation. This homogencity allows the visible surface rendering to be performed simply and efficiently. The bounding volumes selected for this algorithm are parallelepipeds oriented to minimize their size. With this representation, any surface can be rendered since in the limit the bounding volumes make up a point representation of the object. The advantage is that the visibility calculations consist only of a search through the data structure to determine the correspondence between terminal level bounding volumes and the current pixel. For ray tracing algorithms, this means that a simplified operation will produce the point of intersection of each ray with the bounding volumes. Memory requirements are minimized by expanding or fetching the lower levels of the hierarchy only when required. Because the viewing process has a single operation and primitive type, the software or hardware chosen to implement the search can be highly optimized for very fast execution.
Chapter
This paper proposes algorithms for dealing with two essential problems encountered in generating continuous-tone images by the ray tracing method: speed and aliasing. These two factors are considered an Achilles’ heel of the method and have been the main cause preventing the method from being widely used. The paper examines previous approaches to the problem and finally proposes a scheme based on the coherency of an auxiliary data structure imposed on the original object domain. Both simple spatial enumeration and a hybrid octree approach were investigated. 3DDDA (3D line generator) was developed for efficient traversing of both structures. It constitutes the essential factor providing a dramatic improvement (order of magnitude) in processing speed in comparison to other known ray tracing methods. In particular, processing time is found to be virtually independent of the number of objects involved in the scene. For a larger number of objects (around 1500), this method actually becomes faster than scan-line methods. In order to remove jags from edges, a scheme for identifying the edge orientation and the distance from a pixel center to the true edge has been implemented. The additional time required for antialiasing depends on the total length of the edges encountered in the scene, but is normally only a fractional addition to the time required to produce such a scene without antialiasing.
Article
A multiprocessor algorithm for ray tracing is described. The performance of the algorithm is analysed for a cubic and square array of processors with only local communication between near neighbours. Theoretical expressions for the speedup of the system as a function of the number of processors are derived. These analytic results are supported by simulations of ray tracing on a number of simple scenes with polygonal surfaces. It is found that a square network of processors generally performs better than a cubic network. Some comments are made on the construction of such a system using current (1985) microprocessor technology.
Conference Paper
A new method for ray tracing parametric surfaces is presented. The new algorithm solves the ray surface intersection directly using multivariate Newton iteration. This provides enough generality to render surfaces which could not be ray traced using existing methods. To overcome the problem of finding a starting point for the Newton algorithm, techniques from Interval Analysis are employed. The results are presented in terms of solving a general nonlinear system of equations f(x)= 0, and thus can be extended to a large class of problems which arise in computer graphics.
Conference Paper
A new algorithm for speeding up ray-object intersection calculations is presented. Objects are bounded by a new type of extent, which can be made to fit convex hulls arbitrarily tightly. The objects are placed into a hierarchy. A new hierarchy traversal algorithm is presented which is efficient in the sense that objects along the ray are queried in an efficient order.Results are presented which demonstrate that our technique is several times faster than other published algorithms. Furthermore, we demonstrate that it is currently possible to ray trace scenes containing hundreds of thousands of objects.
Article
This paper describes algorithmic procedures that have been implemented to reduce the computational expense of producing ray-traced images. The selection of bounding volumes is examined to reduce the computational cost of the ray-intersection test. The use of object coherence, which relies on a hierarchical description of the environment, is then presented. Finally, since the building of the ray-intersection trees is such a large portion of the computation, a method using image coherence is described. This visible-surface preprocessing method, which is dependent upon the creation of an "item buffer," takes advantage of a priori image information. Examples that indicate the efficiency of these techniques for a variety of representative environments are presented.
Article
To accurately render a scene, global illumination information that affects the intensity of each pixel of the image must be known at the time the intensity is calculated. In a simplified form, this information is stored in a tree of “rays” extending from the viewer to the first surface encountered and from there to other surfaces and to the light sources. The visible surface algorithm creates this tree for each pixel of the display and passes it to the shader. The shader then traverses the tree to determine the intensity of the light received by the viewer. Consideration of all of these factors allows the shader to accurately simulate true reflection, shadows, and refraction as well as the effects simulated by conventional shaders. Anti-aliasing is included as an integral part of the visibility calculations. Surfaces displayed include curved as well as polygonal surfaces.
Article
An algorithm is described that speeds up ray-tracing techniques by reducing the number of time-consuming object-ray intersection calculations that have to be made. The algorithm is based on subdividing space into an octree, associating a given voxel with only those objects whose surfaces pass through the volume of the voxel. It includes a technique for obtaining fast access to any node and a mechanism for finding the next node intersected by a ray when it has hit nothing in the current node. This new algorithm makes possible the ray tracing of complex scenes by medium-scale and small-scale computers.