ArticlePDF Available

VAPOR: Visual, Statistical, and Structural Analysis of Astrophysical Flows


Abstract and Figures

In this paper we discuss recent developments in the capabilities of VAPOR: a desktop application that leverages today's powerful CPUs and GPUs to enable visualization and analysis of terascale data sets using only a commodity PC or laptop. We review VAPOR's current capabilities, highlighting support for Adaptive Mesh Refinement (AMR) grids, and present new developments in interactive feature-based visualization and statistical analysis.
Content may be subject to copyright.
Numerical Modeling of Space Plasma Flows: ASTRONUM-2009
ASP Conference Series, Vol. 429, 2010
Nikolai V. Pogorelov, Edouard Audit, and Gary P. Zank, eds.
VA POR: Visual, Statistical, and Structural Analysis of
Astrophysical Flows
John Clyne
National Center for Atmospheric Research, Boulder, Colorado USA
Kenny Gruchalla
National Renewable Energy Laboratory, Golden, Colorado USA
Mark Rast
University of Colorado, Boulder, Colorado USA
Abstract. In this paper we discuss recent developments in the capabilities of
VAPOR: a desktop application that leverages today’s powerful CPUs and GPUs
to enable visualization and analysis of terascale data sets using only a commodity
PC or laptop. We review VA POR’s current capabilities, highlighting support
for Adaptive Mesh Refinement(AMR) grids, and present new developments in
interactive feature-based visualization and statistical analysis.
1. Introduction
Interactive visualization and analysis of data from astrophysical flow simulations
faces increasing challenges with theever increasi ng size of thosecalculations. The
mismatchbetween visualization/analysis and computational resources means
that some form of data reduction must beemployed to maintain interactivity in
the visualization/analysis process. To date VAPOR has addressed this challenge
via multiresolution acce ss and Cartesian-volume region-of-interest(ROI) extrac-
tion (Clyne & Rast 2005). We briefly sumarize here key elements of the VAPOR
visual dataanalysis environment before discussingat some length VA POR’ssup-
port for Adapative Mesh Refinement(AMR) grids and recent developments in
iterative feature based visualization and analysis. These developments extend
the region-of-interest conceptto volumes definedby the solutionproperties and
field variablecorrelations,loosely termed structures or features.
Inprevious work we have described indetail many of thecapabilities of the
VAPOR package (Clyne & Rast 2005; Clyneet al. 2007; Rast & Clyne 2008;
Mininni et al. 2008b). Three key components distinguish VA POR from other
advanced visualizationpackages thatthe authors are aware of:
VAPOR open source available at http://www.vapor
324 Clyne, Gruchalla, and Rast
a wavelet based multiresolutiondata model enables interactive data brows-
ingof high-resolution simulation outputs usingonly modest computing
resources (e.g., a conventional desktop or laptop)
a feature setthat is targeted toward the specialized analysis needs of the
astrophysical and geophysical computational fluiddynamics communities
a closecoupling between VAPOR’s highly interactiveexploratoryvisual-
ization capabilities and ITT’s fourth-generation scientific data processing
language, IDL
2.1. Multiresolution
VAPOR utilizes a hierarchical data representation as a strategy toapproach the
challenges of interactively analyzing large-scale data volumes. The simulation
outputs are storedhierarchically, with each levelin that hierarchy providing
a coarsened approximation of the dataatthe preceding level. This approach
exploits the factthat manyvisualization and analysis operations can tolerate a
level of information loss by retrievingonly the level of fidelity that is required
for thecurrent operations. For analysis operations that require access to the
dataat full resolution, this approach still allows the original data to be accessed
in their entirety, without loss of information.
The hierarchical multiresolution access is accomplished through a wavelet
decomposition and reconstruction scheme (Clyne 2003). The dataare stored as
hierarchy of successively coarser wavelet coecients, with each level representing
a halvingof the data resolution along each spatial axis, resulting in an eight-
fold reduction in the size of the data volume, and thecorresponding reduction
in required visualization and analysis resources. Storing the data hierarchy
as wavelet coecients avoids the penalty of keeping multiple data copies. A
three-dimensional Haar wavelet(Haar 1910) is currently being used for this
transformation. Thecomputational cost of the forward and inverse transforms
are negligiblecompared to those incurredby readingor writing the data, allowing
the reconstruction of the dataat factor-of-two resolutions with only minimal
This hierarchical dataaccess scheme allows an investigator to control the
fidelity of data in accordance with the available resources, the desired interac-
tivity, and the requirements of the analysis. This forms the basis for an iterative
analysis process, where the investigator can interactively browsecoarsened rep-
resentations of the dataacross the global spatiotemporal domain to identify fea-
tures of interest. Once identified, the analysis domain canbe restricted to these
features,increasing the level of data resolution that canbe handled interactively.
Oftenboth visualinspection and numerical analysis are fairly insensitive to con-
siderable data coarsening (Clyne & Rast 2005), providing substantial savings in
computational costs and input/output overhead during theearly exploratory
stages of investigation when interactivity is most crucial. Of course, subsequent
verification of the analysis results canbe accomplished less interactively at full
resolution if necessary.
Figure 1. Tens of thousands of vortical structures (left) in a high-resolution
simulation of Taylor-Green forced turbulence (Mininni, Alexakis, & Pouquet
2008a). A vortical feature identified and extracted from a data volume using
high global values of vorticity and low absolute values offeature-local helicity
(right), shown with streamlines seeded in the velocity field. These types of
non-Cartesian regions of interest canbe isolated asstructures and examined
both visually and quantitatively.
2.2. Targeted features
While VAPOR supports numerous general purpose visualization algorithms. It
also provides capabilities tailored towards astrophysical and geophysical CFD
needs. Oneexample, discussed indetail in Mininni et al. (2008b),is the integra-
tion and display of magnetic field lines advectedby a velocity fiel d. Other spe-
cialized algorithms, not reported el sewhere,include methods for visually guided
placement of streamline and pathline seedpoints based on physical properties of
the flow such as local field maximaor minima. Interactive seeding is facilitated
by cutting planes at arbitrary orientations in the volume and interactive probing
of the data values.
A recent development(Gruchalla et al. 2009) has foc used onbroadening
theconcept of a ROI beyond Cartesian sub-volumes to thecoherent structures
by combining multivariate volume visualization techniques (Kniss, Kindlmann,
& Hansen 2002; Doleisch, Gasser, & Hauser 2003) with a connected c omponent
analysis (Suzuki, Horibia, & Sugie 2003). Structures canbe broadly and iter-
atively definedby multivariate transfer functions, and can thus represent local
regions of correlation or anticorrelationbetweenfield variables as well as those
identifiedby more traditional thresholdingof a single measure. The algorithm
executes a connected component analysis of the volume, based onuser defined
opacityvalues in the transfer function, to labelindividual structures. Structure
dependent histograms of the original or other derived variables can thenbe dis-
played, and structure statistics canbe used toguide further selection, definition,
and identification in an iterative refinement loop. For example, from the tens
of thousands of vortical structures in a recent simulation of Taylor-Green forced
turbulence (Mininni, Alexakis, & Pouquet 2008a)those regions withbothhigh
326 Clyne, Gruchalla, and Rast
vorticity and low helicity canbe readily identified and extracted (Figure 1) and
compared toother highlyvortical but more helical regions. Suchnon-Cartesian
ROI extraction can significantly reduce data volumes, with thecoordinates of the
voxels contained in the structures readily output for use in subsequent analysis.
2.3. Coupling visualization with quantitative dataanalysis
VAPOR seamlessly interfaces with ITT’s fourth-generatio n language IDL, allow-
ing investigators to perform rigorous quantitative analyses guidedby VAPOR’s
intrinsic 2D and 3D visualization capabilities. The integration of IDL and VA-
POR is facilitatedby metadata exchange defining the attributes and and resolu-
tion of the data. A library of data-access routines allows IDL to read and write
data in VA POR’s wavelet-encoded representation—an approach that is readily
generalizable toother analysis packages. In typical usage, the investigator will
maintain simultaneously active VAPOR and IDL sessions, visually identifying
ROIs with VAPOR and exporting them to IDL for further study. Interactivity
is maintained if the ROI issuciently small or if the operation issuciently
well-behaved over coarsened approximations of the data (Clyne & Rast 2005).
Quantities derived in the IDL session are importedback i nto theexisting VA-
POR session for continued visualinvestigation. Through the iteration of this
process,large-scale data sets canbe interactively explored, visualized, and ana-
lyzed withoutthe usual delays causedby reading, writing, and operatingon the
dataarrays in full.
The primary benefit of coupling visual data investigation with anhigh-
level dataanalysis language is the ability to target expensivecalculations of
derived quantities to specific ROIs. The memory and computing requirement
for calculating such variables in advance, across theentire domain, can require
exorbitant resources, delayingor preventing further analysis. Moreover, the
computation of some analysis quantities requires prior knowledge of the solution.
This is particularly true if they are definedby field values (e.g. regions of
maximum or minimummeasure) or correlations between the flow variables. The
new interactive non-Cartesian-volume feature based ROI capabilities of VA POR
allow, viaamultivariate transfer function, precise definition of ROIs based on
solutionproperties, and can thus focus analysis onhighly reduced sub-volumes.
3. AMR
VAPOR supports a form of the block structured AMR grid that is most closely
describedby MacNeice et al. (2000),implemented in the PARAMESH pack-
age, and presently employedby the FLASH astrophysical thermonuclear flash
. Thecomputational domain is coveredby a base level, regular, Cartesian
grid with uniform sampling. The base grid is partitioned into uniformly sized,
non-overlapping blocks: eachblock contains the same number of samples and
covers the same size physical space. Individual parent blocks may be refinedby
subdividing them into eight chi ld octants. This refinement may be performed
recursively creatingan octree hierarchy, with varying levels of refinement. A
Figure 2. A 3D plume simulation computed with the FLASH astrophysical
thermonuclear flash code. With on-the-fly resampling toauniform grid, the
full range of VAPOR capabilities are directly applicable to the AMR gridded
maximum depth of 10 or 20 levels is not uncommon. All blocks in the hierarchy
contain the same number of uniformly distributed samples. VAPOR supports
a somewhat less restrictive AMR mess structure than that of PARAMESH, not
requiring that adjacent blocks dier by no more than one level of refinement.
Direct visualization of AMR grids is a complex task. Only recently have
practical algorithms been published for such routine visualization algorithms as
direct volume renderingor isosurface construction (Weber et al. 2001b,a). VA-
POR supports numerous fundamental visualization algorithms as well as novel
visualization methods not found in other packages. Toavoid the onerous task
of ge neralizi ngall of VA POR’s principal visualization methods to support both
regularrectilinear grids and a variety of AMR strategies, the approach taken
by VA POR is to resample AMR grids ontoauniformly sampled Cartesian grid.
The resampling is performed on the fly, as needed, and atthe resolution selected
by the user. The user controlled sampling frequency matches a refinement level
in the AMR grid, withblocks in the ROI possessingacoarser sampling than
the user specifieddesired sampling refined through interpolation, and blocks
of finer AMR sampling coarsened. This treatment is analogous to the wavelet
based coarseningand refining that underlies VAPOR operations on regularrec-
tilinear grids. Thecomputational cost of this regridding is fairly modest and is
ameliorated somewhat by VAPOR’s extensiveemployment of caching.
At presentthe only AMR file formatthat VAPOR’s interactive analysis tool,
”vaporgui”,is capable of reading is VAPOR’s own custom format. Pr eparing
an AMR data set for analysis with VAPOR requires firsttranslating the data
into this format. Command line utilities are provided for translation of FLASH
data setsstored i n the HDF5 file format
. Further, examplecodes are provided
that may becustomized for use with other AMR encodings.
328 Clyne, Gruchalla, and Rast
4. Conclusion
VAPOR continues to evolve to meetthe visualization and analysis challenges
facing computational astrophysical and geophysical fluiddynamicists as we near
petascalecomputecapabilities. The focus remains onprovidingaflexible and
useful tool for use in interactive analysis. Both adaptive mesh refinement and
data volume reduction inpost-batch analysis will beessential to interactivity in
the petascaleenvironment, and ecient algorithmic mergingof these remains
an ongoing challenge.
Acknowledgments. This work was funded inpart by the National Science
Foundation under grant ITR-0325934. The AMR plume data was providedby
Matthias Rempel, and that of Taylor-Green turbulence by Pablo Mininni.
Clyne, J. 2003,in Proceedings of Visualization, Imaging, and Image Processing ’03,
Clyne, J., Mininni, P., Norton, A., & Rast, M. 2007, New J of Physics, 9, 301
Clyne, J., & Rast, M. 2005,in SPIE-IS&T Electronic Imaging, Vol. 5669, 284–294
Doleisch, H., Gasser, M., & Hauser, H. 2003,in Proceedings of the symposium on D ata
visualisation 2003 (Grenoble, France: Eurographics Association), 239–248
Gruchalla, K., Rast, M., Bradley, E., Clyne, J., & Mininni, P. 2009,in IDA 2009,
Lecture Notes in Computer Science, Vol. 5772, 321–332
Haar, A. 1910, Mathematische Annalen, 69, 331
Kniss, J., Kindlmann, G., & Hansen, C. 2002, IEEE Tr ansactions on Visualization and
Computer Graphics, 8, 270
MacNeice, P., Olson, K. M., Mobarry, C., de Fainchtein, R., & Packer, C. 2000, Com-
puter Physics Communications, 126, 330
Mininni, P., Alexakis, A., & Pouquet, A. 2008a, Physical Review E, 77, 036306
Mininni, P., Lee, E., Norton, A., & Clyne, J. 2008b, New Journal of Physics, 10, 125007
Rast, M., & Clyne, J. 2008,in Numerical Modeling of Space Plasma Flows/ Astronum-
2007, ASP Conference Series, Vol. 385, 299–308
Suzuki, K., Horibia, I., & Sugie, N. 2003, Computer Vision and Image Understanding,
89, 1
Weber, G. H., Kreylos, O., Ligocki, T. J., Shalf, J., Hagen, H., Hamann, B., Joy,
K. I., & Ma, K.-L. 2001a,in VMV ’01: Proceedings of the Vision Modeling and
Visualization Conference 2001 (Aka GmbH), 121–128
Weber, G. H., Kreylos, O., Ligocki, T. J., Shalf, J. M., Hagen, H., Hamann, B., & Joy,
K. 2001b, in Data Visualization 2001 (Springer Verlag), 25–34
... The use of wavelet-based compression schemes is becoming increasingly popular in the data visualization domain, see for example Gruchalla [9], Gruchalla et al. [10], and Gruchalla et al. [11]. We fully expect this trend to continue with their inclusion as the default compression tool in the VAPOR software package [6]. One of our aims with this research is to develop a compression strategy that allows for heterogeneous levels of compression throughout the simulation domain while remaining consistent with VAPOR's use of the discrete wavelet transform. ...
The volume of data and the velocity with which it is being generated by com- putational experiments on high performance computing (HPC) systems is quickly outpacing our ability to effectively store this information in its full fidelity. There- fore, it is critically important to identify and study compression methodologies that retain as much information as possible, particularly in the most salient regions of the simulation space. In this paper, we cast this in terms of a general decision-theoretic problem and discuss a wavelet-based compression strategy for its solution. We pro- vide a heuristic argument as justification and illustrate our methodology on several examples. Finally, we will discuss how our proposed methodology may be utilized in an HPC environment on large-scale computational experiments.
Full-text available
Computational physics has benefited from on-going microprocessor innovations, which have enabled larger and larger numerical simulations. One consequence of these technological advancements has been an explosion in the amount of data generated. For many modelers, available software tools and computing resources are proving inadequate for investigation of high-resolution numerical outputs. In this paper we discuss the general problems associated with very large data visualization and analysis and our work on a particular solution to those through the development of VAPOR (open source, available at a desktop application that leverages today's powerful CPUs and GPUs to enable visualization and analysis of terascale data sets using only a commodity PC or laptop. We briefly illustrate VAPOR's utility through the exploration of a high-resolution simulation aimed at understanding the effects of hydrogen ionization on convective dynamics in stellar envelopes.
Full-text available
The ever increasing processing capabilities of the supercomputers available to computational scientists today, combined with the need for higher and higher resolution computational grids, has resulted in deluges of simulation data. Yet the computational resources and tools required to make sense of these vast numerical outputs through subsequent analysis are often far from adequate, making such analysis of the data a painstaking, if not a hopeless, task. In this paper, we describe a new tool for the scientific investigation of massive computational datasets. This tool (VAPOR) employs data reduction, advanced visualization, and quantitative analysis operations to permit the interactive exploration of vast datasets using only a desktop PC equipped with a commodity graphics card. We describe VAPORs use in the study of two problems. The first, motivated by stellar envelope convection, investigates the hydrodynamic stability of compressible thermal starting plumes as they descend through a stratified layer of increasing density with depth. The second looks at current sheet formation in an incompressible helical magnetohydrodynamic flow to understand the early spontaneous development of quasi two-dimensional (2D) structures embedded within the 3D solution. Both of the problems were studied at sufficiently high spatial resolution, a grid of 5042 by 2048 points for the first and 15363 points for the second, to overwhelm the interactive capabilities of typically available analysis resources.
Full-text available
Accurately interpreting three dimensional (3D) vector quantities output as solutions to high-resolution computational fluid dynamics (CFD) simulations can be an arduous, time-consuming task. Scientific visualization of these fields can be a powerful aid in their understanding. However, numerous pitfalls present themselves ranging from computational performance to the challenge of generating insightful visual representations of the data. In this paper, we briefly survey current practices for visualizing 3D vector fields, placing particular emphasis on those data arising from CFD simulations of turbulence. We describe the capabilities of a vector field visualization system that we have implemented as part of an open source visual data analysis environment. We also describe a novel algorithm we have developed for illustrating the advection of one vector field by a second flow field. We demonstrate these techniques in the exploration of two sets of runs. The first comprises an ideal and a resistive magnetohydrodynamic (MHD) simulation. This set is used to test the validity of the advection scheme. The second corresponds to a simulation of MHD turbulence. We show the formation of structures in the flows, the evolution of magnetic field lines, and how field line advection can be used effectively to track structures therein.
Full-text available
Figure 1. Solar convection is dominated by the formation of thermal downflow plumes in the surface layer. This image displays the enstrophy in a three-dimentional compressible starting plume driven by cooling at the top and descending (left to right) through a highly stratified (increasing density with depth) medium. ABSTRACT Scientific visualization is routinely promoted as an indispensable component of the knowledge discovery process in a variety of scientific and engineering disciplines. However, our experiences with visualization at the National Center for Atmospheric Research (NCAR) differ somewhat from those described by many in the visualization community. Visu-alization at NCAR is used with great success to convey highly complex results to a wide variety of audiences, but the technology only rarely plays an active role in the day-to-day scientific discovery process. We believe that one reason for this is the mismatch between the size of the primary simulation data sets produced and the capabilities of the software and visual computing facilities generally available for their analysis. Here we describe preliminary results of our efforts to facilitate visual as well as non-visual analysis of terascale scientific data sets with the aim of realizing greater scientific return from such large scale computation efforts.
Full-text available
We analyze the data stemming from a forced incompressible hydrodynamic simulation on a grid of 2048(3) regularly spaced points, with a Taylor Reynolds number of R(lambda) ~ 1300. The forcing is given by the Taylor-Green vortex, which shares similarities with the von Kàrmàn flow used in several laboratory experiments; the computation is run for ten turnover times in the turbulent steady state. At this Reynolds number the anisotropic large scale flow pattern, the inertial range, the bottleneck, and the dissipative range are clearly visible, thus providing a good test case for the study of turbulence as it appears in nature. Triadic interactions, the locality of energy fluxes, and longitudinal structure functions of the velocity increments are computed. A comparison with runs at lower Reynolds numbers is performed and shows the emergence of scaling laws for the relative amplitude of local and nonlocal interactions in spectral space. Furthermore, the scaling of the Kolmogorov constant, and of skewness and flatness of velocity increments is consistent with previous experimental results. The accumulation of energy in the small scales associated with the bottleneck seems to occur on a span of wave numbers that is independent of the Reynolds number, possibly ruling out an inertial range explanation for it. Finally, intermittency exponents seem to depart from standard models at high R(lambda), leaving the interpretation of intermittency an open problem.
In this paper, we describe a community toolkit which is designed to provide parallel support with adaptive mesh capability for a large and important class of computational models, those using structured, logically cartesian meshes. The package of Fortran 90 subroutines, called PARAMESH, is designed to provide an application developer with an easy route to extend an existing serial code which uses a logically cartesian structured mesh into a parallel code with adaptive mesh refinement. Alternatively, in its simplest use, and with minimal effort, it can operate as a domain decomposition tool for users who want to parallelize their serial codes, but who do not wish to use adaptivity. The package can provide them with an incremental evolutionary path for their code, converting it first to uniformly refined parallel code, and then later if they so desire, adding adaptivity.
  • M Rast
  • J Clyne
Rast, M., & Clyne, J. 2008, in Numerical Modeling of Space Plasma Flows/ Astronum- 2007, ASP Conference Series, Vol. 385, 299–308
  • K Gruchalla
  • M Rast
  • E Bradley
  • J Clyne
  • P Mininni
Gruchalla, K., Rast, M., Bradley, E., Clyne, J., & Mininni, P. 2009, in IDA 2009, Lecture Notes in Computer Science, Vol. 5772, 321-332
  • A Haar
Haar, A. 1910, Mathematische Annalen, 69, 331
  • J Kniss
  • G Kindlmann
  • C Hansen
Kniss, J., Kindlmann, G., & Hansen, C. 2002, IEEE Transactions on Visualization and Computer Graphics, 8, 270