Contextual Compression of Large-Scale Wind Turbine Array Simulations
Kenny Gruchalla∗, Nicholas Brunhart-Lupo∗, Kristin Potter∗, and John Clyne†
∗National Renewable Energy Laboratory
†National Center for Atmospheric Research
Abstract—Data sizes are becoming a critical issue particularly
for HPC applications. We have developed a user-driven lossy
wavelet-based storage model to facilitate the analysis and
visualization of large-scale wind turbine array simulations.
The model stores data as heterogeneous blocks of wavelet
coefﬁcients, providing high-ﬁdelity access to user-deﬁned data
regions believed the most salient, while providing lower-ﬁdelity
access to less salient regions on a block-by-block basis. In
practice, by retaining the wavelet coefﬁcients as a function
of feature saliency, we have seen data reductions in excess of
94%, while retaining lossless information in the turbine-wake
regions most critical to analysis and providing enough (low-
ﬁdelity) contextual information in the upper atmosphere to
track incoming coherent turbulent structures. Our contextual
wavelet compression approach has allowed us to deliver interac-
tive visual anlaysis while providing the user control over where
data loss, and thus reduction in accuracy, in the analysis occurs.
We argue this reduced but contexualized representation is a
valid approach and encourages contextual data management.
There is a data analysis dilemma growing in the ﬁeld
of computational science: our ability to generate numerical
data from scientiﬁc computations is outpacing our ability
to store, move, and analyze those data. In large high
performance compute (HPC) systems, I/O access, including
bandwidth, memory, and storage capacity have not scaled
with microprocessor performance. The issue is only expected
to worsen across the computational science community with
architectural changes expected at the exascale. Generally, as
this data analysis disparity grows, some form of data reduc-
tion will become necessary, even before post-processing .
The study of the interactions between turbine wakes and
the atmospheric boundary layer in wind farm simulations
lead to our development of a multi-resolution approach to
data compression. Speciﬁcally, analysis of these datasets
requires the detailed examination of turbine wakes, however
the larger computational domain is only needed for contextual
information. Thus, much of the data coming from a full-
resolution simulation is not necessary. As a response, we
developed a heterogeneous wavelet-based storage scheme
to contextually compress wind farm model data based on
The data volume is decomposed into independent blocks
of wavelet coefﬁcients, which are fully retained in blocks
Figure 1. The study of the interactions between the atmospheric boundary
layer and a wind-turbine array provide inherent opportunities for multi-
resolution analysis. Detailed examinations of the turbine wakes are of
analytical interest, while the larger computational domain is only needed
for contextual information.
intersecting the turbine wakes, losslessly capturing their
dynamics, while retaining only a subset of coefﬁcients in
the non-waked blocks, thereby providing a contextually
compressed volume (see Figure 1). This approach has shown
massive data reduction in excess of 94% for these models,
improving storage cost and rendering times.
2. Wind Turbine Array Modeling
The U.S. Department of Energy’s 2008 report 20% Wind
Energy by 2030  envisioned that wind power could supply
20% of the U.S. electricity demand. However, signiﬁcant
advances in cost, performance, and reliability will be needed
to achieve this vision. For example, large wind plants are
consistently found to be sub-optimal in terms of performance
and reliability. One of the culprits of this underperfomance
is hypothesized to be related to inadequate accounting for
the inter-turbine effects through the propagation of turbine
wakes . Wakes form immediately downstream of the wind
turbines with a lower mean velocity and an increased tur-
bulence intensity. Downstream turbines waked by upstream
turbines can experience vastly reduced power output, and
abrupt, massive stressing of turbine components . The
dynamics of these wakes are generally not well understood,
and involve complex interactions of wind speed, momentum,
temperature, and moisture, resulting in complicated wake
motion, such as 3D periodic undulation, meandering, and
Computational models of large-scale wind farms are
being used to better understand these phenomena. Currently,
large-eddy simulations are being used to create atmospheric
winds and compute the wind turbine ﬂows , while ﬂexible,
revolving actuator lines model the structural and system
dynamics of the turbines . To capture relevant atmospheric
boundary layer scales, and applicable turbine and wake
dynamics, the simulation domain must represent at least
volume with a grid resolution of
and a temporal resolution of
As a consequence, capturing just a few minutes of ﬂow
through these farms can result in hundreds of terabytes of
data. As computational capabilities continue to improve,
higher ﬁdelity models of these wind farms are envisioned
that capture more of the relevant scales, spanning blade-level
turbulence to atmospheric ﬂow at the mesoscale, potentially
increasing the data demands by orders of magnitude.
Early versions of these models employed a uniform grid,
facilitating the post-processing and visualization of these
data but with signiﬁcant storage penalties. These models
have recently evolved to using a multi-resolution nested grid,
resolution in the peripheral boundary layer and
stepping down to
surrounding the array. The nested
grid signiﬁcantly improves simulation times and storage
requirements, but comes at the cost of increased complexity
and cost in visualization and analysis, as non-uniform grids
of this magnitude (on order of a billions cells) are difﬁcult
to interrogate interactively. Speciﬁcally, volume rendering
these large non-uniform grids becomes untenable without
using leadership-class HPC resources , , .
3. Wavelet-Based Compression
The existence of high and low areas of interest in turbine
simulations (see Figure 1) led us to the idea of acceptable
data loss and to explore the use of multi-resolution wavelets
for data compression. Multi-resolution wavelets allow the
reconstruction of a data set at varying resolutions in much the
same way as a MIP-mapping scheme. A wavelet transform
decomposes a signal into a set of wavelet functions of
different sizes and positions, and the result of the transform is
a set of coefﬁcients that describe how the wavelet functions
need to be modiﬁed to reconstruct the original signal. Scaling
the wavelets adapts them to different components of the
signal and provides a multi-resolution view of that signal;
the large scales provide an overview, while smaller scales
ﬁll in the details. The wavelet transform provides excellent
energy compaction, i.e., it concentrates energy (information)
into a small number of coefﬁcients , and the multi-
resolution structure of the coefﬁcients provides the ability
to reconstruct the data at varying resolutions . Clyne
et al.  demonstrated that a simple hierarchical access
scheme based on the multi-resolution properties of wavelets
can enable the interactive analysis of large terabyte-sized
When wavelet reconstruction leads to an approximation
of the original signal, data loss occurs (referred to as lossy
compression). Computational scientists have recently begun
to investigate the role of lossy compression when applied
to scientiﬁc data , , , , , and many
visualization and analysis operations have been shown to be
relatively insensitive to this type of data coarsening , ,
. However, lossy compression is not yet widely used in
practice in the computational science community, as many
scientists are hesitant to incur the loss of data that has been
computed at great cost. The idea of lossless compression
is certainly appealing; however, the randomness of lower
order bits in scientiﬁc data has limited lossless compression
rates to less than 2:1 , , , , , . As
a consequence, only with lossy techniques are we likely
to achieve compression rates high enough to balance the
coming disparity between computational and I/O subsystems.
The novelty of the scheme we explored is that the level of
compression can be speciﬁed at different levels throughout
the data volume guided by domain knowledge, allowing the
domain expert to choose which regions of the data they are
willing to tolerate data loss. In contrast, lossy compression
techniques reported in the literature attempt to minimize the
information loss across the entirety of data largely without
any domain knowledge of the data or analyses that will be
The VAPOR Data Collection (VDC)  provides a
progressive access data model based on a wavelet trans-
form’s multi-resolution and energy compaction properties.
Speciﬁcally, the model decomposes each time step of each
variable into a set of wavelet coefﬁcients. These coefﬁcients
are sorted based on their information content (i.e., magnitude)
and stored in a hierarchical level-of-detail sub-setting scheme.
The number of coefﬁcients across all the levels is equivalent
to the number of grid points in the original data. The
reconstruction of a variable from the wavelet space using all
of the coefﬁcients losslessly restores that variable. However,
the user can select a level-of-detail by using only a subset
of these levels (i.e., those containing the largest magnitude
coefﬁcients) to reconstruct an approximation of the original
The model stores each level of detail as a set of blocks
by decomposing the data volume into independent blocks
before the forward transform. Decomposing the data into
blocks improves the performance of the both the forward and
inverse wavelet transformations, and facilitates the extraction
of regions of interest by limiting the amount of data that
needs to be traversed compared to a single large volume.
Larger block dimensions allow deeper levels of detail at
the expense of increased computational and storage access
The VDC is a lossless storage format, allowing scientists
to reconstruct the full-ﬁdelity data by accessing all the
wavelet coefﬁcients across all levels of detail. We introduce
an extension to this scheme by removing entire levels of
detail on a block-by-block basis, achieving a block-level
compression. This allows a domain expert to classify regions
of the data as a function of how salient that data will be
Figure 2. A comparison of renderings between the original full-ﬁdelity velocity data (
/timestep) and the block-level compressed data (
a) the yellow shading illustrates the regions that are losslessly preserved at the original ﬁdelity. b) volume renderings of the two respective storage schemes.
to the a posteriori analysis of the data. That classiﬁcation
can be designed based on simple spatial regions of interest
or on more complicated feature identiﬁcation algorithms.
Our extension stores blocks that overlap regions with high
analytical importance with a complete set of coefﬁcients
across all levels of detail, while storing only a subset of
the levels in blocks that contain only regions of lesser
In our context, the domain scientists are interested the
turbine wakes in wind farms identiﬁed as low-velocity ﬂows
originating from ﬁxed turbine positions. Therefore, we denote
wakes as regions-of-interest and and the larger surrounding
atmospheric regions as contextual information. Wavelet
blocks that intersect wake regions retain the full complement
of wavelet coefﬁcients, while atmospheric blocks retain and
store only a subset of the coefﬁcients.
In addition, we can quantify the expected level of com-
pression with this technique. A data set size is scaled by the
is the percentage of
space occupied by uncompressed blocks,
is the percentage
of space occupied by compressed blocks (
βc= 1 −βu
rcis the compression rate.
We evaluate data from two turbine-array simulations:
a small array study with two turbines and a large 48-
turbine array model of the Lillgrund wind farm described
by Churchﬁeld, et al. .
The two-turbine data set has a computational domain
and a grid resolution of
. This represents 9.2 billion grid points or
represent one time step of three component velocity at
single ﬂoating-point precision. These data are decomposed
into 35,344 blocks of
voxels. We classify these blocks
into two sets: contextual blocks and salient blocks. Blocks
are classiﬁed as salient if the low-velocity wake region
intersects the blocks, while all non-wake-intersecting blocks
are classiﬁed as contextual. The full set of levels of detail
are retained for the salient blocks, while only the highest
(most signiﬁcant) level of detail is retained for the contextual
blocks. Therefore, the salient blocks can be reconstructed at
the full original ﬁdelity of the simulation, while the contextual
blocks can only be reconstructed at reduced ﬁdelity at 512:1
compression. In our two-turbine test case, the wake regions
intersect 1,715 blocks, classifying the remaining blocks as
contextual. This reduces the data size from
step to less than
per time step a 94.8% savings in
storage and data movement (see Figure 2).
The results from 48-turbine Lillgrund simulation are less
dramatic with 48 wakes occupying a much larger proportion
of the full domain. This simulation domain represents a
volume with a coarser
resolution. This represents 3 billion grid points or 38.7 GB
to represent a single time step of three component velocity.
The data was decomposed into 11,664 blocks of
and the blocks that intersect with turbine wakes are classiﬁed
as salient. The wakes of the 48 turbines intersect 1,078 of
the blocks and are preserved in full-ﬁdelity. Compressing
the remaining contextual blocks at 512:1, reduces the data
down to a manageable
a single time step of three-component velocity – a 90.6%
For both of these data sets, there was no information loss
in the regions immediately surrounding the turbine wakes.
Therefore, no compression-induced error was propagated
to the quantitative analysis. Error and uncertainty has been
introduced to the contextual regions through the wavelet
compression. Fine small-scale structure have been lost,
but the large-scale formations are still intact and visible.
Qualitatively, the compression effects are difﬁcult to perceive
with the large global perspective shown in Figure 2. Taking a
detailed view of the compression boundary between a salient
and a contextual block, the compression effects become
more apparent (see Figure 3). The loss of the small-scale
turbulent structure at 512:1 compression is visible; however,
the compressed regions still provide clear context for the
analysis in the wake regions. We can identify incoming ﬂows
and distinguish the low velocity and high velocity air, which
provides the contextual information to better understand how
structures in the atmospheric boundary layer may correlate
to the wake behavior.
We evaluated the storage requirements (both data and
metadata) and rendering times
for the 48-turbine data set,
comparing a large uniform resolution data volume stored
in the VDC format, a nested grid with coarse resolution
in the atmosphere and ﬁner resolution encapsulating the
wakes, and our block-level compressed extension approxi-
mating the ﬁdelity of the nested grid. For our block-level
compression, we also evaluated rendering times for CPU-
side wavelet reconstruction versus GPU-side reconstruction
of nested textures. Results are provided in Table 1. Storing
the data as an unstructured nested grid generally provides
the worst metrics. The metadata requires nearly
space, primarily to represent the grid, while a time step of
of space. We
did not volume render these grids, as the state of the art
would suggest there is no reasonable expectation of volume
rendering these 3 billion unstructured cells interactively,
except with the largest HPC resources , , . Storing
resolution across the entire domain in the
VDC format requires
of storage for a time step of
of storage for the VDC metadata with rendering
times 0.5 frames per second. Storing mixed-ﬁdelity blocks
of wavelet coefﬁcients in the VDC format reduces the
data storage to
and only increases the metadata by
410 bytes to
. The CPU-side reconstruction of the
mixed-ﬁdelity blocks creates an artiﬁcially large data texture
equivalent in size to the uniform case, and as expected, the
rendering performance is identical at 0.5 frames per second.
The GPU-side reconstruction of the mixed-ﬁdelity blocks
All reported rendering times were measured on a Intel Xeon E5-2470
2.3GHz 8-core CPU with 384GB of RAM and NVIDIA Quadro 6000 with
6GB of RAM. Frame rate is reported once all data is memory resident.
into nested data textures of varying resolutions improves the
rendering performance to near-interactive speeds with a 3-12
frames per second. Early ray termination makes the frame
rate view-dependent and accounts for the large variance in
5. Conclusion & Future Work
We have presented a lossy wavelet-based compression
technique that empowers the domain scientist to control the
compression of data, preserving the most important areas of
the data in full ﬁdelity. Without the use of contextualized
data compression, the interactive visualization and analysis
of the wind turbine data would not have been possible. Using
our technique, researchers were able to interactively volume
render the ﬂow through the turbine array to understand the
turbine-to-turbine interactions, which is critical information
in wind-plant siting and the design of more efﬁcient and
reliable wind turbines.
Keeping an eye towards the exascale systems of the
future, we note that this scheme could be applied in situ
(i.e., while the simulation is running), though this would
require the domain scientist to classify regions or features of
interest beforehand. How to best exploit the data locality of
the simulation to perform that region of interest isolation in
conjunction with the wavelet transform is an open research
question, and a topic for future work.
The technique as presented is ideally suited to data with
discrete regions of interest and non-interest that can be spa-
tially isolated in wavelet blocks. The wind-energy simulations
are exemplary examples with wake features intersecting
a relatively small number of blocks. An opportunity for
improvement of this work is the replacement of the block-
level compression scheme with a function-based feature
preservation capability. A user supplied function could be
used to weight the wavelet coefﬁcients based on different
criteria. This approach would, for example, allow greater
ﬁdelity to be assigned to high-vorticity features, even when
those features uniformly populate the wavelet blocks of the
volume. More precisely, the approach would apply to features
independent of their blocks, providing a compression ratio
related directly to the volume of the features of interest rather
than the volume of the blocks those features intersect.
In the current scheme, the set of compressed blocks are
constant across both variable and time. Clearly there are
opportunities to manipulate the compression from variable
to variable and time step to time step, which could have
interesting impacts on analyses and the I/O systems that
support them, such as the promulgation of burst buffers
within compute nodes.
The research was performed using computational re-
sources sponsored by the Department of Energy’s Ofﬁce of
Energy Efﬁciency and Renewable Energy and located at the
National Renewable Energy Laboratory.
Figure 3. Illustration showing the time evolution of incoming ﬂow at a compression boundary. In the upper portion of the ﬁgure, we show a volumetric
rendering of the boundary, with the loss of small-scale structure at 512:1 compression visible; however, the compressed regions still provide clear context for
the analysis in the wake regions. In the lower portion of the ﬁgure, we plot rotor power over time as derived from the simulation software, with a moving
average in red. At time step 12210 (left), we see the turbine is in a low-velocity ﬂow with expanding lateral wake growth, strong downstream turbulent
mixing, and corresponding low power production. A high-velocity ﬂow followed by a low-velocity structure can be identiﬁed upstream in the compressed
area. At time step 12230 (right), that high-velocity ﬂow has reached the turbine effecting the later wake growth, turbulent mixing, and power production.
TABLE 1. STORAGE AND RENDERING MEASUREMENTS FOR THE 48-TURBINE ARRAY SIMULATION TEST CASE.
Unstructured Grid Uniform Resolution VDC Mixed Resolution VDC Mixed Resolution VDC
Reconstruction n/a CPU CPU GPU
Frame Rate * 0.5fps 0.5 fps 3 fps to 12 fps
Render Time * 2 s 2 s 0.08 s to 0.33 s
Time to Frame hours 360 s 300 s 70 s
~u Volume Size 3.9 GB 38.7 GB 3.7 GB 3.7 GB
Metadata Size 39.9 GB 673 kB 674 kB 674 kB
S. Ahern et al., “Scientiﬁc discovery at the exascale,” DOE ASCR 2011
Workshop on Exascale Data Management, Analysis, and Visualization,
Tech. Rep., 2011.
“20% Wind Energy by 2030 [electronic resource] : Increasing Wind
Energy’s Contribution to U.S. Electricity Supply,” National Renewable
Energy Laboratory, Tech. Rep. DOE/GO-102008-2567, May 2008.
M. Sprague, P. Moriarty, M. Churchﬁeld, K. Gruchalla, S. Lee,
J. Lundquist, J. Michalakes, and A. Purkayastha, “Computational
modeling of wind-plant aerodynamics,” in SciDAC 2011, 2011.
U. Hassan, G. Taylor, and A. Garrad, “The dynamic response of wind
turbines operating in wake ﬂow,” Journal of Wind Engineering and
Industrial Aerodynamics, vol. 27, pp. 113–126, 1988.
 “OpenFOAM,” www.openfoam.org.
J. Jonkman and M. J. Buhl, “Fast user’s guide,” National Renewable
Energy Laboratory, Tech. Rep. NREL/TP-500-38230, 2005.
M. Churchﬁeld, S. Lee, P. Moriarty, L. Martinez, S. Leonardi, G. Vi-
jayakumar, and J. Brasseur, “A large-eddy simulations of wind-plant
aerodynamics,” in 50th AIAA Aerospace Sciences Meeting including
the New Horizons Forum and Aerospace Exposition. American
Institute of Aeronautics and Astronautics, 2014/03/13 2012.
H. Childs, D. Pugmire, S. Ahern, B. Whitlock, M. Howison, Prabhat,
G. Weber, and E. W. Bethel, “Visualization at Extreme Scale Concur-
rency,” in High Performance Visualization—Enabling Extreme-Scale
Scientiﬁc Insight, E. W. Bethel, H. Childs, and C. Hansen, Eds. CRC
Press/Francis–Taylor Group, 2012, pp. 291–306.
J. Patchett, J. Ahrens, S. Ahren, and D. Pugmire, “Parallel visualization
and analysis with paraview on a cray xt4.” Cray User Group, May
2009, pp. 1–5.
K. Moreland, L. Avila, and L. A. Fisk, “Parallel unstructured volume
rendering in paraview,” vol. 6495, 2007, pp. 64 950F–64 950F–12.
J. Clyne, “Progressive data access for regular grids in high performance
visualization—enabling extreme-scale scientiﬁc insight,” in High
Perfomance Visualization, E. W. Bethel, H. Childs, and C. Hanson,
Eds. CRC Press/Francis–Taylor Group, 2012, pp. 145–170.
S. G. Mallat, “A theory for multiresolution signal decomposition: the
wavelet representation,” Pattern Analysis and Machine Intelligence,
IEEE Transactions on, vol. 11, no. 7, pp. 674–693, 1989.
J. Clyne, K. Gruchalla, and M. Rast, “Vapor: Visual, statistical, and
structural analysis of astrophysical ﬂows,” in Numerical Modeling of
Space Plasma Flows. Astronomical Society of the Paciﬁc Conference
Series, vol. 429, 2010, pp. 323–329.
ubbe, A. Wegener, J. M. Kunkel, Y. Ling, and T. Ludwig,
“Evaluating lossy compression on climate data,” in Supercomputing.
Springer, 2013, pp. 343–356.
S. Lakshminarasimhan, N. Shah, S. Ethier, S. Klasky, R. Latham,
R. Ross, and N. F. Samatova, “Compressing the incompressible with
isabela: In-situ reduction of spatio-temporal data,” in Euro-Par 2011
Parallel Processing. Springer, 2011, pp. 366–379.
D. Laney, S. Langer, C. Weber, P. Lindstrom, and A. Wegener,
“Assessing the effects of data compression in simulations using
physically motivated metrics,” in Proceedings of SC13: International
Conference for High Performance Computing, Networking, Storage
and Analysis. ACM, 2013, p. 76.
J. Woodring, S. Mniszewski, C. Brislawn, D. DeMarle, and J. Ahrens,
“Revisiting wavelet compression for large-scale climate data using
jpeg 2000 and ensuring data precision,” in Large Data Analysis and
Visualization (LDAV), 2011 IEEE Symposium on, Oct 2011, pp. 31–38.
J. Clyne and M. Rast, “A prototype discovery environment for
analyzing and visualizing terascale turbulent ﬂuid ﬂow simulations,”
in Visualization and Data Analysis 2005, R. F. Erbacher, J. C. Roberts,
M. T. Grohn, and K. Borner, Eds., vol. 5669. San Jose, CA, USA:
SPIE, Mar. 2005, pp. 284–294.
P. Mininni, E. Lee, A. Norton, and J. Clyne, “Flow visualization and
ﬁeld line advection in computational ﬂuid dynamics: application to
magnetic ﬁelds and turbulent ﬂows,” New Journal of Physics, vol. 10,
no. 12, p. 125007, 2008.
S. Li, K. Gruchalla, K. Potter, J. Clyne, and H. Childs, “Evaluating
the Efﬁcacy of Wavelet Conﬁgurations on Turbulent-Flow Data,”
in Proceedings of IEEE Symposium on Large Data Analysis and
Visualization, Chicago, IL, Oct. 2015, pp. 81–89.
V. Engelson, D. Fritzson, and P. Fritzson, “Lossless compression of
high-volume numerical data from simulations.” in Data Compression
Conference. Citeseer, 2000, p. 574.
P. Ratanaworabhan, J. Ke, and M. Burtscher, “Fast lossless compres-
sion of scientiﬁc ﬂoating-point data,” in Data Compression Conference,
2006. DCC 2006. Proceedings. IEEE, 2006, pp. 133–142.
P. Lindstrom and M. Isenburg, “Fast and efﬁcient compression of
ﬂoating-point data,” Visualization and Computer Graphics, IEEE
Transactions on, vol. 12, no. 5, pp. 1245–1250, 2006.
M. Burtscher and P. Ratanaworabhan, “High throughput compres-
sion of double-precision ﬂoating-point data,” in Data Compression
Conference, 2007. DCC’07. IEEE, 2007, pp. 293–302.
M. Burtscher and P. Ratanaworabhan, “FPC: A high-speed compressor
for double-precision ﬂoating-point data,” Computers, IEEE Transac-
tions on, vol. 58, no. 1, pp. 18–31, 2009.
V. Norton and J. Clyne, “The vapor visualization application,” in High
Perfomance Visualization—Enabling Extreme-Scale Scientiﬁc Insight,
E. W. Bethel, H. Childs, and C. Hanson, Eds. CRC Press/Francis–
Taylor Group, 2012, pp. 145–170.