Content uploaded by Kenny Gruchalla
Author content
All content in this area was uploaded by Kenny Gruchalla on Nov 18, 2017
Content may be subject to copyright.
Contextual Compression of Large-Scale Wind Turbine Array Simulations
Kenny Gruchalla∗, Nicholas Brunhart-Lupo∗, Kristin Potter∗, and John Clyne†
∗National Renewable Energy Laboratory
†National Center for Atmospheric Research
Abstract—Data sizes are becoming a critical issue particularly
for HPC applications. We have developed a user-driven lossy
wavelet-based storage model to facilitate the analysis and
visualization of large-scale wind turbine array simulations.
The model stores data as heterogeneous blocks of wavelet
coefficients, providing high-fidelity access to user-defined data
regions believed the most salient, while providing lower-fidelity
access to less salient regions on a block-by-block basis. In
practice, by retaining the wavelet coefficients as a function
of feature saliency, we have seen data reductions in excess of
94%, while retaining lossless information in the turbine-wake
regions most critical to analysis and providing enough (low-
fidelity) contextual information in the upper atmosphere to
track incoming coherent turbulent structures. Our contextual
wavelet compression approach has allowed us to deliver interac-
tive visual anlaysis while providing the user control over where
data loss, and thus reduction in accuracy, in the analysis occurs.
We argue this reduced but contexualized representation is a
valid approach and encourages contextual data management.
1. Introduction
There is a data analysis dilemma growing in the field
of computational science: our ability to generate numerical
data from scientific computations is outpacing our ability
to store, move, and analyze those data. In large high
performance compute (HPC) systems, I/O access, including
bandwidth, memory, and storage capacity have not scaled
with microprocessor performance. The issue is only expected
to worsen across the computational science community with
architectural changes expected at the exascale. Generally, as
this data analysis disparity grows, some form of data reduc-
tion will become necessary, even before post-processing [1].
The study of the interactions between turbine wakes and
the atmospheric boundary layer in wind farm simulations
lead to our development of a multi-resolution approach to
data compression. Specifically, analysis of these datasets
requires the detailed examination of turbine wakes, however
the larger computational domain is only needed for contextual
information. Thus, much of the data coming from a full-
resolution simulation is not necessary. As a response, we
developed a heterogeneous wavelet-based storage scheme
to contextually compress wind farm model data based on
feature saliency.
The data volume is decomposed into independent blocks
of wavelet coefficients, which are fully retained in blocks
Figure 1. The study of the interactions between the atmospheric boundary
layer and a wind-turbine array provide inherent opportunities for multi-
resolution analysis. Detailed examinations of the turbine wakes are of
analytical interest, while the larger computational domain is only needed
for contextual information.
intersecting the turbine wakes, losslessly capturing their
dynamics, while retaining only a subset of coefficients in
the non-waked blocks, thereby providing a contextually
compressed volume (see Figure 1). This approach has shown
massive data reduction in excess of 94% for these models,
improving storage cost and rendering times.
2. Wind Turbine Array Modeling
The U.S. Department of Energy’s 2008 report 20% Wind
Energy by 2030 [2] envisioned that wind power could supply
20% of the U.S. electricity demand. However, significant
advances in cost, performance, and reliability will be needed
to achieve this vision. For example, large wind plants are
consistently found to be sub-optimal in terms of performance
and reliability. One of the culprits of this underperfomance
is hypothesized to be related to inadequate accounting for
the inter-turbine effects through the propagation of turbine
wakes [3]. Wakes form immediately downstream of the wind
turbines with a lower mean velocity and an increased tur-
bulence intensity. Downstream turbines waked by upstream
turbines can experience vastly reduced power output, and
abrupt, massive stressing of turbine components [4]. The
dynamics of these wakes are generally not well understood,
and involve complex interactions of wind speed, momentum,
temperature, and moisture, resulting in complicated wake
motion, such as 3D periodic undulation, meandering, and
lateral growth.
Computational models of large-scale wind farms are
being used to better understand these phenomena. Currently,
large-eddy simulations are being used to create atmospheric
winds and compute the wind turbine flows [5], while flexible,
revolving actuator lines model the structural and system
dynamics of the turbines [6]. To capture relevant atmospheric
boundary layer scales, and applicable turbine and wake
dynamics, the simulation domain must represent at least
a
3 km
by
3 km
by
1 km
volume with a grid resolution of
approximately
1 m
and a temporal resolution of
1 Hz
[7].
As a consequence, capturing just a few minutes of flow
through these farms can result in hundreds of terabytes of
data. As computational capabilities continue to improve,
higher fidelity models of these wind farms are envisioned
that capture more of the relevant scales, spanning blade-level
turbulence to atmospheric flow at the mesoscale, potentially
increasing the data demands by orders of magnitude.
Early versions of these models employed a uniform grid,
facilitating the post-processing and visualization of these
data but with significant storage penalties. These models
have recently evolved to using a multi-resolution nested grid,
with
∼10 m
resolution in the peripheral boundary layer and
stepping down to
∼1 m
surrounding the array. The nested
grid significantly improves simulation times and storage
requirements, but comes at the cost of increased complexity
and cost in visualization and analysis, as non-uniform grids
of this magnitude (on order of a billions cells) are difficult
to interrogate interactively. Specifically, volume rendering
these large non-uniform grids becomes untenable without
using leadership-class HPC resources [8], [9], [10].
3. Wavelet-Based Compression
The existence of high and low areas of interest in turbine
simulations (see Figure 1) led us to the idea of acceptable
data loss and to explore the use of multi-resolution wavelets
for data compression. Multi-resolution wavelets allow the
reconstruction of a data set at varying resolutions in much the
same way as a MIP-mapping scheme. A wavelet transform
decomposes a signal into a set of wavelet functions of
different sizes and positions, and the result of the transform is
a set of coefficients that describe how the wavelet functions
need to be modified to reconstruct the original signal. Scaling
the wavelets adapts them to different components of the
signal and provides a multi-resolution view of that signal;
the large scales provide an overview, while smaller scales
fill in the details. The wavelet transform provides excellent
energy compaction, i.e., it concentrates energy (information)
into a small number of coefficients [11], and the multi-
resolution structure of the coefficients provides the ability
to reconstruct the data at varying resolutions [12]. Clyne
et al. [13] demonstrated that a simple hierarchical access
scheme based on the multi-resolution properties of wavelets
can enable the interactive analysis of large terabyte-sized
data sets.
When wavelet reconstruction leads to an approximation
of the original signal, data loss occurs (referred to as lossy
compression). Computational scientists have recently begun
to investigate the role of lossy compression when applied
to scientific data [11], [14], [15], [16], [17], and many
visualization and analysis operations have been shown to be
relatively insensitive to this type of data coarsening [18], [19],
[20]. However, lossy compression is not yet widely used in
practice in the computational science community, as many
scientists are hesitant to incur the loss of data that has been
computed at great cost. The idea of lossless compression
is certainly appealing; however, the randomness of lower
order bits in scientific data has limited lossless compression
rates to less than 2:1 [16], [21], [22], [23], [24], [25]. As
a consequence, only with lossy techniques are we likely
to achieve compression rates high enough to balance the
coming disparity between computational and I/O subsystems.
The novelty of the scheme we explored is that the level of
compression can be specified at different levels throughout
the data volume guided by domain knowledge, allowing the
domain expert to choose which regions of the data they are
willing to tolerate data loss. In contrast, lossy compression
techniques reported in the literature attempt to minimize the
information loss across the entirety of data largely without
any domain knowledge of the data or analyses that will be
subsequently applied.
3.1. Method
The VAPOR Data Collection (VDC) [26] provides a
progressive access data model based on a wavelet trans-
form’s multi-resolution and energy compaction properties.
Specifically, the model decomposes each time step of each
variable into a set of wavelet coefficients. These coefficients
are sorted based on their information content (i.e., magnitude)
and stored in a hierarchical level-of-detail sub-setting scheme.
The number of coefficients across all the levels is equivalent
to the number of grid points in the original data. The
reconstruction of a variable from the wavelet space using all
of the coefficients losslessly restores that variable. However,
the user can select a level-of-detail by using only a subset
of these levels (i.e., those containing the largest magnitude
coefficients) to reconstruct an approximation of the original
data.
The model stores each level of detail as a set of blocks
by decomposing the data volume into independent blocks
before the forward transform. Decomposing the data into
blocks improves the performance of the both the forward and
inverse wavelet transformations, and facilitates the extraction
of regions of interest by limiting the amount of data that
needs to be traversed compared to a single large volume.
Larger block dimensions allow deeper levels of detail at
the expense of increased computational and storage access
costs [26].
The VDC is a lossless storage format, allowing scientists
to reconstruct the full-fidelity data by accessing all the
wavelet coefficients across all levels of detail. We introduce
an extension to this scheme by removing entire levels of
detail on a block-by-block basis, achieving a block-level
compression. This allows a domain expert to classify regions
of the data as a function of how salient that data will be
Figure 2. A comparison of renderings between the original full-fidelity velocity data (
112 GB
/timestep) and the block-level compressed data (
5.8 GB
/timestep).
a) the yellow shading illustrates the regions that are losslessly preserved at the original fidelity. b) volume renderings of the two respective storage schemes.
to the a posteriori analysis of the data. That classification
can be designed based on simple spatial regions of interest
or on more complicated feature identification algorithms.
Our extension stores blocks that overlap regions with high
analytical importance with a complete set of coefficients
across all levels of detail, while storing only a subset of
the levels in blocks that contain only regions of lesser
importance.
In our context, the domain scientists are interested the
turbine wakes in wind farms identified as low-velocity flows
originating from fixed turbine positions. Therefore, we denote
wakes as regions-of-interest and and the larger surrounding
atmospheric regions as contextual information. Wavelet
blocks that intersect wake regions retain the full complement
of wavelet coefficients, while atmospheric blocks retain and
store only a subset of the coefficients.
In addition, we can quantify the expected level of com-
pression with this technique. A data set size is scaled by the
linear expression
βu+βc/rc
, where
βu
is the percentage of
space occupied by uncompressed blocks,
βc
is the percentage
of space occupied by compressed blocks (
βc= 1 −βu
), and
rcis the compression rate.
4. Results
We evaluate data from two turbine-array simulations:
a small array study with two turbines and a large 48-
turbine array model of the Lillgrund wind farm described
by Churchfield, et al. [7].
The two-turbine data set has a computational domain
of
3008 m
by
3008 m
by
1024 m
and a grid resolution of
1 m
. This represents 9.2 billion grid points or
112 GB
to
represent one time step of three component velocity at
single floating-point precision. These data are decomposed
into 35,344 blocks of
643
voxels. We classify these blocks
into two sets: contextual blocks and salient blocks. Blocks
are classified as salient if the low-velocity wake region
intersects the blocks, while all non-wake-intersecting blocks
are classified as contextual. The full set of levels of detail
are retained for the salient blocks, while only the highest
(most significant) level of detail is retained for the contextual
blocks. Therefore, the salient blocks can be reconstructed at
the full original fidelity of the simulation, while the contextual
blocks can only be reconstructed at reduced fidelity at 512:1
compression. In our two-turbine test case, the wake regions
intersect 1,715 blocks, classifying the remaining blocks as
contextual. This reduces the data size from
112 GB
per time
step to less than
6 GB
per time step a 94.8% savings in
storage and data movement (see Figure 2).
The results from 48-turbine Lillgrund simulation are less
dramatic with 48 wakes occupying a much larger proportion
of the full domain. This simulation domain represents a
4 km
by
4 km
by
1 km
volume with a coarser
1.7 m
grid
resolution. This represents 3 billion grid points or 38.7 GB
to represent a single time step of three component velocity.
The data was decomposed into 11,664 blocks of
643
voxels,
and the blocks that intersect with turbine wakes are classified
as salient. The wakes of the 48 turbines intersect 1,078 of
the blocks and are preserved in full-fidelity. Compressing
the remaining contextual blocks at 512:1, reduces the data
from
38.7 GB
down to a manageable
3.69 GB
to represent
a single time step of three-component velocity – a 90.6%
space savings.
For both of these data sets, there was no information loss
in the regions immediately surrounding the turbine wakes.
Therefore, no compression-induced error was propagated
to the quantitative analysis. Error and uncertainty has been
introduced to the contextual regions through the wavelet
compression. Fine small-scale structure have been lost,
but the large-scale formations are still intact and visible.
Qualitatively, the compression effects are difficult to perceive
with the large global perspective shown in Figure 2. Taking a
detailed view of the compression boundary between a salient
and a contextual block, the compression effects become
more apparent (see Figure 3). The loss of the small-scale
turbulent structure at 512:1 compression is visible; however,
the compressed regions still provide clear context for the
analysis in the wake regions. We can identify incoming flows
and distinguish the low velocity and high velocity air, which
provides the contextual information to better understand how
structures in the atmospheric boundary layer may correlate
to the wake behavior.
We evaluated the storage requirements (both data and
metadata) and rendering times
1
for the 48-turbine data set,
comparing a large uniform resolution data volume stored
in the VDC format, a nested grid with coarse resolution
in the atmosphere and finer resolution encapsulating the
wakes, and our block-level compressed extension approxi-
mating the fidelity of the nested grid. For our block-level
compression, we also evaluated rendering times for CPU-
side wavelet reconstruction versus GPU-side reconstruction
of nested textures. Results are provided in Table 1. Storing
the data as an unstructured nested grid generally provides
the worst metrics. The metadata requires nearly
40 GB
of
space, primarily to represent the grid, while a time step of
three-component velocity,
~u
, requires
3.9 GB
of space. We
did not volume render these grids, as the state of the art
would suggest there is no reasonable expectation of volume
rendering these 3 billion unstructured cells interactively,
except with the largest HPC resources [8], [9], [10]. Storing
a uniform
1.7 m
resolution across the entire domain in the
VDC format requires
38.7 GB
of storage for a time step of
~u
and
673 kB
of storage for the VDC metadata with rendering
times 0.5 frames per second. Storing mixed-fidelity blocks
of wavelet coefficients in the VDC format reduces the
~u
data storage to
3.7 GB
and only increases the metadata by
410 bytes to
674 kB
. The CPU-side reconstruction of the
mixed-fidelity blocks creates an artificially large data texture
equivalent in size to the uniform case, and as expected, the
rendering performance is identical at 0.5 frames per second.
The GPU-side reconstruction of the mixed-fidelity blocks
1.
All reported rendering times were measured on a Intel Xeon E5-2470
2.3GHz 8-core CPU with 384GB of RAM and NVIDIA Quadro 6000 with
6GB of RAM. Frame rate is reported once all data is memory resident.
into nested data textures of varying resolutions improves the
rendering performance to near-interactive speeds with a 3-12
frames per second. Early ray termination makes the frame
rate view-dependent and accounts for the large variance in
measurements.
5. Conclusion & Future Work
We have presented a lossy wavelet-based compression
technique that empowers the domain scientist to control the
compression of data, preserving the most important areas of
the data in full fidelity. Without the use of contextualized
data compression, the interactive visualization and analysis
of the wind turbine data would not have been possible. Using
our technique, researchers were able to interactively volume
render the flow through the turbine array to understand the
turbine-to-turbine interactions, which is critical information
in wind-plant siting and the design of more efficient and
reliable wind turbines.
Keeping an eye towards the exascale systems of the
future, we note that this scheme could be applied in situ
(i.e., while the simulation is running), though this would
require the domain scientist to classify regions or features of
interest beforehand. How to best exploit the data locality of
the simulation to perform that region of interest isolation in
conjunction with the wavelet transform is an open research
question, and a topic for future work.
The technique as presented is ideally suited to data with
discrete regions of interest and non-interest that can be spa-
tially isolated in wavelet blocks. The wind-energy simulations
are exemplary examples with wake features intersecting
a relatively small number of blocks. An opportunity for
improvement of this work is the replacement of the block-
level compression scheme with a function-based feature
preservation capability. A user supplied function could be
used to weight the wavelet coefficients based on different
criteria. This approach would, for example, allow greater
fidelity to be assigned to high-vorticity features, even when
those features uniformly populate the wavelet blocks of the
volume. More precisely, the approach would apply to features
independent of their blocks, providing a compression ratio
related directly to the volume of the features of interest rather
than the volume of the blocks those features intersect.
In the current scheme, the set of compressed blocks are
constant across both variable and time. Clearly there are
opportunities to manipulate the compression from variable
to variable and time step to time step, which could have
interesting impacts on analyses and the I/O systems that
support them, such as the promulgation of burst buffers
within compute nodes.
Acknowledgments
The research was performed using computational re-
sources sponsored by the Department of Energy’s Office of
Energy Efficiency and Renewable Energy and located at the
National Renewable Energy Laboratory.
Figure 3. Illustration showing the time evolution of incoming flow at a compression boundary. In the upper portion of the figure, we show a volumetric
rendering of the boundary, with the loss of small-scale structure at 512:1 compression visible; however, the compressed regions still provide clear context for
the analysis in the wake regions. In the lower portion of the figure, we plot rotor power over time as derived from the simulation software, with a moving
average in red. At time step 12210 (left), we see the turbine is in a low-velocity flow with expanding lateral wake growth, strong downstream turbulent
mixing, and corresponding low power production. A high-velocity flow followed by a low-velocity structure can be identified upstream in the compressed
area. At time step 12230 (right), that high-velocity flow has reached the turbine effecting the later wake growth, turbulent mixing, and power production.
TABLE 1. STORAGE AND RENDERING MEASUREMENTS FOR THE 48-TURBINE ARRAY SIMULATION TEST CASE.
Unstructured Grid Uniform Resolution VDC Mixed Resolution VDC Mixed Resolution VDC
Reconstruction n/a CPU CPU GPU
Frame Rate * 0.5fps 0.5 fps 3 fps to 12 fps
Render Time * 2 s 2 s 0.08 s to 0.33 s
Time to Frame hours 360 s 300 s 70 s
~u Volume Size 3.9 GB 38.7 GB 3.7 GB 3.7 GB
Metadata Size 39.9 GB 673 kB 674 kB 674 kB
References
[1]
S. Ahern et al., “Scientific discovery at the exascale,” DOE ASCR 2011
Workshop on Exascale Data Management, Analysis, and Visualization,
Tech. Rep., 2011.
[2]
“20% Wind Energy by 2030 [electronic resource] : Increasing Wind
Energy’s Contribution to U.S. Electricity Supply,” National Renewable
Energy Laboratory, Tech. Rep. DOE/GO-102008-2567, May 2008.
[3]
M. Sprague, P. Moriarty, M. Churchfield, K. Gruchalla, S. Lee,
J. Lundquist, J. Michalakes, and A. Purkayastha, “Computational
modeling of wind-plant aerodynamics,” in SciDAC 2011, 2011.
[4]
U. Hassan, G. Taylor, and A. Garrad, “The dynamic response of wind
turbines operating in wake flow,” Journal of Wind Engineering and
Industrial Aerodynamics, vol. 27, pp. 113–126, 1988.
[5] “OpenFOAM,” www.openfoam.org.
[6]
J. Jonkman and M. J. Buhl, “Fast user’s guide,” National Renewable
Energy Laboratory, Tech. Rep. NREL/TP-500-38230, 2005.
[7]
M. Churchfield, S. Lee, P. Moriarty, L. Martinez, S. Leonardi, G. Vi-
jayakumar, and J. Brasseur, “A large-eddy simulations of wind-plant
aerodynamics,” in 50th AIAA Aerospace Sciences Meeting including
the New Horizons Forum and Aerospace Exposition. American
Institute of Aeronautics and Astronautics, 2014/03/13 2012.
[8]
H. Childs, D. Pugmire, S. Ahern, B. Whitlock, M. Howison, Prabhat,
G. Weber, and E. W. Bethel, “Visualization at Extreme Scale Concur-
rency,” in High Performance Visualization—Enabling Extreme-Scale
Scientific Insight, E. W. Bethel, H. Childs, and C. Hansen, Eds. CRC
Press/Francis–Taylor Group, 2012, pp. 291–306.
[9]
J. Patchett, J. Ahrens, S. Ahren, and D. Pugmire, “Parallel visualization
and analysis with paraview on a cray xt4.” Cray User Group, May
2009, pp. 1–5.
[10]
K. Moreland, L. Avila, and L. A. Fisk, “Parallel unstructured volume
rendering in paraview,” vol. 6495, 2007, pp. 64 950F–64 950F–12.
[11]
J. Clyne, “Progressive data access for regular grids in high performance
visualization—enabling extreme-scale scientific insight,” in High
Perfomance Visualization, E. W. Bethel, H. Childs, and C. Hanson,
Eds. CRC Press/Francis–Taylor Group, 2012, pp. 145–170.
[12]
S. G. Mallat, “A theory for multiresolution signal decomposition: the
wavelet representation,” Pattern Analysis and Machine Intelligence,
IEEE Transactions on, vol. 11, no. 7, pp. 674–693, 1989.
[13]
J. Clyne, K. Gruchalla, and M. Rast, “Vapor: Visual, statistical, and
structural analysis of astrophysical flows,” in Numerical Modeling of
Space Plasma Flows. Astronomical Society of the Pacific Conference
Series, vol. 429, 2010, pp. 323–329.
[14]
N. H
¨
ubbe, A. Wegener, J. M. Kunkel, Y. Ling, and T. Ludwig,
“Evaluating lossy compression on climate data,” in Supercomputing.
Springer, 2013, pp. 343–356.
[15]
S. Lakshminarasimhan, N. Shah, S. Ethier, S. Klasky, R. Latham,
R. Ross, and N. F. Samatova, “Compressing the incompressible with
isabela: In-situ reduction of spatio-temporal data,” in Euro-Par 2011
Parallel Processing. Springer, 2011, pp. 366–379.
[16]
D. Laney, S. Langer, C. Weber, P. Lindstrom, and A. Wegener,
“Assessing the effects of data compression in simulations using
physically motivated metrics,” in Proceedings of SC13: International
Conference for High Performance Computing, Networking, Storage
and Analysis. ACM, 2013, p. 76.
[17]
J. Woodring, S. Mniszewski, C. Brislawn, D. DeMarle, and J. Ahrens,
“Revisiting wavelet compression for large-scale climate data using
jpeg 2000 and ensuring data precision,” in Large Data Analysis and
Visualization (LDAV), 2011 IEEE Symposium on, Oct 2011, pp. 31–38.
[18]
J. Clyne and M. Rast, “A prototype discovery environment for
analyzing and visualizing terascale turbulent fluid flow simulations,”
in Visualization and Data Analysis 2005, R. F. Erbacher, J. C. Roberts,
M. T. Grohn, and K. Borner, Eds., vol. 5669. San Jose, CA, USA:
SPIE, Mar. 2005, pp. 284–294.
[19]
P. Mininni, E. Lee, A. Norton, and J. Clyne, “Flow visualization and
field line advection in computational fluid dynamics: application to
magnetic fields and turbulent flows,” New Journal of Physics, vol. 10,
no. 12, p. 125007, 2008.
[20]
S. Li, K. Gruchalla, K. Potter, J. Clyne, and H. Childs, “Evaluating
the Efficacy of Wavelet Configurations on Turbulent-Flow Data,”
in Proceedings of IEEE Symposium on Large Data Analysis and
Visualization, Chicago, IL, Oct. 2015, pp. 81–89.
[21]
V. Engelson, D. Fritzson, and P. Fritzson, “Lossless compression of
high-volume numerical data from simulations.” in Data Compression
Conference. Citeseer, 2000, p. 574.
[22]
P. Ratanaworabhan, J. Ke, and M. Burtscher, “Fast lossless compres-
sion of scientific floating-point data,” in Data Compression Conference,
2006. DCC 2006. Proceedings. IEEE, 2006, pp. 133–142.
[23]
P. Lindstrom and M. Isenburg, “Fast and efficient compression of
floating-point data,” Visualization and Computer Graphics, IEEE
Transactions on, vol. 12, no. 5, pp. 1245–1250, 2006.
[24]
M. Burtscher and P. Ratanaworabhan, “High throughput compres-
sion of double-precision floating-point data,” in Data Compression
Conference, 2007. DCC’07. IEEE, 2007, pp. 293–302.
[25]
M. Burtscher and P. Ratanaworabhan, “FPC: A high-speed compressor
for double-precision floating-point data,” Computers, IEEE Transac-
tions on, vol. 58, no. 1, pp. 18–31, 2009.
[26]
V. Norton and J. Clyne, “The vapor visualization application,” in High
Perfomance Visualization—Enabling Extreme-Scale Scientific Insight,
E. W. Bethel, H. Childs, and C. Hanson, Eds. CRC Press/Francis–
Taylor Group, 2012, pp. 145–170.