Content uploaded by Charles Mockett
Author content
All content in this area was uploaded by Charles Mockett on Mar 16, 2017
Content may be subject to copyright.
DETECTION OF INITIAL TRANSIENT AND ESTIMATION OF
STATISTICAL ERROR IN TIME-RESOLVED TURBULENT
FLOW DATA
C. Mockett, T. Knacke and F. Thiele
Institute of Fluid Mechanics and Engineering Acoustics,
Technische Universit¨
at Berlin, Sekr. MB1, M¨
uller-Breslau-Str. 8, 10623 Berlin, Germany
charles.mockett@cfd.tu-berlin.de,thilo.knacke@cfd.tu-berlin.de
Abstract
Analysis techniques are proposed for estimating
the magnitude of statistical error and detecting the
presence of initial transient content in statistically sta-
tionary random data. The methods are tested for syn-
thesised analytical signals and real data from the un-
steady measurement and simulation of turbulent flows.
The results obtained demonstrate that the proposed
techniques are reliable and robust. The implications
for practical unsteady CFD and turbulent flow mea-
surement in terms of improved quality, minimised un-
certainty and reduced user burden are discussed, and
perspectives for further work are outlined.
1 Introduction
The growth in computing capacity has lead to
an increasing application of turbulence-resolving ap-
proaches, such as large-eddy simulation (LES) or
detached-eddy simulation (DES), to industrial prob-
lems. The need to resolve turbulent structures enforces
small time steps, whereas random low-frequency fluc-
tuations generally make very long time samples nec-
essary for reliable statistical estimates. Furthermore,
the flow must progress from arbitrary initial conditions
to a fully-developed, statistically stationary state be-
fore valid sampling can begin. This “initial transient”
content often makes up a large proportion of the total
simulation time. These factors all combine to necessi-
tate the simulation of very large numbers of time steps,
which very often pushes current computing resources
to their limits.
The length of the initial transient is strongly case-
specific and is usually judged by intuitive inspection
of the histories of various quantities. To the authors’
knowledge, no analysis algorithm for this purpose ex-
ists. Regarding statistical time sample requirements,
again no established method to quantify the statistical
error magnitude is known and the behaviour is simi-
larly case-specific.
The goal of this work is therefore to develop algo-
rithms for the analysis of unsteady data that address
both of these issues. The method developed for the
quantification of statistical error is presented in Sect. 2.
An extension of this for initial transient detection is
dealt with in Sect. 3. Finally, the conclusions are sum-
marised, in which we also outline the potential benefits
for these methods in practice.
2 Quantification of random error
Definitions and assumptions
We assume that the data in question are single sam-
ple records from continuous stationary (ergodic) ran-
dom processes. Stationarity specifies that all statistical
moments are time-invariant. Implicit in this is also the
assumption that the signals are free of initial transient.
The random nature of the signal, x(t), means that
the true value of any statistical quantity φ(e.g. mean,
µxand standard deviation, σx) can only be precisely
determined for infinite sample lengths (φis “unobserv-
able”). We must in practice make do with estimates, ˆ
φ,
obtained from finite samples. The random error1for
the estimate is given by
σ[ˆ
φ] = qE[ˆ
φ2]−E2[φ],(1)
where E[ ] denotes the expected value of the bracketed
quantity. For cases where φ6= 0, it is convenient to
adopt the normalised random error expressed as
ε=σ[ˆ
φ]
φ.(2)
Since it contains the unobservable true value, φ, the
statistical error is also unobservable. An observable
estimate2can however be made by replacing φin (1)
& (2) with the best available estimate obtained using
the entire sample, ˆ
φT.
Formulation of the method
A given signal, x(t), of total length T, is divided
into a series of windows of length Tw. For Tw¿T,
1The bias or systematic error is zero, hence the total statistical
error is given by the random component.
2The error defined as such is most accurately referred to as the
residual, however the term error is used to avoid confusion with the
more common use of the word residual in numerics.
the statistical error can be estimated by
ε(Tw)≈s¿³ˆ
φTw−ˆ
φT´2À
ˆ
φT
,(3)
where h i denotes averaging over the available data
windows. By varying the window size, the error trend
as a function of sample length can be estimated. This
estimate naturally becomes decreasingly accurate as
Twapproaches Tsince the number of available win-
dows decreases3. Unless long time-resolved bench-
mark signals are available (e.g. from comparable ex-
periments), the usefulness of this method alone is
therefore limited.
A further component is introduced, which enables
both an error estimate for the available sample length
as well as a prognosis of the error development for
still longer samples. Analytical expressions for the
error of various statistical quantities as a function of
sample length can be derived for the special case
of bandwidth-limited Gaussian white noise (of band-
width B). By integrating the corresponding autoco-
variance function, Bendat and Piersol (2000) arrive at
the following formulae for the error on the mean and
standard deviation:
ε[ˆµx]≈1
√2BT µσx
µx¶,(4)
ε[ˆσx]≈1
√4BT .(5)
The authors state that these formulae are valid for
BT ≥5. By again replacing the unobservable quan-
tities µxand σxwith the best available estimates, the
parameter Bbecomes the only remaining unknown in
(4) & (5). This is treated as a curve-fitting parame-
ter, obtained as the lowest Bfor which the curve from
(4) intersects the relationship computed using (3) for
ˆ
φ= ˆµ. This Bvalue is then also applied in (5). In
this way, the curve fit is as reliable as possible (Eq. (3)
is more accurate for µxthan for σx) and as conserva-
tive as possible (the minimum intersecting B, hence
maximum εis sought).
Assuming that the error behaviour for long sam-
ple lengths can be approximated by that of white noise
is equivalent to assuming that the autospectral density
function, Gxx(f)is flat at the lowest (and most poorly-
resolved) frequencies, i.e.
Gxx(f)∝fp,(6)
where p= 0 for white noise. Indeed, the T−1
2scaling
of εcan be derived from (6)4, with the more general
result
ε[ˆµx]∝T−1+p
2(7)
3The estimated error is indeed zero for Tw=T.
4This by adopting an equivalence of the windowed average over
Twto an ideal low-pass filter and integrating the spectrum between
f= 0 and f= 1/Tw.
irrespective of the white noise assumption.
It is seen therefore that the white noise error scal-
ing for large Tis the most conservative assumption
compatible with the stationarity condition: p < 0
gives Gxx(0) = ∞. For signals with a statistical
convergence more rapid than band-limited white noise
(e.g. quasi-periodic signals), the method will therefore
over-estimate the error.
Finally, the interpretation of the estimated error
magnitude will be discussed. The collection of sta-
tistical values over data windows ( ˆ
φTwin (3)) consti-
tutes a sampling of the original random variable. ˆ
φTw
is a random variable in its own right, with a proba-
bility distribution function P(ˆ
φTw)referred to as the
sampling distribution. As the sample size Twbecomes
large enough to encompass several uncorrelated sam-
ples5of x(t),P(ˆ
φTw)approaches a Gaussian distribu-
tion. This practical outcome of the central limit theo-
rem holds regardless of the distribution of x(t). This
means that the quantified error can be used to establish
confidence intervals for the unknown true values of the
statistical quantities, namely
ˆ
φ
1 + ε≤φ≤ˆ
φ
1−εwith 68% confidence, or
ˆ
φ
1+2ε≤φ≤ˆ
φ
1−2εwith 95% confidence. (8)
In this manner, estimates of the random statistical error
magnitude and the probability distributions of statisti-
cal quantities from unsteady simulations can be ob-
tained, with the simulated time series as the sole input.
Validation for analytical signals
As a first step towards demonstrating the perfor-
mance of the error estimation methodology, valida-
tion using synthesised analytical signals is carried out.
Gaussian white noise with µ= 1 and σ= 0.1was
generated between 0 s ≤t≤1 s with ∆t= 1×10−5s,
giving rise to N= 1×105+ 1 samples. With the fluc-
tuation energy distributed equally over all frequencies,
the bandwidth is given by the Nyquist maximum re-
solved frequency, i.e. B= 1/(2∆t)=5×104Hz.
Figure 1 shows the error trend computed for µ
and σfrom the white noise signal using (3) (sym-
bols), together with the analytical distribution obtained
by setting B= 5 ×104Hz in (4) & (5) (dashed
lines). The computed data points follow the exact so-
lution very well for small T, before some deviation
is seen around T > 10−1s. This occurs because
of the low number of independent statistical samples
available. The automatically-determined value of B
is under-predicted by around 17%, which leads to an
over-prediction of εof around 10%. This arises from
choosing the lowest Barising from the data, motivated
by the desire to err conservatively. The performance
5Bendat & Piersol (2000) state that the Gaussian assumption for
the sampling distribution becomes reasonable above 4 and accurate
above 10 samples in most cases.
of the error estimation procedure is considered highly
satisfactory on the basis of the white noise test.
10-4 10-3 10-2 10-1 100
Tw, T [s]
10-4
10-3
10-2
10-1
100
ε
ε[µ](T), exact
ε[µ](Tw), Eq. (3)
ε[µ](T), Eq. (4), curve fit (B = 4.14E4)
ε[σ](T), exact
ε[σ](Tw), Eq. (3)
ε[σ](T), Eq. (4), curve fit (B = 4.14E4)
Figure 1: Comparison of estimated and exact normalised er-
ror trends. Gaussian white noise with B= 5 ×
104Hz,µ= 1 and σ= 0.1.
10-4 10-3 10-2 10-1 100
Tw, T [s]
10-4
10-3
10-2
10-1
100
101
ε
ε[µ](Tw), Eq. (3)
ε[µ](T), Eq. (4), curve fit (B = 4.31E1)
ε[σ](Tw), Eq. (3)
ε[σ](T), Eq. (4), curve fit (B = 4.31E1)
Figure 2: Comparison of calculated and fitted normalised er-
ror trends. Sine wave with f= 20 Hz,σ= 0.1
and µ= 1.
To investigate the performance of the method for
a case in which the underlying white noise scaling as-
sumption is violated, a similar test was carried out for
a sine wave signal. With the same µ,σ,Tand ∆t
as above, a sine wave of frequency f= 20 Hz was
analysed. Figure 2 shows the error trends computed
by (3) (symbols) together with the trends obtained by
curve-fitting the white noise scaling of (4) & (5) (solid
lines). A more rapid error decay, scaling as ε∝T−1
for large T, is seen for the sine wave. The white noise
trends therefore consistently over-estimate the error.
The same value of Bfurthermore appears not to hold
for both ε[ˆµ]and ε[ˆσ]. Based on the value judgement
that it is better to over-estimate than to under-estimate
the error, this test therefore demonstrates the benign
failure mode of the methodology when applied to a
periodic signal.
Demonstration using real data
Having validated the performance of the error es-
timation method for white noise and demonstrated the
invalidity of the underlying assumptions when applied
to a periodic signal, it is essential to establish the be-
haviour for a real turbulent signal. For this purpose,
measurements of the unsteady lift coefficient, Cl, for
a NACA0021 airfoil at angle of attack α= 60◦and
Reynolds number Re = 2.7×105are used. The
airfoil is in deep stall and therefore presents a typi-
cal bluff-body flow, with strong quasi-periodic fluctu-
ations from the large scale vortex shedding overlaid
with broadband random turbulent fluctuations. This
signal, measured by Swalwell et al. (2004) and plotted
in Fig. 3, therefore presents an ideal test of the method
due to its mixed tonal and broadband nature. The
signal furthermore covers a very long physical time
of around T∗= 8800 non-dimensional units6, corre-
sponding to around 1760 vortex shedding periods and
hence providing a large statistical sample.
A qualitative impression of the statistical conver-
gence of this signal is provided by the running average
shown in Fig. 3. In addition, the 68% and 95% con-
fidence intervals for µClare shown, as defined by (8).
These completely envelope the fluctuations in the run-
ning mean, which at no point verge outside of the 95%
confidence interval.
100101102103104
Tw*, T*
10-3
10-2
10-1
100
ε
ε[µ](Tw*), Eq. (3)
ε[µ](T*), Eq. (4), curve fit (B = 0.135)
ε[σ](Tw*), Eq. (3)
ε[σ](T*), Eq. (4), curve fit (B = 0.135)
BT* = 5
Figure 4: Comparison of calculated and fitted normalised er-
ror trends. Lift coefficient signal from Swalwell et
al. (2004) measurements.
The error trends calculated from (3) are plotted for
the mean and standard deviation of Clin Fig. 4. The
agreement with the white noise error scaling from (4)
& (5) is very good for larger T(around T∗>20,
4 vortex shedding periods), and the value of Bfitted
to the mean curve applies well to the standard devia-
tion also. The departure from the white noise trend for
small Tis expected, due to the strong correlation of
6Non-dimensional time units formulated as t∗=t|u|∞/c
where cis the chord length of the airfoil and |u|∞is the free stream
velocity magnitude.
t*
0,6
0,8
1,0
1,2
1,4
Cl
0 50 100 150 200 250 300 350 400 450 500
T*
0,80
0,85
0,90
0,95
1,00
1,05
1,10
µCl
Running average
Best estimate
68% confidence interval
95% confidence interval
Figure 3: Time history of lift coefficient (above). Development of running average compared to the best estimate of the mean
with 68% and 95% confidence intervals (below). Both plots are zoomed to 0≤t∗≤500 for clarity .
the signal from the organised vortex shedding motion
at these time scales. Indeed, the cited validity of (4) &
(5) for BT ≥5gives a good indication of the onset of
ε∝T−1
2scaling (BT ∗= 5 is plotted in Fig. 4).
The applicability of the central limit theorem is
demonstrated in Fig. 5. The probability density func-
tion (PDF) of the original signal deviates from the
Gaussian PDF, in line with expectation. With growing
Twhowever, the PDF for [ ˆµCl]Twbecomes increas-
ingly Gaussian with very good agreement seen around
T∗
w>20. Interestingly, this coincides with the value
for T∗
wat which the white noise error scaling begins to
apply (c.f. Fig. 4).
0,6 0,8 1 1,2
Cl
0
1
2
3
4
5
PDF
p(Cl )
Gaussian
Raw signal
0,8 0,85 0,9 0,95 1 1,1
[µCl]Tw
0
2
4
6
8
10 Sample mean, Tw* = 21.90
Figure 5: PDFs computed for the raw Clsignal (left) and
for the sample mean ˆµ[Cl]Twwith T∗
w= 21.90
(right), compared with the Gaussian PDF.
The analysis has until now benefitted from the
large statistical sample offered by the experimental
time series. However, particularly with application for
numerical simulations in mind, this luxury cannot be
assumed. To investigate the robustness of the method
for short sample lengths, the running value of Bhas
been plotted for growing T. As seen in Fig. 6, B
rapidly converges to its final value, such that the re-
sulting εprediction is largely accurate to within 10%
after T∗≈23, or roughly 5 vortex shedding periods.
This indicates that the method is reliable even for very
short time series and hence useful for unsteady CFD
data.
101102103104
T*
0,05
0,10
0,15
0,20
0,25
0,30
0,35
0,40
0,45
0,50
B
B(T*)
Final value, B = 0.1352
ε predicted to within 10%
Figure 6: Development of the parameter Bwith growing
sample length T∗. Lift coefficient signal from
Swalwell et al. (2004) measurements.
After programming optimisation, the computa-
tional cost of the algorithm is acceptably low. A data
set consisting of 105samples is processed in under 1 s
using a single CPU core of a contemporary desktop PC
and the CPU time scaling with sample length is linear
with a negligible gradient.
3 Detection of initial transient
Having presented the formulation of the method
to quantify statistical error for stationary random data
typical of the time traces arising in turbulent flows, a
method for the detection of initial transient content in
such time traces will be proposed.
t*
0,8
1,2
1,6
2,0
CL
tt*
t0*
2e-05
4e-05
6e-05
8e-05
σ[µCL] * σ[σCL]
tt*
0 50 100 150 200 250 300 350 400 450 500
T*
0
50
100
150
Predicted t*t(T*)
Entire signal rejected
Final value of tt*
Figure 7: Time history of lift coefficient (above). Variation of the product of σ[ ˆµCL]and σ[ˆσCL]as t∗
0is varied, illustrating the
detection of t∗
t(mid). Development of the predicted t∗
tas the simulation progresses (below).
Formulation of the method
We consider a randomly fluctuating time trace,
x(t), that at some point becomes statistically station-
ary, but which initially exhibits transient character
(i.e. statistical quantities varying with time). This may
for example arise from switching on an experimental
facility or from arbitrary initialisation of a numerical
simulation. We refer to the onset time of stationarity as
ttsuch that the initial transient lies within 0≤t<tt.
The initial transient may be manifest as a distortion of
µxor σx, in isolation or in combination.
The data is successively shortened by removing
samples from the beginning of the time trace, with the
shifting start time denoted t0. The absolute statistical
random error, σ[ˆ
φx], is estimated using (3–5) for both
ˆ
φx= ˆµxand ˆ
φx= ˆσxas t0is varied. The underly-
ing hypothesis is that the initial transient present when
t0< ttwill cause an increase in σ[ˆ
φx]relative to the
signal without initial transient (t0≥tt). Once the ini-
tial transient has been removed entirely and the signal
is shortened further, σ[ˆ
φx]will then begin to rise due
to the reduction of T.tttherefore is expected to cor-
respond to a minimum in σ[ˆ
φx](t0)when the signal
is successively shortened in this way. To accurately
detect this minimum, the method therefore requires a
signal for which a statistically steady state has been
reached over a sufficient sample size.
To sensitise the method to initial transient both in
the mean and in the fluctuation amplitude, the varia-
tions of σ[ˆµx](t0)and σ[ˆσx](t0)are computed. To ar-
rive at a single value for tt, these must be combined in
some way. We have found that the most robust results
are given by searching for a minimum in the product
of these quantities, using two independent curve fit pa-
rameters Bˆµand Bˆσ.
Due to the shrinking statistical sample when t0ap-
proaches T, the computed errors become increasingly
unreliable, which could cause spurious minima for
large t0. For this reason t0is limited to 0≤t0≤T /2.
Demonstration and validation
As a test data set, a time trace of the integral
lift coefficient7,CL, from a detached-eddy simulation
(DES)8of the same deep-stall NACA0021 test case is
used. This signal was recorded from the start of the
simulation, which was initialised with a uniform flow
field and hence contains pronounced initial transient.
The signal is plotted in Fig. 7. A total of T∗≈880 was
simulated using a time step of ∆t∗= 0.025 (hence a
total of N≈35300 time steps). Further details con-
cerning this simulation can be found in Garbaruk et
al. (2009), Haase & Peng (2009) and Mockett (2009).
The onset of stationarity was detected at t∗
t= 77.5,
which agrees well with intuitive inspection. A clear
minimum is seen in the product of σ[ˆµCL]and σ[ˆσCL]
when t∗
0=t∗
t, plotted in Fig. 7. Some spurious lo-
cal minima are seen above t∗
0≈300 due to the re-
duced sample size discussed previously, however these
do not affect the global minimum.
It is clear that the transient detection method re-
quires a certain sample of stationary signal for t > tt
in order to accurately predict tt. A feeling for the ro-
bustness of the method is given by Fig. 7. Here, the
predicted initial transient is plotted as the simulation
progresses (i.e. T∗is increased) and data lying on the
line t∗
t=T∗represents rejection of the entire sig-
nal. The method is initially indecisive, switching be-
7In contrast to the experiment, the DES lift coefficient was inte-
grated over the entire airfoil, measuring 3.24cin the spanwise di-
rection. For this reason an upper case Lis used in the subscript.
8See Spalart et al. (1997) and Spalart (2009).
tween t∗
t=T∗and several premature values for t∗
t.
However, when T∗≈165 the method finds the final
transient and makes only minor adjustments thereafter.
For this signal therefore, a total sample length of just
over 2ttsuffices to predict tt.
This is typical of the method’s performance for
a wide variety of signals examined during extensive
testing. These include different force components
and point traces from a wide range of flows, com-
puted both with turbulence-resolving approaches and
URANS. Initial transient is also successfully detected
when manifest in the standard deviation only.
Due to repeated application of the error estimation
algorithm, the transient detection method is naturally
more computationally intensive, with CPU time pro-
portional to N2. With the current implementation, a
signal of N= 5 ×104is processed in around 45 s us-
ing a single CPU core on a modern desktop PC. This
is considered acceptable for routine application.
4 Conclusions and outlook
A robust and efficient method for the estimation
of the random statistical error has been proposed and
demonstrated. The results shown for synthesised sig-
nals serve to validate the algorithm and to demonstrate
its conservative behaviour when the underlying as-
sumptions are violated. In-depth analysis of an exper-
imental signal has indicated the validity of the white
noise scaling to describe the statistical error for large
sample sizes and the applicability of the central limit
theorem. Building upon this, a technique for the de-
tection of initial transient has been proposed. Results
shown for a DES time trace demonstrate both good
agreement with intuition and robust response to vary-
ing sample length. Extensive testing on a variety of
different signal types has proved equally satisfactory.
Importantly, both methods have been demonstrated
to deliver accurate estimates even for short input time
traces, indicating usefulness for unsteady CFD appli-
cations where long time traces are expensive. The al-
gorithms furthermore require no external input aside
from the data itself, and are hence suitable for au-
tomatic or script-based application. The tools can
naturally be combined, such that initial transient is
truncated before a statistical evaluation and error es-
timation. A perspective application certainly exists in
applying these methods to automatically control un-
steady simulations: The combination of the transient
detection algorithm with an archive of regularly saved
restart files allows a retrospective adjustment of the
Reynolds averaging time range. The total run time of
the simulation can furthermore be specified by the user
in terms of a desired statistical accuracy in a set of im-
portant engineering quantities. These features would
therefore significantly reduce user burden, of particu-
lar value in an industrial context.
The additional information regarding the statisti-
cal error magnitude adds significant value to simula-
tion results, in both industrial and academic CFD sce-
narios. It is expected that an approximate and cost-
effective solution with a quantified confidence inter-
val will in many industrial situations be considered
of higher value than a more precise but much more
expensive result. Should additional simulation for a
more precise result be desired, the error trends de-
rived can be used to forecast the number of additional
time steps needed. For academic investigations, it is
furthermore essential to demonstrate that differences
between simulation results exceed the statistical error
margin before sound conclusions can be drawn. The
error estimation method presented here was indeed
successfully applied for this purpose in the DES study
of Garbaruk et al. (2009). Knowledge of the proba-
bility distribution furthermore allows hypothesis tests
to be formulated, e.g. “Simulation A agrees with the
benchmark data with a probability of P%.”
Finally, it remains to be stated that the emphasis
placed on unsteady CFD in the field of applications en-
visaged for these methods is a natural outcome of the
authors’ previous experience in this field. Although
particularly pertinent as such, the methods are in prin-
ciple applicable to any applications involving the sim-
ulation or measurement of stationary random data.
Acknowledgements
The authors are grateful for the valuable input pro-
vided by S. Julien and P. Spalart. The simulation data
were generated during the course of the EU DESider
project (AST3-CT-2003-502842) for which CPU time
was granted by the North German Supercomputing
Alliance (HLRN, www.hlrn.de). The methods
are being applied within the ongoing EU ATAAC
and VALIANT projects (ACP8-GA-2009-233710 &
ACP8-GA-2009-233680 respectively). The experi-
mental data was kindly provided by K. Swalwell.
References
Bendat J. and Piersol, A. (2000), Random data: Analysis
and measurement procedures, John Wiley & Sons Inc.
Garbaruk A., Leicher S., Mockett C., Spalart P., Strelets, M.
and Thiele, F. (2009), Evaluation of time-sample and span-
size effects in DES of nominally 2D airfoils beyond stall,
In Proc. 3rd Symp. Hybrid RANS-LES Methods, Gdansk,
Poland.
Haase W., Braza M. and Revell, A., editors (2009), DESider
– A European effort on hybrid RANS-LES modelling,
Vol. 103 of Notes Numer. Fluid Mech. Multidisc. Design,
Springer Verlag.
Mockett C. (2009), A comprehensive study of detached-eddy
simulation, PhD Thesis, Technische Universit¨
at Berlin.
Spalart P. (2009), Detached-eddy simulation. Ann. Rev.
Fluid Mech., Vol. 41, pp. 181–202.
Spalart P., Jou W., Strelets M. and Allmaras, S. (1997), Com-
ments on the feasibility of LES for wings, and on a hybrid
RANS/LES approach. Adv. DNS/LES, Vol. 1.
Swalwell K., Sheridan, J. and Melbourne, W. (2004), Fre-
quency analysis of surface pressure on an airfoil after stall.
In Proc. 21st AIAA Appl. Aerodyn. Conf.