ArticlePDF Available

Feature Selection for Monitoring Erosive Cavitation on a Hydroturbine


Abstract and Figures

This paper presents a method for comparing and evaluating cavitation detection features-the first step towards estimating remaining useful life (RUL) of hydroturbine runners that are impacted by erosive cavitation. The method can be used to quickly compare features created from cavitation survey data collected on any type of hydroturbine, sensor type, sensor location , and cavitation sensitivity parameter (CSP). Although manual evaluation and knowledge of hydroturbine cavitation is still required for our feature selection method, the use of principal component analysis greatly reduces the number of plots that require evaluation. We present a case study based on a cavitation survey data collected on a Francis hydrotur-bine located at a hydroelectric plant and demonstrate the selection of the most advantageous sensor type, sensor location, and CSP to use on this hydroturbine for long-term monitoring of erosive cavitation. Our method provides hydroturbine operators and researchers with a clear and effective means to determine preferred sensors, sensor placements, and CSPs while also laying the groundwork for determining RUL in the future.
Content may be subject to copyright.
Feature Selection for Monitoring Erosive Cavitation on a
Seth W. Gregg1, John P.H. Steele2, and Douglas L. Van Bossuyt3
1Logical Systems, LLC., Golden, Colorado, 80401, USA
2Department of Mechanical Engineering, Colorado School of Mines, Golden, Colorado, 80401, USA
3KTM Research, LLC., Tualatin, Oregon, 97062, USA
This paper presents a method for comparing and evaluating
cavitation detection features - the first step towards estimating
remaining useful life (RUL) of hydroturbine runners that are
impacted by erosive cavitation. The method can be used to
quickly compare features created from cavitation survey data
collected on any type of hydroturbine, sensor type, sensor lo-
cation, and cavitation sensitivity parameter (CSP). Although
manual evaluation and knowledge of hydroturbine cavitation
is still required for our feature selection method, the use of
principal component analysis greatly reduces the number of
plots that require evaluation. We present a case study based
on a cavitation survey data collected on a Francis hydrotur-
bine located at a hydroelectric plant and demonstrate the se-
lection of the most advantageous sensor type, sensor location,
and CSP to use on this hydroturbine for long-term monitor-
ing of erosive cavitation. Our method provides hydroturbine
operators and researchers with a clear and effective means
to determine preferred sensors, sensor placements, and CSPs
while also laying the groundwork for determining RUL in the
Cavitation events in hydroturbines can lead to damage to the
turbine runners and reduced remaining useful life (RUL). Cur-
rent methods of detecting cavitation events and prognosticat-
ing RUL have not been successful in providing hydroelectric
power plant operators with meaningful information. Struc-
tured methods of data collection and feature selection as well
Corresponding Author
Seth W. Gregg et al. This is an open-access article distributed under the terms
of the Creative Commons Attribution 3.0 United States License, which per-
mits unrestricted use, distribution, and reproduction in any medium, provided
the original author and source are credited.
as automated methods for cavitation detection, and RUL pre-
diction are needed to provide plant operators with a clear view
of hydroturbine health and RUL. The collection of useful data
from hydroturbines is well established and the tool chain to
calculate RUL is understood. However, feature selection and
automated cavitation detection remain to be addressed. In this
paper, we specifically examine feature selection in the larger
context of calculating RUL.
Hydropower is the largest renewable source of electricity in
the world. Many nations rely heavily on energy generated
from hydraulic turbines including China, Brazil, India, France,
Russia, Norway and Canada (IHA, 2015). In the United States,
hydropower is the largest and most mature renewable energy
source accounting for 48% of all renewable energy and 6.3%
of all electrical energy generated in the United States (U.S.
Energy Information Administration, 2015). In the Northwest-
ern region of the United States, including Washington, Ore-
gon, and Idaho, hydropower is the primary power source ac-
counting for over half the electrical energy generated.
Like other hydraulic machinery such as pumps, ships, and
valves, hydraulic turbines are susceptible to damage caused
by cavitation. Cavitation is a potentially destructive and com-
plex phenomenon involving the formation and rapid collapse
of vapor bubbles in the liquid. The vapor bubbles, or cavities,
form due to local pressure drops caused by sudden changes in
the fluid dynamics caused by rotating blades, sharp curves or
turbulence. Once formed, the cavities can gather into clouds
of vapor bubbles that periodically shed portions of the cloud
and violently collapse when they reach a higher pressure re-
gion in the fluid (Dular & Petkovˇ
sek, 2015). When the vapor
cavities collapse, they radiate a high energy acoustic pressure
wave that can lead to pit formation and aggressive material
erosion in nearby surfaces.
International Journal of Prognostics and Health Management, ISSN2153-2648, 2017 003 1
Despite advancements in runner design and cavitation resis-
tant materials, damage caused by cavitation remains one of
the primary causes of turbine failure (Dorji & Ghomashchi,
2014; Kumar & Saini, 2010; Bourdon, Farhat, Mossoba, &
Lavigne, 1999). This problem is highlighted by recent and
ongoing cavitation surveys performed by the United States
Bureau of Reclamation at major hydroelectric plants in north-
ern California and eastern Washington that have recently ex-
perienced costly cavitation damage (Bajic, 2008; Germann &
DeHaan, 2013). These events highlight the need to develop
prognostic methods for estimating the RUL of hydraulic tur-
bine runners experiencing cavitation damage.
One starting point for estimating RUL is to calculate a cavita-
tion erosion rate by comparing the amount of cavitation dam-
age accumulated over a long period of time with the amount
of time the turbine runner experienced cavitation over the
same period. Turbine runners are inspected periodically and
standard methods exist for evaluating cavitation damage
(International Electrotechnical Commission, 2004). Many
methods of turbine cavitation event detection have been de-
veloped over the last 50 years; however, these methods are
not widely used in industry for a variety of reasons includ-
ing: 1) only a limited subset of cavitation events can be de-
tected, 2) too many false positives undermine confidence in
the methods, 3) methods are turbine-specific and not general-
izable, and 4) installing new instrumentation to detect cavita-
tion events is overly burdensome on hydro power plant oper-
ators especially with regards to operating budgets.
When significant cavitation damage is discovered during rou-
tine turbine runner inspections at maintenance intervals, hy-
dro power plant operators typically perform a cavitation dam-
age survey. The survey consists of heavily instrumenting
the hydroturbine and running it through a variety of operat-
ing regimes in an attempt to understand what operating con-
ditions lead to cavitation events that can cause turbine run-
ner damage. After the survey is completed, the informa-
tion is used to develop operating guidelines to avoid oper-
ating regions where damage can occur. While this approach
works to reduce damage in the short term by avoiding op-
erating regions that can cause damage, several problems ex-
ist with the approach including: 1) cavitation damage sur-
veys often only examine a limited range of operating con-
ditions available during the cavitation survey such as hydro-
static head, water temperature, and interference from sister
turbines within the power plant, etc. that change seasonally or
year-to-year especially due to drought conditions, 2) changes
to the hydroturbine and associated equipment during repair
or overhaul can change the operating regions in which cavi-
tation occurs, 3) data is not generally collected and used be-
yond the cavitation damage survey to determine RUL during
routine operations, and 4) due to the time consuming nature
of manual comparison, a limited number of cavitation detec-
tion features are typical compared which can lead to missing
cavitation events and excessive false positive identification of
cavitation by not using the most effective feature.
Of specific interest to this paper is determining appropriate
cavitation detection features to use on a specific hydroturbine.
Many cavitation detection features have been proposed in the
literature and have been used with varying degrees of success
in practice; however, no single cavitation detection method is
appropriate for all scenarios. The three constituent compo-
nents of a cavitation detection feature include: 1) sensor type,
2) sensor placement, and 3) cavitation sensitivity parameter
1.1. Specific Contributions
In this paper, we present a method to rapidly compare cavita-
tion detection features and select which cavitation detection
features best identify when a hydroturbine runner is experi-
encing an erosive cavitation event. While Principal Compo-
nent Analysis (PCA) and feature selection are well-understood
methods of developing health monitoring for systems, to our
knowledge, this is the first work to apply this approach to hy-
droturbines. When compared to previous research aimed at
comparing sensors, sensor placement, or CSPs, (Bajic, Ser-
vices, Gmbh, & Zithe, 2003; Schmidt et al., 2014) our method-
ology uses a more objective, statistics-based approach to the
evaluation process. It is important to note here that the method
presented in this paper can discriminate between erosive and
non-erosive cavitation which is important in the ultimate goal
of determining RUL (not addressed in this paper). An added
benefit of using this method is determining the most use-
ful and cost-effective sensors for cavitation detection. This
method is an important step toward full automated cavitation
detection and RUL calculation that will lead to more robust
automated detection that can be relied on by operators.
In this section, we present background information on cavi-
tation damage in hydroturbines to demonstrate the need for
a method to rapidly compare cavitation detection features for
long term monitoring. A review of previous and current work
that has attempted to address hydroturbine cavitation damage
is provided. While efforts have been made to establish reli-
able RUL predictions, hydro power plant operators cannot or
choose not to use existing solutions. The method we present
in this paper builds upon the information presented in this
section toward the eventual goal of predicting RUL.
2.1. Cavitation
Cavitation occurs when vapor bubbles, or cavities, form in
a liquid due to a local decrease in pressure below the fluid
vapor pressure. In hydraulic machinery, cavitation typically
develops in localized areas where a flowing liquid reaches
higher than intended velocities. The liquid then becomes bro-
ken at several points and vapor cavities appear taking on dif-
ferent shapes depending on the structure of the flow (Dular
& Petkovˇ
sek, 2015). When the vapor cavities collapse, they
release a large amount of energy and can be very destructive
leading to material erosion on surrounding surfaces. Conse-
quently, cavitation and cavitation erosion is one of the most
pervasive problems found in hydroturbines (see Figure 1),
pumps, and ship propellers.
Figure 1. Cavitation blade damage on a hydroturbine runner
(courtesy of the U.S. Bureau of Reclamation)
Cavitation damage was first noted on ship propellers in the
late 1800s (Thornycroft & Barnaby, 1895). By the early 1900s,
material research was underway to help reduce propeller dam-
age in ocean liners caused by cavitation (Silberrad, 1912).
Soon after, Lord Rayleigh published the first theoretical model
analyzing the collapse of cavitation bubbles in a liquid
(Rayleigh, 1917) helping to explain the high pressure pulses
emitted by the highly compressed bubble at the moment of
Since Rayleigh’s initial models, there have been ongoing ef-
forts to understand the bubble dynamics and wear mechanism
behind cavitation in greater depth (Harrison, 1952; Naude &
Ellis, 1961; Benjamin & Ellis, 1966; Blake, 1987; Philipp &
Lauterborn, 1998) These studies focus primarily on the dy-
namics and damage caused by the collapse of single bubbles
near simple, flat surfaces – a situation not commonly found
in hydraulic machinery.
Recent cavitation studies use experimental setups that better
replicate realistic conditions of cavitation in rotating equip-
ment (Dular, Stoffel, & ˇ
Sirok, 2006; van Rijsbergen, Foeth,
Fitzsimmons, & Boorsma, 2012; Tan, Miorini, Keller, &
Katz, 2012; Jian, Petkovˇ
sek, Houlin, ˇ
Sirok, & Dular, 2015;
Dular & Petkovˇ
sek, 2015). These investigations have re-
vealed previously unseen complexity including a sheet bubble
structure, periodic shedding of bubble formations, and several
collapse modes that lead to varying amounts of surface dam-
age. The complex nature of cavitation leads to difficulties
in generating accurate computer models for predicting cavi-
tation erosion (Jian et al., 2015). Cavitation remains poorly
characterized in complex flow environments which limits the
ability to predict RUL of a hydroturbine runner using physics-
based simulations.
2.2. Cavitation in Hydroturbines
Hydroturbines create energy by taking advantage of water
falling between reservoirs at different elevations. The avail-
able water head and flow determine the design of the hydro-
turbine and play a large role in determining if cavitation will
develop during turbine operation (Avellan, 2004).
Large power plants typically have Kaplan or Francis style tur-
bines. The major difference in these two styles of turbine
is in the design of their impeller-like rotor called the runner.
Kaplan turbine runners are shaped like ship propellers and
are used when low water head is available. Francis turbine
runners are similar to Francis vane pump impellers and are
used for medium to high head applications (Gordon, 2001).
Both turbine types are susceptible to cavitation; however, the
location and type of cavitation typically observed can vary
slightly between turbine types (Escaler, Egusquiza, Farhat,
Avellan, & Coussirat, 2006). Pump-turbines are becoming in-
creasingly common and have a runner design similar to Fran-
cis turbines, but with the added advantage of being able to
be run in reverse as a pump. Pump-turbines are susceptible
to cavitation in either pump or turbine modes of operation
(Hasmatuchi, 2012; Cencˆ
ıc, Hocevar, & Sirok, 2014).
Important hydroturbine components are shown in Figure 2.
Water flows from the inlet side of the runner into the draft
tube. The amount of power produced by the hydroturbine is
determined by the amount of water flowing through the im-
peller which is controlled by pivoting the inlet guide vanes
open or closed. The area of highest concern for cavitation
damage is on the blades of a turbine runner. For large tur-
bines, the runner can be from 2 to 9 meters in diameter and is
very expensive to replace or repair (“The Knowledge Stream
- Detecting Cavitation to Protect and Maintain Hydraulic Tur-
bines”, 2014).
Hydroturbines can be affected by several types of cavitation
which are characterized by the operating conditions that cause
cavitation to occur and the location where erosion damage
appears. Cavitation types that lead to erosion damage on
the runner include leading edge, traveling bubble, inter-blade
vortex and tip vortex cavitation. Other types of cavitation
including draft tube swirl can cause high vibration, loss of
efficiency and fluctuations in power production, but typically
do not lead to erosion damage (Escaler et al., 2006).
Water head at the inlet and draft tube along with flow rate
Figure 2. Side view of a Francis style hydroturbine with ma-
jor components labeled (CC BY-SA 3.0, Voth Siemens Hydro
Power Generation, n.d.)
through the impeller dictate the operating conditions of a hy-
droturbine. Hydroturbines are designed to run free from cav-
itation. However, the complex nature of cavitation makes de-
signing and constructing a turbine that is not prone to cavita-
tion under at least some conditions very difficult. Available
inlet head may also change sufficiently to lead to unexpected
cavitation damage through seasonal reservoir variations or
large climactic events such as drought or flood. To prevent
catastrophic failure, hydroturbine runners are inspected peri-
odically for cavitation erosion damage and repaired as nec-
essary. When severe damage is found, a cavitation survey1
is performed to map out operating ranges where cavitation
is occurring (Germann & DeHaan, 2013; Escaler, Ekanger,
Francke, Kjeldsen, & Nielsen, 2014).
During a cavitation survey, the hydroturbine is temporarily
instrumented with sensors to detect vibration and acoustic
emissions of the shaft and surrounding structure as well as
pressure changes in the penstock. Next, the hydroturbine is
run at incrementally increasing flow rates while sensor data is
collected at each operating condition. The sensor data is then
analyzed to identify operating conditions where cavitation is
occurring so restrictions can be placed on the operating range
of the turbine. In cases where permanently installed sensors
and data collection equipment is installed, the cavitation sur-
vey can also be used to establish threshold values for on-line
cavitation monitoring and be a basis for monitoring the con-
dition of the turbine runner (Jardine, Lin, & Banjevic, 2006;
Escaler et al., 2014).
Cavitation surveys provide valuable information, but the op-
erating conditions that can be observed during a cavitation
survey is limited by the available inlet and draft tube head at
1Note that the term ”cavitation survey” is used internally at the Bureau of
Reclamation and is used in this paper to describe a study conducted on a
hydtroturbine to identify operation conditions where cavitation is likely to
the time of the survey. Additionally, hydroturbines often op-
erate in parallel with other turbines and the operating points
of these units can affect the survey findings. Data from a
cavitation survey is only a snapshot of current operating con-
ditions and cavitation zones rather than a long-term operating
plan, though in our experience many hydroturbine operators
must treat it as such.
2.2.1. Cavitation Detection Features for Hydroturbines
We define cavitation detection features to consist of three
components including: 1) sensor type, 2) sensor placement,
and 3) CSP. The process of extracting the appropriate infor-
mation to monitor (feature selection) is a key component of
both diagnostics and prognostics. Feature selection for cav-
itation monitoring on a hydraulic turbine involves choosing
sensors, sensor placement, data collection equipment, and a
CSP as well as considering location of the cavitation on the
runner, influence of the turbine structure on the sensor signal,
the number of turbines being operated, and the overall design
of the hydroelectric plant. The sheer number of factors that
influence cavitation feature selection for hydroturbines means
a single cavitation detection feature is not necessarily applica-
ble to multiple plants, turbines, and even operating conditions
of the same turbine. In the below subsections, we discuss the
three constituent components of cavitation detection features.
2.2.2. Sensor Type and Sensor Placement
The most common sensors used for cavitation diagnostics are
accelerometers, which produce a signal proportional to ac-
celeration, and acoustic emission sensors, which produce a
signal proportional to the amplitude of small stress waves
that travel through a material. Both sensors are based on
piezoelectric sensing elements and are able to record high
frequency events. Accelerometers used for cavitation diag-
nostics typically have a linear frequency response from 3 to
40,000 Hz while the acoustic emission sensors used respond
well between 40 and 400 kHz. In order to take advantage
of high frequency sensors, signal recording equipment must
be able to record the data at a high sampling rate – typically
around 1 MHz. Butterworth filters are also commonly applied
to recorded data in order to remove spurious signals and fre-
quency content beyond the useful range of the sensor (Escaler
et al., 2006; Germann & DeHaan, 2013; Cencˆ
ıc et al., 2014;
Escaler et al., 2014).
Other sensors that are less frequently used for cavitation di-
agnostics include hydrophones and high frequency pressure
sensors, sensitive to pressure events between 2 and 180,000
Hz, and proximity probes that measure shaft movement from
0 to 10 kHz (Rus, Dular, Sirok, Hocevar, & Kern, 2007).
For the most part, proximity probes are used for detecting
lower frequency cavitation events typical of draft tube swirl
or non-cavitation related faults such as an unbalanced or mis-
Method Application
Sensor type, number, and lo-
Signal Processing and Analy-
sis Steps
Cavitation Sensitivity
Varga et al.
Laboratory tur-
bine and pump
test rig
1 condenser microphone, 1 ac-
Spectrum analysis 20-40,000
Relative average noise
and overall acceleration
Bajic 2002 17 MW Francis
20 acoustic emission sensors, 1
on each guide vane
1) normalized power spectra
0.2 kHz - 1 MHz across differ-
ent turbine power output con-
ditions and 2) polar modulation
curve plots from each sensor
Maximum signal am-
plitude in RMS
73 MW Kaplan
5 accelerometers, 2 on the
lower guide bearing and 3 on
the thrust bearing
1) power spectra of raw data 0 -
10,000 Hz and 2) power spectra
of demodulated band-pass fil-
tered data 5-10 kHz
Maximum signal am-
plitude in RMS
Escaler et
al. 2006
11 MW Francis
3 accelerometers, 2 on the
lower guide bearing, 1 on the
inlet guide vane. 1 acoustic
emission sensor on the lower
guide bearing
1) power spectra of raw data 0 -
20,000 Hz and 2) power spectra
of demodulated band-pass fil-
tered data 15 - 20 KHz
Maximum signal am-
plitude in RMS
65 MW Francis
3 accelerometers, 2 on the in-
let guide vanes, 1 on the lower
guide bearing. 1 acoustic emis-
sion sensor on the lower guide
1) Overall RMS vibration up to
49 kHz, 2) power spectra of raw
data 0-50,000 Hz, and 3) power
spectra of demodulated band-
pass filtered data 30-50 kHz
Maximum signal am-
pltude in RMS
Rus et al.
Kaplan turbine
test rig
1 accelerometer, 1 acoustic
emission sensor, and 1 hy-
drophone on the test rig suction
1) Power spectra of demod-
ulated band-pass filtered data
(several band-pass filter settings
Sum of the Blade
Pass Modulation Level
(BPML) normalized by
the maximum value
and Sirok
185 MW
2 accelerometers, 1 on the
lower bearing, on on the inlet
guide vane. 1 acoustic emission
sensor on the bearing. 1 pres-
sure sensor on draft tube wall.
1) normalized power spectra 2)
overall RMS value of 5 differ-
ent band-pass filtered frequency
ranges, 3) selection of band-
pass filtered value by highest
coefficient of determination
Discharge coefficient
cavitation estimator
(based on band-pass
RMS amplutide, wa-
ter flow rate, runner
discharge diamter, and
rotational speed)
Escaler et
al. 2015
26 MW Francis
turbine with
leading edge
cavitation and
draft tube swirl
4 accelerometers, 2 on the
lower guide bearing, 1 on the
guide vane, 1 on the draft tube
wall. 1 acoustic emission sen-
sor on the lower guide bearing,
1 pressure sensor on the draft
1) power spectra of raw data 0
- 45 kHz and 0 - 20,000 kHz,
2) RMS level of band-pass fil-
tered data, 3) power spectra of
demodulated band-pass filtered
Power estimation of
modulating frequencies
Table 1. Cavitation Diagnostic Methods
aligned hydroturbine shaft. A recent exception to this is the
work by Pennacchi, et al. (Pennacchi, Borghesani, & Chat-
terton, 2015) showing the potential for cavitation detection
using synchronous averaging and spectral kurtosis on low fre-
quency proximity probe signals in a Kaplan turbine.
Typical sensor locations to monitor cavitation on hydrotur-
bines include: 1) upper and lower turbine bearings, 2) the
stem of an inlet guide vane (also called a wicket gate), and 3)
the draft tube wall. In experimental setups, sensors are some-
times attached to other locations including the hydroturbine
case, test stand frame, or directly to the hydroturbine shaft.
Sensor placement and orientation on one of the above identi-
fied locations can significantly impact the signal response as
(Schmidt et al., 2014) shows.
2.3. Diagnostic Methods
A summary of several options available to hydroturbine op-
erators for cavitation diagnostics is shown in Table 1. To be
practical for long term cavitation monitoring and RUL esti-
mation, a diagnostic method should: 1) be effective for the
turbine configuration and cavitation type, 2) produce a CSP
value that correlates with cavitation erosion rates, 3) consist
of sensors and hardware that are reasonable in cost and prac-
tical for installation in a power plant environment. Selecting
the right diagnostic method for a given hydroturbine is dif-
ficult since no method has been shown to meet all these re-
quirements in every situation. In addition, direct comparison
of diagnostic methods in literature is rare as research instead
focuses on demonstrating the efficacy of a newly proposed
Bajic (Bajic, 2002) promotes the use of a multidimensional
technique that he states is effective for all hydroturbines and
cavitation types; however, to be implemented it requires an
acoustic emission sensor be installed on every inlet guide
vane stem. Hydroturbines commonly have 20 or more in-
let guide vanes and installation of this number of sensors is
impractical in most hydro plants. The large quantity of data
produced from this number of sensors also means analysis is
a time consuming process and long term collection and stor-
age of data is cumbersome. Escaler and Rus (Escaler et al.,
2006; Rus et al., 2007; Escaler et al., 2014) show good cavi-
tation detection results by first band-pass filtering the sensor
signals, then using the power spectrum of the demodulated
signal to select frequency peaks sensitive to leading edge cav-
itation. Escaler suggests this technique is widely applicable;
however, Cencic (Cencˆ
ıc et al., 2014) claims the methodol-
ogy is not often practical because, to be effective, the sensors
must be placed in largely inaccessible locations.
Evaluation of the root mean square (RMS) amplitude of the
sensor signals is the most widely used technique for cavita-
tion diagnostics. Overall RMS calculated from raw sensor
signals is sensitive to cavitation events, but also picks up un-
wanted contributions from other machinery faults or outside
sources of noise. Two methods are suggested for reducing
the effects of unwanted contributions to the sensor signals
for RMS calculations: 1) apply a high-pass filter to sensor
signals to remove amplitude contributions from turbine run-
ning speed and low frequency faults, and 2) apply a band-
pass filter to the signals and calculate RMS amplitude from
a narrow frequency range that is only sensitive to cavitation
events. Some combination of high-pass and band-pass fil-
tering is used in every cavitation diagnostic method we re-
viewed. Escaler et al (Escaler et al., 2014) use a band-pass
filter range of 15 to 20 kHz for accelerometers and 40 to 45
kHz for acoustic emission sensors to reduce the influence of
outside noise. Cencic et al (Cencˆ
ıc et al., 2014) evaluated
five frequency ranges for their response to cavitation over sev-
eral operating conditions, and ultimately found the frequency
range between 22 to 26 kHz for accelerometers and above 50
kHz for acoustic emission sensors showed the best sensitiv-
ity. Bajic (Bajic, 2002) suggests different frequency ranges
can be used to detect different types of cavitation, but does
not suggest that a single best frequency range can be assumed
before analysis of the cavitation survey data.
2.3.1. Prognostics
Prognostics is the process of using a systems state and degra-
dation rate to predict the health of the system at a future
state (Heng, Zhang, Tan, & Mathew, 2009). Prognostic meth-
ods typically utilize historical condition monitoring data com-
bined with either physics-based or data-driven models to pre-
dict the future trend of the condition monitoring data and es-
timate the remaining useful life (An, Kim, & Choi, 2013).
While physics-based approaches can provide accurate future
health information, adequate models of cavitation erosion in
complex hydraulic environments such as hydroturbines do
not exist or have not been validated outside of laboratory en-
vironments (Dular et al., 2006; Dular & Coutier-Delgosha,
2009; Jian et al., 2015). We advocate for a data-driven prog-
nostic method to estimate turbine runner erosion rates and
Feature selection for health monitoring is used in several fields
(Hamby, 1994; Moriasi et al., 2007; Malhi & Gao, 2004).
Sensor data of the system being monitored is analyzed with a
variety of sensitivity parameters (e.g.: maximum signal am-
plitude in RMS (Escaler et al., 2006), power estimation of
modulating frequencies (Escaler et al., 2014), etc. (Varga, JJ
and Sebestyen, Gy and Fay, 1969; Bajic, 2002; Rus et al.,
2007; Cencˆ
ıc et al., 2014)). PCA is performed on the sensi-
tivity parameters to analyze the features (T. Wang, Yu, Siegel,
& Lee, 2008). The principal component scores are often then
analyzed using correlation coefficients to determine the best
features for a specific health monitoring application (Guyon
& Elisseeff, 2003). While feature selection is well understood
in certain fields and industries, it has not been developed to
the extent that we present here.
Existing attempts at data-driven hydroturbine cavitation ero-
sion prognostics or RUL prediction have not been fully suc-
cessful for a variety of reasons. Francois (Francois, 2012)
reports on Hydro Quebecs attempts at erosion rate estimation
that have produced no published results at the time of this
writing. Wolff, Jones and March (Wolff, Jones, & March,
2005) collected data between hydroturbine runner inspections
in an attempt to establish an erosion rate model based on in-
spection reports but insufficient data has stymied this effort.
Several researchers have suggested that their cavitation detec-
tion features and methodologies may possibly be used for ero-
sion estimation or RUL prediction but these researchers have
yet to demonstrate a successful implementation in a hydro-
turbine operating at a hydroelectric plant (Dular et al., 2006;
ıc et al., 2014; Escaler et al., 2014). To our knowledge,
no one has publicly published a successful hydroturbine cav-
itation erosion prognostic or RUL prediction method. In ad-
dition, no research group has addressed cavitation detection
feature selection, instead choosing a cavitation detection fea-
ture a priori for their studies. To date, no one has attempted to
address feature selection for cavitation detection in a repeat-
able, objective approach appropriate for hydroturbines. The
method presented in this paper attempts to provide a repeat-
able, structured approach for comparing and selecting cavita-
tion detection features for hydroturbines.
This section presents a method for determining the best cavi-
tation detection feature(s) for a hydroturbine that experiences
cavitation damage. The method is broken into three parts in-
cluding: 1) Data Preparation, 2) Feature Analysis, and 3) Fea-
ture Selection. Within each part, several steps are presented
that guide the practitioner through down-selecting from all of
the possible cavitation detection features to the few that: 1)
provide the best sensitivity to erosive cavitation, 2) the least
false alarms, and 3) the most practical to implement given
specific hydro plant and hydroturbine configuration. Figure 3
graphically shows the method.
3.1. Data Preparation
Data preparation is comprised of three steps that take place
after a cavitation survey has been performed: 1) CSPs are
calculated, 2) CSPs are organized into columns of a feature
matrix, and 3) the columns of the feature matrix are normal-
ized. The focus of this method is not on cavitation survey
data collection techniques; further information can be found
in (Escaler et al., 2006; Rus et al., 2007).
Step 1: Calculate Cavitation Sensitivity Parameters:
In this step, data are collected from the cavitation survey in-
cluding sensor types, sensor placements, and operating con- Figure 3. Cavitation Feature Selection Process
ditions; matched with diagnostic methods found in Table 1;
split into CSPs specific to each potential combination of the
above listed variables, and then CSPs are calculated to feed
into the matrix developed in Step 2. Most CSPs listed in Ta-
ble 1 can be used with any combination of sensor type and
sensor location. For instance, a cavitation survey that uses
accelerometers located on the lower guide bearing and in-
let guide vanes could use overall RMS levels (Varga, JJ and
Sebestyen, Gy and Fay, 1969), band-pass filtered RMS lev-
els (Cencˆ
ıc et al., 2014), and power estimates of modulating
frequencies (Escaler et al., 2014) for CSP values. It is impor-
tant to note that in order for the columns of the feature matrix
to have the same length, the same quantity of CSP values
must be calculated for each running condition where a run-
ning condition is defined as the specific hydroturbine power
output (usually denoted in Megawatts (MW))2. However, dif-
ferent operating conditions can have different quantities of
CSP values, if desired. If using this method to evaluate ex-
perimental features, our recommendation is to include at least
one commonly accepted sensor type, location, and CSP value
– preferably a combination listed in Table 1 - to ensure use-
ful comparative results. A rigorous feature selection process,
the primary goal of this method, requires as many features as
are practical to compare. For example, in the case study pre-
sented in the next section, we demonstrate the method using
61 features derived from 6 unique CSPs and 17 unique oper-
ating conditions. Note that the data we have observed from
hydroturbines seldom has a Gaussian distribution. However,
we consider PCA performed later in this paper in the same
manner as Jolliffe who states ”PCA [is] a mainly descriptive
technique... many of the properties and applications of PCA
and related techniques... have no need for explicit distribu-
tional assumptions” (Jolliffe, 2002).
Step 2: Form the Cavitation Feature Matrix from the Cavita-
tion Sensitivity Parameters:
Each combination of sensor type, sensor location and CSP
value (identified in Step 1) is a unique feature that will be
evaluated. In this step, we combine these features into a ma-
trix with a format conducive to the mathematical methods we
use for feature analysis and selection in later steps. Features
are first organized by grouping CSP values from the same op-
erating condition (in most cases, operating condition refers
to hydroturbine power output, but it could also include other
variables such as head, efficiency, or number of concurrently
running hydroturbines), and then creating a column vector, f
(note that a bold typeface indicates a vector or matrix), by
concatenating the groups by increasing power output. In this
way, the data can be viewed and manipulated in the operat-
2Note that while the vast majority of cavitation surveys are conducted over
the same underlying operating conditions (e.g. hydrostatic head, water tem-
perature, turbidity, other turbines active at the plant, etc.), it is possible to
conduct a cavitation survey that varies more than the hydroturbine power
output. In this case, multiple running conditions for each power output ex-
ist and each is treated as an individual operating condition
ing domain of the turbine instead of in the time domain. The
features are then combined into a cavitation feature matrix,
F, where each feature becomes a vertical column of block
"f1,1. . . f1,n
fc,1. . . fc,n
#Condition 1
"fc+1,1. . . fc+1,n
f2c,1. . . f2c,n
#Condition 2
"f(s1)c+1,1. . . f(s1)c+1,n
f2sc,1. . . f2sc,n
#Condition 3
The column vectors of each block f1, , fncontain the values
of the features calculated from the different operating condi-
tions of the turbine. The number of columns, n, is determined
by the number of features being compared. The number of
rows in each block, c, is determined by how many feature
values are calculated for each operating condition. The num-
ber of data blocks, s, is determined by the number of oper-
ating conditions the turbine is run under during the cavita-
tion survey. In this way, a column vector spanning all of
the blocks contains feature values ranging across all of the
operating conditions the hydroturbine experienced during the
cavitation study.
Step 3: Normalize the Columns of the Cavitation Feature Ma-
In the third step we normalize the columns of the feature ma-
trix. Normalization allows CSP values with different ampli-
tude scales and units to be directly compared without higher
magnitude CSPs being given undue weighting. CSPs in the
feature matrix can have different units depending on sensor
type and the method used to calculate the parameter.
We use the z-score (sometimes referred to as the standard
score) transformation (Holick, 2013) to normalize the columns
of the feature matrix. The importance of normalization when
comparing data is discussed in detail by Keogh and Kasetty
(Keogh & Kasetty, 2002) and is common in multivariate sta-
tistical analysis (Milligan & Cooper, 1988; Jolliffe, 2002;
Berkhin, 2006; Shalabi, Shaaban, & Kasasbeh, 2006; Shalabi
et al., 2006; C. M. Wang & Huang, 2009) as well as machin-
ery diagnostics and prognostics (Saxena, Celaya, Saha, Saha,
& Goebel, 2009; Khelf, Laouar, Bouchelaghem, R´
emond, &
Saad, 2013; Ramasso & Saxena, 2014; Kan, Tan, & Mathew,
2015). Z-score normalization linearly transforms the data to
have a mean of zero and a variance of 1. The new normalized
value has no units and is a measure of the distance, in stan-
dard deviations, from the mean of the data. We recommend
Z-score normalization be applied to each column indepen-
dently using Equation 2 where f is the CSP value, f0is the
normalized CSP value, µis the column mean, and σis the
column standard deviation.
For the remainder of the feature selection process, unless oth-
erwise noted, the normalized features (f0) are used.
3.2. Feature Analysis
Feature analysis consists of the following steps: 4) Perform
Principal Component Analysis on the feature matrix, and 5)
analyze the principal component scores to select the mode of
variance that is best related to erosive cavitation.
Step 4: Perform Principal Component Analysis of the Cavi-
tation Feature Matrix:
In Step 4, we find the underlying modes of variance within the
features in the running condition domain by applying PCA to
the feature matrix. One of the modes of variance will be re-
lated to erosive cavitation and will be used during the feature
selection steps to find the best sensor type, sensor location
and CSP for long term cavitation monitoring.
PCA as described by (Jolliffe, 2002) is one of the most im-
portant and popular methods in multivariate analysis for re-
ducing the dimensionality of data (van der Maaten, Postma,
& van den Herik, 2009). Reducing dimensions when deal-
ing with large data sets is helpful for both finding simpli-
fied structure within the data and removing variables or fea-
tures that do not contribute significantly to patterns in the data
(Shlens, 2014). PCA is commonly used in condition monitor-
ing for data exploration and feature selection in diagnostics
and prognostics (Baydar, Chen, Ball, & Kruger, 2001; Jar-
dine et al., 2006; Si, Wang, Hu, & Zhou, 2011; Ramasso &
Saxena, 2014; Kim, Uluyol, Parthasarathy, & Mylaraswamy,
2012). The difference between our use of PCA and a more
traditional use of PCA is that we use it as a tool for graphi-
cally describing and analyzing variance modes that relate to
bulk analysis of physical phenomenon. Some limitations of
PCA must be recognized by the hydroturbine practitioner in-
cluding the fact that PCA only considers orthogonal trans-
formations, and dimensional reduction is only applicable to
correlated data .
PCA looks to re-express a data set into as few variables as
possible while keeping the variance of the original data. The
output of PCA is a new orthogonal basis matrix, P, consisting
of orthogonal row vectors referred to as the Principal Com-
ponent (PC)s of the original data, p1,...,pm. The first PC,
p1, is the direction in the new basis that accounts for the most
variance while the second PC is the orthogonal direction that
accounts for the next most variance and so on for the remain-
ing PCs.
In Step 4, we perform PCA on the correlation matrix of F
to obtain P.Fis transformed by Pto produce a new repre-
sentation of the original data, Y. The column vectors of Y
are the principal component scores and are interpreted as the
modes of variance of the feature matrix. Each of the PC score
vectors in Yis then plotted to view the mode of variance for
each principal component. The transformation is expressed
Step 5: Analyze Principal Component Scores and Select the
Mode of Variance Related to Cavitation Erosion:
In the final feature analysis step, we view the feature scores in
the running condition domain and identify the modes of vari-
ance that only represent erosive cavitation. Step 5 is needed
because cavitation features pick up disturbances or events
in the hydroturbine not related to erosive cavitation. These
events may be related to non-erosive cavitation, bearing faults,
and noise, and will vary with hydroturbine running condition
in a different way than erosive cavitation. Since events other
than erosive cavitation have mode of variance that are dif-
ferent, they will be represented by one or several principal
component scores from Step 4. Selecting only the principle
component scores related to erosive cavitation in Step 5 al-
lows us to rank the features based on correlation in Step 6.
PCA on the feature matrix produces the same number of PCs
and PC scores as there are columns in the feature matrix. An-
alyzing all the principal scores is a time consuming process;
however, applying PCA shifts a majority of the important in-
formation, in terms of variance, into the first few PCs, al-
lowing the remaining PCs to be discarded. Despite know-
ing that only a few PC scores need to be retained, there is
no straight-forward test such as a scree plot to determine the
PCs to retain. It should be noted that occasionally the PCs
with smaller variance do contain useful information and in
such cases should not be discarded. Further information is
provided by Jolliffe for specific cases where this may apply
(Jolliffe, 1982).
When mining large data sets where little information is known
about the data a priori, selecting the correct number of PCs
that truly represent the data is a difficult task and is explored
by (Minka, 2008) as well as (Gavish & Donoho, 2013). In
addition, (Jolliffe, 2002) discusses PC selection techniques
specific to time-series data similar to the feature matrix we
use in this method. The task of Step 5 is not to try and fully
represent the data in the feature matrix, but rather choose the
data of interest for detecting erosive cavitation. When PC se-
lection is looked at in this light, two points about the nature of
the data in the cavitation matrix provide insight into picking
a selection process:
1. As suggested by (Preisendorfer, Rudolph W and Mob-
ley, 1988) important PCs from time-series data will con-
tain clear patters when treated as time series themselves.
Similarly, when PC scores from the cavitation feature
matrix are plotted in the running condition domain, they
show clear patterns that can be interpreted by an analyst
knowledgeable about hydroturbine cavitation.
2. In our methodology, the cavitation matrix is built using
features expected to be sensitive to erosive cavitation.
Forming the matrix in this way builds a bias in the vari-
ance of the matrix that promotes erosive cavitation re-
lated PCs. This built in bias ensures that even when a
large number of features are evaluated, only a handful of
PCs will be of significance and require analysis.
Because the cavitation feature matrix has both attributes, the
PC scores can be analyzed in order of decreasing overall vari-
ance until the PC relating to cavitation is found. A simple,
subjective method for selecting the number of principal com-
ponents to keep, such as a scree graph (Cattell, 1966; Jolliffe,
2002), should still be used to confirm Point 2 above; however,
if a scree graph indicates more than approximately 5 princi-
pal components be retained3, we suggest the practitioner re-
evaluate the features or CSPs used to build the feature matrix.
Analysis of the PCs is performed by plotting the PC scores
versus hydroturbine running condition and then looking for
changes in amplitude that match likely changes in cavitation
intensity. Figure 4 shows an example of a PC scores plot
showing changes in amplitude versus hydroturbine running
condition. Knowledge about the type(s) of cavitation the hy-
droturbine is experiencing (which can be gained by analyzing
the erosion damage areas) is useful during analysis to help
select PC scores appropriate to erosive cavitation. For addi-
tional guidance on cavitation diagnostics as well as matching
types of cavitation with erosive damage location, see (Escaler
et al., 2006). Further guidance is beyond the scope of this pa-
PCs related to erosive cavitation are retained and will be used
in the next step to evaluate cavitation detection features. For
the purpose of selecting a cavitation detection feature, PCs
not related to erosive cavitation should be discarded. It should
be noted that discarded PCs may be related to non-erosive
cavitation or other hydroturbine faults and as such could be
useful for selecting detection features sensitive to other faults
in the hydroturbine that are of interest to hydro plant opera-
3Note that the number of principle components to be retained is dependent
upon what the cavitation feature matrix contains – specifically how sensitive
the features are to erosive cavitation. For instance, if most of the features
are not sensitive to erosive cavitation, more PCs will be required to find
the erosive cavitation PC than if most of the features are highly sensitive to
erosive cavitation.
3.3. Feature Selection
The goal of the feature selection process is to pick the best
long term cavitation detection feature for an individual hy-
droturbine. There is no definitive way to measure best; how-
ever, we recommend comparing the features using two statis-
tical measurements before relying on subjective judgement.
In Step 6, the correlation coefficients between the PC scores
selected in Step 5 and the columns of the feature matrix are
calculated. In Step 7, feature variability is compared using
the estimated standard deviation at the features minimum and
maximum values. The final subjective evaluation is conducted
in Step 8 where features are yet again down selected based on
practical considerations for long term cavitation detection.
Calculate and Compare Correlation Coefficients:
Once the principal components that represent erosive cavita-
tion are selected (Step 5), the first feature selection step is
to calculate the sample correlation coefficients between the
columns of the normalized feature matrix f1,...,fnand the
columns of the PC scores y1,...,ynrelated to the PCs se-
lected in Step 5. The values of the correlation coefficients are
then used as a basis for removing features that are not sensi-
tive to erosive cavitation (Al-Kandari & Jolliffe, 2005).
Sample correlation coefficients are a statistical measure of
linear dependence between two population samples (Faber,
2012). For our methodology, the population samples are the
cavitation features and the PC scores. The correlation coef-
ficients between these two populations are designated as ryf
and are calculated by applying z-score normalization to the
principal component scores, then calculating the normalized
covariance between the score vectors and each column of the
feature matrix:
Figure 4. Example of principal component scores plotted ver-
sus hydroturbine running conditions.
where (f0
iµf0)is the complex conjugate, Nis the number
of CSP values in each feature, µy0is the mean of the score
vector, and µf0is the sample mean of the feature vector. Since
the column vectors y0and f0are both real-valued and normal-
ized to have zero mean, the equation simplifies to:
ρ(y0,f0) = 1
resulting in a scaler value between -1 and 1. Features with
correlation coefficients close to 1 or -1 are linearly dependent
with the PC score, share a similar mode of variance, and are
therefore sensitive to erosive cavitation. Features with coeffi-
cients closer to zero are not sensitive to erosive cavitation and
can be removed from the selection process.
A rule of thumb guideline for comparing correlation coeffi-
cients (Holick, 2013) is shown in Table 2. The feature ma-
trix is built from features meant to be sensitive to cavitation;
therefore, several features will have a high or very high de-
gree of dependence with the PC scores associated with ero-
sive cavitation. Based on Table 2 and the expectation of very
high dependence, we recommend removing features with a
correlation coefficient that has an absolute value less than 0.9.
If a suitable feature is not found or the practitioner would like
to evaluate additional options, the threshold can be relaxed to
0.7, but no lower. Below 0.7, features are expected to be of
poor quality and will not be useful for cavitation monitoring.
Step 7: Compare sample standard deviation of the features
CSP values at minimum and maximum cavitation intensity:
In Step 7, we compare the dispersion (otherwise known as the
Absolute Value
of Correlation
Coefficient ρ
Rule of Thumb
0≤ |ρ|<0.3Low degree of dependence
0.3≤ |ρ|<0.5Some degree of dependence
0.5≤ |ρ|<0.7Significant degree of dependence
0.7≤ |ρ|<0.9High degree of dependence
0.9≤ |ρ|<1.0Very high degree of dependence
Table 2. Rule of thumb for comparing correlation coefficients
(Holick, 2013)
spread or variation of a dataset) of the cavitation features by
calculating the sample standard deviation for each feature at
the hydroturbine running condition with minimum and max-
imum erosive cavitation as depicted by the PC scores plot(s)
from Step 5. Features with low standard deviation have less
dispersion (Faber, 2012), and are less likely to produce false
positive and false negative identification of erosive cavitation.
The sample standard deviation at the minimum CSP value
(sCS P min)is calculated for each feature by first identify-
ing the hydroturbine running condition at the minimum CSP
value, then using only CSP values from this running condition
to calculate the standard deviation. The standard deviation es-
timate at the maximum CSP value (sCSP max )is calculated in
a similar way using only CSP values from the same running
condition as the maximum CSP value.
The standard deviation calculation method above will result
in a pair of descriptive statistics for each feature. The standard
deviation pairs are used to rank the remaining features so that
the best cavitation features will have the lowest values for
both statistics. The following guidelines are suggested for
ranking the remaining features:
1. We suggest removing features with high sCSP min or
sCS P max values from consideration. Establish thresh-
olds for eliminating features by multiplying the overall
lowest sCSP min and sC SP max values by a scale factor,
sc, (we use a factor of 2) as shown in Equation 6:
threshold1 = sc sC SP min
threshold2 = sc sC SP max
Eliminate any feature with an sCS P min or an sC SP max
value above the respective thresholds and keep the other
features for continued evaluation.
2. We have found ranking the remaining features based on
the combination of their standard deviation values is a
useful means of comparing and ranking features:
sCS P combined =sCSP min +sC SP max (7)
3. Features with lower standard deviation around their min-
imum value are given preference to features with lower
standard deviation around their maximum value. This is
done because identifying individual cavitation events is
not so critical and, erring toward reducing the number
of false positives is beneficial to long term acceptance
of cavitation prognostics (Saxena, Celaya, Saha, Saha, &
Goebel, 2010).
An additional and roughly equivalent method for evaluating
feature dispersion is to calculate interquartile range around
the minimum and maximum CSP values. In this method, box
plots or quantile-quantile plots give the practitioner additional
insight into the structure of CSP variability (Faber, 2012). In
our experience with calculating dispersion of cavitation sen-
sitivity parameters, ranking features using either interquartile
range or standard deviation shows similar results.
Step 8: Evaluate remaining features based on practical con-
siderations for long term cavitation detection:
In Step 8, we evaluate the remaining cavitation features and
make a final selection based on the following practical con-
1. Are the sensor locations specific to the features practical
for long term usage in the hydroelectric plant?
2. Are the hardware and installation costs required to gen-
erate a feature drastically different than other features?
3. Are the features sensitive enough to erosive cavitation to
be used alone or is more than one feature required?
4. Is the hardware and software specific to each feature re-
liable and can it be maintained by plant personnel?
For the first practical consideration, each remaining feature
can be evaluated by how practical the sensor location is for
permanent installation. Cavitation survey data may have been
collected at sensor locations that work well for cavitation de-
tection, but due to access restrictions, safety regulations, or
the need for equipment modifications, the sensor locations
may not be deemed acceptable for permanent installation.
Cencic et al (Cencˆ
ıc et al., 2014) notes that sealed bearing
housings prevented the consideration of diagnostic techniques
that require a clean transmission path between the runner and
the sensor, such as demodulation, from being used for long
term cavitation monitoring.
The second consideration is to evaluate the hardware and in-
stallation costs required to generate each feature. Cavitation
diagnostics based on data acquired from a large number of
sensors (Bajic, 2002) or using custom-built wireless tech-
nology (Escaler & Egusquiza, 2003; Germann & DeHaan,
2013) have significantly higher equipment cost and complex-
ity when compared to the other methods shown in Table 1
based on fewer, commercially available sensors. The cost of
handling, storing and maintaining the data generated by each
feature is also a part of this consideration. The number of sen-
sors, required sample rate for recording the data, and duration
of the recorded signals all affect data storage requirements
and must be considered when evaluating feature costs.
The third consideration requires evaluating whether each fea-
ture is sensitive enough to erosive cavitation to be used on
its own for long term cavitation monitoring. Features that
generate noisy data or are sensitive to hydroturbine faults not
related to erosive cavitation are unreliable on their own, but
combining multiple features may lead to robust results. When
trying to monitor cavitation to determine erosion rates in a hy-
droturbine, Wolff et al (Wolff et al., 2005) reported problems
due to noisy data and noted that additional sensors would
have been helpful for making more accurate erosion rate esti-
The final evaluation for cavitation feature selection is to con-
sider reliability of the hardware and software system required
to generate the feature. If the system requires maintenance or
troubleshooting, consider whether the hydroelectric plant per-
sonnel have the resources to keep the system reliable. Bour-
don et al. (Bourdon et al., 1999; Bourdon, 2000) developed a
sophisticated monitoring system for cavitation detection meant
for making long term erosion rate estimates. Francois et al.
(Francois, 2012) report however, that lack of reliability in the
monitoring system led to incomplete data over an 8 year pe-
riod preventing erosion rates from being estimated. The mon-
itoring system was recently upgraded; however, cavitation
erosion estimates from the system have yet to be published.
3.4. End Result
The output of the feature selection method is the best feature
or group of features to use for long term erosive cavitation
monitoring in a specific hydroturbine. The selected feature(s)
specifies the sensor type, and sensor location for permanent
installation as well as the cavitation sensitivity parameter to
be monitored over time.
We present here a case study using a real cavitation survey
conducted on a Francis turbine at a hydro power plant lo-
cated in the American West. The data was collected using
the following sensors and sensor placement. An accelerom-
eter and acoustic emission sensor (Acc3 and AE3) were lo-
cated directly on the hydroturbine shaft and data was col-
lected at a sample rate of 1,330,000 S/s. One accelerome-
ter and one acoustic emission sensor were located on both
the lower guide bearing (Acc1 and AE1) and inlet guide vane
stem (Acc2 AE2), and signals from these sensors were sam-
pled at a rate of 1,000,000 S/s. A total of four proximity
probes were mounted 90 degrees apart facing the shaft, two
near the lower bearing (PP1 and PP2) and two near the upper
bearing (PP3 and PP4) of the turbine. In addition, a pressure
sensor was located in the wall of the draft tube (PR1). Sig-
nals from the proximity probes and the pressure sensor were
sampled at a rate of 10,000 S/s. The turbine operating condi-
tions captured in the cavitation survey ranged from 5 MW to
85 MW in 5 MW increments resulting in 17 unique operating
conditions. Other running condition variables such as hydro-
static head, other turbines in the plant operating, and other
factors were held effectively constant throughout the survey.
Step 1: Calculate Cavitation Sensitivity Parameters:
In the first step of the feature selection process, we chose
to calculate six CSP values for each sensor where high fre-
quency data was collected and five CSP values for each sen-
sor where medium frequency data was collected. The num-
ber of calculated CSP values was selected to demonstrate the
method without added confusion from many tens or hundreds
of calculated CSP values. The practitioner can decide to use
more or less CSP values depending upon the situation and
desired results. In our experience, between five and ten CSPs
per sensor is effective in identifying desirable features.
Every CSP listed in Table 1 uses band-pass filters on raw sen-
sor data and RMS amplitude calculations as either the pri-
mary CSP or as a step to calculating the CSP. Since RMS am-
plitude of band-pass filtered data is so common in cavitation
detection, we use it as the basis for a majority of CSPs cal-
culated in this case study. Alternative CSPs were also calcu-
lated using peak amplitude, crest factor, and kurtosis, which
are common calculations used for condition monitoring out-
side of cavitation detection (Randall, 2010). The alternative
CSPs were included for experimental purposes to compare
methods other than RMS that are very rarely if ever found
in hydroturbine cavitation studies. Practitioners may wish to
include other experimental CSPs to determine if borrowing a
CSP from a different field may provide better results as com-
pared to CSPs traditionally used with hydroturbines.
Table 3 shows the formulas used for calculating RMS, peak,
crest factor, and kurtosis values. Table 4 lists the specific
CSPs calculated for each sensor type used in this case study.
Calculation Formula
RMS frms =qPN
Peak Fpeak =max(x)
Crest Factor fcf =fpeak
Kurtosis fkurt =
Table 3. Calculations used for feature CSP values
Step 2: Form the Cavitation Feature Matrix from the Cavita-
tion Sensitivity Parameters:
Next, we formed the cavitation feature matrix by calculating
the CSP values listed in Table 4 for the three acoustic emis-
sion sensors, three accelerometers, four proximity probes and
one pressure transducer used in the cavitation survey resulting
in 61 total features. 32 CSP values were calculated for each
of the 17 operating conditions resulting in 544 CSP values for
each feature. The cavitation feature matrix is therefore a 544
x 61 matrix organized as described in Step 2 of the methodol-
ogy. Throughout the rest of this document, we have adopted
the feature naming convention from the combination of the
abbreviation of the sensor type and the CSP number shown in
Table 4.
Cavitation Sensitivity Parameters
1) RMS amplitude 1,000 - 20,000 Hz
2) RMS amplitude 20,000 - 30,000 Hz
3) RMS amplitude 30,000 - 100,000 Hz
4) Peak amplitude 1,000 - 20,000 Hz
5) Crest factor 1,000 - 20,000 Hz
6) Kurtosis 1,000 - 20,000 Hz
1) RMS amplitude 1,000 - 400,000 Hz
2) RMS amplitude 50,000 - 400,000 Hz
3) RMS amplitude 1,000 - 50,000 Hz
4) Peak amplitude 1,000 - 400,000 Hz
5) Crest factor 1,000 - 400,000 Hz
6) Kurtosis 1,000 - 400,000 Hz
1) RMS amplitude 40 - 1,000 Hz
2) RMS amplitude 1 - 40 Hz
3) Peak amplitude 40 - 1,000 Hz
4) Crest factor 40 - 1,000 Hz
5) Kurtosis 40 - 1,000 Hz
1) RMS amplitude 40 - 1,000 Hz
2) RMS amplitude 1 - 40 Hz
3) Peak amplitude 40 - 1,000 Hz
4) Crest factor 40 - 1,000 Hz
5) Kurtosis 40 - 1,000 Hz
Table 4. Cavitation sensitivity parameter details for each sen-
sor type
Steps 3 and 4: Normalize the Feature Matrix and Perform
Steps 3 and 4 were performed using MATLAB Software and
resulted in a principal component scores matrix Y. The ma-
trix Yis not reproduced here due to the size of the matrix.
Step 5: Analyze Principal Component Scores and Select the
Mode of Variance Related to Cavitation Erosion:
In Step 5, we first created a scree plot from the PCA results to
determine the number of principal component scores to ana-
lyze. Figure 5 shows the results of the scree plot which clearly
indicate the first PC represents a large majority of the total
variance. The scree plot also indicates a slight drop off in
variance after the fourth principal component. Based on the
scree plot, we evaluated the first four PC scores to capture the
vast majority of the variance. The first PC score plot (Figure
6) shows a steady increase in normalized amplitude values
from 35 to 45MW, peak amplitude from 50 to 60MW, then
an amplitude decrease from 60 to 70MW. Based on previous
cavitation diagnostics performed on this hydroturbine by per-
sonnel at the Bureau of Reclamation, using techniques and
resources similar to those discussed by (Escaler et al., 2006),
Figure 5. Scree plot of PCA results on the cavitation
Figure 6. The first principal component score plot represents
a mode of variance related to erosive cavitation
the first PC score plot best matches the operating conditions
and mode of variance associated with erosive cavitation. Ad-
ditionally, the second and third score plots (Figure 7 and Fig-
ure 8) represent modes of variance associated with draft tube
swirl and draft tube vortex collapse. Draft tube swirl can be
damaging to hydroturbines; however, for this specific hydro-
turbine it does not cause erosive damage and is therefore not
of interest for this case study. The fourth PC score plot was
a twin to the third PC score plot, but at a slightly different
running condition.
Step 6: Calculate and Compare Correlation Coefficients:
In Step 6, the correlation coefficients were calculated between
the first principal component scores and each feature. Corre-
lation coefficients for the accelerometers and acoustic emis-
sion sensors are shown in Figure 9. Correlation coefficients
for the proximity probes and pressure transducer are shown
Figure 7. The second principal component score plot repre-
sents a mode of variance related to early developing draft tube
Figure 8. The third principal component score plot represents
a mode of variance related to draft tube vortex collapse at
high power output
Figure 9. Correlation coefficients between the first principal
component scores, and features based on the accelerometers
and acoustic emission sensors
in Figure 10. Based on the large number of features with a
very high degree of dependence with the first principal com-
ponent scores, features with a correlation coefficient less than
0.9 were removed from consideration for the remainder of the
selection process.
Step 7: Compare sample standard deviation at the features
minimum and maximum CSP values:
In the next step, the standard deviation at minimum (sCSP min )
and maximum (sCS P max)CSP values were calculated to com-
pare dispersion within each of the remaining features. As
described in Step 7 of our methodology, features with high
standard deviations were removed from consideration and the
remaining features were ranked in order of their combined
sCS P max and sCS P min values.
Figures 11 and 12 show the features compared by sCSP min
and sCS P max values, respectively. Both figures also show the
threshold line of two times the minimum standard deviation
used for determining which features to remove from consid-
eration. Based on the threshold line, 12 additional features
were removed from consideration. The remaining 12 features
were ranked from smallest to largest by their combined stan-
dard deviation values as shown in Figure 13.
Step 8: Evaluate remaining features based on practical con-
siderations for long term cavitation detection:
In the final step of the feature selection process, we evalu-
ated the remaining 12 features from Step 7 based on the prac-
tical considerations outlined in our methodology. Features
Acc3 1, Acc3 2, Acc3 3, AE3 1, AE3 2, and AE3 3 are all
shaft mounted sensors that require higher cost and complex-
ity to install and maintain. Given that there are 6 additional
features that have a similar sensitivity to erosive cavitation,
Figure 10. Correlation coefficients between the first principal
component scores, and features that use proximity probes or
a pressure transducer
Figure 11. Comparison of standard deviation around the fea-
tures’ minimum CSP value
Figure 12. Comparison of standard deviation around the fea-
tures’ maximum CSP value
Figure 13. Ranking of remaining features by standard devia-
tion around minimum and maximum CSP values
we eliminated these features based on their higher cost and
The remaining 6 features were based on RMS amplitude in
different frequency ranges and come from sensors mounted
to the hydroturbines lower guide bearing. Based on the sim-
ilarity between their sCS P min values and the low cost of in-
stallation and maintenance, any of the remaining 6 features
are adequate for erosive cavitation monitoring. In addition to
having low sCSP min values, the two highest ranking features,
AE1 2, and AE1 1 have the lowest sC SP combined value and
as such, we consider them the best features for long term
monitoring of erosive cavitation.
The methodology outlined in this paper provides several ben-
efits to a researcher or hydroturbine operator wishing to esti-
mate RUL on a hydroturbine runner through long term mon-
itoring of erosive cavitation. In this section we discuss the
benefits of using our cavitation feature selection process, and
issues that the practitioner must keep in mind. While the
method presented here does not yet provide RUL calcula-
tions, it is a step in the direction of a full RUL method for
hydroturbines – a long-sought goal in the industry.
The feature selection process method described in this pa-
per was demonstrated on cavitation survey data taken on a
Francis hydroturbine experiencing leading edge erosive cavi-
tation. While we demonstrated the method on a Fancis hydro-
turbine, the method can be used on any hydroturbine regard-
less of type. This also holds true for common sensors used
in hydroturbine plants to monitor cavitation and other health
monitoring applications (e.g.: bearing monitoring, etc.), for
sensor locations, and multiple cavitation types (e.g.: draft
tube swirl – actually seen in the case studys data, trailing edge
cavitation, etc.). The cavitation survey in the case study was
performed using sensors and sensor locations that are com-
monly found in hydroturbine plant industry cavitation stud-
ies. Practitioners must note that the feature selection process,
specifically analysis of the principal component scores, is dif-
ficult if variance of all the features is dominated by noise or
events not related to erosive cavitation. Thus, it is impor-
tant that the initial features investigated be at least in part
known useful features used on other hydroturbines such as
overall RMS values taken from an acoustic emission sensor
on a lower guide bearing.
One benefit of the presented method is that several aspects
of the feature comparisons are automated which allows many
different cavitation detection features to be compared quickly.
Increasing the number of different features being compared
increases the likelihood of finding the best all-around feature.
The sensitivity and precision of the features being compared
are ranked based on statistical values versus purely subjective
evaluation. This combined with the quickness of the process
also allows new or experimental features to be evaluated and
compared. However, it should be noted, accuracy of the fea-
tures is not addressed in this paper due to the lack of visual
confirmation of cavitation intensity. In industrial settings, it
is very rare to have visual confirmation of cavitation intensity.
A few points to keep in mind when using this method include
that the methodology compares signal dispersion in order to
reduce the likelihood of false positives and false negatives,
but the methodology does not directly evaluate the false pos-
itives or false negatives associated with each feature. Do-
ing this evaluation requires establishing cavitation thresholds,
which is beyond the scope of the work presented in this pa-
per. Another important point is that cavitation intensity is not
addressed by the methodology presented here. This prevents
the method from being directly used to determine RUL. How-
ever, it is expected that future efforts with establishing cavi-
tation thresholds can help to adapt the method presented here
to be more useful in calculating RUL. A final point to note is
that two large sources of error exist including data collection
and data organization. Poor data collection and organization
methods, sometimes seen in cavitation studies, can lead to
results that are not accurate or relevant to hydroturbine oper-
No adequate measure of goodness of fit is available for the
method presented in this paper. This is because the typical
hydroturbine practitioner may not have a clear view of when
cavitation is or is not occurring. While a cavitation survey
may be available for a specific hydroturbine, it is unlikely
to remain valid as operating conditions change (e.g.: hydro-
static head, water temperature, interference from sister hydro-
turbines, accumulated cavitation damage on the hydroturbine
runner, etc.). We view this situation as being similar to an un-
supervised learning method such as clustering where there is
no correct answer. Like with clustering, a practitioner selects
a few features in the data that are believed to be important to
be used as performance metrics. Here we have selected: 1)
the linear fit to the mode of variance measured through corre-
lation coefficients, 2) variance around the minimum and max-
imum signals, and 3) practical considerations. Potential mea-
sures of fit such as the mean square error between different
signals or between Principal Components of Interest (PCI)
and other signals are not valid with the method presented in
this paper. Such measurements are relative measurements and
do not offer additional validity beyond what a practitioner has
already deemed to be important (see above in this paragraph
for what we have deemed important). Attempting to use PCI
or mean square error between signals, or other similar good-
ness of fit measures leads to circular logic because we have
already used PCI within the method as a stepping stone for
finding a better representation of cavitation events within the
hydroturbine. Thus we do not recommend examining good-
ness of fit on the method presented here.
The method presented in this paper is a good starting point
for researchers and hydroturbine operators to better under-
stand how to monitor hydroturbines for cavitation during op-
eration. The method can be used to identify the most ap-
propriate sensors, sensor placements, and CSPs that provide
the most insight into erosive cavitation. Previously, opera-
tors and researchers did not have a direct method of compar-
ison for sensors, sensor placements, and CSPs. The method
presented here is already showing great promise with some
hydroturbine operators and is expected to be deployed in the
field soon.
We are actively pursuing several areas of future work and pro-
pose the hydroturbine prognostics community pursue several
larger goals. One area requiring further study is to better un-
derstand why different RMS frequency bands do not distin-
guish themselves from one another. We discovered this issue
using F-tests. A potential direction of research is an in-depth
investigation of spectral data produced from RMS frequency
Another area that the community needs to investigate is the
evaluation of feature plots viewed in the z-score normalized
domain that may be useful for establishing thresholds for long
term cavitation detection or for training supervised machine
learning algorithms. Establishing cavitation detection thresh-
olds will lead to a better understanding of cavitation intensity
that can then be used to develop a RUL method for hydrotur-
bine operators.
Spectrum-based methods such as demodulation and spectral
kurtosis were not explored in this paper; however, the foun-
dation has been laid here for evaluating spectrum-based fea-
tures against traditional RMS-based features. It is possible
that spectrum-based methods may be more sensitive to dif-
ferent types of erosive cavitation on the same hydroturbine.
While we have demonstrated in this paper that we can detect
leading edge cavitation and we also have seen this method
work to detect draft tube swirl cavitation on the same dataset,
there are several other types of cavitation that can be impor-
tant depending upon the hydroturbine design and operating
conditions. Multiple erosive cavitation events can occur at
the same time and this should be captured for a complete un-
derstanding of RUL.
Proximity probes did not show as high a degree of depen-
dence as the accelerometers and acoustic emission sensors to
the mode of variance related to erosive cavitation; however,
proximity probes did show high dependence and were also
sensitive to the modes of variance associated with draft tube
swirl. Due to their low cost and higher likelihood of already
being installed on a hydroturbine to monitor common low
speed faults such as bent shafts, additional investigation of us-
ing these sensors for erosive cavitation detection and broader
condition monitoring is warranted. It is possible that using
proximity probes may make detecting erosive cavitation sig-
nificantly less expensive and intrusive for hydroturbine oper-
Finally, experimental CSPs including peak, crest factor, and
kurtosis did not measure well against RMS for erosive cavi-
tation. These features do however show stark differences be-
tween different sensor locations – specifically between sen-
sors mounted on the shaft versus sensors mounted off the
shaft. We do not yet understand why this is the case. It is
possible that a deeper understanding of the physics of the sit-
uation may help to develop significantly improved CSPs.
This paper presents a method for comparing and evaluating
cavitation detection features - the first step toward estimating
RUL of hydroturbine runners. The method can be used to
quickly compare features created from cavitation survey data
collected on any type of hydroturbine, sensor type, sensor
location, and CSP. Although manual evaluation and knowl-
edge of hydroturbine cavitation is still required for our fea-
ture selection method, the use of principal component anal-
ysis greatly reduces the number of plots that require evalua-
tion. We are not aware of anyone in academia or industry tak-
ing this approach with hydroturbines. We applied the method
presented in this paper to cavitation survey data collected on
a Francis Hydroturbine and were able to select the best sen-
sor type, sensor location, and CSP to use on this hydroturbine
for long term monitoring of erosive cavitation, thus demon-
strating the usefulness of the method. Our method provides
hydroturbine operators and researchers with a clear and effec-
tive way to determine preferred sensors, sensor placements,
and CSPs while also laying the groundwork for determining
RUL in the future.
The information, data, or work presented herein was funded
in part by the Office of Energy Efficiency and Renewable En-
ergy (EERE), U.S. Department of Energy, under Award Num-
ber DE-EE0002668 and the Hydro Research Foundation.
The authors wish to acknowledge the contributions of John
Germann and James DeHaan for collecting the cavitation sur-
vey data and their guidance with analysis of the data. The
authors further wish to acknowledge the code development
assistance of Logan Schuelke.
The information, data or work presented herein was funded
in part by an agency of the United States Government. Nei-
ther the United States Government nor any agency thereof,
nor any of their employees, makes and warranty, express or
implied, or assumes and legal liability or responsibility for
the accuracy, completeness, or usefulness of any information,
apparatus, product, or process disclosed, or represents that
its use would not infringe privately owned rights. Reference
herein to any specific commercial product, process, or ser-
vice by trade name, trademark, manufacturer, or otherwise
does not necessarily constitute or imply its endorsement, rec-
ommendation or favoring by the United States Government
or any agency thereof. The views and opinions of authors ex-
pressed herein do not necessarily state or reflect those of the
United States Government or any agency thereof.
BPML Blade Pass Modulation Level.
CSP cavitation sensitivity parameter.
MW Megawatts.
PC Principal Component.
PCA Principal Component Analysis.
PCI Principal Components of Interest.
RMS root mean square.
RUL remaining useful life.
Al-Kandari, N. M., & Jolliffe, I. T. (2005). Variable
selection and interpretation in correlation principal
components. Environmetrics,16(6), 659–672. doi:
An, D., Kim, N. H., & Choi, J.-H. (2013). Options for
Prognostics Methods : A review of data-driven and
physics- based prognostics. Annual Conference of the
Prognostics and Health Management Society, 1–14.
doi: 10.2514/6.2013-1940
Avellan, F. (2004). Introduction to cavitation in hydraulic
machinery. The International Conference on
Hydraulic Machinery . . . , 11–22. Retrieved from
Conferinta{\ }MH/102Avellan.pdf
Bajic, B. (2002). Multidimensional Diagnostics of Turbine
Cavitation. Journal of Fluids Engineering,124(4),
943. doi: 10.1115/1.1511162
Bajic, B. (2008). Multidimensional Methods and Simple
Methods for Cavitation Diagnostics and Monitoring
(Tech. Rep.). Brasilia.
Bajic, B., Services, K. C., Gmbh, K., & Zithe, S. (2003).
Methods for vibro-acoustic diagnostics of turbine
cavitation M´
ethodes pour le diagnostic
vibro-acoustique de la cavitation de turbine. Analysis,
41(1), 87–96.
Baydar, N., Chen, Q., Ball, A., & Kruger, U. (2001).
Detection of Incipient Tooth Defect in Helical Gears
Using Multivariate Statistics. Mechanical Systems and
Signal Processing,15(2), 303–321. Retrieved from
retrieve/pii/S0888327000913153 doi:
Benjamin, T. B., & Ellis, A. T. (1966). The Collapse of
Cavitation Bubbles and the Pressures thereby
Produced against Solid Boundaries. Philosophical
Transactions of the Royal Society of London A:
Mathematical, Physical and Engineering Sciences,
260(1110), 221–240. Retrieved from
doi: 10.1098/rsta.1966.0046
Berkhin, P. (2006). A survey of clustering data mining
techniques. Grouping multidimensional data(c),
25–71. Retrieved from
10.1007/3-540-28349-8{\ }2
Blake, J. (1987). Cavitation Bubbles Near Boundaries.
Annual Review of Fluid Mechanics,19, 99–123. doi:
Bourdon, P. (2000). Detection Vibratoire De L’Erosion De
Cavitation Des Turbines Francis (Unpublished
doctoral dissertation). Ecole Polytechnique Federale
De Lausanne.
Bourdon, P., Farhat, M., Mossoba, Y., & Lavigne, P. (1999).
Hydro Turbine Profitability and Cavitation Erosion.
Waterpower’99, 1–10. Retrieved from
40440(1999)76 doi: 10.1061/40440(1999)76
Cattell, R. B. (1966). The scree test for the number of
factors. Multivariate behavioral research,1(2),
ıc, T., Hocevar, M., & Sirok, B. (2014). Study of
Erosive Cavitation Detection in Pump Mode of
PumpStorage Hydropower Plant Prototype. ASME J.
Fluids Eng.,136(5), 51301. doi: 10.1115/1.4026476
Dorji, U., & Ghomashchi, R. (2014). Hydro turbine failure
mechanisms: An overview. Engineering Failure
Analysis,44, 136–147. Retrieved from
j.engfailanal.2014.04.013 doi:
Dular, M., & Coutier-Delgosha, O. (2009). Numerical
modelling of cavitation erosion. International Journal
for Numerical Methods in Fluids,61(12), 1388–1410.
doi: 10.1002/fld.2003
Dular, M., & Petkovˇ
sek, M. (2015). On the mechanisms of
cavitation erosion Coupling high speed videos to
damage patterns. Experimental Thermal and Fluid
Science,68, 359–370. Retrieved from
retrieve/pii/S0894177715001508 doi:
Dular, M., Stoffel, B., & ˇ
Sirok, B. (2006). Development of a
cavitation erosion model. Wear,261(5-6), 642–655.
doi: 10.1016/j.wear.2006.01.020
Escaler, X., & Egusquiza, E. (2003). Vibration Cavitation
Detection Using Onboard Measurements. Symposium
A Quarterly Journal In Modern Foreign
Literatures(July 2015), 1–7.
Escaler, X., Egusquiza, E., Farhat, M., Avellan, F., &
Coussirat, M. (2006). Detection of Cavitation in
Hydraulic Turbines. Mechanical Systems and Signal
Processing, 983 – 1007.
Escaler, X., Ekanger, J. V., Francke, H. H., Kjeldsen, M., &
Nielsen, T. K. (2014). Detection of Draft Tube Surge
and Erosive Blade Cavitation in a Full-Scale Francis
Turbine. Journal of Fluids Engineering,137(1),
011103. Retrieved from
doi: 10.1115/1.4027541
Faber, M. H. (2012). Statistics and Probability Theory - In
Pursuit of Engineering Decision Support. doi:
Francois, L. (2012). Vibratory detection system of
Cavitation Erosion: Historic and Algorithm
Validation. In Proceedings of the eighth international
symposium on cavitation (pp. 325 –330).
Gavish, M., & Donoho, D. L. (2013). The Optimal Hard
Threshold for Singular Values is 4/sqrt(3). , 60(8),
1–14. Retrieved from doi:
Germann, J., & DeHaan, J. (2013). Cavitation Detection
Tests at Judge Francis Carr Powerplant, Redding,
California (Tech. Rep.). Denver, Colorado: Bureau of
Reclamation Technical Services Center.
Gordon, J. L. (2001). Hydraulic turbine efficiency.
Canadian Journal of Civil Engineering,28(2),
238–253. doi: 10.1139/l00-102
Guyon, I., & Elisseeff, A. (2003). An introduction to
variable and feature selection. Journal of machine
learning research,3(Mar), 1157–1182.
Hamby, D. (1994). A review of techniques for parameter
sensitivity analysis of environmental models.
Environmental monitoring and assessment,32(2),
Harrison, M. (1952). An Experimental Study of Single
Bubble Cavitation Noise. The Journal of the
Acoustical Society of America,24(6), 776 – 782.
Hasmatuchi, V. (2012). Hydrodynamics of a Pump-Turbine
Operating at Off-Design Conditions in Generating
Mode (Doctoral dissertation). doi:
Heng, A., Zhang, S., Tan, A. C., & Mathew, J. (2009).
Rotating machinery prognostics: State of the art,
challenges and opportunities. Mechanical Systems and
Signal Processing,23(3), 724–739. Retrieved from
retrieve/pii/S0888327008001489 doi:
Holick, M. (2013). Introduction to Probability and Statistics
for Engineers. doi: 10.1007/978-3-642-38300-7
IHA. (2015). 2015 Key Trends in Hydropower. (Figure 1).
Retrieved from
International Electrotechnical Commission. (2004). IEC
60609-1:2004 Hydraulic turbines, storage pumps and
pump-turbines - Cavitation pitting evaluation - Part 1:
Evaluation in reaction turbines, storage pumps and
pump-turbines. International Electrotechnical
Commission. Retrieved from https://
Jardine, A. K., Lin, D., & Banjevic, D. (2006). A review on
machinery diagnostics and prognostics implementing
condition-based maintenance. Mechanical Systems
and Signal Processing,20(7), 1483–1510. Retrieved
retrieve/pii/S0888327005001512 doi:
Jian, W., Petkovˇ
sek, M., Houlin, L., ˇ
Sirok, B., & Dular, M.
(2015). Combined Numerical and Experimental
Investigation of the Cavitation Erosion Process.
Journal of Fluids Engineering,137(5), 051302.
Retrieved from http://fluidsengineering
doi: 10.1115/1.4029533
Jolliffe, I. T. (1982). A note on the use of principal
components in regression. Applied Statistics,
Jolliffe, I. T. (2002). Principal Component Analysis, Second
Edition. Encyclopedia of Statistics in Behavioral
Science,30(3), 487. Retrieved from
10.1002/0470013192.bsa501/full doi:
Kan, M. S., Tan, A. C., & Mathew, J. (2015). A review on
prognostic techniques for non-stationary and
non-linear rotating systems. Mechanical Systems and
Signal Processing,62-63, 1–20. Retrieved from
S0888327015000898 doi:
Keogh, E., & Kasetty, S. (2002). On the need for time series
data mining benchmarks. Proceedings of the eighth
ACM SIGKDD international conference on
Knowledge discovery and data mining - KDD ’02,
102. Retrieved from
citation.cfm?id=775047.775062 doi:
Khelf, I., Laouar, L., Bouchelaghem, A. M., R´
emond, D., &
Saad, S. (2013). Adaptive fault diagnosis in rotating
machines using indicators selection. Mechanical
Systems and Signal Processing,40(2), 452–468.
Retrieved from
j.ymssp.2013.05.025 doi:
Kim, K., Uluyol, O., Parthasarathy, G., & Mylaraswamy, D.
(2012). Fault Diagnosis of Gas Turbine Engine LRUs
Using the Startup Characteristics. Phm, 1–10.
The Knowledge Stream - Detecting Cavitation to Protect and
Maintain Hydraulic Turbines. (2014). (Summer 2014).
Retrieved from
Kumar, P., & Saini, R. P. (2010). Study of cavitation in
hydro turbines-A review. Renewable and Sustainable
Energy Reviews,14(1), 374–383. doi:
Malhi, A., & Gao, R. X. (2004). Pca-based feature selection
scheme for machine defect classification. IEEE
Transactions on Instrumentation and Measurement,
53(6), 1517–1525.
Milligan, G. W., & Cooper, M. C. (1988). A study of
standardization of variables in cluster analysis.
Journal of Classification,5(2), 181–204. doi:
Minka, T. P. (2008). Automatic choice of dimensionality for
PCA. MIT Media Laboratory Perceptual Computing
Section Technical Report No. 514,2nd revisi.
Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner,
R. L., Harmel, R. D., & Veith, T. L. (2007). Model
evaluation guidelines for systematic quantification of
accuracy in watershed simulations. Transactions of
the ASABE,50(3), 885–900.
Naude, C. F., & Ellis, A. T. (1961). On the Mechanism of
Cavitation Damage by Nonhemispherical Cavities
Collapsing in Contact With a Solid Boundary. Journal
of Basic Engineering,83(4), 648. doi:
Pennacchi, P., Borghesani, P., & Chatterton, S. (2015). A
cyclostationary multi-domain analysis of fluid
instability in Kaplan turbines. Mechanical Systems
and Signal Processing,60-61, 375–390. Retrieved
retrieve/pii/S0888327015000163 doi:
Philipp, a., & Lauterborn, W. (1998). Cavitation erosion by
single laser-produced bubbles. Journal of Fluid
Mechanics,361, 75–116. doi:
Preisendorfer, Rudolph W and Mobley, C. D. (1988).
Principal component analysis in meteorology and
oceanography. Elsevier Amsterdam.
Ramasso, E., & Saxena, A. (2014). Performance
Benchmarking and Analysis of Prognostic Methods
for CMAPSS Datasets. International Journal of
Prognostics and Health
Management(ISSN2153-2648), 1–15.
Randall, R. B. (2010). Vibration-based Condition
Monitoring: Industrial, Aerospace and Automotive
Applications. doi: 10.1002/9780470977668
Rayleigh, L. (1917). On the Pressure developed in the
Liquid during the Collapse of a Spherical Cavity. The
London, Edinburgh, and Dublin Philosophical
Magazine and Journal of Science,34(200), 94 – 98.
Retrieved from http://
Rus, T., Dular, M., Sirok, B., Hocevar, M., & Kern, I.
(2007). An Investigation of the Relationship Between
Acoustic Emission, Vibration, Noise, and Cavitation
Structures on a Kaplan Turbine. Journal of Fluids
Engineering,129(September), 1112. doi:
Saxena, A., Celaya, J., Saha, B., Saha, S., & Goebel, K.
(2010). Metrics for Offline Evaluation of Prognostic
Performance. International Journal of Prognostics
and Health Management(1), 1–20. Retrieved from
.org/files/phm{\ }submission/2010/
ijPHM{\ }10{\ }001.pdf
Saxena, A., Celaya, J. R., Saha, B., Saha, S., & Goebel, K.
(2009). On Applying the Prognostic Performance
Metrics. Proceedings of the annual conference of the
prognostics and health management society, 1–16.
Schmidt, H., Kirschner, O., Riedelbauch, S., Necker, J.,
Kopf, E., Rieg, M., .. . Mayrhuber, J. (2014).
Influence of the vibro-acoustic sensor position on
cavitation detection in a Kaplan turbine. IOP
Conference Series: Earth and Environmental Science,
22(5), 052006. Retrieved from
doi: 10.1088/1755-1315/22/5/052006
Shalabi, L. A., Shaaban, Z., & Kasasbeh, B. (2006). Data
Mining: A Preprocessing Engine. Journal of
Computer Science,2(9), 735–739. doi:
Shlens, J. (2014). A Tutorial on Principal Component
Analysis. arXiv preprint arXiv:1404.1100, 1–13. doi:
Si, X.-S. S., Wang, W., Hu, C.-H. H., & Zhou, D.-H. H.
(2011). Remaining useful life estimation - A review
on the statistical data driven approaches. European
Journal of Operational Research,213(1), 1–14.
Retrieved from
pii/S0377221710007903 doi:
Silberrad. (1912). Propeller erosion. Journal of the Franklin
Ins,174(1), 125.
Tan, D. Y., Miorini, R. L., Keller, J., & Katz, J. (2012).
Investigation of Cavitation Phenomena within an
Axial Waterjet Pump. In Proceedings of the eighth
international symposium on cavitation. doi:
Thornycroft, J. I., & Barnaby, S. W. (1895). Torpedo-Boat
Destroyers. In Minutes of the proceedings of the
institution of civil engineers (Vol. 122, pp. 51–69).
Thomas Telford-ICE Virtual Library. Retrieved from
U.S. Energy Information Administration. (2015). Electric
Power Monthly: with data for May 2015 (Tech. Rep.
No. May).
van der Maaten, L., Postma, E., & van den Herik, J. (2009).
Dimensionality Reduction : A Comparative Review.
October, 1–35.
van Rijsbergen, M., Foeth, E.-J., Fitzsimmons, P., &
Boorsma, A. (2012). High-Speed Video Observations
and Acoustic-Impact Measurements on a NACA 0015
foil. In International symposium on cavitation (pp.
399–406). doi: 10.3850/978-981-07-2826-7
Varga, JJ and Sebestyen, Gy and Fay, A. (1969). Detection
of cavitation by acoustic and vibration-measurement
methods. La houille blanche(2), 137–150.
Wang, C. M., & Huang, Y. F. (2009). Evolutionary-based
feature selection approaches with new criteria for data
mining: A case study of credit approval data. Expert
Systems with Applications,36(3 PART 2), 5900–5908.
Retrieved from
j.eswa.2008.07.026 doi:
Wang, T., Yu, J., Siegel, D., & Lee, J. (2008). A
similarity-based prognostics approach for remaining
useful life estimation of engineered systems. In
Prognostics and health management, 2008. phm 2008.
international conference on (pp. 1–6).
Wolff, P. J., Jones, R. K., & March, P. (2005). Evaluation of
Results from Acoustic Emissions-Based Cavitation
Monitor , Grand Coulee Unit G-24. (October), 1–15.
... In this work, the feature being selected is the frequency range used for the CSP calculations used to predict when a hydroturbine is experiencing cavitation. This definition for a feature could easily be expanded to include the sensor type and sensor location when these additional options exist (Gregg, Steele, & Bossuyt, 2016). ...
... The correct class labels for the training set were created manually using more traditional cavitation detection methods as well as sensor data from accelerometers and acoustic emission sensors. Additional information on the full analysis and general cavitation detection methods used to create the class labels can be found in (Gregg et al., 2016; US Department of the Interior Bureau of Reclamation, 2014;Escaler et al., 2006;Escaler & Egusquiza, 2003). ...
Full-text available
Hydroturbine operators who wish to collect cavitation intensity data to estimate cavitation erosion rates and calculate remaining useful life (RUL) of the turbine runner face several practical challenges related to long term cavitation detection. This paper presents a novel method that addresses these challenges including: a method to create an adaptive cavi-tation threshold, and automation of the cavitation detection process. These two strategies result in collecting consistent cavitation intensity data. While domain knowledge and manual interpretation are used to choose an appropriate cavita-tion sensitivity parameter (CSP), the remainder of the process is automated using both supervised and unsupervised learning methods. A case study based on ramp-down data, taken from a production hydroturbine, is presented and validated using independently gathered survey data from the same hy-droturbine. Results indicate that this fully automated process for selecting cavitation thresholds and classifying cav-itation performs well when compared to manually selected thresholds. This approach provides hydroturbine operators and researchers with a clear and effective way to perform automated , long term, cavitation detection, and assessment.
... Sudden changes in the local pressure of the liquid form bubbles that collapse, radiating acoustic energy waves, and causing the erosion of nearby surfaces. 18 Sand erosion increases the likelihood of cavitation, since eroded surfaces increase wall turbulence and, consequently, reduce the local pressure. 19 Cavitation is more likely in Francis turbines and reversible pump turbines than in Kaplan turbines. ...
Industrial maintenance has become an essential strategic factor for profit and productivity in industrial systems. In the modern industrial context, condition-based maintenance guides the interventions and repairs according to the machine’s health status, calculated from monitoring variables and using statistical and computational techniques. Although several literature reviews address condition-based maintenance, no study discusses the application of these techniques in the hydroelectric sector, a fundamental source of renewable energy. We conducted a systematic literature review of articles published in the area of condition-based maintenance in the last 10 years. This was followed by quantitative and thematic analyses of the most relevant categories that compose the phases of condition-based maintenance. We identified a research trend in the application of machine learning techniques, both in the diagnosis and the prognosis of the generating unit’s assets, being vibration the most frequently discussed monitoring variable. Finally, there is a vast field to be explored regarding the application of statistical models to estimate the useful life, and hybrid models based on physical models and specialists’ knowledge, of turbine-generators.
... Multivariate statistical methods such as Principal Component Analysis (PCA) [25], Independent Component Analysis (ICA) [26] and Least Square Support Vector Machine (LS-SVM) [27][28][29][30], have been widely applied for fault detection and diagnosis in hydro-generating systems. For instance, PCA decomposition is applied to aid experts in identifying and selecting the main features which contribute to cavitation in hydro-turbines [31]. Recent studies have proposed a new monitoring method, based on ICA-PCA that can extract both non-Gaussian and Gaussian information of process data for fault detection and diagnosis [32]. ...
Full-text available
Maintenance in small hydroelectric plants is fundamental for guaranteeing the expansion of clean energy sources and supplying the energy estimated to be necessary for the coming years. Most fault diagnosis models for hydroelectric generating units, proposed so far, are based on the distance between the normal operating profile and newly observed values. The extended isolation forest model is a model, based on binary trees, that has been gaining prominence in anomaly detection applications. However, no study so far has reported the application of the algorithm in the context of hydroelectric power generation. We compared this model with the PCA and KICA-PCA models, using one-year operating data in a small hydroelectric plant with time-series anomaly detection metrics. The algorithm showed satisfactory results with less variance than the others; therefore, it is a suitable candidate for online fault detection applications in the sector.
Full-text available
In recent years there has been increased demand for readiness and availability metrics across many industries and especially in national defense to enable data-driven decision making at all levels of planning, maintenance, and operations, and in leveraging integrated models that inform stakeholders of current operational system health and performance metrics. The digital twin (DT) has been identified as a promising approach for deploying these models to fielded systems although several challenges exist in wide adoption and implementation. Two challenges examined in this article are that the nature of DT development is a system-specific endeavor, and the development is usually an additional effort that begins after initial system fielding. A fundamental challenge with DT development, which sets it apart from traditional models, is the DT itself is treated as a separate system, and therefore the physical asset/DT construct becomes a system-of-systems problem. This article explores how objectives in DT development align with those of model-based systems engineering (MBSE), and how the MBSE process can answer questions necessary to define the DT. The key benefits to the approach are leveraging work already being performed during system synthesis and DT development is pushed earlier in a system's lifecycle. This article contributes to the definition and development processes for DTs by proposing a DT development model and path, a method for scoping and defining requirements for a DT, and an approach to integrate DT and system development. An example case study of a Naval unmanned system is presented to illustrate the contributions. K E Y W O R D S autonomy, digital twin, health monitoring, model-based systems engineering, prognostics, systems engineering, unmanned surface vessel
Full-text available
Hydrodynamics of a Pump-Turbine Operating at Off-Design Conditions in Generating Mode
Full-text available
Hydraulic turbines can be operated close to the limits of the operating range to meet the demand of the grid. When operated close to the limits, the risk increases that cavitation phenomena may occur at the runner and / or at the guide vanes of the turbine. Cavitation in a hydraulic turbine can cause material erosion on the runner and other turbine parts and reduce the durability of the machine leading to required outage time and related repair costs. Therefore it is important to get reliable information about the appearance of cavitation during prototype operation. In this experimental investigation the high frequency acoustic emissions and vibrations were measured at 20 operating points with different cavitation behaviour at different positions in a large prototype Kaplan turbine. The main goal was a comparison of the measured signals at different sensor positions to identify the sensitivity of the location for cavitation detection. The measured signals were analysed statistically and specific values were derived. Based on the measured signals, it is possible to confirm the cavitation limit of the examined turbine. The result of the investigation shows that the position of the sensors has a significant influence on the detection of cavitation.
Full-text available
Recently van Rijsbergen et al. [1], by simultaneous observation of cavitation and acoustic emission measurements, and Petkovsek & Dular [2], by simultaneous observation of both cavitation structures and cavitation damage, have pointed to the fact that the small scale structures and the topology of the cavitation clouds play a significant role in cavitation erosive potential. Despite the two, before mentioned, studies opened some new insights to the physics of cavitation damage, many new questions appeared. In the present study we attached a thin aluminum foil to the surface of a transparent Venturi section using two sided transparent adhesive tape. The surface was very soft – prone to be severely damaged by cavitation in a very short period of time. Using high speed cameras, which captured the images at 30000 frames per second, we simultaneously recorded cavitation structures (from several perspectives) and the surface of the foil. Analysis of the images revealed that five distinctive damage mechanisms exist – spherical cavitation cloud collapse, horseshoe cavitation cloud collapse, the “twister” cavitation cloud collapse and in addition it was found that pits also appear at the moment of cavitation cloud separation and near the stagnation point at the closure of the attached cavity.
Full-text available
The field of prognostics has attracted significant interest from the research community in recent times. Prognostics enables the prediction of failures in machines resulting in benefits to plant operators such as shorter downtimes, higher operation reliability, reduced operations and maintenance cost, and more effective maintenance and logistics planning. Prognostic systems have been successfully deployed for the monitoring of relatively simple rotating machines. However, machines and associated systems today are increasingly complex. As such, there is an urgent need to develop prognostic techniques for such complex systems operating in the real world. This review paper focuses on prognostic techniques that can be applied to rotating machinery operating under non-linear and non-stationary conditions. The general concept of these techniques, the pros and cons of applying these methods, as well as their applications in the research field are discussed. Finally, the opportunities and challenges in implementing prognostic systems and developing effective techniques for monitoring machines operating under non-stationary and non-linear conditions are also discussed.
A set of empirical equations has been developed which defines the peak efficiency and shape of the efficiency curve for hydraulic turbines as a function of the commissioning date for the unit, rated head, rated flow, runner speed, and runner throat or impulse turbine jet diameter. The equations are based on an analysis of peak efficiency data from 56 Francis, 33 axial-flow, and eight impulse runners dating from 1908 to the present, with runner diameters ranging from just under 0.6 m to almost 9.5 m. The metric specific speeds (nq) ranged from 5.3 to 294. The root mean square error of the calculated peak efficiency for Francis and axial-flow runners was found to be 0.65%. The shape of the efficiency curves was derived from eight Francis, five Kaplan, three propeller, and four impulse turbines. Charts showing the relationship between calculated and actual efficiency curves for these 20 runners are provided. A good match between calculated and measured or guaranteed efficiency was obtained. The equations were also used to determine the relative increase in peak efficiency for new reaction runners installed in existing casings at 22 powerplants, with a root mean square accuracy of 1.0%. The equations can be used to (i) develop efficiency curves for new and old runners; (ii) compare the energy output of alternative types of turbines, where this choice is available; and (iii) calculate the approximate incremental energy benefit from installing a new runner in an existing reaction turbine casing, or onto the shaft of an impulse unit.
An experimental study of the noise produced by a single cavitation bubble has been made. The noise consists principally of a transient pressure pulse associated with the collapse of the bubble. The motion of the bubble has been photographed simultaneously with the measurement of the pressure pulse.
This paper introduces a feature-extraction method to characterize gas turbine engine dynamics. The extracted features are used to develop a fault diagnosis and prognosis method for startup related sub-systems in gas turbine engines-the starter system, the ignition system and the fuel delivery system. The startup of a gas turbine engine from ignition to idle speed is very critical not only for achieving a fast and efficient startup without incurring stall, but also for health monitoring of many subsystems involved. During startup, an engine goes through a number of phases during which various components become dominant. The proposed approach physically monitors the relevant phases of a startup by detecting distinct changes in engine behavior as it manifests in such critical variables as the core speed and the gas temperature. The startup process includes several known milestones, such as starter-on, light-off, peak gas temperature, and idle. As each of these is achieved, different engine components come into play and the dynamic response of the engine changes. Monitoring engine speed and exhaust gas temperature and their derivatives provides valuable insights into engine behavior. The approach of the fault diagnosis system is as follows. The engine startup profiles of the core speed (N2) and the gas temperature are obtained and processed into a compact data set by identifying critical-to-characterization instances. The principal component analysis is applied to a number of parameters, and the fault is detected and mapped into three engine component failures which are the starter system failure, the ignition system failure, and the fuel delivery system failure. In this work, actual engine test data was used to develop and validate the system, and the results are shown for the test on engines that experienced startup related system failures. The developed fault diagnosis system detected the failure successfully in all three component failures.
Explaining complex ideas in an easy to understand way, Vibration-based Condition Monitoring provides a comprehensive survey of the application of vibration analysis to the condition monitoring of machines. Reflecting the natural progression of these systems by presenting the fundamental material and then moving onto detection, diagnosis and prognosis, Randall presents classic and state-of-the-art research results that cover vibration signals from rotating and reciprocating machines; basic signal processing techniques; fault detection; diagnostic techniques, and prognostics. Developed out of notes for a course in machine condition monitoring given by Robert Bond Randall over ten years at the University of New South Wales, Vibration-based Condition Monitoring: Industrial, Aerospace and Automotive Applications is essential reading for graduate and postgraduate students/ researchers in machine condition monitoring and diagnostics as well as condition monitoring practitioners and machine manufacturers who want to include a machine monitoring service with their product. Includes a number of exercises for each chapter, many based on Matlab, to illustrate basic points as well as to facilitate the use of the book as a textbook for courses in the topic. Accompanied by a website housing exercises along with data sets and implementation code in Matlab for some of the methods as well as other pedagogical aids. Authored by an internationally recognised authority in the area of condition monitoring.
A perfect fluid theory, which neglects the effect of gravity, and which assumes that the pressure inside a cavitation bubble remains constant during the collapse process, is given for the case of a nonhemispherical, but axially symmetric cavity which collapses in contact with a solid boundary. The theory suggests the possibility that such a cavity may deform to the extent that its wall strikes the solid boundary before minimum cavity volume is reached. High speed motion pictures of cavities generated by spark methods are used to test the theory experimentally. Agreement between theory and experiment is good for the range of experimental cavities considered, and the phenomenon of the cavity wall striking the solid boundary does indeed occur. Studies of damage by cavities of this type on soft aluminum samples reveals that pressures caused by the cavity wall striking the boundary are higher than those resulting from a compression of gases inside the cavity, and are responsible for the damage.