ArticlePDF Available

Abstract and Figures

Failure mode and effects analysis (FMEA) was initiated by the aerospace industry in the 1960s to improve the reliability of systems. It is a part of total quality management programs and should be used to prevent potential failures that could affect safety, production, cost or customer satisfaction. FMEA can be used during the design, service or manufacturing processes to minimize the risk of failure, improving the customer’s confidence while also reducing costs. Complete paper is here: http://www.iienet2.org/details.aspx?id=37883
No caption available
… 
Content may be subject to copyright.
10 Industrial Management
One ISO requirement is to have a
method or system capable of controlling
the process that determines the accept-
ability of product or service quality.
Failure mode and effects analysis
(FMEA) is a good tool for improving the
reliability of the product and its lifecycle.
The tool can maximize the mean time
between failures by reducing the proba-
bility of failure, extending the lifecycle of
the product. This can be done during the
design phase, manufacturing phase or
maintenance service.
FMEA is a risk management tool
that is designed to work as a preventive
method rather than a corrective one.
FMEA is not
a job for one
individual.
Organizations that use the tool as a
corrective method will find that it does
not work as intended. FMEA can be
quite useful for design engineers in the
design phase of the product, as well as
research and development engineers
to help them develop new products
with better reliability, quality and safety.
FMEA helps manufacturing engineers
control the process and eliminate errors
during production, thus decreasing
warranty costs and wastes. Service
engineers can use FMEA to improve
the lifecycle of the product and lower
its service costs by developing a proper
maintenance program.
A team and a process
FMEA is not a job for one individual.
The best possible results come when
teams are composed of contributors
from different engineering perspectives.
The team should have between four to
six members. Team size is determined
by the number of areas affected by the
FMEA, such as manufacturing, mainte-
nance, design, engineering, material and
technical service.
The customer adds another unique
perspective and should be considered for
team membership. If customers cannot
be included, the team should devise ways
to generate voice-of-the customer data.
Analyzing failure
to prevent problems
BY MOHAMMED HAMED AHMED SOLIMAN
EXECUTIVE SUMMARY
Failure mode and effects analysis (FMEA) was initiated by the aerospace
industry in the 1960s to improve the reliability of systems. It is a
part of total quality management programs and should be used to
prevent potential failures that could affect safety, production, cost or
customer satisfaction. FMEA can be used during the design, service or
manufacturing processes to minimize the risk of failure, improving the
customer’s confidence while also reducing costs.
september/october 2014 11
The team should have a leader who
acts as a facilitator, not a decision-
maker. The team leader’s main goals are
to ensure that all resources are available,
coordinate the meetings, and make sure
the team moves toward completing the
FMEA process.
Brainstorming is a well-known
technique for generating a large number
of ideas in a short time period. It’s
preferable to use this tool during the
start of an FMEA process to determine
potential failure modes for each
component your team is studying.
Brainstorming also helps find the root
causes of each failure mode.
To encourage ideas, no theory should
be critiqued or commented on when
it is first offered. Each idea should be
listed and numbered, exactly as offered,
on a flip chart. Expect to generate at
least 50 to 60 concepts in a 30-minute
brainstorming session. Brainstorming
sessions should follow four general
rules: Do not comment on, judge or
critique ideas at the time they are offered;
encourage creative and offbeat ideas; the
goal is to end up with a large number of
ideas; and evaluate ideas later.
FMEA has sequential steps that
were summarized in the book Basics
of FMEA, Second Edition, by Raymond
J. Mikulak, Robin McDermott and
Michael Beauregard. Some of the steps
are obvious, but others aren’t. A basic
outline follows.
1. Select a high-risk process. This
will depend on the criticality of the
process and how a failure in this process
can affect safety, environment, health,
production or costs. For example, a
generator that will supply electricity to a
firefighting system during emergencies
is a critical safety component and must
be considered during an FMEA because
failures in such situations cannot be
accepted.
2. Review the process. This process
involves assigning a team that includes
people with various job responsi-
bilities and levels of experience, such
as the design engineers, maintenance
engineers, production engineers,
process engineers, safety engineers and
environmental engineers. The purpose
of the FMEA team is to bring a variety
of perspectives and experiences to the
project.
If the process is a manufacturing
process, then the team should review the
process flowcharts and walk through the
process at the gemba (the place where
the work is done) to observe the real
situation and collect all the data needed.
If the process is a product or machine,
then the team should review the
assembly drawing. The product should
be tested, and every team member
should be able to operate it and see how
it works.
In this step, everyone in the team
must have full knowledge of how the
process works and operates.
3. Break down the system into
components and subcomponents.
If the system is a large system, like a
water system that supplies an indus-
trial process, the pump can be a critical
component inside the system. A motor
pump is a critical subcomponent
because its failure can break down the
entire process. The motor pump should
be broken down into more subcompo-
nents that are likely to fail and will affect
the system, such as the motor’s bearings
and the rotor shaft. The FMEA will be
Every team
member
should be
able to
operate it and
see how it
works.
12 Industrial Management
used to prevent the probability of failure
for each component or subcomponent.
4. Brainstorm potential failure
modes. Once everyone in the team has
a deep understanding about how the
process or product works, the team can
start thinking about things that could
happen to affect the process. After a
brainstorming session, organize the
ideas by grouping them into categories.
Categorizing failure modes can be done
using many different ways, including
failure type (i.e., electrical, mechanical or
user-created).
A failure mode is an event that causes
a functional failure, any of the myriad
ways in which a product or process can
fail. Examples of failure modes abound.
Low discharge pressure could be a
compressor failure mode. Knocking
could be an engine failure mode. Seized
bearings are a bearing failure mode.
Burnout is a motor failure mode. A dead
battery is a car battery failure mode.
Note that failures are not limited to
problems with the product, and failures
could be tied to user mistakes. Those
types of failures should be included in
the FMEA. Anything that can be done
to ensure the product works correctly,
regardless of how the user operates it,
will move the product closer to 100
percent total customer satisfaction. The
use of mistake-proofing techniques, also
known by its Japanese term poka-yoke,
can be a good tool for preventing failures
related to user mistakes.
For example, an FMEA involving a
coffee maker could try to engineer out
the user mistake of putting too much or
too little ground coffee in the filter. This
will ensure that the machine is making
the right coffee with the same quality of
taste for all users.
5. Assign an effect for each failure
mode. Each failure mode should have
an effect that determines the severity of
the failure. It is also known as the conse-
quence of failure.
The effect of a failure mode on the
system is influenced by the availability
of standby or redundancy in the system.
For example, a transformer that supplies
electricity is critical, but the existence
of a standby generator will reduce the
criticality of the system. However, this
performance must be considered and
compared. If the transformer failed,
would the generator be able to supply
the electricity needed with the same
efficiency? What is the time interval
between when the transformer fails and
when the generator starts to work? Will
any failures have a severe effect on the
product, the process or the whole system
that will cost a lot of money to repair?
One failure mode could have several
effects. For example, an electrical cutoff
in the home could stop the refrigerator
and damage food or prevent you from
doing work on the computer.
Several failure modes could have one
effect. A dead car battery or tire failure
has the same effect on your vehicle – it
will be difficult to make it to work on
time with such a failure early in the
morning.
The team must determine the
end-effect each failure mode has on
the system or the process. This means
examining how each failure affects the
entire system, the facility or the other
connected processes.
6. Assign severity rankings. Severity,
occurrence and detection are each ranked
on a 10-point scale, ranging from one as
the lowest ranking to 10 as the highest.
Figure 1 shows a standard example of
rankings for all three. In the severity
category, potential safety, health and
environmental failure modes generally
indicate high risk, with rankings of nine
and 10. Production losses and costs
rank from a low of two to a high of eight,
depending upon the length of potential
delays and the severity of their effects on
the entire system.
One failure
mode could
have several
effects.
CATEGORIZING FAILURE
Figure 1. An FMEA process should use 10-point scales to rank the severity, occurrence and detection of each failure mode.
Description of failure e ect E ect Rank
ing
No reason to expect failure to have any e ect on safety, health, environment or mission. None 1
Minor disruption of production. Repair of failure can be accomplished during trouble call. Very low 2
Minor disruption of production. Repair of failure may be longer than trouble call but does not delay mission. Low 3
Moderate disruption of production. Some portion of the production process may be delayed. Low to moderate 4
Moderate disruption of production. The production process will be delayed. Moderate 5
Moderate disruption of production. Some portion of production function is lost. Moderate delay in restoring high
function.
Moderate to high 6
High disruption of production. Some portion of production function is lost. Signi cant delay in restoring function. High 7
High disruption of production. All of production function is lost. Signi cant delay in restoring high function. Very high 8
Potential safety, health or environmental issue. Failure will occur with warning. Hazard 9
Potential safety, health or environmental issue. Failure will occur without warning. Hazard 10
Severity ranking criteria Detection ranking criteria
Ranking Description
1-2 Very high probability of detection
3-4 High probability of detection
5-7 Moderate probability of detection
8-9 Low probability of detection
10 Very low probability of detection
Occurrence ranking criteria
Ranking Frequency of occurrence/
operating hours
Description
1 1/10,000 Remote probability of occurrence; unreasonable to expect failure to occur
2 1/5,000 Low failure rate
3 1/2,000 Low failure rate
4 1/1000 Occasional failure rate
5 1/500 M oderate failure rate
6 1/200 M oderate failure rate
7 1/100 H igh failure rate
8 1/50 High failure rate
9 1/20 Very high failure rate
10 1/10 Very high failure rate
september/october 2014 13
7. Assign an occurrence ranking
for each failure mode. Occurrence
is the probability of failure during the
product’s expected lifecycle, usually
determined using the failure log
history. But when historical data are
not available or the failure never has
occurred before, the team can determine
the causes of each failure mode with
techniques such as the “five whys.”
Once the potential causes are deter-
mined, the team can estimate an occur-
rence ranking.
8. Assign a detection ranking for
each failure mode. First, the current
control and prevention methods
applied to prevent, detect or control the
failure should be listed, reviewed and
evaluated. The detection ranking should
be assigned for each failure mode or
effect based on the current control/
prevention/detection methods. As with
the severity and occurrence rankings,
the detection ranking table in Figure
1 is standard. If one failure mode or
effect has several causes, detection and
occurrence rankings should be assigned
based on these causes. When potential
causes are eliminated, the risk of failure
is lowered.
9. Calculate the risk priority
number. The risk priority number
(RPN) gauges the risk associated with
potential problems identified during
the FMEA process. It is useful for
assessing risk and comparing compo-
nents to determine priorities. The RPN
is calculated by multiplying the severity,
occurrence and detection for each
failure mode or effect. The number can
serve as a gauge to compare with the
revised RPN once the FMEA process is
completed and risk is lowered.
Many have commented that the
“ideal” tables in Figure 1 do not exactly
match their industry type or current
conditions. But remember that the
ideal is only a guide, and the tables can
be adapted and changed as needed.
However, it is important to keep the
rankings from one to 10 so that the
RPN scale has a minimum score of one
and a maximum score of 1,000.
10. Prioritize failure modes to
take action. This could be done with
something like a Pareto chart and the
80-20 rule. Failure modes should
be prioritized according to the risk
number. High-risk numbers should be
given attention first; then you can pay
attention to the severity rankings. Thus,
if several failure modes have the same
risk priority number, that failure mode
with the highest severity should be given
more priority.
All RPNs above a certain cutoff point
should be considered for improvement.
The cutoff point number should be one
that will improve at least 50 percent of
the total risk priority number.
11. Take action to eliminate or
reduce the high risk failure modes.
Once the priorities are assigned,
organize action through continuous
improvement tasks and problem-solving
approaches, implementing countermea-
sures to reduce or eliminate the high-risk
failure modes.
Often, the easiest way to make an
improvement to the product or process
is to increase the detectability of the
failure, lowering the detection rate
number. Teams can improve the chances
of detecting failure through modifying
the preventive maintenance program,
using a proper condition-monitoring
method, eliminating the failure mode
during the manufacturing process by
changing materials or suppliers, or
considering a mistake-proofing method
during the design phase. An example
would be computer software that
automatically warns that you are running
out of memory.
12. Calculate risk priority number
as high risks are removed. After
corrective actions have been taken to
lower risks, recalculate the RPN. You
can compare this revised RPN with the
earlier number to gauge improvement.
The expectation is that the FMEA
approach will reduce the initial RPN by
at least 50 percent.
There always will be a potential for
failure modes to occur. The question
the company must ask is how much
relative risk the team is willing to take.
That answer might depend on the type
of industry and the seriousness of
the failure. For example, the nuclear
industry has little margin for errors,
as minor problems could escalate into
major disasters. Other industries might
find it acceptable to take higher risks.
A case of reliable improvement
A good example of a successful FMEA
process comes from the case of a
system that supplied electricity to a
glass-melting furnace in Egypt. The
electric transformer is considered
critical because a failure causes high
production losses – $5,000 an hour.
A standby generator could keep the
furnace running if the transformer
failed. The standby was sufficient to
avoid damaging the furnace but did not
supply enough electricity to continue
production.
The team broke down the transformer
into seven components: bushing, tank,
core, winding, oil, tap changer and solid
isolation. Each component has different
failure modes. For each failure mode
there is an effect. And for each failure
mode and effect there are several causes.
Figure 2 shows the seven components
and their failure modes, effects, causes,
RPNs, rankings, recommended actions
and other details.
The severity ranking number was
based on the effect of each failure mode.
Most of the failures had a medium effect
on production because standby was
available. An occurrence ranking was
assigned based on the potential causes
of each failure mode and the historical
data.
It is important to discover the
problem’s root cause first because the
cause will help determine the occur-
rence ranking. A detection ranking was
assigned based on an evaluation of the
transformer’s current preventive mainte-
nance program.
The transformer’s maintenance
program contained basic measurements
and analysis on a monthly and annual
basis. No advanced prediction methods
were used to detect severe problems
that might occur during the system’s
operation.
High-risk
numbers
should
be given
attention
first; then
you can pay
attention to
the severity
rankings.
14 Industrial Management
CURBING FUTURE PROBLEMS
Figure 2. This successful FMEA project reduced the RPN of a transformer’s seven components from 540 to 188.
COMPONENT NAME AND FUNCTION: Bushing, supply high voltage
Failure
mode
Failur
e
e ect
Severity
Failur
e
causes Failure cause Failure causes Failure
cause
Occurrence
Current contr
ol
detection/
prevention
methods
Detection
RPN
Recommendations Actions
Results
S O
D
RPN
Short
circuit
Equipment shutdown
4
Fault in
insulation
material
Water
penetration or
dirt
Inelastic gasket
Aging
1 Visual inspec
tion
and cleaning
6 24 Improve inspection
and detectability
Use a proper condition-
monitoring technique
such as ultrasound to
detect insulation faults
4 1 2 8
Lack of
maintenance 4 1 2 8
1 6 24
Damage
bushing
1 None 4 16 NA NA 4 1 4 16
COMPONENT NAME AND FUNCTION: Tank , enclose oil, protect active parts
Failure
mode
Failure
cause Failure cause Failure cause Failure cause
Occurrence
Current
controls
Detection
RPN
Recommendations Actions
S
O D RPN
Leakage
Equipment shutdown
Tank
damage
(rupture)
Material/
method
Inelastic gasket
or corrosion
Aging 1
Visual
inspection
5 20
Use ultrasound
analysis
technique to
detect arcing
phenomena
4
1 1 4
Insu cient
maintenance 1 5 20
4
1 1 4
Mechanical
damage
High pressure
due to gas
generation Arcing 1
None
10 40
4
1 1 4
Careless
handling
1 1 4 NA NA 4 1 4 1
COMPONENT NAME AND FUNCTION: Core, carry magnetic ux
Failure mode Failure e ect
Severity
Failure cause Failure cause
Occurrence
Current controls
Detection
RPN
Recommendations
Actions Resulting
RPN
Loss of e ciency
(reduction of transfor
mer
e ciency)
Lower voltage,
production
disturbance
4 Mechanical failur
e
DC magnetization 1 4 16 NA NA 16
1 4 16 NA NA 16
COMPONENT NAME AND FUNCTION: Winding, carry current
Failure
mode
Failur
e
e ect
Failure
cause Failure cause Failure cause
Occurrence
Current
controls
Detection
RPN
Recommendations
Actions
Results
S O D RPN
Short
circuit
Equipment shutdown
4
Fault
insulation
Generation of
copper sul de 1 8 32 Improve inspection
and detectability
Use ultrasound analy
sis
technique 4 1 2 8
Hot spot Low oil quality 1 Oil sampling 1 4 NA NA 4 1 1 4
M
echanical
damage
Movement of
transformer Aging of cellulose 1
None
5 20
Improve inspection
and detectability
Use ultrasound for
early detection
4 1 2 8
Transient
overvoltage
Short circuit in the net 1 5 20 4 1 2 8
Connection of transformer 1 5 20 4 1 2 8
Lightning 1 5 20 4 1 2 8
Construction fault 1 5 20 4 1 2 8
S O D
Oil
Equipment shutdown
4
Short circuit
in transformer
Particles in
the oil Overheated Pump failure,
dirty particles
in the oil 2
Visual
monitoring
of gauges
and oil
sampling
every three
years
4 32
Increase the
frequency of oil
sampling to twice
per year in the
maintenance
schedule
Sample oil
every six
months in the
semiannual
maintenance
schedule
4 1 2 8
Water in the
oil
Overheated or
aging
Overheated
Oil is not
cooled
Oil circulation
out of function,
or
air/water
cooling is out
of function
Fan/pump
failure 2 4 32 4 1 2 8
COMPONENT NAME AND FUNCTION: Oil, the oil ser ves as both cooling medium and part of the insulation system
COMPONENT NAME AND FUNCTION: Tap changers, regulate voltage (volt leveling)
Failure
mode
Failur
e
e ect
Severity
Failure c
ause
Failure c
ause
Failure cause
Occurrence
Current controls
Detection
RPN
Recommendations
Actions
Results
S O D RPN
Tap
changes
Change
of the
voltage
output
3 Can’t change
voltage level
Mechanical
damage Wear 2 Voltage measuring 6 36
Use a proper condition-
monitoring technique
to detect mechanical
damages of the tap
changers
Use infrared
analysis to detect
mechanical
damages
4 1 2 8
COMPONENT NAME AND FUNCTION: Solid isolation in cellulose-based products such as pressboard and paper. Its function is to provide dielectric and mechanical isolation to the
windings.
Failure
Mode
Failure
E ect
Severity
Failure c
ause
Sources of failure Failure cause
Occurrence
Current
controls
Detection
RPN
Recommendations Actions
Results
S O
RPN
Can’t
supply
insulation
Equipment shutdown
4
Mechanical
damage
Short circuit
Aging of
cellulose
1
None
10 40
Improve inspection and
detectability in the
maintenance program
Use ultrasound to
detect early isolation
failures
4 1
2
8
Movement of
transformer
10 40
Fault in
insulation
material
Aging of cellulose 1 4 1
2
8
Hot spot
Low oil quality,
or
overload
1 1 4 4 1
1
8
Generation of
copper sulde 1 10 40
Use a proper condition-
monitoring technique to
detect insulation faults early
Use ultrasound to
detect isolation
failures early
4 1
2
8
Results
Improve inspection and
detectability
Basic
measurements and
gauges monitoring
on monthly basis
Displacement of the core seal
during construction
(construction fault)
Severity
RPN
Failur
e
mode
Failure
e ect
Failure
cause Failure cause Failure cause
Occurrence
RPN
Recommendations
Actions
Results
Severity
Current
controls
Detection
RPN
Failure cause
Failur
e
e ect
Severity
4
D
september/october 2014 15
A risk priority number was calcu-
lated, with a cutoff RPN of 16. All RPNs
greater than 16 were considered for
improvement. The FMEA calculated a
total RPN of 540. Applying continuous
improvement actions to all RPNs
greater than 16 lowered the total RPN to
188. This revised RPN was a 65 percent
improvement. The reduction percentage
is calculated using this formula: (RPN -
RPN revised) / RPN*100.
The improvements that yielded
success included using ultrasound to
detect issues, increasing the frequency
of oil sampling and using infrared
analysis to detect mechanical damage.
An FMEA process can trigger a
number of such actions to improve
a product’s service or maintenance
processes. They include, but are not
limited to: Increase the detection rate
of high-risk failures using a proper
technique to monitor conditions;
increase the inspection rate for a
specific component or part; modify the
routine maintenance program; increase
the frequency of replacing a specific
spare part; modify the preventive
maintenance schedule; change a spare
part supplier; redesign a specific part
in the system – or redesign the whole
system; and use different types of
materials or spare parts.
While the above example involves
a piece of equipment and its parts,
FMEA can be applied in many other
areas, including the component proving
process; the outsourcing or resourcing
of a product; developing suppliers
to achieve quality; major changes in
processes, equipment or technology; cost
reductions; and analysis of new products
or designs.
Other important considerations
Failure mode and effects analysis can
maximize a product’s reliability. But
don’t mistake it as a standalone tool.
For example, to determine occurrence
ratings, FMEAs rely on the failure
log history, and the documentation
process also is important. Problem-
solving techniques like “five whys,”
brainstorming, fault-tree analysis and
Pareto analysis must be engaged. These
techniques will help determine potential
failure modes; assign the severity,
occurrence and detection rankings; and
provide solutions or actions to eliminate
those failures.
And it cannot be emphasized how
important customers are for a successful
FMEA. A proper FMEA process must
consider not only failures related to
your organization’s quality, but failures
and mistakes that can be introduced by
your customers. Collecting feedback is
important. For example, Toyotas recalls
in recent years relied on its dealers
and service centers to play a big role in
collecting the important data needed to
let Toyota know what changes needed to
be made on its factory floors. The data
was based on customer feedback and
comments.
And, as industrial engineers and
managers know, the best tools will not
work without an inherent culture of
continuous improvement. Everything
runs a risk of failure. When failure
happens, the important thing is to find
out what the organization can do to
prevent those failures from occurring in
the future.
An FMEA is not a one-time job – it
should be repeated continuously to keep
the process improved. Once the quality
and cost of your company’s offerings
have been improved, competitors will
try to match or exceed your value propo-
sition. Continual FMEAs will bring your
processes closer to perfection, so the
continuous improvement culture should
be embedded throughout all levels and
with all employees. v
A proper
FMEA process
must consider
... failures
and mistakes
that can be
introduced
by your
customers.
COMPONENT NAME AND FUNCTION: Bushing, supply high voltage
Failure
mode
Failur
e
e ect
Severity
Failur
e
causes Failure cause Failure causes Failure
cause
Occurrence
Current contr
ol
detection/
prevention
methods
Detection
RPN
Recommendations Actions
Results
S O
D
RPN
Short
circuit
Equipment shutdown
4
Fault in
insulation
material
Water
penetration or
dirt
Inelastic gasket
Aging
1 Visual inspec
tion
and cleaning
6 24 Improve inspection
and detectability
Use a proper condition-
monitoring technique
such as ultrasound to
detect insulation faults
4 1 2 8
Lack of
maintenance 4 1 2 8
1 6 24
Damage
bushing
Sabotage stone,
crash or
careless
handling
1 None 4 16 NA NA 4 1
4
16
COMPONENT NAME AND FUNCTION: Tank , enclose oil, protect active parts
Failure
mode
Failure
cause Failure cause Failure cause Failure cause
Occurrence
Current
controls
Detection
RPN
Recommendations Actions
S
O D RPN
Leakage
Equipment shutdown
Tank
damage
(rupture)
Material/
method
Inelastic gasket
or corrosion
Aging 1
Visual
inspection
5 20
Use ultrasound
analysis
technique to
detect arcing
phenomena
4
1 1 4
Insu cient
maintenance 1 5 20
4
1 1 4
Mechanical
damage
High pressure
due to gas
generation Arcing 1
None
10 40
4
1 1 4
Careless
handling
1 1 4 NA NA 4 1 4 1
COMPONENT NAME AND FUNCTION: Core, carry magnetic ux
Failure mode Failure e ect
Severity
Failure cause Failure cause
Occurrence
Current controls
Detection
RPN
Recommendations
Actions Resulting
RPN
Loss of e ciency
(reduction of transfor
mer
e ciency)
Lower voltage,
production
disturbance
4 Mechanical failur
e
DC magnetization 1 4 16 NA NA 16
1 4 16 NA NA 16
COMPONENT NAME AND FUNCTION: Winding, carry current
Failur
e
mode
Failur
e
e ect
Failure
cause Failure cause Failure cause
Occurrence
Current
controls
Detection
RPN
Recommendations
Actions
Results
S O D RPN
Short
circuit
Equipment shutdown
4
Fault
insulation
Generation of
copper sul de 1 8 32 Improve inspection
and detectability
Use ultrasound analy
sis
technique 4 1 2 8
Hot spot Low oil quality 1 Oil sampling 1 4 NA NA 4 1 1 4
M
echanical
damage
Movement of
transformer Aging of cellulose 1
None
5 20
Improve inspection
and detectability
Use ultrasound for
early detection
4 1 2 8
Transient
overvoltage
Short circuit in the net 1 5 20 4 1 2 8
Connection of transformer 1 5 20 4 1 2 8
Lightning 1 5 20 4 1 2 8
Construction fault 1 5 20 4 1 2 8
S O D
Oil
Equipment shutdown
4
Short circuit
in transformer
Particles in
the oil Overheated Pump failure,
dirty particles
in the oil 2
Visual
monitoring
of gauges
and oil
sampling
every three
years
4 32
Increase the
frequency of oil
sampling to twice
per year in the
maintenance
schedule
Sample oil
every six
months in the
semiannual
maintenance
schedule
4 1 2 8
Water in the
oil
Overheated or
aging
Overheated
Oil is not
cooled
Oil circulation
out of function,
or
air/water
cooling is out
of function
Fan/pump
failure 2 4 32 4 1 2 8
COMPONENT NAME AND FUNCTION: Oil, the oil ser ves as both cooling medium and part of the insulation system
COMPONENT NAME AND FUNCTION: Tap changers, regulate voltage (volt leveling)
Failure
mode
Failure
eect
Severity
Failure c
ause
Failure c
ause
Failure cause
Occurrence
Current controls
Detection
RPN
Recommendations
Actions
Results
S O D RPN
Tap
changes
Change
of the
voltage
output
3 Can’t change
voltage level
Mechanical
damage Wear 2 Voltage measuring 6 36
Use a proper condition-
monitoring technique
to detect mechanical
damages of the tap
changers
Use infrared
analysis to detect
mechanical
damages
4 1 2 8
COMPONENT NAME AND FUNCTION: Solid isolation in cellulose-based products such as pressboard and paper. Its function is to provide dielectric and mechanical isolation to the
windings.
Failure
Mode
Failure
E ect
Severity
Failure c
ause
Sources of failure Failure cause
Occurrence
Current
controls
Detection
RPN
Recommendations Actions
Results
S O RPN
Can’t
supply
insulation
Equipment shutdown
4
Mechanical
damage
Short circuit
Aging of
cellulose
1
None
10 40
Improve inspection and
detectability in the
maintenance program
Use ultrasound to
detect early isolation
failures
4 1 2 8
Movement of
transformer
10 40
Fault in
insulation
material
Aging of cellulose 1 4 1 2 8
Hot spot
Low oil quality,
or
overload
1 1 4 4 1 1 8
Generation of
copper sulde 1 10 40
Use a proper condition-
monitoring technique to
detect insulation faults early
Use ultrasound to
detect isolation
failures early
4 1 2 8
Results
Improve inspection and
detectability
Basic
measurements and
gauges monitoring
on monthly basis
Displacement of the core seal
during construction
(construction fault)
Severity
RPN
Failur
e
mode
Failur
e
e ect
Failure
cause Failure cause Failure cause
Occurrence
RPN
Recommendations
Actions
Results
Severity
Current
controls
Detection
RPN
Failure cause
Failur
e
e ect
Severity
4
D
... Deeper and more accurate understandings of systems are becoming essential in their control-command chain, particularly as systems are getting more and more complex [1]. This knowledge is highly valuable to keep track of a system's evolution, to keep the process as smooth as possible and to prevent any unwanted events from occurring [2]. ...
... Nonetheless, several works have directly addressed the problem of identifying the behaviors of a system so as to model it using finite-state automatons or multi-models. Such models serve two purposes: (1) identifying and preventing the anomalies [37][38][39][40][41][42]; (2) predicting the evolution of the system, either for real-time monitoring or for simulation [21,[42][43][44][45]. Whether it is for fault detection and diagnosis or for prediction, the first step is to detect, identify, and characterize the different behaviors of the system. ...
... Let C (t) k and m (t) k be, respectively, the kth cluster itself and its mean at iteration t. The clusters are formed following (1), and their means are updated by (2). ...
Article
Full-text available
For two centuries, the industrial sector has never stopped evolving. Since the dawn of the Fourth Industrial Revolution, commonly known as Industry 4.0, deep and accurate understandings of systems have become essential for real-time monitoring, prediction, and maintenance. In this paper, we propose a machine learning and data-driven methodology, based on data mining and clustering, for automatic identification and characterization of the different ways unknown systems can behave. It relies on the statistical property that a regular demeanor should be represented by many data with very close features; therefore, the most compact groups should be the regular behaviors. Based on the clusters, on the quantification of their intrinsic properties (size, span, density, neighborhood) and on the dynamic comparisons among each other, this methodology gave us some insight into the system's demeanor, which can be valuable for the next steps of modeling and prediction stages. Applied to real Industry 4.0 data, this approach allowed us to extract some typical, real behaviors of the plant, while assuming no previous knowledge about the data. This methodology seems very promising, even though it is still in its infancy and that additional works will further develop it.
... Replacing the ECU cost $1,500. The waste in time, effort and resources led to a total cost of more than $1,850 (Soliman, 2014). ...
Preprint
Full-text available
Mistakenly, many people think plan-do-check-act (PDCA) is a continuous improvement cycle, even if they neglect the human part. PDCA does aim to improve the process, but if you have only improved the process without developing and teaching your people, you have put the process at risk of slipping back. People must be trained in the culture of continuous improvement so they can keep managing the process with the new method.
Book
Full-text available
A Failure Mode and Effect Analysis FMEA is a systematic method for identifying and preventing product and process problems before they occur. FMEAs are focused on preventing defects, enhancing safety and increasing customer satisfaction. FMEAs are conducted in the product design or process development stages, although conducting an FMEA on existing products and processes can also yield substantial benefits. Also Six Sigma's project team use FMEA in the Analyze stage of DMAIC.
Book
Full-text available
Predictive Maintenance strategy employs vibration analysis, thermography analysis, ultrasound analysis, oil analysis and other techniques to improve machine reliability. The goal of the strategy is to provide the stated function of the facility, with the required reliability and availability at the lowest cost.
Book
Full-text available
This book contains over hundred of photos taken for several industrial equipment from several plants: fertilizer, chemical, and crystal-glass making that will help the practitioners identify different types of failures for different types of equipment including pumps, compressors, transformers, engines, motors,...etc. The book also have examples of application in construction, medical and civil fields. Note: Due to the Copyright, only 10-15% of the book is presented here
Book
Full-text available
Brainstorming is a conceptualizing technique that is a notable strategy for creating countless thoughts in a brief timeframe period. It fills in as an instrument for recognizing issues and causes. This book contains a comprehensive guide on the tool, including the requirements, considerations, and how to completely manage the session to get the desired results and solve the problem.
Book
Full-text available
Failure mode and effect analysis (FMEA) was initiated by the aerospace industry in the 1960s to improve the reliability of systems. It is a part of total quality management programs and should be used to prevent potential failures that could affect safety, production, cost or customer satisfaction. FMEA can be used during the design, service or manufacturing processes to minimize the risk of failure, improving the customer’s confidence while also reducing costs.
Book
Full-text available
Vibration Analysis should present 50% of any condition monitoring program. This book include a practical guide to vibration analysis to prepare practitioners for levels I II & III to become certified analyst. Numerous examples with photos are included to present how to detect different types of equipment failure: bearing, shafts misalignment, unbalance, rotor problems, electric motors and more using spectrum analysis technique.
Technical Report
Full-text available
Jidoka is one of the main pillars of the Toyota Production System TPS. The TPS is presented as a house with two pillars. One pillar represents just-in-time (JIT), and the other pillar the concept of Jidoka. Take away any of the pillars holding up the roof, and the entire system will collapse. Take out quality, and there is no TPS. Jidoka is a principle of building quality for customers—not inspecting quality. Building quality mean making it right the first time. If you are making defective products or using unacceptable quality standards and filtering these defects out through an inspection system, there is no building quality—and no Jidoka. You are just catching the mistakes made in the manufacturing process. This cost a lot of money and resources and puts the business at risk.