DATA-DRIVEN APPROACH TO ESTIMATE MAINTENANCE LIFE CYCLE COST OF ASSETS
Roanoke, VA, USA
Roanoke, VA, USA
Roanoke, VA, USA
Roanoke, VA, USA
Different participants in the supply chain of an industrial asset, from original equipment manufacturer (OEM) to
owner/operator (O/O), know more than others about significant aspects of the asset. Sharing of information between these
participants is necessary to most effectively manage a product or asset for all stakeholders involved. In particular, one type
of data generated about an asset during its lifecycle is maintenance data. Field maintenance data collected over the usage
of a product provides valuable information about its failure patterns and performance in different operating contexts that can
benefit all. However, maintenance data by itself typically has data quality issues and needs to be understood and processed
in order for information of value to be extracted and used. In this article, we present a case study of how maintenance data
from the CMMS/EAM can be processed to return information that can be used to benefit everyone in the supply chain.
Lifecycle costs for industrial assets can mean different things from different perspectives of the supply chain. Product lifecycle
management (PLM) is the activity of effectively managing products across the product lifecycle (which spans phases from design, to
production, logistics, and maintenance to disposal/obsolescence of a product) (1). For the users of an industrial asset, lifecycle cost
analysis (LCC) measures the total cost of ownership (TCO), taking many different factors into account from the stages of an asset’s
lifespan, such as initial costs, annual operating and maintenance costs, and decommissioning costs. These two perspectives are
complimentary, but how information is used depends on the stakeholder.
Different stakeholders in the supply chain include the original equipment manufacturer (OEM), the owner/operator (O/O) of the
asset, as well as middle parties such as dealers and suppliers. In practice, there is often an asymmetry of information between the
different participants, which is characterized by each of the participants knowing more than others about significant aspects of the asset.
For instance, OEM knows more about the design characteristics and performance capabilities about the equipment they manufacturer.
The dealers knows more about parts and services as well as local and regional dynamics that affect the sale of new whole goods as well
as overhaul records. The O/O will know more about how the fleet was operated, for how long, and the conditions under which it was
operated, including dispatch and production data, utilization rate, scheduled and unscheduled downtime. Additionally, the O/O will
know more about how the fleet was serviced and maintained including parts consumption and labor.
Effective PLM depends on collaboration among the different participants in the supply chain and the sharing of information from
the data. OEMs can benefit from understanding the gap between how a product is intended to be used, and how it is actually used by
the O/O. Information about an individual asset in its operating context throughout its useful life can inform the product lifecycle for the
OEM such as for improving designs in new products or versions, improving the quality of product production, and for creating and
validating pricing structures. From an O/O perspective, benefits of collectively using information include reducing unplanned
downtime, optimal planned downtime, incremental capacity utilization, and improved certainty about the TCO. From a dealer
perspective, benefits include increase in parts and services revenue as well as increased opportunity for managed services.
Maintenance work order data contains information about failure patterns and maintenance activities of an asset through its lifecycle.
This data has the potential to generate actionable intelligence as well as field usage information which can be useful for all members of
the supply chain. However, simply sharing raw data alone will not give anyone these rewards. Adaptable work processes in which
actionable information can be shared across the supply chain effectively need to be developed (2) (3). To extract actionable information,
the gap between the raw collected maintenance data and actionable insights needs to be addressed.
This paper discusses challenges in the maintenance data and proposes data-driven analytical approaches for addressing these
challenges in order to extract relevant information from maintenance work orders that can shared between the different participants of
the supply chain. This paper focuses specifically on estimating the annual costs around an asset from historical data and understanding
the costs and reliability from different failure modes from observed field data. We illustrate these concepts with a simple case study
comparing simulated life cycle costs between two comparable asset models in a system, which demonstrates both the challenges that
need to be considered to extract actionable insights as well as showing what this information could look like and how it could benefit
BACKGROUND - FIELD DATA FOR ANNUAL MAINTENANCE LIFECYCLE COSTS
Annual lifecycle costs include the costs of maintenance (both corrective and proactive), production losses from downtime, and other
regularly occurring activities that could incur costs or lost opportunity to produce. Maintenance data from Enterprise Asset Management
(EAM) or Computerized Maintenance Management systems (CMMS) contains information about work tasks such as planning,
scheduling, and reporting (4). The information in the CMMS/EAM contains records of all maintenance activities and costs across asset
fleets, but challenges from directly using this information arise due data quality and consistency issues. Discussions of different data
quality challenges are well reviewed in (5) (6) (7) (8) (9) (10) (11). Some key data quality challenges relevant to this study include
missing breakdown indicator (unknown which events are failure events), missing and inconsistent failure modes, and the unstructured
nature of manufacturer and model nomenclature across a large registry.
In our case study, we show how historical maintenance data can be used to estimate annual lifecycle costs and the considerations
and assumptions made along the way. The GE Asset Answers database aggregates work history data from many industrial facilities
around the world by asset type, manufacturer, as well as many other characteristics. This data is anonymized and made available to
subscribers who can compare themselves against peer data. This effort is part of the effort to develop ways that the maintenance data
in Asset Answers can be made more valuable to all participants, and show how information sharing benefits all. We specifically compare
two similar manufacturer and model of centrifugal pumps from different peer data, evaluate the data quality, use natural language
processing to predict which events are failures and to structure the unstructured text. We use this information to estimate metrics to
inform a system reliability model and run a Monte Carlo simulation to compare annual lifecycle costs by risk events. A similar workflow
was used in (12) to estimate the contribution of a certain category of component failures on system reliability. To protect proprietary
information, all variables have been anonymized and age has been scaled.
Out of over 65,000 repair events for over 6,200 centrifugal pumps at 22 different companies over a 4 year period of time from the
Asset Answers database, we identified 2 comparable makes and models, AIC pumps and RELIABLE pumps. 8,000 repair records were
identified against these two models. The next step is to identify which of these repair events are a failure. This process and
considerations are described in (13), where we use a classifier in the GE Digital APM commercial software package which predicts if a
repair event was a failure or not. Of the 8,000 repair events, about 5,800 of them were predicted as failure events. For this dataset,
common themes among the repairs that were not predicted as failure events were either some undeterminable text, or routine procedures
such as inspect, service, or service. A few examples are shown in Table 1.
Table 1 Example work order descriptions demonstrat ing failure classification of repair events for centrif ugal pumps
Is A Failure?
Seal is leaking badly
Block valve is broken open and inoperable
00120-Pump 1 Work Request
Check impeller size
Once we had isolated which maintenance events corresponded to failures, text mining was utilized to characterize the failure mode
information. From this dataset, we characterized failure events by maintainable item and failure mechanisms and used text matching to
extract the information. Different types of approaches for structuring unstructured text are very well described in (14), and have been
compared and studied in (15) (16). Challenges that arise with extracting failure information from maintenance work orders include
naturally occurring class imbalance (certain components or failure events are going to happen at greater frequency than others), the
possibility of multiple correct labels per observation, and the challenges of the transactional text such as misspellings and abbreviations.
In our case study, the objective is to use the structured data to estimate annual life cycle costs from field data by different risk events.
We use a system reliability simulation to estimate the annual lifecycle costs throughout a 10-year period of time. The simulation tool
used is a Monte-Carlo system reliability simulation in the GE APM Reliability commercial software package. We build a simple system
reliability (reliability-availability-maintainability, or RAM) model to illustrate how these processes can be used as part of larger
manufacturing process and other more detailed factors can also be incorporated. The block diagrams are shown in Figure 2.
Figure 1 System reliability scenario models for the two pumps. The system is simplistic but illustrates the framework, and we assume the risks are
all on the Pump subsystem
Reliability factors are incorporated using the failure mode information extracted from the data. Availability factors come from the
unplanned downtime over the course of the 10-year simulation due to these failure modes. Maintainability is modeled using the time-
to-repair (TTR) distribution estimated from the field data. The Society of Maintenance and Reliability (SMRP) defines TTR as the time
needed to restore an asset to its full operational capabilities after a failure (17). We estimate TTR as the difference between the
maintenance start and maintenance completion date on the work order, as an estimate of when the maintenance work was done. TTR
distribution is typically right skewed, and we model it using the lognormal distribution.
To map the RAM simulation results to the financial implications, we use costs per repair event estimated from the data and assume
a user-specified production loss quantity. We use the average repair cost from the maintenance work orders as measures of unplanned
fixed corrective costs. We make the user-specified assumption that any loss of pump function will interrupt production that is valued at
$5,000 per 24 hour day to get estimates of the lost production losses.
We assume both the cost to repair and the time to repair do not vary between the two pump models, but do vary by the failure mode.
We assume that these factors are more specific to the site-specific maintenance and reliability practices than to the asset make and model
but do depend on the nature of the failure. Make and model independent estimated parameters used in our simulation are in Table 2(a).
We identified that the 3 most common failure mode groupings to use as risk events. The common risk events were seal failures,
valve failures, and bearing failures, accounting for 46% of the total failures. For each population (for the 2 make and models), 2-
parameter Weibull distribution parameters were estimated using the probability distribution fitting tool in GE APM Reliability Analytics
with maximum likelihood estimation. Estimated shape
parameters are summarized in Table 2(b).
Table 2 Extracted parameters from field data for use in estimating 10 year asset lifecycle costs for two comparable pump models. (a)
Maintainability measures dependent on failure mode, but we assume independent across the different asset make and model numbers, (b)
comparison of estimated reliability (Weibull distribution) parameters between the two asset make and models.
Average Corrective Work Cost (USD)
TTR Distribution - 𝜇
TTR Distribution - 𝜎
(a) Make and model independent estimated metrics and distribution parameters for RAM simulation
Scale 𝜂 (days)
Valv e failu re
Scale 𝜂 (days)
(b) Make and model dependent estimated metrics and distribution parameters for RAM simulation
Simulating the RAM model over 10 years produces an estimate of the cost of unreliability per year for each vendor scenario. We
ran 1,000 iterations. The cost of unreliability in this scenario is a combination of the unplanned corrective costs and the lost production
costs (Table 3). We apply the Net Present Value (NPV) functions using an initial investment of $0 and a discount rate of 10%. The results
of the NPV represents the sequence of cash flows in today’s dollars and shows AIC Pumps is projected to incur lower costs over a 10
year period of time than RELIABLE pumps. Estimated annual costs can be used with other information, such as comparison to the
purchase price. For example, if AIC pumps has a higher acquisition price, this information can be used to justify its purchase to the
Table 3 Annual cost of unreliabilit y as a sum of production losses
In this simulation, the costs are driven by the lost production, which is determined by the availability. We can compare the corrective
costs as well as the total downtime by failure mode for the two scenarios in Figure 3. Figure 3 shows that the unreliability and
unavailability incurred by seal failures is worse for RELIABLE pumps. However, the corrective cost from bearing failures is worse for
AIC pumps, but there is greater total unplanned downtime for RELIABLE pumps. Which pump model will incur the most cost annually
depends on the production losses.
Figure 2 Comparison of 10-year risks from simulation model for 2 pump models (a) Comparison of corrective costs over 10 year
by failure mode (b) Total unplanned downtime. Production losses are driven by unplanned downtime. AIC pumps have greater costs
with valve and seal failures than Reliability pumps, but more downtime associated with seal failures and bearing failures. The cost
trade-off of unreliability is determined largely by the production losses.
The case study in this paper illustrates steps and processes that can be used to process field maintenance data into actionable
information. The outputs from the study were comparative metrics for different failure modes for different models, as well as the output
from simulation models that can be used to make decisions.
Decisions often made based on upfront costs, but they have long lasting impacts on the life cycle cost of the system. RAM modeling
has been available to support these decisions for many years, but it has traditionally relied on the information available within
organizations or from published reference materials. Asset Answers provides a wealth of data that can be used to build high quality
reliability models of operating cost based on actual field performance data.
1. Stark, J. (2015). Product lifecycle management. Springer, Cham, 1-29.
2. Prajogo, D., & Olhager, J. (2012). Supply chain integration and performance: The effects of long-term relationships,
information technology and sharing, and logistics integration. International Journal of Production Economics, 514-522(1).
3. Li, J., Tao, F., Cheng, Y., & Zhao, L. (2015). Big data in product lifecycle management. The International Journal of
Advanced Manufacturing Technology, 667-684.
4 Gulati, Ramesh and Smith, Ricky. (2013). Maintenance and Reliability Best Practices Second Edition. Industrial Press, Inc.,
5. Lukens, S., Naik, M., Hu, X., Doan, D. S., & Abado, S. (2017). The role of transactional data in prognostics and health
management work processes. Proceedings of the Annual Conference of the Prognostics and Health Management Society. St.
Petersburg, FL, 517-528.
6. Meeker, W. Q., & Hong, Y. (2014). Reliability meets big data: opportunities and challenges. Quality Engineering, 102-116.
7. Hodkiewicz, M., Kelly, P., Sikorska, J., & Gouws, L. (2006). A framework to assess data quality for reliability variables.
Engineering Asset Management. Springer, London. 137-147.
8. Koronios, A., Lin, S. & Gao. (2005). A data quality model for asset management in engineering organisations. Proceedings
of the 10th International Conference on Information Quality (ICIQ), Cambridge, MA, 27-51.
9. Lin, S., Gao, J., Koronios, A., & Chanana (2007). Developing a data quality framework for asset management in engineering
organisations. International Journal of Information Quality, 100-126.
10. Lukens, S. & Markham, M. (2018). Data science approaches for addressing RCM challenges. SMRP Conference
Proceedings, Orlando, FL.
11. Naik, M. & Saetia, K. (2018). Improving Data Quality By Using Best Practices And Cognitive Analytics. SMRP Conference
Proceedings, Orlando, FL.
12. Hodkiewicz, Melinda, Batsioudis, Z., Radomiljac, T., and Ho, Mark T.W. (2017). Why autonomous assets are good for
reliability - the impact of 'operator-related component' failures on heavy mobile equipment reliability. PHM Society Conference, St.
13. Lukens, S. & Markham, M. (2018). Data-driven application of PHM to asset strategies. Proceedings of the Annual
Conference of the Prognostics and Health Management Society, Philadelphia, PA.
14. Hodkiewicz, M., & Ho, M. T. W. (2016). Cleaning historical maintenance work order data for reliability analysis. Journal
of Quality in Maintenance Engineering, 146-163(2).
15. Sexton, T., Hodkiewicz, M., Brundage, M. P., & Smoker, T. (2018). Benchmarking for Keyword Extraction Methodologies in
Maintenance Work Orders. PHM Society Conference, Philadelphia, PA.
16. Sexton, T., Brundage, M. P., Hoffman, M., & Morris, K. C. (2017). Hybrid datafication of maintenance logs from AI-assisted
human tags. Big Data (Big Data), 2017 IEEE International Conference on.
17. SMRP Best Practices. (2017). Society for Maintenance & Reliability Professionals (SMRP) Atlanta, GA.