Content uploaded by Omid M. Arani
Author content
All content in this area was uploaded by Omid M. Arani on Feb 12, 2020
Content may be subject to copyright.
Proceedings of the 2017 Industrial and Systems Engineering Conference
K. Coperich, E. Cudney, H. Nembhard, eds.
Measurement System Analysis in Healthcare: Attribute Data
Omid M. Arani and Nadiye O. Erdil
Department of Mechanical and Industrial Engineering
University of New Haven
West Haven, CT 06516, USA
Abstract
Variation in a process can stem from one or more sources that are broadly categorized under 5 Ms: man, machine,
material, methods and measurements. This research focuses on process variation resulting from measurements and
provides guidelines to implement attribute measurement system analysis (MSA) in healthcare. If the measurement
contributes to the variation observed in the process, then it is difficult to separate the true process variation, and this
could lead to bad decision-making. MSA determines how much of the observed variability is due to the
measurement system. MSA has received significant attention to date, however, much research in this field focuses
on variables (continuous) data and MSA finds vast applications in manufacturing. Attributes (discrete/qualitative)
data is also abundant in many processes. In industries such as healthcare, attribute MSA can play an important role
in identifying variation. Medical errors resulting from system or human errors could possibly be linked to
measurement. In this paper, we discuss considerations and factors in application of attribute MSA in healthcare,
describe key elements for successful implementation, and show why it is worth the effort. We, then provide
guidelines to implement attribute MSA in healthcare setting.
Keywords
Healthcare, measurement system analysis, attribute data, variation
1. Introduction and Related Literature
Medical errors are the third leading cause of deaths in the United States. Approximately 250,000 people die each
year due to medical errors [1]. This number contributes to 9.5% of all deaths. Errors that do not lead to death can
result in short-lived effects or cause permanent disabilities. The cost of medical errors was $17.1 billion in 2008 [2].
In a global perspective, research shows that about 4% of patients are victims of medical errors in the developed
countries [3].
A 2005 joint study report from the National Academy of Engineering (NAE) and the Institute of Medicine (IOM)
highlights the potential of engineering tools and technologies in addressing the issues in healthcare related to safety,
efficacy, and efficiency [4]. The report emphasizes the role of systems engineering tools in improving quality,
productivity, and performance in healthcare. Overcoming medical errors and delivering safe care is one of the focus
areas discussed in the report in relation with improving care in the U.S. One of the most promising systems
engineering tools identified by the study committee is statistical process control (SPC). SPC involves
implementation of methods that monitor whether a process produces consistent outputs that are within acceptable
limits based on a number of measurements taken from the process. Since the publication of the report application of
SPC in healthcare has grown tremendously [5-7].
A key step in implementing SPC, or any other tool that requires measurement, is to ensure that the measurement
methods are reliable. Measurement system analysis (MSA) is a procedure used to assess the capability of a
measurement system by quantifying the variation of the method used for taking measurements. MSA is widely used
in manufacturing, but has recently drawn attention in the healthcare industry. The MSA applications in healthcare
are mostly clustered around evaluating a single measurement instrument or comparing two or more different
measurement instruments. Assessment of optic disk topography using retinal thickness analyzer was presented in
[8]. The results show that there is high variation in this process and care must be taken in its utilization for diagnosis
of glaucoma disease. In another MSA study, which also focused on eye-related issue, an instrument measuring
osmotic pressure was evaluated [9]. To verify the validation of ocular metrology, MSA was employed and the
1109
Arani and Erdil
guidelines to implement MSA for continuous data in ophthalmology were presented in [10]. The ultrasound
pachymeter is another measurement instrument which was studied and the results showed excellent measurement
capability [11]. Two different blood pressure measurement instruments (sphygmomanometer and digital blood
pressure station) were compared in [12] and the results showed that digital blood pressure station has less variation,
and is a more capable measurement tool. In measurement of eye’s anterior chamber depth, three different devices
were evaluated [13]. Another MSA study in healthcare compared the capability of three different instruments in
measuring central corneal thickness [14]. While there are numerous examples of MSA application in healthcare in
the literature, these studies are limited to continuous (variable) data.
Considering that human is a central element in the healthcare system from patients to healthcare workers, many
decisions are based on qualitative measurements. Variation in qualitative measurements, however, can lead to
incorrect decision-making [15]. For example, 37 % of malpractice claims belongs to diagnosis errors and 17% are
attributable to improper performance of a procedure [16]. MSA with attribute data, therefore, can find applications
in healthcare including medical error related studies. It can be very useful for determining areas where establishing
procedures and standards that produce consistent outputs are most needed.
Medical instrument, healthcare worker, environment in which healthcare is delivered, and their interactions may be
the source of variation in measurement. This study focuses on variation in qualitative measurements that might
result in medical errors. The paper explores the applications of MSA in identifying errors attributable to human-
related issues. First, the sources of attribute data in measurement systems in healthcare are identified, then key
elements for successful implementation of attribute MSA in healthcare and guidelines for implementation are
described.
2. Methodology
2.1. Measurement System Analysis
MSA determines how much of the observed variability is due to the measurement system. Errors in measurement
can be classified into two categories: accuracy and precision. Accuracy is the ability to produce results that are on
target (i.e. difference between the observed and the master value), and precision is to produce results that are similar
to each other (i.e. dispersion of observed values). Table 1 shows the components in each category that are evaluated
to determine the reliability of a measurement system.
Table 1 Components of measurement error categories
Error in Measurement
Accuracy
Precision
Bias: accuracy of observed values compared to a gold
standard
Linearity: accuracy of observed values through the
expected range of readings
Stability: accuracy of observed values over time
Repeatability: ability to get the same observed value in
repeated measurements by the same appraiser
Reproducibility: ability to get the same observed value
by different appraisers
MSA with attribute data, also known as attribute gage R&R, is used for processes that require subjective inspection
or validation, for example checking a part visually to ensure that there are no cosmetic defects. R&R stands for
repeatability and reproducibility. Attribute gage R&R is simply an agreement analysis used to determine the
reliability of the assessments made by different appraisers. The measurement of this agreement can be quantified as
follows: 1) appraisers’ agreement with themselves (repeatability), 2) agreement between appraisers (reproducibility)
and 3) agreement against a known standard (overall accuracy). The first two quantifies precision, and the last one
measures accuracy in terms of bias. The equations used for these calculations are shown in Equations (1)-(4).
Popular statistical software packages such as Minitab, Statistica, Stata and Excel add-ons provide an easy platform
to analyze attribute data for MSA study. For our study, Minitab’s Attribute Agreement Analysis statistics and
calculations are used to assess the reliability of an attribute MSA.
1) Appraisers’ agreement with themselves (repeatability)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑒𝑟 =100𝑥# 𝑜𝑓 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑎𝑙𝑠 𝑡ℎ𝑎𝑡 𝑚𝑎𝑡𝑐ℎ𝑒𝑑 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑒𝑟
# 𝑜𝑓 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑎𝑙𝑠 𝑚𝑎𝑑𝑒 𝑏𝑦 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑒𝑟 (1)
2) Agreement between appraisers (reproducibility)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑒𝑟𝑠 = 100𝑥# !" !""#!$%!&% !!!" !"" !""#!$%&!!!""#""$#%& !"#$$ !"#! !"#! !"!!"
# !"#$%&'%( (2)
1110
Arani and Erdil
3) Against a known standard (& overall accuracy)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 𝑏𝑦 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 =100𝑥# 𝑜𝑓 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑎𝑙𝑠 𝑡ℎ𝑎𝑡 𝑚𝑎𝑡𝑐ℎ 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑
# 𝑜𝑓 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑎𝑙𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 (3)
𝑂𝑣𝑒𝑟𝑎𝑙𝑙 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =100𝑥# 𝑜𝑓 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑎𝑙𝑠 𝑡ℎ𝑎𝑡 𝑚𝑎𝑡𝑐ℎ 𝑡ℎ𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑣𝑎𝑙𝑢𝑒
# 𝑜𝑓 𝑎𝑝𝑝𝑟𝑎𝑖𝑠𝑎𝑙𝑠 (4)
There are also other metrics that can be calculated such as accuracy by trial, accuracy by appraiser and standard,
misclassification rates, most frequently misclassified items, etc. Statistical significance is measured by calculating
Kappa and Kendall’s statistics. However, the rule of thumb of 90% or over in agreement is also employed in making
decisions.
At the conclusion of an attribute MSA, if bias is suspected, the recommended corrective actions are using operating
instructions if not in use and reviewing or creating operating definitions (if one does not exist) [17]. A result of
repeatability problem indicates that appraisers are not clear about measurement criteria, which leads to review or
update of standard operating procedures. If reproducibility is the issue, unclear procedures, inadequate training,
and/or unclear operational definitions can be the root cause. Most of these corrective actions are inexpensive to
implement and can produce significant gains.
Attribute analysis aims to accomplish the following in the scope of manufacturing [18]:
• To indicate if appraisers inspect based on the same criteria
• To determine if the organization standard matches with customer standard
• To identify the probability of rejecting a good product and failing to reject defects
• To identify the weaknesses such as processes need improvement, areas need training, and so forth.
Within the scope of analyzing variation in qualitative measurements that might result in medical errors, all of these
objectives can easily be applied to healthcare processes where appraisers indicate healthcare workers, standards and
procedures refer to healthcare services protocols, and defects represent medical errors.
2.2. Sources of Attribute Data in Healthcare
To implement attribute MSA in healthcare, it is important to distinguish the source of attribute data in healthcare
measurement.
Human as measurement instrument
Attribute data in healthcare is commonly generated by humans acting as measurement instruments. “Organoleptic
control” refers to measures that use human sense or knowledge, such as: visual inspection, touch, sound, scent, and
flavor [19]. In such cases, human sense and cognition are used to measure an object instead of a measurement
instrument [20]. The observed value is usually attribute data [15]. The possibility of considerable variation in
organoleptic control makes it a potential source for medical errors. For instance, assessing the movement of a ratchet
surgery tool (touch sense) [19], listening the heart beat to determine if its normal or abnormal (sound sense), reading
a radiology image to diagnose a disease (visual inspection), and so on could vary depending on the appraiser.
Despite all of the technological development of measurement instruments, in healthcare human factor still play an
important role in obtaining measurements and in some cases being the measurement instrument itself [20].
Decision making
Regardless of the type of measurement instrument (human, device, human and device), in some cases the output of
the measurement process is used to make a decision. Decisions based on the attribute measurement could be resulted
in nominal response, for instance decision of go/no go. In healthcare industry, diagnoses are one form of decision-
making that is mostly based on measurement (examination).
2.3. Medical Errors
In the previous section the sources of attribute data measures in healthcare have been discussed. In this section we
present medical error classifications, as understanding of these classifications will help to understand how attribute
data can contribute to medical errors. Human error is defined as “post hoc attribution of causes to an observed
outcome, where the cause refers to a human action or performance characteristic” [21]. One medical error
classification is based on categorization of human errors in relation to different stages of patient care in delivery of
services. These stages are medication, treatment, clerical and diagnosis [22]. Medication is further broken down to
over-use, under-use, or mis-use by incorrect medication, route, dose or administration. Diagnosis is also
1111
Arani and Erdil
differentiated in three groups as missed diagnosis, wrong diagnosis or delayed diagnosis. In all four stages of patient
care, attribute data resulted from measurement could be the source of error. Actions which use human as
measurement instrument are highly involved in medication administration and treatment procedures. Diagnosis is a
form of decision making which generates attribute output in most cases, such as mammogram shows dense breast
tissue or not.
The second classification of medical errors is based on psychological approach. In this classification, medical errors
are grouped into two: mistakes and skill-based errors. While mistakes are defined as a flaw in the treatment plan,
skill-based errors are defined as bad performance in a good cure plan. Mistakes consist of two sub-groups:
knowledge-based errors and rule-based errors, Skill-based errors are further divided into two categories: action-
based errors and memory-based errors [23]. Mistakes and skill-based errors could happen during measurement. For
instance; failure to make an appropriate diagnosis based on examining the patient belongs to mistakes (flaw in cure
plan) and injecting incorrect dose of medication could be an example of skill-based errors in terms of organoleptic
control.
3. Attribute MSA in Healthcare: A Numerical Example
While variable measurement system results in continuous data, attribute measurement system produces categorical
output. In the manufacturing industry, the most common attribute measurements are go/no go or classification of the
final product in different classes such as class I to V (excellent, good, fair, poor, defect). In the healthcare industry,
an example of go/no go decision could be a visual examination of a patient and determining whether the patient
needs further care or not. There are also measurements that could produce results in more than two categories, for
instance identifying the number of root canals from a dental x-ray, which could range from one to four root canals.
A measurement system consists of different components that could influence the measurements. These components
are appraiser, standard, instrument, and the measured object. As discussed above, equations given in 2.1 can be
adapted to healthcare setting where appraisers indicate healthcare workers and standards and procedures refer to
healthcare services protocols. An example is discussed next to illustrate this application.
Diagnosing whether endodontic treatment is needed might sometimes involve subjective judgment [24,25]. Failure
to identify canals and high variation in root canals forms are some of the reasons of unsuccessful root canal
treatment [26]. Tooth x-rays are used to identify root canals. The number of root canals to perform varies from one
to four depending on the number of canals and tooth type. As an example, consider a dental clinic with four
residents that wants to ensure that its clinic procedures followed to identify root canal treatments are effective and
that all its residents are consistently making the same decisions regarding the root canal treatments. A simple study
to evaluate the decisions made with respect to root canal treatment (i.e. the measurements) can be designed as
follows. Eight x-ray pictures of the different teeth are presented to the residents in a random order, and two
evaluations are performed at different times. The data are shown in Table 2.
Table 2: Judgments of four residents on 8 x-ray pictures
X-ray
#
Actual
Classification
Resident 1
Resident 2
Resident 3
Resident 4
1st Trial
2nd Trial
1st Trial
2nd Trial
1st Trial
2nd Trial
1st Trial
2nd Trial
1
3
3
3
3
3
3
3
3
3
2
2
2
2
2
2
2
2
2
2
3
4
4
4
3
4
4
4
4
4
4
3
3
3
3
4
3
3
3
3
5
2
2
2
2
2
2
2
2
2
6
4
4
4
4
4
4
3
3
4
7
4
4
3
4
3
3
4
4
4
8
1
1
1
1
1
1
1
1
1
3.1. Consistency in residents’ individual judgments
In this example, residents play the role of measurement instrument using visual inspection and knowledge. To
evaluate each resident, Equation (1) can be employed and the results will show whether the residents are consistent
in their judgments (which also known as repeatability). Figure 1 presents the confidence intervals for the
percentages within residents. It indicates that Residents 2 and 4 are highly consistent in their classifications.
1112
Arani and Erdil
Figure 1 Agreement within residents (repeatability)
Figure 2 Residents’ judgment against the standard
3.2. Agreement between residents
To study the agreement between residents in reading the x-ray pictures, Equation (2) could be utilized. This index
indicates if residents can reproduce each other’s results (known as reproducibility). The results will help to
determine if residents use the same criteria to read x-ray pictures. In this case, only 50 percent of judgments get the
same results as shown in Figure 3.
Figure 3 Agreement between residents (reproducibility)
3.3. Overall accuracy
Using Equation (4), overall accuracy could be calculated. This index helps to assess the clinic’s overall performance
in correctly diagnosing root canal treatments. Although the repeatability index shows that Resident 2 is highly
consistent in his/her judgment, the judgment was correct only 75 % of the time as shown in Figure 2. This result
indicates Resident 2 needs additional training.
The output of most diagnoses are attribute data, therefore, the proposed guidelines are applicable to almost all fields
in medicine. Furthermore, attribute analysis can be employed in healthcare delivery processes. For instance,
prescription errors, related to incorrect administration of drugs, incorrect dose or error in paperwork, can be studied
using attribute analysis.
4. Conclusions and Recommendations
Medical errors draw lots of attention and numerous studies have been conducted in order to reduce the risk of their
occurrence. In this paper application of attribute MSA in the health care industry as a tool to deal with medical
errors have been studied. Based on different classifications of medical errors, we showed that medical errors could
be a result of errors in attribute measurements. In stages of healthcare delivery classification, errors could be related
to medication, treatment procedure, clerical procedure, and diagnosis, and all could stem from variation in
measurement. From psychological perspective, errors in measurements could be linked to mistakes and skill-based
errors. A large portion of these errors occurs when human plays the role of measurement instrument. Attribute MSA,
in such cases, is an efficient method to evaluate healthcare standards and worker, also to determine the likelihood of
medical error occurrence. To implement attribute MSA in healthcare the first step is to identify the sources of
attribute data and establish regular plans to analyze attribute data. Total accuracy, health worker accuracy, and
standard accuracy are useful indices from which healthcare industry can benefit. While total accuracy index present
overall accuracy of the measurement system, standard accuracy index helps to study healthcare standards,
instruction, and criteria; and health worker index helps to evaluate the accuracy of the health provider diagnosis,
judgment and measurements. The latter, however, could pose a barrier to the implementation of attribute analysis in
healthcare. Therefore, the objectives of this methodology need to be communicated in the organization before its
implementation. Another issue to consider is that in the healthcare industry, unlike in manufacturing industry,
judgments might be subjective depending on the context, and different or no standards may apply, which would
limit the use of the method.
!"#$
$%%
&%
'%
(%
)%
*%
!%
+, ,-./0 1-
21- 3145
&*6%7 89 :
21-3145
;/5</4 8+, ,-. /01-0
!"#$
$%%
&%
'%
(%
)%
*%
!%
+, ,-./0 1-
21- 3145
&*6%7 89 :
21-3145
+,, -./01 -8;08 <5.4 =.-=
1113
Arani and Erdil
References
1. McMains, V., 2016, “Medical Errors Are the Cause 250,000 U.S. Deaths a Year,” The Louisiana Weekly, 90(37).
2. Van Den Bos, J.; et al, 2011, “The $17.1 Billion Problem: The Annual Cost Of Measurable Medical Errors,”
Health Affairs, 30(4), 596–603.
3. GeneralCologneRe, 2002, “Impending Changes in the European Health Care Sector and the Effect on Risk
Management and Malpractice Insurance”, Insurance Issues Europe, 1–8.
4. Reid, P.P., et al, 2005, “Building a Better Delivery System: A New Engineering/Health Care Partnership,”
Washington, D.C.: National Academies Press.
5. Meehan, J. et al., 2015, “A Process-Driven Simulation-Based Approach for Hospital Laboratory Redesign,”
Proceedings of the 2015 Industrial and Systems Engineering Research Conference, May 30-June 2, Nashville,
Montreal, Tennessee, 3091-3090.
6. Allen, T.T., et al, 2010, “Improving the Hospital Discharge Process with Six Sigma Methods,” Quality
Engineering, 22, 13–20.
7. Lighter, D.E., and Tylkowski, C.M., 2004, “Case Study: Using Control Charts to Track Physician Productivity,”
The Physician Executive, 53–58.
8. Hoffmann, E.M., and Medeiros, F.A., 2006, “Repeatability and Reproducibility of Optic Nerve Head
Topography Using the Retinal Thickness Analyzer,” Graefes Arch Clin Exp Ophthalmol, (244), 192–98.
9. Eperjesi, F., Maana, A., and Hannah, B., 2012, “Reproducibility and Repeatability of the OcuSense TearLab
Osmometer,” Graefes Arch Clin Exp Ophthalmol, (250), 1201-5.
10. Mcalinden, C., Jyoti, K., and Konrad, P., 2011, “Statistical Methods for Conducting Agreement and Precision
Studies in Optometry and Ophthalmology,” Ophthalmic & Physiological Optics, 31, 330 –338.
11. Peyman, M., and Lai, Y., 2014, “Accutome PachPen Handheld Ultrasonic Pachymeter : Intraobserver
Repeatability and Interobserver Reproducibility by Personnel of Different Training Grades,” International
Ophthalmology, 35, 651–55.
12. Dalalah, D., and Diabat, A., 2015, “Repeatability and Reproducibility in Med Labs : A Procedure to
Measurement System Analysis,” IET Sci. Meas. Technol, 9(7), 826–35.
13. Wang, Q. et al, 2015, “Anterior Chamber Depth Measurements Using Scheimpflug Imaging and Optical
Coherence Tomography: Repeatability, Reproducibility, and Agreement,” Journal of Cataract & Refractive
Surgery, 41(1), 178–85.
14. Lackner, B. et al, 2005, “Repeatability and Reproducibility of Central Corneal Thickness Measurement With
Pentacam, Orbscan, and Ultrasound,” Optometry and Vision Science, 82(10), 892–899.
15. Pendrill, L., and Peterson, N. 2016, “Metrology of Human-Based and Other Qualitative Measurements,”
Measurement Science and Technology, 27.
16. Brown, T.W., et al, 2010, “An Epidemiologic Study of Closed Emergency Department Malpractice Claims in a
National Database of Physician Malpractice Insurers,” Academic Emergency Medicine, 17(5), 553–560.
17. Stamatis D.H., 2015, “Quality Assurance: Applying Methodologies for Launching New Products, Services, and
Customer Satisfaction,” CRC Press, Boca Raton Fl.
18. Simion, C., 2016, “Evaluation of an Attributive Measurement System in the Automotive Industry,” In IOP Conf.
Series: Materials Science and Engineering.
19. Magdalena, D., and Kujawi, A., 2014, “Human Aspects of the Measurement System Analysis,” Proceedings of
the 5th International Conference on Applied Human Factors and Ergonomics AHFE, 19–23.
20. Pendrill, L., 2014, “Man as a Measurement Instrument Man as a Measurement Instrument,” NCSLI Measure J.
Meas. Sci., 9(4), 24–35.
21. Hollnagel, E., 1998, “Cognitive reliability and error analysis method: CREAM”, 1st Edition, Elsevier, Oxford,
New York.
22. Kopec, D., et al, 2003, “Human Errors in Medical Practice: Systematic Classification and Reduction With
Automated Information Systems,” Journal of Medical Systems, 27(4), 297–314.
23. Ferner, R.E., and Aronson, J.K., 2006, “Clarification of Terminology in Medication Errors,” Data Information,
29(11), 1011–22.
24. Glickman G.N., and Schweitzer J.L., 2013, “Endodontic Diagnosis,” Endodontics: Colleagues for Excellence.
http://www.aae.org/publications-and-research/endodontics-colleagues-for-excellence-newsletter/endodontic-
diagnosis.aspx, Retrieved on 1/15/2017.
25. Petrino, J.A., “Endodontic Diagnosis: How Lesions Can Cloud Determination of Root Canal Treatment,”
http://www.dentistryiq.com, Retrieved on 1/15/2017.
26. Sun, Y., et al, (2016), “The best radiographic method for determining root canal morphology in mandibular first
premolars: A study of Chinese descendants in Taiwan,” Journal of Dental Sciences, 11(2), 175-181.
1114
Reproduced with permission of copyright owner. Further reproduction
prohibited without permission.