Content uploaded by Larysa A. Kosheva
Author content
All content in this area was uploaded by Larysa A. Kosheva on Feb 21, 2020
Content may be subject to copyright.
Assessment of the measurement method precision in
interlaboratory test by using the robust “Algorithm S”
Eugenij Volodarsky1, Zygmunt Warsza2, Larysa Kosheva3, and Adam Idzkowski4
1 Science National Technical University of Ukraine “KPI”, Department of Automation of Ex
perimental Studies, Kiev, Ukraine
vet1@ukr.net
2 Industrial Research Institute of Automation and Measurement (PIAP), Warsaw, Poland
zlw@op.pl
3 Department of Biocybernetics and Aerospace Medicine, Kiev, Ukraine
l.kosh@ukr.net
4 Bialystok University of Technology, Faculty of Electrical Engineering, Bialystok, Poland
a.idzkowski@pb.edu.pl
Abstract. The application of robust statistical methods to assess the precision
(uncertainty) of the results of interlaboratory testing of measurement method is
presented. These results may include outliers. An usual rejection of such data
reduces the reliability of evaluation, especially for small samples. And the ro
bust methods take into consideration all the sample data (also with outliers).
The use of two robust methods are provided for international rules of compara
tive studies in accredited laboratories. However, there is a lack of instructions
on how to proceed them in practice. Evaluation of the precision of results, us
ing the same method for homogeneous objects in nine laboratories, was pre
sented. By traditional calculations, the estimate of the standard deviation was
1.5 times higher without rejection of outlier in comparison to one with rejection
of outlier. Then, after using the robust method “Algorithm S”, the obtained val
ue was near the lower of two mentioned above, and with greater reliability.
Keywords: robust statistics · outliers · measurement uncertainty ·
interlaboratory comparisons.
1 Introduction
During the process of certification and verification of test methods and their proce
dures, the dependence of the accuracy on the changing conditions of measurement
and on the specific organization of the experiment in each laboratory must be taken
into account. A solution is to carry out the same measurements of homogenous ob
jects in several accredited laboratories and to calculate the mean precision of all re
sults. These interlaboratory comparisons are an experimental implementation of a
physical model by specific test procedures in certain conditions. This model is created
on the basis of the measurement results obtained in the laboratories of a similar essen
ÓSpringer International Publishing Switzerland 2016
R. Jabłon
´ski and T. Brezina (eds.), Advanced Mechatronics Solutions,
Advances in Intelligent Systems and Computing 393,
DOI 10.1007/9783319239231_13
87
tial level of competence, which are specialized in a particular type of testing. The
measurement result is represented by the following statistical model [1]
ɟȼɬy
ɭ
++=
. (1)
where:
GP
ɭ
ɬ
 mean value of the measurement results from all laboratories;
į  component of the correctness of the result, i.e. moving average value (bias) due to
the imperfections of the test procedure; B  validation results component (under re
producibility conditions); e  random measurement error component (under repeata
bility conditions).
The relationship between parameters of the model (1) is shown in Fig. 1. This
model formalizes the test procedure. Reproducibility variance ıR
2 represents the re
sults of the interlaboratory test, conducted with the controlled measuring method,
according to certified procedures. It is the sum of variances which defines repeatabil
ity, i.e.
222 += rLR ııı
. (2)
Component ıL
2 is a betweenlaboratory variance which describes a dispersion of the
measurement results for the homogeneous objects in individual laboratories when the
same measuring procedure is used. These spreads result from the permissionable dif
ferences in the organization of laboratory process. Repeatability variance ır
2 is a
component which expresses repeatability of the scattering of results. It also describes
the average impact of changes in size of the quantities which randomly interact within
the limits allowed by the applicable standards.
Usually, the statistical data analysis is based on the assumption that the scattering
of data is normally distributed. It is also the basis for making a decision in statistical
inference. Significant percentage of measurement results in practice can include data
outliers. In particular this concerns the datasets with a small number of samples.
Fig. 1. Basic statistical model of the measurement result [1]
88 E. Volodarsk
y
et al.
The reason of outlier values in datasets are: failure of measuring instruments, non
compliance with the principles of an experiment, errors in the estimation of results,
the impact of external factors. Rejecting these data in the calculations can significant
ly affect accuracy and reliability of the statistical evaluation of research precision.
The classic parametric evaluation of the experimental results based on normal dis
tribution as well as the theory of statistical inference are firmly settled in practice.
Cancellation of this approach would have been inadequate. Thus, a need of adaptation
of the “old” model to the new challenges emerged. It can be realized by developing
such methods of estimation which, under certain conditions, include “data outliers” or
allow sufficiently to assess the parameters of results on the basis of acquired data.
Several methods, named as robust, were developed by Tukey, Huber and others [2],
[4], [7]. Some of them are applied in the accredited laboratory practice and inter
laboratory comparisons [1part 5], [57].
2 Rules of the scattering data assessment by using of robust
methods
In practice, the most commonly used statistical analysis procedures assume a normal
distribution of data. However, they are quite sensitive to minor variations in the pa
rameters of this distribution. It particularly applies for the estimation of variance. In
fact, we often have to deal with distributions that differ from the ideal normal distri
bution. However, recently in the processing of the experimental data, the robust
methods are increasingly used [35]. As the term robust is meant an insensitivity of
determined parameters to the different deviations in data samples and heterogeneity of
a scatter of elements as well. They emerge from unknown reasons.
The basic model used in the robust method is not based on single normal distribu
tion, it is mixed. Different samples of the central part of the actual scatter of data can
be modeled by using of the same normal distribution. And the lower areas of a real
distribution curve (tail areas) are less stable and more stretched than ones for the cen
tral area of a normal distribution. Measuring observations in the tail areas are less
common. Some of them, especially for small samples, can be detected as outliers and
pseudooutliers of the central area of distribution. This approach preserves the tradi
tionally accepted hypothetical assumption of homogeneity of the general population,
as a basis for statistical evaluations. Some deviations differing from the normal distri
bution for the central area are permitted in the tail areas of a real distribution. Howev
er, for the tail areas some limitations are assumed. They are modeled by a normal
distribution with other parameters, or by other statistics.
Approach proposed by Tukey is often used [2]. He assumed that there are a large
number n of measurement data, as accidentally mixed “good” and “bad” observation
xi from a population with a mean value ȝ, respectively, with probability (1İ), where İ
is a low number. Both types of observations xi have different normal distributions, i.e.
the first  N (ȝ, ı2) and the second  N (ȝ, 9ı2), but with the same mean value ȝ –
Fig. 2. The standard deviation of the “bad” is 3 times higher than “good”. Assuming
that all values xi are independent, the following joint distribution can be expressed as
Assessment of the Measurement Method Precision … 89
¸
¹
·
¨
©
§
)
¸
¹
·
¨
©
§
)
V
P
H
V
P
H
3
1xx
xF
, (3)
where
dyɟɯ
ɯɭ
³
f
)
2
2
2
1
S
.
Fig. 2. Joint distribution F(x) = (1 İ) N (ȝ, ı2) + İ N (ȝ, 9ı2) for İ = 0.2.
Among the robust methods and algorithms the approach of Huber is widely
spread [4] and it is currently regarded as classical. He introduced k value which de
pended on the degree of “contamination” of the general population. It defines the
boundaries of the central area of the measurement data histogram, i.e. difference be
tween the upper and lower quartiles modeled by the normal distribution – Fig. 3 [6],
[10].
P
x
V
xp
VP
VP
VP
VP
VP
VP
IQR
50%
.
.
.
.
.
VP
.
VP
.
Fig. 3. The interquartile range (IQR) of a probability density function (pdf) of normal distribu
tion N (ȝ, ı2) used for the central part of sample data with outliers.
90 E. Volodarsk
y
et al.
Observations are less common in the lateral areas and in one of the criteria they can
be considered as outliers. In the method IRLS (iteratively reweighted least squares)
extreme observations are subject to winsorizing, i.e. pulling them on the borders of
the central area. It follows a change in the mean value and standard deviation of the
new set of observations, and constriction of the central area. Therefore customizing
the extreme data should be repeated. This process is iterated until changes become
negligible.
The application of this robust method (to assess: the result obtained with a meas
urement method, proficiency testing for laboratory using small samples of data and
the occurrence of outliers) was presented in [10]. The difference between the average
values designated in the interlaboratory study is utilized to assess the reproducibility
of the result. The basis of applied robust algorithms in these works is high stability of
interquartile range (Fig. 3) with the “pollution” reaching up to 50%.
3 Robust analysis “Algorithm S”
The aim of the interlaboratory comparison study is to estimate and to standardize the
variance describing the repeatability of particular method on its results obtained in
several accredited laboratories. Therefore, it is necessary to determine a joint proba
bility distribution of such variances obtained by individual laboratories participating
in this joint experiment.
For the tests conducted with this method it is allowed to take into consideration
the impact of possible combinations of changes in conditions (within acceptable lim
its). In many cases in practice, it is needed to make separate estimates for different
limitations (e.g. the cost of experiment, duration of the experiment, destructive test
ing) only on the basis of a small number of measurement observations. Normally,
their values are asymmetrically distributed and may differ significantly from a Gauss
ian distribution. According to the Cochran’s C test, some of these observations would
be regarded as the outliers. Therefore, they should be removed from the statistical
processing. Such an approach would be acceptable when the average value was
sought.
However the goal of an interlaboratory experiment discussed here is to assess the
acceptable scattering of results from laboratories on the basis of obtained experi
mental data. The assessment is used to standardize the repeatability of the testing
procedures performed with the use of method controlled in this experiment. The use
of robust methods, as based on all the available experimental data, gives a more relia
ble estimate of the actual statistical dispersion of results. To obtain a stable estimate
of repeatability variance (i.e. precision), the most suitable method is based on robust
“Algorithm S” [1part 5], [7].
The implementation condition of this algorithm is that the bias estimate of robust
standard deviation of results from laboratories should be equal to zero. For real exper
imental data at each jth step of iteration, this assessment is closer to the standard
deviation ı of the normal distribution. Adjustment factor ȟ is introduced to estimate a
variance shift. The condition should be provided
Assessment of the Measurement Method Precision … 91
2
2
*
V[
¿
¾
½
¯
®
sE
. (4)
Robust standard deviation s* should be stable with some probability, i.e. it should be
within specified limits. Therefore, the maximum deviation Șı of the preferred distri
bution is limited
^`
DKV
!
*
sP
, (5)
where: ı  standard deviation of a normally distributed population, which corresponds
to the experimental assuming their “pure” normal distribution; Ș  limit factor de
pendent on the number of data in the sample; P = (1Į) – probability of fulfilling a
condition of the limiting of acceptable standard deviation s* for the expected normal
distribution.
The values of adjustment factor ȟ and limit factor Ș are usually determined for
Į = 0.1. It is made by intersecting of cumulative curves of onemodal distributions
near the point where the probability equals 0.9. This approach should be examined
analytically and its effectiveness should be assessed. Factor Ș corresponds to the up
per value (1Į) 100% of distribution describing the scattering of robust standard devi
ation s*. Standard deviation of this distribution may be used to assess the scattering.
For a number of elements n in the sample it is dependent on the number of degrees of
freedom v = n 1. It is included by multiplying both sides of equation (4) by Ȟ
2
2
*
[
Q
V
Q
°
¿
°
¾
½
°
¯
°
®
¸
¸
¹
·
¨
¨
©
§s
E
or
^`
2
2
[
Q
F
Q
robust
E
. (6)
According to (5) the probability P of the upper limit of
2
F
variable is equal to
^`
DKQF
Q
!
22
robust
P
. (7)
A tail area of
2
F
distribution containing Į Â 100% of the value of a random variable
can be approximated by a uniform distribution with density
2
KQ
2
2
KQD
KQ
F
Q
³
f
dxxpx
. (8)
From Pearson distribution tables [8], [9] a value
2
1,
DQ
F
Ɋ
can be found and then
limit factor Ș for which the condition (4) occurs
92 E. Volodarsk
y
et al.
Q
F
K
Q
2
1,0,
2
P
. (9)
Starting from the relation
DKQF
Z
d 1
22
P
, for the main part of the distribution, z
value corresponding to the value of probability P can be found from the tables and
2
1.0
1
K
[
z
. (10)
It is an adjustment factor ȟ for the selected limit factor Ș, which assures that robust
estimate will not be shifted.
*
j
s
is a robust standard deviation calculated for the jth
step of iteration. In the iterative calculation the value
*
j
s
is updated as follows
*
jj
s
K
\
. (11)
In the ordered series of variances of results from laboratories participating in the
experiment, a median is selected as an initial assessment of the standard deviation of
the predicted normal population
2*
0
s
=Me{
2*
i
s
}, (12)
where i = 1 .. n  number in an ordered series of laboratories.
Then the laboratory standard deviations are changed according to formula
¯
®
!
casesotherin
when
*
i
jij
ij
s
s
s
\\
...,1,0 j
. (13)
On the basis of the value
j
\
which is found in the current step, the values of devia
tions
*
ij
s
in the dataset are modified and the new values are calculated from
䌥
1
2*
*
1
)(
n
i
ij
j
n
s
s
[
, (14)
where 
*
ij
s
robust standard deviation in the jth step of iteration, for the ith laborato
ry participating in the joint experiment (n  the total number of laboratories). Robust
estimate
*
1j
s
is used to establish a new limit
1j
\
. Iterative procedure is continued
Assessment of the Measurement Method Precision … 93
until all standard deviations of the laboratories involved in this experiment converge
within the ranges of current limit.
4 Example of using “Algorithm S”
Nine laboratories with extensive experience in this type of research were selected for
the experiment. In each of them two homogeneous physical objects were examined.
Absolute differences in the results in the ith laboratory were
21

iii
xxw
,
ni ,1
where: xi1, xi2  the results of two experiments in ith laboratory. Standard deviation
(range) values wi for all (n = 9) laboratories were as follows:
w1 = 0.28; w2 = 0.49; w3 = 0.40; w4 = 0.00; w5 = 0.35; w6 =1.98; w7 = 0.80;
w8 = 0.32; w9 = 0.95.
The variance (squared standard deviation) of the difference of two results from the
ith laboratory was
2
21
2
1
2

iii
xxs
. The assessment of the repeatability is examined
for
䌥
1
2
n
i
i
w
. The mean squared range equalled to
827.0
䌥
9
1
2
9
1
0
i
i
ww
.
Analyzing the absolute values of the differences wi it can be noticed that the value
w6 = 1.98 is significantly different from the other. The hypothesis of a statistical outli
er in a laboratory no 6 (value w6 = 1.98) were tested using the Cochran’s C test [8],
[9]:
636.0
1663.6
98.1 2
p
G
.
From table of this distribution [1part 2, table 4], [8], [9] the critical values are
()
638.0=%5
kr
G
and
()
754.0=%10
kr
G
. Thus Gp for w6 is below the lower limit of
this range and w6 should be treated as a quasioutlier. According to the rules of the
traditional approach, the value w6 = 1.98 should be omitted in further data processing.
Then for n = 8, the “more precise” standard deviation
530.0
'
0
w
is obtained. It is
much smaller (
0
'
0
64.0 ww
) than the value w0 obtained when all data (n = 9) is con
sidered in calculations. On the basis of these two estimates it is clear that the exclu
sion of only one difference from source data lying slightly outside a line that separate
outliers, has a significant impact on the outcome of the analysis. It has influence on
the standard deviation assessment of scatter of measurement procedure.
Presently, one of the robust methods will be considered. As it has been already
mentioned, in these methods all the experimental data is used, also outliers. However,
the data is modified. A robust method “Algorithm S” [1, part 5], [7] allows to evalu
ate precision of the control interlaboratory measurement procedures on the basis of
the results obtained in all (n = 9) laboratories. Basic relationships for its implementa
tion were above mentioned. The number of degrees of freedom is Ȟ = 1. The values of
94 E. Volodarsk
y
et al.
the adjusting and limit factors, according to (10) and (9), equal ȟ =1.097 and Ș=1.645,
respectively. Below, the iterative procedure is presented.
In the first step of iteration
66.0˷658.0645.140.0
*
51
K
\
w
is determined.
This is a limit value for this step. From the raw data
*
09
*
08
*
07
,, www
should be modified
because they are greater than ȥ1. New set of differences
*
1i
w
is obtained. In the first
step of iteration
52.047.0097.1
䌥
9
1
2
*
1
9
1
*
1
i
i
ww
[
is calculated on the basis of the
obtained values. This gives a “new” limiting value
86.0˷52.0645.1
2
\
and so on.
In the fourth step, we already obtain robust value
68.0
*
4
w
which differs from
66.0
*
3
w
by
%3˷100
66.0
02.0
. As the final result
68.0
*
w
can be assumed. This
value is between the two estimates calculated conventionally, i.e. the mean value
w0 =0.827 of the precision results for all 9 laboratories and
530.0
'
0
w
for only 8
results  after rejection of a value recognized as quasioutlier. Finally, the common
robust standard deviation should be taken
48.068.0
2
1
*
2
1 wsr
.
This robust estimation of precision of the measurement method tested in this inter
laboratory experiment is more reliable than the traditional one which is based on the
results from 8 laboratories, i.e. after rejection of the outlier. It was obtained from the
results in all 9 laboratories.
5 Summary
The method of determining precision of a measurement method is briefly presented.
If the full model is not known then tests are conducted on homogeneous objects by
the same procedure in several laboratories with similar competencies. It can be as
sumed that the scattering is modeled by random variable with normal distribution. On
the basis of the results of this research a statistical model is created and its accuracy
is determined. In practice the outliers in results may occur. Rejection of them from
further processing, when there is a small number of experimentally acquired data,
diminishes the credibility of the assessment. Thus a robust statistical method should
be applied.
For illustration, a numerical example was presented. The standard deviation as a
result of research in the one of nine laboratories was an outlier. An evaluation test of
precision in a conventional way with outlier rejection and robust method called “Al
gorithm S” [1], [7] was executed. The method uses all the experimental data. A joint
assessment of the standard deviation for all results was achieved. It was slightly larger
than the traditional assessment (with rejection of the outlier), however more statisti
cally reliable.
Assessment of the Measurement Method Precision … 95
Evaluation of reproducibility of a particular method carried out by a specific pro
cedure (assessment of precision) is derived from results of research of interlaboratory
tests. If in this study heterogeneous experimental data (with outliers) is obtained then
their evaluation should be estimated using the robust “Algorithm S”. It is more relia
ble than traditional methods.
References
1. International Organization for Standardization, Accuracy (trueness and precision) of meas
urement methods and results ISO 57252, ISO 57255:2002
2. Tukey, J.W.: Exploratory Data Analysis. AddisonWesley (1978)
3. Willinik, R.: What is robustness in data analysis. Metrologia 45, 442447 (2008)
4. Huber, P.J., Ronchetti, E.M.: Robust Statistics 2nd edition. Wiley (2011)
5. Wilrich, P.T.: Robust estimates of the theoretical standard deviation to be used in
interlaboratory precision experiments. Accreditation and Quality Assurance 12 (5), 231
240 (2007)
6. Volodarsky, E.T., Warsza, Z.L., Application of two robust methods on the example of in
terlaboratory comparison. In: Pavese, F., Bremser, W., Chunovkina, A.G., Fischer, N.,
Forbes, A.B. (eds.) Advanced Mathematical and Computational Tools in Metrology and
Testing X. Series on Advances in Mathematics for Applied Sciences volume 86, World
Scientific Publishing Company, 385391 (2015)
7. International Organization for Standardization, Statistical methods for use in proficiency
testing by interlaboratory comparisons (IDT) attachment C2, ISO 13528:2005
8. ZieliĔski, R.: Tablice statystyczne (Statistical Tables). PWN, Warszawa (1972)
9. Farrant, T.J.: Practical statistics for the analytical scientist: A bench guide. Royal Society
of Chemistry (1997)
10. Volodarsky, E., Warsza, Z., Kosheva, L, Idzkowski, A.: Evaluating the precision of
interlaboratory measurements using robust Salgorithm. Problems and Progress in Metrol
ogy  PPM 2015, Proceedings of Commission Metrology of Polish Academy of Sciences
in Poland (division Katowice), Series: Conferences no 20, 5359 (2015)
96 E. Volodarsk
y
et al.