ArticlePDF Available

Abstract and Figures

A Geodetic Network is a network of point interconnected by direction and/or distance measurements or by using Global Navigation Satellite System receivers. Such networks are essential for the most geodetic engineering projects, such as monitoring the position and deformation of man-made structures (bridges, dams, power plants, tunnels, ports, etc.), to monitor the crustal deformation of the Earth, to implement an urban and rural cadastre, and others. One of the most important criteria that a geodetic network must meet is reliability. In this context, the reliability concerns the network's ability to detect and identify outliers. Here, we apply the Monte Carlo Method (MMC) to investigate the reliability of a geodetic network. The key of the MMC is the random number generator. Results for simulated closed levelling network reveal that identifying an outlier is more diicult than detecting it. In general, considering the simulated network, the relationship between the outlier detection and identiication depends on the level of signiicance of the outlier statistical test.
Content may be subject to copyright.
Revista Brasileira de Computação Aplicada, July, 2019
DOI: 10.5335/rbca.v11i2.8906
Vol. 11, No2, pp. 74–85
Homepage: seer.upf.br/index.php/rbca/index
ORIGINAL ARTICLE
Application of Articial Random Numbers and Monte Carlo
Method in the Reliability Analysis of Geodetic Networks
Maria L. S. Bonimani1, Vinicius F. Rofatto1,2, Marcelo T. Matsuoka1,2 and
Ivandro Klein3,4
1Universidade Federal de Uberlândia, Instituto de Geograa, Engenharia de Agrimensura e Cartográca,
Monte Carmelo - MG - Brasil and 2Universidade Federal do Rio Grande do Sul, Programa de Pós-Graduação
em Sensoriamento Remoto, Porto Alegre - RS - Brasil and 3Universidade Federal do Paraná, Programa de
Pós-Graduação em Ciências Geodésicas, Paraná - PR - Brasil and
4
Instituto Federal de Santa Catarina, Curso
de Agrimensura, Florianópolis - SC - Brasil
*malubonimani@hotmail.com; vfrofatto@gmail.com; tomiomatsuoka@gmail.com; ivandroklein@gmail.com
Received: 2018-11-28. Revised: 2018-11-28. Accepted: 2018-11-28.
Abstract
A Geodetic Network is a network of point interconnected by direction and/or distance measurements or
by using Global Navigation Satellite System receivers. Such networks are essential for the most geodetic
engineering projects, such as monitoring the position and deformation of man-made structures (bridges,
dams, power plants, tunnels, ports, etc.), to monitor the crustal deformation of the Earth, to implement an
urban and rural cadastre, and others. One of the most important criteria that a geodetic network must meet is
reliability. In this context, the reliability concerns the network’s ability to detect and identify outliers. Here,
we apply the Monte Carlo Method (MMC) to investigate the reliability of a geodetic network. The key of the
MMC is the random number generator. Results for simulated closed levelling network reveal that identifying
an outlier is more dicult than detecting it. In general, considering the simulated network, the relationship
between the outlier detection and identication depends on the level of signicance of the outlier statistical
test.
Key words
: Computational Simulation; Geodetic Network; Hypothesis Testing; Monte Carlo Method; Outlier
Detection; Quality Control.
Resumo
Uma rede geodésica consiste de pontos devidamente materializados no terreno, cujas coordenadas são
estimadas por meio de medidas angulares e de distâncias entre os vértices, e/ou por meio de técnicas
de posicionamento por Sistema Global de Navegação por Satélite. Estas redes são essenciais para os diversos
ramos da Ciências e Engenharia, como por exemplo, no monitoramento de estruturas (barragens, pontes,
usinas hidrelétricas, portos, túneis, portos, etc), no monitoramento da deformação da crosta terrestre, na
implantação de um cadastro urbano e/ou rural georreferenciado, entre outros. Um dos critérios que uma
rede geodésicas deve atender é a conabilidade. Neste contexto, a conabilidade pode ser entendida como a
capacidade da rede em detectar e identicar outliers à um certo nível de probabilidade. Aqui, usamos o Método
Monte Carlo (MMC) para investigar a conabilidade de uma rede geodésica. O elemento chave do MMC é o
gerador de números aleatórios. Os resultados de uma rede de nivelamento simulada revelam que identicar
um outlier é mais difícil que detectá-lo. De modo geral, a relação entre a detecção e a identicação de um
outlier depende do nível de signicância do teste estatístico empregado para tratar os outliers.
Palavras-Chave
: Método Monte Carlo; Outliers; Redes Geodésicas; Simulação Computacional; Teste de
Hipóteses; Controle de Qualidade.
74
Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85 |75
1 Introduction
The foundation of the Monte Carlo Method (MMC)
was Buon’s needle problem by Georges Louis Leclerc
in the eighteenth century. Later, in the nineteenth
century, William Sealy Gosset, otherwise known as
‘Student’, Fisher’s disciple, discovered the form of the
‘t-distribution’ by a combination of mathematical and
empirical work with random numbers, which is now
known as an early application of the MMC. However,
the MMC became well known in the 1940s, when
Stanisław Ulam, Nicholas Metropolis, and John von
Neumann worked on the atomic bomb project. That
method was used to solve the problem of diusion
and absorption of neutrons, which was dicult to
consider in any analytical approaches (Stigler;2002).
Despite advances in science and technology to
solve highly complex systems, one of the major
obstacles to run a MMC up until the 1980s
was the analysis time and computing resources
(run time and memory). However, the advent
of personal computers with powerful processors
has rendered MMC a particularly attractive and
cost-eective approach to performance analysis of
complex systems. Therefore, the MMC emerged as
a solution to help analysts understand how well a
system performs under a given regime or a set of
parameters.
The key of the MMC is the random number
generator. A random number generator is an
algorithm that generates a deterministic sequence of
numbers, which simulates a sequence of independent
and identically distributed (i.i.d.) numbers chosen
uniformly between 0 and 1. It is random in the sense
that the sequence of numbers generated passes the
statistical tests for randomness. For this reason,
random number generators are typically referred
to as pseudo-random number generators (PRNGs).
PRNGs are part of many machine learning and
data mining techniques. In simulation, a PRNG
is implemented as a computer algorithm in some
programming language, and is made available to the
user via procedure calls or icons (Altiok and Melamed;
2007). A good generator produces numbers that are
not distinguishable from truly random numbers in a
limited computation time. This is, in particular, true
for Mersenne Twister (Matsumoto and Nishimura;
1998), a popular generator with a long period length
of 219937 – 1.
In essence, the MMC replaces random variables
by computer PRNGs, probabilities by relative
frequencies, and expectations by arithmetic means
over large sets of such numbers. A computation
with one set of PRNG is a Monte Carlo experiment
(Lehmann and Scheer;2011), also referred to as
the number of Monte Carlo simulations (Altiok and
Melamed;2007;Gamerman and Lopes;2006).
It is evident that in the last decades, the use
of MMC for quality control proposals in geodesy
has been increasing. Hekimoglu and Koch (1999)
pioneered the idea of using MMC to geodesy for
evaluating some probabilities as simple ratios from
simulated experiments. Aydin (2012) used 5,000
MMC simulations to investigate the global test
procedure in structure deformation analysis. Yang
et al. (2013) used MMC to analyze the probability
levels of data snooping. Koch (2015) investigated the
non-centrality parameter of the F-distribution by
using 100,000 simulated random variables. Klein
et al. (2017) ran 1000 experiments to verify the
performance of sequential likelihood ratio tests for
multiple outliers. Rofatto et al. (2018a) used MMC
for designing a geodetic network.
In this work, we seek to investigate the reliability
of a geodetic network. One of the frequently used
reliability measures is the Minimal Detectable Bias
-MDB, see e.g. Teunissen (2006) and Teunissen
(1998). The MDB is a diagnostic tool which allows
analyzing the network’s ability to detect outliers.
However, not the MDB, but the Minimal Identiable
Bias (MIB) should be used as the proper diagnostic
tool for outlier identication purposes (Imparato
et al.;2018). Unlike the MDB, the MIB is too complex
and even practically impossible to obtain in a closed
form. On the other hand, today we have fast and
powerful computers, large data storage systems and
modern software, which paves the way for the use
of numerical simulation. In this sense, therefore, we
propose the use of the MMC in order to analyze the
reliability of a geodetic network in terms of the MIB.
The rest of the article is organized as follows: rst,
we provide a brief explanation on what an outlier is
and explain the dierence between outlier detection
and outlier identication. Second, we present a MMC
approach as a computational analysis tool of the
reliability of a geodetic network. Third, a numerical
example of the proposed method is given for a
leveling network. Finally, the concluding remarks
are summarized at the end of this article.
2 Outlier Detection and Identication
The most often quoted denition of outliers is that
of Hawkins (1980): "An outlier is an observation
that deviates so much from other observations as to
arouse suspicions that it was generated by a dierent
mechanism". In geodesy, the term outlier is dened
based on a statistical hypothesis test for the presence
of gross measurement errors in the observations
(Baarda;1968). Observations that are rejected by
such an outlier test are called outliers. Therefore,
an observation that is not grossly erroneous but is
rejected by an outlier test can also be called outlier. In
this context, outliers are most often caused by gross
errors and gross errors most often cause outliers. But
on the one hand outliers may rarely be the result
of fully correct measurements, and on the other
hand, mistakes or malfunctioning instruments may
not always lead to large deviations, e.g., a small
correction wrongly applied (Lehmann;2013).
Since Hawkin’s and most of the other denitions
of outliers restrict themselves to samples (repeated
observations), we follow the Lehmann (2013)
denition: "An outlier is an observation that is so
probably caused by a gross error that it is better not
used or not used as it is".
In this section we provide the elements related to
hypothesis testing for the detection and identication
of a single outlier in linear(ised) models.
76 |Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85
2.1 Outlier Detection and Minimal Detectable
Bias - MDB
Baarda (1968) proposed a procedure based on
hypothesis testing for the detection of a single
outlier in linear(ized) models, which he called data
snooping. Although data snooping was introduced as
a testing procedure for use in geodetic networks,
it is a generally applicable method (Lehmann;
2012). Baarda’s data snooping consists of screening
each individual observation for a possible outlier
(Teunissen;2006). Baarda’s w-test statistic for his
data snooping is given by a normalised least-squares
residual. This test, which is based on a linear mean-
shift model, can also be derived as a particular case
of the generalised likelihood ratio test.
In principle, Baarda’s w-test only makes a
decision between the null H
0
and a single alternative
hypothesis H
i
. The null hypothesis, which is also
called the working hypothesis, corresponds to a
supposedly valid model describing the physical reality
of the observations without the presence of an outlier.
When it is assumed to be ‘true’, this model is used
to estimate the unknown parameters, typically in a
least-squares approach. Thus, the null hypothesis
of the standard Gauss–Markov model in linear or
linearised form is given by equation (1) (Koch;1999).
H0:E(y) = Ax,D(y) = Σyy (1)
Where:
E(.) is the expectation operator;
yRnis the vector of measurements;
ARn×u
is the Jacobian matrix (also called design
matrix) of full rank u;
xRuis the unknown parameter vector;
D(.) is the dispersion operator; and
Σyy Rn×n
is the known positive denite co-
variance matrix of the measurements.
The redundancy (or freedom degrees) of the
model in (1) is r=n-u, where nis the number of
measurements and uthe number of parameters.
Instead of H
0
,Baarda (1968) proposed a mean shift
alternative hypothesis H
i
, also referred to as model
misspecication by Teunissen (2006), as follows:
Hi:E(y) = Ax +cii,D(y) = Σyy (2)
In the equation (2),
ci
is a canonical unit vector,
which consists exclusively of elements with values of
0 and 1, where 1 means that an outlier of magnitude
i
aects an i-th measurement and 0 otherwise, e.g.
ci
= [0 0
. . .
1
i
0 0
. . .
0]. Therefore, the purpose of the
data snooping procedure is to screen each individual
observation for an outlier.
To verify if there are sucient evidences to reject
or not the null hypothesis, the test for binary case
should be performed as (3):
Accept H0if |wi|qχ2
α0(r= 1, 0) = k(3)
Where:
|wi| = c>
iΣ–1
yy b
e0
c>
iΣ–1
yy Σb
e0Σ–1
yy
(4)
In the equations 3and 4, |
wi
| is the Baarda’s w-
test statistic for the data snooping, which represents
the normalised least-squares residual for each
measurement;
Σb
e0
is the co-variance matrix of the
best linear unbiased estimator of
b
e0
under H
0
; and
b
e0
is the least-squares residuals vector of H
0
which
has this distribution under H
0
. The critical value
k
=
qχ2
α0(r= 1, 0)
is computed from the central chi-
squared distribution with r= 1 degree of freedom
and type I error, also known as false alarm or level
of signicance,
α0
(note: the index ‘0’ represents
the case of a single alternative hypothesis testing).
The second argument of
qχ2
α0(r= 1, 0)
is the non-
centrality parameter
λr=1
, that in this case is
λr=1
= 0.
In the case of accepting in favour of H
i
, there is an
outlier that causes the expectation of |
wi
| to become
λr=1
. The non-centrality parameter (
λr=1
) describes
the discrepancy between H
0
of equation (1) and H
i
of
equation (4), and it is given by (5):
λr=1 =c>
iΣ–1
yy Σb
e0Σ–1
yy 2
i(5)
Because Baarda’s w-test in its essence is based
on binary hypothesis testing, in which one decides
between the null hypothesis H
0
of equation (1) and
a unique alternative hypothesis H
i
of equation (2),
it may lead to type I error
α0
and type II error
β0
.
The probability of type I error
α0
is the probability of
rejecting the null hypothesis when it is true, whereas
the type II error
β0
is the probability of failing to
reject the null hypothesis when it is false.
Instead of
α0
and
β0
, there is the condence level
(CL = 1 –
α0
) and power of the test
γ0
= 1 –
β0
,
respectively. The rst deals with the probability of
accepting a true null hypothesis; the second, with
the probability of correctly accepting the alternative
hypothesis. The Fig. 1shows an example of the
relationship between these variables.
Note in (5) that the non-centrality parameter
λr=1
requires knowledge of the outlier size
i
, which in
practice is unknown. On the other hand,
λr=1
can be
computed as a function of
α0
,
γ0
, and for r= 1. In
such case, the term
c>
iΣ–1
yy Σb
e0Σ–1
yy 2
i
becomes a scalar
and the solution of the quadratic equation (5) is given
by (6) (Teunissen;2006):
|i| = MDBi=sλr=1(α0,γ0)
c>
iΣ–1
yy Σb
e0Σ–1
yy ci
(6)
In the equation 6, |
i
| is the Minimal Detectable
Bias (MDB
i
), which is computed for each of the n
alternative hypotheses according to equation (2). For
more details about MDB see e.g. (Rofatto et al.;
2018b).
Although Baarda’s w-test belongs to the class of
generalised likelihood ratio tests and has the property
of being a uniformly most powerful invariant (UMPI)
Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85 |77
Figure 1: A non centrality parameter of λr=1 = 3.147 with α0= 0.01(k= 2.576) lead to γ0= 0.8 (or β0= 0.2)
(Adapted from Rofatto et al. (2018b)).
test when the null hypotheses is tested against a
single alternative (Arnold;1981;Teunissen;2006),
this test may not necessarily be a UMPI when more
than one alternative hypothesis are considered, as
is the case of the data snooping procedure (Kargoll;
2007). In the next section, we will briey review the
multiple alternative hypotheses case and the Minimal
Identiable Bias (MIB).
2.2 Outlier identication and Minimal
Identiable Bias
The sizes of type I and II errors are given for a single
alternative hypothesis H
i
of equation (2). Under this
assumption, the MDB can be obtained as a lower
bound of the outlier that can be successfully detected
(Yang et al.;2013). In practice, however, we do not
have a single alternative hypothesis during the data
snooping procedure, but we have multiple alternative
hypotheses. Therefore, the data snooping procedure
has an eect when it returns the largest absolute
value among the wi, i.e. (Teunissen;2006):
w=max |wi|, i{1, . . . ,n} (7)
The concept of multiple testing says that if H
0
is
rejected, among all H
i
’s the one should be accepted,
which would have rejected H
0
with the least
α
. In the
case that all critical values are identical, it is most
simple: H
i
with the maximum test statistic should
be accepted. In order to check its signicance, the
maximum value wshould be compared with a critical
value (
k
) (Rofatto et al.;2018b). In that case, the
data snooping procedure is therefore given as:
Accept H0if w k(8)
Otherwise,
Accept Hiif w >k(9)
According to the inequalities (8) and (9), If none
of the nw-tests gets rejected, then we accept the
null hypothesis H0.
For the test with multiples alternative hypotheses,
apart from type I and type II errors, there is a
third type of wrong decision when Baarda’s data
snooping is performed. Baarda’s data snooping
can also ag a non-outlying observation while
the ‘true’ outlier remains in the dataset. We are
referring to the type III error (Hawkins;1980). The
determination of the type III error (here denoted
by
κij
) involves a separability analysis between the
alternative hypotheses (Förstner;1983). Therefore,
we are now interested in the identication of
the correct alternative hypothesis. In this case,
rejection of H
0
does not necessarily imply the correct
identication of a particular alternative hypothesis.
Under multiple alternative hypotheses, the
probabilities of type I errors in the data snooping
procedure for outlier identication, when there are
no outliers, are given by (10):
α0i=Z|wi|>|wj|, |wi|>k
f0
0dw1. . . dwn(10)
In the equation (10), f
0
0
is the probability density
function when the expectation of the multivariate
Baarda’s w-test statistics is zero (i.e. µn=0).
Based on the assumption that one outlier is in the
ith position of the dataset, the probability of a correct
identication is given by (11):
1 – βii =Z|wi|>|wj|, |wi|>k
f0
idw1. . . dwn(11)
Where f
0
i
is the probability density function when
the expectation of the multivariate Baarda’s w-test
statistics is not equal to zero (µn6= 0).
The probability of type II error for multiple testing
is given by (12):
78 |Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85
βi0=P"\n
i=1|wi|kHi:true#(12)
In that case, the probability of type III error is
given by (13):
n
X
i=1
P[|wj| > |wi|i, |wj| > k(i6=j) | Hi:true]
=
n
X
i=1
κij (i6=j)
(13)
Testing H
0
against H
1
,H
2
,H
3
,
. . .
,H
n
is not a trivial
task for identication purposes, because the higher
the dimensionality of the alternative hypotheses, the
more complicated the level probabilities associated
with the data snooping procedure.
Teunissen (2018) recently introduced the concept
of Minimal Identiable Bias (MIB) as the smallest
outlier that leads to its identication for a given
correct identication rate. The detection and
identication are equal in the case where we only
have the one alternative hypothesis. However, under
nalternative hypotheses (multiple testing), we have
from equations (11), (12) and (13):
βii =βi0+
n
X
i=1
n
X
i=1
κij (i6=j)(14)
or
1 – βii =γ0
n
X
i=1
κij (i6=j)γ0= 1 – βii +
n
X
i=1
κij (i6=j)(15)
The probability of correct detection
γ0
(power
of the test for a single alternative hypothesis) is
the sum of the probability of correct identication
1 –
βii
(selecting a correct alternative hypothesis)
and the probability of misidentication
Pn
i=1 κij (i6=j)
(selecting one of the n-1 other hypotheses). Thus,
we have the follow inequality (Imparato et al.;2018):
1 – βii γ0(16)
As a consequence of that inequality (16), the MIB
will be larger than MDB, i.e. MIB MDB.
Because the acceptance region (as well as
the critical region) for the multiple alternative
hypotheses case is analytically intractable, the
computation of MIB should be based on Monte
Carlo integration method (MMC). In this respect,
Imparato et al. (2018); Teunissen (2018) showed
how to compute the MIB. They found that the
larger the size of the outlier and/or more precisely,
the estimated outlier, the higher the probability of
being correctly identied. In addition, increasing
the type I error (i.e. reducing the acceptance region)
leads to higher probabilities of correct identication.
Furthermore, increasing the number of alternative
hypotheses leads to a lower probability of correct
identication.
There is no dierence between MDB and MIB in
the case of a single alternative hypothesis. As the
number of alternative hypotheses increases, however,
MDB’s become smaller, whereas MIB’s become larger.
The theory presented so far is for a single round
of data snooping. In practice, however, the data
snooping is applied iteratively in the process of
estimation, identication, and adaptation. First,
the least-squares residual vector is estimated and
Baarda’s w-test statistics are computed by (4). Then,
the detector given by (7) is applied to identify
the most likely outlier. The identied outlier
is then excluded from the dataset and the least-
squares estimation adjustment is restarted without
the rejected observation. Then, Baarda’s w-test
(4) as well as the detector (7) are again computed.
Obviously, if redundancy permits, this procedure is
repeated until no more (possible) outliers can be
identied. This procedure is called iterative data
snooping procedure - IDS (Teunissen;2006).
In the case of IDS, a reliability measure cannot
be easily computed for quality control purposes.
Consequently, MIB is valid only for the case where
data snooping is run once, and they cannot be used
as a diagnostic tool for IDS. Because an analytical
formula is not easy to compute, a MMC should be
run to obtain the MIB for IDS. The MMC allows
insights into these cases where analytical solutions
are extremely complex to fully understand, are
doubted for one reason or another, or are not available
(Rofatto et al.;2018b).
Recent studies by Rofatto et al. (2017) showed
how to extract the probability levels associated with
Baarda’s IDS procedure by MMC. Furthermore, they
introduced two new classes of wrong decisions for
IDS, which they called over-identication. One is the
probability of IDS agging simultaneously the outlier
and good observations. Second is the probability of
IDS agging only the good observations as outliers
(more than one) while the outlier remains in the
dataset. Obviously, these two new false decisions
could occur during the iterative process of estimation,
identication, and exclusion, as is the case of IDS.
3 MIB based on Monte Carlo Method
The probability levels associated with IDS are not
easy to study using analytical models owing to the
paucity or lack of practically computable solutions
(closed form or numerical). Therefore, identifying an
outlier is still a bottleneck in geodesy. On the other
hand, a MMC method can almost always be run to
generate system histories that yield useful statistical
information on system operation and performance
measures as pointed out by Altiok and Melamed
(2007).
A geodetic network are typically composed by
distances and angles measurements. Generally, the
random errors of good measurements are normally
distributed with expectation zero. In order to have
normal random errors, uniformly distributed random
number sequences (produced by the Mersenne
Twister algorithm, for example) are transformed
Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85 |79
into a normal distribution by using the Box–Muller
transformation (Box and Muller;1958). Box–Muller
has been used in geodesy for MMC (Lehmann;2012).
A procedure based on the MMC is applied to
compute the probability levels of IDS as follows
(summarised as a owchart in Fig. 2).
In the rst step, the design matrix
ARn×u
and
the co-variance matrix of the measurements
Σyy
Rn×n
are entered; then, the signicance level
α
and
the magnitude intervals of simulated outliers are
dened.
The magnitude intervals of outliers are based on a
standard deviation of measurements (e.g. |3
σ
to 9
σ
|,
where
σ
is the standard deviation of measurement.
The random error vectors are articially generated
based on a multivariate normal distribution, because
the assumed stochastic model for random errors is
based on a matrix co-variance of the measurements.
In this work, we use the Mersenne Twister algorithm
to generate a sequence of PRNG and Box–Muller to
transform it into a normal distribution. On the other
hand, the magnitude of the outlier (one outlier at a
time, r= 1) is selected based on magnitude intervals
of the outliers for each Monte Carlo experiment. We
use the continuous uniform distribution to select
the outlier magnitude. The uniform distribution is
a rectangular distribution with constant probability
and implies the fact that each range of values that
has the same length on the distributions support
has equal probability of occurrence. Thus, the total
error
is a combination of the random errors and its
corresponding outlier, which is given as as follows:
=e+cii(17)
Where:
eRn
is the PRNG from normal
distribution, i.e.
e∼ N
(0,
Σyy
),
ci
consists exclusively
of elements with values of 0 and 1, where 1 means
that an outlier of magnitude
i
aects an i-th
measurement, and 0 otherwise.
After the total error has been generated, the
least-squares residuals vector
b
e0
is computed using
equation (18):
b
e0=R,with R=IA(A>WA)–1A>W(18)
In the equation (18), we have
RRn×n
as
the redundancy matrix,
W
=
σ02Σ–1
yy Rn×n
the
weight matrix of the measurements, where
σ02
is
the variance scalar factor, and
IRn×n
the identity
matrix (Koch;1999).
For IDS, the hypothesis of (2) for one outlier
is assumed and the corresponding test statistic is
computed according to (4). Then, the maximum test
statistic value is computed according to (7). After
identifying the observation suspected as the most
likely outlier, it is typically excluded from the model,
and least-squares estimation and data snooping are
applied iteratively until there are no further outliers
identied in the dataset. The procedure should be
performed for mexperiments of random error vectors
with each experiment contaminated by an outlier.
If mis the total number of MMC experiments, we
count the number of times that the outlier is correctly
identied (denoted as n
CI
), i.e.
max |wi
ν|
>
k
for
ν
= {1,
. . .
,
m
}. Then, the probability of correct
identication (P
CI
) can be approximated as follows
(Rofatto et al.;2018b):
PCI nCI
m(19)
The error probabilities are also approximated as
follow:
PMD nMD
m(20)
PWE nWE
m(21)
Pover+nover+
m(22)
Povernover
m(23)
Where:
n
MD
is the number of experiments in which the
IDS does not detect the outlier;
P
MD
represents the type II error, also referred to
as missed detection probability;
n
WE
is the number of experiments in which the IDS
procedure ags a non-outlying observation while
the ‘true’ outlier remains in the dataset;
P
WE
represents the type III error, also referred to
as wrong exclusion probability;
n
over+
is the number of experiments where the IDS
identies correctly the outlying observation and
others;
Pover+corresponds to the probability of over+;
n
over
represents the number of experiments
where the IDS identies more than one non-
outlying observation, whereas the ‘true outlier
remains in the dataset;
P
over
corresponds to the probability of over– class;
In practice, as the magnitudes of outliers are
unknown, one can dene the probability of the
correct identication in order to nd the MIB for a
given application. In the next section, the procedure
based on MMC for the computation of MIB is applied
in a geodetic network. The relationship between
detection by MDB and identication by MIB is also
studied.
4 An example of the Monte Carlo Method
applied to the reliability analysis of
geodetic networks
As an example, the procedure based on MMC
experiments for the computation of probability
levels of IDS is applied to the simulated closed-
levelling network given by Rofatto et al. (2018b),
with one control (xed) point (A) and three points
80 |Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85
Figure 2:
Flowchart of the procedure based on MMC for computation of the probability levels of IDS for each
measurement (Rofatto et al.;2018b).
Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85 |81
Table 1: MDB and MIB for each signicance level α(%) and for a power of γ= 0.8(80.0%)
Measurement α0.1% α1% α5% α10%
1MDB 5.3σ4.4σ3.6σ3.2σ
MIB 5.5σ4.8σ4.7σ6.5σ
2MDB 6.6σ5.4σ4.4σ4.0σ
MIB 6.8σ6.0σ5.8σ> 9σ
3MDB 6.6σ5.4σ4.4σ4.0σ
MIB 6.8σ6.0σ5.8σ6.5σ
4MDB 5.3σ4.4σ3.6σ3.2σ
MIB 5.5σ4.8σ4.7σ> 9.0σ
5MDB 6.6σ5.4σ4.4σ4.0σ
MIB 6.8σ6.0σ5.8σ7.0σ
6MDB 5.3σ4.4σ3.6σ3.2σ
MIB 5.5σ4.8σ4.7σ6.0σ
Figure 3: Simulated geodetic levelling network
with unknown heights (B, C, and D), totalling four
minimally constrained points (Fig. 3). The simulated
geodetic network has a minimal number of redundant
measurements that lead the identication of a single
outlier.
It is important to mention that geodetic network
presents a minimum conguration to identify at
least one single outlier. As mentioned by Xu (2005)
that: "in order to identify outliers, one also has to
further assume that for each model parameter, there
must, at least, exist two good data that contain the
information on such a parameter". For example,
consider the one unknown height into a leveling
network (one-dimensional - 1D). Two observations
would lead to dierent solutions and allow the
detection of an inconsistency between them. Three
observations would lead to dierent solutions and
the identication of one outlying observation, and so
on. Thus, in a general case, the number of possible
identiable outliers should be equal to the minimal
number of redundant measurements across each and
every point, minus one.
There are
n
= 6 measurements,
u
= 3 unknowns,
and
n
u
= 3 redundant measurements in this
network. Therefore, the geodetic network would be
able to identify one outlier. The measurements 1,
2, 3, 4, 5, and 6 are assumed normally distributed,
uncorrelated, and with nominal precision (a prior
standard deviation
σ
) of
±
8mm,
±
5.6mm,
±
5.6mm,
±
8mm,
±
5.6mm, and
±
8mm, respectively. The
magnitude interval of outlier is from the minimum 3
σ
to maximum 9
σ
, with an interval rate of 0.1
σ
. Here,
positive and negative outliers are considered for each
measurement. Four values were considered for the
signicance level:
α
= 0.001(0.1%),
α
= 0.01(1%),
α
= 0.05(5%) and
α
= 0.1(10%). We ran 10,000 MMC
experiments for each measure and for each outlier
magnitude interval, totalling 12,960,000 numerical
experiments.
Figure 4) shows the power of the test, type II and
III errors of IDS, and (Figure 5) the over-identication
probabilities, for the case where there is a single
outlier contaminating the measurements. In general,
the larger magnitude of the outlier, the higher the
success rate (i.e. power of the test). It can be noted
that the type III error is the smallest for
α
= 0.001
and largest for the type II error. Furthermore, it is
rare for an outlier of small magnitude, say 3
σ
to 4
σ
,
to be identied on that network.
In general, for the simulated network, the smaller
α
, the larger is the
β
. On the other hand, the smaller
α
,
the smaller type III error (
κ
). For two classes of over-
identication probabilities, in general, the inuence
of committing the over-identication+ and over-
identication– is directly related to probability level
α
: the larger
α
, the larger the over-identications
case. Note that for
α
= 0.001, the over-identication
cases are practically absent.
Besides that, the MDB were computed for each
measurement and for the four signicance level
described above. The relationship between MDB
and MIB is showed in the Tab. 1. The higher the
level of signicance
α
, the higher is the probability
of detecting it, i. e. the smaller the MDB. This
relationship, however, does not work for MIB. The
MIB is slightly larger than the MDB for that geodetic
network, except for the signicance level of 10%,
for which the MIB is approximately two times
larger than the MDB. Therefore, due to the low
redundancy of measurements in the network, it is
not recommended to use a signicance level of 10%
for outlier identication proposals.
This example shows how to compute for the
IDS case based on the MMC. Obviously, should
be computed for a given probability of correct
identication (γ) and signicance level (α).
82 |Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85
Figure 4: Power of the test, type II and type III error for each signicance level α.
Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85 |83
Figure 5: Over-identication probabilities for each signicance level α
84 |Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85
5 Final Remarks
In this study, we highlighted that Monte Carlo
method (MMC) is a primary tool for deriving
solutions to complex problems. We used the
Monte Carlo method as a key tool for studying
the IDS procedure. We emphasized that, the
method discards the use of real measurements.
Actually, it is assumed that the random errors of
the good measurements are normally distributed,
and therefore can be articially generated by means
of a PRNG. Thus, in fact, the only needs are the
geometrical network conguration (given by design
matrix); the uncertainty of the observations (which
can be given by nominal standard deviation of the
equipment); and the magnitude intervals of the
outliers.
We also highlighted that in contrast to the well-
dened theories of reliability, the IDS procedure
is a heuristic method, and therefore, there is no
theoretical reliability measure for it. Hence, an
analytical model with tractable solution is unknown,
and therefore, one needs to resort to MMC. Based on
the work by Rofatto et al. (2018b), we showed how
to nd the probability levels associated with IDS and
how to obtain its for each observation by means of
the MMC for a given correct identication probability
and signicance level.
Acknowledgements
The authors would like to acknowledge the support
from FAPEMIG (Fundação de Amparo à Pesquisa do
Estado de Minas Gerais, research project 2018/7285).
The authors also like to extend our gratitude to the
anonymous referee for some valuable comments on
a previous version of this text.
References
Altiok, T. and Melamed, B. (2007). Simulation
Modeling and Analysis with Arena, 1 edn, Academic
Press.
Arnold, S. (1981). The theory of linear models and
multivariate analysis, 1 edn, Wiley.
Aydin, C. (2012). Power of global test in deformation
analysis, Journal of Surveying Engineering
138
(2): 51–
56.
https://doi.org/10.1061/(ASCE)SU.1943-5428.
0000064.
Baarda, W. (1968). A testing procedure for use in
geodetic networks., Publ. on geodesy, New Series
2
(5).
Box, G. E. P. and Muller, M. E. (1958). A note on the
generation of random normal deviates, The Annals of
Mathematical Statistics
29
(2): 610–611.
https://doi:
10.1214/aoms/1177706645.
Förstner, W. (1983). Reliability and discernability
of extended gauss-markov models, Seminar on
mathematical models of Geodetic/Photogrammetric
Point Determination with Regard to Outliers and
Systematic Errors, Vol. Series A, Deutsche
Geodätische Kommision, Munich, Germany,
pp. 79–103.
Gamerman, D. and Lopes, H. F. (2006). Markov
Chain Monte Carlo: Stochastic Simulation for Bayesian
Inference, 2 edn, Chapman and Hall/CRC.
Hawkins, D. M. (1980). Identication of Outliers, 1 edn,
Springer Netherlands.
https://doi.org/10.1007/
978-94-015-3994-4.
Hekimoglu, S. and Koch, K. R. (1999). How can
reliability of the robust methods be measured?, in
M. O. Altan and L. Gründig (eds), Third Turkish-
German joint geodetic days, Vol. 1, Istanbul Technical
University, Istanbul, Turkey, pp. 179–196.
Imparato, D., Teunissen, P. J. D. and Tiberius, C. C.
J. M. (2018). Minimal detectable and identiable
biases for quality control, Survey Review .
https:
//doi.org/10.1080/00396265.2018.1437947.
Kargoll, B. (2007). On the theory and application
of model misspecication tests in geodesy, Doctoral
thesis, University of Bonn, Landwirtschaftliche
Fakultät, German, Bonn.
Klein, I., Matsuoka, M. T., Guzatto, M. P. and
Nievinski, F. G. (2017). An approach to identify
multiple outliers based on sequential likelihood
ratio tests, Survey Review
49
(357): 449–457.
https:
//doi.org/10.1080/00396265.2016.1212970.
Koch, K. R. (1999). Parameter estimation and hypothesis
testing in linear models., 2 edn, Springer.
Koch, K. R. (2015). Minimal detectable outliers
as measures of reliability, Journal of Geodesy
89
(5): 483–490.
https://doi.org/10.1007/
s00190-015-0793-5.
Lehmann, R. (2012). Improved critical values for
extreme normalized and studentized residuals
in gauss-markov models, Journal of Geodesy
86
(12): 1137–1146.
https://doi.org/10.1007/
s00190-012-0569-0.
Lehmann, R. (2013). On the formulation of
the alternative hypothesis for geodetic outlier
detection, Journal of Geodesy 87(4): 373–386.
Lehmann, R. and Scheer, T. (2011). Monte carlo
based data snooping with application to a geodetic
network, Journal of Applied Geodesy
5
(3-4): 123–134.
https://doi.org/10.1515/JAG.2011.014.
Matsumoto, M. and Nishimura, T. (1998). Mersenne
twister: A 623-dimensionally equidistributed
uniform pseudo-random number generator, ACM
Transactions on Modeling and Computer Simulation
8
(1): 3–30.
https://dl.acm.org/citation.cfm?id=
272991.
Rofatto, V. F., Matsuoka, M. T. and Klein, I. (2017). An
attempt to analyse baarda’s iterative data snooping
procedure based on monte carlo simulation, South
African Journal of Geomatics
6
(6): 416–435.
http:
//dx.doi.org/10.4314/sajg.v6i3.11.
Rofatto, V. F., Matsuoka, M. T. and Klein, I.
(2018a). Design of geodetic networks based
on outlier identication criteria: An example
applied to the leveling network, Bulletin of Geodetic
Sciences
24
(2): 152–170.
http://dx.doi.org/10.
1590/s1982-21702018000200011.
Bonimani et al./ Revista Brasileira de Computação Aplicada (2019), v.11, n.2, pp.74–85 |85
Rofatto, V. F., Matsuoka, M. T., Klein, I., Veronez, M.,
Bonimani, M. L. and Lehmann, R. (2018b). A half-
century of baarda’s concept of reliability: A review,
new perspectives, and applications, Survey Review .
https://doi.org/10.1080/00396265.2018.1548118.
Stigler, S. M. (2002). Statistics on the Table: The History
of Statistical Concepts and Methods, 1 edn, Harvard
University Press.
Teunissen, P. J. G. (1998). Minimal detectable biases
of gps data, Journal of Geodesy
72
(4): 236–244.
https://doi.org/10.1007/s001900050163.
Teunissen, P. J. G. (2006). Testing Theory: an
introduction, 2 edn, Delft University Press.
Teunissen, P. J. G. (2018). Distributional theory for
the dia method, Journal of Geodesy
91
(1): 59–80.
https://doi.org/10.1007/s00190-017-1045-7.
Xu, P. (2005). Sign-constrained robust least squares,
subjective breakdown point and the eect of
weights of observations on robustness, Journal of
Geodesy 79(1): 146–159.
Yang, L., Wang, J., Knight, N. and Shen, Y.
(2013). Outlier separability analysis with a
multiple alternative hypotheses test, Journal of
Geodesy
87
(6): 591–604.
https://doi.org/10.1007/
s00190-013-0629-0.
... It should be noted that the effect of outliers on the estimators of measured quantities being determined can also be limited using robust methods of observation processing [26,27,34]. Nevertheless, irrespective of the approach adopted, the efficacy of methods detecting outliers in the observation vector are largely determined by the features of observation systems which are characterised by geodetic network internal reliability coefficients [41][42]7]. For this reason, at the control network design stage, specialised computational procedures are applied to ensure that the optimum configuration of the geodetic control network is designed to satisfy the required network quality criteria [25,31,20]. ...
Article
The article presents a new version of the method for estimating parameters in a split functional model, which enables the determination of displacements of geodetic network points with constrained datum. The main aim of the study is to present theoretical foundations of Msplit CD estimation and its basic properties and possible applications. Particular attention was paid to the efficacy of the method in the context of geodetic network deformation analysis and to the robustness properties of the proposed method. The theoretical considerations were verified by means of two computational tests conducted using the Monte Carlo simulation. The obtained results of methods of estimation parameters in a split functional model were compared with the results of classical method of the least squares estimation. The numerical examples provided in the study indicate the basis properties of Msplit CD estimators being determined.
... The key of Monte Carlo is artificial random numbers (ARN) [70], which are called 'artificial' because the random numbers are generated using a deterministic process. A random number generator is a technology designed to generate a deterministic sequence of numbers that do not have any pattern and therefore appear to be random. ...
Thesis
Full-text available
For more than half a century, the reliability theory introduced by Baarda (1968) has been used as a standard practice for quality control in geodesy and surveying. Although the theory meets mathematical rigor and probability assumptions, it was originally developed for a Data-Snooping which assumes a specific observation as a suspect outlier. In other words, only one single alternative hypothesis is in play. Actually, we do not know which observation is an outlier. Since the Data-Snooping consists of screening each individual measurement for an outlier, a more appropriate alternative hypothesis would be: “There is at least one outlier in the observations”. Now, we are interested to answer: “Where?”. The answer to this question lies in a problem of locating among the alternative hypotheses the one that led to the rejection of the null hypothesis. Therefore, we are interested in identifying the outlier. Although advances have occurred over that period, the theories presented so far consider only one single round of the Data-Snooping procedure, without any subsequent diagnosis, such as removing the outlier. In fact, however, Data-Snooping is applied iteratively: after identification and elimination of the outlier, the model is reprocessed, and outlier identification is restarted. This procedure of iterative outlier elimination is known as Iterative Data-Snooping (IDS). Computing the probability levels associated with IDS is virtually impossible to those analytical methods usually employed in conventional tests, such as, overall model test and Data-Snooping of only one single alternative hypothesis. Because of this, a rigorous and complete reliability theory was not yet available. Although major advances occurred in the mid-1970s, such as microprocessorbased computers, Baarda had a disadvantage: the technology of his time was insufficient to use intelligent computational techniques. Today, the computational scenario is completely different from the time of Baarda’s theory of reliability. Here, following the current trend of modern science, we can use intelligent computing and extend the reliability theory when the DSI is in play. We show that the estimation depends on the test and the adaptation and, therefore, the IDS is, in fact, an estimator. Until the present, no study has been conducted to evaluate empirically the accuracy of the Monte Carlo for quality control purposes in geodesy. Generally, only the degree of dispersion of the Monte Carlo is considered. Thus, an issue remains: how can we find the optimal number of Monte Carlo experiments for quality control purpose? Here, we use an exact theoretical reference probabilities to answer this question. We find that that the number of experiments m = 200, 000 can provide consistent results with sufficient numerical precision for outlier identification, with a relative error less than 0.1%. The test statistic associated with IDS is the extreme normalised least-squares residual. It is well-known in the literature that critical values (quantile values) of such a test statistic cannot be derived from well-known test distributions but must be computed numerically by means of Monte Carlo. This paper provides the first results on the Monte Carlo-based critical value inserted into different scenarios of correlation between outlier statistics. We also tested whether increasing the level of the family-wise error rate, or reducing the critical values, improves the identifiability of the outlier. The results showed that the lower critical value, or the higher the family-wise error rate, the larger the probability of correct detection, and the smaller the MDB. However, this relationship is not valid in terms of identification. We also highlight that an outlier becomes identifiable when the contributions of the observations to the wrong exclusion rate (Type III error) decline simultaneously. In this case, we verify that the effect of the correlation between outlier statistics on the wrong exclusion rate becomes insignificant for a certain outlier magnitude, which increases the probability of identification.
... The key of Monte Carlo is artificial random numbers (ARN) [70], which are called 'artificial' because the random numbers are generated using a deterministic process. A random number generator is a technology designed to generate a deterministic sequence of numbers that do not have any pattern and therefore appear to be random. ...
Article
Full-text available
An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of modern measurement systems in geodesy and surveying. The test statistic associated with IDS is the extreme normalised least-squares residual. It is well-known in the literature that critical values (quantile values) of such a test statistic cannot be derived from well known test distributions but must be computed numerically by means of Monte Carlo. This paper provides the first results on the Monte Carlo-based critical value inserted into different scenarios of correlation between outlier statistics. From the Monte Carlo evaluation, we compute the probabilities of correct identification, missed detection, wrong exclusion, over-identifications and statistical overlap associated with IDS in the presence of a single outlier. On the basis of such probability levels, we obtain the Minimal Detectable Bias (MDB) and Minimal Identifiable Bias (MIB) for cases in which IDS is in play. The MDB and MIB are sensitivity indicators for outlier detection and identification, respectively. The results show that there are circumstances in which the larger the Type I decision error (smaller critical value), the higher the rates of outlier detection but the lower the rates of outlier identification. In such a case, the larger the Type I Error, the larger the ratio between the MIB and MDB. We also highlight that an outlier becomes identifiable when the contributions of the measures to the wrong exclusion rate decline simultaneously. In this case, we verify that the effect of the correlation between outlier statistics on the wrong exclusion rate becomes insignificant for a certain outlier magnitude, which increases the probability of identification.
Article
Full-text available
We present a numerical simulation method for designing geodetic networks. The quality criterion considered is based on the power of the test of data snooping testing procedure. This criterion expresses the probability of the data snooping to identify correctly an outlier. In general, the power of the test is defined theoretically. However, with the advent of the fast computers and large data storage systems, it can be estimated using numerical simulation. Here, the number of experiments in which the data snooping procedure identifies the outlier correctly is counted using Monte Carlos simulations. If the network configuration does not meet the reliability criterion at some part, then it can be improved by adding required observation to the surveying plan. The method does not use real observations. Thus, it depends on the geometrical configuration of the network; the uncertainty of the observations; and the size of outlier. The proposed method is demonstrated by practical application of one simulated leveling network. Results showed the needs of five additional observations between adjacent stations. The addition of these new observations improved the internal reliability of approximately 18%. Therefore, the final designed network must be able to identify and resist against the undetectable outliers – according to the probability levels.
Article
Full-text available
William Sealy Gosset, otherwise known as "Student", was one of the pioneers in the development of modern statistical method and its application to the design and analysis of experiments. Although there were no computers in his time, he discovered the form of the "t distribution" by a combination of mathematical and empirical work with random numbers. This is now known as an early application of the Monte Carlo simulation. Today with the fast computers and large data storage systems, the probabilities distribution can be estimated using computerized simulation. Here, we use Monte Carlo simulation to investigate the efficiency of the Baarda's iterative data snooping procedure as test statistic for outlier identification in the Gauss-Markov model. We highlight that the iterative data snooping procedure can identify more observations than real number of outliers simulated. It has a deserved attention in this work. The available probability of over-identification allows enhancing the probability of type III error as well as probably the outlier identifiability. With this approach, considering the analysed network, in general, the significance level of 0.001 was the best scenario to not make mistake of excluding wrong observation. Thus, the data snooping procedure was more realistic when the over-identifications case is considered in the simulation. In the end, we concluded that for GNSS network that the iterative data snooping procedure based on Monte Carlo can locate an outlier in the order of magnitude 4.5σ with high success rate.
Article
Full-text available
The DIA method for the detection, identification and adaptation of model misspecifications combines estimation with testing. The aim of the present contribution is to introduce a unifying framework for the rigorous capture of this combination. By using a canonical model formulation and a partitioning of misclosure space, we show that the whole estimation–testing scheme can be captured in one single DIA estimator. We study the characteristics of this estimator and discuss some of its distributional properties. With the distribution of the DIA estimator provided, one can then study all the characteristics of the combined estimation and testing scheme, as well as analyse how they propagate into final outcomes. Examples are given, as well as a discussion on how the distributional properties compare with their usage in practice.
Article
Over the 50 years of its existence, Baarda’s concept of reliability has been used as a standard practice for the quality control in geodesy and surveying. In this study, we analysed the pioneering work of Baarda (Publ Geod New Ser 2(4) 1967; Publ Geod New Ser 2(5) 1968) and recent studies on the subject. We highlighted that the advent of personal computers with powerful processors has rendered Monte Carlo method as an attractive and cost-effective approach for quality control purposes. We also provided an overview of the latest advances in the reliability theory for geodesy, with particular emphasis on Monte Carlo method.
Article
The Minimal Detectable Bias (MDB) is an important diagnostic tool in data quality control. The MDB is traditionally computed for the case of testing the null hypothesis against a single alternative hypothesis. In the actual practice of statistical testing and data quality control, however, multiple alternative hypotheses are considered. We show that this has two important consequences for one's interpretation and use of the popular MDB. First, we demonstrate that care should be exercised in using the single-hypothesis-based MDB for the multiple hypotheses case. Second, we show that for identification purposes, not the MDB, but the Minimal Identifiable Bias (MIB) should be used as the proper diagnostic tool. We analyse the circumstances that drive the differences between the MDBs and MIBs, show how they can be computed using Monte Carlo simulation and illustrate by means of examples the significant differences that one can experience between detectability and identifiability.
Article
One of the main challenges in the quality control of geodetic measurements is the reliable identification of multiple outliers. Within this context, the goal of this paper is to present a procedure designated here as Sequential Likelihood Ratio Tests for Multiple Outliers (SLRTMO). To verify its performance, a levelling network was simulated involving one, two and three (simultaneous) outliers. Also a GNSS network involving one and two (simultaneous) outliers was analysed. Results showed that SLRTMO is efficient for single and multiple outliers, simulated with magnitude greater than five standard deviations, with a mean success rate of 79.6% for these cases. Furthermore, the maximum number of outliers to be tested has to be defined according to the redundancy of the network so as to ensure the performance of SLRTMO.
Article
The concept of reliability was introduced into geodesy by Baarda (A testing procedure for use in geodetic networks. Publications on Geodesy, vol. 2. Netherlands Geodetic Commission, Delft, 1968). It gives a measure for the ability of a parameter estimation to detect outliers and leads in case of one outlier to the MDB, the minimal detectable bias or outlier. The MDB depends on the non-centrality parameter of the \(\chi ^2\) -distribution, as the variance factor of the linear model is assumed to be known, on the size of the outlier test of an individual observation which is set to 0.001 and on the power of the test which is generally chosen to be 0.80. Starting from an estimated variance factor, the \(F\) -distribution is applied here. Furthermore, the size of the test of the individual observation is a function of the number of outliers to keep the size of the test of all observations constant, say 0.05. The power of the test is set to 0.80. The MDBs for multiple outliers are derived here under these assumptions. The method is applied to the reconstruction of a bell-shaped surface measured by a laser scanner. The MDBs are introduced as outliers for the alternative hypotheses of the outlier tests. A Monte Carlo method reveals that due to the way of introducing the outliers, the false null hypotheses cannot be rejected on the average with a power of 0.80 if the MDBs are not enlarged by a factor.
Article
There are two kinds of global test procedures in deformation analysis; chi(2)-test (CT) and F-test (FT). This study discusses their power functions. The CT is more powerful than the other one in an analytical point of view. However, it requires an accurate knowledge on the a priori variance of unit weight. Therefore, in practice, the FT is mostly chosen. Despite its common usage, a chi(2)-power function is considered in the sensitivity design of deformation networks. It is claimed in this study that the F-distribution's power function should be taken into account, if, in reality, the FT will be applied. Thereby, some boundary values deduced from the noncentral F-distribution to be used in sensitivity analysis are computed and tabulated. Furthermore, a simulation for a monitoring network is designed, and it is shown that the mean success rates of the two testing procedures are identical with their own powers known beforehand. This numerical experiment depicts that one should consider the related power function in the design stage, and that each power function gives a realistic probability of how the corresponding test procedure is successful. DOI: 10.1061/(ASCE)SU.1943-5428.0000064. (C) 2012 American Society of Civil Engineers.