Content uploaded by Rüdiger Lehmann

Author content

All content in this area was uploaded by Rüdiger Lehmann on Oct 19, 2016

Content may be subject to copyright.

1

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Transformation model selection by

multiple hypotheses testing

Prof. Dr.-Ing. Rüdiger Lehmann

University of Applied Sciences Dresden

Faculty of Spatial Information

Friedrich-List-Platz 1

D-01069 Dresden

Tel +49 351 462 3146

Fax +49 351 462 2191

mailto:r.lehmann@htw-dresden.de

Abstract

Transformations between different geodetic reference frames are often performed such that first the

transformation parameters are determined from control points. If in the first place we do not know which of

the numerous transformation models is appropriate then we can set up a multiple hypotheses test. The

paper extends the common method of testing transformation parameters for significance, to the case that

also constraints for such parameters are tested. This provides more flexibility when setting up such a test.

One can formulate a general model with a maximum number of transformation parameters and specialize it

by adding constraints to those parameters, which need to be tested. The proper test statistic in a multiple

test is shown to be either the extreme normalized or the extreme studentized Lagrange multiplier. They are

shown to perform superior to the more intuitive test statistics derived from misclosures. It is shown how

model selection by multiple hypotheses testing relates to the use of information criteria like AICc and

Mallows’ Cp, which are based on an information theoretic approach. Nevertheless, whenever comparable,

the results of an exemplary computation almost coincide.

Keywords: Coordinate transformation; Hypothesis test; Affine transformation; Similarity transformation;

Gauss-Markov model with constraints; Normalized Lagrange multiplier; Studentized Lagrange multiplier;

Akaike information criterion; Mallows’ Cp

1 Introduction

Today, geodetic information is often expressed with respect to various geodetic reference frames. To

convert such information (like coordinates of points) from one frame to another, coordinate transformations

are widely needed and applied in all branches of the modern geodetic profession. Those fields of application

range from satellite navigation (e.g. Zhang et al. 2012) to cadastral surveying (e.g. Deakin 1998, 2007) and

to photogrammetry (e.g. Goktepe and Kocaman 2010). The most important theoretical and practical

problems in these fields are solved.

Nonetheless, there are many new developments regarding the computation of coordinate transformation

parameters from a set of points with given coordinates in two different reference frames, known as control

points (synonymously referred to as identical or homologous points):

2

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

There are new results regarding the transformation accuracy. Lehmann (2010) analyzes why and

under which conditions the accuracy of transformed points is optimal in the barycenter of the

control points.

Nowadays, the adjustment of transformation parameters is often computed by robust methods

(Kampmann 1996, Kanani 2000, Carosio et al. 2006, Ge et al. 2013). Here it is shown that outliers in

coordinates of control points are less influential to the transformation parameters.

A more recent development is the total least-squares approach (Schaffrin and Felus 2008, Mahboub

2012). Here, errors are not only assigned to the coordinates of control points, but also to the

elements of the system matrix. However, based on the example of a planar similarity transformation

Neitzel (2010) shows that the total least-squares solution can be obtained easily from a rigorous

evaluation of the Gauss–Helmert model.

In the following we restrict ourselves to three-dimensional (3D) reference frames. In geodesy, we use a

variety of 3D transformation models. The most important are (Andrei 2006):

translation (three parameters)

rotation (three parameters)

rotation and translation (six parameters)

rotation, one scalation and translation (similarity transformation, seven parameters)

rotation, two scalations and translation (eight parameters)

rotation, three scalations and translation (nine parameters)

affine transformation (twelve parameters)

Even more complex transformation models are used in geodesy, e.g. transformations using thin plate splines

(Donato and Belongie 2002). Alternatively we find piecewise approaches (Lippus 2004), where the region

covered by control points is partitioned. In each partition the transformation model is computed

independently and later the pieces are glued together into a single transformation.

In this paper, we are concerned with the following problem: Given a sufficiently large number of control

points, which transformation model should be selected. This problem belongs to a class of problems, which

is often referred to as model section (Burnham and Anderson 2002). Often a strong preference for a

transformation model can be deduced from the relationship between the reference frames. For example, if

they are strictly related by a conformal mapping then the similarity transformation is a proper choice. But

often there is no such preference, and the proper model must be selected by means of the control points.

If one model contains some extra parameters with respect to the other, like rotation and translation versus

similarity transformation, the standard geodetic approach is to test statistically, if the additional parameters

are significant, i.e. if the estimated values of these parameters are significantly different from zero. This test

is often called v-test in geodesy (Teunissen 2000). Such an approach is e.g. used in Andrei (2006). In the case

of the rotation and translation versus the similarity transformation the task would be to test if the scale

parameter is significantly different from unity, which can be done by the v-test.

But often this is not possible because one model is set up with completely different parameters than

another. For example, the parameters of the spatial affine transformation are generally not scales and

rotation angles. Therefore, a comparison to the similarity transformation cannot be done by testing the

significance of some extra parameters. Making the v-test applicable would require (if possible at all) an

3

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

unnatural change of parameterization. In other cases we must select one out of more than two

transformation models.

In our paper we basically follow the hypothesis testing approach, but extend it, such that not only

parameters themselves can be tested, but also constraints on parameters. It will turn out that this case

arises e.g. in the problem of transformation model selection. Moreover, it is necessary to understand that

we are performing a multiple hypotheses test, which adheres to other laws than the classical hypothesis test

(Miller 1981). This is not always properly understood and leads to misconceptions. For example, the

significance test of the transformation model parameters performed by Ziggah et al. (2013) should have

been such a multiple test.

Other approaches to model selection used in statistics are based on information criteria (Burnham and

Anderson 2002). The idea is that more complex models can generally fit the data better, but this may result

in overfitting, i.e. unduly complex models partly fit the observation errors. Therefore, pure goodness of fit is

not a valid criterion for model selection, but a penalty term for model complexity needs to be introduced.

The most important information criteria are

Akaike information criterion (AIC), see Akaike (1974),

its alternate version (AICc, which means AIC with a correction for small data sets),

Bayesian information criterion (BIC) and

Mallows’ Cp , see Mallows (1973).

Given a set of candidate models for the data, the preferred model is the one with the minimum AIC or AICc

or BIC value, or the one with Cp value approaching the number of model parameters. The AICc has been used

for the transformation model selection by Felus and Felus (2009) because it is recommended for small sets

of observations. We will return to this approach in Sect. 8.

The paper is organized as follows: after introducing transformation equations and constraints for common

spatial coordinate transformations we set up a Gauss-Markov model(GMM) with constraints and solve the

model selection problem by a multiple hypotheses test. Following Lehmann and Neitzel (2013) it is shown

that the proper test statistics for such a test are the extreme normalized or externally studentized Lagrange

multipliers (LMs, also known as correlates in geodesy). We emphasize that the use of normalized LMs as test

statistics in geodesy goes back to Teunissen (1985). In a numerical example, it is shown how these tests work

and how they are superior to more intuitively defined test statistics. Finally, we comment on the relationship

of this model selection strategy in relation to information criteria, exemplified for AICc.

2 Transformation equations and constraints

Given a number of points with coordinates in two different spatial reference frames (known as control

points), the problem is to find a good model for the transformation between these two frames. The given

coordinates may be affected by random observation errors with known stochastic properties.

We start from the 3D affine transformation, which obeys the following system of equations:

(1)

4

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Here denote the coordinates of a point in the initial reference frame and denote the

coordinates of the same point in the target reference frame. This transformation model has twelve

transformation parameters . All other relevant models can be derived therefrom by

imposing constraints on these twelve parameters as follows:

A nine parameter transformation is derived by imposing orthogonality of the rows of . In other words, the

elements of need to fulfill the following three constraints (e.g. Andrei 2006, chapter 2):

(2)

(3)

(4)

In this special case we can express (1) as

(5)

where

are scale factors and is a rotation matrix (orthogonal matrix with determinant 1). A way of expressing this

transformation by nine parameters is by and by three Eulerian rotation angles. Another

nine parameter transformation can be defined by requiring the columns of to be orthogonal, rather than

the rows. Here the diagonal matrix of scale factors and are interchanged. Both transformations are

essentially different.

An eight parameter transformation is practically less important. It is motivated by the fact that horizontal

coordinates are sometimes determined by different technologies than vertical coordinates. Therefore, the

scales in horizontal directions may be equal, i.e. , while is kept as an independent parameter (e.g.

Andrei 2006, chapter 3):

(6)

The specialization from (5) to (6) is equivalent to the constraint , but since we want to impose this

constraint on the model (1), it must be expressed in terms of the transformation parameters of this model:

(7)

Also in (6) the diagonal matrix of scale factors and may be interchanged, yielding a different

transformation model with eight parameters.

5

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

The seven parameter similarity transformation (in geodesy also known as spatial Helmert transformation) is

obtained by requiring all scales to be equal, i.e. . The system of transformation equations

now reads (e.g. Andrei 2006, section 1.2):

(8)

Here and can obviously be interchanged without changing the transformation. In order to restrict the

eight parameter transformation (6) to the similarity transformation (8), a further constraint must be added.

There are several equivalent possibilities, how such a constraint could read. We favor the following:

(9)

The reason is that this constraint makes sense even without (7): equals the quadratic mean of and ,

or equivalently,

equals the mean of

and

. This is instructive even if and are different: A

transformation using (9) without (7) deforms a sphere to an ellipsoid with one axis length being the

quadratic mean of the other two axis lengths.

The spatial rotation and translation is obtained by requiring :

(10)

In order to restrict the similarity transformation (8) to this transformation, we favor the following form of

the constraint :

(11)

The reason is that this constraint makes senses even without (7) and (9): The quadratic mean of and

equals unity, or equivalently the mean of

and

equals unity. This is instructive even if

are different: A transformation using (11) without (7) and (9) deforms a sphere to an ellipsoid such that the

space diagonal of the bounding cuboid preserves length.

The pure spatial rotation is derived by additionally requiring that

The pure spatial translation is obtained by requiring to be the unit matrix. This could be achieved by

additionally imposing the constraints

(Remember that a rotation matrix with 1,1,1 on the main diagonal is uniquely determined as the unit

matrix.) But since such transformation models are rarely used in geodesy, the last two sets of constraints will

not be used in the sequel.

If we want to decide if the general transformation like the affine transformation (1) is the correct model or a

more special one like the similarity transformation (8), then we have to decide, whether the constraints

restricting the generality of (1) to (8) are compatible with the given coordinates of the control points or not.

6

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Due to inevitable observation errors we can in general not expect the estimated parameters to fulfill such

constraints exactly. But if the constraints show only small misclosures then we may assume that the special

model is sufficient to represent the relationship between both reference frames. It remains to be shown

how smallness of misclosures or a similar criterion is to be defined.

3 Gauss-Markov model

The general transformation problem can be formulated as a non-linear Gauss-Markov model (GMM)

(12)

where is a -vector of given coordinates of control points and is a -vector of unknown transformation

model parameters, augmented by some unknown true values of coordinates (see below). is a known non-

linear operator mapping from the -dimensional parameter space to the -dimensional observation space.

is an unknown random -vector of normally distributed observation errors. The associated stochastic model

reads:

(13)

is a known -matrix of weights (weight matrix). is the a priori variance factor, which may be either

known or unknown.

It is also customary to formulate coordinate transformations as a Gauss-Helmert model. But this model can

be transformed into the GMM by the simple variable substitution given in (Koch 1999 p. 212). When testing

the compatibility of constraints for parameters, it is better to restrict ourselves to GMM because then we

can immediately use the results given by Lehmann and Neitzel (2013).

For the 3D affine transformation (1) the non-linear observation equations associated with a control point

having six observed coordinates read

(14)

Let there be given control points. Then and the vector of observations reads

(15)

Also and the vector of GMM parameters reads

(16)

Note the difference between transformation parameters and GMM parameters. The latter set also

comprises the unknown true values of coordinates in the initial frame.

7

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

is clearly a non-linear operator here. (The affine transformation model would be immediately linear only if

all could be treated as error-free.) In the sequel we exclude singular configurations of the control

points (like coplanarity), such that all parameters can be uniquely determined in the unconstrained GMM.

The constraints restricting the general transformation model to the special one can be formulated as

(17)

where is a generally non-linear operator mapping the unknown GMM parameter vector to a known -

vector . In fact, due to the non-linearity of the constraints (2),(3),(4),(7),(9),(11) the operator related to

the affine transformation problem is non-linear.

If the special transformation does not describe the relationship between the reference frames correctly then

we get true misclosures

(18)

After computing an estimate of the parameters of the unconstrained GMM we can insert into the

constraints and come up with estimated misclosures:

(19)

Due to inevitable observation errors we get in general

even if holds. For example, the

misclosures related to constraints (2),(3),(4) can be interpreted as the sines of the three shear angles related

to the affine transformation.

4 Hypothesis test in the non-linear model

In geodesy the decision problem on the proper transformation model is generally posed as a statistical

hypothesis test. Opposing the special model represented by the GMM (12),(13) augmented by constraints

(17) to a general model represented by the unconstrained GMM is equivalent to opposing the null

hypothesis

(20a)

to the alternative hypothesis

(20b)

If is to be rejected then we decide on the general model, otherwise the special model is used for the

transformation between the given reference frames.

The standard solution of the testing problem in classical statistics goes as follows (e.g. Tanizaki 2004 p. 49 ff):

1. A test statistic is introduced, which is known to assume extreme values if does not hold

true.

2. Under the condition that holds true, the probability distribution of is derived, represented

by a cumulative distribution function (CDF) .

3. A probability of type I decision error α (significance level) is suitably defined (say 0.01 or 0.05 or

0.10).

8

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

4. For one-sided tests a critical value is derived by where denotes the

inverse CDF (also known as quantile function) of . (For two-sided tests two critical values are

needed, but this case does not show up in this investigation.)

5. The empirical value of the test statistic is computed from the given observations . If

then must be rejected, otherwise we fail to reject .

In principle we are free to choose a test statistic. Even heuristic choices like

(21)

with some suitable norm are conceivable. Although the statistical power (probability of rejection of

when it is false) of such a test might not be optimal or even poor.

Consider for example the problem of opposing the affine transformation model (1) with the nine parameter

transformation model (6).

would be the vector of sines of the shear angles computed from estimated

affine transformation parameters . A possible test statistic would be the RMS or maximum

absolute value of these estimated misclosures. In some instances the misclosure is directly interpreted as a

deviation of a parameter from a fixed value: Consider for example the problem of opposing the similarity

transformation model (8) with the rotation and translation model (10). The effective constraint is

here, and we can test if the estimated parameter in (8) is significantly different from unity. It is here

equivalent to (21) because is nothing but the estimated misclosure (19) of the related constraint

.

In geodesy we most often apply the likelihood ratio (LR) test (e.g. Tanizaki 2004 p. 54 ff). The rationale of the

LR test is provided by the famous Neyman–Pearson lemma (Neyman and Pearson 1933), which

demonstrates that under various assumptions such a test has the highest power among all competitors. It is

often applied even if we cannot exactly or only approximately make these assumptions in practice because

we know that the power is still larger than for rival tests (Teunissen 2000, Kargoll 2012).

Moreover, we can oppose the general model to a set of special models in parallel. This is equivalent to

opposing in (20a) to a set of multiple alternative hypotheses .. Each of them proposes

that only a subset of constraints is violated, or equivalently, a subset of elements of is non-zero. In this

way we come up with a multiple hypotheses test. It is performed by testing vs. , vs. ,…, vs.

, and is rejected if and only if it is rejected in any of the tests. However, theses tests are not

performed with quantile probability , but . The rationale for this is that the producers risk must be

portioned to alternative hypotheses. Nonetheless, this treatment is fully valid only if the test statistics

are statistically independent, which is often violated. Lehmann (2012) shows how to improve this in the case

of geodetic outlier detection. A recommendable textbook on this topic is (Miller 1981).

Consider for example the case that we want to test if

: the rotation and translation (10) is the correct model or

: the similarity transformation model (8) or

: the eight parameter transformation model (6) or

: the nine parameter transformation model (5) or

: the affine transformation model (1).

9

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

We set up the observation equations of the affine transformation and augment the resulting GMM by the

constraints (2),(3),(4),(7),(9),(11). is then the hypothesis that (11) is in conflict with the observations and

the rest of the constraints. is then the hypothesis that (9) and (11) are in conflict with the observations

and (2),(3),(4),(7). is now the hypothesis that (7),(9),(11) are in conflict with the observations and

(2),(3),(4) and finally is the hypothesis that all six constraints are in conflict with the observations (i.e.

they produce true misclosures ).

5 Linearization

In general non-linear models the desired CDF cannot be analytically derived. Not even the CDF of

can be analytically derived here. A numerical technique for deriving such distributions is the Monte Carlo

method, but it is often computationally costly.

The family of normal distributions enjoys the famous property of constituting a family of stable distributions,

i.e. linear combinations of normal random variables are also normally distributed. Therefore, in a linear

model, where are linear operators, the distributions of ,

etc. are known to be normal too.

However, if the non-linear model is somehow close to a linear model, then the relevant distributions are still

somehow close to normal. (Otherwise the representation of the solution by the estimate and possibly a

covariance matrix associated with it would be meaningless.)

The common procedure in this case is to introduce approximate parameters , e.g. by solving 12 selected

affine transformation equations neglecting observation errors. Then we get with ,

, the linearized GMM

(22)

with linearized constraints

(23)

and denote the Jacobian matrices of and at . Transposition of is introduced here to come

close to the standard geodetic notation, also used by Lehmann and Neitzel (2013).

The linearized true misclosures now read

(24)

In a 3D affine transformation it is customary to use

(25)

where

are computed from four selected control points (not coplanar).

The resulting linearized observation equations (14) and linearized constraints (2),(3),(4),(7),(9),(11) in the

form of (22),(23) are built of:

10

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

(26)

(27)

(28)

(29)

6 Hypothesis test in the linearized model

The problem is now to identify constraints which in the linearized GMM with constraints are in conflict with

the observations and the rest of the constraints, indicating that the transformation model is too special. The

standard approach is to test, if all constraints are in conflict with the observations. The hypotheses to be

tested here, read

versus

(30)

In the case developed in the last section this would mean to discriminate between the two models of the

rotation & translation ( is true) and of the affine transformation ( is true).

As Lehmann and Neitzel (2013) have shown, it is possible to identify also conflicting subsets of constraints. If

e.g. only the last three constraints (7),(9),(11) are conflicting then the nine parameter transformation would

11

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

be the transformation model of choice. This is equivalent to eliminating the conflicting constraints. Since we

have more than two options how to build subsets of constraints, this requires a multiple test.

The most appealing layout of such a multiple test would be to test if an individual constraint is in conflict

with the observations and the rest of the constraints. The drawback is that eliminating one of the first five

constraints, the resulting transformation does not have a name in geodesy. However, this does not mean

that it is excluded to apply such a transformation practically.

Since a priori we do not know which of the considered six constraints is the best candidate for elimination,

we need to test the compatibility of all individual constraints in parallel. Following Lehmann and Neitzel

(2013) the test statistic of such a test derived from the classical likelihood ratio is

- either the extreme normalized Lagrange multiplier (LM)

(31)

- or the extreme externally studentized Lagrange multiplier (LM)

(32)

should be used if the a priori variance factor is known and should be used otherwise. For

convenience, the notation of (Lehmann and Neitzel 2013, eqs. 5.13, 5.14, 5.75, 5.76) is adopted here as

follows:

denotes the estimate of the LM related to the -th constraint, when solving the fully constrained

GMM,

denotes the standard deviation of this value using the a priori variance factor and

denotes the standard deviation of

using an estimate of . This estimate denotes the common best

quadratic unbiased estimate, but in the semi-constrained GMM. This means that for computing the estimate

of the -th constraint must be dropped.

From (Lehmann and Neitzel 2013, eqs. 5.53-5.56) the following distributional results can be adopted:

(33a)

(33b)

(34a)

(34b)

Here and denote the central and non-central Student’s t-distribution with degrees of freedom

and non-centrality parameter . Both in (33b) and (34b) the non-centrality parameter reads

(35)

12

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Practically, the true misclosure in the -th constraint is unknown, and perhaps also the a priori variance

factor . The cofactor

of

is always known.

After choosing a probability of type I error , a critical value must be taken from either of the distributions

(33a) or (34a), but with quantile probability in both cases, because the actual test statistics in the

multiple test are the extreme LMs or (see section 4) . If the critical value is exceeded by the related

test statistic in (31) or in (32) then we are inclined to reject . This means here that the

rotation & translation transformation model is not adequate. We should now drop the constraint, at which

the maximum in (31) or (32) is attained.

Not unlike the common practice in geodesy when dealing with extreme normalized or studentized residuals

in outlier detection, it is possible to iterate the procedure, until no further conflicting constraint can be

identified.

7 Example: 3D transformation based on six control points

7.1 Null hypothesis is true

We illustrate and investigate the procedure in a setup of six control points forming a flattened octahedron,

see Table 1 and Figure 1. The height of the octahedron will be varied.

The true transformation parameters are defined as zero except . This means that the

similarity transformation model (8) is the proper model, except when . In this case the rotation and

translation transformation model (10) is adequate. This setup has observations. (The

frames could also have been rotated or translated with respect to each other without changing the

subsequent results because the constraints for “no rotation” and “no translation” are not tested in this

section.)

A priori we assume not to know which transformation model is adequate. Therefore we use the affine

transformation model (1) with model parameters and try to specialize it by applying

constraints (2),(3),(4),(7),(9),(11) as derived above. This yields a total redundancy of .

Let us start with the case . Here is true and should be rejected by the multiple test only

with probability .

Observations (14) are generated in two different ways, by adding pseudo random noise according to (13)

with (identity weight matrix), both with and with . From this we compute the

test statistics (31),(32) and compare them with their critical values, deciding if must be rejected or not. In

a Monte Carlo approach we repeat this procedure 106 times and compute the relative frequency of

rejected. This value is expected to be equal to because in this simulation we tacitly know that is in fact

true. The results are given in Table 2. It is seen that relative frequencies for normalized LM are a little bit

smaller than expected, for studentized LM a little bit larger.

There are three potential causes why these relative frequencies are perhaps not exactly equal to :

1. 106 repetitions are not enough. This cause can be ruled out by re-starting the procedure with

different pseudo random numbers and comparing the results. If the relative frequencies differ from

in the same way then 106 repetitions are enough, otherwise the number must be increased.

13

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

2. The original model is non-linear, and consequently the distributional results (33a)-(34b) are at best

approximately valid. This cause can be ruled out by re-computing with larger , which makes the

nonlinearity worse. If the relative frequency now differs substantially more from then the

linearization is to blame. This can also be done in the reverse direction.

3. When computing the critical values of (31) or (32), the simple portioning of onto the constraints

by using as significance level is only valid if all

are statistically independent. This is at best

approximately true.

Although the results in Table 2 have been computed with two different and at this opportunity also with

different pseudo random numbers, the deviation from is practically the same. This proves that cause 3

produces the observed effect. However, at least in this small-scale example, the deviations from α seem to

be tolerable.

Table 1: True coordinates of control points used in section 7 (flattened octahedron)

A

0

0

0

B

0

100

0

C

100

100

0

D

100

0

0

E

71

71

F

71

71

Figure 1: Configuration of control points used in section 7 (flattened octahedron)

ζ

ξ

η

A

B

F

D

C

h

E

14

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Table 2: Various values of the relative frequency of rejected, when it is true

critical value

=10

=100

significance

level

for from

(33a) with

ratio of rejected

,

ratio of rejected

,

0.01

3.19

0.008 0.009

0.010 0.009

0.02

2.98

0.017 0.017

0.019 0.019

0.05

2.69

0.043 0.043

0.049 0.048

0.10

2.45

0.084 0.083

0.095 0.094

significance

level

for from

(34a) with

ratio of rejected

,

ratio of rejected

,

0.01

3.80

0.012 0.013

0.013 0.013

0.02

3.48

0.023 0.024

0.027 0.026

0.05

3.06

0.055 0.057

0.064 0.063

0.10

2.73

0.107 0.108

0.123 0.122

7.2 Null hypothesis is false

Next, we consider the case that is false. The test should now reject . The ability to reject when it is

false, is known as the power of the test. The power is usually smaller, when is only slightly violated

(small true misclosure ) and larger otherwise. This relationship is called the power function of the test

.

More specifically, equals the probability that is rejected by the test as a function of the true

misclosure . It is computed from the CDF of , see (33b),(34b). Software packages with implemented

quantile function of the non-central Student’s t-distribution in (34b) are less widely spread. In MATLAB we

find the function nctinv.

In our simulation study we are in the position to implement a true misclosure into the model. As a test we

implement

, i.e. a true misclosure . All other

constraints remain valid. As a test statistic we again use only the extreme normalized and externally

studentized LMs, i.e. we assume not to know which constraint is violated.

The results in terms of the two power functions are given in Figure 2. We restrict ourselves to

here and to , i.e. , because does not change when changes sign. Firstly, we observe

that the power increases with the significance level . This is the typical behavior because a higher means

that is more often rejected and therefore less often falsely accepted. Secondly, we observe that the

power increases with the magnitude of the true misclosure . This is also expected because a more

distinct separation between and makes type II decision errors (failures to reject a false ) less

probable.

The difference between the power of the test statistics (31),(32) is clearly seen. The normalized LMs require

to be known. If is unknown then we must resort to studentized LMs with a significant loss of test

power. The reason for this is a typical smearing effect: If constraint (11) is in effect, the inconsistent scales

between both coordinate frames are partly interpreted as observation errors, increasing the estimated

residuals. In this way the variance factor is mostly overestimated by . This makes

in (32) too

large. Consequently, becomes too small, such that it does not exceed its critical value as often as

does.

15

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Finally, we see nearly no effect in the power of and , when changing , and in this way changing the

conditioning of the normal system of the GMM. (Note that for the system is singular due to a

coplanarity of all control points.)

7.3 Using the extreme normalized misclosure as a test statistic

Test statistics must not be restricted to likelihood ratios, but could perhaps be defined by plausibility

reasoning. See (21) and discussion below. Not uncommon in practical geodesy, such a test statistic could be

the extreme normalized or studentized estimated misclosure. In this subsection the extreme normalized

estimated misclosure

(36)

is defined as test statistics and will be considered as a substitute for in (31). uses the estimated

misclosures of the unconstrained solution, which is the affine transformation model (14). A large value of

such a test statistic would also indicate an incompatibility of the constraint, for which the maximum in (36) is

attained. The critical values of are the same as of and can be taken from Table 2.

The resulting power functions are displayed in Figure 2 and must be compared to that of the test statistic

, where has also been used. We see a similar behavior, but recognize a great loss of power, when is

small, i.e. for nearly illconditioned normal systems. Here the misclosure is subject to a smearing effect:

Incompatible scales of the coordinate frames produce large magnitudes of misclosures not primarily in ,

but the effect also smears over to other misclosures, such that often falls below the critical value.

16

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Figure 2: Power functions related to the test statistics (31),(32),(36) for the example of section 7

17

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

7.4 Identification of the conflicting constraints

If the test statistic exceeds the critical value, then we are inclined to reject . This means that the model of

rotation & translation is too special for the description of the relationship between the control points. The

next step would be to identify the constraint in conflict with the observations and the rest of the constraints.

We can only hope that this is the constraint, for which the maximum in (31),(32) or (36) is attained. Now, we

investigate if this is true.

Firstly, note that in this respect there is no difference between normalized and studentized LMs (and also

not between normalized and studentized misclosures, although the latter values have not been used here).

The reason is that these values differ only in the way that

is computed with the estimated variance

factor , while for

the known value is used. Therefore, normalized and studentized values are

proportional to each other and the maxima in (31) and (32) are attained at the same index . Consequently,

studentized values are disregarded below.

We use the computation of the last section for only. The relative frequencies, how often the maxima

of (31) and (36) are attained at a certain constraint, are displayed in Figure 3, regardless of their value, i.e. if

they exceed any critical value or not. In this way the investigation becomes independent of α. We expect

those maxima to be attained mostly at , which is the index of the violated constraint (11). If this

violation is only weak, i.e. here, then these maxima are not attained primarily at .

But unsurprisingly, the stronger (11) is violated, the more often the maxima are attained at . At a value

of at which is almost certainly rejected , we almost certainly accept the correct

.

Comparing left and right part of Figure 3 we see that for LMs we get the correct result more often than for

misclosures, which confirms that LMs are the superior test statistics. For the maximum in (31) is

oftentimes attained at , such that the constraint (9) is sometimes wrongly identified as conflicting. For

misclosures we find wrong identifications in all five other constraints.

Figure 3: Relative frequencies of rejection of a constraint, for which the maximum in (31), left subplot, and

(36), right subplot, is attained. Note that the curves for constraints 1-4 are practically overlapping

18

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

8 The use of information criteria

8.1 Comparison to Akaike information criterion

As an alternative to transformation model selection by hypothesis testing we employ the AICc information

criterion as proposed by Felus and Felus (2009). It is defined as (Burnham and Anderson 2002)

(37)

where as before and are the number of observations, constraints and GMM parameters, respectively,

and WSSR is the weighted sum of squared residuals in the GMM. The first summand measures goodness of

fit and the second is a penalty term for model complexity. The model with the smallest AICc must be

selected. Note that the standard formulae for AIC and AICc do not involve constraints. But it is clear that the

number of constraints must be subtracted from the number of parameters to get the number of

effective parameters as a measure of model complexity.

We apply this criterion to the example of section 7. Note that AICc and similar criteria do not use a possibly

known variance factor . Therefore, we can fairly compare AICc only to in (32). In the following we

restrict ourselves to AICc computed for

the rotation and translation model, i.e. , denoted as

all six transformation models with , denoted as , and

the eight parameter transformation model, i.e. , denoted as , as an example of an, in

any case, overfitting model.

First of all, if then the rotation and translation is the proper model, and we expect

to show this. In the range of this happens with a relative frequency of

0.94…0.95. This can be seen in Figure 4 for . It means that the AICc effectively relates to a significance

level (although it is of course not a hypothesis test). If then we expect that falls

below . For each model we display the relative frequency of selection according to AICc in Figure 4

together with the relative frequency of selection according to the hypothesis test based on in (32). By

we mean the relative frequencies of selection of the rotation and translation model

and the six models with , respectively, analogous to the notation of AICc. First of all, it is striking how

close the corresponding curves in Figure 4 are. Thus, AICc almost coincides with at a significance level

. The small difference is in favor of . (Remember that here selects the appropriate model.)

In compliance with the results of section 7.3, the almost only other model with selected here is the

model without constraint (9). It is wrongly selected here, but this happens only in rare cases.

It is interesting to note that not only the model selections show equal frequencies in Figure 4, but also

coincide very well. For example, for the relative frequency of coincident model selections,

may be right or wrong, is 0.985. Here we have restricted ourselves to because the same figure for

or would be nearly indistinguishable from Figure 4.

Summarizing, we can say that model selection by AICc is very much the same as by extreme studentized LMs

with significance level here. This is appreciated by users uncertain about the choice of any ,

because AICc does not require such a choice. Also no critical value must be computed. This is welcome, if the

distribution of the test statistic is analytically intractable or unknown. On the other hand, if for example a

19

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

user wants to have the model with less parameters supported more than AICc does, then she/he cannot

simply choose a smaller . Possibly, another information criterion like BIC can be employed.

Figure 4: Relative frequencies of selection of a model according to the AICc (37) [solid curves] and the

hypothesis test based on extreme studentized LMs (32) with significance level α=0.05 [dotted curves]

8.2 Comparison to Mallows’ Cp

Another common information criterion is Mallows’ Cp (Mallows 1973). So far it is less popular in geodesy, at

least under this name, but has been employed e.g. by Mahboub (2014). It is derived from an advanced

statistic as

(38)

with notation as in (37). In the same way as for AIC and AICc, the standard formula for Cp does not involve

constraints. But it is clear that the number of constraints must be subtracted from the number of

parameters to get the number of effective parameters , for which often the symbol is

used. The expectation of is known to be equal to , possibly plus a positive bias term due to lack of fit.

Therefore, the model with the smallest value of , which is somehow close to , is selected.

Unlike AIC and AICc, Mallows’ Cp is also able to use , should it be known. Therefore it is interesting to

compare it to in (31). We repeat the computations of the last subsection, coming up with Cp(0)…Cp(7)

instead of AICc(0)…AICc(7) and instead of .

To start with, we naively base the model selection on Mallows’ Cp in such a way that each time the model

with closest to is selected. The results are disappointing: For Mallows’ Cp selects the correct

model only with a relative frequency of 0.06. Instead the overfitting eight parameter transformation model

is wrongly selected most often. The reason is that in any case the values of scatter perfectly around , but

the overfitting models show smaller scattering, such that there is a bigger chance to get a value of closer

to . For Mallows’ Cp most often selects the correct model, but by far not as often as the

other methods.

This shows that it is crucial not to select the model “with closest to ”, as done before, but “with the

smallest value of , which is somehow close to ”. We must therefore define a permissible interval of

values close to . The disadvantage is that it may happen that none of the values of falls into this interval.

20

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Some investigation using the numerical example at hand shows that the interval is optimal. It

is the smallest interval, which ensures that with a relative frequency of 0.95 at least one is permissible. It

is therefore used below.

The results are displayed in Figure 5. While model selection based on shows almost the same, but slightly

superior results as on (compare with Figure 4), the results with Mallows’ Cp are worse. For

Mallows’ Cp selects the correct model only with a relative frequency of 0.87 (curve Cp(0) in Figure 5). This is

much better than in the naïve approach, but not as good as for the other methods. The latter also applies to

the cases with : For example, for Mallows’ Cp selects the correct model only in

one out of two cases (curve Cp(6) in Figure 5). In one out of three cases constraint (9) is wrongly identified as

conflicting (curve Cp(5) in Figure 5). This shows that Mallows’ Cp is not a good alternative to multiple

hypotheses testing. One reason could be that here is not varying much, only from 24 to 26.

Therefore, Mallows’ Cp may be tried again in cases where in one system all coordinates of control points are

considered as error-free quantities, such that shows a stronger relative variation, here from 6 to

8.

Figure 5: Relative frequencies of selection of a model according to Mallows’ Cp (38) [solid curves] and the

hypothesis test based on extreme normalized LMs (31) with significance level α=0.05 [dotted curves]

9 Conclusions

In geodesy we oftentimes need to transform points between two different coordinate frames. We are

routinely faced with the situation that we do not know which transformation model should be selected. It is

good tradition in geodesy to base such a decision on a statistical hypothesis test. If we consider more than

two models then the test must be a multiple hypotheses test.

It is often not possible to simply test the significance of parameters because all considered models may have

a different set of parameters. For example, there is no natural implementation of a scale parameter into the

spatial affine transformation. Therefore, it is better to start with a general transformation model and try to

specialize it by adding constraints. The compatibility of those constraints needs to be tested. Such a test can

be intuitively based on the estimated misclosures of the constraints in the unconstrained model. From

(Lehmann and Neitzel 2013) we know that better test statistics should be based on the Lagrange multipliers

(LMs, also known as correlates in geodesy) of the constrained solution. They should assume the form of

21

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

either the extreme normalized or the extreme externally studentized LM. The second test statistic comes

into effect if the a priori variance factor is unknown.

We worked out an example of a 3D coordinate transformation based on six control points. Here it is shown

that the LM-based test statistics have more statistical power than those based on the estimated misclosures.

This advantage is most drastic if the configuration of the control points is poor. Moreover, in a multiple test

the test statistics based on LMs more often identify the correct alternative model. They can be

recommended, not only to problems with transformations, but to all geodetic adjustment problems posed

as a Gauss Markov models with constraints.

For transformation model selection the AICc and Mallows’ Cp are considered as an alternative to the multiple

hypotheses test. It turns out that in the exemplary case the AICc almost always selects the same model as

the extreme externally studentized LM does. This is remarkable because the theoretical background is

different. Mallows’ Cp was also successfully applied, but here the results are inferior to the other methods.

This may be due to the fact that there is only a small difference in the number of parameters of the models

to be selected.

References

Akaike H (1974) A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control,

19:716-723.

Andrei CO (2006) 3D affine coordinate transformations. Masters of Science Thesis in Geodesy No. 3091.

School of Architecture and the Built Environment Royal Institute of Technology (KTH) Stockholm,

Sweden

Burnham KP, Anderson DR (2002) Model Selection and Multimodel Inference: A Practical-Theoretic

Approach, 2nd ed., Springer New York, ISBN 0-387-95364-7

Carosio A, Piras M, Salvini D (2006) The Forward Search Method Applied to Geodetic Transformations.

Studies in Classification, Data Analysis, and Knowledge Organization, 2006, Part III, 181-188. DOI

10.1007/3-540-35978-8_21

Deakin RE (1998) 3D coordinate transformations. Surveying and Land Information Systems 58(4):223-234

Deakin RE (2007) Coordinate Transformations for Cadstral Surveying. School of Mathematical and Geospatial

Sciences, RMIT University, pp 1-33

Donato G, Belongie S (2002) Approximate Thin Plate Spline Mappings. Lecture Notes in Computer Science

2352, Springer Berlin Heidelberg pp 21-31. DOI 10.1007/3-540-47977-5_2

Felus YA, Felus M (2009) On Choosing the Right Coordinate Transformation Method. Proc FIG Working Week

2009: Surveyors Key Role in Accelerated Development. Eilat, Israel, 3-8 May 2009

Ge Y, Yuan Y, Jia N (2013) More efficient methods among commonly used robust estimation methods for

GPS coordinate transformation. Survey Review 45(330):229-234. DOI

10.1179/1752270612Y.0000000028

Goktepe A, Kocaman E (2010) Analysis of camera calibrations using direct linear transformation and bundle

adjustment methods. Scientific Research and Essays 5(9):869-872

Kampmann G (1996) New Adjustment Techniques for the Determination of Transformation Parameters for

Cadastral and Engineering Purposes. Geomatica 50(1):27-34

22

Postprint of Journal of Geodesy (2014) 88(12)1117-1130, DOI:10.1007/s00190-014-0747-3

Kanani E (2000) Robust Estimators for Geodetic Transformations and GIS. PhD thesis, Institut für Geodäsie

und Photogrammetrie, ETH Zürich

Kargoll B (2012) On the Theory and Application of Model Misspecification Tests in Geodesy. Deutsche

Geodätsche Kommission Reihe C, Nr. 674, München

Koch KR (1999) Parameter Estimation and Hypothesis Testing in Linear Models. 2nd ed., Springer Berlin. ISBN

978-3-662-03976-2

Lehmann R (2010) Im Schwerpunkt der Anschlusspunkte – Zur Genauigkeit geodätischer

Koordinatentransformationen (In the barycentre of the control points – on the accuracy of geodetic

coordinate transformations). Allgemeine Vermessungsnachrichten 4/2010. VDE-Verlag Offenbach

Lehmann R (2012) Improved critical values for extreme normalized and studentized residuals in Gauss-

Markov models. J Geod 86(12):1137–1146

Lehmann R, Neitzel F (2013) Testing the compatibility of constraints for parameters of a geodetic adjustment

model. J Geod 87: 555-566. DOI 10.1007/s00190-013-0627-2

Lippus J (2004) Transformation of coordinates using piecewise conformal mapping. Journal of Geodesy, 78(1-

2):40-46. DOI 10.1007/s00190-003-0364-z

Mahboub V (2012) On weighted total least-squares for geodetic transformations. J Geod 86(5):359-367. DOI

10.1007/s00190-011-0524-5

Mahboub V (2014) Variance component estimation in errors-invariables models and a rigorous total least

squares approach. Stud. Geophys. Geod. 58 (2014), 17-40, DOI: 10.1007/s11200-013-1150-x

Mallows CL (1973) Some Comments on CP. Technometrics 15(4):661-675. DOI 10.2307/1267380

Miller RG (1981) Simultaneous statistical inference. Springer-Verlag New York Inc. ISBN 0-387-90548-0

Neyman J, Pearson ES (1933) On the Problem of the Most Efficient Tests of Statistical Hypotheses.

Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences

231(694–706):289–337. DOI:10.1098/rsta.1933.0009

Neitzel F (2010) Generalization of total least-squares on example of unweighted and weighted 2D similarity

transformation. Journal of Geodesy 84(12):751-762. DOI 10.1007/s00190-010-0408-0

Schaffrin B, Felus YA (2008) On the multivariate total least-squares approach to empirical coordinate

transformations. Three algorithms. J Geod 82(6):373-383. DOI 10.1007/s00190-007-0186-5

Tanizaki H (2004) Computational Methods in Statistics and Econometrics. Marcel Dekker New York.

ISBN-13: 978-0824748043

Teunissen PJG (1985) Quality control in geodetic networks. In: E.W. Grafarend and F. Sanso (eds.)

Optimization and Design of Geodetic Networks, pp 526–547, Springer Berlin Heidelberg New York

Teunissen PJG (2000) Testing theory; an introduction. Series on mathematical geodesy and positioning, 2nd

ed., Delft University of Technology, The Netherlands. ISBN 90-407-1975-6

Zhang P, Xu C, Hu C, Chen Y (2012) Coordinate transformations in satellite navigation systems. In Advances in

Electronic Engineering, Communication and Management Vol. 2 pp 249-257, Springer Berlin

Heidelberg. DOI 10.1007/978-3-642-27296-7_40

Ziggah YY, Youjian H, Amans ChO, Fan DL (2013) Determination of GPS Coordinate Transformation

Parameters of Geodetic Data Between Reference Datums-A Case Study of Ghana Geodetic Reference

Network. International Journal of Engineering Sciences & Research Technology 2(4):956-971