Content uploaded by Feng Li

Author content

All content in this area was uploaded by Feng Li on Oct 27, 2020

Content may be subject to copyright.

JianqingFan

JianxinPanEditors

Contemporary

Experimental Design,

Multivariate Analysis

and Data Mining

Festschrift in Honour of

Professor Kai-Tai Fang

Contemporary Experimental Design, Multivariate

Analysis and Data Mining

Jianqing Fan •Jianxin Pan

Editors

Contemporary Experimental

Design, Multivariate Analysis

and Data Mining

Festschrift in Honour of Professor Kai-Tai

Fang

123

Editors

Jianqing Fan

Department of Financial Engineering

Princeton University

Princeton, NJ, USA

Jianxin Pan

Department of Mathematics

The University of Manchester

Manchester, UK

ISBN 978-3-030-46160-7 ISBN 978-3-030-46161-4 (eBook)

https://doi.org/10.1007/978-3-030-46161-4

Mathematics Subject Classiﬁcation: 62F, 62H, 62G, 62K, 62J, 62N, 62P

©Springer Nature Switzerland AG 2020

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part

of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations,

recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission

or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar

methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this

publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from

the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this

book are believed to be true and accurate at the date of publication. Neither the publisher nor the

authors or the editors give a warranty, expressed or implied, with respect to the material contained

herein or for any errors or omissions that may have been made. The publisher remains neutral with regard

to jurisdictional claims in published maps and institutional afﬁliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG

The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Chapter 21

A Bilinear Reduced Rank Model

Chengcheng Hao, Feng Li, and Dietrich von Rosen

Abstract This article considers a bilinear model that includes two different latent

effects. The ﬁrst effect has a direct inﬂuence on the response variable, whereas the

second latent effect is assumed to ﬁrst inﬂuence other latent variables, which in turn

affect the response variable. In this article, latent variables are modelled via rank

restrictions on unknown mean parameters and the models which are used are often

referred to as reduced rank regression models. This article presents a likelihood-

based approach that results in explicit estimators. In our model, the latent variables

act as covariates that we know exist, but their direct inﬂuence is unknown and will

therefore not be considered in detail. One example is if we observe hundreds of

weather variables, but we cannot say which or how these variables affect plant growth.

21.1 Introduction

In the early age of statistics, variations in data were studied through applications of

linear models. Analysis of variance, regression analysis and analysis of covariance

were developed simultaneously and later were put under the same umbrella with

Dedicated to Professor Kai-Tai Fang and his wife Ting Mei.

C. Hao

Shanghai University of International Business and Economics, No. 1900 Wenxiang Road,

Songjiang District, 201620 Shanghai, People’s Republic of China

e-mail: chengcheng.hao@outlook.com

F. Li

Central University of Finance and Economics, 39 South College Road, Haidian District, Beijing

100081, People’s Republic of China

e-mail: feng.li@cufe.edu.cn

D. von Rosen (B

)

Swedish University of Agricultural Sciences, Box 7032, 750 07 Uppsala, Sweden

e-mail: dietrich.von.rosen@slu.se

Linköping University, SE-581 83, Linköping, Sweden

© Springer Nature Switzerland AG 2020

J. Fan and J. Pan (eds.), Contemporary Experimental Design,

Multivariate Analysis and Data Mining,

https://doi.org/10.1007/978-3-030-46161-4_21

329

330 C. Hao et al.

matrix theory. In the beginning, only one response variable was considered but soon,

due to the knowledge of how to simultaneously handle sample variances and sample

covariances, multivariate analysis was developed including multivariate analysis of

variance (MANOVA), principal component analysis and canonical correlation anal-

ysis. An interesting article that reviews multivariate analysis up to the 1940s was

written by Rao [6]. Shortly after Rao’s article, Andersson [1] wrote a seminal paper

on multivariate analysis in which, among other methods, reduced rank problems were

considered; i.e., the cases in which the matrix of regression parameters was not of

full rank, which is a case that obviously does not exist in univariate linear regression

models, where the rank always equals one.

Bilinear regression models were indirectly considered in the above-mentioned

article by Anderson [1], in which bilinear restrictions in a MANOVA model were

handled. However, the article by Potthoff and Roy [5] is usually considered the

ﬁrst article on the analysis of bilinear models, which introduced the so called growth

curve model, but there had been several earlier contributions in the analysis of growth

curves, which were all bilinear. For references on bilinear regression models (growth

curve models/GMANOVA) see, for example, von Rosen [9]. Moreover, our bilinear

models are balanced multivariate models with a linearly structured mean in contrast

to a MANOVA model for which an arbitrary mean structure is assumed to hold.

This article mainly considers reduced rank analysis applied to the analysis of

growth curve models via knowledge about the analysis of extended bilinear regression

models. The book by Reinsel and Velu [8] includes many references to reduced rank

regression analysis presents many examples in which the models are used. In von

Rosen and von Rosen [10], some of Reinsel and Velu’s [8] work on growth curves

with rank restrictions was extended.

Reduced rank models imaginarily can be connected to latent variables. For exam-

ple, if observing many weather variables, such as hourly temperature and precipi-

tation data collected over a month, the effect on plant growth will occur via some

unobserved latent processes. In this article, we will introduce the case in which

latent variables inﬂuence a response variable directly, but the latent variables also

inﬂuence other latent variables that then inﬂuence the response variable. In our exam-

ple, temperature and precipitation also inﬂuence many soil characteristics, and these

variables, through some unobserved latent processes, affect plant growth. Thus, tem-

perature and precipitation also form a basis for latent variables to act indirectly. Some

details connected to this example will be provided in Sect.21.3. This way of thinking

has been implemented in statistical model building which is connected to graphical

models, but in this case, the focus is on modelling covariance matrices. Moreover, in

factor analysis (i.e. structural equation modelling), one models latent variables via

covariance structures.

Suppose that there is a parameter matrix Θof size p×qand rank r; i.e.,

r(Θ)=r. This supposition means that there exist linear combinations LΘ=0,

but where L:(p−r)×pis unknown. Solving LΘ=0as a function of Θimplies

the factorization Θ=Θ1Θ2,Θ1p×rand Θ2r×q, where both matrices are of

rank equal to r. This implication follows from the fact that C(Θ1)=C(L)⊥, where

C(•)and C(•)⊥denote the column space and its orthogonal complement, respec-

21 A Bilinear Reduced Rank Model 331

tively, and Θ2is an arbitrary matrix that generats all solutions to LΘ=0. However,

since Lis arbitrary, Θ1is also arbitrary. Moreover, the rank restriction also implies

restrictions among the columns of Θ; i.e., HΘ=0. Thus, Θ=Θ1Θ2, where Θ2is

of full rank, satisfying C(Θ

2)=C(H)⊥, and of course, Θ1is unknown. Therefore,

with rank restrictions, we have difﬁculties interpreting Θ1and Θ2because we do

not know if we have row- or column-restrictions. It also follows that without further

conditions, Θ1and Θ2are not estimable; therefore, in this article, the focus will be on

estimating Θand not Θ1and Θ2. We can conclude that by putting rank restrictions

on a matrix is not very informative. However, there exists another type of modelling

where one starts with Θ1and Θ2and multiplies these matrices together; and this

case, one usually has a clear interpretation of the matrices. This type of model has

been studies in so-called cointegration analysis in econometrics (see Johansen [3]).

In Sect. 21.2, the proposed model is described in detail. To the best of our knowl-

edge, this method for modelling the indirect effects of latent variables has not been

considered before. Moreover, the chapter provides one example in which the model

can be used. In Sect. 21.3, a likelihood-based approach to estimate the parameters is

proposed.

21.2 Model

Denote X:p×nas the random matrix corresponding to prepeated measurements

of nindependent observations. The model that will be studied is given by

X=ABC1+ΘC2+ΨΘC3+E,(21.1)

where A:p×qis a known within-individuals design matrix, the matrices

C1:k×n,C2:k1×nand C3:k1×nare known between-individuals design matri-

ces with column spaces satisfying C(C

3)⊂C(C

2)⊂C(C

1), and E:p×nis a

random error matrix following the matrix normal distribution Np,n(0,Σ,In), where

the covariance matrix Σ:p×pis an unknown positive deﬁnite matrix. Moreover,

the matrices B:q×k,Θ:p×k1and Ψ:p×pare unknown mean parameters,

where Θand Ψhave rank restrictions r(Θ)=r1<min(p,k1)and r(Ψ)=r2<p.

Models with a product such as ΨΘ in (21.1), with rank restrictions on both included

matrices, have, to the best of our knowledge, not been considered before. In this

model, the main purpose is to estimate Bwhile adjusting for the latent effects that

are introduced in the model via rank restrictions.

The condition C(C

3)⊂C(C

2)⊂C(C

1)is a technical condition and is motivated

by knowledge about extended bilinear regression models (see von Rosen [9]). In the

condition, strict subspace inequalities, which are pure estimability conditions since

Θis included in two effect terms, are necessary. It will also be assumed that

C(A)∩C(Θ)={0},

332 C. Hao et al.

which is a very natural condition and will, in principle, not put any restrictions on

the use of the model in (21.1).

If the terms ΘC2and ΨΘC3are not included in (21.1), we would have the

traditional growth curve model (see Potthoff and Roy [5], von Rosen [9]), or if Θ

does not have rank restrictions, and the term ΨΘC3is not included, then the model

is a GMANOVA+MANOVA model (see Chinchilli and Elsewick [2]). Moreover, in

(21.1), the term ΘC2represents the direct latent effects, whereas the term ΨΘC3

mimics indirect latent effects or the interaction between two latent variables within

a nested regime. Throughout the article, it will be assumed that nis so large that the

parameters can be estimated.

We ﬁnally stress that the main purpose is to estimate the growth curves that are

adjusted for latent effects, and we do not discuss the latent effects directly because

it is difﬁcult to interpret the estimators of the rank restricted parameter matrices.

21.2.1 Example

The purpose of this example is to motivate the model given by (21.1). We have chosen

to use plant growth and weather characteristics to illustrate the model, but currently,

when we have the ability to measure many complex systems, there are also many

other examples where the model can be applied, for example, within neurosciences

or when studying ﬁnancial markets.

Plant Growth

Suppose that the aim of a study is to compare two treatments (blocks). The study

comprises 10 types of plants. Two types of plants have long roots and are not affected

by the characteristics of the upper layer of soil (e.g., chemical variables, soil textures,

organic materials); two types of plants are very robust against weather conditions

(different types of summary measures of temperature and precipitation) and soil

characteristics, whereas the remaining six types of plants are affected by both weather

and soil variables. Let the study comprise n=200 observations, and suppose that for

all 10 plant types, there are two blocks per plant type, all of which are of equal size.

A between-individuals design matrix C1for specifying the “growth curve” ABC1

can have the following form:

C1=I10 ⊗1

10 0

01

10 ,

where 110 is a vector of ones of size 10 and ⊗denotes the Kronecker product of

two matrices. The within-individuals design matrix Ain the model given in (21.1)

equals, if p=10 and there is a linear growth (tirepresents the ith time point),

A=111111111 1

t1t2t3t4t5t6t7t8t9t10 .

21 A Bilinear Reduced Rank Model 333

Concerning the weather and soil variables, suppose that we have 10 weather

variables and 10 soil variables. We will specify C2and C3in the model presented

in (21.1). Let gidenote a vector of size 10 in which the 10 weather variables for

plant type iare stored, and let sibe a vector in which the soil observations for plant

type iare stored. It will be supposed that weather and soil variables are constant for

each plant type, meaning that all plants from a speciﬁc plant type grow in places

with the same soil and weather characteristics. Therefore, for each plant type, we

have speciﬁc background matrices. To handle weather observations in the model, we

deﬁne Vi,i∈{1,...,10}, such that

Vi=1

20 ⊗gi,10 ×20,if i∈{1,...,8},

1

20 ⊗0,10 ×20,if i∈{9,10}.

Then C2in (21.1) is deﬁned as

C2=Block(V1,...,V10 ), 100 ×200,

where Block denotes the block diagonal operator. It follows that C(C

2)⊂C(C

1).

To see this relationship, note that

V

i=110 0

01

10 g

i

g

i,i∈{1,...,8}.

Strict inclusion between the subspaces holds because for i∈{9,10},Viequals 0.It

can be noted that the model states that there are eight plant types that are directly

affected by weather conditions via the term ΘC2.

Concerning the soil variable, let

Mi=1

20 ⊗si,10 ×20,if i∈{1,...,6},

1

20 ⊗0,10 ×20,if i∈{7,...,10}.

Then, C3in (21.1) is given by

C3=Block(M1,...,M10 ), 100 ×200.

Since, C(M

i)=C(V

i),i∈{1,...,6}, it follows that C(C

3)⊂C(C

2).

Here, C3is constructed so that only plant types on which there is an inﬂuence by

weather and soil characteristics are included. We can think of the latent processes of

weather variables as effecting the soil characteristics, which in turn have an inﬂuence

on plant growth via some new latent variables.

334 C. Hao et al.

21.3 Estimation

In this section, likelihood-inspired estimators are established. For notational conve-

nience, (Q)(Q)is written as (Q)(), where Qcan be any matrix expression. The

likelihood function for model (21.1) equals

L(B,Θ,Ψ,Σ)=(2π)−1

2pne−1

2tr {Σ−1(X−ABC1−ΘC2−ΨΘC3)()}|Σ|−1

2n.

Using a well-known inequality (see Srivastava and Khatri [11], Theorem 1.10.4)

L(B,Θ,Ψ,Σ)≤|n−1(X−ABC1−ΘC2−ΨΘC3)()|−1

2n(2π)−1

2pne−1

2pn,

(21.2)

with equality if and only if

nΣ=(X−ABC1−ΘC2−ΨΘC3)().

Thus, Σwill be estimated if B,Θand Ψcan be estimated.

Let S1=X(I−PC

1)X,PC

1=C

1(C1C

1)−C1, and V1=XPC

1−ABC1−

ΘC2−ΨΘC3, where for an arbitrary Qthe notation (Q)−denotes any g-inverse

of Qsatisfying the well-known relation QQ−Q=Q.

Moreover, in the subsequent calculations, the determinant relation |I+QR|=

|I+RQ|for arbitrary Qand Rwill be used many times. Minimizing the determi-

nant in (21.2) will be the main objective, and we start by performing a number of

calculations leading to the following chain of equalities:

|(X−ABC1−ΘC2−ΨΘC3)()|

=|S1+(XPC

1−ABC1−ΘC2−ΨΘC3)()|=|S1||I+V

1S−1

1V1|

=|S1||I+V

1S−1

1A(AS−1

1A)−AS−1

1V1+V

1Ao(AoS1Ao)−AoV1|,(21.3)

where S−1=S−1PA,S+PAo,S−1S−1, with PA,S=

A(AS−1A)−AS−1, and Aois any matrix generating C(A)⊥(see Kollo and von

Rosen [4], Theorem 1.2.25). Since AoABC1=0, it follows that

the r.h.s. of (21.3)

≥|S1||I+(XPC

1−ΘC2−ΨΘC3)Ao(AoS1Ao)−(XPC

1−ΘC2−ΨΘC3)|,

(21.4)

(r.h.s. is an abbreviation for “right-hand side”) with equality if and only if

AS−1

1V1=0; that is,

ABC1=A(AS−1

1A)−AS−1

1(XPC

1−ΘC2−ΨΘC3).

21 A Bilinear Reduced Rank Model 335

Thus, Bis estimated if Θand Ψcan be estimated because as a function of Θand

Ψwe have a consistent system of linear equations. The above block of calculations

will be repeated two times before the estimators are obtained.

Let us continue with (21.4). We need a few more deﬁnitions (compare with S1

and V1). Let

T1=S1Ao(AoS1Ao)−Ao=P

Ao,S−1

1

,

S2=S1+T1X(PC

1−PC

2)XT

1,

V2=XPC2−ΘC2−ΨΘC3.

Then,

the r.h.s. of (21.4)

=|S1+T1(XPC

1−ΘC2−ΨΘC3)()T

1|

=|S2+T1V2V

2T

1|=|S2||I+V

2T

1S−1

2T1V2|

=|S2||I+V

2T

1S−1

2PT1Θ1,S2T1V2+V

2T

1P(T1Θ1)o,S−1

2S−1

2T1V2|,(21.5)

because of the rank restrictions r(Θ)=r1and Θ=Θ1Θ2,forsomeΘ1:p×r1,

Θ2:r1×k1which both are of rank r1and unknown. Moreover,

the r.h.s. of (21.5) ≥|S2||I+V

2T

1P(T1Θ1)o,S2S−1

2T1V2|(21.6)

with equality if and only if Θ

1T

1S−1

2T1V2=0, which in turn implies that

Θ1Θ2C2=Θ1(Θ

1T

1S−1

2T1Θ1)−1Θ

1T

1S−1

2(XPC

2−ΨΘ1Θ2C3), (21.7)

where the inverse exists because Θ1is of full column rank. If we can ﬁnd an estimator

for Θ1and consider ΨΘC3to be known, we have a system of consistent linear

equations in Θ2.

We proceed with (21.6). Let

T2=S2(T1Θ1)o((T1Θ1)oS2(T1Θ1)o)−(T1Θ1)o=P(T1Θ1)o,S−1

2,

S3=S2+T2T1X(PC

2−PC

3)XT

1T

2,

V3=XPC

3−ΨΘC3.

Then,

the r.h.s. of (21.6) =|S2+T2T1(XPC

2−ΨΘC3)()T

1T

2|

=|S3+T2T1(XPC

3−ΨΘC3)()T

1T

2|=|S3+T2T1V3V

3T

1T

2|

=|S3||I+V

3T

1T

2S−1

3T2T1V3|.(21.8)

336 C. Hao et al.

The determinant |S3|is a function of Θ1since T2is a function of Θ1.Now,we

will minimize |S3|with respect to Θ1, which implies that we are not aiming to ﬁnd

maximum likelihood estimates because Θ1is also included in the other determinant

of (21.8). However, by focusing only on |S3|, it will be shown that explicit estimators

can be obtained. Let

PC

2\C

3=PC

2−PC

3,

R1=I+PC

2\C

3XH1H

1XPC

2\C

3,

where for some H1:p×(p−r(A))

T

1S−1

2T1=H1H

1.

It follows that

|S3|=|S2|I+PC

2\C

3XT

1T

2S−1

2T2T1XPC

2\C

3(21.9)

=|S2||R1−PC

2\C

3XH1H

1Θ1(Θ

1H1H

1Θ1)−1Θ

1H1H

1XPC

2\C

3

=|S2||R1||I−F

1H

1XPC

2\C

3R−1

1PC

2\C

3XH1F1|,

where

F1=H

1Θ1(Θ

1H1H

1Θ1)−1/2,F

1F1=Ir1,F1:(p−r(A)) ×r1.

Let

U=I−H

1XPC

2\C

3R−1

1PC

2\C

3XH1

which is positive deﬁnite since

U−1=I+H

1XPC

2\C

3XH1

is positive deﬁnite. Thus,

the r.h.s. of (21.9) =|S2||R1||F

1UF1|≥|S2||R1|

r1

i=1

λp−r(A)−r1+i,(21.10)

where λ1≥ ··· ≥ λp−r(A)are the ordered eigen-values of U, which are all indepen-

dent of Θ1since Uis not a function of Θ1. The inequality follows from Rao [7],

Theorem 2.1 (Poincaré separation theorem). Let {vi}be the corresponding eigen-

vectors to {λp−r(A)−r1+i}. Then, the minimum in (21.10) is obtained if F1is chosen

to equal

FF1=(v1,...,vr1)

21 A Bilinear Reduced Rank Model 337

and it remains to ﬁnd a Θ1such that

FF1=H

1Θ1(Θ

1H1H

1Θ1)−1/2.

Since

FF

1

FF1=Ir1, one solution is given by

Θ1=H1(H

1H1)−1

FF1.(21.11)

We will later return to the estimation of Θbecause according to (21.7), the esti-

mator of Θwill be a function of the estimator of Ψ, and therefore, Ψ=Ψ1Ψ2,

where Ψ1:p×r2,Ψ2:r2×phas to be discussed. Let us start with (21.8) and

the r.h.s. of (21.8)

=|S3||I+V

3T

1T

2S−1

3PT2T1Ψ1,S3T2T1V3+V

3T

1T

2P(T2T1Ψ1)o,S−1

3S−1

3T2T1V3|

≥|S3||I+V

3T

1T

2P(T2T1Ψ1)o,S−1

3S−1

3T2T1V3|.(21.12)

Equality holds if and only if

Ψ

1T

1T

2S−1

3T2T1V3=0,

where S3and T2are functions of Θ1for which an estimate was presented in (21.11).

Hence,

Ψ1Ψ2ΘC3=Ψ1(Ψ

1T

1T

2S−1

3T2T1Ψ1)−1Ψ

1T

1T

2S−1

3XPC

3(21.13)

and Ψ2can be estimated as a function of Ψ1and Θ. Note that (21.13)impliesthat

ΨΘC3is determined if Ψ1is replaced by an estimate since Θ1in S3and T2has

been estimated. Moreover, the expression in (21.13) can be inserted into (21.7) and

given

Θ1, we can write

ΘC2=

Θ1(

Θ

1T

1S−1

2T1

Θ1)−1

Θ

1T

1S−1

2(XPC

2−

ΨΘC3)(21.14)

=

Θ1

Θ

1T

1S−1

2(XPC

2−

ΨΘC3),

where

ΨΘ indicates that (21.13) is used, assuming that Ψ1can be estimated. Thus,

from this expression, Θ2can be estimated; however, Θ2is not unique and not of any

greater interest.

The parameter matrix Ψ1remains to estimate, which is important because this

will give explicit estimators for

ΘC2and

ΨΘC3. The estimation will be carried out

in the same way as when Θ1was estimated. Let

338 C. Hao et al.

R2=I+PC

3XT

1T

2S−1

3T2T2XPC

3,

T

1T

2S−1

3T2T2=H2H

2,H2:p×(p−r(A:Θ)),

F2=H

2Ψ1(Ψ

1H2H

2Ψ1)−1/2,

where R2and H2depend on only one unknown quantity, Θ1, which has been esti-

mated in (21.11). From (21.12), it follows that

the r.h.s. of (21.12) =|S3||R2||F

2(I−H

2XPC

3R−1

2PC

3XH2)F2|.(21.15)

Furthermore, deﬁne

U2=I−H

2XPC

3R−1

2PC

3XH2,(p−r(A:Θ)) ×(p−r(A:Θ)),

with U−1

2=I+H

2XPC

3XH2. Thus, U2is positive deﬁnite. The assumption C(A)∩

C(Θ)={0}used in (21.1) implies that p−r(A:Θ)=p−r(A)−r1. Then,

the r.h.s. of (21.15) =|S3||R2||F

2U2F2|≥|S3||R2|

r2

i=1

λp−r(A)−r1−r2+i,

where {λp−r(A)−r1−r2+i}are eigen-values of U2. Moreover, let

FF2=(w1,...,wr2)

be the matrix of eigen-vectors of U2corresponding to {λp−r(A)−r1−r2+i},i∈

{1,...,r2}. Thus, an estimated Ψ1must satisfy

H

2Ψ1(Ψ

1H2H

2Ψ1)−1/2=

FF2

Since

FF

2

FF2=I

Ψ1=H2(H

2H2)−1

FF2(21.16)

is an estimator.

This result means that we have estimated both

ΘC2and

ΨΘC3, and therefore,

an explicit estimator of Bcan be presented, which was the main purpose of this

article. We do not present estimators of Θ2and Ψ2, since they are of no real interest,

although they can be obtained from (21.15) and (21.13), respectively.

Proposition 1 For the model presented in Sect. 21.2 in (21.1), let

ΘC2and

ΨΘC3

be given by (21.15) and (21.13), respectively, where for the last relation

Ψ1, presented

in (21.16), has been inserted. The following estimators are proposed:

(i)

Θ1is given in (21.11);

(ii)

Ψ1is given in (21.16);

21 A Bilinear Reduced Rank Model 339

(iii)If r(A)=q and r(C1)=k then

B=(AS−1

1A)−1AS−1

1

×(XC

1(C1C

1)−1−

ΘC2C

1(C1C

1)−1−

ΨΘC3C

1(C1C

1)−1),

where S1=X(I−PC

1)X;

(iv) A

BC1=A(AS−1

1A)−AS−1

1(XC

1(C1C

1)−C1−

ΘC2−

ΨΘC3);

(v) n

Σ=(X−A

BC1−

ΘC2−

ΨΘC3)(X−A

BC1−

ΘC2−

ΨΘC3).

21.4 Discussion

The model is overparameterized, which implies estimability problems (parame-

ter identiliability problems). In fact, estimability in complex statistical models has

become an important topic in the era of analysing large data-sets. Regularization

in loss functions is one tool that nowadays is often applied. A different approach

that constitutes the main idea of this work is to introduce rank restrictions on mean

parameters to model the effects of latent variables, which are thought to govern a

large set of measurable variables. Moreover, we link the latent mean variables with an

extended Bilinear regression model, which yields a new class of models. An explicit

estimator of the latent variable effect is derived. In the future, based on this estimate,

the aim is to study statistical properties, including the interpretation, of estimators

and estimates and to study different types of model validation procedures.

Acknowledgements Chengcheng Hao and Feng Li are supported by the National Natural

Science Foundation of China (no. 11601319 and no. 11501587, respectively). Feng Li is also

supported by Beijing Universities Advanced Disciplines Initiative (no. 6JJ2019163). Dietrich von

Rosen is supported by the Swedish Research Council (2017-03003).

References

1. Anderson,T.W.: Estimating linear restrictions on regression coefﬁcients for multivariate normal

distributions. Ann. Math. Statist 22, 327–351 (1951)

2. Chinchilli, V.M., Elswick, R.K.: A mixture of the MANOVA and GMANOVA models. Comm.

Statist. Theory Methods 14, 3075–3089 (1985)

3. Johansen, S.: Estimation and hypothesis testing of cointegration vectors in Gaussian vector

autoregressive models. Econometrica 59, 1551–1580 (1991)

4. Kollo, T., von Rosen, D.: Advanced Multivariate Statistics with Matrices. Springer, New York

(2005)

5. Potthoff, R.F., Roy, S.N.: A generalized multivariate analysis of variance model useful espe-

cially for growth curve problems. Biometrika 51, 313–326 (1964)

6. Rao, C.R.: Tests of signiﬁcance in multivariate analysis. Biometrika 35, 58–79 (1948)

7. Rao, C.R.: Separation theorems for singular values of matrices and their applications in mul-

tivariate analysis. J. Multivar. Anal. 9, 362–377 (1979)

340 C. Hao et al.

8. Reinsel, G.C., Velu, R.P.: Multivariate Reduced-Rank Regression. Springer, New York (1998)

9. von Rosen, D.: Bilinear Regression Analysis: An Introduction. Springer, New York (2018)

10. von Rosen, T., von Rosen, D.: On estimation in some reduced rank extended growth curve

models. Math. Methods Statist. 26, 299–310 (2017)

11. Srivastava, M.S., Khatri, C.G.: An Introduction to Multivariate Statistics. North-Holland, New

York (1979)