Page 1

ORIGINAL RESEARCH

Decomposing Group Differences of Latent Means of Ordered

Categorical Variables within a Genetic Factor Model

Seung Bin Cho Æ Æ Phillip K. Wood Æ Æ Andrew C. Heath

Received: 13 December 2007/Accepted: 11 October 2008/Published online: 14 November 2008

? Springer Science+Business Media, LLC 2008

Abstract

position of group differences of the means of phenotypic

behavior as well as individual differences when the research

variables under consideration are ordered categorical. The

modelemploysthegeneralGeneticFactorModelproposedby

Neale and Cardon (Methodology for genetic studies of twins

and families, 1992) and, more specifically, the extension

proposed by Dolan et al. (Behav Genet 22: 319–335, 1992)

which enables decomposition of group differences of the

means associated with genetic and environmental factors.

Using a latent response variable (LRV) formulation (Muthe ´n

and Asparouhov, Latent variable analysis with categorical

outcomes: multiple-group and growth modeling in Mplus.

Mplus web notes: No. 4, Version 5, 2002), proportional dif-

ferences of response categories between groups are modeled

within the genetic factor model in terms of the distributional

differences of latent response variables assumed to underlie

the observed ordered categorical variables. Use of the pro-

posed modelis illustrated using a measure of conservatismin

the data collected from the Australian Twin Registry.

A genetic factor model is introduced for decom-

Keywords

variable formulation ? Ordered categorical variables ?

Mplus ? Twins

Genetic factor model ? Latent response

The genetic factor model is used in behavior genetics to

determine the relative contributions of genetic and envi-

ronmental components on a phenotypic behavior (Martin

and Eaves 1977; Neale and Cardon 1992). In its basic form,

the genetic factor model is an application of multiple-group

confirmatory factor analytic models that decomposes

variances of observed phenotypic variables into genetic

and environmental factors with genetically informative

data collected from twin pairs. Although the genetic factor

model is often used to explain a phenotypic behavior

measured by a single indicator variable, the effect of

genetic and environmental factors on phenotypic behaviors

measured by multiple indicators can be estimated using

analogous latent variables in a multivariate genetic factor

model (Heath et al. 1989b; Neale and Cardon 1992).

Although applications of the genetic factor model have

primarily been focused on the decomposition of individual

differences in terms of covariance structures, the mean

structures of phenotypic indicators can also be modeled by

genetic and environmental factors. Dolan (1989) proposed

a model for the genetic factor analysis with mean structures

based on the adjoined sum of squares and cross products

(SSCP) matrix in which the intercepts of phenotypic indi-

cators are set to zero and factor means are estimated.

Although this model can estimate the means of genetic and

environmental factors, setting intercepts of all indicator

variables to zero ignores intercept differences between

variables. The assumption of identical zero intercepts is not

necessarily appropriate in many situations (Dolan et al.

1992). It is more reasonable to believe that the intercept of

each indicator is different across indicator variables or that

measurement artifacts may produce baseline differences in

indicator variables not due to the factors in the model. For

this reason, models which include manifest variables

intercepts in addition to factor means have been appealing.

Edited by Dorret Boomsma.

S. B. Cho (&) ? P. K. Wood

Psychological Sciences, University of Missouri,

200 South 7th Street, Psychology Building,

Columbia, MO 65211, USA

e-mail: sbcht3@mizzou.edu; sbcht3@yahoo.com

A. C. Heath

Department of Psychiatry, Washington University Medical

School, St. Louis, MO 63110, USA

123

Behav Genet (2009) 39:101–122

DOI 10.1007/s10519-008-9237-9

Page 2

Unfortunately, models with both estimated intercepts

and factor means are not mathematically identified without

further restrictions in single-group analyses (Dolan et al.

1992). In multiple-group settings, however, relative dif-

ferences of factor means and variances across groups can

be estimated. Under multiple-group settings, factor means

and variances in a selected reference group are fixed to zero

and one, respectively, and factor means and variances in

non-reference groups can be estimated as departures from

those of a reference group, given that factor loadings and

intercepts are assumed invariant across groups (So ¨rbom

1974). Dolan et al. (1992) applied this method to a genetic

factor model to decompose factor mean differences

between groups into genetic and environmental factors, in

which zygosity groups are further divided by additional

grouping variables, such as sex or ethnicity. In this model,

factor means in the non-reference groups are estimated as

differences relative to the factor means in the reference

groups in which factor means are set to zero. Thus, the

mean of each factor indicator in non-reference groups is

determined by its intercept and the differences from the

reference groups due to each factor, while the mean of each

indicator variable in the reference groups is determined by

its intercept alone. This model enables estimation of the

direction and extent of the changes in means of phenotypic

behavior between groups due to genetic and environmental

factors without assuming zero intercepts of phenotypic

indicators.

The model proposed by Dolan et al. (1992) has been

employed in subsequent studies (Dolan and Molenaar

1994; Rowe and Cleveland 1996; Rowe and Rodgers 1997;

Cleveland et al. 2000; Heiman et al. 2003) to determine the

origins of within and between group variations on pheno-

typicbehaviorsduetogenetic

components. However, the model has yet to be applied to

phenotypic behaviors measured by ordered categorical

indicator variables. This is surprising, given that many

psychological measurements used in behavior genetics

involve categorical variables with relatively few ordered

response categories (e.g., Likert scales or attitude scales).

Direct applications of the traditional genetic factor model

to ordered categorical variables, in which ordered cate-

gorical variables are treated as continuous variables, can

often be misleading because the assumptions of factor

analytic models are not met (Bollen 1989). This problem

has been dealt with by assuming that the observed cate-

goricalvariablesarecategorizations

distributed latent continuous variables which underlie

observed categorical variables.

Applications of genetic factor models based on under-

lying latent continuous variables have been attempted

using polychoric correlation matrices as input data (Loehlin

1993; Truett et al. 1992). However, using a polychoric

andenvironmental

of normally

correlation matrix is equivalent to analyzing standardized

variables and the decomposition of latent means is not

possible because no factor means can be estimated. Anal-

yses of ordered categorical variables via underlying

continuous variables were generalized under the frame-

work of the latent response variable (LRV) formulation

(Christoffersson 1975; Muthe ´n 1984; Muthe ´n and Aspa-

rouhov 2002; Skrondal and Rabe-Hesketh 2004). The LRV

formulation is an extension of the general idea of polych-

oric correlations in that it relates the observed categories of

responses to an underlying continuous variable. The LRV

formulation, however, more generally extends factor ana-

lytic and structural models to ordered categorical variables

by relating the latent response variables to observed cate-

gorical variables using thresholds given distributional

assumptions on latent response variables. Prescott (2004)

described genetic factor analyses of ordered categorical

variables using LRV formulation in Mplus (Muthe ´n and

Muthe ´n 2007). Although Prescott (2004) described several

types of possible genetic factor analyses with continuous

and categorical variables in single- and multiple-group

settings, the decomposition of group differences of phe-

notypic means into genetic and environmental factors, as in

Dolan et al. (1992), has not been applied to the case of

ordered categorical variables.

In this paper, a multiple-group genetic factor model is

proposed for decomposition of mean differences of a

phenotypic behavior between groups when the indicator

variables under consideration are ordered categorical. In

contrast to limitations inherent in analysis of the polychoric

correlation matrix, the LRV formulation allows estimation

of distributional differences of latent response variables

using equality constraints on thresholds across groups.

Analyses of ordered categorical variable in multiple-group

settings have been described elsewhere (Muthe ´n and

Christofferson 1981; Muthe ´n and Lehman 1985; Muthe ´n

1989; Muthe ´n and Asparouhov 2002; Millsap and Yun-

Tein 2004). However, decomposition of phenotypic means

within the genetic factor model requires additional model

specification and identification considerations beyond tra-

ditional multiple-group structural models due to the

rationale underlying the specification of genetic and envi-

ronmental effects in genetic factor models. Specifically,

under the genetic factor model, differential covariance

structures due to genetic and environmental factors are

identified across groups defined on the basis of the zyg-

osities of twin pairs. In multiple-group settings using the

genetic factor model, these zygosity groups are often fur-

ther divided based on additional grouping variables such as

sex or ethnicity. The standard approach in multiple-group

analyses is to select a group as a reference group, in which

some parameters are fixed to constants to determine scale

of measurement of the factors with these respective

102 Behav Genet (2009) 39:101–122

123

Page 3

parameters are freely estimated. Under the multiple-group

genetic factor model, by contrast, reference and non-ref-

erence groups are determined based on the additional

grouping variable(s). All zygosity groups within a selected

level of additional grouping variables are set as the refer-

ence groups, with relevant parameters fixed to constants

and with the corresponding parameters estimated in the

remaining non-reference groups subject to equality con-

straints across zygosity groups in the same level of the

grouping variable. In order to avoid confusion in the pre-

sentation which follows, ‘‘group’’ refers to the group

divided by additional grouping variables and ‘‘zygosity

group’’ refers to grouping of twin pairs based on zygosity.

In addition, identification of factor analytic models with

ordered categorical variables is more complicated than for

factor models with continuous variables, in that latent

response variables are indirectly modeled via distribu-

tional assumptions on latent response variables and

thresholds. Thus, identification of means and variances of

both factors and latent response variables should be con-

sidered. Millsap and Yun-Tein (2004) derived general

conditions of identification that can be applied to various

forms of factor analytic models but, as they pointed out,

the identification conditions of a specific model should be

developed considering its structure and the hypotheses to

be tested. Specification of such models for the multiple-

group genetic factor model with factor means based on

ordered categorical manifest variables has not been

developed.

This paper begins with a brief presentation of the

genetic factor model for phenotypic means proposed by

Dolan et al. (1992) followed by a description of the

application of LRV formulation for decomposing mean

differences of latent response variables associated with

phenotypic behaviors measured by ordered categorical

variables into genetic and environmental factors. Model

specification and identification are derived within this

context, along with related discussions of intercept differ-

ences of variables, as pointed out by Dolan et al. (1992),

and factorial invariance within the context of the proposed

model. As an example of the approach, the model is used to

explore sex differences in genetic and environmental fac-

tors in items taken from the conservatism scale (Wilson

and Patterson 1968) in data from the Australian Twin

Registry using Mplus (Muthe ´n and Muthe ´n 2007).

The model

The genetic factor model for phenotypic means

The genetic factor model (Neale and Cardon 1992) is an

application of the multiple-group factor model used to

identify the relative contributions of genetic and envi-

ronmental factors on individual differences in phenotypic

behavior based on covariance matrices of genetically

informative data collected from twin pairs. When multi-

ple phenotypic indicators for a phenotypic behavior are

available, variances of each observed variable are repre-

sented as a linear function of genetic and environmental

factors and a residual variance term unique to each

indicator variable. This model is called the Independent

Pathway model or Biometric model, because each indi-

cator variable has a distinct factor loading from each

factor (Neale and Cardon 1992). Another way of mod-

elingphenotypicbehaviors

indicators is the Common Pathway model or Psycho-

metric model in which genetic and environmental factors

are assumed to have factor loadings on a single latent

variable extracted from multiple indicators. For reasons

described below, the present paper discusses the LRV

formulation for the biometric model rather than the

common pathway model due to identification difficulties

associated with the latter.

In multiple-group genetic factor models, in which

zygosity groups are further divided based on additional

grouping variables, mean difference of each indicator

variable across the groups due to each genetic and envi-

ronmental factors can be estimated (Dolan et al. 1992).

This model can be represented in matrix form as:

0

aðgÞ

j

measured bymultiple

yðgÞ

1j

yðgÞ

2j

@

1

A¼

aðgÞ

j

0

@

1

A

þ

kAj

kCj

kEj

000

000

kAj

kCj

kEj

!

AðgÞ

1

CðgÞ

1

EðgÞ

1

AðgÞ

2

CðgÞ

2

EðgÞ

2

0

B

B

B

B

B

B

B

B

B

B

B

@

1

C

C

C

C

C

C

C

C

C

C

C

A

þ

eðgÞ

1j

eðgÞ

2j

0

@

1

A

ð1Þ

where y1j

i-th twin (i = {1, 2}) from group g, which is defined by

the levels of additional grouping variables. Ai

Ei

environmental factor shared by both twins in the same

family, and the unique environmental factor, respectively,

for the i-th twin. aj

is the residual of each variable which is not explained by

the geneticand environmental

(g)and y2j

(g)represent the observed variable j from

(g), Ci

(g), and

(g)represent the additive genetic factor, the common

(g)is the intercept of variable yij

(g)and eij

(g)

factors. Differential

Behav Genet (2009) 39:101–122103

123

Page 4

structures in the correlation matrix of genetic and

environmental factors are

groups. Specifically:

assumed across zygosity

corrðA1

C1

0

1

0

0

1

0

E1

0

0

1

0

0

0

A2

C2

0

0

0

0

0

1

E2Þ0

¼

1

0

0

r

0

0

r

0

0

1

0

0

0

1

0

0

1

0

0

B

B

B

B

B

@

B

1

C

C

C

C

C

A

C

ð2Þ

In Eq. 2 the correlation between A1and A2, r, is set to 1 for

monozygotic (MZ) twin pairs and 0.5 for dizygotic (DZ)

twin pairs.

The model in Eq. 1 is not identified because factor

means and intercepts of indicator variables cannot be

estimated simultaneously without further constraints.

However, in multiple-group settings, as So ¨rbom (1974)

discussed, factor mean differences between groups can be

estimated relative to a selected reference group in which

factor means and variances are set to zero and one,

respectively, provided that the intercept and factor loading

of each respective manifest variable are invariant across

groups. Dolan et al. (1992) applied this method to

decompose phenotypic mean differences between groups in

terms of genetic and environmental factors. Specifically,

zygosity groups are further divided by an additional

grouping variable and all zygosity groups within one level

of the grouping variable are set as the reference groups.

Factor means and variances are estimated in the remaining

groups which share the same level of the grouping variable

but are constrained to equality within level of the grouping

variable. Given that factor loadings and intercepts are

invariant across the reference and non-reference groups,

factor means and variances in the non-reference groups are

interpreted relative to the latent variable metric in the

reference groups. Under this model, the mean of variable j

is determined as

lðrÞ

j

¼ ajfor the reference groups, and

lðnÞ

j

¼ajþ kAjdðnÞ

for the non-reference groups,

Aþ kCjdðnÞ

Cþ kEjdðnÞ

E

where d(n)’s are the factor mean differences from the

reference groups. Therefore the mean of each variable in

non-reference groups is determined by the intercept and

change in the mean from reference groups due to each

factor, while the mean of each variable in the reference

groups is determinedsolely

variance/covariancestructure

determined as

by its intercept.

between

The

twinsis

where /(n)’s are the variances of factors in the non-

reference groups, h represents the residual variance of

each variable, and r is the correlation between additive

genetic factors for twin 1 and 2, from Eq. 2. This model

requires multiple indicator variables and, as mentioned

above, the independent pathway model. Because there are

three common factors for each twin at least three

phenotypic means are needed to identify factor means.

If the common pathway model is used, the model with

factor means estimated in non-reference groups cannot

beidentified becausethree

reference groups cannot be estimated based on a single

latent variable. For this reason, the common pathway

model cannot be employed for the purpose of current

study.

factor meansinnon-

Application to ordered categorical variables

As mentioned above, direct application of the factor ana-

lytic model to ordered categorical variables, in which

ordered categorical variables are treated as continuous

variables, is often problematic, especially when relatively

few response categories are used (Olsson 1979; Johnson

cov yðrÞ

1j; yðrÞ

2j

??

¼

k2

Ajþ k2

Cjþ k2

rk2

Ejþ hðrÞ

Cj

j

rk2

Cjþ k2

Ajþ k2

Cj

Ejþ hðrÞ

Ajþ k2

k2

Ajþ k2

j

!

for the reference groups and

cov yðnÞ

1j; yðnÞ

2j

??

¼

k2

Aj/Aþ k2

Cj/Cþ k2

rk2

Ej/Eþ hðnÞ

Cj/C

j

rk2

Aj/Aþ k2

Cj/Cþ k2

Cj/C

Ej/Eþ hðnÞ

Aj/Aþ k2

k2

Aj/Aþ k2

j

!

for the non-reference groups,

104Behav Genet (2009) 39:101–122

123

Page 5

and Creech 1983; Lubke and Muthe ´n 2004), due to vio-

lations of assumptions of factor analytic models of

continuous variables (Bollen 1989). In the LRV formu-

lation, a normally distributed latent response variable,

which underlies each observed categorical variable, is

assumed and factor models are applied to latent response

variables, instead of the observed categorical variables

(Muthe ´n 1984; Muthe ´n and Asparouhov 2002; Skrondal

and Rabe-Hesketh 2004). Under the LRV formulation,

Eq. 1 is expressed in terms of the latent response

variables. Denoting y?

ijas continuous latent response

variables that underlie observed variable yijfor i-th twin

(i = {1, 2}), with Cjresponse categories, the Eq. 1 then

becomes:

y?ðgÞ

1j

y?ðgÞ

2j

0

@

1

A¼

a?ðgÞ

j

a?ðgÞ

j

0

@

1

A

þ

kAj

0

kCj

0

kEj

0

000

kAj

kCj

kEj

??

AðgÞ

1

CðgÞ

1

EðgÞ

1

AðgÞ

2

CðgÞ

2

EðgÞ

2

0

B

B

B

B

B

B

B

B

B

B

B

@

1

C

C

C

C

C

C

C

C

C

C

C

A

þ

e?ðgÞ

1j

e?ðgÞ

2j

0

@

1

A;

ð3Þ

where a?ðgÞ

respectively, of latent response variable y?

mean of the latent response variable y?

intercept and the means of common factors. The variance/

covariance matrix between twins is determined by factor

loadingsandfactor variances/covariances.

response variable y?

j

maps onto observed categorical

variable yjwith Cj- 1 thresholds.

j

and e?ðgÞ

ij

are the intercept and residual variance,

ij. From Eq. 3, the

ijis determined by its

Alatent

yij¼

1;

2;

...

if y?

ij?sj;1

ij?sj2

if sj1\y?

Cj? 1;

Cj;

if sjCj?2\y?

if sjCj?1\y?

ij?sjCj?1

ij

8

<

>

>

>

>

>

>

>

>

>

>

:

ð4Þ

sj1;sj2;...;sjCj?1arethethresholdswhichsegmentthelatent

response variable y?

theconditionalprobabilitythatvariable yjfallsintocategory

c,given the intercept and genetic and environmental factors,

is determined as the cumulative probability associated

with the (c - 1)-th category subtracted from cumulative

jinto Cjcategories. From Eqs. 3 and 4,

probability of the c-th category. Assuming e?

distributed with mean of zero and variance of hij, given

factors Ai, Ci, and Eiand intercept aj, the probability that

variable yijfalls into category c is,

ijis normally

Pðyij¼cjaj;Ai;Ci;EiÞ ¼ Pðyij?cjaj;Ai;Ci;EiÞ

? Pðyij?c ? 1jaj;Ai;Ci;EiÞ

¼ Usjc? ðajþ kAjjAþ kCjjCþ kEjjEÞ

"

ffiffiffiffiffi

ffiffiffiffiffi

hij

p

p

"#

? Usjc?1? ðajþ kAjjAþ kCjjCþ kEjjEÞ

hij

#

;

ð5Þ

where U is the cumulative distribution function of standard

normal distribution and j’s are factor means.

The LRV formulation introduces additional parameters

(i.e., thresholds). Thresholds are estimated from the mar-

ginal distribution of each latent response variable y?

single-group analyses y?

jis usually assumed to follow a

standard normal distribution and thresholds are estimated

as z-scores corresponding to the cumulative proportion

associated with each response category, which is a con-

sistent estimator of the cumulative probability of the latent

response variable. However, latent response variables

assumed to have standard normal distributions cannot be

used for genetic factor models that decompose means of

latent response variables because no information is present

regarding mean differences of latent response variables

between groups. In multiple-group settings, the distribu-

tions of latent response variables do not necessarily have

means of zero and unit variances across all groups. By

setting the distributions of latent response variables in the

reference groups to have mean of zero and unit variance,

means and variances of latent response variables for the

non-reference groups can be estimated relative to the

metric of latent response variables in the reference groups.

Minimally, to estimate the mean and variance of each

latent response variable in the non-reference groups, two

thresholds per each variable need to be constrained equal

across groups in order to provide the location and scale of

the latent response variable in the non-reference groups.

Denoting zj1

cumulative proportions of the first and second categories of

variable yj

the thresholds estimated from the reference groups, based

on standard normal distribution, l?ðnÞ

and variance of yj

j. In

(n)and zj2

(n)as the z-scores corresponding to the

(n)in the non-reference groups and sj1and sj2as

j

and r?ðnÞ

j

, the mean

(n), can be estimated from following.

sj1? l?ðnÞ

r?ðnÞ

j

j

¼ zðnÞ

j1;

sj2? l?ðnÞ

r?ðnÞ

j

j

¼ zðnÞ

j2

ð6Þ

Figures 1 and 2 illustrate estimation of means and

variances of a latent response variable across groups.

Behav Genet (2009) 39:101–122105

123

Page 6

Assuming three ordered response categories, the upper

panel of Fig. 1 shows the thresholds based on a standard

normal distribution for the variable with cumulative

proportions of 40 and 90%. Thresholds are estimated as

z-scores corresponding to these cumulative proportions and

are -0.2533 and 1.2816, respectively. The lower panel

shows thresholds for the cumulative proportion of 20 and

80%. With the same distributional assumption on the latent

response variable, the first and second thresholds are

changed to -0.8416 and 0.8416, respectively. In Fig. 2,

using the same cumulative proportions as in Fig. 1,

thresholds are fixed at values from the upper panel and

the mean and variance of the latent response variable in the

lower panel are estimated by Eq. 6. The upper panel is

based on standard normal distribution so that the mean is

zero (as indicated by a dashed reference line) and the

variance is 1. In the lower panel of Fig. 2, for the

cumulative proportions of 20 and 80%, mean and variance

areestimated basedon

the distribution in the lower panel is changed to have the

mean of 0.5141 (marked by a heavy dashed line) and the

variance of 0.9119. In short, locational differences of

thresholds provide information on the mean differences

and the interval between thresholds provides information

on the variance differences of latent response variables.

Obtained means and variances of the latent response

variables can thus be modeled. In general multiple-group

analysis of ordered categorical variables, differences in

means and variances of a latent response variable across

the groups are estimated by constraining respective

thresholds to be equal across a selected reference group

thefixedthresholdsand

and

Asparouhov 2002; Millsap and Yun-Tein 2004). In

multiple-group genetic factor analyses, the reference and

non-reference groups are defined by the levels of additional

grouping variables. All zygosity groups within the selected

level of the grouping variable are set as the reference

groups with appropriate equality constraints applied across

the reference and non-reference groups.

Even though the means and variances of latent response

variables can be estimated as departures relative to the

reference groups, this model is still under-identified. The

identification of the model is further complicated because

the latent response variables are indirectly modeled based

on distributional assumptions and thresholds. The model in

Eq. 3 can be identified, however, by applying constraints

on parameters. Millsap and Yun-Tein (2004) developed the

general minimum conditions to identify multiple-group

factor models of ordered categorical manifest variables.

Although these minimal identification conditions cover

factor analytic models generally, the unique structure and

requirements of the genetic factor analysis require addi-

tionalconsiderationwhen

constraints.

It can be shown that the following set of constraints can

minimally identify the model in Eq. 3: (a) the mean and

variance of each latent response variable, y?

and one, respectively, in the reference groups, (b) the mean

and variance of each factor are set to zero and one,

respectively, in the reference groups, (c) factor loadings are

remainingnon-referencegroups(Muthe ´nand

developingidentification

j, are set to zero

-4-20

y*

24

0.0

-4 -20

y*

24

0.0

0.2

0.4

Probability

40%

50%

0.2

0.4

Probability

20%

60%

Fig. 1 Different thresholds estimated for different response propor-

tions based on the standard normal distributions. In the upper panel,

thresholds estimated for the cumulative proportions of 40 and 90%

are -0.2533 and 1.2816 (vertical lines in the upper panel),

respectively. In the lower panel, thresholds estimated for the

cumulative proportions of 20 and 80% are -0.8416 and 0.8416

(vertical lines in the lower panel), respectively

-4-20

y*

24

0.0

-4 -20

y*

24

0.0

Probability

0.2

0.4

40%

50%

Probability

0.2

0.4

20%

60%

Fig. 2 Different means and variances estimated based on the fixed

thresholds. In the upper panel, thresholds estimated for the cumula-

tive proportions of 40 and 90% are -0.2533 and 1.2816 (vertical lines

in the upper panel), respectively. In the upper panel, mean and

variance are zero (dashed line in the upper panel) and one,

respectively. In the lower panel, for the cumulative proportions of

20 and 80%, thresholds are fixed to the same values from the first

panel (vertical lines in the lower panel) and mean and variance are

estimated as in Eq. 5 based on the fixed thresholds, which are 0.5141

(heavy dashed line in the lower panel) and 0.9119, respectively

106Behav Genet (2009) 39:101–122

123

Page 7

constrained to be equal across groups, (d) the intercept of

each variable is set to zero in both reference and non-

reference groups, and (e) for three selected indicator vari-

ables two thresholds are set to be equal across groups.

Constraints (a) and (b) identify all parameters in the ref-

erence groups by providing the scale for latent response

variables and factors. Constraint (c) identifies the factor

loadings for the non-reference groups. Constraint (e)

identifies the means and variances of the latent response

variables of the three chosen indicator variables in the non-

reference groups, which, in conjunction with the con-

straints (c) and (d), leads to the identification of the

distributions of the factors in the non-reference groups. The

means and variances of the latent response variables not

included in constraint (e) can be estimated based on the

distributions of the factor and factor loadings. Identification

of the model is detailed in Technical supplement A. Thus,

the covariance structures are determined as follows

where /(n)’s are the variances of factors. Mean of each

latent response variable are determined as,

l?ðrÞ

j

l?ðnÞ

j

¼0;

¼kAjdðnÞ

forreferencegroups,and

AþkCjdðnÞ

CþkEjdðnÞ

E; fornon-referencegroups,

ð7Þ

where d(n)’s are the differences of factor means from the

reference groups.

Several related issues of the proposed model require

further elaboration. First, intercept differences between

variables are not included in the current model. As noted

by Dolan et al. (1992), setting intercepts of the variables to

zero is equivalent to assuming that all indicator variables

have same intercepts, which ignores possible location dif-

ferences among the variables. However, in the LRV

formulation, intercepts and thresholds are not entirely

distinct from each other and intercept differences between

the latent response variables are absorbed into the differ-

ently estimated thresholds across variables (Muthe ´n and

Asparouhov 2002; Millsap and Yun-Tein 2004). To show

this point, Eq. 5 is revisited.

Pðyj?cjA;C;EÞ ¼ Uscj? ðkAjjAþ kCjjCþ kEjjEÞ

ffiffiffiffi

hj

p

"#

ð8Þ

Differences of intercepts and thresholds cannot be

simultaneously estimated. It requires either intercepts of

variables fixed to a constant or one of the thresholds set

equal across variables to estimate thresholds or intercept

differences. Assuming that the variables have the same

number of categories, estimating the intercept difference

between variables requires that at least one threshold is set

equal across the variables, to provide a reference point

from which intercept differences can be estimated, and one

intercept of a selected variable is fixed to a constant,

usually zero. With the intercept differences in the model

and a constrained c-th threshold, the conditional probability

that the response yjis less than or equal to c-th category,

given intercept and factors, is,

"

Pðyj?cjaj;A;C;EÞ ¼Usc?aj?ðkAjjAþkCjjCþkEjjEÞ

ffiffiffiffi

hj

p

#

:

ð9Þ

The subscript j is omitted from scbecause c-th threshold

of each variable has been set equal across variables. From

Eq. 9 the threshold for the c-th category of variable j is

adjusted from scby aj, which is (sc- aj). Thus, if any

intercept difference exists between variables, it is captured

by differentially estimated thresholds across variables. A

common threshold adjusted for variable j, (sc- aj), from

Eq. 9, and the threshold estimated for variable j, scj, from

Eq. 8 should be equivalent. Also, the numbers of estimated

intercept and threshold parameters are the same in Eqs. 8

and 9. Supposing p variables with C categories each, Eq. 8

contains

p(C - 1)estimated

parameters to be estimated are a threshold, sc; constrained

thresholds. InEq. 9,

cov y?ðrÞ

1j; y?ðrÞ

2j

??

¼

1

rk2

Ajþ k2

1

Cj

rk2

Ajþ k2

Cj

!

; for reference groups and

cov y?ðnÞ

1j; y?ðnÞ

2j

??

¼

k2

Aj/Aþ k2

Cj/Cþ k2

rk2

Ej/Eþ hðnÞ

Cj/C

j

rk2

Aj/Aþ k2

Cj/Cþ k2

Cj/C

Ej/Eþ hðnÞ

Aj/Aþ k2

k2

Aj/Aþ k2

j

!

for non-reference groups,

Behav Genet (2009) 39:101–122107

123

Page 8

across variables; p(C - 2) remaining thresholds; and

(p - 1) intercepts making a total of p(C - 1) estimated

parameters. Thus, Eqs. 8 and 9 are re-parameterizations of

eachotherand,giventhatthedifferentinterceptisnotamain

substantive question, estimating the differential intercept

acrossthevariablesisunnecessary.Differencesofintercepts

across variables are therefore absorbed into different

estimates of thresholds across the variables. Likewise,

intercept differences of a subset of variables across groups

are also captured by differently estimated thresholds. In the

identification constraints described above, two thresholds

per each of three selected indicator variables are constrained

to be equal across groups. Thus the rest of the thresholds can

be differently estimated across groups and the intercept

differences of those variables across groups are absorbed

into group specific thresholds for each indicator variable.

Second, although, in the minimal constraints described

above, differences of factor means and variances across

groups can be estimated based on the invariance of factor

loadings and thresholds of a subset of parameters, the

invariance of the remaining parameters not constrained can

further be investigated as part of a larger examination of

factorial invariance across groups. Testing invariance

hypotheses in factor models using ordered categorical

variables is more complicated than in the cases of contin-

uous variables because the latent response variables are

indirectly modeled using distributional assumptions and

thresholds. As such, factorial invariance involves interre-

lated equalities of factor loadings, intercepts, residual

variances, and thresholds of latent response variables

across groups. Testing invariance hypotheses using latent

response variables within the proposed model is more

complicated because constraints in the minimally identified

model require factor loadings to be constrained to equality

in order to estimate the means and variances of the latent

response variables and the genetic and environmental fac-

tors. Further, the location of the latent response variable

must be determined as a function of factor means in those

variables chosen for the minimal identification constraints.

Nevertheless, there is some flexibility when investigating

invariance hypotheses relative to a base model with mini-

mal identification constraints. This model can then be

compared to a model in which these thresholds are set equal

across groups. Under the LRV approach, as seen above (see

Eq. 6; Figs. 1, 2), thresholds play important roles because

they contain information about means and variances of

latent response variables. By comparing the fit of a model

with fully invariant thresholds and loadings to the fit of the

minimally constrained model, one can at least test whether a

model with fully invariant thresholds and loadings is a

parsimonious, well-fitting alternative. If it fits appreciably

worse than the minimally constrained model, either factor

loadings or thresholds may not be invariant. Although the

specific source of misfit of the model may be difficult to

pinpoint, general strategies for probing the possible sources

of model misfit in the discussion section below.

Finally, factorial invariance of ordered categorical vari-

ables also involves the invariance of the residual variance of

the latent response variable associated with each indicator

variable. The invariance of residual variances can also be

tested by fitting the model with equality constraints on

residual variances. However, it should be noted that,

depending on the parameterization method used, residual

variances of latent response variables may not be indepen-

dent parameters to be estimated (Muthe ´n and Asparouhov

2002) and, thus, applying constraints on residual vari-

ances may not be possible. Note also that no genetic factor

structure is specified at the level of each indicator variable

in Eq. 3. It is possible to incorporate additional genetic and

environmental factor structures specific for each variable

(as was done for the original genetic factor model described

by Heath et al. 1989a for the case of continuous variables)

without modifying current identification constraints. How-

ever, given the computational burden introduced by

including genetic and environmental components unique to

each variable, sample size considerations, and the fact that

such effects are often not of substantive interest, allowing

residual covariance between twins which vary across

zygosity groups appears a reasonable accommodation of the

unique genetic and environmental factors associated with

individual indicator variables.

Illustrative example

Data collected from the Australian Twin Registry on Wil-

son–Patterson conservatism scale (Wilson and Patterson

1968) were used as an example application of the proposed

model (see Martin et al. 1986 for more information on data

collection and summary statistics). Based on zygosity and

sex, 3,808 pairs of twins were divided into five groups.

There are 1,202 pairs of monozygotic female (MZF) twins,

567 pairs of monozygotic male (MZM) twins, 747 pairs of

dizygotic female (DZF) twins, 350 pairs of dizygotic male

twins (DZM), and 912 pairs of opposite sex (DZO) twins.

The Wilson–Patterson conservatism scale consists of 50

items with three response categories per each item that

indicate the degree of assent: ‘‘Yes’’, ‘‘?’’, and ‘‘No’’.

Response categories are assumed to be ordered because

they reflect the degree of a respondent’s supportive attitude

on each item. As originally developed, odd-numbered items

were worded to have positive relationships with conserva-

tive attitude (e.g. Apartheid, Church Authority, etc.) and

even-numbered items were worded to have negative rela-

tionships (e.g. Colored Immigration, Evolution Theory,

etc.) (Wilson and Patterson 1968).

108 Behav Genet (2009) 39:101–122

123

Page 9

Because analyzing all 50 items in one model poses great

computationalburdensonparameterestimationofthemodel

being considered, items were first factor analyzed to deter-

mine general sub-dimensions of conservative attitude and

each subset of items was separately analyzed. This practice

isalsoinkeepingwiththeoriginaldesignofthescale,which

assumes that an individual can have different attitudes on

different sub-dimensions of conservatism (Wilson and

Patterson1968).Anexploratoryfactoranalysisperformedin

Mplus version 5 (Muthe ´n and Muthe ´n 2007) using promax

rotation produced four oblique sub-dimensions. Each sub-

dimension was named according to the items included: the

political, religious, racial, and social dimensions. Within

eachfactor,itemswithfactorloadingswithanabsolutevalue

greaterthan0.4wereretained,andareshowninTable 1.The

pattern of items included in each dimension is roughly

consistentwithearlierexploratoryfactoranalysesofthedata

(Truett et al. 1992; Eaves et al. 1999).

Age was included in the model as a covariate because of

the age-cohort effect on conservatism (Truett 1993; Eaves

et al. 1999). Truett (1993) found strong evidence that

conservatism scores on this scale are greater in older

respondentsthaninyoungerrespondentsafteradjustingfora

varietyofothercovariatesandfoundthatthischangeismore

rapidafterthefifthdecadeoflife.Inthissample,sucharapid

change after the fifth decade of life was not found and, thus,

the linear effect of the deviation score from the mean of age

was used as a covariate. The regression coefficient of age on

each variable was constrained to be equal across groups.

Model specification

In matrix form, for the variable j, the model is expressed as

where bjis the regression slope for variable j on age. Since

all observed variables have three ordered categories, the

latent response variable y?ðgÞ

j

in group g is mapped onto the

observed variable yðgÞ

j

with two thresholds, sðgÞ

order to set female groups as reference groups, the MZF,

DZF and female twins of DZO group were set as the refer-

ence groups, and MZM, DZM, and the male twins of DZO

group were set as the non-reference groups. The path dia-

gram in Fig. 3 illustrates this model with four indicator

variables. To present the models for both female and male

groups in one diagram, the left side of diagram is the model

for female groups and the right side is the model for male

groups. The triangle in the diagram denotes a column vector

of ones and, as such, the loadings originating from this

variable to factors represent factor means. Note that factor

means in the female part are set to zero, but are estimated in

the male part of the diagram. Latent response variables are

represented as circles linked to observed categorical vari-

ables via filled circles to represent the transformation from

latent response variables to observed categorical variables.

Mplus version 5 (Muthe ´n and Muthe ´n 2007) was used to

estimate the model parameters. The default estimation

method for ordered categorical variables in Mplus,

weighted least squares (WLSMV), was used (Muthe ´n and

Muthe ´n 2007). The variance of each variable in non-ref-

erence groups was estimated via scale parameters using the

Delta Parameterization method (Muthe ´n and Asparouhov

2002). The scale parameter is the inverse of the standard

deviation of the marginal distribution of the latent response

variable for each categorical indicator variable. Thus, the

scale parameter for each variable was estimated in the non-

reference groups and fixed to one in the reference groups.

Although the delta parameterization has computational

advantages over the theta parameterization (Muthe ´n and

Asparouhov 2002), which is an alternative parameteriza-

tion in Mplus, each residual variance is not a free

parameter to be estimated and no equality constraints

across the groups on residual variances are permitted with

the delta parameterization. The residual variances are

computed as 1 ? ðk2

and as r?2ðnÞ

j

? ðk2

reference groups. Inability of imposing equality constraints

on residual variances with the delta parameterization pre-

vents the assessment of the invariance of residual

variances. Equality constraints on residual variances can be

j1and sðgÞ

j2. In

Ajþ k2

Aj/ðnÞ

Cjþ k2

Cj/ðnÞ

EjÞ in the reference groups

Cþ k2

Aþ k2

Ej/ðnÞ

EÞ in the non-

Table 1 Items included in each dimension

PoliticalPatriotism, Licensing Law, Royalty, Censorship, Strict Rules, Inborn Conscience

ReligiousEvolution Theory, Sabbath Observance, Birth Control, Divine Law, Legalized Abortion,

Church Authority, Divorce

Racial Death Penalty, Empire Building, Women Judges, Apartheid, Caning, Colored Immigration

SocialHippies, Modern Art, Student Pranks, Nudist Camps, Jazz, Casual Living, Pyjama Party

y?ðgÞ

1j

y?ðgÞ

2j

!

¼

bj

bj

??

Age

þ

kAj

0

kCj

0

kEj

0

000

kAj

kCj

kEj

??

AðgÞ

1

CðgÞ

1

EðgÞ

1

AðgÞ

2

CðgÞ

2

EðgÞ

2

0

B

B

B

B

@

B

B

B

B

1

C

C

C

C

A

C

C

C

C

þ

e?ðgÞ

j1

e?ðgÞ

j2

!

;

Behav Genet (2009) 39:101–122109

123

Page 10

applied with the theta parameterization, but the models

using the theta parameterization for these data did not

converge in any sub-dimensions.

The model was specified in accordance with the minimal

identification constraints described in previous sections.

Variances of latent response variables in MZF, DZF, and

the female part of DZO groups were set to one by setting

the scale parameters for each variable to one. The scale

parameters in MZM, DZM, and male part of DZO groups

were estimated. Factor means and variances were set to

zero and one, respectively, in female groups, and were

estimated in male groups. Setting factor means to zero in

female groups sets the means of latent response variables to

zero because Mplus does not allow intercepts for ordered

categorical variables and the intercepts are set to zero by

default. Factor loadings were set to be equal across groups.

The thresholds of the first three indicator variables were set

to be equal across female and male groups. As in other

genetic factor analyses, the model is symmetric and

respective parameters are constrained to be equal across

twins in the same pair. The covariance structure between

twins was specified by With statements in Mplus, but

With statements does not permit direct specification of

correlations between non-standardized variables. Instead,

to apply the correlation structure in Eq. 2, nonlinear

constraints were applied to covariances between factors

using Model constraint statements. Covariances

between common environmental factors (factor C) were

constrained to be equal to the variance of factor C in MZM

and DZM groups and the square root of the variance of

factor C estimated for male part in DZO group. Covari-

ances between additive genetic factors (factor A) were

constrained equal to the variance of additive genetic factor

in MZM group; as half of the variance of factor A in DZM

group; and as half of the square root of the variance of

factor A estimated for the male part in DZO group. Instead

of modeling genetic factor structure for each variable, the

covariance between twins for each variable was estimated.

The residual covariances between twins were allowed to

vary across zygosity groups (MZF and MZM for mono-

zygotic twins and DZF, DZM, and DZO groups for

dizygotic twins) to accommodate different correlation

structures between monozygotic and dizygotic twins. An

example Mplus program is in Table 7 with detailed

description in Technical supplement B.

Analysis

Models with minimal identification constraints were fit first

for all sub-dimensions. Even though v2values were

y11

y12

y13

y14

y21

y22

y23

y24

y*11

y*12

y*13

y*14

y*21

y*22

y*23

y*24

A1

C1

E1

A2

C2

E2

1

ε∗11

ε∗12

ε∗13

ε∗14

ε∗21

ε∗22

ε∗23

ε∗

24

λA1

λC1

λE1

λA2

λC2 λE2

λA3

λC3

λE3

λA4

λC4

λE4

λA1

λC1

λE1

λA2

λC2

λE2

λA3

λC3

λE3

λA4

λC4

λE4

0

0

0

δA

δB

δC

1 for MZ twins and .5 for DZ twins1

Model for females

Model for males

Age

Fig. 3 The path-diagram of the proposed model. The left side of

diagram is the model for females and the right side is the model for

males. The loadings originating from the triangle to factors represent

the factor means. Latent response variables are represented as circles

linked to observed categorical variables via filled circles

110Behav Genet (2009) 39:101–122

123

Page 11

significant, which might be due to the large sample size,

other fit indices indicated acceptable fit for all sub-

dimensions (v2(218) = 258.709, CFI = 0.989, TLI =

0.987, RMSEA = 0.016 for the political dimension; v2

(190) = 274.355, CFI = 0.993, TLI = 0.993, RMSEA =

0.024 for the religious dimension; v2(201) = 294.627,

CFI = 0.981, TLI = 0.977, RMSEA = 0.025 for the

racial dimension;

v2

(241) = 411.206,

TLI = 0.980, RMSEA = 0.030 for the social dimension).

Models with fully invariant thresholds were then fit and

compared with the minimally constrained models. Since v2

values from weighted least square estimation (WLSMV)

are not valid for v2difference testing (Muthe ´n and Muthe ´n

2007), the difftest option in Mplus was used. For the

political dimension and religious dimension, v2differences

between the models with fully invariant thresholds and

minimally constrained models were not significant (v2diff

(5) = 1.166, P = 0.948 for the political dimension and

vdiff

(6) = 7.628, P = 0.267 for the religious dimension),

while v2differences were significant for the racial dimen-

sion and social dimension (v2diff(5) = 17.714, P = 0.003

and vdiff

(5) = 19.813, P = 0.002, respectively). As dis-

cussed above, worse fit of models with fully invariant

thresholds than the minimally constrained model may

indicate lack of factorial invariance between groups for

CFI = 0.982,

2

2

those sub-dimensions and further investigations to locate

the origin of factorial invariance is appropriate. However,

because assessing the factorial invariance is not a major

purpose of this illustration, it is not described here.

Tables 2, 3, 4, 5 show the parameters estimated from the

model for each sub-dimension. Models with fully invariant

thresholds for the political dimension (Table 2) and the

religious dimension (Table 3) are shown in accordance

with the results of v2difference tests of the model with

fully invariant thresholds. The models for the racial

dimension (Table 4) and the social dimension (Table 5) are

the models with minimal identification constraints. The

square of the factor loading associated with each factor

represents the proportion of the variance in each latent

response variable explained by each factor in female

groups, but same interpretation is not valid for male groups

because variances of factors and latent response variables

are not set to one in male groups. Thus, the R2’s for each

item due to each factor for female and male groups are

shown to the right of the factor loadings to show the rel-

ative contribution of each factor to the variance of each

latent response variable. Signs of factor loadings inform the

direction of the factors on each item. As mentioned in the

previous section, odd-numbered items were positively

worded and even-numbered items were negatively worded

Table 2 Parameter estimates for the political dimensiona

ItemAdditive genetic Common environment Unique environment

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Patriotism0.632

(0.041)

0.399

0.381

0.253

(0.077)

0.064

0.030

0.103

(0.033)

0.010

0.013

Royalty0.422

(0.029)

0.178

0.147

-0.086

(0.058)

0.007

0.003

0.215

(0.030)

0.046

0.049

Censorship 0.396

(0.035)

0.157

0.143

-0.148

(0.054)

0.022

0.010

0.503

(0.041)

0.253

0.294

Strict rules 0.367

(0.061)

0.135

0.156

-0.466

(0.047)

0.217

0.123

0.212

(0.030)

0.045

0.066

Licensing laws0.219

(0.030)

0.048

0.073

0.031

(0.039)

0.001

0.001

0.383

(0.036)

0.147

0.284

Inborn

conscience

0.297

(0.030)

0.088

0.066

0.156

(0.045)

0.024

0.001

0.111

(0.029)

0.012

0.012

95% CI95% CI95% CI

Factor meanb

(SE)

Factor variancec

0.231

(0.163)

-0.129

0.590

-0.067

(0.167)

-0.395

0.260

-1.431

(0.236)

-1.894

-0.968

(SE)

0.934

(0.148)

v2(219) = 264.465, CFI = 0.988, TLI = 0.985, RMSEA = 0.017

0.644

1.225

0.457

(0.115)

0.232

0.682

1.190

(0.239)

0.720

1.190

Fit statistics

aFor the political dimension, parameter estimates are from the model with fully invariant thresholds

bFactor means are estimated in male groups and are set to zero in female groups

cFactor variances are estimated in male groups and are set to one in female groups

Behav Genet (2009) 39:101–122111

123

Page 12

for conservative attitudes. Therefore factor loadings on

positively worded item and negatively worded item are

expected to have opposite signs, but, for some items, this

pattern was not clearly presented. Factor means are esti-

mated means for the male groups relative to the factor

means of female groups in which factor means were set to

zero. Estimated means quantify the magnitude and direc-

tion of differences in factor means of male groups relative

to females. A factor mean multiplied by its factor loading

on each item represents the mean difference of the latent

response variable between female and male groups due to

each factor (see Eq. 7). Estimated means of each item for

the male groups are shown in Table 6. Item means are

computed as linear combinations of factor means as in

Eq. 7. For ease of presentation, the signs of mean scores of

the negatively worded items are reversed in Table 6, so

that higher scores represent more conservative attitudes for

all items. Factor variances in Tables 2, 3, 4, 5 represent the

variances of factors in male groups relative to the factor

variances of female groups, in which factor variances were

set to one.

For the political dimension, in Table 2, R2’s indicate

that the additive genetic factor was an important factor that

determined the variance of most items included in this

dimension, while environmental factors were only impor-

tant for some items (‘‘Strict Rules’’ and ‘‘Licensing

Laws’’). The R2’s were not noticeably different between

female and male groups with a few exceptions (the com-

mon environmental factor on ‘‘Strict Rules’’ and the unique

environmental factor on ‘‘Licensing Law’’). The factor

mean of the unique environmental factor was negative and

the largest in absolute value while factor means of the

additive genetic factor and common environmental factor

were not significantly different from zero. The patterns of

factor means and loadings resulted in the negative net

effect on each item for males (the first column of Table 6).

Relative to women, men showed more variance in the

unique environmental factor but less variance in the addi-

tive genetic and common environmental factors.

For the religious dimension, as shown in Table 3, vari-

ances of items related to marriage and procreation were

determined relatively more by the additive genetic factor,

whereas the variances of items related to religious claims

were largely determined by environmental factors. R2’s

indicate that the contributions of each factor on each latent

response variable were not noticeably different between

Table 3 Parameter estimates for the religious dimensiona

ItemAdditive geneticCommon environmentUnique environment

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Legalized abortion0.739

(0.041)

0.546

0.515

0.389

(0.061)

0.152

0.147

0.195

(0.032)

0.038

0.041

Birth control0.684

(0.038)

0.468

0.384

0.252

(0.064)

0.064

0.053

0.114

(0.040)

0.013

0.012

Divorce0.571

(0.037)

0.326

0.298

0.332

(0.049)

0.110

0.103

0.180

(0.029)

0.032

0.034

Evolution theory 0.266

(0.042)

0.071

0.073

0.389

(0.031)

0.151

0.160

0.164

(0.023)

0.027

0.032

Sabbath observance-0.159

(0.056)

0.025

0.025

-0.498

(0.025)

0.248

0.251

-0.451

(0.023)

0.203

0.232

Divine law-0.137

(0.057)

0.019

0.022

-0.518

(0.024)

0.268

0.316

-0.464

(0.023)

0.215

0.287

Church authority-0.203

(0.059)

0.041

0.040

-0.546

(0.027)

0.298

0.297

-0.448

(0.023)

0.201

0.226

95% CI95% CI 95% CI

Factor meanb

(SE)

Factor variancec

-0.647

(0.271)

-1.178

-0.115

1.809

(0.719)

0.400

3.218

-1.189

(0.776)

-2.711

0.332

(SE)

1.386

(0.264)

v2(190) = 274.355, CFI = 0.993, TLI = 0.993, RMSEA = 0.024

0.828

1.943

1.418

(0.230)

0.967

1.870

1.602

(0.228)

1.156

2.049

Fit statistics

aFor the religious dimension, parameter estimates are from the model with fully invariant thresholds

bFactor means are estimated in male groups and are set to zero in female groups

cFactor variances are estimated in male groups and are set to one in female groups

112Behav Genet (2009) 39:101–122

123

Page 13

female and male groups. The mean of the common envi-

ronmental factor for men was positive and the largest in

absolute value and the corresponding mean of the additive

genetic factor was negative. The mean of the unique

environmental factor was negligible. Thus, mean differ-

ences between men and women were mainly determined by

the common environmental and additive genetic factors,

which resulted in the negative net effect on the items

related to religious claims and positive net effect on the

items related with procreation in men (second column of

Table 6). Higher factor variances on the bottom of Table 3

indicate that the factor variances were larger in men.

For the racial dimension, factor loadings and R2’s in

Table 4 show no dominant factor affecting all items in this

dimension. The additive genetic factor was relatively more

important in determining the items ‘‘Death Penalty’’ and

‘‘Caning’’. The relative contributions of factors varied

between female and male groups in some items. Factor

means were not significant for all three factors, and thus

further interpretation of factor means is not necessary. Men

appeared much more variable in terms of their environ-

mental factors, as evidenced by the larger factor variances

associated with the environmental factors.

For the social dimension, in Table 5, the additive

genetic factor was an important factor for explaining the

variance of most items while unique environmental factors

were important for items ‘‘Modern Art’’ and ‘‘Jazz.’’ No

noticeable differences between male and female groups

were found for R2due to each factor. The factor mean was

negative and largest in absolute value for the unique

environmental factor. The factor mean for the additive

genetic factor was positive and the factor mean for unique

environmental factor was negative. Thus, the mean dif-

ferences between men and women were mainly determined

by the additive genetic factor and unique environmental

factor, which results in the pattern of net effect in Table 6.

Factor variances associated with environmental factors

were larger in men (shown on the bottom of Table 5).

Discussion

The model proposed in this study incorporates latent mean

structures for ordered categorical variables in a genetic

factor model using the LRV formulation. Between group

differences of means in estimated latent response variables

associated with the ordered categorical variables are

decomposed into differences due to genetic and environ-

mental factors. A main advantage of the proposed model is

its ability to test hypotheses regarding the origins of within

Table 4 Parameter estimates for the racial dimensiona

ItemAdditive genetic Common environmentUnique environment

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Death Penalty0.669

(0.051)

0.447

0.521

0.023

(0.025)

0.001

0.003

-0.097

(0.024)

0.009

0.029

Empire-building0.340

(0.045)

0.115

0.046

0.402

(0.045)

0.162

0.319

-0.675

(0.053)

0.456

0.480

Women Judges -0.074

(0.041)

0.005

0.004

-0.181

(0.039)

0.033

0.105

0.188

(0.039)

0.035

0.061

Apartheid 0.221

(0.037)

0.049

0.022

0.336

(0.037)

0.113

0.251

-0.345

(0.032)

0.119

0.142

Caning 0.386

(0.033)

0.149

0.201

-0.014

(0.023)

0.000

0.001

-0.084

(0.022)

0.007

0.0254

Colored immigration-0.349

(0.033)

0.122

0.073

-0.290

(0.034)

0.084

0.248

0.316

(0.030)

0.010

0.157

95% CI95% CI95% CI

Factor meanb

(SE)

Factor variancec

0.974

(1.086)

-1.154

3.102

14.262

(11.995)

-9.248

37.772

10.108

(8.028)

-5.626

25.843

(SE)

1.431

(0.380)

v2(201) = 294.627, CFI = 0.981, TLI = 0.977, RMSEA = 0.025

0.686

2.175

7.074

(2.187)

2.787

11.360

3.779

(0.915)

1.985

5.573

Fit statistics

aFor the racial dimension, parameter estimates are from the model with minimal identification constraints

bFactor means are estimated in male groups and are set to zero in female groups

cFactor variances are estimated in male groups and are set to one in female groups

Behav Genet (2009) 39:101–122113

123

Page 14

and between group variations in phenotypic behaviors due

to environmental and genetic factors in ordered categorical

variables and to make statements as to whether these

effects suppress or elevate levels of a phenotypic behavior.

In the case of continuous phenotypic indicators, the genetic

factor model with mean structure proposed by Dolan et al.

(1992) provides a method in this regard, and the model

proposed here extends it to the case of ordered categorical

variables. The minimal constraints needed to identify the

model with latent mean structures were derived and the

models with further constraints on the subset of parameters

of interest can fit and compared to the model with minimal

constraints to test related hypotheses.

The framework of the proposed model can also be

applied to the case of longitudinal panel studies. Analogous

to multiple-group settings, one of the occasions of repeated

measurements can be set as a reference point. If the same

variable is measured repeatedly for the same respondent,

one can justifiably assume invariant thresholds across

repeated measurements. Proportional differences across

repeated measurements can then be converted into distri-

butional differences of the latent response variables based

Table 5 Parameter estimates for the social dimensiona

Item Additive geneticCommon environmentUnique environment

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Loading

(SE)

R2for female

and male

Hippies0.584

(0.021)

0.341

0.332

-0.042

(0.039)

0.002

0.005

0.316

(0.024)

0.100

0.123

Modern art0.295

(0.026)

0.087

0.076

-0.083

(0.029)

0.007

0.017

0.553

(0.031)

0.306

0.336

Student pranks0.526

(0.026)

0.277

0.354

-0.160

(0.039)

0.026

0.094

0.051

(0.023)

0.003

0.004

Nudist camps0.644

(0.031)

0.415

0.371

0.274

(0.053)

0.075

0.194

0.290

(0.025)

0.084

0.095

Jazz 0.237

(0.027)

0.056

0.048

-0.042

(0.026)

0.002

0.004

0.487

(0.030)

0.237

0.256

Casual living0.397

(0.028)

0.158

0.155

0.168

(0.036)

0.028

0.080

0.257

(0.026)

0.066

0.082

Pyjama party0.427

(0.029)

0.182

0.171

0.194

(0.036)

0.038

0.102

0.298

(0.024)

0.089

0.105

95% CI95% CI 95% CI

Factor meanb

(SE)

Factor variancec

0.670

(0.219)

0.240

1.100

-0.675

(0.811)

-2.264

0.915

-0.855

(0.189)

-1.226

-0.484

(SE)

1.357

(0.184)

v2(241) = 411.206, CFI = 0.982, TLI = 0.980, RMSEA = 0.030

0.966

1.717

3.914

(0.803)

2.339

5.488

1.716

(0.234)

1.258

2.174

Fit statistics

aFor the social dimension, parameter estimates are from the model with minimal identification constraints

bFactor means are estimated in male groups and are set to zero in female groups

cFactor variances are estimated in male groups and are set to one in female groups

Table 6 Estimated means of latent continuous variables for malesa

PoliticalMean ReligiousMeanRacial MeanSocialMean

Patriotism-0.018 Birth control0.122Death penalty-0.001 Hippies-0.149

Licensing Law-0.500Legalized abortion 0.004Empire-building-0.758 Modern art0.219

Royalty-0.204 Divorce-0.017Women judges0.753 Student pranks-0.417

Censorship-0.618Evolution theory -0.337 Apartheid1.520 Nudist camps0.001

Strict rules-0.187Sabbath observance -0.262Caning -0.673Jazz0.229

Inborn conscience-0.101Divine law-0.297 Colored immigration 1.282Casual living0.067

Church authority-0.324Pyjama party 0.010

aThe signs of mean scores of the negatively worded items were reversed, so that higher scores represent more conservative attitudes for all items

114Behav Genet (2009) 39:101–122

123

Page 15

on equally constrained thresholds across repeated mea-

surements (Bollen and Curran 2006; Mehta et al. 2004). If

the distributional differences of latent response variables

over time are estimated, then the autoregressive effects of

genetic and environmental factors on mean differences

over time can be estimated (Dolan et al. 1991) or genetic

and environmental effects on the latent growth factors,

intercept and slope, can be analyzed (McArdle 1986).

Although the model as proposed offers a useful method

for decomposing latent differences of ordered categorical

variables across groups via genetic and environmental

factors, several limitations of the model must be kept in

mind when applying and interpreting such model. Most of

these limitations derive from the fact that distributional

differences of latent response variables are estimated by

means of proportional differences of observed categorical

variables, and the constraints required to identify the

model. First, although the minimum identification con-

straints derived can provide some flexibility for further

constraints on parameters, invariance constraints on

required parameters are not avoidable. It is possible to

constrain alternative sets of parameter constraints and use

of a different set of minimal parameter constraints would

result in different estimated parameters. This is a common

issue associated with factor models with ordered categor-

ical variables because latent response variables are

indirectly modeled by distributional assumption and

thresholds and not all parameters can be estimated (Millsap

and Yun-Tein 2004). Thus, the purpose and structure of

specific models should be considered when identification

constraints are chosen. Although the identification con-

straints derived are appropriate for the purpose and

structure of the proposed model, some examination of these

assumptions should be explored.

Recall that three variables are required to have two

invariant thresholds across groups. Selected three variables

function as anchoring variables across groups in order to

provide the scales of three factors—additive genetic,

common environmental, and unique environmental factors.

Although, in the example, the first three variables were

chosen for the threshold constraints, any set of three vari-

ables could have been selected and could lead to different

results. Moreover, if the variables have more than three

response categories, so that there are more than two

thresholds per variable, any of two thresholds can be

constrained equal across groups. If the thresholds of one

subset of indicator variables are more invariant than of

other variables, it is more reasonable to apply the equality

constraints on those variables. These questions are not

confined to genetic factor analyses and Mehta et al. (2004)

suggested a mathematical formulation on the threshold

invariance inthecontext

ordered categorical variables which can also be applied to

of repeatedlymeasured

multiple-group contexts. However, this method only

applies to the case of more than two thresholds per vari-

able. Determining which set of variables has invariant

thresholds may be analogous to finding anchoring items

with invariant item characteristics as described in studies of

differential item functioning (DIF) analyses within item

response theory (IRT). As such, iterative processes that

have been proposed to find such anchoring items (e.g.

Candell and Drasgow 1988; Drasgow 1987), and can be

utilized in the context of genetic factor analyses to find

appropriate combination of the variables to apply equality

constraints on thresholds.

As noted earlier, assessment of invariance of factor

loadings in the proposed model is somewhat complicated.

Factor loadings are set equal across groups in the minimal

identification constraints and freeing them across groups

to test factor loading invariance would under-identify the

model. Thus, testing factor loading invariance therefore

requires the use of alternative approaches. One possible

solution is to constrain thresholds fully invariant and to

free any factor loadings not required to identify the

model. This model could be compared to the proposed

model, although these models are not nested within each

other and such comparison would have to be based on

information criteria. Alternatively, group specific factor

loadings could be estimated if means and variances of

both factors and latent response variables are set to zero

and one in all groups. Group specific thresholds can then

be estimated. This method may provide an alternative for

assessing factor loading invariance because different fac-

tor loadings between groups can be estimated across

groups while the group differences of means and vari-

ances of latent response variables are captured by

differently estimated thresholds across groups. Factor

loadings can be constrained equal across groups from this

model and those two models can be compared to obtain

the evidence of factor loading invariance. However, with

this setting, mean of each latent response variable is not

decomposed into genetic and environmental factors, and

adding factor means may produce different patterns of

factor loadings. This strategies while heuristic and labor

intensive may prove quite useful in future applications.

Additionally, the delta parameterization method of

Mplus has been used in example analysis. Under the delta

parameterization, the latent variance of each ordered

categorical indicators is modeled in terms of a scale

parameter which is the inverse of the standard deviation

of a latent response variable. Scale parameters are

allowed to vary across groups to estimate the across group

differences of the variance of each latent response vari-

able. Although the delta parameterization method has

computational advantage in model estimation (Muthe ´n

and Asparouhov 2002), it has other disadvantages in

Behav Genet (2009) 39:101–122 115

123