Page 1

Modeling Maternal-Offspring Gene-Gene Interactions: The

Extended-MFG Test

Erica J. Childs1, Christina G.S. Palmer2,3, Kenneth Lange3,4, and Janet S. Sinsheimer1,3,4,*

1Department of Biostatistics, University of California, Los Angeles, California

2Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles,

California

3Department of Human Genetics, University of California, Los Angeles, California

4Department of Biomathematics, University of California, Los Angeles, California

Abstract

Maternal-fetal genotype (MFG) incompatibility is an interaction between the genes of a mother

and offspring at a particular locus that adversely affects the developing fetus, thereby increasing

susceptibility to disease. Statistical methods for examining MFG incompatibility as a disease risk

factor have been developed for nuclear families. Because families collected as part of a study can

be large and complex, containing multiple generations and marriage loops, we create the

Extended-MFG (EMFG) Test, a model-based likelihood approach, to allow for arbitrary family

structures. We modify the MFG test by replacing the nuclear-family based “mating type” approach

with Ott’s representation of a pedigree likelihood and calculating MFG incompatibility along with

the Mendelian transmission probability. In order to allow for extension to arbitrary family

structures, we make a slightly more stringent assumption of random mating with respect to the

locus of interest. Simulations show that the EMFG test has appropriate type-I error rate, power,

and precise parameter estimation when random mating holds. Our simulations and real data

example illustrate that the chief advantages of the EMFG test over the earlier nuclear family

version of the MFG test are improved accuracy of parameter estimation and power gains in the

presence of missing genotypes.

Keywords

maternal-fetal interaction; gene by environment; family based association; pedigree likelihood

INTRODUCTION

Studies show that prenatal environment may contribute to disease risk in offspring later in

their lives [Cannon, 1997; Cantor-Graae et al., 2000; Louey and Thornburg, 2005;

McKinney et al., 1999]. One example is Maternal-Fetal Genotype (MFG) incompatibility,

which arises through an interaction between maternal and fetal gene products [Laing et al.,

1995; Marcelis et al., 1998; Ober, 1998]. The underlying genetic basis for this

incompatibility is what allows us to study it even decades after the adverse environment has

passed.

© 2010 Wiley-Liss, Inc.

*Correspondence to: Janet S. Sinsheimer, Department of Human Genetics, 5357C Gonda Center, 695 Charles E. Young Drive South,

Box 957088, Los Angeles, CA 90095-7088. janet@mednet.ucla.edu.

NIH Public Access

Author Manuscript

Genet Epidemiol. Author manuscript; available in PMC 2011 September 2.

Published in final edited form as:

Genet Epidemiol. 2010 July ; 34(5): 512–521. doi:10.1002/gepi.20508.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 2

Sinsheimer et al. [2003] developed the MFG incompatibility test, a likelihood-based,

affected-only statistical method for examining MFG incompatibility as a disease risk factor.

The original method uses parent-offspring trios in an adaptation of Weinberg’s log-linear

method [Weinberg et al., 1998] for estimating genotypic relative risks for maternal and

offspring main effects. The MFG test allows the user to jointly model offspring allelic

effects, maternal allelic effects, and maternal-offspring genotype interactions, such as

maternal-fetal genotype incompatibility. Kraft et al. [2004, 2005] extended the MFG test to

allow multiple siblings per nuclear family and to model locus-specific effects of maternal

immunological processes, such as prior exposure. Other researchers developed the MFG test

variations for use with nuclear families or with case-mother control-mother data [Chen et al.,

2005, 2009; Cordell et al., 2004; Li et al., 2009]. Through the MFG and related tests, MFG

incompatibility has been tested as a potential risk factor for diseases as diverse as autism,

schizophrenia, and rheumatoid arthritis (RA) [Chen et al., 2005, 2009; Hsieh et al., 2006a;

Insel et al., 2005; Palmer et al., 2002, 2006, 2008; Zandi et al., 2006].

Although the MFG test is powerful and adept at jointly modeling effects without

confounding them, the restriction to nuclear families forces researchers who have extended

family data to make a difficult choice. They must either risk losing power by selecting a

single nuclear family per extended pedigree or risk introducing biases by partitioning their

extended pedigrees into a number of nuclear families treated as independent. For this reason,

we have created the Extended-MFG (EMFG) test to handle arbitrary pedigree structures in

testing hypotheses about maternal allelic effects, offspring allelic effects, and maternal-

offspring genotype incompatibilities. Before we develop the EMFG test, we review the

current, nuclear family MFG test. We then use simulations to illustrate the properties and

potential advantages and disadvantages of the EMFG test when compared to the nuclear

family MFG test. Finally, we provide a simple illustration of the EMFG test on a real data

set consisting of a single large pedigree.

MFG INCOMPATIBILITY CONCEPTS

Table I illustrates how the MFG test works in general for MFG incompatibility at a bi-allelic

locus. There are seven possible combinations of mother and offspring’s genotypes (Table I,

Columns 1 and 2). The most general model (Table I, Column 3) allows the offspring’s

disease risk to differ for each MFG combination [Sinsheimer et al., 2003]. In Table I, these

risks are designated by δij, where i and j represent the number of variant alleles (allele 2)

present in the mother and offspring, respectively. Genotype interaction (MFG

incompatibility) is reflected by the lack of constraints on δij that force the multiplicative

model δij = ρi ηj and make maternal and offspring effects independent.

When prior evidence exists for a particular maternal-fetal genotype incompatibility

mechanism, the number of parameters can be reduced. Two examples are provided in Table

I and Figure 1A, B. Consider RHD incompatibility (Table I, Column 4 and Fig. 1A). In this

case, allele 2 corresponds to the antigen-coding allele (often coded as D) and allele 1

corresponds to the null allele (often coded as d). RHD incompatibility occurs when the

immune system of a mother with genotype 1/1 recognizes the fetus’ protein product from

the 1/2 genotype as foreign and mounts an immune response that can be detrimental to her

offspring. The potentially detrimental genotype combination has an increased risk of μ over

the baseline risk of β. All other maternal-offspring genotype combinations have the baseline

risk.

The second example (Table I, Column 5 and Fig. 1B) is derived from RA and HLA-DRB1,

where offspring allelic effects are highly significant and non-inherited maternal antigens

(NIMA) have been implicated to increase risk of disease in offspring [Harney et al., 2003;

Childs et al.Page 2

Genet Epidemiol. Author manuscript; available in PMC 2011 September 2.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 3

Newton et al., 2004; van der Horst-Bruinsma et al., 1998]. Offspring with 1/2 or 2/2

genotypes are at increased risk (ρ1 and ρ2, respectively) over the baseline risk. NIMA effects

can occur to 1/1 offspring whose mother has genotype 1/2. This MFG combination carries

increased risk to the offspring.

THE NUCLEAR FAMILY MFG TEST

We start our model development with the affected-only, nuclear family MFG [Kraft et al.,

2004, 2005], which allows for multiple siblings in a family. The MFG test conditional

likelihood controls for ascertainment and has the following form for a completely genotyped

family:

(1)

Here, G = (G1,…,Gk) represents the observed genotypes of the k-affected offspring in the

family, and Gr and Gs represent the genotypes of the mother and father. The denominator

sums over all possible ordered (phased) genotypes for offspring and parents (g, gr, gs). The

vector D = (D1,…,Dk) denotes the k offspring in the family, with Dc equal to 1 for the k-

affected offspring. There are six mating types (MT) to consider for a bi-allelic locus (Table

II). Using mating types controls for non-random mating at the trait locus.

The MFG test is an affected-only analysis, however, the genotypes of unaffected or

phenotype unknown offspring contribute if there are missing parental data. The conditional

likelihood (1) must be summed over all possible parental genotypes (gr, gs) consistent with

the observed genotypes in the family, which now include the genotypes of unaffected or

phenotype unknown offspring [Hsieh et al., 2006b; Kraft et al., 2004; Minassian et al., 2006;

Palmer et al., 2002]. The denominator remains the same. The likelihood for the entire

sample is the product of the likelihoods for separate and independent families.

THE EMFG TEST

Reliance on mating types limits the traditional EMFG test to nuclear family analysis and

burdens the model with nuisance parameters. In extended pedigrees, it is unclear how to

weight matings, where one or both parents are non-founders. Assuming random mating at

the locus of interest avoids mating types. Random mating takes place when selection of a

mate is independent of marker genotype. Thus, the probability of a mating type equals the

product of the genotype frequencies (Table II, Column 3). The model now depends on

founder genotypes. Based on this assumption, the conditional likelihood of any pedigree can

be evaluated by taking the ratio of two unconditional pedigree likelihoods [Ott, 1974].

The MFG likelihood for a pedigree with n members is defined as the conditional likelihood

L(G|D) of the observed genotypes G given the trait phenotypes D in the pedigree. The

affected-only analysis is achieved by treating all unaffected individuals as phenotype

unknown. Hence, the vector D omits the disease phenotypes of pedigree founders. In

contrast, G includes the marker genotypes of founders when these genotypes are known. If

users have affected founders, they can be included in the analysis by introducing their

parents without phenotype or genotype data. If genotypes are missing for some individuals

or genotype phases are unknown, the likelihood is summed over the possible ordered

genotypes g that are consistent with the observed genotypes G in the family.

The conditional likelihood for the maternal-fetal genotype incompatibility effects is:

Childs et al. Page 3

Genet Epidemiol. Author manuscript; available in PMC 2011 September 2.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 4

(2)

The probability of disease in offspring c is a function of both her and her mother’s genotype.

Prior(gj) represents founder j’s genotype frequency. These founder population genotype

frequencies are estimated by maximizing the likelihood along with the risk parameters. The

second term in the numerator, Pr(Gi|gi), is 1 if the proposed genotype for i, gi, is consistent

with the observed genotype Gi, and 0 if it is inconsistent. When Gi is missing, Pr(Gi|gi) = 1.

This term is calculated for each family member, regardless of affection status. Offspring and

maternal allelic effects, and maternal-fetal genotype interactions are parameterized through

Pr(Dc|gc,gr), which is calculated with Trans(gc|gr,gs), the transmission probability for

offspring, mother, and father triple (c,r,s). The denominator sums over all possible ordered

(phased) genotypes for the n family members, and is similar to standard ascertainment

correction [Lange, 2002]. When parental genotypes are unavailable for founders, we treat

them as phenotype unknown.

To illustrate the adaptability of Pr(Dc|gc,gr), we consider a few pertinent examples. First,

consider RHD incompatibility and hemolytic disease of the newborn where disease risk

differs for boys and girls.

(3)

MFG incompatibility occurs when the mother’s genotype is 1/1 and the offspring’s is 1/2.

In our second example we modify Pr(Dc|gc,gr) to allow for both NIMA and offspring allelic

effects as follows:

(4)

Parameters ρ1 and ρ2 represent the relative risk of disease for an offspring who carries one

or two copies of a risk allele at the locus of interest compared to an offspring who carries no

copies of a risk allele. MFG incompatibility occurs when the mother’s genotype is 1/2 and

the offspring’s is 1/1. Since Di is considered missing if i is unaffected or phenotype

unknown and the baseline population prevalence β cancels from the conditional likelihood,

we fix β = 1, which is equivalent to δ00 = 1 (Table I, Column 3).

IMPLEMENTING THE EMFG TEST

The EMFG test is implemented in the Mendel software [Lange et al., 2001] for pedigrees of

variable size and structure. The current program handles the generalized risk model for a bi-

allelic locus (Table I, Column 3). Parameters under specific hypotheses are estimated by

placing restrictions on this model. RHD incompatibility without gender effects imposes δ00

= δ10 = δ11 = δ12 = δ21 = δ22 = 1 and estimates only δ01. NIMA sets δ00 = 1 and imposes δ01

= δ11 = δ21 and δ12 = δ22, accounting for the offspring allelic effects. Thus, there are three

free parameters, two for offspring allelic effects and one (δ10) accounting for the NIMA

Childs et al. Page 4

Genet Epidemiol. Author manuscript; available in PMC 2011 September 2.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 5

effect. Nested models are compared using a likelihood ratio (LR) test. For example, to test

for a significant NIMA effect in the presence of an offspring allelic effect, one can compare

a full model with three free parameters (NIMA and offspring allelic effects) to a restricted

model with two free parameters (offspring allelic effects) using a one-degree of freedom LR

test.

SIMULATION STUDY: OBJECTIVES AND CONDITIONS

The simulations are designed for three purposes. In Simulation Study 1, we demonstrate the

statistical properties of the EMFG test under realistic effect sizes for RHD and NIMA

incompatibility. In Simulation Study 2, we compare three possible study designs using (a)

extended pedigrees in their entirety, (b) one nuclear family per extended pedigree [Palmer et

al., 2008], and (c) all nuclear families available from extended pedigrees and treated as

independent. We anticipate that design (a) will be superior to (b) in power and (c) in

accuracy. In Simulation Study 3, we determine the effects of violating the random mating

assumption on parameter estimation, power, and type-I error rate of the EMFG test.

In our simulation studies, we characterize the properties of the EMFG test in terms of its

rejection rates and estimation accuracy. Unless otherwise mentioned, founder genotypes are

simulated according to Hardy Weinberg Equilibrium (HWE) with allele frequencies P(1) =

0.33 and P(2) = 0.67. Non-founder genotypes are simulated using Mendel’s gene dropping

option [Lange et al., 2001]. Samples are simulated with no MFG effect and with detrimental

MFG effects ranging from 1.5 to 2.5 in 0.1 increments (results only shown for certain

values). We chose this range because of previous estimates of MFG effects [Insel et al.,

2005; Kraft et al., 2004; Palmer et al., 2008]. We refer to these simulation conditions as

scenarios. For each scenario, 1,000 data sets are simulated. Parameter estimates and their

standard errors are averaged over the data sets. The number of pedigrees within a data set

varies. For each model, rejection rate and coverage are shown. Coverage is the proportion of

95% confidence intervals that contain the MFG relative risk’s true value. The rejection rate

is the proportion of samples where the LR test rejects the null hypothesis of no MFG effect

at a significance level of 0.05. In all scenarios, two-sided tests are used. Since each

performance statistic p is a proportion, its standard error is .

RESULTS

SIMULATION STUDY 1: PROPERTIES OF THE EMFG TEST

For Simulation Study 1, each of 1,000 simulated data sets contain 200 three-generational

pedigrees. Only individuals 7 and 8 are affected (with dark shading in Fig. 2). First, we use

the EMFG Test to estimate the MFG effect in simulated samples with no gender-specific

MFG effect (Table III, Scenarios A and B). When μ = 1, the type-I error rate of 0.040 is

close to the desired level of 0.05. The EMFG test has 0.816 power to detect an MFG effect

of 1.7 when there are no gender effects (Scenario B, model μm = μf). When these same data

sets are analyzed under a model that allows for gender differences (μm≠μf), power is reduced

to 0.724. In all scenarios, the parameter estimates are close to the true MFG effect. Founder

genotype frequencies and their standard errors are the same across models, and are equal to

their expected values under HWE (P̂(1/1) = 0.109, SE = 0.012; P̂(1/2) = 0.443, SE = 0.018;

P̂(2/2) = 0.448, SE = 0.018). Accurate results are also obtained for data simulated with

1.5≤μ≤2.5 (data not shown).

Several studies found that MFG incompatibility effects are confined to a single gender [Insel

et al., 2005; Palmer et al., 2006, 2008]. To mimic this scenario, samples are simulated with

males at increased risk of disease but females at baseline risk (μf = 1.0). Under a correct

Childs et al. Page 5

Genet Epidemiol. Author manuscript; available in PMC 2011 September 2.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript