Page 1

Disparities in Defining Disparities: Statistical Conceptual

Frameworks

Naihua Duan1,2,*, Xiao-Li Meng3, Julia Y. Lin4,5, Chih-nan Chen5, and Margarita Alegria4,5

1Department of Psychiatry, Columbia University and New York State Psychiatric Institute, New

York, NY 10032, U.S.A.

2Department of Biostatistics, Columbia University, New York, NY 10032, U.S.A.

3Department of Statistics, Harvard University, Cambridge, MA 02138, U.S.A.

4Department of Psychiatry, Harvard Medical School, Boston, MA 02215, U.S.A.

5Center for Multicultural Mental Health Research, Somerville, MA 02143, U.S.A.

SUMMARY

Motivated by the need to meaningfully implement the Institute of Medicine’s (IOM’s) definition

of health care disparity, this paper proposes statistical frameworks that lay out explicitly the

needed causal assumptions for defining disparity measures. Our key emphasis is that a

scientifically defensible disparity measure must take into account the direction of the causal

relationship between allowable covariates that are not considered to be contributors to disparity

and non-allowable covariates that are considered to be contributors to disparity, to avoid flawed

disparity measures based on implausible populations that are not relevant for clinical or policy

decisions. However, these causal relationships are usually unknown and undetectable from

observed data. Consequently, we must make strong causal assumptions in order to proceed. Two

frameworks are proposed in this paper, one is the conditional disparity framework under the

assumption that allowable covariates impact non-allowable covariates but not vice versa. The

other is the marginal disparity framework under the assumption that non-allowable covariates

impact allowable ones but not vice versa. We establish theoretical conditions under which the two

disparity measures are the same, and present a theoretical example showing that the difference

between the two disparity measures can be arbitrarily large. Using data from the Collaborative

Psychiatric Epidemiology Survey, we also provide an example where the conditional disparity is

misled by Simpson’s paradox, while the marginal disparity approach handles it correctly.

Keywords

Counterfactual populations; Disparities; Potential outcomes; Weighting; Mental health; Simpson’s

paradox

1. CAUSALITY AND DISPARITY MEASURES

1.1 The Causal Implication of the IOM Definition

The Institute of Medicine (IOM) [1] defines health care disparities as “racial or ethnic

differences in the quality of health care that are not due to access-related factors or clinical

Copyright © 2007 John Wiley & Sons, Ltd.

*Correspondence to: Naihua Duan, Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West

168th Street, Room 636, New York, NY 10032, U.S.A. Email: naihua.duan@columbia.edu.

NIH Public Access

Author Manuscript

Stat Med. Author manuscript; available in PMC 2009 September 10.

Published in final edited form as:

Stat Med. 2008 September 10; 27(20): 3941–3956. doi:10.1002/sim.3283.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 2

needs, preference, and appropriateness of intervention.” This definition represents an

important advance in disparity research, because it explicitly recognizes the role of causality

in the determination of disparities through its reference to the causal expression “not due to”.

However, it leaves open the interpretation of the causal model underlying this causal

statement. In this paper we identify several causal models under which the IOM definition

can be implemented meaningfully, and propose the corresponding frameworks for defining

and comparing statistically justifiable disparity measures following these models. Our work

can be viewed as a statistically oriented conceptualization of research in this area (e.g.,

[2,3,4,5,6]). Although our work was directly motivated by the IOM definition, the proposed

general frameworks are equally applicable to other areas, such as in legal settings (e.g.,

[7,8,9]).

The statistical frameworks proposed in this paper assume that the covariates of interest have

been classified into allowable and non-allowable categories. Allowable covariates are

considered to be justifiable to cause difference and hence should be adjusted before

measuring disparity. The remaining covariates are classified as non-allowable.

It is important to note that the classification of allowable versus non-allowable covariates

can, and should, vary from study to study, depending on the particular purpose for the study.

For example, IOM’s classification of access-related factors as allowable is appropriate for

studying disparity at the level of patient-clinician encounter, with the focus being the

treatment delivered during the encounter, controlled for all historical factors that occurred

prior to the encounter. However, when studying health care disparity at the level of service

systems, it would be more appropriate to classify access-related factors as non-allowable,

thus holding the service systems accountable for failure to engage disadvantaged patients

into care. The statistical frameworks we establish in this paper apply to any of such

classifications.

As a specific example for illustration, suppose that covariates that might be predictive of

health care are classified as follows:

•

Clinical needs and preference are considered allowable. Differences in health care

due to these covariates are not considered to be part of health care disparity.

•

All other covariates, such as knowledge about health, state of residency, insurance

coverage, and education (to name a few), are considered non-allowable.

Differences in health care due to these covariates are considered to be health-care

disparity.

Given such a classification, our goal then is to measure the disparity that is “not due to” the

allowable covariates.

A seemingly obvious, and hence very common, approach is to substitute the levels of

allowable covariates of, for example, Afro-Caribbean with those of their non-Latino white

counterparts, while leaving the levels of non-allowable covariates unchanged. This

procedure is often used in Analysis of Covariance models that adjusts for allowable

covariates across racial/ethnic groups. However, this approach is sensible in general only if

the allowable covariates are statistically independent of the non-allowable covariates, a

condition that is unlikely to hold in practice. Without this independence condition, this direct

substitution may lead to an implausible population, such as a hypothetical population with

high level of income (as a non-allowable covariate that remains unchanged) and a high level

of chronic diseases (as an allowable covariate that was substituted with the levels from the

reference population). As a result, the disparity estimates obtained from this procedure may

not be relevant for clinical, policy or other purposes, because they are based on a postulated

population that cannot be realized by policy changes or disparities interventions.

Duan et al.Page 2

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 3

Not accounting properly for the causal relationships between allowable and non-allowable

covariates is especially problematic when the two sets of covariates are highly correlated in

the observed data, and both sets of variables are included in our outcome model. In such

cases, the allowable covariates might appear to be very weak for predicting the outcome in

the fitted model due to the well-known “collinearity” phenomenon. Consequently, replacing

a minority group’s allowable covariates by their counterparts in the non-Latino white group

in the fitted model may only produce trivial adjustment, even if in reality a substantial part

of the observed racial/ethnic difference is indeed due to the difference in the allowable

covariates. This could be either because of their direct impact on the outcome (which would

not be detected by the fitted regression model because of the strong collinearity) or on the

non-allowable covariates, or both. The frameworks proposed in this paper can help to

substantially reduce such serious misestimation of disparity because they explicitly take into

account the causal relationship between the allowable and non-allowable covariates. For

example, our approaches permit an adjustment in allowable covariates to cause substantial

adjustment in the non-allowable ones, which in turn may lead to substantial adjustment in

the predicted outcome, even if the allowable predictor appears to be very weak in the fitted

model for predicting the outcome.

1.2 Explicating the Underlying Causal Assumptions

In order to measure disparity meaningfully, such as to implement the IOM definition for

health care disparity, one must be explicit about the underlying causal assumptions that are

imbedded in any disparity measure. The fact that the exact causal mechanisms may not be

known or may not even be knowable is not a reason to “sweep everything under the rug”. To

the contrary, this is precisely the reason for us to be explicit about our assumptions so it is

possible for policy makers and subsequent researchers to correctly interpret the disparity

measures/estimates we obtain, as well as to determine the directions for correction or

improvement when newer information becomes available for the underlying causal

relationships.

The key reason that we need to make causal assumptions is that once an action is forced

upon a particular variable (e.g., by changing a minority group’s distribution of clinical needs

to match that of the non-Latino white population), it will have a ripple effect—in real life—

on other variables (e.g., income level) that are impacted by the one adjusted. However, this

ripple effect is not estimable without carrying out the actual (social) experiment, because the

observed relationships in a natural population may or may not be preserved after an

intervention. As an illustrative example, in a natural population, a person’s left-eye visual

acuity (AV) may be highly correlated with the person’s right-eye AV. However, this

correlation will be destroyed or at least reduced if we perform a vision correction laser

surgery on the right eye only. The two AVs will become independent shortly after the

surgery, but may become correlated again over time, though the cross-sectional data from a

natural population would tell us little about how large this correlation could be or whether it

would ever reach the same level as in the natural population.

Therefore, in order to measure the disparity “not due to” the allowable covariates, we must

postulate causal directions, as well as how any relationships among relevant variables are

preserved or altered with the change from a natural population to a hypothetical one. There

are two extreme types of unidirectional causal relationships: (A) allowable covariates impact

non-allowable covariates but not vice versa; and (B) non-allowable covariates impact

allowable covariates but not vice versa. The more realistic relationships are likely to be

either (C) allowable covariates and non-allowable covariates are inter-related and

reciprocally impact each other, or (D), which is (C) plus the possibility that both allowable

and non-allowable covariates are also impacted by the outcome itself (over time).

Duan et al. Page 3

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 4

While (C) and (D) are most dynamic and realistic, they do not permit useful modeling

without further specifications on how the variables involved impact each other. As these

specifications are content dependent and can be extremely difficult to postulate, we will

pursue them in future work. In this paper, we lay out the statistical frameworks for the

simpler causal relationships (A) and (B). These two frameworks serve as building blocks for

more complex causal specifications, and at the same time provide plausible specifications

that might yield useful bounds on the true disparity when more complicated causal

relationships are present. In some applications, such as the one presented in Section 3.2,

such simplistic causal assumptions are actually reasonable, leading to sensible practical

solutions.

2. STATISTICAL FRAMEWORKS

2.1 Linking Natural and Hypothetical Joint Distributions

Let XN denote non-allowable covariates such as knowledge about health, and let XA denote

allowable covariates such as clinical needs. Let Y denote the outcome of interest, such as log

of the health care expenditure. To measure the disparity, we want to adjust the levels of

allowable covariates (XA) but not the levels of non-allowable covariates (XN). Note here that

all variables are measured for each individual i, but we suppress the subscript i throughout

the text to simplify the notation. To describe the distribution of these variables, we use the

common generic notation P( ), e.g., P(XN). Whenever needed, we will use subscript 1 to

denote the reference group (e.g., the non-Latino whites) and 2 the group of interest (e.g., a

minority group), for example, P1(XN) and P2(XN).

The goal of our modeling is to estimate the potential outcome if the group of interest has the

same levels of allowable covariates as the reference group. The first step in setting up our

proposed frameworks is to explicitly consider the joint distribution of (Y, XA, XN), and

recognize that there are two joint distributions of interest: one for the natural population, and

one for the adjusted hypothetical population. We use the superscript (H) to denote different

populations, e.g. ( ), where (H) can refer to either an adjustment rule for a hypothetical

population (e.g. ( ), for adjustment rule (A)) or a natural (or non-adjusted) population

(e.g. ( )). But for any (H), we always have the following decomposition:

(1)

The importance of recognizing the dependence on H here is that only the natural population,

P(N) (Y, XN, XA), can be estimated from the data. Therefore, in order to calculate disparities

under a hypothetical population, we need to make strong assumptions to link the

hypothetical population, such as

Our first assumption, which appears to be taken for granted in much of the existing

literature, is that the “forced action” of the adjustment has no impact on the conditional

distribution of Y given (XN, XA). That is, for any adjustment rule (A), we assume

, to the natural population .

(2)

We will refer to (2) as the “predictively nature preserving” (PNP) assumption, meaning that

the predictive nature of {XN, XA} on Y is preserved despite of the “forced action” on XA.

Duan et al. Page 4

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 5

One can easily consider a scenario under which the PNP assumption is false, but without

such an assumption, the estimation of the disparity is essentially impossible. For example, in

our hypothetical eye vision example, two people may have identical AVs for both eyes (e.g.,

both are 20/20 in the right eye but 20/40 for in left eye), but they can have quite different

probabilities of having automobile accidents if one of them was born with such vision, but

the other achieved it via laser surgery to his right eye. Clearly, if this occurs, then it is

impossible to estimate—using only the data collected from the natural population—the

accident rate for the group of people with vision corrections done to their right eyes only.

To carry the decomposition (1) further, we can decompose the component

(1) into one conditional and one marginal distribution. This time, there are two possibilities:

in

(3)

and

(4)

The first decomposition is the basis for our conditional framework, which assumes that non-

allowable covariates XN are causally dependent on allowable covariates XA but not vice

versa. The second decomposition is suitable for the marginal causal framework, which

assumes that the allowable covariates XA are causally dependent on the non-allowable

covariates XN but not vice versa. Below we show how we can create different counterfactual

populations, a standard practice in causal inferences (e.g., see [10]), using these

decompositions.

2.2 Conditional Disparity

Under the conditional framework, we adjust the marginal distribution of the allowable

covariates XA from the natural population (such as Latinos) to the corresponding marginal

distribution of the reference group (such as non-Latino whites), while preserving the

conditional distribution for non-allowable covariates XN given allowable covariates XA as in

the natural population. Specifically, the hypothetical joint distribution is obtained by

replacing the marginal distribution of XA in the natural population

(5)

by that of the reference population (e.g., non-Latino whites), and thereby creating the

following hypothetical population distribution:

(6)

Although

into (5) leads to a hypothetical population that retains the natural conditional distributions

is taken from the natural population of the reference group, its insertion

and , with the component “mutated” into . We

denote this adjustment rule under the conditional disparity framework as adjustment (C).

Duan et al. Page 5

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 6

In order for (6) to be a meaningful hypothetical population, our assumptions are (i) the PNP

assumption holds, and (ii) the adjustment action has no impact on the conditional

distribution of XN given XA either, that is,

(7)

which is plausible when the causal direction is from XA to XN but not vice versa. We will

refer to (7) as the “conditionally nature preserving” (CNP) assumption, meaning that the

natural conditional distribution P2(XN|XA) is preserved after the adjustment on XA.

The ratio between the adjusted joint density (6) and the natural joint density (5) is simply the

ratio of the marginal densities

(8)

Following the principle of importance weighting, the expected outcome under the

hypothetical population (6) can be expressed as the following weighted expectation of Y

under the natural population (5), with the importance weight RC(XA):

(9)

where denotes the expectation with respect to the hypothetical population in (6), and

denotes the expectation with respect to the natural population in (5).

Expression (9) gives us a practical way to estimate

involves expectations with respect to the natural population (5), from which we can estimate

from the sample data. Since the current paper focuses on setting up conceptual frameworks,

the detailed estimation procedures, particularly for estimating RC(XA), will be presented in a

subsequent paper.

[Y] because its right hand side only

Intuitively, the adjustment under our conditional framework amounts to weighting the level

of health care (Y) among minorities by the density ratio RC(XA). Minorities with higher

density ratio RC get weighted up because a value of RC(XA) > 1 tells us that there are more

non-Latino whites with the levels of XA than minorities with the same levels of XA. The

corresponding disparity is then measured as the difference between the expected value of Y

for the adjusted (hypothetical) population (6) and that of the reference population:

(10)

We term DC of (10) as conditional disparity because the main source of disparity is in the

difference in the conditional distributions and . The difference in

and may also be of interest in its own right,anissue we shall

not pursue here due to page limitation, but will briefly touch upon in Section 3.3.

Duan et al.Page 6

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 7

Applying expression (9) to the definition (10), we have the following expression for

conditional disparity that can be estimated using sample data:

(11)

Notice that this expression for conditional disparity does not involve the non-allowable

covariates, XN. This is possible because of the assumption that XN is caused by XA. Under

such an assumption, we can greatly simplify the estimation task since (11) bypasses the need

to model XN. The theoretical implication of this simplification will be discussed in Section 4.

2.3 Marginal Disparity

In contrast to conditional disparity, which equates the two marginal distributions of XA, the

marginal disparity framework replaces the conditional distribution of XA, conditioning on

XN, of the population of interest (e.g., Latinos) by that of the reference population (e.g., non-

Latino whites). Specifically, we replace the conditional distribution

natural population

in the

(12)

by that of the reference population to create the following hypothetical population

(13)

We denote this adjustment rule under the marginal disparity framework as adjustment (M).

Similar to the conditional disparity framework, in order for (13) to be a meaningful

hypothetical population, we have assumed that (i) the PNP assumption holds, and (ii) the

adjustment action has no impact on the marginal distribution of XN either; that is,

(14)

which is plausible when the causal direction is from XN to XA but not vice versa. We will

refer to (14) as the “marginally nature preserving” (MNP) assumption, meaning that the

marginal distribution P2(XN) is preserved after the adjustment on XA.

Similar to (8), the ratio between the joint densities (13) and (12) is given by the ratio

between the two conditional densities

(15)

Again, the ratio (15) can be used as the importance weight to express:

(16)

Duan et al.Page 7

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 8

where

expectation under the natural population (12). Note that the right hand side of (16) can be

estimated from sample data obtained in the natural population (12).

denotes expectation under the hypothetical population (13), and denotes

It is useful to visualize the adjustment under our marginal framework as first stratifying the

minority population by the level of the non-allowable covariates (e.g., knowledge of health).

We then apply the same weighting scheme as with the conditional disparity approach but

now within each stratum, therefore the weights there, namely, the ratio of marginal densities

RC(XA) is now replaced by the ratio of the corresponding conditional densities RM(XA; XN).

Minorities within a particular stratum, as defined by their values of XN, with higher

conditional density ratio RM get weighted up when there are more non-Latino whites with

the levels of XA than minorities in the same stratum as defined by the value of XN.

The marginal disparity measure then is defined as the difference between the expected value

of Y for the adjusted (hypothetical) population (13) and that of the reference population (12):

(17)

We term DM as marginal disparity because the main source of the disparity is in the

difference in the marginal distributions and , in addition to any difference

in

we have the following expression for marginal disparity that can be estimated using sample

data:

and . Again, applying expression (16) to the definition (17),

(18)

The estimation of RM(XA; XN) is more complicated than estimating RC(XA) due to the

higher dimensionality. Again, these technical details will be addressed in a subsequent

paper.

3. COMPARING CONDITIONAL AND MARGINAL DISPARITIES

With the two frameworks given above, a natural question is when do they give the same

disparity estimates, or more profoundly, do they give different values that would matter in

practice? The answer to the first part is a clean-cut theoretical result we present below. The

answer to the second part is obviously “it depends” because it depends critically on the

nature of the dependence structure between XA and XN, as well as the dependence of Y on

(XA, XN), in particular applications. We will illustrate this with two examples, one of which

shows the difference between getting it right or wrong, and the other gives a class of cases

where the difference can be made arbitrarily large. For the rest of this paper, we suppress the

superscript (N) as a notation for the natural population, whenever the context is clear.

3.1 A Theoretical Result Related to Local Dependence Function

The difference between DC and DM can be expressed as

(19)

Duan et al. Page 8

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 9

The two disparity measures will be identical, ΔD = 0, if

(20)

This condition is equivalent to the condition that

(21)

Here the G function can be viewed as a measure of the dependence structure between XN

and XA, and therefore condition (21) says that as long as the dependence structure is the

same for the two groups (e.g., it remains the same across the two racial/ethnic groups), the

two disparity measures would be identical. As a special case, if XN and XA are independent

under both populations, then the two measures are the same because both G1 and G2 are

then identical to 1.

For continuous variables, the notion that G is a measure of dependence structure can also

been examined through the local dependence function (LDF), as defined in [11] and studied

in [12] and [13],

(22)

Because

(23)

it is obvious that condition (21) implies that the LDF is independent of the group index, i.e.,

the LDF does not change with the racial/ethnic group. Note however that the reverse is not

necessarily true; that is, we can have LDF invariant to group index, but the condition (21)

does not hold. In this sense, the measure of dependence by the G function is more stringent

than that by the LDF.

Finally we note that condition (21) is sufficient but not necessary for ΔD = 0. A simple

example is that ΔD = 0 when the regression of Y on XA and XN, that is, E2[Y|XN, XA] are

free of both XN and XA (note that this is weaker requirement than the independence between

Y and (XN, XA) since only the conditional mean of Y is involved). This, of course, does not

happen when XN and/or XA are useful predictors of Y. But it reminds us that the difference

between DC and DM also depends on the relationship between Y and (XN, XA), and the

difference will be small when both XN and XA are weak predictors.

We emphasize here that the statement we just made is true only when both XN and XA are

weak predictors. If one is weak but the other is not, the difference between the two measures

can still be very large if there is high correlation between XN and XA. Indeed, the appearance

of “one-weak and one-strong” scenario is quite common in practice when the two predictors

are highly correlated because of the well-known “collinearity” problem among the

predictors. And it is precisely in such cases that the recognition of the impact of the

Duan et al.Page 9

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 10

allowable covariates on the non-allowable ones, or vice versa, is of critical importance. As

mentioned in Section 1, the common approach of adjusting only the allowable covariates

without conisdering its impact on the non-allowable covariates can lead to serious mis-

estimation of the disparity when the allowable covariates appears to be a weak predictor in

the regression of Y on XN and XA.

3.2 A Discrete-Distribution Example

We start with a simple 2 × 2 × 2 contingency table example to both illustrate the basic

calculations for DC and DM, as well as their differences. We use data from the combined

data set of three large epidemiological studies, namely, the NIMH Collaborative Psychiatric

Epidemiology Survey (CPES): the National Latinos and Asian American Study (NLAAS)

[14], the National Comorbidity Study Replication (NCS-R) [15], and the National Study of

American Life (NSAL) [16]. These studies focus on collecting epidemiological information

on mental health and substance disorders and services utilization among the general

population with special emphasis on ethnic minority groups in the NLAAS (Latinos and

Asians) and NSAL (African Americans and Afro-Caribbean) with non-Latino white

comparisons from the NCS-R. The studies were designed to allow integration as though they

were a single, nationally-representative study [17]. The combined data set is the largest

epidemiological data set available for examining the patterns and correlates of mental health

services use in minority populations in the United States. The sampling frames and sample

selection procedures are described in detail elsewhere [18]. For illustration purposes, here

we treat this combined data set as a population by itself, and therefore all the numbers below

are regarded as population quantities (e.g., probabilities), instead of sample estimates (e.g.,

sample proportions).

For simplicity, we focus on a dichotomous outcome, namely, Y = 1 means the respondent

had at least one visit to any mental health service provider (either specialist or generalist) in

the past year, and Y = 0 otherwise. The allowable covariate is also a binary variable

indicating clinical need: XA = 1 if there was a need, and XA = 0 if there was not. The non-

allowable covariate is a binary variable indicating nativity: XN = 1 if the respondent is an

immigrant, and XN = 0 if the respondent was born in United States.

Table I provides the data for the non-Latino white population, from which we can easily

calculate the service use rate for this population. In Table I, there are two numbers in each of

the cells in the 2 × 2 layout. The top number is the percentage of individuals who fall into

the (i, j)-cell defined by the values of (XN = i, XA = j), and the bottom bracketed number μij

is the percentage of people in that cell who have used services, that is, μij = P(Y = 1|XN = i,

XA = j). Consequently, the overall service rate for the non-Latino white population, namely

E1[Y] is obtained by multiplying the two numbers in each cell, and adding them up across all

cells. This leads to E1[Y] = 14.39%. Similarly, for the Afro-Caribbean population (Table II),

E2[Y] = 6.75%, so the observed racial/ethnic difference is

(24)

This, however, is not necessarily the disparity in the sense of the IOM definition because it

has not adjusted for the difference in clinical needs.

Comparing Table I and Table II, we observe an interesting phenomenon. The percentages of

people in need are greater in the Afro-Caribbean population than in the non-Latino white

population when conditional on the nativity—55.75% versus 41.62% for the US Born

population and 33.90% versus 30.91% for the immigrant population. The pattern, however,

is reversed for the marginal rates, that is, when we combine the US born and the immigrants

Duan et al.Page 10

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 11

together: 41.18% for Afro-Caribbean versus 41.28% for non-Latino whites. Although the

difference between these two marginal rates is minimal (but there is no estimation error here

since we are using the data as if they were the entire population), it is nevertheless an

example of the well-known Simpson’s paradox[19]. The reason is the extreme imbalance of

the nativity groups in the two populations: more than 95% of the non-Latino whites were US

born, but only 1/3 of the Afro-Caribbean were US born.

The implication of this phenomenon for our disparity measure is clear. First, given that the

difference in the marginal rates is so small, 41.18% verse 41.28%, one would expect that the

conditional disparity which results from adjusting the Afro-Caribbean’s marginal rate from

41.18% to the non-Latino whites marginal rate of 41.28% will have a minimal impact on the

value of RD of (24). Indeed, as shown below, the conditional disparity in this case is DC =

−7.62%, nearly identical to RD = −7.64%.

Second, this adjustment in fact is in the wrong direction, because in this case the casual

assumption underlying the conditional disparity, that is, the allowable covariate (clinical

need) causes the non-allowable (nativity) is clearly a very implausible one. The marginal

disparity approach is a much more sensible one, because it makes adjustment of clinical

needs within each nativity category. Given the fact that the two nativity groups have very

different levels of clinical needs, with the US Born having more needs, it is intuitive that we

should make the adjustment after stratifying by nativity groups. Because the Afro-Caribbean

population has more needs in each of the nativity groups, it is also intuitive that had their

needs been the same as the non-Latino whites, the observed racial/ethnic difference would

be even larger. Indeed, as shown below, the marginal disparity in this case is DM = −8.84%.

In contrast to DC, which points to the wrong direction, DM shows that the disparity is

actually more pronounced than the unadjusted racial/ethnic difference by about (8.84 −

7.64)/7.64 ≈ 16%.

3.3. Disparity Calculations

The calculations of DC and DM can be best illustrated by creating two adjusted versions of

Table II, corresponding respectively to the two hypothetical populations as defined in (6)

and (13). They are given in Table III and Table IV respectively. To construct Table III,

which is for the conditional disparity, we need to compute the density ratio RC of (8). From

the last row of Table I and Table II respectively, we can obtain this easily as

We can then multiply each of the three un-bracketed proportions in the “No (0)” column of

Table II by RC(0), and multiply each of the three un-bracketed proportions in the “Yes (1)”

column of Table II by RC(1). This will yield the adjusted population corresponding to the

conditional disparity approach, as given in Table III, where the last column P(XA = 1|XN) has

also been changed using the adjusted cell probabilities. We see that Table III and Table I

have the same marginal distribution for XA (rounding errors notwithstanding), as intended.

The expected value of Y under this adjusted population can be easily obtained by

multiplying each cell probability in Table III with the corresponding μij from Table II and

then sum them up. This leads to , and hence

Duan et al. Page 11

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 12

To calculate the marginal disparity, we need first to compute the RM function of (15), which

is determined by the right most columns labeled “P(XA = 1|XN)” in Table I and Table II.

Specifically, we have

Table IV then is obtained by multiplying the (i, j)-cell proportion (the top un-bracketed

percentage) in Table II with RM(i; j) just obtained for i,j = 0,1, and then compute the

corresponding P(XA = 1|XN) and (XA) accordingly. We note that the resulting

conditional distribution (XA|XN) is the same as that from Table I (rounding errors

notwithstanding), as it should be, but the marginal distribution

different from the one from the non-Latino whites P1(XA). This difference reflects the

difference between the two approaches, because with the conditional disparity approach we

(XA) is now markedly

have

need-level” approach actually is misleading in this application because of the Simpson

paradox. Equating the need level after stratifying on nativity is a much more sensible

approach.

. As we discussed previously, the seemingly natural “equating-the-

To find the expectation of Y under this adjusted Afro-Caribbean population, we multiply the

four cell percentages in Table IV respectively by the four μij values in Table II and then sum

them up. This yields . Consequently, the marginal disparity, which in this

example can be regarded as a sensible measure of disparity, is given by

3.4 A Continuous-Distribution Example

This theoretical example establishes the mathematical fact that the difference in the

conditional disparity and marginal disparity can be arbitrarily large. It also illustrates another

form of the Simpson’s paradox, that is, even when there is no disparity in any strata defined

by the non-allowable variables XN, in the aggregated population one can still observe a

disparity due to the correlation between XN and race/ethnicity in the aggregated population

and the fact that XN is classified as non-allowable.

To see this, let us consider a simple linear regression case

(25)

where k = 1 indexes the non-Latino white population and k = 2 the minority population. To

simplify algebra, suppose in the natural populations (XN, XA) is bivariate normal, with mean

, unit variances and correlation ρ(k). That is

(26)

Duan et al.Page 12

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 13

Under this setting, for the conditional disparity, the hypothetical joint distribution

is a bivariate normal with the following distribution:

(27)

In contrast, under the marginal disparity approach, the hypothetical joint distribution for

(XN, XA) is given by P1(XA|XN)P2(XN), which is also bivariate normal but with the following

means and covariance matrix:

(28)

Simple algebra then yields that the difference between the two measures is

(29)

From (29), we have the following observations, two of which are special cases of what we

have discussed in general in Section 3.1. Specifically, we see that ΔD = 0 whenever one of

the following three condition holds:

a.

ρ(1) = ρ(2) = 0; that is, when XN and XA are independent in both populations;

b.

; that is, when the regression (25) does not depend on either XN or XA in

the population of interest (not necessarily in the reference population);

c.

and , that is, when the two populations have the same marginal

distributions for both XN and XA.

Of course ΔD can be zero by many other (incidental) combinations of the parameter values,

but the above three are most useful for theoretical insights. Note in particularly that

conditions (a) and (b) are applicable in general, but condition (c) only works when the

regression of Y is linear in both XN and XA. We emphasize that since the parameters in (29)

have no restrictions other than |ρ(k)| ≤ 1, ΔD can be arbitrarily large, including approaching

infinity.

We also remark a special case of interest, that is, when Ek[Y|XN, XA] of (25) is free of both k

(e.g., race/ethnicity index) and XN (i.e.,

disparity under the conditional disparity model, since XA is being adjusted to have the same

distribution for both racial/ethnic groups and (11) does not involve XN. Under the marginal

disparity model, however, the matter is more complicated. Although XN does not impact Y

directly, it impacts XA when it is correlated with XA. Consequently, the difference in the

marginal distributions of XN in the two racial/ethnic groups will result in differences in the

marginal distribution of XA even when, or rather especially when, the conditional

distribution is adjusted to be invariant to the race/ethnicity index k. It follows

then that there will be racial/ethnic disparity due to the indirect impact of XN on Y via XA.

Indeed, it is easy to verify for the current example that the marginal disparity is given by

). In such cases, there is no racial/ethnic

Duan et al.Page 13

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 14

(30)

This is zero only when (i) ρ(1) = 0 and hence XA and XN are independent in the reference

population so XN cannot impact XA in the hypothetical population, or (ii)

the impact of XN on XA does not translate into any impact on Y in the hypothetical

population, or (iii) and hence the distribution of XN is actually invariant to race/

ethnicity.

and hence

Perhaps most important here is to notice the Simpson’s paradox again. Although in the

aggregated population there is a marginal disparity for the case above, clearly there is no

disparity in any subpopulation defined by a particular value of XN, that is, when we

condition on XN, because the conditional distribution P2(XA|XN) has been adjusted for to be

the same as P1(XA|XN). This of course is not paradoxical, just as Simpson’s paradox is not a

real paradox in the mathematical sense. Once we classify XN as a non-allowable variable,

then logically we have to accept any difference caused by it as a part of the overall disparity,

regardless of whether the difference comes from its direct impact or indirect impact on the

outcome Y. Of course, one may argue whether the indirect part really should be viewed as

disparity, which is not an easy issue to address as then one is implying that XN is both a non-

allowable available (for the direct impact) and allowable variable (for the indirect impact via

XN). We shall pursue this complex issue in subsequent work.

4. FUTURE WORK

The IOM definition of disparities takes an indirect approach of elimination, and defines

health care disparity as the difference in health care that is not due to allowable covariates.

While this approach is appropriate for capturing disparity in its entirety irrespective of

source attribution, it leaves open the question of plausible causes for the disparity, and what

can be done to eliminate or reduce the disparity.

An alternative direct, constructive approach, is to define health care disparity attributable to

specific non-allowable covariates as the difference in health care that is due to these

covariates. This alternative approach can be implemented using the similar statistical

frameworks proposed above, but with the role of allowable and non-allowable covariates

switched. This approach does not capture disparity in its entirety, because it only captures

disparity attributable to the specific non-allowable covariates, and may miss the disparity

attributable to other non-allowable covariates, including those that may not have been

observed. However, this approach may have more direct policy implications, providing

guidance on the potential to reduce or even eliminate health care disparities through specific

policy implementations regarding the specific non-allowable covariates.

In practice, we believe both versions of the disparity are important. The elimination

approach is useful for estimating the magnitude of the overall disparity, whereas the

constructive approach is a tool for estimating how much disparity can be eliminated through

specific policy interventions. A comparison between the two is also important in revealing

how much of the overall disparity the policy intervention can eliminate. If a large portion

remains, a new policy intervention needs to be identified. We plan to explore these issues in

subsequent work, especially in the context of longitudinal data.

Another issue that we plan to investigate is the issue of variables that are not included in the

model for predicting the outcome Y but may actually be important. Traditionally there is not

much one can do about those variables other than trying one’s best to include as many

Duan et al.Page 14

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 15

variables as one can find and afford to measure. For the conditional disparity framework as

we outlined, one may have noticed that the conditional disparity as defined by (11) does not

involve the non-allowable variables. This provides an opportunity to realize the implicit

assumption carried in the IOM definition, that is, the non-allowable category is the “catch

all” category that includes all covariates that have not been named explicitly in the allowable

category. Of course, without strong assumptions, nothing can be done for variables that are

not even identified. Recall the fundamental assumption underlying our conditional disparity

model is that the allowable variables, which clearly need to be identified and measured, are

causes for non-allowable variables. Therefore, if in specific applications where such an

assumption can be viewed as reasonable, even when the non-allowable variables form the

“catch all” category, then the conditional disparity measure enjoys the property of being

more general than we discussed in the current paper.

However, the “catch-all” formulation of the non-allowable variables would not produce

anything meaningful under the marginal disparity model, because we simply cannot stratify

on variables that are not measured, nor should it be as logically there is nothing can be done

when the causes are not even identified. All these issues remind us again of the fundamental

importance of explicitly formulating, identifying, and stating causal assumptions underlying

any disparity measure.

Acknowledgments

We thank J. Gastwirth, X. Xie and A. Zaslavsky for helpful exchanges.

Contract/grant sponsor: NIH; contract/grant number: P50-MHO73469-03, U01-MH06220-06A2

REFERENCES

1. Institute of Medicine. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care.

Washington, DC: National Academy Press; 2002.

2. Asch DA, Armstrong K. Aggregating and partitioning populations in health care disparities

research: differences in perspective. Journal of Clinical Oncology. 2007; 25(15):2117–2121.

[PubMed: 17513818]

3. Cook B. Effect of medicaid managed care on racial disparities in health care access. Health Services

Research. 2007; 42:124–145. [PubMed: 17355585]

4. McGuire TG, Alegria M, Cook BL, Wells KB, Zaslavsky AM. Implementing the Institute of

Medicine definition of disparities: an application to mental health care. Health Services Research.

2006; 41:1979–2005. [PubMed: 16987312]

5. Rao RS, Graubard BI, Breen N, Gastwirth JL. Understanding the factors underlying disparities in

cancer screening rates using the Peters-Belson approach. Mediacal Care. 2004; 42(8):789–800.

6. Fiscella K, Franks P, Doescher MP, Saver BG. Disparities in health care by race, ethnicity, and

language among the insured: findings from a national sample. Medical Care. 2002; 40(1):52–59.

[PubMed: 11748426]

7. Gastwirth JL. A clerification of some statistical issues in Watson V. Fort Worth Bank and Trust.

Jurimetrics Journal. 1989; 29:267–284.

8. Gastwirth JL, Greenhouse SW. Biostatistical concepts and methods in the legal setting. Statistics in

Medicine. 1995; 14:1641–1653. [PubMed: 7481200]

9. Nayak TK, Gastwirth JL. Statistical measures of economic discrimination useful in evelauating

fairness. Proceedins of Biopharmaceutical Section of the American Statistical Association.

1995:87–94. (1995).

10. Gelman, A.; Meng, X-L., editors. Applied Bayesian Modeling and Causal Inference from

Incomplete-data Perspectives. U.K.: Wiley & Sons; 2004.

11. Holland PW, Wang YJ. Depedence function for continuous bivariate densities. Comm. Statist.

Theory Methods. 1987; 16:863–876.

Duan et al.Page 15

Stat Med. Author manuscript; available in PMC 2009 September 10.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript