Bias in the estimation of exposure effects with individual- or group-based exposure assessment.

Hyang-Mi Kim, David Richardson, Dana Loomis, Martie Van Tongeren, Igor Burstyn

Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada.

Journal Article: Journal of Exposure Science and Environmental Epidemiology (impact factor: 2.72). 02/2010; 21(2):212-21. DOI: 10.1038/jes.2009.74

Abstract

In this paper, we develop models of bias in estimates of exposure-disease associations for epidemiological studies that use group- and individual-based exposure assessments. In a study that uses a group-based exposure assessment, individuals are grouped according to shared attributes, such as job title or work area, and assigned an exposure score, usually the mean of some concentration measurements made on samples drawn from the group. We considered bias in the estimation of exposure effects in the context of both linear and logistic regression disease models, and the classical measurement error in the exposure model. To understand group-based exposure assessment, we introduced a quasi-Berkson error structure that can be justified with a moderate number of exposure measurements from each group. In the quasi-Berkson error structure, the true value is equal to the observed one plus error, and the error is not independent of the observed value. The bias in estimates with individual-based assessment depends on all variance components in the exposure model and is smaller when the between-group and between-subject variances are large. In group-based exposure assessment, group means can be assumed to be either fixed or random effects. Regardless of this assumption, the behavior of estimates is similar: the estimates of regression coefficients were less attenuated with a large sample size used to estimate group means, when between-subject variability was small and the spread between group means was large. However, if groups are considered to be random effects, bias is present, even with large number of measurements from each group. This does not occur when group effects are treated as fixed. We illustrate these models in analyses of the associations between exposure to magnetic fields and cancer mortality among electric utility workers and respiratory symptoms due to carbon black.

Source: PubMed

Comments on this publication

ResearchGate members can add comments. Sign up now and post your comment!

Similar publications

Page 1
 
Page 2
 
Page 3
 
Page 4
 
Page 5
 
End of preview.
Page 1
Bias in the estimation of exposure effects with individual- or group-based
exposure assessment
HYANG-MI KIMa,e, DAVID RICHARDSONb, DANA LOOMISc, MARTIE VAN TONGERENd
AND IGOR BURSTYNe
aDepartment of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
bDepartment of Epidemiology, University of North Carolina, Chapel Hill, USA
cDepartment of Environmental and Occupational Health, University of Nevada, Nevada, USA
dInstitute of Occupational Medicine, Riccarton, UK
eDepartment of Medicine, University of Alberta, Edmonton, Alberta, Canada
In this paper, we develop models of bias in estimates of exposure–disease associations for epidemiological studies that use group- and individual-based
exposure assessments. In a study that uses a group-based exposure assessment, individuals are grouped according to shared attributes, such as job title or
work area, and assigned an exposure score, usually the mean of some concentration measurements made on samples drawn from the group. We
considered bias in the estimation of exposure effects in the context of both linear and logistic regression disease models, and the classical measurement
error in the exposure model. To understand group-based exposure assessment, we introduced a quasi-Berkson error structure that can be justified with a
moderate number of exposure measurements from each group. In the quasi-Berkson error structure, the true value is equal to the observed one plus error,
and the error is not independent of the observed value. The bias in estimates with individual-based assessment depends on all variance components in the
exposure model and is smaller when the between-group and between-subject variances are large. In group-based exposure assessment, group means can be
assumed to be either fixed or random effects. Regardless of this assumption, the behavior of estimates is similar: the estimates of regression coefficients
were less attenuated with a large sample size used to estimate group means, when between-subject variability was small and the spread between group
means was large. However, if groups are considered to be random effects, bias is present, even with large number of measurements from each group. This
does not occur when group effects are treated as fixed. We illustrate these models in analyses of the associations between exposure to magnetic fields and
cancer mortality among electric utility workers and respiratory symptoms due to carbon black.
Journal of Exposure Science and Environmental Epidemiology advance online publication, 24 February 2010; doi:10.1038/jes.2009.74
Keywords: quasi-Berkson type error structure, non-differential measurement error, bias, mixed exposure model, homogenous error.
Introduction
In epidemiological cohort studies of occupational and
environmental exposures, individual exposure measurements
are often not available for all members of the study, whereas
health outcome measures are obtained for each individual. In
such settings, a commonly employed approach is to derive
exposure estimates through a group-based strategy (Loomis
and Kromhout, 2004) (also know as semi-individual or semi-
ecological study design). Individuals are grouped according
to shared attributes, such as job title or work area, and
assigned an exposure score, usually the mean of some
concentration measurements made on samples drawn from
the group.
Interestingly, in some settings, the use of a group-based
strategy for assigning exposure scores can result in a less
biased estimate of an exposure–disease association than would
be achieved through individual exposure measurements. It is
well-known that non-differential measurement errors in
individual exposure estimates may lead to bias in estimates
of exposure–response associations; a group-based strategy
can minimize this attenuation bias by creating an error
structure that has some properties of a Berkson-type error.
The Berkson error model was originally proposed for
experimental situations, in which the investigator attempted
to set the exposure at a target value, but because of
imprecision of instrumentation, its true value was randomly
distributed around the target (Berkson, 1950). If the
experiment was replicated many times with the same target
value, the true value would be randomly distributed with an
estimated mean approaching the target value: the errors
would be independent of the target value. Kim et al. (2006)
showed that the group-based strategy leads to an approxi-
mate Berkson measurement error structure when data areReceived 24 August 2009; accepted 30 November 2009
1. Address all correspondence to: Dr. Hyang-Mi Kim, Department of
Mathematics and Statistics, The University of Calgary, 2500 University
Drive N.W, Calgary, AB, Canada, T2N 1N4. Tel: þ 403 220 5691.
Fax: þ 403 282 5150. E-mail: hmkim@ucalgary.ca
Journal of Exposure Science and Environmental Epidemiology (2010), 1–10
r 2010 Nature Publishing Group All rights reserved 1559-0631/10/$32.00
www.nature.com/jes
Page 2
available for a large number of subjects in each group. It is
approximate in the sense that assigned group means may
not be independent of error. To account for this complexity,
we formally introduce a novel quasi-Berkson error model in
this paper.
When members of a cohort are grouped, mixed-effects
models are often used to fit to exposure data, as these models
allow an analyst to treat the group as either a fixed or a
random effect. A question arises as to whether estimation
using random grouping methods (RGE) produces exposure–
response results that are consistent with those obtained using
fixed group effect (FGE) modeling. In some settings, the
rationale for treating exposure groups as fixed is clear. An
occupational cohort study that makes use of a previously
published job–exposure matrix implies an exposure assess-
ment in which groups are fixed. If such an assumption is
made, then conclusions can be drawn only about exposure–
response association in the occupational groups that were
investigated. This may well be desirable in narrowly targeted
studies of uncommon exposures (that only occur in the
studied workplaces) or in investigations undertaken by one
company0s health and safety department (wherein the goal is
to simply understand health risk to employees of a given
enterprise).
In contrast, an occupational cohort in which the
investigator wishes to draw conclusions about exposure–
response not only among a fixed set of studied occupational
groups but also in all possible occupational groups, it is more
natural to assume that groups are created through a random
draw of all possible groupings. This desire to generalize
findings beyond, say, one occupation in a given factory to all
similar jobs in a specific industry requires us to assume that
the observed groups provide information about the char-
acteristics of all possible exposure groups. This assumption
enables an investigator to estimate the variation in exposure
between groups (Goldstein, 2003).
The following assumptions were made for the purposes of
our study: a normal exposure distribution (a log transforma-
tion for log-normally distributed exposures was applied in the
examples), known constant error variance components, no
systematic error, non-differential error and no correlation
among errors. We also focused on the scenario in which the
disease under study was neither common nor extremely rare.
Throughout this paper, we define exposure as intensity or
concentration of substance, ignoring complications that arise
from time-varying exposure patterns and accumulation of
dose due to long-term exposure.
Our first aim is to examine, from a theoretical perspective,
the use of fixed and random group-based strategies for
assigning exposure scores in an epidemiological cohort. Next,
we illustrate how a researcher may obtain valid estimates of
exposure–disease associations through linear or logistic
regression methods, even when exposure measurements for
all subjects are not available, as long as an adequate
sample of measured values for each group is drawn and the
between-group variability is large. The impact of different
grouping schemes on parameter estimation is illustrated in
two examples: (a) occupational exposure to magnetic fields
among workers with any cancer (Kromhout et al., 1995;
Saviz and Loomis, 1995) and (b) respiratory health of
employees in the European carbon black manufacturing
industry in relation to exposure to carbon black dust (van
Tongeren et al., 1997). In section 2, we present the bias
equations for individual- and group-based assessments for both
random and FGE models. Findings derived from the simula-
tion studies are described in section 3. In section 4, we provide
two examples; and, the findings are discussed in section 5.
Theoretical Study
Theoretical studies were considered by assuming an additive
measurement error model together with linear and logistic
response models. For both individual- and group-based
strategies, the conditional mean of the linear response model,
given the observed exposure (Harville, 1977), was used to
obtain the attenuation factor for the regression coefficient in
the linear model. For the logistic model, at first, the
expressions of the conditional mean and variance of the true
exposure, given the observed values, were derived and
subsequently, the expression of the attenuation in the
response model was found.
For the group-based strategy, we considered two exposure
models: the RGE exposure model in which the group is
regarded as a random component, and the FGE exposure
model, in which the group is a fixed component. For both
exposure models, the Berkson error was induced from a
classical error structure under the assumptions that (1) the
number of measurements in each group is sufficiently large to
estimate the true group means closely and (2) the group
means are not correlated with the measurement error in the
Berkson error model. As this approximation of the Berkson
error depends on the sample size and the covariance between
the group mean and measurement error, we call this a quasi-
Berkson error model.
In the presence of fixed unknown group means, we assume
that the group means are fixed with different between-group
variabilities (o2) for each distinct grouping scheme. How-
ever, in deriving the attenuation equation for logistic
regression models, the assumption of normality of the
exposure distribution is required, and the assumption fails
when the between-group exposure variability (o2) is large for
the FGE model. In a such situation, we have exposures being
distributed as a mixture normal distribution with the number
of components equal to the number of groups. Therefore, an
expression for attenuation cannot be easily derived, and in
this paper, we do not explore the theoretical behavior of the
regression coefficient when o2 is large.
Bias estimation with exposure assessmentsKim et al.
2 Journal of Exposure Science and Environmental Epidemiology (2010), 1–10
Page 3
Models
We postulate a classical exposure measurement error model.
For the setting of RGE, the measurement error model is
Wgi ¼ m þ ng þ ggi þ Zgi ¼ Xgi þ Zgi ð1Þ
and for FGEs, the measurement error model is
Wgi ¼ mg þ ggi þ Zgi ¼ Xgi þ Zgi ð2Þ
where Wgi represents the observed exposure on the ith subject
from the gth group, Xgi and represents the true exposure of
the subject; m is the common true mean; mg is the fixed group
mean, g¼ 1,y, G; ngBN(0,sg2) is a random effect due to
group g, g ¼ 1,y, G; ggiBN(0,sb2) is a random effect due to
subject i in each group, i¼ 1,y, N; ZgiBN(0,sZ2) is a random
effect due to measurement error and daily fluctuations in
exposure that may arise in occupational settings from
variability of daily tasks (e.g., day-to-day variability for
full-shift measurement, Wgi), and the errors are mutually
independent.
For the association between exposure and response, we
consider the linear and logistic regression models given,
respectively, by
Ygi ¼ b0 þ b1Xgi þ egi
where b0 and b1 are the intercept and the slope parameters,
respectively, and egiBN(0,se2), and
PðZgi ¼ 1jXgiÞ ¼ Lðb0 þ b1XgiÞ
where Zgi is a binary variable for the health outcome and
L(t)¼ 1/(1þ exp(�t)).
The conditional expectation of response Ygi given the
observed values, Wi ¼ (Wgi: individual values, W¯g: group
mean), for the linear models is
E½YgijWi� ¼ b0 þ b1E½XgijWi� ð3Þ
and for logistic regression models is
E½PðZgi ¼ 1jXgiÞjWi� � E½F½cðb0 þ b1XgiÞ�jWi�
¼ Lðb00 þ b
0
1EðXgijWiÞÞ
ð4Þ
where c¼ 0.588, b00 and b01 are functions of V(Xgi|Wi) by
using the approximation to probit regression model (Reeves
et al., 1998) and F(t) is the cumulative density function of
the standard normal distribution. By obtaining E(Xgi|Wi)¼
f(Wi) and V(Xgi|Wi)¼c(Wi), the bias factor can be formu-
lated (Burr, 1988; Wang et al., 1998; Carroll et al., 2006).
Bias
In this section, the conditional expectation and variance are
calculated for both the RGE (Eq. (1)) and FGE (Eq. (2))
models and used to derive the bias factors for both linear and
logistic regression models.
Individual-Based Strategy
With the RGE model, E(Xgi|Wgi)¼ m(1�l*)þ l*Wgi, where
l*¼ (sg2þ sb2)/(sg2þsb2þsZ2) and the conditional variance is
given by V(Xgi|Wgi)¼V(Xgi)(1�l*)¼ sZ2l*. With the FGE
model, E(Xgi|Wgi)¼ �m(1�l0)þl0Wgi and V(Xgi|Wgi)¼
l02sZ2þ (1�l0)2sb2, where l0¼ (o2þ sb2)/(o2þsb2þsZ2), the
between-group variability is defined as o2 ¼
P
ðmg � �mÞ2=g
and �m ¼
P
mg=g.
On the basis of Eqs (3) and (4), we obtained approximate
equations that describe the relationship between the true
regression coefficient, b1, and the observed regression coefficient,
b*1, with the observed exposures, Wi. In the linear regression,
b�1 ¼ l�b1 and b�1 ¼ l0b1
for RGE and FGE models, respectively. In the logistic
regression context with a RGE model,
b�1 ¼
l�b1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
c2b21s2Zl� þ 1
q ð5Þ
and with the FGE model when the between-group variability is
small so that the exposures are approximately normally
distributed,
b�1 ¼
l0b1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
c2b21½s2Zl20 þ ð1 � l0Þ
2s2b� þ 1
q ð6Þ
The attenuation factors, l*¼ (sg2þ sb2)/(sg2þ sb2þsZ2)
(l0¼ (o2þsb2)/(o2þsb2þ sZ2)) for the linear and logistic models
(Eqs (5) and (6)) go to l¼sb2/(sb2þsZ2) as the between-group
variability decreases. There is attenuation as lrl*(l0)p1. There
is less attenuation when the between-subject variability increases
for a model with fixed between-group variability, ((sg2(o2)þ
sb,12 )/(sg2(o2)þ sb,12 þ sZ2)r(sg2(o2) þ sb,22 )/(sg2(o2)þsb,22 þsZ2)
r1 if sb,12 osb,22 ).
In addition, when the measurement error variance
increases, the attenuation increases.
Group-Based Strategy
In the group-based strategy, an average (W¯g) of the observed
measurements for a group g is taken to apply to all subjects in
the group (e.g., from the same job title); �Wg ¼
P �Ygi=n,
where n is the number of subjects from a group of the total
size (N). For each subject, this group mean W¯g is an
approximation of his/her true exposure (Xgi), if the number
of measurements is reasonably large.
The conditional expectation of the true exposure given the
observed group mean in this case is E½Xgij �Wg� ¼ �Wgþ
ðm � 1Þð �Wg � mgÞ, where m ¼ covðXgi; �WgÞ=varð �WgÞ. The
derivation is made under the assumption of a classic
measurement error model. If the number of subjects in each
group is sufficient for the true mean and the estimated group
mean to be close in value, that is, �Wg � E½Xgi�, then we have
E½Xgij �Wg� � �Wg. By showing an approximate property,
ðE½Xgij �Wg� ¼ �WgÞ, we postulate a quasi-Berkson error
model with the assigned group mean (W¯g) and true exposures
(Xgi), that is,
Xgi ¼ �Wg þ egi; E½egij �Wg� ¼ 0 ð7Þ
Bias estimation with exposure assessments Kim et al.
Journal of Exposure Science and Environmental Epidemiology (2010), 1–10 3
Page 4
This approximation may depend on the sample selected, but
we need only a moderately large sample size to obtain this
property, wherein the true exposure of each worker randomly
varies about the group mean, W¯g, and this mean is
approximately the true group mean. This situation is
analogous to the Berkson error model, in which the true
exposure, given the observed exposure, has an expected value
equal to the observed exposure. However, the model is not a
true Berkson error model unless the group mean (W¯g) is
independent of the error (egi) (Kim et al., 2006). With RGE,
it is necessary to consider the possibility of correlation
between W¯g and egi as the group means are random
components correlated with the model error (egi) and egi
may be correlated with the model error (egi). The Berkson
error structure will be approximated if this covariance is
small. The covariance can be either positive or negative and a
function of sg, covð �Wg; egiÞ ¼ xðsgÞ.
With FGE, one can derive that cov(W¯g, egi)¼ 0 and
cov(Xgi, egi)¼V(Xgi|W¯g)a0, as W¯g can be considered a
constant when the number of measurements for the group
mean is large. The model, however, is not a truly Berkson
error model as cov(W¯g, egi)¼ 0 and it does not imply that the
observed value and the error are independent, which is
required for the Berkson error model.
Applying the RGE model in the linear regression model
(Eq. (3)) leads from non-differential error to differential error
if covariance exists between W¯g and egi. As Xgi ¼ W¯g þ egi,
the linear model is expressed as Ygi ¼b0þ b1W¯g þ e*gi, where
e*gi¼b1egiþ egi. Thus, covðe�gi; �WgÞ ¼ covðb1egi þ egi; �WgÞ �
b1covðegi; �WgÞ, that is, the model error (egi*) is correlated with
the covariate (W¯g). With RGE, the quasi-Berkson error structure
(Eq. (7)) and non-zero covariance,
b�1 ¼ b1 þ
covð �Wg; egiÞ
s2g
ð8Þ
whereas with FGE,
b�1 � b1 ð9Þ
In a logistic regression analysis (Eq. (4)), it is necessary to find a
relationship among the variances in Eqs. (1) and (2) under the
quasi-Berkson error model to derive the amount of attenuation
in the response models. Using the RGE model, the error
variance, se2, is obtained:
s2e ¼ VðegiÞ ¼ s2b � 2covð �Wg; egiÞ � 0
where covð �Wg; egiÞ � 0:5s2b. This equation implies that the bias
with the RGE model depends on the between-subject variance,
as well as the covariance between the group mean and the
measurement error of the quasi-Berkson error structure:
b�1 ¼
b1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
c2b21½s2b � 2covðegi; �WgÞ� þ 1
q ð10Þ
Using the FGE model, by using a property of the Berkson error
structure when the between-group variance is small, the error
variance is obtained by
s2e ¼ VðegiÞ ¼ VðXgi � �WgÞ
¼ VðXgiÞ þ Vð �WgÞ � 2covðXgi; �WgÞ ¼ VðXgiÞ � s2b
This equation implies that the bias with the FGE model depends
on the between-subject variance when the number of sampled
subjects (n) is sufficiently large:
b�1 ¼
b1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
c2b21s2b þ 1
q ð11Þ
When grouping is used to assess exposure, the measurement
error variances vanish as the sample size increases for the FGE
model. However, for the RGE model, the correlation between
the measurement error (egi) and the group mean leads to bias.
That is, the group-based strategy reduces the effect of
measurement error in the regression coefficient estimation in
FGE, but not for RGE.
From Eqs (8) and (10), the attenuation in both the linear
and logistic models depends on the between-group and
between-subject variability with the RGE model. As the
between-subject variability increases, the attenuation increases,
whereas when the between-group variability increases, the bias
decreases.
For the FGE model (Eq. (11)), the derived expression for
bias does not include the between-group variance compo-
nent. One reason for this is that we consider only small group
variability, so that the distribution for the exposures has an
approximate normal distribution. However, the simulation
study (below) shows that the bias decreases as the groups are
far away from each other, just as with the RGE model.
Sample Size in Group-based Exposure Assessment
The extent to which this quasi-Berkson error model fits the
data depends on (a) the variance of the group mean and
(b) the covariance between the group mean and measurement
error. It can be shown that the variance of W¯g approaches
zero as the sample size, n, increases. For each group, the
variance of group means can be expressed based on the
sample size using a binary variable, which indicates that the
observation is in the sample with probability of nN:
Vð �WgÞ ¼ 1nfðs2g þ s2b þ s2ZÞ þ ð1 � nNÞm2g þ ð1 � 1NÞs2g for
RGE. The variance of the group mean depends on the
between-group variance, regardless of the sample size. This
variability together with the covariance affects the parameter
estimation in the response models (Eqs (8) and (10)).
Figure 1 shows how the variance changes as the sample
size increases. For the FGE model, the variance of group
means, Vð �WgÞ ¼ 1nfðs2b þ s2ZÞ þ ð1 � nNÞm2gg, starts to ap-
proach zero with a relatively small number of exposure
measurements drawn from each group. This condition leads
to the quasi-Berkson model being a good approximation of
the Berkson error model, so that the bias in the slope
parameter is negligible.
Bias estimation with exposure assessmentsKim et al.
4 Journal of Exposure Science and Environmental Epidemiology (2010), 1–10
Page 5
Comparison between Individual and Group Strategies
In a linear regression analysis with the RGE model, if the
measurement error gets smaller, there is negligible attenuation in
using the individual-based assessment, while there is bias depen-
ding on the covariance term, cov(W¯g, egi), and the between-
group variability (sg2) with group-based assessment. However,
with the FGE model, the group-based exposure assessment is
always superior if the sample size is moderately large, because it
leads to a quasi-Berkson error structure that gives no bias.
In logistic regression analysis, for a given between-group
variability, as the error variance (sZ2) gets smaller, there is
negligible attenuation when using the individual-based
assessment, while there is still bias depending on the error
variability (se2) when the group-based assessment is used on
the same data. Therefore, when the measurement error is
small and the error variability, se2, is large, the estimates with
individual-based assessment are expected to be less biased
than that with group-based assessment (Eq. (5) versus
Eq. (10) and Eq. (6) versus Eq. (11)).
Simulation study
Simulations were performed to examine attenuation in the
regression coefficient estimates in linear and logistic models
with individual- and group-based exposure assessment and a
disease with an expected risk of about 10% (PE0.1) and less
than 10% (Po0.1). We considered a cohort with time-
invariant exposure that segregates into five exposure groups.
We further assumed that disease risk depended only on
exposure intensity and not on its duration. The measure-
ments of exposure for a sample of n¼ 20(100) workers were
obtainable among exposure measurements of all 500 subjects
(N) in each group. Each subject was measured only once, and
it was assumed that all variance components were known
(the between-group, between-subject and measurement error
variance on each subject from the measurement error
models). The mean of the 1000 sets of estimates and
standard errors were calculated. In addition, the empirical
standard deviations of the estimates were calculated and the
empirical mean square error (MSE) were obtained.
The true regression coefficient was set to 0.3, and �4 (for
Po0.1) or �2 (for PE0.1) were used as the intercept
parameters for both regression models. The probability of
disease, p, P(Zgi ¼ 1|Xgi)¼L(b0þb1(exposure: Xgi)), was
calculated and used to assign binary disease status from a
Bernoulli distribution. The exposures were assumed to be
normally distributed with the common means of 0.1 and the
between-group standard deviation of 0.3, 0.5 and 1 for the
RGE model, and 0.1(0.3)1.3 (o¼ 0.3), 0.1(0.5)2.1(o¼ 0.5)
and 0.1(1) 4.1(o¼ 1) for the first group to fifth group for the
FGE model. To see the impact of the between-subject
standard deviation, we examined values that span a plausible
range (Kromhout et al., 1993), that is, a small value of 0.7
(sb2¼ 0.49) and a large value of 1.414(sb2¼ 2). As the
measurement error disappears with the grouping strategy
(sZ2/sample size in each group), we considered only the
measurement error standard deviation (sZ2) values of 1 for
both the RGE and FGE models. For the group-based
strategy, the estimated mean exposure for each group was
assigned to all workers in a given group. The regression
coefficients were estimated using the generalized linear model
procedures of R software, which was developed by John
Chambers and Hastie (1991) at Lucent Technologies. The
corresponding author will make the R code used in the
simulations available on request.
Individual-Based Strategy
Bias depends on all variance components for the RGE and
FGE models in both linear and logistic regression models. As
the measurement error variance increases, the bias increases,
as expected (not shown). When the measurement error
variance is fixed as sZ2¼ 1 (Table 1), the bias is reduced when
the between-group variability and between-subject variability
are large, which shows that the bias also depends on the
variability of an unknown true covariate (sg2þsb2 for RGE or
sb2 for FGE). With the RGE model, for example, when the
between-group variability is sg ¼ 0.3 and the between-subject
variability increases from sb ¼ 0.7 to 1.414, the bias decreases
as the estimate (MSE) increases from 0.108 (0.037) to 0.202
(0.010) in a linear model and as the estimate (MSE) increases
from 0.110 (0.039) to 0.202 (0.011) in the logistic model.
Group-Based Strategy
Tables 1, 2 and 3 show the results for the RGE and FGE
models with the probability of disease of B10%, the
condition assumed in the preceding theoretical developments.
The tables also present the results for the analogous set of
simulation parameters, except in the case of ‘‘rare’’, occurring
in less than 10% of the subjects. Under the grouping, the bias
depends on the between-group and between-subject var-
iances, because the measurement error variance vanishes as
the number of measurements increases. As the between-
2520151050
sample size
va
ria
nc
e
N = 500; μ = 1; μg = 1; σb = 0.5
RGE: σg = 1
RGE: σg = 0.25
FGE0
1
2
3
4
Figure 1. Variance of group mean in relation to sample size.
Bias estimation with exposure assessments Kim et al.
Journal of Exposure Science and Environmental Epidemiology (2010), 1–10 5
End of preview.
Preview full-text

Science & Research Jobs

Keywords

between-subject variability
 
carbon black
 
classical measurement error
 
epidemiological studies
 
exposure effects
 
exposure measurements
 
exposure model
 
exposure score
 
exposure-disease associations
 
group effects
 
group-based exposure assessment
 
individual-based assessment
 
individual-based exposure assessments
 
job title
 
logistic regression disease models
 
magnetic fields
 
observed value
 
respiratory symptoms
 
use group-
 
variance components