ArticlePDF Available

Methods for Pooling Results of Epidemiologic Studies: The Pooling Project of Prospective Studies of Diet and Cancer

Authors:

Abstract and Figures

With the growing number of epidemiologic publications on the relation between dietary factors and cancer risk, pooled analyses that summarize results from multiple studies are becoming more common. Here, the authors describe the methods being used to summarize data on diet-cancer associations within the ongoing Pooling Project of Prospective Studies of Diet and Cancer, begun in 1991. In the Pooling Project, the primary data from prospective cohort studies meeting prespecified inclusion criteria are analyzed using standardized criteria for modeling of exposure, confounding, and outcome variables. In addition to evaluating main exposure-disease associations, analyses are also conducted to evaluate whether exposure-disease associations are modified by other dietary and nondietary factors or vary among population subgroups or particular cancer subtypes. Study-specific relative risks are calculated using the Cox proportional hazards model and then pooled using a random- or mixed-effects model. The study-specific estimates are weighted by the inverse of their variances in forming summary estimates. Most of the methods used in the Pooling Project may be adapted for examining associations with dietary and nondietary factors in pooled analyses of case-control studies or case-control and cohort studies combined.
Content may be subject to copyright.
Practice of Epidemiology
Methods for Pooling Results of Epidemiologic Studies
The Pooling Project of Prospective Studies of Diet and Cancer
Stephanie A. Smith-Warner
1,2
, Donna Spiegelman
2,3
, John Ritz
2,3
, Demetrius Albanes
4
,
W. Lawrence Beeson
5
, Leslie Bernstein
6
, Franco Berrino
7
, Piet A. van den Brandt
8
, Julie E. Buring
2,9
,
Eunyoung Cho
10
, Graham A. Colditz
2,10,11
, Aaron R. Folsom
12
, Jo L. Freudenheim
13
, Edward
Giovannucci
1,2,10
, R. Alexandra Goldbohm
14
, Saxon Graham
13
, Lisa Harnack
12
, Pamela L. Horn-
Ross
15
, Vittori o Krogh
7
, Michael F. Leitzmann
16
, Marjorie L. McCullough
17
, Anthony B. Miller
18
,
Carmen Rodriguez
17
, Thomas E. Rohan
19
, Arthur Schatzkin
16
, Roy Shore
20
, Mikko Virtanen
21
,
Walter C. Willett
1,2,10
, Alicja Wolk
22
, Anne Zeleniuch-Jacquotte
20
, Shumin M. Zhang
2,9
, and David
J. Hunter
1,2,10,11
1
Department of Nutrition, Harvard School of Public Health,
Boston, MA.
2
Department of Epidemiology, Harvard School of Public
Health, Boston, MA.
3
Department of Biostatistics, Harvard School of Public
Health, Boston, MA.
4
Nutritional Epidemiology Branch, National Cancer Institute,
Bethesda, MD.
5
Center for Health Research, School of Medicine, Loma
Linda University, Loma Linda, CA.
6
Department of Preventive Medicine and USC/Norris
Comprehensive Cancer Center, University of Southern
California, Los Angeles, CA.
7
Epidemiology Unit, National Cancer Institute, Milan, Italy.
8
Department of Epidemiology, Faculty of Health Sciences,
Maastricht University, Maastricht, the Netherlands.
9
Division of Preventive Medicine, Department of Medicine,
Brigham and Women’s Hospital and Harvard Medical School,
Boston, MA.
10
Channing Laboratory, Department of Medicine, Brigham
and Women’s Hospital and Harvard Medical School,
Boston, MA.
11
Harvard Center for Cancer Prevention, Boston, MA.
12
Division of Epidemiology and Community Health, School of
Public Health, University of Minnesota, Minneapolis, MN.
13
Department of Social and Preventive Medicine,
University at Buffalo, State University of New York,
Buffalo, NY.
14
Department of Epidemiology, TNO Quality of Life, Zeist,
the Netherlands.
15
Northern California Cancer Center, Fremont, CA.
16
Division of Cancer Epidemiology and Genetics,
National Cancer Institute, Bethesda, MD.
17
Epidemiology and Surveillance Research, American
Cancer Society, Atlanta, GA.
18
Department of Public Health Sciences, Faculty of
Medicine, University of Toronto, Toronto, Ontario, Canada.
19
Department of Epidemiology and Population Health,
Albert Einstein College of Medicine, Bronx, NY.
20
Department of Environmental Medicine, School of
Medicine, New York University, New York, NY.
21
Department of Epidemiology and Health Promotion,
National Public Health Institute, Helsinki, Finland.
22
Division of Nutritional Epidemiology, National Institute of
Environmental Medicine, Karolinska Institute, Stockholm,
Sweden.
Received for publication March 22, 2005; accepted for publication December 21, 2005.
With the growing number of epidemiologic publications on the relation between dietary factors and cancer risk,
pooled analyses that summarize results from multiple studies are becoming more common. Here, the authors
describe the methods being used to summarize data on diet-cancer associations within the ongoing Pooling Project
of Prospective Studies of Diet and Cancer, begun in 1991. In the Pooling Project, the primary data from prospective
cohort studies meeting prespecified inclusion criteria are analyzed using standardized criteria for modeling of
Correspondence to Dr. Stephanie Smith-Warner, Department of Nutrition, Harvard School of Public Health, 665 Huntington Avenue, Boston,
MA 02115 (e-mail: pooling@hsphsun2.harvard.edu).
1053 Am J Epidemiol 2006;163:1053–1064
American Journal of Epidemiology
Copyright
ª 2006 by the Johns Hopkins Bloomberg School of Public Health
All rights reserved; printed in U.S.A.
Vol. 163, No. 11
DOI: 10.1093/aje/kwj127
Advance Access publication April 19, 2006
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
exposure, confounding, and outcome variables. In addition to evaluating main exposure-disease associations,
analyses are also conducted to evaluate whether exposure-disease associations are modified by other dietary and
nondietary factors or vary among population subgroups or particular cancer subtypes. Study-specific relative risks
are calculated using the Cox proportional hazards model and then pooled using a random- or mixed-effects model.
The study-specific estimates are weighted by the inverse of their variances in forming summary estimates. Most of
the methods used in the Pooling Project may be adapted for examining associations with dietary and nondietary
factors in pooled analyses of case-control studies or case-control and cohort studies combined.
cohort studies; diet; epidemiologic methods; meta-analysis; neoplasms
The growing number of epidemiologic publications on
the relation between diet and cancer risk has heightened
the need for methods of summarizing results from multiple
studies. These methods include qualitative reviews and
quantitative summaries such as meta-analyses of the pub-
lished literature and pooled analyses of the primary data
(also called meta-analyses of individual data) (1). A general
framework for conducting pooled analyses entails 1) formu-
lating study inclusion criteria; 2) identifying all potential
studies meeting these criteria; 3) obtaining each study’s pri-
mary data; 4) creating a standardized database; 5) estimating
study-specific exposure-disease associations; 6) examining
whether the study-specific results are heterogeneous; 7) cal-
culating pooled estimates, if applicable; and 8) conducting
sensitivity analyses to evaluate whether the estimates are
robust (2). There are many advantages to reanalyzing the
primary data from multiple studies rather than extracting
the study-specific relative risks from published articles
(1–5). In a pooled analysis, the modeling of the exposure,
confounding, and outcome variables, the choice of which
variables to control for, and the type of analysis conducted
can be standardized, thereby removing potential sources of
heterogeneity across studies. Because of larger sample sizes,
pooled analyses also offer investigators the opportunity to
examine uncommon exposures, rare diseases, and variation
in associations among population subgroups with greater
statistical power than is possible in individual studies.
The pooling of data from observational studies has be-
come more common recently (6–13). Summary estimates
have been calculated using a weighted average of the
study-specific estimates (8, 9, 11) or by combining studies
into a single data set for the analysis (6, 7, 10, 12, 13). In this
paper, we describe the methods that are being used within
the ongoing Pooling Project of Prospective Studies of Diet
and Cancer (the Pooling Project), an international consor-
tium of cohort studies with the goal of providing the best
available summary of data on associations between diet and
cancer (14–30). Most of these methods can also be adapted
to examine associations in pooled analyses of case-control
studies or both case-control and cohort studies combined.
INCLUSION CRITERIA
To maximize the quality and comparability of the studies
in the Pooling Project, we formulated several inclusion cri-
teria a priori. First, we include prospective studies which
1) had at least one publication on the relation between diet
and cancer; 2) used a dietary assessment method that was of
sufficient detail to calculate intakes of most nutrients, in-
cluding energy, and that assessed intake over a period of
months or years; and 3) assessed the validity of their dietary
assessment method or a closely related instrument. Second,
for each cancer site evaluated, we specify a minimum num-
ber of cases required for a study to be included in the anal-
ysis. Additional inclusion criteria also may be made for each
cancer site. Third, for each analysis, we include only those
studies that assessed the specified exposure and in which
participants consumed the dietary item of interest. For anal-
yses that are going on simultaneously in the Pooling Project
and the European Prospective Investigation into Cancer and
Nutrition (31), we intend to coordinate analyses so that, to
the extent possible, we can use similar analytic approaches
and provide comparable results.
COMPONENT STUDIES
Sixteen studies (32–46) are currently included in the
Pooling Project (table 1). As we become aware of new
studies meeting the inclusion criteria, the investigators from
those studies are invited to join the Project. The Canadian
National Breast Screening Study and the Netherlands Co-
hort Study are each analyzed as case-cohort studies (47),
because the investigators in these two studies each selected
a random sample of the cohort to provide the person-time
data for the cohort and have processed questionnaires for
only this random sample and the cases. We divide the person-
time and numbers of cases compiled during follow-up of
the Nurses Health Study into two segments to take advan-
tage of the expanded food frequency questionnaire admin-
istered in 1986 as compared with 1980. In this paper, we
refer to the follow-up period from 1980 to 1986 as ‘Nurses’
Health Study A’; the follow-up period beginning in 1986
is referred to as ‘Nurses’ Health Study B. Following
standard survival data analysis theory, blocks of person-
time in different time periods are asymptotically uncorre-
lated, regardless of the extent to which they are derived
from the same people (48, 49). Thus, pooling of the esti-
mates from these two time periods produces estimates and
standard errors which are as valid as those from a single
time period.
Data collection
The investigators in each Pooling Project study send their
primary data on select variables to the Harvard School of
Public Health (Boston, Massachusetts). There we inspect
1054 Smith-Warner et al.
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
the data for completeness and resolve inconsistencies with
the investigators of each study.
Each study used a food frequency questionnaire or diet
history instrument that was designed and pretested in its
specific study population or a similar population (P. L. Horn-
Ross, unpublished data; V. Krogh, unpublished data; A.
Wolk, unpublished data) (50–59) (table 1). Although the
numbers of items included in the food frequency question-
naires varied over fivefold across the studies (table 1), the
study-specific correlation coefficients comparing the food
frequency questionnaire used in each cohort or a closely
related instrument with multiple dietary records or 24-hour
recalls generally exceed 0.40 for total fat, dietary fiber, and
several micronutrients (P. L. Horn-Ross, unpublished data;
V. Krogh, unpublished data; A. Wolk, unpublished data)
(50–59) (table 2).
Information on nondietary risk factors was collected at
baseline in each study using self-administered question-
naires. For measured covariates, the proportion of missing
data for nondietary risk factors is generally low across stud-
ies (table 3). The exception is the Swedish Mammography
Cohort, in which some covariate information was available
for only one of the two counties in the study.
Case ascertainment
Incident cancer diagnoses are identified through follow-
up questionnaires, with subsequent medical record review
(37, 44, 46), linkage with cancer registries (32, 36, 39–42,
45), or both (33–35, 38, 43). In addition, investigators in
some studies ascertain incident and/or fatal outcomes using
mortality registries (32, 34, 35, 37–39, 41–46). Case ascer-
tainment has generally been estimated to be greater than 90
percent in each study (table 1).
STATISTICAL APPROACHES AND RATIONALE
For each cohort, after applying the exclusion criteria used
in that study, we further exclude participants who reported
log
e
-transformed energy intakes beyond three standard de-
viations from the study-specific log
e
-transformed mean en-
ergy intake of the baseline population (or subcohort, for the
case-cohort studies) or who reported a history of cancer
(except nonmelanoma skin cancer) at baseline. Additional
exclusion criteria may be applied for analyses of specific
cancer sites. Because many cancers appear to have hormonal
antecedents and because lifestyle factors may differ between
women and men, studies including both women and men are
split into two studies: a cohort of women and a cohort of
men. This conservative approach, in which all estimates are
calculated separately for women and men in those studies
including both genders, allows for potential effect modifica-
tion by sex for every determinant of the outcome.
Follow-up time is calculated for each participant from the
date on which his/her baseline questionnaire was returned to
the date of diagnosis of the specific cancer being examined,
the date of death, the date on which the participant moves
out of the study area (if applicable), or the end of follow-up,
whichever comes first.
In our analyses, we create standardized categories for
most confounding variables across studies. We create a
missing-data indicator variable for missing responses for
each measured confounder in a study, if applicable. As long
as 1) the association between the confounding variable and
the exposure of interest is weak, or the association between
the confounding variable and the outcome is weak, or the
confounding variable has little variability in the study and
2) the percentage of missing data within the study is low,
the use of the missing-data indicator method is likely to
improve efficiency without introducing appreciable bias in
comparison with the complete case method (60, 61). As
table 3 shows, the proportion of missing data for each co-
variate across studies is generally low, satisfying one of the
conditions for valid use of the missing-data indicator
method. In addition, potentially confounding factors gener-
ally have had moderate-to-weak associations with the can-
cer sites we have examined and have had low-to-moderate
correlations with the dietary exposures that are of primary
interest in the Pooling Project. Information on age, which is
typically the strongest measured risk factor for cancer in-
cidence, is never missing in the constituent studies.
Two-stage analysis
Our analytic approach generally is a two-stage process. In
the first step, we calculate study-specific relative risks using
the Cox proportional hazards model (49), defined through
the hazard function h by
h
jks
ðtj u
is
;x
is
Þ¼h
0jks
ðtÞexpða
s
u
is
þ b
s
x
is
Þð1Þ
for s ¼ 1, ..., S, where s is the study number, t is follow-up
time, u
is
and x
is
are the study-specific confounding and ex-
posure variables, respectively, for individual i in study s, and
h
0jks
(t) is the baseline incidence rate at age j (in years), in
calendar year k, and for time since entry into the study t. The
estimated study-specific log relative risks for a one-unit in-
crease in the exposures, x
is
, are given by the b
s
. The study-
specific log relative risks for a one-unit increase in the
confounding variables, u
is
, are given by the a
s
. Stratifying
jointly by age at baseline (years) and the year in which the
baseline questionnaire was returned (indexed by j and k,
respectively) and treating follow-up time (in years) as the
time metric in the Cox model is equivalent to treating age as
the time metric in the Cox model and stratifying jointly on
calendar time (in years) and duration of time in the study,
with one exception: There is a difference in which two-way
interactions are allowed. With our approach, no assumptions
are made about the shape of the age or calendar-year in-
cidence curves, each of which is fully adjusted for the other,
and arbitrary two-way interactions of the joint dependency
of the outcome on age and calendar time are allowed. Each
case-cohort study is analyzed using EPICURE software
(HiroSoft International Corporation, Seattle, Washington)
(47, 62); each remaining study is analyzed using SAS PROC
PHREG (SAS Institute, Inc., Cary, North Carolina) (63).
If case-control studies were included in our pooled anal-
yses, the model for these studies would be similar to equa-
tion 1, except that we would stratify the participants by
Methods for the Pooling Project 1055
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
TABLE 1. Characteristics of the studies included in the Pooling Project of Prospective Studies of Diet and Cancer, 1991–2004
Study Study population Location
Study
dates
Baseline
cohort size*
Age
(years) at
baseline
Food frequency questionnaire/
diet history instrument
Outcome
ascertainment
Estimated
case
ascertainment
rate
Women Men
No. of
items
Time
frame
Components
measured
Adventist Health
Study (33)
Non-Hispanic White men
and women living in
Seventh-Day Adventist
households
California,
United States
1976–1982 18,403 12,896 >24 46 Past year Frequency FQsy/MRRy;
cancer registry;
mortality registry
>99
Alpha-Tocopherol,
Beta-Carotene
Cancer
Prevention
Study (34)
Male smokers who
participated in a
randomized double-blind
placebo-controlled clinical
trial of a-tocopherol
and b-carotene
supplement use
Southwestern
Finland
1985 onward
(ongoing)
0 26,987 50–69 276 Past year Frequency and
portion size
FQs/MRR;
cancer registry;
mortality registry
100
Breast Cancer
Detection
Demonstration
Project
Follow-up
Cohort (35)
Subset of women
participating in a breast
cancer screening
program in 1973–1980
who had been diagnosed
with breast cancer
or had undergone or
been recommended
to receive a breast
biopsy, plus a random
sample of the remaining
women who had been
screened
United States 1987 onward
(ongoing)
41,987 0 40–93 62 Past year Frequency and
portion size
FQs/MRR;
cancer registry;
mortality registry
91
California
Teachers
Study (45)
Active and retired female
teachers and administrators
participating in the California
State Teachers Retirement
System
California,
United States
1995 onward
(ongoing)
100,036 0 21–103 103 Past year Frequency and
portion size
Cancer registry;
mortality registry
>97z
Canadian
National
Breast
Screening
Study (36)
Women who participated in
a multicenter randomized
controlled trial of
mammography screening
for female breast cancer
Canada 1980 onward
(ongoing)
56,837 0 40–59 86 Past month Frequency and
portion size
Cancer registry 100
Cancer
Prevention
Study II
Nutrition
Cohort (38)
Subset of men and women
participating in Cancer
Prevention Study II who
completed a diet
questionnaire in 1992
United States 1992 onward
(ongoing)
74,053 66,090 50–74 68 Past year Frequency and
portion size
FQs/MRR;
cancer registry;
mortality registry
>90
Health
Professionals
Follow-up
Study (37)
Male dentists, optometrists,
osteopathic physicians,
podiatrists, pharmacists,
and veterinarians
United States 1986 onward
(ongoing)
0 47,673 40–75 131 Past year Frequency of
specified
portions
FQs/MRR;
mortality registry
>94
Iowa Women’s
Health
Study (41)
Postmenopausal women
selected randomly from the
1985 Department of
Transportation’s driver’s
license list in Iowa
Iowa, United States 1986 onward
(ongoing)
34,603 0 55–69 116 Past year Frequency of
specified
portions
Cancer registry;
mortality registry
98§
Netherlands
Cohort
Study (40)
Men and women from 204
municipal population
registries throughout the
Netherlands
The Netherlands 1986 onward
(ongoing)
62,573 58,279 55–69 150 Past year Frequency and
portion size
Cancer registry;
pathology
database
>95
1056 Smith-Warner et al.
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
New York State
Cohort (42)
Male and female residents
who had had the same
address and telephone
number for the previous
18 years
New York,
United States
1980–1987 22,550 30,363 50–93 45 Past year Frequency Cancer registry {
New York
University
Women’s
Health
Study (43)
Women visiting a breast
screening clinic who had
not used any hormonal
medications or been
pregnant in the previous
6 months
New York,
United States
1985 onward
(ongoing)
13,258 0 34–65 71 Past year Frequency and
portion size
FQs/MRR;
cancer registry;
mortality registry
95
Nurses’ Health
Study A (37)
Female registered nurses United States 1980–1986 88,651 0 34–59 61 Past year Frequency of
specified
portions
FQs/MRR;
mortality registry
>94
Nurses’ Health
Study B (37)
Female registered nurses United States 1986 onward
(ongoing)
68,540 0 40–65 131 Past year Frequency of
specified
portions
FQs/MRR;
mortality registry
>94
Nurses’ Health
Study II (46)
Female registered nurses United States 1991 onward
(ongoing)
93,894 0 26–46 133 Past year Frequency of
specified
portions
FQs/MRR;
mortality registry
>90
Prospective Study
on Hormones,
Diet and Breast
Cancer (39)
Female volunteers recruited
from the general population
using mass media
advertising and from breast
cancer prevention units
Varese Province,
Italy
1987 onward
(ongoing)
9,027 0 35–69 177 Past year Frequency and
portion size
Cancer registry;
mortality registry;
admissions and
discharge reports;
pathology database
>97
Swedish
Mammography
Cohort (32)
Women who participated
in a population-based
mammography screening
program
Va
¨
stmanland and
Uppsala counties,
Sweden
1987 onward
(ongoing)
61,463 0 40–74 67 Past 6
months
Frequency Cancer registry 98
Women’s Health
Study (44)
Female health professionals
who participated in a
randomized, double-blind,
placebo-controlled trial
of low-dose aspirin,
b-carotene, and
vitamin E use
United States 1993 onward
(ongoing)
38,384 0 45 131 Past year Frequency of
specified
portions
FQs/MRR 96
* The baseline cohort size corresponds to the number of participants in the Pooling Project database for the renal cell cancer analyses in the California Teachers Study (45) and for the colorectal cancer analyses in
the remaining studies.
y FQs, follow-up questionnaires; MRR, medical record review.
z For California residents only.
§ For Iowa residents only.
{ Cancer outcomes in the New York State Cohort (42) were identified through linkage with a cancer registry; thus, it is difficult to determine the follow-up rate in the cohort. When a subset of the cohort was followed
intensively, loss to follow-up was not related to exposure.
Methods for the Pooling Project 1057
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
TABLE 2. Correlation coefficients (CCs) for nutrient intakes estimated using a food frequency questionnaire versus a comparison method for studies in the Pooling Project of
Prospective Studies of Diet and Cancer, 1991–2004*
Study Sex
No. of
participants
Comparison
method
Type of
CC
Total
fat
Saturated
fat
Mono-
unsaturated
fat
Poly-
unsaturated
fat
Dietary
fiber
Alcohol
Vitamin
Ay
Vitamin
Cy
Vitamin
Ey
Folatey Calciumy
Adventist Health
Study (50)
Women 103 Five 24-hour
recalls over
6 months
Spearman CCsz 0.40 0.45 0.41 0.26 0.47§
Men 44 Five 24-hour
recalls over
6 months
Spearman CCsz 0.38 0.57 0.29 0.15 0.50§
Alpha-Tocopherol,
Beta-Carotene
Cancer
Prevention
Study (51)
Men 178 Twelve 2-day
diet records
over 6 months
Energy-adjusted,
deattenuated
Pearson CCs
0.75 0.79 0.68 0.85 0.82 0.85 0.68 0.71 0.82 0.74
California
Teachers Study
(unpublished)
Women 185 Four 24-hour
recalls over
10 months
Energy-adjusted,
deattenuated
Pearson CCs
0.64 0.82 0.41 0.23 0.77 0.82 0.35{ 0.62{ 0.82{ 0.73{ 0.30{
Canadian National
Breast Screening
Study (52)
Women 108 7-day diet records Energy-adjusted
Pearson CCs
0.44 0.61 0.43# 0.40zz 0.60 0.59 0.60 0.53 0.67
Cancer Prevention
Study II Nutrition
Cohort (53)
Women 188z Four 24-hour
recalls over
1year
Energy-adjusted,
deattenuated
Pearson CCs
0.66 0.66 0.58# 0.42zz 0.61 0.77 0.65 0.27 0.43 0.66
Men 229z Four 24-hour
recalls over
1year
Energy-adjusted,
deattenuated
Pearson CCs
0.58 0.64 0.61# 0.48zz 0.64 0.82 0.65 0.23 0.51 0.57
Health
Professionals
Follow-up
Study (54)
Men 127 Two 7-day diet
records over
6 months
Energy-adjusted,
deattenuated
Pearson CCs
0.67 0.75 0.68 0.37 0.68 0.86z,** 0.61 0.77 0.42 0.70 0.60
Iowa Women’s
Health
Study (55)
Women 44 Five 24-hour
recalls over
2 months
Energy-adjusted
Pearson CCs
0.62 0.59 0.62 0.43 0.24§ 0.32 0.14 0.53 0.79 0.26 0.49
Netherlands
Cohort Study (56)
Women
and men
109 Three 3-day diet
records over
1year
Energy- and
sex-adjusted
deattenuated
Pearson CCs
0.53 0.58yy 0.80 0.79 0.86 0.76 0.58 0.66
New York State
Cohort
Women
(unpublished)
190 Simulated study Energy-adjusted,
deattenuated
Pearson CCsz
0.21 0.18 0.41 0.25 0.53 0.16
Men (57) 127 Simulated study Energy-adjusted,
deattenuated
Pearson CCs
0.57 0.60 0.61 0.22 0.65 0.39 0.76 0.46 0.60
Nurses’ Health
Study A (58)
Women 173 Four 7-day diet
records over
1year
Energy-adjusted
Pearson CCs
0.53 0.59 0.48 0.58§ 0.90z,** 0.36 0.66
Nurses’ Health
Study B (59)
Women 191 Two 7-day diet
records over
1year
Energy-adjusted,
deattenuated
Pearson CCs
0.57 0.68 0.58 0.48 0.79 0.76 0.75
1058 Smith-Warner et al.
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
study-specific matching factors. Since the Cox model and
the conditional logistic regression model produce algebrai-
cally identical log-(partial)-likelihood functions, SAS PROC
PHREG could also be used for case-control studies to esti-
mate study-specific odd ratios and their standard errors.
The second step consists of pooling the study-specific rel-
ative risks using a random-effects model (64–66) given by
ˆ
b
s
¼ b þ b
s
þ e
s
; ð2Þ
where the
ˆ
b
s
are the estimated study-specific exposure-
disease effects, b is the underlying common exposure-
disease association, b
s
are the random between-studies
effects, and e
s
are the within-study errors. Both b
s
and
e
s
are assumed to be independent and asymptotically nor-
mally distributed with means of zero and variances of r
2
B
and r
2
s
; respectively, and r
2
s
¼ Varð
ˆ
b
s
Þ: The study-specific
exposure-disease effects are weighted by the inverse of
their variances using
ˆ
b ¼
X
S
s¼1
w
s
ˆ
b
s
;
where
w
s
¼ð
ˆ
r
2
B
þ
ˆ
r
2
s
Þ
1
.
X
S
t¼1
ð
ˆ
r
2
B
þ
ˆ
r
2
t
Þ
1
:
When the exposure variable is categorized into different
levels, we calculate a pooled relative risk for each category
separately.
We test for the statistical significance of between-studies
heterogeneity among the study-specific exposure-disease
estimates using the Q test statistic given by
Q ¼
X
S
s¼1
w
*
s
ð
ˆ
b
s
ˆ
bÞ
2
; ð3Þ
where w
*
s
¼ Var
ˆ
ð
ˆ
b
s
Þ
1
: The Q test statistic follows an ap-
proximate v
2
s1
distribution (66, 67).
For the exposures of interest, we generally categorize
participants into study-specific quantiles. Because the quan-
tile approach does not take into account true differences in
the distribution of population intakes across studies, we also
create categories defined by identical absolute intake cut-
points across studies. Misclassification can also occur in the
analyses based on identical absolute intake cutpoints, be-
cause reported intakes may differ across studies based on
differences in the dietary assessment methods used. How-
ever, when possible we adjust our results for measurement
error in the individual studies.
Aggregated analysis
We can also conduct analyses in which the data from all
studies are combined into one data set (referred to as an
aggregated analysis). A single exposure-disease effect is
then calculated using the Cox proportional hazards model,
including stratification by study, age at baseline, and the
year in which the baseline questionnaire was returned.
Prospective Study
on Hormones,
Diet and Breast
Cancer
(unpublished)
Women 104 Fourteen
24-hour recalls
over 1 year
Spearman CCs 0.39 0.49 0.40 0.42 0.53 0.88 0.19 0.37 0.37 0.32 0.48
Swedish
Mammography
Cohort
(unpublished)
Women 129 Four 7-day diet
records over
1year
Energy-adjusted,
deattenuated
Pearson CCs
0.49 0.42 0.51 0.36 0.54 0.85 0.49 0.33 0.29 0.48 0.48
Median 0.55 0.60 0.55 0.40 0.64 0.84 0.45 0.65 0.37 0.46 0.60
* A blank cell means that the investigators did not evaluate this nutrient in their validation study. The studies not mentioned in this table used questionnaires that were very similar to those that had been validated
previously by other investigators. The Breast Cancer Detection Demonstration Project Follow-up Cohort (35) and the New York University Women’s Health Study (43) both used food frequency questionnaires similar
to the food frequency questionnaire used in the Cancer Prevention Study II Nutrition Cohort (53). Nurses’ Health Study II (46) and the Women’s Health Study (44) both used food frequency questionnaires similar to the
food frequency questionnaire used in Nurses’ Health Study B (59).
y Intake from foods only; supplemental intake was not included.
z The data presented were calculated using the validation study data that were sent to the Pooling Project.
§ Crude fiber.
{ Correlations among nonusers of supplements only (n ¼ 44).
# Oleic acid.
** Spearman CC.
yy Energy- and sex-adjusted Pearson CC.
zz Linoleic acid.
Methods for the Pooling Project 1059
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
Although combining the data from all studies is one way to
take advantage of differences in the distributions of the
exposure variable across studies, it assumes that the expo-
sure was measured in comparable ways across studies. Be-
cause the distributions of dietary variables may differ across
studies due to true differences in actual intake and due to
differences in the dietary assessment methods used (and
other study-specific sources of error), this assumption may
not be reasonable, except for nutrients that come from
a small number of food sources (e.g., alcohol). In addition,
combining the studies into one data set assumes that there is
no between-studies heterogeneity in the associations of the
outcome with the exposure or any of the covariates. In the
few instances where we have conducted both pooled and
aggregated analyses, the results have been essentially iden-
tical (16, 25, 30). Nevertheless, because it is difficult to test
the underlying assumptions, we have opted to use two-stage
analyses as our primary analytic strategy.
Trend analysis
To test the significance of trends in disease risk over
exposure categories, we conduct separate analyses in which
participants are assigned the study-specific median value of
their respective category (given by med
js
for j ¼ 1, ..., J,
where J is the number of levels in which the exposure variable
is categorized). For each study, we fit a Cox proportional
hazards model with regression terms b
s
z
is
for s ¼ 1, ..., S,
where s is the study number and z
is
takes on the values
med
js
corresponding to the category in which the individ-
ual’s exposure value falls. We then compute the pooled
estimate for the regression coefficient for trend using a
random-effects model (64–66). The pooled test for trend is
a Wald test of the hypothesis H
0
: b ¼ 0. We test for the
statistical significance of between-studies heterogeneity
among the study-specific regression coefficients using the
Q test statistic (66, 67).
We also evaluate whether associations between dietary
factors and cancer risk are linear by comparing nonparamet-
ric regression curves using restricted cubic splines with the
linear model using the likelihood ratio test, and by visual
inspection of the restricted cubic spline graphs (68, 69). For
these analyses, the studies are combined into a single data
set stratified by study.
Evaluation of heterogeneity of effects
An advantage of a pooled analysis is the ability to eval-
uate whether the exposure-disease association is modified
by other risk factors. In these analyses, if the exposure-
disease association is log-linear and the potential effect
modifier is an ordinal or binary variable, we first compute
estimates of the exposure-disease association and their stan-
dard errors for each study within each category of the
TABLE 3. Prevalences of missing data for select nondietary factors across studies in
the Pooling Project of Prospective Studies of Diet and Cancer, 1991–2004
No. of studies in
which the factor
was measured*
% of missin g
data across
studies (range)
Studies with <5%
missing data
No. %
Age 17 0 17 100
Education 17y 0–23 14 82
Body mass index 17 0–8 15 88
Smoking status 15 0–5 15 100
Physical activity 14 0–12 11 79
Multivitamin use 14 0–8 10 71
Age at menarchez 14 0–3 14 100
Parityz 15 0–10 13 87
Menopausal statusz 14 0–18 8 57
Oral contraceptive usez 13 0–21 11 85
Postmenopausal hormone
usez,§,{ 13 0–16 12 92
* For this table, Nurses’ Health Study A (1980–1986) and Nurses’ Health Study B (1986–
present) were counted as two separate studies (see Materials and Methods).
y All participants in the California Teachers Study, the Health Professionals Follow-up Study,
the Nurses’ Health Study, and Nurses’ Health Study II were assumed to have received additional
education after graduating from high school, because these populations were selected on the
basis of their employment in occupations requiring a post-high-school education.
z Only cohort studies including women are included here. The prevalence of missing data was
calculated only among the female participants.
§ For the Swedish Mammography Cohort, only the percentage of missing data for women
living in Uppsala County is included, since these data were not collected for women living in
Vastmanland County.
{ Among postmenopausal women only.
1060 Smith-Warner et al.
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
potential effect modifier. The model uses the same format as
equation 1, but
h
jksl
ðtj u
is
;x
is
Þ¼h
0jksl
ðtÞexpða
sl
u
is
þ b
sl
x
is
Þ; ð4Þ
where l ¼ 1, ..., L levels of the effect modifier, x
is
is the
study-specific exposure variable, and a
sl
are the estimated
study-specific log relative risks for a one-unit increase in the
confounding variables, u
is
. The study-specific estimates
ˆ
b
sl
for each stratum are then pooled across studies and expo-
nentiated to obtain the relative risk for each level of the
potential effect modifier. For assessment of the statistical
significance of the interaction, the Cox proportional hazards
model is
h
jks
ðtj u
is
;x
is
;m
is
Þ¼h
0jks
ðtÞexpða
s
u
is
þ n
s
m
is
þ b
s
x
is
þ c
s
x
is
m
is
Þ; ð5Þ
where c
s
is the study-specific estimate for the cross-product
term of the potential effect modifier variable (m
is
) times the
exposure variable (x
is
) and n
s
is the study-specific main ef-
fect of the effect modifier. The study-specific estimates
ˆ
c
s
are then pooled across studies, and the p value correspond-
ing to the test for interaction (H
0
: c ¼ 0) is obtained from
a Wald test based upon the pooled
ˆ
c.
We use a mixed-effects meta-regression model (70) to test
for effect modification when the exposure-disease associa-
tion is nonlinear, when the potential effect modifier is a poly-
tomous nominal variable, or when effect modification can
be assessed only between studies. As an example, consider
the test for effect modification by gender. The model here is
a slightly modified version of equation 2:
ˆ
b
s
¼ b
0
þ b
1
z
s
þ b
s
þ e
s
ð6Þ
for s ¼ 1, ..., S , where s is the study number, the
ˆ
b
s
are the
estimated study-specific exposure-disease effects, b
0
is the
log relative risk for the exposure in the reference level of
the modifier (here, men), b
1
is the difference in the log
relative risks between the reference level and each of the
other levels (here, between genders), z
s
¼ 1 if study s is car-
ried out among women and z
s
¼ 0 if it is carried out among
men, b
s
are the study-specific random effects, and e
s
are the
within-study sampling errors. The Wald test statistic based
on the estimate
ˆ
b
1
and its standard error is used to test the
null hypothesis (H
0
: b
1
¼ 0) that there is no modification of
the effect of exposure on the outcome by levels of the po-
tential effect modifier (here, between genders).
Assessment of heterogeneity by outcome subtype
We can also evaluate whether associations differ by can-
cer subtype. For these analyses, we fit separate Cox pro-
portional hazards models (equation 1) for each subtype.
Occurrences of the cancer under study that are of a different
subtype are censored at their date of diagnosis. The relative
risks obtained for each subtype that are estimated in this
way are asymptotically uncorrelated (71–73). In addition,
because these estimates are asymptotically normally distrib-
uted with variances given by the square of their respective
estimated standard errors, any linear combination of the
different estimates is normally distributed, and it follows
from the Cramer-Wald device (74) that the multivariate vec-
tor obtained by combining all of the competing risk esti-
mates is multivariate normal. The corresponding variances
are in the diagonal of the covariance matrix, and zeroes are
in the off-diagonal. To test the null hypothesis that there is
no difference in the pooled exposure-disease parameters
among the subtypes, we use a contrast test (75). For exam-
ple, to test whether the pooled exposure-disease parameters
differed among three subtypes, we would use the test statis-
tic Z
2
given by
Z
2
¼ðC
ˆ
bÞ
T
ðC
ˆ
RC
T
Þ
1
ðC
ˆ
bÞ; ð7Þ
where C is a contrast matrix whose first and second rows are
(1, 1, 0) and (1, 0, 1),
ˆ
b is the vector of the pooled
estimates of the exposure-disease association for the differ-
ent subtypes, and
ˆ
R
is its estimated covariance matrix. The
Z
2
statistic in this example has an approximate v
2
distribu-
tion with 2 df (dened by the number of different subtypes
minus 1) (75). These methods can also be used to construct
tests for heterogeneity of effects between any set of cancers
or other outcomes.
Measurement error correction
As with most exposures, measurement of dietary vari-
ables is not free from error. Measurement error in dietary
data derives from normal within-person variation in intakes
over time (76) and from errors associated with self-reports
(77). Therefore, the relative risks will be biased, usually
towards the null, but can be biased in either direction when
there is also error in measuring confounding variables (78).
One can use the validation data from each study to regress
the ‘gold standard (or an unbiased estimate of the gold
standard, an ‘alloyed’ gold standard (79)) on the error-
prone measurement and confounding variables to obtain
a correction factor. This correction factor can then be used
to calibrate the uncorrected estimates of the exposure effect
of interest obtained from logistic and Cox regression models
(77, 79, 80). If the errors in the alloyed gold standard are
correlated with the errors in the usual measure of dietary
intake, the regression calibration method for measurement
error correction will remove some, but not all, of the bias in
the effect estimate (81). However, it appears that energy
adjustment removes much of the bias in this method due
to correlated errors for at least some dietary variables (e.g.,
protein) (82, 83). To remove the remaining bias, an addi-
tional method of assessment of intake is needed, such as
a biomarker (81).
In the measurement error correction analyses, for each
study, the true intake of the particular nutrient being evalu-
ated or an unbiased estimate of the true intake (e.g., intakes
calculated from several dietary records or 24-hour recalls) is
regressed on the surrogate measurement of that nutrient
(calculated from the food frequency questionnaire) to obtain
the coefficient
ˆ
k
s
and its estimated standard error. We then
derive the corrected estimate of the log relative risk as
ˆ
b
s
=
ˆ
k
s
; where
ˆ
b
s
is the uncorrected estimated effect in each
study from a logistic regression or Cox proportional hazards
Methods for the Pooling Project 1061
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
regression analysis. The standard error of
ˆ
b
s
=
ˆ
k
s
is derived
using the delta method (84). One can simultaneously correct
for the error in several covariates in all point estimates and
their standard errors using a multivariate extension of mea-
surement error correction (79, 85). The corrected coefficient
estimates are then pooled into a summary estimate. If a study
has poor validity of nutrient measurements, its variance will
be large, and the study will thus have little weight when the
study-specific results are pooled. In addition, under the re-
quired assumption that the dietary records and 24-hour re-
calls provide an unbiased estimate of nutrient intake (even if
subject to random error), this approach calibrates the esti-
mated relative risks to a common unit of measurement
across studies, thereby adjusting for systematic errors due
to differences in the food frequency questionnaires used in
the various studies.
STRENGTHS AND LIMITATIONS
The Pooling Project of Prospective Studies of Diet and
Cancer provides a large collection of data in which multiple
diet-and-cancer hypotheses can be examined with greater
statistical power than is available in any one study. Each
study included in the Pooling Project is a prospective cohort
study in which diet was assessed prior to development of
disease, thereby limiting recall and selection biases. In the
Pooling Project, we standardize the modeling of the expo-
sure and confounding variables to remove potential sources
of noncomparability and heterogeneity that occur in the
published literature. We are able to examine associations
over a wide range of intakes with greater precision than in
the individual studies, because of the larger sample size and
the different diets consumed across the populations. In ad-
dition, we can evaluate whether associations are modified
by other factors and whether associations differ among can-
cer subtypes. Because inclusion of an individual study in a
particular analysis is not dependent on whether those in-
vestigators have published findings on that association,
publication bias does not affect our pooled analyses—as
opposed to meta-analyses of the published literature, for
which approximately half of the results may have some in-
dication of publication bias (86). Finally, results from these
pooled analyses may assist epidemiologists and other health
professionals in synthesizing the vast amount of published
data on specific diet-cancer associations.
A limitation of the Pooling Project is that it was planned
retrospectively. Thus, there are differences in how the in-
cluded studies were designed and implemented. First, the
studies comprise populations from different geographic re-
gions with different age ranges and education levels. How-
ever, these differences in study population characteristics
may be considered a strength, particularly if the results are
consistent across studies. Second, the dietary assessment
methods used vary across studies, which may lead to artifac-
tual differences in estimated intakes across studies, in addi-
tion to any true between-population differences in intakes.
However, it is also possible that validity is enhanced by the
use of study-specific questionnaires, since they may be tai-
lored for use in each component study. Some heterogeneity of
assessment instruments cannot be avoided, even in prospec-
tively planned pooled studies—if, for instance, the language
spoken and the foods consumed differ between populations.
Another limitation of the Pooling Project is that only current
diet at baseline was measured in most of the studies; thus, we
cannot examine the effects of dietary changes occurring dur-
ing follow-up or assess associations with diet at younger
ages. There may be differential control for confounding
across studies because the nondietary variables that were
measured varied across studies, although many important
potential confounders were measured in most studies. In
addition, by standardizing which confounding variables are
included in the multivariate models and their categorization,
we have minimized between-studies heterogeneity resulting
from how potentially confounding variables were modeled.
A final restriction is our inability to examine effect modifi-
cation by race and ethnicity, because the Pooling Project
currently includes studies from only North America and
Europe and a predominantly Caucasian population; how-
ever, as studies from other regions and with persons of dif-
ferent ethnicities become eligible to join the Pooling Project,
the ethnic composition of the Pooling Project will expand.
Despite these limitations and restrictions, the data com-
piled in the Pooling Project are a valuable resource for pro-
spectively investigating associations between diet and
cancer, particularly for population subgroups, less common
cancers, and specific cancer subtypes. In our analyses, we
use standardized criteria to define each variable in order to
reduce potential sources of between-studies heterogeneity.
We then evaluate whether associations are consistent across
different study populations. Finally, the methods that we use
in the Pooling Project may be modified to pool data from
both case-control and cohort studies to examine associations
between dietary and nondietary risk factors and disease.
ACKNOWLEDGMENTS
This research was funded by National Institutes of Health
grants CA55075 and CA78548. The work was performed at
the Harvard School of Public Health (Boston, Massachusetts).
Conflict of interest: none declared.
REFERENCES
1. Blettner M, Sauerbrei W, Schlehofer B, et al. Traditional
reviews, meta-analyses and pooled analyses in epidemiology.
Int J Epidemiol 1999;28:1–9.
2. Friedenreich CM. Methods for pooled analyses of epidemio-
logic studies. Epidemiology 1993;4:295–302.
3. Steinberg KK, Smith SJ, Stroup DF, et al. Comparison of effect
estimates from a meta-analysis of summary data from pub-
lished studies and from a meta-analysis using individual pa-
tient data for ovarian cancer studies. Am J Epidemiol 1997;
145:917–25.
4. Lyman GH, Kuderer NM. The strengths and limitations of
meta-analyses based on aggregate data. BMC Med Res
Methodol 2005;5:14.
5. Ioannidis JP, Rosenberg PS, Goedert JJ, et al. Commentary:
meta-analysis of individual participants’ data in genetic epi-
demiology. Am J Epidemiol 2002;156:204–10.
1062 Smith-Warner et al.
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
6. Collaborative Group on Hormonal Factors in Breast Cancer.
Breast cancer and hormonal contraceptives: collaborative re-
analysis of individual data on 53 297 women with breast
cancer and 100 239 women without breast cancer from 54
epidemiological studies. Lancet 1996;347:1713–27.
7. Whittemore AS, Harris R, Itnyre J, et al. Characteristics re-
lating to ovarian cancer risk: collaborative analysis of 12 US
case-control studies. I. Methods. Am J Epidemiol 1992;
136:1175–83.
8. Plummer M, Herrero R, Franceschi S, et al. Smoking and
cervical cancer: pooled analysis of the IARC multi-centric
case-control study. Cancer Causes Control 2003;14:805–14.
9. Bosetti C, Kolonel L, Negri E, et al. A pooled analysis of case-
control studies of thyroid cancer. VI. Fish and shellfish con-
sumption. Cancer Causes Control 2001;12:375–82.
10. Arslan AA, Zeleniuch-Jacquotte A, Lundin E, et al. Serum
follicle-stimulating hormone and risk of epithelial ovarian
cancer in postmenopausal women. Cancer Epidemiol Bio-
markers Prev 2003;12:1531–5.
11. Pereira MA, O’Reilly E, Augustsson K, et al. Dietary fiber and
risk of coronary heart disease: a pooled analysis of cohort
studies. Arch Intern Med 2004;164:370–6.
12. Morton LM, Hartge P, Holford TR, et al. Cigarette smoking
and risk of non-Hodgkin lymphoma: a pooled analysis from
the International Lymphoma Epidemiology Consortium
(InterLymph). Cancer Epidemiol Biomarkers Prev 2005;14:
925–33.
13. Smith JS, Herrero R, Bosetti C, et al. Herpes simplex virus-2
as a human papillomavirus cofactor in the etiology of invasive
cervical cancer. J Natl Cancer Inst 2002;94:1604–13.
14. Hunter DJ, Spiegelman D, Adami H-O, et al. Cohort studies
of fat intake and the risk of breast cancer—a pooled analysis.
N Engl J Med 1996;334:356–61.
15. Hunter DJ, Spiegelman D, Adami H-O, et al. Non-dietary
factors as risk factors for breast cancer, and as effect modifiers
of the association of fat intake and risk of breast cancer. Cancer
Causes Control 1997;8:49–56.
16. Smith-Warner SA, Spiegelman D, Yaun S-S, et al. Alcohol and
breast cancer in women: a pooled analysis of cohort studies.
JAMA 1998;279:535–40.
17. Cho E, Smith-Warner SA, Spiegelman D, et al. Dairy foods,
calcium, and colorectal cancer: a pooled analysis of 10 cohort
studies. J Natl Cancer Inst 2004;96:1015–22.
18. van den Brandt PA, Spiegelman D, Yaun SS, et al. Pooled
analysis of prospective cohort studies on height, weight and
breast cancer risk. Am J Epidemiol 2000;152:514–27.
19. Smith-Warner SA, Spiegelman D, Yaun S-S, et al. Intake
of fruits and vegetables and risk of breast cancer: a pooled
analysis of cohort studies. JAMA 2001;285:769–76.
20. Smith-Warner SA, Spiegelman D, Adami HO, et al. Types
of dietary fat and breast cancer: a pooled analysis of cohort
studies. Int J Cancer 2001;92:767–74.
21. Missmer SA, Smith-Warner SA, Spiegelman D, et al. Meat
and dairy food consumption and breast cancer: a pooled
analysis of cohort studies. Int J Epidemiol 2002;31:78–85.
22. Smith-Warner SA, Ritz J, Hunter DJ, et al. Dietary fat and
risk of lung cancer in a pooled analysis of prospective studies.
Cancer Epidemiol Biomarkers Prev 2002;11:987–92.
23. Smith-Warner SA, Spiegelman D, Yaun SS, et al. Fruits,
vegetables and lung cancer: a pooled analysis of cohort stud-
ies. Int J Cancer 2003;107:1001–11.
24. Mannisto S, Smith-Warner SA, Spiegelman D, et al. Dietary
carotenoids and risk of lung cancer in a pooled analysis of
seven cohort studies. Cancer Epidemiol Biomarkers Prev
2004;13:40–8.
25. Cho E, Smith-Warner SA, Ritz J, et al. Alcohol intake and
colorectal cancer: a pooled analysis of 8 cohort studies. Ann
Intern Med 2004;140:603–13.
26. Koushik A, Hunter DJ, Spiegelman D, et al. Fruits and vege-
tables and ovarian cancer risk in a pooled analysis of 12 cohort
studies. Cancer Epidemiol Biomarkers Prev 2005;14:
2160–7.
27. Freudenheim JL, Ritz J, Smith-Warner SA, et al. Alcohol
consumption and risk of lung cancer: a pooled analysis of
cohort studies. Am J Clin Nutr 2005;82:657–67.
28. Cho E, Hunter DJ, Spiegelman D, et al. Intakes of vitamins A,
C and E and folate and multivitamins and lung cancer: a
pooled analysis of 8 prospective studies. Int J Cancer 2006;
118:970–8.
29. Genkinger JM, Hunter DJ, Spiegelman D, et al. A pooled
analysis of 12 cohort studies of dietary fat, cholesterol and egg
intake and ovarian cancer. Cancer Causes Control 2006;17:
273–85.
30. Park Y, Hunter DJ, Spiegelman D, et al. Dietary fiber intake
and risk of colorectal cancer: a pooled analysis of prospective
cohort studies. JAMA 2005;294:2849–57.
31. Riboli E, Kaaks R. The EPIC Project: rationale and study
design. Int J Epidemiol 1997;26(suppl 1):S6–14.
32. Wolk A, Bergstro
¨
m R, Hunter D, et al. A prospective study
of association of monounsaturated fat and other types of
fat with risk of breast cancer. Arch Intern Med 1998;158:
41–5.
33. Singh PN, Fraser GE. Dietary risk factors for colon cancer
in a low-risk population. Am J Epidemiol 1998;148:761–74.
34. The ATBC Cancer Prevention Study Group. The Alpha-
Tocopherol, Beta-Carotene Lung Cancer Prevention Study:
design, methods, participant characteristics, and compliance.
Ann Epidemiol 1994;4:1–10.
35. Flood A, Velie EM, Chaterjee N, et al. Fruit and vegetable
intakes and the risk of colorectal cancer in the Breast Cancer
Detection Demonstration Project Follow-up Cohort. Am J Clin
Nutr 2002;75:936–43.
36. Terry P, Jain M, Miller AB, et al. Dietary intake of folic acid
and colorectal cancer risk in a cohort of women. Int J Cancer
2002;97:864–7.
37. Michels KB, Giovannucci E, Joshipura KJ, et al. Prospec-
tive study of fruit and vegetable consumption and incidence
of colon and rectal cancers. J Natl Cancer Inst 2000;92:
1740–52.
38. Calle EE, Rodriguez C, Jacobs EJ, et al. The American Cancer
Society Cancer Prevention Study II Nutrition Cohort: ratio-
nale, study design, and baseline characteristics. Cancer 2002;
94:500–11.
39. Sieri S, Krogh V, Muti P, et al. Fat and protein intake and
subsequent breast cancer risk in postmenopausal women.
Nutr Cancer 2002;42:10–17.
40. Voorrips LE, Goldbohm RA, van Poppel G, et al. Vegetable
and fruit consumption and risks of colon and rectal cancer in
a prospective cohort study: The Netherlands Cohort Study on
Diet and Cancer. Am J Epidemiol 2000;152:1081–92.
41. Steinmetz KA, Kushi LH, Bostick RM, et al. Vegetables, fruit,
and colon cancer in the Iowa Women’s Health Study. Am J
Epidemiol 1994;139:1–15.
42. Bandera EV, Freudenheim JL, Marshall JR, et al. Diet and
alcohol consumption and lung cancer risk in the New York
State Cohort (United States). Cancer Causes Control 1997;
8:828–40.
43. Kato I, Akhmedkhanov A, Koenig K, et al. Prospective study
of diet and female colorectal cancer: The New York University
Women’s Health Study. Nutr Cancer 1997;28:276–81.
Methods for the Pooling Project 1063
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
44. Higginbotham S, Zhang Z-F, Lee I-M, et al. Dietary glycemic
load and risk of colorectal cancer in the Women’s Health
Study. J Natl Cancer Inst 2004;96:229–33.
45. Horn-Ross PL, Hoggatt KJ, West DW, et al. Recent diet and
breast cancer risk: The California Teachers Study (USA).
Cancer Causes Control 2002;13:407–15.
46. Cho E, Spiegelman D, Hunter DJ, et al. Premenopausal fat
intake and risk of breast cancer. J Natl Cancer Inst 2003;95:
1079–85.
47. Prentice RL. A case-cohort design for epidemiologic cohort
studies and disease prevention trials. Biometrika 1986;73:
1–11.
48. Rothman KJ. Modern epidemiology. Boston, MA: Little,
Brown and Company, 1986.
49. Cox DR. Regression models and life tables (with discussion).
J R Stat Soc B 1972;34:187–220.
50. Abbey DE, Andress M, Fraser G, et al. Validity and reliability
of alternative nutrient indices based on a food frequency
questionnaire. (Abstract). Am J Epidemiol 1988;128(suppl):
934.
51. Pietinen P, Hartman AM, Haapa E, et al. Reproducibility
and validity of dietary assessment instruments. I. A self-
administered food use questionnaire with a portion size picture
booklet. Am J Epidemiol 1988;128:655–66.
52. Jain M, Howe GR, Rohan T. Dietary assessment in epidemi-
ology: comparison of a food frequency and a diet history
questionnaire with a 7-day food record. Am J Epidemiol
1996;143:953–60.
53. Flagg EW, Coates RJ, Calle EE, et al. Validation of the
American Cancer Society Cancer Prevention Study II Nutri-
tion Survey Cohort food frequency questionnaire. Epidemiol-
ogy 2000;11:462–8.
54. Rimm EB, Giovannucci EL, Stampfer MJ, et al. Reproduc-
ibility and validity of an expanded self-administered semi-
quantitative food frequency questionnaire among male health
professionals. Am J Epidemiol 1992;135:1114–26.
55. Munger RG, Folsom AR, Kushi LH, et al. Dietary assessment
of older Iowa women with a food frequency questionnaire:
nutrient intake, reproducibility, and comparison with 24-hour
dietary recall interviews. Am J Epidemiol 1992;136:192–200.
56. Goldbohm RA, van den Brandt PA, Brants HA, et al. Valida-
tion of a dietary questionnaire used in a large-scale prospec-
tive cohort study on diet and cancer. Eur J Clin Nutr 1994;
48:253–65.
57. Feskanich D, Marshall J, Rimm EB, et al. Simulated validation
of a brief food frequency questionnaire. Ann Epidemiol 1994;
4:181–7.
58. Willett WC, Sampson L, Stampfer MJ, et al. Reproducibility
and validity of a semiquantitative food frequency question-
naire. Am J Epidemiol 1985;122:51–65.
59. Willett W. Nutritional epidemiology. New York, NY: Oxford
University Press, 1998.
60. Miettinen OS. Theoretical epidemiology. New York, NY:
John Wiley and Sons, Inc, 1985.
61. Huberman M, Langholz B. Application of the missing-
indicator method in matched case-control studies with incom-
plete data. Am J Epidemiol 1999;150:1340–5.
62. HiroSoft International Corporation. EPICURE user’s guide:
the PEANUTS program. Seattle, WA: HiroSoft International
Corporation, 1993.
63. Allison PD. Survival analysis using the SAS system: a practi-
cal guide. Cary, NC: SAS Publishing, 1995.
64. Harville DA. Maximum likelihood approaches to variance
component estimation and to related problems. J Am Stat
Assoc 1977;72:320–38.
65. Laird NM, Ware JH. Random-effects models for longitudinal
data. Biometrics 1982;38:963–74.
66. DerSimonian R, Laird N. Meta-analysis in clinical trials.
Control Clin Trials 1986;7:177–88.
67. Cochran WG. The combination of estimates from different
experiments. Biometrics 1954;10:101–29.
68. Durrleman S, Simon R. Flexible regression models with cubic
splines. Stat Med 1989;8:551–61.
69. Smith PL. Splines as a useful and convenient statistical tool.
Am Stat 1979;33:57–62.
70. Stram DO. Meta-analysis of published data using a linear
mixed-effects model. Biometrics 1996;52:536–44.
71. Prentice RL, Kalbfleisch JD, Peterson AV, et al. The analysis
of failure times in the presence of competing risks. Biometrics
1978;34:541–54.
72. Tsiatis AA. Competing risks. In: Armitage P, Colton T, eds.
Encyclopedia of biostatistics. 1st ed. Vol 1. New York, NY:
John Wiley and Sons, Inc, 1988:824–34.
73. Cox DR, Oakes D. Analysis of survival data. New York, NY:
Chapman and Hall, Inc, 1993.
74. Billingsley P. Probability and measure. New York, NY: John
Wiley and Sons, Inc, 1995.
75. Anderson TW. Introduction to multivariate statistics. New
York, NY: John Wiley and Sons, Inc, 1984.
76. Beaton GH, Milner J, McGuire V, et al. Source of variance in
24-hour dietary recall data: implications for nutrition study
design and interpretation. Carbohydrate sources, vitamins, and
minerals. Am J Clin Nutr 1983;37:986–95.
77. Rosner B, Willett WC, Spiegelman D. Correction of logistic
regression relative risk estimates and confidence intervals for
systematic within-person measurement error. Stat Med 1989;
8:1051–69.
78. Kupper LL. Effects of the use of unreliable surrogate variables
on the validity of epidemiologic research studies. Am J Epi-
demiol 1984;120:643–8.
79. Spiegelman D, Schneeweiss S, McDermott A. Measurement
error correction for logistic regression models with an
‘alloyed gold standard. Am J Epidemiol 1997;145:184–96.
80. Wang CY, Xie, Prentice AM, et al. Recalibration based on an
approximate relative risk estimator in Cox regression with
missing covariates. Stat Sinica 2001;11:1081–104.
81. Spiegelman D, Zhao B, Kim J. Correlated errors in biased
surrogates: study designs and methods for measurement error
correction. Stat Med 2005;24:1657–82.
82. Kipnis V, Subar AF, Midthune D, et al. Structure of dietary
measurement error: results of the OPEN biomarker study. Am
J Epidemiol 2003;158:14–21.
83. Michels KB, Bingham SA, Luben R, et al. The effect of cor-
related measurement error in multivariate models of diet. Am J
Epidemiol 2004;160:59–67.
84. Bishop Y, Fienberg S, Holland P. Discrete multivariate anal-
ysis. Cambridge, MA: MIT Press, 1975.
85. Rosner B, Spiegelman D, Willett WC. Correction of logistic
regression relative risk estimates and confidence intervals for
measurement error: the case of multiple covariates measured
with error. Am J Epidemiol 1990;132:734–45.
86. Sutton AJ, Duval SJ, Tweedie RL, et al. Empirical assessment
of effect of publication bias on meta-analyses. BMJ 2000;
320:1574–7.
1064 Smith-Warner et al.
Am J Epidemiol 2006;163:1053–1064
by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from
... A linear regression was used to calculate study-specific β-estimates and 95% confidence intervals (CI) for each dietary index (PHDS and uRFDS) at the baseline as a continuous exposure variable, and baseline anthropometric measures (weight, BMI, and WC) and percentual anthropometric changes (weight, BMI, and WC) as continuous outcome variables. Then, study-specific β-estimates and 95% CI with P-values were pooled with a two-staged random effects linear regression (34). The heterogeneity was tested between the pooled cohorts using Q-statistics. ...
... The heterogeneity was tested between the pooled cohorts using Q-statistics. Furthermore, to evaluate the plausibility of the PHDS and uRFDS in relation to its components, the pooled P-value for the trend in the association between the intake of each of the uRFDS or PHDS components and the scoring groups was determined with a two-staged random effects linear regression (34). ...
... The strengths of this study include the relatively long follow-up time and harmonized and pooled data from two population-based studies: the validated FFQ and professionally measured anthropometric variables (25)(26)(27)34). We acknowledge the limitations with the under-and overreporting and memory bias related to the FFQ method. ...
Article
Full-text available
Background: Knowledge on the association between the EAT-Lancet Planetary Health Diet (PHD) or the Finnish Nutrition recommendations (FNR) and anthropometric changes is scarce. Especially, the role of the overall diet quality, distinct from energy intake, on weight changes needs further examination. Objectives: To examine the association between diet quality and weight change indicators and to develop a dietary index based on the PHD adapted for the Finnish food culture. Methods: The study population consisted of participants of two Finnish population-based studies (n = 4,371, 56% of women, aged 30−74 years at baseline). Dietary habits at the baseline were assessed with a validated food frequency questionnaire including 128−130 food items. We developed a Planetary Health Diet Score (PHDS) (including 13 components) and updated the pre-existing Recommended Finnish Diet Score (uRFDS) (including nine components) with energy density values to measure overall diet quality. Weight, height, and waist circumference (WC), and the body mass index (BMI) were measured at the baseline and follow-up, and their percentual changes during a 7-year follow-up were calculated. Two-staged random effects linear regression was used to evaluate β-estimates with 95% confidence intervals. Results: Adherence to both indices was relatively low (PHDS: mean 3.6 points (standard deviation [SD] 1.2) in the range of 0−13; uRFDS: mean 12.7 points (SD 3.9) in the range of 0−27). We did not find statistically significant associations between either of the dietary indices and anthropometric changes during the follow-up (PHDS, weight: β −0.04 (95% CI −0.19, 0.11), BMI: β 0.05 (−0.20, 0.10), WC: β −0.08 (−0.22, 0.06); uRFDS, weight: β 0.01 (−0.04, 0.06), BMI: β 0.01 (−0.04, 0.06), WC: β −0.02 (−0.07, 0.03)). Conclusion: No associations between overall diet quality and anthropometric changes were found, which may be at least partly explained by low adherence to the PHD and the FNR in the Finnish adult population.
... The associations were estimated using two-stage meta-analyses. We calculated cohort-specific HRs and 95% CIs for CRC risk by Cox proportional hazards multivariate models, after which we pooled the cohort-specific estimates, weighted by the inverse of their variances, using random-effects models [35]. Heterogeneity between the cohorts was tested by Q-statistics. ...
Article
Full-text available
Objectives Shifting from animal-based to plant-based diets could reduce colorectal cancer (CRC) incidence. Currently, the impacts of these dietary shifts on CRC risk are ill-defined. Therefore, we examined partial substitutions of red or processed meat with whole grains, vegetables, fruits or a combination of these in relation to CRC risk in Finnish adults. Methods We pooled five Finnish cohorts, resulting in 43 788 participants aged ≥ 25 years (79% men). Diet was assessed by validated food frequency questionnaires at study enrolment. We modelled partial substitutions of red (100 g/week) or processed meat (50 g/week) with corresponding amounts of plant-based foods. Cohort-specific hazard ratios (HR) for CRC were calculated using Cox proportional hazards models and pooled together using random-effects models. Adjustments included age, sex, energy intake and other relevant confounders. Results During the median follow-up of 28.8 years, 1124 CRCs were diagnosed. We observed small risk reductions when red meat was substituted with vegetables (HR 0.97, 95% CI 0.95 − 0.99), fruits (0.97, 0.94 − 0.99), or whole grains, vegetables and fruits combined (0.97, 0.95 − 0.99). For processed meat, these substitutions yielded 1% risk reductions. Substituting red or processed meat with whole grains was associated with a decreased CRC risk only in participants with < median whole grain intake (0.92, 0.86 − 0.98; 0.96, 0.93 − 0.99, respectively; pinteraction=0.001). Conclusions Even small, easily implemented substitutions of red or processed meat with whole grains, vegetables or fruits could lower CRC risk in a population with high meat consumption. These findings broaden our insight into dietary modifications that could foster CRC primary prevention.
... A two-stage modeling approach was used to quantify the associations of reproductive factors and MHT with gastric cancer [27]. First, the study-specific odds ratios (ORs) and the corresponding 95% confidence intervals (CIs) were estimated for the association between each measure of reproductive factors and gastric cancer risk using multivariable unconditional logistic regression models. ...
Article
Full-text available
Background Gastric cancer incidence is higher in men, and a protective hormone-related effect in women is postulated. We aimed to investigate and quantify the relationship in the Stomach cancer Pooling (StoP) Project consortium. Methods A total of 2,084 cases and 7,102 controls from 11 studies in seven countries were included. Summary odds ratios (ORs) and 95% confidence intervals (CIs) assessing associations of key reproductive factors and menopausal hormone therapy (MHT) with gastric cancer were estimated by pooling study-specific ORs using random-effects meta-analysis. Results A duration of fertility of ≥ 40 years (vs. < 20), was associated with a 25% lower risk of gastric cancer (OR = 0.75; 95% CI: 0.58–0.96). Compared with never use, ever, 5–9 years and ≥ 10 years use of MHT in postmenopausal women, showed ORs of 0.73 (95% CI: 0.58–0.92), 0.53 (95% CI: 0.34–0.84) and 0.71 (95% CI: 0.50–1.00), respectively. The associations were generally similar for anatomical and histologic subtypes. Conclusion Our results support the hypothesis that reproductive factors and MHT use may lower the risk of gastric cancer in women, regardless of anatomical or histologic subtypes. Given the variation in hormones over the lifespan, studies should address their effects in premenopausal and postmenopausal women. Furthermore, mechanistic studies may inform potential biological processes.
... It provides reliable and accurate data that enables the computation of nationally representative estimates of demographic and health indicators in Nigeria among women aged 15-49 (National Population Commission -NPC & ICF, 2019). From the two surveys-in which 41,821 women in the 2018 NDHS and 38,948 women in the 2013 NDHS-adolescent girls' (aged 15-19) data were extracted (from the Individual Recode (IR)), pooled and analyzed to increase the sample size and enhance statistical precision (Friedenreich, 1993;Smith-Warner et al., 2006). A similar approach has been used in previous studies (Babalola, 2011;Okunlola et al., 2022;Solanke et al., 2018). ...
... The incidence rates were calculated using an aggregated analysis, obtained by combining results from all studies in one dataset, and the results were presented for each included publication in a table. 11 The logit transformation was used to calculate the summary proportion, which resulted in a pooled proportion with a 95% CI. Additionally, the proportions (expressed as percentages), with their 95% CI, in individual studies were listed in a separate table. ...
Article
Full-text available
Introduction High-speed boat operators constitute a population at risk of work-related injuries and disabilities. This review aimed to summarize the available knowledge on workplace-related injuries and chronic musculoskeletal pain among high-speed boat operators. Materials and Methods In this systematic review, we searched Medline, Embase, Scopus, and the Cochrane Library Database for studies, published from 1980 to 2022, on occupational health and hazards onboard high-speed boats. Studies and reports were eligible for inclusion if they evaluated, compared, used, or described harms associated with impact exposure onboard high-speed boats. Studies focusing on recreational injuries and operators of non-planing boats were excluded. The primary outcome of interest was the incidence of acute injuries. The secondary outcome measures comprised the presence of chronic musculoskeletal disorders, pain medication use, and days off work. Results Of the 163 search results, 5 (2 prospective longitudinal and 3 cross-sectional cohort studies) were included in this systematic review. A total of 804 cases with 3,312 injuries sustained during 3,467 person-years onboard high-speed boats were included in the synthesis of the results. The pooled incidence rate was 1.0 per person-year. The most common injuries were related to the lower back (26%), followed by neck (16%) and head (12%) injuries. The pooled prevalence of chronic pain was 74% (95% CI: 73–75%) and 60% (95% CI: 59–62%) of the cohort consumed analgesics. Conclusions Despite very limited data, this review found evidence that high-speed boat operators have a higher rate of injuries and a higher prevalence of chronic pain than other naval service operators and the general workforce. Given the low certainty of these findings, further prospective research is required to verify the injury incidence and chronic pain prevalence among high-speed boat operators.
... In a sensitivity analysis, we compared associations obtained with the simple imputation for waist circumference residuals to a complete case analysis. Potential heterogeneity in the associations between FABP-4 and colorectal cancer anatomical subsite (i.e., outcome subtypes) was determined using competing risk tests [41,42]. Heterogeneity by sex was evaluated using the Q-statistic from the inverse variance method, assuming a fixed-effect model of 1 degree of freedom [43]. ...
Article
Full-text available
Background Fatty acid binding protein 4 (FABP-4) is a lipid-binding adipokine upregulated in obesity, which may facilitate fatty acid supply for tumor growth and promote insulin resistance and inflammation and may thus play a role in colorectal cancer (CRC) development. We aimed to investigate the association between circulating FABP-4 and CRC and to assess potential causality using a Mendelian randomization (MR) approach. Methods The association between pre-diagnostic plasma measurements of FABP-4 and CRC risk was investigated in a nested case-control study in 1324 CRC cases and the same number of matched controls within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. A two-sample Mendelian randomization study was conducted based on three genetic variants (1 cis, 2 trans) associated with circulating FABP-4 identified in a published genome-wide association study (discovery n = 20,436) and data from 58,131 CRC cases and 67,347 controls in the Genetics and Epidemiology of Colorectal Cancer Consortium, Colorectal Cancer Transdisciplinary Study, and Colon Cancer Family Registry. Results In conditional logistic regression models adjusted for potential confounders including body size, the estimated relative risk, RR (95% confidence interval, CI) per one standard deviation, SD (8.9 ng/mL) higher FABP-4 concentration was 1.01 (0.92, 1.12) overall, 0.95 (0.80, 1.13) in men and 1.09 (0.95, 1.25) in women. Genetically determined higher FABP-4 was not associated with colorectal cancer risk (RR per FABP-4 SD was 1.10 (0.95, 1.27) overall, 1.03 (0.84, 1.26) in men and 1.21 (0.98, 1.48) in women). However, in a cis-MR approach, a statistically significant association was observed in women (RR 1.56, 1.09, 2.23) but not overall (RR 1.23, 0.97, 1.57) or in men (0.99, 0.71, 1.37). Conclusions Taken together, these analyses provide no support for a causal role of circulating FABP-4 in the development of CRC, although the cis-MR provides some evidence for a positive association in women, which may deserve to be investigated further.
Article
It remains unclear if pre‐diagnostic factors influence the developmental pathways of colorectal cancer (CRC) that could enhance tumor aggressiveness. This study used prospective data from 205,489 cancer‐free US health professionals to investigate the associations of 31 known or putative risk factors with the risk of aggressive CRC. Tumor aggressiveness was characterized by three endpoints: aggressive CRC (cancer that causes death within 5 years of diagnosis), fatal CRC, and tumor stage at diagnosis. The data augmentation method was used to assess the difference in the associations between risk factors and endpoints. We documented 3201 CRC cases, of which 899 were aggressive. The protective associations of undergoing lower endoscopy (hazard ratios [HR] 0.43, 95% confidence interval (CI) 0.37, 0.49 for aggressive versus HR 0.61, 95% CI 0.56, 0.67 for non‐aggressive) and regular use of aspirin (HR 0.70, 95% CI 0.61, 0.81 versus HR 0.84, 95% CI 0.77, 0.92) were stronger for aggressive than non‐aggressive CRC ( p Heterogeneity <0.05). Lower intake of whole grains or cereal fiber and greater dietary inflammatory potential were associated with a higher risk of aggressive but not non‐aggressive CRC. The remaining risk factors showed comparable associations with aggressive CRC and non‐aggressive CRC. Aggressive cases were more likely to have KRAS ‐mutated tumors but less likely to have distal or MSI‐high tumors ( p < .007). Similar results were observed for fatal CRC and advanced tumor stages at diagnosis. These findings provide initial evidence for the role of pre‐diagnostic risk factors in the pathogenesis of aggressive CRC and suggest research priorities for preventive interventions.
Article
The Mediterranean diet is a well-studied cultural model of healthy eating, yet research on healthy models from other cultures and cuisines has been limited. This perspective article summarizes the components of traditional Latin American, Asian, and African heritage diets, their association with diet quality and markers of health, and implications for nutrition programs and policy. Though these diets differ in specific foods and flavors, we present a common thread that emphasizes healthful plant foods and that is consistent with high dietary quality and low rates of major causes of disability and deaths. In this perspective, we propose that nutrition interventions that incorporate these cultural models of healthy eating show promise, though further research is needed to determine health outcomes and best practices for implementation.
Chapter
This chapter provides an overview of nutritional epidemiology for those unfamiliar with the field. The field of nutritional epidemiology developed from an interest in the concept that aspects of diet may influence the occurrence of human disease. Although it is relatively new as a formal area of research, investigators have used basic epidemiologic methods for more than 200 years to identify numerous essential nutrients. The most serious challenge to research in nutritional epidemiology has been the development of practical methods to measure diet. Because epidemiologic studies usually involve at least several hundred and sometimes hundreds of thousands of subjects, dietary assessment methods must be not only reasonably accurate but also relatively inexpensive. Epidemiologic approaches to diet and disease and the interpretation of epidemiologic data are discussed.
Article
Objective.— To assess the risk of invasive breast cancer associated with total and beverage-specific alcohol consumption and to evaluate whether dietary and nondietary factors modify the association. Data Sources.— We included in these analyses 6 prospective studies that had at least 200 incident breast cancer cases, assessed long-term intake of food and nutrients, and used a validated diet assessment instrument. The studies were conducted in Canada, the Netherlands, Sweden, and the United States. Alcohol intake was estimated by food frequency questionnaires in each study. The studies included a total of 322647 women evaluated for up to 11 years, including 4335 participants with a diagnosis of incident invasive breast cancer. Data Extraction.— Pooled analysis of primary data using analyses consistent with each study's original design and the random-effects model for the overall pooled analyses. Data Synthesis.— For alcohol intakes less than 60 g/d (reported by >99% of participants), risk increased linearly with increasing intake; the pooled multivariate relative risk for an increment of 10 g/d of alcohol (about 0.75-1 drink) was 1.09 (95% confidence interval [CI], 1.04-1.13; P for heterogeneity among studies, .71). The multivariate-adjusted relative risk for total alcohol intakes of 30 to less than 60 g/d (about 2-5 drinks) vs nondrinkers was 1.41 (95% CI, 1.18-1.69). Limited data suggested that alcohol intakes of at least 60 g/d were not associated with further increased risk. The specific type of alcoholic beverage did not strongly influence risk estimates. The association between alcohol intake and breast cancer was not modified by other factors. Conclusions.— Alcohol consumption is associated with a linear increase in breast cancer incidence in women over the range of consumption reported by most women. Among women who consume alcohol regularly, reducing alcohol consumption is a potential means to reduce breast cancer risk.
Article
Introduction. Survival distributions. Single sample nonparametric methods. Dependence on explanatory variables. Model formulation. The multiplicative log-linear hazards model. Partial likelihood. Several types of failure. Further problems. Exercises. Bibliography. Index.
Article
Multiple-day food records or 24-hour dietary recalls (24HRs) are commonly used as “reference” instruments to calibrate food frequency questionnaires (FFQs) and to adjust findings from nutritional epidemiologic studies for measurement error. Correct adjustment requires that the errors in the adopted reference instrument be independent of those in the FFQ and of true intake. The authors report data from the Observing Protein and Energy Nutrition (OPEN) Study, conducted from September 1999 to March 2000, in which valid reference biomarkers for energy (doubly labeled water) and protein (urinary nitrogen), together with a FFQ and 24HR, were observed in 484 healthy volunteers from Montgomery County, Maryland. Accounting for the reference biomarkers, the data suggest that the FFQ leads to severe attenuation in estimated disease relative risks for absolute protein or energy intake (a true relative risk of 2 would appear as 1.1 or smaller). For protein adjusted for energy intake by using either nutrient density or nutrient residuals, the attenuation is less severe (a relative risk of 2 would appear as approximately 1.3), lending weight to the use of energy adjustment. Using the 24HR as a reference instrument can seriously underestimate true attenuation (up to 60% for energy-adjusted protein). Results suggest that the interpretation of findings from FFQ-based epidemiologic studies of diet-disease associations needs to be reevaluated. bias (epidemiology); biological markers; diet; energy intake; epidemiologic methods; nutrition assessment; questionnaires; reference values
Article
Background: Frequent consumption of fruit and vegetables has been associated with a reduced risk of colorectal cancer in many observational studies. Methods: We prospectively investigated the association between fruit and vegetable consumption and the incidence of colon and rectal cancers in two large cohorts: the Nurses' Health Study (88764 women) and the Health Professionals' Follow-up Study (47325 men). Diet was assessed and cumulatively updated in 1980, 1984, 1986, and 1990 among women and in 1986 and 1990 among men. The incidence of cancer of the colon and rectum was ascertained up to June or January of 1996, respectively. Relative risk (RR) estimates were calculated with the use of pooled logistic regression models accounting for various potential confounders. All statistical tests were two-sided. Results: With a follow-up including 1743645 person-years and 937 cases of colon cancer, we found little association of colon cancer incidence with fruit and vegetable consumption. For women and men combined, a difference in fruit and vegetable consumption of one additional serving per day was associated with a covariate-adjusted RR of 1.02 (95% confidence interval [CI] = 0.98-1.05). A difference in vegetable consumption of one additional serving per day was associated with an RR of 1.03 (95% CI = 0.97-1.09). Similar results were obtained for women and men considered separately. A difference in fruit consumption of one additional serving per day was associated with a covariate-adjusted RR for colon cancer of 0.96 (95% CI = 0.89-1.03) among women and 1.08 (95% CI = 1.00-1.16) among men. For rectal cancer (total, 244 cases), a difference in fruit and vegetable consumption of one additional serving per day was associated with an RR of 1.02 (95% CI = 0.95-1.09) in men and women combined. None of these associations was modified by vitamin supplement use or smoking habits. Conclusions: Although fruits and vegetables may confer protection against some chronic diseases, their frequent consumption does not appear to confer protection from colon or rectal cancer.
Article
Background The use of review articles and meta-analysis has become an important part of epidemiological research, mainly for reconciling previously conducted studies that have inconsistent results. Numerous methodologic issues particularly with respect to biases and the use of meta-analysis are still controversial. Methods Four methods summarizing data from epidemiological studies are described. The rationale for meta-analysis and the statistical methods used are outlined. The strengths and limitations of these methods are compared particularly with respect to their ability to investigate heterogeneity between studies and to provide quantitative risk estimation. Results Meta-analyses from published data are in general insufficient to calculate a pooled estimate since published estimates are based on heterogeneous populations, different study designs and mainly different statistical models. More reliable results can be expected if individual data are available for a pooled analysis, although some heterogeneity still remains. Large prospective planned meta-analysis of multicentre studies would be preferable to investigate small risk factors, however this type of meta-analysis is expensive and time-consuming. Conclusion For a full assessment of risk factors with a high prevalence in the general population, pooling of data will become increasingly important. Future research needs to focus on the deficiencies of review methods, in particular, the errors and biases that can be produced when studies are combined that have used different designs, methods and analytic models.
Book
Probability. Measure. Integration. Random Variables and Expected Values. Convergence of Distributions. Derivatives and Conditional Probability. Stochastic Processes. Appendix. Notes on the Problems. Bibliography. List of Symbols. Index.