ArticlePDF Available

Methods for Pooling Results of Epidemiologic Studies: The Pooling Project of Prospective Studies of Diet and Cancer

July 2006
American Journal of Epidemiology 163(11):1053-64

July 2006
163(11):1053-64

DOI:10.1093/aje/kwj127

Source
PubMed

Authors:

John Ritz

Syneos Health

Show all 32 authorsHide

With the growing number of epidemiologic publications on the relation between dietary factors and cancer risk, pooled analyses that summarize results from multiple studies are becoming more common. Here, the authors describe the methods being used to summarize data on diet-cancer associations within the ongoing Pooling Project of Prospective Studies of Diet and Cancer, begun in 1991. In the Pooling Project, the primary data from prospective cohort studies meeting prespecified inclusion criteria are analyzed using standardized criteria for modeling of exposure, confounding, and outcome variables. In addition to evaluating main exposure-disease associations, analyses are also conducted to evaluate whether exposure-disease associations are modified by other dietary and nondietary factors or vary among population subgroups or particular cancer subtypes. Study-specific relative risks are calculated using the Cox proportional hazards model and then pooled using a random- or mixed-effects model. The study-specific estimates are weighted by the inverse of their variances in forming summary estimates. Most of the methods used in the Pooling Project may be adapted for examining associations with dietary and nondietary factors in pooled analyses of case-control studies or case-control and cohort studies combined.

. Characteristics of the studies included in the Pooling Project of Prospective Studies of Diet and Cancer, 1991-2004

…

Figures - uploaded by Eunyoung Cho

Content may be subject to copyright.

Content uploaded by Eunyoung Cho

Content may be subject to copyright.

Practice of Epidemiology

Methods for Pooling Results of Epidemiologic Studies

The Pooling Project of Prospective Studies of Diet and Cancer

Stephanie A. Smith-Warner

1,2

, Donna Spiegelman

2,3

, John Ritz

2,3

, Demetrius Albanes

W. Lawrence Beeson

, Leslie Bernstein

, Franco Berrino

, Piet A. van den Brandt

, Julie E. Buring

2,9

Eunyoung Cho

, Graham A. Colditz

2,10,11

, Aaron R. Folsom

, Jo L. Freudenheim

, Edward

Giovannucci

1,2,10

, R. Alexandra Goldbohm

, Saxon Graham

, Lisa Harnack

, Pamela L. Horn-

Ross

, Vittori o Krogh

, Michael F. Leitzmann

, Marjorie L. McCullough

, Anthony B. Miller

Carmen Rodriguez

, Thomas E. Rohan

, Arthur Schatzkin

, Roy Shore

, Mikko Virtanen

Walter C. Willett

1,2,10

, Alicja Wolk

, Anne Zeleniuch-Jacquotte

, Shumin M. Zhang

2,9

, and David

J. Hunter

1,2,10,11

Department of Nutrition, Harvard School of Public Health,

Boston, MA.

Department of Epidemiology, Harvard School of Public

Health, Boston, MA.

Department of Biostatistics, Harvard School of Public

Health, Boston, MA.

Nutritional Epidemiology Branch, National Cancer Institute,

Bethesda, MD.

Center for Health Research, School of Medicine, Loma

Linda University, Loma Linda, CA.

Department of Preventive Medicine and USC/Norris

Comprehensive Cancer Center, University of Southern

California, Los Angeles, CA.

Epidemiology Unit, National Cancer Institute, Milan, Italy.

Department of Epidemiology, Faculty of Health Sciences,

Maastricht University, Maastricht, the Netherlands.

Division of Preventive Medicine, Department of Medicine,

Brigham and Women’s Hospital and Harvard Medical School,

Boston, MA.

Channing Laboratory, Department of Medicine, Brigham

and Women’s Hospital and Harvard Medical School,

Boston, MA.

Harvard Center for Cancer Prevention, Boston, MA.

Division of Epidemiology and Community Health, School of

Public Health, University of Minnesota, Minneapolis, MN.

Department of Social and Preventive Medicine,

University at Buffalo, State University of New York,

Buffalo, NY.

Department of Epidemiology, TNO Quality of Life, Zeist,

the Netherlands.

Northern California Cancer Center, Fremont, CA.

Division of Cancer Epidemiology and Genetics,

National Cancer Institute, Bethesda, MD.

Epidemiology and Surveillance Research, American

Cancer Society, Atlanta, GA.

Department of Public Health Sciences, Faculty of

Medicine, University of Toronto, Toronto, Ontario, Canada.

Department of Epidemiology and Population Health,

Albert Einstein College of Medicine, Bronx, NY.

Department of Environmental Medicine, School of

Medicine, New York University, New York, NY.

Department of Epidemiology and Health Promotion,

National Public Health Institute, Helsinki, Finland.

Division of Nutritional Epidemiology, National Institute of

Environmental Medicine, Karolinska Institute, Stockholm,

Sweden.

Received for publication March 22, 2005; accepted for publication December 21, 2005.

With the growing number of epidemiologic publications on the relation between dietary factors and cancer risk,

pooled analyses that summarize results from multiple studies are becoming more common. Here, the authors

describe the methods being used to summarize data on diet-cancer associations within the ongoing Pooling Project

of Prospective Studies of Diet and Cancer, begun in 1991. In the Pooling Project, the primary data from prospective

cohort studies meeting prespeciﬁed inclusion criteria are analyzed using standardized criteria for modeling of

Correspondence to Dr. Stephanie Smith-Warner, Department of Nutrition, Harvard School of Public Health, 665 Huntington Avenue, Boston,

MA 02115 (e-mail: pooling@hsphsun2.harvard.edu).

1053 Am J Epidemiol 2006;163:1053–1064

American Journal of Epidemiology

ª 2006 by the Johns Hopkins Bloomberg School of Public Health

Vol. 163, No. 11

DOI: 10.1093/aje/kwj127

Advance Access publication April 19, 2006

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

exposure, confounding, and outcome variables. In addition to evaluating main exposure-disease associations,

analyses are also conducted to evaluate whether exposure-disease associations are modiﬁed by other dietary and

nondietary factors or vary among population subgroups or particular cancer subtypes. Study-speciﬁc relative risks

are calculated using the Cox proportional hazards model and then pooled using a random- or mixed-effects model.

The study-speciﬁc estimates are weighted by the inverse of their variances in forming summary estimates. Most of

the methods used in the Pooling Project may be adapted for examining associations with dietary and nondietary

factors in pooled analyses of case-control studies or case-control and cohort studies combined.

cohort studies; diet; epidemiologic methods; meta-analysis; neoplasms

The growing number of epidemiologic publications on

the relation between diet and cancer risk has heightened

the need for methods of summarizing results from multiple

studies. These methods include qualitative reviews and

quantitative summaries such as meta-analyses of the pub-

lished literature and pooled analyses of the primary data

(also called meta-analyses of individual data) (1). A general

framework for conducting pooled analyses entails 1) formu-

lating study inclusion criteria; 2) identifying all potential

studies meeting these criteria; 3) obtaining each study’s pri-

mary data; 4) creating a standardized database; 5) estimating

study-speciﬁc exposure-disease associations; 6) examining

whether the study-speciﬁc results are heterogeneous; 7) cal-

culating pooled estimates, if applicable; and 8) conducting

sensitivity analyses to evaluate whether the estimates are

robust (2). There are many advantages to reanalyzing the

primary data from multiple studies rather than extracting

the study-speciﬁc relative risks from published articles

(1–5). In a pooled analysis, the modeling of the exposure,

confounding, and outcome variables, the choice of which

variables to control for, and the type of analysis conducted

can be standardized, thereby removing potential sources of

heterogeneity across studies. Because of larger sample sizes,

pooled analyses also offer investigators the opportunity to

examine uncommon exposures, rare diseases, and variation

in associations among population subgroups with greater

statistical power than is possible in individual studies.

The pooling of data from observational studies has be-

come more common recently (6–13). Summary estimates

have been calculated using a weighted average of the

study-speciﬁc estimates (8, 9, 11) or by combining studies

into a single data set for the analysis (6, 7, 10, 12, 13). In this

paper, we describe the methods that are being used within

the ongoing Pooling Project of Prospective Studies of Diet

and Cancer (the Pooling Project), an international consor-

tium of cohort studies with the goal of providing the best

available summary of data on associations between diet and

cancer (14–30). Most of these methods can also be adapted

to examine associations in pooled analyses of case-control

studies or both case-control and cohort studies combined.

INCLUSION CRITERIA

To maximize the quality and comparability of the studies

in the Pooling Project, we formulated several inclusion cri-

teria a priori. First, we include prospective studies which

1) had at least one publication on the relation between diet

and cancer; 2) used a dietary assessment method that was of

sufﬁcient detail to calculate intakes of most nutrients, in-

cluding energy, and that assessed intake over a period of

months or years; and 3) assessed the validity of their dietary

assessment method or a closely related instrument. Second,

for each cancer site evaluated, we specify a minimum num-

ber of cases required for a study to be included in the anal-

ysis. Additional inclusion criteria also may be made for each

cancer site. Third, for each analysis, we include only those

studies that assessed the speciﬁed exposure and in which

participants consumed the dietary item of interest. For anal-

yses that are going on simultaneously in the Pooling Project

and the European Prospective Investigation into Cancer and

Nutrition (31), we intend to coordinate analyses so that, to

the extent possible, we can use similar analytic approaches

and provide comparable results.

COMPONENT STUDIES

Sixteen studies (32–46) are currently included in the

Pooling Project (table 1). As we become aware of new

studies meeting the inclusion criteria, the investigators from

those studies are invited to join the Project. The Canadian

National Breast Screening Study and the Netherlands Co-

hort Study are each analyzed as case-cohort studies (47),

because the investigators in these two studies each selected

a random sample of the cohort to provide the person-time

data for the cohort and have processed questionnaires for

only this random sample and the cases. We divide the person-

time and numbers of cases compiled during follow-up of

the Nurses’ Health Study into two segments to take advan-

tage of the expanded food frequency questionnaire admin-

istered in 1986 as compared with 1980. In this paper, we

refer to the follow-up period from 1980 to 1986 as ‘‘Nurses’

Health Study A’’; the follow-up period beginning in 1986

is referred to as ‘‘Nurses’ Health Study B.’’ Following

standard survival data analysis theory, blocks of person-

time in different time periods are asymptotically uncorre-

lated, regardless of the extent to which they are derived

from the same people (48, 49). Thus, pooling of the esti-

mates from these two time periods produces estimates and

standard errors which are as valid as those from a single

time period.

Data collection

The investigators in each Pooling Project study send their

primary data on select variables to the Harvard School of

Public Health (Boston, Massachusetts). There we inspect

1054 Smith-Warner et al.

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

the data for completeness and resolve inconsistencies with

the investigators of each study.

Each study used a food frequency questionnaire or diet

history instrument that was designed and pretested in its

speciﬁc study population or a similar population (P. L. Horn-

Ross, unpublished data; V. Krogh, unpublished data; A.

Wolk, unpublished data) (50–59) (table 1). Although the

numbers of items included in the food frequency question-

naires varied over ﬁvefold across the studies (table 1), the

study-speciﬁc correlation coefﬁcients comparing the food

frequency questionnaire used in each cohort or a closely

related instrument with multiple dietary records or 24-hour

recalls generally exceed 0.40 for total fat, dietary ﬁber, and

several micronutrients (P. L. Horn-Ross, unpublished data;

V. Krogh, unpublished data; A. Wolk, unpublished data)

(50–59) (table 2).

Information on nondietary risk factors was collected at

baseline in each study using self-administered question-

naires. For measured covariates, the proportion of missing

data for nondietary risk factors is generally low across stud-

ies (table 3). The exception is the Swedish Mammography

Cohort, in which some covariate information was available

for only one of the two counties in the study.

Case ascertainment

Incident cancer diagnoses are identiﬁed through follow-

up questionnaires, with subsequent medical record review

(37, 44, 46), linkage with cancer registries (32, 36, 39–42,

45), or both (33–35, 38, 43). In addition, investigators in

some studies ascertain incident and/or fatal outcomes using

mortality registries (32, 34, 35, 37–39, 41–46). Case ascer-

tainment has generally been estimated to be greater than 90

percent in each study (table 1).

STATISTICAL APPROACHES AND RATIONALE

For each cohort, after applying the exclusion criteria used

in that study, we further exclude participants who reported

log

-transformed energy intakes beyond three standard de-

viations from the study-speciﬁc log

-transformed mean en-

ergy intake of the baseline population (or subcohort, for the

case-cohort studies) or who reported a history of cancer

(except nonmelanoma skin cancer) at baseline. Additional

exclusion criteria may be applied for analyses of speciﬁc

cancer sites. Because many cancers appear to have hormonal

antecedents and because lifestyle factors may differ between

women and men, studies including both women and men are

split into two studies: a cohort of women and a cohort of

men. This conservative approach, in which all estimates are

calculated separately for women and men in those studies

including both genders, allows for potential effect modiﬁca-

tion by sex for every determinant of the outcome.

Follow-up time is calculated for each participant from the

date on which his/her baseline questionnaire was returned to

the date of diagnosis of the speciﬁc cancer being examined,

the date of death, the date on which the participant moves

out of the study area (if applicable), or the end of follow-up,

whichever comes ﬁrst.

In our analyses, we create standardized categories for

most confounding variables across studies. We create a

missing-data indicator variable for missing responses for

each measured confounder in a study, if applicable. As long

as 1) the association between the confounding variable and

the exposure of interest is weak, or the association between

the confounding variable and the outcome is weak, or the

confounding variable has little variability in the study and

2) the percentage of missing data within the study is low,

the use of the missing-data indicator method is likely to

improve efﬁciency without introducing appreciable bias in

comparison with the complete case method (60, 61). As

table 3 shows, the proportion of missing data for each co-

variate across studies is generally low, satisfying one of the

conditions for valid use of the missing-data indicator

method. In addition, potentially confounding factors gener-

ally have had moderate-to-weak associations with the can-

cer sites we have examined and have had low-to-moderate

correlations with the dietary exposures that are of primary

interest in the Pooling Project. Information on age, which is

typically the strongest measured risk factor for cancer in-

cidence, is never missing in the constituent studies.

Two-stage analysis

Our analytic approach generally is a two-stage process. In

the ﬁrst step, we calculate study-speciﬁc relative risks using

the Cox proportional hazards model (49), deﬁned through

the hazard function h by

jks

ðtj u

Þ¼h

0jks

ðtÞexpða

þ b

Þð1Þ

for s ¼ 1, ..., S, where s is the study number, t is follow-up

time, u

and x

are the study-speciﬁc confounding and ex-

posure variables, respectively, for individual i in study s, and

0jks

(t) is the baseline incidence rate at age j (in years), in

calendar year k, and for time since entry into the study t. The

estimated study-speciﬁc log relative risks for a one-unit in-

crease in the exposures, x

, are given by the b

. The study-

speciﬁc log relative risks for a one-unit increase in the

confounding variables, u

, are given by the a

. Stratifying

jointly by age at baseline (years) and the year in which the

baseline questionnaire was returned (indexed by j and k,

respectively) and treating follow-up time (in years) as the

time metric in the Cox model is equivalent to treating age as

the time metric in the Cox model and stratifying jointly on

calendar time (in years) and duration of time in the study,

with one exception: There is a difference in which two-way

interactions are allowed. With our approach, no assumptions

are made about the shape of the age or calendar-year in-

cidence curves, each of which is fully adjusted for the other,

and arbitrary two-way interactions of the joint dependency

of the outcome on age and calendar time are allowed. Each

case-cohort study is analyzed using EPICURE software

(HiroSoft International Corporation, Seattle, Washington)

(47, 62); each remaining study is analyzed using SAS PROC

PHREG (SAS Institute, Inc., Cary, North Carolina) (63).

If case-control studies were included in our pooled anal-

yses, the model for these studies would be similar to equa-

tion 1, except that we would stratify the participants by

Methods for the Pooling Project 1055

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

TABLE 1. Characteristics of the studies included in the Pooling Project of Prospective Studies of Diet and Cancer, 1991–2004

Study Study population Location

Study

dates

Baseline

cohort size*

Age

(years) at

baseline

Food frequency questionnaire/

diet history instrument

Outcome

ascertainment

Estimated

case

ascertainment

rate

Women Men

No. of

items

Time

frame

Components

measured

Adventist Health

Study (33)

Non-Hispanic White men

and women living in

Seventh-Day Adventist

households

California,

United States

1976–1982 18,403 12,896 >24 46 Past year Frequency FQsy/MRRy;

cancer registry;

mortality registry

>99

Alpha-Tocopherol,

Beta-Carotene

Cancer

Prevention

Study (34)

Male smokers who

participated in a

randomized double-blind

placebo-controlled clinical

trial of a-tocopherol

and b-carotene

supplement use

Southwestern

Finland

1985 onward

(ongoing)

0 26,987 50–69 276 Past year Frequency and

portion size

FQs/MRR;

cancer registry;

mortality registry

100

Breast Cancer

Detection

Demonstration

Project

Follow-up

Cohort (35)

Subset of women

participating in a breast

cancer screening

program in 1973–1980

who had been diagnosed

with breast cancer

or had undergone or

been recommended

to receive a breast

biopsy, plus a random

sample of the remaining

women who had been

screened

United States 1987 onward

(ongoing)

41,987 0 40–93 62 Past year Frequency and

portion size

FQs/MRR;

cancer registry;

mortality registry

California

Teachers

Study (45)

Active and retired female

teachers and administrators

participating in the California

State Teachers Retirement

System

California,

United States

1995 onward

(ongoing)

100,036 0 21–103 103 Past year Frequency and

portion size

Cancer registry;

mortality registry

>97z

Canadian

National

Breast

Screening

Study (36)

Women who participated in

a multicenter randomized

controlled trial of

mammography screening

for female breast cancer

Canada 1980 onward

(ongoing)

56,837 0 40–59 86 Past month Frequency and

portion size

Cancer registry 100

Cancer

Prevention

Study II

Nutrition

Cohort (38)

Subset of men and women

participating in Cancer

Prevention Study II who

completed a diet

questionnaire in 1992

United States 1992 onward

(ongoing)

74,053 66,090 50–74 68 Past year Frequency and

portion size

FQs/MRR;

cancer registry;

mortality registry

>90

Health

Professionals

Follow-up

Study (37)

Male dentists, optometrists,

osteopathic physicians,

podiatrists, pharmacists,

and veterinarians

United States 1986 onward

(ongoing)

0 47,673 40–75 131 Past year Frequency of

speciﬁed

portions

FQs/MRR;

mortality registry

>94

Iowa Women’s

Health

Study (41)

Postmenopausal women

selected randomly from the

1985 Department of

Transportation’s driver’s

license list in Iowa

Iowa, United States 1986 onward

(ongoing)

34,603 0 55–69 116 Past year Frequency of

speciﬁed

portions

Cancer registry;

mortality registry

98§

Netherlands

Cohort

Study (40)

Men and women from 204

municipal population

registries throughout the

Netherlands

The Netherlands 1986 onward

(ongoing)

62,573 58,279 55–69 150 Past year Frequency and

portion size

Cancer registry;

pathology

database

>95

1056 Smith-Warner et al.

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

New York State

Cohort (42)

Male and female residents

who had had the same

address and telephone

number for the previous

18 years

New York,

United States

1980–1987 22,550 30,363 50–93 45 Past year Frequency Cancer registry —{

New York

University

Women’s

Health

Study (43)

Women visiting a breast

screening clinic who had

not used any hormonal

medications or been

pregnant in the previous

6 months

New York,

United States

1985 onward

(ongoing)

13,258 0 34–65 71 Past year Frequency and

portion size

FQs/MRR;

cancer registry;

mortality registry

Nurses’ Health

Study A (37)

Female registered nurses United States 1980–1986 88,651 0 34–59 61 Past year Frequency of

speciﬁed

portions

FQs/MRR;

mortality registry

>94

Nurses’ Health

Study B (37)

Female registered nurses United States 1986 onward

(ongoing)

68,540 0 40–65 131 Past year Frequency of

speciﬁed

portions

FQs/MRR;

mortality registry

>94

Nurses’ Health

Study II (46)

Female registered nurses United States 1991 onward

(ongoing)

93,894 0 26–46 133 Past year Frequency of

speciﬁed

portions

FQs/MRR;

mortality registry

>90

Prospective Study

on Hormones,

Diet and Breast

Cancer (39)

Female volunteers recruited

from the general population

using mass media

advertising and from breast

cancer prevention units

Varese Province,

Italy

1987 onward

(ongoing)

9,027 0 35–69 177 Past year Frequency and

portion size

Cancer registry;

mortality registry;

admissions and

discharge reports;

pathology database

>97

Swedish

Mammography

Cohort (32)

Women who participated

in a population-based

mammography screening

program

stmanland and

Uppsala counties,

Sweden

1987 onward

(ongoing)

61,463 0 40–74 67 Past 6

months

Frequency Cancer registry 98

Women’s Health

Study (44)

Female health professionals

who participated in a

randomized, double-blind,

placebo-controlled trial

of low-dose aspirin,

b-carotene, and

vitamin E use

United States 1993 onward

(ongoing)

38,384 0 45 131 Past year Frequency of

speciﬁed

portions

FQs/MRR 96

* The baseline cohort size corresponds to the number of participants in the Pooling Project database for the renal cell cancer analyses in the California Teachers Study (45) and for the colorectal cancer analyses in

the remaining studies.

y FQs, follow-up questionnaires; MRR, medical record review.

z For California residents only.

§ For Iowa residents only.

{ Cancer outcomes in the New York State Cohort (42) were identiﬁed through linkage with a cancer registry; thus, it is difﬁcult to determine the follow-up rate in the cohort. When a subset of the cohort was followed

intensively, loss to follow-up was not related to exposure.

Methods for the Pooling Project 1057

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

TABLE 2. Correlation coefﬁcients (CCs) for nutrient intakes estimated using a food frequency questionnaire versus a comparison method for studies in the Pooling Project of

Prospective Studies of Diet and Cancer, 1991–2004*

Study Sex

No. of

participants

Comparison

method

Type of

Total

fat

Saturated

fat

Mono-

unsaturated

fat

Poly-

unsaturated

fat

Dietary

ﬁber

Alcohol

Vitamin

Folatey Calciumy

Adventist Health

Study (50)

Women 103 Five 24-hour

recalls over

6 months

Spearman CCsz 0.40 0.45 0.41 0.26 0.47§

Men 44 Five 24-hour

recalls over

6 months

Spearman CCsz 0.38 0.57 0.29 0.15 0.50§

Alpha-Tocopherol,

Beta-Carotene

Cancer

Prevention

Study (51)

Men 178 Twelve 2-day

diet records

over 6 months

Energy-adjusted,

deattenuated

Pearson CCs

0.75 0.79 0.68 0.85 0.82 0.85 0.68 0.71 0.82 0.74

California

Teachers Study

(unpublished)

Women 185 Four 24-hour

recalls over

10 months

Energy-adjusted,

deattenuated

Pearson CCs

0.64 0.82 0.41 0.23 0.77 0.82 0.35{ 0.62{ 0.82{ 0.73{ 0.30{

Canadian National

Breast Screening

Study (52)

Women 108 7-day diet records Energy-adjusted

Pearson CCs

0.44 0.61 0.43# 0.40zz 0.60 0.59 0.60 0.53 0.67

Cancer Prevention

Study II Nutrition

Cohort (53)

Women 188z Four 24-hour

recalls over

1year

Energy-adjusted,

deattenuated

Pearson CCs

0.66 0.66 0.58# 0.42zz 0.61 0.77 0.65 0.27 0.43 0.66

Men 229z Four 24-hour

recalls over

1year

Energy-adjusted,

deattenuated

Pearson CCs

0.58 0.64 0.61# 0.48zz 0.64 0.82 0.65 0.23 0.51 0.57

Health

Professionals

Follow-up

Study (54)

Men 127 Two 7-day diet

records over

6 months

Energy-adjusted,

deattenuated

Pearson CCs

0.67 0.75 0.68 0.37 0.68 0.86z,** 0.61 0.77 0.42 0.70 0.60

Iowa Women’s

Health

Study (55)

Women 44 Five 24-hour

recalls over

2 months

Energy-adjusted

Pearson CCs

0.62 0.59 0.62 0.43 0.24§ 0.32 0.14 0.53 0.79 0.26 0.49

Netherlands

Cohort Study (56)

Women

and men

109 Three 3-day diet

records over

1year

Energy- and

sex-adjusted

deattenuated

Pearson CCs

0.53 0.58yy 0.80 0.79 0.86 0.76 0.58 0.66

New York State

Cohort

Women

(unpublished)

190 Simulated study Energy-adjusted,

deattenuated

Pearson CCsz

0.21 0.18 0.41 0.25 0.53 0.16

Men (57) 127 Simulated study Energy-adjusted,

deattenuated

Pearson CCs

0.57 0.60 0.61 0.22 0.65 0.39 0.76 0.46 0.60

Nurses’ Health

Study A (58)

Women 173 Four 7-day diet

records over

1year

Energy-adjusted

Pearson CCs

0.53 0.59 0.48 0.58§ 0.90z,** 0.36 0.66

Nurses’ Health

Study B (59)

Women 191 Two 7-day diet

records over

1year

Energy-adjusted,

deattenuated

Pearson CCs

0.57 0.68 0.58 0.48 0.79 0.76 0.75

1058 Smith-Warner et al.

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

study-speciﬁc matching factors. Since the Cox model and

the conditional logistic regression model produce algebrai-

cally identical log-(partial)-likelihood functions, SAS PROC

PHREG could also be used for case-control studies to esti-

mate study-speciﬁc odd ratios and their standard errors.

The second step consists of pooling the study-speciﬁc rel-

ative risks using a random-effects model (64–66) given by

¼ b þ b

þ e

; ð2Þ

where the

are the estimated study-speciﬁc exposure-

disease effects, b is the underlying common exposure-

disease association, b

are the random between-studies

effects, and e

are the within-study errors. Both b

and

are assumed to be independent and asymptotically nor-

mally distributed with means of zero and variances of r

and r

; respectively, and r

¼ Varð

Þ: The study-speciﬁc

exposure-disease effects are weighted by the inverse of

their variances using

b ¼

s¼1

;

where

¼ð

1

t¼1

1

When the exposure variable is categorized into different

levels, we calculate a pooled relative risk for each category

separately.

We test for the statistical signiﬁcance of between-studies

heterogeneity among the study-speciﬁc exposure-disease

estimates using the Q test statistic given by

Q ¼

s¼1



bÞ

; ð3Þ

where w

¼ Var

1

: The Q test statistic follows an ap-

proximate v

s1

distribution (66, 67).

For the exposures of interest, we generally categorize

participants into study-speciﬁc quantiles. Because the quan-

tile approach does not take into account true differences in

the distribution of population intakes across studies, we also

create categories deﬁned by identical absolute intake cut-

points across studies. Misclassiﬁcation can also occur in the

analyses based on identical absolute intake cutpoints, be-

cause reported intakes may differ across studies based on

differences in the dietary assessment methods used. How-

ever, when possible we adjust our results for measurement

error in the individual studies.

Aggregated analysis

We can also conduct analyses in which the data from all

studies are combined into one data set (referred to as an

aggregated analysis). A single exposure-disease effect is

then calculated using the Cox proportional hazards model,

including stratiﬁcation by study, age at baseline, and the

year in which the baseline questionnaire was returned.

Prospective Study

on Hormones,

Diet and Breast

Cancer

(unpublished)

Women 104 Fourteen

24-hour recalls

over 1 year

Spearman CCs 0.39 0.49 0.40 0.42 0.53 0.88 0.19 0.37 0.37 0.32 0.48

Swedish

Mammography

Cohort

(unpublished)

Women 129 Four 7-day diet

records over

1year

Energy-adjusted,

deattenuated

Pearson CCs

0.49 0.42 0.51 0.36 0.54 0.85 0.49 0.33 0.29 0.48 0.48

Median 0.55 0.60 0.55 0.40 0.64 0.84 0.45 0.65 0.37 0.46 0.60

* A blank cell means that the investigators did not evaluate this nutrient in their validation study. The studies not mentioned in this table used questionnaires that were very similar to those that had been validated

previously by other investigators. The Breast Cancer Detection Demonstration Project Follow-up Cohort (35) and the New York University Women’s Health Study (43) both used food frequency questionnaires similar

to the food frequency questionnaire used in the Cancer Prevention Study II Nutrition Cohort (53). Nurses’ Health Study II (46) and the Women’s Health Study (44) both used food frequency questionnaires similar to the

food frequency questionnaire used in Nurses’ Health Study B (59).

y Intake from foods only; supplemental intake was not included.

z The data presented were calculated using the validation study data that were sent to the Pooling Project.

§ Crude ﬁber.

{ Correlations among nonusers of supplements only (n ¼ 44).

# Oleic acid.

** Spearman CC.

yy Energy- and sex-adjusted Pearson CC.

zz Linoleic acid.

Methods for the Pooling Project 1059

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

Although combining the data from all studies is one way to

take advantage of differences in the distributions of the

exposure variable across studies, it assumes that the expo-

sure was measured in comparable ways across studies. Be-

cause the distributions of dietary variables may differ across

studies due to true differences in actual intake and due to

differences in the dietary assessment methods used (and

other study-speciﬁc sources of error), this assumption may

not be reasonable, except for nutrients that come from

a small number of food sources (e.g., alcohol). In addition,

combining the studies into one data set assumes that there is

no between-studies heterogeneity in the associations of the

outcome with the exposure or any of the covariates. In the

few instances where we have conducted both pooled and

aggregated analyses, the results have been essentially iden-

tical (16, 25, 30). Nevertheless, because it is difﬁcult to test

the underlying assumptions, we have opted to use two-stage

analyses as our primary analytic strategy.

Trend analysis

To test the signiﬁcance of trends in disease risk over

exposure categories, we conduct separate analyses in which

participants are assigned the study-speciﬁc median value of

their respective category (given by med

for j ¼ 1, ..., J,

where J is the number of levels in which the exposure variable

is categorized). For each study, we ﬁt a Cox proportional

hazards model with regression terms b

for s ¼ 1, ..., S,

where s is the study number and z

takes on the values

med

corresponding to the category in which the individ-

ual’s exposure value falls. We then compute the pooled

estimate for the regression coefﬁcient for trend using a

random-effects model (64–66). The pooled test for trend is

a Wald test of the hypothesis H

: b ¼ 0. We test for the

statistical signiﬁcance of between-studies heterogeneity

among the study-speciﬁc regression coefﬁcients using the

Q test statistic (66, 67).

We also evaluate whether associations between dietary

factors and cancer risk are linear by comparing nonparamet-

ric regression curves using restricted cubic splines with the

linear model using the likelihood ratio test, and by visual

inspection of the restricted cubic spline graphs (68, 69). For

these analyses, the studies are combined into a single data

set stratiﬁed by study.

Evaluation of heterogeneity of effects

An advantage of a pooled analysis is the ability to eval-

uate whether the exposure-disease association is modiﬁed

by other risk factors. In these analyses, if the exposure-

disease association is log-linear and the potential effect

modiﬁer is an ordinal or binary variable, we ﬁrst compute

estimates of the exposure-disease association and their stan-

dard errors for each study within each category of the

TABLE 3. Prevalences of missing data for select nondietary factors across studies in

the Pooling Project of Prospective Studies of Diet and Cancer, 1991–2004

No. of studies in

which the factor

was measured*

% of missin g

data across

studies (range)

Studies with <5%

missing data

No. %

Age 17 0 17 100

Education 17y 0–23 14 82

Body mass index 17 0–8 15 88

Smoking status 15 0–5 15 100

Physical activity 14 0–12 11 79

Multivitamin use 14 0–8 10 71

Age at menarchez,§ 14 0–3 14 100

Parityz 15 0–10 13 87

Menopausal statusz,§ 14 0–18 8 57

Oral contraceptive usez,§ 13 0–21 11 85

Postmenopausal hormone

usez,§,{ 13 0–16 12 92

* For this table, Nurses’ Health Study A (1980–1986) and Nurses’ Health Study B (1986–

present) were counted as two separate studies (see Materials and Methods).

y All participants in the California Teachers Study, the Health Professionals Follow-up Study,

the Nurses’ Health Study, and Nurses’ Health Study II were assumed to have received additional

education after graduating from high school, because these populations were selected on the

basis of their employment in occupations requiring a post-high-school education.

z Only cohort studies including women are included here. The prevalence of missing data was

calculated only among the female participants.

§ For the Swedish Mammography Cohort, only the percentage of missing data for women

living in Uppsala County is included, since these data were not collected for women living in

Vastmanland County.

{ Among postmenopausal women only.

1060 Smith-Warner et al.

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

potential effect modiﬁer. The model uses the same format as

equation 1, but

jksl

ðtj u

Þ¼h

0jksl

ðtÞexpða

þ b

Þ; ð4Þ

where l ¼ 1, ..., L levels of the effect modiﬁer, x

is the

study-speciﬁc exposure variable, and a

are the estimated

study-speciﬁc log relative risks for a one-unit increase in the

confounding variables, u

. The study-speciﬁc estimates

for each stratum are then pooled across studies and expo-

nentiated to obtain the relative risk for each level of the

potential effect modiﬁer. For assessment of the statistical

signiﬁcance of the interaction, the Cox proportional hazards

model is

jks

ðtj u

Þ¼h

0jks

ðtÞexpða

þ n

þ b

þ c

Þ; ð5Þ

where c

is the study-speciﬁc estimate for the cross-product

term of the potential effect modiﬁer variable (m

) times the

exposure variable (x

) and n

is the study-speciﬁc main ef-

fect of the effect modiﬁer. The study-speciﬁc estimates

are then pooled across studies, and the p value correspond-

ing to the test for interaction (H

: c ¼ 0) is obtained from

a Wald test based upon the pooled

We use a mixed-effects meta-regression model (70) to test

for effect modiﬁcation when the exposure-disease associa-

tion is nonlinear, when the potential effect modiﬁer is a poly-

tomous nominal variable, or when effect modiﬁcation can

be assessed only between studies. As an example, consider

the test for effect modiﬁcation by gender. The model here is

a slightly modiﬁed version of equation 2:

¼ b

þ b

þ e

ð6Þ

for s ¼ 1, ..., S , where s is the study number, the

are the

estimated study-speciﬁc exposure-disease effects, b

is the

log relative risk for the exposure in the reference level of

the modiﬁer (here, men), b

is the difference in the log

relative risks between the reference level and each of the

other levels (here, between genders), z

¼ 1 if study s is car-

ried out among women and z

¼ 0 if it is carried out among

men, b

are the study-speciﬁc random effects, and e

are the

within-study sampling errors. The Wald test statistic based

on the estimate

and its standard error is used to test the

null hypothesis (H

: b

¼ 0) that there is no modiﬁcation of

the effect of exposure on the outcome by levels of the po-

tential effect modiﬁer (here, between genders).

Assessment of heterogeneity by outcome subtype

We can also evaluate whether associations differ by can-

cer subtype. For these analyses, we ﬁt separate Cox pro-

portional hazards models (equation 1) for each subtype.

Occurrences of the cancer under study that are of a different

subtype are censored at their date of diagnosis. The relative

risks obtained for each subtype that are estimated in this

way are asymptotically uncorrelated (71–73). In addition,

because these estimates are asymptotically normally distrib-

uted with variances given by the square of their respective

estimated standard errors, any linear combination of the

different estimates is normally distributed, and it follows

from the Cramer-Wald device (74) that the multivariate vec-

tor obtained by combining all of the competing risk esti-

mates is multivariate normal. The corresponding variances

are in the diagonal of the covariance matrix, and zeroes are

in the off-diagonal. To test the null hypothesis that there is

no difference in the pooled exposure-disease parameters

among the subtypes, we use a contrast test (75). For exam-

ple, to test whether the pooled exposure-disease parameters

differed among three subtypes, we would use the test statis-

tic Z

given by

¼ðC

bÞ

ðC

1

ðC

bÞ; ð7Þ

where C is a contrast matrix whose ﬁrst and second rows are

(1, 1, 0) and (1, 0, 1),

b is the vector of the pooled

estimates of the exposure-disease association for the differ-

ent subtypes, and

is its estimated covariance matrix. The

statistic in this example has an approximate v

distribu-

tion with 2 df (deﬁned by the number of different subtypes

minus 1) (75). These methods can also be used to construct

tests for heterogeneity of effects between any set of cancers

or other outcomes.

Measurement error correction

As with most exposures, measurement of dietary vari-

ables is not free from error. Measurement error in dietary

data derives from normal within-person variation in intakes

over time (76) and from errors associated with self-reports

(77). Therefore, the relative risks will be biased, usually

towards the null, but can be biased in either direction when

there is also error in measuring confounding variables (78).

One can use the validation data from each study to regress

the ‘‘gold standard’’ (or an unbiased estimate of the gold

standard, an ‘‘alloyed’’ gold standard (79)) on the error-

prone measurement and confounding variables to obtain

a correction factor. This correction factor can then be used

to calibrate the uncorrected estimates of the exposure effect

of interest obtained from logistic and Cox regression models

(77, 79, 80). If the errors in the alloyed gold standard are

correlated with the errors in the usual measure of dietary

intake, the regression calibration method for measurement

error correction will remove some, but not all, of the bias in

the effect estimate (81). However, it appears that energy

adjustment removes much of the bias in this method due

to correlated errors for at least some dietary variables (e.g.,

protein) (82, 83). To remove the remaining bias, an addi-

tional method of assessment of intake is needed, such as

a biomarker (81).

In the measurement error correction analyses, for each

study, the true intake of the particular nutrient being evalu-

ated or an unbiased estimate of the true intake (e.g., intakes

calculated from several dietary records or 24-hour recalls) is

regressed on the surrogate measurement of that nutrient

(calculated from the food frequency questionnaire) to obtain

the coefﬁcient

and its estimated standard error. We then

derive the corrected estimate of the log relative risk as

; where

is the uncorrected estimated effect in each

study from a logistic regression or Cox proportional hazards

Methods for the Pooling Project 1061

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

regression analysis. The standard error of

is derived

using the delta method (84). One can simultaneously correct

for the error in several covariates in all point estimates and

their standard errors using a multivariate extension of mea-

surement error correction (79, 85). The corrected coefﬁcient

estimates are then pooled into a summary estimate. If a study

has poor validity of nutrient measurements, its variance will

be large, and the study will thus have little weight when the

study-speciﬁc results are pooled. In addition, under the re-

quired assumption that the dietary records and 24-hour re-

calls provide an unbiased estimate of nutrient intake (even if

subject to random error), this approach calibrates the esti-

mated relative risks to a common unit of measurement

across studies, thereby adjusting for systematic errors due

to differences in the food frequency questionnaires used in

the various studies.

STRENGTHS AND LIMITATIONS

The Pooling Project of Prospective Studies of Diet and

Cancer provides a large collection of data in which multiple

diet-and-cancer hypotheses can be examined with greater

statistical power than is available in any one study. Each

study included in the Pooling Project is a prospective cohort

study in which diet was assessed prior to development of

disease, thereby limiting recall and selection biases. In the

Pooling Project, we standardize the modeling of the expo-

sure and confounding variables to remove potential sources

of noncomparability and heterogeneity that occur in the

published literature. We are able to examine associations

over a wide range of intakes with greater precision than in

the individual studies, because of the larger sample size and

the different diets consumed across the populations. In ad-

dition, we can evaluate whether associations are modiﬁed

by other factors and whether associations differ among can-

cer subtypes. Because inclusion of an individual study in a

particular analysis is not dependent on whether those in-

vestigators have published ﬁndings on that association,

publication bias does not affect our pooled analyses—as

opposed to meta-analyses of the published literature, for

which approximately half of the results may have some in-

dication of publication bias (86). Finally, results from these

pooled analyses may assist epidemiologists and other health

professionals in synthesizing the vast amount of published

data on speciﬁc diet-cancer associations.

A limitation of the Pooling Project is that it was planned

retrospectively. Thus, there are differences in how the in-

cluded studies were designed and implemented. First, the

studies comprise populations from different geographic re-

gions with different age ranges and education levels. How-

ever, these differences in study population characteristics

may be considered a strength, particularly if the results are

consistent across studies. Second, the dietary assessment

methods used vary across studies, which may lead to artifac-

tual differences in estimated intakes across studies, in addi-

tion to any true between-population differences in intakes.

However, it is also possible that validity is enhanced by the

use of study-speciﬁc questionnaires, since they may be tai-

lored for use in each component study. Some heterogeneity of

assessment instruments cannot be avoided, even in prospec-

tively planned pooled studies—if, for instance, the language

spoken and the foods consumed differ between populations.

Another limitation of the Pooling Project is that only current

diet at baseline was measured in most of the studies; thus, we

cannot examine the effects of dietary changes occurring dur-

ing follow-up or assess associations with diet at younger

ages. There may be differential control for confounding

across studies because the nondietary variables that were

measured varied across studies, although many important

potential confounders were measured in most studies. In

addition, by standardizing which confounding variables are

included in the multivariate models and their categorization,

we have minimized between-studies heterogeneity resulting

from how potentially confounding variables were modeled.

A ﬁnal restriction is our inability to examine effect modiﬁ-

cation by race and ethnicity, because the Pooling Project

currently includes studies from only North America and

Europe and a predominantly Caucasian population; how-

ever, as studies from other regions and with persons of dif-

ferent ethnicities become eligible to join the Pooling Project,

the ethnic composition of the Pooling Project will expand.

Despite these limitations and restrictions, the data com-

piled in the Pooling Project are a valuable resource for pro-

spectively investigating associations between diet and

cancer, particularly for population subgroups, less common

cancers, and speciﬁc cancer subtypes. In our analyses, we

use standardized criteria to deﬁne each variable in order to

reduce potential sources of between-studies heterogeneity.

We then evaluate whether associations are consistent across

different study populations. Finally, the methods that we use

in the Pooling Project may be modiﬁed to pool data from

both case-control and cohort studies to examine associations

between dietary and nondietary risk factors and disease.

ACKNOWLEDGMENTS

This research was funded by National Institutes of Health

grants CA55075 and CA78548. The work was performed at

the Harvard School of Public Health (Boston, Massachusetts).

Conﬂict of interest: none declared.

REFERENCES

1. Blettner M, Sauerbrei W, Schlehofer B, et al. Traditional

reviews, meta-analyses and pooled analyses in epidemiology.

Int J Epidemiol 1999;28:1–9.

2. Friedenreich CM. Methods for pooled analyses of epidemio-

logic studies. Epidemiology 1993;4:295–302.

3. Steinberg KK, Smith SJ, Stroup DF, et al. Comparison of effect

estimates from a meta-analysis of summary data from pub-

lished studies and from a meta-analysis using individual pa-

tient data for ovarian cancer studies. Am J Epidemiol 1997;

145:917–25.

4. Lyman GH, Kuderer NM. The strengths and limitations of

meta-analyses based on aggregate data. BMC Med Res

Methodol 2005;5:14.

5. Ioannidis JP, Rosenberg PS, Goedert JJ, et al. Commentary:

meta-analysis of individual participants’ data in genetic epi-

demiology. Am J Epidemiol 2002;156:204–10.

1062 Smith-Warner et al.

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

6. Collaborative Group on Hormonal Factors in Breast Cancer.

Breast cancer and hormonal contraceptives: collaborative re-

analysis of individual data on 53 297 women with breast

cancer and 100 239 women without breast cancer from 54

epidemiological studies. Lancet 1996;347:1713–27.

7. Whittemore AS, Harris R, Itnyre J, et al. Characteristics re-

lating to ovarian cancer risk: collaborative analysis of 12 US

case-control studies. I. Methods. Am J Epidemiol 1992;

136:1175–83.

8. Plummer M, Herrero R, Franceschi S, et al. Smoking and

cervical cancer: pooled analysis of the IARC multi-centric

case-control study. Cancer Causes Control 2003;14:805–14.

9. Bosetti C, Kolonel L, Negri E, et al. A pooled analysis of case-

control studies of thyroid cancer. VI. Fish and shellﬁsh con-

sumption. Cancer Causes Control 2001;12:375–82.

10. Arslan AA, Zeleniuch-Jacquotte A, Lundin E, et al. Serum

follicle-stimulating hormone and risk of epithelial ovarian

cancer in postmenopausal women. Cancer Epidemiol Bio-

markers Prev 2003;12:1531–5.

11. Pereira MA, O’Reilly E, Augustsson K, et al. Dietary ﬁber and

risk of coronary heart disease: a pooled analysis of cohort

studies. Arch Intern Med 2004;164:370–6.

12. Morton LM, Hartge P, Holford TR, et al. Cigarette smoking

and risk of non-Hodgkin lymphoma: a pooled analysis from

the International Lymphoma Epidemiology Consortium

(InterLymph). Cancer Epidemiol Biomarkers Prev 2005;14:

925–33.

13. Smith JS, Herrero R, Bosetti C, et al. Herpes simplex virus-2

as a human papillomavirus cofactor in the etiology of invasive

cervical cancer. J Natl Cancer Inst 2002;94:1604–13.

14. Hunter DJ, Spiegelman D, Adami H-O, et al. Cohort studies

of fat intake and the risk of breast cancer—a pooled analysis.

N Engl J Med 1996;334:356–61.

15. Hunter DJ, Spiegelman D, Adami H-O, et al. Non-dietary

factors as risk factors for breast cancer, and as effect modiﬁers

of the association of fat intake and risk of breast cancer. Cancer

Causes Control 1997;8:49–56.

16. Smith-Warner SA, Spiegelman D, Yaun S-S, et al. Alcohol and

breast cancer in women: a pooled analysis of cohort studies.

JAMA 1998;279:535–40.

17. Cho E, Smith-Warner SA, Spiegelman D, et al. Dairy foods,

calcium, and colorectal cancer: a pooled analysis of 10 cohort

studies. J Natl Cancer Inst 2004;96:1015–22.

18. van den Brandt PA, Spiegelman D, Yaun SS, et al. Pooled

analysis of prospective cohort studies on height, weight and

breast cancer risk. Am J Epidemiol 2000;152:514–27.

19. Smith-Warner SA, Spiegelman D, Yaun S-S, et al. Intake

of fruits and vegetables and risk of breast cancer: a pooled

analysis of cohort studies. JAMA 2001;285:769–76.

20. Smith-Warner SA, Spiegelman D, Adami HO, et al. Types

of dietary fat and breast cancer: a pooled analysis of cohort

studies. Int J Cancer 2001;92:767–74.

21. Missmer SA, Smith-Warner SA, Spiegelman D, et al. Meat

and dairy food consumption and breast cancer: a pooled

analysis of cohort studies. Int J Epidemiol 2002;31:78–85.

22. Smith-Warner SA, Ritz J, Hunter DJ, et al. Dietary fat and

risk of lung cancer in a pooled analysis of prospective studies.

Cancer Epidemiol Biomarkers Prev 2002;11:987–92.

23. Smith-Warner SA, Spiegelman D, Yaun SS, et al. Fruits,

vegetables and lung cancer: a pooled analysis of cohort stud-

ies. Int J Cancer 2003;107:1001–11.

24. Mannisto S, Smith-Warner SA, Spiegelman D, et al. Dietary

carotenoids and risk of lung cancer in a pooled analysis of

seven cohort studies. Cancer Epidemiol Biomarkers Prev

2004;13:40–8.

25. Cho E, Smith-Warner SA, Ritz J, et al. Alcohol intake and

colorectal cancer: a pooled analysis of 8 cohort studies. Ann

Intern Med 2004;140:603–13.

26. Koushik A, Hunter DJ, Spiegelman D, et al. Fruits and vege-

tables and ovarian cancer risk in a pooled analysis of 12 cohort

studies. Cancer Epidemiol Biomarkers Prev 2005;14:

2160–7.

27. Freudenheim JL, Ritz J, Smith-Warner SA, et al. Alcohol

consumption and risk of lung cancer: a pooled analysis of

cohort studies. Am J Clin Nutr 2005;82:657–67.

28. Cho E, Hunter DJ, Spiegelman D, et al. Intakes of vitamins A,

C and E and folate and multivitamins and lung cancer: a

pooled analysis of 8 prospective studies. Int J Cancer 2006;

118:970–8.

29. Genkinger JM, Hunter DJ, Spiegelman D, et al. A pooled

analysis of 12 cohort studies of dietary fat, cholesterol and egg

intake and ovarian cancer. Cancer Causes Control 2006;17:

273–85.

30. Park Y, Hunter DJ, Spiegelman D, et al. Dietary ﬁber intake

and risk of colorectal cancer: a pooled analysis of prospective

cohort studies. JAMA 2005;294:2849–57.

31. Riboli E, Kaaks R. The EPIC Project: rationale and study

design. Int J Epidemiol 1997;26(suppl 1):S6–14.

32. Wolk A, Bergstro

m R, Hunter D, et al. A prospective study

of association of monounsaturated fat and other types of

fat with risk of breast cancer. Arch Intern Med 1998;158:

41–5.

33. Singh PN, Fraser GE. Dietary risk factors for colon cancer

in a low-risk population. Am J Epidemiol 1998;148:761–74.

34. The ATBC Cancer Prevention Study Group. The Alpha-

Tocopherol, Beta-Carotene Lung Cancer Prevention Study:

design, methods, participant characteristics, and compliance.

Ann Epidemiol 1994;4:1–10.

35. Flood A, Velie EM, Chaterjee N, et al. Fruit and vegetable

intakes and the risk of colorectal cancer in the Breast Cancer

Detection Demonstration Project Follow-up Cohort. Am J Clin

Nutr 2002;75:936–43.

36. Terry P, Jain M, Miller AB, et al. Dietary intake of folic acid

and colorectal cancer risk in a cohort of women. Int J Cancer

2002;97:864–7.

37. Michels KB, Giovannucci E, Joshipura KJ, et al. Prospec-

tive study of fruit and vegetable consumption and incidence

of colon and rectal cancers. J Natl Cancer Inst 2000;92:

1740–52.

38. Calle EE, Rodriguez C, Jacobs EJ, et al. The American Cancer

Society Cancer Prevention Study II Nutrition Cohort: ratio-

nale, study design, and baseline characteristics. Cancer 2002;

94:500–11.

39. Sieri S, Krogh V, Muti P, et al. Fat and protein intake and

subsequent breast cancer risk in postmenopausal women.

Nutr Cancer 2002;42:10–17.

40. Voorrips LE, Goldbohm RA, van Poppel G, et al. Vegetable

and fruit consumption and risks of colon and rectal cancer in

a prospective cohort study: The Netherlands Cohort Study on

Diet and Cancer. Am J Epidemiol 2000;152:1081–92.

41. Steinmetz KA, Kushi LH, Bostick RM, et al. Vegetables, fruit,

and colon cancer in the Iowa Women’s Health Study. Am J

Epidemiol 1994;139:1–15.

42. Bandera EV, Freudenheim JL, Marshall JR, et al. Diet and

alcohol consumption and lung cancer risk in the New York

State Cohort (United States). Cancer Causes Control 1997;

8:828–40.

43. Kato I, Akhmedkhanov A, Koenig K, et al. Prospective study

of diet and female colorectal cancer: The New York University

Women’s Health Study. Nutr Cancer 1997;28:276–81.

Methods for the Pooling Project 1063

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

44. Higginbotham S, Zhang Z-F, Lee I-M, et al. Dietary glycemic

load and risk of colorectal cancer in the Women’s Health

Study. J Natl Cancer Inst 2004;96:229–33.

45. Horn-Ross PL, Hoggatt KJ, West DW, et al. Recent diet and

breast cancer risk: The California Teachers Study (USA).

Cancer Causes Control 2002;13:407–15.

46. Cho E, Spiegelman D, Hunter DJ, et al. Premenopausal fat

intake and risk of breast cancer. J Natl Cancer Inst 2003;95:

1079–85.

47. Prentice RL. A case-cohort design for epidemiologic cohort

studies and disease prevention trials. Biometrika 1986;73:

1–11.

48. Rothman KJ. Modern epidemiology. Boston, MA: Little,

Brown and Company, 1986.

49. Cox DR. Regression models and life tables (with discussion).

J R Stat Soc B 1972;34:187–220.

50. Abbey DE, Andress M, Fraser G, et al. Validity and reliability

of alternative nutrient indices based on a food frequency

questionnaire. (Abstract). Am J Epidemiol 1988;128(suppl):

934.

51. Pietinen P, Hartman AM, Haapa E, et al. Reproducibility

and validity of dietary assessment instruments. I. A self-

administered food use questionnaire with a portion size picture

booklet. Am J Epidemiol 1988;128:655–66.

52. Jain M, Howe GR, Rohan T. Dietary assessment in epidemi-

ology: comparison of a food frequency and a diet history

questionnaire with a 7-day food record. Am J Epidemiol

1996;143:953–60.

53. Flagg EW, Coates RJ, Calle EE, et al. Validation of the

American Cancer Society Cancer Prevention Study II Nutri-

tion Survey Cohort food frequency questionnaire. Epidemiol-

ogy 2000;11:462–8.

54. Rimm EB, Giovannucci EL, Stampfer MJ, et al. Reproduc-

ibility and validity of an expanded self-administered semi-

quantitative food frequency questionnaire among male health

professionals. Am J Epidemiol 1992;135:1114–26.

55. Munger RG, Folsom AR, Kushi LH, et al. Dietary assessment

of older Iowa women with a food frequency questionnaire:

nutrient intake, reproducibility, and comparison with 24-hour

dietary recall interviews. Am J Epidemiol 1992;136:192–200.

56. Goldbohm RA, van den Brandt PA, Brants HA, et al. Valida-

tion of a dietary questionnaire used in a large-scale prospec-

tive cohort study on diet and cancer. Eur J Clin Nutr 1994;

48:253–65.

57. Feskanich D, Marshall J, Rimm EB, et al. Simulated validation

of a brief food frequency questionnaire. Ann Epidemiol 1994;

4:181–7.

58. Willett WC, Sampson L, Stampfer MJ, et al. Reproducibility

and validity of a semiquantitative food frequency question-

naire. Am J Epidemiol 1985;122:51–65.

59. Willett W. Nutritional epidemiology. New York, NY: Oxford

University Press, 1998.

60. Miettinen OS. Theoretical epidemiology. New York, NY:

John Wiley and Sons, Inc, 1985.

61. Huberman M, Langholz B. Application of the missing-

indicator method in matched case-control studies with incom-

plete data. Am J Epidemiol 1999;150:1340–5.

62. HiroSoft International Corporation. EPICURE user’s guide:

the PEANUTS program. Seattle, WA: HiroSoft International

Corporation, 1993.

63. Allison PD. Survival analysis using the SAS system: a practi-

cal guide. Cary, NC: SAS Publishing, 1995.

64. Harville DA. Maximum likelihood approaches to variance

component estimation and to related problems. J Am Stat

Assoc 1977;72:320–38.

65. Laird NM, Ware JH. Random-effects models for longitudinal

data. Biometrics 1982;38:963–74.

66. DerSimonian R, Laird N. Meta-analysis in clinical trials.

Control Clin Trials 1986;7:177–88.

67. Cochran WG. The combination of estimates from different

experiments. Biometrics 1954;10:101–29.

68. Durrleman S, Simon R. Flexible regression models with cubic

splines. Stat Med 1989;8:551–61.

69. Smith PL. Splines as a useful and convenient statistical tool.

Am Stat 1979;33:57–62.

70. Stram DO. Meta-analysis of published data using a linear

mixed-effects model. Biometrics 1996;52:536–44.

71. Prentice RL, Kalbﬂeisch JD, Peterson AV, et al. The analysis

of failure times in the presence of competing risks. Biometrics

1978;34:541–54.

72. Tsiatis AA. Competing risks. In: Armitage P, Colton T, eds.

Encyclopedia of biostatistics. 1st ed. Vol 1. New York, NY:

John Wiley and Sons, Inc, 1988:824–34.

73. Cox DR, Oakes D. Analysis of survival data. New York, NY:

Chapman and Hall, Inc, 1993.

74. Billingsley P. Probability and measure. New York, NY: John

Wiley and Sons, Inc, 1995.

75. Anderson TW. Introduction to multivariate statistics. New

York, NY: John Wiley and Sons, Inc, 1984.

76. Beaton GH, Milner J, McGuire V, et al. Source of variance in

24-hour dietary recall data: implications for nutrition study

design and interpretation. Carbohydrate sources, vitamins, and

minerals. Am J Clin Nutr 1983;37:986–95.

77. Rosner B, Willett WC, Spiegelman D. Correction of logistic

regression relative risk estimates and conﬁdence intervals for

systematic within-person measurement error. Stat Med 1989;

8:1051–69.

78. Kupper LL. Effects of the use of unreliable surrogate variables

on the validity of epidemiologic research studies. Am J Epi-

demiol 1984;120:643–8.

79. Spiegelman D, Schneeweiss S, McDermott A. Measurement

error correction for logistic regression models with an

‘‘alloyed gold standard.’’ Am J Epidemiol 1997;145:184–96.

80. Wang CY, Xie, Prentice AM, et al. Recalibration based on an

approximate relative risk estimator in Cox regression with

missing covariates. Stat Sinica 2001;11:1081–104.

81. Spiegelman D, Zhao B, Kim J. Correlated errors in biased

surrogates: study designs and methods for measurement error

correction. Stat Med 2005;24:1657–82.

82. Kipnis V, Subar AF, Midthune D, et al. Structure of dietary

measurement error: results of the OPEN biomarker study. Am

J Epidemiol 2003;158:14–21.

83. Michels KB, Bingham SA, Luben R, et al. The effect of cor-

related measurement error in multivariate models of diet. Am J

Epidemiol 2004;160:59–67.

84. Bishop Y, Fienberg S, Holland P. Discrete multivariate anal-

ysis. Cambridge, MA: MIT Press, 1975.

85. Rosner B, Spiegelman D, Willett WC. Correction of logistic

regression relative risk estimates and conﬁdence intervals for

measurement error: the case of multiple covariates measured

with error. Am J Epidemiol 1990;132:734–45.

86. Sutton AJ, Duval SJ, Tweedie RL, et al. Empirical assessment

of effect of publication bias on meta-analyses. BMJ 2000;

320:1574–7.

1064 Smith-Warner et al.

Am J Epidemiol 2006;163:1053–1064

by guest on June 4, 2013http://aje.oxfordjournals.org/Downloaded from

Associations of EAT-Lancet Planetary Health Diet or Finnish Nutrition Recommendations with changes in obesity measures: a follow-up study in adults

Article

Full-text available

Dec 2023

Background: Knowledge on the association between the EAT-Lancet Planetary Health Diet (PHD) or the Finnish Nutrition recommendations (FNR) and anthropometric changes is scarce. Especially, the role of the overall diet quality, distinct from energy intake, on weight changes needs further examination. Objectives: To examine the association between diet quality and weight change indicators and to develop a dietary index based on the PHD adapted for the Finnish food culture. Methods: The study population consisted of participants of two Finnish population-based studies (n = 4,371, 56% of women, aged 30−74 years at baseline). Dietary habits at the baseline were assessed with a validated food frequency questionnaire including 128−130 food items. We developed a Planetary Health Diet Score (PHDS) (including 13 components) and updated the pre-existing Recommended Finnish Diet Score (uRFDS) (including nine components) with energy density values to measure overall diet quality. Weight, height, and waist circumference (WC), and the body mass index (BMI) were measured at the baseline and follow-up, and their percentual changes during a 7-year follow-up were calculated. Two-staged random effects linear regression was used to evaluate β-estimates with 95% confidence intervals. Results: Adherence to both indices was relatively low (PHDS: mean 3.6 points (standard deviation [SD] 1.2) in the range of 0−13; uRFDS: mean 12.7 points (SD 3.9) in the range of 0−27). We did not find statistically significant associations between either of the dietary indices and anthropometric changes during the follow-up (PHDS, weight: β −0.04 (95% CI −0.19, 0.11), BMI: β 0.05 (−0.20, 0.10), WC: β −0.08 (−0.22, 0.06); uRFDS, weight: β 0.01 (−0.04, 0.06), BMI: β 0.01 (−0.04, 0.06), WC: β −0.02 (−0.07, 0.03)). Conclusion: No associations between overall diet quality and anthropometric changes were found, which may be at least partly explained by low adherence to the PHD and the FNR in the Finnish adult population.

Partial substitution of red meat or processed meat with plant-based foods and the risk of colorectal cancer

Article

Full-text available

Jan 2024
EUR J EPIDEMIOL

Objectives Shifting from animal-based to plant-based diets could reduce colorectal cancer (CRC) incidence. Currently, the impacts of these dietary shifts on CRC risk are ill-defined. Therefore, we examined partial substitutions of red or processed meat with whole grains, vegetables, fruits or a combination of these in relation to CRC risk in Finnish adults. Methods We pooled five Finnish cohorts, resulting in 43 788 participants aged ≥ 25 years (79% men). Diet was assessed by validated food frequency questionnaires at study enrolment. We modelled partial substitutions of red (100 g/week) or processed meat (50 g/week) with corresponding amounts of plant-based foods. Cohort-specific hazard ratios (HR) for CRC were calculated using Cox proportional hazards models and pooled together using random-effects models. Adjustments included age, sex, energy intake and other relevant confounders. Results During the median follow-up of 28.8 years, 1124 CRCs were diagnosed. We observed small risk reductions when red meat was substituted with vegetables (HR 0.97, 95% CI 0.95 − 0.99), fruits (0.97, 0.94 − 0.99), or whole grains, vegetables and fruits combined (0.97, 0.95 − 0.99). For processed meat, these substitutions yielded 1% risk reductions. Substituting red or processed meat with whole grains was associated with a decreased CRC risk only in participants with < median whole grain intake (0.92, 0.86 − 0.98; 0.96, 0.93 − 0.99, respectively; pinteraction=0.001). Conclusions Even small, easily implemented substitutions of red or processed meat with whole grains, vegetables or fruits could lower CRC risk in a population with high meat consumption. These findings broaden our insight into dietary modifications that could foster CRC primary prevention.

Reproductive factors, hormonal interventions, and gastric cancer risk in the Stomach cancer Pooling (StoP) Project

Article

Full-text available

Dec 2023
CANCER CAUSE CONTROL

Background Gastric cancer incidence is higher in men, and a protective hormone-related effect in women is postulated. We aimed to investigate and quantify the relationship in the Stomach cancer Pooling (StoP) Project consortium. Methods A total of 2,084 cases and 7,102 controls from 11 studies in seven countries were included. Summary odds ratios (ORs) and 95% confidence intervals (CIs) assessing associations of key reproductive factors and menopausal hormone therapy (MHT) with gastric cancer were estimated by pooling study-specific ORs using random-effects meta-analysis. Results A duration of fertility of ≥ 40 years (vs. < 20), was associated with a 25% lower risk of gastric cancer (OR = 0.75; 95% CI: 0.58–0.96). Compared with never use, ever, 5–9 years and ≥ 10 years use of MHT in postmenopausal women, showed ORs of 0.73 (95% CI: 0.58–0.92), 0.53 (95% CI: 0.34–0.84) and 0.71 (95% CI: 0.50–1.00), respectively. The associations were generally similar for anatomical and histologic subtypes. Conclusion Our results support the hypothesis that reproductive factors and MHT use may lower the risk of gastric cancer in women, regardless of anatomical or histologic subtypes. Given the variation in hormones over the lifespan, studies should address their effects in premenopausal and postmenopausal women. Furthermore, mechanistic studies may inform potential biological processes.

Exposure to Parental Violence and Self-Reported Sexual Violence among Unpartnered Adolescent Girls in Nigeria: Evidence from a National Survey

Article

Nov 2023

Systematic Review of Injuries and Chronic Musculoskeletal Pain Among High-speed Boat Operators

Article

Full-text available

Oct 2023

Introduction High-speed boat operators constitute a population at risk of work-related injuries and disabilities. This review aimed to summarize the available knowledge on workplace-related injuries and chronic musculoskeletal pain among high-speed boat operators. Materials and Methods In this systematic review, we searched Medline, Embase, Scopus, and the Cochrane Library Database for studies, published from 1980 to 2022, on occupational health and hazards onboard high-speed boats. Studies and reports were eligible for inclusion if they evaluated, compared, used, or described harms associated with impact exposure onboard high-speed boats. Studies focusing on recreational injuries and operators of non-planing boats were excluded. The primary outcome of interest was the incidence of acute injuries. The secondary outcome measures comprised the presence of chronic musculoskeletal disorders, pain medication use, and days off work. Results Of the 163 search results, 5 (2 prospective longitudinal and 3 cross-sectional cohort studies) were included in this systematic review. A total of 804 cases with 3,312 injuries sustained during 3,467 person-years onboard high-speed boats were included in the synthesis of the results. The pooled incidence rate was 1.0 per person-year. The most common injuries were related to the lower back (26%), followed by neck (16%) and head (12%) injuries. The pooled prevalence of chronic pain was 74% (95% CI: 73–75%) and 60% (95% CI: 59–62%) of the cohort consumed analgesics. Conclusions Despite very limited data, this review found evidence that high-speed boat operators have a higher rate of injuries and a higher prevalence of chronic pain than other naval service operators and the general workforce. Given the low certainty of these findings, further prospective research is required to verify the injury incidence and chronic pain prevalence among high-speed boat operators.

Prospective and Mendelian randomization analyses on the association of circulating fatty acid binding protein 4 (FABP-4) and risk of colorectal cancer

Article

Full-text available

Oct 2023
BMC MED

Background Fatty acid binding protein 4 (FABP-4) is a lipid-binding adipokine upregulated in obesity, which may facilitate fatty acid supply for tumor growth and promote insulin resistance and inflammation and may thus play a role in colorectal cancer (CRC) development. We aimed to investigate the association between circulating FABP-4 and CRC and to assess potential causality using a Mendelian randomization (MR) approach. Methods The association between pre-diagnostic plasma measurements of FABP-4 and CRC risk was investigated in a nested case-control study in 1324 CRC cases and the same number of matched controls within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. A two-sample Mendelian randomization study was conducted based on three genetic variants (1 cis, 2 trans) associated with circulating FABP-4 identified in a published genome-wide association study (discovery n = 20,436) and data from 58,131 CRC cases and 67,347 controls in the Genetics and Epidemiology of Colorectal Cancer Consortium, Colorectal Cancer Transdisciplinary Study, and Colon Cancer Family Registry. Results In conditional logistic regression models adjusted for potential confounders including body size, the estimated relative risk, RR (95% confidence interval, CI) per one standard deviation, SD (8.9 ng/mL) higher FABP-4 concentration was 1.01 (0.92, 1.12) overall, 0.95 (0.80, 1.13) in men and 1.09 (0.95, 1.25) in women. Genetically determined higher FABP-4 was not associated with colorectal cancer risk (RR per FABP-4 SD was 1.10 (0.95, 1.27) overall, 1.03 (0.84, 1.26) in men and 1.21 (0.98, 1.48) in women). However, in a cis-MR approach, a statistically significant association was observed in women (RR 1.56, 1.09, 2.23) but not overall (RR 1.23, 0.97, 1.57) or in men (0.99, 0.71, 1.37). Conclusions Taken together, these analyses provide no support for a causal role of circulating FABP-4 in the development of CRC, although the cis-MR provides some evidence for a positive association in women, which may deserve to be investigated further.

Identifying modifiable risk factors to prevent aggressive colorectal cancer

Article

May 2024
INT J CANCER

It remains unclear if pre‐diagnostic factors influence the developmental pathways of colorectal cancer (CRC) that could enhance tumor aggressiveness. This study used prospective data from 205,489 cancer‐free US health professionals to investigate the associations of 31 known or putative risk factors with the risk of aggressive CRC. Tumor aggressiveness was characterized by three endpoints: aggressive CRC (cancer that causes death within 5 years of diagnosis), fatal CRC, and tumor stage at diagnosis. The data augmentation method was used to assess the difference in the associations between risk factors and endpoints. We documented 3201 CRC cases, of which 899 were aggressive. The protective associations of undergoing lower endoscopy (hazard ratios [HR] 0.43, 95% confidence interval (CI) 0.37, 0.49 for aggressive versus HR 0.61, 95% CI 0.56, 0.67 for non‐aggressive) and regular use of aspirin (HR 0.70, 95% CI 0.61, 0.81 versus HR 0.84, 95% CI 0.77, 0.92) were stronger for aggressive than non‐aggressive CRC ( p Heterogeneity <0.05). Lower intake of whole grains or cereal fiber and greater dietary inflammatory potential were associated with a higher risk of aggressive but not non‐aggressive CRC. The remaining risk factors showed comparable associations with aggressive CRC and non‐aggressive CRC. Aggressive cases were more likely to have KRAS ‐mutated tumors but less likely to have distal or MSI‐high tumors ( p < .007). Similar results were observed for fatal CRC and advanced tumor stages at diagnosis. These findings provide initial evidence for the role of pre‐diagnostic risk factors in the pathogenesis of aggressive CRC and suggest research priorities for preventive interventions.

Perspective: Beyond the Mediterranean Diet -- Exploring Latin American, Asian, and African Heritage Diets as Cultural Models of Healthy Eating

Article

Apr 2024

The Mediterranean diet is a well-studied cultural model of healthy eating, yet research on healthy models from other cultures and cuisines has been limited. This perspective article summarizes the components of traditional Latin American, Asian, and African heritage diets, their association with diet quality and markers of health, and implications for nutrition programs and policy. Though these diets differ in specific foods and flavors, we present a common thread that emphasizes healthful plant foods and that is consistent with high dietary quality and low rates of major causes of disability and deaths. In this perspective, we propose that nutrition interventions that incorporate these cultural models of healthy eating show promise, though further research is needed to determine health outcomes and best practices for implementation.

Effect of multivitamin-mineral supplementation versus placebo on cognitive function: results from the clinic subcohort of the COcoa Supplement and Multivitamin Outcomes Study (COSMOS) randomized clinical trial and meta-analysis of 3 cognitive studies within COSMOS

Article

Jan 2024
AM J CLIN NUTR

Red meat intake and risk of type 2 diabetes in a prospective cohort study of United States females and males

Article

Oct 2023
AM J CLIN NUTR

An Introduction to Multivariate Statistical Analysis

Article

May 1986
TECHNOMETRICS

Alcohol and breast cancer: A pooled analysis of cohort studies.

Article

Jun 1997
AM J EPIDEMIOL

Random-Effects Models for Longitudinal Data

Article

Jan 1982
BIOMETRICS

Overview of Nutritional Epidemiology

Chapter

Jul 1998

Walter C Willett

This chapter provides an overview of nutritional epidemiology for those unfamiliar with the field. The field of nutritional epidemiology developed from an interest in the concept that aspects of diet may influence the occurrence of human disease. Although it is relatively new as a formal area of research, investigators have used basic epidemiologic methods for more than 200 years to identify numerous essential nutrients. The most serious challenge to research in nutritional epidemiology has been the development of practical methods to measure diet. Because epidemiologic studies usually involve at least several hundred and sometimes hundreds of thousands of subjects, dietary assessment methods must be not only reasonably accurate but also relatively inexpensive. Epidemiologic approaches to diet and disease and the interpretation of epidemiologic data are discussed.

Alcohol and Breast Cancer in Women

Article

Feb 1998

Objective.— To assess the risk of invasive breast cancer associated with total and beverage-specific alcohol consumption and to evaluate whether dietary and nondietary factors modify the association. Data Sources.— We included in these analyses 6 prospective studies that had at least 200 incident breast cancer cases, assessed long-term intake of food and nutrients, and used a validated diet assessment instrument. The studies were conducted in Canada, the Netherlands, Sweden, and the United States. Alcohol intake was estimated by food frequency questionnaires in each study. The studies included a total of 322647 women evaluated for up to 11 years, including 4335 participants with a diagnosis of incident invasive breast cancer. Data Extraction.— Pooled analysis of primary data using analyses consistent with each study's original design and the random-effects model for the overall pooled analyses. Data Synthesis.— For alcohol intakes less than 60 g/d (reported by >99% of participants), risk increased linearly with increasing intake; the pooled multivariate relative risk for an increment of 10 g/d of alcohol (about 0.75-1 drink) was 1.09 (95% confidence interval [CI], 1.04-1.13; P for heterogeneity among studies, .71). The multivariate-adjusted relative risk for total alcohol intakes of 30 to less than 60 g/d (about 2-5 drinks) vs nondrinkers was 1.41 (95% CI, 1.18-1.69). Limited data suggested that alcohol intakes of at least 60 g/d were not associated with further increased risk. The specific type of alcoholic beverage did not strongly influence risk estimates. The association between alcohol intake and breast cancer was not modified by other factors. Conclusions.— Alcohol consumption is associated with a linear increase in breast cancer incidence in women over the range of consumption reported by most women. Among women who consume alcohol regularly, reducing alcohol consumption is a potential means to reduce breast cancer risk.

Analysis of Survival Data

Article

Jun 1986

Introduction. Survival distributions. Single sample nonparametric methods. Dependence on explanatory variables. Model formulation. The multiplicative log-linear hazards model. Partial likelihood. Several types of failure. Further problems. Exercises. Bibliography. Index.

Structure of Dietary Measurement Error: Results of the OPEN Biomarker Study

Article

Jul 2003
AM J EPIDEMIOL

Victor Kipnis

Multiple-day food records or 24-hour dietary recalls (24HRs) are commonly used as “reference” instruments to calibrate food frequency questionnaires (FFQs) and to adjust findings from nutritional epidemiologic studies for measurement error. Correct adjustment requires that the errors in the adopted reference instrument be independent of those in the FFQ and of true intake. The authors report data from the Observing Protein and Energy Nutrition (OPEN) Study, conducted from September 1999 to March 2000, in which valid reference biomarkers for energy (doubly labeled water) and protein (urinary nitrogen), together with a FFQ and 24HR, were observed in 484 healthy volunteers from Montgomery County, Maryland. Accounting for the reference biomarkers, the data suggest that the FFQ leads to severe attenuation in estimated disease relative risks for absolute protein or energy intake (a true relative risk of 2 would appear as 1.1 or smaller). For protein adjusted for energy intake by using either nutrient density or nutrient residuals, the attenuation is less severe (a relative risk of 2 would appear as approximately 1.3), lending weight to the use of energy adjustment. Using the 24HR as a reference instrument can seriously underestimate true attenuation (up to 60% for energy-adjusted protein). Results suggest that the interpretation of findings from FFQ-based epidemiologic studies of diet-disease associations needs to be reevaluated. bias (epidemiology); biological markers; diet; energy intake; epidemiologic methods; nutrition assessment; questionnaires; reference values

Prospective Study of Fruit and Vegetable Consumption and Incidence of Colon and Rectal Cancers

Article

Nov 2000
J NATL CANCER I

Karin B Michels

Background: Frequent consumption of fruit and vegetables has been associated with a reduced risk of colorectal cancer in many observational studies. Methods: We prospectively investigated the association between fruit and vegetable consumption and the incidence of colon and rectal cancers in two large cohorts: the Nurses' Health Study (88764 women) and the Health Professionals' Follow-up Study (47325 men). Diet was assessed and cumulatively updated in 1980, 1984, 1986, and 1990 among women and in 1986 and 1990 among men. The incidence of cancer of the colon and rectum was ascertained up to June or January of 1996, respectively. Relative risk (RR) estimates were calculated with the use of pooled logistic regression models accounting for various potential confounders. All statistical tests were two-sided. Results: With a follow-up including 1743645 person-years and 937 cases of colon cancer, we found little association of colon cancer incidence with fruit and vegetable consumption. For women and men combined, a difference in fruit and vegetable consumption of one additional serving per day was associated with a covariate-adjusted RR of 1.02 (95% confidence interval [CI] = 0.98-1.05). A difference in vegetable consumption of one additional serving per day was associated with an RR of 1.03 (95% CI = 0.97-1.09). Similar results were obtained for women and men considered separately. A difference in fruit consumption of one additional serving per day was associated with a covariate-adjusted RR for colon cancer of 0.96 (95% CI = 0.89-1.03) among women and 1.08 (95% CI = 1.00-1.16) among men. For rectal cancer (total, 244 cases), a difference in fruit and vegetable consumption of one additional serving per day was associated with an RR of 1.02 (95% CI = 0.95-1.09) in men and women combined. None of these associations was modified by vitamin supplement use or smoking habits. Conclusions: Although fruits and vegetables may confer protection against some chronic diseases, their frequent consumption does not appear to confer protection from colon or rectal cancer.

Traditional reviews, meta-analyses and pooled analyses in epidemiology

Article

Feb 1999
INT J EPIDEMIOL

Maria Blettner

Background The use of review articles and meta-analysis has become an important part of epidemiological research, mainly for reconciling previously conducted studies that have inconsistent results. Numerous methodologic issues particularly with respect to biases and the use of meta-analysis are still controversial. Methods Four methods summarizing data from epidemiological studies are described. The rationale for meta-analysis and the statistical methods used are outlined. The strengths and limitations of these methods are compared particularly with respect to their ability to investigate heterogeneity between studies and to provide quantitative risk estimation. Results Meta-analyses from published data are in general insufficient to calculate a pooled estimate since published estimates are based on heterogeneous populations, different study designs and mainly different statistical models. More reliable results can be expected if individual data are available for a pooled analysis, although some heterogeneity still remains. Large prospective planned meta-analysis of multicentre studies would be preferable to investigate small risk factors, however this type of meta-analysis is expensive and time-consuming. Conclusion For a full assessment of risk factors with a high prevalence in the general population, pooling of data will become increasingly important. Future research needs to focus on the deficiencies of review methods, in particular, the errors and biases that can be produced when studies are combined that have used different designs, methods and analytic models.

Probability and Measure

Book

Jan 1986
J AM STAT ASSOC

P. Billingsley

Probability. Measure. Integration. Random Variables and Expected Values. Convergence of Distributions. Derivatives and Conditional Probability. Stochastic Processes. Appendix. Notes on the Problems. Bibliography. List of Symbols. Index.