ArticlePDF Available

Abstract and Figures

Caring for a family member with dementia is generally regarded as a chronically stressful process, with potentially negative physical health consequences. However, no quantitative analysis has been conducted on this literature. The authors combined the results of 23 studies to compare the physical health of caregivers with demographically similar noncaregivers. When examined across 11 health categories, caregivers exhibited a slightly greater risk for health problems than did noncaregivers. However, sex and the health category assessed moderated this relationship. Stronger relationships occurred with stress hormones, antibodies, and global reported health. The authors argue that a theoretical model is needed that relates caregiver stressors to illness and proffers moderating roles for vulnerabilities and resources and mediating roles for psychosocial distress and health behaviors.
Content may be subject to copyright.
Is Caregiving Hazardous to One’s Physical Health? A Meta-Analysis
Peter P. Vitaliano
University of Washington
Jianping Zhang
Indiana University–Purdue University, Indianapolis
James M. Scanlan
University of Washington
Caring for a family member with dementia is generally regarded as a chronically stressful process, with
potentially negative physical health consequences. However, no quantitative analysis has been conducted
on this literature. The authors combined the results of 23 studies to compare the physical health of
caregivers with demographically similar noncaregivers. When examined across 11 health categories,
caregivers exhibited a slightly greater risk for health problems than did noncaregivers. However, sex and
the health category assessed moderated this relationship. Stronger relationships occurred with stress
hormones, antibodies, and global reported health. The authors argue that a theoretical model is needed
that relates caregiver stressors to illness and proffers moderating roles for vulnerabilities and resources
and mediating roles for psychosocial distress and health behaviors.
For the past 30 years researchers have assumed without contro-
versy that providing care for an ill family member is hazardous to
one’s health. Although this research has been well meaning, no
quantitative procedures have summarized this literature to assess
the consistency of findings across studies. The fact is, it is still
unknown to what extent caregiving is hazardous to one’s health. In
this meta-analysis we examine and critique the literature on self-
reported health and physiological functioning in caregivers of
persons with dementia. Because meta-analyses are usually re-
stricted to studies with comparison groups (Shadish, Matt, Na-
varro, & Phillips, 2000), we focus here on research that compares
caregivers with demographically similar noncaregivers. To pro-
vide a rationale for this review, we first discuss why caregiver
health is important. We then argue that stressors, psychosocial
distress, and risky health habits influence physiological responses
in caregivers, which increase their risks for health problems. Next,
we discuss how illnesses and physiological functioning have been
measured in caregivers, as well as the rationales for the choice of
measures. We then argue that inferences about relationships of
caregiving with health problems should consider the types of
health measures used. Finally, we note that research on caregiver
health has been largely atheoretical and has not profited from
extensive mind–body research. We conclude by recommending a
model that uses constructs previously shown to be predictive of
illnesses in persons under chronic stress.
Who Are Informal Caregivers and Why Are They
Informal caregivers are caregivers who are not financially com-
pensated for their services. They are usually relatives or friends
who provide assistance to persons who are having difficulties with
daily activities because of physical, cognitive, or emotional im-
pairments. Without such help, care recipients, such as persons with
Alzheimer’s disease (AD), would not be able to sustain them-
selves. Although caregiving is dependent on the specific needs of
each person, it usually involves helping with maintenance such as
bathing or higher level activities such as reading. In general,
caregiving is
based on a reverence for life and the belief that human beings have the
innate right to function to their highest level of mental and physical
capacity. The major mission of caregiving is to promote independence
by maintaining the person’s most functional state—physically, intel-
lectually, emotionally, and spiritually. (Bridges, 1995, p. 13)
Since prehistoric times, informal caregiving has been the stan-
dard way to protect people in poor health (Lebel et al., 2001).
Although caregivers have always been of socioeconomic value to
society, in the future they will be even more important. In the next
20 years, the portion of the U.S. population that will exceed 65
years of age will increase from 12% to 17% (U.S. Bureau of the
Census, 1996). The prevalence of AD will also increase. Estimates
of AD in the United States vary from 2.7% to 11.2% for persons
65 years of age or older (Ernst & Hay, 1997). This translates to
1.5–6.1 million persons. Note that although 75% of all persons
with dementia who are 65 years of age or older have AD (Ganguli,
Dodge, Chen, Belle, & DeKosky, 2000), stroke and Parkinson’s
Peter P. Vitaliano and James M. Scanlan, Department of Psychiatry and
Behavioral Sciences, University of Washington; Jianping Zhang, Depart-
ment of Psychology, Indiana University–Purdue University, Indianapolis.
This research was supported by National Institute of Mental Health
Grant R01MH57663, National Institute of Aging Grant R01 AG10760,
Clinical Nutrition Research Unit Grant DK38516, and National Institutes
of Health Clinical Research Center Grant M01-RR00037. The title of the
article arose from a 1999 editorial in Psychosomatic Medicine by Igor
Grant. We thank William Shadish for his insights and generosity and
Roslyn Siegel for clerical support.
Correspondence concerning this article should be addressed to Peter P.
Vitaliano, Department of Psychiatry and Behavioral Sciences, University
of Washington, Box 356560, Seattle, Washington 98195. E-mail:
Psychological Bulletin Copyright 2003 by the American Psychological Association, Inc.
2003, Vol. 129, No. 6, 946–972 0033-2909/03/$12.00 DOI: 10.1037/0033-2909.129.6.946
disease also create cognitive and behavioral problems (Hooker,
Manoogian-ODell, Monahan, Frazier, & Shifren, 2000; Vetter et
al., 1999). Thus, the prevalence of all dementias, and the resulting
costs, are significantly greater than those for AD alone. For 1991
the total U.S. costs of AD alone were $20 billion (Johnson, Davis,
& Bosanquet, 2000), and families bore most of these expenses
(Leon, Cheng, & Neumann, 1988). Unfortunately, the costs of
caregiving may also be psychosocial and physical.
Why Are Caregivers Expected To Be at Higher Risk for
Health Problems Than Noncaregivers?
Given the critical functions that caregivers perform, both gov-
ernment agencies and researchers have been concerned with main-
taining caregiver health. One assumption underlying this concern
is that negative responses to caregiving may interfere with a
caregivers ability to provide care. However, researchers have also
been concerned with caregiver health in its own rightand con-
cerned that caregivers may not care for or be able to care for
themselves. These beliefs have generated extensive research. Since
Grad and Sainsbury (1963), the first researchers to examine per-
ceptions of stressors among caregivers, a plethora of observational
studies have been performed. Most of these studies did not include
comparison groups. In those that did, matching was not always
used to control confounders. In studies that matched on variables,
such as sex and age, the direction of relationships between care-
giving and health indicators could not be determined because of
the retrospective designs. Despite these problems, researchers have
persisted in studying caregiver health because of its importance to
society and the assumption that caregivers are at risk for illnesses.
We now discuss two literatures to support the belief that caregivers
are at risk for health problems. First we consider caregiver expe-
riences, such as chronic stressors, psychosocial distress, and risky
health habits, and then we review relationships known to exist
among chronic stress, distress, health habits, health indicators, and
potential physiological mechanisms.
Stressors, Distress, and Health Habits in Caregivers
To understand the stressors encountered by caregivers of vic-
tims of dementia and related disorders, one needs to understand
these diseases. Dementia is caused by degenerative brain (AD),
cerebrovascular (multi-infarct dementia), and neurological (Par-
kinsons) diseases. AD and vascular dementia are diagnosed using
established criteria (Diagnostic and Statistical Manual of Mental
Disorders, 4th ed.; DSM–IV; American Psychiatric Association,
1994; McKhann et al., 1984). AD involves impairments in mem-
ory, attention and cognition, and gradual progressive intellectual
and functional deterioration. Parkinsons disease involves progres-
sive degeneration of motor function, tremors, and stiffness (Stern,
1988). Most caregivers of persons with dementia face 315 years
of exposure to physical and psychosocial demands. They absorb
household chores and are exposed to symptoms of depression,
anger, agitation, and paranoia in their care recipients (Teri et al,
1992). These affects and behaviors intrude on their lives (Mendel-
sohn, Dakof, & Skaff, 1995; Vernon & Stern, 1988). As AD
progresses, caregivers must continually monitor their care recipi-
ents and witness their cognitive deterioration (Stephens, Kinney, &
Ogrocki, 1991). This exposure to chronic stressors can lead to
psychosocial distress and risky health behaviors.
One psychosocial response to caregiving is perceived burden. It
results from the physical, psychological, emotional, social and
financial problems experienced by families caring for impaired
older adults (George & Gwyther, 1986, p. 253). Burden includes
embarrassment, overload, feelings of entrapment, resentment, iso-
lation from society (Zarit, Reever, & Bach-Peterson, 1980), loss of
control, poor communication, (Morris, Morris, & Britton, 1988),
and work pressures (Stephens et al., 1991). Given these responses,
it is not surprising that caregivers report more distress and risky
health behaviors than do noncaregivers. As an example, Blazer
(2003) reviewed three population-based studies of community-
residing older adults and reported a median rate of major and
minor depression of 11%. In contrast, in a review of caregiver
morbidities, the median rate of clinical depression (e.g., using
versions of the Structured Clinical Interview for DSM–IV) was
22% (Schulz, OBrien, Bookwala, & Fleissner, 1995). When self-
report measures were used, however, the median rate of depressed
mood was 30%. These results are consistent with reviews that have
argued that rates of clinical depression in caregivers are signifi-
cantly higher than in the general population but that only a mi-
nority of caregivers will meet with clinical depression if they are
not actively seeking psychological intervention (Neundorfer, 1991;
Wright, Clipp, & George, 1993). These results also support the
belief that structured interviews yield lower estimates of depres-
sion than self-report inventories. In addition to depression, other
reactions may increase the risks of caregiver illnesses. These
include sleep problems, poor diets, and sedentary behaviors
(Fuller-Jonap & Haley, 1995; Gallant & Connell, 1997; Vitaliano
et al., 2002).
Relationships of Chronic Stressors With Health Indicators
Chronic stressors are associated with illnesses (S. Cohen,
Kessler, & Underwood-Gordon, 1997; Greenwood, Muir, Pack-
ham, & Madeley, 1996) and with disease progression in persons
who are already ill (Everson et al., 1997). Caregivers may expe-
rience prolonged anticipatory bereavement over lost aspects of
their relationships with their care recipients, and bereavement is
positively associated with physical illnesses (Kaprio, Koskenvuo,
& Rita, 1987), health care utilization (Prigerson et al., 1997; W.
Stroebe & Stroebe, 1987), and mortality (Goldman, Korenman, &
Weinstein, 1995). Chronic stressors and bereavement are also
associated with elevated physiological risks (Kawakami, Haratani,
& Araki, 1998; B. S. McCann et al., 1999; OConnor, Allen, &
Kaszniak, 2002; Pickering et al., 1996). Such responses may
provide mechanisms for why chronic stressors are related to
Two pathways may be relevant to relationships of illness with
chronic stressors and bereavement. One pathway appears to flow
from chronic stressors to psychosocial distress and then to stress
hormones. This primarily occurs via the hypothalamicpituitary
adrenal axis, from which corticotropin releasing hormone-ACTH-
cortisol are secreted, and the sympathetic adrenomedullary axis,
from which norepinephrine and epinephrine are secreted (Lovallo,
1997; Steptoe, Cropley, Griffith, & Kirschbaum, 2000). These
hormones stimulate peripheral activity, which can lead to allostatic
load, or wear-and-tear from repeated arousal and inefficient con-
trol of physiological responses (Chrousos & Gold, 1992; McEwen,
2000). Such compensation may lead to pathophysiology (Niaura,
Stoney, & Herbert, 1992; Schneiderman, 1983). In a second path-
way, distress may trigger risky health behaviors, such as poor diet,
sedentary behavior, and substance abuse.
These two possible pathways, among others, may contribute to
illness by increasing cardiovascular (Kannel & Vokonas, 1986),
metabolic (Keys, Fidanza, Karvonen, Kimura, & Taylor, 1972), or
immunologic (OLeary, 1990) dysregulation. As such, they should
help to explain why depression, sleep problems, and risky be-
haviors are associated with illnesses in the general population
(Arntzenius et al., 1985; Epstein & Perkins, 1988; Erikssen,
Forfang, & Jervell, 1981; Fischer & Raschke, 1997; Leigh & Fries,
1992; Musselman, Evans, & Nemeroff, 1998) and why caregivers
are expected to be at greater risk for illnesses.
Current Knowledge About Physical Health Indicators in
In one review (Schulz, Visintainer, & Williamson, 1990), only
11 of 34 caregiver studies examined physical health, and only 1
study included physiological measures. In studies that used self-
reports, the measures included global self-reported health, a single
item that assessed health from poor to excellent (Davies &
Ware, 1981); number and intensity of symptoms (Pennebaker,
1982); number of chronic illnesses (Rosencranz & Pihlblad, 1970);
number of medications (Harrison, 1997); and health care utiliza-
tion (Ritter et al., 2001). Caregivers were similar to matched
noncaregivers in some studies (George & Gwyther, 1986), but
poorer in health in other studies (Haley, Levine, Brown, Berry, &
Hughes, 1987). In a review of 40 additional studies (Schulz et al.,
1995), some researchers found that caregivers were higher than
noncaregivers in chronic illnesses and medications (Baumgarten et
al., 1992; Kiecolt-Glaser, Dura, Speicher, Trask, & Glaser, 1991),
whereas others did not observe differences (George & Gwyther,
1986; Haley et al., 1987). Since these reviews, more studies have
used physiological measures. Such measures may help to explain
observed associations between caregiving and illness. Indeed,
physiological measures may show associations with caregiver ex-
periences much earlier than do chronic illnesses.
Physiological Indicators
Table 1 contains a list of the categories and measures that have
been used to study physiological functioning in caregivers since
this research began 15 years ago. The categories include stress
hormones and neurotransmitters and, immunologic, cardiovascu-
lar, and metabolic functioning. For each measure, we provide its
description and the references that support its relationships with
stress and illness. We do not consider the caregiver studies that
have examined these measures because they are discussed below.
We now argue that the type of health measure used may influence
its relationship with caregiving.
Health Indicators as Moderators
Because caregiver health problems have been assessed in many
different ways, it is important to examine whether certain measures
are more related to caregiving than others. If some health problems
are more associated with caregiving than others, this could have
both theoretical and clinical implications. Hence, one objective of
this article was to consider four contrasts, each of which is driven
by previous work. First, self-report measures may have inherent
advantages over physiological measures because of their relation-
ships with distress. That is, strong positive relationships exist
between caregiving and psychological distress (Schulz et al., 1995)
and between distress and reported health problems (Costa & Mc-
Crae, 1980; Watson & Clark, 1984). Thus, one needs to assess
whether caregiving is more related to reported health than to
physiological measures. Second, because relationships are espe-
cially high for global self-reported health with distress (r .70;
Hooker & Siegler, 1992), it would be useful to assess whether
relationships of caregiving with global health are greater than
relationships of caregiving with reports of medications, utilization,
and illnesses. Third, researchers in psychophysiology would ex-
pect measures that are immediately responsive to central nervous
system arousal, such as stress hormone levels, to be more related
to caregiving than more peripheral cardiovascular and metabolic
responses (Lovallo & Thomas, 2000). Fourth, variations in the
association of caregiving with immunity are relevant to research in
psychoneuroimmunology. For example, meta-analyses of immune
measures with depression and stress have reported that mean effect
sizes for antibodies to viruses are larger than mean effect sizes for
T-cell counts and natural killer cell activity (NKA; Herbert &
Cohen, 1993a, 1993b). In addition to health indicators, individual
differences may be moderators of relationships of caregiving with
health problems. However, prior to presenting some candidates,
we consider a model that supports the importance of individual
differences for predicting health problems in caregivers.
A Theoretical Model of Caregiver Experiences, Individual
Differences, and Health Problems
Although research on caregiver health has steadily increased in
the past decade, relatively few attempts have been made to use a
theoretical model to unify this work. Figure 1 is one such attempt.
It illustrates the basic pathways that interrelate caregiver stressors,
psychosocial distress, risky health habits, physiological mediators,
and subsequent health problems. The model overlaps and parallels
previous attempts to relate psychosocial factors to illness
(Clark, Anderson, Clark, & Williams, 1999; Jorgensen, Johnson,
Kolodziej, & Schreer, 1996; Taylor & Repetti, 1997). It includes
individual differences, such as vulnerabilities and resources, that
moderate relationships of stressors with distress (Lazarus &
Folkman, 1984). Vulnerabilities and resources tend to be nega-
tively related. As such, some researchers have viewed them as
mirror images or opposite ends of the same continuum. However,
there is some tradition in sociology, psychology, epidemiology,
and psychiatry to distinguish these constructs according to their
stability and temporal place among pathways from stressors to
illness. For example, the term vulnerability has been used to refer
to hard-wired characteristics (Mechanic, 1967) such as age, sex,
disposition (Lazarus & Folkman, 1984), race (Robins, 1978), and
family history and heredity (Zubin & Spring, 1977). In contrast,
resources are more mutable and affected by interactions of the
person with the environment. These include process coping
(Folkman & Lazarus, 1980) and social supports (S. Cohen &
Wills, 1985). For these reasons, vulnerabilities typically occur
early in development and are not the result of caregiving, whereas
resources may be both predictors and outcomes of caregiving.
Using this framework, traitlike characteristics, such as disposi-
tions, may have positive (e.g., hostility) or negative (e.g., opti-
mism) relationships with illness, depending on their content, but
they are both categorized in the vulnerability domain. Hence,
persons high in optimism would be low in vulnerability. Likewise,
depending on their quality, social supports may provide either high
or low resources.
Given the model, one would expect the main effects of expo-
sures (E; here, caregiving), vulnerabilities (V), and resources (R)to
be directly associated with psychosocial distress and risky health
habits. In addition, interactions among the constructs indicators
are expected to influence illness over and above the main effects.
This belief is consistent with the stressdiathesis model (Mechan-
ic, 1967). For example, two-way interactions of exposures, vul-
nerabilities, and resources (e.g., E V; E/R; V/R) predict distress
and poorer health habits beyond the main effects of E, V, and 1/R,
and they also magnify their effects (Vitaliano, Maiuro, Bolton, &
Armsden, 1987). Indeed, in one study caregivers with high vul-
nerability and low resources (V/R) had greater burden 1518
months later than did those with either low vulnerability or high
vulnerability and high resources, even after controlling baseline
burden (Vitaliano, Russo, Young, Teri, & Maiuro, 1991). One
would expect such distress to result in greater illness risks for these
Individual Differences as Moderators
Consistent with Figure 1, another objective of this article was to
assess whether relationships of caregiving with health problems
are moderated by individual differences. Examples of some can-
didates for moderation include psychiatric history, personality,
ethnicity, comorbidities, social supports, socioeconomic status
(SES), and coping. In support of the stressdiathesis model,
Russo, Vitaliano, Brewer, Katon, and Becker (1995) found that
73% of caregivers with a history of depression had a recurrence of
depression while they were caregivers, but only 30% of non-
caregivers had a recurrence during a similar time frame. This is
important because depression has been found to be positively
related to health problems in caregivers (Schulz et al., 1995).
Caregivers also have elevated levels of anger (Gallagher, Wrabetz,
Lovett, Del Maestro, & Rose, 1989), and Vitaliano, Becker, Russo,
Magana-Amato, and Maiuro (1989) found that caregivers who
were critical of their spouses in a structured interview reported
more trait anger than those who were not critical of their spouses.
These results are relevant to caregiver health because in non-
caregivers, anger is a risk factor for elevated blood pressure
(Jamner, Shapiro, Goldstein, & Hug, 1991), greater body mass
index (Scherwitz, Perkins, Chesney, & Hughes, 1991), fat and
caloric intake (Scherwitz et al., 1991), and glucose and insulin
levels (Raikkonen, Keltikangas, & Hautanen, 1994). Indeed, care-
givers high in anger have higher levels of fasting glucose than do
noncaregivers who are high in anger, but no differences exist in
caregivers and noncaregivers who are low in anger (Vitaliano,
Scanlan, Krenz, & Fujimoto, 1996). Ethnicity may also be relevant
to caregiver health because it is related to health disparities
(Kaplan, 1992), and ethnic groups respond differently to caregiv-
ing (Hinrichsen & Ramirez, 1992). Also, Black caregivers may
have fewer economic resources than White caregivers, but they
may have more spiritual resources (Dilworth-Anderson & Ander-
son, 1994). Finally, comorbidities, such as coronary heart disease
(CHD) and cancer may moderate relationships between caregiving
and physiological measures that are manifestations of these dis-
eases (Vitaliano, Scanlan, Ochs, et al., 1998; Vitaliano, Scanlan,
Siegler, et al., 1998).
In noncaregivers, resources such as emotional and instrumental
supports (House, 1981) are related to better health habits (Peirce,
Frone, Russell, Cooper, & Mudar, 2000), less distress (Raikkonen
et al., 1994), and lower CHD prevalence (Niaura et al., 1992;
Williams et al., 1992). In caregivers, high levels of perceived
support also predict better reported health (Monahan & Hooker,
1995) and lower metabolic and cardiovascular risk (Vitaliano et
al., 2002). In noncaregivers, high SES is associated with better
physical health (Adler & Ostrove, 1999) and lower CHD preva-
lence (Niaura et al., 1992), and low SES is related to poorer health
in caregivers (Morrissey, Becker, & Rupert, 1990). Active coping,
such as problem solving, is inversely associated with elevated
distress (Folkman & Lazarus, 1980), poor health habits (Epstein &
Perkins, 1988), and physiological risk (Niaura et al., 1992) in
noncaregivers. In caregivers, problem-focused coping is associated
with less psychological distress (Vitaliano, Russo, Carr, Maiuro, &
Becker, 1985).
Demographic variables, such as a caregivers age, relationship
to the care recipient, and sex may be particularly relevant to his or
her risk of illness. In two of three studies reviewed by Schulz et al.
(1995), women reported greater health problems than did men, and
in two of two studies of caregiver age, no relationships were
observed with health (Schulz et al., 1995). However, these were
relationships for health problems with individual differences and
not relationships of effect sizes with individual differences. To our
knowledge, no studies have examined the latter. It is important to
note that the direction of moderation was, and still is, not obvious
for sex, age, and ones relationship to the care recipient. Although
older persons have less resistance to illness (Rowe & Kahn, 1998),
caregiving may be more developmentally on time for older than
for younger caregivers (Neugarten, 1969). Also, spouses may be
frailer, more isolated, and more distressed than other caregivers
(C. A. Cohen et al., 1993), but caring for a spouse might be viewed
as a marriage commitment. In contrast, caring for a parent might
pose conflicts of equity when one must choose between parents
versus spouses and children (George, 1982). Sex may be an
important moderator of caregiver health because women report
more distress (Kessler et al., 1994) and health problems (Rahman,
Strauss, Gertler, Ashley, & Fox, 1994; Ross & Bird, 1994), and
they utilize more health care (Nathanson, 1990) than do men.
Alternatively, men exposed to laboratory stressors show larger and
more consistent increases in stress hormones, neurotransmitter
metabolites, and blood pressure than do women (Earle, Linden, &
Weinberg, 1999; Kirschbaum, Kudielka, Gaab, Schommer, &
Hellhammer, 1999). This may be further exacerbated when faced
with a stressor such as caregiving, which is inconsistent with
mens traditional gender roles (Kramer, 1997). Finally, widowers
appear to have more illnesses in response to the loss of a spouse
than do widows (Chen et al., 1999). Hence, one might expect
associations of caregiving and health indicators to differ for men
and women.
Study Objectives
Given the contributions that caregivers make to society and the
potential for improving understanding of relationships of chronic
stressors with illness, one goal of this article is to quantify rela-
tionships of caregiving with health problems. For practical and
Table 1
Measures of Stress and Physiological Functioning in Caregiver Research
Measure Description Stress related Illness related
Stress hormones and neurotransmitters
ACTH (Adrenocorticotropic hormone) Peptide hormone secreted by pituitary
signal cortisol release.
Chrousos & Gold (1992),
Lovallo (1997), Lovallo
& Thomas (2000)
Cortisol Steroid glucocorticoid secreted by the adrenal gland. Affects cardiac
and metabolic function.
Chrousos & Gold (1992),
Lovallo (1997), Lovallo
& Thomas (2000)
Chrousos & Gold (1992),
Lovallo (1997),
Lovallo & Thomas
(2000), Sapolsky
Epinephrine Secreted rapidly by adrenal medulla. Affects cardiac function and
blood pressure (BP).
Lovallo (1997), Lovallo &
Thomas (2000)
Norepinephrine Present in CNS, peripheral sympathetic nerves, and the adrenal
medulla. Secretion affected by muscular exertion and effort.
Affects cardiac function, BP, and immunity.
Lovallo (1997), Lovallo &
Thomas (2000)
Kohm & Sanders (2001)
Prolactin Peptide hormone secreted by anterior pituitary. Can increase with
high physical and/or psychological stress exposure. Has an
immune stimulating effect.
Irwin et al. (1997) McMurray (2001)
Forskolin stimulation Stimulates post-beta receptor adenylate cyclase activation. May be
used to test changes in cyclic AMPA independent of beta-2
Mills et al. (1997)
Growth hormone Growth hormone mRNA expression from B and T cells facilitates
immunity. Low levels of growth hormone mRNA expression
suggest reduced growth hormone production.
Wu et al. (1999) Lovallo (1997), Lovallo
& Thomas (2000),
McCallum et al.
GABA The neurotransmitter GABA is known to counter certain stress
responses in animals.
Pomara et al. (1989),
Sanacora et al. (2000)
Neurotransmitters found in cerebrospinal fluids (CSF) reflect CNS
activity. Low CSF GABA has been found in people with
Neuropeptide Y Neurotransmitter in both CNS and peripheral sympathetic nerves,
which increases catecholamine activities and influences feeding
Irwin et al. (1991),
Irwin et al. (1991),
Beta receptors Beta-adrenergic receptors mediate relationships among
catecholamines, cardiac function, and BP.
Naga-Prasad et al. (2001) Haeusler (1990), Herd
(1991), Naga-Prasad et
al. (2001)
Enumerative Cell counts, percentages, and CD4:CD8 ratios. High levels of some
immune cells are generally associated with health, others with
infectious illness; for example, high white blood cell count is
generally associated with active infections; CD3 refers to total T
lymphocyte cell count; CD4 cells are generally considered to be
positively related to the capacity to defend against many illnesses.
Low CD4 can be caused by severe disease, such as HIV infection,
and reflects disease vulnerability. High CD8 (cytotoxic T cells)
counts may reflect an overactive immune system or a current
illness the immune system has difficulty controlling. CD4:CD8
ratio is used as an immunological index, with higher ratios
presumed healthier. Higher NK (CD56) and B-cell counts are
generally thought to reflect better immune reserves.
Cohen & Herbert (1996),
Herbert & Cohen
(1993a, 1993b), Lovallo
(1997), Lovallo &
Thomas (2000)
Cohen & Herbert (1996),
Herbert & Cohen
(1993a, 1993b),
Lovallo (1997),
Lovallo & Thomas
Functional Lymphocyte proliferation in response to mitogen challenge. Index of
the ability of T and B cells to proliferate in response to mitogens
such as phytohemagglutinin, concanvillan A, and poke weed. Low
levels may suggest illness susceptibility.
Cohen & Herbert (1996),
Herbert & Cohen
(1993a, 1993b), Irwin
(1999), Lovallo (1997),
Lovallo & Thomas
Horiuchi et al. (1995),
Irwin (1999)
Natural killer cell activity (NKA) and lymphokine activated killing
are early immunological defenses against tumor cells and viral
infections independent of prior exposure. Common measures of
NKA include unstimulated killing response to tumor cells and
response to tumor cells after cytokine stimulation (such as IL-2 or
interferon gamma), which may better reflect in vivo NKA.
Moderate to high NKA is consistent with good health; chronically
low NKA may reflect disease susceptibility.
Cohen & Herbert (1996),
Herbert & Cohen
(1993a, 1993b), Lovallo
(1997), Lovallo &
Thomas (2000)
Lovallo (1997), Lovallo
& Thomas (2000)
Table 1 (continued)
Measure Description Stress related Illness related
Immunologic (continued)
Delayed type hypersensitivity response is skins capacity to respond
to multiple antigens (J. J. McCann, 1991, used tuberculin, tetanus,
streptococcus, diptheria, candida, trichophyton, and proteus),
usually 48 hr after exposure. Two typical response measures are
the number of positive antigen reactions (antigen score) and the
size (diameter) of the antigenic skin response (induration).
Dhabhar (2000) Dannenberg (1991)
Healing speed of a standardized skin puncture wound is used as a
general health index, and is dependent on interactions of
immunological, hormonal, and nutritional factors.
Kiecolt-Glaser et al.
Kiecolt-Glaser et al.
Tumor necrosis factor is an inflammatory cytokine. High levels are
thought to promote loss of muscle tissue, aging, and poor general
health. Cytokine secretion has complex effects on immunological
responses and can affect cellular immune responses (such as
activating macrophages) and humoral immunity through beta-cell
Dantzer et al. (1999),
Lovallo (1997), Lovallo
& Thomas (2000)
Dantzer et al. (1999),
Lovallo (1997),
Lovallo & Thomas
Vaccinations Immunoglobulin G (IgG) vaccination response (influenza and
otherwise). High antibody responses reflect health. A four-fold
memory response postvaccination implies a vigorous immune
Cohen & Herbert (1996),
Herbert & Cohen
(1993a, 1993b), Lovallo
(1997), Lovallo &
Thomas (2000)
Cohen & Herbert (1996),
Herbert & Cohen
(1993a, 1993b),
Lovallo (1997),
Lovallo & Thomas
Epstein Barr virus
EBV virus capsid antigen IgG titers. High levels of EBV antibodies,
in the absence of vaccination or antigenic challenge, may reflect
latent EBV viruses that the cellular immune system is having
difficulty suppressing.
Cohen & Herbert (1996),
Herbert & Cohen
(1993a, 1993b), Lovallo
(1997), Lovallo &
Thomas (2000)
Murray & Young (2002)
Immunoglobulins A GAM profile includes measures of multiple plasma
immunoglobulins (IgG, IgA, IgM). Although different diseases can
result in either increases or decreases in immunoglobulins, in the
absence of specific disease, it is assumed that lower levels are
associated with poorer immunity. These are relatively general
measures as opposed to specific immunological memory responses,
such as antibody responses to influenza vaccination.
Valdimarsdottir & Stone
Pyne & Gleeson (1998)
Systolic BP (SBP) The point of highest recorded BP, usually measured at rest. SBP
140 suggests hypertension, which is associated with coronary heart
disease (CHD) and mortality.
Herd (1991), Lovallo
(1997), Lovallo &
Thomas (2000)
Herd (1991), Lovallo
(1997), Lovallo &
Thomas (2000)
Diastolic BP (DBP) The point of lowest recorded BP, usually measured at rest. DBP
90 suggests hypertension.
Herd (1991), Lovallo
(1997), Lovallo &
Thomas (2000)
Herd (1991), Lovallo
(1997), Lovallo &
Thomas (2000)
Heart rate Measured in heart beat/min. Higher rates may be associated with
reduced heart function.
Herd (1991), Lovallo
(1997), Lovallo &
Thomas (2000)
Herd (1991), Lovallo
(1997), Lovallo &
Thomas (2000)
Insulin, glucose
Insulin anabolic hormone causes glucose storage; hyperinsulinemia
reflects difficulty controlling glucose.
Schneiderman & Skyler
Schneiderman & Skyler
Glucose is necessary for muscular activity and proper brain function,
but high fasting glucose (126) suggests glucose intolerance and
possible diabetes, conditions associated with CHD.
Schneiderman & Skyler
Schneiderman & Skyler
Body mass index is the ratio of weight/height squared. High numbers
reflect obesity.
Schneiderman & Skyler
Schneiderman & Skyler
Transferin Transferin is an index of stored iron in the body. Low iron levels
may be associated with anemia.
Bostian et al. (1976)
Plasma lipids Low-density cholesterol and triglycerides are associated with obesity
and CHD. High-density cholesterol is associated with physical
fitness and reduced CHD.
Schneiderman & Skyler
Schneiderman & Skyler
Note. CNS central nervous system; AMPA alpha-amino-3-hydroxy-5-methylisoxazole-4-propionic acid; GABA gamma-aminobutyric acid; CD
cluster designation; NK natural killer; IL-2 interleukin 2; GAM profile for immunoglobulins G, A (IgA), and M (IgM).
theoretical reasons, however, we also examine whether certain
health indicators are more related to caregiving than others and
whether some subgroups of caregivers are at greater risk for health
problems. A final objective is to critique the state of research in
this area and provide recommendations for future work.
Inclusion and Exclusion Criteria Used to Search the
Caregiver Physical Health Literature
We located reports using electronic and manual searches and the refer-
ence lists of identified articles. We sampled Current Contents since 1996,
MEDLINE since 1983, PsycINFO since 1967, Sociofile since 1974, Social
Work Abstracts since 1977, and Nursing and Allied Health since 1982. To
be included, a report had to examine caregivers of persons with dementia
and physical health problems or illnesses or physiology. In each database
we included the following key words or word stemsusing an asterisk as
wild card, here indicating that the search contained but was not limited
to that word or word stem: dementia, Alzheimers disease, cognitive
disorders, health, physical and health, physiolog*
illness*, hormones,
cholesterol, cardiovascular diseases, blood pressure, obesity, diabetes,
immun*, mortality, death, and caregiv*
Articles were searched only in
English, with an inclusion end date of April 1, 2001. Medical subject
headings (MeSH, the National Library of Medicine vocabulary) were also
used for MEDLINE articles. We also hand searched the Journals of
Gerontology, The Gerontologist, the Journal of the American Geriatrics
Society, Psychology and Aging, Psychosomatic Medicine, Health Psychol-
ogy, the Journal of Behavioral Medicine, the Annals of Behavioral Med-
icine, the Journal of the American Medical Association, and the Journal of
Aging and Health. Bibliographies from reviews were also used (Schulz et
al., 1990, 1995; Vitaliano, 1997). We excluded articles that (a) were
non-data based, (b) examined caregivers of care recipients who did not
have dementia (e.g., cancer patients), (c) did not include health or physi-
ological data, and (d) lacked a noncaregiver comparison group. Because we
could not guarantee that all studies were uncovered, we examined the
robustness of our findings from all possible reports of data censoring. To
do this, we used the trim and fill procedure (Sutton, Song, Gilbody, &
Abrams, 2000), which is described below.
Results of searches. For parsimony we provide only the results for
MEDLINE, as it yielded the vast majority of articles. After crossing the
key words dementia (Alzheimers)bycaregiv*byphysical health (or
immunity, physiology, cardiovascular diseases, endocrinology), we ob-
tained 81 articles. When these terms were crossed by control, non-
caregiver,orcomparison, 35 articles resulted. The other databases yielded
7 additional articles. Hence, 42 (35 7) of 88 articles, or 48%, met criteria.
Using similar key words in the ProQuest Digital Dissertations database
(UMI from 1984), we obtained 10 entries; however, only 3 were retained
(Atkinson, 1995; Giefer, 1994; J. J. McCann, 1991) because the others had
been published and included above or they did not include physical health
data or noncaregivers. This resulted in 45 reports.
Independence of samples. To clarify whether multiple reports from one
research group came from independent samples, we compared the demo-
graphic data and descriptions of participants. Three labs produced reports
with partial to substantial participant overlap. These authors were con-
tacted, and it was determined that the Ohio State University (OSU) and
University of California, San Diego (UCSD) labs used three and two
independent samples, respectively, and the University of Washington
(UW), University of Alabama, and Medical College of Georgia each used
one sample. Of the three dissertations, two used independent samples, and
the third was based on data from the second OSU sample that had not been
published (Atkinson, 1995). Overall, the 42 articles and one dissertation
were generated from 21 independent samples, and two dissertations were
generated from 2 independent samples. This yielded a total of 23 samples.
Eighteen samples each produced 1 report, and 5 samples produced 27
reports. Of the latter, 10 articles were from the OSU laboratory, 8 articles
were from the UW laboratory, 5 articles were from the UCSD laboratory,
and 2 articles each were from the University of Alabama and the Medical
College of Georgia.
Health Categories Assessed
Table 2 summarizes the 45 reports according to the 23 samples that
generated them. For each report there are eight main columns containing
the (a) identifying number of the sample; (b) reports authors and year of
publication; (c) mean ages of the caregivers and noncaregivers; (d) types of
disorders of the care recipients; (e) caregiver sample size, stratified on sex;
(f) noncaregiver (control) sample size, stratified on sex; (g) types of
reported health measures; and (h) types of physiological measures. Eleven
samples used only reported health measures, six used only physiological
measures, and six used both. There were five self-reported health catego-
ries and six physiological categories.
Five reported health categories. Table 2 includes the report references
for each category, so they are not repeated here.
1. Global self-reported health was examined in 10 samples (n
717 caregivers; n 879 noncaregivers). Five samples used a
single self-rated item, How would you rate your current
health?on a 4-point or 5-point scale from poor to excellent.As
noted below, these scores were reversed in the analysis so that a
This subsumes terms such as physiology and physiological.
This subsumes terms such as caregiver, caregivers, and caregiving.
Figure 1. Theoretical model of stress and health/illness.
Table 2
Summary of the 45 Reports Included in the Meta-Analysis
Sample Report Mean age Care recipients
Self-reported measures Physiological measuresMWMW
1 Kiecolt-Glaser et al.
CG: 59.3;
CO: 60.3
AD 11 23 11 23 Global self-reported health, number of
physician visits, number of days ill
NK cell percentage, CD#, CD4, CD4:CD8 ratios, EBV
antibody; EBV VCA IgG titers, albumin, transferin
2 Cacioppo et al. (1998) CG and
CO: 67.2
AD or other dementia 0 27 0 37 Proliferation (ConA, PHA), NKA, NK cell count,
CD3, CD4, CD8, CD4/CD8
Esterling et al. (1994) CG: 70.4;
CO: 70.9
AD 10 21 9 22 Number of chronic conditions,
number of physician visits, number
of days ill
NKA, NK cell percentage, NKA response to rIFN or
rIL-2, mean cell fluorescence
Esterling et al. (1996) CG: 65.7;
CO: 68.9
AD or other dementia 10 18 7 22 NKA, NK cells response to rIFN or rIL-2, BMI
Glaser & Kiecolt-
Glaser (1997)
CG: 60.6;
CO: 62.4
AD, Parkinsons disease,
Huntington Picks
13 58 13 45 Percentage with cold sore history Herpes antibody titers, HSV-1 specific T-cell response
Kiecolt-Glaser et al.
CG: 67.3;
CO: 67.8
AD, multi-infarct
dementia, Parkinsons
disease, Huntington
Picks disease,
unspecified dementia
20 49 20 49 Number of chronic conditions,
number of physician visits, number
of days ill
ConA, PHA, NK cell percentage, CD3, CD4, CD8, B
cell, EBV VCA IgG titers
Kiecolt-Glaser et al.
CG: 73.1;
CO: 73.3
AD 14 18 14 18 CD3, CD4, CD8, monocytes, IgG response to
vaccination (ELISA), IgG response to vaccination
(HAI), cytokine response: IL-1b, cytokine response:
IL-6, cytokine response: IL-2
Uchino et al. (1992) CG and
CO: 63.5
AD 13 23 6 28 SBP, DBP, HR
Wu et al. (1999) CG: 64.0;
CO: 68.0
AD 0 9 0 9 Growth hormone mRNA expression from B cells and
T cells
Glaser et al. (1998) CG: 73.0;
CO: 71.5
AD or other dementia 15 39 20 49 Four-fold antibody response to flu vaccine; baseline
antibody levels
Atkinson (1995) CG: 67.2;
CO: 68.1
AD 23 52 23 57 Medication use Weight change
3 Kiecolt-Glaser et al.
CG: 62.3;
CO: 60.4
Dementia 0 13 0 13 IL-1b mRNA expression in response to
lipopolysaccharide, IL-1b mRNA expression in
response to TNF, IL-1b mRNA expression in
response to GM-CSF, number of days of wound
4 Irwin et al. (1991) CG and
CO: 71.3
AD 18 30 6 11 EPI, NEPI, neuropeptide Y, NKA
5 Irwin et al. (1997) CG: 71.0;
CO: 69.6
AD 43 57 14 19 ACTH, beta-endorphin, cortisol, prolactin, EPI, NEPI,
neuropeptide Y, NKA
Mills et al. (1997) CG: 73.5;
CO: 74.0
AD 6 21 8 2 CD4, CD8, CD16, cortisol, EPI, NEPI, beta-receptor
sensitivity, beta-receptor density, forskolin
stimulation, SBP, DBP, weight
Patterson et al. (1998) CG: 63.7;
CO: 61.2
AD 68 106 71 69 Number of physical symptoms
Shaw et al. (1997) CG: 70.6;
CO: 69.7
AD 53 97 22 24 Medication use, hospitalization,
objective health rated by nurse
Shaw et al. (1999) CG: 70.5;
CO: 70.2
AD 51 93 24 23 Antihypertensive medication use SBP, DBP, BMI
(table continues)
Table 2 (continued)
Sample Report Mean age Care recipients
Self-reported measures Physiological measuresMWMW
6 Picot et al. (1997) CG: 60.8;
CO: 55.5
AD and other chronic
0 18 0 24 Number of chronic conditions,
antihypertensive medication use
7 Pomara et al. (1989) CG: 63.6;
CO: 64.5
AD 0 5 2 2 CSF, GABA
8 Reese et al. (1994) CG: 56.3;
CO: 60.9
AD, stroke 16 34 8 17 NK cell, lymphocyte, CD3, CD4, CD8, CD4/CD8
9 Vedhara et al. (1999) CG: 73.0
CO: 68.0
AD, dementia, Parkinsons
24 26 31 36 Salivary cortisol, IgG responses to vaccines
10 Scanlan et al. (1998) CG: 69.8;
CO: 69.1
AD 29 52 25 57 Number of chronic conditions CD4, CD8, CD4:CD8 ratio, BMI
Vitaliano, Russo, et al.
CG: 69.4;
CO: 68.5
AD 30 52 30 48 Antihypertensive medication use SBP, DBP, HR
Vitaliano et al. (1995) CG: 69.8;
CO: 69.1
AD 36 60 29 62 Total cholesterol, LDLC, HDLC, triglycerides, BMI
Vitaliano, Scanlan,
Krenz, & Fujimoto
CG: 69.8;
CO: 69.1
AD 25 48 20 49 Insulin, glucose, lipids, BMI
Vitaliano, Scanlan,
Krenz, Schwartz, &
Marcovina (1996)
CG: 69.8;
CO: 69.1
AD 29 52 26 60 Antihypertensive medication use BMI, lipids
Vitaliano, Scanlan,
Ochs, et al. (1998)
CG: 69.8;
CO: 69.1
AD 27 53 26 59 NKA
Vitaliano, Scanlan,
Siegler, et al. (1998)
CG: 69.8;
CO: 69.1
AD 24 47 21 49 Metabolic syndrome, SBP, DBP, insulin, glucose,
BMI, HDLC, triglycerides
Vitaliano et al. (1999) CG and
CO: 69.5
AD 29 51 25 60 NKA
11 Almberg et al. (1998) CG: 66.6;
CO: 68.8
Dementia 15 37 23 43 Health problems in a burden
questionnaire, coded as yes/no
12 Baumgarten et al.
CG: 66.7;
CO: 60.4
Dementia 40 63 46 69 Global self-reported health, medication
use, number of physical symptoms
Baumgarten et al.
CG: 63.2%
65; CO:
above 65
Dementia 33 62 41 63 Number of chronic conditions;
medical record of number of
physician visits
13 Grafstrom et al. (1992) Dementia 35 75 73 159 Medication use, number of physician
visits, percentage having health
worse than expected
14 Haley et al. (1987) CG: 57.8;
CO: 53.4
Dementia 12 32 12 32 Number of physical symptoms (PILL),
global self-reported health, number of
physician visits, number of current
prescription medications, number of
chronic conditions, health status
Table 2 (continued)
Sample Report Mean age Care recipients
Self-reported measures Physiological measuresMWMW
15 Haley et al. (1995) CG: 58.7;
CO: 56.4
Dementia 60 115 58 117 Cardiovascular and respiratory
symptoms (Cornell Medical Index);
global self-reported health; number
of physician visits, hospitalizations;
medication use
16 Lorensini & Bates (1997) CG: 31.2%
under 50;
under 50
Dementia 21 64 6 41 Number of days in bed or hospital in
past 12 months, number of
physician visits in past 12 months
17 McNaughton et al. (1995) CG: 71.1;
CO: 70.4
AD N 89
N 31
Number of physical symptoms
(Interim Medical Survey); objective
health (ratings based on number of
physician visits, medication use,
number of physical symptoms,
18 OReilly et al. (1996) CG: 56.5;
CO: 56.4
Parkinsons disease 55 99 47 77 Number of chronic illnesses; number
of physician visits, hospitalizations;
medication use
19 Rose-Rego et al. (1998) CG: 70.4;
CO: 72.5
AD 38 61 39 74 Global self-reported health
20 Wright (1994) CG: 69.0;
CO: 68.6
AD 6 24 4 12 Multilevel Assessment Inventory (MAI;
self-rated health; number of physician
visits in the past year, days of
hospitalization in the past year,
medications taken per day; and
ratings of heart and circulatory
problems make up combined score; a
maximum of 27 indicates perfect
Wright et al. (1999) CG: 67.2;
CO: 63.9
AD, stroke 17 11 9 5 Global self-rated health; MAI
21 Fuller-Jonap & Haley (1995) CG: 74.5;
CO: 74.1
AD 52 0 53 0 Global self-rated health, number of
physician visits; cardiovascular and
respiratory symptoms (Cornell
Medical Index)
22 J. J. McCann (1991) CG: 67.5;
CO: 67.2
AD, Multi-infarct, and
unspecified dementia
0 34 0 33 Global self-rated health; number of
physician visits, days of illness,
chronic illnesses, medications
Number and percentage of total lymphocyte, CD3,
CD4, CD8, CD56, and B cells; delayed skin
23 Giefer (1994) CG and
CO: 75.0
Dementia and frail older
14 26 14 24 IgG profile (IgG, IgM, IgA)
Note. A dash indicates that the data could not be computed because same or all of the information was unavailable. CG caregiver; M men; W women; CO control (noncaregiver); AD Alzheimers
disease; NK natural killer; CD cluster designation; EBV Epstein Barr virus; VCA virus capsid antigen; IgG immunoglobulin G; ConA concanvillan A; PHA phytohemagglutinin; NKA
natural killer cell activity; rIFN interferon receptor; rIL-2 interleukin 2 receptor; BMI body mass index; HSV-1 herpes simplex virus Type 1; ELISA enzyme linked immunosorbent assay; HAI
hemaglutinin inhibition assay; IL-1b interleukin-1b; IL-6 interleukin 6; IL-2 interleukin 2; SBP systolic blood pressure; DBP diastolic blood pressure; HR heart rate; TNF tumor necrosis factor;
GM-CSF granulocyte-macrophage colony stimulating factor; EPI epinephrine; NEPI norepinephrine; CSF cerebrospinal fluid; GABA gamma-aminobutyric acid; LDLC low-density cholesterol;
HDLC high-density cholesterol; PILL Pennebaker Inventory of Linguid Languidness; IgM immunoglobulin M; IgA immunoglobulin A.
This effect size was not included in some of the meta-analyses.
Both men and women are included in this number because the original study did not stratify the category by sex.
higher score corresponded to poorer health. Other reports used
items such as increase in health problems and percentage who
had worse health than expectedor composite health scores from
the Multilevel Assessment Inventory (Lawton, Moss, Fulcomer,
& Kleban, 1982) and the Duke University Center for the Study
of Aging and Human Development (1978).
2. Number of chronic conditions was assessed in seven samples
(n 476 caregivers, n 461 noncaregivers). Chronic illnesses
were measured by checklists. One example is the Health Status
Questionnaire (Belloc, Breslow, & Hochstim, 1971).
3. Number of physical symptoms was assessed in eight samples
(n 742 caregivers, n 649 noncaregivers). Measures included
the Pennebaker Inventory of Linguid Languidness (Pennebaker,
1982), the Cornell Medical Index, and the Interim Medical
4. Medication use was examined in 10 samples (n 941 caregiv-
ers, n 960 noncaregivers). This included the number of med-
ications or percentage of people taking a medication. As this is a
review of physical health, we focused on somatic medications.
5. Health service utilization was assessed in 11 samples (n 1,002
caregivers, n 961 noncaregivers). This included number of
clinic visits in the past 3 or 6 months, days in hospital, percentage
of people who visited a physician more than once, and percent-
age of people hospitalized.
Six physiological health categories. Here we only present information
on the measures used in each category and not the rationales, as they are
provided in Table 1.
1. Antibodies were assessed in four samples (n 175 caregivers,
n 187 noncaregivers), namely, immunoglobulin G (IgG) re-
sponse to herpes simplex virus (HSV) Type 1 and vaccination,
Epstein Barr virus (EBV) virus capsid antigen IgG titers, and
2. Other functional immune measures were examined in six sam-
ples (n 308 caregivers, n 216 noncaregivers). The OSUs
second and third samples examined T-cell proliferation re-
sponses to mitogens (concanvillan A, phytohemagglutinin), re-
sponses to cytokine stimulation (NKA in response to interferon-
receptor, lymphokine activated killing in response to interleukin
2 receptor), cytokine responses (interleukin-1b [IL-1b], etc.),
IL-1b mRNA in response to lipopolysaccharide, tumor necrosis
factor, granulocyte-macrophage colony-stimulating factor, and
delayed skin hypersensitivity from antigen and induration scores.
3. Enumerative immunity was examined in six samples (n 266
caregivers, n 226 noncaregivers), namely, lymphocyte counts
and percentages of different subsets (e.g., cluster designation
[CD]4, CD8).
4. Stress hormones and neurotransmitters were examined in five
samples (n 176 caregivers, n 119 noncaregivers). Measures
included ACTH, epinephrine, norepinephrine, cortisol, prolactin,
neuropeptide Y, gamma-aminobutyric acid (GABA), beta-
receptor sensitivity and density, forskolin stimulation, and
growth hormone from B and T cells.
5. Cardiovascular measures were examined in four samples (n
217 caregivers, n 161 noncaregivers). These included systolic
and diastolic blood pressure and heart rate.
6. Metabolic measures were examined in five samples (n 309
caregivers, n 256 noncaregivers). These included body mass,
weight, cholesterol, insulin, glucose, and transferin.
The speed of healing in response to a standardized skin wound has been
used to assess a caregivers ability to heal damaged tissue. It is dependent
on interactions of immune, hormonal, and nutritional factors. Because it
was only examined in one sample (Kiecolt-Glaser, Marucha, Malarkey,
Mercado, & Glaser, 1995), it could not be included in a category by itself,
but it was used to calculate global relationships.
Coding Procedures and Rater Reliability
Jianping Zhang and James M. Scanlan examined the 45 reports for
codeable data, such as means, t tests, and so on. Data were included if
information was available to transform a contrast to the point-biserial
correlation (r
). We used the point-biserial correlation as the effect size
because it reflects the observational nature of caregiver research and it is
easy to understand and interpret. In each computation, we coded noncare-
givers as 0 and caregivers as 1. As such, larger point-biserial correlations
suggested greater health risks for caregivers. A total of 172 computable
point-biserial correlations were obtained. Rater reliability was .92 for two
raters across five reports with self-reported measures and six reports with
physiological measures.
Analyses Used to Calculate Within-Sample Point-Biserial
Lipsey and Wilson (2001) provided the main source for the point-
biserial correlation calculations. Below we discuss the decision rules used
for their derision.
1. If the mean, standard deviation, sample size, or percentage was
included for each measure in each group but the article did not
report a statistical test for the difference between caregivers and
noncaregivers, we computed a t test or chi-square test for inde-
pendent samples using the pooled variance. The r
was trans
formed from t according to standard formulae.
2. If the tests and degrees of freedom were included, point-biserial
correlations were computed using the above transformation re-
gardless of whether the means and standard deviations were
reported. This was done because in cases of missing data or
outliers, the sample used to calculate the statistical test might not
have been the same as the one reported in the descriptive results.
3. If a probability value was reported without a test value and the
sample sizes were known, we estimated the value using a reverse
distribution function and obtained the point-biserial correlation.
4. If an article reported that a difference between groups was not
significant, but it did not provide a test value, probability value,
or means and standard deviations, we recorded the comparison as
nsnd (nonsignificant, no data). In such cases, we calculated the
average point-biserial correlations twice for that sample and
category, namely, without the nsnd and by setting the nsnd
values to zero. This allowed us to observe how nonsignificant
findings influenced the average point-biserial correlations.
5. In one article (Mills et al., 1997), two groups of caregivers were
available. In this case, we first averaged the groups and then
computed the point-biserial correlations.
6. Some articles included, not only groups of caregivers whose care
recipients were living at home, but also other types of caregivers.
The latter included former caregivers (Glaser, Kiecolt-Glaser,
Malarkey, & Sheridan, 1998), bereaved caregivers (Cacioppo et
al., 1998; Esterling, Kiecolt-Glaser, Bodnar, & Glaser, 1994;
Esterling, Kiecolt-Glaser, & Glaser, 1996; Lorensini & Bates,
1997; Wright, Hickey, Buckwalter, Hendrix, & Kelechi, 1999),
caregivers of care recipients in nursing homes (Wright et al.,
1999), and caregivers using adult day care (Lorensini & Bates,
1997). When the data were stratified, we used only current
caregivers, caregivers caring for care recipients in their homes, or
caregivers that were not using adult day care.
7. In the few samples with repeated measures data, we used the first
timepoint to compute point-biserial correlations because this was
commensurate with data from the vast majority of other samples.
8. If the same measure was used across overlapping reports, the
point-biserial correlations were averaged and represented only
once for that sample. The mean of the resulting means was then
obtained. For example, NKA was used in four samples with
seven reports. As such, it was first averaged across the reports in
each sample and then the resulting four sample means were
9. For each physiological measure we had to determine the direc-
tion of its most common relationship with illness. In doing so we
recognized that although indicators such as obesity and high
glucose are usually associated with negative health outcomes,
very low body weight and glucose might also result from star-
vation. Moreover, high NKA and CD4 are usually associated
with better health, but there are rare diseases in which they may
be elevated. In almost all cases, the most common association
was for higher values with greater health risk, but for some
immune markers and high-density lipoproteins, lower values
indicated more risk. In these cases, we reversed the coding so
that higher values represented more risk.
Analyses Used to Assess Central Tendencies, Variability,
Significance, and Moderators
Random-effects and fixed-effects models. Inferences about central ten-
dencies relative to several sources of variation can be made using random-
effects or fixed-effects models. The random-effects model is usually used
when the goal of a meta-analysis is to generalize the findings beyond the
collection of observed or identical studies. Greater generalizability is
achieved by incorporating between-studies variability into error estimates,
variance, and statistical tests (Hedges & Vevea, 1998; Lipsey & Wilson,
2001). The fixed-effects model is often used when ones goal is to make
inferences to the observed or identical studies, except for random error
sampling of participants into each study. Although the random-effects
model is more generalizable, the costs of applying it are generally broader
confidence intervals and less powerful test results when making condi-
tional inferences (Hedges & Vevea, 1998). Alternatively, when heteroge-
neous effect sizes exist, generalizing the results of fixed error models
beyond the studies used is problematic because systematic variability is
unexplained. Here, the point-biserial correlations were assumed to be
heterogeneous across samples when the Q statistic exceeded the critical
value of the chi-square, with k 1 degrees of freedom. In such cases, we
examined potential outliers (Shadish & Haddock, 1994). Because the
fixed-effects model also allows important inferences, we used both ap-
proaches. However, we only report fixed error results when they are
significant and the random error results are not.
Mean point-biserial correlations. Two types of tests were con-
ductedthose in which mean point-biserial correlations were compared
with zero and tests of contrasts, in which mean point-biserial correlations
were compared with each other. To calculate a mean point-biserial corre-
lation, all r
s were first transformed to Fishers Z
. Each sample was then
weighted by its appropriate mean weights (Shadish & Haddock, 1994). If
a sample generated one report, the function of the sample size was used to
weight the point-biserial correlation in the composite of point-biserial
correlations from that category. If a sample generated multiple reports
using the same measure, the sample sizes in the reports were first averaged
to weight the point-biserial correlation for that measure. To calculate the
grand mean point-biserial correlation of the 23 samples, we computed the
unweighted mean point-biserial correlation within each sample. The 23
mean point-biserial correlations were then each weighted by a function of
their sample size (mean n was used for the OSU, UW, and UCSD samples)
and used to compute the grand mean. When computing the point-biserial
correlations for the five reported health categories or the six physiological
categories, we first averaged across point-biserial correlations within each
sample in that grouping (i.e., k 17 and 12, respectively). These values
were then weighted by their sample size functions, and their averages were
obtained across the samples. This was also done for the means of the 11
health categories. In all cases, the means of the point-biserial correlations
were computed from Lipsey and Wilsons (2001) formulae using the SPSS
macro procedure. For the random error model, a test was used that assumes
this model (Bryk, Raudenbush, & Congdon, 1996; Shadish & Haddock,
1994). Contrast tests were done for reported health versus physiological
measures; global self-reported health versus utilization, medications, and
chronic illnesses; stress hormones versus cardiovascular and metabolic
measures; antibodies versus enumerative measures; and antibodies versus
other functional immune measures.
Analyses Used to Examine Data Censoring and Data
We examined data censoring in two ways. First, we compared the mean
point-biserial correlations of published and unpublished studies. Second,
we used the trim and fill procedure. This method is useful when the
primary goal of a meta-analysis is to establish the existence of a relation-
ship between two variables. In such cases it is important to examine the
robustness of findings relative to all possible sources of data censoring. The
trim and fill method (Duval & Tweedie, 2000) first develops a funnel plot
of studies, defined by the effect sizes on the x axis and the standard errors
or sample sizes on the y axis. Once the plot is created, studies from the
asymmetric outlying part of the funnel are trimmed. The symmetric re-
maining studies provide a more accurate effect size estimate. Correct
calculation of the variance for the pooled estimate requires that the
trimmed studies be replaced and their apparent missing counterparts on the
funnel plot be filled by inputting values. This is based on the assumption
that the sides of the funnel plot should be a mirror image of each other
(Sutton, Song, et al., 2000). The fill step allows for adjusted overall
confidence intervals to be calculated. Funnel plots may be asymmetric
from factors other than publication bias. These include study quality,
different outcome measures, and so forth. Therefore, changed results based
on inputted values should be interpreted with caution (Sutton, Duval,
Tweedie, Abrams, & Jones, 2000). When a mean effect size withstands this
procedure it has enhanced credibility. Here we used the STATA macro
procedure, METATRIM (Steichen, 2001), which employs the methods of
Duval and Tweedie (2000). Finally, to examine the meaning and interpret-
ability of the grand means of the point-biserial correlations, we assessed
study-level effect sizes. This was done by correlating the point-biserial
correlations for different categories of health measures across those studies
that assessed more than one health indicator.
Across the 23 samples (n 3,072), 18 were from North Amer-
ica, 4 were from Europe, and 1 was from Australia. The grand
mean for age was 65.0 years, and the range of the age means was
55 to 75 years. The median percentage of women was 65.1, and the
range was 0% to 100%. The median number of non-White partic-
ipants was 7.4%, with the sample percentages varying from 0 to
100. Overall, there were 1,594 caregivers and 1,478 noncaregivers
that were group matched on age and sex in the 23 observational
studies. The mean ages were 65.6 years (SD 5.9) for caregivers
and 64.6 years (SD 6.4) for noncaregivers. The median percent-
age of women was 65.0 for caregivers and 65.2 for noncaregivers.
Point-Biserial Correlations of Caregiving With Indicators
of Physical Health
Grand means for all studies, self-reported studies, and physio-
logical studies. There were 172 point-biserial correlations that
could be calculated across the 45 reports (13 from dissertations and
159 from articles). Forty-one nonsignificant comparisons were
also reported (or 19% of 213 comparisons), with insufficient data
to calculate point-biserial correlations. For the 23 samples, the
overall grand mean was significant when the nonsignificant values
were set to zero (r
.10, p .01), and it was also significant
when these values were ignored (r
.12, p .01). The 17
samples that examined the five self-reported health categories
generated 7 to 14 contrasts in each category. A total of 57 contrasts
were reported, of which 53 point-biserial correlations could be
calculated. For the self-reported health categories the mean was
also significant (r
.10, p .01), both with and without
nonsignificant values set to zero. Caregivers reported more health
problems than did noncaregivers. The 12 samples that examined
the six physiological categories contained 14 to 56 contrasts. Of
the 155 contrasts (154 1 contrast for wound healing), 118
point-biserial correlations could be calculated. The mean was
significant (r
.11, p .05) when nonsignificant values were
set to zero, as well as when they were ignored (r
.15, p .01).
These results suggest that caregivers had greater potential illness
risks than did noncaregivers.
Mean point-biserial correlations and variability within subcat-
egories. For each sample (row) of Table 3, we provide point-
biserial correlations for the five self-reported health categories that
were examined. The last three rows include the mean point-biserial
correlations using the random error model, without and with the
nsnd results set to zero, and the number of point-biserial correla-
tions used to calculate the means and, in parentheses, the number
of nsnd results in each health category. Table 4 is similar to Table
3, but it contains the physiological results.
Table 5 provides a summary of 14 groups of point-biserial
correlations. These include the set of all point-biserial correlations
calculated, the point-biserial correlations for all self-reported
health measures, the point-biserial correlations for all physiologi-
cal measures, and the point-biserial correlations for the 11 subcat-
egories. For each group, the table includes columns containing (a)
category name; (b) number of independent samples (k); (c) sample
sizes for caregivers and noncaregivers (controls); (d) mean point-
biserial correlation for the random-effects model ignoring nsnd,
with significance levels, 95% confidence intervals, and Q statistic;
(e) mean point-biserial correlations for the random-effects model
with nsnd results set to zero, with significance levels, 95% confi-
dence intervals, and Q statistic; and (f) trim and fill estimates of the
mean point-biserial correlation with nsnd values set to zero.
From the table, one can see that in eight categories there was
either no difference or a minor difference between the mean
point-biserial correlations for the calculations in which nsnds were
ignored versus the calculations in which they were set to zero. In
other categories, the drop in the mean point-biserial correlation
was more substantial when zeros were substituted for nsnds. These
included global self-reported health (16% drop), functional cellular
immunity (23% drop), all physiological samples (27% drop), stress
hormones (28% drop), and enumerative immunity (42% drop).
This last difference is not surprising because 25 of the 56 point-
biserial correlations in this category were nsnd. To be conserva-
Table 3
Point-Biserial Correlation Coefficients for Self-Reported Health Measures
Sample Report
Global self-
reported health
Health service
1 Almberg et al. (1998) .28
2 Baumgarten et al. (1992, 1997) .28 .04 .25 .38 .02
3 Grafstrom et al. (1992) .13 .05 .07
4 Haley et al. (1987) .18 .25 .11 .22 .19
5 Haley et al. (1995) .09 .02 .03 .03
6 Lorensini & Bates (1997) .09
7 McNaughton et al. (1995) .07
8OReilly et al. (1996) .08 .11 .01
9 Rose-Rego et al. (1998) .22
10 Wright (1994), Wright et al. (1999) .09
11 Fuller-Jonap & Haley (1995) .12
12 Kiecolt-Glaser et al. (1987)
13 OSU second sample .12 .11 .04 .29
14 UCSD second sample .05 .06 .08
15 Picot et al. (1997) .11 .04
16 UW group .24 .01
17 J. J. McCann (1991) .25 .12 .18 .36 .21
Mean r
Random-effects model
without nsnd
.18 .11 .10 .12 .06
Random-effects model
with nsnd at 0
.16 .11 .10 .12 .05
Number of r
s (nsnd)
10 (2) 7 (0) 13 (0) 11 (0) 12 (2)
Note. Point-biserial correlations (r
s) are unweighted. A dash indicates that the value could not be computed because it was recorded as nonsignificant,
with no data (nsnd). OSU Ohio State University. UCSD University of California, San Diego. UW University of Washington.
tive, we focus here on results with nsnd coded as zero (see M and
Q columns for weighted r
with nsnd, Table 5).
Across categories, the largest point-biserial correlations were for
stress hormones (k 5; r
.23, p .05), global self-reported
health (k 10; r
.16, p .01), and antibodies (k 4; r
.15, p .01). In all cases, caregivers reported greater health
problems and/or had greater potential risks than did noncaregivers.
For health service utilization, chronic illnesses, metabolic, and
cardiovascular categories, the point-biserial correlations for the
random-effects model were not significant, but the point-biserial
correlations for the fixed-effects model were significant for
chronic illnesses (k 7; r
.08, p .05) and health service
utilization (k 11; r
.04, p .05). Fixed-effects models are
difficult to interpret with heterogeneous effect sizes, which was
true for chronic illnesses. However, when an outlier (.075 for
OReilly, Finnan, Allwright, Smith, & Ben-Shlomo, 1996) was
dropped, the measure of heterogeneity became nonsignificant
(Q 4.93, p .33) and the point-biserial correlation became .14
(p .01; k 6). OReilly et al. (1996) examined caregivers of
Parkinsonian patients, but the other samples examined AD care-
givers. It has been shown that when distress is not controlled (as
in OReilly et al., 1996), caregivers of Parkinsons patients re-
port fewer health problems than caregivers of persons with AD
(Hooker, Monahan, Bowman, Frazier, & Shifren, 1998).
Table 5
Mean Random-Effects Model Point-Biserial Correlations (r
s), Q Statistics, and Trim and Fill Estimates for Each Category
Category k
n Weighted r
without nsnd
Weighted r
with nsnd
CG CO M 95% CI QM95% CI Q Trim and fill r
All samples 23 1,594 1,478 .12** .07, .17 45.09** .10** .05, .15 38.96* .09**
All self-report samples 17 1,405 1,388 .10** .05, .15 23.23 .10** .05, .14 23.72 .10**
All physiological samples 12 601 461 .15** .05, .25 25.78** .11* .02, .19 19.02 .11*
Global self-reported health 10 717 879 .19** .12, .25 8.60 .16** .10, .23 13.41 .16**
Chronic illnesses 7 476 461 .10 .00, .21 14.18* .11 .00, .21 14.18* .02
Physical symptoms 8 742 649 .10** .03, .17 10.97 .10** .03, .17 10.97 .03
Medication use 10 941 960 .12* .03, .21 34.67** .12* .03, .21 34.67** .12*
Health service utilization 11 1,002 961 .06 .01, .14 17.77* .05 .01, .12 18.14 .05
Functional cellular immunity 6 308 216 .13 .09, .34 20.04** .10 .08, .27 19.42** .10
Antibodies 4 175 187 .15** .05, .25 1.48 .15** .04, .25 1.43 .15**
Enumerative immunity 6 266 226 .12* .02, .23 6.51 .07 .02, .16 3.42 .07
Stress hormones 5 176 119 .32** .08, .52 13.42** .23* .00, .43 11.71* .23*
Cardiovascular 4 217 161 .02 .13, .10 0.02 .01 .12, .09 0.04 .01
Metabolic 5 309 256 .01 .14, .12 8.56 .01 .14, .12 8.56 .01
Note. k number of independent samples; CG caregiver; CO control (noncaregiver); nsnd nonsignificant without sufficient information to
compute effect sizes; CI confidence interval; Q measure of heterogeneity.
* p .05. ** p .01.
Table 4
Point-Biserial Correlation Coefficients for Physiological Measures
Sample Report
immunity Antibodies
hormones Cardiovascular Metabolic
1 Kiecolt-Glaser et al. (1987) .26 .27 .13
2 OSU second sample .29 .17 .07 .46 .10
3 Kiecolt-Glaser et al. (1995) .20
4 Irwin et al. (1991) .32
5 UCSD second sample .14 .07 .03 .03 .19
6 Picot et al. (1997) .03
7 Pomara et al. (1989) .89
8 Reese et al. (1994) .03
9 Vedhara et al. (1999) .08 .23
10 UW group .05 .03 .01 .01
11 Giefer (1994) .14
12 J. J. McCann (1991) .41 .32 .15
Mean r
Random-effects model
without nsnd
.13 .15 .12 .32 .02 .01
Random-effects model
with nsnd at 0
.10 .15 .07 .23 .02 .01
Number of r
s (nsnd)
23 (3) 13 (1) 31 (25) 17 (3) 11 (5) 23 (0)
Note. Point-biserial correlations (r
s) are unweighted. A dash indicates that the value could not be computed because it was recorded as nonsignificant,
with no data (nsnd). OSU Ohio State University. UCSD University of California at San Diego. UW University of Washington.
Contrast Tests
As we noted at the beginning of our article, there are reasons to
examine three contrasts. These include the relationships of care-
giving with global health versus other self-reported health, stress
hormones versus cardiovascular and metabolic measures, and an-
tibodies versus other immunologic measures. In performing these
tests we eliminated the problem of nonindependence within a
contrast level, such as for other self-reported health (utilization,
medications, and illnesses) by averaging over the point-biserial
correlations for the multiple correlated measures within each sam-
ple. However, although the point-biserial correlations were inde-
pendent within contrast levels, they were not all independent
across contrast levels, for example, global versus other self-
reported health. In such cases, the use of independent tests across
correlated samples made our findings more conservative. Our
results suggest that the point-biserial correlation for global self-
reported health (.16) was greater than that for the combination of
utilization, medications, and illnesses (.06), Q(1) 8.98, p .01.
The point-biserial correlation for stress hormones (.19), using the
random-effects model, was not greater than that for cardiovascular
and metabolic measures (.01), Q(1) 3.50, p .06. However,
under the fixed-effects model, the point-biserial correlation for
stress hormones (.17) was greater than that for cardiovascular and
metabolic measures (.00), Q(1) 6.52, p .01. The point-biserial
correlation for antibodies for the fixed-effects model (.15) was also
greater than that for other functional immune measures (.01),
Q(1) 4.70, p .03, but the point-biserial correlations for
antibodies, for the random-effects model (.15) and for other func-
tional immune measures (.01), Q(1) 2.80, p .10, were not
different. Finally, despite an almost 2:1 ratio in magnitude, the
point-biserial correlations for antibodies and enumerative immu-
nity were not different.
Tests of Demographic Variables as Moderators
To perform these analyses, we first recorded the number of
samples that assessed each potential moderator. Sex, age, and
relationship to the care recipient were reported in enough studies to
meaningfully allow such analyses.
Tests of sex as a moderator. We examined sex as a moderator
using several types of contrasts. These included (a) comparisons of
men and women caregivers (r
s of sex with health indicators in
caregivers), (b) comparisons of caregivers and noncaregivers sep-
arately for men and women (r
s of caregiver status with health
indicators stratified on sex), and (c) comparisons of the point-
biserial correlations obtained for men and women in (b) above. In
the first comparisons, which were similar to correlating percent-
ages of women (or men) in each study with the obtained point-
biserial correlations, we were cognizant of the fact that the base
rates of certain health indicators may differ across sex. For exam-
ple, on average, men have higher systolic blood pressure, and
women have higher high density lipoprotein levels and self-
reported health problems. As such, the latter tests are better indi-
cators of sexs moderation of caregiving with health. Also, studies
that have compared male caregivers with male noncaregivers may
differ in inclusion criteria and assay assessments from studies that
have compared female caregivers with female noncaregivers. For
this reason, the comparisons in (c) above compared the point-
biserial correlations in studies that had simultaneously examined
differences in caregivers and noncaregivers in both men and
women. These analyses had the effect of first controlling for sex
differences across samples by stratifying on sex and then control-
ling for different inclusion criteria and assay assessments across
studies by requiring the same studies across each contrast. Al-
though such criteria limited the number of studies that could be
used in each comparison, this resulted in less confounding from
sources other than sex. Using this approach, global health was
assessed in enough studies to be a separate category, but we had to
collapse the physiological measures into immunologic and
hormonecardiovascularmetabolic (HCM) categories.
However, the comparisons of male caregivers with female care-
givers did allow for sex comparisons of three additional studies
(Gallant & Connell, 1997; Neundorfer, 1991; Sparks, Farran,
Donner, & Keane-Hagerty, 1998). These were not included in the
45 reports cited above because they did not contain noncaregivers.
Table 6 contains the point-biserial correlations using the
random-effects model for the analyses that examined sex moder-
ation. In these analyses women were coded as 1, and men were
coded as 2. Hence, positive point-biserial correlations meant that
the observed values for male caregivers were greater than those for
female caregivers, and negative point-biserial correlations meant
the opposite. According to this framework, female caregivers (n
500) reported poorer global health than did male caregivers (n
262; r
⫽⫺.10, p .01) for the five available samples (Gallant
& Connell, 1997; Grafstrom, Fratiglioni, Sandman, & Winblad,
Table 6
Point-Biserial Correlations (r
s) for Analyses of Sex Moderation for Global Self-Reported
Health and Physiological Measures
Global self-report HCM Immunologic
CG men vs. CG women .10** 762 .07 214 .02 181
CG men vs. NCG men .15 222 .11 112 .06 112
CG women vs. NCG women .25** 450 .01 186 .05 186
Note. A positive r
suggests that the observed values of men were greater than those of women or that the
observed values of caregivers (CGs) were greater than those of noncaregivers (NCGs); a negative r
that the observed values of women were greater than those of men or that the observed values of NCGs were
greater than those of CGs. HCM hormonecardiovascularmetabolic.
** p .01.
1992; Neundorfer, 1991; Rose-Rego, Strauss, & Smyth, 1998;
Sparks et al., 1998). For the HCM measures there was no differ-
ence in the 86 caregiver men and 128 caregiver women in the three
available studies (Irwin et al., 1991; OSU second sample; UW
sample; r
.07, p .15). However, the point-biserial correla
tion was positive, which contributed to the difference in the point-
biserial correlations for global health (.10) and HCM (.07),
Q(1) 3.78, p .05. When comparing immunologic measures for
caregiver men and women, the two studies (Irwin et al., 1991; UW
sample; n 181) yielded a point-biserial correlation of .02.
In male caregivers versus male noncaregivers (n 222), a
nonsignificant result occurred for the random-effects model (r
.15, p .13) in the three studies that assessed global health
(Almberg, Jansson, Grafstrom, & Winblad, 1998; Grafstrom et al.,
1992; Rose-Rego et al., 1998). However, the result for the fixed-
effects model was significant (r
.15, p .05), and the
point-biserial correlations were homogeneous. Here, male caregiv-
ers reported worse global health than did male noncaregivers. Two
studies compared male caregivers with male noncaregivers on
HCM measures (Irwin et al., 1997; UW sample). In these, the
relationship was not significant (r
.11, p .12) because the
sample size was 112 and only two studies were available. Also, the
mean for the HCM measures (r
.11) was not significantly
different from the mean for global health (r
.15). When these
same two studies were used to examine immune measures, the
result was nonsignificant (r
Caregiver women reported worse global health than noncare-
giver women in three studies (n 450; Almberg et al., 1998;
Grafstrom et al., 1992; Rose-Rego et al., 1998), with a mean
point-biserial correlation of .25 (p .01). Two studies of HCM
measures (Irwin et al., 1997; UW sample; n 186) yielded a
nonsignificant result (r
⫽⫺.01, p .44). The mean for global
health (r
.25) was greater than that for HCM measures (r
.01), Q(1) 2.90, p .05. The same two studies that assessed
immunologic measures (Irwin et al., 1997; UW sample; n 186)
yielded a nonsignificant result (r
⫽⫺.05). The point-biserial
correlations for the immunologic measures were not different for
men (.06) and women (.05).
Tests of caregiver age and relationship to the care recipient as
moderators. To examine whether age and relationship to the care
recipient were associated with the obtained point-biserial correla-
tions, we first determined whether they themselves were related.
As noted for the contrast tests, we obtained independence of
point-biserial correlations in these tests by first averaging over the
point-biserial correlations within samples that had multiple mea-
suresthat is, all self-report measures or all physiological mea-
sures. Hence, when examining the relationships of point-biserial
correlations with a potential moderator, each independent sample
was represented by only one point-biserial correlation. Our anal-
yses suggest that age was not related to the percentage of spouses
across the 20 samples with such data (r .36, p .11). In the 15
studies that included age, the point-biserial correlations for self-
reported health were higher in samples with older caregivers (r
.49, p .04). For the physiological measures, no relationship was
observed for age with the point-biserial correlations. Also, no
associations occurred for reported health (r .07, k 16) and
physiological measures (r ⫽⫺.23, k 11) with the caregivers
relationship to the care recipient (spouse vs. child caregiver).
Data Censoring and Interpretation
Unpublished dissertations versus published reports. Only one
dissertation used an independent sample to examine reported
health (J. J. McCann, 1991), so it was not tested for publication
bias. Two samples used unpublished physiological data (Giefer,
1994; J. J. McCann, 1991) that were compared with samples with
published physiological data. The point-biserial correlations were
not different, Q(1) 0.07, p .80, namely, .11 and .14.
Trim and fill estimates. We performed the METATRIM pro-
cedure on all 14 groupings (see Table 5). In 11 of 14 categories, no
studies were trimmed and filled. The 3 categories that changed
involved the total set of studies, the chronic illnesses category, and
the physical symptoms category. Across the total set of 23 point-
biserial correlations, two studies were trimmed and filled, and the
overall point-biserial correlation dropped by .01; however, it was
still significant (r
.09, p .05). In the other two cases, the
procedure trimmed and filled four studies each. In the chronic
illnesses category (k 7), the mean point-biserial correlation
dropped from .11 to .02 and became nonsignificant. Likewise,
the mean point-biserial correlation for the physical symptoms
category (k 8) changed from .10 to .03 and became
Study-level effect size. Intercorrelations of point-biserial cor-
relations were calculated across measures in studies that had
assessed health indicators in two or more categories. Some corre-
lations were based on two or three studies, but we only interpreted
those computed on four or more studies. Of these, nine pairs of
correlations were computed across the self-report measures. The
values ranged from .16 to .98 (median r .73). Only three
correlations could be computed on physiological measures in four
or more studies. The three correlations ranged from .08 to .91
(median r .49).
To quantitatively review and critique the literature on caregiver
health, we compared 1,594 caregivers of persons with dementia
with 1,478 demographically similar noncaregivers. The point-
biserial correlations were as follows: .09 for all health indicators
(k 23), .10 for reported health indicators (k 17), and .11 for
the physiological indicators (k 12). Although these point-biserial
correlations were significantly greater than zero, questions remain
regarding their magnitudes and their clinical relevance. At first
glance, it might appear that these are weak relationships, yet the
decision to set nsnd point-biserial correlations to zero had some
important effects in five categories. Enumerative immunity had a
large number of nsnd values, and when these were set to zero, the
point-biserial correlation dropped in magnitude by 42%, from .12
to .07. In other cases the point-biserial correlation dropped by 16%
(global self-reported health), 23% (functional cellular immunity),
27% (all physiological samples), and 28% (stress hormones).
However, even if one uses the estimates of the point-biserial
correlations that include nsnd results set to zero, the means still
have important implications because of the large number of im-
portant persons at risk. For example, the binomial effect size
display (BESD; Rosenthal & Rubin, 1982) was used in the Phy-
sicians Clinical Aspirin Trial (n 22,000) to show that a point-
biserial correlation of .034 translated to 374 fewer myocardial
infarctions in the physicians that ingested aspirin every other day
(Steering Committee of the Physicians Health Study Research
Group, 1988).
Using the BESD, the overall point-biserial correlation of .09
observed here translates to a 9% greater risk of health problems in
caregivers than in demographically similar noncaregivers. This is
important because there are more than 5 million caregivers of
persons with dementia in the United States (American Association
of Retired Persons, 1988) and at least another 5 million care
recipients who may be affected if their caregivers become ill.
These numbers are telling when one considers that caregivers had
a 23% higher level of stress hormones compared with noncaregiv-
ers and that prolonged physiological reactions to elevated stress
hormones, such as elevated blood pressure and glucose levels, can
increase ones risk for hypertension and diabetes (see Table 1).
The 15% poorer antibody production for caregivers may also be
critical because their mean age was 65.1 years. Older adults are at
higher risk for influenza, and their responses to vaccination are
lower than younger adults (Bernstein, Gardner, Abrutyn, Gross, &
Murasko, 1998). Moreover, if older adults do not receive regular
vaccinations, as many as 62% may have reduced antibodies to
common pneumonia serotypes (Sankilampi, Isoaho, Bloigu,
Kivela, & Leinonen, 1997).
Health Indicator Categories as Moderators of Caregiving
With Health Outcomes
The fact that the point-biserial correlations for antibodies were
higher than those for other functional immune measures is consis-
tent with previous meta-analyses of immune measures with de-
pression and stress (Herbert and Cohen, 1993a, 1993b).
we conclude, therefore, that caregiving does not influence other
immunologic functions? To answer this question one must con-
sider how these variables were measured and the designs of the
studies in which they were used. First, functional immunity was
primarily assessed using unchallenged assessments, yet caregivers
have only been shown to have poorer functional responses than
noncaregivers when NKA is stimulated (Esterling et al., 1994,
1996). Second, research on NKA in caregivers has ignored indi-
vidual differences such as comorbidities, and comorbidities such
as cancer history may moderate relationships of caregiving with
NKA. For example, a history of cancer may predispose one to
immunologic risk (Fawzy et al., 1993), and caregivers with cancer
histories have lower NKA than do noncaregivers with cancer
histories. In contrast, NKA does not differ in caregivers and
noncaregivers free of cancer histories (Vitaliano, Scanlan, Ochs, et
al., 1998).
The disregard for comorbidities may also have contributed to
the small point-biserial correlations for cardiovascular and meta-
bolic measures. In one study (Vitaliano, Russo, et al., 1993) no
effect was observed for normotensive caregivers and normotensive
noncaregivers in systolic blood pressure reactivity, but hyperten-
sive caregivers showed greater reactivity than did hypertensive
noncaregivers. Also, caregivers with CHD had higher levels on the
metabolic syndromea linear combination of fasting glucose,
insulin, lipids, mean arterial pressure, and obesitythan did non-
caregivers with CHD; however, no difference occurred for care-
givers and noncaregivers free of such disease (Vitaliano, Scanlan,
Siegler, et al., 1998). For these reasons, relationships of caregiving
with reactivity and the metabolic syndrome may be moderated
respectively by hypertension and CHD. Unfortunately, in this
meta-analysis we could not examine comorbidities because they
were not reported or they were summarily excluded. Such exclu-
sions may have limited researchers to those caregivers who were
least likely to show dysregulation from chronic stress. This is
ironic because spouse caregivers are typically aged 65 and over,
and many are already ill when they enter research studies (among
older adults, 40% have hypertension, 25% have heart disease, and
18% have diabetes; Centers for Disease Control and Prevention,
1998). These caregivers have not been adequately represented in
In summary, if caregivers are more likely to have major
illnesses and to also have more health complications from the
combination of comorbidities and caregiving, past selection crite-
ria would have been biased against observing relationships with
In this analysis the point-biserial correlation for global health
was greater than that for the other reported health categories. We
expected this would occur because, although illness reports are
related to neuroticism and anxiety (Costa & McCrae, 1980;
Watson & Clark, 1984), they are less associated with distress than
is global health (Hooker & Siegler, 1992; Zhang, Vitaliano, Scan-
lan, & Savage, 2001). Also, global health is predictive of mortality
(Idler, Kasl, & Lemke, 1990). When one considers that global
health is highly related to strain and distress, this is consistent with
the 63% higher death rate observed among strained caregivers than
among noncaregivers in a 4-year follow-up study (Schulz &
Beach, 1999).
Demographic Variables as Moderators of Relationships of
Caregiving With Health Indicators
Sex. The sex results depended on the measures used. The
point-biserial correlations for male versus female caregivers for
global self-reported health (.10) and HCM measures (.07) were
significantly different because they were in opposite directions.
Female caregivers reported more health problems, but they did not
exhibit higher HCM risk. This pattern also occurred when differ-
ences in caregivers and noncaregivers were stratified on sex. For
women, caregiving was not only more related to global self-
reported health (.25) than it was to HCM measures (.01), but the
latter point-biserial correlation was minute. In contrast, for men,
the HCM point-biserial correlation (.11) was neither negligible nor
was it lower than that for global self-reported health (.15).
How should we interpret these results? Do the self-report results
allow us to say that women are more vulnerable to caregiving than
are men? There may be several problems with this interpretation.
First, women report more health problems than men in many
These results may be surprising because antibodies are variable in the
face of antigens and lifetime viral exposure but other immune parameters
are relatively constant, except when one is fighting illnesses. To reduce
variability, however, researchers typically examine antigens to which the
majority of older adults have had exposure (e.g., EBV). Moreover, anti-
body responses are also larger than is typically seen in other immune
measures. Hence, one can see a four-fold increase in antibodies in response
to vaccination but a 50% change in NKA in response to a stimulus or a
20% change in CD4 counts (in persons who are not HIV positive) would
be unusual.
Many researchers also excluded persons on medications, disallowing
their interactions with stress and compromising external validity, given the
large number of older adults on medications.
situations (Bosworth et al., 1999; Rahman et al., 1994; Ross &
Bird, 1994). This may occur because women are more aware of
their problems (Barsky, Peekna, & Borus, 2001) and are more
likely to report them when they exist (King, Taylor, Albright, &
Haskell, 1990). Therefore, the current findings may not be unique
to caregiving. Second, global health is related to distress (Hooker
& Siegler, 1992), and distress is higher in female caregivers than
in male caregivers (Lutzky & Knight, 1994). As such, sex differ-
ences in distress may influence caregiver reports. Third, selection
bias may have played a greater role in the results for male care-
givers than for female caregivers. W. Stroebe and Stroebe (1987)
have shown that differences in depression between widows and
widowers who agreed to participate in face-to-face interviews
were minimal; however, widowers who only agreed to do postal
questionnaires reported significantly greater depression than did
widows who only did postal questionnaires. Hence, selection bias
in face-to-face research may work against finding sex differences.
A final issue involves the discrepancy between the magnitude
and significance of the point-biserial correlations for caregivers
and noncaregivers when stratified on sex. Although the HCM
point-biserial correlation for female caregivers versus female non-
caregivers was neither significant nor of meaningful magnitude
(.01), this was not true for male caregivers versus male noncare-
givers. The latter point-biserial correlation (.11) was as large as the
point-biserial correlation for male caregivers versus female care-
givers on global self-reported health (.10, p .05), but it was not
significant because of the sample size. In fact, almost all physio-
logical point-biserial correlations were minute, and the HCM
point-biserial correlation for men was the only point-biserial cor-
relation greater than .10 (see Table 6). Indeed, the sample sizes and
number of studies that examined male caregivers versus male
noncaregivers were among the smallest of all comparisons in this
meta-analysis. The fact that the HCM point-biserial correlation for
men is the largest physiological point-biserial correlation is con-
sistent with research which has shown that men have greater stress
responses to similar physiologic measures (Earle et al., 1999;
Kirschbaum et al., 1999). Men also have greater negative re-
sponses to bereavement. Widowers have higher rates of physical
disabilities (Goldman et al., 1995), diseases of the circulatory
system (Joung, Glerum, vanPoppel, Kardaun, & Mackenbach,
1996), and mortality (Goldman et al., 1995) than do widows.
Taylor et al. (2000) have argued that womens biopsychosocial
adaptation to stress, in contrast to that of men, is to seek out
relaxation and affiliation. This may help to explain the seven-
and-a-half non-specific years that women live longer than men . . .
the tend-and-befriend pattern proposed here may reduce womens
vulnerability to a broad array of stress-related disorders (Taylor et
al., 2000, p. 423). Tend-and-befriend stress reactions include nur-
turance, to protect the self and offspring and reduce distress, and
the creation of social networks that facilitate nurturance.
Caregiver age and relationship to the care recipient. In addi-
tion to sex, we examined the caregivers age and relationship to the
care recipient as potential moderators. Relationships of caregiving
with self-reported health were greater for older participants. Al-
though this supported the added vulnerability of older caregivers
versus older noncaregivers, relative to younger caregivers versus
younger noncaregivers, this was not observed for physiological
measures. One methodological reason for this is that the range of
age means was more restricted in samples with physiological
measures than in samples with self-report measures (data not
shown). A more substantive reason for this result may be that as
age increases, increases occur in physical illnesses and disabilities
(and in their variability; Rowe & Kahn, 1998), and these may be
exacerbated by psychosocial distress, a strong correlate of self-
reported health. The fact that the caregivers relationship with the
care recipient was not shown to be a moderator is difficult to
interpret because of differences in the living arrangements of
spouse versus child caregivers. In 9 of 11 samples, all caregivers
were spouses, and 90% or more of the care recipients were living
at home. In contrast, of the eight studies that mixed spouse and
child caregivers, the percentage of care recipients at home varied
from 0 to 70. The overlap in the caregivers relationship with the
care recipient and his or her place of residence may have affected
the above result.
Meta-Analytic Issues and Limitations
As in all meta-analyses, we had to make a number of decisions
about the criteria for choosing reports and how to combine reports
once obtained. Clearly, these issues have effects on the heteroge-
neity of both the samples obtained and the aggregation of point-
biserial correlations into subcategories. These decisions also affect
the publication bias and external validity of the results.
Sources of data censoring and data interpretation. Meta-
analyses of only published studies may yield higher point-biserial
correlations and be positively biased in favor of significant results.
As such, we supplemented the electronic search of published
articles with hand searches and electronic searches of dissertations.
The observed point-biserial correlation for dissertations (.15) ex-
ceeded the point-biserial correlation for articles (.10), which
helped to counter the problem of publication bias. Also, the trim
and fill procedure was used to examine the robustness of findings
relative to all possible sources of data censoring. These include
study quality and different outcome measures. Across all 14 group-
ings, only 3 categories were trimmed and filled, and 1 of these
changed only slightly and remained significant. In the chronic
illnesses and the physical symptoms categories, the procedure
trimmed and filled four studies, and they each dropped dramati-
cally in value and became nonsignificant. For this reason, the
original results for chronic illnesses and physical symptoms must
be interpreted with caution (Sutton, Duval, et al., 2000). In con-
trast, the findings in 12 of the 14 categories (see Table 5) withstood
the exacting trim and fill procedure, and therefore the results of
this meta-analysis have enhanced credibility (Sutton, Song, et al.,
2000). Moreover, the median intercorrelations of the point-biserial
correlations that assessed the interpretation of the study-level
effect sizes suggest that there are consistencies across the different
health measures.
These data appear to be inconsistent with a review by Kiecolt-Glaser &
Newton (2001), who concluded that gender is an important moderator of
the pathway from negative marital conflict behaviors to physiological
functioning: This pathway is stronger for women than for men, and
womens physiological changes following marital conflict show greater
persistence than mens (pp. 494495). However, in the current article,
most spouses were 6070 years old, were probably married longer, and the
reasons for their conflicts may have been different. Most AD caregivers do
not blame their care recipients for their AD (Vitaliano, Young, Russo,
Romano, & Magana-Amato, 1993). Hence, caregiver distress may be very
different from conflicting marital interactions.
Inclusion criteria. To reduce heterogeneity and address the
problem of combining studies with different samples, outcomes,
and designs, the studies in this meta-analysis had to include (a)
primarily caregivers of care recipients with dementia, (b) a non-
caregiver comparison group, and (c) physical health measures.
Although these criteria had advantages, we also recognize their
disadvantages. Care recipients with AD, vascular dementias, and
Parkinsons disease cluster together relative to other diseases, and
these illnesses produce similar demand characteristics for caregiv-
ers (Vitaliano, Young, & Russo, 1991). However, the aggregation
of care recipients across dementias may also result in heterogene-
ity because caregivers of patients with AD and vascular dementias
experience different levels of burden at different times (Vetter et
al., 1999). Such distinctions were not possible because in cases
with multiple illnesses, the exact numbers were not specified or the
samples were too small.
The decision to only use reports that
included noncaregiver comparisons also had a disadvantage. Al-
though it is accepted practice to limit conclusions about effect
sizes to studies with comparison groups, this criterion restricted
generalizability to such studies. Of all the reports that examined
health problems in caregivers of care recipients with dementia,
only 45 emerged when they were crossed with the terms control,
comparison,ornoncaregivers. Moreover, caregivers and noncare-
givers were not demographically balanced on all potential con-
founds. Some studies (e.g., Kiecolt-Glaser et al., 1991) had higher
levels of divorce and bereavement in noncaregivers than in care-
givers, but these differences may have made their results conser-
vative because such characteristics are related to poorer health
(House, Landis, & Umberson, 1988). However, because these
samples were not stratified, we could not compute point-biserial
correlations separately for each marital status group.
Categorical groupings. Meta-analyses harness the power of
multiple studies to generalize conclusions beyond those of one
study. In doing so, meta-analyses may encounter problems from
grouping studies together. Here the measures grouped in some
domains, such as metabolic variables, were correlated and part of
a syndrome. In others, such as the immunologic ones, categorical
assortment was more difficult. Antibodies to EBV and HSV were
placed in the same category even though they can arise from
humoral immunity, cellular immunity, or both. This categorical
approach was also used by Herbert and Cohen (1993a, 1993b).
Moreover, because of their relationships with each other, IgG
responses to specific vaccinations and IgG serum levels were
combined. Had we not done this, the number of studies per
category would have been greatly limited.
Additional Potential Limitations
Interpretative problems occur in meta-analyses in response to
measurement error and inadequate study designs. Here we con-
sider three such problems that are relevant to the current meta-
analysis. These include the validity of the health measures used
across studies, selection biases, and problems of confounding and
reverse causality.
Validity of measures. This article assumes that stressors neg-
atively affect physiological risk, which in turn increases illness
risks. Although there are many instances in which such pathways
are well known and accepted (Chrousos & Gold, 1992), there are
others in which the connections have less empirical support.
Clearly, stressors increase stress hormones, which increase glucose
and blood pressure. Cholesterol and hypertension increase CHD
risk, and high glucose and obesity increase risk for Type II
diabetes. Also, wound healing and antibodies to vaccinations and
to EBV/HSV each have strong links with health outcomes. In
contrast, associations between illnesses with resting plasma IgG
levels and lymphocyte proliferation may not be as strong in per-
sons who are not already immunocompromised. Despite such
variations, most measures in this review represent attempts by
caregiver researchers to use physiological measures that have
predictive and concurrent validity (see Table 1). Indeed, in the past
15 years the immunologic measures used have been shown to be
more responsive to stressors and clinically relevant. If we had
focused only on the small number of immunologic measures
currently thought to be important, this review would have been
limited and biased. Yet even with these caveats, the current results
for antibodies are still consistent with the stress, depression, and
immunity meta-analyses of Herbert and Cohen (1993a, 1993b).
Selection bias. All reports in this meta-analysis were based on
observations and not experiments, and this may have obscured
results. Premorbid differences between caregivers and noncaregiv-
ers could have caused the observed point-biserial correlations to be
different from those that would have been obtained from random-
ized studies. Moreover, none of these studies examined individuals
before they became caregivers. As such, prior to caregiving, care-
givers may have had greater distress and poorer health habits than
noncaregivers. Without random assignment of caregivers and non-
caregivers, characteristics that occur prior to caregiving and that
are known to covary within couples may influence differences in
the observed health of caregivers versus noncaregivers. Such cor-
relations can occur because of assortative matingthe tendency
for individuals to marry persons similar to themselves (Buss,
1984)and mutual influences on each others behavior (Vogel &
Motulsky, 1986). High within-couple correlations have been found
for diet, alcohol consumption, caffeine, tobacco, and medications
(Davis, Murphy, Neuhaus, Gee, & Quiroga, 2000; Demers, Bisson,
& Palluy, 1999), and Buss (1983) has observed within-couple
correlations of .30 for weight, .43 for smoking, and .43 for drink-
ing. These results are important because health habits (Skoog,
1998) and distress (Gale, Braidwood, Winter, & Martyn, 1999;
Leonard, 2001) may contribute to the development of dementia
and related cognitive disorders. If this is the case, then the same
lifestyle that influenced the development of dementia in care
recipients may also have influenced the development of other
illnesses in caregivers, with genetic predispositions affecting how
these shared risk factors manifested themselves differently.
Confounding and reverse causation. To control for differ-
ences in caregivers and noncaregivers from unknown confounders,
the noncaregivers in this meta-analysis were group matched to
caregivers on sex and age. However, it is impossible to match on
all variables, and biases can occur if important variables are
ignored. Income is one such variable because it is negatively
related to health (Kaplan, 1992). In some studies, caregivers and
Draper, Poulos, Cole, Poulos, and Ehrlich (1992) and Reese, Gross,
Smalley, and Messer (1994) found no difference in distress between AD
and stroke caregivers. Hooker et al. (1998) found that AD caregivers had
worse mental health than Parkinsons caregivers, but AD caregivers had
better physical health than Parkinsons caregivers, after mental health and
personality were controlled.
noncaregivers and male caregivers and male noncaregivers did not
differ in income (Vitaliano et al., 2002), but this may not have been
true in all studies. Persons who become caregivers may have lower
incomes than persons eligible for but who opt out of caregiving.
Men may take on caregiving only when they do not have financial
resources or a daughter or sister to help them (Kramer, 1997). It is
important that income be controlled when comparing the health of
male caregivers with male noncaregivers.
In addition to their possible confounding nature, the studies in
this meta-analysis are also limited by their cross-sectional design.
That is, they do not allow one to determine whether illnesses
preceded caregiving or vice versa. To address this concern, some
caregiver studies have used prospective designs, in which caregiv-
ers and noncaregivers were examined only if they were free of a
predicted outcome. Shaw et al. (1997) followed spousal caregivers
of AD patients and spouses of noncaregivers over 16 years,
depending on the care recipients life course. Caregivers who
provided the most assistance had a greater hazard of reaching at
least one objective negative health event relative to other caregiv-
ers and noncaregivers. This suggests that caregiver stressors may
precede their health problems. Schulz and Beach (1999) followed
persons who were caregivers of various types of care recipients.
They observed that over an average of 4 years, strained caregivers
had a 63% higher death rate than noncaregivers. Finally, Vitaliano
et al. (2002) observed that men caring for a spouse with AD had
a greater prevalence of heart disease than demographically similar
noncaregiver men 2730 months after study entry. Also, the inci-
dent cases of heart disease showed a higher trend in male care-
givers than in male noncaregivers. No differences occurred for
women. In terms of mechanisms, a latent variable of caregiver
status and care recipient deficits in cognitive and functional status
explained variance in distress. This, in turn, explained variance in
poor health habits, which predicted elevated cardiovascular and
metabolic risk 1518 months later. Such dysregulation predicted
new cases of heart disease over 2730 months. As in Figure 1,
health habits are part of a major pathway from stressors to health
problems (Fuller-Jonap & Haley, 1995). Indeed, Gallant and Con-
nell (1997) observed that caregiver burden was associated with
poorer health habits, such as sedentary behavior, alcohol consump-
tion, and smoking. They also concluded that health habits repre-
sent one mechanism by which caregiver stressors influence ad-
verse health. Health habits, however, have not received enough
emphasis in caregiver research to be examined meta-analytically.
Advances and Recommendations
Despite the above concerns, this article has a number of advan-
tages. Caregiving allows one to examine a naturalistic chronic
stressor that is unambiguously defined by the caregivers self-
identification and the care recipients cognitive, functional, and
affective disabilities. Yet there have been relatively few qualitative
reviews of caregiver physical health, and this is the first quantita-
tive review. Although qualitative reviews can be of value, meta-
analytic reviews have fixed rules for aggregating point-biserial
correlations and providing quantitative summaries of the typical
strength of a relationship, its variability, and its significance.
Moreover, because point-biserial correlations are continuous, they
are more accurate than vote counts of significant findings in
qualitative reviews (Lipsey & Wilson, 2001). For this reason, the
results of this meta-analysis should be useful. However, in addition
to quantifying the risk of caregiver health problems relative to
noncaregivers, these results also suggest directions for future re-
searchnamely, the use of a theoretical model of stress and
illness, more informative designs, and additional health measures.
Stressillness models. Understanding health responses to care-
giving can be improved by a greater use of models that consider
individual differences (Lazarus & Folkman, 1984). Earlier we
presented a model in which illness is not just a function of
caregiving, but also of vulnerabilities, resources, and their inter-
actions (Vitaliano et al., 1987). Unfortunately, most caregiver
research on physical health has not used guiding stress models.
Indeed, we were not able to examine most parts of the proposed
model in this meta-analysis because very few studies had included
the variables necessary for such analyses. For example, caregiver
health research has tended to ignore individual differences that
would allow one to examine interactions with caregiving. Vari-
ables such as psychiatric history, personality, comorbidities, social
supports, and income were not reported in sufficient numbers to
allow analyses. Potential cognitive mediators relevant to stress
illness relationships were also not reported in sufficient numbers.
These include appraised control, self-concept, expectations, and
the caregivers cognitions about his or her illnesses and those of
his or her care recipient. The self-regulation model would be useful
here because it views caregivers as active agents in their adherence
to treatment regimens for themselves and their care recipients
(Leventhal, Nerenz, & Strauss, 1982). To change risky health
behaviors and manage their illnesses and those of their care recip-
ients, caregivers must possess accurate information about these
illnesses. Caregiver illnesses may also be affected by the self-
efficacy (Bandura, 1993) of the caregiver. In fact, self-efficacy
may influence the role of sex as a moderator of caregiver illnesses.
As noted by M. S. Stroebe (1998), widows have higher levels of
perceived efficacy than do widowers in interpersonal activities and
social support. A womans confidence in her ability to manage
interpersonal activities and her generally larger social networks
may help her to adjust more easily to bereavement than a man.
Finally, some researchers have emphasized the positive aspects
of caregiving (C. A. Cohen, Gold, Schulman, & Zucchero, 1994)
and have reported more positive changes in self-concept in re-
sponse to caregiving than negative changes (Aneshensel, Pearlin,
Mullan, Zarit, & Whitlatch, 1985). As such, positive appraisals
need to be related to illness (Walker, Acock, Bowman, & Li,
1996). In the HIV/AIDS literature, the use of social coping among
caregivers has been shown to be associated with greater positive
affect and lower levels of physical symptoms (Billings, Folkman,
Acree, & Moskowitz, 2000). Moreover, Folkman, Chesney, Col-
lette, Boccellari, & Cooke (1996) have shown that if meaning is
derived from the caregiver experience, caregivers can maintain
positive morale. One would expect that this would also influence
physical health outcomes.
Design. Although the three prospective studies discussed
above allowed researchers to infer that caregiving preceded the
outcomes studied, they were only prospective for illness or mor-
tality and not for caregiving. Hence, no information about expe-
riences prior to caregiving was available. Only studies that are
doubly prospective, in which a cohort is examined before caregiv-
ing and before a target illness develops, allow one to examine
covariation in caregiver exposure and illness relative to psycho-
social, behavioral, and physiological changes. One expects that by
studying individuals before caregiving takes place, one can assess
whether self-selection occurs and whether postcaregiving illnesses
are influenced by precaregiving as opposed to postcaregiving
experiences. Doubly prospective studies can be done by combining
caregiver research with ongoing population studies. However,
illnesses take time to be detected, and extensive follow-up is
necessary. Indeed, even when clinical illness is not evident, more
innovative designs can be used to contrast caregivers and noncare-
givers under acute and chronic stressors. This is important because
relationships between acute stressors and psychological distress
are heightened in persons with chronic stress (Norris & Uhl, 1993).
Also, the hypothalamicpituitaryadrenal axis becomes sensitized
by chronic stress to yield an amplified response to acute stressors
(Hauger, Lorang, Irwin, & Aguilera, 1990). For these reasons,
researchers should identify caregivers exposed to stressors that
exist in addition to caregiving. Comorbidities may increase care-
giver distress and interact with caregiving to exacerbate physio-
logical dysregulation (Vitaliano, Russo, Bailey, Young, & Mc-
Cann, 1993; Vitaliano, Scanlan, Ochs, et al., 1998; Vitaliano,
Scanlan, Siegler, et al., 1998). Such research allows one to study
whether caregivers with comorbid illnesses have more disease
progression than noncaregivers with such illnesses. To be of con-
cern to society, caregiving does not have to cause illnesses; it only
has to contribute to illness progression.
Measurement. Physiological assessments have been primarily
laboratory measures. Future research can increase ecological va-
lidity by assessing such measures in situ over longer time periods.
Home ambulatory blood pressures in caregivers may be greater
than clinic and work blood pressures (King, Oka, & Young, 1994),
especially in the presence of their care recipients (King & Brass-
ington, 1997). Assessments that include medical records, physical
exams, and death certificates are also important. Medical records
may provide lab results, date, and nature of diagnosis (or Interna-
tional Classification of Diseases, 9th edition, codes; Puckett,
1993), treatment, medications, and prognosis. Quality control
checklists document treatment regimen and symptoms to support
diagnostic codes and assess the internal consistency of records
(Hanken, 1989). Physical exams can uncover problems undetected
in medical records (Bates, 1995). In terms of illnesses, those with
dramatic sudden onsets or a need for medical attention may be
more reliably studied than illnesses without such features (Kasl,
1983). Also, measures should be matched to the temporal courses
of the stressors and the illnesses being studied (cf. S. Cohen,
Kaplan, & Matthews, 1994). Heart disease can take years to
develop, so one must specify pathways through which caregiving
is expected to influence its development. Measures that have
already been used in caregiver research can also be put to better
use. Instead of just counting illnesses, researchers should assess
which illnesses are most related to caregiving, as well as the
frequencies, dosages, and health implications of medications. One
can assess the latter by asking participants to bring their medica-
tions to ones lab and having a pharmacologist code them. Results
for caregiver health service utilization are also difficult to interpret
because one must ensure that caregivers are evaluated separately
from care recipients and that one allows for the fact that caregivers
may be home bound. A final issue involves the simultaneous use
of self-report and physiological assessments.
Assessments that rely solely on self-reports will lead to a targeting of
resources away from some caregivers (e.g., men), who may be un-
willing or unable to express their distress. Assessments that incorpo-
rate cardiovascular reactivity should be better able to identify care-
givers who are truly distressed and in need of help. (Lutzky & Knight,
1994, p. 518)
We searched published and unpublished reports over a 38-year
period and found 23 studies that compared physical health indica-
tors in family caregivers of persons with dementia to health indi-
cators in noncaregivers who were generally matched on age and
sex. Caregivers had a 23% higher level of stress hormones and a
15% lower level of antibody responses than did noncaregivers.
Whereas these observational data do not allow us to infer defini-
tively that caregiving is hazardous to ones health, such potential
added risks are noteworthy because they may have clinical impli-
cations for millions of caregivers. Although researchers would
really like to determine whether caregiving causes illnesses,
regardless of the causes, the fact that caregiving may influence
illness is still important. As the worlds population ages, caregivers
will play an even greater role in society. Science may develop a
curefor dementia, but other care recipients will need caregivers.
In this regard, the current work should be extended to caregivers of
care recipients with other chronic illnesses. Moreover, doubly
prospective studies of caregiving should be performed to clarify
the causes of such added risk, and subgroups of caregivers, such as
those with comorbidities, should be examined. General research on
chronic stress and illness suggests that persons with comorbidities
may be at higher risk for health problems in response to stressors
than persons exposed to chronic stressors who are free of comor-
bidities. The population of older caregivers with comorbidities is
large, yet this group essentially has been ignored in this literature,
as have other subgroups defined by individual differences. One
hopes that the combination of designs recommended here, which
incorporates assessments of individual differences and other health
indicators, will target high risk caregivers and be used to develop
cost-effective treatments for those who can profit most from in-
terventions. By helping caregivers to maintain their health, such
interventions should also help care recipients and society.
Adler, N. E., & Ostrove, J. M. (1999). Socioeconomic status and health:
What we know and what we dont. In N. E. Adler, M. Marmot, et al.
(Eds.), Annals of the New York Academy of Sciences: Vol. 896. Socio-
economic status and health in industrial nations: Social, psychological,
and biological pathways (pp. 315). New York: New York Academy of
Almberg, B., Jansson, W., Grafstrom, M., & Winblad, B. (1998). Differ-
ences between and within sexes in caregiving strain: A comparison of
caregivers of demented and non-caregivers of non-demented elderly
people. Journal of Advanced Nursing, 28, 849858.
American Association of Retired Persons. (1988). National survey of
caregivers summary of findings. Washington, DC: Author.
American Psychiatric Association. (1994). Diagnostic and statistical man-
ual of mental disorders (4th ed.). Washington, DC: Author.
Aneshensel, C. S., Pearlin, L. I., Mullan, J. T., Zarit, S. H., & Whitlatch,
C. J. (1985). Profiles in caregiving: The unexpected career. San Diego,
CA: Academic Press.
Arntzenius, A. C., Kromhout, D, Barth, J. D., Reiber, J. H. C., Bruschke,
A. V. G., Buis, B., et al. (1985). Diet, lipoproteins, and the progression
of coronary atherosclerosis. New England Journal of Medicine, 312,
Atkinson, C. L. (1995). Defensive coping, stress, and immunity. Disser-
tation Abstracts International, 56, 2313B. (UMI No. 9525987)
Bandura, A. (1993). Perceived self-efficacy in cognitive development and
functioning. Educational Psychologist, 28, 117148.
Barsky, A. J.,