Schumm, L.P. , McClintock, M., Williams, S., Leitsch, S., Lundstrom, J., Hummel, T., & Lindau, S.T. (2009). Assessment of sensory function in the national social life, health, and aging project. Journal of
Gerontology: Social Sciences, 64B(S1), i76–i85, doi:10.1093/geronb/gbp048.
© The Author 2009. Published by Oxford University Press on behalf of The Gerontological Society of America.
All rights reserved. For permissions, please e-mail: email@example.com.
Health, and Aging Project (NSHAP) included measurements
of each of the fi ve senses. By taking advantage of several
recent advances in the in-home collection of biomeasures
( Lindau & McDade, 2007 ), NSHAP was able to obtain
biomeasures of visual, tactile (touch), gustatory (taste), and
olfactory (smell) function (auditory function was assessed
via self-report only). The purposes of this article were to
describe the methods by which these measurements were
obtained, to report on the quality of the resulting data, and to
illustrate how these data may be used analytically.
Sensory function is an important aspect of health, espe-
cially as people age. Declines in sensory function may be
symptomatic of underlying disease and can affect personal
safety ( Anstey, Wood, Lord, & Walker, 2005 ), quality of
life, and perceived health ( Ostbye et al., 2006 ). In addition,
a decline in sensory function may limit participation in in-
timate relationships and other types of social activities,
which may in turn have additional negative consequences
for health. Exploring this type of dynamic process in which
health and social interaction are intertwined was one of
NSHAP ’ s primary objectives ( Lindau, Laumann, Levinson, &
Waite, 2003 ).
S part of its attempt to obtain a comprehensive assess-
ment of respondent health, the National Social Life,
Several population and clinical studies have documented
age-related decline in sensory function. For example, stud-
ies in the United States ( Murphy et al., 2002 ), Germany
( Hummel, Kobal, Gudziol, & Mackay-Sim, 2007 ; Landis,
Konnerth, & Hummel, 2004 ), and Sweden ( Bramerson,
Johansson, Ek, Nordin, & Bende, 2004 ) have found the
likelihood of olfactory dysfunction to increase substantially
after age 55, affecting a larger proportion of the population
than previously thought ( Landis & Hummel, 2006 ). Data
for gustation are limited to much smaller clinical studies but
also suggest a decline in gustatory function with age inde-
pendent of the decline in olfaction ( Fukunaga, Uematsu, &
Sugimoto, 2005 ; Seiberling & Conley, 2004 ). National data
on vision and hearing are available from the National Health
and Nutrition Examination Survey and indicate that the
presence of both visual and hearing impairments increases
substantially with age ( Li, Healy, Wanzer Drane, & Zhang,
2006 ; Vitale, Cotch, & Sperduto, 2006 ). Finally, a few small
studies have found a decrease in hand sensibility with age
( Desrosiers, Hebert, Bravo, & Dutil, 1996 ; Ranganathan,
Siemionow, Sahgal, & Yue, 2001 ; Wickremaratchi &
Llewelyn, 2006 ). Our objective was to confi rm these results
with the NSHAP data while investigating the psychometric
properties of each measurement module.
Assessment of Sensory Function in the National Social
Life, Health, and Aging Project
L. Philip Schumm , 1 Martha McClintock , 2 Sharon Williams , 3 Sara Leitsch , 4 Johan Lundstrom , 5
Thomas Hummel , 6 and Stacy Tessler Lindau 7
1 Department of Health Studies and 2 Institute for Mind and Biology and Department of Psychology, University of Chicago, Illinois .
3 Department of Anthropology, Purdue University, West Lafayette, Indiana .
4 National Opinion Research Center, Chicago, Illinois .
5 Monell Chemical Senses Center and Department of Psychology, University of Pennsylvania, Philadelphia .
6 Smell & Taste Clinic, Department of Otorhinolaryngology, University of Dresden Medical School, Dresden, Germany .
7 Department of Obstetrics and Gynecology, Department of Medicine and the MacLean Center for Clinical Medical Ethics, University of
Chicago Pritzker School of Medicine, Illinois .
Objectives. The National Social Life, Health, and Aging Project assessed functioning of all 5 senses using both self-
report and objective measures. We evaluate the performance of the objective measures and model differences in sensory
function by gender and age. In the process, we demonstrate how to use and interpret these measures.
Methods. Distance vision was assessed using a standard Sloan eye chart, and touch was measured using a stationary
2-point discrimination test applied to the index fi ngertip of the dominant hand. Olfactory function (both intensity detec-
tion and odor identifi cation) was assessed using odorants administered via felt-tip pens. Gustatory function was measured
via identifi cation of four taste strips.
Results. The performance of the objective measures was similar to that reported for previous studies, as was the rela-
tionship between sensory function and both gender and age.
Discussion. Sensory function is important in studies of aging and health both because it is an important health out-
come and also because a decline in functioning can be symptomatic of or predict other health conditions. Although the
objective measures provide considerably more precision than the self-report items, the latter can be valuable for imputa-
tion of missing data and for understanding differences in how older adults perceive their own sensory ability.
ASSESSMENT OF SENSORY FUNCTION IN NSHAP
A nalytic A pproach
The protocols for measuring olfactory, gustatory, and
tactile functioning each involved administering several dif-
ferent stimuli in blinded fashion and asking respondents to
identify each one from a list of possible alternatives. It is
assumed that the ability to identify the stimulus correctly
refl ects the respondent ’ s underlying level of function in that
particular sensory domain. A standard way to analyze such
data is with an item-response model . Assuming y ij is the re-
sponse for respondent i to the j th item, the simplest such
g E yij
where E ( y ij ) is the mean of y ij conditional on h ij , g () is known
as the link function , and a i and q j are sets of respondent-
specifi c and item-specifi c parameters, respectively. When y ij
takes the values 0 or 1, a common choice for g () is the logit
function ( Cox & Snell, 1989 ):
Model 1 is the well-known Rasch model ( Rasch, 1960 ) de-
veloped for evaluating and scoring educational tests com-
posed of binary items and may be fi t to NSHAP ’ s sensory
data by coding each identifi cation item as either correct (1)
or incorrect (0). The parameter q j may then be thought of as
the underlying diffi culty of the j th identifi cation task and the
parameter a i as the underlying ability of the i th respondent
to perform such tasks.
Model 1 has two important limitations ( Skrondal & Rabe-
Hesketh, 2004 , pp. 292 – 298). First, it assumes that differ-
ences in ability affect performance on all items equally. This
may be true of a well-constructed test battery or psycho-
logical scale but may not be true of NSHAP ’ s sensory func-
tion modules because they were not designed with this
criterion in mind. To accommodate this, we may extend
Equation 1 in the following way:
where the l j are analogous to factor loadings in a factor
analytic model (note that only j − 1 of the l j are identifi -
able). This is referred to as a two-parameter item-response
model ( Birnbaum, 1968 ) because each item is now repre-
sented by two parameters ( q j and l j ).
A second limitation evident from Equation 1 is that for
any given item, the probability of a “ correct ” response ap-
proaches 0 as the respondent ’ s sensory ability decreases.
This is clearly not realistic in cases where the response in-
volves choosing from a fi xed set of possibilities, as in the
case of NSHAP ’ s sensory identifi cation items (despite this,
Equation 1 and Equation 2 are still often used to analyze
data from multiple choice items). A more realistic model is
where in the case of the logit link g − 1 ( x ) = e x /(1 + e x ). In this
model, as ability approaches − ∞ , the probability of a correct
response approaches c , which can therefore be interpreted
as the probability of a pure guess (i.e., by someone with no
ability) being correct. Although c is represented here as be-
ing constant across items, this can be relaxed ( Birnbaum,
In cases where sensory function is hypothesized to de-
pend on certain observed covariates, this can be accom-
plished within the context of these models by specifying a
structural model for a i :
yc c g
= = ++
11− λ αθ
i= xi? ?,
where x i is a vector of covariates for respondent i (note that
if x i includes a constant one of the q j must be set to 0 for
identifi cation). The resulting model is referred to as the
Multiple Indicator Multiple Causes model ( Joreskog &
Goldberger, 1975 ).
In addition to models in which sensory function is
treated as an outcome, one may also wish to treat it as an
explanatory variable in analyses of other outcomes — as,
for example, in exploring the hypothesis noted above that
poor sensory function may limit one ’ s ability to engage in
satisfying intimate relationships (in reality, this relation-
ship may be bidirectional). Such analyses may be con-
ducted in two ways. First, one may estimate a full structural
equation model ( Bollen, 1989 ) combining one of the mea-
surement models above with another model in which a i is
also used as a predictor. A second, simpler approach is to
use the fi tted measurement model to compute empirical
Bayes predictions of the a i and then simply use these esti-
mates ˆ ai as covariates in another model. It is important to
note that although the ˆ ai differ from the true a i , this ap-
proach can still yield consistent (though less effi cient) es-
timates for the model of interest, though the standard errors
will be biased downward because the ˆ ai are being treated
as nonrandom ( Whittemore, 1989 ). A possible way to ad-
dress this would be to bootstrap the entire process (i.e.,
both the estimation of ˆ ai and the model of interest; Efron
& Tibshirani, 1993).
All the models described above are examples of General-
ized Linear Latent and Mixed Models (GLLAMMs; Skrondal
& Rabe-Hesketh, 2004 ) and may be fi t in Stata ( StataCorp,
2007 ) using the gllamm package ( Zheng & Rabe-Hesketh,
2007 ). For convenience, we assume that the a i are distrib-
uted Gaussian with mean given by the corresponding struc-
tural model (if specifi ed).
Objective Versus Subjective Measurement
In addition to biomeasures of sensory function, NSHAP
also obtained self-reported measures of function for each of
the fi ve senses. Because self-reported measures of function
SCHUMM ET AL.
can have poor reliability (e.g., Landis, Hummel, Hugentobler,
Giger, & Lacroix, 2003 ) and/or be affected by reporting
bias, such measures are typically of interest only in cases
where biomeasures are unavailable. For example, because
NSHAP ’ s visual acuity testing was performed on only half
of the respondents, while the self-reported measure was ob-
tained from all, one could use the half sample with both
measures to develop and estimate a model for the self-report
process and then use this model to impute visual acuity for
those who were not tested. Similarly, one might use self-
reports of olfactory, gustatory, and tactile function to impute
missing values for the corresponding biomeasures due to
nonresponse (note, however, that in these three cases, the
corresponding self-reported measures were obtained from
only half of the respondents).
In addition to obtaining objective measurements of sen-
sory function as part of a holistic assessment of health, the
NSHAP research team was also interested in the effects that
a decline in sensory function may have on an older adult ’ s
participation in intimate and other types of social activity.
Because such effects may be due in part to self-limitation,
the way in which respondents perceive their sensory
function — as distinct from their actual level of function —
becomes important. This leads to consideration of the
distribution of self-reported function conditional on a i , and
the way in which this distribution depends on various cova-
riates. Such an analysis can be performed using the methods
described earlier. Because vision was assessed using only a
single measure of visual acuity, one can model self-reported
vision using visual acuity directly as a covariate (e.g., see
Globe, Wu, Azen, & Varma, 2004 ).
At the conclusion of each interview, NSHAP interview-
ers were asked to rate the respondent ’ s vision and hearing
using a 5-point scale. The resulting data (not presented here)
may be used in models for imputing missing values of vi-
sual acuity and/or self-reported vision and hearing.
Olfactory function was assessed in all respondents using
tests of both odor sensitivity (i.e., the lowest concentration
at which an odor can be detected) and odor identifi cation.
Odorants were administered using commercially available
felt-tip pens, each fi lled with an individual odorant at a spe-
cifi c concentration. This device is inexpensive, convenient,
and ideally suited to delivering odorants at a constant con-
centration ( Hummel, Sekinger, Wolf, Pauli, & Kobal, 1997 ).
After we developed the NSHAP protocol, another short
screening method using the pens was independently devel-
oped ( Mueller & Renner, 2006 ).
The test of odor sensitivity involved presenting a series of
fi ve pens, the fi rst containing only the diluent propylene gly-
col (1,2-propanediol) followed by steadily increasing con-
centrations of the odorant n -butanol (0.13%, 0.50%, 2.00%,
and 8.00%). Each pen was held by the interviewer (who wore
a cotton glove to eliminate residual odors) approximately
half an inch from the respondent ’ s nostrils; respondents were
then asked to inhale through the nose, during which time the
pen was waved slowly back and forth for no more than 3 – 4 s.
Following each pen, respondents used a visual analog scale
ranging from 0 ( labeled no smell at all ) to 10 ( labeled smells
very strong ) to record their perception of odor strength.
Although there is not suffi cient space to present the sen-
sitivity data here, we note that there was some variation
among respondents in the way in which the visual analog
scale was completed. Respondents were given a choice be-
tween recording their responses directly on the interview-
er ’ s laptop computer (using the mouse to position a slider)
or on a paper version of the scale. Nearly two thirds of those
who completed the olfactory module chose the paper ver-
sion (61%), and the likelihood of choosing paper was higher
for women and increased with age. Despite instructions to
mark an X on the line representing the scale, 17% – 18% of
those who recorded their answers on paper wrote an integer
between 0 and 10 instead.
Following the sensitivity test, respondents were presented
with a fi ve-item identifi cation test. A single odor was pre-
sented, and respondents were asked to identify it from a set
of alternatives (responses were recorded by the interviewer
on the computer). This was repeated using fi ve individual
odors. The response sets were as follows (in order of admin-
istration and with the true odorant indicated in italics): (a)
chamomile, raspberry, rose , or cherry; (b) smoke, glue,
leather , or grass; (c) orange , blueberry, strawberry, or onion;
(d) bread, fi sh , cheese, or ham; and (e) chive, peppermint ,
pine, or onion. Following a forced-choice paradigm, re-
spondents were not permitted to answer “ don ’ t know ” ;
however, for each test, 1% – 3% of respondents refused to
answer. Although these refusals are excluded from the anal-
yses presented here, they may in many cases refl ect uncer-
tainty about the correct response, and therefore, other
analysts may wish to handle them differently.
Results for the identifi cation tests are presented in
Table 1 . For this analysis, we have focused solely on
whether the respondent was able to identify the odorant
correctly; a more in-depth analysis might examine the dis-
tribution of responses among the various incorrect alterna-
tives. Each of the fi ve odorants was identifi ed correctly by
a majority of respondents, with peppermint identifi ed cor-
rectly most often (92%) and leather identifi ed correctly
least often (71%). Item nonresponse (including item-spe-
cifi c refusal to give a response plus 61 respondents [2%]
who declined the entire olfactory module, 1 respondent
who broke off the interview at an earlier point, and one
instance of an equipment problem) was highest for the fi rst
item (5%) and declined steadily thereafter. The increasing
likelihood of a correct response coupled with decreasing
item nonresponse is consistent with the possibility that
some respondents became more adept at the task after the
fi rst couple of tries. However, because the order of the
ASSESSMENT OF SENSORY FUNCTION IN NSHAP
items was identical for all respondents, it is not possible
to distinguish between such order effects and true item-
specifi c differences in diffi culty.
Estimates for Model 1 and two versions of Model 2 (see
Multiple Measurements ) are also presented in Table 1 , ob-
tained using all respondents for whom data from at least one
item were available. Model 1 refl ects the same ordering in
item diffi culty observed in the percent correct and estimates
the variance of the a i to be 1.27 on the logit scale, indicating
that a 1 SD increase in individual ability roughly triples the
odds of correctly identifying a given odor. Model 2A per-
mits the effect of a change in individual ability to vary
across items; a likelihood ratio test of this model against
Model 1 yields a p value of <.001, indicating that the items
do differ in their ability to discriminate among individuals.
Estimates of the discrimination parameters ˆλj indicate that
peppermint provided the best discrimination, whereas rose
and leather provided the worst.
Model 2B extends 2A by incorporating a structural
model for the a i containing the covariates gender and age
group. The estimated odds ratio for women (vs. men) is
e 0.32 = 1.38, with an approximate 95% confi dence interval
of 1.20 – 1.58. The effects of age appear roughly linear,
with a 67% decrease in the odds of identifying an odor
correctly from the youngest (57 – 64 years) to the oldest
(75 – 85 years) age group (odds ratio 0.33 with a 95% con-
fi dence interval of 0.26 – 0.43). These results are consistent
with previous studies of gender and age differences in ol-
factory function assessed by odor identifi cation ( Hummel
et al., 2007 ).
Assessment of gustatory function was performed on all
respondents using a series of taste-impregnated strips of fi l-
ter paper ( Mueller et al., 2003 ). Four strips were presented
in the same order to each respondent: The fi rst tasted sour,
the second bitter, the third sweet, and the fourth salty. Be-
fore tasting each strip, respondents were asked to take a sip
of water; they were then instructed to put the strip on their
tongue and to describe the taste using one of the following
descriptors: “ salty, ” “ sweet, ” “ bitter, ” or “ sour. ” In addition,
they were asked to rate how certain they were that they had
identifi ed the taste correctly using a visual analog scale
ranging from 0 ( labeled very uncertain ) to 10 ( labeled very
certain ). To facilitate use of the scale, respondents were
asked to record their answers directly on the laptop; those
who were uncomfortable doing so were provided with a pa-
per version of the items. As with the olfactory module, the
majority of respondents who participated in the assessment
(58%) chose the paper version.
Nonresponse was higher for this module than for the ol-
factory module. One hundred and thirty-seven respondents
(5%) declined to participate, and equipment problems pre-
vented administering the module in an additional 58 cases
(2%). Although respondents who recorded their answers on
the laptop were not given the option “ don ’ t know, ” they
were permitted to indicate that they had tried and were un-
able to perform the task, at which point no further strips
were administered. Fifty-nine respondents (2%) reported
being unable to rate one of the four strips. In addition, be-
tween 5 and 21 respondents who recorded their answers on
Table 1. Item-Response Models Fit to Odor Identifi cation Data ( SE s)
OdorPercent correctItem nonresponse a Parameter
n = 2,928
Model 1Model 2A Model 2B
Item diffi culty
q fi sh
l fi sh
Gender (vs. men)
Age (vs. 57 – 64 years)
65 – 74 years
75 – 85 years
Var( a i )
− 0.58 (0.21)
− 0.77 (0.20)
− 0.53 (0.27)
− 0.01 (0.30)
− 0.47 (0.09)
− 1.10 (0.13)
Note : a Includes 65 respondents for whom entire smell module is missing (61 refusals, 1 equipment problem, 1 interview break-off, and 2 due to interviewer error)
plus those who refused each specifi c item.
SCHUMM ET AL.
the computer refused to answer each identifi cation item,
and between 9 and 20 respondents who recorded their an-
swers on paper wrote in “ don ’ t know. ” Finally, between 66
and 161 respondents (2% – 5%) who recorded their answers
on paper left each identifi cation item blank. Although re-
spondents who were truly unable to identify a particular
taste are likely represented in each of these categories, only
those recorded as “ tried, unable to do ” or who wrote in
“ don ’ t know ” are counted as legitimate (incorrect) responses
in the analysis presented here; all others are excluded.
Results for the identifi cation items are presented in Table 2 .
The least-recognized taste was sour (39% correct), whereas
the most recognized taste was sweet (86% correct). Because
the four tastes were presented in the same order to all
respondents, it is not possible to distinguish between item-
specifi c differences and a possible learning effect, though
the fact that only 67% identifi ed the fi nal taste (salty)
correctly suggests that a learning effect cannot account for
all the differences observed. Item nonresponse was highest
for the bitter strip, refl ecting a larger number of blank,
“ don ’ t know, ” and refused responses.
Estimates for Model 1 mirror the item ordering observed
in the percent correct, whereas those for Model 2A suggest
that the sour and salty tastes load more heavily on the single
dimension of ability being picked up here (a likelihood ratio
test comparing Model 2A with the more restrictive Model 1
yields a p value of .004). Under 2A, the variance of the a i is
estimated to be 0.95, indicating that a 1 SD increase in ability
increases the odds of correctly identifying a taste with a
loading of one by a factor of 2.65. Model 2B adds a struc-
tural model for the a i including gender and age group as
covariates. The ability of women to identify tastes correctly
is estimated to be 0.60 logits higher than that of men, whereas
age is associated with a more modest decline of only 0.22
logits by the oldest age group relative to the youngest.
For comparison, Table 3 shows the results from logistic
regression models fi t separately to each of the taste identifi -
cation items including both gender and age group as covari-
ates. The estimated gender effects are similar in magnitude
to the estimate from the structural model; however, age-
related declines are observed for only the tastes bitter and
salty. This indicates that the unidimensional model is inad-
equate for the purpose of describing the effects of age on the
ability to identify these four tastes.
Distant visual acuity was assessed in both eyes together
at 3 m using a chart with Sloan optotypes manufactured by
Precision Vision (catalog number 2104). Respondents who
normally wear glasses or contact lenses for driving or dis-
tance vision were instructed to wear them during the test.
Interviewers followed a detailed protocol to ensure consis-
tent distance from the chart (using a premeasured string laid
out on the fl oor), line of sight (respondent seated with inter-
viewer holding chart at respondent ’ s eye level), and lighting
(suffi cient light for reading with low glare or strong back-
lighting). Respondents were asked to begin by reading the
smallest discernible line and, depending on the outcome,
were then successively directed up or down one line at a
time until the smallest line that could be read accurately was
Table 2. Item-Response Models Fit to Taste Identifi cation Data ( SE s)
TastePercent correct a Item nonresponse b Parameter
n = 2,765
Model 1Model 2A Model 2B
Item diffi culty
Gender (vs. men)
Age (vs. 57 – 64 years)
65 – 74 years
75 – 85 years
Var( a i )
− 0.56 (0.05)
− 0.64 (0.07)
− 1.85 (0.20)
− 0.24 (0.14)
− 0.08 (0.07)
− 0.22 (0.08)
Notes : a Among those providing a response; responses “ don ’ t know ” and “ tried, unable to do ” are counted as “ incorrect. ”
b Includes 227 respondents for whom entire taste module is missing (137 refusals, 58 due to equipment problems, 1 interview break-off, and 31 paper-and-pencil
response sheets lost in the fi eld) plus those who refused (computer-assisted personal interview) or failed to mark (paper-and-pencil) each specifi c item.
ASSESSMENT OF SENSORY FUNCTION IN NSHAP
Half of the 3,005 respondents were randomized to receive
the vision assessment (1,506). Of these, 64 (4%) refused to
participate, 1 broke off the interview at an earlier point, and
in four cases, a problem with the equipment prevented con-
ducting the test. Results for the remaining 1,437 respon-
dents are shown in Table 3 . Twenty-three respondents (1%)
were unable to read the largest line at 3 m, indicating vision
worse than 20/200. Using standard guidelines, 62% of the
study population are estimated to have good vision (better
than 20/40), 27% are estimated to have moderately de-
creased vision (between 20/40 and 20/60), and 11% are es-
timated to have poor vision (worse than 20/60). Seventy-three
percent of the respondents wore glasses or contact lenses
during the test, and 20 respondents who normally wear
glasses or contacts did not wear them (these individuals are
included in the analyses presented here).
In modeling these data, at least two approaches are pos-
sible. First, one might attempt to model the mean directly,
either on the standard scale (i.e., 20/20, 20/25, etc.) or on
some appropriate transformation thereof (e.g., the logarithm
of the minimum angle of resolution, which linearizes the
geometric sequence of the chart). A second approach is to
model the cumulative probabilities of being at or above a
given ability using ordinal regression ( McCullagh & Nelder,
1989 ). The advantage of the latter is that it does not require
advanced knowledge of the biometrics of visual acuity, and
more importantly, it yields conclusions that are immediately
interpretable in clinically relevant terms (e.g., the effect of a
covariate on the log odds of having vision equal to or better
than a given threshold).
Table 3 shows estimates from a logistic model with the
covariates gender (female vs. male) and age group (65 – 74
Table 3. Logistic Models Fit to Data on Distance Vision, Hearing, and Touch ( SE s) a
Item Percent a
Female (vs. male)Age 65 – 74 years (vs. 57 – 64 years) Age 75 – 85 years (vs. 57 – 64 years)
Distance vision (3 m) b, c
Unable to do
Proportional odds model
Self-rated hearing b
Proportional odds model
Taste identifi cation d
2-point discrimination e
1 point only
− 0.46 (0.33)
− 0.55 (0.22)
− 0.38 (0.20)
− 0.14 (0.17)
− 0.29 (0.14)
− 0.24 (0.14)
− 0.23 (0.13)
− 0.52 (0.18)
− 0.60 (0.24)
− 0.76 (0.56)
− 0.50 (0.40)
− 0.62 (0.33)
− 0.50 (0.24)
− 0.51 (0.18)
− 0.72 (0.15)
− 0.68 (0.13)
− 0.99 (0.18)
− 0.88 (0.22)
− 1.87 (0.45)
− 1.80 (0.37)
− 1.56 (0.29)
− 1.46 (0.21)
− 1.36 (0.17)
− 1.61 (0.15)
− 1.54 (0.19)
− 1.73 (0.29)
− 2.08 (0.47)
− 1.59 (0.12)
− 0.95 (0.27)
− 1.01 (0.15)
− 0.74 (0.09)
− 0.75 (0.16)
− 0.84 (0.09)
− 0.24 (0.11)
− 0.34 (0.17)
− 0.77 (0.16)
− 0.39 (0.23)
− 0.57 (0.17)
− 0.31 (0.11) − 0.74 (0.11)
− 0.16 (0.27)
− 0.29 (0.14)
− 0.22 (0.10)
− 0.29 (0.13)
− 0.25 (0.09)
− 0.14 (0.13)
− 0.30 (0.15)
< − 0.01 (0.13)
− 0.33 (0.15)
− 0.03 (0.17)
− 0.27 (0.18)
Notes : a Estimates weighted to account for differential probabilities of selection and differential nonresponse. Design-based standard errors obtained using the
b Unconstrained model (i.e., nonparallel regressions) in which the change in odds associated with a change in the value of the covariate(s) is permitted to vary
across the different cutpoints; estimates represent the change in the log odds of being in or above the corresponding category.
c Categories at each end have been combined due to the small number of observations.
d Separate logistic regression models fi t to the probability of a correct response; responses “ don ’ t know ” and “ tried, unable to do ” are counted as “ incorrect. ”
e Separate logistic regression models fi t to the probability of a correct response; responses “ didn ’ t feel any points ” and “ tried, unable to do ” are counted as “ incorrect. ”
SCHUMM ET AL.
and 75 – 85 years, both relative to 57 – 64 years) in which the
effects of the covariates on the odds of being at or above
a given level of ability are allowed to differ at each level (this
model may be fi t using the gologit2 package for Stata
[ Williams, 2006 ]). The estimates for each covariate are
roughly similar across the different cutpoints; likelihood
ratio tests of equality across the cutpoints yield p values of
.576, .398, and .210 for gender, age 65 – 74 years, and age
75 – 85 years, respectively. Estimates for the proportional
odds model (in which the effects of the covariates are as-
sumed equal across the cutpoints) are also provided and
indicate that the odds of being at or above a given level of
ability are estimated to be (e − 0.31 − 1) × 100 = 26.7% lower
for women than for men. The effect of age appears roughly
linear, with those aged 75 – 85 years having 79.6% lower
odds of being above a given cutpoint than those aged 57 – 64
years. A likelihood ratio test of the interaction between gen-
der and age (2 df ) yields a p value of .847. Of course, al-
though the proportional odds model does a good job of
summarizing the effects of gender and age on visual acuity,
there is no guarantee that this model will be appropriate for
Note that because respondents were randomly selected
for visual acuity assessment, analyses like those just de-
scribed using only those individuals who were assessed will
yield unbiased results. However, in situations where work-
ing with half of the sample is not adequate, one could use
self-rated vision (asked of all respondents) together with
other covariates to impute values of visual acuity for those
who were not administered this module ( Rubin, 1987 ).
Hearing was the one sense that NSHAP did not measure
objectively. This was not due to a lack of interest; rather, por-
table audiometers — the standard method for assessment in
the fi eld — were too costly, and technical problems precluded
the possible integration of an audiometric application directly
into the computer-assisted personal interview instrument
prior to going into the fi eld. Thus, the primary measure of
hearing was the question “ Is your hearing excellent, very
good, good, fair, or poor? ” This question was asked of all
respondents; respondents who use a hearing aid were asked
to describe their hearing while using it. Respondents were
also asked the question “ Do you feel you have a hearing
loss? ” to which 44% answered yes; this single question has
been shown to have reasonable sensitivity and specifi city for
hearing impairment ( Sindhusake et al., 2001 ). Finally, as
noted above, the interviewer ’ s rating of the respondent ’ s
hearing is also available. Because the interviewer had en-
gaged in a 90-min or longer face-to-face conversation with
the respondent, one might expect his or her rating to be fairly
sensitive to defi cits severe enough to affect conversation.
Results from the self-report question are shown in Table
3 . Only 18% of respondents rated their hearing as excellent,
whereas roughly one third each rated their hearing as either
very good or good. Twenty percent rated their hearing as
either fair or poor. Estimates for the unconstrained ordered
logit model are similar to those for the proportional odds
model, with likelihood ratio tests comparing the two yield-
ing p values of .867, .993, and .331 for the gender, age 65 –
74 years, and age 75 – 85 years coeffi cients, respectively.
Thus, we estimate that the likelihood of being above a given
cutpoint is (e 0.57 − 1) × 100 = 76.8% higher for women than
for men and decreases with age at an increasing rate, declin-
ing by (e − 0.84 − 1) × 100 = 56.8% in the 75- to 85-year-old
Tactile function was assessed via 2-point discrimination —
a standard and reliable method for measuring the fi nger ’ s sen-
sation to touch ( Dellon & Keller, 1997 ; Dellon, Mackinnon, &
Crosby, 1987 ; Finnell, Knopp, Johnson, Holland, & Schubert,
2004 ). These tests were performed using a multisided hand-
held discriminator with graded intraprong distances devel-
oped specifi cally for NSHAP by a metallurgical engineer.
Although more sensitive and accurate devices are available
(e.g., Mayfi eld & Sugarman, 2000 ), their cost and the time
required to administer them preclude using them in a large,
multipurpose study conducted by survey interviewers in the
home (the desire to obtain data comparable to existing stud-
ies was also a factor). Respondents were fi rst asked to close
their eyes, after which the interviewer touched the tip of the
index fi nger of their dominant hand lightly with two small
metal points located a fi xed distance apart. Respondents
were then asked whether they had felt one or two points; re-
sponses such as “ three points ” or “ I feel something but I ’ m
not sure how many points ” were recorded by the interviewer
as one point. Four tests were performed in succession: the
fi rst at 12 mm apart, the second consisting of only a single
point, the third at 8 mm, and the fourth at 4 mm.
In order to reduce the average length of the interview, the
assessment of tactile function was administered to the same
random half sample who received the vision assessment. Of
those 1,506 respondents, 28 (2%) declined to participate, an
equipment problem was reported by the interviewer in three
cases, and 1 respondent broke off the interview at an earlier
point, leaving 1,474 respondents who completed at least
part of the module. As with the taste identifi cation module,
after each stimulus, respondents were permitted to indicate
that despite trying, they were unable to perform the task, at
which point no further stimuli were administered. However,
only 39 respondents (3% of those participating) indicated
that they were unable to complete one of the tests; these
responses — together with the response “ I didn ’ t feel any
points ” (1% – 4%) — are treated as incorrect in the analyses
Table 4 shows the percent correct and item nonresponse
for each of the four stimuli. As expected, respondents found
ASSESSMENT OF SENSORY FUNCTION IN NSHAP
it substantially more diffi cult to distinguish between points
that were 4 mm apart (only 41% correctly identifi ed these as
two distinct points). Interestingly, however, the same per-
centage of respondents (79%) were able to distinguish be-
tween points 8 mm apart as were able to distinguish between
points 12 mm apart, suggesting that there is a plateau in the
response function over this interval. Only slightly more
(86%) correctly identifi ed the single point.
The same item-response models used above were also fi t
to the 2-point discrimination data. Model 2A shows that the
4 mm and single-point tasks load substantially less on the
individual factor being captured by the model. Regressing
that factor on gender and age group shows no difference in
sensory ability between men and women but a decrease
with age, especially among the oldest group. For compari-
son, Table 3 shows results from individual logistic regres-
sions fi t to each of the four items. The decline with age is
most evident for the 4 and 12 mm items and is not evident
at all for the single-point item. Although women were
slightly less likely to discriminate between the 12 mm points
(this could, e.g., refl ect a tendency on the part of the inter-
viewers to touch the discriminator initially less heavily
against some women ’ s fi ngers), none of the other items ex-
hibited a gender difference.
For expository purposes, we have modeled the single-
point item in the same manner as the other three. However,
the primary function of the single-point item was to prevent
respondents from simply guessing that all the stimuli in-
volved two points. Given this, together with the fact that
identifying a single point clearly represents a different task
from discriminating between two points, one might argue
that this item should instead be modeled differently. This is
underscored by the fact that the likelihood of answering the
single-point item correctly was not related to age.
The National Social Life, Health, and Aging Project is the
fi rst U.S. national study of older adults that has attempted to
obtain a comprehensive assessment of sensory function. Con-
sistent with previous clinical and population-based studies, the
data show age-related declines in functioning across each of
the fi ve senses. Researchers may now use this data set to
examine whether certain subgroups exhibit greater declines
than others. In addition, researchers may now — for the fi rst
time — begin to explore among older adults the relationships
between sensory function and both the level of social par-
ticipation and the quality of intimate relationships. Future
waves of NSHAP will offer the ability to study changes in
sensory function over time, providing an opportunity to
explore causal hypotheses involving sensory function and
social interaction. Use of the self-report measures in con-
junction with the objective assessments should also prove
informative here because one ’ s perception of one ’ s abilities
may serve to mediate the effects of actual changes in func-
tion on social interaction.
To the best of our knowledge, NSHAP is the fi rst survey
study to attempt objective measurements of olfactory, gus-
tatory, or tactile function. Although item nonresponse for
the gustation component ranged from 10% to 14% (the
slightly higher rate was perhaps to be expected given the
more invasive nature of putting an object in one ’ s mouth), it
was less than 5% for olfaction and touch. Results for olfac-
tion and taste identifi cation are similar to those obtained
Table 4. Item-Response Models Fit to 2-Point Discrimination Data ( SE s)
Distance between points Percent correct a Item nonresponse b Parameter
n = 1,474
Model 1Model 2A Model 2B
1 point only
Item diffi culty
q 12 mm
q 1 point
q 8 mm
q 4 mm
l 12 mm
l 1 point
l 8 mm
l 4 mm
Gender (vs. men)
Age (vs. 57 – 64 years)
65 – 74 years
75 – 85 years
Var( a i )
− 0.52 (0.07)
− 0.41 (0.06)
− 1.01 (0.14)
− 0.16 (0.18)
− 0.33 (0.23)
− 1.04 (0.30)
Notes : a Responses “ didn ’ t feel any points ” and “ tried, unable to do ” are counted as incorrect.
b Includes 32 respondents for whom entire taste module is missing (28 refusals, 3 due to equipment problems, and 1 interview break-off).
SCHUMM ET AL.
from more in-depth studies using the same methodologies
in more controlled settings, indicating that these protocols
can be administered successfully by fi eld interviewers with
older adults in the home. The analyses presented here also
show that the resulting data may be analyzed with standard
item-response models. The 2-point discrimination data may
prove to be an exception here because there is some indica-
tion that the three graded distances, when taken together, do
not measure a single underlying dimension. More work is
needed to determine whether and how these items should be
The analysis of the multi-item measures (i.e., olfaction,
gustation, touch) presented here is intended merely to illus-
trate how the items may be combined for the purpose of in-
vestigating the relationship between sensory function and
other variables. However, scoring each item as either correct
or incorrect as we have done here ignores the possibility that
additional information may be recovered by distinguishing
between the various incorrect alternatives (e.g., the differ-
ence between salty and bitter may not be as large as the dif-
ference between salty and sweet). Polytomous choice models
similar to those described here may be used to address this
issue. Similarly, researchers interested in hearing may wish
to analyze the two self-report measures together with the in-
terviewer rating in order to investigate possible biases in the
self-reports and to achieve a more effi cient analysis.
Finally, we note that a detailed analysis of a particular
sensory function will likely require explicit consideration of
several specifi c physiological factors known to affect that
function. Because NSHAP included a broad assessment of
health, many of these factors have been measured and are
therefore available for analysis. For example, a history of
either nasal surgery or head injury is relevant to olfactory
function and was included in the assessment of physical
health. Similarly, the presence of several comorbid condi-
tions (e.g., high blood pressure, diabetes) and a complete
log of current medications were also obtained (see corre-
sponding article in this volume for more details). These
greatly enhance the value of the sensory data.
A uthor C ontributions
M.M., S.W., S.L., J.L., T.H., and S.T.L. all participated in designing the
sensory function protocols used in the study. L.P.S. performed the data
analysis and drafted the manuscript. M.M., S.W., S.L., J.L., T.H., and S.T.L.
participated in revising the manuscript for important intellectual content.
The authors would like to thank Dr. D. Friedman for help in designing
NSHAP ’ s vision module and P. Rathouz for directing us to the paper by
Whittemore (1989) regarding the use of empirical Bayes predictions. The
authors would also like to thank R. Williams for designing and making the
discriminators used in this study.
Address correspondence to L. Philip Schumm, MA, Department of
Health Studies, University of Chicago, 4841 South Maryland Avenue
MC2007, Chicago, IL 40437. Email: firstname.lastname@example.org
Anstey , K. J. , Wood , J. , Lord , S. , & Walker , J. G. ( 2005 ). Cognitive, sen-
sory and physical factors enabling driving safety in older adults .
Clinical Psychology Review , 25 , 45 – 65 .
Birnbaum , A. ( 1968 ). Some latent trait models and their use in inferring an
examinee ’ s ability . In F. M. Lord & M. R. Novick (Eds.), Statistical
theories of mental test scores (pp. 396 – 479 ). Reading, MA : Addison-
Bollen , K. A. ( 1989 ). Structural equations with latent variables . New York :
Bramerson , A. , Johansson , L. , Ek , L. , Nordin , S. , & Bende , M. ( 2004 ).
Prevalence of olfactory dysfunction: The Skovde Population-Based
Study . Laryngoscope , 114 , 733 – 737 .
Cox , D. R. , & Snell , E. J. ( 1989 ). Analysis of binary data ( 2nd ed. ).
London : Chapman & Hall .
Dellon , A. L. , & Keller , K. M. ( 1997 ). Computer-assisted quantitative sen-
sorimotor testing in patients with carpal and cubital tunnel syn-
dromes . Annals of Plastic Surgery , 38 , 493 – 502 .
Dellon , A. L. , Mackinnon , S. E. , & Crosby , P. M. ( 1987 ). Reliability of
two-point discrimination measurements . Journal of Hand Surgery ,
12 , 693 – 696 .
Desrosiers , J. , Hebert , R. , Bravo , G. , & Dutil , E. ( 1996 ). Hand sensibility
of healthy older people . Journal of the American Geriatrics Society ,
44 , 974 – 978 .
Efron , B. , & Tibshirani , R. ( 1993 ). An introduction to the bootstrap ( Vol. 57 ).
New York : Chapman & Hall .
Finnell , J. T. , Knopp , R. , Johnson , P. , Holland , P. C. , & Schubert , W.
( 2004 ). A calibrated paper clip is a reliable measure of two-point dis-
crimination . Academic Emergency Medicine , 11 , 710 – 714 .
Fukunaga , A. , Uematsu , H. , & Sugimoto , K. ( 2005 ). Infl uences of aging on
taste perception and oral somatic sensation . Journal of Gerontology:
Biological Sciences and Medical Sciences , 60 , 109 – 113 .
Globe , D. R. , Wu , J. , Azen , S. P. , & Varma , R. ( 2004 ). The impact of visual
impairment on self-reported visual functioning in Latinos: The Los
Angeles Latino Eye Study . Ophthalmology , 111 , 1141 – 1149 .
Hummel , T. , Kobal , G. , Gudziol , H. , & Mackay-Sim , A. ( 2007 ). Normative
data for the “ Sniffi n ’ Sticks ” including tests of odor identifi cation,
odor discrimination, and olfactory thresholds: An upgrade based on a
group of more than 3,000 subjects . European Archives of Oto-Rhino-
Laryngology , 264 , 237 – 243 .
Hummel , T. , Sekinger , B. , Wolf , S. R. , Pauli , E. , & Kobal , G. ( 1997 ). Snif-
fi n ’ Sticks: Olfactory performance assessed by the combined testing
of odor identifi cation, odor discrimination and olfactory threshold .
Chemical Senses , 22 , 39 – 52 .
Joreskog , K. G. , & Goldberger , A. S. ( 1975 ). Estimation of a model with
multiple indicators and multiple causes of a single latent variable .
Journal of the American Statistical Association , 70 , 631 – 639 .
Landis , B. N. , & Hummel , T. ( 2006 ). New evidence for high occurrence of
olfactory dysfunctions within the population [Letter to the editor] .
American Journal of Medicine , 119 , 91 – 92 .
Landis , B. N. , Hummel , T. , Hugentobler , M. , Giger , R. , & Lacroix , J. S.
( 2003 ). Ratings of overall olfactory function . Chemical Senses , 28 ,
691 – 694 .
Landis , B. N. , Konnerth , C. G. , & Hummel , T. ( 2004 ). A study on the fre-
quency of olfactory dysfunction . Laryngoscope , 114 , 1764 – 1769 .
Li , Y. , Healy , E. W. , Wanzer Drane , J. , & Zhang , J. ( 2006 ). Comorbidity
between and risk factors for severe hearing and memory impairment
in older Americans . Preventive Medicine , 43 , 416 – 421 .
Lindau , S. T. , Laumann , E. O. , Levinson , W. , & Waite , L. J. ( 2003 ). Syn-
thesis of scientifi c disciplines in pursuit of health: The Interactive
Biopsychosocial Model . Perspectives in Biology and Medicine , 46 ( 3
Suppl. ), S74 – S86 .
Lindau , S. T. , & McDade , T. W. ( 2007 ). Minimally invasive and innovative
methods for biomeasure collection in population-based research .
In M. Weinstein , J. W. Vaupel , & K. W. Wachter (Eds.), Biosocial
surveys (chap. 13) . Washington, DC : The National Academies Press .
ASSESSMENT OF SENSORY FUNCTION IN NSHAP
Mayfi eld , J. A. , & Sugarman , J. R. ( 2000 ). The use of the Semmes-
Weinstein monofi lament and other threshold tests for preventing foot
ulceration and amputation in persons with diabetes . Journal of Fam-
ily Practice , 49 ( 11 Suppl. ), S17 – S29 .
McCullagh , P. , & Nelder , J. A. ( 1989 ). Generalized linear models ( 2nd
ed. ). London : Chapman & Hall .
Mueller , C. , Kallert , S. , Renner , B. , Stiassny , K. , Temmel , A. F. P. , Hummel ,
T. , & Kobal , G . ( 2003 ). Quantitative assessment of gustatory function in
a clinical context using impregnated “ taste strips ” . Rhinology , 41 , 2 – 6 .
Mueller , C. , & Renner , B. ( 2006 ). A new procedure for the short screening
of olfactory function using fi ve items from the “ Sniffi n ’ Sticks ” iden-
tifi cation test kit . American Journal of Rhinology , 20 , 113 – 116 .
Murphy , C. , Schubert , C. R. , Cruickshanks , K. J. , Klein , B. E. K. , Klein , R. , &
Nondahl , D. M. ( 2002 ). Prevalence of olfactory impairment in older
adults . Journal of the American Medical Association , 288 , 2307 – 2312 .
Ostbye , T. , Krause , K. M. , Norton , M. C. , Tschanz , J. , Sanders , L. , Hayden ,
K. , Pieper , C. , & Welsh-Bohmer , K. A. ( 2006 ). Ten dimensions of
health and their relationships with overall self-reported health and
survival in a predominately religiously active elderly population: The
Cache County memory study . Journal of the American Geriatrics
Society , 54 , 199 – 209 .
Ranganathan , V. K. , Siemionow , V. , Sahgal , V. , & Yue , G. H. ( 2001 ). Ef-
fects of aging on hand function . Journal of the American Geriatrics
Society , 49 , 1478 – 1484 .
Rasch , G. ( 1960 ). Probabilistic models for some intelligence and attain-
ment tests . Copenhagen, Demark : Nielson and Lydiche .
Rubin , D. B. ( 1987 ). Multiple imputation for nonresponse in surveys . New
York : Wiley .
Seiberling , K. A. , & Conley , D. B. ( 2004 ). Aging and olfactory and taste
function . Otolaryngologic Clinics of North America , 37 , 1209 –
Sindhusake , D. , Mitchell , P. , Smith , W. , Golding , M. , Newall , P. , Hartley ,
D. , & Rubin , G. ( 2001 ). Validation of self-reported hearing loss. The
blue mountains hearing study . International Journal of Epidemiol-
ogy , 30 , 1371 – 1378 .
Skrondal , A. , & Rabe-Hesketh , S. ( 2004 ). Generalized latent variable
modeling: Multilevel, longitudinal, and structural equation models .
Boca Raton, FL : Chapman & Hall/CRC .
StataCorp . ( 2007 ). Stata statistical software: Release 10 . College Station,
TX : StataCorp LP .
Vitale , S. , Cotch , M. F. , & Sperduto , R. D. ( 2006 ). Prevalence of visual
impairment in the United States . Journal of the American Medical
Association , 295 , 2158 – 2163 .
Whittemore , A. S. ( 1989 ). Errors-in-variables regression using stein esti-
mates . American Statistician , 43 , 226 – 228 .
Wickremaratchi , M. M. , & Llewelyn , J. G. ( 2006 ). Effects of ageing on
touch . Postgraduate Medical Journal , 82 , 301 – 304 .
Williams , R. ( 2006 ). Generalized ordered logit/partial proportional odds
models for ordinal dependent variables . Stata Journal , 6 , 58 – 82 .
Zheng , X. , & Rabe-Hesketh , S. ( 2007 ). Estimating parameters of dichoto-
mous and ordinal item response models with gllamm . Stata Journal ,
7 , 313 – 333 .
Received July 28 , 2008
Accepted February 9 , 2009
Decision Editor: Robert B. Wallace, MD, MSc