Use of population health data to refine diagnostic decision-making for pertussis.
ABSTRACT To improve identification of pertussis cases by developing a decision model that incorporates recent, local, population-level disease incidence.
Retrospective cohort analysis of 443 infants tested for pertussis (2003-7).
Three models (based on clinical data only, local disease incidence only, and a combination of clinical data and local disease incidence) to predict pertussis positivity were created with demographic, historical, physical exam, and state-wide pertussis data. Models were compared using sensitivity, specificity, area under the receiver-operating characteristics (ROC) curve (AUC), and related metrics.
The model using only clinical data included cyanosis, cough for 1 week, and absence of fever, and was 89% sensitive (95% CI 79 to 99), 27% specific (95% CI 22 to 32) with an area under the ROC curve of 0.80. The model using only local incidence data performed best when the proportion positive of pertussis cultures in the region exceeded 10% in the 8-14 days prior to the infant's associated visit, achieving 13% sensitivity, 53% specificity, and AUC 0.65. The combined model, built with patient-derived variables and local incidence data, included cyanosis, cough for 1 week, and the variable indicating that the proportion positive of pertussis cultures in the region exceeded 10% 8-14 days prior to the infant's associated visit. This model was 100% sensitive (p<0.04, 95% CI 92 to 100), 38% specific (p<0.001, 95% CI 33 to 43), with AUC 0.82.
Incorporating recent, local population-level disease incidence improved the ability of a decision model to correctly identify infants with pertussis. Our findings support fostering bidirectional exchange between public health and clinical practice, and validate a method for integrating large-scale public health datasets with rich clinical data to improve decision-making and public health.
- SourceAvailable from: Albert M Lai[Show abstract] [Hide abstract]
ABSTRACT: To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses.Journal of the American Medical Informatics Association 11/2013; · 3.57 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Clinical research informatics is the rapidly evolving sub-discipline within biomedical informatics that focuses on developing new informatics theories, tools, and solutions to accelerate the full translational continuum: basic research to clinical trials (T1), clinical trials to academic health center practice (T2), diffusion and implementation to community practice (T3), and 'real world' outcomes (T4). We present a conceptual model based on an informatics-enabled clinical research workflow, integration across heterogeneous data sources, and core informatics tools and platforms. We use this conceptual model to highlight 18 new articles in the JAMIA special issue on clinical research informatics.Journal of the American Medical Informatics Association 04/2012; 19(e1):e36-e42. · 3.57 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Chinese translation Consensus guidelines recommend against testing or treating adults at low risk for group A streptococcal (GAS) pharyngitis. To help patients decide when to visit a clinician for the evaluation of sore throat. Retrospective cohort study. A national chain of retail health clinics. 71 776 patients aged 15 years or older with pharyngitis who visited a clinic from September 2006 to December 2008. The authors created a score using information from patient-reported clinical variables plus the incidence of local disease and compared it with the Centor score and other traditional scores that require clinician-elicited signs. If patients aged 15 years or older with sore throat did not visit a clinician when the new score estimated the likelihood of GAS pharyngitis to be less than 10% instead of having clinicians manage their symptoms following guidelines that use the Centor score, 230 000 visits would be avoided in the United States each year and 8500 patients with GAS pharyngitis who would have received antibiotics would not be treated with them. Real-time information about the local incidence of GAS pharyngitis, which is necessary to calculate the new score, is not currently available. A patient-driven approach to pharyngitis diagnosis that uses this new score could save hundreds of thousands of visits annually by identifying patients at home who are unlikely to require testing or treatment. Centers for Disease Control and Prevention and the National Library of Medicine, National Institutes of Health.Annals of internal medicine 11/2013; 159(9):577-83. · 13.98 Impact Factor
Use of population health data to refine diagnostic
decision-making for pertussis
Andrew M Fine,1Ben Y Reis,1,2Lise E Nigrovic,1Donald A Goldmann,3
Tracy N LaPorte,4Karen L Olson,1,2Kenneth D Mandl1,2,5
Objective To improve identification of pertussis cases by
developing a decision model that incorporates recent,
local, population-level disease incidence.
Design Retrospective cohort analysis of 443 infants
tested for pertussis (2003–7).
Measurements Three models (based on clinical data
only, local disease incidence only, and a combination of
clinical data and local disease incidence) to predict
pertussis positivity were created with demographic,
historical, physical exam, and state-wide pertussis data.
Models were compared using sensitivity, specificity, area
under the receiver-operating characteristics (ROC) curve
(AUC), and related metrics.
Results The model using only clinical data included
cyanosis, cough for 1 week, and absence of fever, and
was 89% sensitive (95% CI 79 to 99), 27% specific (95%
CI 22 to 32) with an area under the ROC curve of 0.80.
The model using only local incidence data performed best
when the proportion positive of pertussis cultures in the
region exceeded 10% in the 8–14 days prior to the
infant’s associated visit, achieving 13% sensitivity, 53%
specificity, and AUC 0.65. The combined model, built with
patient-derived variables and local incidence data,
included cyanosis, cough for 1 week, and the variable
indicating that the proportion positive of pertussis
cultures in the region exceeded 10% 8–14 days prior to
the infant’s associated visit. This model was 100%
sensitive (p,0.04, 95% CI 92 to 100), 38% specific
(p,0.001, 95% CI 33 to 43), with AUC 0.82.
Conclusions Incorporating recent, local population-level
disease incidence improved the ability of a decision
model to correctly identify infants with pertussis. Our
findings support fostering bidirectional exchange
between public health and clinical practice, and validate
a method for integrating large-scale public health
datasets with rich clinical data to improve decision-
making and public health.
Bordetella pertussis outbreaks can infect hundreds of
people across all age groups, though the infection is
most dangerous for young infants.1–3Pertussis is
difficult to diagnose, especially in its early stages,
and definitive test results are not available for
several days. Timely administration of antibiotics
decreases transmissibility of the disease.4Most
patients with cough do not have pertussis, but a
missed case of the contagious disease is likely to
have important consequences for the patient, her
contacts, and the public health.5A patient’s risk of
exposure to infection varies by local disease
burden,6 7though clinicians rarely have ready access
to information about epidemiologic context8—the
recent regional incidence of an infectious disease—
when making management decisions. The prolifer-
ation of real-time infectious disease surveillance
and electronic laboratory reporting
systems11creates an opportunity to open a key
communications channel between public health
agencies and point-of-care providers. Currently
there are no clinical decision-support systems that
management algorithms in real time.12
Because of the temporal and geographic varia-
bility of pertussis outbreaks, the delay in diagnostic
test results, and the personal and public health
ramifications13of incorrect management decisions
at the point of care, pertussis is a prototypical
disease for which real-time public health incidence
data might inform, guide, and improve clinical
decision-making. The purpose of this study is to
quantify the value of recent, local disease incidence,
derived from public health sources, in improving
management of pertussis in the clinical setting.
Design, setting, and subjects
A retrospective review was conducted of charts for
infants tested for pertussis by culture, presenting to
the pediatric emergency department (ED) of a large
urban tertiary care US hospital from 1 January 2003
to 31 December 2007. The ED volume exceeds
50 000 patients per year. The study received insti-
tutional review board approval.
Inclusion and exclusion criteria
Subjects included all infants tested for pertussis by
culture from 2003 to 2007. If a patient had multiple
pertussis cultures from 2003 to 2007, only the first
test was included.
An infant was defined as pertussis-positive or
pertussis-negative based on culture result, which is
widely regarded as the gold standard.14 15Alternate
tests like PCR, serology, and direct fluorescent
antibody (DFA) were not used in the case definition.
Positive culture from a nasopharyngeal specimen is
100% specific for pertussis.4 16Sensitivity, however,
may be limited for several reasons including the
organism’s fastidious nature, specimen collection
technique, when the patient is tested in the course
of the illness, and prior or concurrent use of
antibiotics.16 17While PCR may have a better
sensitivity, we did not rely on it because there is no
FDA-approved test kit available, because test char-
acteristics vary widely by laboratory and because
1Division of Emergency
Medicine, Children’s Hospital
Boston and Department of
Pediatrics, Harvard Medical
School, Boston, Massachusetts,
Informatics Program at the
Harvard-MIT, Division of Health
Sciences and Technology,
Boston, Massachusetts, USA
3Division of Infectious Diseases,
Children’s Hospital Boston and
Harvard Medical School, Boston,
4Massachusetts Department of
Public Health, Jamaica Plain,
Manton Center for Orphan
Disease Research, Children’s
Hospital Boston, Boston,
Dr A M Fine, Division of
Emergency Medicine—Main 1,
Children’s Hospital Boston, 300
Longwood Avenue, Boston, MA
02115, USA; andrew.fine@
Received 12 May 2008
Accepted 23 August 2009
J Am Med Inform Assoc 2010;17:85–90. doi:10.1197/jamia.M306185
outbreaks have recently been attributed to PCR false posi-
tives.4 18PCR may, in fact, be oversensitive, and requires corre-
lation with at least 2 weeks of cough and paroxysm, whoop or
post-tussive emesis,4which are difficult to assess accurately in a
retrospective review. Serology is not recommended for infants,
and DFA is not widely available.19
Clinical data collection
Demographics, signs, and symptoms commonly associated with
infant pertussis, local disease incidence data and outcomes were
collected for each patient.4 20 21Demographics included visit
date, gender, and age (months). Signs and symptoms included
cough duration (days), fever duration (days), history of apnea,
post-tussive emesis, cyanosis, seizure, and contact with a person
with known pertussis. If the record did not contain information
about these symptoms, they were coded as absent. Cough
descriptors like paroxysm, staccato, and “whoop” were not
included because they could not be measured accurately by chart
review. Outcome data including antibiotic use, hospitalization,
and mortality were collected to help describe the study
In the initial review, the pertussis culture result for each
patient was obtained from the hospital laboratory information
system. Subsequently, the chart abstractor (an attending
physician specializing in pediatric emergency medicine) respon-
sible for collecting and entering patient data into structured
forms was blinded to the culture result. The culture result was
accessible through a unique laboratory link to a PDF file from
the external laboratory that performed the culture. These results
were kept separated from the portion of the electronic chart
used from the ED clinical encounter. No linkage between the
culture result and the clinical portion of the chart was conducted
until after all clinical charts had been reviewed. Historical and
physical exam features were based on the EMR generated during
the ED encounter. Outcome data were collected from the ED
EMR, inpatient discharge summaries, and outpatient follow-up
visits. To assess inter-rater reliability, a second abstractor (also an
attending physician specializing in pediatric emergency medi-
cine) reviewed a random sample of 7% of charts.22 23The two
chart abstractors had over 90% agreement (range 91–97%) and
k24from 0.52 to 0.87 for all candidate predictors.
Local disease-incidence data collection
A query of the State Laboratory of the Massachusetts Depart-
ment of Public Health database yielded 19 907 pertussis culture
results from patients of all ages over the study period (2003–7).
These data were obtained through a limited data sharing agree-
ment. State data about cultures included date sent and culture
result, but not demographics, clinical findings or outcomes.
Aggregate disease incidence variables were created for the
number of pertussis cultures performed, the number of positives
and the proportion positive at the state laboratory. Each of these
variables was tabulated over a range of different timescales: 1–7,
8–14, 15–21, and 22–28 days prior to each visit date. Based on
date of presentation, the corresponding public health incidence
variables (number ofcultures performed,positive, and proportion
positive in the prior and cumulative 1–4 weeks) were assigned to
Building the decision models
The same sequence of steps was used to build three decision
models: (1) “clinical only” model—candidate predictors included
only clinical data based on demographics, history, and physical
exam; (2) “local disease incidence” model—candidate predictors
included only public health incidence data; and (3) “contextual-
ized” model—all clinical and public health predictors were
Variable discretization and selection
Dichotomous variables (history of apnea, post-tussive emesis,
cyanosis, and seizure) associated with positive pertussis culture
in the clinical data set were identified. Significance of association
was tested with a x2goodness-of-fit test (p,0.05). Continuous
variables (duration of cough, duration of fever, and local disease
incidence variables) were dichotomized at categorical cut-offs
considered by the clinical investigators to be clinically useful and
easy to remember (eg, cough at least 1 week, presence of fever,
and proportion positive past 21 days .0.10).
In the multivariate analysis, candidate variables were entered
into a forward stepwise logistic regression to identify inde-
pendent predictors of infants testing pertussis positive. Cut-offs
for entry and departure for the logistic regression model were
0.25 and 0.10.
For the local disease incidence model, each variable was consid-
ered for entry into the model as an independent predictor.
Because of the interdependence of these variables, it was estab-
lished a priori that no more than one candidate incidence variable
would be contained in the final model. Thresholds were defined
for the numbers of tests performed, positive, and proportion
positive over 1–4 weeks. For proportion positive, thresholds were
tested from 0.01 to 0.20 in increments of 0.01.
For the contextualized model, each clinical and local disease
incidence variable was considered for entry into the multivariate
model. Variables not included in the final clinical only or final local
disease incidence model were still considered for inclusion into the
After selecting the best final model for each analysis (clinical,
local disease incidence, or contextualized), a bootstrap validation
was performed. Predictors that were selected in over 50% of the
1000 bootstrap samples were retained in the final model.25–27
Measurement of model performance
Sensitivity, specificity, positive and negative predictive values,
area under the receiver-operating characteristics (ROC) curve
(AUC), and percent correct classification were used to compare
performance. The best model was defined as that with the
greatest specificity among those with highest sensitivity, in
order to minimize missed pertussis cases, and also minimize
misclassification of those without pertussis.
Comparing clinician performance with decision models
Clinicians’ actual performance was compared with the clinical,
local disease incidence, and contextualized models by measuring
correct classification. Clinician performance of correct classi-
fication was judged by utilization or omission of antibiotics in
the clinical encounter. The clinical actions taken, as determined
by chart review, were compared with what would have been
recommended based on the three decision models generated.
Four hundred and forty-three infants had a pertussis culture sent
from 2003 to 2007, and 38 (8.4%) tested positive. Nineteen
thousand nine hundred and seven cultures were performed at
the State Laboratory Institute of the Massachusetts Department
of Public Health during the study; 1103 (5.5%) tested positive.
For these 19 907 cultures, the weekly proportion positive ranged
86J Am Med Inform Assoc 2010;17:85–90. doi:10.1197/jamia.M3061
from 0 to 32% (mean 6.8%, median 5.6%, interquartile range
(IQR) 3.3 to 8.8%). A mean of 4.2 cultures tested positive each
week (median 4, range 0 to 20, IQR 2 to 6). Weekly and monthly
(figure 1A,B) proportion positive at the state laboratory demon-
strate that the timing, height, width, and total number of
pertussis peaks vary annually.
Development of “clinical only” decision model
Infants testing positive for pertussis were younger, more likely to
have a history of apnea or cyanosis, or cough for at least
1 week and were less likely to have fever than those who tested
negative (table 1). There were no significant differences between
emesis or seizure, or exposure to a contact with known pertussis.
History of cyanosis was the best predictor of pertussis, followed
by history of cough for at least 1 week and absence of fever
(table 2). Gender, history of post-tussive emesis or seizure,
exposure to pertussis contact, and age in months were not
included in the final clinical only model.
positive pertussis culture 2003–7. (B)
Monthly proportion positive pertussis
cultures, 2003–7. (A) and (B)
demonstrate that the number, timing,
height, and duration of pertussis peak
vary annually. While the graph suggests
some seasonality, it also shows that
pertussis varies substantially from year
(A) Weekly proportion
J Am Med Inform Assoc 2010;17:85–90. doi:10.1197/jamia.M306187
Development of “local disease incidence” decision model
Selection of incidence variables
In the 1–4 weeks prior to each visit date, the ranges of mean
numbers of cultures performed (102 to 110), positive (4.7 to 4.9)
and proportion positive (0.058–0.060) showed a small variation.
Because means, medians, ranges, and SD correlated closely for
1–7, 8–14, 15–21, and 22–28 days, metrics from a single time
interval were chosen to represent “local disease incidence.” Due
to the time required to definitively process a pertussis culture,4
8–14 days (2 weeks) was chosen as the best metric available for a
clinical application. A range of thresholds were examined to
determine the cut-off that would optimize specificity for
maximum sensitivity. The area under the ROC curve for these
variables was maximized when the proportion positive from 8 to
14 days prior exceeded 0.10 (p,0.0001).
Development of contextualized decision model
Among clinical variables, presence of cyanosis and cough for at
least 1 week met criteria for selection into the final logistic
regression model (table 3). The proportion positive 8–14 days
prior to the test date also met criteria for selection into the final
logistic regression model. Proportion positive thresholds ranging
from 0.01 to 0.20 in increments of 0.01 were tested. In
conjunction with the clinical variables above, the maximal area
under the ROC curve occurred with a cut-off of 0.10. The best
contextualized model contained three variables—history of
cyanosis, cough at least 1 week, and proportion positive .0.10
eight to 14 days earlier. The incidence variable was a stronger
predictor than any clinical factor considered, except for history of
All predictors from the multivariate analyses were validated by
the bootstrap method and retained in the final models. Cyanosis
was selected in over 99%, public health pertussis proportion
positive $0.10 eight to 14 days prior in over 90%, and cough for
at least 1 week in over 80% of 1000 bootstrap samples.
Measurement of performance of decision models
The best model derived in the clinical only analysis (history of
cyanosis, cough for at least 1 week and absence of fever), gener-
ated an area under the ROC of 0.80, with 89% sensitivity and
27% specificity (table 4). Addition of variables not significant in
the univariate analysis (gender, history of post-tussive emesis,
history of apnea, history of seizure, exposure to known pertussis
case, and age under 3 months) did not improve the area under the
ROC. The best local disease incidence model achieved only 13%
sensitivity and 53% specificity. In the best contextualized model
(history of cyanosis, proportion positive 8–14 days earlier $0.10,
and cough for at least 1 week) the area under the ROC was 0.82
with 100% sensitivity and 38% specificity.
The contextualized model outperformed the clinical and local
disease incidence models across all metrics (table 4). Compared
with the clinical model, the contextualized model achieved superior
sensitivity (89–100), specificity (27–38), PPV (12–15), NPV
(96–100), and area under the ROC (0.80–0.82).
Comparing clinician performance with decision models
percent of negatives not treated with antibiotics were
compared with hypothetical outcomes generated by the
three decision models (table 5). The contextualized model
missed no patients with pertussis. Among models that did not
miss any cases (100% sensitivity), the contextualized model
misclassified the fewest patients (62%, 95% CI 57% to 67%)
without pertussis, which would have resulted in the most
judicious antibiotic use and correct categorization of infants.
While clinicians did not do as well as the contextualized model,
clinicians outperformed the clinical only model, misclassifying
about the same number of patients with pertussis (11% vs 13%)
but misclassifying fewer patients without the disease (60% vs
73%) (table 5).
treated with antibiotics and
Subject characteristics by culture result
(n5405) N (%)
(n538) N (%)
Mean age (months)
Mean days of cough
Cough $1 week
History of apnea
History of cyanosis
History of fever
History of seizure
Exposure to pertussis
Public health incidence
mean percent positive
pertussis tests 2 weeks
prior (median, IQR)
3.4 (3, 1 to 5)
2.3 (2, 1 to 3)
8.8 (4, 1 to 10)9.6 (7, 5 to 14)0.71
5.6 (4.8, 2.9 to 7.7)
9.9 (8.8, 5.8 to 14)
*Pertussis culture has 100% specificity and variable sensitivity (4, 16).
IQR, interquartile range.
Multivariate logistic regression analysis (“clinical only” model)
Predictor OR 95% CIp Value
Cough $1 week
Absence of fever
2.9 to 13.4
1.4 to 7.0
1.3 to 118
incidence variables included (contextualized model)
Multivariate logistic regression analysis with local disease
PredictorOR95% CI p Value
Proportion positive prior 8–14 days .10%
Cough $1 week
3.3 to 16
2.3 to 13
1.5 to 7.9
Performance of decision models with 95% CIs
incidence only Contextualized
Positive predictive value
Negative predictive value
Area under receiver
89 (79 to 99)
27 (22 to 32)
12 (8 to 16)
96 (92 to 99.96)
100 (92 to 100)
38 (33 to 43)
15 (11 to 19)
100 (98 to 100)
Sensitivity, specificity, positive and negative predictive value, and area under the receiver-
operating characteristics curve for the best model in each category. The contextualized
model was superior to the clinical model for all metrics, and was statistically better for
sensitivity (p,0.04), specificity (p,0.001), and negative predictive value (p,0.02).
88J Am Med Inform Assoc 2010;17:85–90. doi:10.1197/jamia.M3061
Clinicians make critical decisions in the face of uncertainty,
and typically rely on individual clinical experience and
discussion with close colleagues when making decisions about
diagnosis and treatment.28–30Previously, we showed that local
disease incidence information about meningitis from a single
hospital provides valuable epidemiologic context and enhances a
decision model for distinguishing aseptic from bacterial menin-
gitis.8Here, we demonstrate for the first time how an external
public health surveillance source improves a clinical decision
model, by incorporating state-wide “epidemiologic context.”
Previous prediction models, derived from small numbers of
patients, have identified clinical predictors of infantile pertussis
like cyanosis and cough, and some models have considered
seasonality, but none have incorporated local disease inci-
dence.20 21Seasonality is not a substitute for accurate real-time
information about pertussis incidence as pertussis outbreaks are
sporadic and do not follow consistent seasonal or geographic
patterns.15 31–34In our analysis, epidemiologic context was
stronger than all but one clinical predictor (cyanosis). This
finding underscores the importance of “situational awareness” in
the clinical setting. Understanding the epidemiologic context in
which a patient presents may provide critical information about
the etiology of the patient’s problem, but currently, this type of
information is not formally processed, considered, or utilized in
Our findings support a general approach of estimating clinical
risk of disease at the point of care, accounting for local disease
incidence. This approach uses epidemiologic context in the
clinical decision-making process rather than relying solely on
history, physical exam, heuristics, and preliminary diagnostic
test results.10 29 35 36It is becoming increasingly feasible to deliver
public health information to clinicians at the point of care. The
emergence of robust, real-time surveillance systems, automated
reporting to public health, and widespread adoption of electronic
health records present opportunities for bidirectional communi-
cation between clinical practice and public health. Our study also
promotes the value of disease reporting and surveillance at the
We demonstrate a useful synergy between clinical and public
health information in the generation and refinement of clinical
decision rules. Public health data have not previously been used
to generate decision models because, while they contain detailed
information about those who test positive, they contain limited
information about those testing negative. Public health efforts
focus on tracking, interviewing, and following patients with
reportable diseases, so public health data sets contain far greater
detail about individuals who test positive. Data about those
testing negative are limited even further by patient privacy laws,
which prohibit collection of detailed information about people
without the reportable disease. This unbalanced data stream
creates a unique challenge to the use of public health datasets for
the creation of decision models, which rely on rich information
about patients both with and without the disease.37In an effort
to use available high-quality data, we approached this problem
by integrating a large statewide public health dataset with a
detailed hospital-based clinical dataset to develop a decision
model for a disease with major public health importance—
The design of this study was retrospective, so a further vali-
dation would be necessary prior to integration into a clinical
setting.38The retrospective nature of the study also required
basing the clinical models on patients who had pertussis tests
ordered, potentially biasing toward subjects for whom clinicians
already suspected of pertussis. While the contextualized model
outperformed the clinical only model, the clinical only model is
most limited by its retrospective nature. First, a prospectively
derived clinical model where testing was based on symptoms
and structured data were acquired systematically might
improve the performance of a clinical model. Second, while the
study was carried out at a single site, this site provides care for
75% of the children who live in and around this large
metropolitan area. Third, the incidence data are state-wide,
while the patients are from a single, large metropolitan area.
Fourth, we relied on the most conservative method for evalu-
ating pertussis—culture—because it is widely regarded as the
gold standard.14 15As delineated in the case definition section of
the methods, PCR may be oversensitive, and requires correlation
with at least 2 weeks of cough and paroxysm, whoop, or post-
tussive emesis,4which are difficult to assess accurately in a
retrospective review. Serology is not recommended for infants,
and DFA is not widely available.19Fifth, the study was limited
by a lack of immunization data on the subjects because primary
care records were not accessible for these ED patients. Sixth,
most patients did not have blood tests performed as part of the
evaluation, and so lymphocytosis could not be included in the
models; however, while lymphocytosis is classically associated
with pertussis, it has been shown to be neither sensitive nor
This study validates a scientific method for integrating
incidence data into a clinical decision model and suggests
that “epidemiologic context” could be an important compo-
nent of future clinical decision-support systems. A software
application integrated with an electronic health record might
display data to physicians about ambient public health condi-
tions and prompt appropriate management, treatment and
reporting processes based on a calculation that considered
patient factors in a specific epidemiologic situation. This
important refinement of clinical decision-making requires
communication between public health and clinical settings, and
programs to enable integration of public health data with
Funding This work was supported by grants K01HK000055 and 1 P01 HK000088 from
the Centers for Disease Control and Prevention and by G08LM009778 and R01
LM007677 from the National Library of Medicine.
Competing interests None.
Ethics approval The Committee on Clinical Investigation of Children’s Hospital Boston
approved the study.
Contributors All authors made substantial contributions to conception, design,
analysis, and interpretation of data. AMF and KDM drafted the manuscript, and all
Misclassification of best models versus actual clinician
Percent of patients
pertussis positive but
not treated (95% CI)
Percent of patients
but treated (95% CI)
Clinician’s actual treatment
per electronic chart review
Best clinical only model
Best local disease incidence
Best contextualized model
13% (2 to 23)60% (55 to 65)
11% (1 to 21)
61% (45 to 76)
73% (69 to 78)
15% (11 to 18)
0% (0 to 8)62% (57 to 67)
J Am Med Inform Assoc 2010;17:85–90. doi:10.1197/jamia.M306189
authors were involved in revising it critically for important intellectual content and final
approval. AMF is guarantor.
Provenance and peer review Not commissioned; externally peer reviewed.
Cherry JD. Pertussis in adults. Ann Intern Med 1998;128:64–6.
Lee GM, Lett S, Schauer S, et al. Societal costs and morbidity of pertussis in
adolescents and adults. Clin Infect Dis 2004;39:1572–80.
Sotir MJ, Cappozzo DL, Warshauer DM, et al. A countywide outbreak of
pertussis: initial transmission in a high school weight room with subsequent
substantial impact on adolescents and adults. Arch Pediatr Adolesc Med
Murphy TV, Slade BA, Broder KR, et al. Prevention of pertussis, tetanus, and
diphtheria among pregnant and postpartum women and their infants:
recommendations of the Advisory Committee on Immunization Practices (ACIP).
MMWR Morb Mortal Wkly Rep 2008;57:1–47.
Davis JP. Clinical and economic effects of pertussis outbreaks. Pediatr Infect Dis J
Fiore AE, Shay DK, Haber P, et al. Prevention and control of influenza.
Recommendations of the Advisory Committee on Immunization Practices (ACIP),
2007. MMWR Recomm Rep 2007;56:1–54.
Fisman DN. Seasonality of infectious diseases. Annu Rev Public Health
Fine AM, Nigrovic LE, Reis BY, et al. Linking surveillance to action: incorporation of
real-time regional data into a medical decision rule. J Am Med Inform Assoc
Brownstein JS, Kleinman KP, Mandl KD. Identifying pediatric age groups for
influenza vaccination using a real-time regional surveillance system. Am J Epidemiol
Reis BY, Pagano M, Mandl KD. Using temporal context to improve biosurveillance.
Proc Natl Acad Sci U S A 2003;100:1961–5.
Overhage JM, Grannis S, McDonald CJ. A comparison of the completeness and
timeliness of automated electronic laboratory reporting and spontaneous reporting of
notifiable conditions. Am J Public Health 2008;98:344–50.
Kukafka R, Ancker JS, Chan C, et al. Redesigning electronic health record systems to
support public health. J Biomed Inform 2007;40:398–409.
Calugar A, Ortega-Sanchez IR, Tiwari T, et al. Nosocomial pertussis: costs of an
outbreak and benefits of vaccinating health care workers. Clin Infect Dis
Baughman AL, Bisgard KM, Cortese MM, et al. Utility of composite reference
standards and latent class analysis in evaluating the clinical accuracy of diagnostic
tests for pertussis. Clin Vaccine Immunol 2008;15:106–14.
Centers for Disease Control and Prevention. Manual for the surveillance of
vaccine-preventable diseases. 3rd edn. Atlanta, GA: CDC, 2002.
Halperin SA. The control of pertussis—2007 and beyond. N Engl J Med
Mattoo S, Cherry JD. Molecular pathogenesis, epidemiology, and clinical
manifestations of respiratory infections due to Bordetella pertussis and other
Bordetella subspecies. Clin Microbiol Rev 2005;18:326–82.
Centers for Disease Control and Prevention (CDC). Outbreaks of
respiratory illness mistakenly attributed to pertussis—New Hampshire,
Massachusetts, and Tennessee, 2004–2006. MMWR Morb Mortal Wkly Rep
American Academy of Pediatrics. Section 3: Summaries of infectious diseases:
pertussis (whooping cough). In: Pickering LK, Baker CJ, Long SS, et al, eds. Red book
2006: report of the committee on infectious diseases. 27th edn. Elk Grove Village: AAP,
Guinto-Ocampo H, Bennett JE, Attia MW. Predicting pertussis in infants. Pediatr
Emerg Care 2008;24:16–20.
Mackey JE, Wojcik S, Long R, et al. Predicting pertussis in a pediatric emergency
department population. Clin Pediatr (Phila) 2007;46:437–40.
Gilbert EH, Lowenstein SR, Koziol-McLain J, et al. Chart reviews in emergency
medicine research: where are the methods? Ann Emerg Med 1996;27:305–8.
Gorelick MH, Yen K. The kappa statistic was representative of empirically
observed inter-rater agreement for physical findings. J Clin Epidemiol
Landis JR, Koch GG. The measurement of observer agreement for categorical data.
Austin PC, Tu JV. Bootstrap methods for developing predictive models. Am Stat
Efron B, Gong G. A leisurely look at the bootstrap, the jackknife, and cross-validation.
Am Stat 1983;37:36–48.
Glaser N, Barnett P, McCaslin I, et al. Risk factors for cerebral edema in children with
diabetic ketoacidosis. N Engl J Med 2001;344:264–9.
Patel VL, Kaufman DR, Arocha JF. Emerging paradigms of cognition in medical
decision-making. J Biomed Inform 2002;35:52–75.
Tversky A, Kahneman D. Judgment under uncertainty: heuristics and biases. Science
Wolf FM, Gruppen LD, Billi JE. Differential diagnosis and the competing-hypotheses
heuristic. A practical approach to judgment under uncertainty and Bayesian
probability. JAMA 1985;253:2858–62.
Crowcroft NS, Pebody RG. Recent developments in pertussis. Lancet
Fine PE, Clarkson JA. Seasonal influences on pertussis. Int J Epidemiol
Skowronski DM, De Serres G, MacDonald D, et al. The changing age and seasonal
profile of pertussis in Canada. J Infect Dis 2002;185:1448–53.
Tanaka M, Vitek CR, Pascual FB, et al. Trends in pertussis among infants in the
United States, 1980–1999. JAMA 2003;290:2968–75.
Croskerry P. The importance of cognitive errors in diagnosis and strategies to
minimize them. Acad Med 2003;78:775–80.
Morse SS. Global infectious disease surveillance and health intelligence. Health Aff
Wasson JH, Sox HC, Neff RK, et al. Clinical prediction rules. Applications and
methodological standards. N Engl J Med 1985;313:793–9.
Charlson ME, Ales KL, Simon R, et al. Why predictive indexes perform less well
in validation studies. Is it magic or methods? Arch Intern Med
Long S, Pickering L, Prober C. Principles and practice of pediatric infectious diseases.
3rd edn. New York: Churchill Livingstone, 2008.
90J Am Med Inform Assoc 2010;17:85–90. doi:10.1197/jamia.M3061