261 Vol. 39, No. 4
The objective of residency training in family medicine
is to produce competent family physicians. Yet, com-
petency is difficult to quantify. Most family medicine
residency programs rely on subjective appraisals from
attending physicians and the In-training Examinations
as instruments to evaluate a resident’s general compe-
tence.1 An objective structured clinical examination
(OSCE) as a measurement of clinical competence
has also been used in graduate medical education.1-4
While common, these forms of evaluation fail to
comprehensively measure clinical performance skills.
Standardized exams in particular, such as the In-train-
ing Examination, have shown a poor correlation with
performance on the OSCE.5
The use of standardized patients (SPs) has proven to
be an effective tool to assess physicians’ competence.1
There is evidence that changing physician behavior is
most effective when data from real practice is used,6,7
and in particular, studies have demonstrated that unan-
nounced SPs can accurately assess a resident’s clinical
skills with regard to specific disease states.8
Family medicine residency training programs em-
phasize longitudinal ambulatory care and thus provide
a unique setting in which to evaluate a resident’s
clinical competence over time.6 While many family
medicine residency programs are using an OSCE as
an evaluative method for residents, few if any are
integrating SPs in the resident’s practice panel.7,9 The
purposes of our study were (1) to examine if the use of
SPs is a feasible method to evaluate residents’ clinical
competency, (2) to assess whether SPs can be used to
measure advancement of clinical skills through a fam-
ily medicine residency program, and (3) to evaluate
the challenges/limitations of using SPs as an evalu-
This study was conducted over a 2-year period,
during which time we followed residents as they pro-
gressed from their second to third year of training. The
The Use of Standardized Patients to Evaluate
Family Medicine Resident Decision Making
Richard Terry, DO; Erik Hiester, DO; Gary D. James, PhD
From the Wilson Family Practice Residency, Johnson City, NY.
Background and Objectives: This study was intended to assess the usefulness of standardized patients
(SPs) in the evaluation of family medicine residents’ clinical decision-making skills in ambulatory
settings. Methods: A pool of SPs was trained about the symptoms of one of three clinical conditions
(depression, headache, or irritable bowel syndrome). These patients were then surreptitiously in-
corporated into the office hours of 11 residents on one occasion during their second year and once
during their third year of training. A different SP was used at each encounter, but the same clinical
case was presented each year for that particular resident. The SPs were given a questionnaire af-
ter each office visit to evaluate whether the resident completed elements of history, physical exam,
clinical decision making specific to their case, and to report their satisfaction with the encounter. In
addition, the assessment component of the resident’s progress notes for each SP visit was reviewed
by a single examiner to determine if the correct diagnosis was made by the resident. Results: The
residents showed a statistically significant decrease in the percentage of checklist items completed
for all clinical cases from the second year (82.70%) to the third year (75.55%). However, the aver-
age patient satisfaction was unchanged, as was the number of correct diagnoses, even though fewer
questions were asked. Conclusions: The use of SPs is a feasible and potentially useful method for
evaluating family medicine resident decision making. Several factors may account for the differences
in resident performance with SP scenarios.
(Fam Med 2007;39(4):261-5.)
April 2007Family Medicine
protocol was submitted to the United Health Services
Hospitals’ Institutional Review Board and was deter-
mined to be exempt from formal review.
The subjects of the study were a resident cohort
consisting of 11 residents, of whom five were females
and six were males. Their ages ranged from 28 to 43.
These residents were part of the Wilson Hospital Fam-
ily Practice Program in Johnson City, NY. One resident
did not complete his third-year evaluation; hence, there
were only 10 residents in the study.
The residents were informed during their orientation
for the residency program that they would be evaluated
by SPs during their second and third year of training.
Though residents did not receive formal training in any
of the clinical cases with which they were presented, it
was assumed that during their PGY-1 year they would
have had adequate exposure to patients with these
conditions, since they are commonly encountered by
family physicians. In addition, as part of the PGY-1
residency curriculum, the residents received at least one
1-hour lecture on depression, headache, and irritable
A pool of SPs was trained in one of three clinical
scenarios: depression, headache, and irritable bowel
syndrome. The SPs were paid, semi-professional ac-
tors who were also used to evaluate medical students
at the State University of New York (SUNY) Upstate
Medical University and thus had considerable experi-
ence portraying clinical scenarios. Their preparation
included reading handouts for each clinical case that
outlined elements of chief complaint, social and fam-
ily history, and recommended physical manifestations
(for example, flat affect for depression). In addition,
all SPs received 1 hour of individualized training by
the SP program coordinator 1 week prior to their visit
with the resident to review relevant symptoms on their
These SPs were then incorporated into the office
hours of 11 residents at different times during their sec-
ond and third year of training. The residents and office
staff were not informed that they would be seeing an
SP on that day. Each resident was evaluated two times
with only one of the above-mentioned cases, using a
different SP each time. The first time was during the
PGY-2 year, the second time during the PGY-3 year.
For example, resident “A” was first evaluated using an
SP trained with the depression case during the PGY-2
year of training. Then a different SP, trained with the
same depression case, was incorporated into this same
resident’s office hours during the third year of training.
Then, the two SP checklist questionnaire responses
from each resident encounter (PGY-2 and PGY-3) were
The clinical questionnaires were selected from the
case log of the Association of Standardized Patient
Educators and used by the SUNY Upstate Medi-
cal University OSCE program. The SPs were given
a questionnaire after each office visit to evaluate if
the resident completed elements of history, physical
exam, and clinical decision making specific to each
clinical case (Table 1). For the purpose of analysis, if
the resident asked questions on the survey, the SP was
instructed to mark that as a yes. If the resident did not
ask questions on the survey, the SP was instructed to
mark that as a no.
For example, if the resident asked about suicidal
ideation for the depression case, then a yes was re-
corded for that item. In addition, there was a section
on the questionnaire for evaluation of the resident’s
ability to share information, determined by asking if
residents addressed what might be occurring with the
SP. Following this, there were two questions in regard
to patient-physician interaction, determined by asking
if the residents introduced themselves and if the resi-
dents washed their hands prior to the physical exam.
There was also a question asking if the SP was satisfied
with the encounter. Finally, there was a section on the
questionnaire for a numerical assessment of commu-
nication (0=poor, 2=excellent.) Questions to evaluate
communication included (1) Did the resident demon-
strate sensitivity and empathy regarding the concerns?
(2) Did the resident behave warmly and professionally
throughout the encounter? (3) Did the resident use
words easily understood when discussing the problem?
and (4) Did the resident encourage questions and never
avoid giving the answer? These data were all collected
for each resident and analyzed as noted below.
Feasibility was assessed by the ability to integrate
the SP into the resident’s practice without the SP be-
ing identified. It was also evaluated by the return of
completed questionnaires by the SPs.
The mean percentage of questionnaire items that
answered yes for each resident for the entire question-
naire was calculated (Table 2). The last four questions
on the questionnaire dealing with communication skills
were also averaged for each resident (Table 2).
The statistical software SPSS for Windows (version
11.0, released September 19, 2001), was used to analyze
the data.7 Paired t test evaluation was performed com-
paring resident second-year to third-year performance
on total clinical case percentages, individual clinical
case percentages, and communication scores. The
clinical cases chart notes were reviewed by a single
examiner to determine if the resident arrived at the
correct diagnosis. Residents were given credit for the
correct diagnosis if the assessment in their progress
263Vol. 39, No. 4
note included the mention of tension headache, irritable
bowel syndrome, or depression, depending on their
Our primary challenge was in maintaining the secret
identity of the SP from the resident. We accomplished
this by strategically spacing the SPs’ visits to at least 1
month apart and using different SPs for each case.
Completed questionnaires were returned from all
but one SP encounter. The residents’ scores showed a
statistically significant decrease in percentage of ques-
tionnaire items checked yes for all clinical cases from
second year (82.70%) to third year (75.55%) (P<.05).
When each clinical scenario was analyzed individu-
ally, there was no statistically significant difference
from second year to third year (depression P=.502,
headache P=.685, irritable bowel syndrome P=.067) in
the percentages of yes answers on the SP questionnaire.
There were also no statistically significant difference
in communication scores with patient encounters from
second year to third year (P=.669). The SPs were satis-
Standardized Patient Questionnaire
Did the resident ask?
1. Any family history of chief complaint
2. Any sexual history
3. If you take any prescription or OTC
4. If you smoke
5. If you drink alcohol
6. If you take any illegal drugs
7. If you have any allergies
Did the resident?
25. Address what might be going on
26. Indicate need for counseling
27. Suggest antidepressant
Did the resident?
28. Introduce him/herself
(Must give last name)
29. Wash his/her hands before the physical
8. About any aches and pains you
9. About the source of your unhappiness
10. About family history of emotional
11. About weight change
12. About change in appetite
13. Changes in sleep patterns
14. How long you have felt low or sad
15. Whether you have had any thoughts of
16. If you have had any hallucinations or
17. Specific questions about anemia or
18. Specific questions about possible thyroid
30. I was satisfied with this encounterYes No
0=Poor to 2=Excellent
How was the resident you saw at:
31. Demonstrating sensitivity and empathy
regarding the concern
32. Behaving warmly and professionally
throughout the encounter
33. Using words easily understood
when discussing the problem
34. Encouraging questions and never
avoiding giving answer
Did the resident: YesNo
19. Check inside lower eyelid
20. Look in throat
21. Palpate thyroid
22. Feel lymph nodes
23. Listen to heart and lungs
24. Palpate abdomen
OTC—over the counter
April 2007 Family Medicine
fied with all but one encounter (Resident J in the second
year.) The number of correct diagnoses from second
year to third year (8 of 11 and 7 of 10, respectively)
was the same (Table 2.) There were two residents who
missed the diagnosis both as a second-year resident and
as a third-year resident. There was one resident who
arrived at the correct diagnosis as a PGY-2 but missed
the diagnosis as a PGY-3. There was also one resident
who had an incorrect diagnosis as a PGY-2, who arrived
at the correct diagnosis as a third-year resident.
Our study shows that it is feasible to use SPs to
evaluate family medicine residents’ clinical skills. We
were successful in maintaining the anonymity of the
SPs. In no case were they identified as SPs. This further
indicates that the training of SPs was adequate, since
they appeared like genuine patients to our residents.
In addition, checklists were returned for all but one
encounter since this particular resident graduated early
and thus did not complete the study. This further sup-
ports the feasibility of our methods.
Our data demonstrate that the third-year residents
asked fewer history questions and performed less physi-
cal exam maneuvers to formulate the correct diagnosis,
yet there was no difference between second-year and
third-year residents’ ability to arrive at the correct diag-
nosis. One could interpret this to mean that a third-year
resident is less thorough than a second-year resident
since fewer history questions and physical exam maneu-
vers were completed. However, perhaps this represents
the fact that the third-year resident is clinically more
astute and did not need to ask as many history questions
or perform as many physical exam maneuvers to arrive
at the correct diagnosis and thus used valid clinical
shortcutting that comes with experience.
Residents’ Ratings on Questionnaire Items
Resident C **
* Data represented as percentage of questionnaire checklist items that were marked yes by the SP (standardized patient).
** Resident graduated early and did not complete third evaluation.
IBS—irritable bowel syndrome
265 Vol. 39, No. 4
Our data also showed that there was no statistically
significant difference in the SP’s perception of resident
communication from second to third year. Factors that
may influence this may be the decrease in allotted time
for the clinical encounter from 30 minutes to 20 minutes
for the third-year resident, and an electronic medical
record (EMR) system was implemented for documenta-
tion midway through the study period.
Thus, our study does not definitively answer whether
SPs can be used as a measurement of advancement of
a resident’s clinical skills longitudinally over a period
of time. We are currently collecting more data and
expect that larger sample sizes will help answer this
This study has several challenges and limitations.
First, the length of time of our study (1 year) may
be insufficient to make comment on improvements
of a resident’s clinical decision making. Second, we
tested just three clinical cases of depression, headache,
and irritable bowel syndrome. This is by no means a
comprehensive measurement of a resident’s clinical
competence. Each resident had just two clinical ex-
posures; although the case remained the same, the SP
was different for each encounter. This may have led to
some variability in the patient rating of the provider.
Third, the clinical case questionnaire’s items were a
simple yes/no. This format may not accurately reflect
a resident’s clinical knowledge base. Finally, we did not
specifically quantify the extent to which the implemen-
tation of our EMR midway through our study affected
resident communication ratings.
Our study is unique in that we attempted to deter-
mine if there was a longitudinal improvement in a
resident’s clinical skills as assessed by SPs. Wilkinson
and Fontaine have demonstrated that observed SPs can
reliably assess a student’s clinical skills and that there
is high correlation between aggregate patient opinions
and aggregate OSCE scores.11 No study has examined
a resident’s clinical performance as evaluated by SPs
One might question whether the use of SPs was worth
the effort as compared to a standard OSCE testing sta-
tion or videotaping, because this scenario is contrived
and may lead to unauthentic physician behavior.7
However, cost and authenticity are an advantage of
this program. There was no need for special sessions
or additional faculty time, and because SPs are the
sole evaluators, the only costs are to train the SPs and
One might also question the ethics of using SPs
to evaluate resident performance. Each resident was
aware, however, that at any time an SP may visit the
clinic. They were given the option to opt out of the
evaluation if they were uncomfortable with the pro-
gram. Most viewed this as an opportunity to advance
their clinical skills. We believe that by allowing the
resident to voluntarily participate, the ethics of the
program were upheld.
Our results also bring up an important discussion
about what is the proper measure of clinical compe-
tence. Is competence defined as the ability to diagnose
or the ability to cover all questions and complete the
physical examination as determined by a question-
naire? It would be interesting to use the same SPs in
the evaluation of practicing physicians to see if this
trend of clinical shortcutting continues.
Our pilot study suggests that employing SPs may be
a useful tool to evaluate residents’ clinical competence.
We are currently using SPs in our program and plan
on expanding the program. Further studies with more
resident clinical exposures are needed to more fully
determine the validity of this approach.
Acknowledgments: This study was presented at the 2005 American College
of Osteopathic Family Physicians Conference in Phoenix and at the 2005
New York College of Osteopathic Medicine Educational Consortium Poster
Competition in Long Island, NY.
The authors would like to acknowledge administrative assistance by
Debra Van De Weert.
Corresponding Author: Address correspondence to Dr Terry, Wilson Family
Practice Residency, 40 Arch Street, Johnson City, NY 13790. 607-763-5334.
Fax: 607-763-5415. firstname.lastname@example.org.
1. Sloan D, Donnelly M, Schwartz R, Strodel W. The objective structured
clinical examination. Ann Surg 1995;222(6):735-42.
2. Dupras D, Li J. Use of an objective structured clinical examination to
determine clinical competence. Acad Med 1995;70(11):1029-34.
3. Peabody J, Luck J, Glassman P, Dresselhaus T, Lee M. Comparison of
vignettes, standardized patients, and chart abstraction. JAMA 2000;
4. Nagoshi M, Williams S, Kasuya R, Sakai D, Masaki K, Blanchette L.
Using standardized patients to assess the geriatrics medicine skills of
medical students, internal medicine residents, and geriatric medicine
fellows. Acad Med 2004;79(7):698-702.
5. Rifkin W, Rifkin A. Correlation between housestaff performance on the
United States Medical Licensing Examination and standardized patient
encounters. The Mount Sinai Journal of Medicine 2005;72(1):47-9.
6. Wendling A. Assessing resident competency in an outpatient setting.
Fam Med 2004;36(3):178-84.
7. Rethans JJ, Drop R, Sturmans F, Van Der Vleuten C. A method for
introducing standardized (simulated) patients into general practice
consultations. Br J Gen Pract 1991;41:94-6.
8. Wilk A, Jensen N. Investigation of a brief teaching encounter using
standardized patients. J Gen Intern Med 2002;17:356-60.
9. Skinner B, Newton W, Curtis P. The educational value of an OSCE in
a family practice residency. Acad Med 1997;72(8):722-4.
10. Garrison G, Bernard M, Rasmussen N. 21st-century health care: the
effect of computer use by physicians on patient satisfaction at a family
medicine clinic. Fam Med 2002;34(5):362-8.
11. Wilkinson T, Fontaine S. Patients’ global ratings of student competence.
Unreliable contamination or gold standard? Med Educ 2002;36:1117-