ArticlePDF Available

A Controlled Trial of Arthroscopic Surgery for Osteoarthritis of the Knee


Abstract and Figures

Many patients report symptomatic relief after undergoing arthroscopy of the knee for osteoarthritis, but it is unclear how the procedure achieves this result. We conducted a randomized, placebo-controlled trial to evaluate the efficacy of arthroscopy for osteoarthritis of the knee. A total of 180 patients with osteoarthritis of the knee were randomly assigned to receive arthroscopic débridement, arthroscopic lavage, or placebo surgery. Patients in the placebo group received skin incisions and underwent a simulated débridement without insertion of the arthroscope. Patients and assessors of outcome were blinded to the treatment-group assignment. Outcomes were assessed at multiple points over a 24-month period with the use of five self-reported scores--three on scales for pain and two on scales for function--and one objective test of walking and stair climbing. A total of 165 patients completed the trial. At no point did either of the intervention groups report less pain or better function than the placebo group. For example, mean (+/-SD) scores on the Knee-Specific Pain Scale (range, 0 to 100, with higher scores indicating more severe pain) were similar in the placebo, lavage, and débridement groups: 48.9+/-21.9, 54.8+/-19.8, and 51.7+/-22.4, respectively, at one year (P=0.14 for the comparison between placebo and lavage; P=0.51 for the comparison between placebo and débridement) and 51.6+/-23.7, 53.7+/-23.7, and 51.4+/-23.2, respectively, at two years (P=0.64 and P=0.96, respectively). Furthermore, the 95 percent confidence intervals for the differences between the placebo group and the intervention groups exclude any clinically meaningful difference. In this controlled trial involving patients with osteoarthritis of the knee, the outcomes after arthroscopic lavage or arthroscopic débridement were no better than those after a placebo procedure.
Content may be subject to copyright.
N Engl J Med, Vol. 347, No. 2
July 11, 2002
The New England
Copyright © 2002 by the Massachusetts Medical Society
11, 2002
J. B
, M.D., K
, P
.D., N
J. P
, P
.D., T
J. M
, P
A. B
, P
.D., D
H. K
, P
.D., J
C. H
, D
M. A
, M.D., M.P.H.,
P. W
, M.D., M.P.H.
Many patients report symptomatic re-
lief after undergoing arthroscopy of the knee for os-
teoarthritis, but it is unclear how the procedure
achieves this result. We conducted a randomized,
placebo-controlled trial to evaluate the efficacy of ar-
throscopy for osteoarthritis of the knee.
A total of 180 patients with osteoarthritis of
the knee were randomly assigned to receive arthro-
scopic débridement, arthroscopic lavage, or placebo
surgery. Patients in the placebo group received skin
incisions and underwent a simulated débridement
without insertion of the arthroscope. Patients and
assessors of outcome were blinded to the treatment-
group assignment. Outcomes were assessed at mul-
tiple points over a 24-month period with the use of
five self-reported scores three on scales for pain
and two on scales for function and one objective
test of walking and stair climbing. A total of 165 pa-
tients completed the trial.
At no point did either of the intervention
groups report less pain or better function than the pla-
cebo group. For example, mean (±SD) scores on the
Knee-Specific Pain Scale (range, 0 to 100, with higher
scores indicating more severe pain) were similar in the
placebo, lavage, and débridement groups: 48.9±21.9,
54.8±19.8, and 51.7±22.4, respectively, at one year
(P=0.14 for the comparison between placebo and la-
vage; P=0.51 for the comparison between placebo
and débridement) and 51.6±23.7, 53.7±23.7, and 51.4±
23.2, respectively, at two years (P=0.64 and P=0.96,
respectively). Furthermore, the 95 percent confidence
intervals for the differences between the placebo
group and the intervention groups exclude any clin-
ically meaningful difference.
In this controlled trial involving pa-
tients with osteoarthritis of the knee, the outcomes
after arthroscopic lavage or arthroscopic débridement
were no better than those after a placebo procedure.
(N Engl J Med 2002;347:81-8.)
Copyright © 2002 Massachusetts Medical Society.
From the Houston Veterans Affairs Medical Center (J.B.M., K.O., N.J.P.,
T.J.M., D.H.K., C.M.A., N.P.W.); the Department of Orthopedic Surgery
(J.B.M.), the Department of Medicine, Section of Health Services Research
(K.O., N.J.P., T.J.M., C.M.A., N.P.W.), and the Center for Medical Ethics
and Health Policy (B.A.B.), Baylor College of Medicine; and International
Survey Research (D.H.K.) — all in Houston; and the Laguna Honda Hos-
pital, San Francisco (J.C.H.). Address reprint requests to Dr. Wray at the
Section of Health Services Research, Baylor College of Medicine, 2002
Holcombe Blvd. (M.R. 152), Houston, TX 77030, or at nwray@bcm.
HEN medical therapy fails to relieve
the pain of osteoarthritis of the knee,
arthroscopic lavage or débridement
is often recommended. More than
650,000 such procedures are performed each year
at a cost of roughly $5,000 each. In uncontrolled
studies of knee arthroscopy for osteoarthritis, about
half the patients report relief from pain.
the physiological basis for the pain relief is unclear.
There is no evidence that arthroscopy cures or arrests
the osteoarthritis. Therefore, we conducted a random-
ized, placebo-controlled trial to assess the efficacy of
arthroscopic surgery of the knee in relieving pain
and improving function in patients with osteoarthri-
tis. Both patients and assessors of outcome were blind-
ed to the treatment assignments.
The college and hospital institutional review board approved
the protocol. A data and safety monitoring board monitored the
Study Participants
Participants were recruited from the Houston Veterans Affairs
Medical Center from October 1995 through September 1998.
Patients were eligible if they were 75 years old or younger, had os-
teoarthritis of the knee as defined by the American College of
reported at least moderate knee pain on average
(»4 on a visual-analogue scale ranging from 0 to 10) despite max-
imal medical treatment for at least six months, and had not un-
dergone arthroscopy of the knee during the previous two years.
The New England
Copyright © 2002 by the Massachusetts Medical Society
11, 2002
N Engl J Med, Vol. 347, No. 2
July 11, 2002
The New England Journal of Medicine
The severity of osteoarthritis in the study knee (that with the
greatest pain-induced limitation of function) was assessed radio-
graphically and graded on a scale of zero to four.
The scores for
the three compartments were added together to generate a sever-
ity grade of 0 to 12. Criteria for exclusion were a severity grade
of 9 or higher, severe deformity, and serious medical problems.
All patients provided informed consent, which included writing
in their chart, “On entering this study, I realize that I may receive
only placebo surgery. I further realize that this means that I will
not have surgery on my knee joint. This placebo surgery will not
benefit my knee arthritis.Of the 324 consecutive patients who
met the criteria for inclusion, 144 (44 percent) declined to par-
ticipate. Participants were younger than those who declined to
participate (52.3±11.3 years vs. 55.3±12.4 years, P=0.002), were
more likely to be white (62.2 percent vs. 50.7 percent, P=0.03),
and had more severe arthritis (25.0 percent vs. 12.5 percent with
grade 7 or 8 arthritis, P<0.001).
Randomization Process and Treatment Groups
Participants were stratified into three groups according to the
severity of osteoarthritis (grade 1, 2, or 3; grade 4, 5, or 6; and
grade 7 or 8). A stratified randomization process with fixed blocks
of six was used. Sealed, sequentially numbered, stratum-specific
envelopes containing treatment assignments were prepared and
given to the research assistant. After the patient was in the oper-
ating suite, the surgeon was handed the envelope. The treatment
assignment was not revealed to the patient.
Participants were randomly assigned to arthroscopic débridement,
arthroscopic lavage alone, or the placebo procedure. One orthope-
dist performed all the operations. Patients in the débridement group
or the lavage group received standard general anesthesia with endo-
tracheal intubation. Patients in the placebo group received a short-
acting intravenous tranquilizer and an opioid and spontaneously
breathed oxygen-enriched air.
After diagnostic arthroscopy in patients in the lavage group, the
joint was lavaged with at least 10 liters of fluid. Anything that could
be flushed out through arthroscopic cannulas was removed. Nor-
mally, no instruments were used to mechanically débride or remove
tissue. However, if a mechanically important, unstable tear in the
meniscus (e.g., a displaced “bucket-handle” tear) was encountered,
the torn portion was removed and the remaining meniscus was
smoothed to a firm, stable rim. (There is general agreement that
it is inappropriate to leave this type of meniscal tear untreat-
11,13,19, 20
) No other débridement was performed.
After diagnostic arthroscopy in patients in the débridement group,
the joint was lavaged with at least 10 liters of fluid, rough articular
cartilage was shaved (chondroplasty was performed), loose debris
was removed, all torn or degenerated meniscal fragments were
trimmed, and the remaining meniscus was smoothed to a firm and
stable rim. No abrasion arthroplasty or microfracture was performed.
Typically, bone spurs were not removed, but any spurs from the
tibial spine area that blocked full extension were shaved smooth.
Placebo Procedure
To preserve blinding in the event that patients in the placebo
group did not have total amnesia, a standard arthroscopic débride-
ment procedure was simulated. After the knee was prepped and
draped, three 1-cm incisions were made in the skin. The surgeon
asked for all instruments and manipulated the knee as if arthros-
copy were being performed. Saline was splashed to simulate the
sounds of lavage. No instrument entered the portals for arthros-
copy. The patient was kept in the operating room for the amount
of time required for a débridement. Patients spent the night after
the procedure in the hospital and were cared for by nurses who
were unaware of the treatment-group assignment.
Postoperatively, there were two minor complications and no
deaths. Incisional erythema developed in one patient, who was giv-
en antibiotics. In a second patient, calf swelling developed in the leg
that had undergone surgery; venography was negative for thrombo-
sis. In no case did a complication necessitate the breaking of the
randomization code.
Postoperative care was delivered according to a protocol spec-
ifying that all patients should receive the same walking aids, grad-
uated exercise program, and analgesics. The use of analgesics after
surgery was monitored; during the two-year follow-up period,
the amount used was similar in the three groups.
End Points
Study personnel who were unaware of the treatment-group as-
signments performed all postoperative outcome assessments; the
operating surgeon did not participate in any way. Data on end points
were collected 2 weeks, 6 weeks, 3 months, 6 months, 12 months,
18 months, and 24 months after the procedure. To assess whether
patients remained unaware of their treatment-group assignment,
they were asked at each follow-up visit to guess which procedure
they had undergone. Patients in the placebo group were no more
likely than patients in the other two groups to guess that they had
undergone a placebo procedure. For example, at two weeks, 13.8
percent of the patients in the placebo group guessed that they had
undergone a placebo procedure, and 13.2 percent of the patients
in the lavage and débridement groups guessed that they had under-
gone a placebo procedure.
The primary end point was pain in the study knee 24 months
after the intervention, as assessed by a 12-item self-reported Knee-
Specific Pain Scale (KSPS) created for this study (see Supplemen-
tary Appendix 1, available with the full text of this article at http:// Scores on this scale range from 0 to 100, with
higher scores indicating more severe pain. In addition, to ensure
our ability to detect any benefit, we also used five secondary efficacy
end points: two additional assessments of pain and three assessments
of function at all time points. Arthritis pain in general (i.e., not spe-
cifically in the study knee) was assessed by means of the four-item
pain subscale of the Arthritis Impact Measurement Scales (AIMS2-
Higher scores on this subscale indicate more severe pain.
Body pain (i.e., not necessarily from arthritis and not necessarily in
the knee) was assessed with the 2-item pain subscale of the Medical
Outcomes Study 36-item Short-Form General Health Survey (SF-
Higher scores on this subscale indicate less severe pain.
The AIMS2-P and the SF-36-P scores were transformed into scores
on a scale from 0 to 100.
Two self-reported measures of physical function were used: the
5-item walking–bending subscale from the AIMS2 (AIMS2-WB,
transformed into scores on a scale from 0 to 100, with higher
scores indicating more limited function
) and the 10-item phys-
ical-function subscale from the SF-36 (SF-36-PF, transformed into
scores on a scale from 0 to 100, with higher scores indicating better
). As an objective measure, we devised the Physical
Functioning Scale (PFS) to record the amount of time in seconds
that a patient required to walk 30 m (100 ft) and to climb up and
down a flight of stairs as quickly as possible. Longer times indicate
poorer functioning.
All six outcome scales had good reliability. The median Cron-
bachs alpha (according to analyses of data from eight time points
for all scales) exceeded 0.80. Results for all the outcome measures
at all the time points that are not reported here are summarized
in Supplementary Appendix 2, available with the full text of this
article at
Statistical Analysis
Our pilot study indicated that it would be feasible to recruit 60
patients per year. The trial was designed to have 90 percent power,
N Engl J Med, Vol. 347, No. 2
July 11, 2002
with a two-sided type I error of 0.04, to detect a moderate effect
size (0.55) between the placebo group and the combined arthro-
scopic-treatment groups in terms of body pain as measured by the
SF-36-P at two years, with an enrollment of 180 patients and 16
or fewer lost to follow-up (i.e., 164 or more completing the two-
year follow-up). The primary hypothesis was that the patients in
two arthroscopic-intervention groups combined would report the
same amount of knee pain at two years as the patients assigned
to the placebo group. All statistical tests compared the treatment
groups in terms of the values at each visit rather than analyzing
the changes from base line. (Scores for these changes [“change
scores”] were analyzed, with results that did not differ from the
results presented here.) The data and safety monitoring board re-
viewed interim data 15 months and 24 months after enrollment
began, using the Haybittle–Peto group-sequential method, with
stopping boundaries of P=0.001 for the two interim analy-
All reported P values are two-sided and have not been ad-
justed for multiple comparisons.
Our prespecified analytic strategy was to test, at all time points,
for the superiority of the arthroscopic procedures over the place-
bo procedure. Lacking evidence of superiority, we tested for ev-
idence that the arthroscopic procedures were equivalent
to the
placebo procedure by determining the extent to which the study
was powered to reject the hypothesis that the arthroscopic treat-
ments caused a small but clinically important improvement (the
“minimal important difference”). The minimal important differ-
ence for a scale is the smallest change score associated with a pa-
tient’s perception of a change in health status,
but it can vary
somewhat according to the method of calculation and the study
Minimal important differences for each of the six study
scales were calculated on the basis of the trial data by two differ-
ent methods: the change ratings of patients (their scores on a sin-
gle-item scale that asked patients if their condition was the same,
somewhat better [or worse], or much better [or worse] than be-
fore surgery) and the standard error of measurement (the SD of
the instrument multiplied by the square root of one minus its re-
liability coefficient).
Estimates were also obtained from the lit-
For each scale, we tested the hypothesis that the place-
bo procedure was equivalent to the arthroscopic procedures, using
as the minimal important difference the midpoint of the range of
the minimal important differences reported in the literature or
calculated on the basis of our data. If the 95 percent confidence
interval around the estimated size of the effect does not include
the minimal important difference, one can reject the hypothesis that
*There were no significant differences among the three groups. Plus–minus values are means ±SD.
†Severity of osteoarthritis was assessed by radiography.
‡Percentages are based on the number of patients who participated in the trial.
§Scores on the Knee Society Clinical Rating Scale range from 0 to 100, with higher scores indi-
cating fewer knee-related symptoms.
¶Scores for all psychological attributes, except expectations for benefit, range from 0 to 100, with
higher scores indicating more of the variable (e.g., more anxiety, depression, or optimism).
¿Expectations were retrospectively assessed two weeks after the procedure. Scores range from 1 to 5,
with higher scores indicating higher expectations.
Age (yr) 52.0±11.1 51.2±10.5 53.6±12.2
Male sex (%) 93.3 88.5 96.6
Race (%)
White 60.0 59.0 61.0
Black 31.7 31.2 22.0
Other 8.3 9.9 17.0
Severity of osteoarthritis in knee (%)†
Mild 28.3 27.9 30.5
Moderate 46.7 45.9 45.8
Severe 25.0 26.2 23.7
Analgesic use (%)‡
Nonprescription 70.0 67.2 64.4
Prescription 21.7 21.3 15.3
Mean score on Knee Society Clinical Rating Scale‡§
Knee symptoms 49.4 50.2 51.4
Function 62.2 62.4 57.6
Psychological attributes¶
Anxiety 27.0±21.0 30.2±19.9 28.4±22.4
Depression 20.0±32.0 28.1±37.2 22.0±35.3
Expectations for benefit¿ 3.5±1.0 3.5±0.9 3.6±1.1
Optimism 72.6±21.0 74.5±19.4 73.7±17.1
Satisfaction with general health 39.3±25.1 43.7±22.4 46.5±24.8
Social functioning 65.5±25.6 60.3±23.9 67.6±25.2
Somatization 11.3±12.7 9.6±12.4 10.0±10.7
Stress 28.4±19.7 26.1±18.2 27.9±18.8
Vitality 54.8±21.0 52.7±19.7 57.7±19.3
N Engl J Med, Vol. 347, No. 2
July 11, 2002
The New England Journal of Medicine
the arthroscopic procedures have a small but clinically important
A total of 180 patients underwent randomization;
60 were assigned to the placebo group, 61 to the la-
vage group, and 59 to the débridement group. Base-
line characteristics were similar in the three study
groups (Table 1).
At no point did either arthroscopic-intervention
group have greater pain relief than the placebo group
(Fig. 1, Table 2, and Supplementary Appendix 2). For
example, there was no difference in knee pain between
the placebo group and either the lavage group or the
débridement group at one year (mean [±SD] KSPS
scores, 48.9±21.9, 54.8±19.8, and 51.7±22.4, re-
spectively; P=0.14 for the comparison with the la-
vage group, and P=0.51 for the comparison with
the débridement group) or at two years (mean KSPS
scores, 51.6±23.7, 53.7±23.7, and 51.4±23.2, re-
spectively; P=0.64 and P=0.96, respectively). Sim-
ilarly, there was no significant difference in arthritis
pain between the placebo group and the lavage
group or the débridement group at one or two years
(Table 2).
Furthermore, at no time point did either arthro-
scopic-intervention group have significantly greater
improvement in function than the placebo group
(Fig. 2, Table 3, and Supplementary Appendix 2). For
example, there was no significant difference between
the placebo group and either the lavage group or the
débridement group in the self-reported ability to
walk and bend at one year (mean AIMS2-WB scores,
49.4±25.5, 49.6±29.1, and 56.4±28.4, respectively;
P=0.98 for the comparison with the lavage group,
and P=0.19 for the comparison with the débride-
ment group) or at two years (mean AIMS2-WB score,
53.8±27.5, 51.1±28.3, and 56.4±29.4, respectively;
P=0.61 and P=0.64, respectively). Indeed, objective-
ly measured walking and stair climbing were poorer in
the débridement group than in the placebo group at
two weeks (mean PFS score, 56.0±21.8 vs. 48.3±
13.4; P=0.02) and one year (mean PFS score, 52.5±
20.3 vs. 45.6±10.2; P=0.04) and showed a trend to-
ward worse functioning at two years (mean PFS score,
52.6±16.4 vs. 47.7±12.0; P=0.11) (Table 3).
Lacking evidence of the superiority of the arthro-
scopic treatments over the placebo procedure in re-
lieving pain or improving function, we considered
whether the 95 percent confidence intervals for the
Figure 1.
Mean Values (and 95 Percent Confidence Intervals) on the Knee-Specific Pain Scale.
Assessments were made before the procedure and 2 weeks, 6 weeks, 3 months, 6 months, 12 months, 18 months, and
24 months after the procedure. Higher scores indicate more severe pain.
2 yr2 wk 6 wk 3 mo 6 mo 1 yr 18 mo
After Procedure
Mean Knee-Specific Pain Scale Score
N Engl J Med, Vol. 347, No. 2
July 11, 2002
differences in outcome between each arthroscopic
procedure and the placebo procedure included clin-
ically important differences. The minimal important
differences used for this evaluation were as follows:
a difference of 13.5 points on the KSPS, 10.0 on the
AIMS2-P, 11.8 on the SF-36-P, 12.8 on the AIMS2-
WB, 11.3 on the SF-36-PF, and 4.5 on the PFS. At
almost all time points during follow-up (72 of 84
comparisons), the confidence intervals excluded these
minimal important differences.
This study provides strong evidence that arthro-
scopic lavage with or without débridement is not
better than and appears to be equivalent to a place-
bo procedure in improving knee pain and self-report-
ed function. Indeed, at some points during follow-up,
objective function was significantly worse in the dé-
bridement group than in the placebo group.
Arthroscopy is the most commonly performed type
of orthopedic surgery, and the knee is by far the most
common joint on which it is performed.
uncontrolled, retrospective case series have reported
substantial pain relief after arthroscopic lavage or ar-
throscopic débridement for osteoarthritis of the
In the only previous double-blind, random-
ized, controlled trial of knee arthroscopy of which we
are aware,
patients with minimal osteoarthritis as
assessed by radiography were assigned to undergo
arthroscopic lavage with either 3000 ml of fluid (treat-
ment) or 250 ml of fluid (control) and were followed
for one year. Both the treatment and the control
groups reported improvement in function at 12
months, and although the report interprets the study
as having proved the efficacy of lavage, there was no
statistically significant difference between the groups
in terms of the primary outcome at any point during
To explain the improvement that has been report-
ed after these procedures, some have proposed that
the fluid that is flushed through the knee during ar-
throscopy cleanses the knee of painful debris and in-
flammatory enzymes.
Others have suggest-
ed that the improvement is due to the removal of
flaps of articular cartilage, torn meniscal fragments,
hypertrophied synovium, and loose debris.
ever, our study found that outcomes after arthro-
scopic treatment are no better than those after a pla-
cebo procedure. This lack of difference suggests that
the improvement is not due to any intrinsic efficacy
of the procedures. Although patients in the placebo
groups of randomized trials frequently have improve-
ment, it may be attributable to either the natural his-
tory of the condition or some independent effect of
the placebo.
Because we found no evidence that lavage or dé-
*Plus–minus values are means ±SD. CI denotes confidence interval. Scores were transformed into scores on a scale ranging from 0 to 100, with higher scores indicating more severe pain.
†This 95 percent confidence interval includes the minimal important difference.
2 WK 6 WK 3 MO 6 MO 1 YR 18 MO 2 YR
Placebo group
No. with data 59 5957565754 52 55
Score 59.5±18.5 47.9±23.9 50.8±23.2 50.1±21.3 50.0±20.7 53.6±22.1 55.6±23.6 52.5±25.1
Lavage group
No. with data 61 5957595957 57 56
Score 59.3±16.7 51.9±20.3 52.4±22.1 53.7±23.1 54.8±21.6 57.8±23.5 55.4±24.6 56.7±24.1
Débridement group
No. with data 58 5859585551 51 53
Score 59.3±22.2 53.2±21.7 49.9±23.3 49.9±21.7 52.0±20.8 53.3±25.4 50.7±24.4 54.0±23.3
Placebo group vs. lavage group
Difference between mean scores (95% CI) 0.2
(¡6.2 to 6.6)
(¡12.1 to 4.1)
(¡10.1 to 6.8)
(¡11.8 to 4.6)
(¡12.5 to 3.0)
(¡12.8 to 4.4)
(¡8.9 to 9.4)
(¡13.5 to 5.1)
P value 0.95 0.33 0.70 0.39 0.23 0.34 0.95 0.37
Placebo group vs. débridement group
Difference between mean scores (95% CI) 0.3
(¡7.2 to 7.7)
(¡13.6 to 3.1)
(¡7.7 to 9.4)
(¡7.7 to 8.2)
(¡9.8 to 5.7)
(¡8.9 to 9.5)
(¡4.5 to 14.3)†
(¡10.8 to 7.7)
P value 0.94 0.22 0.84 0.95 0.60 0.95 0.30 0.75
86 · N Engl J Med, Vol. 347, No. 2 · July 11, 2002 ·
The New England Journal of Medicine
bridement is superior to a placebo procedure, the
question arises whether these arthroscopic procedures
could have small but clinically important benefits that
we missed because of our limited sample size. To eval-
uate this possibility, we determined the size of the
clinical benefit that the trial was able to rule out, us-
ing the minimal important difference for each of our
scales. Because estimates of minimal important dif-
ferences based on different samples and different
methods do not yield the same values, we used the
midpoint of the range of available minimal impor-
tant differences in order to test our hypothesis about
the equivalence of the three procedures. For the great
majority of comparisons, the 95 percent confidence
intervals did not contain the minimal important dif-
ference, indicating that there was not a clinically im-
portant improvement that the study had simply failed
to detect.
One surgeon performed all the procedures in this
study. Consequently, his technical proficiency is crit-
ical to the generalizability of our findings. Our study
surgeon is board-certified, is fellowship-trained in
arthroscopy and sports medicine, and has been in
practice for 10 years in an academic medical center.
He is currently the orthopedic surgeon for a Nation-
al Basketball Association team and was the physician
for the mens and womens U.S. Olympic basketball
teams in 1996.
The principal limitation of this study is that our
participants may not be representative of all candi-
dates for arthroscopic treatment of osteoarthritis of
the knee. Almost all participants were men, because
the study was conducted at a Veterans Affairs medical
center. We do not know whether our findings may
be generalized to women, although uncontrolled
studies do not indicate that there are differences be-
tween the sexes in responses to arthroscopic proce-
A selection bias might have been intro-
duced by the fact that 44 percent of the eligible
patients declined to participate in the study. We be-
lieve this high rate of refusal to participate resulted
from the fact that all patients knew they had a one-
in-three chance of undergoing a placebo procedure.
Patients who agreed to participate might have been
so sure that an arthroscopic procedure would help
that they were willing to take a one-in-three chance
of undergoing the placebo procedure. Such patients
might have had higher expectations of benefit or
been more susceptible to a placebo effect than those
who chose not to participate.
Figure 2. Mean Values (and 95 Percent Confidence Intervals) on the Walking–Bending Subscale of the Arthritis Impact
Measurement Scales (AIMS2).
Assessments were made before the procedure and 2 weeks, 6 weeks, 3 months, 6 months, 12 months, 18 months, and
24 months after the procedure. Higher scores indicate poorer functioning.
2 yr2 wk 6 wk 3 mo 6 mo 1 yr 18 mo
After Procedure
AIMS2 Walking–Bending Subscale
N Engl J Med, Vol. 347, No. 2 · July 11, 2002 · · 87
If the efficacy of arthroscopic lavage or débride-
ment in patients with osteoarthritis of the knee is no
greater than that of placebo surgery, the billions of
dollars spent on such procedures annually might be
put to better use. This study has also shown the great
potential for a placebo effect with surgery, although
it is unclear whether this effect is due solely to the
natural history of the condition or whether there is
some independent effect. Researchers should recon-
sider the best ways of testing the efficacy of surgical
procedures performed purely for the improvement
of symptoms. In the debate about placebo-controlled
trials of surgery, the critical ethical considerations
surround the choice of the placebo. Finally, health
care researchers should not underestimate the place-
bo effect, regardless of its mechanism.
Supported by a grant from the Department of Veterans Affairs.
1. Owings MF, Kozak LJ. Ambulatory and inpatient procedures in the
United States, 1996. Vital and health statistics. Series 13. No. 139. Hyatts-
ville, Md.: National Center for Health Statistics, November 1998. (DHHS
publication no. (PHS) 99-1710.)
2. Baumgaertner MR, Cannon WD Jr, Vittore JM, Schmidt ES, Maurer
RC. Arthroscopic debridement of the arthritic knee. Clin Orthop 1990;
3. Bert JM, Maschka K. The arthroscopic treatment of unicompartmental
gonarthrosis: a five-year follow-up study of abrasion arthroplasty plus ar-
throscopic debridement and arthroscopic debridement alone. Arthroscopy
4. Chang RW, Falconer J, Stulberg SD, Arnold WJ, Manheim LM, Dyer
AR. A randomized, controlled trial of arthroscopic surgery versus closed-
needle joint lavage for patients with osteoarthritis of the knee. Arthritis
Rheum 1993;36:289-96.
5. Gross DE, Brenner SL, Esformes I, Gross ML. Arthroscopic treatment
of degenerative joint disease of the knee. Orthopedics 1991;14:1317-21.
6. Jackson RW, Silver R, Marans H. Arthroscopic treatment of degenera-
tive joint disease. Arthroscopy 1986;2:114.
7. Jennings JE. Arthroscopic debridement as an alternative to total knee
replacement. Arthroscopy 1986;2:123-4.
8. McLaren AC, Blokker CP, Fowler PJ, Roth JN, Rock MG. Arthroscopic
debridement of the knee for osteoarthritis. Can J Surg 1991;34:595-8.
9. Ogilvie-Harris DJ, Fitsialos DP. Arthroscopic management of the de-
generative knee. Arthroscopy 1991;7:151-7.
10. Rand JA. Role of arthroscopy in osteoarthritis of the knee. Arthros-
copy 1991;7:358-63.
11. Richards RN Jr, Lonergan RP. Arthroscopic surgery for relief of pain
in the osteoarthritic knee. Orthopedics 1984;7:1705-7.
12. Salisbury RB, Nottage WM, Gardner V. The effect of alignment on
results in arthroscopic debridement of the degenerative knee. Clin Orthop
13. Sprague NF III. Arthroscopic debridement for degenerative knee joint
disease. Clin Orthop 1981;160:118-23.
14. Timoney JM, Kneisl JS, Barrack RL, Alexander AH. Arthroscopy in
the osteoarthritic knee: long-term follow-up. Orthop Rev 1990;19:371-3,
15. Ike RW, Arnold WJ, Rothschild EW, Shaw HL. Tidal irrigation versus
conservative medical management in patients with osteoarthritis of the
knee: a prospective randomized study. J Rheumatol 1992;19:772-9.
16. Livesley PJ, Doherty M, Needoff M, Moulton A. Arthroscopic lavage
of osteoarthritic knees. J Bone Joint Surg Br 1991;73:922-6.
17. Workshop on etiopathogenesis of osteoarthritis. J Rheumatol 1986;
18. Kellgren JH, Lawrence JS. Radiological assessment of osteo-arthrosis.
Ann Rheum Dis 1957;16:494-502.
19. Jackson RW, Rouse DW. The results of partial arthroscopic meniscec-
tomy in patients over 40 years of age. J Bone Joint Surg Br 1982;64:481-5.
*Scores indicate the number of seconds the patient required to walk 30 m (100 ft) and to climb up and down a flight of stairs as quickly as possible; longer times indicate poorer functioning. Plus–minus
values are means ±SD. CI denotes confidence interval.
†This 95 percent confidence interval includes the minimal important difference.
Placebo group
WK 6 WK 3 MO 6 MO 1 YR 18 MO 2 YR
No. with data 59 59565454494644
Score 48.5±14.5 48.3±13.4 45.9±12.0 47.3±16.0 47.0±13.0 45.6±10.2 48.5±12.4 47.7±12.0
Lavage group
No. with data 59 57545552544950
Score 50.0±14.3 53.0±25.3 49.5±19.4 48.8±21.0 49.4±20.4 50.4±17.6 51.18.8 53.2±21.6
Débridement group
No. with data 58 57585654474444
Score 52.1±20.2 56.0±21.8 51.7±24.7 49.5±17.4 49.8±17.4 52.5±20.3 52.8±20.9 52.6±16.4
Placebo group vs. lavage group
Difference between mean scores (95% CI) ¡1.6
(¡6.8 to 3.7)
(¡12.0 to 2.8)
(¡9.7 to 2.5)
(¡8.5 to 5.6)†
(¡9.0 to 4.1)
(¡10.6 to 0.9)
(¡9.2 to 3.8)
(¡12.8 to 1.8)
P value 0.56 0.22 0.25 0.69 0.47 0.09 0.41 0.13
Placebo group vs. débridement group
Difference between mean scores (95% CI) ¡3.6
(¡10.0 to 2.8)
(¡14.3 to ¡1.1)
(¡13.1 to 1.4)
(¡8.5 to 4.1)
(¡8.7 to 3.1)
(¡13.3 to ¡0.4)
(¡11.5 to 2.8)
(¡11.0 to 1.2)
P value 0.27 0.02 0.11 0.49 0.34 0.04 0.23 0.11
88 · N Engl J Med, Vol. 347, No. 2 · July 11, 2002 ·
The New England Journal of Medicine
20. Rand JA. Arthroscopic management of degenerative meniscus tears in
patients with degenerative arthritis. Arthroscopy 1985;1:253-8.
21. Meenan RF, Mason JH, Anderson JJ, Guccione AA, Kazis LE.
AIMS2: the content and properties of a revised and expanded Arthritis Im-
pact Measurement Scales health status questionnaire. Arthritis Rheum
22. Liang MH, Fossel AH, Larson MG. Comparisons of five health status
instruments for orthopedic evaluation. Med Care 1990;28:632-42.
23. Ware JE, Sherbourne CD. The MOS 36-item short-form health survey
(SF-36). I. Conceptual framework and item selection. Med Care 1992;30:
24. Kantz ME, Harris WJ, Levitsky K, Ware JE Jr, Davies AR. Methods
for assessing condition-specific and generic functional status outcomes af-
ter total knee replacement. Med Care 1992;30:Suppl:MS240-MS252.
25. Haybittle JL. Repeated assessment of results in clinical trials of cancer
treatment. Br J Radiol 1971;44:793-7.
26. Peto R, Pike MC, Armitage P, et al. Design and analysis of random-
ized clinical trials requiring prolonged observation of each patient. I. In-
troduction and design. Br J Cancer 1976;34:585-612.
27. Jones B, Jarvis P, Lewis JA, Ebbutt AF. Trials to assess equivalence: the
importance of rigorous methods. BMJ 1996;313:36-9. [Erratum, BMJ
28. Altman DG, Bland JM. Absence of evidence is not evidence of ab-
sence. BMJ 1995;311:485.
29. Greene WL, Concato J, Feinstein AR. Claims of equivalence in med-
ical research: are they supported by the evidence? Ann Intern Med 2000;
30. Jaeschke R, Singer J, Guyatt GH. Measurement of health status: ascer-
taining the minimal clinically important difference. Control Clin Trials
31. Kosinski M, Zhao SZ, Dedhiya S, Osterhaus JT, Ware JE Jr. Determin-
ing minimally important changes in generic and disease-specific health-
related quality of life questionnaires in clinical trials of rheumatoid arthritis.
Arthritis Rheum 2000;43:1478-87.
32. Wyrwich KW, Tierney WM, Wolinsky FD. Further evidence support-
ing an SEM-based criterion for identifying meaningful intra-individual
changes in health-related quality of life. J Clin Epidemiol 1999;52:861-73.
33. Angst F, Aeschlimann A, Stucki G. Smallest detectable and minimal
clinically important differences of rehabilitation intervention with their im-
plications for required sample sizes using WOMAC and SF-36 quality of
life measurement instruments in patients with osteoarthritis of the lower
extremities. Arthritis Rheum 2001;45:384-91.
34. Kalunian KC, Moreland LW, Klashman DJ, et al. Visually-guided irri-
gation in patients with early knee osteoarthritis: a multicenter randomized,
controlled trial. Osteoarthritis Cartilage 2000;8:412-8.
35. Hróbjartsson A, Gøtzsche PC. Is the placebo powerless? An analysis
of clinical trials comparing placebo with no treatment. N Engl J Med 2001;
344:1594-602. [Erratum, N Engl J Med 2001;345:304.]
Copyright © 2002 Massachusetts Medical Society.
Access to the complete text of the Journal on the Internet is free to all subscribers. To use
this Web site, subscribers should go to the Journal’s home page (
and register by entering their names and subscriber numbers as they appear on their
mailing labels. After this one-time registration, subscribers can use their passwords to log
on for electronic access to the entire Journal from any computer that is connected to the
Internet. Features include a library of all issues since January 1993 and abstracts since
January 1975, a full-text search capacity, and a personal archive for saving articles and
search results of interest. All articles can be printed in a format that is virtually identical
to that of the typeset pages. Beginning six months after publication the full text of all
original articles and special articles is available free to nonsubscribers who have completed
a brief registration.
... A artroscopia é frequentemente considerada uma abordagem menos invasiva em comparação com a cirurgia aberta, o que geralmente resulta em menos dor pósoperatória, cicatrizes menores e uma recuperação mais rápida. Para pacientes com OA do joelho, a artroscopia pode ser realizada para realizar procedimentos como a (Moseley et al., 2002). ...
... No entanto, a eficácia da artroscopia no tratamento da OA do joelho tem sido questionada. Um estudo notável publicado no New England Journal of Medicine em 2002, conhecido como "Estudo Moseley", concluiu que a artroscopia do joelho não proporcionava benefícios significativos em relação ao tratamento conservador, como fisioterapia e medicamentos, para pacientes com OA do joelho (Moseley et al., 2002). Essa constatação levou a uma reavaliação da prática da artroscopia em casos de OA do joelho. ...
Full-text available
Este artigo revisa as abordagens modernas para o tratamento da osteoartrite (OA), uma doença crônica das articulações que causa dor e limitações funcionais. As opções de tratamento incluem fisioterapia, gestão do peso, medicamentos e terapias complementares para casos iniciais de OA. Para casos mais avançados, a artroplastia total de articulação (ATA) é uma opção eficaz, embora envolva riscos cirúrgicos. A educação do paciente desempenha um papel crucial, capacitando os indivíduos com informações sobre a OA e estratégias de autocuidado. A pesquisa contínua visa desenvolver terapias mais eficazes para abordar essa condição debilitante. A escolha da abordagem de tratamento deve ser personalizada, levando em consideração a gravidade da OA, as necessidades individuais e os objetivos de tratamento do paciente. O gerenciamento eficaz da OA envolve uma abordagem multifacetada, com foco na redução da dor, melhoria da função articular e qualidade de vida do paciente.
... In Asian populations, MOWHTO tends to be more frequently performed in females than in males. Our ndings might have been biased by the disproportionate female sex predominance [47][48][49][50]. Nevertheless, we believe that our study provides a reasonable evaluation of important factors associated with MOWHTO treatment success by applying machine learning to develop a clinically validate web-based model to provide patient-speci c predictions for the management of medial compartmental osteoarthritis. ...
Full-text available
Background: Although high tibial osteotomy (HTO) is an established treatment option for medial compartment osteoarthritis, the predictive factors for HTO treatment success remain unclear. This study aimed to identify informative variables associated with HTO treatment success and to develop and internally validate machine learning algorithms to provide patient-specific predictions of which patients will achieve HTO treatment success for medial compartmental osteoarthritis. Methods: This study retrospectively reviewed patients who underwent medial opening-wedge HTO (MOWHTO) at our center between March 2010 and December 2015. The primary outcomes were the lack of conversion to total knee arthroplasty (TKA) and achievement of the minimal clinically important difference of improvement in the Knee Injury and Osteoarthritis Outcome Score (KOOS) at a minimum of five years postoperatively. Recursive feature selection was used to identify the combination of variables from an initial pool of 25 features that optimized model performance. Five machine learning algorithms (XGBoost, multilayer perception, support vector machine, elastic-net penalized logistic regression, and random forest) were trained using five-fold cross-validation three times and applied to an independent test set of patients.The performance of the model was evaluated by the area under the receiver operating characteristic curve (AUC). Results: A total of 231 patients were included, and 200 patients (86.6%) achieved treatment success at the mean of 9 years of follow-up. A combination of seven variables optimized algorithm performance, and specific cutoffs increased the likelihood of MOWHTO treatment success: body mass index (BMI) ≤26.8, preoperative KOOS for pain ≤46.0, preoperative KOOS for quality of life ≤33.0, preoperative International Knee Documentation Committee score ≤42.0, preoperative Short-Form 36 questionnaire (SF-36) score >42.25, three-month postoperative hip-knee-ankle angle >1.0, and three-month postoperative medial proximal tibial angle (MPTA) >91.5 and ≤94.7. The random forest model demonstrated the best performance (F1 score: 0.93; AUC: 0.81) and was transformed into an online application as an educational tool to demonstrate machine learning capabilities. Conclusions: The random forest machine learning algorithm best predicted MOWHTO treatment success. Patients with a lower BMI, poor clinical status, slight valgus overcorrection, and postoperative MPTA <94.7, more frequently achieved a greater likelihood of treatment success. Level of Evidence: Level III, retrospective cohort study.
... This soft-tissue-friendly approach enables faster wound healing and rehabilitation and has made arthroscopy one of the most frequently performed outpatient surgical interventions. Critical consideration should be given to whether the fairly minor surgical procedure is not performed beyond guideline-recommended indications and is supported by current evidence (28,21). In addition to a general increase in the number of arthroscopies performed, there has also been a widespread increase in scientific publications (seen as the vehicle through which new discoveries in this field of orthopaedics are conveyed to the rest of the world (29,39). ...
PURPOSE OF THE STUDY A global bibliometric comparison of the level of scientific interest and output in the two research areas hip and knee arthroscopy (H-ASC and K-ASC) was carried out. In addition, the different degrees of publication activity in the countries and institutes performing this research were investigated. MATERIAL AND METHODS Publications from 1945-2020 listed in the Web of Science Core Collection were included in the study. Using the web application Science Performance Evaluation (SciPE), quantitative and qualitative aspects were evaluated. Subsequently, the date of publication, author information, and other metadata were analysed. RESULTS Since 1945, 3,924 studies have been published on K-ASC and 2,163 on H-ASC. The majority of the publications which have appeared since 2016 dealt with the topic of H-ASC (H-ASC: 241.2 publications/year; K-ASC: 217.4 publications/year). The USA published the most on both topics (H-ASC: 1,123 publications; K-ASC: 1,078 publications). More countries and institutes participated in K-ASC (3,008 institutes, 82 countries) than in H-ASC (103 institutes, 57 countries). The ten institutes with the most publications accounted for 36.71% and 12.34% of all publications on H-ASC and K-ASC, respectively. H-ASC received 78.12% of its funding from private sponsors while K-ASC was supported mainly by governmental/nonprofit sponsors (70.92%). CONCLUSIONS This study provides the first scientific comparison between H-ASC and K-ASC. Measured by qualitative and quantitative aspects, K-ASC was the most flourishing research area overall. In the last ten to five years, interest has shifted towards HASC with an increasing number of publications and a higher rate of citations. Key words: knee arthroscopy, hip arthroscopy, bibliometric comparison.
... In another systematic review, authors found no difference between patients receiving debridement or partial meniscectomy versus nonoperative management [28]. In addition, many studies and systematic reviews found lavage and debridement increase risk of infection and other complications and may cause patients to require TKA sooner [27,[29][30][31][32][33]. Given a lack of efficacy and safety concerns, the majority of studies indicate lavage and debridement is not costeffective compared to non-operative management, in terms of health-related quality of life and QALY [34][35][36]. ...
Full-text available
Osteoarthritis is a chronic, degenerative disease leading to pain and decreased functionality in millions of people in the United States every year. The knee joint is the most commonly affected joint and has a direct impact on patients’ mobility and ability to perform activities of daily living. Osteoarthritis is treated conservatively with a variety of modalities, including anti-inflammatories, physical therapy, corticosteroid, viscosupplementation, and platelet-rich plasma injections. The gold standard surgical treatment for eligible patients with severe osteoarthritis is total or unicompartmental knee arthroplasty. However, many patients may not be candidates for, or may opt to delay, knee replacement but still require surgery. In these cases, patients may receive lavage and debridement, microfracture, osteochondral allograft or autograft transplantation, autologous chondrocyte implantation, or high tibial osteotomy. This review will provide an overview of literature surrounding these non-arthroplasty, surgical treatment options for treating cartilage defects and knee osteoarthritis. In addition to being evaluated based on efficacy and discussing the ideal patient populations for each procedure, these surgeries will be compared in terms of cost-effectiveness and impact on quality of life.
... In comparison to literature prior to 2021, ChatGPT3's information was limited with arthroscopic debridement or lavage being found to have no significant benefit for knee OA at 2 years with inconclusive evidence for arthroscopic meniscal debridement [7,8]. Furthermore, limited information on unicompartmental knee arthroplasty (UKA) and high tibial osteotomy (HTO) was provided, failing to mention any differences between the two interventions. ...
Full-text available
Chat Generative Pre-Trained Transformer 3 (ChatGPT3) is an open artificial intelligence (AI) platform that utilizes deep learning to produce human-like text, which could greatly reduce the time spent on literature search, data analysis and research writing in future, and ensures academic standards of writing. This study aims to evaluate the information provided by open artificial intelligence, Chat Generative Pre-Trained Transformed 3 (ChatGPT3) and its use in orthopedic surgery research writing. Serial of five prompted questions on surgical management of knee osteoarthritis (OA) were asked from ChatGPT3. The answers were reviewed and scrutinized for how updated, accurate and succinctly presented the information was in text and referencing. The information ChatGPT3 provided was accurate, albeit surface-level. It lacked analytical abilities, missed vital studies, and all references links were incorrect. It seems that while the algorithm has access to all information on the internet until 2021, it lacked the analytical ability to dissect for important limitations about knee OA, which would not be conducive to potentiating creative ideas and solutions in orthopedic surgery. ChatGTP3 only promotes convergent thinking and prevent innovation and therefore should be limited within the scope of research or at least reviewed under the guidance of experts.
Placebo effects raise some fundamental questions concerning the nature of clinical and medical research. This Element begins with an overview of the different roles placebos play, followed by a survey of significant studies and dominant views about placebo mechanisms. It then critically examines the concept of placebo and offers a new definition that avoids the pitfalls of other attempts. The main philosophical lesson is that background medical theories provide the ontology for clinical and medical research. Because these theories often contain incoherent and arbitrary classifications, the concept of placebo inherits the same messiness. The Element concludes by highlighting some impending challenges for placebo studies.
Full-text available
Osteoarthritis (OA) is a condition that can cause substantial pain, loss of joint function, and a decline in quality of life in patients. Numerous risk factors, including aging, genetics, and injury, have a role in the onset of OA, characterized by structural changes within the joints. Most therapeutic approaches focus on the symptoms and try to change or improve the structure of the joint tissues. Even so, no treatments have been able to stop or slow the progression of OA or give effective and long-lasting relief of symptoms. In the absence of disease-modifying drugs, regenerative medicine is being investigated as a possible treatment that can change the course of OA by changing the structure of damaged articular cartilage. In regenerative therapy for OA, mesenchymal stem cells (MSCs) have been the mainstay of translational investigations and clinical applications. In recent years, MSCs have been discovered to be an appropriate cell source for treating OA due to their ability to expand rapidly in culture, their nontumorigenic nature, and their ease of collection. MSCs’ anti-inflammatory and immunomodulatory capabilities may provide a more favorable local environment for the regeneration of injured articular cartilage, which was thought to be one of the reasons why they were seen as more suited for OA. In addition to bone marrow, MSCs have also been isolated from adipose tissue, synovium, umbilical cord, cord blood, dental pulp, placenta, periosteum, and skeletal muscle. Adipose tissue and bone marrow are two of the most essential tissues for therapeutic MSCs. Positive preclinical and clinical trial results have shown that, despite current limitations and risks, MSC-based therapy is becoming a promising approach to regenerative medicine in treating OA.
Over the last two decades, there has been a growing emphasis on the publication quality in Foot & Ankle research. A level-of-evidence rating system for clinical scientific papers has been proposed by the Centre for Evidence-based medicine in Oxford, United Kingdom. As opposed to other subspecialities, foot & ankle surgery deals with a wide variety of clinical problems and surgical solutions, which in turn leads to a generally low number of patients available for study groups. However, level III and IV studies still have a valuable place in orthopaedic research, given the challenges in running high-level studies.The measurement of outcomes in medicine from the patients' perspective (PROMS:(patient reported outcome measures) has grown almost exponentially in all surgical specialties including foot & ankle surgery. There are many PROMs available to foot & ankle surgeons, but there is little consensus on which assessment is most appropriate for a given procedure or diagnosis. Their use in research and clinical practice offers many advantages in clinical practice and research, however, besides the advantages there are also some downsides.
Full-text available
A 36-item short-form (SF-36) was constructed to survey health status in the Medical Outcomes Study. The SF-36 was designed for use in clinical practice and research, health policy evaluations, and general population surveys. The SF-36 includes one multi-item scale that assesses eight health concepts: 1) limitations in physical activities because of health problems; 2) limitations in social activities because of physical or emotional problems; 3) limitations in usual role activities because of physical health problems; 4) bodily pain; 5) general mental health (psychological distress and well-being); 6) limitations in usual role activities because of emotional problems; 7) vitality (energy and fatigue); and 8) general health perceptions. The survey was constructed for self-administration by persons 14 years of age and older, and for administration by a trained interviewer in person or by telephone. The history of the development of the SF-36, the origin of specific items, and the logic underlying their selection are summarized. The content and features of the SF-36 are compared with the 20-item Medical Outcomes Study short-form.
Full-text available
The non-equivalence of statistical significance and clinical importance has long been recognised, but this error of interpretation remains common. Although a significant result in a large study may sometimes not be clinically important, a far greater problem arises from misinterpretation of non-significant findings. By convention a P value greater than 5% (P>0.05) is called “not significant.” Randomised controlled clinical trials that do not show a significant difference between the treatments being compared are often called “negative.” This term wrongly implies that the study has shown that there is no difference, whereas usually all that has been shown is an absence of evidence of a difference. These are quite different statements.The sample size of controlled trials is generally inadequate, with a consequent lack of power to detect real, and clinically worthwhile, differences in treatment. Freiman et al1 found that only 30% of a sample of 71 trials published in the New England Journal of Medicine in 1978-9 with P>0.1 were large enough to have a 90% chance of detecting even a 50% difference in the effectiveness of the treatments being compared, and they found no improvement in a similar sample of trials published in 1988. To interpret all these “negative” trials as providing evidence of the ineffectiveness of new treatments is clearly wrong and foolhardy. The term “negative” should not be used in this context.2A recent example is given by a trial comparing octreotide and sclerotherapy in patients with variceal bleeding.3 The study was carried out on a sample of only 100 despite a reported calculation that suggested that 1800 patients were needed. This trial had only a 5% chance of getting a statistically significant result if the stated clinically worthwhile treatment difference truly existed. One consequence of such low statistical power was a wide confidence interval for the treatment difference. The authors concluded that the two treatments were equally effective despite a 95% confidence interval that included differences between the cure rates of the two treatments of up to 20 percentage points.Similar evidence of the dangers of misinterpretation of non-significant results is found in numerous metaanalyses (overviews) of published trials, when few or none of the individual trials were statistically large enough. A dramatic example is provided by the overview of clinical trials evaluating fibrinolytic treatment (mostly streptokinase) for preventing reinfarction after acute myocardial infarction. The overview of randomised controlled trials found a modest but clinically worthwhile (and highly significant) reduction in mortality of 22%,4 but only five of the 24 trials had shown a statistically significant effect with P<0.05. The lack of statistical significance of most of the individual trials led to a long delay before the true value of streptokinase was appreciated.While it is usually reasonable not to accept a new treatment unless there is positive evidence in its favour, when issues of public health are concerned we must question whether the absence of evidence is a valid enough justification for inaction. A recent publicised example is the suggested link between some sudden infant deaths and antimony in cot mattresses. Statements about the absence of evidence are common—for example, in relation to the possible link between violent behaviour and exposure to violence on television and video, the possible harmful effects of pesticide residues in drinking water, the possible link between electromagnetic fields and leukaemia, and the possible transmission of bovine spongiform encephalopathy from cows. Can we be comfortable that the absence of clear evidence in such cases means that there is no risk or only a negligible one?When we are told that “there is no evidence that A causes B” we should first ask whether absence of evidence means simply that there is no information at all. If there are data we should look for quantification of the association rather than just a P value. Where risks are small P values may well mislead: confidence intervals are likely to be wide, indicating considerable uncertainty. While we can never prove the absence of a relation, when necessary we should seek evidence against the link between A and B—for example, from case-control studies. The importance of carrying out such studies will relate to the seriousness of the postulated effect and how widespread is the exposure in the population.References↵Bailar JC, Mosteller FFreiman JA, Chalmers TC, Smith H, Kuebler RR.The importance of beta, the type II error, and sample size in the design and interpretation of the randomized controlled trial: survey of two sets of “negative” trials. In: Bailar JC, Mosteller F eds.Medical uses of statistics.2nd ed. Boston, MA: NEJM Books,1992: 357–73.↵Chalmers I.Proposal to outlaw the term “negative trial.”BMJ1985;290: 1002.↵Sung JJY, Chung SCS, Lai C-W, Chan FKL, Leung JWC, Yung M-L, Kassianides C, et al.Octreotide infusion or emergency sclerotherapy for variceal haemorrhage.Lancet1993;342:637–41.OpenUrlCrossRefMedlineWeb of Science↵Yusuf S, Collins R, Peto R, Furberg C, Stampfer MJ, Goldhaber SZ, et al.Intravenous and intracoronary fibrinolytic therapy in acute myocardial infarction: overview of results on mortality, reinfarction and side-effects from 33 randomized controlled trials.Eur Heart J1985;6:556–85.OpenUrlFREE Full Text
Background: Most clinical studies are done to show comparative superiority, but many reports now claim equivalence between the investigated entities. These assertions may not always be supported by the methods used and the results obtained. Purpose: To assess the justification and support for claims of clinical or therapeutic equivalence in medical journals. Data Sources: A search of MEDLINE for articles published from 1992 through 1996. Study Selection: From 1209 citations that contained the word equivalence in the title or abstract or contained the Medical Subject Heading therapeutic equivalency, we excluded 1121 studies reporting nonoriginal research, purely laboratory or other nonhuman research, and studies in which equivalence was not the main claim. The remaining 88 eligible papers were evaluated for five methodologic attributes. Data Synthesis: Only 45 (51%) of the 88 reports were specifically aimed at studying equivalence; the others either tried to show superiority or did not state a research aim. The quantitative distinctions regarded as equivalent ranged from 0% to 21% for direct increments and from 0% to 76% for proportionate differences. An equivalence boundary was set and confirmed with an appropriate statistical test in only 23% of reports. In 67% of reports, equivalence was declared after a failed test for comparative superiority, and in 10%, the claim of equivalence was not statistically evaluated. The sample size needed to confirm results had been calculated in advance for only 33% of reports. Sample size was 20 patients per group or fewer in 25% of reports. Conclusions: Many studies of clinical equivalence do not set boundaries for equivalence. Claims of difference or similarity are often made not by thoughtful examination of the data but by tests of statistical significance that are often misapplied or accompanied by inadequate sample sizes. These methodologic flaws can lead to false claims, inconsistencies, and harm to patients.
This retrospective study reviews 43 patients with osteoarthritis of the knee who underwent arthroscopic surgery, from January 1979 until April 1982. Percutaneous drilling of an osteochondral defect in the femoral condyle was performed in 22 patients to relieve the high intraosseous pressure and rest pain associated with this disease; successful results were recorded in 80% of patients at an average followup of 25.1 months. Partial meniscectomy was performed in 21 patients to remove an obstructing degenerative meniscal tear; 81% had successful results with an average followup of 40.6 months. There were no postoperative complications. Percutaneous drilling and excision of degenerative meniscal tears can be valuable arthroscopic procedures in properly selected patients with osteoarthritis of the knee.