Page 1

American Journal of Epidemiology

Copyright O 1998 by The Johns Hopkins University School of Hygiene and PubDc Health

All rights reserved

Vol. 147, No. 8

Printed in USA.

Confidence Limits Made Easy: Interval Estimation Using a Substitution

Method

Leslie E. Daly

The use of confidence intervals has become standard in the presentation of statistical results in medical

journals. Calculation of confidence limits can be straightforward using the normal approximation with an

estimate of the standard error, and in particular cases exact solutions can be obtained from published tables.

However, for a number of commonly used measures in epidemiology and clinical research, formulae either are

not available or are so complex that calculation is tedious. The author describes how an approach to

confidence interval estimation which has been used in certain specific instances can be generalized to obtain

a simple and easily understood method that has wide applicability. The technique is applicable as long as the

measure for which a confidence interval is required can be expressed as a monotonic function of a single

parameter for which the confidence limits are available. These known confidence limits are substituted into the

expression for the measure—giving the required interval. This approach makes fewer distributional assump-

tions than the use of the normal approximation and can be more accurate. The author illustrates his technique

by calculating confidence intervals for Levin's attributable risk, some measures in population genetics, and the

"number needed to be treated" in a clinical trial. Hitherto the calculation of confidence intervals for these

measures was quite problematic. The substitution method can provide a practical alternative to the use of

complex formulae when performing interval estimation, and even in simpler situations it has major advantages.

Am J Epidemiol 1998; 147:783-90.

binomial distribution; confidence intervals; epidemiologic methods; Poisson distribution; statistics

Confidence intervals are now required by most med-

ical journals for the presentation of statistical results.

A confidence interval is a range of likely values for an

unknown population parameter at a given confidence

level. The endpoints of this range are called the con-

fidence limits.

A number of different methods can be used to

estimate confidence limits. Exact limits can be ob-

tained using published tables (1-3) or appropriate soft-

ware (4, 5) for a single proportion, percentage, or risk

(binomial limits), as well as for a count (Poisson

limits). However, the most commonly used method of

calculating confidence limits involves the normal ap-

proximation, in which a multiple of the standard error

(SE) is added to and subtracted from the sample value

for the measure. For 95 percent confidence limits, the

general expression is

statistic ± 1.96 SE(statistic),

(1)

Received for publication October 13, 1994, and In final form

October 3, 1997.

Abbreviations: LAR, Levin's attributable risk; NNT, number

needed to be treated; RR, relative risk; SE, standard error.

From the Department of Public Health Medicine and Epidemiol-

ogy, University College Dublin, Earlsfort Terrace, Dublin 2, Ireland.

(Reprint requests to Prof. Leslie E. Daly at this address).

where SE(statistic) is the standard error of the relevant

quantity and 1.96 is the appropriate percentile of the

normal distribution. Confidence limit estimation is

relatively straightforward using this approach, and

methods for use with single means, proportions, or

counts, for differences between these, and for relative

risk-type measures are well known (1, 2, 6). Several

commonly used standard error formulae are given in

the Appendix.

Although the normal approximation (expression 1)

can often be used directly for confidence interval

estimation, sometimes it must be used on a transfor-

mation of the measure of interest For instance, 95

percent confidence limits for the relative risk (RR) can

be based on the limits for loge RR:

log,RR±

(2)

where SE(loge RR) is the standard error of the natural

logarithm of RR (expression A2 in the Appendix).

Transforming back to the original scale, the exponen-

tial of these limits gives the limits for the relative risk

itself. A similar approach can also be used for the odds

ratio. It is important to realize that it is the actual limits

of the transformed quantity that must be back-

transformed. When the limits are transformed in this

783

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 2

784 Daly

way, the confidence limits are not symmetrical around

the point estimate. A common error for the unwary is

to back-transform the "plus-and-minus" part of the

expression, which gives a symmetrical but incorrect

interval.

Unfortunately, however, for a number of measures

used in epidemiology or clinical research, either no

standard error formulae are available or the formulae

are complex and tedious to use. In addition, these more

complex formulae often can only be found in specialist

articles or books, and they are rarely implemented in

computer software packages.

Other techniques for confidence limit estimation

have also been proposed, the best known of which

include Miettinen's test-based limits (1, pp. 197-200)

and Thomas and Gart's (7) computationally difficult

procedure for parameters of 2 X 2 tables. These tech-

niques are of limited applicability.

This paper describes a particular approach to confi-

dence limit estimation which, though previously de-

scribed for certain simple situations, gives rise to a

hitherto unrecognized general method. It is easily un-

derstood and simple to apply, makes fewer assump-

tions than the normal approximation approach, is in-

herently more accurate, and is applicable in many

situations that were previously intractable.

THE SUBSTITUTION METHOD

The following example, pertaining to an incidence

rate, illustrates a simple application of what might be

called "the substitution method" for estimating confi-

dence limits. Seventeen cases of Wilson's disease

were detected in 1,240,091 births in Ireland (8), giving

a birth incidence of

1 = xIN = 17/1,240,091 = 13.71 per million. (3)

In this situation, the number of cases (x) can be as-

sumed to have a Poisson distribution, and the denom-

inator (A0, which is large and based on census data,

can be considered fixed and without sampling varia-

tion. Confidence limits for / can then be based on

confidence limits for x (1, pp. 67-8). If xl and xu are

the lower and upper confidence limits for x, respec-

tively, then the lower and upper confidence limits for

/ are simply

// = x,/N;

L = xJN.

(4)

(5)

In Poisson confidence interval tables (1, pp. 393-5),

the 95 percent limits for this number of cases (x = 17)

are 9.903 and 27.219. Substitution into expressions 4

and 5 then gives a 95 percent confidence interval for

the incidence rate of 7.99-21.95 per million.

Although the substitution of known confidence lim-

its into an expression for a quantity has also been

described for interval estimation of a standardized

mortality ratio (1, p. 279) and the ratio of two rates (1,

p. 200), the approach has never been considered as a

method for confidence limit estimation in its own

right, and its general applicability has never been

exploited. In each of these examples, the measure for

which a confidence interval is required is expressed as

a function of a single quantity for which limits are easy

to calculate. The confidence limits for this single

quantity are then substituted into the formula for the

measure of interest to obtain the required interval.

It is interesting to note that confidence limit estima-

tion based on a transformation of a particular quantity

can also be considered an application of this substitu-

tion method. For instance, any quantity is a function

(the exponential or back-transformation) of its loga-

rithm. Taking the relative risk as an example (see

above), the confidence limits for log,. RR are readily

computed, and their exponentiation, giving the confi-

dence limits for the RR itself, is equivalent to their

substitution into the formula for the RR.

The general applicability of the substitution method

is illustrated below by extending its application to

three situations which hitherto were quite problematic.

Other examples can easily be found. In each situation,

the use of the method is illustrated using previously

published data, and further calculations are performed

on example data sets. Twenty-one 2 X 2 contingency

tables were chosen for illustrating two of the applica-

tions. These example tables (table 1), each of which

shows a statistically significant (uncorrected chi-

square, p < 0.05) difference in mortality between two

groups, cover a range of sample sizes, baseline risks,

relative risks, and risk differences.

Levin's attributable risk

Several attributable risk-type measures are sug-

gested in the literature (9). One in particular is called

Levin's attributable risk. Table 2 gives infant mortality

by birth weight for 72,730 births among whites in New

York City for 1974 (10, p. 77). If low birth weight

births did not occur in the population, the population

risk would be reduced to that observed for infants with

normal birth weight. Letting RT = 1,040/72,730 =

0.0143 and R2 = 422/67,515 = 0.0063 represent,

respectively, the total observed risk in the population

and the risk in normal birth weight infants, the differ-

ence between these is the amount of risk in the pop-

ulation that is attributable to low birth weight. Levin's

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 3

Confidence Limits Made Easy 785

TABLE 1. Structure of axampla data seta used to Illustrata tha application of the substitution mathod tor aatimation of

confidanca limits*

Example

A

B

C

D

E

F

G

H

1

J

K

L

M

N

0

P

Q

R

S

T

U

Total

sample

StZB

50

50

50

888

888

500

500

500

500

500

500

500

500

500

500

500

500

Risk In

«r°up1

cy

0.72

0.72

0.72

0.72

0.72

0.72

0.48

0.48

0.48

0.72

0.72

0.72

0.48

0.48

0.48

0.24

0.24

0.24

0.12

0.12

0.12

RetatMerlsk

1.5

2.0

3.0

1.5

3.0

1.5

ZO

3.0

1.5

ZO

3.0

1.5

ZO

3.0

1.5

2.0

3.0

1.5

ZO

3.0

RJR,

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

Rtek

drterenoa

0.24

0.36

0.48

0.24

0.36

0.48

0.16

0.24

0.32

024

0.36

0.48

0.16

0.24

0.32

0.08

0.12

0.16

0.04

0.06

0.08

Group 1

Dead

36

36

36

72

72

72

48

48

48

360

360

360

240

240

240

120

120

120

60

60

60

Cell

Alive

14

14

14

28

28

28

52

52

52

III

260

260

260

380

380

380

440

440

440

Dead

24

18

12

48

36

24

32

24

16

240

180

120

160

120

80

80

60

40

40

30

20

Group 2

Aft/e

26

32

38

52

64

76

68

76

84

260

320

380

340

380

420

420

440

460

460

470

480

* Each data set ia a 2 x 2 table relating mortality in two groups of equal sampis sizes. (Group 1 has the higher risk.)

attributable risk (LAR) is this quantity expressed as a

proportion of the total population risk:

LAR =

RT R2

Rr

= 0.563.

(6)

A complex standard error formula for LAR was pro-

posed by Walter (9), and an alternative for the loga-

rithm of LAR was proposed by Fleiss (10, pp. 76-7),

both of which enable estimation of confidence limits.

However, application of the substitution method pro-

vides a far easier solution. First, LAR must be ex-

TABLE 2. Infant mortality among whitas In New York City, by

birth weight, 1974*

Birth

(fl)

22,500

>Z500

Total

Outcome el 1 year

Dead

618 (a)f

422 (C)

1,040

Alive

4,597 (b)

67,093 (d)

71,690

Total

5,215 (a + b)

67,515 (c + d)

72,730

• Data from Fleiss (10, p. 77).

t a, b, c, and dare oeD entries.

pressed as a function of a single parameter for which

confidence limits are easy to obtain. A small amount

of algebraic manipulation gives the following expres-

sion for LAR in terms of the relative risk of infant

death among low birth weight infants compared with

normal birth weight infants (RR = RXIR2 = 18.959,

where fl, = 618/5215 = 0.119) and the prevalence of

low birth weight in the population (Prev = 5,215/

72,730 = 0.0717):

LAR =

Prev(RR - 1)

1 +Prev(RR- 1)'

(7)

This expression for LAR is in common use, even

though its interpretation is not intuitively obvious. If

the prevalence of low birth weight is assumed to be

free of sampling variation (equivalent to assuming that

one of the margins of table 2 is fixed), LAR is seen to

be expressed in terms of the relative risk, for which

confidence limits are easily obtained. These limits can

then be substituted into expression 7 to obtain limits

for LAR. The lower and upper 95 percent confidence

Umits for this RR, RR, and RRU, estimated using

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 4

786 Daly

expressions 2 and A2, are 16.807 and 21.387, respec-

tively. By substitution, the limits for LAR are

LAR, =

T AD —

Prev(RR, - 1)

1 + Prev(RR; - 1)

0.0717(16.807 - 1)

~ 1 + 0.0717(16.807 - 1)

Prev(RRu - 1)

Prev(RRtt - 1)

= 0.531; (8)

0.0717(21.387 - 1)

= 1 + 0.0717(21.387 - 1)= 0.594. (9)

These are almost identical to the values of 0.530 and

0.594 calculated using Fleiss' standard error formula

(10) and to the limits of 0.532 and 0.594 given by

Walter's more complex approach (9). Of course, the

limits obtained by the substitution method depend

on which formula is employed for the relative risk

limits.

Table 3 compares the 95 percent confidence limits

obtained by means of Walter's method, Fleiss'

method, and the substitution method (used as de-

scribed above) for the 21 example tables. For this

application, group 1 is taken as the exposed group

and group 2 as the nonexposed. There is good

agreement between the three approaches, but in

general the substitution limits (lower and upper)

tend to be lower than those given by Walter's

method and higher than those given by Fleiss'.

Thus, the substitution limits would seem to be a

better approximation of Walter's limits than the

limits proposed by Fleiss. In example table S, for

instance, Fleiss' lower limit is less than zero, which

would correspond to a statistically nonsignificant

(5 percent level) association between exposure and

mortality. Based on the (conservative) continuity-

corrected chi-square, the association is statistically

significant, and both the substitution method

and Walter's method give the required nonnegative

limits.

Population genetics

Under Hardy-Weinberg equilibrium, the frequency

(the proportion or percentage) of a rare recessive gene

TABLE 3. Lower and upper 95 peroent confidence Dmlts for Levin's attributable risk (LAR, and LAR,,),

calculated using two published methods and the substitution method In 21 example data sets*

Levin's

attrtxitabto

risk

(LAR)

Confidence SnU calculation method

Example

A

B

C

D

E ,

F

Q

H

1

J

K

L

M

N

O

P

Q

R

S

T

U

0.200

0.333

0.500

0.2O0

0.333

0.500

0.200

0.333

0.5OO

0.200

0.333

0.500

0.200

0.333

0.500

0.200

0.333

0.500

0200

0.333

0.500

Method o» Rates (10)

LAR,

0.017

0.118

0.251

0.075

0.188

0.335

0.011

0.124

0272

0.146

0272

0.432

0.120

0247

0.408

0.068

0.193

0.356

-0.005

0.117

0277

LAR.

0.349

0.498

0.666

0.308

0.453

0.624

0.353

0.493

0.657

0250

0.390

0.560

0273

0.410

0.577

0.314

0.449

0.612

0.363

0.497

0.654

Method of Walter (9)

LAR,

0.039

0.152

0.304

0.086

0.205

0.361

0.031

0.154

0.315

0.149

0276

0.438

0.125

0253

0.417

0.078

0207

0.374

0.017

0.147

0.316

LAR.

0.361

0.515

0.696

0.314

0.462

0.639

0.369

0.513

0.685

0.251

0.391

0.562

0.275

0.414

0.583

0.322

0.460

0.626

0.383

0.520

0.684

Subsdutton method

LAR,

0.035

0.142

0.280

0.084

0200

0.349

0.027

0.144

0294

0.148

0275

0.436

0.124

0251

0.413

0.075

0202

0.364

0.013

0.136

0295

LAR,,

0.355

0.501

0.670

0.311

0.455

0.626

0.361

0.499

0.662

0250

0.389

0.559

0274

0.411

0.578

0.318

0.453

0.615

0.374

0.506

0.661

*Seo table 1.

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 5

Confidence Umtts Made Easy 787

(q) in the population can be estimated from the square

root of the birth incidence of homozygotes (/)

(11, p. 5):

q =

(10)

An approximation for the standard error of this esti-

mate of q is usually given by (11, p. 5):

SE(q) =

— ^2q2)/4N,

(11)

where N is the number of births on which the birth

incidence is based. In the study described above, the

17 (homozygous) cases of Wilson's disease in

1,240,091 births gave an incidence of 13.71 per mil-

lion and a gene frequency of 0.37 percent. The 95

percent confidence interval for the latter figure using

the normal approximation (expressions 1 and 11) is

0.28-0.46 percent.

The substitution method offers an alternative to this

approach. The 95 percent confidence limits for the

incidence rate were previously determined from ex-

pressions 4 and 5 (using the substitution method with

the Poisson distribution) to be 7.99 per million and

21.95 per million. Applying the substitution method

again by taking the square roots of these limits (ex-

pression 10), the lower and upper 95 percent confi-

dence limits for the gene frequency are 0.28 percent

and 0.47 percent—almost identical to those obtained

using the standard error method.

Table 4 compares the substitution method with the

usual approach for a series of birth incidence examples

covering a range of gene frequencies and total births.

Agreement is close, particularly with large numbers of

affected births. When there was only one affected birth

(examples AA and EE), however, the lower 95 percent

limits for the gene frequency were much higher using

the substitution method. It should be realized, of

course, that use of expressions 1 and 11 does not give

exact limits for the gene frequency, since the calcula-

tion is based on an approximate formula and on the

assumption that the sampling distribution of q is nor-

mal. The substitution limits, on the other hand, based

on a transformation of the exact Poisson limits for

the incidence rate, can be considered exact in this

situation.

Although the substitution method is more accurate,

the simplicity of the standard error formula in this

example gives no advantage to the new approach in

terms of ease of use. If, however, the estimation were

to allow for inbreeding, the formula relating the gene

frequency to the incidence is more intricate and in-

TABLE 4.

udng the ueual method and the eubetitution method in a —riea of 22 birth Incidence examptoe

Lower and upp«r 95 percent confidence limits for gene frequency (q, and qj, calculated

Example

AA

BB

CC

DD

EE

FF

GQ

HH

II

JJ

KK

LL

MM

NN

OO

PP

QQ

RR

SS

TT

UU

W

No.

of

births

1

2

5

10

1

2

6

10

20

50

5

10

25

50

100

5

10

25

50

100

250

500

Total

na.cH

births

1,000

1,000

1,000

1,000

5,000

5,000

5,000

5,000

5,000

5,000

10,000

10,000

10,000

10,000

10,000

50,000

50,000

50,000

50,000

50,000

50,000

50,000

Gene

frequency

(4

3.16

4.47

7.07

10.00

1.41

ZOO

3.16

4.47

6.32

10.00

Z24

3.16

5.00

7.07

10.00

1.00

1.41

£24

3.16

4.47

7.07

10.00

Cortktenco Iknft cateutattoo method

Usual method

Q,

0.06

1.38

3.98

6.92

0.03

0.61

1.78

3.09

4.94

8.62

1.26

2.18

4.02

6.09

9.02

0.56

0.98

1.80

Z72

4.03

6.63

9.56

Q.

6.28

7.57

10.16

13.08

2.80

3.39

4.55

5.86

7.71

11.38

3.22

4.14

5.98

8.05

10.98

1.44

1.85

2.67

3.60

4.91

7.51

10.44

Substitution method

0.50

1.56

4.03

6.92

0.23

0.70

1.80

3.10

4.94

8.62

1.27

Z19

4.02

6.09

9.02

0.57

0.98

1.80

Z72

4.03

6.63

9.56

%

7.46

8.50

10.80

13.56

3.34

3.80

4.83

6.06

7.86

11.48

3.42

4.29

6.07

8.12

11.03

1.53

1.92

2.72

3.63

4.93

7.52

10.45

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 6

788 Daly

eludes a population inbreeding coefficient (11, p. 20).

The standard error for this corrected estimate of the

gene frequency is not given in standard textbooks, but

the substitution method allows for easy estimation of

gene frequency confidence limits. In addition, many

genetic parameters, such as the population proportion

of heterozygotes, are functions of the gene frequency.

Although standard error formulae are not easily found,

further repeated application of the substitution method

again allows for easy confidence interval estimation.

Number needed to be treated

A measure summarizing the results of a clinical trial

was described several years ago by Laupacis et al.

(12), without any explicit formulation for estimating

its confidence limits. The "number needed to be

treated" (NNT) is the number of patients that would

have to be treated with a trial therapy to prevent one

adverse event, and thus it gives clinicians and patients

a measure of the effort required to achieve a particular

result.

The effect of an insulin-glucose infusion followed

by intensive subcutaneous insulin in diabetic patients

with myocardial infarction was examined in a random-

ized controlled trial (13). After 1 year of follow-up,

there were 58 deaths in the 306 patients receiving the

new therapy (19.0 percent) as compared with 82

deaths in the 314 control patients on standard therapy

(26.1 percent). On the basis of these figures, one

would expect 261 deaths in 1,000 patients on standard

therapy. However, if these patients had received the

new treatment, there would have been just 190 deaths.

Thus, treatment of 1,000 patients would have pre-

vented 71 deaths (261 - 190 = 71), meaning that 14

patients (1,000/71) would have had to be treated to

prevent one death. NNT is then 14. It is easy to see

that, in fact, NNT is simply the reciprocal of the

absolute risk reduction (the difference in the risk of an

event between the treated and control groups).

NNT =

111

0.261-0.190 0.071= 14.08,

(12)

where /?t is the event risk in the control group and R2

is the risk in the treated group.

In the original paper describing this measure, pub-

lished in 1988 (12), it was suggested that confidence

limits might be obtained using a complex technique

requiring a special computer program (7). Applying

this to the above data gives 95 percent confidence

limits for NNT of 7.2 and 378.5. The substitution

method, however, provides a very simple solution for

this problem. Using expressions 1 and A3, the 95

percent confidence limits for the difference between

the proportions of events in the treatment and placebo

groups (Rl - R2 = 0.071) are 0.006 and 0.137, re-

spectively. Using the substitution principle on expres-

sion 12, the reciprocal of these limits gives the 95

percent Limits for the NNT as 7.3 and 166.7. The upper

limit is considerably lower than that obtained using the

Laupacis et al. method, and this is due to the fact that

the lower confidence limit for the risk difference is

close to zero. In cases Like this, where the significance

level is not very high, the stability of the upper Limit

for NNT may be in question. Since a zero risk differ-

ence corresponds to an NNT of infinity, small changes

in the lower limit for the risk difference close to zero

can result in very large changes in the NNT estimate.

In 1992, ChatelLier et al. (14) published a nomogram

for estimating NNT from the relative risk and the risk

in the control group, also showing how to estimate its

confidence limits. Their application is actually equiv-

alent to substitution of the limits for the relative risk

into a formula for NNT expressed in terms of RR and

/?,. This approach must be less accurate than the

method described above, which substituted limits for

the absolute risk reduction: ChatelLier et aL.'s method

assumes that Rx is without sampling variation, and

estimating the limits for RR requires more stringent

assumptions and a greater degree of approximation

than the estimation of limits for the risk reduction.

Table 5 compares the 95 percent confidence Limits

for NNT calculated on the 21 example tables using the

suggestions of Laupacis et al. (12), the suggestions of

Chatellier et al. (14), and the substitution method.

Here group 1 is taken as the controls, with group 2

representing the treated patients. Expression A2 was

employed to estimate the relative risk Limits for

Chatellier et al.'s method, and the Limits for the abso-

lute risk reduction based on expression A3 were em-

ployed for the substitution method.

In general, the substitution limits were wider than

those of Chatellier et ai., which can be explained by

the more stringent assumptions underlying the latter.

The substitution method agrees well with the complex

approach of Laupacis et al., especially for larger sam-

ple sizes, though it seems to give a consistently

smaller upper Limit. As was noted above, a large

discrepancy can be expected for results that are close

to significance, as in example S.

DISCUSSION

The substitution method of estimating confidence

limits described in this paper does not seem to have

been proposed explicitly before, although some spe-

cific applications are well known. The kernel of the

method is expressing the measure for which confi-

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 7

Confidence Limits Made Easy 789

TABLE 5. Lower and upper 95 percent confldenoe Omits for the number needed to be treated (NNT, and

NNT,,), calculated using two published methods and the substitution method in 21 example data sets*

Example

A

B

C

D

LJJ

F

G

H

1

J

K

L

M

N

0

P

Q

R

S

T

U

No. needed

to be

treated

(NNT)

42

2.8

2.1

4.2

2.8

2.1

6.3

4.2

3.1

4.2

2.8

2.1

6.3

4.2

3.1

1Z5

8.3

6.3

25.0

16.7

1Z5

Method o4 Laupad* et al. (12)

NMT,

2.3

1.8

1.5

2.7

2.0

1.7

3.4

2.7

Z3

3.3

2.4

1.8

4.5

3.4

Z7

7.7

6.0

5.0

13.0

10.7

9.2

NNTD

35.1

6.8

3.7

10.5

4.6

2.9

69.7

10.2

5.4

5.6

3.3

2.4

10.3

5.6

3.8

34.9

14.0

8.7

1,105.6

43.5

21.7

Coffttonce Bmft calculation method

Method of Chateffler el al. (14)

NNT,

2.7

2.1

1.7

2.9

2.2

1.8

3.9

3.1

2.6

3.5

2.5

1.9

4.8

3.6

2.8

8.6

6.7

5.5

15.3

1Z4

10.5

NNTB

20.8

5.6

3.2

9.0

4.2

2.7

39.4

8.3

4.6

5.4

3.2

2.3

9.5

5.2

3.6

29.7

1Z4

7.8

336.6

34.9

18.3

Substtutton method

NNT,

Z3

1.8

1.5

2.7

2.0

1.7

3.4

2.7

2.3

3.3

Z4

1.9

4.5

3.4

2.7

7.7

6.0

4.9

13.0

10.5

8.8

NNT,,

18.6

5.6

3.2

9.2

4.3

Z8

38.4

9.0

5.0

5.5

3.3

2.3

10.0

5.5

3.8

32.6

13.7

8.6

345.5

40.5

21.4

•Se« table 1.

dence limits are required as a function of a single

quantity for which Emits are easily obtained. This is

often not difficult, and in many cases the usual for-

mula for the measure will be sufficient. It is important

to note, however, that the measure must be a function

of a single parameter in order for the substitution

method to work. For example, it is not possible to

obtain a confidence interval for a relative risk by using

the confidence limits for the two component absolute

risks.

In some cases it is necessary to assume that some of

the quantities that make up the relevant formula are

without sampling variation and are thus essentially

constant. If the measure is derived from a contingency

table, this will often be equivalent to the assumption

that one or both of the margins of the table are fixed,

making the analysis conditional on those margins.

Thus, for Levin's attributable risk, the prevalence of

the condition was taken as constant. This is a common

assumption in contingency table analysis. The condi-

tional assumption can sometimes be avoided by judi-

cious choice of the parameter to be substituted, as in

the case of the NNT discussed above.

The substitution method will be applicable as long

as the relation between the measure and the parameter

for which limits are available is fairly simple. Tech-

nically, the requirement is that the functional relation

be monotonically increasing (or decreasing). This

means that if the parameter increases, the measure

must always either remain the same or increase (or

decrease). (In the latter case, the lower limit for the

parameter will give the upper limit for the measure and

vice versa.) It is difficult to imagine a practically

useful measure in medical or epidemiologic applica-

tions for which this condition will not hold.

Although the examples presented in this paper can-

not be taken as a formal comprehensive numerical

evaluation of the substitution method for confidence

interval estimation, there is good agreement with the

more established procedures in the cases considered.

Even though there is no explicit formula for the con-

fidence limits, the substitution method is without

doubt easier to explain and to use. A suitable formula

for the measure of interest is all that is required, and

the usually incomprehensible expressions for standard

errors are avoided entirely.

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 8

790 Daly

Another major advantage of the method is that no

distributional assumptions are necessary for the sam-

pling distribution of the measure for which the confi-

dence limits are required. If exact confidence limits

are known for the underlying parameter (as in the

binomial or Poisson cases), the limits for a function of

the parameter will also be exact. Thus, there is a

distinct advantage to the substitution method even

when an alternative exists using known standard error

formulae.

For measures that are a function of a single param-

eter, a Taylor series expansion is often used for inter-

val estimation (6, pp. 91-2). The standard error of the

function of a parameter f(x) is given by

SE[/«] ~ ±

where the derivative of/is evaluated at the mean value

of*.

Not only is this standard error an approximation but

the additional assumption of normality is required to

derive the confidence limits for the function using

expression 1. The standard error formula for the gene

frequency (expression 11) can be derived in this way.

The requirements for valid use of the Taylor series

expansion method are also more stringent than those

for the substitution method, in that the functional

relation must be strictly monotonically increasing (or

decreasing) and must have a nonzero first derivative.

(A strictly monotonic function requires that the func-

tion always changes as the parameter changes.) Thus,

the substitution method can and should always be used

instead of a Taylor series expansion.

For the end user, the general approach of the sub-

stitution method and its lack of reliance on complex

formulae make it clearer what the confidence limits

are measuring. It is particularly suitable for "hand"

calculations when specialized computer software is

not available. The substitution method should be

adopted as a practical alternative to complex formulae

when performing interval estimation. Even in simple

cases, the inherent accuracy of the method suggests

that it should replace some standard approaches.

ACKNOWLEDGMENTS

The author thanks Dr. Douglas G. Altaian of the Imperial

Cancer Research Fund (Oxford, England) and Prof.

Marcello Pagano of the Department of Biostatistics, Har-

vard School of Public Health (Brookline, Massachusetts),

for valuable suggestions and encouragement

REFERENCES

1. Daly LE, Bourke GJ, McGilvray J. Interpretation and uses of

medical statistics. 4th ed London, England: Blackwell Scien-

tific Publications, 1991.

2. Gardner MJ, Altman DG, eds. Statistics with confidence:

confidence intervals and statistical guidelines. London,

England: British Medical Journal, 1989.

3. Lentner C, ed. Geigy scientific tables. 8th ed. Version 2.

Basel, Switzerland: Ciba-Geigy Ltd, 1982.

4. Daly L. Simple SAS macros for the calculation of exact

binomial and Poisson confidence limits. Comput Biol Med

1992;22:351-61.

5. Gardner MJ, Gardner SB, Winter PD. Confidence interval

analysis microcomputer program manual. London, England:

British Medical Journal, 1989.

6. Armitage P, Berry G. Statistical methods in medical research.

3rd ed Oxford, England: Blackwell Scientific Publications, 1994.

7. Thomas DG, Gait JJ. A table of exact confidence limits for

differences and ratios of two proportions and their odds ratios.

J Am Stat Assoc 1977;72:73-6.

8. Reilly M, Daly L, Hutchinson M. An epidemiological study of

Wilson's disease in the Republic of Ireland. J Neurol Neuro-

surg Psychiatry 1993;56:298-3O0.

9. Walter SD. The estimation and interpretation of attributable

risk in health research. Biometrics 1976;32:829-49.

10. Fleiss JL. Statistical methods for rates and proportions. 2nd

ed New York, NY: John Wiley and Sons, Inc, 1981.

11. Emery AEH. Methodology in medical genetics: an introduc-

tion to statistical methods. Edinburgh, Scotland: Churchill

Livingstone, 1976.

12. Laupacis A, Sackett DL, Roberts RS. An assessment of clin-

ically useful measures of the consequences of treatment.

N Engl J Med 1988;318:1728-33.

13. Malmberg K. Prospective randomised study of intensive in-

sulin treatment on long term survival after acute myocardial

infarction in patients with diabetes mellitus. BMJ 1997;314:

1512-15.

14. Chatellier G, Zapletal E, Lemaitre D, et al. The number

needed to treat: a clinically useful nomogram in its proper

context BMJ 1996;312:426-9.

APPENDIX

Some common standard error formulae which were

employed in the derivation of results presented in this

paper are shown below.

For a binomial proportion (p) in a sample size of n,

the standard error is calculated as

-p)ln. (Al)

For the natural logarithm of the relative risk (log^

RR) (a, b, c, and d are table entries—see table 1 in the

text), the standard error is calculated as

1

a

111

a + b c c + d'

(A2)

For the difference between two risks (Rl — R^) in

sample sizes of nl and n?, the standard error is calcu-

lated as

(R2(l -

(A3)

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from