Page 1

American Journal of Epidemiology

Copyright O 1998 by The Johns Hopkins University School of Hygiene and PubDc Health

All rights reserved

Vol. 147, No. 8

Printed in USA.

Confidence Limits Made Easy: Interval Estimation Using a Substitution

Method

Leslie E. Daly

The use of confidence intervals has become standard in the presentation of statistical results in medical

journals. Calculation of confidence limits can be straightforward using the normal approximation with an

estimate of the standard error, and in particular cases exact solutions can be obtained from published tables.

However, for a number of commonly used measures in epidemiology and clinical research, formulae either are

not available or are so complex that calculation is tedious. The author describes how an approach to

confidence interval estimation which has been used in certain specific instances can be generalized to obtain

a simple and easily understood method that has wide applicability. The technique is applicable as long as the

measure for which a confidence interval is required can be expressed as a monotonic function of a single

parameter for which the confidence limits are available. These known confidence limits are substituted into the

expression for the measure—giving the required interval. This approach makes fewer distributional assump-

tions than the use of the normal approximation and can be more accurate. The author illustrates his technique

by calculating confidence intervals for Levin's attributable risk, some measures in population genetics, and the

"number needed to be treated" in a clinical trial. Hitherto the calculation of confidence intervals for these

measures was quite problematic. The substitution method can provide a practical alternative to the use of

complex formulae when performing interval estimation, and even in simpler situations it has major advantages.

Am J Epidemiol 1998; 147:783-90.

binomial distribution; confidence intervals; epidemiologic methods; Poisson distribution; statistics

Confidence intervals are now required by most med-

ical journals for the presentation of statistical results.

A confidence interval is a range of likely values for an

unknown population parameter at a given confidence

level. The endpoints of this range are called the con-

fidence limits.

A number of different methods can be used to

estimate confidence limits. Exact limits can be ob-

tained using published tables (1-3) or appropriate soft-

ware (4, 5) for a single proportion, percentage, or risk

(binomial limits), as well as for a count (Poisson

limits). However, the most commonly used method of

calculating confidence limits involves the normal ap-

proximation, in which a multiple of the standard error

(SE) is added to and subtracted from the sample value

for the measure. For 95 percent confidence limits, the

general expression is

statistic ± 1.96 SE(statistic),

(1)

Received for publication October 13, 1994, and In final form

October 3, 1997.

Abbreviations: LAR, Levin's attributable risk; NNT, number

needed to be treated; RR, relative risk; SE, standard error.

From the Department of Public Health Medicine and Epidemiol-

ogy, University College Dublin, Earlsfort Terrace, Dublin 2, Ireland.

(Reprint requests to Prof. Leslie E. Daly at this address).

where SE(statistic) is the standard error of the relevant

quantity and 1.96 is the appropriate percentile of the

normal distribution. Confidence limit estimation is

relatively straightforward using this approach, and

methods for use with single means, proportions, or

counts, for differences between these, and for relative

risk-type measures are well known (1, 2, 6). Several

commonly used standard error formulae are given in

the Appendix.

Although the normal approximation (expression 1)

can often be used directly for confidence interval

estimation, sometimes it must be used on a transfor-

mation of the measure of interest For instance, 95

percent confidence limits for the relative risk (RR) can

be based on the limits for loge RR:

log,RR±

(2)

where SE(loge RR) is the standard error of the natural

logarithm of RR (expression A2 in the Appendix).

Transforming back to the original scale, the exponen-

tial of these limits gives the limits for the relative risk

itself. A similar approach can also be used for the odds

ratio. It is important to realize that it is the actual limits

of the transformed quantity that must be back-

transformed. When the limits are transformed in this

783

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 2

784 Daly

way, the confidence limits are not symmetrical around

the point estimate. A common error for the unwary is

to back-transform the "plus-and-minus" part of the

expression, which gives a symmetrical but incorrect

interval.

Unfortunately, however, for a number of measures

used in epidemiology or clinical research, either no

standard error formulae are available or the formulae

are complex and tedious to use. In addition, these more

complex formulae often can only be found in specialist

articles or books, and they are rarely implemented in

computer software packages.

Other techniques for confidence limit estimation

have also been proposed, the best known of which

include Miettinen's test-based limits (1, pp. 197-200)

and Thomas and Gart's (7) computationally difficult

procedure for parameters of 2 X 2 tables. These tech-

niques are of limited applicability.

This paper describes a particular approach to confi-

dence limit estimation which, though previously de-

scribed for certain simple situations, gives rise to a

hitherto unrecognized general method. It is easily un-

derstood and simple to apply, makes fewer assump-

tions than the normal approximation approach, is in-

herently more accurate, and is applicable in many

situations that were previously intractable.

THE SUBSTITUTION METHOD

The following example, pertaining to an incidence

rate, illustrates a simple application of what might be

called "the substitution method" for estimating confi-

dence limits. Seventeen cases of Wilson's disease

were detected in 1,240,091 births in Ireland (8), giving

a birth incidence of

1 = xIN = 17/1,240,091 = 13.71 per million. (3)

In this situation, the number of cases (x) can be as-

sumed to have a Poisson distribution, and the denom-

inator (A0, which is large and based on census data,

can be considered fixed and without sampling varia-

tion. Confidence limits for / can then be based on

confidence limits for x (1, pp. 67-8). If xl and xu are

the lower and upper confidence limits for x, respec-

tively, then the lower and upper confidence limits for

/ are simply

// = x,/N;

L = xJN.

(4)

(5)

In Poisson confidence interval tables (1, pp. 393-5),

the 95 percent limits for this number of cases (x = 17)

are 9.903 and 27.219. Substitution into expressions 4

and 5 then gives a 95 percent confidence interval for

the incidence rate of 7.99-21.95 per million.

Although the substitution of known confidence lim-

its into an expression for a quantity has also been

described for interval estimation of a standardized

mortality ratio (1, p. 279) and the ratio of two rates (1,

p. 200), the approach has never been considered as a

method for confidence limit estimation in its own

right, and its general applicability has never been

exploited. In each of these examples, the measure for

which a confidence interval is required is expressed as

a function of a single quantity for which limits are easy

to calculate. The confidence limits for this single

quantity are then substituted into the formula for the

measure of interest to obtain the required interval.

It is interesting to note that confidence limit estima-

tion based on a transformation of a particular quantity

can also be considered an application of this substitu-

tion method. For instance, any quantity is a function

(the exponential or back-transformation) of its loga-

rithm. Taking the relative risk as an example (see

above), the confidence limits for log,. RR are readily

computed, and their exponentiation, giving the confi-

dence limits for the RR itself, is equivalent to their

substitution into the formula for the RR.

The general applicability of the substitution method

is illustrated below by extending its application to

three situations which hitherto were quite problematic.

Other examples can easily be found. In each situation,

the use of the method is illustrated using previously

published data, and further calculations are performed

on example data sets. Twenty-one 2 X 2 contingency

tables were chosen for illustrating two of the applica-

tions. These example tables (table 1), each of which

shows a statistically significant (uncorrected chi-

square, p < 0.05) difference in mortality between two

groups, cover a range of sample sizes, baseline risks,

relative risks, and risk differences.

Levin's attributable risk

Several attributable risk-type measures are sug-

gested in the literature (9). One in particular is called

Levin's attributable risk. Table 2 gives infant mortality

by birth weight for 72,730 births among whites in New

York City for 1974 (10, p. 77). If low birth weight

births did not occur in the population, the population

risk would be reduced to that observed for infants with

normal birth weight. Letting RT = 1,040/72,730 =

0.0143 and R2 = 422/67,515 = 0.0063 represent,

respectively, the total observed risk in the population

and the risk in normal birth weight infants, the differ-

ence between these is the amount of risk in the pop-

ulation that is attributable to low birth weight. Levin's

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 3

Confidence Limits Made Easy 785

TABLE 1. Structure of axampla data seta used to Illustrata tha application of the substitution mathod tor aatimation of

confidanca limits*

Example

A

B

C

D

E

F

G

H

1

J

K

L

M

N

0

P

Q

R

S

T

U

Total

sample

StZB

50

50

50

888

888

500

500

500

500

500

500

500

500

500

500

500

500

Risk In

«r°up1

cy

0.72

0.72

0.72

0.72

0.72

0.72

0.48

0.48

0.48

0.72

0.72

0.72

0.48

0.48

0.48

0.24

0.24

0.24

0.12

0.12

0.12

RetatMerlsk

1.5

2.0

3.0

1.5

3.0

1.5

ZO

3.0

1.5

ZO

3.0

1.5

ZO

3.0

1.5

2.0

3.0

1.5

ZO

3.0

RJR,

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

0.67

0.50

0.33

Rtek

drterenoa

0.24

0.36

0.48

0.24

0.36

0.48

0.16

0.24

0.32

024

0.36

0.48

0.16

0.24

0.32

0.08

0.12

0.16

0.04

0.06

0.08

Group 1

Dead

36

36

36

72

72

72

48

48

48

360

360

360

240

240

240

120

120

120

60

60

60

Cell

Alive

14

14

14

28

28

28

52

52

52

III

260

260

260

380

380

380

440

440

440

Dead

24

18

12

48

36

24

32

24

16

240

180

120

160

120

80

80

60

40

40

30

20

Group 2

Aft/e

26

32

38

52

64

76

68

76

84

260

320

380

340

380

420

420

440

460

460

470

480

* Each data set ia a 2 x 2 table relating mortality in two groups of equal sampis sizes. (Group 1 has the higher risk.)

attributable risk (LAR) is this quantity expressed as a

proportion of the total population risk:

LAR =

RT R2

Rr

= 0.563.

(6)

A complex standard error formula for LAR was pro-

posed by Walter (9), and an alternative for the loga-

rithm of LAR was proposed by Fleiss (10, pp. 76-7),

both of which enable estimation of confidence limits.

However, application of the substitution method pro-

vides a far easier solution. First, LAR must be ex-

TABLE 2. Infant mortality among whitas In New York City, by

birth weight, 1974*

Birth

(fl)

22,500

>Z500

Total

Outcome el 1 year

Dead

618 (a)f

422 (C)

1,040

Alive

4,597 (b)

67,093 (d)

71,690

Total

5,215 (a + b)

67,515 (c + d)

72,730

• Data from Fleiss (10, p. 77).

t a, b, c, and dare oeD entries.

pressed as a function of a single parameter for which

confidence limits are easy to obtain. A small amount

of algebraic manipulation gives the following expres-

sion for LAR in terms of the relative risk of infant

death among low birth weight infants compared with

normal birth weight infants (RR = RXIR2 = 18.959,

where fl, = 618/5215 = 0.119) and the prevalence of

low birth weight in the population (Prev = 5,215/

72,730 = 0.0717):

LAR =

Prev(RR - 1)

1 +Prev(RR- 1)'

(7)

This expression for LAR is in common use, even

though its interpretation is not intuitively obvious. If

the prevalence of low birth weight is assumed to be

free of sampling variation (equivalent to assuming that

one of the margins of table 2 is fixed), LAR is seen to

be expressed in terms of the relative risk, for which

confidence limits are easily obtained. These limits can

then be substituted into expression 7 to obtain limits

for LAR. The lower and upper 95 percent confidence

Umits for this RR, RR, and RRU, estimated using

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 4

786 Daly

expressions 2 and A2, are 16.807 and 21.387, respec-

tively. By substitution, the limits for LAR are

LAR, =

T AD —

Prev(RR, - 1)

1 + Prev(RR; - 1)

0.0717(16.807 - 1)

~ 1 + 0.0717(16.807 - 1)

Prev(RRu - 1)

Prev(RRtt - 1)

= 0.531; (8)

0.0717(21.387 - 1)

= 1 + 0.0717(21.387 - 1)= 0.594. (9)

These are almost identical to the values of 0.530 and

0.594 calculated using Fleiss' standard error formula

(10) and to the limits of 0.532 and 0.594 given by

Walter's more complex approach (9). Of course, the

limits obtained by the substitution method depend

on which formula is employed for the relative risk

limits.

Table 3 compares the 95 percent confidence limits

obtained by means of Walter's method, Fleiss'

method, and the substitution method (used as de-

scribed above) for the 21 example tables. For this

application, group 1 is taken as the exposed group

and group 2 as the nonexposed. There is good

agreement between the three approaches, but in

general the substitution limits (lower and upper)

tend to be lower than those given by Walter's

method and higher than those given by Fleiss'.

Thus, the substitution limits would seem to be a

better approximation of Walter's limits than the

limits proposed by Fleiss. In example table S, for

instance, Fleiss' lower limit is less than zero, which

would correspond to a statistically nonsignificant

(5 percent level) association between exposure and

mortality. Based on the (conservative) continuity-

corrected chi-square, the association is statistically

significant, and both the substitution method

and Walter's method give the required nonnegative

limits.

Population genetics

Under Hardy-Weinberg equilibrium, the frequency

(the proportion or percentage) of a rare recessive gene

TABLE 3. Lower and upper 95 peroent confidence Dmlts for Levin's attributable risk (LAR, and LAR,,),

calculated using two published methods and the substitution method In 21 example data sets*

Levin's

attrtxitabto

risk

(LAR)

Confidence SnU calculation method

Example

A

B

C

D

E ,

F

Q

H

1

J

K

L

M

N

O

P

Q

R

S

T

U

0.200

0.333

0.500

0.2O0

0.333

0.500

0.200

0.333

0.5OO

0.200

0.333

0.500

0.200

0.333

0.500

0.200

0.333

0.500

0200

0.333

0.500

Method o» Rates (10)

LAR,

0.017

0.118

0.251

0.075

0.188

0.335

0.011

0.124

0272

0.146

0272

0.432

0.120

0247

0.408

0.068

0.193

0.356

-0.005

0.117

0277

LAR.

0.349

0.498

0.666

0.308

0.453

0.624

0.353

0.493

0.657

0250

0.390

0.560

0273

0.410

0.577

0.314

0.449

0.612

0.363

0.497

0.654

Method of Walter (9)

LAR,

0.039

0.152

0.304

0.086

0.205

0.361

0.031

0.154

0.315

0.149

0276

0.438

0.125

0253

0.417

0.078

0207

0.374

0.017

0.147

0.316

LAR.

0.361

0.515

0.696

0.314

0.462

0.639

0.369

0.513

0.685

0.251

0.391

0.562

0.275

0.414

0.583

0.322

0.460

0.626

0.383

0.520

0.684

Subsdutton method

LAR,

0.035

0.142

0.280

0.084

0200

0.349

0.027

0.144

0294

0.148

0275

0.436

0.124

0251

0.413

0.075

0202

0.364

0.013

0.136

0295

LAR,,

0.355

0.501

0.670

0.311

0.455

0.626

0.361

0.499

0.662

0250

0.389

0.559

0274

0.411

0.578

0.318

0.453

0.615

0.374

0.506

0.661

*Seo table 1.

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from

Page 5

Confidence Umtts Made Easy 787

(q) in the population can be estimated from the square

root of the birth incidence of homozygotes (/)

(11, p. 5):

q =

(10)

An approximation for the standard error of this esti-

mate of q is usually given by (11, p. 5):

SE(q) =

— ^2q2)/4N,

(11)

where N is the number of births on which the birth

incidence is based. In the study described above, the

17 (homozygous) cases of Wilson's disease in

1,240,091 births gave an incidence of 13.71 per mil-

lion and a gene frequency of 0.37 percent. The 95

percent confidence interval for the latter figure using

the normal approximation (expressions 1 and 11) is

0.28-0.46 percent.

The substitution method offers an alternative to this

approach. The 95 percent confidence limits for the

incidence rate were previously determined from ex-

pressions 4 and 5 (using the substitution method with

the Poisson distribution) to be 7.99 per million and

21.95 per million. Applying the substitution method

again by taking the square roots of these limits (ex-

pression 10), the lower and upper 95 percent confi-

dence limits for the gene frequency are 0.28 percent

and 0.47 percent—almost identical to those obtained

using the standard error method.

Table 4 compares the substitution method with the

usual approach for a series of birth incidence examples

covering a range of gene frequencies and total births.

Agreement is close, particularly with large numbers of

affected births. When there was only one affected birth

(examples AA and EE), however, the lower 95 percent

limits for the gene frequency were much higher using

the substitution method. It should be realized, of

course, that use of expressions 1 and 11 does not give

exact limits for the gene frequency, since the calcula-

tion is based on an approximate formula and on the

assumption that the sampling distribution of q is nor-

mal. The substitution limits, on the other hand, based

on a transformation of the exact Poisson limits for

the incidence rate, can be considered exact in this

situation.

Although the substitution method is more accurate,

the simplicity of the standard error formula in this

example gives no advantage to the new approach in

terms of ease of use. If, however, the estimation were

to allow for inbreeding, the formula relating the gene

frequency to the incidence is more intricate and in-

TABLE 4.

udng the ueual method and the eubetitution method in a —riea of 22 birth Incidence examptoe

Lower and upp«r 95 percent confidence limits for gene frequency (q, and qj, calculated

Example

AA

BB

CC

DD

EE

FF

GQ

HH

II

JJ

KK

LL

MM

NN

OO

PP

QQ

RR

SS

TT

UU

W

No.

of

births

1

2

5

10

1

2

6

10

20

50

5

10

25

50

100

5

10

25

50

100

250

500

Total

na.cH

births

1,000

1,000

1,000

1,000

5,000

5,000

5,000

5,000

5,000

5,000

10,000

10,000

10,000

10,000

10,000

50,000

50,000

50,000

50,000

50,000

50,000

50,000

Gene

frequency

(4

3.16

4.47

7.07

10.00

1.41

ZOO

3.16

4.47

6.32

10.00

Z24

3.16

5.00

7.07

10.00

1.00

1.41

£24

3.16

4.47

7.07

10.00

Cortktenco Iknft cateutattoo method

Usual method

Q,

0.06

1.38

3.98

6.92

0.03

0.61

1.78

3.09

4.94

8.62

1.26

2.18

4.02

6.09

9.02

0.56

0.98

1.80

Z72

4.03

6.63

9.56

Q.

6.28

7.57

10.16

13.08

2.80

3.39

4.55

5.86

7.71

11.38

3.22

4.14

5.98

8.05

10.98

1.44

1.85

2.67

3.60

4.91

7.51

10.44

Substitution method

0.50

1.56

4.03

6.92

0.23

0.70

1.80

3.10

4.94

8.62

1.27

Z19

4.02

6.09

9.02

0.57

0.98

1.80

Z72

4.03

6.63

9.56

%

7.46

8.50

10.80

13.56

3.34

3.80

4.83

6.06

7.86

11.48

3.42

4.29

6.07

8.12

11.03

1.53

1.92

2.72

3.63

4.93

7.52

10.45

Am J Epidemiol Vol. 147, No. 8, 1998

by guest on July 8, 2011

aje.oxfordjournals.org

Downloaded from