# Generalized P-Values and Confidence Intervals: A Novel Approach for Analyzing Lognormally Distributed Exposure Data

**Abstract**

The problem of assessing occupational exposure using the mean of a lognormal distribution is addressed. The novel concepts of generalized p-values and generalized confidence intervals are applied for testing hypotheses and computing confidence intervals for a lognormal mean. The proposed methods perform well, they are applicable to small sample sizes, and they are easy to implement. Power studies and sample size calculation are also discussed. Computational details and a source for the computer program are given. The procedures are also extended to compare two lognormal means and to make inference about a lognormal variance. In fact, our approach based on generalized p-values and generalized confidence intervals is easily adapted to deal with any parametric function involving one or two lognormal distributions. Several examples involving industrial exposure data are used to illustrate the methods. An added advantage of the generalized variables approach is the ease of computation and implementation. In fact, the procedures can be easily coded in a programming language for implementation. Furthermore, extensive numerical computations by the authors show that the results based on the generalized p-value approach are essentially equivalent to those based on the Land's method. We want to draw the attention of the industrial hygiene community to this accurate and unified methodology to deal with any parameter associated with the lognormal distribution.

# Figures

Journal of Occupational and Environmental Hygiene,3:642–650

ISSN: 1545-9624 print / 1545-9632 online

Copyright

c

2006 JOEH, LLC

DOI: 10.1080/15459620600961196

Generalized P-Values and Conﬁdence Intervals:

ANovel Approach for Analyzing Lognormally

Distributed Exposure Data

K. Krishnamoorthy,

1

Thomas Mathew,

2

and Gurumurthy Ramachandran

3

1

Department of Mathematics, University of Louisiana at Lafayette, Lafayette, Louisiana

2

Department of Mathematics and Statistics, University of Maryland, Baltimore, Maryland

3

Division of Environmental Health Sciences, School of Public Health, University of Minnesota,

Minneapolis, Minnesota

The problem of assessing occupational exposure using the

mean of a lognormal distribution is addressed. The novel

concepts of generalized p-values and generalized conﬁdence

intervals are applied for testing hypotheses and computing

conﬁdence intervals for a lognormal mean. The proposed

methods perform well, they are applicable to small sample

sizes, and they are easy to implement. Power studies and

sample size calculation are also discussed. Computational

details and a source for the computer program are given.

The procedures are also extended to compare two lognormal

means and to make inference about a lognormal variance.

In fact, our approach based on generalized p-values and

generalized conﬁdence intervals is easily adapted to deal

with any parametric function involving one or two lognormal

distributions. Several examples involving industrial exposure

data are used to illustrate the methods. An added advantage of

the generalized variables approach is the ease of computation

and implementation. In fact, the procedures can be easily coded

in a programming language for implementation. Furthermore,

extensive numerical computations by the authors show that

the results based on the generalized p-value approach are

essentially equivalent to those based on the Land’s method. We

want to draw the attention of the industrial hygiene community

to this accurate and uniﬁed methodology to deal with any

parameter associated with the lognormal distribution.

Keywords conﬁdence interval, hypothesis test, Type 1 error

Address correspondence to: K. Krishnamoorthy, Department

of Mathematics, 217 Maxim D. Doucet Hall, P.O. Box 41010,

University of Louisiana at Lafayette, Lafayette, LA 70504; e-mail:

krishna@louisiana.edu.

INTRODUCTION

I

t has been well established that occupational exposure

data and pollution data very often follow the lognormal

distribution. Since Oldham’s

(1)

1953 report that the distribution

of dust levels in coal mines is approximately lognormal, several

authors have postulated the lognormal model for studying and

analyzing workplace pollutant data.

(2−8)

The most common

explanation for this phenomenon is as follows: workplace

concentrations are related to rates of contaminant generation

and ventilation rates that are variable. Workers move around in

this nonuniform environment, and their activity patterns also

vary from day to day. The workers’ exposures are related to

the above factors in a multiplicative manner. Irrespective of the

distribution of contaminant generation rates, ventilation rates,

and worker activity patterns, their multiplicative interactions

typically lead to exposure distributions that are right skewed

and described well by the lognormal probability distribution.

The validity of lognormality assumption for a given data

set can be easily tested. The fact that the data y

1

,...,y

n

are

said to follow a lognormal distribution if ln(y

1

),...,ln(y

n

)

follow a normal distribution (where “ln” denotes the natural

logarithm) allows us to adequately validate the assumption of

lognormality of a given data set. Thus, testing for lognormality

is simply a matter of validating the normality assumption for

the logged data, and this can be done using many widely

available software programs such as Minitab, SPSS, and SAS

or using some popular methods such as Shapiro-Wilks test or

Anderson-Darling test.

If we let y denote the lognormally distributed exposure

measurement of an employee, then x = ln(y)isdistributed

normally with mean and standard deviation to be denoted by µ

l

and σ

l

, respectively, and the mean of the lognormal distribution

(say, µ)isgivenby

µ = exp(η), where η = µ

l

+ σ

2

l

/2. (1)

If repeated exposure measurements are available from a

single worker, then µ can be viewed as the mean of the

worker, and our approach can be used to estimate the individual

worker’s mean. Our approach is also applicable to estimate

the mean of a similarly exposed group (SEG) of workers

if only one exposure measurement is obtained per worker

642 Journal of Occupational and Environmental Hygiene November 2006

NOMENCLATURE

y

1

,...,y

n

sample from a lognormal distribution

x

1

,...,x

n

logged data; x

i

= ln(y

i

), i = 1,...,n

µ

l

population mean of the logged data

σ

l

population standard deviation of the logged data

µ mean of the lognormal distribution;

µ = exp (µ

l

+σ

2

l

/2)

σ

2

variance of the lognormal distribution;

σ

2

= exp (2µ

l

+ σ

2

l

)[exp(σ

2

l

) − 1]

σ

g

geometric standard deviation; σ

g

= exp(σ

l

)

¯

x sample mean of the logged data

s sample standard deviation of the logged data

or to estimate the mean contaminant level in a workplace.

If multiple measurements exist for each worker, and both

between- and within-worker variability are signiﬁcant and need

to be accounted for, then one should use the random effects

model.

(9,10,11)

The sample mean exposure can be used as an estimate of the

long-term average exposure or the average exposure for a SEG

of workers over an extended period of time. For substances

that cause health effects due to chronic exposures, day-to-day

variability in long-term exposures is less health relevant than

the long-term mean. For such exposures, the arithmetic mean

is the best measure of cumulative exposure over a biologically

relevant time period, since the body would have integrated

exposures over this time period.

(9)

The long-term mean is of

relevance in occupational epidemiology where the estimated

value of the long-term mean is assigned to all workers in a SEG.

Once lognormality has been veriﬁed for an exposure sample,

inferences on the parameters of the lognormal distribution

can be made. Whereas there are currently only a few legal

standards and threshold values based on long-term averages,

some researchers have explored the statistics of exposures

exceeding long-term limits.

(3)

To show that the mean exposure

does not exceed the long-term average exposure limit (LTA-

OEL), we may want to test the hypotheses

H

0

: µ ≥ LTA-OEL vs. H

a

: µ<LTA-OEL (2)

Note that the null and alternative hypotheses in Eq. 2 are

set up to look for evidence in favor of H

a

. Rejection of the

null hypothesis in Eq. 2 implies that the exposure level is

acceptable.

Another method of assessing workplace exposure, sug-

gested by some investigators,

(2,5,8)

is based on the proportion

of exposure data in excess of the LTA-OEL. Because the

proportion of the measurements that are above the LTA-OEL

is equal to the proportion of the logged measurements that

are above ln(LTA-OEL), this approach reduces to the problem

of hypothesis testing about an upper quantile of a normal

distribution. This hypothesis testing can be carried out using an

appropriate tolerance limit of the normal distribution, and it has

been well addressed in the context of assessing occupational

exposure by Tuggle,

(2)

Selvin et al.,

(5)

and Lyles and Kupper.

(8)

In the context of exposure assessment, the problem of

comparing two lognormal means will arise when we want to

compare exposure levels of two similarly exposed groups of

workers, or when we want to compare two exposure assessment

methods or two different sampling devices. Thus, let y

1

and y

2

be lognormally distributed random variables denoting

exposure levels at two different sites or measurements obtained

by two different methods, and let µ

l1

, µ

l2

and σ

2

l1

, σ

2

l2

denote

the respective means and variances of the normally distributed

random variables ln(y

1

) and ln(y

2

). Then the means of x

1

and

x

2

, say µ

1

and µ

2

, respectively, are given by

µ

1

= exp(η

1

), and µ

2

= exp(η

2

),

where η

1

= µ

l1

+ σ

2

l1

/2 and η

2

= µ

l2

+ σ

2

l2

/2.

(3)

For comparing the exposure levels at the two sites, it is of

interest to test the hypotheses

H

0

: µ

1

≤ µ

2

vs. H

a

: µ

1

>µ

2

. (4)

Land

(12)

has proposed exact methods for constructing

conﬁdence intervals and hypothesis tests for the lognormal

mean. His methods, however, are computationally intensive

and depend on the standard deviation of the logged data, which

makes the necessary tabulation difﬁcult. For this reason, Rap-

paport and Selvin

(3)

proposed a simple approximate method

that is satisfactory as long as σ

2

l

≤ 3 and n > 5. Zhou and

Gao

(13)

reviewed and compared several approximate methods

and concluded that all the approximate methods are either too

conservative or liberal, except for large samples, in which case,

a method developed by Cox

(12)

is satisfactory. Armstrong

(14)

compared four approximate methods for estimating the conﬁ-

dence intervals (CI) with Land’s

(12)

exact interval. These were

the (a) “simple t-interval,” (b) the “lognormal t-interval,” (c)

the Cox interval proposed by Land,

(15)

and (d) a variation

of the Cox interval. Armstrong

(14)

found that whereas some

of these approximate intervals were adequate for large sample

sizes (n ≥ 25) or small geometric standard deviations (σ

g

=

1.5), none of them were accurate for small sample sizes

and large σ

g

—precisely the situations that are commonly

encountered in occupational exposure assessment. Hewett

and Ganser

(16)

have developed procedures that considerably

simplify the calculation of Land’s exact conﬁdence interval.

In a recent article, Taylor and colleagues

(17)

evaluated several

approximate conﬁdence intervals in terms of their coverage

probabilities and also suggested an improved approximation.

Very little work is available on the problem of comparing

two lognormal means. A large sample test is derived in Zhou

et al.

(18)

for testing the equality of two lognormal means.

Journal of Occupational and Environmental Hygiene November 2006 643

The purpose of this article is to illustrate the application

of a novel approach for carrying out tests and conﬁdence

intervals for a single lognormal mean, for the ratio of two

lognormal means, for a single lognormal variance, and for the

ratio of two lognormal variances. The approach is based on the

concepts of generalized p-values and generalized conﬁdence

intervals, collectively referred to as the generalized variables

method. The generalized variables methodology is already

described in Krishnamoorthy and Mathew

(19)

for obtaining

tests and conﬁdence intervals for a single lognormal mean, and

for comparing two lognormal means; however, the lognormal

variance is not considered in that article. In this article, we

extend this approach for obtaining conﬁdence intervals for

the lognormal variance. Even though the lognormal mean

is addressed by Krishnamoorthy and Mathew, we shall ﬁrst

brieﬂy review the generalized variables procedure for the

lognormal means described in their article and then apply

it to the lognormal variance. We want to draw the attention

of the industrial hygiene community to an accurate and

uniﬁed methodology to deal with any parameter associated

with the lognormal distribution. An added advantage of the

generalized variables approach is the ease of computation and

implementation. In fact the procedures can be easily coded

in a programming language for implementation. Furthermore,

extensive numerical results by the authors

(19)

show that for one-

sided tests concerning a single lognormal mean, the results

based on the generalized p-value approach are essentially

equivalent to those based on the Land’s

(12)

method.

The concept of generalized p-value was originally intro-

duced by Tsui and Weerahandi,

(20)

and the concept of gener-

alized conﬁdence intervals was introduced by Weerahandi.

(21)

A later book by Weerahandi

(22)

illustrates several nonstandard

statistical problems where the generalized variable approach

produced remarkably useful results. Because the concepts are

not well known, we have presented them in a brief outline

in Appendix 1. In this article, we ﬁrst present generalized

variables for making inferences about a normal mean and

variance. We then outline the hypothesis testing and interval

estimation procedures for a single lognormal mean and then for

the difference between two lognormal means. The necessary

algorithms and Fortran and SAS programs to carry out our pro-

cedures are posted at http://www.ucs.louisiana.edu/∼kxk4695

and are available as an appendix to the online version of

this article on the JOEH website. In a later section, we also

address the problem of obtaining tests and conﬁdence intervals

concerning a single lognormal variance, or the ratio of two

lognormal variances. A conﬁdence interval for the lognormal

variance should be of interest to assess the variability among

exposure measurements.

We have used two examples to illustrate our methods.

The ﬁrst example involves the sample of air lead levels data

collected from a lab by the National Institute of Occupational

Safety and Health (NIOSH) health hazard evaluation staffs.

The problem is to assess the contaminant level within the

facility based on a sample. We also illustrate the generalized

variable method for testing the equality of the means of

measurements obtained by two different methods. For this

purpose we used the data presented in O’Brien et al.

(23)

Generalized Variables for the Mean and Variance of

a Normal Distribution

As the mean of a lognormal distribution is a function of

the mean and variance of a normal distribution, we present

the generalized variables for the mean and variance of a

normal population. The details of construction of generalized

variables can be found in Krishnamoorthy and Mathew

(19)

or

in Weerahandi,

(22)

and for easy reference they are provided in

Appendix 1. Let X

1

, ..., X

n

be a sample from a normal popu-

lation with mean µ

l

and variance σ

2

l

, N (µ

l

,σ

2

l

). The sample

mean and the variance of the X

i

s are respectively given by

¯

X =

1

n

n

i=1

X

i

and S

2

=

1

n − 1

n

i=1

(X

i

−

¯

X)

2

. (5)

Let Z and V be independent random variables with

Z =

√

n(

¯

X − µ

l

)

σ

l

∼ N (0, 1), and V

2

=

(n − 1)S

2

σ

2

l

∼ χ

2

n−1

,

(6)

where χ

2

r

denotes the central chi-square distribution with r

degrees of freedom. Let

¯

x and s be the observed values of

¯

Xand S, respectively. Following the procedure outlined in the

appendix, a generalized variable for making inferences on µ

l

is given by

G

µ

l

=

¯

x −

¯

X − µ

l

σ

l

/

√

n

σ

l

√

n

s

S

− µ

l

=

¯

x −

Z

V /

√

n − 1

s

√

n

− µ

l

(7)

= T

µ

l

− µ

l

,

where

T

µ

l

=

¯

x −

Z

V /

√

n − 1

s

√

n

, (8)

and Z and V are as deﬁned in (Eq. 6). In the above, G

µ

l

denotes the generalized test variable for µ

l

, and T

µ

l

denotes

the generalized pivot statistic (the statistic that can be used for

making inference about the unknown parameter) for µ

l

.We

shall now show that G

µ

l

satisﬁes the three conditions given in

(Eq. A3) of Appendix 1: (1) For a given

¯

x and s, the distribution

of G

µ

l

does not depend on the nuisance parameter σ

2

l

; (2) it

follows from Step 1 of Eq. 7 that the value of G

µ

l

at (

¯

X, S) =

(

¯

x, s)isµ

l

; (3) it follows from Step 3 of Eq. 7 that, for a given

¯

x ands, the generalized variable is stochastically decreasing

with respect to µ

l

and hence the generalized p-value for testing

H

0

: µ

l

≥ µ

l0

vs. H

a

:µ

l

<µ

l0

is given by

sup

H

0

P(G

µ

l

≥ 0) = P(G

µ

l

≥ 0|µ

l

= µ

l0

)

= P(T

µ

l

≥ µ

l0

)

= P

t

n−1

<

¯

x − µ

l0

s/

√

n

,

644 Journal of Occupational and Environmental Hygiene November 2006

which is the p-value based on the usual t-test. To get the

last equality, we used the fact that Z /(V /

√

n − 1) follows a

Student’s t distribution with degrees of freedom n − 1, t

n−1

.

For a given

¯

x and s, the lower α/2 quantile T

µ

l

,α/2

of T

µ

l

and

the upper α/2 quantile T

µ

l

,1−α/2

of T

µ

l

form a 1 – α generalized

conﬁdence interval for µ

l

. This generalized CI is indeed

equal to the usual t-interval; that is, (T

µ

l

,α/2

, T

µ

l

,1−α/2

) =

(

¯

x − t

n−1,1−α/2

s

√

n

,

¯

x + t

n−1,1−α/2

s

√

n

), where t

m, p

denotes the

100 pth percentile of the Student’s t distribution with m degrees

of freedom.

The generalized test variable for the variance σ

2

l

is given by

G

σ

2

l

=

s

2

V

2

/(n − 1)

− σ

2

l

= T

σ

2

l

− σ

2

l

, (9)

where

T

σ

2

l

=

s

2

V

2

/(n − 1)

(10)

is the generalized pivot statistic, and V is as deﬁned in Eq. 6.

Again, for a given s

2

, the generalized 1 – α CI for σ

2

l

is formed

by the lower and upper α/2 quantiles of T

σ

2

l

and is equal to the

usual CI based on a chi-square distribution with n −1degrees

of freedom.

Even though the generalized variable method produced

exact inferential procedures for the normal parameters, in

general, the generalized variable method is not necessarily

exact. In other words, the generalized p-value may not satisfy

the conventional properties of the usual p-value. In such cases,

the properties (such as Type I error rates of the generalized

variable test and coverage probability of the generalized

conﬁdence limits) of the generalized variable method should

be evaluated numerically.

Suppose we are interested in making inference about a

function of µ

l

and σ

2

l

, say, q(µ

l

,σ

2

l

). Then, the generalized

test variable for q(µ

l

, σ

2

l

)isgivenbyq(T

µ

l

, T

σ

2

l

) − q(µ

l

,σ

2

l

),

and the generalized pivot statistic is given by q(T

µ

l

, T

σ

2

l

). For a

given

¯

x and s, the variable q(µ

l

, σ

2

l

) depends only on the ran-

dom variables Z and V whose distributions do not depend on

any unknown parameters. Therefore, Monte Carlo simulation

can be used to ﬁnd a generalized CI for q(µ

l

,σ

2

l

). This will be

illustrated for the lognormal case in the following section.

Inference about a Lognormal Mean

Let y

1

,...,y

n

be a sample of exposure measurements and

let x

i

= ln(y

i

), i = 1,...,n. Then, x

1

,...,x

n

is a random

sample from a N (µ

l

,σ

2

l

) distribution. Since the lognormal

mean exp(µ

l

+ σ

2

l

/2) is a function of µ

l

and σ

2

l

, the results

of the preceding section can be readily applied to construct

a generalized test variable and a generalized pivot statistic

for the lognormal mean. From the preceding section, we

have the generalized test variable for making inference on

η = (µ

l

+ σ

2

l

/2) as

G

η

= T

µ

l

+

T

σ

2

l

2

− η

=

¯

x −

Z

V /

√

n − 1

s

√

n

+

s

2

2V

2

/(n − 1)

− η (11)

= T

η

− η,

where

T

η

=

¯

x −

Z

V /

√

n − 1

s

√

n

+

s

2

2V

2

/(n − 1)

(12)

and Z and V are as deﬁned in Eq. 6. For given sample statistics

¯

x and s,wenote that G

η

is stochastically decreasing in η,

and hence the generalized p-value for testing (Eq. 2) is given

by

P(G

η

≥ 0|η = ln(LTA-OEL))

= P(T

η

≥ ln(LTA-OEL)). (13)

The null hypothesis in Eq. 2 will be rejected whenever the

probability in Eq. 13 is less than the nominal level α.

The generalized pivot statistic for interval estimation of η is

given by T

η

. Appropriate quantiles of T

η

can be used to obtain

conﬁdence intervals for η or for the lognormal mean exp(η).

Speciﬁcally, if T

η,p

,0< p < 1, denotes the pth quantile of T

η

,

then (T

η,α/2

, T

η,1−α/2

)isa1−α generalized conﬁdence interval

for η, and (exp(T

η,α/2

), exp(T

η,1−α/2

)) is a 1 − α generalized

conﬁdence interval for the lognormal mean exp(η). One-sided

limits for η and exp(η) can be similarly obtained. In particular,

a1− α lower limit for exp(η)isgivenby exp(T

η,α

).

Through numerical results, Krishnamoorthy and

Mathew

(17)

noted that the conﬁdence limits based on

Land’s

(12)

approach and the generalized conﬁdence interval

are practically the same. However, computationally, our

approach is very easy to implement. The simple algorithm

presented in Appendix 2 of Krishnamoorthy and Mathew

(19)

can be used for computing the generalized p-value and the

generalized conﬁdence interval.

Power Studies and Sample Size Calculation for

Testing a Lognormal Mean

We shall now discuss the power of the test based on the

generalized p-value in Eq. 13. For a given sample size n and

for a given value of µ

l

and σ

l

such that H

a

in Eq. 2 holds

(i.e., η = µ

l

+ σ

2

l

/2 < ln(LTA-OEL)), the power of the

test can be estimated by Monte Carlo simulation. In practice,

however, practitioners are mainly interested in ﬁnding the

required sample size to have a speciﬁed power at a given

level of signiﬁcance. The sample size can be calculated using

an iterative method. For power calculation, an algorithm and

Fortran and SAS programs based on the algorithm are posted

at http://www.ucs.louisiana.edu/∼kxk4695 and are available

as an appendix to the online version of this article. Using

this program, we computed sample sizes that are required to

have a power of 0.90 at the level of signiﬁcance α = 0.05 for

various parameter conﬁgurations, and these are presented in

Table I. As an example, if an employer speculates that the mean

exposure level is 40% (the value R in Table I) of the LTA-OEL,

and the geometric standard deviation is 2.0, then the required

sample size to have a power of at least 0.90 at the level 0.05

is 13.

We observe from Table I that the power of the test increases

as the ratio R decreases, which is a natural requirement for a

test. We also note that the power decreases as σ

g

increases and,

Journal of Occupational and Environmental Hygiene November 2006 645

TABLE I. Sample Size for Testing Equation 2 to

Attain a Power of 0.90 at the Level of 0.05, Using the

Generalized P-Value Test

σ

g

R 1.5 2.0 2.5 3.0 3.5

0.1 4 (.96) 6 (.94) 8 (.90) 11 (.90) 13 (.91)

0.2 4 (.90) 7 (.91) 11 (.91) 16 (.90) 21 (.91)

0.3 5 (.93) 10 (.91) 16 (.90) 21 (.91) 30 (.90)

0.4 6 (.91) 13 (.90) 23 (.91) 35 (.91) 45 (.90)

0.5 8 (.93) 18 (.91) 33 (.90) 52 (.90) 68 (.90)

0.7 18 (.91) 56 (.90) 99 (.90) 162 (.90) 235 (.90)

0.8 37 (.90) 120 (.90) 241 (.90) 363 (.90) 563 (.90)

Note: R =

µ

l

LTA

-OEL

; σ

g

= exp(σ

l

) = geometric standard deviation; the

numbers in parenthesis represent actual attained powers; LTA-OEL =1.0; the

lognormal mean.

hence, large samples are required to make correct decisions

when σ

g

is expected to be large.

Comparison of Two Lognormal Means

Consider the independent lognormal random variables y

1

and y

2

so that x

1

= ln(y

1

) ∼ N (µ

l1

,σ

2

l1

) and x

2

= ln(y

2

) ∼

N (µ

l2

,σ

2

l2

). Then the lognormal means are given by E(y

1

) =

exp(η

1

) and E(y

2

) = exp(η

2

), where

η

1

= exp

µ

l1

+ σ

2

l1

/2

and η

2

= exp

µ

l2

+ σ

2

l2

/2

. (14)

Thus, hypothesis tests and conﬁdence intervals for the ratio

of the two lognormal means are respectively equivalent to

those for the difference η

1

− η

2

.Weshall now develop gen-

eralized p-values and generalized conﬁdence intervals for this

problem.

We shall ﬁrst consider the testing problem

H

0

: η

1

≤ η

2

vs. H

a

: η

1

>η

2

. (15)

Let y

1 j

, j = 1,...,n

1

, and y

2 j

, j = 1,...,n

2

, denote

random samples from the lognormal distributions of y

1

and

y

2

, respectively. Let x

1 j

= ln(y

1 j

), j = 1,...,n

1

, and x

2 j

=

ln(y

2 j

), j = 1,...,n

2

. The sample means

¯

x

1

and

¯

x

2

and the

sample variances s

2

1

and s

2

2

are then given by

¯

x

i

=

1

n

i

n

i

j=1

x

ij

and s

2

i

=

1

n

i

− 1

n

i

j=1

(x

ij

−

¯

x

i

)

2

, i = 1, 2.

(16)

It follows from Eq. 12 that the generalized variable for η

i

can be expressed as

T

η

i

=

¯

x

i

−

Z

i

V

i

/

√

n

i

− 1

s

i

√

n

i

+

s

2

i

2V

2

i

/(n

i

− 1)

, i = 1, 2,

(17)

where Z

i

∼ N (0, 1) and V

2

i

∼ χ

2

n

i

−1

, for i = 1, 2, and all

these random variables are independent. The generalized test

variable for testing (Eq. 15) is given by

G

η

1

−η

2

= T

η

1

− T

η

2

− (η

1

− η

2

) (18)

and the generalized pivot statistic to construct CI for η

1

− η

2

for is given by

T

η

1

−η

2

= T

η

1

− T

η

2

. (19)

Forgiven sample statistics, G

η

1

−η

2

is stochastically decreas-

ing in η

1

− η

2

. Thus the generalized p-value for testing the

hypotheses in Eq. 15 is given by

sup

H

0

P(G

η

1

−η

2

≤ 0) = P(G

η

1

−η

2

≤ 0|η

1

− η

2

= 0)

= P(T

η

1

−η

2

≤ 0). (20)

Forgiven sample statistics, the conﬁdence intervals for

η

1

− η

2

can be computed using the percentiles of T

η

1

−η

2

.

Because, given

¯

x

1

,

¯

x

2

, s

2

1

and s

2

2

, the distribution of T

η

1

−η

2

is free of any unknown parameters, the percentiles of T

η

1

−η

2

can be estimated using Monte Carlo simulation. We can also

construct conﬁdence intervals for the difference between the

lognormal means, that is, exp(η

1

) − exp(η

2

). For this, we

can use the percentiles of exp(T

η

1

) − exp(T

η

2

), where T

η

1

and T

η

2

are given in Eq. 17. Note that algorithms similar to

Algorithm 1 can be easily developed for computing the above

generalized p-values and conﬁdence intervals. An algorithm

and Fortran and SAS programs for computing the generalized

p-value test and the CI for exp(T

η

1

) − exp(T

η

2

) are posted at

http://www.ucs.louisiana.edu/∼kxk4695 and are available as

an appendix to the online version of this article.

Power Properties of the Generalized Test for the

Two-Sample Case

For given sample sizes n

1

and n

2

, parameters µ

l1

,µ

l2

,σ

l1

and σ

l2

the powers of the generalized test based on Eq.

20 can be estimated using Monte Carlo method. A Fortran

program and SAS codes for computing the power (along

with a help ﬁle) are posted at http://www.ucs.louisiana.edu/

∼kxk4695 and are available as an appendix to the online

version of this article. The help ﬁle also contains an algo-

rithm that can be coded in any desired computing language.

Krishnamoorthy and Mathew

(19)

computed powers for several

sample sizes and parameter combinations. It is observed in this

article that the generalized test possesses all natural properties.

However, the power of the test depends on µ

l1

− µ

l2

,σ

l1

and

σ

l2

. Therefore, to compute the required sample sizes to attain a

speciﬁed power, the practitioner should have knowledge about

µ

l1

− µ

l2

,σ

l1

, and σ

l2

.

Inference about a Lognormal Variance and

Geometric Standard Deviation

For the assessment of the extent of variability among

the exposure measurements, conﬁdence intervals, or tests

concerning the variance becomes necessary. If y denotes the

lognormally distributed exposure measurements, then x =

ln(y)isdistributed normally with mean µ

l

and variance σ

2

l

.

646 Journal of Occupational and Environmental Hygiene November 2006

TABLE II. Monte Carlo Estimates of the Sizes of the Generalized P-Value Test Based on Equation 24 for

Lognormal Variance in Equation 21; Nominal Level = 0.05

σ

l

= 0.5 σ

l

= 1.0 σ

l

= 1.5

µ

l

n = 10 n = 15 n = 20 n = 10 n = 15 n = 20 n = 10 n = 15 n = 20

0.00 .050 .041 .047 .047 .048 .046 .050 .044 .049

0.30 .046 .056 .052 .053 .044 .049 .043 .046 .056

0.70 .047 .050 .052 .045 .050 .048 .049 .044 .049

1.00 .050 .050 .054 .048 .049 .050 .048 .044 .049

1.30 .054 .046 .043 .052 .049 .045 .045 .051 .050

1.50 .048 .060 .048 .053 .048 .049 .048 .050 .045

1.70 .053 .053 .051 .043 .045 .048 .052 .048 .042

2.00 .055 .053 .054 .046 .054 .054 .054 .048 .047

The variance of y,tobedenoted by σ

2

,isgivenby

σ

2

= exp

2µ

l

+ σ

2

l

exp

σ

2

l

− 1

. (21)

As far as we are aware, no procedures (except obvious large

sample procedures) are known for computing a conﬁdence

interval or for testing hypotheses concerning σ

2

.Itturns out

that the ideas of generalized p-values and generalized conﬁ-

dence intervals provide solutions to this problem, regardless

of the sample size. We shall now construct a generalized pivot

statistic that can be used to compute a conﬁdence interval for

σ

2

, and a generalized test variable that can be used for testing

the hypotheses

H

0

: σ

2

≥ σ

2

0

vs. H

a

: σ

2

<σ

2

0

, (22)

where σ

2

0

is a known constant. Note that it is by rejectingH

0

that we conclude that the variability is small, that is, below the

bound σ

2

0

.

Using earlier notations, the generalized test variable for σ

2

is given by

G

σ

2

= exp

2T

µ

l

+ T

σ

2

l

exp

σ

2

l

− 1

− σ

2

= exp

2

¯

x −

Z

V /

√

n − 1

s

√

n

+

s

2

V

2

/(n − 1)

×

exp

s

2

V

2

/(n − 1)

− 1

− σ

2

, (23)

where Z and V are as deﬁned in Eq. 6. The generalized pivot

statistic for constructing CI for σ

2

l

is given by

T

σ

2

= exp

2

¯

x −

Z

V /

√

n − 1

s

√

n

+

s

2

V

2

/(n − 1)

exp

s

2

V

2

/(n − 1)

− 1

. (24)

Arguing as in previous sections, the generalized p-value for

testing the hypotheses in Eq. 22 is given by

P

G

σ

2

≥ 0

σ

2

= σ

2

0

= P

T

σ

2

≥ σ

2

0

. (25)

Furthermore, the percentiles of T

σ

2

can be used for com-

puting a generalized conﬁdence interval for σ

2

.Analgorithm

(similar to Algorithm 1 in Appendix 2) can be easily developed

for computing the above generalized p-value and conﬁdence

interval. We also note that the above procedure can be

easily extended for the purpose of comparing two lognormal

variances.

To understand the validity of the generalized test based

on Eq. 25, we estimated its sizes (Type I error rates) using

Monte Carlo method for various values of µ

l

,σ

l

and n =

10,15, and 20. The sizes are estimated for testing hypotheses

in Eq. 22 at the nominal level 0.05, and they are given in Table

II. For a good test, the estimated sizes should be close to

the nominal level. We see from Table II that the estimated

sizes are very close to the nominal level for all the cases

considered.

The generalized variable for a geometric standard deviation

σ

g

= exp(σ

l

)isgivenby

G

σ

g

= exp

G

σ

2

l

, (26)

where the generalized variable G

σ

2

l

for σ

2

l

is given in Eq. 9.

However, it was pointed out earlier that the generalized

variable approach gives the same conﬁdence interval for σ

2

l

as

the conventional chi-square interval. From this, a conﬁdence

interval for the geometric standard deviation is easily obtained

as

exp

s

(n − 1)

χ

2

n−1,1−α/2

, exp

s

(n − 1)

χ

2

n−1,α/2

, (27)

where χ

2

m, p

denotes the 100 pth percentile of the central chi-

square distribution with df = m. The expression in (Eq. 27) is

an exact 1 − α conﬁdence interval for σ

g

.

Similarly, a test for

H

0

: σ

g

≤ c vs. H

a

: σ

g

> c, (28)

is essentially a test concerning the variance σ

2

l

, and the usual

chi-square test for the variance can be applied.

Journal of Occupational and Environmental Hygiene November 2006 647

Illustrative Examples

Example 1

The data represent air lead levels collected by NIOSH at the

Alma American Labs, Fairplay, Colorado, for health hazard

evaluation purpose (HETA 89-052) on February 23, 1989. The

air lead levels were collected from 15 different areas within

the facility.

Air Lead Levels (µg/m

3

): 200, 120, 15, 7, 8, 6, 48, 61, 380,

80, 29, 1000, 350, 1400, 110

For this data, the mean (=254) is much larger than the

median (=80), which is an indication that the distribution

is right skewed. The normal probability plots (Minitab 14.0,

default method) were constructed for the actual lead levels

(Figure 1A) and for the logged lead levels (Figure 1B). It

is clear from Figures 1A and 1B that the distribution of the

data is far away from a normal distribution (p-value < 0.05),

butalognormal model adequately describes the data (p-value

0.871). The p-values are based on the Anderson-Darling test.

Therefore, we apply the methods of this paper to make valid

inferences about the mean lead level. Based on the logged data,

we have the observed values

¯

x = 4.333 and s = 1.739. Using

these numbers in Algorithm 1, we computed the 95% upper

limit for exp(η)as2405. We also computed the 95% lower limit

for the lognormal mean as 141. That is, the mean air lead level

within the facility exceeds 141 µg/m

3

with 95% conﬁdence.

Suppose we want to test whether the mean is greater than

some arbitrary value (e.g., 120 µg/m

3

) that could be a limit

value

H

0

: µ ≥ 120 vs. H

a

: µ<120,

where µ = exp(η) (with η = µ

l

+ σ

2

l

/2) denotes the actual

unknown mean air lead levels within the lab facility. Using

again Algorithm 1, we computed the generalized p-value as

FIGURE 1. Normal probability plots of (A) actual lead levels, and (B) logged air lead levels

648 Journal of Occupational and Environmental Hygiene November 2006

TABLE III. Summary Statistics for Airborne Con-

centration of Metalworking Fluids (MWF) in 23 Plants

Method Sample Size ¯xs

Thoracic MWF aerosol 23 −1.277 0.835

(gravimetric analysis)

Closed-face MWF analysis 23 −0.979 0.917

Note:

¯

x = mean of the logged data; s =standard deviation of the logged data.

0.97, and so we conclude that the data do not provide enough

evidence to indicate that the mean air lead levels within the

facility is less than 120 µg/m

3

.

Regarding the lognormal variance, we computed the maxi-

mum likelihood estimate as 2337098 µg/m

3

. This estimate is

obtained by replacing µ

l

and σ

2

l

in Eq. 21, respectively, by

¯

x

and ((n −1)s2/n). We also computed a 95% conﬁdence interval

for the lognormal variance, using the generalized pivot statistic

T

σ

2

in (24), as (128538, 2956026772).

Finally, we computed the 95% CI for the geometric standard

deviation using the generalized variable in Eq. 26 as (3.57,

15.49); using the exact formula in Eq. 27, we get (3.57, 15.53).

Example 2

In this example, we shall illustrate the generalized variable

procedures for testing the equality of the means of mea-

surements obtained by two different methods. The data were

reported in Table I of O’Brien et. al.,

(23)

and represent total mass

of metalworking ﬂuids (MWF) obtained by thoracic MWF

aerosol and closed-face MWF aerosol. Normal probability

plots of logged data indicated that the lognormality assumption

about the original data is tenable. The means and the standard

deviations of the logged data are given in Table III. Let µ

t

and µ

c

denote the true means of the airborne concentrations

by thoracic MWF aerosol and closed-face MWF aerosol,

respectively. To test the equality of the means, we consider

H

0

: µ

t

= µ

c

vs. H

a

: µ

t

= µ

c

.

Using the summary statistics in Table III, we simulated

D = exp(T

η

1

)−exp(T

η

2

), where T

η

1

and T

η

2

are given in Eq. 17,

100,000 times. The generalized p-value for the above two-tail

test can be estimated by 2 × min{proportion of Ds < 0,

proportion of Ds > 0}. Our simulation yielded the generalized

p-value of 0.244. The lower 2.5 and the upper 2.5 percentiles of

D form a 95% conﬁdence interval for the difference between

the means and is computed as (–0.657, 0.145). Thus, at the

5% level, both generalized p-value and the conﬁdence interval

indicate that there is no signiﬁcant difference between the

means.

CONCLUSIONS

S

everal attempts have been made in the literature for

drawing inferences concerning the mean of a single

lognormal distribution. To a much lesser extent, attempts have

also been made to draw inferences for the ratio of the means

of two lognormal distributions. These problems have certain

inherent difﬁculties associated with them, and the available

solutions are either approximate, or are applicable only to

large samples, or are difﬁcult to compute. In this article, we

have explored a novel approach for solving these problems,

based on the concepts of generalized p-values and generalized

conﬁdence intervals. It turns out that these concepts provide

a uniﬁed and versatile approach for handling any parametric

function associated with one or two lognormal distributions.

Even though analytic expressions are not available for the

resulting conﬁdence intervals or p-values, their computation is

both easy and straightforward. We have provided the necessary

programs for their computation, and we have also illustrated

our approach using several examples dealing with the analysis

of exposure data. In writing this article, our intention has

been to draw the attention of industrial hygienists to this new

methodology.

ACKNOWLEDGMENT

T

his research was supported by a grant from the National

Institute of Occupational Safety and Health (NIOSH).

REFERENCES

1. Oldham, P.: The nature of the variability of dust concentrations at the

coal face. Br. J. Ind. Med. 10:227–234 (1953).

2. Tuggle, R.M.: Assessment of occupational exposure using one-sided

tolerance limits. Am. Ind. Hyg. Assoc. J. 43:338–346 (1982).

3. Rappaport, S. M., and S. Selvin: A method for evaluating the mean

exposure from a lognormal distribution. Am. Ind. Hyg. Assoc. J. 48:374–

379 (1987).

4. Selvin, S., and S.M. Rappaport: Note on the estimation of the mean

value from a lognormal distribution. Am. Ind. Hyg. Assoc. J. 50:627–630

(1989).

5. Selvin, S., S. M. Rappaport, R. Spear, J. Schulman, and M. Francis:

A note on the assessment of exposure using one-sided tolerance limits.

Am. Ind. Hyg. Assoc. J. 48:89–93 (1987).

6. Borjanovic, S.S., S.V. Djordjevic, and M.D. Vukovic-Pal: A method

for evaluating exposure to nitrous oxides by application of lognormal

distribution. J. Occup. Health 41:27–32 (1999).

7. Saltzman, B.E.: Health risk assessment of ﬂuctuating concentrations

using lognormal models. J. Air Waste Manag. Assoc. 47:1152–1160

(1997).

8. Lyles, R.H., and L.L. Kupper: On strategies for comparing occupational

exposure data to limits. Am. Ind. Hyg. Assoc. J. 57:6–15 (1996).

9. Rappaport, S.M.: Assessment of long-term exposures to toxic substances

in air. Ann. Occup. Hyg. 35:61–121 (1991).

10. Lyles, R.H., L.L. Kupper, and S.M. Rappaport: Assessing regulatory

compliance of occupational exposures via the balanced one-way random

effects ANOVA model. J. Agric. Biol. Environ. Statist. 2:64–86 (1997).

11. Krishnamoorthy, K., and T. Mathew: One-sided tolerance limits in

balanced and unbalanced one-way random models based on generalized

conﬁdence limits. Technometrics 46:44–52 (2004).

12. Land, C.E.: Hypotheses tests and interval estimates. In Lognormal

Distribution (E.L. Crow and K. Shimizu, eds.). New York: Marcel Dekker,

1988. pp. 87–112.

13. Zhou, X.H., and S. Gao: Conﬁdence intervals for the lognormal mean.

Statist. Med. 16:783–790 (1997).

Journal of Occupational and Environmental Hygiene November 2006 649

14. Armstrong, B.G.: Conﬁdence intervals for arithmetic means of lognor-

mally distributed exposures. Am. Ind. Hyg. Assoc. J. 53:481–485 (1992).

15. Land, C.: An evaluation of approximate conﬁdence interval methods for

lognormal means. Technometrics 14:145–158 (1972).

16. Hewett, P., and G.H. Ganser: Simple procedures for calculating

conﬁdence intervals around the sample mean and exceedance fraction

derived from lognormally distributed data. Appl. Occup. Environ. Hyg.

12:132–142 (1997).

17. Taylor, D.J., L.L. Kupper, and K.E. Muller: Improved approximate

conﬁdence intervals for the mean of a log-normal random variable. Statist.

Med. 21:1443–1459 (2002).

18. Zhou, X.H., S. Gao, and S.L. Hui: Methods for comparing the means of

two independent lognormal samples. Biometrics 53:1129–1135 (1997).

19. Krishnamoorthy, K., and T. Mathew: Inferences on the means of lognor-

mal distributions using generalized p-values and generalized conﬁdence

intervals. J. Statist. Plan. Infer. 115:103–121 (2003).

20. Tsui, K.W., and S. Weerahandi: Generalized p-values in signiﬁcance

testing of hypotheses in the presence of nuisance parameters. J. Am. Statist.

Assoc. 84:602–607 (1989)

21. Weerahandi, S.: Generalized conﬁdence intervals. J. Am. Statist. Assoc.

88:899–905 (1993).

22. Weerahandi, S.: Exact Statistical Methods for Data Analysis.New York:

Springer-Verlag, 1995.

23. O’Brien, D.M., G.M. Piacitelli, W.K. Sieber, R.T. Hughes, and J.D.

Catalano: An evaluation of short-term exposures to metal working ﬂuids

in small machine shops. Am. Ind. Hyg. Assoc. J. 62:342–348 (2001).

APPENDIX 1

The Generalized Conﬁdence Interval

and Generalized P-Value

A general setup where the concepts of generalized con-

ﬁdence intervals and generalized p-values are deﬁned is as

follows. Consider a random variable X whose distribution

depends on a scalar parameter of interest θ and a nuisance

parameter (parameter that is not of direct inferential interest)

η, where η could be a vector. Here X could also be a vector.

Suppose we are interested in computing a conﬁdence interval

for θ. Let x denote the observed value of X , that is, x represents

the data that has been collected. To obtain a generalized

conﬁdence interval for θ ,weneed a generalized pivot statistic

(the pivotal quantity based on which inferential procedures will

be developed) T

1

(X ; x,θ,η) that is a function of the random

variable X, the observed data x , and the parameters θ and η,

and satisfying the following two conditions:

(i) Given x, the distribution of T

1

(X ; x,θ,η)isfree of the

unknown parameters θ and η;

(ii) The observed value of T

1

(X ; x,θ,η), namely,

T

1

(x; x,θ,η)isequal to θ. (A1)

The percentiles of T

1

(X ; x,θ,η) can then be used to obtain

conﬁdence intervals for θ. Such conﬁdence intervals are

referred to as generalized conﬁdence intervals. For example,

if T

1−α

denotes the 100 (1 − α)th percentile of T

1

(X ; x,θ,η),

then T

1−α

is a generalized upper conﬁdence limit for θ .A

lower conﬁdence limit or two-sided conﬁdence limits can be

similarly deﬁned.

Now suppose we are interested in testing the hypothesis

H

0

: θ ≤ θ

0

vs. H

a

: θ>θ

0

, (A2)

where θ

0

is a speciﬁed quantity. Suppose we can deﬁne a gen-

eralized test variable T

2

(X ; x,θ,η) satisfying the following

conditions:

(i) For a given x, the distribution of T

2

(X ; x,θ,η)isfree of

the nuisance parameter η;

(ii) The observed value of T

2

(X ; x,θ,η), namely,

T

2

(x; x,θ,η)isfree of any unknown parameters;

(iii) For a given x and η, the distribution of T

2

(X ; x,θ,η)is

stochastically monotone in θ (i.e., stochastically increas-

ing or decreasing in θ). (A3)

In general, for a given x and η, T

2

(X ; x,θ,η)isstochas-

tically decreasing in θ, and the generalized p-value for

testing Eq. A2 is given byP

(

T

2

(X ; x,θ,η) ≤ t

)

, where t =

T

2

(x; x,θ,η). On the other hand, if T

2

(X ; x,θ,η)isstochas-

tically decreasing in θ, then the generalized p-value for

Eq. A2 is deﬁned as P

(

T

2

(X ; x,θ,η) ≥ t

)

.Ingeneral, the

observed value t is equal to θ

0

, and as the distribution

of T

2

(X ; x,θ,η)isfree of the nuisance parameter η, the

generalized p-value at θ

0

can be computed using Monte Carlo

simulation.

APPENDIX 2

Algorithm for Computing the Generalized P-Value

and the Generalized Conﬁdence Interval

The following algorithm given by Krishnamoorthy and

Mathew

(17)

can be used for computing the generalized p-value

and the generalized conﬁdence interval.

Foragiven logged data set, compute the observed sample

mean and variance, namely,

¯

x ands

2

, respectively.

For i = 1tom

Generate a standard normal variate Z

Generate a chi-square random variate V

2

with degrees of

freedom n − 1

Set T

ηi

=

¯

x −

Z

V /

√

n − 1

s

√

n

+

s

2

2V

2

/(n − 1)

Set K

i

= 1ifT

ηi

> ln(LTA - PEL), else K

i

= 0

(end i loop)

1

m

m

i=1

K

i

is the generalized p-value for testing the hypothe-

ses in Eq. 2. The 100(1 − α)th percentile of T

η

1

,...,T

η

m

,

denoted by T

η,1−α

,isthe 1 − α generalized upper conﬁdence

limit for η = µ

l

+σ

2

l

/2. Furthermore, exp (T

η,1−α

)isthe 1 −α

generalized upper limit for the lognormal mean.

Based on our experience, we recommend simulation con-

sisting of at least 100,000 (i.e., the value of m)toget consistent

results regardless of the initial seed used for random number

generation. The above algorithm can be easily programmed

in any programming language. A Fortran program and SAS

codes for computing generalized p-values for one-tail tests

and one-sided conﬁdence limits is posted at http://www.ucs.

louisiana.edu/∼kxk4695. Interested readers can download

these ﬁles from this address.

650 Journal of Occupational and Environmental Hygiene November 2006

- CitationsCitations21
- ReferencesReferences25

- "The computation of confidence limits for some rather complicated parameters comes up in many industrial hygiene applications, and the concept of a generalized confidence interval has proved very fruitful to address such problems. In a series of articles, Krishnamoorthy and Mathew (2002, 2009) and Krishnamoorthy et al., (2006 Krishnamoorthy et al., ( , 2007) have successfully applied the generalized confidence interval idea for the analysis of industrial hygiene data. In particular, Krishnamoorthy and Mathew (2009) have developed an accurate upper confidence limit for the symmetric-range accuracy , using the generalized confidence interval approach, in the context of normally distributed sample measurements. "

[Show abstract] [Hide abstract]**ABSTRACT:**The symmetric-range accuracy of a sampler is defined as the fractional range, symmetric about the true concentration, that includes a specified proportion of sampler measurements. In this article, we give an explicit expression for assuming that the sampler measurements follow a one-way random model so as to capture different components of variability, for example, variabilities among and within different laboratories or variabilities among and within exposed workers. We derive an upper confidence limit for based on the concept of a ‘generalized confidence interval’. A convenient approximation is also provided for computing the upper confidence limit. Both balanced and unbalanced data situations are investigated. Monte Carlo evaluation indicates that the proposed upper confidence limit is satisfactory even for small samples. The statistical procedures are illustrated using an example.- "A confidence interval for g can be obtained along the lines of the method for the percentiles given in earlier section and using the generalized variable approach. For more details on the generalized variable approach in the present context, see the articles by Krishnamoorthy et al. (2006 Krishnamoorthy et al. ( , 2011) and Krishnamoorthy and Mathew (2009b). Specifically, an approximate 'generalized pivotal quantity (GPQ)' for g, which can be constructed following the lines of Krishnamoorthy et al. (2010), as follows. "

[Show abstract] [Hide abstract]**ABSTRACT:**The problem of assessing occupational exposure using the mean or an upper percentile of a lognormal distribution is addressed. Inferential methods for constructing an upper confidence limit for an upper percentile of a lognormal distribution and for finding confidence intervals for a lognormal mean based on samples with multiple detection limits are proposed. The proposed methods are based on the maximum likelihood estimates. They perform well with respect to coverage probabilities as well as power and are applicable to small sample sizes. The proposed approaches are also applicable for finding confidence limits for the percentiles of a gamma distribution. Computational details and a source for the computer programs are given. An advantage of the proposed approach is the ease of computation and implementation. Illustrative examples with real data sets and a simulated data set are given.- "Generalized procedures have been successfully applied to several problems of practical importance. The areas of applications include comparison of means, testing and estimation of functions of parameters of normal and related distributions (Weerahandi,2345, Krishnamoorthy and Mathew [6], Johnson and Weerahandi [7], Gamage, Mathew and Weerahandi [8]); testing fixed effects and variance components in repeated measures and mixed effects ANOVA models (Zhou and Mathew [9], Gamage and Weerahandi [10], Chiang [11], Krishnamoorthy and Mathew [6], Weerahandi [5], Mathew and Webb [12], Arendacka [13]); interlaboratory testing (Iyer, Wang and Mathew [14]); bioequivalence (McNally, Iyer and Mathew [15]); growth curve modeling (Weerahandi and Berger [16], Lin and Lee [17]); reliability and system engineering (Roy and Mathew [18], Tian and Cappelleri [19], Mathew, Kurian and Sebastian [20]); process control (Burdick, Borror and Montgomery [21], Mathew, Kurian and Sebastian [22]); environmental health (Krishnamoorthy, Mathew and Ramachandran [23]) and many others. The simulation studies in Johnson and Weerahandi [7], Weerahandi [4,5] Zhou and Mathew [9], Gamage and Weerahandi [10], among others have demonstrated the success of the generalized procedure in many problems where the classical approach fails to yield adequate confidence intervals. "

[Show abstract] [Hide abstract]**ABSTRACT:**Generalized confidence intervals provide confidence intervals for complicated parametric functions in many common practical problems. They do not have exact frequentist coverage in general, but often provide coverage close to the nominal value and have the correct asymptotic coverage. However, in many applications generalized confidence intervals do not have satisfactory finite sample performance. We derive expansions of coverage probabilities of one-sided generalized confidence intervals and use the expansions to explain the nonuniform performance of the generalized intervals. We then show how to use these expansions to obtain improved coverage by suitable calibration. The benefits of the proposed modification are illustrated via several examples.

Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.

This publication is from a journal that may support self archiving.

Learn more