Article

Conditioning to reduce the sensitivity of general estimating functions to nuisance parameters

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

A conditional method is presented that renders an estimating function insensitive to nuisance parameters. The approach is a generalisation of the conditional score method to a general estimating function context and does not require complete specification of the probability model. We exploit the informal relationship between general estimating functions and score functions to derive simple generalisations of sufficient and partially ancillary statistics, referred to as G-sufficient and G-ancillary statistics, respectively. These two types of statistic are defined in a manner that does not require complete knowledge of the probability model and thus are more suitable for use with estimating functions. If we condition on a G-sufficient statistic for the nuisance parameters, the resulting conditional estimating function is insensitive to nuisance parameters and in particular achieves the plug-in unbiasedness property. Furthermore, if the conditioning argument is also G-ancillary for the parameters of interest, then the conditional estimating function possesses an attractive optimality property. Copyright Biometrika Trust 2003, Oxford University Press.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In addition, these models yield inconsistent inferences about the correlation parameters of interest, owing to sensitivity to the fitting of the many stratum-specific effects: this is a manifestation of the notorious 'infinitely many nuisance parameters' problem that typically occurs with sparse data [10]. Removal of the aberrant effects of nuisance parameters by conditioning generally is not possible with these models, since the minimal sufficient statistic for the nuisance parameters is the entire sample, leading to a conditional likelihood for the interest parameters that is degenerate [2]. In contrast to fixed-effects models, multi-level random-effects models accommodate the nested structure of the study design and avoid the nuisance parameters problem. ...
... In studies that are non-sparse, where the number of strata K is small compared to the stratum sizes, it is well known that the most efficient choice of weights in estimating functions (2) and (3) for α and λ i is ...
Article
Standard methods for the analysis of cluster-correlated count data fail to yield valid inferences when the study is finely stratified and the interest is in assessing the intracluster correlation structure. We present an approach, based upon exactly adjusting an estimating function for the bias induced by the fitting of stratum-specific effects, that requires modeling only the first two joint moments of the observations and that yields consistent and asymptotically normal estimators of the correlation parameters.
Article
When the data are sparse but not exceedingly so, we face a trade-off between bias and precision that makes the usual choice between conducting either a fully unconditional inference or a fully conditional inference unduly restrictive. We propose a method to relax the conditional inference that relies upon commonly available computer outputs. In the rectangular array asymptotic setting, the relaxed conditional maximum likelihood estimator has smaller bias than the unconditional estimator and smaller mean square error than the conditional estimator.
Article
We propose an orthogonal locally ancillary estimating function that provides first-order bias correction of inferences. It requires the specification of merely the first two moments of the observations when applying to analysis of stratified clustered (continuous or binary) data with the parameters of interest in both the first and second joint moments of dependent data. Simulation results confirm that the estimators obtained using the proposed method are substantially improved over those using regular profile estimating functions.
Article
Full-text available
SUMMARY When the number of nuisance parameters increases in proportion to the sample size, the Cramer-Rao bound does not necessarily give an attainable lower bound for the asymptotic variance of an estimator of the structural parameter. The present paper presents a new lower bound under a criterion called information uniformity. The bound is expressed as the inverse of the sum of the partial information and a certain nonnegative term, which is derived by differential-geometrical considerations. The optimal estimating function meeting this lower bound, when it exists, is also obtained in a decomposed form. The first term is the modified score function, and the second term is, roughly speaking, given by the normal component of the mixture covariant derivative of some random variable. Furthermore, special versions of these results are given in concise form, and these are then applied to elucidate the efficiency of some examples.
Article
This paper proposes a semiparametric extension of the projected score method of R.P. Waterman and B.G. Lindsay [ibid. 83, 1–13 (1996; Zbl 0866.62011)] for the elimination of nuisance parameters. The procedure addresses cases where only the mean and the variance of the response variable are specified and where the mean function involves both parameters of interest and nuisance parameters. Important applications of the semiparametric model include quasilikelihood models for matched designs and for measurement error models. As a result of the optimality and information-unbiasedness of the quasi-score function, a second-order quasi-score basis of estimating functions for the nuisance parameter is derived. Second-order locally ancillary estimating functions are then obtained by solving a simple linear system that corresponds to a true projection for canonical exponential family distributions. Asymptotic arguments and simulations show that the impact of nuisance parameters is considerably reduced when adopting the proposed approach.
Article
This paper extends the projected score methods of C. G. Small and D. L. McLeish [ibid. 76, No. 4, 693-703 (1989; Zbl 0681.62008)]. It is shown that the conditional score function may be approximated, with arbitrarily small stochastic error, in terms of a natural basis for the space of centred likelihood ratios. The utility of using this basis is established by identifying a U-statistic representation theorem and a class of expectation identities for the basis elements, making higher order asymptotics more tractable. The results are applied to a canonical exponential family model, where it is shown that the projected scores with estimated nuisance parameters can provide an accurate approximation to the conditional score function.
Article
This paper examines statistical methods based upon estimating functions, i.e. functions of both the parameter and data that are designed to permit inference about an unknown parameter in a statistical model. We explore reductions of such estimating functions by projection. This reduction, analogous to the process of Rao–Blackwellization, may be used either to increase the power of a test, the efficiency of a point estimator, or alternatively to render an inference function insensitive to the value of a nuisance parameter. In the case where a complete sufficient statistic exists for a parameter of interest the methods reduce to increasing sensitivity through Rao–Blackwellization. When this same parameter is regarded as a nuisance parameter, the techniques lead us to condition on the complete sufficient statistic for this parameter. However the techniques are seen to be more widely applicable than for models permitting reduction through complete sufficiency. Examples involving mixture models will be developed.
Article
This paper concerns the efficiency of the conditional likelihood method for inference in models which include nuisance parameters. A new concept of ancillarity, asymptotic weak ancillarity, is introduced. It is shown that the conditional maximum likelihood estimator and the conditional score test of θ, the parameter of interest, are asymptotically equivalent to their unconditional counterparts, and hence are asymptotically efficient, provided that the conditioning statistic is asymptotically weakly ancillary. The key assumption that the conditioning statistic is asymptotically weakly ancillary is verified when the underlying distribution is from exponential families. Some illustrative examples are given.
Article
SUMMARY Godambe (1976) put forward two concepts of ancillarity in the presenmce of nuisance paremeters. In this paper they areunified and extended concept of Fisher information.
Article
S The often conservative nature, for discrete data, of so-called exact tests seems usually the result of unnecessarily precise conditioning. We consider avoiding this by conditioning only approximately on the sufficient statistics for nuisance parameters. Modest relaxation of conditioning results in small loss in terms of the rationale for conditional inference, but can greatly reduce the difficulties caused by discreteness. Exact calculation of p-values based on approximate conditioning is possible, but unattractive both in terms of the amount of calculation involved and in requiring explicit specification of the extent to which conditioning is to be relaxed. It is shown that there is a highly accurate, easily computed and very natural asymptotic approximation that avoids these difficulties.
Article
In many studies, the scientific objective can be formulated in terms of a statistical model indexed by parameters, only some of which are of scientific interest. The other "nuisance parameters" are required to complete the specification of the probability mechanism but are not of intrinsic value in themselves. It is well known that nuisance parameters can have a profound impact on inference. Many approaches have been proposed to eliminate or reduce their impact. In this paper, we consider two situations: where the likelihood is completely specified; and where only a part of the random mechanism can be reasonably assumed. In either case, we examine methods for dealing with nuisance parameters from the vantage point of parameter estimating functions. To establish a context, we begin with a review of the basic concepts and limitations of optimal estimating functions. We introduce a hierarchy of orthogonality conditions for estimating functions that helps to characterize the sensitivity of inferences to nuisance parameters. It applies to both the fully and partly parametric cases. Throughout the paper, we rely on examples to illustrate the main ideas.
Article
SUMMARY This note shows that concepts of ancillarity in the presence of a nuisance parameter suggested by Andersen (1970) and Basawa (1981) are equivalent for exponential families of distributions. Some illustrative examples are given and the relation to Fisher's information is discussed.
Article
The approximate conditional likelihood method proposed by Cox & Reid (1987) is applied to the estimation of a scalar parameter Θ , in the presence of nuisance parameters. The estimating function of Θ based on the approximate conditional likelihood is shown to be preferable to that based on the profile likelihood. A sufficient condition for both approaches to be equivalent is given. The role of parameter orthogonality is emphasized. Several examples including bivariate normal means with known coefficient of variation are presented.
Article
Cox & Reid (1987) proposed the technique of orthogonalizing parameters, to deal with the general problem of nuisance parameters, within fully parametric models. They obtained a large-sample approximation to the conditional likelihood. Along the same lines Davison (1988) studied generalized linear models. In the present paper we deal with the problem of nuisance parameters, within a semiparametric setup which includes the class of distributions associated with generalized linear models. The technique used is that of optimum orthogonal estimating functions (Godambe & Thompson, 1989). The results are related to those of Cox & Reid (1987).
Article
This paper concerns the impact of litter effects on the inference of dose-response relationships in teratological experiments with binary response. Kupper et al. (1986, Biometrics 42, 85-98) concluded that when intra-litter correlations are dose-dependent, the use of a common intra-litter correlation likelihood based on the beta-binomial distribution could lead to severe bias in estimation of the parameter beta, which characterizes the dose-response relationship. We show here that the problems of bias and coverage probability for beta could still be substantial when one uses the likelihood with heterogeneous intra-litter correlations. We then examine through a simulation study the performance of the quasi-likelihood method (Wedderburn, 1974, Biometrika 61, 439-447) and recommend that this method with a common intra-litter correlation parameter be used when the number of litters is small or modest.
Article
In settings where the full probability model is not specified, consider a general estimating function g(&thgr;, &lgr;; y) that involves not only the parameters of interest, &thgr;, but also some nuisance parameters, &lgr;. We consider methods for reducing the effects on g of fitting nuisance parameters. We propose Cox--Reid-type adjustment to the profile estimating function, g(&thgr;, &lgr;ˆ-sub-&thgr;; y), that reduces its bias by two orders. Typically, only the first two moments of the response variable are needed to form the adjustment. Important applications of this method include the estimation of the pairwise association and main effects in stratified, clustered data and estimation of the main effects in a matched pair study. A brief simulation study shows that the proposed method considerably reduces the impact of the nuisance parameters. Copyright Biometrika Trust 2003, Oxford University Press.
Article
In a parametric model the maximum likelihood estimator of a parameter of interest &psgr; may be viewed as the solution to the equation l′-sub-p(&psgr;) &equals; 0, where l-sub-p denotes the profile <?Pub Caret>loglikelihood function. It is well known that the estimating function l′-sub-p(&psgr;) is not unbiased and that this bias can, in some cases, lead to poor estimates of &psgr;. An alternative approach is to use the modified profile likelihood function, or an approximation to the modified profile likelihood function, which yields an estimating function that is approximately unbiased. In many cases, the maximum likelihood estimating functions are unbiased under more general assumptions than those used to construct the likelihood function, for example under first- or second-moment conditions. Although the likelihood function itself may provide valid estimates under moment conditions alone, the modified profile likelihood requires a full parametric model. In this paper, modifications to l′-sub-p(&psgr;) are presented that yield an approximately unbiased estimating function under more general conditions. Copyright Biometrika Trust 2002, Oxford University Press.