# J. A. Nelder's research while affiliated with Imperial College London and other places

**What is this page?**

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

## Publications (115)

A modelling approach has been useful for the analysis of data from robust designs for quality improvement. Recently, Robinson et al. (J. Qual. Technol. 2006; 38:65–38) proposed the use of generalized linear mixed models (GLMMs) and they used the marginal quasi-likelihood (MQL) method of Breslow and Clayton (J. Am. Statist. Ass. 1983; 88:9–25). Hier...

Rejoinder to "Likelihood Inference for Models with Unobservables: Another View" by Youngjo Lee and John A. Nelder [arXiv:1010.0303] Comment: Published in at http://dx.doi.org/10.1214/09-STS277REJ the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

There have been controversies among statisticians on (i) what to model and (ii) how to make inferences from models with unobservables. One such controversy concerns the difference between estimation methods for the marginal means not necessarily having a probabilistic basis and statistical models having unobservables with a probabilistic basis. Ano...

The quasilikelihood estimator is widely used in data analysis where a likelihood is not available. We illustrate that with a given variance function it is not only conservative, in minimizing a maximum risk, but also robust against a possible misspecification of either the likelihood or cumulants of the model. In examples it is compared with estima...

The various stages in the history of statistical computing are illustrated by personal experiences. The stages include batch processing, interactive working, and the consultative mode. Statistical aspects associated with one or more of these stages include the development of general algorithms, procedures for model checking, and data-driven non-par...

Two papers have recently been published in this journal which purport to deal with the mixed-models controversy (Lencina et al., 2005; Lencina & Singer, 2006, to be referred to as L1 and L2). In my view they do not represent the current state of thinking on this subject. My own work begins with Nelder (1977) and continues with papers in 1982, and p...

In recent issues of this journal it has been asserted in two papers that the use of h-likelihood is wrong, in the sense of
giving unsatisfactory estimates of some parameters for binary data (Kuk and Cheng, 1999; Waddington and Thompson, 2004) or
theoretically unsound (Kuk and Cheng, 1999). We wish to refute both these assertions.

LIST OF NOTATIONS PREFACE INTRODUCTION CLASSICAL LIKELIHOOD THEORY Definition Quantities derived from the likelihood Profile likelihood Distribution of the likelihood-ratio statistic Distribution of the MLE and the Wald statistic Model selection Marginal and conditional likelihoods Higher-order approximations Adjusted profile likelihood Bayesian an...

We propose a class of double hierarchical generalized linear models in which random effects can be specified for both the mean and dispersion. Heteroscedasticity between clusters can be modelled by introducing random effects in the dispersion model, as is heterogeneity between clusters in the mean model. This class will, among other things, enable...

When there are two alternative random-effect models leading to the same marginal model, inferences from one model can be used
for the other model. We illustrate how a likelihood method for fitting models with independent random effects can be applied
to seemingly very different models with correlated random effects. We also discuss some merits of u...

Since their introduction in 1972, generalized linear models (GLMs) have proven useful in the generalization of classical normal models. Presenting methods for fitting GLMs with random effects to data, Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood explores a wide range of applications, including combining informati...

For inferences from random-effect models Lee and Nelder (1996) proposed to use hierarchical likelihood (h-likelihood). It allows Inference from models that may include both fixed and random parameters. Because of the presence of unobserved random variables h-likelihood is not a likelihood in the Fisherian sense. The Fisher likelihood framework has...

By combining a keen interest in agriculture and natural science with his ability as a mathematician, John Nelder has enjoyed a long and distinguished career as an agricultural statistician. In conversation with Helen Joyce, he offers a personal view of the world of statistics and his own contributions to it.

HolaFor inferences from random-effect models Lee and Nelder (1996) proposed to use hierarchical likelihood (h-likelihood). It allows inference from models that may include both fixed and random parameters. Because of the presence of unobserved random variables h-likelihood is not a likelihood in the Fisherian sense. The Fisher likelihood framework...

Restricted likelihood was originally introduced as the criterion for the estimation of dispersion components in normal mixed linear models. Lee & Nelder (2001a) showed that it can be extended to a much wider class of models via double extended quasi-likelihood. We give a detailed description of the new method and show that it gives an efficient est...

A search for a good parsimonious model is often required in data analysis. However, unfortunately we may end up with a falsely parsimonious model. Misspecification of the variance structure causes a loss of efficiency in regression estimation and this can lead to large standard-error estimates, producing possibly false parsimony. With generalized l...

A single data transformation may fail to satisfy all the required properties necessary for an analysis. With generalized linear models (GLMs), the identification of the mean-variance relationship and the choice of the scale on which the effects are to be measured can be done separately, overcoming the shortcomings of the data-transformation approac...

In multi-centre clinical trials, heterogeneities in individual hospital treatment effects can be modelled as random effects. Estimates of the individual hospital treatment effects and estimate of the mean treatment effect, allowing for the presence of overall hospital differences, are required, together with some measure of their uncertainty. Syste...

For contingency tables with one factor as a response, log-linear models can be used provided that a minimal model, which constrains the predicted sample sizes of the response factor to equal the actual sizes, is fitted first. Only models that contain this minimal term make inferential sense. For response factors with two levels the results from fit...

Hierarchical generalised linear models are developed as a synthesis of generalised linear models, mixed linear models and
structured dispersions. We generalise the restricted maximum likelihood method for the estimation of dispersion to the wider
class and show how the joint fitting of models for mean and dispersion can be expressed by two intercon...

This article illustrates a technique for visualizing nonlinear mappings ƒ:k → m that arise frequently in engineering applications. The idea is based on viewing sections ƒ−1(B) of the domain k, and ƒ(A) of the range m, respectively. After suitable discretization, such sections are easily approximated with familiar brushing operations in scatterplot...

We introduce a model class that includes many types of correlation structures for non-Gaussian models. We then show how to check the underlying model assumptions to discriminate between different correlation patterns and demonstrate how to select suitable models. Strawberry data are used to discuss the choice between fixed- and random-effect models...

We introduce a model class that includes many types of correlation structures for non-Gaussian models. We then show how to check the underlying model assumptions to discriminate between different correlation patterns and demonstrate how to select suitable models. Strawberry data are used to discuss the choice between fixed- and random-effect models...

We describe a tool which supports the activity of a human being in fitting a mathematical model to measured or simulated data. The tool offers two principal advantages; its use requires a minimum of statistical knowledge, and its visual and interactive nature ensures that its use is intuitive. The tool is novel in that, in the iterative and often e...

The authors demonstrate the significant advantages of response
surface methodology for the optimisation of pass-transistor logic
circuits with a large number of design variables. It is shown how the
“curse of dimensionality” in this type of optimisation
problem can be effectively dealt with by using methods from the set of
techniques referred to as...

The human sex ratio data, collected in Saxony in the 19th century by Geissler, are reanalysed by joint modelling of the mean and dispersion. Extended quasi-likelihood and the unnormalized double-exponential family are shown to lead to identical inference. The use of the unnormalized form is discussed. The relationship between multinomial and Poisso...

For non-normal data assumed to have distributions, such as the Poisson distribution, which have an a priori dispersion parameter, there are two ways of modelling overdispersion: by a quasi-likelihood approach or with a random-effect model. The two approaches yield different variance functions for the response, which may be distinguishable if adequa...

Models described as using quasi-likelihood (QL) are often using a different approach based on the normal likelihood, which I call pseudo-likelihood. The two approaches are described and contrasted, and an example is used to illustrate the advantages of the QL approach proper.

We propose a technique for visualizing nonlinear mappings f : R k ! R m that arise frequently in engineering applications. The proposal is based on viewing sections of the form f Gamma1 (B) and f(A) with scatterplot matrices of the domain R k and the range R m , respectively. The obvious approach is to evaluate f on a factorial grid in R k and view...

Hierarchical generalized linear models (HGLMs) are developed as a synthesis of (i) generalized linear models (GLMs) (ii) mixed linear models, (iii) joint modelling of mean and dispersion and (iv) modelling of spatial and temporal correlations. Statistical inferences for complicated phenomena can be made from such a HGLM, which is capable of being d...

Well-formed polynomials contain the marginal terms of all terms; for example, they contain both x 1 and x 2 if x 1 x 2 is present. Such models have a goodness of fit that is invariant to linear transformations of the x variables. Recently, selection procedures have been proposed which may not give well-formed polynomials. Analysis of two data sets...

This paper reviews the established practice of providing evidence to regulatory authorities about the claimed properties (such as efficacy and safety) of new pharmaceutical products. The established conventions and procedures are contrasted with scientific concepts and principles. The following issues are discussed: (a) recruitment of subjects and...

This paper reviews the established practice of providing evidence to regulatory authorities about the claimed properties (such as efficacy and safety) of new pharmaceutical products. The established conventions and procedures are contrasted with scientific concepts and principles. The following issues are discussed: (a) recruitment of subjects and...

It is asserted that statistics must be relevant to making inferences in science and technology. The subject should be renamed statistical science and be focused on the experimental cycle, design–execute–analyse–predict. Its part in each component of the cycle is discussed. The P-value culture is claimed to be the main prop of non-scientific statist...

The use of a response surface in the design of an electromagnetic
system is described. The surface is generated using a minimum number of
points based on the design of experiments. The final surface is
integrated in SPICE to provide dynamic performance modeling

Response surface models (RSMs) can be used to model expensive simulations or real measurements [1]. An RSM is constructed by simulating solutions at systematic points in a design space and fitting a model to them. The RSM is then employed in subsequent calculations (for instance in optimization) effectively replacing the original expensive simulato...

Generalized linear models may be extended in several ways. This paper describes five such extensions: (i) generalized additive models; (ii) the use of quasi-likelihood; (iii) joint modelling of mean and dispersion; (iv) introduction of extra random components to give hierarchical generalized linear models; (v) modelling of correlated responses with...

A novel method is presented for the statistical modelling of
magnetic flux density using response surface methodology. It is shown
that careful consideration of the class of response surface model using
domain knowledge, gives models which are two orders of magnitude better
than the more conventional polynomial models

Model selection under the weak-heredity principle allows models that contain compound terms such as x1x2 to have only one of the corresponding x1 and x2 terms In the model It is shown that the conditions required to justify use of the principle are so restrictive as to make It unusable in practice. An example is given to Illustrate this.

Generalized linear models provide a useful tool for analyzing data from quality-improvement experiments. We discuss why analysis must be done for all the data, not just for summarizing quantities, and show by examples how residuals can be used for model checking. A restricted-maximum-likelihood-type adjustment for the dispersion analysis is develop...

Since the early 1980s, industry has embraced the use of designed experiments as an effective means for improving quality. For quality characteristics not normally distributed, the practice of first transforming the data and then analyzing them by standard normal-based methods is well established. There is a natural alternative called generalized li...

In this paper, we present a design and optimization procedure
which utilizes a case-based approach to design and a response surface
methodology, implementing a fractional factorial approach, to determine
the empirical models relating the performance of the device and the
design parameters. With these models constructed, a genetic algorithm
can be u...

We consider hierarchical generalized linear models which allow extra error components in the linear predictors of generalized linear models. The distribution of these components is not restricted to be normal; this allows a broader class of models, which includes generalized linear mixed models. We use a generalization of Henderson's joint likeliho...

In recent years, designed experiments and response-surface methods of analysis have been used in the design and optimization of engineering artefacts. The key to this statistical approach is to build a set of accurate response models of the performances of such artefacts. This paper introduces the class of generalized linear regression models (GLMs...

We consider hierarchical generalized linear models which allow extra error components in the linear predictors of generalized linear models. The distribution of these components is not restricted to be normal; this allows a broader class of models, which includes generalized linear mixed models. We use a generalization of Henderson's joint likeliho...

The responses to a recent paper by Dallal in this journal are evaluated by reference to the ideas of Frank Yates. It is concluded that much unnecessary complication has been introduced into the computer analysis of linear models by (1) the imposition of constraints on parameters, (2) neglect of marginality relations in forming hypotheses, and (3) c...

The discussion has shown, I fear, that the formulation of linear models, one of the basic tools of our trade, is in an unsatisfactory state; also that the argument is not just one between Nelder and the rest of the world. Why the sorry state? I believe that the problem is that statistical theory is driven too often by mathematics rather than by the...

In recent years, designed experiments and response surface methods
of analysis have been used in the design and optimization of electronic
circuits. The key to this statistical approach is to build a set of
accurate response models of the circuit performances. This paper
introduces the class of Generalized Linear Models (GLMs), which extend
classic...

Inference from the fitting of linear models is basic to statistical practice, but the development of strategies for analysis has been hindered by unnecessary complexities in the descriptions of such models. Three false steps are identified and discussed: they concern constraints on parameters, neglect of marginality constraints, and confusion betwe...

Whatever actually happens in the next ten years, one general prophecy is likely to be fulfilled. This is that advances will depend at least as much on the imagination of statisticians as on their mathematical and computing skills. I don't think we pay enough attention to trying to develop imagination among our students.

There is considerable interest in the fitting of models jointly to the mean and dispersion of a response. For the mean parameter, the Wedderburn estimating equations are widely accepted. However, there is some controversy about estimating the dispersion parameters. Finite sampling properties of several dispersion estimators are investigated for thr...

It is more than a decade since Genichi Taguchi's ideas on quality improvement were inrroduced in the United States. His parameter-design approach for reducing variation in products and processes has generated a great deal of interest among both quality practitioners and statisticians. The statistical techniques used by Taguchi to implement paramete...

The efficiency of the finite element approach in determining object heights from metric space photography and parallax bar observations has been tested using two stereoscopic models. The results show that heighting accuracy of 0.2 0/00 H (+/- 50 m on ...

Confusion arises in the output from statistical packages, caused by (1) failure to distinguish between a hypothesis and the non-centrality parameter used to test it, and (2) neglect of marginality relations between main effects and interactions in linear models. The batch-oriented output from some packages is criticised, and a suitable form of outp...

There is considerable interest in the fitting of models jointly to the mean and dispersion of a response. For the mean parameter, the Wedderburn estimating equations are widely accepted. However, there is some controversy about estimating the dispersion parameters. Finite sampling properties of several dispersion estimators are investigated for thr...

Although the traditional unrestricted ('non-parametric') estimators of directly standardized rates and rate differences remain unbiased in sparse data, they tend to suffer from instability (low precision). As a result, many authors have proposed more precise estimators based on parametric models for the rates. This paper provides a general approach...

We show that generalized linear models are well suited to the analysis of enzyme-kinetic data, and illustrate the process of modelling using data published in this journal.

Recent interest in Taguchi's methods have led to developments in joint analysis of the mean and dispersion from designed experiments. A commonly used method is the analysis of variance of the transformed data. However, a single transformation cannot necessarily produce the Normality, constancy of variance and linearity of systematic effects for the...

## Citations

... When checking assumption for Poisson's regression model, it fulfills assumption of equi-dispersion. 15 Because of that Poisson's regression was used in this study. Finally, the odd ratios and the corresponding 95% confidence interval were calculated for each independent variable. ...

Reference: 20503121221079479

... The Jaccard Index is used (Goslee and Urban, 2007). Then, each dissimilarity matrix Second Kerguelen Plateau Symposium: marine ecosystem and fisheries Martin et al. (McCullagh and Nelder, 1989) is fitted, with the biological dissimilarity as the explained value and the environmental statistics as the explanatory factors. An iterative modelling process is run to select the best GLM and the best set of environmental factors, using the deviance explained as testing criteria (McCullagh and Nelder, 1989). ...

... Using CRF in named-entity problem [11], we denote = ( 1 , 2 , … , ) as the input sequence (words of a sentence) and = ( 1 , 2 , … , ) as the sequence of output states (named entity tags) and modeled the conditional probability as ...

... Habitat suitability maps were calculated using 12 different algorithms. These algorithms are: Bioclim [33,48,49], Domain [76], Generalized Linear Models (GLM) [77,78], Generalized Additive Models (GAM) [78,79], Multivariate Adaptive Regression Splines (MARS) [80][81][82], Flexible Discriminant Analysis (FDA) [83][84][85], Classification Tree Analysis (CTA) [86], Artificial Neural Network (ANN) [87,88], Random Forest (RF) [89,90], Support Vector Machine (SVM) [91][92][93], Maximum Entropy (Maxent) [24,35,94], and Kernel Density Estimation (KDE) [95,96]. All algorithms were used in R [97]. ...

... However, in the iterative process to build joint models of the mean and dispersion there is some uncertainty in the estimation of φ i , thus Pinto and Pereira (2021) assume that φ i be replaced by τ φ i , where τ is an unknown constant. In this way, if H c and H d are two nested hypothesis of dimension c < d, that is η c ⊂ η d , then, considering the quasi-likelihood ratio statistic and according to McCullagh (1983), under H c the change in the extended quasi-deviance, given by ...

... To make inferences about the random effects z, Lee et al. (2017) proposed the use of the predictive likelihood: L p (z|y; θ) = f θ (z | y) = f θ (y, z)/f θ (y) = H(θ, z)/L m (θ), which is analogous to the use of a Bayesian posterior under a flat prior on θ. ...

Reference: Enhanced Laplace Approximation

... an asymptotic bias when the underlying distribution is asymmetric and outside the exponential family. Smyth (1989) proposed the class of double generalized linear models (DGLMs) for modeling of mean and dispersion; further developments for EQL were provided by Godambe and Thompson (1989) and McCullagh and Nelder (1989); Nelder and Lee (1991) introduced the JMMD, using EQL, for robust parameter design, see Taguchi (1986); Nelder and Lee (1992) compared EQL and PL in simulation studies with finite samples and showed that the maximum extended quasi-likelihood estimator is usually superior than the pseudo-likelihood estimator in minimizing the mean-squared error, see also Nelder (2000); Verbyla (1993) used restricted maximum likelihood (REML) for normal heteroscedastic models; Lee and Nelder (1996) introduced the class of hierarchical generalized linear models (HGLMs) as a synthesis of JMMD, generalizing the ideas of quasi-likelihood and EQL, see Lee and Nelder (2001) and Lee et al. (2006); Lee and Nelder (1998) introduced the adjustment for the REML in the JMMD proposed by Nelder and Lee (1991); Smyth and Verbyla (1999a) and Smyth and Verbyla (1999b) considered extensions of REML estimation of variance parameters for DGLMs; Lee and Nelder (2000) showed that the double exponential family, see Efron (1986), and the EQL generate identical inferences; Cuervo and Gamerman (2001) presented a Bayesian version of JMMD for normal heteroscedastic linear models; Antoniadis et al. (2016) considered estimation and variable selection for JMMD in proper dispersion models, see Jorgensen (1987) and Jorgensen (1997); Bonat and Jorgensen (2016) proposed the class of multivariate covariance generalized linear models (MCGLMs). ...

... Assuming that Y |β ∼ N (Xβ, σ 2 I n ), Ridge regression can be derived as the mean of a posterior distribution with the prior β ∼ N (0 d , σ 2 λ −1 I d ) (van Wieringen, 2015) and as in Bayesian hierarchical linear regression, likelihood-based methods maximize the likelihood with respect to σ 2 and λ using for instance an iterative method (Lee and Nelder, 1996). Unlike goodness-of-fit-based methods, the advantage of likelihood-based approaches is, on the one hand, that they do not require grid selection for the regularization parameters. ...

... Usually, in the analysis of multi-environment trial (MET) data, the variety-specific regression terms for covariates are taken as fixed effects (Denis 1988). However, when the target region is divided into zones and variety effects are modelled as random to borrow strength across zones, then variety-specific regression coefficients must be modelled as random, giving rise to random coefficient models (Longford 1993;Buntaran et al. 2021). ...

... Beta regression, however, confines parameter estimates between the (0, 1) bounds, resulting in estimates that are both biologically relevant and mathematically sound. Prior to beta regression, logistic regression was recommended for continuous proportion data (Hamada & Nelder, 1997). Logistic regression confines the model to the data scale, from 0 and 1. ...