This paper establishes three properties of F-statistics for inference about the mean vector in multiple regression and analysis of variance. The extra SSE due to imposing a set of linear conditions on the model tests the estimable part of those conditions. All other possible numerator sums of squares that test the same have not-lesser degrees of fr...

Explicit expressions for the estimated mean \(\tilde {\mathbf {y}}_k = X\tilde {\boldsymbol {\beta }}_k = H_k\mathbf {y}\) and effective degrees of freedomνk = tr(Hk) by penalized least squares, with penalty k||Dβ||², can be found readily when X′X + D′D is nonsingular. We establish them here in general under only the condition that X be a non-zero...

The numerator sum of squares in the conventional F-statistic for testing a linear hypothesis in a general linear model can be viewed as following the heuristic that K. Pearson used in his seminal 1900 paper. That is, find a statistic (Formula presented.) that has expected value (Formula presented.) under the null hypothesis and form from it (Formul...

[Please note that this chapter is substantially different from previous editions of the book.]
CHAPTER SUBHEADINGS
-Introduction -Carrion insect succession does not track the physical stages of decomposition, but corpse appearance is still important -Distinguishing the analytical result from the interpretation -Carrion insect age may equal minimum...

The object of inverse prediction is to infer the value of a condition x* that caused an observed response y*, based on a linear model relating responses to conditions fit to training data. Four methods of inverse prediction are investigated here. Their performances are compared in terms of the rates at which they reject potential values x0 of the t...

Type III methods were introduced by SAS to address difficulties in dummy-variable models for effects of multiple factors and covariates. They are widely used in practice; they are the default method in several statistical computing packages. Type III sums of squares (SSs) are defined by a set of instructions; an explicit mathematical formulation do...

In 1934, F. Yates described a sum of squares for testing factor main effects in saturated unbalanced models for effects of two factors. He claimed no particular properties of this sum of squares other than that it provided an "efficient estimate of the variance from the A means of the sub-class means... ." Although it became widely regarded as the...

SAS introduced Type III methods to address difficulties in dummy-variable models for effects of multiple factors and covariates. Type III methods are widely used in practice; they are the default method in many statistical computing packages. Type III sums of squares (SSs) are defined by an algorithm, and an explicit mathematical formulation does n...

Abstract from conference presentation: To our knowledge an estimate of time since death is almost never accompanied by the kind of mathematically explicit probability statement that is the standard in most scientific disciplines. This has been a problem both for death investigation casework (and court testimony) and for research, because scientists...

The most common forensic entomological application is the estimation of some portion of the time since death, or postmortem interval (PMI). To our knowledge, a PMI estimate is almost never accompanied by an associated probability. Statistical methods are now available for calculating confidence limits for an insect-based prediction of PMI for both...

The between-within split of total sum of squares in one-way analysis of variance (ANOVA) is intuitively appealing and computationally simple, whether balanced or not. In the balanced two-factor setting, the same heuristic and computations apply to analyse treatment sum of squares into main effects and interaction effects sums of squares. Accomplish...

The age of a carrion insect associated with a corpse may represent a minimum postmortem interval. No method has been proposed before for constructing a confidence set on age based on development stage modeled as a categorical response. This paper illustrates the application of exact p values, first developed for succession data, to construct a conf...

Inverse prediction (IP) is reputed to be computationally inconvenient for multivariate responses. This paper describes how IP can be formulated in terms of a general linear mixed model, along with a flexible modeling approach for both mean vectors and variance–covariance matrices. It illustrates that results can be had as standard output from widel...

It is shown that the sum of squares by Yates's method of weighted squares of means is equivalent to numerator sums of squares formulated by other methods. These relations are established first for hypotheses about fixed effects in a general linear model, in the process showing how Yates's method can be extended. They are then illustrated in the une...

Distributions of a response y (height, for example) differ with values of a factor t (such as age). Given a response y* for a subject of unknown t*, the objective of inverse prediction is to infer the value of t* and to provide a defensible confidence set for it. Training data provide values of y observed on subjects at known values of t. Models re...

Many authors produced carrion insect development data for predicting the age of an insect from a corpse. Under some circumstances, this age value is a minimum postmortem interval. There are no standard protocols for such experiments, and the literature includes a variety of sampling methods. To our knowledge, there has been no investigation of how...

Given training data, a model relating a multivariate response y to x, and y * from a mystery specimen, the objective is to infer what values x * might have given rise to y *. Two approaches are investigated and illustrated here. In one, inverse prediction, tenable values of x * are those at which y * does not test as an outlier. In the other, based...

The Gram-Schmidt construction, with a little extension, can be used to establish results in linear algebra, multiple regression analysis, and the theory of linear models. This article describes and illustrates how it serves to develop the basic results required for statistical inference in the Gauss–Markov model. For upper-level theory courses, the...

Our studies have demonstrated that chronic Δ9-tetrahydrocannabinol (THC) administration results in a generalized attenuation of viral load and tissue inflammation in simian immunodeficiency virus (SIV)-infected male rhesus macaques. Gut-associated lymphoid tissue is an important site for HIV replication and inflammation that can impact disease prog...

Social support has been shown to influence health outcomes in later life. In this study, we focus on social engagement as an umbrella construct that covers select social behaviors in a lifespan sample that included oldest-old adults, a segment of the adult population for whom very little data currently exist. We examined relationships among social...

Often, the response variables on sampling units are observed repeatedly over time. The sampling units may come from different populations, such as treatment groups. This setting is routinely modeled by a random coefficients growth curve model, and the techniques of general linear mixed models are applied to address the primary research aim. An alte...

Δ(9)-Tetrahydrocannabinol (Δ(9)-THC), the primary psychoactive component in marijuana, is FDA approved to ameliorate AIDS-associated wasting. Because cannabinoid receptors are expressed on cells of the immune system, chronic Δ(9)-THC use may impact HIV disease progression. We examined the impact of chronic Δ(9)-THC administration (0.32 mg/kg im, 2...

Alcohol abuse is associated with an increased incidence and severity of pneumonia. In both the general population and individuals consuming excess alcohol, Streptococcus pneumoniae is the most frequent lung infection pathogen. Alcoholic patients with pneumonia frequently present with granulocytopenia, which is predictive of increased mortality. The...

In a linear model Y ∼ (Xβ, σ
2
I), powers of tests of H0: H′Xβ= 0 are developed following Pearson’s (1900) formulation. The class considered comprises all tests based on linear statistics
A′
Y that have expected value 0 under H0. The standard F-statistic, which is in this class, has good power properties, but others may be preferred in some setting...

Sufficiency and minimal sufficiency are examined in settings in which the parameter space and the sample space are both finite, where these properties can be presented in terms of linear algebra. Minimal sufficient partitions are identified constructively. This provides direct, hands-on treatment of these topics at a level accessible to a wide audi...

A framework is proposed for discussing p-values in models for binary response. Within that framework, the target of a p-value is defined, and accuracy is described relative to that target. Likelihood-ratio, F , and Pearson chi-squared approximate p-values, the exact conditional p-value based on the score statistic, and its mid-p version are examine...

The topic of testing linear hypotheses about parameters of fixed effects in models with variance-covariance components has been investigated ex-tensively in the past decades. The main question is in determination of the degrees of freedom of an approximate F -test. The approximation is based on the choice of the estimate of the approximate variance...

The original derivation of the widely cited form of the REML likelihood function for mixed linear models is difficult and
indirect. This paper derives it directly using familiar operations with matrices and determinants.

Exact conditional p -values based on the likelihood-ratio statistic in logistic regression require accurate computation of the supremum of the likelihood function, particularly for outcomes in the sample space that represent completely-separated or quasi-completely-separated data sets. Current software does not always handle these cases well. Three...

In this five-week study, we tested the hypotheses that free access to a maintenance diet supplemented with L-carnitine (L-C) would reduce body fat in adult, sedentary, ovariectomized (OVX) rats, and that there would be an additive effect of L-C on weight reduction in swim-trained animals. As expected, serum carnitine was higher in rats fed the L-C...

Cotton plants were infested with brown stink bug, Euschistus servus (Say), to define cotton boll age classes (based on heat unit accumulation beyond anthesis) that are most frequently injured during each of the initial 5 wk of flowering. Bolls from each week were grouped into discrete age classes and evaluated for the presence of stink bug injury....

Brown stink bug, Euschistus servus (Say), was infested on cotton, Gossypium hirsutum L., plants during reproductive stages to determine the effects on boll injury and seedcotton yield. During each week in 2002 and 2003, significantly more bolls with > or = 1 injured locule, bolls with > or = 2 injured locules, and bolls with discolored lint were re...

We investigate a simple situation with one discrete explanatory variable, say X, with finite levels and the response variable being a vector of dichotomous responses, say Y, with replicated values at each level of X. The mean of Y is modeled by a vector, say p, such that E(Y)=p(X) and var(Y)=V. The main question, whether the probability p depends o...

Two approaches, one based on the likelihood-ratio statistic and the other based on unconditioning Fisher's exact test, are examined for obtaining a p-value in the comparison of the combination of arthropod species present on a mystery carcass to the observed frequency distribution of species combinations on carcasses exposed to the elements for a k...

Deleted-case diagnostic statistics in regression analysis are based on changes in estimates due to deleting one or more cases.
Bounds on these statistics, suggested in the literature for identifying influential cases, are widely used.
In a linear regression model for Y in terms of X and Z, the model is “collapsible” with respect to Z if the Y−X r...

Inability to consistently rear healthy Trichoplusia ni led to a study of its rearing diseases. Four diseases were designated after preliminary research which included electron microscopy: cytoplasmic polyhedrosis (due to cytoplasmic polyhedrosis virus, or CPV), nuclear polyhedrosis (due to nucleopolyhedrovirus, or NPV), "neonate death" syndrome (mo...

The general form of prediction intervals is presented as the set of realized values of the response variable that would not lead to rejection of the hypothesis of equal population means, an approach that emphasizes consonance of the hypothesis with the data rather than confidence that the interval will cover the as yet unrealized response.

Invariant quadratics play an important role in inferences about variance components in linear models. Sometimes called “translation invariance” or “location invariance”, this property is widely defined in a way that assumes that the parameter set and the random variable in question fill the linear subspaces in which they take on values. In this not...

In a general linear model, it is shown that all admissible linear estimators are limits of linear estimators that are uniquely
best at some point in an extended parameter set. The principal result shows that a linear estimator that is uniquely best
at a pointW
2 among multiple linear estimators that are best at a pointW
1 is the limit of uniquely b...

A study was completed to evaluate the nutritional status of females at 3 Stages of the HIV/AIDS disease. The 3 Stages were identified by CD4 serum levels of ≥500/μ L, 200-499/μL, and <200/μ L, respectively. Nutritional status was determined by dietary assessments, anthropometrics and serum levels (albumin, WBC, hemoglobin, CD4, CD4/CD8). A total of...

This paper illustrates procedures that permit any linear hypothesis to be tested in terms of extra SS for a subset of predictor variables, and that provide RLS estimates and estimated standard deviations under any linear constraints on the regression coefficients. These can be accomplished using standard multiple regression computations.

Forensic entomological evidence is most often used to estimate the postmortem interval (PMI). Satisfactory techniques have not been available to quantify the precision of such a PMI estimate. For Cochliomyia macellaria (F.) (Diptera: Calliphoridae), we describe construction of a confidence interval on age of a larva, given its weight. The method re...

Some Studentized linear estimates do not follow Student's t distributions because the numerator and denominator are not independent. Recognized in residuals, the problem can occur in other contexts as well. A remedy is easily obtained by adding another predictor variable to the regression model that removes the shared degree of freedom.

In linear models with two variance components, mean squared errors (MSEs) of invariant quadratic estimators of linear combinations of the variance components are quadratics in the variance components. However, it is revealing to note that MSEs are linear in convex combinations of the squares and products of the variance components, so that surfaces...

All intensity data in a powder XRD pattern can be used to provide useful results for the interpretation of the mineralogical variability of geological samples. Simple statistical methods using Bonferroni's theorem permit the recognition of areas usually associated with mineral peaks where sample patterns are different at a predetermined level of co...

The effect of deleting case i from a linear model can be studied by adding a dummy variable that is equal to one for case i and equal to zero for all other cases. A case is outlying if the coefficient of the corresponding dummy is nonzero. By including a dummy for each case, one can detect outliers using variable selection techniques. The objective...

The LM test is modified to test any value of the ratio of two variance components in a mixed effects linear model with two variance components. The test is exact, so it can be used to construct exact confidence intervals on this ratio.Exact Neyman-Pearson (NP) tests on the variance ratio are described.Their powers provide attainable upper bounds on...

This paper reexamines the issue of common stock market risk stationarity by applying a newly available exact test for random-walk regression coefficients. For each eight-year subperiod tested in the 1951 1974 interval, betas for individual New York Stock Exchange-listed stocks appeared to be nonstationary. The statistical powers of the exact test a...

Problems encountered in estimating variance components in mixed-effects analysis of variance (ANOVA) models have stimulated research in linear estimation. Pukelsheim (1976) noted that quadratics in the observations which are invariant to fixed effects follow a linear model in the variance components, and that results in linear estimation theory can...

Operational measurement methods may be developed to measure the value of information in the data reported by the U.S. Government. Illustrative measures for cost of cotton production statistics indicate that the benefits from these data in certain important uses may far exceed their costs. If similar measures could be provided for major Government d...

Necessary and sufficient conditions for a linear estimator to be admissible among linear estimators are described. The model assumed is general, allowing for relations between elements of the mean vector and covariance matrix, and allowing the covariance matrix to vary in an arbitrary subset of nonnegative definite symmetric matrices.

A linear regression function is developed for use in a classification procedure. The regression function is chosen to minimize the number of misclassifications and, secondarily, to minimize the sum of absolute residuals. In situations where the categories of classification correspond to ranges of values of a criterion variable in one dimension, thi...

Shortcomings of commonly used estimators of variance components have been noted often in previous work. Hodges and Lehmann [2] noted that the sample variance is dominated by a simple multiple of itself. Klotz, Milton and Zacks [3] demonstrated that the customarily-used estimator of the among-groups variance component in the balanced, random, one-wa...

Exact tests of the hypothesis that the mean vector of an observed T-variate random variable follows a conventional fixed-coefficient linear model, against the alternative that the regression parameters vary according to a first-order Markov process, are derived. Power functions of the tests are investigated in order to provide means of choosing a t...

A class of linear estimators, called Bayes linear estimators, is developed by finding, among all linear estimators, ones which have least average total mean squared error, averaged over parameter points. Ridge, generalized ridge, restricted least squares, subset least squares, least squares, best, and generalized inverse linear estimators are all e...

This paper develops a new algorithm for the estimation of components of variance in the mixed ANOVA model. This algorithm is 'efficient' since the computational effort (measured by the number of products) is proportional to n, the number of observations. The method of estimation on which the algorithm is based can be identified with special cases o...

The collection of variance-covariance matrices for any linear model may be represented, without altering relationships among linear unbiased estimators, as a compact convex subset of nonnegative definite matrices throughout the relative interior of which all matrices are positive definite.

Admissibility and completeness of linear unbiased estimators in a linear model with general covariance structure are examined. Under rather unrestrictive conditions, a minimal complete class of linear unbiased estimators is described.

Multinomial samples may be fully or partially classified. In this paper, procedures for optimally determining sample size allocations between fully and partially classified data sets are described and illustrated. Expressions for the large-sample variances of the maximum likeli-hood estimators using both completely and partially classified data are...

There are no uniformly best invariant quadratic estimators of the variance components in the random, one-way ANOVA model. In the balanced model the ANOVA estimators are best among unbiased estimators, but are inadmissible among all invariant quadratic estimators. Two classes of admissible invariant quadratic estimators are presented in this paper:...

Human chorionic somatomammotropin (HCS) levels were studied in normal smoking and nonsmoking primiparous adolescent pregnancies. 136 teenagers, aged 12-18 years, were divided into groups: nonsmokers, deep, and shallow inhalers, long, and short puffers, high, and low tar, and high, and low nicotin. Shallow inhaling and low nicotine exposure pati...

Linear combinations of variance components for which there exist unbiased, non-negative quadratic estimators are characterized. It is shown that the ‘error’ component in ANOVA models is the only single component which can be so estimated.

Best quadratic estimators of the variance components in a general linear model are presented for each of several classes of estimators of the form Y'AY. Of the classes examined, three contain biased estimators and the others contain only unbiased estimators. Attainable lower bounds on mean squared errors of estimators in each class are described. I...

A convenient representation of the covariance matrix for a general $k$-level random, nested AOV model is obtained, as well as an expression for its inverse and determinant.

The report describes a computer program implementing the Hartley- Hocking convex programming algorithm. The two parts of this report are, respectively, a description of the Hartley-Hocking method as extracted from the original paper, and the documentation of the computer program.

A number of criteria have been proposed for selecting the best subset or subsets of independent variables in linear regression analysis. Applying these criteria to all possible subsets is, in general, not feasible if the number of variables is large. Many of the criteria are monotone functions of the residual sum of squares hence the problem is red...

Pectin esterase (PE)activities inabscission zones, other portions ofleaves, and adjacent stemtissues werecompared inattached leaves andabscissing petioles (previously debladed) ofColeus blumei Benth. andPhaseoluts vulgaris L.,cv.Canadian Wonder.Earlier findings ofOsborne inbeanwereconfirmed andchanges inPE activity incoleus wereshown toresemble tho...

Pectin esterase (PE) activities in abscission zones, other portions of leaves, and adjacent stem tissues were compared in attached leaves and abscissing petioles (previously debladed) of Coleus blumei Benth. and Phaseolus vulgaris L., cv. Canadian Wonder. Earlier findings of Osborne in bean were confirmed and changes in PE activity in coleus were s...