The fallacy of ratio correction to address confounding factors.
ABSTRACT Scientists aspire to measure cause and effect. Unfortunately confounding variables, ones that are associated with both the probable cause and the outcome, can lead to an association that is true but potentially misleading. For example, altered body weight is often observed in a gene knockout; however, many other variables, such as lean mass, will also change as the body weight changes. This leaves the researcher asking whether the change in that variable is expected for that change in weight. Ratio correction, which is often referred to as normalization, is a method used commonly to remove the effect of a confounding variable. Although ratio correction is used widely in biological research, it is not the method recommended in the statistical literature to address confounding factors; instead regression methods such as the analysis of covariance (ANCOVA) are proposed. This method examines the difference in means after adjusting for the confounding relationship. Using real data, this manuscript demonstrates how the ratio correction approach is flawed and can result in erroneous calls of significance leading to inappropriate biological conclusions. This arises as some of the underlying assumptions are not met. The manuscript goes on to demonstrate that researchers should use ANCOVA, and discusses how graphical tools can be used readily to judge the robustness of this method. This study is therefore a clear example of why assumption testing is an important component of a study and thus why it is included in the Animal Research: Reporting of In Vivo Experiment (ARRIVE) guidelines.
[show abstract] [hide abstract]
ABSTRACT: A significant challenge of in-vivo studies is the identification of phenotypes with a method that is robust and reliable. The challenge arises from practical issues that lead to experimental designs which are not ideal. Breeding issues, particularly in the presence of fertility or fecundity problems, frequently lead to data being collected in multiple batches. This problem is acute in high throughput phenotyping programs. In addition, in a high throughput environment operational issues lead to controls not being measured on the same day as knockouts. We highlight how application of traditional methods, such as a Student's t-Test or a 2-way ANOVA, in these situations give flawed results and should not be used. We explore the use of mixed models using worked examples from Sanger Mouse Genome Project focusing on Dual-Energy X-Ray Absorptiometry data for the analysis of mouse knockout data and compare to a reference range approach. We show that mixed model analysis is more sensitive and less prone to artefacts allowing the discovery of subtle quantitative phenotypes essential for correlating a gene's function to human disease. We demonstrate how a mixed model approach has the additional advantage of being able to include covariates, such as body weight, to separate effect of genotype from these covariates. This is a particular issue in knockout studies, where body weight is a common phenotype and will enhance the precision of assigning phenotypes and the subsequent selection of lines for secondary phenotyping. The use of mixed models with in-vivo studies has value not only in improving the quality and sensitivity of the data analysis but also ethically as a method suitable for small batches which reduces the breeding burden of a colony. This will reduce the use of animals, increase throughput, and decrease cost whilst improving the quality and depth of knowledge gained.PLoS ONE 01/2012; 7(12):e52410. · 4.09 Impact Factor