Article

A Simple Method of Sample Size Calculation for Linear and Logistic Regression

Department of Statistics, Stanford University, Palo Alto, California, United States
Statistics in Medicine (Impact Factor: 1.83). 08/1998; 17(14):1623-34. DOI: 10.1002/(SICI)1097-0258(19980730)17:143.0.CO;2-S
Source: PubMed

ABSTRACT

A sample size calculation for logistic regression involves complicated formulae. This paper suggests use of sample size formulae for comparing means or for comparing proportions in order to calculate the required sample size for a simple logistic regression model. One can then adjust the required sample size for a multiple logistic regression model by a variance inflation factor. This method requires no assumption of low response probability in the logistic model as in a previous publication. One can similarly calculate the sample size for linear regression models. This paper also compares the accuracy of some existing sample-size software for logistic regression with computer power simulations. An example illustrates the methods.

Download full-text

Full-text

Available from: Michael D Larsen, Aug 25, 2015
  • Source
    • "To determine minimum required sample size, we use " knowledge of required period of the first ANC visit for pregnant women " as the principal independent variable. Then, the sample size required to detect Odd ratio of 02 with statistical power of 0.80, α = 0.05 using logistic regression was 300[10]. All pregnant women regardless of rank and length of the pregnancy as they come for ANC visit were selected and interviewed after their consent. "

    Preview · Article · Sep 2015
  • Source
    • "Clearly, in most of the situations this is an upper bound[25][26]that leads to conservative estimates but it may be useful in many practical applications. In order to evaluate Equation (25), we need to calculate 2 i R and σ 2 . "

    Full-text · Article · Jan 2015 · Open Journal of Statistics
  • Source
    • "Using a significance threshold (alpha-level) of 5% and power of 70%, the required sample size for African ancestry to be associated with the three Dengue outcome severity measures are: 298, 116 and 85 respectively. This is done in the same multivariate regression model that was fitted in Table 6, i.e. includes all the covariates, and assumes the same regression coefficients (Hsieh et al., 1998; Mefford and Witte, 2012). All the analyses were performed using the R statistical software (www.r-project.org) "
    [Show abstract] [Hide abstract]
    ABSTRACT: The wide variation in severity displayed during Dengue Virus (DENV) infection may be influenced by host susceptibility. In several epidemiological approaches, differences in disease outcomes have been found between some ethnic groups, suggesting that human genetic background has an important role in disease severity. In the Caribbean, It has been reported that populations of African descent present considerable less frequency of severe forms compared with Mestizo and White self-reported groups. Admixed populations offer advantages for genetic epidemiology studies due to variation and distribution of alleles, such as those involved in disease susceptibility, as well to provide explanations of individual variability in clinical outcomes. The current study analysed three Colombian populations, which like most of Latin American populations, are made up of the product of complex admixture processes between European, Native American and African ancestors; having as a main goal to assess the effect of genetic ancestry, estimated with 30 Ancestry Informative Markers (AIMs), on DENV infection severity. We found that African Ancestry has a protective effect against severe outcomes under several systems of clinical classification: Severe Dengue (OR: 0.963 for every 1% increase in African Ancestry, 95% confidence interval (0.934 - 0.993), p-value: 0.016), Dengue Haemorrhagic Fever (OR: 0.969, 95% CI (0.947 - 0.991), p-value: 0.006), and occurrence of haemorrhages (OR: 0.971, 95% CI (0.952 - 0.989), p-value: 0.002). Conversely, decrease from 100% to 0% African ancestry significantly increases the chance of severe outcomes: OR is 44-fold for Severe Dengue, 24-fold for Dengue Haemorrhagic Fever, and 20-fold for occurrence of haemorrhages. Furthermore, several warning signs also showed statistically significant association given more evidences in specific stages of DENV infection. These results provide consistent evidence in order to infer statistical models providing a framework for future genetic epidemiology and clinical studies.
    Full-text · Article · Oct 2014 · Infection Genetics and Evolution
Show more

Questions & Answers about this publication

  • Luke C Pilling asked a question in Epidemiology:
    What is the best method for calculating the power of an epidemiological study of DNA methylation (using 450k array)?
    I know of methods available to calculate the power/sample-size required for a 'simple' epidemiological study using binary or linear regression models (e.g. https://www.researchgate.net/publication/13586507_A_simple_method_of_sample_size_calculation_for_linear_and_logistic_regression) - I was wondering whether anyone knows of methods specific to studies of genome-wide methylation (specifically, Illumina 450k arrays)?
    • Source
      [Show abstract] [Hide abstract]
      ABSTRACT: A sample size calculation for logistic regression involves complicated formulae. This paper suggests use of sample size formulae for comparing means or for comparing proportions in order to calculate the required sample size for a simple logistic regression model. One can then adjust the required sample size for a multiple logistic regression model by a variance inflation factor. This method requires no assumption of low response probability in the logistic model as in a previous publication. One can similarly calculate the sample size for linear regression models. This paper also compares the accuracy of some existing sample-size software for logistic regression with computer power simulations. An example illustrates the methods.
      Full-text · Article · Aug 1998 · Statistics in Medicine