ArticlePDF AvailableLiterature Review

Sample size determination. Influencing factors and calculation strategies for survey research

Authors:

Abstract and Figures

The paper reviews both the influencing factors and calculation strategies of sample size determination for survey research. It indicate the factors that affect the sample size determination procedure and explains how. It also provides calculation methods (including formulas) that can be applied directly and easily to estimate the sample size needed in most popular situations.
Content may be subject to copyright.
D
The paper reviews both the influencing factors and calculation strategies of sample size determination for survey research. It
indicate the factors that affect the sample size determination procedure and explains how. It also provides calculation methods
(including formulas) that can be applied directly and easily to estimate the sample size needed in most popular situations.
Neurosciences 2003; Vol. 8 (2): 79-86
selected? Unfortunately, the answer to these questions
are not as easy as the researcher desires. There are
factors which influence determining sample size and
others influence determining sampling design. The
researcher needs to know these factors and their effect
beforehand to succeed in determining the adequate
sample size. This paper attempts to highlight the factors
relevant to determining the minimum sample size
needed for descriptive studies and introduce some useful
strategies that can be employed for the purpose of
sample size determination.
Factors influencing determining sample size.
Determining the adequate sample size is the most
important design decision that faces the researcher.2 The
reason for this is that using too low sample size, the
research will lack the precision to provide reliable
answers to the questions that are under investigation.
Moreover, using too large sample size, time, and
resources will be wasted often for minimal gain. As
stated previously, there are factors playing a vital role in
determining the sample size. Knowing these factors and
their effect helps the researcher to determine the sample
Sample size determination
Influencing factors and calculation strategies for
survey research
Ali A. Al-Subaihi, PhD.
From the Institute of Public Administration, Riyadh, Kingdom of Saudi Arabia.
Published simultaneously with special permission from Saudi Medical Journal.
Address correspondence and reprint request to: Dr. Ali A. Al-Subaihi, Assistant Professor of Research Methodology and Applied Statistics, Institute of Public
Administration, Riyadh 11141, Kingdom of Saudi Arabia. Tel. +966 (1) 4745146. Fax. +966 (1) 4792136. E-mail: subaihia@ipa.edu.sa
ABSTRACT
escriptive research is one of the 3 broad categories
of research (the other 2 categories are correlation
and experimental) used to describe the characteristics of
the subjects of the study. Descriptive research can be
classified, in terms of how data is collected, as either
survey research or observational research. Survey
research is the most well known type of self-report
research and widely used technique in many fields,
including health administration, public administration,
education, sociology, and economics.1 A survey is an
attempt to collect data from members of a population in
order to determine the current status of that population
with respect to one or more variables. Survey can be
divided into 2 broad categories: the questionnaire and
the interview. A questionnaire is usually a
paper-and-pencil instrument that the respondent must
complete, and the interview must be completed by the
interviewer based on the respondent response. In a
survey research as well as in other research types, after
the research problem has been defined, the first 2
questions that concern the researchers’ are: How many
subjects are needed for the study, and how can they be
79
Review Articles
Sample size survey ... Al-Subaihi
80 Neurosciences 2003; Vol. 8 (2)
size needed appropriately. These factors are: the
sampling design, statistical analysis, level of precision,
level of confidence, degree of variability, and
non-response rate. Each of these factors and their
influences are as follows:
1). The sampling design. Due to the variability of
characteristics among subjects in the population, the
researcher typically applies scientific sampling design in
the sample selection process to reduce the probability of
having a biased view of the population. The researcher
withdraws a sample using simple random sampling
procedure or complex multistage sampling procedure
that includes stratification, clustering, and unequal
probabilities of selection. Or, they might withdraw a
sample using one of the non-probability sampling
designs such as convenience sampling, judgmental
sampling, quota sampling, and snowball sampling.3-7
The objective of the survey as well as other factors help
to determine the appropriate sampling design and valid
data collection methodology. In order to describe the
target population adequately and make statistically valid
inferences for the population using the sample survey
data, the researcher should incorporate the sample design
in both procedures: the sample size determination and
data analysis.8,9 That is, the researcher ought to use the
sample size determination procedure that matches the
sampling design which is going to be applied since the
sample size required differs from one sampling design to
another. To illustrate, sample sizes that are needed to
estimate the monthly spending mean of a finite
population (N = 800) within maximum allowable
difference between the estimate and the true value [d] =
$2 and 95% level of confidence are 217 when simple
and 192 when stratified random designs were used, (for
more details on the examples, see examples 3-5-1 and
3-5-3. Moreover, if one of the non-probability sampling
designs is going to be utilized, the judgmental sample
size determination procedure must be employed. The
judgmental technique is normally recommended for
non-probability sampling designs since the mathematical
formulas that help in determining the sample size cannot
be driven with lack of probability.
The unavailability of an objective method for
non-probability sampling designs and the spread of the
simple random design’s sample size determination
techniques among researchers lets most researchers use
the random design’s method with non-probability
design. They perform it frequently to the extent that it
became perfectly normal to see published survey study
indicating that one of the probability sample size
determination methods was employed when one of the
non-probability sampling designs was actually applied.
This is unfortunate since the validity of applying sample
size determination technique that does not match the
sampling design is questionable.
2) The statistical analysis that is planned. Another
consideration with sample size is the number needed for
the data analysis. If descriptive statistics (for example
mean, SD, frequencies, and so forth) are going to be
used, the situation is complicated as there is no existing
single method (or formula) that works in any survey
study. If inferential statistics (for example t-test, analysis
of variance, multiple regression, so forth) are intended to
be used, power analysis should be employed to
determine the sample size needed. The main goal of
power analysis is to allow the researcher to decide how
large a sample is needed to allow statistical judgments
that are accurate and reliable. The power of an
inferential statistical test is the probability of rejecting a
false null hypothesis, and is determined by 4 factors:
sample size, level of significance, size of the population
SD (σ), and magnitude of the means difference. Making
an appropriate change in any of these factors increases
the power of a test. The sample size has a direct
relationship with the power of a test. Thus, the simplest
way that researchers use to increase the power is to
increase the sample size. For more details on
determining sample size using the power of a test, the
reader is referred to Cohen,10 Kirk,11 and Trochim.1
3) The level of precision. The sampling error is
defined to be the difference between the parameter
(which is a numerical quantity, such as the mean and
SD, calculated using data collected from the entire
population) and the statistic (which is a numerical
quantity calculated using data collected from a sample).
The sampling error is called precision in sampling
contexts, and gives the researcher some idea relating to
the accuracy of the statistical estimate. The level of
precision, which also could be expressed in percentage
such as ± 3%, ± 5%, ± 7%, or ± 10% (which are the
commonly used values in humane studies), is the range
of accuracy of estimating the true value of the
parameter. And, it means that if the researcher finds that
80% of subjects in the sample have acquired a skill (or
knowledge) under study with a precision level of ± 10%,
the researcher might conclude that between 70% and
90% of subjects in the population have acquired the
skill. The level of precision has a reverse relationship
with the sample size. That is, the smaller the level of
precision is predetermined, the greater sample size is
needed. The reason for this is that the greater the sample
size, the closer the sample is to the actual population
itself. And, if the researcher takes a sample that contains
the entire population, they actually have no sampling
error (namely parameter = statistic). The relationship
between level of precision and the sample size is not
linear; however, it is curvilinear (Figure 1a), that is, the
rate of improvement in the precision decreases as the
sample size increases.2 For example, from Figure 1a, one
can see that the precision the researcher could get from a
sample of size 250 is 50% less than the precision that
they could get from a sample of size 1,000.
4) The level of confidence. The level of confidence,
which is based on ideas encompassed under the central
limit theorem (CLT) (for more details on CLT, the
reader is referred to Glass and Hopkins),12 is a value
which indicates a specific probability that the sample
Sample size survey ... Al-Subaihi
Neurosciences 2003; Vol. 8 (2) 81
contains the parameter being estimated. Such level of
precision, the level of confidence is expressed in
percentage such as 90%, 95%, or 99% which are
commonly used values in social studies. The level of
confidence means that, if a 95% is selected, 95 out of
100 samples will have the true population parameter
within the range of precision specified earlier (Figure
1b). Though, there is always a chance that the sample
which was obtained does not hold the true population
value. The shaded areas in Figure 1b represent such
samples with extreme values. This risk is reduced for
99% confidence levels and increased for 90% confidence
levels.13 The level of confidence has a positive
correlation relationship with the sample size. That is,
when other things being hold constant, the higher the
confidence level predetermined, the larger the sample
size needed (Figure 1c). For example, the sample sizes
needed to estimate the population parameter with ± 3%
precision level and 90%, 95%, and 99% confidence
levels are 699, 964, and 1556.
5) The degree of variability. In some studies, the
object is to estimate the percentage (p) (proportion) of
subjects in the population having some attribute. For
example, the researcher may wish to estimate the
proportion of females in a nurse population, the
proportion of divorced women in a workingwoman
population, or the proportion of teenagers in a smoker
population. In this situation, the variable of interest is an
indicator variable namely:
(1 if the subject i has the attributed
xi = (0 if the subjects i does not have it
where xi denote the variable of interest. The proportion
of subjects in the population having some attribute refers
to the distribution of attributes in the population. The
degree of variability in the attributes being measured
equals p (1-p) and has a direct relationship with the
sample size. That is, the more the degree of variability of
the distribution of attributes in the population, the larger
the sample size is required to obtain a given level of
precision. The less variable population, the smaller the
sample size. For example, sample sizes that are needed
to estimate true proportions p = 0.20 and 0.50 with 3%
precision level and 95% confidence level are 683 and
1067 subjects (Figure 1d). From Figure 1d, one may
notice that a proportion of 0.50 (or equivalently written
50%) requires the largest sample size since it indicates a
greater variability than other proportion values. Thus, p
= 0.50 is usually used to determine a conservative
sample size when the true variability of the population
attribute is unknown.
6) The non-response rate. It has been noted that the
sample size needed is referred to the number of valid
responses not the number of subjects. In other words, the
sample size that the researcher had is not the number of
subjects who were selected to participate in the research.
Rather, it is the number of subjects who responded
correctly to the survey. The difference between the 2
numbers is called a non-response error, which is
common in survey research. This is unfortunate due to a
high non-response rate might lead to biased results.19
Common sources of non-response are: refusals, unable
to answer, and not found. The consequences of a high
non-response error vary. As non-response rate increases,
the possibility for having a biased sample increases. This
is because the obtained responses of a probability
sample may no longer be representative of the target
population. In addition, the non-response might reduce a
probability sample to a convenience sample and
consequently, the conclusions are weaker.14 In an effort
to obtain enough data for analysis, the researcher should
increase the sample size needed by a certain percentage
to compensate for non-response. This percentage varies
according to who is surveyed. The non-response rate, for
example, among busy people (such as politicians,
businessmen, general managers and so forth) is higher
than among normal people. Thus, the researcher who
wants to survey busy persons must keep that into
consideration and selects more subjects to ensure having
an adequate sample size. Since the non-response rate
varies according to who is surveyed, the researcher is
advised to consult previous studies in the research arena
to determine the percentage wanted for non-response
adjustment.
Strategies for determining sample size. There are
several approaches to determine the sample size. These
include using a census for small populations, imitating a
sample size of similar studies, using Internet sample size
calculator, using published tables, and using formulas.
Each strategy is discussed:
A) Using a census for small populations. A survey
research that attempts to acquire data from each and
every member in the population is called a census
survey, and one approach to determine the sample size is
to deal with the entire population and use it as the
sample. If the population that the researcher wishes to
study is small (for example <200), the researcher should
measure the interest variables for every subject in the
population. The reason for this is that a census
eliminates sampling error, and virtually the entire
population would have to be sampled in small
populations to achieve a desirable level of precision.
B) Using a sample size of a similar study. Another
approach to determine the size of the sample needed for
a particular study is to use the same sample size as those
of studies similar to the one under plan. However, the
researcher must note that without reviewing the
procedures employed in these studies, they may run the
risk of repeating similar errors that were made in
determining the sample size for another study. Inspite of
that, reviewing the literature in the study’s discipline
along with reviewing the procedures employed can
reduce the possibility of repeating same errors. The
literature review can provide guidance on sample sizes
that are usually used.
C) Using Internet sample size calculator. A third
way to determine sample size is to utilize one of the
Internet sample size calculators, which provide the
Sample size survey ... Al-Subaihi
82 Neurosciences 2003; Vol. 8 (2)
Figure 1 - A chart showing a) The relationship between the precision and sample size required to estimate a mean of normally distributed population
with σ = 1. b) The 95% of sample means within 2 SD. c) The relationship between the level of confidence and sample size. d) The
relationship between the degree of variability and sample size. e) Sample size determination flowchart. CL - confidence level
a b
dc
e
Sample size survey ... Al-Subaihi
Neurosciences 2003; Vol. 8 (2) 83
sample size for a given set of criteria. Sites such as:
http://www.surveysystem.com/sscalchtm
http:/ebook.stat.ucla.edu/calculators/sampsize.phtm and
http://www.azplansite.com/samplesize.htm (and many
others) provide an interactive way to determine the
sample sizes that would be necessary for given
combinations of precision, confidence levels, and
variability. These sites and others similar are designed to
provide sample size that: reflects the number of obtained
responses, and not necessarily equals the number of
surveys should be mailed or interviews must be planned
(for more details about the difference between the two
numbers, see a non-response rate factor); presumes the
attributes being measured are distributed normally (or
nearly so) with estimated proportion p = 0.5. Note that if
the normality assumption cannot be met, then using the
entire population should be considered. Assumes the
simple random sampling design is going to be employed.
D) Using published tables. A forth way to determine
sample size is to utilize published sample size tables,
which are available in, almost, every research
methodology and sampling textbooks. These tables are
designed exactly in the same way that the Internet
calculators are, however for fixed and predetermined
combinations of precision, confidence levels, and
variability. These features and the spread of the Internet
usage among researchers limited the tables’ utilization
value. Thus, tables are not going to be provided here.
E) Using mathematical formulas. The traditional
method to determine the sample size needed is to use
directly the mathematical formulas. These formulas
cover most of the probability sampling designs and can
be used in the study aimed to estimate either the
population mean or proportion. Indeed, they are the ones
used in the Internet sample size calculators and the
published tables. The most frequently used formulas are:
(i) 3-5-1 simple random sampling (Mean). The simple
random sampling is a method of selecting n (number)
subjects out of the N (the population size) such that
everyone of the
N!
CNn = ---------------
n! (N - n)!
distinct samples has an equal chance of being drawn.1
And, the formula that is used to determine the sample
size for a study employs a simple random sampling
design in order to estimate the population mean is by
using formula 1-a:
N S2
n = ------------------------------
d2
(N - 1) ------- + S2
Z2
where n is the sample size needed, Z is the inverse of the
standard normal cumulative distribution that correspond
to the level of confidence, σ2 is the variance of an
attribute in the population which is usually estimated
using a pilot sample variance (S2), and d2 is the
maximum allowable difference between the estimate
and the true value. The Z values that correspond to the
frequently used confidence levels are shown in Table 1.
When the population size is infinite or unknown, the
sample size is estimated by using formula 1-b:
Z2 σ2 Z2 S2
n = ------------------------ --------------
d2 d2
Survey samplers interested in estimating µ rarely have
an estimate of σ readily available to use in either
formula 1- a or formula 1- b for determining n. Thus,
rather than guessing its value, the researcher must
conduct a pilot sample to estimate the population SD σ.
Typically, the pilot sample is a relatively small sample
(namely n = 10, or 20).13
A pilot study (example 3-5-1) was conducted to
estimate the population variance (σ2) of monthly
spending in health care among families. The sample
variance (S2) was found to be $232. What is the sample
size needed in the study to estimate the population mean
(µ) within d = $2 of the true mean with 95% confidence
if the simple random sampling is planned to be used and
N = 8,000? Solution to the sample size needed is:
N S28,000 (232)
n = -------------- = ---------------------------- = 216.77 217
d2 (2)2
(N -1) ---- +S2 (8,000-1) ----- + (232)
Z2 3.841
(ii) Simple random sampling (proportion). When a
simple random sampling design is employed to estimate
the proportion of the population, the following formula
can be used to determine the sample size: (formula 2- a)
Table 1 - The Z values that correspond to the frequently used confidence
level.
Level of
confidence (%)
90
95
99
Z
1.645
1.960
2.576
Z2
2.706
3.841
6.635
Z - the inverse of the standard normal cumulative distribution that
correspond to the level of confidence
Sample size survey ... Al-Subaihi
84 Neurosciences 2003; Vol. 8 (2)
In example 3-5-3, the researcher wants to estimate the
monthly spending mean of households in health care in a
particular city, where households are divided into 3
strata: high, medium, and low income. It is expected that
the mean spending varies between the 3 strata. The
population size of the city is 8,000 and the size of high
income household are 2,000, medium 1,000, and low
income households are 5,000. A pilot study was
conducted to estimate the variance in each stratum and
found to be $225, $300, $170. What are the sample sizes
needed to estimate the population mean (µ) within d =
$2 with 95% confidence? the computation should be:
N = 8,000, Ni = 2,000; 1,000; 5,000 and S2i = 225; 300;
170.
Ni
wi = --------- = 0.25; 0.125; 0.625
N
Plugging values in formula 3, yields
( N2i S2i )
( wi )
n = ---------------------------------
d2
N2 ------ + ( N2i S2i )
Z2
[ (2000)2 (225) (1000)2 (300) (5000)2 (170)]
------------------ + ----------------- + ----------------
[ 0.25 0.125 0.625 ]
n = -------------------------------------------------------------------
22
[(8000)2 x------- ] + [(2000x225) + (1000x300) + (5000x170)]
(1.96)2
n = 191.98 192. Applying the strata weights, the
researcher should sample: n1 = w1n = 0.25 (192) = 48; n2
= w2n = 0.125 (192) = 19.2 20; n3 = w3n = 0.625 (192)
= 124.8 125 households from stratum 1, stratum 2, and
stratum 3.
(iv) Stratified random sampling (proportion). The
formula that gives the sample size needed to estimate the
population proportion when the stratified random
sampling technique used is formula 4:
N2i pi (1 - pi)
(-----------------)
wi
n = ---------------------------------
N2 e2
------ + Ni pi (1 - pi)
Z2
where pi is the subpopulation proportion for stratum(i),
and the remaining variables were defined previously.15
In example 3-5-4, refer to example 3-5-3. Find the
sample sizes n1, n2, and n3 needed to estimate the
population proportion with 5% precision and 95%
N p (1 - p)
n = ---------------------------
e2
(N -1) ------ +p (1 - p)
Z2
where p is the estimated variability of an attribute of
interest in the population, e is the precision level, and the
remaining variables are as defined in formula 1-a. Also,
when the population size is infinite or unknown, the
sample size is estimated by using formula 2- b:
Z2 p (1 - p)
n = ----------------------
e2
If the actual variability value of an attribute of interest in
the population is unknown, p in formula 2- b is replaced
by 0.5 to become: (formula 2-c)
0.25 Z2
n = -------------------
e2
In example 3-5-2, suppose that the nurse population in
the Kingdom of Saudi Arabia is 5000. What is the
sample size needed for a study to estimate the proportion
of women in the nurse population with 5% precision and
95% confidence if the simple random sampling design is
going to be used? Since there is no prior information on
the degree of variability, formula 2-c will be used to
estimate the sample size required:
0.25 Z2 0.25 (3.841)2
n = -------------- = ----------------------- = 384.1 385 subjects
e2 (0.05)2
(iii) Stratified random sampling (Mean). In stratified
sampling, a sample is drawn from each stratum of the
population of N subjects, which is assumed to be divided
into subpopulations (or strata) according to a specific
characteristic such as age, gender, race, and so forth.
When a simple random sample is taken in each stratum,
the whole procedure is described as stratified random
sampling.1 To determine the sample size in order to
estimate the population mean when the stratified random
sampling design, the formula 3 is used:
( N2i S2 i )
( wi )
n = ---------------------------------
d2
N2 -------- + (N2i S2i )
Z2
where L is the total number of strata, Ni is the size of
stratum(i), Si is the estimated variance of the attribute in
the stratum(i), wi is the estimated proportion of Ni to N,
and the rest variables are previously defined.15
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Sample size survey ... Al-Subaihi
Neurosciences 2003; Vol. 8 (2) 85
subscriptions per household in a town containing 1,500
households. The town is divided into 200
non-overlapping geographic areas. What is the sample
size (namely, number of clusters) needed to estimate the
mean number of insurance subscriptions (µ) within d = 1
of the true value with 95% confidence if the estimated
variance of the cluster total (S2c) is 100? N = 200; M =
1,500; d = 1; S2c = 100; and M = M divided N = 7.5.
Then the sample size needed is:
N S2c200 (100)
n = -------------------- = --------------------------- = 6.6 ≈ 7
N d2 M2200 (1)2 (7.5)2
------------ + S2c ------------------+ 100
Z2 (1.96)2
(vi) Cluster sampling design (Proportion). The sample
size required to estimate the proportion of the population
when the cluster sampling is used can be estimated by
using formula 6:
N S2c
n = --------------------
N e2 M2
------------------- + S2c
Z2
(αi - pmi)2
where S2c = ------------------------
n - 1
is the estimated variance of the number α of successes in
αi cluster, αi is number of subjects in cluster(i) having the
attribute of interest,
αi
p = ------------ is the proportion.
mi
In example 3-5-6, refer to the Example 3-5-5. Find the
sample size (namely, number of clusters) needed to
estimate the population proportion (p) with 5% precision
and 95% confidence, if the estimated variance of the
cluster total ( S2c) is 5. The sample size is:
N S2c 200 (5)
n = ------------------- = ----------------------- = 81.16 ≈ 82
N e2 M2 200 (0.05)2 (7.5)2
------------- + S2c ---------------------- + 5
Z2 (1.96)
The following flowchart helps guiding the researcher to
the suitable method that can be used to determine the
sample size needed for a survey research.
confidence. Since there is no information on the strata
proportions, the conservative values (pi , p2 , p3 = 0.5)
should be used.
N2i pi (1-pi)
(-----------------)
wi
n = ----------------------------------------
N2 e2
------------ + Ni pi (1 - pi)
Z2
[ (2000)2 (0.25) (1000)2 (0.25) (5000)2 (0.25)]
------------------- + ------------------- + -----------------
[ 0.25 0.125 0.625 ]
n = --------------------------------------------------------------------
(0.05)2
[(8000)2 x -------]+[(2000x0.25)+(1000x0.25)+(5000x0.25)]
(1.96)2
n = 375.15 376. Applying the strata weights, the
researcher should sample n1 = w1 n = 0.25 (376) = 94; n2
= w2n = 0.125 (376) = 47; n3 = w3 n = 0.625 (376) = 235
households from stratum 1, stratum 2, and stratum 3.
(v) Cluster sampling design (Mean). Cluster sampling
is a sampling technique where the entire population is
divided, usually geographically, into clusters, and a
random sample of these clusters, not individuals, is
selected. The number of clusters needed to estimate the
population mean when the cluster sampling employed
can be determined by (Formula 5):
N S2c
n = -----------------------
N d2 M2
-------------- + S2c
Z2
where n is number of cluster in a simple random sample,
M is the average size of the clusters in the population,
(xi - xmi)2
S2c = --------------------------
n - 1
is the estimated variance of the cluster total, xi is the
total of all subjects in cluster(i). mi is the size of cluster(i),
xi
x = ---------- is the estimated population mean.
mi
In example 3-5-5, a health insurance company
wanted to estimate the mean number of insurance
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Σ
L
i=1
Sample size survey ... Al-Subaihi
86 Neurosciences 2003; Vol. 8 (2)
In conclusion, the first question that comes to the
researcher’s mind, after finishing writing the research
question, is "How many subjects are needed?". The
question is essential and difficult; its importance comes
from the desire not to survey either a too low or a too
large sample size, and the difficulty comes from the
non-existence of a direct answer. There are factors that
play an important role in the process of determining the
sample size, and in order for the researcher to determine
the sample size required adequately, they must answer
some supportive questions. These supportive questions
are: Is one of the probability sampling designs planned
to be used to select the subjects? If the answer is No
(namely, one of the non-probability sampling designs is
going to be used), the researcher needs to stop here and
consult experts or previous studies in the field to
determine the satisfactory sample size. That is because
there is no objective sample size determination method
that can be driven from non-probability sampling
designs. If the answer is yes, the researcher needs to
answer the following question. Is inferential statistics
only going to be used to analyze the data? If yes, the
researcher has to stop here and utilizes the statistical
power analysis to determine the sufficient sample size.
The Sample Power software that is developed by SPSS
or any statistical software performs power analysis is
recommended to be utilized here. If No (namely,
descriptive statistics will be used), the researcher ought
to respond to the next question. What is planned to be
estimated? (Mean or Proportion). If the population mean
will be estimated, the researcher needs to choose the
related formula from formulas one, 3, or 5 according to
the probability sampling design that is planned to be
used. The predetermined precision and confidence levels
are then plugged in. If the population proportion is
intended to be estimated, the researcher has to choose
formula 2, 4, or 6 also according to the probability
sampling design that is going to be used. Again, the
precision level, level of confidence, and the degree of
variability in the attributes being measured are plugged
into the analogous proportion formula.
Finally, the researcher needs to take into consideration
who is surveyed to adjust for non-response rate. The
non-response rate adjustment should always be applied
regardless of what sampling design will be used or
statistical analysis is planned. The sample size required
is the number of valid responses not the number of
subjects selected to participate in the study.
References
1. Trochim WM. Research Methods Knowledge Base. 2nd ed.
Available from
URL: http://trochim.human.cornell.edu/kb/index.htm.
2. McClave JT, Benson PG, Sincich T. Statistics for Business and
Economics. 8th ed. Upper Saddle River (NJ): Prentice Hall
International, Inc; 2001.
3. Hansen, MH, Hurwitz WN, Madow WG. Sample Survey
Methods and Theory. Volumes I and II. New York (NY): John
Wiley & Sons Inc; 1953.
4. Kalton G. Introduction to Survey Sampling. SAGE University
Paper series on Quantitative Applications in the Social Sciences.
Series no. 07-035, Beverly Hills (CA), London (UK): SAGE
Publications Inc; 1983.
5. Cochran WG. Sampling Techniques. 3rd ed. New York (NY):
John Wiley & Sons Inc; 1977.
6. Kish L. Survey Sampling. New York (NY): John Wiley & Sons.
Inc; 1965.
7. Lunsford T, Lunsford B. The Research Sample. Part I: Sampling.
Journal of Prosthetics and Orthotics 1995; 7: 105-112.
8. Israel GD. Sampling the Evidence of Extension Program Impact,
Program Evaluation and Organizational Development, IFAS,
PEOD-5. Florida (FL): University of Florida; 1992-A
9. SAS Institute. "SAS/STAT User's Guide", Version 6. 4th ed.
Cary (NC): SAS Institute Inc; 1989.
10. Cohen J. Statistical Power Analysis for the Behavioral Sciences.
2nd ed. Hillsdale(NJ): Erlbaum; 1988.
11. Kirk RE. Experimental Design. 3rd ed. Pacific Grove (CA):
Brooks/Cole Publishing Company; 1995.
12. Glass, Hopkins. Basic Statistics for the Behavioral Sciences. 3rd
ed. Allyn and Bacon; 1996.
13. Israel GD. Determining Sample Size, Program Evaluation and
Organizational Development, IFAS. PEOD-6. Florida (FL):
University of Florida; 1992-B
14. Israel GD. Sampling Issues: Non-response, Program Evaluation
and Organizational Development, IFAS. PEOD-9. Florida (FL):
University of Florida; 1992-C.
15. Sincich T. Business Statistics By Examples. 5th ed. Upper
Saddle River (NJ): Prentice Hall International Inc; 1996.
... Stratified random sampling method was used as asampling design for selecting a representative sample of female students. It is a technique, which is explained by Cochran [22] and Al-Subaihi [23]. ...
... When Stratified random sampling technique is used to estimate the population proportion, the following formula for sample size n is used [22][23]. The formula that gives the sample size needed was: where n = is the sample size needed, p i = is the subpopulation proportion for stratum i, which is the probability that a female student retained or graduate. ...
Article
Full-text available
The study was conducted on female students who were 2005, 2006, 2007, and 2008 entries in the fields of Natural Science, Agriculture, and Social Science. From 1931 female students a sample of 605 was taken using stratified random sampling, Primary and secondary data were collected using questionnaire and analyzed using the Bayesian logistic regression analysis. The results showed that the percentage of graduation among 362 females who were enrolled in 2005, 2006, and 2007 was 72.1%. Similarly the retention rate among 243 females of 2008 entry was 75.7%. From the Bayesian logistic regression analyses, significant predicators of both graduation and retention were choice of field, preparatory average result, entrance exam score and first year cumulative GPA. Moreover pregnancy, organizing studying and leisure time, habit of chewing Khat, satisfaction with instructors, parent income, habit of smoking cigarette and using drugs, and feel safe to study at night in classrooms appeared as significant predictors of retention. The graduation rate and retention rate for the students who assigned to the field they did not choose were lower than that for those assigned to the field they chose. Those with first year CGPA less than 2.0 were having lower rates of graduation and retention than those having greater than 2.0. The graduation and retention rates for the students having higher preparatory average result and higher entrance exam score were higher than that for those having lower. The students having parents' income less than 500 were less likely to retain than those having parents' income greater than 1500. The retention rate for the students who were not satisfied with their instructors was lower than those were satisfied. The students who cannot organize their study and leisure time easily were less likely to retain than those can organize. In conclusion, the factors those mainly affect female students' graduation and retention were more of academic variables; hence we recommend that assigning to the field they choose by their interest may help female students' graduation and retention. The teaching method at secondary and preparatory schools should be designed to challenge and motivate them to adequately prepare them for Higher Education Institutions. Moreover, campus and Department administrators in collaboration with the students themselves and academic staff need to work hard to bring change in behavior, academics, and social aspects of female students at the University.
... From the determined sample size (380), 5% was added to cater for non-responses, as proposed by Krejcie and Morgan [18], producing a sample size of 399. During data collection, the sample size covered was 392 smallholder oil palm farmers, resulting in a 1.75% missing data rate, which was random. ...
Article
Full-text available
Oil palm is one of the primary vegetable oil sources worldwide, including in Tanzania. Tanzania’s mean palm oil yield is 1.6 tons per hectare, far below the 6 to 8 tons per hectare reported elsewhere. This low oil yield is attributable to underdeveloped, unsustainable oil palm production systems and improvements, several biotic and abiotic stresses, and socio-economic and policy challenges that have yet to be systematically documented to guide large-scale production, breeding, and research support. The objectives of this study were to appraise oil palm production and improvement in Tanzania, focusing on constraints, opportunities, and farmers’ major preferences. A participatory rural appraisal study was conducted in Kigoma Region, in three selected districts. Data were collected from 392 oil palm farmers using semi-structured questionnaires and 54 focus group discussants. Data were subjected to statistical analyses to discern the variables and their significant associations using the Statistical Package for Social Science (SPSS Inc., 2020). About 98.5% of the participant farmers engage in oil palm production. Most respondent farmers predominantly cultivate the Dura oil palm type (97.4%), followed by Tenera (50%). The farmers’ major reported oil palm production constraints were an inadequate supply of improved planting materials (reported by 82.7% of respondents), poor access to credit (72.4%), a high cost of production inputs (59.4%), poor market access (56.4%), insect pests and diseases (53.6), and poor production technologies (45.4%). A chi-square analysis of farmers’ production constraints revealed that the unavailability of labor (X2 = 41.181; p = 0.000); limited extension services (X2 = 29.074; p = 0.000); and diseases and pests (X2 = 19.582; p = 0.000) differed significantly across the study area. Additionally, the lack of fertilizers (X2 = 14.218; p = 0.001); inappropriate technology and knowledge gaps (X2 = 10.529; p = 0.005); and poor market access (X2 = 6.621; p = 0.036) differed significantly across districts. A high oil yield (reported by 58.7% of the respondents), a high number of bunches per plant (40.5%), early maturity (37.2%), and tolerance to droughts (23%) and diseases and insect pests (18.9%) were the most preferred traits by farmers in oil palm varieties. Therefore, integrative and sustainable breeding oil palm for enhanced yields and farmers’ preferred traits will increase the adoption of newly improved varieties for local palm oil production, import substitution, and economic development in Tanzania.
... Based on the sample size calculation method for impact factor studies [18], the sample size was determined to be at least 5 to 10 times the number of independent variables, totaling 28, with a 10% allowance for non-response. Thus, the sample size ranged from 154 to 308. ...
Article
Full-text available
Background Maintaining effective disease control in patients with inflammatory bowel disease (IBD) is both a significant goal and challenge. Drawing on the Common-Sense Model of Self-Regulation (CSM) and related research, this study investigates how IBD activity status influences disease control through both direct and indirect pathways. Methods A cross-sectional survey was conducted among 310 IBD patients who attended a tertiary general hospital, the leader of the IBD Alliance Group in Chongqing City, between March and August 2024. Structural equation modeling (SEM) was utilized to assess the role and magnitude of various influencing factor pathways. Relying on AMOS26 software, the path effects and magnitude of various factors in the disease control process were analyzed using structural equation modeling (SEM) to test hypothetical models. Results A total of 306 valid questionnaires were collected, with a mean IBD-control score of 12.14 ± 3.665. There was a negative link between disease activity and IBD-control (P < 0.01) and a positive correlation between chronic illness management self-efficacy, IBD self-management behavior, and IBD-control (P < 0.01). Path analysis showed that IBD activity negatively predicted IBD control (β = -0.715, P = 0.01). Chronic disease management self-efficacy partially mediated this relationship (β = -0.071, P = 0.012). A significant chain-mediated pathway was identified, where IBD activity affected IBD control via self-efficacy guided by self-management behavior (β = -0.025, P = 0.007). However, the pathway where IBD activity influenced control through self-efficacy and subsequently self-management behavior showed only marginal significance (P = 0.074). Conclusion Effective self-management behaviors improve IBD control. High disease activity may reduce chronic disease management self-efficacy, impairing IBD control. Positive feedback loops involving self-management behaviors and enhanced self-efficacy are crucial for better disease control, as patients who perceive positive outcomes are more motivated to maintain these behaviors.
... where p i is the subpopulation proportion for stratum(i), Ni is population size for stratum (i),), d is a marginal error, and Zα /2 is a critical value for normal distribution at 95% confidence interval which is equal to 1.96 (Z value at alpha = 0.05) [4]. The subpopulation proportion stratum ( p i = 0.69) and marginal error (d) (5% = 0.05) are adapted from Jemal et al. [38]. ...
Article
Full-text available
Background Healthcare-associated infections (HCAIs) are a common challenge faced in healthcare facilities, particularly in low- and middle-income countries (LMICs). Evaluating the level of knowledge, attitude, and practice (KAP) among healthcare personnel regarding HCAI prevention and identifying the relevant factors is important for handling and controlling these infections. Therefore, this study aimed to assess the direct and indirect effects of knowledge, attitude, and practices of healthcare workers (HCW) towards HCAIs prevention in Jimma University Medical Center (JUMC). Method An institutional cross-sectional study was conducted from March to April 2022. A total of 262 was sampled from 1354 health professionals working in JUMC using the population proportion stratified random sampling method. The data were collected using a self-administered questionnaire. Structural equation modeling (SEM) was used to identify the direct and indirect effects of KAP of HCWs on HCAIs. Results Of the 262 participants, 55% (n = 144) were nurses and 52.7% were female. The study found that HCW occupational skills (os) had a direct effect on knowledge (k) and attitudes (a) in preventing HCAIs (βos→k = 1.43, Pos→k = 0.004, and βos→a = 0.65, Pos→a = 0.004). HCWs’ practice on HCAIs had an effect (βos→p = -0.79; Pos→p = 0.004). HCWs’ attitudes toward HCAI preventive practice (p) and knowledge had an effect, p-values Pa→p = .002 and Pa→k = .003, respectively. Indirect effects revealed that HCWs’ attitudes towards preventing HCAIs through practice had an impact (γa→k = .426, Pa→k = .003). HCWs’ occupational skills and attitudes towards preventing HCAIs had an effect (γos→p = .523, Pos→p = .002). Conclusion The finding indicated that attitude and occupational skills can be improved through practice which finally brings a significant improvement in the knowledge of HCWs about HCAIs prevention. Besides, there were direct effects of occupational skills on the practices of HCAIs prevention. This highlights ongoing training and mentoring of HCWs during practice is essential to enhance HCAIs prevention.
... The recruited respondents received email invitations with a wealth of information about the study's goals, the anticipated time it would take to complete the survey, the researcher's contact information, privacy and confidentiality assurances, and the requirement to sign a written informed permission form. A self-reporting, anonymous questionnaire was developed based on the literature [19]. The questionnaire included general respondent characteristics such as respondent age, sex, specialty, years of clinical work experience, self-efficacy, knowledge of infection prevention and control (IPC), previous applied IPC training, time pressure and workload, and availability of IPC resources. ...
Article
Full-text available
Background Healthcare-associated infections (HAIs) are a major global health threat, leading to higher morbidity and mortality, longer hospital stays, and increased healthcare expenses. Intensive care units (ICUs) present a particularly high risk of developing HAIs. This study aims to examine the risk factors of HAIs among healthcare workers (HCWs) in the ICUs of selected public hospitals. Methods We employed a cross-sectional design using an online survey. Respondents were randomly selected from seven large public hospitals located in different areas of Riyadh, the capital city of Saudi Arabia. Data collection was conducted between November 1st to 15th, 2023. Logistic regression analysis was employed to examine previous exposure to HAIs as the response variable and selected predictors. Results A total of 600 HCWs participated in the study (response rate 88.2%). Among the study HCWs, 75.1% were female, with nurses making up 50% of the sample. Of the respondents, 78% had at least a year’s experience, 71% had applied infection prevention and control (IPC) training from the infection control department, and 93% reported they had good knowledge about infection control. The level of knowledge of IPC (OR = 0.9, p < 0.05) and applied IPC training (0.1, p < 0.001) were significantly associated with a lower risk of HAIs. Additionally, a higher risk of HAIs was associated with HCWs years of clinical experience (p < 0.001). Conclusion Overall, the findings indicated that HCWs who have poor knowledge of IPC, who reported no previous IPC applied training, and who have more years of clinical work experience have a greater risk of HAIs. Thus, legislators and Health officials should prioritize the prevention of infections linked to healthcare, paying particular attention to tailored and applied IPC initiatives.
... Additionally, a certain sample loss rate is taken into account. Initially, a total of 450 questionnaires were distributed, and eventually 361 valid questionnaires were collected [12]. ...
Article
Full-text available
Objective This study aimed to investigate the factors influencing nurses’ ability to respond to public health emergencies and understand the relationship between nurses’ ability to respond to emergencies and workplace resilience. Methods A cross-sectional study of 361 nurses from military hospitals was conducted from January 18 to September 6, 2022, using an online survey. The Infectious Diseases Emergency Response Capacity (IDERC) questionnaire and the Workplace Resilience Scale (WRS) were utilized, and sociodemographic information was also collected. Data were analyzed using descriptive statistics and frequency analysis. Differences between groups were identified by one-way analysis of variance, and linear regression was used to analyze the main factors influencing the infectious emergency response capacity. Results The average infectious emergency response capacity score on the IDERC questionnaire and workplace resilience, measured by WRS, were 4.01 (SD = 0.76) and 3.85 (SD = 0.71), respectively, on a scale of 1–5, indicating high performance. Factors such as degree of education, nurses’ service years and experience in epidemic prevention participation were found to be the main influencing factors of the score of IDERC. The level of workplace resilience showed a positive correlation with the capacity to respond to infectious disease, the score of WRS and the service year accounted for 63.6% of the variance in emergency response capabilities. Conclusion The results indicate an urgent need to strengthen the training of nurses with lower degree of education, shorter service years, no prior work, or no experience of epidemic prevention participation, and hospitals should also prioritize improving nurses’ workplace resilience through targeted interventions, enhancing their abilities in infectious disease prevention, preparation, first aid, and subsequent critical patient care.
... However, obtaining precise figures posed a challenge. Thus, we calculated our sample size based on the formula of unknown population [27]. Guided by a desired confidence interval of 95% and an acceptable margin of error of 8%, our estimated sample size requirement was approximately 150 participant responses. ...
Article
Full-text available
Despite the growing body of literature supporting the use of point-of-care lung ultrasound (POC-LU) in neonates, its adoption in Canadian neonatal intensive care units (NICUs) remains limited. This study aimed to identify healthcare providers’ perceptions and barriers to implementing POC-LU in Canadian NICUs. We conducted an electronic survey targeting neonatologists, neonatal fellows, neonatal nurse practitioners, and registered respiratory therapists in 20 Canadian NICUs. The survey comprised a 28-item questionnaire divided into four sections: (1) participants’ demographics and availability of POC-LU equipment, (2) experience and interest in POC-LU learning, (3) perception of POC-LU as a diagnostic tool, and (4) barriers to POC-LU implementation in NICUs. A total of 194 participants completed the survey, with neonatologists comprising the majority (45%). Nearly half of the participants (48%) reported prior experience with POC-LU. The most prevalent indications for POC-LU use were diagnosis of pleural effusion (90%), pneumothorax (87%), and respiratory distress syndrome (76%). Participants identified the primary barrier to POC-LU adoption as the lack of trained providers available for both training and clinical integration. Notably, most respondents (87%) expressed keen interest in learning neonatal POC-LU. A subgroup analysis based on the responses collected from NICU-directors of 12 institutions yielded results consistent with those of the overall participant pool. Conclusion: This survey underscores the perceived importance of POC-LU among NICU healthcare providers. A Canadian consensus is required to facilitate the development of widespread training programs as well as standardized clinical practice guideline for its implementation. What is Known: • In recent years, point-of-care lung ultrasound (POC-LU) has emerged as an important tool in neonatology, revolutionizing the assessment and management of critically ill infants. However, its adoption in Canadian Neonatal Intensive Care Units remains limited. What is New: • Most Canadian healthcare providers showed high level of interest in learning POC-LU techniques. Additionally, POC-LU was perceived as a useful tool for diagnosis and guiding intervention in various neonatal respiratory diseases. Nonetheless, the lack of expertise emerged as the primary barrier to its adoption and practice across different groups of participants regardless of their clinical experience level.
... A total of n = 498 participants filled out the questionnaire, representing about 18 % of the FF consumers in the LMA and about 8 % of the national FF universe. With a population of 6700, requiring a 95 % level of confidence (p<.05) and a maximum error in terms of the standard deviation of 5 %, the estimated necessary sample size was calculated as 364 (see Al-Subaihi, 2003). Responses were geographically evenly spread. ...
Article
Full-text available
Alternative Food Networks gain increasing importance in sustainability transitions of food pro- duction, retail, and consumption. This paper explores the role of AFN consumers as critical food sustainability change agents, with a special focus on low-income consumers. It challenges pre- conceived notions that associate sustainable living exclusively with affluent communities, high- lighting the substantial influence of economically disadvantaged individuals in shaping sustainable food consumption patterns. Based on a survey of the Portuguese Fruta Feia cooper- ative, the paper examines how perceived income affects sustainable food values, decisions, and practices. Results highlight low-income consumers’ significant, yet often overlooked, role in driving changes towards environmentally responsible food systems and practices. This research shifts the focus of sustainability change agency, underscoring the critical role of diverse, partic- ularly financially disadvantaged, consumer groups in championing sustainability in the food sector. It also confirms the importance of AFNs and their members as critical transition stakeholders.
... Estimating the sample size is based on the questionnaire with the highest number of items. (30,31). In this study, there were 51 items in the questionnaire on day care and elderly care needs for the elderly. ...
Article
Full-text available
Background The latest census data show that people over 60 years of age account for about 18.7% of the total population in China, and the aging of the population has become an irreversible trend in the 21st century. This study aimed to investigate the current status and factors influencing the care of the elderly in community day care centers in order to lay the foundation for the development of better services in community day care centers. Methods This study was a cross-sectional survey using convenience sampling in Nanjing, China. The survey instrument was the Day care and Elderly Care Service Needs Questionnaire, which included the Ability of Daily Living Assessment (ADL), the Xiao Shuiyuan Social Support Rating Scale (SSRS) and the Day care Elderly Care Service Needs Survey Form, and a general information survey. Results A total of 450 elderly people in day care centers were surveyed. The elderly had different levels of demand for day care services, especially regarding daily care. Correlation analyses indicated that age (r = 0.619), education level (r = 0.616), source of income (r = 0.582), caregiver (r = 0.557), satisfaction with care service (r = 0.603), and degree of ADL (r = 0.629) were correlated with the need for elderly day care services (all p < 0.05). The factors influencing the demand for day care services encompassed age, education level, income source, caregiver, satisfaction with service, and ADL (all p < 0.05). Conclusion Elderly care services in community day care centers are mainly based on daily and spiritual comfort, and the needs of the elderly are influenced by many factors. Timely nursing care policies and measures that target these factors are needed to improve elderly care.
Article
Full-text available
This study aims to understand the functionality of computerized adaptive testing software based on a website application model for the competency of electric motor installation in vocational education. The research method in this research uses research and development and the waterfall model. The research subjects are teachers and students in the electric motor installation subject. Data collection is done using the Noeeriat software instrument. The results of this study are as follows: (1) This tool can function well. This is indicated by testing on the authority of teachers (86.22%) and students (87.50%). (2) The usability level of the tool is considered very feasible in terms of functionality, display, and usefulness. (3) This tool contributes to an improvement in student learning outcomes. This tool can be used to measure the level of students' competency achievement, as a basis for compiling progress reports on learning outcomes, and as a foundation for improving the learning process.
Article
Full-text available
This textbook is designed for a one-or two-semester course in applied statistics. The theory and methods are applicable to empirical research in many disciplines. We have drawn applications from several fields, although most come from education and the behavioral sciences. In most instances, the data are not hypothetical but are from actual studies. The approach of this text is conceptual, but not mathematical. . . . We have stressed concepts rather than derivation and proof. To our knowledge, this book is unique among statistics texts in at least three respects: 1. Math notes and related practice exercises are integrated with the text and the student is directed to them only when and as they are needed. 2. Diagnostic mastery tests were carefully developed and follow each chapter. 3. New and more intuitive formulas and computational procedures have been developed and used for (1) the analysis of variance and (2) chi-square. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
In order to evaluate their programs, Extension offices have to gather evidence about program outcomes and impacts. The first step of this process is to determine the appropriate amount of data needed, or the correct sample size. Using a sample can help Extension professionals save time, money, and labor because fewer people must be interviewed or surveyed; thus the complete set of data can be collected quickly. This revised 9-page fact sheet provides an overview of sampling procedures, beginning with how to determine the research problem, define the population, and decide whether to sample, and going on to explain the different types of samples and how they are used. Written by Glenn D. Israel, and published by the Agricultural Education and Communication Department, December 2015. PEOD5/PD005: Sampling the Evidence of Extension Program Impact (ufl.edu)
Article
The cost of studying an entire population to answer a specific question is usually prohibitive in terms of time, money and resources. Therefore, a subset of subjects representative of a given population must be selected; this is called sampling. The concepts involved in selecting subjects to represent the larger population are presented. Sampling errors and associated determining factors are reviewed. Definitions of the research populations, including target and accessible groups, are given. The inclusion and exclusion criteria required to refine the accessible population to a re-searchable subgroup are explained, and an example is provided. The two types of sampling methods, probability and nonprobability, are defined and presented with their respective types. Probability sampling includes simple random sampling, systematic sampling, stratified sampling, cluster sampling and disproportional sampling. Nonprobability sampling includes convenience sampling, consecutive sampling, judgmental sampling, quota sampling and snowball sampling. The goals and concepts related to recruitment are reviewed with application to survey and experimental research. Three steps are suggested for obtaining an appropriate research sample: (1) clearly define the target population, (2) define the accessible population, and (3) define the steps and effort that will be employed to recruit subjects for study. (C) 1995 American Academy of Orthotists & Prosthetists
Determining Sample Size, Program Evaluation and Organizational Development, IFAS. PEOD-6. Florida (FL): University of Florida
  • G D Israel
Israel GD. Determining Sample Size, Program Evaluation and Organizational Development, IFAS. PEOD-6. Florida (FL): University of Florida; 1992-B