Who is Slovin and where and how did the Slovin's Formula for determining the sample size for a survey research originated?
The Slovin's Formula is quite popularly use in my country for determining the sample size for a survey research, especially in undergraduate thesis in education and social sciences, may be because it is easy to use and the computation is based almost solely on the population size. The Slovin's Formula is given as follows: n = N/(1+Ne2), where n is the sample size, N is the population size and e is the margin of error to be decided by the researcher. However, its misuse is now also a popular subject of research here in my country and students are usually discourage to use the formula even though the reasons behind are not clear enough to them. Perhaps it will helpful if we could know who really is Slovin and what were the bases of his formula.
If you use statistical models (like Ttest, ANOVA, Pearson r, regression analysis, path analysis, SEM, among others) to test the hypotheses of your study, then I suggest you conduct a statistical power analysis in computing your minimum sample size. Sample size is a function of the following components: effect size, errors in decision (Type I and Type II), complexity of the statistical model, among others. Statistical power analysis or simply power analysis is finding the optimal combination of the said components. You can use the G*Power software which is downloadable for free. Just search it in google.
Slovin's formula has been taught by "irresponsible professors" in the Philippine Colleges and Universities. Sorry for the strong words, but that is true. Because they dont have formal training about Statistics, they taught the wrong things to their students. They are like blind people guiding another blinds.
Anyway, you may read a published article titled "On the misuse of Slovin's Formula".
I am on travel now. I am just using my phone to reply your message. I will email to you later the said article if you want. If interested, just email me at johnny.amora@gmail.com and then I will send you some materials including the said article.
Khalid Hassan ·referenced this to Slovin (1960). An Internet search identified Slovin's first name as Ramus, and several papers referenced Slovin's formula to 'Sevilla et. al., 1960:182' e.g.:
Thank you Dr. Wilson. But the problem is, although Dr. Sevilla cited Slovin (1960) in her Research Methods book, such work by Slovin was not included in the Bibliography of that book, hence still "questionable". By the way, me and Dr. Sevilla came from the same country (Philippines) where Slovin's Formula is frequently use (or perhaps, misuse). By the way, "Ramus" or "Rumus" may not be the first name of Slovin. Rumus, if I am not mistaken, is an Indonesian word for "formula". Again, thank you very much.
I have never heard of Slovin before, but as to "...where and how did the Slovin's Formula for determining the sample size for a survey research originated?" I think I have a good idea.
In order to estimate a sample size, you start with the standard deviation of the population. At least you do for continuous data with which I am familiar, and similarly for proportions, in spite of the 'worst case' calculators on the internet, and I suppose analogously with other data.
The standard deviation of a population is a fixed but generally unknown value that you need to estimate. (Perhaps a pilot study is needed.) To estimate sample size needs, we generally estimate the number of observations required to estimate the mean of the population (mean or total of a finite population), with a given standard error, and the estimated standard error of the mean is a function of sample size and standard deviation of the population. If you set your margin of error as a number of standard errors (say t or z if normally distributed), then your sample size is that needed to obtain an estimated standard error of the mean that gives you the confidence interval you desire.
There are a number of good sampling books that would help. Here are two of them:
Cochran, W.G(1977), Sampling Techniques, 3rd ed., John Wiley & Sons.
Blair, E. and Blair, J(2015), Applied Survey Sampling, Sage Publications.
The 'formula' for estimating standard errors is found many places, and you can see that you just need a big enough sample size, n, to make the standard error small enough. The margin of error is then some multiple of that estimated standard error that gives you a confidence interval that is reasonable for your application.
Often it is helpful to include a finite population correction (fpc) factor, which accounts for the fact that as the sample size n approaches the population size N, there is less and less sampling error. (But don't forget nonsampling error.)
This (above) has all been considering simple random sampling. Note that other designs are similar, but more involved. Also note that the data are often considered 'normal' or at least that the Central Limit Theorem will help, but there is always Chebyshev's Inequality to fall back on if you must.
FINALLY, getting to Slovin:
If you look at page 77 in Cochran(1977), then you may understand better what Slovin apparently did. Because he has N in his (her?) 'formula,' he apparently was considering the fpc. He would apparently need to have started with equation 4.4 on page 77 of Cochran. But he then solved for n differently than Cochran shows, though the format looks similar. It appears that an approximation was made. The problem that I see with Slovin's Formula, is probably what also appeals to some. Though simple, the formula may be misleading when you do not make it clear, as Cochran does, that the margin of error has to be thought of as a multiple of the estimated standard error for a confidence interval, and not some arbitrary number that sounds fixed. Cochran even points out that in practice "...it is often more stable and easier..." to consider the ratios of the standard deviations to the means, the "coefficients of variation," when estimating sample size, than to determine a reasonable standard deviation alone. This is also on page 77.
So, as to "...where and how did the Slovin's Formula for determining the sample size for a survey research originated?" is concerned, it must have been a result of determining a relationship between sample size and standard error of estimated mean for a simple random sample, with an fpc, but apparently using some other approximation in the algebra. The most dangerous part of its use may be that it is likely passed around for use without explanation as to its limitations and applications.
[Similarly, for proportions, it is easy to sidestep the embedded estimation of a standard error by using an online calculator that assumes p=q=0.5 (a worst case) and/or ignores fpc (a further 'worst case'), and is apparently commonly used by people without their knowing that it is only for yes/no questions anyway. So, if the use of Slovin's Formula is discourage, it may be because it is similarly misused. Because it looks so simple, people may often not understand how to use it.] -
It also may be a problem that I did not see exactly how Slovin's formula was estimated from equation 4.4 in Cochran.
One of the references of Cochran in his 1977 Sampling Techniques book is the 1960 book titled Sampling in a Nutshell by M. J. Slonim. I think that it is possible that Slonim was just misspelled as Slovin. However, we still need to look if the mentioned "formula" or similar to that is in the said book of Slonim.
As your two questions on "Slovin" have come together now, I will repeat this response under both of them.
Yes, I had noticed a reference in Cochran(1977) to "Slonim," and wondered about it, but according to the author index, Slonim(1960) is only referenced on page 4 as follows: "The books by Deming (1960) and Slonim (1960) contain many interesting examples showing the range of applications of the sampling method in business." (W. Edwards Deming became quite famous regarding business and quality, by the way. He was also one of the well-known authors of a population sample study in Greece in 1947. See Cochran on page 160: Jessen, etal(1947). I'm glad you found a copy of the 3rd ed of Cochran.)
I don't know where you can find Slonim (1960), but I note that Eddie See does not believe that Slonim came up with this formula. It is hard to say when things get lost in time. I think the worst thing missing is probably the caveats of its limitations whenever someone uses it. But as for giving it a name, I can see Eddie's point that if he (she?) did not first develope it, why name it that way? But then, I can also see that if he made (hopefully) good use of it, then why not give it his name if the true origin is unknown and you want to refer to it quickly? But actually, I don't really see it as being too very useful anyway.
Maybe, as I think that perhaps you indicated, the origins may be discovered.
By the way, I did a similar historical check in the area of weighted least squares regression/heteroscedasticity, regarding the coefficient of heteroscedasticity and its use in a commonly employed format for regression weights. Phil Kott, a US statistician, had told me he'd seen it as far back as Cochran's. 1st edition, as I recall, but Ken Brewer, the author of one of the books I cited in at least one of these two "Slonin" questions, sent me to a much earlier reference:
Fairfield Smith (1938). An empirical law describing heterogeneity in the yields of agricultural crops. The J. Agri. Sci., 28, 1-23.
(I'm told that "Fairfield Smith" is an unhyphenated, two-word last name.)
So things do get lost in time. On page 158 in the 3rd ed of Cochran, 1977, Cochran gives Ken Brewer (1963), and Richard Royall (1970) credit, basically, for starting model-based inference for finite populations (or at least its modern use). However, in one of my papers I have this:
"From Cochran (1978), page 5, an even earlier application of an “…estimated ratio of population to births …” than that of Laplace was documented by John Graunt in
1662. Cochran (1978), page 7, went on to say, that Laplace made use of
superpopulations, and that 'So far as I know, this use of an infinite superpopulation in studying the properties of sampling methods was not reintroduced into sample survey
theory until 1963, when Brewer (1963), followed by Royall (1970), applied it to the ratio estimator ...'"
- This is an example of model-based inference for a finite population.
Cochran, W.G. (1978), “Laplace's Ratio Estimator,” Contributions to Survey Sampling and Applied Statistics, ed. HA David. New York: Academic Press., 3-10.
So ... it can be interesting to know these things, and perhaps it might be best if names were not given to methodologies where the origin is not clear. People like to use names, however. - Consider the way new diseases are named. :-)
I am on my second day of seminar on Multivariate Statistical Models organized by EINS Consultancy where the speaker is Sir Johnny Amora. He has a presentation on the history of Slovin's formula. Bottom line, the Slovin's formula is mistakenly named after a Slovin. The right person is Yamane and it is only good for proportion.
There is a way to look at a worst case in sample size estimation for proportions, but that is not necessary here. The 'formula' presented in the question could even be for a mean.
That is why I'd rather write estimated sample size requirements in terms of relative standard error. Note that in Cochran, W.G(1977), Sampling Techniques, 3rd ed., John Wiley & Sons, there is a chapter on sample size determination for means and for proportions, and around page 77, as I recall, he writes a sample size requirement estimator for simple random sampling (SRS) for estimating means, and he uses the "e" type of approach, but in the very next chapter, he uses a relative standard error type of approach to writing sample requirements when he talks about stratified random sampling. Perhaps his SRS related 'formula' was influenced by this historical perspective, but I find it less confusing to always use the relative standard error related approach. You can see the confusion here, for this question, when we have to consider what it is that we mean by "e."
At any rate, I don't see that this needs to be confined to proportions. The Pennsylvania State University webpages linked above can be seen to shown this.
Cheers.
PS - Actually the "e" value is only different between mean and total, because of the factor N, and really, the only other issues are the z value, and what you put in for sigma. But still, I don't care for this format. I use the relative standard error format at the bottom of page 2888 in
This may be well-intentioned, but unfortunately is extremely misleading, as they do not describe what the "error" means that they wish you to enter. Please see the Pennsylvania State University links I supplied above.
This "calculator" oversimplifies to the point of essentially being meaningless, and likely very misleading.
Most online "calculators" are for a special worst case of sampling for proportions. The calculator above is more general, perhaps used more for means, I imagine, but in both cases they are misleading, not really being properly annotated.
In both cases, you'd be expected to be using simple random sampling, but there are other designs.
Beware of using any software if you don't know what it all means. And when you do know what you want, the software may be doing something else, so beware. Perhaps finding an example, and using the software to duplicate the result, could help.
Just yesterday, we had also a seminar-workshop in Statistics in our University as part of the National Statistics Month celebration with Prof. Johnny Amora also as resource speaker. He also mentioned the "derivation" of Slovin (or Yamane) sample size formula from that of Cochran. But how can we say that Slovin or Yamane's formula was "derived" from Cochran while the references showed that the book of Slovin (if it really existed) was published in 1960, Yamane was in 1967 and Cochran was in a much later year 1977?
You may want to see also my other question/thread on "Slovin" for more information from RG colleagues (see attached link).
Well, Cochran wrote various articles - i think maybe some from the 1930s and maybe earlier - was a co-author for at least one other book, and as for his quite famous book, Sampling Techniques, Cochran had three editions: 1953, 1963, 1977. The one people usually site now is the 3rd edition, from 1977. He has quite an enduring reputation based on his works (books, articles, Prof at Harvard, perhaps substantially more).
Anyway, William G. Cochran was quite a pioneer, so I don't know if he did this much earlier, or simultaneously with another, or what, but I think, given his wide range of early work, it may be highly likely that he did this first, as well as explaining it well.
The main question was about the derivation of the formula, n = N/(1 + Ne^2).
The best way to answer the question is to check the original publication. That is, check the original publication of Slovin (if it really existed) and Yamane(1967). The famous book on Sampling Techniques by Cochran has three editions: 1953, 1963, and 1977. The early editions of the book can be a good source.
Dear Aqeel Hussain: The Slovin's formula is given in the question. Please see also the answers in another question on the said formula given in the following link:
Ah! I just revisited this. I see that Johnny found the answer as to the derivation of that 'formula' in an apparently open access publication which shows the circumstances/assumptions needed for application on pages 130-131. Mystery solved.
Johnny, it would be great for you to provide this answer to other questions on Slovin which have occurred on ResearchGate, such as Romer's other question, noted above. I thought I remembered figuring out the assumptions, but I don't see such an answer here. But since the link you provided shows this, and if it is open access, it would be good for more people to know about the special circumstances needed for Slovin to be applicable, if you would provide that link. - Otherwise, you could just summarize the information from those two pages. - Cheers.
Yes, I have also read about this formula and have also seen some people using this formula. But never interested who developed it in detail. Thanks all colleagues sharing this interesting information.
n = N/(1 + Ne^2), who is the primary author of this formula Yemane or Slovin, i know the book published by Slovin was 1960 and Yemane was published in 1967. Researchers used interchangeably, which one is the righ author?
Dear Bekele: The existence of the book of Slovin (and even "his" own existence) is actually a big question. (You may also read the other previous answers to this question in this thread.)
Sir Johnny Amora, may I ask about the Slovins formula. I am currently working on a study and im thinking of using slovins but after reading the comments here I am not sure anymore. I cant read also your link since it cant be open. Thank you.
If you use statistical models (like Ttest, ANOVA, Pearson r, regression analysis, path analysis, SEM, among others) to test the hypotheses of your study, then I suggest you conduct a statistical power analysis in computing your minimum sample size. Sample size is a function of the following components: effect size, errors in decision (Type I and Type II), complexity of the statistical model, among others. Statistical power analysis or simply power analysis is finding the optimal combination of the said components. You can use the G*Power software which is downloadable for free. Just search it in google.
Slovin's formula has been taught by "irresponsible professors" in the Philippine Colleges and Universities. Sorry for the strong words, but that is true. Because they dont have formal training about Statistics, they taught the wrong things to their students. They are like blind people guiding another blinds.
Anyway, you may read a published article titled "On the misuse of Slovin's Formula".
I am on travel now. I am just using my phone to reply your message. I will email to you later the said article if you want. If interested, just email me at johnny.amora@gmail.com and then I will send you some materials including the said article.
How can researchers justify the use of one technique over another for estimation of sample size- for example of use of g power and other online techniques ? Are there any peer reviewed research studies that justify their use?
Tahir Sufi, GPower and the other you described as "online techniques" are just calculators. Sample size for a research that utilizes statistical tests such as TTest, ANOVA, Regression analysis, among others, depends on effect size, errors in decisions (Type I and Type II errors), statistical power, complexity of the statistical test being used, etc. You should choose your effect size, errors in decisions (Type I and Type II errors), and statistical power.
There is no person called slovin with the kind of work such as slovin. Perhaps you can google. As popularly should the formula be searched, if a person named slovin exist, then we should be able to find him. The hard truth today is that we don't find any. Further, the literature on slovin's formula only appeared recently while the same formula actually has been put forward by Yamane designed for sampling within a finite population size for estimations involving proportions. It assumed the maximum variation pq where q=1-p, and p =0.5 and where the significance level is the two-tailed alpha = 0.05 with normal probability z score of 1.96 rounded to 2.0. Applying Cochran's formula for finite population within these assumptions will lead to the so-called slovin's formula. The truth therefore is that the formula is a special formula where the specific assumptions are satisfied. Otherwise it should not be used. If I were you, just don't use it. Use Yamane and Cochran.
Hello dear colleagues, In my opinion, the sample size should be determined according to the objectives of the research and the type of study. It is not possible to consider a general rule for all studies. For example, for survey studies, one method should be used to determine the sample size and other methods should be used for factor analysis and experimental research designs. This formula is very simplistic in determining the sample size. Thanks
sampling may vary also on the scope and delimitation of the research. and in some cases students in basic education since they are minors they have limited access to their possible respondents that's why they're using slovin's. But, i totally with sir johnny since power analysis considers 80% accurate probability.
The paper discusses quality monitoring and assessment in quantitative survey research from a cross-national perspective. It takes standards of best practice advocated in national survey research as a starting point from which to discuss cross-national research quality and comparability. It illustrates how the lack of adequate documentation at each...
Obra que desarrolla las técnicas para la obtención y análisis de datos. Contiene: La encuesta: concepto, características y utilidad; El cuestionario; Del cuestionario a la base datos; Obtención de los datos y calidad de la encuesta; Análisis de los datos; Análisis de una pregunta de respuesta libre; Análisis de una encuesta.
Background:
Increasing response burden in alcohol surveys combined with filter questions to exclude abstainers, results in systematically missing data in questions on alcohol consumption as abstainers are not required to answer them.
Objectives:
The aim of the current study is to assess the impact of responder burden on current drinker rates in...