Science method

Sampling - Science method

Sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population.
Questions related to Sampling
  • asked a question related to Sampling
Question
5 answers
I came across a young Prosopis juliflora plant, which is highly invasive. What is the best method for collecting sand around the roots to study the potential
Relevant answer
Answer
The method for collecting sand around Prosopis juliflora roots to study allelopathic compound secretion is informed by established research in rhizosphere sampling and allelopathy studies. Key sources include:
  1. Rhizosphere Soil Method (RSM): This technique involves collecting soil directly from the rhizosphere to evaluate allelopathic activity. Fujii et al. (2005) and Karmegam et al. (2014) have utilized this method to detect allelochemicals in the root zone.
  2. Metabolite Profiling of Rhizosphere Soil: Studies have employed metabolite profiling to identify allelochemicals in rhizosphere soils. For instance, research on allelopathic rice varieties demonstrated the accumulation of specific metabolites in the rhizosphere, providing insights into allelopathic interactions.
  3. Extraction Techniques for Allelochemicals: Methods involving the extraction of allelochemicals from rhizosphere soil using solvents like methanol have been documented. For example, a study on the allelopathic effects of Saussurea lappa detailed the extraction of potential allelochemicals from rhizosphere soil samples.
  4. Standardized Protocols for Rhizosphere Sampling: Protocols for the independent extraction of bulk, rhizosphere, and rhizoplane soil fractions have been proposed to ensure consistency and accuracy in sampling.Such standardized methods facilitate the study of plant-microbe-soil interactions and allelopathic effects. These sources provide foundational methodologies for sampling rhizosphere soil and analyzing allelopathic compounds, which can be adapted to study Prosopis juliflora.
  • asked a question related to Sampling
Question
5 answers
I have been studying a particular set of issues in methodology, and looking to see how various texts have addressed this.  I have a number of sampling books, but only a few published since 2010, with the latest being Yves Tille, Sampling and Estimation from Finite Populations, 2020, Wiley. 
In my early days of survey sampling, William Cochran's Sampling Techniques, 3rd ed, 1977, Wiley, was popular. I would like to know which books are most popularly used today to teach survey sampling (sampling from finite populations).
I posted almost exactly the same message as above to the American Statistical Association's ASA Connect and received a few recommendations, notably Sampling: Design and Analysis,  Sharon Lohr, whose 3rd ed, 2022, is published by CRC Press.  Also, of note was Sampling Theory and Practice, Wu and Thompson, 2020, Springer.
Any other recommendations would also be appreciated. 
Thank you  -  Jim Knaub
Relevant answer
Answer
Here are some recommended ones: 1. "Sampling Techniques" by William G. Cochran This classic book covers a wide range of sampling methods with practical examples. It’s comprehensive and delves into both theory and application, making it valuable for students and professionals. 2. "Survey Sampling" by Leslie Kish'' This is another foundational text, known for its detailed treatment of survey sampling design and estimation methods. Kish's book is especially useful for those interested in practical survey applications. 3. "Model Assisted Survey Sampling" by Carl-Erik Särndal, Bengt Swensson, and Jan Wretman This book introduces model-assisted methods for survey sampling, which blend traditional design-based methods with model-based techniques. It's ideal for more advanced readers interested in complex survey designs. 4. "Sampling of Populations: Methods and Applications" by Paul S. Levy and Stanley Lemeshow This text is widely used in academia and provides thorough explanations of different sampling methods with a focus on real-world applications. It also includes case studies and practical exercises, making it helpful for hands-on learners. 5. "Introduction to Survey Sampling" by Graham Kalton This introductory book offers a concise and accessible overview of survey sampling methods. It’s well-suited for beginners who need a straightforward introduction to key concepts. 6. "Designing Surveys: A Guide to Decisions and Procedures" by Johnny Blair, Ronald F. Czaja, and Edward A. Blair This book focuses on the practical aspects of designing and conducting surveys, with particular emphasis on decision-making and procedural choices in the survey process.
  • asked a question related to Sampling
Question
9 answers
I am conducting a study testing the effectiveness of a kind of group psychotherapy. There are 10 participants in my experimental group and 14 participants in my control group. At first, I planned random assignment to the groups, but because of the time of the group therapy, 14 of the participants wanted to be in the waitlist control group. After I created the control group, I run a t test to compare two groups in terms of some study variables. When I did a t test, I saw that there was no significant difference between the groups in terms of the study variables. In summary, the groups have similar characteristics (e.g. Age, educational level, romantic relationship status, mean scores of the participants). However, group sizes are different. Can I do my analysis with 10 people in the experimental group and 14 people in the control group? If no, how do I remove the 4 people in the control group?
Relevant answer
Answer
Ceren Bektaş-Aydın, please report n, mean, and SD for each of your groups. Thanks.
  • asked a question related to Sampling
Question
6 answers
I have 23500 points, I sorted them in Excel from lowest to biggest, and then in a scatter plot, I create its chart, now I want to find the data that after that data (point) my chart starts to a high slope near 90 degrees, or in another word my chart begins growing up faster.
Relevant answer
Answer
Hello Javid,
You're seeking, apparently, the point of inflection for a presumed function which links the two variables summarized in the display.
If you've fitted a model, then solve for the first derivative being set to zero.
If you haven't fitted a model, then you'll need to:
1. Define some measure of slope change. This could be [(Y for the kth case - Y for the k - 1th case)] / [(Y k - 1th case) - Y k - 2th case)]
2. Define some degree of slope change that represents a slope "shift" and not just a graduated increase.
3. Compute the slope change (#1 above) for each point on the graph, going out to the right.
4. When you come to a point at which the slope change exceeds your criterion level (#2 above), then this is your shift point.
Good luck with your work.
  • asked a question related to Sampling
Question
4 answers
Is it permissible for the number of samples to be reduced or not reach the number of sample calculation results because many respondents are not willing to join the study as respondents? what should i do?
Thank you
Relevant answer
Answer
Ainul Q K -
It sounds as if you have estimated the sample size needed to obtain a given standard error for a given quantitative variable result, say for continuous data or for a proportion, noting this would have to be investigated for all important items on a survey, and hopefully the data collection method is consistent with the method used in your sample size "calculation." For example, if your calculation assumed simple random sampling for a proportion, say with a finite population correction factor, then you did try to collect data according to a simple random sample, or whatever you had decided.
Now it sounds like your problem is nonresponse. That does not just make your attainable standard error larger, which it does, it may also introduce bias because the mean of, or proportion for the nonrespondent data might have been very different from that of the responses obtained.
To reduce that bias, you could consider "response propensity." You could weight data in groups so that data like cases not obtained will be weighted more. You could research response propensity groups.
If you have covariate data, there are other things you could do, but they could be very complex. See, for example, "Comparing Alternatives for Estimation from Nonprobability Samples," by Richard Valliant, December 2019, Journal of Survey Statistics and Methodology 8(2),
DOI: 10.1093/jssam/smz003
https://doi.org/10.1093/jssam/smz003. Richard Valliant has other papers on ResearchGate. Another place to look for papers on nonprobability sampling would be under J. Michael Brick.
If you are able to obtain the missing responses on a second try, that might help. You could also try asking other 'similar' members of the population - something called "nearest neighbor."
I like using a ratio model-based approach if you have a census of good, highly related data, which form a straight-line relationship to the origin.
At any rate, you are correct in being concerned about nonresponse, but not just because of higher variance, but also adding bias from the mean which also increases mean square error. You should at least note the problem, and perhaps you could do a sensitivity analysis to see how far off you might reasonable be.
Best wishes - Jim Knaub
  • asked a question related to Sampling
Question
3 answers
Where is the Jade due to the throwing out of a brick and a paving stone?
A brand new conception of preferable probability and its evaluation were created, the book was entitled "Probability - based multi - objective optimization for material selection", and published by Springer, which opens a new way for multi-objective orthogonal experimental design, uniform experimental design, respose surface design, and robust design, discretization treatment and sequential optimization, etc.
It aims to provide a rational approch without personal or other subjective coefficients, which is available at https://link.springer.com/book/9789811933509,
DOI: 10.1007/978-981-19-3351-6.
Best regards.
Yours
M. Zheng
  • asked a question related to Sampling
Question
9 answers
According to my view, even the mere opening of a drawer or a cupboard is already damaging to cultural remains. The shock of opening a drawer, the changing environment between the inside and the outside of a drawer, as well as a sudden light that falls on an artifact that has been in the dark is damaging to any item, especially to parchment, papyrus and paper.
When it comes to 'taking a sample' needed for applying a analytical technique to an artifact, one can only speak of less and more destructive, because destructive it is.
For example in Neutron Activation and Petrography, one needs either an amount of 80 mg of pottery powder or a thin-section before submitting the sample either as powder or as a pellet to a nuclear reactor or to a glass slide to be looked at under a microscope.
I think that the formula "non-destructive sampling technique" was invented by scientists to obtain samples they needed from a curator or conservator. I, therefore, suggest to omit the word "non-destructive" from the Cultural Heritage vocabulary.
Relevant answer
I think we should make a difference between sampling, materials and techniques when talking about non-destructive or non-invasive.
Taking a micro-sample on an artefact is invasive and destructive, while its analysis can be non-invasive and non-destructive. Also, the difference between invasive and destructive should also be taken into account. Invasive means that there is an interaction between the material and the technique, while destructive means the material is destroyed = cannot be re-analysed.
Based on that, some techniques are invasive, but non-destructive.
Also, semi-invasive techniques do not exist (at least in my opinion), while semi-destructive techniques exist. A preparation technique, such as cross-section for example, is semi-destructive. During preparation, part of the sample is polished = destroyed, but the rest of the sample remains whole and could be re-analysed (extraction from resin can be discussed). We can argue about how much invasive this preparation is then, but it largely depends on the method of preparation/protection.
When the cross-section is analysed under an optical microscope, the destructivity depends highly on the material we are observing (exclusion of staining). For most cases (hard substrates, inorganic materials, resins, some organic materials as well), it is non-destructive and non-invasive.
When SEM is used, the same principle can be applied, which means we will have a non-destructive, but invasive analysis. Again, it depends on the material analysed.
In conclusion, the terminology should be carefully explained, and what is determined as non-destructive, semi-destructive, destructive (same for the invasivity) should be carefully described.
  • asked a question related to Sampling
Question
13 answers
More exactly, do you know of a case where there are repeated, continuous data, sample surveys, perhaps monthly, and an occasional census survey on the same data items, perhaps annually, likely used to produce Official Statistics?   These would likely be establishment surveys, perhaps of volumes of products produced by those establishments. 
I have applied a method which is useful under such circumstances, and I would like to know of other places where this method might also be applied.   Thank you. 
Relevant answer
Answer
This is for the crushed stone industry in the US:
I'm told these quarterly surveys are for "a select set of companies" which reminds me of how the quasi-cutoff sample of electric sales in the US got started. The electric sales survey of a select group of entities was later modified and used as a sample, first a stratified random sample with a large company censused stratum, and then only the censused stratum as a quasi-cutoff sample, all after starting an annual census of all electric sales by economic sector (residential, etc.), from the production/supply side. If one wanted to monitor the crushed stone industry the same way, I would suggest this approach using a quasi-cutoff sample with a ratio model for prediction, as is done at the US Energy Information Administration (EIA).
Does anyone know of other surveys, each of a select group of larger establishments being followed, where there is a chance to instead have an occasional census to be used for regressor data for the same data items in a more frequent sample?
  • asked a question related to Sampling
Question
3 answers
Like MPFP, important sampling etc.
Relevant answer
Answer
You should perform Monte Carlo simulations with at least 10,000 iterations at worst process corner and temperature.
For HSNM and RSNM: FS corner and 125 'C
For WSNM: SF corner and -40 'C
  • asked a question related to Sampling
Question
4 answers
as i am carrying out my research in the area of cloud forensics to show the results i want to go through a sample live data set or database, can any one suggest the various FTK tools where are freely available to carry out my results
Relevant answer
Answer
As a part of my research I have generated Cloud Forensic dataset which are published on the following links you can check that out. A description of the dataset is also given. You can refer to my papers or contact me for more details.
Cheers.
All these datasets are generated by collecting data at the hypervisor level.
  • asked a question related to Sampling
Question
4 answers
Good afternoon,
I am carrying out a monthly invertebrate sampling for future molecular studies (DNA). I am euthanizing my arthropods with 70° ethanol right after the capture and then store them in a freezer. Would it be better for DNA preservation using 96° or pure ethanol?
Relevant answer
Answer
I have heard, but do not remember the source, that 100% alcohol/absolute alcohol can be used for DNA studies. Normally 70% alcohol is used for storage in alcohol. Some recommend 70% with 5% glycerol for long-term storage. If possible, freezing, as mentioned, is best for DNA studies. The colder the better. -18 degrees C, is standard in household freezers. Dry ice holds - 78o and can be used for short-term freezing. Liquid nitrogen is even colder, but requires special equipment.
  • asked a question related to Sampling
Question
11 answers
In the latest approaches to trying to infer from nonprobability samples, multiple covariates are encouraged.  For example, see https://www.researchgate.net/publication/316867475_Inference_for_Nonprobability_Samples.  However, in my experience, when a simple ratio model can be used with the only predictor being the same data item in a previous census (and results can be checked and monitored with repeated sample and census surveys), results can be very good.  When more complex models are needed, I question how often this can be done suitably reliably.  With regard to that, I made comments to the above paper.  (That paper is available through Project Euclid, using the DOI found at the link above.) 
Analogously, for heteroscedasticity in regression, for Yi associated with larger predicted-yi, sigma should be larger.  However, when a more complex model is needed, this is less likely to be empirically apparent.  For a one-predictor ratio model where the predictor is the same data item in a previous census, and you have repeated sample and census surveys for monitoring, this, I believe, is much more likely to be successful, and heteroscedasticity is more likely to be evident. 
This is with regard to finite population survey statistics.  However, in general, when multiple regression is necessary, this always involves complications such as collinearity and others.  Of course this has been developed for many years with much success, but the more variables required to obtain a good predicted-y "formula," the less "perfect" I would expect the modeling to be.  (This is aside from the bias variance tradeoff which means an unneeded predictor tends to increase variance.) 
[By the way, back in Cochran, W.G.(1953), Sampling Techniques, 1st ed, John Wiley & Sons, pages 205-206, he notes that a very good size measure for a data item is the same data item in a previous census.] 
People who have had a lot of experience successfully using regression with a large number of predictors may find it strange to have this discussion, but I think it is worth mulling over. 
So, "When more predictors are needed, how often can you model well?"
Relevant answer
Answer
Prediction-based inference from finite population sampling is well-established.   See
Valliant, R, Dorfman, A.H., and Royall, R.M.(2000), Finite Population Sampling and Inference: A Prediction Approach, Wiley Series in Probability and Statistics,
and
Chambers, R, and Clark, R(2012), An Introduction to Model-Based Survey Sampling with Applications, Oxford Statistical Science Series. 
This is also in parts of other sampling books such as Thompson, S.K.(2012), Sampling, 3rd ed, John Wiley & Sons. 
A classical ratio model is based on less heteroscedasticity than one may usually find in survey data, except when the effective coefficient of heteroscedasticity is reduced by data quality issues from the small responders.  It can be very useful in Official Statistics.  We can estimate sample size requirements for a subpopulation or a population modeled well by a single such model, using a sample drawn in any way which does not miss any important subdivision which would indicate more than one model is needed.  The format for the "formula" for estimating the sample size in such a case is similar to what is found for a simple random sample in Cochran, W.G.(1977), Sampling Techniques, 3rd ed, John Wiley & Sons.  See https://www.researchgate.net/publication/261947825_Projected_Variance_for_the_Model-based_Classical_Ratio_Estimator_Estimating_Sample_Size_Requirements.  Both cutoff/quasi-cutoff sampling and balanced sampling are discussed. 
Thus we can estimate the sample size required here to make satisfactory inference from a population or subpopulation which falls under the purview of one model-based classical ratio estimator.  This simple result is relatively easily verified. 
So we can see that when a simple model is appropriate, as when a single predictor is the same data item in a previous census, we may more feasibly infer from a nonprobability sample.  
Does anyone have any other experience to discuss?
  • asked a question related to Sampling
Question
3 answers
i have a system and i want to apply a force to drag a part of it in a certain direction. i am using GROMACS patched with PLUMED and defined two COMs (one for the group to be dragged and the other is for the target position) and defined a distance between them and applied a restraint on this distance using a harmonic restraint.
My question is how to determine the value of the force constant KAPPA?
an example for my input is the following:
outerP: COM ATOMS=1-10
N: COM ATOMS=20-30
d1: DISTANCE ATOMS=outerP,N NOPBC COMPONENTS
restraint1: RESTRAINT ARG=d1.z KAPPA=??? AT=0.0
Thanks
Relevant answer
Answer
Usually you 'drag' the COM of the nano-particle by linking it to a point that is moved at a certain velocity and the COM is harmonically restrained to that virtual point that is being moved. There are a few publications that investigate the resistance offered by a bilayer to a penetrating nanoparticle/nanotube etc, those could maybe be a good starting point to get an idea? I think in general you will have to try more that one value of k to see if there is there are any dependencies of your results!
Best,
Nicola
  • asked a question related to Sampling
Question
24 answers
If I use 320 sample size using a purposive sampling technique, how can validate the sample size for generalizing results? Are 320 responses could be statistically sufficient to generalize the results?
Relevant answer
Answer
All due respect to you for this question
  • asked a question related to Sampling
Question
4 answers
Hello, so the population of my research is university students from a specific island (my country is an archipelago), thus I know the total amount of the population based on the national government census reports. So, I did a multistage random sampling from island->province->cities->universities. But, after I selected the universities as the unit sampling, I realized I didn't have the sample frame from each universities. So for this last step to select the samples, is it acceptable if I use non-probability sampling, such as purposive sampling if I planned to do regression analysis? What should I do in this situation?
Side note: I've actually carried out the survey that way and I found out that there was no problem with the results, i.e, the assumptions of the linear regression are fulfilled, and the validity and reliability of the scales were also acceptable. But I'm not sure whether what I did was justifiable or not...
Relevant answer
Answer
Technically no but you have done better than most surveys like this and.futher the frames aren't available to anyone else either. I would report all of this and say that this was the best that anyone could do... Good luck and best wishes to you, David Booth
  • asked a question related to Sampling
Question
3 answers
Is it possible to use Finite Population Correction (FPC) to decide the minimum required sample size when we use Respondent Driven Sampling (RDS) approach to recruit hidden populations? Kindly share any reading material on this? An introduction to RDS is attached for your information. Thanks in advance for kind support.
Relevant answer
Answer
Suchira Suranga -
If your weights are good, so you expect reasonable inference, do you have a way to estimate variance? I am not familiar with this, but if you can estimate variance, then you generally want to only apply that to the data not in the sample. You estimate variance from what is in the sample, and apply it only to what was not in the sample. I have a Sage Encyclopedia entry for the finite population correction (fpc) factor, which I explained in terms of both design-based, and model-based methods. I suspect that the same idea applies here.
I obtained permission from Sage to post this.
Cheers - Jim
  • asked a question related to Sampling
Question
20 answers
Hello I am trying to reconstruct the far field pattern of a patch antenna at 28 GHz (lambda = 10.71 mm ). I am using a planar scanner to sample the near field with a probe antenna. The distance between patch and probe is 5 cm. The resolution of the scan is 1.53 x 1.53 mm². The total scanned surface is 195x195 mm. The NF patterns are shown in the NF_raw file.
The complex near field is then transformed using the 2D IFFT to compute the modal spectrum of the plane waves constructing the scanned near field. (See C. A. Balanis (17-6 a and 17-7b) for this). The modal components are shown in the IFFT file. The problem is that is observe an oscillation in the phase of those modal components that reminds me of aliasing effects in digital images (Moiré pattern).
This effects also procreate when I resample the modal spectrum in spherical coordinates, as seen in the Sampling file. The transformed phase changes therefore too fast per radian. The absolute value of the pattern looks reasonable.
Could someone explain why these effects occur and what steps I can implement to prevent them? Thank you for any helpful input.
Relevant answer
Answer
I scaled the phase wrong!
Here are the correct ones (I think) and the input files as well
  • asked a question related to Sampling
Question
6 answers
My research aims to evaluate how incumbent companies can face newcomers effectively based on a case study of the Mobile Phone industry.
In this regard, I am collecting data through a survey targeting Smartphone users to better understand the strategies adopted by Mobile Phone companies.
I personally believe it is impossible (or at least very difficult) to use probability sampling as the number of Smartphone users is very large (2.9 Billion users in 2017). I would like to use non-probability sampling, but I am not sure whether it would be acceptable in a research paper. What do you think?
Relevant answer
Answer
Why not.
  • asked a question related to Sampling
Question
30 answers
I am conducting a single-case study research as part of my dissertation for a Master's degree. The topic is in the area of public procurement and innovation. The aim is to explore to what extent standards referenced in public procurement allow innovation in State-Owned Enterprises (in a one country).
The research is designed as a single case-study. As identified by Robert K. Yin in his book Case Study Research, one of the rationales of a single case study is the representative or typical case. As a result, I have arranged for an interview with one procurement professional from the selected organization. However, my supervisor informed me that a single interview will not be sufficient to get unbiased and comprehensive data for analysis and discussion. Additionally, I was advised to conduct surveys if it is difficult to arrange interviews.
I do not understand why is it necessary to involve more than one participant in the research and conduct more than one interview. Also, how surveys are going to help get sufficient data, given that I am conducting a qualitative research. As for data analysis, I am going to use thematic analysis in which I will link what to be said in the interview with my findings from the literature.
I would appreciate it, if you could advice me on what should I do
Relevant answer
Answer
This might depend on the scope of the study
  • asked a question related to Sampling
Question
7 answers
Hi! As I'm just starting teaching and mentoring students in their coursework I often come across a particular issue related to sampling in qualitative research. Whenever students are assigned to do a preliminary qualitative study or devise a qualitative research strategy which involves getting information from other people they often resort to using Facebook as a place for distribution of their invitations to participate in research, or post links to online surveys etc. I do not find this particularly problematic, but I sometimes encounter MA thesis proposals which resort to this strategy even though the proposed research is not really presented as situated in the context of social media. I've also come across some studies which use a more structured approach where social media is used as a platform to implement the snowballing sampling principle.
My questions are:
1. Do you have any experience with that (in terms of students using social media in their sampling strategies)?
2. Have you used social media in your sampling strategies and what were your justifications to do so?
3. Should this approach be encouraged or discouraged if students are aware of the limitations of their sample creation strategies?
4. Does it matter whether the focus of such research has something to do with social media or not?
Relevant answer
Answer
  • Research is of value only when the findings from a sample can be generalized to a meaningful population. When the population addressed by the survey cannot be described, and when the sample is contaminated by respondents with biases, findings from online surveys cannot be generalized and may therefore mislead.
  • Online surveys are becoming increasingly popular, perhaps as a result of their ease, convenience, and low cost of data collection. It provides sampling frames that would otherwise be unavailable (or extremely difficult to obtain). Online surveys commonly suffer from two serious methodological limitations: the population to which they are distributed cannot be described, and respondents with biases may select themselves into the sample. Only when the findings from a sample can be applied to a larger population research is worthwhile. Findings from online surveys cannot be generalized and may therefore mislead if the population addressed by the survey cannot be described and the sample is contaminated by respondents with biases.
  • Research findings are of scientific value only if they can be generalized. At the very least, generalization from the sample to the population from which the sample was drawn should be possible. This is only possible if the sample is representative of the population, which necessitates the fulfilment of two conditions. The first requirement is that the population be known; it is impossible to extrapolate findings from a study to an undefined population. The second requirement is that a valid sampling method was used; a method that recruits a sample that is overrepresented for one characteristic cannot represent the population.
  • Personally speaking I am not a fan of online surveys which are utilized for the purpose of research purpose.
  • asked a question related to Sampling
Question
2 answers
Dear all,
Could you recommend any review paper (or book) comparing various downsampling methods applicable to volumetric data (preferrably, light microscopy or cell tomography data)?
Relevant answer
Answer
Some of publications of downsampling for volumetric data:1_ Optimal distribution_preserving downsampling for a large biomedical data.
2_ Downsampling method for medical datasets_Core.
https://Core.ac.Uk> download>pdf
  • asked a question related to Sampling
Question
2 answers
The DBS is planned for a community based study
Relevant answer
Answer
working with large Whatmann 903, you need to know about inter and intra lot variation, which will affect your method repeatability and precission. it will be better to test inter and intra lot thickness, weight. you also may need to test the effects of drying time too.
  • asked a question related to Sampling
Question
11 answers
I am trying to measure the power of my study, in which I measured the level of awareness about a disease (PCOS) among university students (level of awareness was measured with a score of 22 points and served as the dependent variable in the further analyses). I sampled about 1000 students and my target population is about 30,000 students. I do not know the target population awareness score, as no similar studies have been conducted in my country. How can I calculate the post-hoc power of my study?
Relevant answer
Answer
Hello Sharad,
Please note that the formula that was provided by Getasew has to do with estimating needed sample size for purposes of estimating a population parameter (here, proportion of cases) with some desired degree of precision and at some target risk level of being entirely wrong (by not capturing the true value within the confidence interval estimate); it's not a power function determination.
If your goal is to estimate some population characteristic that is suitably expressed as a proportion (e.g., a dichotomous variable), then the approach works well for simple random sampling.
If, however, you want to test a specific hypothesis and want that test to be run with a specific a priori power level to detect an effect at least as large as some specific a priori value, then you'll need to try another approach. Here's a link that you may find helpful for this latter possibility:
Good luck with your work.
  • asked a question related to Sampling
Question
6 answers
How bioanalyzer determine the RIN of RNA sample?
Why in some cases the RIN appears as not applicable? What could happened?
Relevant answer
Answer
RIN is determined based on the result of microcapillary electrophoretic RNA separation. Several features, including the total RNA ratio, the height of the 28S peak, the fast area ratio and the marker height, were taken into consideration in the RIN algorithm.
For detail, you can read the following paper.
Schroeder, A., Mueller, O., Stocker, S., Salowsky, R., Leiber, M., Gassmann, M., Lightfoot, S., Menzel, W., Granzow, M., and Ragg, T. (2006). The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol 7, 3.
RIN value represents the degaradation of RNA. Usually, RIN<7 is not applicable for the following RNA sequencing.
  • asked a question related to Sampling
Question
7 answers
What`s the minimum sample of treated and non-treated observations for a study that uses a combination of Propensity Score Matching (PSM) and Difference in Difference (DID)?
I searched several articles online, and I could not find any "rule" or something that states what could be the minimum sample size of treated and non-treated observations on a study that uses PSM and DID approach as a combination. 
Relevant answer
Answer
I did a propensity score matching separately for 3 different outcome measures using the same matching process (1:1 Caliper 0.4 common) psmatch2 $treatment $dlist, outcome($ylist) caliper (0.4) ate logit common First outcome: hospital admission within 30 days Second outcome: mortality in 1 year Third outcome: duration for hospital stay for the admission in 30 days. The sample size I obtained for the first outcome was 38 vs. 58 for the second outcome measure. Is it correct to obtain different sample size for different outcomes? Please help
  • asked a question related to Sampling
Question
3 answers
Hi,
I am new to DL and I'm trying to classify 1 Landsat8 image into 3 categories using VGG-19. I am using 8 bands (B2 to B7 ,B10 and Panchromatic). I performed the sampling procedure and my samples are named "1_id_b2" (category_id sample_Landsat band). I have my training and test samples into separate folders. The folder structure is similar to the image attached (folder_str). I've read that I need to create training and test labels. I don't understand why to create the labels, because I already labeled my samples.
Relevant answer
Answer
Also, the website mentioned in the 4th point does not work
  • asked a question related to Sampling
Question
7 answers
Dear Scientists,
I have a question, we want to use the application of ANNs in regression analysis and this is some sort of easy utilization for ANNs, but the question is " how many samples do we need to training? using 12 samples could be enough? " I produced these 12 samples by Fractional Factorial Design (FFD) method and I need to be sure about this. Therefore, I would be grateful if you could provide me with any information about this subject.
Many thanks in advance for your time and kind consideration.
Regards
Mohsen
Relevant answer
Answer
Dear Mohsen, there is a rule of Thumb of Upadhyaya & Eryurec that says:
H= I * log2 N, where H is number of weights, I is the size of input vector and N is the number of trainning patterns. If you have 12 trainning patterns, and a input of size 1, you got only 3 weights. So its not possible to have a hidden layer. Hope it helps you.
References:
  • asked a question related to Sampling
Question
8 answers
In the process of conducting a correlational study, I got stuck in planning my sampling technique. The aim is to investigate the predictors of reading performance among EFL university students of low and high proficiency levels. The population of the study is consisted of EFL university students majoring in English. The sample needs to be of two groups low and high proficiency students. However, the students‘ language proficiency across academic levels is not defined, which means i need to administer a placement test to divide participants into tow proficiency groups. I’m thinking to include only the beginning and advanced academic levels and administer the tests to the students to take only low proficiency students from beginning levels and high proficiency students from advanced levels. The reason why I need to administer the test is that students’ proficiency levels vary considerably across levels and hence their academic levels are not the best indicators of their level. My question is what is the best sampling method that suits my study ?
Relevant answer
Answer
you may select sample applying stratified random sampling.
  • asked a question related to Sampling
Question
10 answers
How would you defend that your quantitative research results are representing the population even though you're using non-probability sampling? (which not everyone has the same opportunity to be used as a sample)
Please correct me if i'm wrong. Thank you!!
Relevant answer
Answer
Rahmawati -
There is another approach, probably more often used for establishment surveys. The model-based approach. See for example Royall, R.M.(1992), "The model based (prediction) approach to finite population sampling theory," Institute of Mathematical Statistics Lecture Notes - Monograph Series, Volume 17, pp. 225-240.  Available under Project Euclid, open access:
Here are a couple of textbooks:
Chambers, R, and Clark, R(2012), An Introduction to Model-Based Survey Sampling with Applications, Oxford Statistical Science Series
And
Valliant, R, Dorfman, A.H., and Royall, R.M.(2000), Finite Population Sampling and Inference: A Prediction Approach, Wiley Series in Probability and Statistics
The idea is that you use regression. So you have to have another data source for an independent variable on everything. That is the hard part in many cases, having the auxiliary data.
There are other books where the methods are "combined," as Ken Brewer has said, and he wrote this as well:
Brewer(2014), "Three controversies in the history of survey sampling," Survey Methodology, Dec 2013 -  Ken Brewer/Waksberg Award: 
When you can use regression, you have more choices for sampling because you have data on everything and know your population better. (You might look up "balanced sampling.") Sometimes auxiliary data are available from a recently past census. Sometimes there is administrative data available online for businesses, if you have that. But my guess is that you do not have the necessary auxiliary/regressor/predictor data available to you.
Best wishes - Jim
  • asked a question related to Sampling
Question
9 answers
I'm in the market to buy a DNA extraction robot, and would really appreciate any suggestions/experience/advice.
With the projects we recently landed we're expecting to process on average about 3000-5000 noninvasive samples per year (scats, urine, saliva - all taken from the environment not from the animal directly). DNA extraction is a total bottleneck in our lab, it's difficult to do quality control when hand-extracting (sample mixup, pippetting errors...) and is too labor intensive (hence expensive) and slow.
I'm not too keen on the magnetic beads technology (tested some machines, didn't like them) and I'd like something that could automate regular spin column (silica membrane) extraction. QiaCube from Qiagen seems an option, but it only does 12 samples at a time. I'm looking at about 100 samples per day throughput, and can spend about 30,000€ on this (well, 40,000€ tops). Contamination prevention is critical with noninvasive sampling applications. 
I'd really appreciate any help with this.
Relevant answer
Answer
I am also eager to hear your thoughts on the robot comparison! Can you share? Do you think bead technology would work with eDNA samples?
  • asked a question related to Sampling
Question
3 answers
Please enlighten and attach the reference of past studies.
thanks
Relevant answer
Answer
Cohen's book, Statistical Power Analysis for the Behavioral Sciences, was a ground breaking text on the determination of sample size. G*Power is simply a statistical program that implements and extends Cohen's original ideas.
  • asked a question related to Sampling
  • asked a question related to Sampling
Question
16 answers
I have a high aspect flexible aircraft wing of 2 meters in which I want to place 6 gyroscopes along it to measure its deflection for research purposes. I want to be able to collect all the data effectively at 100Hz frequency from all the gyroscopes ( at the same time) to feed an estimator . It is not an easy task to do because I need communication protocol to be fast, robust to noise generated from BLDC motor, works for long distances and cheap.
Please see specs below :
- The longest distance between the control unit and any IMU will not exceed 2 meters.
- The Data collected from all the IMU’s should be relatively at the same time.
- The communication protocol that to be used should be highly robust to noise.
- The protocol to be used can be adapted with available microcontrollers.
- Data should be collected at 100 Hz frequency in control unit (T sampling = 10 ms).
There are alot of IMU sensors which can be used from adafruit, sparkfun or silicon labs. Currently i have two candidates thunderboard sense 2 and Razor sparkfun IMU in which both can be used as a sensor and a microcontroller at same time since they have arm processor and can be programmed.
Any one can suggest a suitable way to connect and interface with these sensors?
Any one can suggest a cyber physical system in which we can connect these sensors in a specific architecture in which we can gather data with interrupts respecting the above specs?
Thank You.
Relevant answer
Answer
Ali Srour I guess it must be ok for moderate clock. E.g. if you are using 5MHz the length of signal in cable will be 30 meter (half speed of light in the cable) and considering back and forth delay the clock domain has radius of 15 meters . One can use e.g. 1 or 2 MHz for peace of mind, should still be OK for 100Hz (one can transfer almost 1e6/100 = 1e4 bits at 1MHz/100Hz). Alternatively, you can poll sensor sequentially but it complicates things e.g. as adding the time shift between them. For sequential polling however i2c would be easier at that (rather low still) sample rate (there are nice i2c isolators too considering EMI hardening).
  • asked a question related to Sampling
Question
7 answers
There are two-arguments I found in the sample size.
the statisticians say it should be 30 to get accurate stat results whilst
Software engineers says 5 users can find the majority of the faults in the software.
My research experience shows 11 users can find more than 80% of the problems
see at :
Kindly tell me how many numbers of expert should be used to verify a framework developed in the multidisciplinary research area?
The research to be verified is shown at
Relevant answer
Answer
I mean, what kind of sample? Every science has its aspects of data and analysis method. So the representative samples are different according to the addressed phenomena
  • asked a question related to Sampling
Question
4 answers
I have a set of data collected as part of a hydroacoustic survey-- essentially a boat drove back and forth over a harbour and took a snapshot of the fish biomass/density underneath the boat every 5 minutes using a sonar-like device. I was worried that all of these snap-shots could be considered pseudoreplicates in that they wouldn't be independent of each other-- i.e. fish sampled at time X could be resampled at time X+1 if they happened to move with the boat. To correct for this I performed a test of spatial independence using a Moran's I test, which came back as non-significant. I also compared the delta AICs of models that included a spatial correction and the basic model with no spatial correction, and the basic model had a lower score. Does this mean that I can consider my samples collected via the hydroacoustic survey as being indpendent from one another and proceed with non-spatial corrected analyses?
Relevant answer
Answer
I am a PhD student in geography, specifically using spatial analysis in environmental geochemistry. For me I would consider that the data size that you collected and the research design about the fish species (like you said you drove back and back). The spatial autucorrelation should be a index that measured by both of these two parameters, so if your results are not significant. Maybe it need to be improved of the sampling design, or maybe it should be called spatial dependence. Here is a paper that I read for you, about the correlated spatial autocorrelation in ecology: . And another paper written by my supervisor abou the sample size on the statistical significance: . And you can find how we interprete the results of a significant z-score of std. residures in a case that we do not need (for residures it should be spatial non-autocorrelation): .Hope this help!
  • asked a question related to Sampling
Question
2 answers
When creating & optimizing mathematical models with multivariate sensor data (i.e. 'X' matrices) to predict properties of interest (i.e. dependent variable or 'Y'), many strategies are recursively employed to reach "suitably relevant" model performance which include ::
>> preprocessing (e.g. scaling, derivatives...)
>> variable selection (e.g. penalties, optimization, distance metrics) with respect to RMSE or objective criteria
>> calibrant sampling (e.g. confidence intervals, clustering, latent space projection, optimization..)
Typically & contextually, for calibrant sampling, a top-down approach is utilized, i.e., from a set of 'N' calibrants, subsets of calibrants may be added or removed depending on the "requirement" or model performance. The assumption here is that a large number of datapoints or calibrants are available to choose from (collected a priori).
Philosophically & technically, how does the bottom-up pathfinding approach for calibrant sampling or "searching for ideal calibrants" in a design space, manifest itself? This is particularly relevant in chemical & biological domains, where experimental sampling is constrained.
E.g., Given smaller set of calibrants, how does one robustly approach the addition of new calibrants in silico to the calibrant-space to make more "suitable" models? (simulated datapoints can then be collected experimentally for addition to calibrant-space post modelling for next iteration of modelling).
:: Flow example ::
N calibrants -> build & compare models -> model iteration 1 -> addition of new calibrants (N+1) -> build & compare models -> model iteration 2 -> so on.... ->acceptable performance ~ acceptable experimental datapoints collectable -> acceptable model performance
  • asked a question related to Sampling
Question
4 answers
I am trying to perform the cell-weighting procedure on SPSS, but I am not familiar with how this is done. I understand cell-weighting in theory but I need to apply it through SPSS. Assume that I have the actual population distributions.
Relevant answer
Answer
I might be misunderstanding your question, or his answer, but in my reading of what you are trying to do, I think the approach suggested by David Morse is missing a final step.
I'll assume, as David did, that the population, with N=3200, consists of 500 cases (or 15.625%) in subgroup A, 700 (21.825%) in B, and 2000 (62.5%) in C. I'll assume, further, that you have a sample, with n=80, that includes 10 cases (or 12.5%) in A, 20 (25%) in B, and 50 (62.5%) in C. If so, then the cell weights for your sample should be (15.625/12.5 = 1.25) for cell A, (21.825/25 = 0.875) for cell B, and (62.5/62.5 = 1.0) for cell C. That will keep your weighted sample size at 80, the same as your unweighted sample size, but will make the proportion of cases in A, B, and C in your weighted sample equal to the population proportions.
Forming the weights as ratios of the % in the population divided by the % in the sample will inflate the under-represented cells and deflate the over-represented cells in your sample by exactly the right amount.
If, instead, you also want to make the total number of cases in your sample equal to the total population size, then each of the three initial weights (1.25, 0.875, and 1.0) should be multiplied by (3200/80 = 40), yielding three new weights (50, 35, and 40).
Multiplying by the ratio of population size divided by sample size inflates all of your initially weighted sample counts by exactly the right amount to equal the population count.
  • asked a question related to Sampling
Question
5 answers
Hi,
I want to start testing pitfall trap to obtain ants samples, but I need to conduct molecular analysis on those insects. So, what kind of fluid can I use? Ethanol expires too early and I need to let the trap on the ground for a day, or at least 10/12 hours. I did look up for bibliography on the topic, but with scarse results.
Thank you!
Relevant answer
I use 96% ethanol as a fixative, DNA can be isolated, even with long-term storage
  • asked a question related to Sampling
Question
4 answers
Hello,
I'm currently working with a system consisting of an accelerometer, that samples in bursts of 10 seconds with a sample frequency of 3.9 Hz, before going into deep sleep for an extended (and yet undetermined) time period, then waking up again and sampling for 10 seconds and so on.
I've recently taken over this project from someone else and I can see that this person has implemented a Kalman filter to smooth the noise from the accelerometer. I don't know much about Kalman filters, but to me it seems that the long deep sleep period would make the previous states too outdated to be useful in predicting the new ones.
So my question is: can the previous states become outdated?
Relevant answer
Answer
The application of a Kalman Filter requires four models:
1. Measurement model, expressed by the Jacobian of the measurement variables in respect to the state variables. (H)
2. The measurement error model (R)
3. The state model, which allows to extrapolate the state to the future (F)
4. The state error model which allows to model the unpredictability of the state extrapolation. (The degree of a deviation of the state from the prediction for whatever reason) (Q)
For the exact mathematical meaning of H,R,F,Q refer to Wikipedia: Kalman Filter.
  • asked a question related to Sampling
Question
6 answers
I want to determine the percentage of ductile and brittle fracture for some samples from impact test.
Relevant answer
Answer
In SEM it is possible to distinguish ductile zone and calculate the percentage.
  • asked a question related to Sampling
Question
3 answers
Hi all,
In my lab we are designing some acute osmotic and salt treatments in plants of a endemic tomato variety to analyze the relative transcript levels of different genes by qRT-PCR at different times. One of the discussion we are having is how to perform the sampling. In one hand, some believe that the best is to pool samples and then perform the RNA extraction (3 plant per pool and 2 pool) and in other hand some believe in perform the RNA extraction and qRT-PCR experiments in each individual without pooling samples.
What do you recommend is the best approach?
Thanks!
Relevant answer
Answer
I believe that qRT-PCR of the genes of each individual wouldn't give you many answers. As already stated it will bring a lot of variation in the experiment. It would be more sound to do some pooling and make sure you've got at least 3 biological replicates for each of your treatments. For example you could pool x individuals together into one pool and do that for at least 3 pools. I usually do qRT-PCR on individual plants if THAT plant has a special phenotype that others might not have but I'm very careful in interpreting these results.
  • asked a question related to Sampling
Question
12 answers
When there are a large number of documents on the same topic, I can't get hold of them all; nor can I analyze them all. I'm wondering how I can know which documents I should use in an analysis. For example, out of a hundred commentaries on the same issue, how many commentaries and which commentaries should be selected? I did some search about this, but I still feel the need to get some more advice. Thank you very much  
Relevant answer
Answer
Thank you so much for the shared information. My direct question would be - how do one go through the sampling method of the relevant categories of documents and the predicting the sample size?
  • asked a question related to Sampling
Question
3 answers
We are trying to design a clinical trial on type 2 diabetes patients. The main data that we want to assess include FBS, 2hpp, HbA1c, insulin, and HOMA-IR. Also, we will assess the lipid profile and stress oxidative indices (MDA and TAC). The problem is that we could not find any similar study to determine the sample size. In this situation is it possible to use the Cohen formula? If not what is the right way for determining the sample size?
Relevant answer
You can use G*Power calculation to determine your sample size without worrying if the sample size is under the attribution of statistical significance. G*Power is a common sample calculation particularly in setting up a clinical trial.
Please check out this publication for more details:
Use of G*Power Software | SpringerLink by:
Verma J.P., Verma P. (2020) Use of G*Power Software. In: Determining Sample Size and Power in Research Studies. Springer, Singapore. https://doi.org/10.1007/978-981-15-5204-5_5
Hope it would be helpful and all best luck!
  • asked a question related to Sampling
Question
12 answers
Is it possible to compare the Theoretical maximum adsorption capacity (qm of Langmuir) of my sample to other materials when the (R2 of Langmuir is about 0.82) and the (R2 of Freundlich model about 0.98)
Relevant answer
Answer
How to apply Langmuir adsorption isotherm. A complete guidance ..
  • asked a question related to Sampling
Question
5 answers
I wish to assess the level of stress among a specific group of nurses redeployed to other hospital settings (i.e, research nurses) during the COVID pandemic in my research proposal.
May I ask for your thoughts as to which sampling method is best and may I ask why?
Relevant answer
Answer
Hello Daryl,
It depends on your research question, your population, and your measure(s). If you're talking about a modest number (say, 100) of nurses, it might be best to try to obtain a census sample (all of the cases). If the population is much larger, then the ideal approach would likely be a simple random sample or a stratified random sample. Of the two, I suspect stratification (for example, based on the type of reassignment venue) would likely be far better. Any probability sampling method pretty much requires that you have a way of identifying/locating the members of the target population so as to be able to: (a) make tentative selections; and (b) make the solicitation to participate in your study.
On the other hand, if your research aim is more along the lines of a qualitative inquiry, then perhaps a more purposive approach (and not a probability sample) would be better for you, as suggested by Dean Whitehead .
Good luck with your work.
  • asked a question related to Sampling
Question
5 answers
I am conducting a study to assess the quality of selected parts of some herbal materials and also develop acceptance criteria for their quality attributes. I am supposed to sample these materials from across the length and breadth of the country and I am hoping to stratify the country into strata and further divide each stratum into clusters, and then randomly sample the materials from each of the clusters picked up through systematic sampling.
My challenge is with the calculation of a 'realistic' sample size that can then be used to determine the number of clusters and the number of samples from each cluster. Very often what I see in literature tends to be convenient sampling, which may not be representative of the population. The focus of my study however requires that my sampling is representative of the population in the country (and also realistic), especially because of the part that has to do with setting acceptance criteria.
I would be very grateful for your technical assistance. Thank you.
Relevant answer
Answer
Hi Emmanuel,
A country is too large for this type of survey. I guess you need to use multi-stage sampling also known as cluster sampling.
You will need to stratify the country into separate homogenous units (e.g. based on ecology, ethnicity, administrative units, etc.). Within each cluster again find a homogenous distinguishing feature. Go on identifying subunits until the smallest possible one and then randomly selected units within the last unit. A very useful text on this sampling approach is by De Vaus (2014). De Vaus, D. 2014. Surveys In Social Research. 6th edition. Routledge.
What I have not yet figured out is if I have e.g. 10 units at the second stage (cluster), what proportion I should take of these 10.
  • asked a question related to Sampling
Question
4 answers
Is there a Python project where a commercial FEA (finite element analysis) package is used to generate input data for a freely available optimizer, such as scipy.optimize, pymoo, pyopt, pyoptsparse?
  • asked a question related to Sampling
Question
8 answers
Dear researchers greetings,
I'm working on eggs quality for my Ph.D thesis and I want to know what is the protocol for eggs sampling from a production unit.
The main questions I have are the following:
- What is the number of eggs to be collected with respect to the production unit capacity ?
- How the eggs are collected with respect to their position in the batch ?
- How the egges are conserved prior to the tests in the laboratory ?
Warmest regards.
Relevant answer
Answer
You have taken on quite a job!
First: Mycotoxins will enter the laying hens exclusively via the feed. Any levels to be found in eggs are, due to the physiology of egg formation a reflection of the intake during the last 14 days (or even more).
Second: the large poultry farms certainly get their (new) feed from commercial feed mills every week or 10 days (or even more often). Given the size of your country these mills will be located relatively nearby.
Third: You need a quite sensitive and specific analytical method to assess the amounts of mycotoxins and possibly some of their (toxic) metabolites, as levels in products of animal origin are usually quite low. Remember that plants and plant material are the primary source of mycotoxins and the passage of the mycotoxins through the animal will considerably reduce these levels by metabolism.
Fourth: Any mycotoxins in eggs will be rather evenly spread amongst the hens, so 10-30 eggs from one farm at one time seems adequate to me. These eggs might be contained for up to a week or so on room temperature. For longer storage break the eggs and store the mixed whole egg at -20.
Sampling of feed ,ills might be a first step to know which mycotoxins are most likely to be found.
At the end, the contribution of animal food products to the mycotoxin burden of the population might be small compared to the one from plant derived food products.
Good Luck
  • asked a question related to Sampling
Question
4 answers
I have the energy specter acquired from experimental data. After normalization, it can be used as a probability density function(PDF). I can construct a Cumulative distribution function(CDF) on a given interval using its definition as the integral of PDF. This integral simplified as a sum because of the PDF given in discrete form. I want to generate random numbers from this CDF.
I used Inverse transform sampling replacing CDF integral with sum. From then I am following the standard routine of the Inverse transform sampling solving it for sum range instead of an integral range.
My sampling visually fits experimental data but I wonder if this procedure is mathematically correct and how it could be proofed?
Relevant answer
Answer
The ideas are ok but you need to do things that show your summs converge to the integral. Referring to a text on harmonic analysis or numerical analysis would probably be beneficial.
  • asked a question related to Sampling
Question
7 answers
After collecting dental water unit samples post-flushing, I have got some microbes on Gram staining. They are long rods with breaks in between. Plz suggest what it could be...??
Relevant answer
Answer
I am not sure if you have identified the organisms since this was posted, but be aware that one of the problematic organisms in water systems are a variety of species that are in the genus Mycobacterium. These grow slowly, or are unculturable, and have been associated with a number of infections.
  • asked a question related to Sampling
Question
13 answers
Dear peers,
It would be much appreciated if you could suggest papers or reports that emphasize the sampling considerations for microplastics in soil/terrestrial/agricultural environments.
Thanks!
Relevant answer
Answer
Hey Gabriel, others came up with very nice papers. You may also like to take a look at the last EGU presentations. (https://meetingorganizer.copernicus.org/EGU2020/session/35952).
You may find some interesting people/contacts there.
Cheers,
  • asked a question related to Sampling
Question
2 answers
I have mortality data for Trout and Daphnia tested in the same sample of water, repeated for water samples taken over many days. I end up with a data table like this:
Sample Tsamplesize #TDead PropTdead Dsamplesize #Ddead PropDdead
1 10 1 0.1 30 3 0.1
2 10 2 0.2 30 5 0.167
3 10 3 0.3 10 2 0.2
etc.
The Daphnia sample size is either 10 or 30 but the Trout sample size is always 10.
I want to test if the paired Trout and Daphnia results are statistically similar and correlated. What is the appropriate test for the paired proprtions in this case. I'm sure this problem is common in case-control studies, and interlaboratory testing but I can't seem to find the appropriate test details. I thought of using a paired t-test with arcsin transform. Any suggestions or references would be appreciated. I've attached a data file.
Relevant answer
Answer
I think I finally found the answer I was looking for. Their explanation and method is simple and makes sense
  • asked a question related to Sampling
Question
6 answers
Let's say that we are doing an online survey among a group of people with the same profession - cross-sectional study. The population size identified, and sample size calculation is done. And since the sample size is small (n=588), a census is planned (universal sampling). Along the way, population size was underestimated and the sample size calculated. The real population size is N=1070, and sample size, n=780. Therefore, sampling needs to be done. Because of time constraints, my question is - can we do sampling and randomization after the data has been collected? And if so, is there a research article that has done it before? One way to avoid bias, is that the data has no identifiers except name of workplace. Can that be done?
Relevant answer
Answer
Doing randomization after results are known is akin to putting a cart in front of the horse!
Randomization of the subjects in a randomized study has to be done before a treatment is allocated to two or more groups, whether the trial is placebo controlled or comparative between different treatment groups. Purpose is to remove bias from the results. Doing randomization after getting the result will mean a flawed biased study results akin to putting a cart before the horse.
For further information please have a look at these articles:
  • asked a question related to Sampling
Question
5 answers
I would like to know that I like to do a research in which target population would be parents and I want to do this research in OPD clinics of different private practitioner. I would like to know that for sample size calculation do I need to calculate sample size of the population or sample size of clinics.
Let suppose, if the population of parents is 1 million in Karachi and keeping confidence interval of 95%, margin of error of 5% and outcome factor of 50%, it would be 384.
We don't have exact figure of Healthcare clinics run by private practitioner in Karachi and I have searched some links and also have combined them so I found 306 clinics & hospital in Karachi. If we keep this population and consider confidence interval of 95%, margin of error of 5% and outcome factor 50%, so in such case sample size would be 169.
If I choose case 2, then how many parents from each clinic do I have to choose. Actually my University had asked me to work on Cluster sampling or Systemic sampling technique rather than non-probability sampling.
So suggest me which option is more suitable in such case or how many clinics or how many participants per clinic can I recruit, so that it could represent the population.
Thank you in advance
 
Relevant answer
Answer
  • asked a question related to Sampling
Question
5 answers
Currently, I am going to implement the surveying method in one of my research related to business units. Orbis database (of Company information across the globe | BvD) or similar would be useful for me to make a sample according to certain criteria and obtain contacts. My organization does not provide access to the Orbis Database. Maybe someone has access to this database and could provide me with data from it or recommend free alternatives?
Thank you in advance.
Relevant answer
Answer
The alternatives are:
However, those databases require you to have an account. Are you requiring data for private companies?
Some countries allowing you to buy the audited company report that is listed in the stock market.
  • asked a question related to Sampling
Question
20 answers
I will be collecting carbonatite samples for LA ICP MS. They will be ground in order to handpick zircon crystals for U Pb geochronology. I want to get 100 zircon grains. What sample weight should I take?
Relevant answer
Answer
Lots of different approaches, but at the end of the day, you can't go wrong with about 5 kg. Don't separate all of it, start with just 500 g. Don't do anything special, just crush and gold pan it, and if the sample is rich, the zircon will be there. A UV lamp can be used to see it in a dark room, or just go ahead and pick it under the stereomicroscope. If it is detrital, it won't get locked up in other grains. Baddeleyite is different, and the only good way is to extract using a Wilfley Table.
  • asked a question related to Sampling
Question
5 answers
R programming language
I am considering if is it appropriate to use two different randomly chosen samples coming from one huge database to proceed two logistic regressions separately on the same subject?. The main cause is a low power of my computer and no possibility to use own written multimatching function that binarizes whole data into 0 and 1 (follow / not follow).
The database consists of 1 500 000 obs. and 54 variables (data.frame). The DV reflects the act of following one of two presidential candidates (1 and 0) and IVs reflect the act of following particular media outlets appearing on Twitter (also 1 and 0). The aim is to present association between media and political agenda and predictive power of particular media.
Unfortunately, I am forced to sample the data because of the computing time. Hence, I am going to randomize two samples (2 x 100k records), proceed the regression, and then, confirm the first one using the second one. Is it consistent with methodological / statistical art ? Thank you in advance.
Relevant answer
Yes, you can report experiment 1 and then experiment 2. You can write general methods for both experiments och method 1 and two. The same thing goes for the discussion.
  • asked a question related to Sampling
Question
3 answers
Hello Esteemed Researchers
I have a question and I was hoping the experts in the field could guide me. So I have never worked with Shoot Apical Meristem (SAM) and am really curious to learn more and more about it in wheat. However, I do not know how to identify its location in a grown wheat plant. I tried searching articles which researched on SAM of rice and other monocots but the method of sampling or its location has not been stated.
I would really appreciate it if someone could advise me on this as well as explain to me thoroughly on how to identify the SAM region and what would be the best procedures to sample it.
Thank you so much in advance. Any form of guidance will be fully appreciated.
With gratitude
Dee
Relevant answer
Answer
Dear, Dee!
Typically, apical meristems are located at the top of the shoots (main and lateral). Such an arrangement of meristems is determined already in the initial phases of ontogenesis.
Perhaps, the following articles will help you by SAM in wheat:
Na-Sheng Lin and W. G. Langenberg. Distribution of Barley Stripe Mosaic Virus Protein in Infected Wheat Root and Shoot Tips
Ahmad, A., Zhong, H., Wang, W. et al. Shoot apical meristem: In vitro regeneration and morphogenesis in wheat (Triticum aestivum L.)
Wang PJ., Charles A. (1991) Micropropagation Through Meristem Culture. In: Bajaj Y.P.S. (eds) High-Tech and Micropropagation I. Biotechnology in Agriculture and Forestry, vol 17. Springer, Berlin, Heidelberg
  • asked a question related to Sampling
Question
13 answers
I am directing a study of upgrading information about the trees in the public space of the Partido de Morón, Provincia de Buenos Aires, Argentina.
Between the years 2005 and 2008 a group of professors and students of the catedra of "Floricultura" did a census of trees in the public space. For each of the trees (aproximately 100000 plants) they recorded the date of evaluation, the Genus and species, the common name, and several cuantitative and cualitative variables.
In the year 2013 and the first trimester of 2014 we did a random sample of 100 blocks in the same population. We registered for each individual tree, any change in the information between the census and the random sample. There are aproximately 5000 plants in the random sample.
We get a data base where we have for each tree, its information in each variable (quali or cuantitative) in the two dates.
The purpose of our study is to produce an upgrading of the information of the census for March of 2014.
We are using ratio and regresion estimators, and post-stratified estimators but would like consider any suggestions of you for obtaining the more reliable estimator of each variable of this population. We want to take in consideration, the time between the observations.
Thank you, in advance for any help!.
Relevant answer
Answer
El censo puede servir de marco del proceso de muestreo posterior de tipo probabilístico. Por ejemplo, dividiendo la superficie total en trozos, cada trozo tenía una población de árboles (variable auxiliar x), y en los trozos muestreados tiene una población de árboles (variable de interés y) que ha podido cambiar con respecto al censo. Cualquier método de muestreo que proporciones estimadores insesgados y con varianza estimable podría usarse en el estudio. Un ejemplo que creo interesante es el que he propuesto anteriormente en esta Cuestión.
  • asked a question related to Sampling
Question
17 answers
First, why do it? Well, ambient MS sampling methods are by nature destructive, and rare and precious analyte objects can't be indiscriminately subjected to moisture, stripping, discoloration, or burning. DESI (and DART) operate continuously. If a target is at the center of a surface, one has to drag the desorbing flow across the surface to get to the target and have it all positioned optimally, disrupting more area than should be necessary.
One could just turn the DESI voltage or syringe pusher on and off until the sample is positioned, but it's my understanding that the flow needs to be stable to get good signal. Some "start-up" emitted solvent would expose the target it before optimum conditions. Diverting the flow back and forth from the emitter would presumably have the same effect.
Perhaps one could protect the sample surface before exposure by using a shutter, as with DART (Analytical Methods 10 (9), 1038-1045). A shutter vane is probably going to be 0.01"/0.25 mm thick. Of course, the shutter can't contact the DESI emitter at >1 kV and needs clearance to move over the sample surface, so one has to allow at least 0.75 mm between the emitter and sample. The greater that distance, the greater the sample area exposed. Also, what happens to the solvent that builds up on the shutter while closed? Tricky.
Instead of a swinging shutter, one could mask the entire sample surface save for the target area. Of course, more than one target would require laborious change in masks and apparatus repositioning.
One could abandon DESI entirely and use some liquid microjunction or nano-DESI sampling with 3D sample manipulation, but that's not the point of this thought experiment. Some day the Venter lab or someone is going to perfect protein sampling with DESI, and then I'll really want it to be discontinuous. I've been thinking about this off and on for years. How would you do it?
Relevant answer
Answer
we havent tried, but some thought experiments in the past lead us to believe the best way to prevent instability with spray current, overcome back pressure in the spraysolvent capillary, large droplets dripping of the spray tip, etc is to have a physical barrier or baffle to divert the spray to be the best approach.
  • asked a question related to Sampling
Question
5 answers
To perform data quality assessment in the pre-processing data phase (Big Data Context), should data profiling being performed before data sampling (on the whole data set), or is it ok to have profiled on a subset of the data?
If we consider the second approach, how sampling is done without having information about the data (even some level of profiling)?
Relevant answer
Answer
Hadi Fadlallah , yes. That should decrease computational expenses, help to perform an investigation of a subset instead of the whole set. That is similar to a data science process where a small dataset is analyzed afterwards the methods are applied on big data set.
  • asked a question related to Sampling
Question
2 answers
Anything in AMBER?
Relevant answer
Answer
Dear Suchetana,
Did you find a solution?
  • asked a question related to Sampling
Question
4 answers
If I use purposive sampling in my qualitative study, do I need to set the sample size? If yes, then how?
Relevant answer
Answer
Dear Md. Mokshedur Rahman
In fact it is an interesting question. I agreed with David Morse, Morever, you can check the article published by Sim J et al. in the following URL: https://www.researchgate.net/publication/324042278_Can_sample_size_in_qualitative_research_be_determined_a_priori
  • asked a question related to Sampling
Question
3 answers
I've taken 156 samples out of 2500 population with an 80% confidence level and 5% margin of error. How to calculate the sampling intensity in this case?
  • asked a question related to Sampling
Question
5 answers
Dear colleagues,
Would anyone be willing to start a collaboration by sampling freshwater atyid shrimps in Egypt, in particular in the Faiyum Oasis?
In an integrative taxonomic approach combining morphological and molecular data, this would help me to delineate species.
Relevant answer
Answer
Hi Valentin. I am interested. I also work with my team on similar issues. If you are still interested, my email is khaled.mohammed@icman.csic.es
  • asked a question related to Sampling
Question
6 answers
I selected five firms in an industrial sector (where total number of firms in that sector was more than 1,000). These selected five firms comprised of 2,090 relevant individuals who I was interested to contact for the participation in a survey study. A sample of 1,000 was drawn randomly from these 2,090 potential participants and a survey was sent to them. I received 337 usable responses which were then used for the analysis.
In your opinion what is the best way to report the above sampling procedure in terms of target population, sampling frame etc? Any authentic reference will be much appreciated please.
Relevant answer
Answer
Hello again, Fawad,
Your selection of the five firms was purposive, and made with intent to fulfill whatever conditions/criteria you had set for your screening method. Just be clear as to what those conditions/criteria were and give an explanation of why you imposed them as the basis for firm selection. That lets readers judge for themselves how applicable the results might be to their target universe.
Good luck with finishing your work.
  • asked a question related to Sampling
Question
3 answers
I am looking to do a content analysis on how left and right wing UK newspapers presented the link between MMR and autism. However, the number of articles I get back when searching the terms 'autism' and 'MMR' on Nexus for each newspaper is huge. The number also differs for each newspaper.
How can I decrease these articles into a manageable size? Stratified sampling?
Relevant answer
Answer
There are a number of online programs that allow you to determine sample size for estimating the value of a parameter such as a percentage. Another alternative is to examine what is known as the power of your test such as a t-Test, using the software G*Power.
  • asked a question related to Sampling
Question
14 answers
Does a sampling technique known as Infinite Population Random Sampling exists? If exits, could it be applied to internet user/social media studies? and how it can be employed?
Relevant answer
Answer
Please let me point out that I said "any" when I said "But if you want any kind of random sampling, that means controlling the probability of selection in each case." To go from considering a finite to an infinite population in simple random sampling, you do not use a finite population correction (fpc) factor. That is what you do when your sample is large enough not to need to consider an fpc. Here, you would also have such a case. But if your population were truly infinite, a probability of 0.1 of selection, for example, would still be an infinite sized sample. What you need is a relative probability of selection. Sequential sampling is not exactly random sampling, but close enough.
So if you took every tenth case, that would be something like a probability of 0.1 of selection, but only for those cases considered. That is a problem with internet studies. You can say you looked at one in ten of "these," but what about all others?
So you are back to the problem I noted: "But if you want any kind of random sampling, that means controlling the probability of selection in each case." That is most likely not going to be possible for most internet studies.
To make the distribution of your sample look like the infinite (or finite) population distribution, you have to know what that population distribution looks like. But if it were truly infinite, that would be impossible to know. Even for an Internet study, which is not really infinite, how would you know? It might even be a multimodal distribution. If you only had data near one mode, and no other information, you would not know about any other mode in the distribution. (If you did, that would be a good reason for stratified random sampling, rather than simple random sampling.) This is a problem with a large or infinite population. You may not know about some parts of the distribution, while having a great deal of information on other parts. Big Data could fail that way.
I'd say that it is generally not possible to have a truly random design for an Internet study.
  • asked a question related to Sampling
Question
2 answers
Dear colleagues,
My question is regarding suggested methodologies for snow sampling in, for example, mountains or peeks. Some ice sampling techniques for these environments would also be appreciated. Must consider these samples are going to be processed to identify microplastics in the snowy mountain ecosystems.
Thanks in advance,
Relevant answer
Answer
Hi Gabriel! Not sure what scale you're hoping to work on -- here are some links to various snow and ice core methods:
https://www.wcc.nrcs.usda.gov/factpub/ah169/SnowSurveySamplingGuideHandout.pdf