Science method
Sampling - Science method
Sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population.
Questions related to Sampling
I came across a young Prosopis juliflora plant, which is highly invasive. What is the best method for collecting sand around the roots to study the potential

I have been studying a particular set of issues in methodology, and looking to see how various texts have addressed this. I have a number of sampling books, but only a few published since 2010, with the latest being Yves Tille, Sampling and Estimation from Finite Populations, 2020, Wiley.
In my early days of survey sampling, William Cochran's Sampling Techniques, 3rd ed, 1977, Wiley, was popular. I would like to know which books are most popularly used today to teach survey sampling (sampling from finite populations).
I posted almost exactly the same message as above to the American Statistical Association's ASA Connect and received a few recommendations, notably Sampling: Design and Analysis, Sharon Lohr, whose 3rd ed, 2022, is published by CRC Press. Also, of note was Sampling Theory and Practice, Wu and Thompson, 2020, Springer.
Any other recommendations would also be appreciated.
Thank you - Jim Knaub
I am conducting a study testing the effectiveness of a kind of group psychotherapy. There are 10 participants in my experimental group and 14 participants in my control group. At first, I planned random assignment to the groups, but because of the time of the group therapy, 14 of the participants wanted to be in the waitlist control group. After I created the control group, I run a t test to compare two groups in terms of some study variables. When I did a t test, I saw that there was no significant difference between the groups in terms of the study variables. In summary, the groups have similar characteristics (e.g. Age, educational level, romantic relationship status, mean scores of the participants). However, group sizes are different. Can I do my analysis with 10 people in the experimental group and 14 people in the control group? If no, how do I remove the 4 people in the control group?
I have 23500 points, I sorted them in Excel from lowest to biggest, and then in a scatter plot, I create its chart, now I want to find the data that after that data (point) my chart starts to a high slope near 90 degrees, or in another word my chart begins growing up faster.


Is it permissible for the number of samples to be reduced or not reach the number of sample calculation results because many respondents are not willing to join the study as respondents? what should i do?
Thank you
Where is the Jade due to the throwing out of a brick and a paving stone?
A brand new conception of preferable probability and its evaluation were created, the book was entitled "Probability - based multi - objective optimization for material selection", and published by Springer, which opens a new way for multi-objective orthogonal experimental design, uniform experimental design, respose surface design, and robust design, discretization treatment and sequential optimization, etc.
It aims to provide a rational approch without personal or other subjective coefficients, which is available at https://link.springer.com/book/9789811933509,
DOI: 10.1007/978-981-19-3351-6.
Best regards.
Yours
M. Zheng
According to my view, even the mere opening of a drawer or a cupboard is already damaging to cultural remains. The shock of opening a drawer, the changing environment between the inside and the outside of a drawer, as well as a sudden light that falls on an artifact that has been in the dark is damaging to any item, especially to parchment, papyrus and paper.
When it comes to 'taking a sample' needed for applying a analytical technique to an artifact, one can only speak of less and more destructive, because destructive it is.
For example in Neutron Activation and Petrography, one needs either an amount of 80 mg of pottery powder or a thin-section before submitting the sample either as powder or as a pellet to a nuclear reactor or to a glass slide to be looked at under a microscope.
I think that the formula "non-destructive sampling technique" was invented by scientists to obtain samples they needed from a curator or conservator. I, therefore, suggest to omit the word "non-destructive" from the Cultural Heritage vocabulary.
More exactly, do you know of a case where there are repeated, continuous data, sample surveys, perhaps monthly, and an occasional census survey on the same data items, perhaps annually, likely used to produce Official Statistics? These would likely be establishment surveys, perhaps of volumes of products produced by those establishments.
I have applied a method which is useful under such circumstances, and I would like to know of other places where this method might also be applied. Thank you.
Like MPFP, important sampling etc.
as i am carrying out my research in the area of cloud forensics to show the results i want to go through a sample live data set or database, can any one suggest the various FTK tools where are freely available to carry out my results
Good afternoon,
I am carrying out a monthly invertebrate sampling for future molecular studies (DNA). I am euthanizing my arthropods with 70° ethanol right after the capture and then store them in a freezer. Would it be better for DNA preservation using 96° or pure ethanol?
In the latest approaches to trying to infer from nonprobability samples, multiple covariates are encouraged. For example, see https://www.researchgate.net/publication/316867475_Inference_for_Nonprobability_Samples. However, in my experience, when a simple ratio model can be used with the only predictor being the same data item in a previous census (and results can be checked and monitored with repeated sample and census surveys), results can be very good. When more complex models are needed, I question how often this can be done suitably reliably. With regard to that, I made comments to the above paper. (That paper is available through Project Euclid, using the DOI found at the link above.)
Analogously, for heteroscedasticity in regression, for Yi associated with larger predicted-yi, sigma should be larger. However, when a more complex model is needed, this is less likely to be empirically apparent. For a one-predictor ratio model where the predictor is the same data item in a previous census, and you have repeated sample and census surveys for monitoring, this, I believe, is much more likely to be successful, and heteroscedasticity is more likely to be evident.
This is with regard to finite population survey statistics. However, in general, when multiple regression is necessary, this always involves complications such as collinearity and others. Of course this has been developed for many years with much success, but the more variables required to obtain a good predicted-y "formula," the less "perfect" I would expect the modeling to be. (This is aside from the bias variance tradeoff which means an unneeded predictor tends to increase variance.)
[By the way, back in Cochran, W.G.(1953), Sampling Techniques, 1st ed, John Wiley & Sons, pages 205-206, he notes that a very good size measure for a data item is the same data item in a previous census.]
People who have had a lot of experience successfully using regression with a large number of predictors may find it strange to have this discussion, but I think it is worth mulling over.
So, "When more predictors are needed, how often can you model well?"
i have a system and i want to apply a force to drag a part of it in a certain direction. i am using GROMACS patched with PLUMED and defined two COMs (one for the group to be dragged and the other is for the target position) and defined a distance between them and applied a restraint on this distance using a harmonic restraint.
My question is how to determine the value of the force constant KAPPA?
an example for my input is the following:
outerP: COM ATOMS=1-10
N: COM ATOMS=20-30
d1: DISTANCE ATOMS=outerP,N NOPBC COMPONENTS
restraint1: RESTRAINT ARG=d1.z KAPPA=??? AT=0.0
Thanks
If I use 320 sample size using a purposive sampling technique, how can validate the sample size for generalizing results? Are 320 responses could be statistically sufficient to generalize the results?
Hello, so the population of my research is university students from a specific island (my country is an archipelago), thus I know the total amount of the population based on the national government census reports. So, I did a multistage random sampling from island->province->cities->universities. But, after I selected the universities as the unit sampling, I realized I didn't have the sample frame from each universities. So for this last step to select the samples, is it acceptable if I use non-probability sampling, such as purposive sampling if I planned to do regression analysis? What should I do in this situation?
Side note: I've actually carried out the survey that way and I found out that there was no problem with the results, i.e, the assumptions of the linear regression are fulfilled, and the validity and reliability of the scales were also acceptable. But I'm not sure whether what I did was justifiable or not...
Is it possible to use Finite Population Correction (FPC) to decide the minimum required sample size when we use Respondent Driven Sampling (RDS) approach to recruit hidden populations? Kindly share any reading material on this? An introduction to RDS is attached for your information. Thanks in advance for kind support.
Hello I am trying to reconstruct the far field pattern of a patch antenna at 28 GHz (lambda = 10.71 mm ). I am using a planar scanner to sample the near field with a probe antenna. The distance between patch and probe is 5 cm. The resolution of the scan is 1.53 x 1.53 mm². The total scanned surface is 195x195 mm. The NF patterns are shown in the NF_raw file.
The complex near field is then transformed using the 2D IFFT to compute the modal spectrum of the plane waves constructing the scanned near field. (See C. A. Balanis (17-6 a and 17-7b) for this). The modal components are shown in the IFFT file. The problem is that is observe an oscillation in the phase of those modal components that reminds me of aliasing effects in digital images (Moiré pattern).
This effects also procreate when I resample the modal spectrum in spherical coordinates, as seen in the Sampling file. The transformed phase changes therefore too fast per radian. The absolute value of the pattern looks reasonable.
Could someone explain why these effects occur and what steps I can implement to prevent them? Thank you for any helpful input.



My research aims to evaluate how incumbent companies can face newcomers effectively based on a case study of the Mobile Phone industry.
In this regard, I am collecting data through a survey targeting Smartphone users to better understand the strategies adopted by Mobile Phone companies.
I personally believe it is impossible (or at least very difficult) to use probability sampling as the number of Smartphone users is very large (2.9 Billion users in 2017). I would like to use non-probability sampling, but I am not sure whether it would be acceptable in a research paper. What do you think?
I am conducting a single-case study research as part of my dissertation for a Master's degree. The topic is in the area of public procurement and innovation. The aim is to explore to what extent standards referenced in public procurement allow innovation in State-Owned Enterprises (in a one country).
The research is designed as a single case-study. As identified by Robert K. Yin in his book Case Study Research, one of the rationales of a single case study is the representative or typical case. As a result, I have arranged for an interview with one procurement professional from the selected organization. However, my supervisor informed me that a single interview will not be sufficient to get unbiased and comprehensive data for analysis and discussion. Additionally, I was advised to conduct surveys if it is difficult to arrange interviews.
I do not understand why is it necessary to involve more than one participant in the research and conduct more than one interview. Also, how surveys are going to help get sufficient data, given that I am conducting a qualitative research. As for data analysis, I am going to use thematic analysis in which I will link what to be said in the interview with my findings from the literature.
I would appreciate it, if you could advice me on what should I do
Hi! As I'm just starting teaching and mentoring students in their coursework I often come across a particular issue related to sampling in qualitative research. Whenever students are assigned to do a preliminary qualitative study or devise a qualitative research strategy which involves getting information from other people they often resort to using Facebook as a place for distribution of their invitations to participate in research, or post links to online surveys etc. I do not find this particularly problematic, but I sometimes encounter MA thesis proposals which resort to this strategy even though the proposed research is not really presented as situated in the context of social media. I've also come across some studies which use a more structured approach where social media is used as a platform to implement the snowballing sampling principle.
My questions are:
1. Do you have any experience with that (in terms of students using social media in their sampling strategies)?
2. Have you used social media in your sampling strategies and what were your justifications to do so?
3. Should this approach be encouraged or discouraged if students are aware of the limitations of their sample creation strategies?
4. Does it matter whether the focus of such research has something to do with social media or not?
Dear all,
Could you recommend any review paper (or book) comparing various downsampling methods applicable to volumetric data (preferrably, light microscopy or cell tomography data)?
The DBS is planned for a community based study
I am trying to measure the power of my study, in which I measured the level of awareness about a disease (PCOS) among university students (level of awareness was measured with a score of 22 points and served as the dependent variable in the further analyses). I sampled about 1000 students and my target population is about 30,000 students. I do not know the target population awareness score, as no similar studies have been conducted in my country. How can I calculate the post-hoc power of my study?
How bioanalyzer determine the RIN of RNA sample?
Why in some cases the RIN appears as not applicable? What could happened?
What`s the minimum sample of treated and non-treated observations for a study that uses a combination of Propensity Score Matching (PSM) and Difference in Difference (DID)?
I searched several articles online, and I could not find any "rule" or something that states what could be the minimum sample size of treated and non-treated observations on a study that uses PSM and DID approach as a combination.
Hi,
I am new to DL and I'm trying to classify 1 Landsat8 image into 3 categories using VGG-19. I am using 8 bands (B2 to B7 ,B10 and Panchromatic). I performed the sampling procedure and my samples are named "1_id_b2" (category_id sample_Landsat band). I have my training and test samples into separate folders. The folder structure is similar to the image attached (folder_str). I've read that I need to create training and test labels. I don't understand why to create the labels, because I already labeled my samples.

Dear Scientists,
I have a question, we want to use the application of ANNs in regression analysis and this is some sort of easy utilization for ANNs, but the question is " how many samples do we need to training? using 12 samples could be enough? " I produced these 12 samples by Fractional Factorial Design (FFD) method and I need to be sure about this. Therefore, I would be grateful if you could provide me with any information about this subject.
Many thanks in advance for your time and kind consideration.
Regards
Mohsen
Reference for FFD method: https://en.wikipedia.org/wiki/Fractional_factorial_design
In the process of conducting a correlational study, I got stuck in planning my sampling technique. The aim is to investigate the predictors of reading performance among EFL university students of low and high proficiency levels. The population of the study is consisted of EFL university students majoring in English. The sample needs to be of two groups low and high proficiency students. However, the students‘ language proficiency across academic levels is not defined, which means i need to administer a placement test to divide participants into tow proficiency groups. I’m thinking to include only the beginning and advanced academic levels and administer the tests to the students to take only low proficiency students from beginning levels and high proficiency students from advanced levels. The reason why I need to administer the test is that students’ proficiency levels vary considerably across levels and hence their academic levels are not the best indicators of their level. My question is what is the best sampling method that suits my study ?
How would you defend that your quantitative research results are representing the population even though you're using non-probability sampling? (which not everyone has the same opportunity to be used as a sample)
Please correct me if i'm wrong. Thank you!!
I'm in the market to buy a DNA extraction robot, and would really appreciate any suggestions/experience/advice.
With the projects we recently landed we're expecting to process on average about 3000-5000 noninvasive samples per year (scats, urine, saliva - all taken from the environment not from the animal directly). DNA extraction is a total bottleneck in our lab, it's difficult to do quality control when hand-extracting (sample mixup, pippetting errors...) and is too labor intensive (hence expensive) and slow.
I'm not too keen on the magnetic beads technology (tested some machines, didn't like them) and I'd like something that could automate regular spin column (silica membrane) extraction. QiaCube from Qiagen seems an option, but it only does 12 samples at a time. I'm looking at about 100 samples per day throughput, and can spend about 30,000€ on this (well, 40,000€ tops). Contamination prevention is critical with noninvasive sampling applications.
I'd really appreciate any help with this.
Please enlighten and attach the reference of past studies.
thanks
Maybe co-precipitacion with BSA?
I have a high aspect flexible aircraft wing of 2 meters in which I want to place 6 gyroscopes along it to measure its deflection for research purposes. I want to be able to collect all the data effectively at 100Hz frequency from all the gyroscopes ( at the same time) to feed an estimator . It is not an easy task to do because I need communication protocol to be fast, robust to noise generated from BLDC motor, works for long distances and cheap.
Please see specs below :
- The longest distance between the control unit and any IMU will not exceed 2 meters.
- The Data collected from all the IMU’s should be relatively at the same time.
- The communication protocol that to be used should be highly robust to noise.
- The protocol to be used can be adapted with available microcontrollers.
- Data should be collected at 100 Hz frequency in control unit (T sampling = 10 ms).
There are alot of IMU sensors which can be used from adafruit, sparkfun or silicon labs. Currently i have two candidates thunderboard sense 2 and Razor sparkfun IMU in which both can be used as a sensor and a microcontroller at same time since they have arm processor and can be programmed.
Any one can suggest a suitable way to connect and interface with these sensors?
Any one can suggest a cyber physical system in which we can connect these sensors in a specific architecture in which we can gather data with interrupts respecting the above specs?
Thank You.
There are two-arguments I found in the sample size.
the statisticians say it should be 30 to get accurate stat results whilst
Software engineers says 5 users can find the majority of the faults in the software.
My research experience shows 11 users can find more than 80% of the problems
see at :
Conference Paper Selecting a Usability Evaluation User Group -A Case Study th...
Kindly tell me how many numbers of expert should be used to verify a framework developed in the multidisciplinary research area?
The research to be verified is shown at
I have a set of data collected as part of a hydroacoustic survey-- essentially a boat drove back and forth over a harbour and took a snapshot of the fish biomass/density underneath the boat every 5 minutes using a sonar-like device. I was worried that all of these snap-shots could be considered pseudoreplicates in that they wouldn't be independent of each other-- i.e. fish sampled at time X could be resampled at time X+1 if they happened to move with the boat. To correct for this I performed a test of spatial independence using a Moran's I test, which came back as non-significant. I also compared the delta AICs of models that included a spatial correction and the basic model with no spatial correction, and the basic model had a lower score. Does this mean that I can consider my samples collected via the hydroacoustic survey as being indpendent from one another and proceed with non-spatial corrected analyses?
When creating & optimizing mathematical models with multivariate sensor data (i.e. 'X' matrices) to predict properties of interest (i.e. dependent variable or 'Y'), many strategies are recursively employed to reach "suitably relevant" model performance which include ::
>> preprocessing (e.g. scaling, derivatives...)
>> variable selection (e.g. penalties, optimization, distance metrics) with respect to RMSE or objective criteria
>> calibrant sampling (e.g. confidence intervals, clustering, latent space projection, optimization..)
Typically & contextually, for calibrant sampling, a top-down approach is utilized, i.e., from a set of 'N' calibrants, subsets of calibrants may be added or removed depending on the "requirement" or model performance. The assumption here is that a large number of datapoints or calibrants are available to choose from (collected a priori).
Philosophically & technically, how does the bottom-up pathfinding approach for calibrant sampling or "searching for ideal calibrants" in a design space, manifest itself? This is particularly relevant in chemical & biological domains, where experimental sampling is constrained.
E.g., Given smaller set of calibrants, how does one robustly approach the addition of new calibrants in silico to the calibrant-space to make more "suitable" models? (simulated datapoints can then be collected experimentally for addition to calibrant-space post modelling for next iteration of modelling).
:: Flow example ::
N calibrants -> build & compare models -> model iteration 1 -> addition of new calibrants (N+1) -> build & compare models -> model iteration 2 -> so on.... ->acceptable performance ~ acceptable experimental datapoints collectable -> acceptable model performance
I am trying to perform the cell-weighting procedure on SPSS, but I am not familiar with how this is done. I understand cell-weighting in theory but I need to apply it through SPSS. Assume that I have the actual population distributions.
Hi,
I want to start testing pitfall trap to obtain ants samples, but I need to conduct molecular analysis on those insects. So, what kind of fluid can I use? Ethanol expires too early and I need to let the trap on the ground for a day, or at least 10/12 hours. I did look up for bibliography on the topic, but with scarse results.
Thank you!
Hello,
I'm currently working with a system consisting of an accelerometer, that samples in bursts of 10 seconds with a sample frequency of 3.9 Hz, before going into deep sleep for an extended (and yet undetermined) time period, then waking up again and sampling for 10 seconds and so on.
I've recently taken over this project from someone else and I can see that this person has implemented a Kalman filter to smooth the noise from the accelerometer. I don't know much about Kalman filters, but to me it seems that the long deep sleep period would make the previous states too outdated to be useful in predicting the new ones.
So my question is: can the previous states become outdated?
I want to determine the percentage of ductile and brittle fracture for some samples from impact test.
Hi all,
In my lab we are designing some acute osmotic and salt treatments in plants of a endemic tomato variety to analyze the relative transcript levels of different genes by qRT-PCR at different times. One of the discussion we are having is how to perform the sampling. In one hand, some believe that the best is to pool samples and then perform the RNA extraction (3 plant per pool and 2 pool) and in other hand some believe in perform the RNA extraction and qRT-PCR experiments in each individual without pooling samples.
What do you recommend is the best approach?
Thanks!
When there are a large number of documents on the same topic, I can't get hold of them all; nor can I analyze them all. I'm wondering how I can know which documents I should use in an analysis. For example, out of a hundred commentaries on the same issue, how many commentaries and which commentaries should be selected? I did some search about this, but I still feel the need to get some more advice. Thank you very much
We are trying to design a clinical trial on type 2 diabetes patients. The main data that we want to assess include FBS, 2hpp, HbA1c, insulin, and HOMA-IR. Also, we will assess the lipid profile and stress oxidative indices (MDA and TAC). The problem is that we could not find any similar study to determine the sample size. In this situation is it possible to use the Cohen formula? If not what is the right way for determining the sample size?

Is it possible to compare the Theoretical maximum adsorption capacity (qm of Langmuir) of my sample to other materials when the (R2 of Langmuir is about 0.82) and the (R2 of Freundlich model about 0.98)
I wish to assess the level of stress among a specific group of nurses redeployed to other hospital settings (i.e, research nurses) during the COVID pandemic in my research proposal.
May I ask for your thoughts as to which sampling method is best and may I ask why?
I am conducting a study to assess the quality of selected parts of some herbal materials and also develop acceptance criteria for their quality attributes. I am supposed to sample these materials from across the length and breadth of the country and I am hoping to stratify the country into strata and further divide each stratum into clusters, and then randomly sample the materials from each of the clusters picked up through systematic sampling.
My challenge is with the calculation of a 'realistic' sample size that can then be used to determine the number of clusters and the number of samples from each cluster. Very often what I see in literature tends to be convenient sampling, which may not be representative of the population. The focus of my study however requires that my sampling is representative of the population in the country (and also realistic), especially because of the part that has to do with setting acceptance criteria.
I would be very grateful for your technical assistance. Thank you.
Is there a Python project where a commercial FEA (finite element analysis) package is used to generate input data for a freely available optimizer, such as scipy.optimize, pymoo, pyopt, pyoptsparse?
Dear researchers greetings,
I'm working on eggs quality for my Ph.D thesis and I want to know what is the protocol for eggs sampling from a production unit.
The main questions I have are the following:
- What is the number of eggs to be collected with respect to the production unit capacity ?
- How the eggs are collected with respect to their position in the batch ?
- How the egges are conserved prior to the tests in the laboratory ?
Warmest regards.
I have the energy specter acquired from experimental data. After normalization, it can be used as a probability density function(PDF). I can construct a Cumulative distribution function(CDF) on a given interval using its definition as the integral of PDF. This integral simplified as a sum because of the PDF given in discrete form. I want to generate random numbers from this CDF.
I used Inverse transform sampling replacing CDF integral with sum. From then I am following the standard routine of the Inverse transform sampling solving it for sum range instead of an integral range.
My sampling visually fits experimental data but I wonder if this procedure is mathematically correct and how it could be proofed?
After collecting dental water unit samples post-flushing, I have got some microbes on Gram staining. They are long rods with breaks in between. Plz suggest what it could be...??


Dear peers,
It would be much appreciated if you could suggest papers or reports that emphasize the sampling considerations for microplastics in soil/terrestrial/agricultural environments.
Thanks!
I have mortality data for Trout and Daphnia tested in the same sample of water, repeated for water samples taken over many days. I end up with a data table like this:
Sample Tsamplesize #TDead PropTdead Dsamplesize #Ddead PropDdead
1 10 1 0.1 30 3 0.1
2 10 2 0.2 30 5 0.167
3 10 3 0.3 10 2 0.2
etc.
The Daphnia sample size is either 10 or 30 but the Trout sample size is always 10.
I want to test if the paired Trout and Daphnia results are statistically similar and correlated. What is the appropriate test for the paired proprtions in this case. I'm sure this problem is common in case-control studies, and interlaboratory testing but I can't seem to find the appropriate test details. I thought of using a paired t-test with arcsin transform. Any suggestions or references would be appreciated. I've attached a data file.
Let's say that we are doing an online survey among a group of people with the same profession - cross-sectional study. The population size identified, and sample size calculation is done. And since the sample size is small (n=588), a census is planned (universal sampling). Along the way, population size was underestimated and the sample size calculated. The real population size is N=1070, and sample size, n=780. Therefore, sampling needs to be done. Because of time constraints, my question is - can we do sampling and randomization after the data has been collected? And if so, is there a research article that has done it before? One way to avoid bias, is that the data has no identifiers except name of workplace. Can that be done?
I would like to know that I like to do a research in which target population would be parents and I want to do this research in OPD clinics of different private practitioner. I would like to know that for sample size calculation do I need to calculate sample size of the population or sample size of clinics.
Let suppose, if the population of parents is 1 million in Karachi and keeping confidence interval of 95%, margin of error of 5% and outcome factor of 50%, it would be 384.
We don't have exact figure of Healthcare clinics run by private practitioner in Karachi and I have searched some links and also have combined them so I found 306 clinics & hospital in Karachi. If we keep this population and consider confidence interval of 95%, margin of error of 5% and outcome factor 50%, so in such case sample size would be 169.
If I choose case 2, then how many parents from each clinic do I have to choose. Actually my University had asked me to work on Cluster sampling or Systemic sampling technique rather than non-probability sampling.
So suggest me which option is more suitable in such case or how many clinics or how many participants per clinic can I recruit, so that it could represent the population.
Thank you in advance
Currently, I am going to implement the surveying method in one of my research related to business units. Orbis database (of Company information across the globe | BvD) or similar would be useful for me to make a sample according to certain criteria and obtain contacts. My organization does not provide access to the Orbis Database. Maybe someone has access to this database and could provide me with data from it or recommend free alternatives?
Thank you in advance.
I will be collecting carbonatite samples for LA ICP MS. They will be ground in order to handpick zircon crystals for U Pb geochronology. I want to get 100 zircon grains. What sample weight should I take?
R programming language
I am considering if is it appropriate to use two different randomly chosen samples coming from one huge database to proceed two logistic regressions separately on the same subject?. The main cause is a low power of my computer and no possibility to use own written multimatching function that binarizes whole data into 0 and 1 (follow / not follow).
The database consists of 1 500 000 obs. and 54 variables (data.frame). The DV reflects the act of following one of two presidential candidates (1 and 0) and IVs reflect the act of following particular media outlets appearing on Twitter (also 1 and 0). The aim is to present association between media and political agenda and predictive power of particular media.
Unfortunately, I am forced to sample the data because of the computing time. Hence, I am going to randomize two samples (2 x 100k records), proceed the regression, and then, confirm the first one using the second one. Is it consistent with methodological / statistical art ? Thank you in advance.
Hello Esteemed Researchers
I have a question and I was hoping the experts in the field could guide me. So I have never worked with Shoot Apical Meristem (SAM) and am really curious to learn more and more about it in wheat. However, I do not know how to identify its location in a grown wheat plant. I tried searching articles which researched on SAM of rice and other monocots but the method of sampling or its location has not been stated.
I would really appreciate it if someone could advise me on this as well as explain to me thoroughly on how to identify the SAM region and what would be the best procedures to sample it.
Thank you so much in advance. Any form of guidance will be fully appreciated.
With gratitude
Dee
I am directing a study of upgrading information about the trees in the public space of the Partido de Morón, Provincia de Buenos Aires, Argentina.
Between the years 2005 and 2008 a group of professors and students of the catedra of "Floricultura" did a census of trees in the public space. For each of the trees (aproximately 100000 plants) they recorded the date of evaluation, the Genus and species, the common name, and several cuantitative and cualitative variables.
In the year 2013 and the first trimester of 2014 we did a random sample of 100 blocks in the same population. We registered for each individual tree, any change in the information between the census and the random sample. There are aproximately 5000 plants in the random sample.
We get a data base where we have for each tree, its information in each variable (quali or cuantitative) in the two dates.
The purpose of our study is to produce an upgrading of the information of the census for March of 2014.
We are using ratio and regresion estimators, and post-stratified estimators but would like consider any suggestions of you for obtaining the more reliable estimator of each variable of this population. We want to take in consideration, the time between the observations.
Thank you, in advance for any help!.
First, why do it? Well, ambient MS sampling methods are by nature destructive, and rare and precious analyte objects can't be indiscriminately subjected to moisture, stripping, discoloration, or burning. DESI (and DART) operate continuously. If a target is at the center of a surface, one has to drag the desorbing flow across the surface to get to the target and have it all positioned optimally, disrupting more area than should be necessary.
One could just turn the DESI voltage or syringe pusher on and off until the sample is positioned, but it's my understanding that the flow needs to be stable to get good signal. Some "start-up" emitted solvent would expose the target it before optimum conditions. Diverting the flow back and forth from the emitter would presumably have the same effect.
Perhaps one could protect the sample surface before exposure by using a shutter, as with DART (Analytical Methods 10 (9), 1038-1045). A shutter vane is probably going to be 0.01"/0.25 mm thick. Of course, the shutter can't contact the DESI emitter at >1 kV and needs clearance to move over the sample surface, so one has to allow at least 0.75 mm between the emitter and sample. The greater that distance, the greater the sample area exposed. Also, what happens to the solvent that builds up on the shutter while closed? Tricky.
Instead of a swinging shutter, one could mask the entire sample surface save for the target area. Of course, more than one target would require laborious change in masks and apparatus repositioning.
One could abandon DESI entirely and use some liquid microjunction or nano-DESI sampling with 3D sample manipulation, but that's not the point of this thought experiment. Some day the Venter lab or someone is going to perfect protein sampling with DESI, and then I'll really want it to be discontinuous. I've been thinking about this off and on for years. How would you do it?
To perform data quality assessment in the pre-processing data phase (Big Data Context), should data profiling being performed before data sampling (on the whole data set), or is it ok to have profiled on a subset of the data?
If we consider the second approach, how sampling is done without having information about the data (even some level of profiling)?
If I use purposive sampling in my qualitative study, do I need to set the sample size? If yes, then how?
I've taken 156 samples out of 2500 population with an 80% confidence level and 5% margin of error. How to calculate the sampling intensity in this case?
Dear colleagues,
Would anyone be willing to start a collaboration by sampling freshwater atyid shrimps in Egypt, in particular in the Faiyum Oasis?
In an integrative taxonomic approach combining morphological and molecular data, this would help me to delineate species.
I selected five firms in an industrial sector (where total number of firms in that sector was more than 1,000). These selected five firms comprised of 2,090 relevant individuals who I was interested to contact for the participation in a survey study. A sample of 1,000 was drawn randomly from these 2,090 potential participants and a survey was sent to them. I received 337 usable responses which were then used for the analysis.
In your opinion what is the best way to report the above sampling procedure in terms of target population, sampling frame etc? Any authentic reference will be much appreciated please.
I am looking to do a content analysis on how left and right wing UK newspapers presented the link between MMR and autism. However, the number of articles I get back when searching the terms 'autism' and 'MMR' on Nexus for each newspaper is huge. The number also differs for each newspaper.
How can I decrease these articles into a manageable size? Stratified sampling?
Does a sampling technique known as Infinite Population Random Sampling exists? If exits, could it be applied to internet user/social media studies? and how it can be employed?
Dear colleagues,
My question is regarding suggested methodologies for snow sampling in, for example, mountains or peeks. Some ice sampling techniques for these environments would also be appreciated. Must consider these samples are going to be processed to identify microplastics in the snowy mountain ecosystems.
Thanks in advance,