Science method

# Design of Experiments - Science method

Explore the latest questions and answers in Design of Experiments, and find Design of Experiments experts.
Questions related to Design of Experiments
Question
SNR for dynamic response: 10 log10 ((1/r * Sbeta - V)/Ve ).
There is one important paper of George Box about it.
Signal-to-Noise Ratios, Performance Criteria, and Transformations
George Box
Technometrics
Vol. 30, No. 1 (Feb., 1988), pp. 1-17
After that, many other papers discussing it were published. On my opinion, the best extension of ANOVA adapted to SN Ration is using Double (or Joint) Generalized Linear Models, proposed by Smyth (1989) and Nelder e Lee (1991). Even this was re-extended on Double Hierachical Generalized Linear Models, implemented on library hglm on R statistical computing language.
Good luck,
Afrânio Vieira.
Question
I have done a full factorial experiment (using 3 factors) and three center points. The experiment was repeated thrice at main points, overall producing 27 design points.
Now analyzing the data, I am not able to get the desired model. I analyzed the design using a first-order model with all interactions, but R square and r square adjusted was found to be very less.
After various iterations I finally ended up with a model having only main effects. However, the curvature effect was seen to be quite significant.
Subsequently when I tried to analyze the design using RSM, the model worsened and I had to revert back to the original model.
I am getting Rsquare and Re-sq. adjusted as 79 and 75% respectively. I feel it's too low to accept the model.
Can someone tell me whether curvature effects are taken into account while building a first-order model? I mean, is y=a+a*x1+b*x2+c*x3 sufficient even though centerpoint/curvature shows a significant effect?
If it's not included, can someone please explain how to improve the model or include the effect of curvature.
Dear Deewajar
If you add parameters to a model its Rsq always grows. This is the reason behind the adjusted Rsq. You can use Rsq to compare linear models of the same number of parameters, but if you have two models of different number of parameters it is best to use other measures like the adjusted Rsq or the Cp of Mallows.
Question
We have a project that requires us to design an experiment to show how protozoans metabolize.
Are you talking about breaking down ingested substrates?. In any case you could measure metabolism either as O2 consumption or CO2 production. There are different techniques. I do not know whether you are interested in individual responses or you have cultures. Being so small you may need to follow several individuals in group, Metabolize is unclear to me.
Question
9 factors (A-I) each at three levels and interactions AB and CD are to be considered. What DOE would be suitable so as to avoid incorrect analysis, faulty conclusion and minimize the confounding effect of factors and interactions?
Dear Fahraz
I used L9 with 3 factors & 3 levels with Taguchi confirmation test in my last work and I explain that step by step in my article. You can see my new publication in below link.
Question
I want to optimize an analytic procedure. I have 4 factors and 4 levels. So which design is best fitted to my experiment? I am using Design Expert Software. Does anyone have experience working with this software?
Also, I employed L9 design for 4 factors & 3 levels & 3 responses with doing final confirmation optimum test, via Taguchi method. I explain about it, step by step in my article.
“Shahavi M. H., Hosseini M., Jahanshahi M., Meyer R. L. and Najafpour G. D. (2015) Clove oil nanoemulsion as an effective antibacterial agent: Taguchi optimization. Desalination and water treatment, Taylor & Francis, Impact Factor: 1.173, ISSN: 1944-3994, DOI: 10.1080/19443994.2015.1092893”
You can see my article in below link:
Question
In an attempt to determine the transcriptional responses, I used three animals (at each time point) to conduct the experiment. I collected the tissues and POOLED the equal amount of tissues from these triplicates to extract the total RNA. Subsequently, CDNA was synthesized and for each CDNA sample, triplicate (n=3) qPCR assay was performed to detect the transcript levels. Regarding the interpretation of the data, a question arose from one of my peers stating that “transcriptional data that is presented obtained from one single RNA pool obtained from three individuals is not enough to support the results”. What could be the logical explanation I might add, to my argument against this question?
Unfortunately, there is no way to draw conclusions about differential expression unless there are biological replicates. By pooling samples as described, the 'n' has effectively been reduced to one. There is no way to know whether any given result is associated with the experimental variable or technical variations in the pooling procedure, an outlier sample, etc. The experiment can be best used to show that you can or cannot detect a given transcript, or as the basis for a follow-up experiment with biological replicates.
Question
If anybody has any idea or any related material please send them.
This is a very wide area! Have you googled it?
In what sense Will you use it?
/E
Question
The experimental parameter used in the literature in my problem are all used for a parallel execution. But I want to make a sequential execution.
Dahi,
The problem of choosing the right parameters is not solved yet in literature adequately. but keep in mind that those parameters are dependent, for example as you increase the crossover rate you will increase the exploration rate of GA and thus you will end searching a larger space but this will reduce your conversion, in the other hand using higher mutation rates will increase your exploitation but you may end trapping in a local optima instead of the global optima that you need. Same thing with all other parameters. You may want to build a GA to optimize your original GA parameters as every problem may need a different set of parameters. in another words, for the same GA with different problem you need a different set of parameters to optimize different problems. you may want to refer to 'No Free Lunch Theorem' to gain some insight of this problem.
Question
In sample size estimation, before starting our research, we have to decide on the significance level and our estimated power for the study. When we are able to collect a larger sample size, could we consider as an example 0.01 instead of 0.05? How does this change affect the result?
You did not mention why you would want to lower your type-I error rate threshold, so I'm not sure if the following applies to your study. Regardless, I will share my thoughts......There are some good papers in the ecological literature that address methods for determining appropriate type-I and type-II error rates for monitoring programs based on the costs of each mistake. These methods are based on the idea that the ratio of cost of a false positive to false negative should determine the appropriate ratio of the type-I and type-II error rates. Once you know the appropriate ratio for your problem, you can adjust sample size to achieve the desired level of both. Hope this is helpful.
Joe Mudge, Leanne Baker, Chris Edge, Jeff Houlahan. 2012. Setting an Optimal α That Minimizes Errors in Null Hypothesis Significance Tests. PLoS ONE 7(2)
Field SA, Tyre AJ, Jonzen N (2004) Minimizing the cost of environmental management decisions by optimizing statistical thresholds. Ecol Lett 7:669–675
Question
To get the optimization parameter in machining process.
We used Taguchi design in our last work and we explain this method, step by step. You can see it in below link.
Question
As per current ICH guidelines, Q8R2 is the current demand of the pharmaceutical formulation development to implement Quality By Design in the pharma industry. Therefore, the chemists and scientists should have the knowledge of QbD. However, academics show the least amount of interest.
Well cant it be included as a part of the curriculum for the pharmacy graduates as it has a vast requirement in the research and industry.
Question
I have Three factors A, B and C with levels 15, 2, and 2 respectively. The standard deviation of population is 1.8 from pilot survey. I want to fit three way ANOVA model:
y_jikl=mu+alpha_i+beta_j+gamma_k+(alpha*beta)_ij+(alpha*gamma)_ik+()ijk+error_ijkl
Our main hypothesis is to find best level of factor A with interaction levels of B and C. How do you calculate sample size for testing this hypothesis? And could you give me the R/SAS code for calculating sample size by simulation?
G*Power is a flexible, free tool for calculating sample sizes. http://www.gpower.hhu.de/en.html
Question
I calculated the heat of formation of ammonia using different levels of theory namely Hartree fock, MP2, CCSD and CCSD(T) and different basis sets namely 3-21G, 4-31G, 6-31G**, 6-31G**++.
What i expected was the answer in the case when I use CCSD(T) theory and 6-31G**++ basis sets, to be closer to the experimental result, but it was not the case.
I have checked the calculation twice but the answer is the same.
Is there an explanation for this?
Thanks.
Usually when such things happen the problem is either:
1. a systematic measurement error (i.e. something is not being measured correctly, you should check that measurement is working properly and perform careful recalibration).
2. a "better" theory is not actually better in this particular case (or not better enough), because it does not take into account something important.
3. a combination of both.
Question
In my thesis, I have a couple of variables which may affect the final result. When checked the number of experiments to be conducted it was 128. Even if I decide to do each one once, performing 128 tests seems quite impossible. I cannot reduce the factors because all seem important. Is there any other ways to reduce the number of experiments?
All fractional factorial designs (FFDs) can be used for screening based on our purpose/requirement. To my knowledge, Plackett-Burman and Taguchi methods are effective screening designs. Both have its own merits and demerits. But with FFDs, you will be able to able to screen more number of variables with minimum trials. Also, you didn't mention about the levels of factors.
Question
Optimization & DoE
1) Design and analysis of experiments by D.C. Montgomery
2) Design of experiments using the Taguchi approach: 16 steps to product and process improvement by Ranjit K. Roy
Question
In the pre-test post-test experiment design, both tests should be identical (same questions). Is it possible to have a pre-test and post-test which are assisting the same concepts but using different questions?
In my experiment I had a control and experimental groups. Both groups answered the pre-test and post-test questions, however, the questions in the pre-test were a bit different from the questions in the post-test.
Dear Mona
The answer is yes, but the pre-test from both groups and the post-test from both groups should be the same to obtain meaningful results. If besides you have same items common to both test (pre and post) then you may use IRT (http://en.wikipedia.org/wiki/Item_response_theory), for analysing this matter. This can produce more detailed results for the comparision.
Question
I have results of the experiments designed as you can see below (or in the attached file). What is your suggestion for analysis of such designed experiments? Actually my problem is with center points which were run separately and in different days otherwise the main experiments could be treated as split-plot design by taking A and B as hard-to-change factors.
- doing complementary experiments is impossible.
- Finding the significance of curvature is important.
- Estimating the effect of main factors and their 2nd order interactions is necessary.
A B C D
------------------------------------1st day
0 0 0 0
------------------------------------2nd day
-1 1 1 1
-1 1 1 -1
-1 1 -1 1
-1 1 -1 -1
------------------------------------3rd day
-1 -1 -1 -1
-1 -1 -1 1
-1 -1 1 -1
-1 -1 1 1
------------------------------------4th day
0 0 0 0
------------------------------------5th day
0 0 0 0
----------------------------------- 6th day
1 -1 -1 -1
1 -1 -1 1
1 -1 1 -1
1 -1 1 1
------------------------------------ 7th day
1 1 1 1
1 1 1 -1
1 1 -1 1
1 1 -1 -1
-------------------------------------8th day
0 0 0 0
-------------------------------------9th day
0 0 0 0
You can still analyze the experiments as a split-plot design (using the approach I gave in my paper). The problem is that because the design is not balanced, the test statistics will not have known distributions under the null hypothesis, i.e., the degrees of freedom at the subplot level are complicated.. You could use simulation to estimate the null distribution of the test statistics, and then compare the observed values of the test statistics from the experiments to this simulated distribution. You could also approximate the subplot degrees of freedom using the method of Kenward and Roger. Do you have your response data? I have another idea for analyzing unbalanced split-plot designs that I have been working on more recently. If you like, I can apply this to your data and see what results we find.
Question
I found out the optimal version for my experiment would be ABBA/BAAB/AABB/BBAA, but it only made sense just to do a ABBA/BAAB.
My question is what properties of the optimum get lost through this reduction of test groups?
is it still uniform and strongly balanced? If yes, what's the disadvantage of this reduction of groups?
At https://onlinecourses.science.psu.edu/stat509/book/export/html/123 are some definitions, but I'm not sure about whether or not these properties also match with my cross over design...
If you are running a simple A vs B test, randomizing the sequence in which you take samples is important. If you use any 2 of the blocks, you will get good results. I wouldn't worry about the "Optimal" sequence.
Question
I would like to conduct an experiment where participants are put in different emotional states and then need to take decisions. Is there a common/well-known experimental approach for that? Maybe some game?
I also considered that, but I am afraid that answers may have little validity because of the artificial situation. I was looking for something a little more subtle, like the prisoners dilemma game is used to assess cooperation...
Question
I am used to the corpus linguistic paradigm, but now I need to do some linguistic experiments. As I want to avoid making fundamental mistakes, I search for literature that describes the general methodology of experimental linguistics.
You can find lists of references, teaching materials, tutorials, etc. on our webpage for fieldworkers/corpus linguists who want to carry out experiments (and for psycholinguists who want to carry out work in the field): http://experimentalfieldlinguistics.wordpress.com/
Question
Scientists have a choice to start with either the theory or the experiment.
Sure, literature search comes first. But with Internet and Google search it comes much easier, than in old time. Then, certain degree of theoretical understanding is required in order to design the effective experiments. Workong in a large Research Establishment, one has to stay in line for a machene shop coming with parts for the equipment required.
Then experimental stage comes. Experiment more often corrects the initial theoretical model, researcher initially had. So, that allows to refine the theory. In the case the theory describes the observed experiments and is capable to predict different scenarios, the project is well done. But that is from my personal experience. Other Scientists can have different mode operandi, only result is a KING.
Question
If there are at least 4-5 factors to consider, there will be too many samples. I have read something about 2k factorial level design, and some researchers used to screen out the important factors.
Table 1: Number of Runs for 2k Full Factorial
Number of Factors Number Of Runs
2 4
3 8
4 16
5 32
6 64
7 128
Question
On every occasion! Effect size fulfils an important requirement for the external validity of your research (the degree to which your research tells us something about the world in which we live). By translating the abstract language of numbers into real-life measures, we make the final step between question and answer.
Stating that there is a 'significant relationship' between Vioxx and risk of fatal and non-fatal heart attack is not sufficient. But when you estimate (as the FDA did) that Vioxx was responsible for around 27,000 fatal and nonfatal heart attacks you understand the extent of the manufacturers culpability in not withdrawing it when it was clear from the data that it was unsafe.
Question
I am conducting an experiment about public goods dilemma with a group of 4. My design has two kinds of treatment as experimental treatment and control treatment. However, I have very limit financial support so I am wondering how many participants should I invite to my experiment. Now I have collected 8 groups (32 participants) for experimental treatment and 7 groups (35 participants) for control treatment, so the total number is 68 persons. Is that enough?
I have checked many ralated papers, but many of them have more than 100 participants and some only have 64 subjects, for example “Climate change in a public goods game: Investment decision in mitigation versus adaptation”.
By the way, the result of the experiment with the current data is pretty good and it has verified my hypothesis.
Any idea will be most helpful.
Yours sincerely
Joanna Zhang
____________________________________
With so many excellent answers, I have learned a lot, thank you!
Calculate your sample size needed. Sample size calculator are available
Good luck
Béatrice
Question
Hi all,
I would like to implement DoE into our bioprocessing unit for animal cell culture. I would like to ask you the following:
- What are your experiences with DoE
- Which program do you recommend
- Is there literature you can suggest to get more familiar with DoE
- Any course/workshops you can recommend
Thank you so much - looking forward to an interesting discussion,
David
Experience with DOE, or "experience with DOE for animal cell cutlure"?
For the 1st I have a lot, for the 2nd nothing.
Be carefull: a lot of wrong applications are around.....
See, e.g., the newly uploaded document """"Case n° TWENTYONE; A WRONGLY AWARDED wrong paper of on DOE, awarders are NOT reliable!!!! Quality MUST be loved, DISquality MUST be hated""""
Question
Can it be used for experimental design?
Am critiquing a paper which has no methodology stated then it says the sample was drawn through advertising
Question
I did a two-level full factorial design in order to find the significance of three factors (A, B, C) and their interactions on a response (R). Due to interaction between minimum level of A and maximum level of C at two runs, a situation turned up that R is not detectable. How can I treat with such a case? Shall I put these two runs away and analyze the remained ones? If so, what should I do with this non-orthogonal condition?
One way to tackle  this, is to treat them as missing data.
Question
Hi, I'm planning to conduct a social experiment on online communication apprehension. I've been informed that the number of participants for a social experiment should not exceed 30 people for every experiment group. Any idea where I can get references to support this information or otherwise?
Hi Nigen & Alese,
Thank you so much for your response. Appreciate it much :)
Alese, I'll read the articles and will email you should i need further discussion.
have a great weekend ahead. Cheers!
Question
To test the effect of defoliation on an invasive weed, I will be clipping plants in different seasons (spring, early summer, late summer, autumn [fall]) and at different frequencies (none (= control), 1, 2, 3, or all 4).  The attached file "Clipping design" shows it diagrammatically.
Is it correct to describe this as a full factorial design with 4 factors (Spring, ESummer, LSummer, Autumn) each varied at 2 levels (clipped or not clipped)?
It just seems a complicated explanation of a simple design.  And analysis would allow for high-order interactions (up to Spring*ESummer*LSummer*Autumn), which also seems a bit complicated.
Frequency is aliased with the other factors, so can effectively be ignored, or analysed separately.
Thanks so much for that - really very helpful.  We are setting up the experiment now, and I'll put a reminder in place to let you know how it turned out.
Question
I need to design an experiment to measure or record dislocation movements inside material grain under different strain conditions.
Cláudio,
I am studying the deformations induced by riveting processes on aluminum alloys, mainly   AA2024-T3 and 7050-T6.  The  alloy anisotropy (grain direction) has a very high influence on this effect, but I don´t know the reason yet. I was instructed to investigate the dislocation motion inside the grains in order to reach a better understanding of the dislocation motion effect on the riveting induced deformation dynamics.  The attached file has some information related to that.
Question
I am working on an experiment using Response Surface Methodology (Design of Experiment). I use StatEase software for the same. During analysis the software shows that CUBIC model for my data is Aliased. However, the cubic model is significant (p<0.05) and has insignificant lack of fit (p > 0.05). Moreover R2 value is also very good 0.985.
Can I use this model for prediction of optimized conditions, although the model is aliased, but statistically significant?
In Design Expert (Stat Easy Software) try letting the software reduce your model by the option "Backwards", which takes in account all variables and interactions, and eliminates the ones that do not are significant. Use the resultant model taking in account that variables aliased in this final model can not be resolved (for example, interactions between variables ABC can be aliased, confounded with A2B or something according to what the analysis of the software says). You can use the resultant model if the main factors (pure variables) are not aliased. Greetings
Question
.
I suggest you four papers and one book where you can find an appropriate answer to your question
Costa, N., Pires, R., & Ribeiro, C. (2006). Guidelines to help practitioners of design of experiments. Total Quality Management Magazine, 18, 386-399.
Simpson, J., Listak, C., & Hutto, G. (2013). Guidelines for Planning and Evidence for Assessing a Well-Designed Experiment. Quality Engineering, 25, 333-355.
Tanco, M., Viles, E., Ilzarbe, L., & Alvarez, M. (2009). Implementation of Design of Experiments projects in industry. Applied Stochastic Models in Business and Industry, 25, 478-505.
Freeman, L., Ryan, A., Kensler, J., Dickinson, R., & Vining, G. (2013). A Tutorial on the Planning of Experiments. Quality Engineering, 25, 315-332.
Myers, R., Montgomery, D., & Anderson-Cook, C. (2009). Response Surface Methodology: process and product optimization using designed experiments (3rd ed.). New Jersey: John Wiley & Sons.
I hope you consider this information useful.
Best regards
Question
Can we design an experiment to prove that speed of a particle cannot go from below light to above light? I am not talking only about accelerating particles but the experiment must be able to prove that no method including pushing, jumping or tunneling particles from below the speed of light to above the speed can exist. I am not looking at the theoretical derivation but an actual lab experiment.
Dear Shalender, to my understanding (and this is partly gained by studying the history of science) this is not the way physics, as a scientific discipline, works: one does not begin with a set of perceived advantages exterior to science to be had if something was true and then set out to prove the truth of that thing; this is at best a form of reverse engineering. We are supposed to observe the nature insofar as it is accessible to observation and through scientific methods come to an understanding of these observations within the body of established knowledge, which of course can itself be subject to revision. This understanding can be formulated in terms of laws, expressed most conveniently in mathematical language. To test whether our understanding is complete, or at least not erroneous, we use the laws we have written down to predict observations as yet to be made. Later observations are to verify these predictions. From my very personal perspective, any scientific investigation must be embedded in a greater whole; its links to the available body of knowledge must be evident to us, the researcher. Of course, one can make discoveries by serendipity while doing research, but this is only an added bonus for thinking about nature and attempting to understand it. You may wish to consult one of the better biographies of Albert Einstein or any other great scientist for that matter to see how they practised physics. Come to think of it, one only needs to go through the publications of Einstein (the ones he wrote in his most productive years) to realise that they invariably begin with discussing some experimental observations.
Question
I think that Taguchi its very useful, however, as I do not have much experience using these techniques because my experiments are expensive very time consuming.
A, D or I optimal? Are there any contrasts that we should not test? Do we want to look at quadratic terms? What is the number of runs we have to work with? Since factors are continuous, we can test all the factors at 3+ levels. Would that be of interest?
With 36 runs, I can create a model that will find ALL 2-way interactions and quadratic terms for for your last 2 factors. With 19 runs, I can find all main and quadratic effects, plus the 4 interactions, plus have 5 extra runs to create an error term and look for model validity, or look for other interactions, or create blocks in the design.
What are the goals and constraints for the design?
Question
Do we have nowadays the software which can design a mixed factor fractional factorial experiment? Probably in R, Matlab, Unscrambler?
I have 2 factors with level 2 and 3 factors with level 3. The full factorial experiment 2^2 3^3 = 108 runs, but I want to reduce it to 54. Someone knows how to do it?
There is a type of experimental design called an Optimal Response Surface. You can create a design that fits your experimental space. I use Design Expert software to create my experimental designs. JMP would be another good product to check out. Both offer free trials.
Question
I want to carry out remediation in a screen house using different macrophytes to test their remediation potential on 5 different industrial effluents. How do I design the experiment?
Hi
We've been doing some phytoremediation experiments of late, but not yet published (though my student's PhD has now been successfully defended). I'd suggest a two factor random block design with three replicates:
Factor 1: macrophyte species: (n = 5 levels)
level 1 species A
level 2 species B
level 3 species C
level 4 species D
level 5 species E
Factor 2: effluent (n = 3 levels)
level 1 no pollution (clean water: control)
level 2 intermediate: 50% effluent: 50 % clean water
level 3: 100% effluent
Blocks I, II, III (n = 3 blocks)
So you would need 5 x 3 x 3 treatment units to do the experiment = 45 units (tanks, buckets, or whatever you want to use to hold your plants, but each one must be independent of the others, don't start putting different plants in the same treatment unit or you'll end up with a pseudoreplicated design which is useless and can't show you the effects of your treatments (which of course is what you are interested in).
Each block will occupy one discrete area of your screening house and will contain 15 treatment units, arranged at random within the block, representing a full set of treatment-combinations  (species A clean, species A intermediate, species A polluted; species B clean etc).
The advantage of the block design is that it takes out an element of the error (due to "spatial position" of your treatments) and hence makes it easier to find significance in your results (if there is any) because the error term is reduced in the ANOVA.
If you are pushed for resources you could eliminate the intermediate treatment, thereby losing one third of the treatment units needed, but don't reduce the number of blocks; 3 reps is essential to have any chance of getting a good statistical outcome.
There are thousands of examples of this simple but effective design in the literature.
Cheers
Kevin
Question
I am planning to prepare a review in a reputed impact journal that will give a brief and lucid idea on the application of Design of experiments and statistics in pharmaceutical research.
If interested please respond and be a coauthor the Publication
can contact me at my mail, hassan23pharma@gmail.com
Question
Most experiments which can used will have an induced associative learning recorded in the memory of the lion. How can we remove that part and conduct an experiment?
Hi Abhishek!
Why do you want to avoid associative learning?
Not knowing much about lions I would think the easiest way to test that is by using a two choice procedure rewarding them for choosing specific colours and then testing for transfer to gray scale stimuli with the same luminance - basically the same experiment Karl von Frisch did with bees.
To avoid associative learning you would have to design an experiment that uses their natural behaviours as a measure for whether or not they see colours, e.g. two hidden prey items, one in colour (which would be a lot easier to detect for a human) and one in gray or the same colour as the background, and then observe which one the lion attacks. If the lion gets rewarded for both choices, i.e. he gets to eat the prey no matter which one he chooses, there should be no associative learning.
Regards
Question
I am working on a transcriptomics project to identify bio-markers which can separate two classes of hepatotoxicants. there are  10 compounds, 5 of the compounds belong to one class and the other 5 belong to second class. for each of the compound, there are 3 concentrations tested. In total 32 samples from both classes in an experiment (biological replicate) including the control sample (same control will be used to make all comparisons). With sample size of 6, I have have 192 samples from 6 independent experiments. Now, I need to perform hybridization of 192 samples in 2 runs or batches.
I am wondering what is the best way to randomize the samples to avoid possible batch effects.
The best option is to stratify the random sampling so that each batch contains the same number of samples from each group. Then you can estimate the batch effect from the difference of the batch-means and correct/adjust for it.
Another way is to include the batch effect staight-away in the model. This also requires that both batches have some arrays from all the groups, but not neccesarily the same number. If it is the same number, the internal solution is again similar to the strategy stated above. Additionally, the model will take care of some additional uncertainty that comes with the possible batch-to-batch differences. And you have the choice to treat the batch as a random factor in the model.
Question
I did a 2-level full factorial design with 3 factors in order to find the significance of each factor and their interactions on a response. After ANOVA on non-transformed response, I found no factors with significant p-value while 2 factors were obsereved with significant p-value when the same response was log- transformed. What can I say on significance of the factors? Are those significant or not?
First of all, if something is significant or not is your decision, and not (simply) the result of a test.
Actually, to unbiasedly interpret the test results, you should know before analysing the data if and how the response should be transformed. However, you obviousely don't know it... so the test results are not interpretable in the way you'd like to.
The typical way is to have a look at the residuals. Their distribution should not be too contradictory to the model assumptions (symmetry, variance homogeneity -> "normality"). From the look at the residuals you may be able to decide which model (the additive model with the untransformed response or the multiplicative model with the log-transformed response) is more appropriate (in light of the given data!).
Since you then used the very same data to decide about the model and to analyze the model coefficients, the p-values must be interpreted with greater care. Selecting a coefficient with p<0.05, for instance, does not anymore guarantee to keep a type-I error rate of 5%. If this is your aim, then you will either need some justification for the model you chose that comes from outside of the data (and the results) or you need new data that you will then analyze based on the insights you have got from the present data.
Question
Does anyone know some proxies to measure audit quality as they are used by Kaplan/Mauldin (2008) or Knapp (1991)?
read the basics of ISO 11000
Question
I did the following full factorial experiments to find the significance of variables and fit model. I analyzed the data using the standard least squares model and generalized linear model (normal distribution/ identity link function). The results of both analyses are attached. Why do they show completely different p-values while the model estimates are exactly the same? What can I say about the significance of the variables?
X1  X2  X3  Y1
++−        1    1    -1   67.5
−++       -1    1     1   0.5
+−−        1   -1    -1   8.9
−−+       -1   -1     1   0.4
+++        1    1     1   56.9
−−−      -1   -1    -1   8.7
+−+       1   -1     1    6.6
−+−     -1     1    -1    69.4
000       0    0     0     37.1
(all variables are continuous)
Generally, how can I find that which model (GLM or standard least squares) should be used for analysis when I have no idea about the response distribution?
----------------------
Y2
0.034
0.001
0.011
0
0.144
0.007
0.035
0.021
0.053
consider Y2 as another response of the designed experiments above. when you check the distribution of Y2 , you will find 0.144 as outlier. does it mean that least squares model can not be suitable as it is affected by outliers?
Why do they show completely different p-values while the model estimates are exactly the same?
Actually, both analyses should give identical results. Maybe the problem here is that the GLM tests are performed using the Chi² distribution. Maybe useing F-tests here would be better.
Generally, how can I find that which model (GLM or standard least squares) should be used for analysis when I have no idea about response distribution?
The response distribution (more specifically: the distribution of the *errors*) must be known or assumed. If it is not known, the assumption should be reasonable. There is not much more to do. One may look at the actual residual distribution to see if it is strongly contraditory to the assumption, one may also simulate data following the assumption and see if the "real" reasults are in strong contradiction to the simulated results.
does it mean that least squares model can not be suitable as it is affected by outliers?
If 0.144 is a physically impossible or really non-sensical value, you should anyway remove it. Otherwise you actually have too little data to judge from the data alone that this is an outlier. Typically the best option is then todo the analysis onec with and once without this value; if the results are grossly different, some careful reasoning is required (otherwise it doesn't matter anyway).
Outliers affect all analyses, not only least-squares. But only for least-squares it is relatively simple to detect outliers (given enough data). This does not mean that least-squares is more sensitive to the presence of outliers than other methods, except, for sure, robust methods (that usually have other drawbacks).
Question
~10% of the Reactive oxygen species can originate from extramitochondrial sites like NADPH oxidases. Besides knocking out mitochondrial genes, is there a way to determine the differences in cell culture. I am trying to design an experiment to explain that the origin of ROS is extra mitochondrial. As far as I know, fluorescence based assays do not tell the difference from where they come from. Am I missing something?
like everyone told you before me, Mitosox is a good tool.   Please do not use DCFDA.  It is not easy to measure production of oxidants using DCFDA as a probe.  The number of controls and the method needed to have a correct interpretation are extremely difficult.  You cant ever to Kalyanaraman, B., V. Darley-Usmar, K. J. Davies, P. A. Dennery, H. J. Forman, M. B. Grisham, G. E. Mann, K. Moore, L. J. Roberts, 2nd and H. Ischiropoulos (2012). "Measuring reactive oxygen and nitrogen species with fluorescent probes: challenges and limitations." Free radical biology & medicine 52(1): 1-6. for better information on ROS probes.  The problem with Mitosox is that it reacts with superoxide but not with hydrogen peroxide, and even when all hydrogen peroxide is formed from oxide, not all is produced from superoxide.  So if you have mitochondrial hydrogen peroxide formation without changes superoxide formation, you will not see it.  There are a few more selective probes for hydrogen peroxide, some of them targeted to the mitochondria that you could use simultaneously with a cytoplasmic hydrogen peroxide scavenger, such as overexpression of catalase that might work.  The true is that there is not an easy answer to your question and the methods to use will greatly depend on the specific experimental model and conditions.
Question
I'm trying to design an experiment on domain wall resonance by alternating field (magnetic field or current). My concern is about frequency domain of the resonance (ideally in Permalloy and with 50nm domain wall size) and what kind of experiments has been done. I'm using geometry constrictions to trap domain walls.
No matter what the experiment is, first of all you should decide on the factors and their dependency. After that, you should select the most suitable design (considering complexity, time and budget constraints etc) for your research. I strongly recommend you "Design and Analysis of Experiments by Douglas C. Montgomery".
Question
Individual Variables Symbols Levels
-1 0 1
Extraction time X1 20 24 28
Extraction temperature X2 4 30 56
Vol of enzyme solution X3 4 5 6
Question
Taguchi Method and crystallization
Taguchi method is not suitable for your work, As Fausto highlighted it has many drawbacks.
Question
I have tried water in which extract is insoluble and DMSO & other organic solvent itself is showing inhibitory effect in the assay.
Thanks for the suggestions!!!...
Question
I am planning to study a hypothesis for understanding the pathophysiology in a particular disease. I am therefore wanting to know how many disease cases do I need to study for such research.
Dear Wasim, in disease related research
1.  You have find the prevalence of the particular disease in the same geographical area.
2.  Decide and fix the power of the test
3.  What are the type one error and type two error.
4.  Use the appropriate formula for finding the sample size for the particular study.
If your study related to prevalence or prevalence is already found in the previous study means, then you can use the following formula.
4 p q
Sample Size  =  n  =  --------------
d^2
p = prevalence  (in percentage)
q = 100 - p
d = admissible error (10% of prevalence)
Question
I want to do 2^4 (+ 3 center points) experiments in order to find the effect of 4 factors on a goal at two levels. One of the factors is temperature. If i want to run the experiments randomly it takes a long time because I have just one equipment set for keeping the temperature fixed at max or min (or middle) value. Can i do all experiments requiring max (or min or middle) temperature together instead of doing randomly? As far as i know we can not block the experiments in terms of one factor (temperature).
Randomization of the experimental trials (runs) protects against the effect of the so called “lurking variables”. A lurking variable is one that has important effect however it is not included in the experimental configuration. This is expected especially in the first steps of an experimental study when the effect of one or more significant variables might have been underestimated.
So in general it is necessary to randomize the experimental runs.
In addition it is as equally important to consider doing at least two replicates (repeat each experiment twice) and this way obtain an estimation of “pure error” which represents common cause variation of your experiments.
Since you have already four variables a suggestion would be to consider in the first step a half fraction design with two replicates where you can include your 3 center points. This configuration brings a total of 19 experimental runs in one block, randomized if possible.
Question
I used to use full and partial factorial DoE methods for mathematical analysis. Right now I am using Taguchi concept. I like Taguchi because of easier calculations and higher precision, moreover there is no need of the deriving any mathematical models. However, in my applications very often interactions between inputs play an important role and Taguchi can be in such applications insufficient. Does any one of you can suggest me any other design of the experiment methods and give me some info about additional literature?
L9 Taguchi optimization with title  "Clove oil nanoemulsion as an effective antibacterial agent: Taguchi optimization. (2015) Desalination and water treatment, Taylor & Francis, Impact Factor: 1.173, ISSN: 1944-3994, DOI: 10.1080/19443994.2015.1092893” is available via the link below!
Question
Question
Plotting of graphs are the best means for the analysis of our results. But if they are plotted wrong, then surely we can conclude wrong results.
Many of the biologists are not very familiar to various programming languages, so the software which are user friendly, may be preferred in contrast to one, which involves command line interface.
The question is simple but the answer can be quite involved.
First of all the choice of a tool will depend on the kind of data
to be plotted. One of the main factors is the dimension of the
domain over which the data is plotted -- 1D (for instance function
of one variable) versus 2D and 3D (fields defined over
two or three dimensional regions). Another issue is if you deal
with static or time dependent data.
Also the term "user friendly" can be understood in different ways.
Most of the time it will be understood in the sense that the tool
provides a GUI, where you can make some clicks and get the results.
That might be the case for some users but for me the indicator
of "friendliness" is the level of automation I can get from the
tool. If I have a single data set and I am going to prepare only
one version of the plot then GUI tools are OK. However if I have
hundreds of data sets and I would like to use visualisation to
understand the data then often it turns out that "the GUI way"
is hard to be automated. Thus I would not dismiss command line
tools because they seem archaic or difficult to use. In the long
term investment in them will be very beneficial.
As for a more concrete answer I would say:
ParaView http://www.paraview.org/ and other VTK based tools for
visualisation of multidimensional
fields ( e.g. tensor ones) over complex 2D or 3D domains (also
for time dependant data).
Question
I was wondering if anyone could help me with a question regarding the assumptions for a statistical test I am running as part of a manuscript that I am revising now for publication (after receiving initial comments by the reviewers).
Basically, I recorded voices of men and women in two languages, responding to both men and women who were categorised either as attractive or unattractive, and I am analysing 4 different acoustic parameters of their voices. So, it is a 2 x 2 x 2 x 2 mixed design MANOVA, for which I have 2 between-subject variables (sex of the speaker and language) and 2 within-subject variables (target sex and target attractiveness), all with two levels, and 4 outcome variables. In total, I have 110 participants (30 men and 30 women in one language, and 25 men and 25 women in the other), and 4 observations per participant for each one of the 4 outcome variables. Therefore the degrees of freedom are 4, 103 for multivariate results, and 1, 106 for univariate.
Because all independent variables have only two levels, sphericity is not an issue. However, I have not been able to find clear information about multivariate normality and the variance-covariance matrices for a mixed design like mine, and how robust is the test to violations of these. I can run a Box’s test, but it seems that to be able to interpret the results it is essential test the multivariate normality, which apparently it is not possible in SPSS (which is what I normally use). Even more, all the information I have found seems to be contradictory regarding the importance of group sizes and how it affects the robustness of the main MANOVA to possible violations of the assumptions.
As you can see, I am very confused. I would appreciate any advice.
You are attempting to do a lot with a small sample. Have you performed any type of power analyses? Are you going to use a Bonferroni correction? I would h
If I had 4 outcome vars, I would. There are several good articles on Monte Carlo studies that demonstrate that univariate ANOVA is robust to violations of the normality assumption and the homogeneity of variance assumption. There must be some multivariate test. If you can't find any, reference the univariate studies and explain why it is likely that those findings might generalize. I think you are using a much too complex design with too many outcome variables. Interactions will be tough to interpret. Good luck.
Question
Epitope prediction have many tool online, but which one will be a generally suitable for polycolone antibody production (maybe for western blot)?
I second that NetMHCII can be good as well. depends what your interest is class I or classII. ClassII predictions are developing and are at times difficult to predict .
Question
Mixture design of organic solvents
The Simplex-Lattice can estimate a full cubic model, whereas the Simplex-Centroid cannot - it can estimate the special cubic. Thus, the Simplex-Lattice would only be better if you suspect that the response(s) you are measuring might require a full cubic.
Question
I have recently conducted a randomized experiment. I have one major (in my opinion) finding and potentially two more interesting findings. What would be a better strategy to get this into these journals: A) Simply focus on the main finding, B) Discuss the main finding and then add the other two as added complexity, C) Advance all three hypotheses together in the lit review section, and then test all three and present three findings together.
Hello,
I'd prefer option C, assuming that you has created a priori hypotheses, and not altered them based on your result. The main finding that you propose should be given special credit, but the supplementary findings may be deemed as (is not more) important as the former. You could also view the latest issues of the journals that you're looking at to publish in, to determine their trends in stating hypotheses in the introduction and discussion sections. Hope this helps :)
Question
I am using a backstepping position control for an electrohydraulic servo system. I am using DS1005 DSpace and real time workshop to develop the experimental setup and the 4/3 way servo valve and asymmetric cylinder is also used. The simulation is doing very well, but during the experiment the extension case has noise and disturbance while the extraction case works very well.
I think you should go back and check/validate the model first.
Question
I am carrying out a three factor ANOVA to test for significant differences between three factors, site, shore and station. Now I am having interaction factors which are significant either between site x shore or even all three factors. Though conditions within a site vary from other sites, i see no way how one can explain an interaction between site and shore type.....in these cases are interactions left uninterpreted?
Even if sites were alike one would still expect random differences between them (due to other sources of variation than those accounted for) - that is exactly why replicates are required in any scientific study.
Question
I have to measure vibrations of blades fixed to the rotor. Unfortunately I have a very limited budget. We've just started the project and tried to get some funds. At present I have to prepare the measuring system on the rotor with diameter of ca. 5m. What is the cheapest way to do this? The rotating system can rotate with various angular velocities in the range from 0 to 1000 rpm.
Fist, look at static frequency response of your device (without dynamic loads due to rotation): a simple measuremet with one accelerometer an an impact hammer can give you helpfull results about your structure behaviour. However, dynamic loads due to rotation should not be neglected.
For this kind of measurements, you can use a cheap accelerometer mounted on the axis bearings and correlate the measurement with a trigger given a (white) mark on the axis, detected by an optical sensor. This will give you a good idea abour the rotor out ov axis movement that you may correlate to blades vibrations.
You also may use piezoelectric film (pvdf) stuck directly onto the blades and measure the blade vibration, but you will then need a rotating contact (with all the problems it may cause).
The better, however all but cheap: use Polytec laser scanning system with de-rotator an you will measure the blade vibrations. Well it is worth the price of several 100 000 \$ ... if you get the fundings. You also may rent such a test device.
Question
Effects of ambient light; How to design experimental device?
Hi Liu
First use some hand held NIR device which is light compensated itself. Then select a tree having fruit in different direction (N,S, E, W) and under full light, partial light. Tag those fruits and marked at equatorial region of fruit for NIR Scanning. I suggest to scan fruit when comes in scannable size and then scan once a week until maturity. Then after scan twice a week. During each scannning take five representative fruit for destructive analysis into your lab which should not be the tagged one. Scan those destructive fruits and measure quality of your interest, eg. DM, Color, pigment, acid, firmness etc. Then use this sample to develop a calibration set and use it to predict quality of on tree fruit later.
Cheers,
Question
I have six treatment groups (different cell types) with four biological replicates per group that I would like to compare protein levels in. I have come up with several experimental designs, each with pros and cons... For my first design, I thought to run four separate gels, each one containing 1 lane of each treatment group..For a second design, running six different gels, one for each cell type and having 1 lane per biological replicate. Trade-offs between the two being inter-gel variability and culture issues (different growing times, passage numbers, etc.)
Does anyone have any recommendations or thoughts on what the best way to go about this would be?
Good question!
first design!
in this way you could compare groups and consider gels as replicates.
Suggestion: load 1 samples in all gels to see how consistent your runs are
Question
For instance, in a 2x2 factorial experiment with 4 treatments, what minimum number of animals can be used for each treatment?
Depends on the variability of the response variables evaluated. In general I would suggest a minimum of 10 animals per treatment, ie 40 animals.
Question
I'm working with a colleague on series of experiments treating mice with compound A to see if their condition improves. Baseline behavior tests were administered to determine the difference between WT and diseased mice, and then each genotype was to be separated, 50% for no treatment, 50% for treatment with compound A. My initial reaction was to randomly assign the mice to either treated or not treated based upon drawing mouse numbers out of a hat.
Instead, my colleague analyzed the baseline tests and decided which mice would be treated or not treated based upon these tests in an effort to "ensure accurate representation of all baseline levels". Now, ignoring the fact that I would have rather done this completely blind, and had someone else randomly assign them, is this an acceptable way of deciding on treated vs. untreated mice? I'm not 100% comfortable with it, so I'm curious what the scientific community at large feels.
I see two possible approaches here.
1.Randomly assign the animals and then use the base line data as a covariate in a covariance analysis when you have collected your final data. That will test whether the baseline affects the final measurement and also correct for it.
2. Match the animals on the base line data, then assign matched animals to the treatments at random. For example, if you have two treatment groups (say control and treated) you should look for pairs of individual with the same base line value and assign them at random to the two treatments. This then becomes a randomised block experimental design, and it should be analysed by a two-way analysis of variance without interaction (I am assuming quantitative data). These designs are described more fully in www.3Rs-reduction.co.uk.
Michael
Question
Optimized through DOE algorithm, parameter of interest lies outside the range of DOE dataset indicating there is a interaction effect between variables. What is the simplest regression method if we know what is the correlation between individual variables that are part of the input parameters?
Kateryna - a very interesting question, and I am afraid I dont have a definitive answer for you but instead a question - is DOE stand for Dept of Energy or is it a particular data type/analysis?
Question
Let me know location for ball mill.
Do u know about sintering facility for silicon nitride.....
Question
I'm analizing the interaction between a plough tool whit the soil, I have a DOE for this analysis, but my response is a Matrix.
Hm, your problem may be resolved in two ways:
a) modelling matrix components as seperate output variables; be careful! tensor must be symmetric i.e. you have only six independent components,
b) modelling some scalar surrogate e.g. HMH criterion (Huber-Mises-Hencky) as output variable.
Error estimator may created by two approaches:
a) assume that your model is true and variance is spatially constant; then mean residual error is enough good estimator of variance
b) try Monte Carlo approach: jackknife (systematic subsampling) or bootstrap (massive random resampling); if you select this then look at book of Shao and Tu "The Jackknife and Bootstrap".
Random perturbation is good approach but: what is random matrix added to? And why N(0,1)?
Question
Can this be done? And if so, could someone supply the design please?
Refer this paper. Construction of central composite designs
for balanced orthogonal blocks
SUNG H. PARK & KIHO KIM, College of Natural Sciences, Seoul National
University, Korea
Journal of Applied Statistics, Vol. 29, No. 6, 2002, 885-893
Question
When using stratified sampling to build focus groups, it may be harder to find groups with certain qualities (in my case a certain style of religiosity). Sometimes it is even necessary to boost samples. Of necessity, I have been unable to find equal numbers of each instance of religiosity style - I have held 8 focus groups with one religiosity style (heritage) but only one with a contrasting style (convert) as the latter is quite rare. To what extent can I base conclusions on the contrast between the two types of group - and how would I qualify my findings in a write up?
Use multistage quota sampling
Question
The lack of experimental details seems to be a problem for most posts I visit to try to offer advice. There needs to be a policy or standard set of information that is offered up to get proper advice. Check out some of the QA spaces in the computer science realms to see what I mean (eg. stackoverflow). The majority of replies to a post seem to always be clarification questions and then a few people to take a stab at an answer.
I sort of understand most peoples reluctance to provide details as they fear being scooped, but if your want help you have to give a sufficient amount of detail. Most problems stem from experimental artifacts or fundamental misunderstanding of biological principals. For example you can hide the name of a gene but should provide functional details like if it is a transcription factor or a kinase, I think the name of the sequencing platform or details on how a library is prepared are very pertinent pieces of information that will not deluge to the world your research.
The post on this system are too much like twitter. Its fine for quips or to point people at a resource/news article/event, but not okay for any serious conversation.
Totally agree with the feeling. Many questions on Research Gate are of very poor quality (no background or goal/objective information, short and imprecise question). It reduces the overall quality of the entire site.
Guidelines, distributed to all members upon registrations AND shown upon adding a question or answer (as suggested by Stuart Jenkins) would be great.
Another thing that I have seen make a difference is for members to politely tell the OP that their question is poorly formed and would benefit from -More context-, -Goals/Objectives info-, -Being formulated at length, with care for important details-.
My suggestion is that you create a template in a text file and, any time you think the OP deserves the speach, copy-paste it as an answer, adding appropriate information about the OP (name...) or the question itself.
This is all a question of "Forum Naivety or Newbieness". You can't just state: "Please be an advanced and well informed forum user". You have to explain at (some) length what that means and what is expected of you as a forum member.
Question
I work on the optimization of surveillance network for an insect ecosystem. I'm finding out some models or methodologies for this study. If you have some experiences, please share me some references or links?
Hello. It seems to be an interesting problem. I'm sorry my major is not in this field, but I did have experiences in optimization. The detailed mathematical model of an ecosystem is too difficult. But I think that the simple Predator-prey model may be a good choice in preliminary analysis.
For an optimization problem, you must identify the performance index with the optimization variables.The basic principle is to find the optimal variables in order to optimize (min or max) the performance index. However, I 'm not sure what you are going to do. Maybe I can give more advice if you can offer some more detail. Thank you!
Question
For example, when the nature of the experiments indicate that is highly unlikely that the random component of the observations will affect the estimates of the factor effects.
Question
I am planning to conduct experiment with composite of different proportion of Si3N4 and hBN, to investigate the effect of proportion on tribological behaviour of composite. What should the design of experiment be?
How to carryout optimization of wear rate with different contants of Si3N4 and hBN? which method will be suitable?
Question
Dear friend:SN ratio analysis is most suitable techniques for find out the sensitivity of beta, you can applay orthognal nested type of design for analysis and interpretation, if the sensitivity is >70 than your experimentation and statement are correct.
Question
The dependent variable should be operationally defined in measurable terms. As such they should be characterized as reliable and valid. Could someone clarify these concept?
Reliability shows the extend to which a variable (or a set of variables) is consistent with what they measure. Validity is related not to what it should be measured but to how it is measured, That is, validity is how well a measure (or a set of measures) reflects the concept of the study and the study is free from any systematic or non-systematic error.
In other words reliability refers to the consistency of the measure or the set of measures while validity refers to how properly the concept is defined by the measure(s).
Kind regards
Question
As usual in some disciplines, a Researcher conducted an experiment without having previously designed. The experiment consisted of exposing 4 different groups of mice at two types of food (simultaneously measuring and amount ingested preference) and each group was expusto turn to different conditions.
What would be the appropriate statistical procedure?
The problem I have is the dependence on preference.
As you are well aware, any analysis made on relationships seen or hypothesis made a posteriori requires the most conservative statistical analysis to exclude investigator bias. So if you are going to use a 2-way analysis of variance that Mark Perry has proposed for repeated measurements over time and within groups (and I agree), it behooves you to use the most conservative post hoc analysis available such as the Bonferroni method (J M Bland, D G Altman. Multiple significance tests: the Bonferroni method. BMJ 1995; 310(6973):170).
I have a bias that I need to express. As statistical packages evolved and continue to evolve, there has been in the past a knee-jerk response to quote the name of this package (SAS, SPSS, etc.) within the method section of their paper. Understand this, it is far more important to state the statistical method used then how you calculated it (on your computer, a hand-held calculator, an abacus, or your fingers and toes). Just a suggestion from a reviewer who tried to hammer this concept home for years.
Question
Friedman and Sunder defined experimental data as "data deliberately created for scientific (or other) purposes under controlled conditions", and laboratory data as "data gathered in an artificial environment designed for scientific (or other purposes)." Based on these definitions, I would like to know if experimental data are in anyway different from laboratory data. Where can the boundary be identified if they actually differ?
I guess some hints can also come from the use of data generated from a computer software (e.g. the atomic coordinates of an atomic assembly).
Astonishingly those can be considered as "experimental data". Can they be considered as laboratory data? It's a matter of meaning of the words.. a computer software is in any case an "artificial environment" (even if it is not a physical one), and we "gather" the data from a computer, even if it is different than "recording" them
Question
What is difference between standard method and S/N method in data analyzing?
Dr. Taguchi of Nippon Telephones and Telegraph Company, Japan has developed a method based on " ORTHOGONAL ARRAY " experiments which gives much reduced " variance " for the experiment with " optimum settings " of control parameters. Thus the marriage of Design of Experiments with optimization of control parameters to obtain BEST results is achieved in the Taguchi Method. But the Taguchi method has some errors and inconsistencies. I suggest you to use more efficient modern methods.
Question
Whenever individuals, or groups of scientists plan an experiment, small or large, there is a bias in the estimated accuracy or 'outcome' which more than often favour high accuracies. This 'higher than before' accuracy is often used as the driver for funding and scientific acceptance of the proposal. How does our subjective confidence (being higher than our objective accuracy) influence our sciences? Is it a positive or negative influence? Do we actually achieve more this way? Is there a balance between overconfidence, optimism and actual achievements?
Well, seeing that there are no answers yet, consider Gravity Probe B as an example, or the Hubble space telescope. Long in the making, expensive, subjective confidence driving science targets. Results were (at least initially) not as expected, requiring more time, more funding, driven once more by overconfidence and optimism. The outcomes were fairly favourable in the end....with a slight dubious taste in the mouth at the one end and an empty pocket at the other. Without the over optimism, none of these very commendable projects would have been supported in terms of funding. Does the collective memory of these projects influence the outcome of future projects? Probably...
Question
Assume that we have 8 factors and want to design experiments using 2-level fractional factorial design. At the same resolution for 1/8 fraction and 1/16 fraction, we can run partly a same number of experiments:
37 runs with 1 replicate (32 + 5 center points)
42 runs with 2 replicates (16X2 + 5X2 center points)
Considering that we can do not more than 42 experiments, which ones give the better estimation - with replication or without?
Dear Selvaraju,
In the link below it was mentioned that there is no need to replicate fractional factorial design. The replication doesn't make sens. Instead, if there are resources available for more runs, it makes much more sense to use those runs to design a higher fraction.
Question
There are many methods, such as, 2 level factorial, center composition design, taguchi, and so on, the question is which model is more responsible for DOE and concrete analyzing?
Factorial, composition are for independent variables only. Taguchi is for the case where you want to cope with the noise in the variables - e.g. in the electronics.
As long you want to optmize the compounds (proportions) of the concrete - cement, sand, water in order to improve its let say 28th day strength - you need DOE for constrained mixtures. May be it can be combined with some independent variables e.g. the quantity of the reinforcement. Mixture - because you have to vary the components of the mixture (cemente, sand etc), constrained - because 0 % of cement for instance have no sense - see:
Question
What would be a good behavioral task to test rats' sensory discrimination difficulties?
It should be acceptable to use a psychophysics auditory discrimination task requiring a bar press for reward (food or water). I'm still not sure what the hypothesis is....
Question
I'm using a design of experiments method, especially the Box-Behnken matrix. Now comes the time to plot some figures to see the effects of the three constants and I started to do it like in this paper:
The thing is that when studying 2 of the 3 constants, there are two results for the same point given the constitution of the matrix. Matlab automatically does an average of those two values but I wonder if it's really pertinent as the third value has an effect on the response. So is this method using the average still relevant to analyze the effects?
Can you suggest another method which could be better?
Hi Jonathan,
sounds like you need to reshape your data. I'm not exactly sure I understood your problem correctly, but I imagine your data is formatted like
x1 --- y1 --- z11
x1 --- y2 --- z12
x1 --- y3 --- z13
x2 --- y1 --- z21
and so on?
If yes, try the following:
nx = numel(unique(x));
ny = numel(unique(y));
X = reshape(x, nx, ny);
Y = reshape(y, nx, ny);
Z = reshape(z ,nx, ny);
pcolor(X, Y, Z);
does that help?
Question
My goal is to detect the differential gene expression in two plant biotypes. The genes involved with hormone biosynthesis and regulation are what I'm mostly interested in. Which is the best way to do this with a limited budget?Should I use smaller sample size with higher sequencing depth (100 PE), or larger sample size with lower sequencing depth (50 SE).
Hi Shu Chen,
You should sequence at least 3 to 5 samples (biological replicates) from each biotype. This is a must. Without replicates you lose power and you won't be able to tell if the differences you will find are the effect of the biotypes or the effect of normal variations between individuals.
Target your sequencing depth to about 30 million reads from each library and use barcodes to multiplex samples. For current HISEQ 2000 thoughput, one lane should produce about 180 million reads. So you could run 6 samples per lane with multiplexing.
If you have to reduce the costs, choose 50bp single ends. However, if there is no genome or good transcriptome to map your reads against, you will have to sequence 2X100bp to have a decent de novo assembly.
Regards,
Gustavo
Question
This questions needs to be resolved seeing that otherwise it is impossible to include qualitative attributes in a pivot design. I have so far not tracked down a single article that has done so (they all only use quantitative attributes, that can just be coded linearly).
I would think so.
Question
I've got some data from an experiment in which participants were asked to complete a task (i.e. placing a group of objects onto a target using a tool) as fast as they could with the minimum number of errors (dropping the objects). As time is dependant on error (dropped objects cannot be picked up again, so the experiment terminates earlier than if all objects were placed without error) I would like to combine the time and error data into a single figure by creating a time penalty for each error, but I do not want to arbitrarily select this number. Are there any classic methods for determining what value this penalty should be? I imagine it will be a combination of maximum / minimum / average time and errors.
Any suggestions appreciated.
mind that your times won't be normal distributed, so simply averaging is not a good idea. There's another conceptual problem with analyzing error data: if someone does no correctly complete the task, what did they actually do? Are there different, and well-enough defined error modes etc.?
I'd have a different suggestion. You might treat your data as censored - if someone dropped the ball at time x, he didn't handle it successfully up to that time. In fact I'm working right now on a tool to analyze time data in such situations - would be interesting to try it in your case. Can we get in touch via mail? (myfirstname.lastname@affiliation.com)
Question
Yielding of a chemical product (Y) is a function of concentration of 3 ingredients (X1, X2, X3), temperature (T), and pH. I want to find a fitting model which relates Y with all the factors (X1, X2, X3, T, pH).
* T and pH are independently adjusted in each experiment (they do not depend on amount (or concentration) of ingredients and each other).
** I want to investigate the factors at 3 levels (3 concentrations of ingredients, 3 temperatures and 3 pHs).
*** There is no linear relation between Y and factors' levels.
I suggest to use a neural network model for this data fitting. I can help you if you desire it.
Question
I've been observing my liquid sample under optical microscope and found the digital picture shown in the computer as obscure and less in quality from the actual picture seen through the eyepiece. I assume this is common for all the microscopes. How can I capture this eyepiece picture to a video?
If you can see something with your eye, you can usually take a picture of it with a camera if the camera lens is small enough. Your best option might be a compact camera whose objective you place in the exit pupil of the eye piece. You will probably have to use a long focal length and focus to infinity.
Generally though, the standard camera mount (typically at the top of the microscope, imaging the intermediate image) should give you a high quality image. What is the make and model of your microscope? Is the CCD chip of your camera in good shape?
Question
We have isolated genomic DNA from three biological replicates (3 different petri plates of filamentous fungi) of our samples (2 treatments and a control). We are then proceeding to qPCR, examining the effect of our chemical treatment on starting copy number available for amplification. In order to optimize the assay, I would like to run a standard curve to determine our reaction efficiency (I believe this is also a suggestion for high-quality data from MIQE). I am looking at a 5-point serial dilution with three technical replicates of each point, taken from my control sample's gDNA. But how do my biological replicates play into this? Do I pick just one at random, or do I have to run a standard curve for each bio-rep and hope that all three are statistically similar? And if the latter is the appropriate course, then I won't be able to run my standard curves and samples on the same plate (I'm limited to 48 wells). In such a case, would it be appropriate to run my samples immediately after my standard curves?
I'm new to this, so I appreciate any advice!
Struggling with understanding and calculating qPCR amplification efficiency? This article breaks it down for you in an easy and comprehensible way. It also provides a treasure chest of resources that will bring your qPCR game to the next level ->
Question
I am doing cell experiment on op9 cells. My sample turns into suspension when i dilute it with the media. Would it be all right if i use that sample even though it is a suspension?
Question
I want to analyze the gas evolved in minute amount (<0.5 ml). It is also necessary to store multiple gas samples.
Please use air tight gas syringe for sample collection and preservation.
Question
I am planning on an experiment layout of random sampling using a quadrat of size x, over an entire field, rather than individual plots. Does anyone know of any good material that can support my choice of experimental design?
An Evaluation of Plotless Sampling Using Vegetation Simulations and Field Data from a Mangrove Forest
Question
I'm looking for a way to inspect the flow behavior of vapors and its influence on my chemical system.
Question
How can one determine the coefficient of discharge for oscillating helium flow from 22 bar to 8 bar? The oscillation is produced by a rotary valve which generates the sinusoidal pressure waveform to create an oscillating flow. Which instrument can be helpful? Coriolis meter can be useful but I do not know whether it can be suitable for 22 bar to 8 bar pressure variation with to and from motion of gas?
In fact, there is a variety of distinct technologies to flowmetering, each one presenting the respectives advantages and disadvantages.
The fact that you work with pure He is an advantage to use thermal technology, which is strongly dependent of the fluid properties knowledge.
On the other hand, usually thermal meters are less sensitive to flow variations due to the its own thermal principle, which should to warm and cool in the same frequencie of the flow.
I confess that I am not a specialist in thermal technologies, so I suggest a consultation join to supplier, as well.
My suggestion for a PVT or a gravimetric method refers to a customize flow calibration procedure by your own, considering this particular pressure fluctuations and meter instalation.
Unfortunately, I don´t have any literature in English, but I am sure that it is not represent a hard task to get some, since it is a widely known calibration procedures.
Question
I would like to know the best and well known software which is able to do experiment design, to be specific OPTIMAL design?
Thanks dear, How can I get this software? Can you suggest me some site sources where I can get it or can access to a demo of it?
Question
We have the sequence of an siRNA we have used to knock down a gene of interest. Rather than keep ordering it, we converted it to shRNA to clone into the pU6YH vector.
Cloning worked, cells are transfected, but no knockdown relative to siRNA control. Can all siRNA's be converted to shRNA? We have siRNA-resistant constructs that we do not want to re-design. Could the sequence/length of the loop make a big difference (we are using one that worked for someone else)?
Some commercially available siRNAs are modified (e.g. 2' O-Methyl) to "force" the antisense strand been recruited by the RISC, also many siRNAs may start with C or T as the first nucleotide in the sense strand. However the U6 promoter for shRNA expression prefers G as the starting nucleotide and H1 is more flexible (may prefer A or G). Overall the siRNA to shRNA conversion may not be 100% compatible but doable.
Question
.
1. when complete randomization of certain treatments are constrained logistically. e.g. tillage, irrigation etc.
2. when we need to introduce new treatments in an already-established set of treatments.
according to popular belief, split plot design is preferred for factorial experiments. but I believe since estimation in accuracy of one factor (main plot factor) is compromised in this design, better to use RBD, arranging the treatments in factorial combination. it will serve the purpose better without compromising on the precision of any factor's estimation.
Last term we made a monopoly experiment relating to Nelson, R. & Beil, R. (1994)
Question
We made new findings and developed enhancements. Which Journal might be interested?
Question
I am wondering what possibilities there are in order to test whether one factor in an experimental design is more important than another. Importance being measured as the amount of variance explained. I think that testing the regression coefficients should work. Do you any other ideas? Or do you know research in which such a regression approach has been done?
Christoph, there is a statistical technique available that effectively supplements regression analysis called "relative importance analysis". Basically, traditional regression cannot effectively compare the relative importance of multiple predictors because correlated predictors (by definition) explain some degree of shared variance with one another. Relative importance analysis applies a transformation to each of the predictors in your equation to make them orthogonal to each other. This allows the researcher to examine the independent amounts of variance explained by each predictor and rightfully conclude which predictors are "more important" than others in within the current sample.
Scott Tonidandel & James LeBreton have a website which probably does a better job of explaining the procedure than I ever could and also includes a very helpful macro to help you conduct the analysis. You can find it here: http://relativeimportance.davidson.edu/
There are also several recent papers written on this topic from these two researchers in journals such as the Journal of Business and Psychology.
Hope this is of some help to you.
Question
I am measuring the level of a protein in different genetic backgrounds.
For example, I am measuring the level of protein X in wild type and atg8mutant.
I ran both the wild type and mutant samples on the same gel. Probed initially for protein X, followed by actin (loading control).
Bands were analyzed using an analysis software (Total lab).
Then I did a couple of normalizations.
1) Level of protein X was normalized to respective action control.
2) The control (wild type) condition was normalized to 1 and all other experimental conditions were compared to this.
Following is an example of what I have done.
SDS fraction Vol of protein x Vol of action
Wild type 695432.72 174080.04
Atg8a mutant 948245.24 61598.79
1) Normalisation- Vol of protein x in lane A divides by Vol of actin in lane A
That gives wild type =3.99
atg8a mut=15. 39
2) Relative protein level in relation to wild type -(Divide each sample with control (wild type) including the control)
That gives wild type =1
atg8amutant-=3.85
I repeated the experiment two more times and analyzed the data as described above. Other two experiments also follow the same trend (mutant with more protein). Since these data are from three independent experiments (three different western), How should I apply the statistics?
Should I use the values normalized with actin ( above described first value-3.99 and 15.39) or the value normalized to the 1 (1 and 3.85) for analysis?
Should I do a paired t test? Or two way anova (each experiment as random variable and group (wild type and mutant) as fixed variable?
Thank you very much. Any help will be appreciated.