Science topic

Hypothesis Testing - Science topic

Explore the latest questions and answers in Hypothesis Testing, and find Hypothesis Testing experts.
Questions related to Hypothesis Testing
  • asked a question related to Hypothesis Testing
Question
5 answers
Hi all,
I am trying to find the demographic association with with variable on customer engagement.
Need suggestion on two queries
1. is alternate hypothesis statement is correct ?
2. Is decision statements are correct?
Alternate hypothesis statement
There is significant association of gender with customer engagement
Decision rule
1 The Alpha value is taken as 0.05
2.Reject hypotheses, if the p-value is less than or equal to Alpha
3. Fail to Reject hypotheses, if the p-value is greater than or equal to Alpha
Pls help
Regards,
Uday Bhale
Relevant answer
Answer
The decision rule for a chi-square test of association is about the null hypothesis.
If you are looking to assess an alternative hypothesis, something like TOST (two one-sided test) might be appropriate, but I don't know precisely how this would be applied in a chi-square test situation.
  • asked a question related to Hypothesis Testing
Question
9 answers
I am researching potential success factors for the internationalization of small and medium enterprises (SMEs) into emerging markets.
The identified success factor is "cooperation with local firms." As there is no existing literature on this specific topic (SMEs + Cooperation + Emerging Markets), I plan to use qualitative content analysis on data collected from expert interviews.
The schematic structure of the research will be as follows:
  1. Theoretical basics (Internationalization, SMEs, Emerging Markets)
  2. Theoretical basics (Cooperation) -> Ending with a research question (for example: "How important is cooperation, especially for SMEs, in emerging markets and what are possible reasons?")
  3. Results of the interviews -> Ending with a hypothesis ("It is an advantage for SMEs to cooperate with local firms to internationalize in emerging markets.")
Is this a reasonable approach?
Relevant answer
Answer
in your case, it is accurate to adopt a comprehensive approach, the priority is to understand how does SME cooperate. starting from a non-essential hypothesis (there is a cooperation !), you are not inventing this fact, your job is to understand deeply how this cooperation is possible. in this case you can conduct a qualitative approach or you can adopt a Mixed approach.
take a look on the mixed research approaches, especially the Exploratory Sequential Method.
  • asked a question related to Hypothesis Testing
Question
6 answers
Hello everyone,
I am currently doing research on the impact of online reviews on consumer behavior. Unfortunately, statistics are not my strong point, and I have to test three hypotheses.
The hypotheses are as follows: H1: There is a connection between the level of reading online reviews and the formation of impulsive buying behavior in women.
H2: There is a relationship between the age of the respondents and susceptibility to the influence of online reviews when making a purchase decision.
H3: There is a relationship between respondents' income level and attitudes that online reviews strengthen the desire to buy.
Questions related to age, level of income and level of reading online reviews were set as ranks (e.g. 18-25 years; 26-35 years...; 1000-2000 Eur; 2001-3000 Eur; every day; once a week; once a month etc.), and the questions measuring attitudes and impulsive behavior were formed in the form of a Likert scale.
What statistical method should be used to test these hypotheses?
Relevant answer
Answer
Go with the test of association (chi-square test )
  • asked a question related to Hypothesis Testing
Question
3 answers
I'm pretty new to statistics. I have a temporal data set consisting of nitrate levels in water collected from 5 different locations. (I wanted to correlate it with water pH in the region in long run.) Based on box plots, it appears that level of nitrates in location A and C are significantly higher than other spots at a given month. I would like to prove that A and C contributes most to the total nitrate levels of that month than other spots by a hypothesis testing and p-Values. Is there any hypothesis testing or post hoc test to do that. (The data ins not normally distributed) Thank you.
Relevant answer
Answer
Dear Good Sirs,
Thank you very much for your information. Really appreciate your answers.
  • asked a question related to Hypothesis Testing
Question
3 answers
We are currently working on a research paper aiming to develop ink from organic waste. My research group mates and I are debating on what statistical test to use for our study. We want to see the effect of different particle sizes on ink characteristics such as viscosity, pH, drying time, erasability, density, etc.
Most of the related literature we found did not use a specific test, only graphic/tabulating and then describing the data they have obtained.
Relevant answer
Answer
1 depends on the characteristic you are interested in
2 regression. Type depends on the characteristics of the dependent variable
3 Hypothesis tests depending on samples involved size, method of collection,etc.
The point I have been trying to make is that there are lots of possibilities often depending on the particular question you wish to ask. There's no omnibus test or method that covers everything that you asked about. A good reference for your questions is Montgomery, Design and analysis of experiments. There's no easy simple answer to your question. David Booth
  • asked a question related to Hypothesis Testing
Question
8 answers
Given a time series with Events, I want to test whether events in two time series are occurring differently. 
See for example the attached image. There are 12 events (orange) between 2000 and 2007 with different lengths. Let's pretend this are drought periods in countries or a any kind of event that can last for a certain time and there can be multiple events overlapping (my actual dataset are returns and the events are some kind of patterns).
I can simulate data to generate events. For example I can simulate a weather dataset and if certain conditions are met it is an event that will last for some time (e.g. no clouds = no rain). Thus, I can generate as many datasets (under H0) like I want. 
I want to check if there is any kind of "systematic occurrence" or if there is the same amount and distribution of events in the simulated datasets and actual dataset. 
I could use a simple t-Test and test the average number of events from the simulated datasets against the actual one. But his would not account for the problem, that the amount of events could be the same - but the events could be clustered (e.g. the twelve events in the actual dataset are always at the beginning).
Does anyone knows any similar problem with a solution or any kind of test for this kind of problem?
My only idea is to split the tests:
(1) Test for the number of occurrences 
(2) Somehow test if the structure is different
Thanks,
Nico
  • asked a question related to Hypothesis Testing
Question
7 answers
Hi, I am a student doing research on the MBA level.
In my research, I have linear regressions, Chi-squares, and t-tests due different types of variables. I have around 8 variables that are tested against each other. I have two main hypotheses but have several sub-hypotheses that are tested in SPSS.
Due to my study having several analyses, are there any issues with having (too) many hypotheses? i.e 16 - 20? Are there any arbitrary limits or is "too many" something in research?
Thanks
Relevant answer
Answer
Dear Dilpreet Singh
If you have too many research hypotheses, that are related to your topic, and you have elaborated them according to the requirements of the topic, then it should not be a problem.
It is also possible to introduce some control variables...
I suggest you read the relevant literature for more specifics.
  • asked a question related to Hypothesis Testing
Question
8 answers
I am studying the effects of social media on fad dieting on males and females. My hypothesis is "There is going to be more of a social media influence on females than males regarding fad dieting." I'm recruiting participants from my University however I'm slumped to figuring out a procedure to measure my hypothesis. Can anyone help me? Thank you in advance.
Relevant answer
Answer
Social media is a concept. And it is not variable. Social media cannot be measured with quantitative tools.
  • asked a question related to Hypothesis Testing
Question
8 answers
The scenario is as follows:
- Imagine a phenomenon was studied by others;
- They concluded that there is correlation between the increase in that phenomenon and the increase in its outcomes;
- After reading these studies, can I hypothesize that if the phenomenon will prevail/ increase, the outcomes will also prevail/increase as well?
- Will my hypothesis be valid or is it incorrect to draw future conclusions on past ones?
- P.S. Sorry not to reveal what I am hypothisizing because, once revealed, it will lose its magic :)
Relevant answer
Answer
The good part is the technique can be applied to any area. However, the bad part is not that easy. I used it on the question of free will (philosophy) . The second part of that article to be published soon hopefully. Admittedly, the technique requires a long steep learning curve. Note it is not Buckingham's Theorem that appear in many books and papers. You can read more in my book Basics of Fluid Mechanics Dimensional Chapter (not easy to read but provide the flavor even to none fluid mechanics people).
  • asked a question related to Hypothesis Testing
Question
9 answers
in hypothesis testing, we report the confidence interval? For instance when we say 90% or 95% confidence interval, what does it mean?
Relevant answer
Answer
Blaine Tomkins, nothing in the definition I quoted suggests that one has to be able to compute the CIs for all possible samples of size N.
The correct definition of a frequentist CI is based on the hypothetical notion of drawing all possible samples of size N from a population and computing the x% CI for each sample. If one could do that, x% of those intervals would include the true value of the parameter being estimated, and 100-x% of the intervals would not.
In the usual case, we have just one sample and one of those possible intervals. But as the authors of the analogy put it, we do not know if it is one of the x% of intervals that tell the truth (i.e., they include the parameter) or one of the 100-x% of intervals that lies (i.e., they do not contain the parameter).
  • asked a question related to Hypothesis Testing
Question
5 answers
Kindly, let me know which regression model I can use specifically for hypothesis testing.
Relevant answer
Answer
Have you three hypotheses or one?
Please tell us a little more. It's well-nigh impossible to answer the question as it stands.
  • asked a question related to Hypothesis Testing
Question
1 answer
In the study (EEG band power analysis), there were six sample points (six subjects) on the pre and post-groups but couldn't get a significant difference between both groups. Intend to apply a sliding window of 1s with no overlap to reduce variance; let's assume we have 100 sample points per participant, 600 sample points in the pre-group, and 600 points in the post-group. I intend to run paired t-test on the data obtained post sliding window application. The dilemma:
should I,
1. average the windows generated per participant to increase the SNR, returning our data to 6 sample points per group (we didn't get any significant difference with this method)?
2. keep the 600 sample points generated by the sliding window and run the test on it?
Relevant answer
Answer
This is a really bad strategy. You can do a lot of bad things with such data, leading to irrelevant or wrong conclusions and there is a high risk than you will miss the relevant information. I suggest you sit together with a statistician, ideally one that is a bit experienced in analyzing such kind of data.
  • asked a question related to Hypothesis Testing
Question
9 answers
Hi
I have applied several conditional independence testing methods:
1- Fisher's exact test
2- Monte-Carlo Chi-sq
3- Yates correction of Chi-sq
4- CMH
The number of distinct feature segments that reject the independence (null H) is different in each method. Which method is more reliable and why?
(The data satisfies the prerequisite of all of these methods)
Relevant answer
Answer
With low expected cell counts there are various suggestions as to when Chi-sq is no longer trustworthy. I like the Np15 rule because it's simple and seems to work well. In this rule, N is the total number of observations, and p is the proportion in the smaller group. So if you have a table that contains 50 observations, and 10% of them fall into the smaller group (so P = 0·1). Then Np is (50 x 0·1) which is 5. The table fails the Np > 15 test – not enough data to do a Chi-squared test.
  • asked a question related to Hypothesis Testing
Question
3 answers
In my dissertation I made a Multiple Linear Regression Model, and to test the variable significance I display the p-values with the hypothesis test as follows:
H0: There is a relation between the feature and the output
H1: There is no relation between the feature and the output
1º This hypothesis is correct?
2ºWhen I say 'relation between the feature and the output' it means that this relation is linear?
And if the variables present a non-linear relation, for example an exponential relation, the variable will not be significant?
Relevant answer
Answer
I'll just deal with the first one from David Morse
"1. In an ordinary least squares regression model having more than one independent variable ("predictor"), regarding the test of an individual predictor, the p-value informs you as to whether it is more reasonable to presume that the population value is zero (the null hypothesis) or non-zero (the alternative hypothesis)."
Unless I am misreading this, it could be written as (assuming alpha = 7%): if p < .07, H0 has a probability greater than 50% (and would it be higher if alpha = 5%?). This is incorrect, starting with in most situations the prior probability of H0 as a point hypothesis would be very low (perhaps 0). I could go on, but I assume I am misreading something.
Caty Gonçalves , I think a general point is that the beta associated with each individual feature will be conditional on all the others, so is this what you want? Also, how many features and does it make sense to explore patterns among these before prediction, and if not have you consider lasso-like procedures (and there are lasso procedures if you think the features are in groups).
  • asked a question related to Hypothesis Testing
Question
16 answers
Good day.
I am doing linear regression between a set of data and predictions made by two models, that I'll call A and B. Both models have the same number of parameters.
If I do a simple regression with excel, I get the following:
- Model A has R2 = 0.97.
- Model B has R2 = 0.29.
- The least-squares fit to model A has a slope m = 2.43.
- The slope for model B is m = 0.29
From this simple analysis, I would conclude that model A is better than model B in capturing the trend of experimental outcomes. I even tested it on a set of unseen data and it performed better at predicting the trends.
Now, I was asked to confirm this by hypothesis testing, and here it gets tricky probably due to my lack of experience. Due to the large slope of model A, the residual sum of squares for model A is huge, almost 5 times larger than that for model B. Since the number of data points and parameters is the same for both models, this suggests that model B is better than model A.
What am I doing wrong? I feel that I'm not formulating my problem correctly, but I'm honestly lost.
Also, I've seen that there are endless flavors of hypothesis testing, but the more I read the less I know where to start.
Is there a simple prescription to formulate and test my hypothesis?
Many thanks in advance!
Relevant answer
The sample size for A and B is equally 11, which is too small (n=11). Sometimes, it is not enough for regression analysis.
  • asked a question related to Hypothesis Testing
Question
4 answers
Which countries use permanent income hypothesis for managing their oil wealth?
Relevant answer
Answer
  • asked a question related to Hypothesis Testing
Question
7 answers
Hi
I have a huge dataset for which I'd like to assess the independence of two categorical variables (x,y) given a third categorical variable (z).
My assumption: I have to do the independence tests per each unique "z" and even if one of these experiments shows the rejection of null hypothesis (independence), it would be rejected for the whole data.
Results: I have done Chi-Sq, Chi with Yates correction, Monte Carlo and Fisher.
- Chi-Sq is not a good method for my data due to sparse contingency table
- Yates and Monte carlo show the rejection of null hypothesis
- For Fisher, all the p values are equal to 1
1) I would like to know if there is something I'm missing or not.
2) I have already discarded the "z"s that have DOF = 0. If I keep them how could I interpret the independence?
3) Why do Fisher result in pval=1 all the time?
4) Any suggestion?
#### Apply Fisher exact test
fish = fisher.test(cont_table,workspace = 6e8,simulate.p.value=T)
#### Apply Chi^2 method
chi_cor = chisq.test(cont_table,correct=T); ### Yates correction of the Chi^2
chi = chisq.test(cont_table,correct=F);
chi_monte = chisq.test(cont_table,simulate.p.value=T, B=3000);
Relevant answer
Answer
Hello Masha,
Why not use the Mantel-Haenszel test across all the z-level 2x2 tables for which there is some data? This allows you to estimate the aggregate odds ratio (and its standard error), thus you can easily determine whether a confidence interval includes 1 (no difference in odds, and hence, no relationship between the two variables in each table) or not.
That seems simpler than having to run a bunch of tests, and by so doing, increase the aggregate risk of a type I error (false positive).
Good luck with your work.
  • asked a question related to Hypothesis Testing
Question
17 answers
Hello,
I am writing a research proposal in the field of Marketing Theory right now. The hypothesis must be developed based on the theory (of a paper by Ofek et al. (2011)), that multichannel retailers must increase in-store assistance levels to decrease product returns. Therefore, the authors propose that retailers with a high level of in-store assistance have lower returns than vice versa.
I want to test this relationship. My hypothesis is based on my belief (based on previous literature), that this relationship as described by the authors has changed. Therefore, I expect both groups (retailers with high vs. low in-store assistance levels) to have the same product return rates.
Now my question:
Is this hypothesis in a statistical context correct?
"The average product return rate of a B&C retailer with a high level of in-store assistance is similar to a B&C retailer with a low level of in-store assistance in the clothing market."
My problem is: If I conduct, for example, an ANOVA test and the results are significant, I would normally conclude with "there is a significant difference in the group means, therefore I can reject the null hypothesis.
In my case, with a significant test result, I would then need to *reject* my hypothesis. Is this allowed or statistical incorrect?
Thank you so much for any help or feedback. I would appreciate any thoughts on this.
Kind regards,
Johanna
Relevant answer
Answer
Just a note to the post of Ronán Michael Conroy : TOST is equivalent to interpreting a (two-sided) confidence interval, something that might be easier to understand.
Johanna Schulz , maybe you can contact Ofek et al. to get the data or discuss directly what they would consider a trivial effect. Getting the data would be ideal because then you could directly test the hypothesis that the effect in their study equals the effect equals in your study. If the estimated difference of these effects is negative (the estimate for your study is smaller then for theirs) and when you can reject the hypothesis, then you have reason to believe that your effect is smaller then theirs (there may still be an effect-but not as large as found be Ofek et al.).
PS: This seems to be a nice example demonstrating how worthless publications are that do not report and test meaningful models and don't give any effect quantification, and don't provide the actual data.
  • asked a question related to Hypothesis Testing
Question
10 answers
When we teach about family-wise error rates, we usually use straightforward examples such as:
I conduct an ANOVA and obtain a statistically significant result. Next, I conduct post hoc comparisons to determine which cells significantly differ. However, I need to adjust alpha since I'm doing multiple hypothesis tests.
...but are family-wise error rates an issue only in these narrow situations; namely, when multiple hypothesis tests are done on the same datum and all related to a single underlying "question", or "inference".
For instance, if a study contains 3 experiments, all slightly different in design, but all intended to address a single question or phenomenon - is a statistical adjustment required? Why or why not?
*My point in asking this question is that what exactly constitutes a "family" is far from clear (at least to me).
Relevant answer
Answer
I'd say yes, testing the same "structure" using three different drugs (assuming that there are no other effect interfering with the action of that structure), the evidence from the experiments should pile up. As an extreme case you may think of using three drugs with minimal differences, actually with no difference at all - so essentially you use the same drug in three experiments. In each experiment you make 5 independent measurements. Due to the large variance all the p-values are rather large. "Piling up evidence" means that you could pool the three times 5 measurements and so have a higher power (i.e. you get a lower p-value for the same "true" effect). This should be (more or less) equivalent to combining the three p-values.
To the second part (dose-effect of at least one of the drugs). I also agree here. This is a screening. The difference to before is that here you assume that the drugs work at different doses because of some feature of these drugs (independent of "structure") which are all different between the drugs. You are testing a family of different features (or feature-combinations).
And I also finally agree that this can be a rather complex question in practice. Most authors don't seem to think about such things at all. They just always use some correction for muliple testing when doing some kind of ANOVA and otherwise they completely ignore the topic (except in meta-analyses, where evidence is purposefully combined).
  • asked a question related to Hypothesis Testing
Question
6 answers
Is Bartlett's test alone enough for hypothesis testing? or Chi square has to be tested along with in a dissertation for a Ph.D study in the field or social science?
Relevant answer
Answer
For exploratory factor analysis?
Bartlett’s test will tell you whether your data fit for factor analysis or not?
And the chi-square test will tell you if there is a sufficient quantity of factors extracted to explain the variance observed in the data.
I think you will require both.
Best!!
AN
  • asked a question related to Hypothesis Testing
Question
4 answers
For example, in the model y= b1+b2x2+b3x3+e, I want to test the linear combination b2+b3=0 in R with model estimated in Quantile Regression. I checked quantreg package of Koenker , it has the command anova.rq, but it tests only nested models, similar to the usual anova command in base R. I want to test linear combination of a single multiple regression or time series model. For linear model package 'AER' has such a command of 'linearHypothesis' but it does not work for QR.
I know Eviews can be used for this and I have used this in my paper Iqbal (2017)"Does gold hedge stock market, inflation and exchange rate risks: An econometric investigation", International Review of Economics and Finance.
Any help is appreciated.
Relevant answer
Answer
check this, if not useful, please DM me https://doi.org/10.1016/j.renene.2022.03.017
  • asked a question related to Hypothesis Testing
Question
62 answers
I have large sample (just one sample with 1100 cases) and I want to test a hypothesis about comparing mean of my sample in the two groups (each group has 550 cases).
Some statisticians told me "you can use formal student t-test because the data are normal, based on Central Limit Theorem".
I'm confused, the Central Limit Theorem is about "mean of sample means". for example, if we have a data with 100,000 cases which is not normal then we can take 100 samples. In this case, the average of 100 sample means would be normal. Now I can use the t-test.
If my sample is large, can I use parametric statistics (or testing hypothesis test) with a non-normal distribution of the data?
Relevant answer
Answer
As a complement to this discussion and valuable remarks made by Prof. David Eugene Booth , let me also put this quote from Wilcoxon [1]. It totally agrees with my experience in a field well known from generating "difficult" datasets (clinical trials -> clinical biochemistry).
1.4 The Central Limit Theorem
When working with means or least squares regression, certainly the best-known method for dealing with non-normality is to appeal to the central limit theorem. Put simply, under random sampling, if the sample size is sufficiently large, the distribution of the sample mean is approximately normal under fairly weak assumptions. A practical concern is the description sufficiently large. Just how large must n be to justify the assumption that ̄X(bar) has a normal distribution? Early studies suggested that n=40 is more than sufficient, and there was a time when even n=25 seemed to suffice. These claims were not based on wild speculations, but more recent studies have found that these early investigations overlooked two crucial aspects of the problem.The first is that early studies looking into how quickly the sampling distribution of ̄X(bar) approaches a normal distribution focused on very light-tailed distributions where the expected proportion of outliers is relatively low. In particular, a popular way of illustrating the central limit theorem was to consider the distribution of ̄X(bar) when sampling from a uniform or exponential distribution. These distributions look nothing like a normal curve, the distribution of ̄X(bar) based on n=40 is approximately normal, so a natural speculation is that this will continue to be the case when sampling from other non-normal distributions. But more recently it has become clear that as we move toward more heavy-tailed distributions, a larger sample size is required.The second aspect being overlooked is that when making inferences based on Student’s t, the distribution of T can be influenced more by non-normality than the distribution of ̄X(bar). In particular, even if the distribution of ̄X(bar) is approximately normal based on a sample of n observations, the actual distribution of T can differ substantially from a Student’s t-distribution with n−1 degrees of freedom.Even when sampling from a relatively light-tailed distribution, practical problems arise when using Student’s t as will be illustrated in Section 4.1. When sampling from heavy-tailed distributions, even n=300 might not suffice when computing a 0.95 confidence interval via Student’s t.
[1] Wilcox, Rand. (2012). Introduction to Robust Estimation and Hypothesis Testing. 10.1016/C2010-0-67044-1.
  • asked a question related to Hypothesis Testing
Question
5 answers
My understanding of conventional practice in this regard is that when there are more than two independent proportions being compared (e.g., comparing the proportion of people who contracted COVID-19 at a given period between the <18 year-old group, 18-64 year-old group and >64 year-old group), one of the groups being compared will serve as a reference group (which will automatically have an OR or RR = 1) through which the corresponding OR or RR of the remaining groups will be derived from. As far as I know, it seems that the generated OR or RR from the latter groups, through logistic regression or by-hand manual computation, will have a p-value whose threshold for significance testing is not adjusted with respect to the number of pairwise comparisons performed.
I understand that in the case of more than two independent means, we implement one-way ANOVA/Kruskal-Wallis technique first as omnibus/global hypothesis test which is followed by the appropriate post-hoc tests with the p-value thresholds adjusted if the former test finds something "statistically significant." I imagine that if the same stringency is applied to more than two independent proportions, we should be doing something like a Chi-square test of association (with the assumptions of the test being met) first as omnibus/global hypothesis test, followed by an appropriate post-hoc procedure (possibly Fisher exact tests with p-value threshold adjustment depending on the number of pairwise comparisons performed) if the former test elicits a "statistically significant" difference between the independent proportions.
I would like to ask some clarification (i.e., what concepts/matters I am getting wrong) on this. Thank you in advance.
Relevant answer
Answer
Indeed do run the chi-square test. In attachment an R script with post hoc tests.
  • asked a question related to Hypothesis Testing
Question
4 answers
The starting point for Interpretative Phenomenological Analysis (IPA) is induction. It aims at filtering out theoretical and conceptual assumptions in order to allow the lived experiences of research participants vis-a-vis a phenomena to speak on their own terms. But, although IPA strongly emphasises an inductive process, many PhD students who choose IPA as their main methodological point of departure appear to make empirical and theoretical assumptions, as well as develop hypotheses, prior to embarking on their fieldwork (this is, at least, my experience). So, the question is, can IPA incorporate deductive processes, such as preconceived hypotheses?
Relevant answer
Answer
This raises the classic issue of "bracketing" in phenomenology, with the goal of becoming aware of, and setting aside, one's own preconceptions. Eatough & Smith (2017) give a nice one paragraph summary of the IPA position on preconceptions, with the argument that IPA researchers need to be reflexive in detecting and restricting these prejudices throughout the research process.
But this is very different from a deductive approach, which would build these preconceptions into the research process. In their book, Smith et al. (2009) do acknowledge the possibility of having some "secondary" interview questions that are theory-driven, but these should always be subordinate to the core goal of hearing the participants' experiences in their own terms.
  • asked a question related to Hypothesis Testing
Question
3 answers
Hello everyone. I really need your help on xtfrontier.
I'm using STATA and journal of Battese and Coelli (1995) to estimate the efficiency of firms using two-stage procedure.
As in the journal there is some test of hypotheses for parameters of the inefficiency frontier model. What command that I should run in STATA to have those hypotheses test and to get the inefficiency score? Like how to do loglikelihood test like in the paper?
I run as follows:
xtfrontier lnY lnK1 lnK2 lnL, tvd
predict TE, te
xtreg TE ...(some independent var)
Thanks in advance for your help!
Relevant answer
Answer
Dear researcher,
I hope you may find the answer. From my understanding, you can find all you need in xtfrontier posestimation in Stata command by typing help xtfrontier and find xtfrontier posestimation. Hausman and Lrtest are available for your research.
  • asked a question related to Hypothesis Testing
Question
4 answers
With a hypothesis test, we can see of using the null hypothesis in many studies. While the use of alternative hypotheses is also widespread, particularly for the structural equation modeling approach.
It is very important to know for the researchers when we should use what hypothesis approach. Could you give your expert opinion on it?
Relevant answer
Answer
In statistics, you use a model to explain how the observed data may be generated. Typically the data is understood as realizations of a random variable with some distribution (binomial, Poisson, beta, Weibull, gamma, Gaussian etc.) and one is interested in the expected value (mean) of this distribution, that may depend on other factors (treatment, time, disease group, country, ...).
The expected value is formulated as a parametric function, being flexible enough to explain essentially any possible observation. The set of all posisble values of all the parameters in this model is called the parameter space. Now the observations you actually made may be more or less well explained by some combination of parameter values. A specific subset of the parameter space is called a (statistical) hypothesis. The complement of the parameter space (all values of all parameters not contained in this specific subset) is called the alternative. You determine if your observed data is considerably more likely under the alternative than under the hypothesis. A hypothesis test calculates the probability of a larger discrepancy of the likelihood of data under the hypothesis and the alternative under the assumption that your model is correct and that the true set of parameter values is in the subset of the parameter space you selected as the hypothesis. If this is very unlikely, then your observed data provides a considerable amount of information against yout model/hypothesis combination. If you believe that your model is actually ok, then the conclusion is that your hypothesis was a "bad choice" and the alternative, therefore, is the better choice. You would thus reject the hypothesis (in favour of the alternative, as this is "everithing elese that is not part of the hypothesis you just rejected).
We often test a point null hypothesis that divides the alternative into two parts to see if the data provide enough information to decide in which of these two parts the parameter value should be located.
Example:
You want to model the probability of a successful job application under a more or less rigorously defined set of circumstances (e.g. what job, who applies, and how the application is done and evaluated). You define the Bernoulli variable Y having the value 1 for a success and 0 for a failure of an application. The parameter is the success probability π, what here is identical to the expected value E(Y).
You collect data to learn something about π. Let's say you have single observation, and that person got the job, so the observed realization of Y1 is 1. This observation is most likeliy under the hypothesis π = 1 and it is least likely under the hypothesis that π = 0. Now you get a second observation, and that happened to be Y2 = 0. If these two observations were independent, then the combination of such two observations is most likely under the hypothesis that π = 0.5 and least likely unter the hypothesis that π = 0 as well as under π = 1. After many more (independent) observations you will have a set of data that is considerably more likely under a very narrow set of possible values for π than for any other posible value of π. In fact, it will be most likely for π = sum(Yi)/n, which is the average of the realizations, which I will denote with m (like "mean").
Testing a point hypothesis (π = h, with h being some value between 0 and 1) is now essentially comparing the maximum likelihood of the data under this hypothesis to the global maximum likelihood. As out simple model here has only a single parameter, the maximum likelihood under this hypothesis is just the likelihood of the data assuming π = h. The global maximum is obtained assuming π = m. The further away h is from m, and the narrower the likelihood peaks around m, the larger is the ratio of these two likelihoods, and the lower is the probability to obsere even larger likelihood ratios when π = h. If this is so unlikely that you are to reject π = h, then you reject any value of π that is even further away from m than h. This is because the likelihood has a single maximum (at π = m). So the test of π = h is equivalent to the test of π > h when h > m and equivalent to π < h when h < m. Hence, when rejecting π = h, you claim to have enough information from your observations about π to say whether π > h or π < h. If you cant reject π = h, then your data don't provide enough information wheter π > h or π < h.
So what we actually want is to see if we can distinguish π > h from π < h. We do this by testing the point hypothesis π = h. If we can reject this, then π = h is "out" to explain the observed data, because there is some other hypothesis within the alternative (which is any π ≠ h) under which the data is considerably more likely. And since the likelihood is maximal at π = m, we know that this value must be "on the same side" of h as m.
For more complex hypotheses, e.g. compound hypotheses about several parameters, there is no simple direction anymore, so the practical meaning of testing point hypothesis (where the point is a point in the multidimensional parameter space) may not be very obvious. But one can certainly define explicit subsets of the parameter space to test more "meaningful" hypotheses against more "meaningful" alternatives.
  • asked a question related to Hypothesis Testing
Question
3 answers
Esteemed members of Research Gate
I have been wondering for many years and even now about hypothesis testing, which is not able to produce practical results. Just accepting some proposition or rejecting it would not be right. Ironically the research which we do here is heavily dependent on it. I am starting the discussion so that you would provide insights into the issue.
My point is this. When I was doing graduation in statistics, I have been instructed to set up the Null Hypothesis as a statement of " no significant difference". If the calculated value is less than the tabulated value, then accept the null hypothesis. If not, reject the null hypothesis. But in research reviews and some lectures, I have seen the opposite. If I just ignore the null hypothesis statement and just take a directive hypothesis where I assume that there is a significant difference between the variable under study as my primary hypothesis, would the same rule apply? I mean, if calculated<tabulated then we have to accept the primary hypothesis, which is assumed to be significant. Then I have to reject it if the opposite comes true. Is this right?
To bypass this issue, I have come across with bayseian and is very clear. But I do not have a model to follow in teaching students.
So kindly edify the correct way of teaching hypothesis testing as is still the predominant practice of doing psychological research.
If possible refer me some free resources of teaching research methodology to psychology teachers. Unless teachers of psychology trained well in this issue, genuine research would not come forth.
Relevant answer
Answer
There are several conceptual errors in what you wrote you had been instructed.
Very generally speaking, you do hypothesis tests to find out if your observed data contains enough information for some kind of interpretation. That's all. We call it a "(statistically) significant result" when there is enough information for an interpretation, and otherwise we call it a "(statistically) non-significant reesult". This only means that the observed data don't give us enough information of the the feature we like to interpret.
The previous paragraph was very general, but it contains an extremely important message: why, in general, we do tests at all and what the test results tell us (and what they don't tell us). About 80% (at least) of all applied stats books I know get this wrong, and so do most researchers.
This principle idea will become clearer with a more concrete example. A prototype example is the effect of a treatment on some quantitative response variable. In your statistical model you formulate the response variable as a random variable of some distribution family. The interesting feature of this random variable is its expected value which may depend on the treatment. We don't know the expected value of this distribution. However, obsered data gives us some information about this value, which I will denote with "µ": for any possible expected value of µ we can use the distribution model to calculate how likely the observed data would be, if µ were what we set it to be. We can so get the likelihood of the data for any hypothesized value of µ. The likelihood will be larger when we choos a µ somewhere in the "middle" of the observed values, and it will decrease the further away from the data we coose a value of µ.
The value of µ where the likelihood is maximal is called the maximum likelihood estimate (of µ). This is the sample mean which I will denote with "m".
If we have a lot of observations, this drop will be steep, so that the likelihood will be much lower we we choose a value that is just slightly different to the sample mean. If we have only few observations, the likelihood won't change so quickly: there is a larger interval of value we can choose for µ and still find a relatively large likelihood of the data. This shows that and how data bring information about µ.
You may now be interested in finding out if that information about µ is sufficient to expect µ being on the same side of some value h like the sample mean m. If the difference between m and h is small and the sample size is small, the ratio of likelihoods for m and h will be close to 1. But when there is a lot of data, the likelihood drops quickly around m and the likelihood ratio will be much smaller than 1.
The genious trick is that one can derive a probability distribution of this likelihood ratio under the assumption that µ = h. So it is possible to calculate the probability of even smaller likelihood ratios than the one we obtained for the observed data. This is the famous p-value. If this probability is very small, then the observed data are very incompatible with the statistical model and our hypothesis that µ = h. Therefore, the data is even more incompatible with all values for µ that are on the opposite side of h than m (because there the likelihood ratios will be even smaller). When, for instance, m > h, and the data are very incompatible with the assumption that µ = h, then the data are even more incompatible with any µ < h. But this means: when there is any value compatible with the data, then this value must be larger than h. This means we "reject the hypothesis that µ < h" and conclude that µ > h (because the data is too incompatible with any µ < h).
Often, it is stated that only µ = h is rejected.Although this is not wrong, it obscures that therewith all hypotheses µ < h or µ > h are also rejected, depending on whether m < µ or m > µ.
Regarding your "Bayesian bypass": this adresses an entirely different question. In the hypothesis test you try to find out if the information in the data is sufficient for a statement about the sign of the difference h-m. The Bayesian analysis requires a prior distribution for µ and modifies this based on the information in the data to a posterior, expressing your current knowledge about µ (and notif you have a large enough sample size to make a statement about whether µ > h or µ < h).
  • asked a question related to Hypothesis Testing
Question
8 answers
I am using Mann-Kendall test and Sen slope to assess the trends in monthly rainfall datasets for 64 years, e.g., Jan 1957, Jan 1958, ..., Jan 2020. Since the region is a semi arid one, there are a lot of zero values (NOT missing values) in the time series. For example, the time series for rainfall in January has only 15 non-zero values out of 64 data points. My question is how this will effect the trend test (Mann-Kendall) and the trend slope (Theil-Sen)?
Relevant answer
Answer
To me this sounds like a Point Process with (say) Time intervals between the Rain event (A Procedure in Genstat software gives following description):
>>>
A point process, or series of events, is characterized both by the times at which events occur, and the intervals between events. The Poisson process is the most basic point process, with Poisson counts in any interval, and independent exponentially distributed intervals between events.
A comprehensive account of methods for analysing point processes is given by Cox & Lewis (1966). PTDESCRIBE implements many of the test and summary statistics they give and should be used in conjunction with the text for a full discussion of the motivation and context of their use. All equations referred to below are from Cox & Lewis (1966).
The DATA variate may contain either the times at which events occur, the intervals between events, or a sequence of 0's and 1's, with 1's indicating the times of events on an integer time scale. The option REPRESENTATION specifies which of these is used. If REPRESENTATION=time and the process is measured from some time other than zero, the initial time should be given in the parameter START. Otherwise the START time is assumed to be zero. The first interval is taken to lie between the START time and the first event. If the process is observed beyond the last event, the total duration of the process should be given in the parameter LENGTH. Checks are carried out on START, LENGTH and the length of each interval, and the procedure terminates if these are inconsistent. If REPRESENTATION=time, the DATA variate may be restricted, facilitating the analysis of truncated or thinned point processes.
>>>
Reference
Cox, D.R. & Lewis, P.A.W. (1966). The Statistical Analysis of Series of Events. Methuen, London.
  • asked a question related to Hypothesis Testing
Question
9 answers
Dear Community,
I was wondering whether it is possible or not to validate hypothesis testing based on OLS regression with 2 indep variables only? Or will such thing decrease the credibility of OLS regression results?
If we push it to the extreme, is it possible to validate hypothesis testing with uni-variate regression analysis?
Relevant answer
Answer
Hello Yuko,
I'm not sure I understand exactly what you're asking.
You certainly can:
1. Use OLS regression to develop a model having two IVs and one DV (which is a univariate model: one DV).
2. Use OLS regression, with additional data, to determine how well a previously developed model having two IVs works to account for variance in the DV in the added/new data batch.
3. Use OLS regression to evaluate the relationship between scores on a single IV with those of a single DV (simple bivariate regression).
4. Use OLS regression to address questions of whether a categorical IV can help explain differences in scores on a single metric DV.
If none of these gets at the intention of your query, perhaps you could elaborate a bit so as to increase the likelihood of getting a more constructive response.
Good luck with your work.
  • asked a question related to Hypothesis Testing
Question
1 answer
On working on a hypothesis it was assumed that a marine microorganism named dinoflagellates need to be injected via microinjection into the xylem tissue of the plant once cultured into the nutritive media.
I want to know is it possible to do so (practically)? Any consequences related to it for plant as well as for the microorganism survival ?
Relevant answer
Answer
Before deriving the hypothesis/theoretical assumption and even trying to apply it practically, will require a lot of things to consider. For say, most marine organisms (except euryhaline= those that can adapt to various salt concentrations) would die in a fluid wherein the salt concentration is too low or too high due to Osmosis. Some marine organisms can tolerate different osmotic pressure though.
Secondly, you are using a dinoflagellate, which is an autotrophic organism. unless your dinoflagellate has special adaptations for dark conditions it is highly unlikely that they will survive within the xylem of the plant.
Apart from light dinoflagellates require inorganic nutrients for normal growth.
Thirdly, other things to consider would be that - can the to be tested organism evade the plant immunity, acclimatize to the xylem fluid condition?
  • asked a question related to Hypothesis Testing
Question
2 answers
I have heard some academics argue that t-test can only be used for hypothesis testing. That it is too weak a tool to be used to analyse a specific objective when carrying out an academic research. For example, is t-test an appropriate analytical tool to determine the effect of credit on farm output?
Relevant answer
Answer
Depending on you objective statement, if your objective is to compare variables that influence a particular problem, you can use the t- test to compare and them give justification.
  • asked a question related to Hypothesis Testing
Question
2 answers
I am confused about it that during testing EKC, the studied variable of DGP and its square values are having very small coefficient (0.01 and 0.00) having positive and negative sign respectively at 1% level of significane. It is okay with these values or there is something wrong with the data i am using in the study? Kindly help me out in this matter. Thanks
Relevant answer
Answer
Thank you very much for sharing supporting material
  • asked a question related to Hypothesis Testing
Question
3 answers
I want establish relationship between two construct and both construct are second order reflective-formative construct. Please suggest how test hypothesis between such construct using smart pls software. Kindly share a research paper if possible for such kind of analysis. Thanks in advance.
Relevant answer
Answer
dears
see the material below.
  • asked a question related to Hypothesis Testing
Question
21 answers
I have two independent groups of animals (n=4 in each group) observed under two different conditions (A and B) at the same time of the day and equal time interval (12 hours). For each group, behavioral data were recorded, divided into 7 mutually exclusive categories of behaviour. As a result, we have a different number and frequency of categories of behavioral acts between two groups (Total: Group A - 795 behavioural acts, Group B - 867).
I just visualized the total frequency of the categories in each group and the relative frequency as a percentage, but I do not know which statistical method to use with this data to determine the significance of the difference in the frequencies of each behavioral category between group A and B.
Can you suggest one?
Relevant answer
Answer
T Test?!
Mann–Whitney?!
Sometimes I want to bang my head against the wall at Research Gate.
  • asked a question related to Hypothesis Testing
Question
3 answers
Hello, everyone
I want to discuss with you about Hypothesis Testing.
Briefly speaking,
about 350 thousand people have bad liver(bad AST, ALT) so they got medical test for Hepatitis C
and only 38 people among them really have virus of Hepatitis C
and other 850 thousand people didn't get a medical test for Hepatitis C because they have normal AST, ALT
(So they don't know if they are patients or not for Hepatitis C)
In this case, we want to do test between two groups
group C : All people who have virus of Hepatitis C
group D : All people who don't have virus of Hepatitis C
We want to test if there is a significant difference between the mean of BMI of group C and the mean of BMI of group D.
(Also we want to test similarly if there is a significant difference between the mean of weights of group C and the mean of weights of group D
and so on)
Two serious things in this situation are
We don't know, for some of them, if they belong to group C or group D.
And, there is extreme imbalance between the number of group C and the number of group D (group C is too small)
In this case, I want to discuss with you
What is the best strategy or test for this situation
Thank all of you
Relevant answer
Answer
Do a two sample t test of the difference in means between the tested negative and tested positive groups only.
Because they are of very unequal size they will not satisfy the equal variance condition and you have a Fisher-Behrens problem for which there are several approaches including the Welch test and the Behrens test.
  • asked a question related to Hypothesis Testing
Question
7 answers
Dear Researchers, I would like to discuss, please, on experiences with the statistical method of the Wilcoxon Paired Test. In the core algorithm of the method, the values are further transfered on ranks. However, the test would not consider e.g., if some value (before transfering on ranks) e.g. is 5 times or 15 times greater than other values - scale of some values. Do I consider right, that the Kolmogorov-Smirnov test for testing variances is appropriate, as the necessary 2nd additive procedure, for obtaining the information on difference of the various situation with appeared various multiplied values (not ranks) in the data file (?)
I supposed that sequency before transfering on ranks e.g. 1 1 1 2 1 .. is in Wilcoxon Paired Test transfered same as e.g. 1 1 1 20 1 ...; However, if is right to considered the K-S test for satisfaction that there are two statistically significantlly different possible situations in the sense of the influence of themselve values for the paired comparisons.
If the Wilcoxon Paired Test would obtain that the medians of ranked data are statistically significantlly "same", there are not included the influences of concrete values; therefore, if is right to make the following additive test on differences between variances by Kolmogorov-Smirnov test (?)
Thanks for possible discussions
With wishes of statistically significant summer days
Tomas Barot
Dpt. of Mathematics with Didactics
Uni. of Ostrava, Czechia
Relevant answer
Answer
If I understand the issue, I think the answer is simply that different tests test different things.
In the group of two-sample paired tests, on one end is the sign test, which cares only about if an observation in one group is larger than it's paired observation in the other group, with no care about how large this difference is.
The Wilcoxon signed rank test begins by looking at the difference between the paired observations, but then uses the ranks of these differences to perform the test. So, it partially cares about the magnitude of the differences.
And then there's the t-test, that includes the magnitude of the differences. But has certain assumptions about the data.
There are probably other tests that don't have the same assumptions about the data that the t-test has. If you can take the difference between pairs, I'm sure that there's a permutation test that tests if the differences are not symmetric about 0.
  • asked a question related to Hypothesis Testing
Question
4 answers
My analysis is about the candidates who participated in 8 Ukrainian Parlamentarian elections more than one time, no matter whether they were winning or loosing.. I am testing hypothesis whether gender, place of living, occupation, electoral rules (majoratarian|party lists), experience of victory effect the number of attempts to win the elections. I need a literature which analysizes one of these problems, or phenomena of multiple-time participating in elections. And, may be, theoretical frame for this study
Relevant answer
  • asked a question related to Hypothesis Testing
Question
4 answers
I have samples of monthly totals of births for families in different locations and want to compare the seasonality of births as an indicator of distinctions in work patterns. The suggestion is that if sailor are away from their families at particular times of year, the pattern of births in their families will reflect these absences. The data is arranged by months and years with the proportion of births as a percentage of the total for each month and the totals for each month added and calculated as a proportion of total births. Like this:
Parish 1
Jan Feb Mar Apr May etc total
1614 0 0 1 4 2 31
0% 0% 3% 13% 6% 100%
1615 etc
Total 15 11 13 7 9 etc 144
19% 8% 9% 5% 6%
Parish 2
1614 1 3 1 3 1 etc 20
5% 15% 5% 15% 15% etc 100%
1615 etc
Total 15 16 13 25 21 etc 183
8% 9% 7% 14% 11% 100%
To test for the significance of the differences , I have been performing simple T-tests on each month. Is this the correct approach or are there other ways to test difference in the pattern of seasonality? In my statistical work I've not gone much further than hypothesis testing and linear regression.
Any thoughts would be much appreciated.
Relevant answer
Answer
dear
see the paper under
  • asked a question related to Hypothesis Testing
Question
3 answers
I have two data sets to compare which are responses to statements using a Likert scale response anchors (7 in total, always untrue to always true, coded as 1-7) (so non-normally distributed, ordinal data). In the 2 groups (n=24 and n=34, responses collected a year apart), 5 cases appear in both data sets (responses at the different time points from the same people) but all other cases are from people who have only given responses at one of the two time-points.
If I am looking to hypothesis test for group differences, what is the best way to work with these groups? Split into independent and not? Analyse together ignoring the 5 cases where responses are from the same people? Previously, I have analysed other data sets (two and three groups) where all cases have been independent of each other both between and within groups using Mann-Whitney and Kruskal-Wallis tests, but as I have a mix of independent and not cases here, I am unsure on how best to treat the data set/the best hypothesis testing to run?
Relevant answer
Answer
Hi Hayley, Maybe this is a half full, half empty situation? How about trying ... 24 participants completed the survey and a year later (time 2) 29 different participants drawn from the same population completed the survey. Run your analyses as between-subject. Then say something like, "Additionally, 5 participants from time 1 repeated the survey at time 2. Though too few for statistical comparison, I (you) explored for additional insights and hypothesis generation. You can then say things like if the highest and lowest scoring participants remained the same. You can say if their survey responses uniformly increased or decreased across time like, or unlike, those in your between-subject study. Good luck with your project! -Maddie
  • asked a question related to Hypothesis Testing
Question
2 answers
We're working with the lovely garden eels: snake-like fishes that live in big colonies, attached to the sandy sea bottom. They feed on plankton and hide in their burrows whenever something big approaches. Here's a small video of them: https://www.youtube.com/watch?v=v2WEkd9qMlw
To test whether they're using social information in their evasive behaviour, we found an edge of the colony and, after satying put for 3 minutes to ensure they were not hiding at that point, one of us slowly approached until the first eel retracted. We marked that point as our zero. Then, we marked the positions where the closest and farthest eels hide. We then measured the distances between our zero and the closest (Ri), and farthest (R1) points.
Now, our null hypothesis is that if Ri and R1 are equal, the information (the evasive behaviour) is not spreading, and therefore there's no use of social information. Our H1, then, is that if information is spreading, R1 > Ri. As every pair of R1 and Ri was taken at the same time, respect to the same point of reference (zero), and our data did not pass the Shapiro normality test, we're considering a paired Wilcoxon test. Is this appropiate? Our sample size is 68.
Thank you in advance.
Relevant answer
Answer
Yes, the wilcoxon test is appropriate for matched pairs of non-normally distributed data.
  • asked a question related to Hypothesis Testing
Question
11 answers
If somebody frames research hypothesis for his/her study can he/she conclude on that based on frequency of response only or he must apply a statistical test of hypothesis testing?
Relevant answer
Las pruebas de hipótesis permiten a partir del resultado alcanzado si se acepta o se rechaza la hipótesis nula y si se puede afirmar a un nivel de significación de 0,05 q los resultados satisfactorios y q se observan diferencias significativas al comparar antes y después de aplicada la propuesta en la práctica educativa.
  • asked a question related to Hypothesis Testing
Question
5 answers
(My research is about development of FR intumescent coating by addition of additives, my goal is to have a more thermal resistive sample than the controlled sample without neglecting the time(aim: lesser average temp or better "temperature vs time" relationship( negatively related)). I performed a horizontal burner fire test for 8 different samples, recording the temperatures in 1 hour per minute from each samples.What statistical tool/method do I use the data gathered( time vs temp) to conculde that sample a is better than the controlled sample? I am not quite sure to use correlation and not knowledgeable enough to other tests thank you.
Relevant answer
Answer
Dou you have several readings (Y) per sample over time (T), like you put a sample in fire and measure every minute over one hour, so you have about 60 readings per sample?
If so:
Is there some theory specifying the functional relationship between Y and T (e.g. should that be an exponential or linear relationship)?
Plot Y against T for each sample and see what the relationship in your sample data might be, or if it is ain accordance with theory.
Fit the relationship per sample, using a regression model. The model should include some parameter of interest, e.g. a slope or a half-time. Collect the fitted values of this parameter.
You can finally test hypotheses about the difference or the ratio of this parameter depending on the additive. For instance, if the regression model is a simple linear regression and the relevant parameter is the slope of the regression line, you can use a t-test. If the regression model is exponential and the parameter of ionterest is the half-time, you can use a t-test with the logarithm of the half-time.
  • asked a question related to Hypothesis Testing
Question
3 answers
To show the mediation effect , how should we develop hypothesis. Will we test three hypothesis for testing the path a to b, b to c and a to C ?
Can someone clarify the number of hypothesis to be tested and how to develop these hypothesis? If the discussion is done using an example it can be a great help.
Relevant answer
Answer
Looks like a mediation to me.
  • asked a question related to Hypothesis Testing
Question
14 answers
I performed a specific experiment involving temperature vs time. I performed 8 experiments having 8 different data (time vs temperature), can I use pearson correlation coefficient to represent the behavior of the graph and then how can I compare each coefficients to see if it is significantly different from one another.
Relevant answer
Answer
Livey Czar Aligno I interpret this new information like this: you have an independent variable that is the experimental group (1, 2, ... 8), and for each group you have temp responses for fixed times (1, 2, ... 60) and you need compare the temp behaviors as a function of time for the 8 groups. I dare to think that it is a comparison problem of 8 regression models where the data of the dependent variable (temp) have been taken as repeated measures at the set times. A repeated measures (therefore correlated) design for 8 independent experimental groups.
  • asked a question related to Hypothesis Testing
Question
4 answers
Hi everyone!
I analyzed 20 tissue samples of oral leukoplakia (OL - an oral potentially malignant disease) through untargeted metabolomics to compare the metabolic profile of those OL who had malignant transformation (5) and those who did not (15). I know that the small sample size is one important limitation of the study, but OL is a rare disease and I have to deal with it.
Well, when I use my complete dataset (around 4k compounds) to perform multivariate analysis such as PLS-DA, my model is overfitted, exhibiting a negative q2. However, when I use the 72 compounds considered statistically significant by the univariate methods (hypothesis tests) as the input data, my q2 rises to 0.6. The improvement also occurs when I use this small dataset to build the heatmap that clearly distinguishes the malignant transformed from the non-transformed OL. Interestingly most of the compounds classified on the PLS-DA VIP list are the same, both using my whole data and using the 72 discriminant features as the input.
I recently presented my thesis to a metabolomics specialist and she told me that my analysis is curious and that she cannot tell me whether it is right or wrong.
Would anyone here help me with this question?
Thanks!
Relevant answer
Answer
HI, you are actually asking two different questions here, as PCA is only a data reduction method while PLS and clusterization aim to interpret the data.
You have 2 groups and only 20 samples all in all, so of course way too many compounds and potential confusions. And I would not rely exclusively on "significantly different" between the 2 groups to select data as, with 4k variables, it is more likely than not that you may have false positive (depending on how stringent you ran the statistical test).
My approach would be to do the PCA and look, is there some spontaneous separation of the 2 groups on one of the early principal compoments? Do your data "cluster" i.e. do you have groups of very correlated data? In which case you can probably simplify these groups to a single variable.
You have only 20 samples so when there are significant differences (72 variables) it might be worthwhile to actually plot and look, is the difference pulled by a few samples or is nicely repeatable in one group.
Anyway you look at it, you are in trouble... too many variables for few samples, this is going to require a lot of brain exercise. This is where statistical tools, however good they are, must leave way to knowledge and human intelligence. Do you have hypotheses as to the related biological mechanisms to help you sift through the data , and the predictive value, as said by Guillermo Quintas, will stay poor. Tentative I'd say, it might be a tool to help the practician but not totally relaible as a diagnostic.
  • asked a question related to Hypothesis Testing
Question
10 answers
As we know there are 2 types of errors in hypothesis testing. Type I & type II.
When applied to real life situations, which are more catastrophic ?
Examples would be helpful.
Thanks.
Relevant answer
Answer
Both could be catastrophic. Here are 2 examples:
Type I error: You incorrectly assume that a new expensive cancer treatment is helpful when in fact it is not effective. People may die because they receive the new (ineffective) treatment rather than a treatment that could really help them. Plus you waste a lot of money and time on something that isn't useful.
Type II error: You fail to see that an effective cancer treatment works. People die because they do not receive the treatment.
  • asked a question related to Hypothesis Testing
Question
2 answers
Hello! I've published on five randomized controlled trials revealing that a virtual reality job interview training tool increases the odds of employment in those using the virtual tool compared to a community control group (~OR=2.0 with two-tailed tests). The trials were in various groups with serious mental illness. I'm now conducting what was supposed to be a fully-powered RCT where the COVID-19 prematurely ended our recruitment where we enrolled 68% of our anticipated sample.
Given we have 5 RCTs finding the same outcome, I proposed an a priori directional hypothesis that the virtual interview tool would again increase employment in the latest study. That said, is there a way to compute a directional/one-sided confidence interval for the Odds Ratio?
Relevant answer
Answer
Hello Matthew,
A one-sided CI for an odds ratio would simply be the appropriate tail estimate from a 100(1 - 2*risk level) CI. For example, a 95% upper limit only CI for a data set would be found be asking for a 90% CI and using only the upper limit.
As Professor Booth notes, the Rosner text is very helpful for this and a host of other questions.
Good luck with your work.
  • asked a question related to Hypothesis Testing
Question
4 answers
I have a large dataset (https://www.kaggle.com/teejmahal20/airline-passenger-satisfaction) regarding airline passenger satisfaction. I applied a decision tree on this dataset and I extracted the feature importance and it seems that the quality of the inflight wifi service is the best predictor for the final satisfaction level of the passengers. Please keep in mind that the target variable is binary in this dataset (satisfied or dissatisfied).
I would like to cross-check this result by using "classical" statistics - hypothesis testing - whether the quality / level of satisfaction with respect to the wifi service is really a good indicator of whether the passenger will be satisfied or not. The final purpose of my research is to create an algorithm that can provide quality information for a business decision making process from the airlines' point of view (is it worth to invest in X service in order to improve our passenger's satisfaction level? - if the quality of the service is improved by a% then b% of the passengers become satisfied and are more likely to fly with our airline again).
I've identified the PSM (propensity score matching) as a way to "create" these control & test group for my hypothesis, but I'm not sure how to apply this or whether it is what I am really looking for.
Can anyone shed some light into this problem? Any help with respect to properly selecting a control group and a test group for this hypothesis testing will be greatly appreciated!
Many thanks!
Relevant answer
Answer
Hello again Serban,
OK, so the usual nomenclature is "training" (or model building) and "test" (or validation) groups. The training data set is used to develop the model, then the model's accuracy/efficacy is ascertained by applying it to the test/validation sample.
With a large enough initial sample, you could randomly select some fraction of that to serve as the training/model building sample, and hold the remainder out to use as the test/validation sample. There are fancier schemes (so-called k-fold models) that data mining folks like to use, but this is the basic framework.
Choose the initial sample size to be sufficiently large to allow a high degree of precision (and, stability: all other things equal, sample results are less volatile across large samples than across moderate or small samples). It's up to you to declare what degree of precision you'd like for any parameter estimates deriving from model building, and to select the training sample size accordingly.
Good luck with your work.
  • asked a question related to Hypothesis Testing
Question
5 answers
My study is about the development of Intumescent Coatinv through addition of additive A and B. In my research I used samples with different ratios from the two having the controlled one with 0:0 . I perfomed the Horizontal fire test and recorsed the temperature of the coated steel overtime. I need to show if there is significant difference from each sample and compare the correlation coefficients from each samples( if it is statistically significant). I tried using one way anova and post hoc test but i think time affects the temperatures. Should i try two way anova?
Relevant answer
Answer
if normality is ok, definetly Dunnett's test (see Google)
Good luck
  • asked a question related to Hypothesis Testing
Question
4 answers
In order to perform verify validity of a hypothesis what will be the minimum sample size ? I only have 45 samples, would that be enough? Is it okay do conduct a qualitative analyzing methods to validate the hypothesis?
Relevant answer
Answer
Sample Size Rule
Sekaran (2013) wrote:
"Roscoe (1975) proposes the following rules of thumb for determining sample size:
1. Sample sizes larger than 30 and less than 500 are appropriate for most research.
2. Where samples are to be broken into sub-samples;(male/females, juniors/seniors, etc.), a minimum sample size of 30 for each category is necessary.
3. In multivariate research (including multiple regression analyses),the sample size should be several times (preferably 10 times or more) as large as the number of variables in the study.
4. For simple experimental research with tight experimental controls (matched pairs, etc.), successful research is possible with samples as small as 10 to 20 in size."
Reference
Sekaran, U., 2003. Research methods for business: A skill building approach. John Wiley & Sons.
  • asked a question related to Hypothesis Testing
Question
2 answers
Hi there!!! I have done a meta-analysis with 6 different datasets to find out significantly differentially abundant bacteria across all datasets.
I have calculated the standardized mean difference (Effect Size) between the control and test group for each bacteria from each dataset. Now, for a single bacteria, I have 6 different effect sizes. Across these 6 different effect sizes, I have run the Random Effect model to find the overall effect size across populations for that particular bacteria and I got the P-Value.
I have done the same procedure for all the bacteria (a total of 200 bacteria). As I have done multiple hypothesis testing, I have adjusted for P-Value with FDR correction. After adjustment, I am getting 8, 11, 14, and 21 differentially abundant bacteria after FDR cut off 0.05, 0.1, 0.15, and 0.2 respectively. In this case,
  • can I report the bacteria with FDR < 0.2 or < 0.15? Will it be acceptable in high-quality journals?
  • Do the journals have any restrictions for high FDR values like 0.15 or 0.2?
Thanks,
Deep
Relevant answer
Answer
What a journal will accept or not is usually a gamble. It depends on the reviewers, although sometimes the question falls to a technical editor who has a better grasp of what is acceptable for a given journal.
There is nothing magical about the 0.05 cutoff for p-values. If we are in a situation where we are screening potential candidates for something, and we aren't worried about type-I errors at this step, then it makes sense to relax our p-value cutoff to, say, 0.10 or 0.15.
I'm wondering though, if it might make more sense (strategically), to not adjust the p-values with FDR, and keep to the 0.05 cutoff. The logic is the same: If you are interested in screening many bacteria, and aren't worried about an inflated type-I error, then there's no reason to be strict about the false discovery rate.
  • asked a question related to Hypothesis Testing
Question
16 answers
The question has been answered- Closed Thread
.
.
Hello, for my undergrad dissertation I have a model where the dependent variable is Behavioral Intention (BI), and it has many independent variables. I first run regression analysis on SPSS by putting BI in the dependent box, and the rest of the variables (as well as the control variables) in the independent box. Almost all of my hypotheses were accepted, except 2 where the significance was over 0.05. Then I decided to run the analysis by testing the variables one by one instead of putting them all together (however I still included the control variables). I then realized that in this way, the standardized b coefficients were higher and the significance was almost always 0.000 (i.e. more strong relationships, and all hypotheses accepted). I know that probably the first method is more correct (multiple linear regression analysis), but why does this happen? Note: there are no issues of multicollinearity
Relevant answer
Answer
"Almost all of my hypotheses were accepted"
How? Can you say what your hypotheses were? Were they of the type B_1 = 0?
  • asked a question related to Hypothesis Testing
Question
20 answers
I was trying to determine whether there are differences in the frequencies of words (lemmas) in a given language corpus starting with the letter K and starting with the letter M. Some 50 000 words starting with K and 54000 words starting with M altogether. I first tried using the chi-square test, but the comments below revealed that this was an error.
Relevant answer
Answer
Did you try Python word count?
  • asked a question related to Hypothesis Testing
Question
8 answers
Hello research fellows,
I would like to understand how I can create a statistical test, testing that the relationship between A and B is zero. That is, my ALTERNATIV HYPOTHESIS predicts a zero relationship between these two variables.
What I understand is: The solution cannot be an insignificant relationship between A and B with a normal t-test, as the statistical test then was created under the assumption that the null hypothesis predicts COR(A, B) =0, while the alternative hypothesis predicts COR(A, B) ≠ 0 or ><0. So here I would have no coherence between my null hypothesis about the phenomenon and my statistical test.
Can anybody suggest literature for how to test for zero relationships, preferably in the realm of psychology?
Thank you very much!
Best
Rafael
Relevant answer
Answer
Hello Rafael,
The type of test you envision is conceptually comparable to equivalence (non-inferiority) testing. Since a mean difference between two groups can be evaluated as testing the correlation that r(group membership, score) = 0, I think you could recast this method to your purpose.
Here's a link that offers a simple overview of equivalence testing:
Good luck with your work.
  • asked a question related to Hypothesis Testing
Question
3 answers
Hi, I have 2 datasets of au values, they were analysed with different analytical methods, dataset A has N=60 and dataset B has N =252.which hypothesis testing method can I use to test if the two datasets are significantly different from each other?
Relevant answer
Answer
Both samples are large enough i.e. >30, to allow for use of the standard normal z-test of equality of their means.
  • asked a question related to Hypothesis Testing
Question
3 answers
Has anyone ever used Bayesian modelling when hypothesis testing instead of p values in classical hypothesis testing?
Relevant answer
Answer
Yes, its increasingly common in some fields (notably psychology) to use Bayesian approaches. Bayes factors are only one way to do this.
  • asked a question related to Hypothesis Testing
Question
6 answers
Hello Everyone,
I have one population, sample size 200. and needed to find the correlation between Variable A and B. Also, which T-test should I choose for hypothesis testing
Please advise which statistical test using SPSS will be good for below research questions:
Is there a relationship between Variable A and Variable B?
Are there any differences in risk score by gender ( Male vs. Female)?
Thanks in advance.
Will appreciate your response.
Best Regards,
Meraj Farheen Ansari.
Relevant answer
Answer
If your variables A & B are on the same scale, you can run a repeated measures ANOVA, with Gender as the between factor and the 2 variables as the within factor.
If A is an independent variabel, you can run an ANCOVA, i.e. a regression:
B = A + gender (in that order)
Note that the interaction "A * Gender" should not be significant. If there is an interaction between Gender and variable A, you can rely on the Jonhson-Neyman technique to see where there could be differences.
see:
  • asked a question related to Hypothesis Testing
Question
6 answers
Hi Colleagues- What do you think about the following methodological issues:
1. Is it methodologically wrong to formulate quantitative research questionnaire based on inductive reasoning?
2. Is it mandatory to test hypothesis and/or theory in quantitative research? if no -what are the justifications? (please suggest an example of exemption)
Thanks in advance
Relevant answer
Answer
To your first question, it is not right to look for quantitative data for inductive analysis. Inductive analysis is driven by reasoning based on the data emerging from observations and participants views. Inductive analysis is therefore based on qualitative data. Quantitative means you are either testing an hypothesis or a theory that you already know something about. My opinion.
  • asked a question related to Hypothesis Testing
Question
7 answers
Hello everyone, I am using SPSS and IPA METHOD to analyse my data. My questionnaire included 31 questions about two variables "Importance" and "Usage" (and the participants answered from 1-5). My first hypothesis is
H0- There is no significant difference between the level of importance and the level of usage...
In order to answer, I have used Paired t test in SPSS for each question , BUT 19/31 questions show that the Mean differnce is NOT statistically significant , while the rest 12 questions show that that the Mean differnce is statistically significant. Hence I do not know how to proceed?
Do I accept or reject the null hypothesis and why ??
Thanks!
Relevant answer
Answer
It sounds like what you want to do is first create a composite variable for each of Importance and Usage, and use those as your measured variables. Of course, if you are comparing these two variables, they need to be commensurate, like composed of the average response for a respondant for the relevant questions.
  • asked a question related to Hypothesis Testing
Question
8 answers
I have collected data of a variable and I want to know the underlying distribution, and then take samples from it.
I have tried to fit several distributions and almost all of them fail in hypothesis testing. What should I do? How should I approach this question?
Relevant answer
Answer
  1. Document analysis.
  2. Surveys.
  3. Interviews.
  4. Observations.
  5. Focus groups.
  6. Case studies.
  • asked a question related to Hypothesis Testing
Question
11 answers
Dear RG Researchers,
When we apply the null hyposthesis to test the relationship, for two data set , we compare the F-statistic with the critical value (P-value) and upon this comparison we reject the null hypotheis (means there is relationship between the two data set (F>P-value)) or we accept it (F<P-value). However, we know this fact, I would like to know the scientific reasons (statistical reasons) behind this comparison to reject or accept.
Best wishes,
Sincerely.
Relevant answer
Answer
You don't compare an F-value to a P-value. You compare the (observed) F-value to the critical F-value. The critical F-value is a value for which Pr(F > Fcrit | H0) = alpha, with alpha being the level of significance (the "size" of the test). If F > Fcrit you know that this probability is smaller than alpha, in which case you reject H0.
Today, we don't need to use tables with Fcrit and can calculate Pr(F>Fobs | H0) directly. This is the p-value, and we can compare this p-value to alpha and reject H0 if p < alpha.
We NEVER accept H0. Failure to reject H0 means that we don't have enough data to interpret the statistic w.r.t. H0.
The reason behind this procedure is to dare an interpretation of a statistic calculated from the data (e.g. a mean difference, or the slope o a regression line, or an odds ratio, etc.) only if the data provides enough information about this statistic. The statistical significance is a proxy for this amount of information.
  • asked a question related to Hypothesis Testing
Question
7 answers
I have an enormous dataset and for each row, I have a predicted value, and in the same row, there are a few characteristics(independent variables). I have set an ideal linear regression for this dataset. Now, I want to compare the set of independent variables of my ideal regression with the regression of each every row of my dataset. I appreciate for any help....thank you!
Relevant answer
Answer
Chi-square tests
  • asked a question related to Hypothesis Testing
Question
4 answers
I would be most grateful for advice on interesting clinical cases where interventions have been approved based on hypothesis test results from high-quality RCTs, but where it has subsequently been discovered that the hypothesis test results corresponded to false positives. I am particularly interested in cases where, despite the positive RCT finding, the scientific rationale behind the hypothesis was later discredited.
Many thanks in advance!
Relevant answer
Answer
You would have to go very far back, I think, to the era before systematic reviews, and even there the problem was more generalisation of RCT findings to groups that hadn't been included in the trial, such as the use of beta-blockers for hypertension in older patients.
The only case I can think of where there were positive results, but the underlying rationale was later discredited was the Paris streetscape. After the cholera epidemic in (?) 1832, they demolished slums, widened streets and installed sewers to reduce exposure to 'miasma' (foul air, believed to cause cholera). It worked, but not for the reasons they thought.
  • asked a question related to Hypothesis Testing
Question
5 answers
I am studying the characteristics of GitHub issues for a project. Based on some criteria, I have classified these issues into two separate groups. For example, if I have a total of 1000 issues for a project, 20 goes to the first group and the remaining 980 goes to the second group. Also, the two groups are highly unbalanced (e.g., 1 issue in the first category to 100 issues in the second category). For all of the issues, I have measured different characteristics, and the measured values for each feature do not follow a normal distribution.
Now, I want to do a null hypothesis testing for each of the measured characteristics to find out if the feature is different for the two groups and ideally how different are they. For example, feature X is significantly different in the two groups and it has higher values in the first group compared to the second group.
Can someone kindly help me on which methods I can use for this purpose.
Relevant answer
Answer
One way ANOVA on SPSS can be helpful
  • asked a question related to Hypothesis Testing
Question
25 answers
Given a study hypothesis, what are the up-to-date alternatives to null hypothesis significance testing in finding the evidence for or against the truth of that hypothesis and can you summarize you they work?
Relevant answer
Answer
Hening Huang , the things you enumerated are no real alternatives. These are all just flavors of the same thing. It's like changing the color of a car and trying to sell it as an alternative to individual automotive mobility...
  • (1-a)-confidence intervals are simply intervals of the "hypothesis space" within which the p < a.
  • effect size analysis requires to judge the signal-to-noise ratio, and this is done via p-values.
  • Bayesian factors are likelihood ratios (likelihood factors). p-values are results of likelihood ratio tests (even if not, they can be shown to be euqivalent to LRTs).
  • S-values are just log-transformed p-values.
  • The term "exceedence probability" I know only in the context of extreme-cases (hazards) analyses. If you mean some kind of Bayesian analysis, then it's likely asymptotically equivalent to the interpretation of a confidence interval.
At the end of the day, you have some data from which you calculate some etsimated effect(size), and you must use some measure to judge if the information in the data justifies your interpretation of this estimate. And this judgement is done by comparing this estimated size against the estimated noise one should reasonably expect (that is usually also estimated from the observed data). And this is statistical significance, eventually. No need to stick with p-values - we may use transformations thereof (like S-values), or intermediate statistics (like likelihood ratios), or even use informal ways (like 6-sigma rules or whatever). These are all different ways to judge the very same thing: the statistical significance of the observed data. These are no alternatives to the actual process (and need) to have a look at and judge the statistical significance.
  • asked a question related to Hypothesis Testing
Question
6 answers
I want to test the impact of a process change on an operational metric, which is a continuous variable. I have two data sets, pre-test and post-test. Both data sets represent the entire population of events that occurred during the specified time periods, and both have a population size of > 5,000. I want to know if there was a positive or negative change following the intervention and whether that change is statistically significant.
My intuition is to apply a two-tailed z-test, however this particular metric is reported using its 90th percentile rather than its mean. A z-test for proportion doesn't seem to fit either. Essentially, I want to know if a change in the 90th percentile was statistically significant.
Relevant answer
Answer
You would better to use non-parametric tests. Visually, Q-Q plot is insightful. I think what have to be teste is the change in the probability of the initial interval. Is the change statistically significant?
  • asked a question related to Hypothesis Testing
Question
3 answers
I have a dataset with an unusual configuration and was hoping for guidance on choosing a method to test for changes in mean.
For context, this is operational data tracking vehicle arrivals at 10 physical sites, and measures relative site volume using a ratio of daily arrivals in relation to the site's capacity. Variation among the 10 ratios is calculated for each day (coefficient of variation).
A program was started that changes the conditions under which vehicles determine which site they drive to (goal is to reduce variation in above described ratios). There have been three different changes in conditions, yet the time lengths they were implemented for were all different:
Baseline - 30 days (cannot be extended)
Phase 1 - 78 days
Phase 2 - 116 days
Phase 3 - 87 days
I'm being asked to determine whether there were significant changes in the mean variation during each phase compared to Baseline. Since I'm testing the same group (the entire vehicle/site system) under three different conditions, I believe three separate paired t-tests would be appropriate. However, I know the sample size must be identical for each pair. Generating a proportional random sample would still give me different sample sizes (obviously). My question is whether it's acceptable to choose a constant number of days to sample from each phase (e.g. 15 days of ratio variations) or if there would be a more appropriate test to use?
Relevant answer
Answer
Yes, a mixed model would be appropriate (time and condition as fixed and site as random factor). And since your dependent variable are counts, this should be a negative binomial model, in which the sites capacity is used as offset. The model shouldinclude the time:condition interaction that gives you difference in time course between the conditions.
  • asked a question related to Hypothesis Testing
Question
7 answers
Hello all! I am new to applying statistical analyses to research problems. I need some help regarding choosing the correct statistical test to analyze my experiments:
I am trying to measure the concentration of a metabolite that cells secrete in response to particular compounds I treat them with, and see if the compounds affect metabolite secretion.
1) I am trying to see how the mean concentration of the metabolite differs between the four groups : control, treatment 1 alone, treatment 2 alone and treatment 1 and 2 in combination. (treatment 1 refers to when I treat my cells with compound 1, treatment 2 refers to when I treat cells with compound 2.)
2) I am trying to see whether the mean metabolite concentration differs between control and treatment 1 at four different time points: 12hours, 24 hours, 36 hours and 48 hours post-treatment. At the same time, I am also comparing the mean metabolite concentration when I give treatment 1 at 12 hours with the metabolite concentration at 24 hours with the concentration at 36 hours with the concentration at 48 hours post-treatment.
Any help is appreciated!
Relevant answer
Answer
You might want to look at the comparison of absolute concentrations to enzyme binding site affinities for substrates and for inhibitors
Enzyme active sites are in general more saturated than inhibitor sites (p-values are from the Kolmogorov-Smirnov test).Otherwise look at GEMpress and Shortest-Path Method.
Good luck.
  • asked a question related to Hypothesis Testing
Question
5 answers
Hello, I am looking for some advice regarding choosing a direction for hypothesis/statistical testing for a multivariate analysis. I am interested in determining if there is relationship between cyanobacteria bloom frequency and wildfires. So, my dependent continuous variable is the amount of blooms detected, while I have both categorical (NLCD class) and quantitative (fire frequency, wind speed, temperature, etc) independent variables. I am now planning to do a PCA in order to reduce the dimensionality of all these variables, which may lead me to do a multiple regression.
I was wondering if there is another hypothesis test that I am missing regarding multivariate data that may have independent variables related to one another? (i.e fire frequency and area burned may be related). Someone has suggested me to look at a PERMANOVA, but I am not sure if that would be the other route I could take.
I am a novice in statistics, so I apologize if I said something incorrect and would appreciate any suggestions/advice.
Relevant answer
Answer
James R Knaub Thank you so much for the thorough answer, and I did look into the linked paper. Very informative and your help was much appreciated!
Austin Pearce Really cool link, will help a lot for any statistical analysis. Yeah when I did some simple regression relationships between fire freq. and burned area there appeared a limiting factor, which led to me having to integrate so many factors.
Andrew Paul McKenzie Pegman Thank you for the suggestion! Yeah I felt that the regression seemed almost redundant at that point, but wanted an outside opinion, much appreciated :)
  • asked a question related to Hypothesis Testing
Question
6 answers
I have two methods for doing Monte Carlo simulations. With both of them I have run serveral simulations and got the mean and variance of their results. I would like to determine whether both methods are equivalent. How can I compare them since hypothesis tests like Student's t test only determine that there is not strong evidence to refute the null hypothesis?
Relevant answer
Answer
null-hyposesis significance tests (NHST) such as t-tests, are only able to provide evidence against the null hypothesis. Therefore, NHST can not be interpreted as evidence against the presence of an effect, which is equivalent to an interpretation in favor of the null hypothesis.
I think possible solutions might come from the area of equivalence testing, i.e., evidence that two values are practically equivalent.
This can be done by either using frequentist approaches (Lakens, D. (2017). Equivalence Tests: A Practical Primer for t Tests, Correlations, and Meta-Analyses. Social Psychological and Personality Science, 8(4), 355–362. https://doi.org/10/gbf8nt ) or Bayesian approaches (Kruschke, J. K., & Liddell, T. M. (2018). The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychonomic Bulletin & Review, 25(1), 178–206. https://doi.org/10/gc3gmn).
Also from the Bayesian perspective comes the idea of the Bayes Factor that expresses the likeliness that a hypothesis (simulation results from both MC-simulations are equal) is true given the data (Lakens, D., McLatchie, N., Isager, P. M., Scheel, A. M., & Dienes, Z. (2018). Improving Inferences About Null Effects With Bayes Factors and Equivalence Tests. The Journals of Gerontology: Series B. https://doi.org/10/gdk2z5). Not to be confounded with the likeliness of a Value to occur given a certain hypothesis as in NHST.
Bests,
Sven .