Science topic

Quantitative Data Analysis - Science topic

Explore the latest questions and answers in Quantitative Data Analysis, and find Quantitative Data Analysis experts.
Questions related to Quantitative Data Analysis
  • asked a question related to Quantitative Data Analysis
Question
3 answers
I am reviewing a paper for a statistic class. In it, the authors combine, for example, Cohen Perceived Stress scale (0-24) with Epworth Sleepiness scale (14 - 70) to give a final scale (14-94) in order to asses general perceived stress. I wanted to know if these type of scales could be combined in this way.
I suppose it is not a good idea, since they don't even have the same min/max values, meaning each point has more weight in the first scale (with only 24 max points) than in the Epworth scale (70 max points). If combined, someone that had 24/24 and 20/70 would be scored in the same way than someone that had a combination of 0/24 and 44/70.
Relevant answer
Answer
Hello Paolo,
On the face of it, the practice doesn't sound like a good idea. Minimally, you'd want to do some sort of prior work to demonstrate that both instruments are functioning as valid measures of the purported construct (general perceived stress) before choosing to combine their scores.
The arithmetic underlying a simple sum of the two instrument scores is such that, potential score ranges notwithstanding, the actual weight of each instrument towards the combined score is proportional to its standard deviation (SD).
So, without some supporting evidence, I would be skeptical of this practice.
Good luck with your review.
  • asked a question related to Quantitative Data Analysis
Question
12 answers
hello ..
thank you for reading my quistion and i will appreciate any help. i am doing my master dissertation now which i am investigating gender represenatation in textbooks. my method is content analysis and i will collect data which includes genders of poets, scientists, authours, leaders and so on. i also will look at the appearance of each genders in varied areas. the data will be gathered from texts and pictures and i will interpret it (after analising) in charts and diagrams.
my quistion is does this process considered as qualitative or quantitave or mixed methods? and about the data is it qualitative data or quantitative ?
thank you
Fatema
Relevant answer
Answer
Mixed methods requires two sets of results, one qualitative and one quantitative, as well as the integration of the those sets of results. A straightforward content analysis such as you propose would not meet this definition.
Instead, it sounds like you want to do a quantitative content analysis (i.e., based on counting things), but you want this analysis to be essentially descriptive (i.e., you will not be testing any hypotheses).
  • asked a question related to Quantitative Data Analysis
Question
4 answers
Hi Everyone
Can anyone provide references of a thesis or article where systematic literature review is used to identify gaps followed by a quantitative data analysis method to fill the gap?
Relevant answer
Answer
Saima Munawar Yes, a systematic review can serve as your Ph.D. dissertation; but, be cautious about what you wish for. During your Ph.D. studies, you can become a specialist in systematic reviewing or learn additional quantitative and qualitative study designs in addition to systematic reviewing.
Systematic reviews (SRs) have been advocated as a research approach suitable for a graduate research thesis.
  • asked a question related to Quantitative Data Analysis
Question
7 answers
I am currently attending a PhD program in innovative Urban Leadership and I am interested how and in what way have the church leadership responded to the pandemic that happened in the past two years. I want to learn as how most leaders responded to the scenario created by the COVID-19 Pandemic. What innovative leadership techniques have been employed by church pastors and how were those techniques of innovative leadership principles and values have been employed to address the dire situation of the church goers? What models have been employed to address the wholistic needs of the members of the church? How did the church leadership overcame their vulnerability within this dire situation as they were in the forefront of fighting the pandemic? What is the learning most institutions generated and how do we recreate those learnings and use in the future when similar incidents happen?
Relevant answer
Answer
SPSS and SEM could be useful for quantitative research. Contrastly, the qualitative approach could employ for NVivo software.
  • asked a question related to Quantitative Data Analysis
Question
3 answers
I have six kinds of compounds which I then tested for antioxidant activity using the DDPH assay and also anticancer activity on five types of cell lines, so I got two types of data groups:
1. Antioxidant activity data
2. Anticancer activity (5 types of cancer cell line)
Each data consisted of 3 replications. Which correlation test is the most appropriate to determine whether there is a relationship between the two activities?
Relevant answer
Answer
Just do logistic regression is what I had in mind. The DV might be antcancer activity (yes /no) same for antioxidant activity. Best wishes David Booth
  • asked a question related to Quantitative Data Analysis
Question
7 answers
I am particularly interested in resources on/best ways to learn datasets/numeric evidence's interpretation and correlations between variables. Thank you!
Relevant answer
Answer
Try Statistics for People Who (Think They) Hate Statistics by Neil Salkind & Bruce Frey. It's a great intro to statistics that I've used to help quite a few students struggling in statistics.
  • asked a question related to Quantitative Data Analysis
Question
48 answers
Dear researchers,
I have estimated internal consistency for a questionnaire with 50 items (five -likert point) by alpha cronbach. Alpha was .98.
What is your interpretation?
What can the cause be from your point of view?   
Thanks for your help.
Relevant answer
Answer
Exactly, Cronbach's alpha indicates whether the items measure the same construct. The minimum acceptable value for Cronbach's alpha ca 0.70; Below this value the internal consistency of the common range is low. Meanwhile, the maximum expected value is 0.90; Above this value is perceived as redundancy or duplication.
  • asked a question related to Quantitative Data Analysis
Question
4 answers
I have 10 variables (1-D.V, 8-I.V and 1-Moderarator) in my model and I have 267 no of observations i.e. sample size.
I am newbie in AMOS and want to test the interaction between the variables of this model through AMOS. The measurement model estimates are quite handsome. The results of Hayes process model in SPSS for moderation are also fine. However when I run the Model fit measure plugin or Calculate estimates for interaction. The error "RMSEA is not defined" is shown. May I know the reason and solution please?
Note: The standardized values of IVs and moderator have been used for interaction.
Relevant answer
Answer
Nice answer Mr. David Morse, Thanks. Your link is very good. Mr. Muhammad, follow this instruction.
  • asked a question related to Quantitative Data Analysis
Question
4 answers
Hi everyone,
I am experiencing trouble on measuring correlation on my dissertation study.
In two questions, I have asked my respondents "How many times have you been to ..." and "How familiar are you with the music at ...".
I am trying to establish the relationship between those who have frequented a store most often and if they are the same people who expressed their high familiarity with the music. They are the same group of respondents. Will a one-sample T test on SPSS be sufficient to test this relationship?
Hope to get all of your expertise on this!!
Relevant answer
Answer
Assuming your two questions yield categorical data on the same respondents, you could go for Pearson Chi-Square. You might check out the following for insights into carrying out Pearson Chi-Square on SPSS.
Lund Research Ltd. (2018). Chi-Square test for association using SPSS statistics. SPSS Statistics Tutorials and Statistical Guides | Laerd Statistics. https://statistics.laerd.com/spss-tutorials/chi-square-test-for-association-using-spss-statistics.php
Good luck,
  • asked a question related to Quantitative Data Analysis
Question
8 answers
What are the programs which can be used for structural modeling?
I have used an AMOS previously but the trial period had run off, so are there any other available programs for the use which are free of charge?
thanks
Relevant answer
Answer
JASP is one of the latest software and it is free. If you need any assistance you can contact
  • asked a question related to Quantitative Data Analysis
Question
8 answers
How can i validate a questionnaire for hospitals' senior managers?
Hello everyone
-I performed a systematic review for the strategic KPIs that are most used and important worldwide.
-Then, I developed a questionnaire in which I asked the senior managers at 15 hospitals to rate these items based on their importance and their performance at that hospital on a scale of 0-10 (Quantitative data).
-The sample size is 30 because the population is small (however, it is an important one to my research).
-How can I perform construct validation for the items which are 46 items, especially that EFA and CFA will not be suitable for such a small sample.
-These 45 items can be classified into 6 components based on literature (such as the financial, the managerial, the customer, etc..)
-Bootstrapping in validation was not recommended.
-I found a good article with a close idea but they only performed face and content validity:
Ravaghi H, Heidarpour P, Mohseni M, Rafiei S. Senior managers’ viewpoints toward challenges of implementing clinical governance: a national study in Iran. International Journal of Health Policy and Management 2013; 1: 295–299.
-Do you recommend using EFA for each component separately which will contain around 5- 9 items to consider each as a separate scale and to define its sub-components (i tried this option and it gave good results and sample adequacy), but am not sure if this is acceptable to do. If you can think of other options I will be thankful if you can enlighten me.
Relevant answer
Answer
After the survey is completed..but it is better to increase the number of studied samples..so that the result will be bette
  • asked a question related to Quantitative Data Analysis
Question
10 answers
Hi everyone,
I am trying to run statistical analysis on two datasets: two researchers have each independently measured a quantitative, continuous variable (i.e. wound closure rate) from the same population. What statistical tests would be most appropriate to test the variance between the two researchers' measurements, and to determine if it is okay to combine the two datasets?
What do you think of the following method:
1. Testing both datasets for normality
2. Testing for a significant difference between the means of the two datasets (student's t-test for parametric data, Mann-Whitney U test for non-parametric data)
3. Testing for a significant difference between the variance of the two datasets (Kolmogorov-Smirnov test for parametric data, F-test for equality of 2 variances for non-parametric data)
4. If means and variance are not significantly different, the two datasets can be combined
Any comments and suggestions would be much appreciated!
Relevant answer
Answer
T-test will be appropiated
  • asked a question related to Quantitative Data Analysis
Question
10 answers
GLM is an advanced quantitative data analysis but I could not find enough materials for SPSS. Please, suggest some good resources teaching and learning GLM by using SPSS.
Relevant answer
Answer
I agree with Daniel Wright, the package is secondary. This may help you:
Generalized Linear Models With Examples in R
Peter K. Dunn, Gordon K. Smyth
  • asked a question related to Quantitative Data Analysis
Question
7 answers
Hi,
I am doing my dissertation on the effects of complaints on sonographers in obstetric ultrasound.
I am doing survey as a mixed methods design. So a convergent design (questionnaire/data validation variant). I was advised to use descriptive statistics only for the quantitative data analysis. I cannot find any justification for this. Is this acceptable? Creswell seems to suggest I should be using inferential statistics as well.
I know its standard for surveys to be used a quantitative data only but I have done a lot of work on the justification for using it in a mixed methods study.
Also is thematic analysis standard in this type of study for the qualitative data analysis?
Many thanks
Gina
Relevant answer
Answer
Without seeing your questionnaire and research question/hypothesis it is difficult to answer your question. However, if you are simply wanting to find out the impact of a complaint on sonographers and are not testing a hypothesis or aiming to generalise your findings (depending on the level of study) I think it is acceptable to use descriptive stats. Also, thematic analysis is often utilised for qualitative survey data but whether it is appropriate for your study depends on the research paradigm. Hope this helps and good luck with your dissertation.
  • asked a question related to Quantitative Data Analysis
Question
9 answers
Hello everyone,
I am struggling with quantitative data analysis as I have not done it before. My questionnaire is about electric vehicles and I am looking on key factors that influence people's willingness to adopt electric vehicles. So, consumer behaviour in short.
I use Survey Monkey to collect data, therefore, it is easy to extract data in excel, figures, percentages, key words etc. What I need your help is;
1- Which method should I use to analyse? (best if it is the easiest one :)
2- Can I indeed just analyse by myself? This might be silly but surveymonkey gives everything I need, percentage, key words etc. Can I, for instance, give general overview of percentages, responses etc. Then start analysing and discussing question by question?
I am super lost and any help is appreciated! :)
Many thanks
Relevant answer
Answer
Mr Banerjee, we can have approximate score s of open ended questions on the basis of pre set criterion like exam questions are being evaluated that results in marks.
  • asked a question related to Quantitative Data Analysis
Question
3 answers
Hello,
In relation to another recent, as yet unanswered, post (https://www.researchgate.net/post/How_to_calculate_comparable_effect_sizes_from_both_t-test_and_ANOVA_statistics), I am wondering how I can calculate the sample variance of an effect size, in order for me to then infer the confidence intervals.
So far, I have been calculating 'Cohen's d' effect sizes from studies' experiments, by using the t-value of two-sample t-tests and the sample size (n) for each group. I then convert the Cohen's d into an unbiased 'Hedge's g' effect size.
I understand that, normally, in order to calculate sample variance, I would also need to know the means of the two groups and their standard deviation. However, these are not reported in most studies when a t-test is calculated. Is there any way I can calculate the sample variance of my Hedge's g effect sizes without this information?
Many thanks,
Alex
Relevant answer
Answer
Florina Erbeli can this method also be used for within participant designs, when you do not have access to the r statistic please? I expect not as it would then suggest double the sample size. I am struggling to fine a way to calculate the sample variance with only a t statistic, sample size and effect size.
  • asked a question related to Quantitative Data Analysis
Question
7 answers
Dear all,
I am conducting my dissertation which has a quantitative data analysis process and, as I have never done it before, I need your help to understand the needed steps.
The basic model will require three separate regressions to confirm the relationship between constructs. The data has been collected d with a Likert-style questionnaire where multiple questions measure each variable.
Before I start running the regressions, which steps should I implement?
I believe I would need to do reliability, do I test the Cronbach's alpha for each question of for the variables altogether?
Similarly for the regression, do I take the Likert average of each variable, or do I test each question separately?
Thank you for your help! If anyone would also be available for a video call it would be of amazing support.
Elena
Relevant answer
Answer
Hi Kenneth W. Cromer great thank you very much for confirming this! Now back to the drawing board and reversing the steps, good thing I saved the initial data set!
  • asked a question related to Quantitative Data Analysis
Question
5 answers
I have applied the Yoon-Nelson kinetic model to the experimental data obtained for CO2 adsorption on solid adsorbents. For the two materials studied, the experimental values of adsorption capacity were found to be 5.9% and 2.3% higher than those of empirical values. However, the R2 values were 0.998 and 0.985 respectively. Now, a journal’s reviewer is expressing disagreement with the values, pointing out that Experimental values are more than empirical values. I feel that high value of R2 and closeness of values (between experimental and empirical) are sufficient to show that the model fits well, as it is only empirical, not theoretical. Please comment on this.
Relevant answer
Answer
First, the calculated value (not empirical) can be greater than the experimental value, but it can also be less than or equal to. Secondly, the value of R2 applies to the entire range of changes in the adsorption capacity, not just the end value. Third, ARE can be used in parallel with R2. It's also good to know which form of the Y-N equation you've been using. Explicit with respect to q, or implicit.
  • asked a question related to Quantitative Data Analysis
Question
11 answers
Hello everyone,
I am struggling with quantitative data analysis as I have not done it before. My questionnaire is about electric vehicles and I am looking on key factors that influence people's willingness to adopt electric vehicles. So, consumer behaviour in short.
I use Survey Monkey to collect data, therefore, it is easy to extract data in excel, figures, percentages, key words etc. What I need your help is;
1- Which method should I use to analyse? (best if it is the easiest one :)
2- Can I indeed just analyse by myself? This might be silly but surveymonkey gives everything I need, percentage, key words etc. Can I, for instance, give general overview of percentages, responses etc. Then start analysing and discussing question by question?
I am super lost and any help is appreciated! :)
Many thanks
Relevant answer
Answer
the type of data analysis and even the software you use is highly dependent on your research questions and hypothesis. Are you going to get frequency or you are willing to measure the correlation between two variables?
  • asked a question related to Quantitative Data Analysis
Question
6 answers
Dear Scientists
During a study of corrosion inhibitor's performance, and doing Tafel fitting, there is a statistical value appear called "Chi Squared" and having a wide range of values ranging from very low (0.0001) to tens. 
Please can you advise how can I decide based on "Chi Squared" value if my Tafel fit is correct or not? 
I am using GAMRY Potentiostat. 
Thank you in advance
Aasem
Relevant answer
Answer
Dear Dr. Aasem Zeino ,
a very small chi square test statistic means that your observed data fits your expected data extremely well. In other words, there is a relationship. A very large chi square test statistic means that the data does not fit very well. In other words, there isn't a relationship.
For more details, please see the source:
Chi-Square Statistic: How to Calculate It / Distribution by Statistics How To
My best regards, Pierluigi Traverso.
  • asked a question related to Quantitative Data Analysis
Question
7 answers
1. in a research study, I'm doing using SEM, there are six paths which are theoretically well-established in the literature, and three paths (from the exogenous to three mediating constructs) whose literature isn't strong. should I still use PLS-SEM or not?
2. using two different techniques of determining the sample size for PLS-SEM, I ended up with 146 and 147 cases, which are both over the number of population cases (140). this seems a little odd to me.
two methods are Hair et al. (2013) table on sample size recommendation and Gamma exponential method by Kock and Hadaya (2016).
in case of PLS-SEM being the right approach, should I continue with studying all the population (140)?
Relevant answer
Answer
In that web link I commented on an online power calculator to estimate the sample in structural equation models from the number of latent and observable variables, which has hundreds of studies that refer to its use
  • asked a question related to Quantitative Data Analysis
Question
1 answer
Hi, I'm writing a research paper for a theoretical experiment.
It's a repeated measure design. The estimated sample size is 385, consisting of 3 to 12-year-old participants who would be exposed to 4 different intervention methods and then complete a Wong-Baker faces pain scale (children under 8) or a Likert scale ( children 8 and above).
What quantitative data analysis would be appropriate?
Relevant answer
Answer
I assume that there will be 4 groups of participants, with each group being exposed on one of the 4 interventions. It it not stated how many repeated measures there will be, but a common design would be for all groups to have one pre-tests and one post-test. Then if would be a 4 by 2 mixed experimental design, with the first factor being between-group, and the second being a within-subject factor.
If there is some attrition expected, then a common procedure used to analyse the result would be Linear Mixed Modelling (LMM), with a random intercept and slope. If the dependent variable is not (sufficiently) normally distributed, then Generalized LMM could be used.
  • asked a question related to Quantitative Data Analysis
Question
7 answers
Hello everyone,
As a part of my mixed-methods thesis, I ran a small survey about neurologists' practice on breaking bad news. I'm relatively inexperienced with quantitative studies, so, although I have managed to derive all the relevant descriptive statistics, I was wondering if I could perform any additional analyses to such a small sample size?
I have tried to consult google and several booked but got a bit confused. Do I have to test normality and then maybe perform non-parametric tests? As an example, I would like to look at whether their age, experience, perceived difficulty of breaking bad news etc are associated with their attitudes and practice.
Thank you!
Relevant answer
Answer
Hello Lefteris,
The answer depends on: (a) the specific variables involved and how they are quantified; and (b) the specific research question each analysis is intended to address. Since this is for your thesis, might I suggest that you confer with your thesis advisor and committee members (if there is a committee), since these are the folks who must approve your work?
Unfortunately, without more specific information (about a, b, above), I can't really offer specific recommendations.
Good luck with your work.
  • asked a question related to Quantitative Data Analysis
Question
1 answer
I have collected data using questionnaires from athletes at two time points. I have an independent variable (leadership) and a few dependent variables but also some mediating or indirect variables such as trust ect. All of the variables were measured using the same questionnaire at two time points.
What data analysis method would be best to analyse the data? So far I have used the Process macro on SPSS which uses OLS regressions but I am unsure if this is the best method.
I essentially want to see how the IV relates/ increases the dependant variables over time and whether this changed occurred directly from the IV and indirectly through the mediators.
Would these be appropriate research questions for the type of data I have and for the appropriate analysis technique?
Relevant answer
Answer
Hello Ella,
The "best" analysis will depend on: (a) your specific research question/s; (b) the nature of the variables that you have collected and how they are quantified; and (c) your sampling procedures. You may find that an SEM framework would be easier for expressing your hypothesized model of how the variables do or don't fit together (even though Hayes' PROCESS add-in is pretty darn versatile). But, your query isn't fully elaborated, so (as an example) I have no way to judge whether you'd be better off with a univariate vs. a multivariate approach.
It sounds as if it might be worth your while to chat with someone from your institution to help arrive at an analytic approach that would answer your specific questions (a, above), while being defensible (given b and c, above).
Good luck with your work!
  • asked a question related to Quantitative Data Analysis
Question
4 answers
Hi everyone!
In my thesis, I use partner gender (woman vs. man) as a between subjects independent variable and priming figure (partner vs. acquaintance) as a within subject independent variable. Dependent variable is negative affect recovery score (range: -5 and +5).
So my study design is a 2x2 mixed design.
Also, I gave all participants LGB Identity Scale and asked their relationship duration prior to the experiment. I planned to use LGB Identity Scale (1-7 Likert type) and Relationship Duration (in months) as control variables.
The problem is that two control variables are continuous and what they measure is completely different than what I measure in the experiment? therefore, I cannot imagine how these covariates can eliminate any effect from dependent variable?
Another question is which analysis to use?
Actually I need to understand the logic of any analysis method to interpret these variables.
Thank you!
(*I use Tabachnick and Fidell’s book you can also refer to a book page )
Relevant answer
Answer
scores of outcome looks like ordinal data why cant you try non parametric test like Mann Whitney U to compare scores between genders
  • asked a question related to Quantitative Data Analysis
Question
1 answer
I would like to know how to interpret the result when the interaction term is significant while both of the main effects are not.
For instance, I would like to understand the direct effect of A and moderating effect of B on dependent variable D. The results show that the direct effect of A and B are insignificant. However, the interaction term of A and B is significant.
1. In this case, should I interpret that there are no direct effect of either A or B, yet the B moderates the direct effect of A?
2. Is it okay to plot an interaction chart while there are no direct effects available?
Thank you very much!
Relevant answer
Answer
Yes you would be right to report the absence of direct effects. The relationship follows a chain effect for significance to be achieved.
  • asked a question related to Quantitative Data Analysis
Question
2 answers
Hello,
I am currently undertaking a meta-analysis related to the efficacy of previous studies to harness pharmacological agents to ameliorate associative fear memories in rodents. The concept of effect sizes is fundamental to meta-analyses, but I am not so familiar with it. As a result, I have a (perhaps) elementary question related to the calculation of effect sizes:
I have decided that 'Hedge's g' will be the most appropriate metric of effect size for my analysis, given the relatively small sample sizes of my included studies. Often, authors will report an observation-of-interest (i.e. treatment vs. control effect) in the form of a (two-sample) t-test. I have found that the reported values of a t-test can easily be used to calculate a Hedge's g effect size - all good so far.
However, often an observation-of-interest is reported as an ANOVA (e.g. when the treatment group is compared with more than one control type), and other meta-analyses I have read seem to derive their effect sizes from both t-test or ANOVA statistics, depending on which is carried out. Again, I think I have found the correct equation to derive a Hedge's g effect size from the reported F-value of an ANOVA. However, I am struggling to understand how effect sizes can be comparable when they are derived from both t-test and ANOVA statistics. Surely ANOVA-derived effect sizes are less informative because you cannot tease out the individual contrasts-of-interest from an F value, as you obviously can for a two-sample t-test.
As a result, my initial instinct would be to request the raw data from the author (when ANOVAs have been reported), calculate a t-test for the specific contrast-of-interest (i.e. treatment vs. one particular control group) and calculate the Hedge's g effect size from the resulting t-test value.
Apologies for the long question, but I would hugely appreciate someone to enlighten me, and show me how to conflate t-tests and ANOVAs when generating effect sizes for a meta-analysis.
Many thanks!
Alex Nagle
Relevant answer
You cannot tear apart anovas to perform individual t tests because you will remove other interactions. The post hoc tests of anova are comparable with the t tests. You can directly compare t and anova tests because they're essentially the same thing
  • asked a question related to Quantitative Data Analysis
Question
3 answers
I have collected data using questionnaires from athletes at two time points. I have an independent variable (leadership) and a few dependent variables but also some mediating or indirect variables such as trust ect. All of the variables were measured using the same questionnaire at two time points.
What data analysis method would be best to analyse the data? So far I have used the Process macro on SPSS which uses OLS regressions but I am unsure if this is the best method.
I essentially want to see how the IV relates/ increases the dependant variables over time and whether this changed occurred directly from the IV and indirectly through the mediators. Would these be appropriate research questions for the type of data I have and for the appropriate analysis technique?
Relevant answer
Answer
Hello Ela,
a) learn SEM--for instance with the lavaan-package (lavaan.org)
b) For your research question, autoregressive models would be a possibility in which you can model lagged or synchronous direct and indirect effects.
Here is an applied example. If this fits what you want, I'll give you further papers.
Frese, M., Garst, H., & Fay, D. (2007). Making things happen: Reciprocal relationships between work characteristics and personal initiative in a four-wave longitudinal structural equation model. Journal of Applied Psychology, 92(4), 1084-1102.
(don't be distracted that they used 4 waves)
Best,
Holger
  • asked a question related to Quantitative Data Analysis
Question
11 answers
So I'm becoming crazy trying to analyze three online newspapers, the fact is that my paper has a very small scope so I only decided to study 10 articles per newspapers. The reason why I want to analyze so few is that I'm going to also take into consideration the comments under the articles to figure out what kind of reactions they generate directly. The problems here concern reliability and validity, which I struggle to show, like for example, can I just say that the selection is random? I have selected specific dates, which is from April to June 2014 and keywords such as "EP elections", "immigration", "economic crisis" and "austerity". Another problem is that I couldn't find any other paper anywhere which dealt with the same type of data. If anyone could provide some sources or help out in any way it would be greatly appreciated. Thanks in advance
Relevant answer
Answer
If your content analysis is qualitative I would suggest that you use qualitative terms for quality instead of validity and reliability. for example the four (five) criterias (credibility, dependability, confirmability, and transferability; and they later added authenticity 1994) suggested by Lincoln & Guba. They are very much used in qualitative research.
Lincoln YS, Guba EG. Naturalistic Inquiry. Beverly Hills, CA: Sage; 1985.
  • asked a question related to Quantitative Data Analysis
Question
5 answers
Hello everyone,
I am trying to statistically analyze whether data from 3 thermometers differs significantly. At the moment, because of COVID-19, several control points have come up at the company for which I work. We have been using infrared thermometers to check up on people and to be aware if they have a fever or not. However, we don't own a control thermometer with which we could easily calibrate our equipment, we thought that using a statistical test would be helpful, but at this point, we are lost.
Normally, we would compare our data to our control thermometer and that would be it. Our other thermometers are allowed to have a difference of +-1°C at max when we compare them to their controls; we can't do that now.
What I have been doing is collecting 5 to 10 measurements from each thermometer and compare them through an ANOVA test, and then assessing the results (when needed) by running Fisher's Least Significant Difference test. I don't know if it is right to do so because sometimes the data I collect does not seem to vary a lot (the mean difference is NEVER greater than +-1°C), and even so the test concludes that they differ significantly.
What would be right here? We don't want to work with the wrong kind of equipment or put away operating thermometers without a solid reason, we just want to do what's best to our people.
Could you guys please help me?
Relevant answer
Answer
The only way to honestly solve your problem is to a priori set a reference cutofftion is one of informed reliability and instrument validation testing. If you think about it, the question is not that complex and applying ANOVA would just confuse matters.
  • You have three thermometers and you don't know if any of them are perform entirely accurately.
  • You do not know if thermometers provide precise measurements upon retesting - within re-test consistency.
  • You don't know if the thermometers perform in a similar fashion in comparison to each other - between method reliability.
The only way to honestly solve your problem is to set an unstandardized tolerance cutoff that can you would then use to determine if it makes sense to continue using these instruments. Cutoff could be ±1 degree C, just as an arbitrary example.
My approach:
  • Test each instrument many times on the same standard.
  • My favorite statistic is the RANGE. Do any thermometers fall outside of perform outside of your cutoff. If so, then somebody may be incorrectly told they have COVID, so best to not use it. It really depends on your context.
  • There are many wonderful descriptive and reliability analyses than can be done in your situation. Get the median absolute deviation (MAD) of each thermometer and decide if dispersion is acceptable.
  • Check histograms of each and look for skew. If it exists, is it acceptable, given the direction?
  • Plot the three measurements in a modified Bland-Altman plot with average between the three in the X-axis and SORTED temperatures in Y axis. Preferably you know the temperature reference, so draw a horizontal line in the plot at that point. Otherwise use the mean.
  • Visually check the residuals vs the reference. Maybe get a root mean square error of Y values vs reference.
  • You may want to repeat the process with another subject or temperature range.
  • In the end, make an informed value judgment about the precision and probable validity of the instruments. If the values are too far apart, then you cannot use the thermometers.
This is really the only way to make these kinds of decisions in my opinion. I am open for others discussion reliability analysis but be wary if others suggest repeated measures ANOVA, two one-sided t-tests, or even Cronbach's alpha. Your goal is to get the most representative assessment of the situation, not p-hack your way to safety!
  • asked a question related to Quantitative Data Analysis
Question
6 answers
I am currently conducting a systematic review which uses the Downs & Black Checklist for measuring Quality. 
I am having trouble with Question 27: The power question as I am not sure how to go about doing my calculations even with the recommended guidelines. 
Does anybody have any suggestions or is anyone able to talk me through it as I am quite lost. 
Also is there any articles which state this checklist as being valid even after Question 27 has been modified? Please advise. 
Thanks
Relevant answer
Answer
I just came across this question. I have the answer from Dr. Black, personal communication to me of 07-25-2018, in which he sent me an MS Word doc. Dr. Nick Black is still involved with quality of health services research issues, and has great stuff out there I like, along with this checklist that I really like; this seems to have been a one-time interested for "Downs;" "Black" is the person on task for this.
I am not too satisfied by this answer and I have adopted a work-around, alternate approach for scoring item 27 similar to what others have adopted: to give the point if a defensible power analysis has been done before the study was conducted. [Don't ask what "defensible" is. That is a long discussion - just ask my advisees.]
I am going to try to post a pdf of this - pdf so it is in some relatively immutable form. I am posting to help OP or anyone else. For reference, the "properties" of that MS Word doc say "Authors:" = "padmnbla" and "Company" says "LSHTM" and it says "last printed" on 7/14/2004.
Essentially, the five scoring levels, 1-5, are for study "power" (1 minus Beta , or 1 minus type-two error). The higher in percent this value is, the more powerful a study is; the less likely you will make a Type II error / the less likely that you will fail to reject a false hypothesis. [Type 1 error is alpha, AKA the "pee-level," the chance of rejecting a true hypothesis.]
Conventionally, a typical "power level," a typical 1-Beta level, is 80%. Downs and Black say to give points according to increasingly stringent power levels: 1 pt for a power level of 70%, 2 pts for power level of 80%, 3 pts for power level of 85%, 4 pts for power level of 90%, 5 pts for power level of 95%, and 6 pts for power level of 99%.
How do you know power level?
Apparently, you must calculate this yourself. This includes determining a clinically meaningful difference on the outcome of interest. In their note, this is at point 1, where 50% or 60% is mentioned.
You will have to consult other sources to grasp the task of determining a clinically meaningful difference. This is very subjective and situation-specific.
Having been trained to calculate power analyses, and having done several - more than 5 and maybe a dozen - I can say that each is unique, with a different specific formula depending on the nature of the sample, the measure, and the analysis (power analysis for a survival analysis is different from power analysis for a logistic regression, etc.). Having also done study quality review in many forms, I believe it is too much for a study quality rater, for the applications we typically have, to conduct a power analysis of all of the studies being reviewed for study quality. To do this a project would truly have to include a statistician or other researcher very adept at a range of power analyses.
Many major projects barely enlist said statistician for the main power analysis of a major project. This makes a literature quality review far out of teh grasp of most otherwise qualified researchers, and is cost-prohibitive. Finally, if seeking funding for this, many funders would not even know what the one budget-buster line item was, if you included consultant costs for the statistician.
Study assigning one point to D/B power analysis question:
Vasileios Korakakis, Rodney Whiteley, Alexander Tzavara, Nikolaos Malliaropoulos. The effectiveness of extracorporeal shockwave therapy in common lower limb conditions: a systematic review including quantification of patient-rated pain reduction. British Journalk of Sports Medicine, V 52 issue 6, May 2018. Supplement 4.
Below, I am posting the text from the document sent to me 07-2018 by Nick Black. Cheers!
Downs and Black checklist (JECH 1998;52:377-384)
Question 27: Power
This is in essence similar to a power calculation.
1. Decide on what constitutes a clinically or socially significant difference between the two groups being compared (eg difference in desired outcome 60% versus 50% success)
2. Select a probability value for such a difference – we suggest 5% as commonly accepted value.
3. Select a range of study powers against which you want to assess papers. These are represented as A to F in Question 27. For example, A=70%, B=80%, C=85%, D=90%, E=95%, F=99%.
4. You can now determine the number of subjects that would need to be in the smallest group (though the likelihood is there will be the same number in all groups in the study in question). These are designated as n1 to n8. These can be derived from standard software for calculating sample sizes for randomised trials.
5. Now you can use Question 27 to assess the power of all the studies being assessed by applying the number of subjects in the smallest group to the table and the right-hand column gives you the value (from 0 to 5).
6. Warning: this approach may overestimate the power of non-randomised trials (prospective cohort studies) but there is no simple, alternative method available at present.
  • asked a question related to Quantitative Data Analysis
Question
15 answers
Does the indirect effect (in a simple mediation model) of your mediator only consider the IV, or also take into account the control variables in your model?
I am trying to check the robustness of the indirect effect in my simple mediation model (with 6 control variables), with the Sobel/Aroian/Goodman Test. Do I need to include my controls to check the indirect effect or not?*
*I have tried both. The indirect effect is significant as I include my controls, but insignificant as I exclude them. In my initial methodology, my indirect effect is shown to be insignificant (using bootstrapping method).
Relevant answer
Answer
I'd disagree slightly. There is no general reason to use SEM for a mediation model without latent variables. There are specific reasons why SEM approaches might be helpful by making the analysis a single step or by providing fit statistics or in fitting a wider range of models. However, a simple mediation analysis fit as a path model using multiple regression should be equivalent to the SEM formulation. There are also intermediate approaches like piecewise SEM models that can be more attractive than full SEM (if you don't have latent variables).
Generally most regression models including mediation models can be specified in equivalent ways using different modeling approaches.
  • asked a question related to Quantitative Data Analysis
Question
16 answers
Hi,
I have analysed my data using multivariate multiple regression (8 IVs, 3 DVs), and significant composite results have been found.
Before reporting my findings, I want to discuss in my results chapter (briefly) how the composite variable is created.
I have done some reading, and in the sources I have found, authors simply state that a 'weighted linear composite' or a 'linear combination of DVs' is created by SPSS (the software I am using).
They do not explain how they are weighted, and as someone relatively new to multivariate statistics, I am still unclear.
Are the composite DVs simply a mean score of the three DVs I am using, or is a more sophisticated method used on SPSS?
If the latter is true, could anyone either a) explain what this method is, or b) signpost some useful (and accessible) readings which explain the method of creating composite variables?
Many thanks,
Edward Noon
Relevant answer
Answer
OK.
MANOVA works by analysing a linear function of the DVs rather than the raw DVs. For example it could be
Ycomp = a*DV1 + b*DV2 + c*DV3
The weights a, b and c are optimised so that the variance explained in Ycomp by the DVs is maximised (as if in a one-way ANOVA).
This seems like a reasonable thing to do but it has some strange consequences. Firstly, its atheoretical so it might not be interpretable even if the separate DVs are. Second, it optimises on the data so will capitalise on sampling variability and other characteristics. Thus an identical replication using the same analysis will actually use a different composite DV. Third, standardised effect size statistic relate to the composite so are even harder to interpret than normal.
I'm not a fan of MANOVA for these and other reasons (e.g., it doesn't protect against Type I error in the way people assume).
  • asked a question related to Quantitative Data Analysis
Question
16 answers
I have finished an online course on Nvivo, but before to start analyzing my data with it, wanted to know my peers' experience on any other software for quantitative data analysis.
Relevant answer
Answer
I have used MAXQDA extensively. It seems to be extremely useful in analyzing large numbers of data (e.g. a few hundreds of interviews, a few hundreds of institutional field etc.). You can use colorful highlighting and various support instruments. Link ideas to people by location or age or sex. You can print easily all excerpts on any topic or subtopic separately. You can also easily quantify all quality data. I like this program enormously even though I come from and use large-scale quantitative data, currently 29 million publications...
Take care! Marek
  • asked a question related to Quantitative Data Analysis
Question
12 answers
If the variances are not homogeneous in the one way anova test and the sample numbers in the group are equal, which of the Dunnetts T3, Games Howel or Dunnetts C tests are preferred to determine the source of the difference?
Relevant answer
Answer
Firstly, thank you for your answer.
I looked at your blog and I will read your article.
Regards...
  • asked a question related to Quantitative Data Analysis
Question
3 answers
I wish to collect the data of Annual Average conc. of SO2, NO2 indicators, state-wise. The data available is citywise. Can I take average of all the cities within the state and use it for state wise analyasis?
Relevant answer
Answer
Belt and braces would be to take a population weighted average of the cities , but it might be better to explain what the problem is, as there might be better analytical solution to what you are trying to do.
  • asked a question related to Quantitative Data Analysis
Question
5 answers
narratives, oral histories, videography adds to and enhances quantitative data analysis for efforts like this.
Relevant answer
Answer
Do you have a set of results from analyzing quantitative data? If so, you might use a mixed method approach called an "explanatory sequential design" where the follow-up qualitative methods gives you a better understanding of your quantitative results.
  • asked a question related to Quantitative Data Analysis
Question
3 answers
Dear colleagues,
I've been working on some lipid extraction analysis with QDa HPLC.
I start with some biological samples that are spiked with a different final concentrations of a lipid mix (consisted of few known lipids), thus to see what is the minimal concentration (and the recovery %) of the lipids that can be detected. In parallel, I use a freshly prepared calibration curve (0;0.5;1;2;4;6;8;10;20).
The problem is that I am not sure of the way I analyze my data, because in some samples I see clear peaks (of up to the ^6-7 intensity), yet, in my excel calculations I get very low or non-existing concentrations, like the extraction was not successful at all. For ex., I have clear peaks and higher calculated concentrations for samples with 50ug/ml final conc; but for samples with 100ug/ml I only have clear peaks with higher intensity, yet the calculations and recovery % show very low/non-existent values.
The way I analyze my data is the following:
1) Use calibration curve (slope&intercept) to calculate the concentration of my samples;
2) I tried to normalize for the blanks (for ex. if I have some very low signal in some of the blank samples)
3) I tried to normalize for the dilution factor (if diluted 10 times, multiple the concentration calculation by 10).
I believe there is a problem with the calibration curve cause a) R= 0.95-8 (approximately)
b) because of the slope or intercept, I get a lot of negative values in my sample conc. calculations.
I already checked the calculations multiple times and I am using a precise automatic pipette, so I really don't understand what the problem might be. Knowing how simple Cal.curve preparation and calculation is, I feel very frustrated that I cannot get proper values every time I do the analysis.
Do you maybe have any suggestions/comment/advice on a proper calculation, on how the slope&intercept affect my concentration and the calibration curve problem?
Thanks in advance,
Margarita
Relevant answer
When you are analyse your linear range and calibration range you have to check your column. I guess your column is not correspond for your data!!!
  • asked a question related to Quantitative Data Analysis
Question
3 answers
In an exploratory study, If I want to state that certain components of counselling (7 items to assess), environmental modification (8 items) and therapeutic interventions ( 8 items) results in the practice of social case work, what analysis should I do?
NB: we have no items to assess practice of social work. Instead we want to state the practice of the other three components results in practice of social case work.
Relevant answer
Answer
If according to Morgan
  • asked a question related to Quantitative Data Analysis
Question
3 answers
If you have expertise on quantitative data analysis, please let me know! I have data from questionnaire survey and need help in analysis those data! Perhaps interested person can communicate with in miladhasan@yahoo.com for collaboration as well.
Relevant answer
Answer
Hello ! I am good at SPSS for quantitative analysis. My background is Economics.
  • asked a question related to Quantitative Data Analysis
Question
5 answers
For eg, In an exploratory study, I have assessed social factors, emotional factors and cultural factors and emotional intelligence. How can I arrive at a model by identifying relevant factors influencing emotional intelligence?
Relevant answer
Answer
It looks like a regression analysis between IV to DV as David says.
  • asked a question related to Quantitative Data Analysis
Question
3 answers
I'm preparing for a management research and analysis exam and in it we have to devise a hypothetical research study to solve an organisational problem.
My independent variables in the example I'm completing are communication frequency, trust levels within team, and number of team members. My dependent variable is team performance.
If I wanted to suggest trust had a moderation effect on the relationship between communication and team performance, what steps would I have to go through/data analysis techniques would I use, and what kind of results would I expect to see if there was a moderation effect on the relationship? It doesn't need to be too in depth, and it might be worth noting I'm using a longitudinal research design with questionnaires as my data collection method.
Many thanks.
Relevant answer
Answer
Deat researcher
As for the choice of the questionnaire method for data collection, it is a valid one. The questionnaire should be structured, preferably a Likert scale type. The assessment of the moderating effects in your research model requires the use of multi-group analysis. You may want to assess the unconstrained model against the constrained models that show the effects of your moderating variables.
Kind regards
  • asked a question related to Quantitative Data Analysis
Question
26 answers
Can someone please advise me as to the best software to use to analyse quantitative data.
Relevant answer
Answer
You may want to take a look at this paper and see which one suits best your needs:
  • asked a question related to Quantitative Data Analysis
Question
6 answers
Does anyone could share with me the steps to analyze the ERP data using BrainVision Analyzer 2? My data was collected using Brain Products V-Amp 8 channels.
Thank you in advance.
Relevant answer
Answer
Hello
You can analyze your ERP data by following the steps mentioned in this attached file.
Thank you
  • asked a question related to Quantitative Data Analysis
Question
7 answers
I am new to using R and am stuck on best code to analyze a split plot experiment. Does any one have sample code?
Relevant answer
Answer
library(agricolae)
trt= as.factor(fp$trt)
blc=as.factor(fp$rep)
yr=as.factor(fp$yr)
model=with(fp, sp.plot(blc, yr, trt, pt))
#where fp is the name of datafile, trt, rep and yr are the column of the data table, yr = main plot factor, trt=subplot factor and blc=replications
gla=model$gl.a
glb=model$gl.b
ea=model$Ea
eb=model$Eb
#lsd test for main, sub plot and interaction effects
Mainplot=with(fp, LSD.test(pt, yr, gla, ea, console = TRUE))
Subplot=with(fp, LSD.test(pt, trt, glb, eb, console = TRUE))
Interaction=LSD.test(fp$pt, yr:trt, glb, eb, console = TRUE)
  • asked a question related to Quantitative Data Analysis
Question
6 answers
Hey everybody,
I'm going to be up front with this, I am not super confident when it comes to quantitative data analysis. The study I am working on uses a series of Likert-style questions to generate data (including a basic BFAS questionnaire) which I am using to highlight potential areas of interest before I conduct my primary analysis using qualitative data analysis. I have two focus groups (the first N=31 and the second n= 13) with two equally large control groups randomly selected from the remaining sample that did not qualify for the focus groups.
Anyway, that's more than you all probably needed to know - my issue is that I'm not sure what all information I need to put into my quantitative report, especially since that data is just basically a discussion starter for my Qual analysis. I've run (r) on the appropriate values, and computed (d) between the focus and control groups. Is there anything else you all would suggest I do?
Thanks in advance for the help. We English studies types don't do much in the way of quantitative studies.
Relevant answer
Answer
Ryan - that makes more sense. You seem to be hiding your light under a bushel. What you identify here sounds useful and interesting - and 'is what it is'. You may not have significance in some areas - but that's not a problem. To me, the correlation is actually one of the more important areas and might be your main point of originality. I would pursue it more than 'ancillary'. Back to your qualitative phase. You still have to move beyond the 'looking for specifics'. If what you would 'like' to see emerges naturally from the collected narrative/data - then all well and good - but don't give an impression to markers/reviewers that you have 'cherry-picked'.
  • asked a question related to Quantitative Data Analysis
Question
4 answers
My data consits out of observations rated on 5 different dimensions (5 point Lickert-scale). A sixth variable describes the outcome (binary 0-1) I am interested in. But rather understanding the individual contribution of these dimensions on the outcome variable, I am interested in finding out the optimal combination of dimensions resulting in the highest probability of getting the Outcome equal to 1.
Do you have any advice regarding a methodological approach for me?
Thank you very much in advance, your help is highly appreciated!
Kind regards,
Jessica Birkholz
Relevant answer
Answer
Hi Jessica,when there is a multivariate analysis like yours,we would suggest two methods.In case all variables are observed variables (e.g. age,weight,frequency) etc then on can go for Regression.But in case any one of the variable is a construct (e.g stress,satisfaction which cannot be measured in numbers or units but will have to be measured with a likert scale statements set called items)then we go for the SEM which includes the CFA method of analysis.
  • asked a question related to Quantitative Data Analysis
Question
9 answers
Dear sirs/madams,
Hi.
I have a question the answer to which I couldn't find. I was wondering if you could help me.
I used random matching (matched paires) based on one variable to assign 132 participants into two groups i.e. 66 participants each. However, after deleting outliers and checking the assumptions, imbalanced groups resulted in a way that one group has 58 participants and the other has 53.
Running an independent sample t-test showed no significant difference between the means of the two groups, yet i was doubtful about the numbers of the participants in each group. Should the numbers be exactly the same by deleting the counterparts of the participants that were deleted?
Relevant answer
Answer
How about using a mixed model?
  • asked a question related to Quantitative Data Analysis
Question
6 answers
After submitting this paper to a journal, I have received feedback that it is not possible to make a contribution to this literature with analysis at such a macro level and I should look at firm-level data or case studies to do the analysis. This cant be correct can it?
Relevant answer
Answer
Its a two way process whereby on one hand you may analyze entrepreneurship at macro-level to evolve policy formation and on the other hand you may undertake evaluation of entrepreneurship at micro-level to determine its impact or feedback to the policy. This purely depends on your research question at hand and the scope of your research project.
I did some stuff looking at entrepreneurship from a macro-level ( ) and the at micro-level blended with macro-level (
Deleted research item The research item mentioned here has been deleted
)
You can also look at GEM model to understand the linkages between macro-level and micro-level analysis of entrepreneurship (https://www.gemconsortium.org)
  • asked a question related to Quantitative Data Analysis
Question
9 answers
My current research project is a sequential explanatory mixed methods project. I plan to give high school teachers in one school district a demographic questionnaire and the West Virginia 21st Century Teaching and Learning Survey [WVDE-CIS-28]. This survey is a 5 point Likert-Type Survey on the frequency of integration of 21st Century Skills . Teachers with the highest frequency will be chosen for the qualitative interviews (and used to define successful integration) and a thematic analysis will be used for the qualitative phase.
I am trying to determine what statistical analysis should be used to determine if a relationship exists between frequency of integration and the following:
a) years experience teaching
b) level of education
c) number of professional development hours completed- related to 21st century skills/learning
I have limited knowledge of quantitative data analysis. Is there anyone out there that can provide me with some insight?
Relevant answer
Answer
I think the qualitative analysis in this area is most appropriate to help you get average results for a large sample of people
  • asked a question related to Quantitative Data Analysis
Question
3 answers
My study is based on skilled returnee to their home country and there are two types of them , one came home country with the job offered by government and other find job on their own after return to home country. Choice of coming back is ranked on the scale of 1-10 .
question that i need to find answer for my project is that
Do returnee with govt fellow and non fellow differ significantly on choice of reason of returning to home country ?
Should i use Mann-Whitney test or kruskal wallis test?
Thanks
Relevant answer
Answer
Hi Kanika,
The Mann Whitney U test is used to examine the difference between the two groups, while the Kruskal Wallis H test is used to examine the difference between more than two groups. So the selection of the Mann Whitney U test is correct.
Examine the distribution of the groups for the relevant dependent variable.
Nonparametric tests have less power than parametric tests.
Best, İbrahim.
  • asked a question related to Quantitative Data Analysis
Question
5 answers
When I analyze NMR data for metabolomics combined with other omics, can I use relative quantification data but not absolute quantification to do multi-omics data analysis?
Relevant answer
Answer
Hello,
Yes, you absolutely CAN. For multi-omics/ integrated-omics you hardly need anything in ABSOLUTE terms (neither are genomics, proteomics, transcriptomics!).
If you truly want to do NMR-based metabolomics (unsure when you say relative quantification in NMR?) then you MUST give a read and udnerstand this excellent recent paper: High‐Throughput Metabolomics by 1D NMR: https://onlinelibrary.wiley.com/doi/full/10.1002/anie.201804736 and follow all the original references they point to get the rationale of annotation and using such data for NMR metabolomics.
Also you may want to consult Dr. Ebbel's tools for NMR data quantification:
Thanks,
Biswa
P.S. Ignore comments above from others, which are non-specific.
esp. BATMAN i.e.,
Hao, J., Astle, W., De Iorio, M. and Ebbels, T.M., 2012. BATMAN—an R package for the automated quantification of metabolites from nuclear magnetic resonance spectra using a Bayesian model. Bioinformatics, 28(15), pp.2088-2090.
and approaches such as:
Ebbels, T.M., Rodriguez-Martinez, A., Dumas, M.E. and Keun, H.C., 2018. Advances in Computational Analysis of Metabolomic NMR Data. NMR-based Metabolomics, 14, p.310.
Beckonert, O., Keun, H.C., Ebbels, T.M., Bundy, J., Holmes, E., Lindon, J.C. and Nicholson, J.K., 2007. Metabolic profiling, metabolomic and metabonomic procedures for NMR spectroscopy of urine, plasma, serum and tissue extracts. Nature protocols, 2(11), p.2692.
Weljie, A.M., Newton, J., Mercier, P., Carlson, E. and Slupsky, C.M., 2006. Targeted profiling: quantitative analysis of 1H NMR metabolomics data. Analytical chemistry, 78(13), pp.4430-4442.
Cavill, R., Kamburov, A., Ellis, J.K., Athersuch, T.J., Blagrove, M.S., Herwig, R., Ebbels, T.M. and Keun, H.C., 2011. Consensus-phenotype integration of transcriptomic and metabolomic data implies a role for metabolism in the chemosensitivity of tumour cells. PLoS computational biology, 7(3), p.e1001113.
Hope it helps.
  • asked a question related to Quantitative Data Analysis
Question
1 answer
I would like to cluster user into at least 3, and at most 7 segments based on different credit and paying back variables that i would collected from two different databases. In the attached file, i put an example of what table I will have to do this. The user_id (A-E) represents my users which, in reality, is about 20, 000. I put different potential variable that would be part of clustering. I would assume they can rage from 5-15 variables (in the example, dunning state, no. of missing date, and so on).
My question is,
1) Is it possible to do a k-means clustering to achieve such segmentation? Or do I need to have other statistical means I should purse?
2) Which software can do the job well (R or Spss)?
Excuse any nativity on my part, as i am new to quantitative data analysis.
Relevant answer
Very interested question you have raised,
There is solution to solve your doubt and kindly try to follow the procedure:
1. First reconstruct the missing data in the dataset using Dynamic Data Cleaning Technique (Refer the Paper: " "), i hope this concept is very helpful to reconstruct the missing data points in the dataset.
2. Use the DAAC (Dynamic Automatic Agglomerative Clustering) scheme to automatically split the pre-processed dataset into finest number dissimilar clusters without user input (Refer Paper: " An Improved Frequency Based Agglomerative Clustering Algorithm for Detecting Distinct Clusters on Two Dimensional Dataset ")
3. Can easily evaluate the clustering result of user dataset using cluster validation scheme (Refer Paper :An Improved Frequency Based Agglomerative Clustering Algorithm for Detecting Distinct Clusters on Two Dimensional Dataset)
Herewith, i hope you can follow the above given procedure and identify the distinct groups over the user dataset for you analysis.
Thank You
  • asked a question related to Quantitative Data Analysis
Question
12 answers
Usually a simple model such as linear regression require a population, where each samples are linked with each information/parameter. For some reasons: probably data security or financial, I only get the marginal distributions of each given parameters.
While I know this is not optimal and information (especially interrelationship) is lost, I want to ask whether you think this is a hopeless study case. I personally cannot think any possible way to analyse such data (trivially, I'm able to see the marginal histogram or calculate its core density, but nothing more than that).
I would not go so far saying that analysis with such data is technically impossible. Given only that an object's shadow is a circle from three different perspective, one could still reliably estimate that the object is probably a ball. In the same manner, one might be able to analyse something (with some degree of information loss), when only given marginal distributions.
What is your opinion on this? and would you suggest a concrete statistical/stochastical methods when given such data?
Relevant answer
Answer
Hi Chistopher,
There only thing I am aware of that you can do is ‘Testing for Marginal Homogeneity’.
It appears that formulas have been derived for binary, multicategory, ordinal, and multidimensional test for marginal homogeneity in a categorical and ordinal data setting.
In categorical data analysis, there are some results like ‘Complete symmetry implies marginal homogeneity but the converse does not hold (except for 2x2 contingency tables)’ which is the problem.
In Alan Agresti’s book ‘Categorical Data Analysis’ it has a section on ‘Marginal Models and Quasi-Symmetry Models for Matched Sets’, but all the good stuff looks like it requires at least some match data. It mentions something about different types of marginal homogeneity in the multidimensional setting.
-Matt
  • asked a question related to Quantitative Data Analysis
Question
3 answers
When you sacrifice a mouse from a study group due to its big tumor size, how do you record the result? Since the average tumor volume of the group will be smaller, you should see a dip in the tumor growth chart, but this does not reflect the real situation. Also, in the end of the study, the number of mice in the study group will be smaller due to sacrifice, thus SEM will be bigger. Do you record the changing of study group size in the growth chart?
thanks,
Relevant answer
Answer
Usually the study group is run under fixed parameters (see our examples here: http://altogenlabs.com/xenograft-models/). The growth curve is then constructed from each of the tumors, ending at some fixed volume. Error bars usually resolve the issue of representing lower averages, but in general you should have consistency among your xenografts with regards to procedures.
  • asked a question related to Quantitative Data Analysis
Question
6 answers
Dear all,
I am currently working on my MA dissertation in Event Management, analysing the impact of Technology on UK music festivals' attendees' experience.
I would like to analyse data collected via a 25 questions survey (5 points likert scale - strongly agree/strongly disagree). My main objective is to test 8 hypothesis. My hypothesis are classified in 4 different groups (2 per group), each of these groups being related to a different facet. I aim at finding a positive relationship between the different groups/facets.
Example:
H1: Sound equipment has a positive influence on Music experience*
H2: Large screen has a positive influence on Music experience
H3: Social Media has a positive influence on Social experience**
H4: 4G has a positive influence on Social experience... and so on
*1st facet
**2nd facet
However I am not sure of the right method to follow... Which kind of data analysis test would you go for?
Thank you
Relevant answer
Answer
Hi Cyrielle,
Did you use just one question for each hyphotesis ?
If so, doesn't make sense to use EFA.
Let me use an example from Marketing area to ilustrate the whole picture.
Let's assume I want to know if Customer Satisfaction affects Customer Loyalty to a brand or a product.
So my hypothesis is that Customer Satisfaction affects positive and significantly Customer Loyalty.
Both variables are latent variables as both of them can't be measured directly.
To measure them I use two scales, one for each latent variable.
These escales use Likert range similar to the one you mentioned for your research (usually the format is: 1) Strongly disagree 2) Disagree 3) Neither agree or disagree 4) Agree 5) Strongly agree, but what you propose is not wrong).
Each scale has to have at least three statements
(Hair, Anderson, Tatham, & Black, 2009); for instance, for Customer Loyalty you could use: 1) I prefer to use the products of this company , 2) I think this company has the best offers in the present , 3) I prefer to buy this brand instead of other brands.
After that you should check at least internal consistency of each scale (Cronbach Alfa), but this is the minimum since in reality you should check also factorial validity, convergent validity and discriminant validity of each scales, which can be done when you make a CFA.
Assuming that the scales for both latent variables are OK, them you should calculate scores for Customer Satisfation and Customer Loyalty and them submit these data to regression analysis, checking if the hypothesis can be confirmed.
Despite EFA + regression analysis can be used, depending upon the magazine you want to submit it may not accept and ask you to perform CFA + SEM (I already have this bad experience with the magazine refusing my paper because ot methodology I used).
So, use one Likert-type question to test the hypothesis sounds a little weird, at least consideting what I have seeing in the literature.
Attached are some articles to ilustrate my example of Customer Satisfaction and Customer Loyalty (as I'm more familiar with this kind of research I decided to use it to show you the whole process).
Sorry if I were so basic in my discussion, but I'd rather to present the whole picture to permit you to confront my example with your situation.
I hope this can help you.
Anything else, feel free to call me back.
Bibliography for my discussion (the book has an English version):
Hair, J. F., Anderson, R., Tatham, R., & Black, W. (2009). Análise Multivariada de Dados (6th ed.). Porto Alegre: Bookman.
  • asked a question related to Quantitative Data Analysis
Question
1 answer
I have 8 months of load consumption and PV generation data available from site and want a prediction/forecasting analysis for the next 4 months. Kindly update with any new/effective method used.
Thanks
Relevant answer
Answer
Dear Mahmood,
I kindly suggest you to have a look at ARIMA and recurrent neural networks models.
  • asked a question related to Quantitative Data Analysis
Question
8 answers
I applied sequential explanatory design in which quantitative data analysis was followed qualitative data collection through semi-structured interviews. The qualitative interviews were based on the themes/sections of the quantitative data. I am wondering what type of qualitative data analysis will be the best choice at this stage. The qualitative data will support the quantitative results. Thanks
Relevant answer
Answer
Perhaps you could use some type of content analysis?
  • asked a question related to Quantitative Data Analysis
Question
18 answers
I am familiar with liker type questions coding and analyzing in SPSS, but how can code multiple answers questions in SPSS.
For example:
Q: What are you wearing right now?
o Short-sleeves shirt dress
o Long sleeves shirt dress
o Straight trouser
o Sleeveless shirt
o Skirt
o Sleeveless vest
o Long sleeves pyjamas
o Calf length socks (length between knee and foot)
o Pant
o Boots
o Sandal o Cap
o Scarf
o Abaya
There are more multiple checks to this question by single respondent.
1) How this can be coded in SPSS ?
2) How to analyze It?
Relevant answer
Answer
Dear Dr.Khalil
With my kind of understanding of your question that a particular question has multiple options and how to code them in SPSS. I think it is easy before you think about any analysis.
The example has 13 options, each of this response goes to a single column, and it can be entered as short sleeve : yes or no;
long sleeve: yes or no
shorts: yes or no and so on
At the end of it you can see how many are saying yes (or wearing) short sleeves, long sleves, shorts and so on.
good luck with your coding
thanking you
  • asked a question related to Quantitative Data Analysis
Question
10 answers
I'm conducting a study investigating the attraction to individuals in the Dark Triad. I am using a questionnaire with several likert scale rating sections about attraction as well as some vignettes where the participant rates their attraction to the vignette person and filling out the Big Five Inventory (BFI-S).
I'm not sure which method of data analysis I should use for this study.
Relevant answer
Answer
Which method of data analysis should I use for a study using questionnaires?
What data analysis to use is depending on your conceptual framework / research model that depict all the constructs / variables, their relationships & hypotheses you intend to analyze e.g. BFI-S can be one variable, Dark Triad can be one variable or it can be splitted into 3 variables i.e. Narcissism, Machiavellianism, Psychopathy etc. Once you nail down the conceptual framework / research model, you can know what data analysis to be adopted e.g. correlational study, mediation or moderation study, Structural Equation Modeling, Analysis of Variance etc. However, the conceptual framework / research model is dependent on your research objectives & research questions which are then dependent on the research problem that you want to resolve / address.
  • asked a question related to Quantitative Data Analysis
Question
14 answers
In my data-set, there are 8 explanatory variables. I took log transformation for 7 of those as doing so improves normality while one explanatory variable remains untrasformed as log transforming does decrease normality as shown in normal q-q plot. In this condition, can I run a regression model with one variable untrasfored while all other variables log transformed? Note that dependent variable was also log transformed.
Relevant answer
Answer
The explanatory variables don't need to be normal distributed. Only the distribution of the dependent variable should be (approximately) normal conditional on the explanatory variables.
It is more important that the (conditional) relationship of the explanatory variables and the dependent variable is linear.
  • asked a question related to Quantitative Data Analysis
Question
5 answers
which method must be considered to treat the missing data in demographic information that might the respondents answered most the questionnaire items?
Relevant answer
Answer
There are different approaches. 1. When it comes to data that can be deduced from the answers of other people. For example, if the group of respondents is from the same neighborhood. Even if that question has not been answered, it can be deduced without problems. 2. When a response can be attributed from a calculation. For example, assigning the population mean or the group response mean. 3. When the answer can be deduced by checking other answers. For example, the employment situation based on contracts, the place of work or others. 4. When you should interpret what you meant by your non-response. This last case is very doubtful. In any case, it does not seem advisable to replace the non-response, but only to interpret in the report why the respondent did not answer.
  • asked a question related to Quantitative Data Analysis
Question
28 answers
Why is it important to examine the assumption of linearity when using Regression?
Relevant answer
Answer
Linear regression is an analysis that assesses whether one or more predictor variables explain the dependent (criterion) variable. The regression has five key assumptions:
Linear relationship
Multivariate normality
No or little multicollinearity
No auto-correlation
Homoscedasticity
A note about sample size. In Linear regression the sample size rule of thumb is that the regression analysis requires at least 20 cases per independent variable in the analysis.
In the free software below, its really easy to conduct a regression and most of the assumptions are preloaded and interpreted for you.
First, linear regression needs the relationship between the independent and dependent variables to be linear. It is also important to check for outliers since linear regression is sensitive to outlier effects. The linearity assumption can best be tested with scatter plots, the following two examples depict two cases, where no and little linearity is present.
Secondly, the linear regression analysis requires all variables to be multivariate normal. This assumption can best be checked with a histogram or a Q-Q-Plot. Normality can be checked with a goodness of fit test, e.g., the Kolmogorov-Smirnov test. When the data is not normally distributed a non-linear transformation (e.g., log-transformation) might fix this issue.
Thirdly, linear regression assumes that there is little or no multicollinearity in the data. Multicollinearity occurs when the independent variables are too highly correlated with each other.
Multicollinearity may be tested with three central criteria:
1) Correlation matrix – when computing the matrix of Pearson’s Bivariate Correlation among all independent variables the correlation coefficients need to be smaller than 1.
2) Tolerance – the tolerance measures the influence of one independent variable on all other independent variables; the tolerance is calculated with an initial linear regression analysis. Tolerance is defined as T = 1 – R² for these first step regression analysis. With T < 0.1 there might be multicollinearity in the data and with T < 0.01 there certainly is.
3) Variance Inflation Factor (VIF) – the variance inflation factor of the linear regression is defined as VIF = 1/T. With VIF > 10 there is an indication that multicollinearity may be present; with VIF > 100 there is certainly multicollinearity among the variables.
If multicollinearity is found in the data, centering the data (that is deducting the mean of the variable from each score) might help to solve the problem. However, the simplest way to address the problem is to remove independent variables with high VIF values.
Fourth, linear regression analysis requires that there is little or no autocorrelation in the data. Autocorrelation occurs when the residuals are not independent from each other. For instance, this typically occurs in stock prices, where the price is not independent from the previous price.
4) Condition Index – the condition index is calculated using a factor analysis on the independent variables. Values of 10-30 indicate a mediocre multicollinearity in the linear regression variables, values > 30 indicate strong multicollinearity.
If multicollinearity is found in the data centering the data, that is deducting the mean score might help to solve the problem. Other alternatives to tackle the problems is conducting a factor analysis and rotating the factors to insure independence of the factors in the linear regression analysis.
Fourthly, linear regression analysis requires that there is little or no autocorrelation in the data. Autocorrelation occurs when the residuals are not independent from each other. In other words when the value of y(x+1) is not independent from the value of y(x).
While a scatterplot allows you to check for autocorrelations, you can test the linear regression model for autocorrelation with the Durbin-Watson test. Durbin-Watson’s d tests the null hypothesis that the residuals are not linearly auto-correlated. While d can assume values between 0 and 4, values around 2 indicate no autocorrelation. As a rule of thumb values of 1.5 < d < 2.5 show that there is no auto-correlation in the data. However, the Durbin-Watson test only analyses linear autocorrelation and only between direct neighbors, which are first order effects.
The last assumption of the linear regression analysis is homoscedasticity. The scatter plot is good way to check whether the data are homoscedastic (meaning the residuals are equal across the regression line). The following scatter plots show examples of data that are not homoscedastic (i.e., heteroscedastic):
The Goldfeld-Quandt Test can also be used to test for heteroscedasticity. The test splits the data into two groups and tests to see if the variances of the residuals are similar across the groups. If homoscedasticity is present, a non-linear correction might fix the problem.
  • asked a question related to Quantitative Data Analysis
Question
8 answers
I'm planning to perform a SEM in order to investigate intention to innovate. This dependent variable depends on attitude toward innovation (ATI) and entrepreneurial self-efficacy (ESE) - based on the scheme of the Theory of Planned Behaviour. Literature indicates that other variables play a role, such as gender and family exposure to entrepreneurship (FEE).
How should I perform the analysis, in order to see how gender and FEE impacts the scheme? The data comes from a survey with 1200 answers.
At first I thought on mediating model, but I believe the effect is not a causal relationship between gender and the dependent variable, but a moderator effect. Nevertheless, I am struggling to implement it in Stata. Most literature about it talk about mediation and moderator at the same time, which is not the case.
Other suggestions of analysis would also be really helpful.
Best regards, Pedro
Relevant answer
Answer
interesting question
  • asked a question related to Quantitative Data Analysis
Question
13 answers
When, I try to fit model using AMOS so I find a instruction that the following variables are endogenous, but have no residual (error) variable so my question is: is it compulsory to impose residual (error) without imposing error residual? The results are found good. Kindly respond over this . 
Relevant answer
Answer
@Peter: 90% agreement (though I would like to differentirate the regression model as a structure from „regression“ - that is fitting a plane into a multidimensional scatterplot). SEM does not involve that.
Best
Holger
  • asked a question related to Quantitative Data Analysis
Question
5 answers
The β-weights of the items in the factor pattern will be substantially reduced, I suppose, but will that be true for the item-factor correlations in the factor structure as well?
Relevant answer
Answer
Raiswa, I advise you to ask your question to the RG participants in general. Add more information about your research subject, measurement instrument(s), model, and fit-indices inspected. Clarify the less common abbreviations such as MSV and AVE. Report also chi-square, its df, and its significance value. And how you determined the instrument's discriminant validity. People more acquainted with structural equation modeling than I am, will then be in a position to answer your question. As it is presented now, nobody will be able to answer your question.
  • asked a question related to Quantitative Data Analysis
Question
6 answers
People think numbers are more accurate than words although the quality of data analysis matters.
Which one do you thin is more convenient in applied research? What are the considerations?
Reflections are appreciated in advance,
Relevant answer
Hi Awoke
Actually I believe its depend on the problem. In predictive problems, if someone is interested in a selection problem (for instance identify the top 10) even using quantitative data we can categorize the response into classes, and use powerful classifiers, popular nowadays like deep learning models (Neural nets using categorical likelihoods etc). However, if one may interested in quantifying some quantitative reliable value, like incidence of cancer in some content or country, quantitative analysis using hierarchical models can give high accurate results.
Please, if the comment was useful, recommend for others,
  • asked a question related to Quantitative Data Analysis
Question
3 answers
Closed
Relevant answer
Answer
You think they are in same age, same context and characteristic but their personality and motivation surely not the same.
If I were you I would examine their approach towards the learning or the most interesting research topic could be to analyse how different teaching strategies or tools can play its role in different groups. Moreover, evolution of group dynamis can also be investigated. 
  • asked a question related to Quantitative Data Analysis
Question
2 answers
I was working with PEAKs quantitative data analysis but I have a confusion what a Ratio vs Quality graph means. Further, the output in the quantitative analysis has indicated that Quality≥1 under the Result filtration parameters. But I don't understand what this ≥1 value is telling.
Thanks
Relevant answer
Answer
thank you very much
  • asked a question related to Quantitative Data Analysis
Question
8 answers
I have annual data on real total remittances and real per capita income, primary education enrollment, domestic saving, age dependency ratio, education expenditure as percentage of GDP. All variables are annual time series observations from 1972-2012.
Need your expertise in this regards immensely?
Relevant answer
Answer
Please see p.p 37, perhaps it would help you.