Science topic

SPSS - Science topic

Explore the latest questions and answers in SPSS, and find SPSS experts.
Questions related to SPSS
  • asked a question related to SPSS
Question
3 answers
Hello everyone, I use IBM SPSS Statistics 29.0.2.0 and I have a list of the following dichotomous variables:
  • Number of people who are self-employed Recoded,
  • Number of people with multiple jobs,
  • Number of household members who work full time,
  • Number of household members who work part-time,
  • Number of unemployed household members,
  • Household members who are retired,
  • Number of household members who are disabled,
  • Members who are not working for some reason.
This is an example of how they are coded: {1.00, No household members who are self-employed} and {2.00, Household members who are self-employed}. Of course, each one has its corresponding values. Now, I want to compute all of these variables into one comprehensive dichotomous variable called: Household economic activity status with the following values: 1.00, Households with at least one Economically Active Member and 2.00, Households with at least one Economically Inactive member.
I tried running different codes on the syntax, but none worked. I would greatly appreciate any help on how to do this.
Thanks and best wishes,
Amina
Relevant answer
Answer
Years ago I used the book SPSS analysis without anguish.
  • asked a question related to SPSS
Question
4 answers
I analysed reliability test on 5 point likert scale in SPSS, but value of Cronbach's Alpha =.697, not exceed after that when removed variables.
Relevant answer
Answer
@s. Rama Gokula Krishnan THANK YOU SIR
  • asked a question related to SPSS
Question
2 answers
If there is a research model with one IV, one DV and one continuous moderator. So, how to check the parametric assumptions in SPSS?
Include an Interaction Term:
Create an interaction term by multiplying the IV and the moderator.
Run the Moderated Regression Analysis:
Set up a regression model where the DV is the outcome variable, and the IV, moderator, and interaction term are the predictors.
Check Parametric Assumptions for the Model
Relevant answer
Answer
First, use the Shapiro-Wilk test, Q-Q plots, or histograms to determine whether the residuals are normal. Examine scatter plots for the correlations between the IV, moderator, and DV to confirm linearity. To verify constant variance, plot residuals against anticipated values to check for homoscedasticity. Use Variance Inflation Factor (VIF) numbers to assess multicollinearity; values more than 10 suggest issues. Additionally, use the Durbin-Watson statistic to verify the independence of mistakes; a value near 2 is desired. Lastly, center the continuous variables before constructing the interaction term to prevent multicollinearity and enhance model interpretability. These verifications guarantee that the data satisfies the presumptions required for a reliable regression analysis.
  • asked a question related to SPSS
Question
2 answers
There is a research model with one IV, one DV and one mediator. So, when checking the parametric assumptions in SPSS, what will be done to the mediator?
Does the following answer is true?
Path 1: IV to Mediator
For the path where the IV predicts the mediator, the mediator is treated as the outcome.
  • Run a simple regression with the IV as the predictor and the mediator as the dependent variable. Check parametric assumptions for this regression:
    • Linearity
    • Normality of Residuals
    • Homoscedasticity
    • Independence of Errors
Path 2: Mediator to DV (and IV to DV)
In this step, the mediator serves as a predictor of the DV, alongside the IV.
  • Run a regression with the IV and the mediator as predictors and the DV as the outcome. Check assumptions for this second regression model:
    • Linearity
    • Normality of Residuals
    • Homoscedasticity
    • Independence of Errors
    • Multicollinearity
Relevant answer
Answer
This sounds correct to me.
  • asked a question related to SPSS
Question
6 answers
I am looking for a way to analyze repeated measurements.
I have data of subjects who have varying number of measurements (from 2 to 10) over fixed time periods (Week 1, week 2, etc)
The subjects are divided into two groups (A and B).
What I want to check in my analysis is:
1. For all subjects together, does Week1 differ from Week 2, Week 2 from Week 3, etc
2. Are the changes over time in group A different from the changes in group B? I expect from my data for the group A to have higher deltas between timepoints in comparison with group B.
Some issues with my data:
1. some values are missing for most individuals;
2. plotting the data over time reveals non-linear trend: There is a trend of increasing values during the first 3 weeks, and from weeks 3 to 10 - a decrease of the values;
3. measurements of different patients are quite variable (think body weight - from 50kg to 120 kg) - a wide variation
- Repeated measures ANOVA is not a good option as I understand because it cannot deal with missing data.
- Linear mixed models (LMM) seems to be a good fit, as it allows for missing data, and allows entering the subjects as a random factor (so each subject has their own intercept).
The problem I see is the slope - it is not linear.
I know SPSS does not have a non-linear mixed effects model at all and I am not skilled in any of the other statistical programs. Is there any other solution for my data or workaround to use LMM?
Relevant answer
Answer
Dear Bruce Weaver , I agree that if the shape of the relationship looks like a triangle, a piecewise linear model (aka segmented regression) would do the trick. However, I have the strong feeling :) that the relationship does not really look like a triangle but more curved instead, with a smooth transition between increasing and decreasing trends. I also find that piecewise models should applied only when there is an external reason for a "break point" to exist. Like for instance the time point when a treatment is applied (that works quickly relative to the observed time period), what implies that there are values recorded before and after the treatment (what is applied at a particular point in time).
  • asked a question related to SPSS
Question
3 answers
we need a research scholar who has done factor analysis before.- EFA,CFA,reliability and Validity cheeking of items, and stability in SPSS or SPSS Amose.
Relevant answer
Answer
i want to know more about you .can you please tell me more how you can be a perfect team member on this work?
  • asked a question related to SPSS
Question
4 answers
Hello everyone,
I hope this message finds you well. I am currently working on my doctoral thesis and have been using IBM SPSS for data analysis. However, my trial version has recently expired, and I am unable to access the software at the moment.
I am looking for alternatives or any possible ways to obtain a free version of SPSS, either through academic institutions or other resources. If anyone has any suggestions or can share experiences regarding accessing SPSS for free, I would greatly appreciate.
Thank you in advance for your help!
Best regards, [Ferial]
Relevant answer
Answer
I suppose the lack of responses to my Sep 30 query means that Jamovi & JASP do not currently have straightforward ways to generate command syntax to document exactly what has been done, and to make it easily reproducible. I'm disappointed, if that is true, and I hope such functionality will be added eventually.
  • asked a question related to SPSS
Question
3 answers
I am comparing 2 groups' effect on 6 minute walk test scores. Now I want to find out if there is any indirect effect of age, gender and laboratory values on scores. But I am confused as scores are DV and gender is IDV. Regression does give OR but it does not define me which gender has the effect.
  • asked a question related to SPSS
Question
1 answer
I collected data using cluster survey using kobo toolbox and imported into SPSS V. 26.
after coding all varibles I weighted them. the total collected sample is 1881. But after weighting the data using SPSS; data-weight by case; the total frequency change to 2420.
What do you recommed for the difference b/n the actual collected and weighted frequency?
Relevant answer
Answer
A simple (and simplistic) approach is to calculate NEWWT1 = OLDWT * (1881/2420) and then substitute NEWWT1 for OLDWT in the WEIGHT command. That will give you a weighted sample size that is the same as your collected sample size (perhaps plus or minus a small decimal fraction).
A better approach would take into account the relative sampling efficiency of your cluster sample compared to simple random sampling (SRS). If you are able to calculate that, or approximate it defensibly, you can further adjust the weighting variable so that significance tests and confidence intervals generated by SPSS reflect the reduced efficiency (and consequent increased standard errors) from your sample design. For example, suppose you have reason to believe your cluster sample is only two-thirds as efficient as SRS. If so, you would calculate NEWWT2 = OLDWT * (1881/2420) * (2/3),
or equivalently, NEWWT2 = NEWWT1 * (2/3), and use NEWWT2 as the weighting variable in the WEIGHT command. That will give you a weighted sample size that is two-thirds your collected sample size, making your significance tests and confidence intervals correspondingly more conservative. A good, clear source on this topic is a book by Dorofeev and Grant, Statistics for Real-Life Sample Surveys, Cambridge University Press 2006.
  • asked a question related to SPSS
Question
3 answers
I administered pre and post-achievement tests on one sample (N=16), where I obtained two paired variables and intended to run a paired sample T-test. However, while pre-test data are normally distributed, post-test data are not. Should I go for the Wilcoxon signed-rank test on SPSS?
Relevant answer
Answer
Both the paired t-test and the signed rank test are conducted on the paired differences. All assumptions for the tests apply to the these differences. No assumptions apply to the pre- group or post-group separately. The tests --- and the assumptions --- are exactly the same as the one-sample versions of these tests on the differences.
If you have any references that suggest that the assumptions apply to pre- and post- groups separately, you can throw them in the garbage.
Hope that helps...
  • asked a question related to SPSS
Question
5 answers
Hello, fellow researchers! I'm hoping to find someone well familiar with Firth's logistic regression. I am trying to analyse whether certain emotions predict behaviour. My outcomes are 'approached', 'withdrew', & 'accepted' - all coded 1/0 & tested individually. However, in some conditions the outcome behaviour is a rare event, leading to extremely low cell frequencies for my 1's, so I decided to use Firth's method instead of standard logistic regression.
However, I can't get the data to converge & get warning messages (see below). I've tried to reduce predictors (from 5 to 2) and increase iterations to 300, but no change. My understanding of logistic regression is superficial so I have felt too uncertain to adjust the step size. I'm also not sure how much I can increase iterations. The warning on NAs introduced by coercion I have ignored (as per advice on the web) as all data looks fine in data view.
My skill-set is only a very 'rusty' python coding, so I can't use other systems. Any SPSS friendly help would be greatly appreciated!
***
Warning messages:
1: In dofirth(dep = "Approach_Binom", indep = list("Resent", "Anger"), :
NAs introduced by coercion
2: In options(stringsAsFactors = TRUE) :
'options(stringsAsFactors = TRUE)' is deprecated and will be disabled
3: In (function (formula, data, pl = TRUE, alpha = 0.05, control, plcontrol, :
logistf.fit: Maximum number of iterations for full model exceeded. Try to increase the number of iterations or alter step size by passing 'logistf.control(maxit=..., maxstep=...)' to parameter control
4: In (function (formula, data, pl = TRUE, alpha = 0.05, control, plcontrol, :
logistf.fit: Maximum number of iterations for null model exceeded. Try to increase the number of iterations or alter step size by passing 'logistf.control(maxit=..., maxstep=...)' to parameter control
5: In (function (formula, data, pl = TRUE, alpha = 0.05, control, plcontrol, :
Nonconverged PL confidence limits: maximum number of iterations for variables: (Intercept), Resent, Anger exceeded. Try to increase the number of iterations by passing 'logistpl.control(maxit=...)' to parameter plcontrol
Relevant answer
Answer
Data may fail to converge in Firth logistic regression in SPSS due to issues like complete separation, small sample size, multicollinearity, or outliers. It's essential to check the data for these problems and ensure that the model is specified correctly. Consider simplifying the model or collecting more data if necessary.
  • asked a question related to SPSS
Question
2 answers
Is there a difference on the result of factor loading when you use SPSS or JASP?
Relevant answer
Answer
A difference in the use of software should not affect factor loadings. While some software may round off the decimals in some cases, others may not. Apart from that, the results should not vary significantly.
  • asked a question related to SPSS
Question
1 answer
Hi,
I have conducted an RCT (psychological intervention) where participants were randomised to 1 of 3 conditions.
Initially I conducted a per-protocol analysis but now I would like to do an intention to treat analysis and compare the results. The issue is that there is a lot of missing data (about 19%) and the pattern is MNAR.
What is the best approach to doing an intention to treat analysis in this situation? Can I still conduct the analysis with the missing data (not replacing them)?
Relevant answer
Answer
Depending on why and how you want to use the results of the ITT analysis, you could just replace each missing value by the "worst", i.e. most conservative possible value, whatever that is in your specific trial. Somewhat more sophisticated, you could model the missing values taking the reasons for their missingness into account. To find out which model works best, one can introduce artifical missingness into the complete cases according to the assumed MNAR mechanism, and then predict the missings using different models and compare the predictions to the original values. Finally, it seems that multiple imputation works OK in many cases even if the missingness pattern is MNAR. The following recent article might offer additional insights:
  • asked a question related to SPSS
Question
5 answers
I am pursuing Ph.D. in Environmental Sciences from Sambalpur University, Odisha, India. I am specialised in water and soil analysis. I use SPSS and MS-excel for preparation of graphs and other statistical analysis. However, I find myself lacking in data representation in my research articles. I want to improve visibility of my research papers through R programming language. Please suggest a few study materials, including YouTube videos for quick learning.
Relevant answer
Answer
  • "R for Data Science" by Garrett Grolemund and Hadley Wickham (available free online) is excellent for beginners.
  • "The Art of R Programming" by Norman Matloff focuses on learning R as a programming language.
Stay Consistent and Patient
  • Practice Regularly: Set aside time each day or week to practice coding.
  • Solve Real Problems: Try solving problems you’re interested in (e.g., analyzing your own data or datasets from Kaggle).
  • asked a question related to SPSS
Question
4 answers
What are the processes to extract ASI microdata with STATA and SPSS?
I have microdata in STATA and SPSS formats. I want to know about the process. Is there any tutorial on youtube for ASI microdata?
Relevant answer
Answer
Good morning Sir Florian Schütze
Thank you very much for your reply/comment.
I have visited there. I found video for PLFS and NSS but did not get for ASI.
From MoSPI microdata catalog, I have downloaded data but unable to get specific variables' quantity. Variables Like, no. of firms and operated firms, I have get. But I am unable to get fixed capital, input, output and other variables. I merged two blocks and applied formula but perhaps there is some mistake. So I am not getting values.
  • asked a question related to SPSS
Question
14 answers
Hello everyone,
I have some questions regarding the use of these two effect sizes, and I am a bit uncertain about which one to choose for t-tests, independent t-tests, pairwise comparisons, or non-parametric tests.
The primary difference between Cohen’s d and Hedges’ g is that Cohen’s d divides by the pooled standard deviation, while Hedges’ g uses the sample size (N-1) in the denominator (https://doi.org/10.1037/h0087427). From this perspective, Hedges’ g may be more efficient to use, as the standard deviation values are typically reported in studies.
Furthermore, based on various opinions:
- For smaller sample sizes, Hedges’ g is considered more appropriate.
- For samples of unequal sizes, Hedges’ g may provide more accurate estimates.
However, many online calculation tools allow the use of Cohen’s d even for unequal sample sizes, and the standard deviation values seem sufficient for calculating Cohen’s d as well.
I would appreciate your thoughts and insights on this matter. Thank you in advance for your contributions.
BERMAN.
Relevant answer
Answer
Thanks for sharing that Turner & Bernard article, Sal Mangiafico. I had not seen it before, but it does look useful.
In response to the original question, I would add that sometimes, a raw (or simple) effect size may be preferred aver a standardized effect size. See Thom Baguley's 2009 article, for example. HTH.
  • asked a question related to SPSS
Question
2 answers
Im currently conducting a study on factors that may influence the chances of patients having delirium post-surgery. I have around 30 variables that have been collected including continous (HB, HBA1c, urea levels (pre-op, peri-operatively and urea difference), alcohol audit c score, CPB duration etc), categorical (blood transfusion - yes/no, smoking status, drinking status, surgery type etc) and demographic information (gender, age, ethnicity). The study also looks at whether our current measurements of risk of delirium are good predictors of actual delirium (the DEAR score, consists of 5 yes/no questions and a final total score).
As with many studies using data from previous patients (around 750), there are a lot of missing information in many categories. I have already been conducting assumption tests including testing the linearity of the logs and this has excluded some variables. I am using SPSS if anyone knows of anything on these systems i can use.
QUESTIONS:
1. (More of a clarification) I have not been using the pre-op, peri-op and difference between urea levels scores in the same models as i assume this violates the assumption of independency between variables - this is correct yes? If so, I assume that other variables that measure the same thing in different ways (e.g., age at time of surgery and the delirium risk quesiton that asks if patients are over 80) should also be excluded from the same model, and instead test the model with each difference and select the strongest model for prediction?
2. What should i do with my missing data? There is a big chunk (around 50% of the 750ish patients included) that dont have a delirium risk score - should i only conduct my model with patients that have a score if im investigating the validity of the DEAR score for predicting delirium or will SPSS select these case automatically for me? Other missing data includes HBA1c (because we do not test every patient), ethnicity (as the patient has not declared their ethnicity on the system), Drinking status (no audit c score made for the patient as they either dont drink or were not asked about their drinking status) etc... I've seen some chat about using a theory to generate predictions for the missing information but I feel like using this for example, for gender wouldnt be sensible as my population is heavily male centric.
3. Part of our hypothesis is identifying a model of prediction for males and females separately if they show different significant influences on chances of delirium. Can i simply split my data files by gender and conduct the regression that way to get different models for each gender? When i have done this, I have not used gender as a variable in the regression, but have tested it with all the data and found a significant influence of gender but only when tested with age and ethnicity or on its own (not in a model that includes all of my variables, or in a model that includes only the significant variables determined from testing various models). Should I just ignore gender all together?
Sorry for what may seem to be very silly or 'dealers choice' questions - I am not very experienced with studies with this many variables or cases and usually have full data sets (normally collect data in the here and now, not based on previous patients).
Any help or suggestions would be much appreciated!
Relevant answer
Answer
Hello Lucy Birch. Given the number & nature of your questions, I recommend that you find someone local to consult. A quick search took me to this NIHR page:
Perhaps they could help you find a good consultant or collaborator.
Best of luck with your research.
  • asked a question related to SPSS
Question
4 answers
Hello everyone,
I’m seeking advice on the best approach for analyzing data related to ESL learners’ self-regulation capacities in vocabulary learning, specifically in relation to socio-demographic variables like age, gender, major, and IELTS scores (Comparison). I just wanted to detect differences in self-regualtion and background variables without hypotheses.
Initially, I used MANOVA to identify overall differences across these variables, focusing on five dimensions of self-regulation. I conducted separate MANOVAs for each variable (e.g., age, gender) with the five domains of self-regualtion and then summarized the results in a comprehensive table.
Additionally, I performed one-way ANOVAs for each socio-demographic variable across the five domains, which yielded insightful and significant differences. After I wrote all the results and discussion, however, my supervisor suggested a different approach: using a single MANOVA model that includes all socio-demographic variables simultaneously and all self-regualtion domains. This method produced different results, highlighting only some differences in age and English levels. My supervisor is particularly interested in examining interaction effects.
The challenge is that my sample consists of approximately 550 participants, but the distribution is uneven across specific age groups and proficiency levels, leading to some cells being empty in the analysis of interaction effects. This imbalance complicates the analysis, and I’m torn between the two approaches. Even the discusstion becomes more challenging.
Personally, I favor the first approach because I didn't hypothesize any interaction effects, and the unequal sample sizes across groups make the results from the second approach less reliable. Additionally, the findings from the first approach are more interesting and aligned with my research objectives.
Given these considerations, I’m concerned about which method is more appropriate. I would greatly appreciate any insights or suggestions on how to proceed with this analysis. Note, I have both analysis and discussion in sperate files written but I am concerned what is the best for my thesis.
Thank you in advance for your help!
Relevant answer
Answer
IMO, anyone who is contemplating using MANOVA should read the following articles before proceeding.
HTH.
  • asked a question related to SPSS
Question
3 answers
Q1: I analyzed my data using a nonparametric test, and I have 3 covariate variables. How can I test their effects on my study? Should I do first the nonparametric test and then do another analysis to test the covariate variables ( do all three together or separately every covariate)
Q2: When I analyzed my serial mediation effect by Hayes -SPSS, The X Variable is group 1, so I got the message that there was an error because the X variable consists of a constant variable( What should I put in X variable? ) ( I have to test the effect of group 1 on DV by serial mediation variables; also, I have to do it again with group 2 because it has different serial mediation variables )
Thank You
Relevant answer
Answer
This answer is almost certainly a "cut and past" from an AI such as ChatGPT. As such, it shows very little understanding of Question #1. In particular, what does it mean to "Stratify your data: Divide your sample into subgroups based on the covariate(s) and perform the nonparametric test within each subgroup"? IIf your covariates are things like age and education, this is nonsensical. Even if your have limited categories in your covariates (e.g., gender), how do you compare the results of the non-parametric tests across the subgroups?
The more useful answer is that there is no direct statistical equivalent to the analysis of covariance when you are conducting non-parametric t4ests.
  • asked a question related to SPSS
Question
5 answers
Hi there,
I was looking for a scoring guide or SPSS/Stata/R syntax for scoring SF 12 version-2. Can anyone help me in this regard? My email address is m.alimam@cqu.edu.au
Thanks in advance.
Relevant answer
Answer
SPSS Syntax
*/SF12 V2 Scoring
RECODE SF12HF_1 (1=5) (2=4) (3=3) (4=2) (5=1) INTO SF12HF_1_r.
EXECUTE.
RECODE SF12HF_8 (1=5) (2=4) (3=3) (4=2) (5=1) INTO SF12HF_8_r.
EXECUTE.
RECODE SF12HF_9 SF12HF_10 (1=6) (2=5) (3=4) (4=3) (5=2) (6=1) INTO SF12HF_9_r SF12HF_10_r.
EXECUTE.
RECODE SF12HF_1_r SF12HF_2 SF12HF_3 SF12HF_4 SF12HF_5 SF12HF_6 SF12HF_7 SF12HF_8_r SF12HF_9_r
SF12HF_10_r SF12HF_11 SF12HF_12 (ELSE=Copy) INTO Item1 Item2A Item2B Item3A Item3B Item4A Item4B
Item5 Item6A Item6B Item6C Item7.
EXECUTE.
COMPUTE PF_1=Item2A+Item2B.
EXECUTE.
COMPUTE RP_1=Item3A+Item3B.
EXECUTE.
COMPUTE BP_1=Item5.
EXECUTE.
COMPUTE GH_1=Item1.
EXECUTE.
COMPUTE VT_1=Item6B.
EXECUTE.
COMPUTE SF_1=Item7.
EXECUTE.
COMPUTE RE_1=Item4A+Item4B.
EXECUTE.
COMPUTE MH_1=Item6A+Item6C.
EXECUTE.
COMPUTE PF_2=100*(PF_1 - 2)/4.
EXECUTE.
COMPUTE RP_2=100*(RP_1 - 2)/8.
EXECUTE.
COMPUTE BP_2=100*(BP_1 - 1)/4.
EXECUTE.
COMPUTE GH_2=100*(GH_1 - 1)/4.
EXECUTE.
COMPUTE VT_2=100*(VT_1 - 1)/4.
EXECUTE.
COMPUTE SF_2=100*(SF_1 - 1)/4.
EXECUTE.
COMPUTE RE_2=100*(RE_1 - 2)/8.
EXECUTE.
COMPUTE MH_2=100*(MH_1 - 2)/8.
EXECUTE.
*/TRANSFORM SCORES TO Z-SCORES;
COMPUTE PF_Z = (PF_2 - 81.18122) / 29.10588 .
EXECUTE.
COMPUTE RP_Z = (RP_2 - 80.52856) / 27.13526 .
EXECUTE.
COMPUTE BP_Z = (BP_2 - 81.74015) / 24.53019.
EXECUTE.
COMPUTE GH_Z = (GH_2 - 72.19795) / 23.19041.
EXECUTE.
COMPUTE VT_Z = (VT_2 - 55.59090) / 24.84380 .
EXECUTE.
COMPUTE SF_Z = (SF_2 - 83.73973) / 24.75775 .
EXECUTE.
COMPUTE RE_Z = (RE_2 - 86.41051) / 22.35543 .
EXECUTE.
COMPUTE MH_Z = (MH_2 - 70.18217) / 20.50597 .
EXECUTE.
*/CREATE PHYSICAL AND MENTAL HEALTH COMPOSITE SCORES:
COMPUTE AGG_PHYS = (PF_Z * 0.42402) +
(RP_Z * 0.35119) +
(BP_Z * 0.31754) +
(GH_Z * 0.24954) +
(VT_Z * 0.02877) +
(SF_Z * -.00753) +
(RE_Z * -.19206) +
(MH_Z * -.22069).
EXECUTE.
COMPUTE AGG_MENT = (PF_Z * -.22999) +
(RP_Z * -.12329) +
(BP_Z * -.09731) +
(GH_Z * -.01571) +
(VT_Z * 0.23534) +
(SF_Z * 0.26876) +
(RE_Z * 0.43407) +
(MH_Z * 0.48581) .
EXECUTE.
*/TRANSFORM COMPOSITE AND SCALE SCORES TO T-SCORES
COMPUTE AGG_PHYS_T= 50 + (AGG_PHYS * 10).
EXECUTE.
COMPUTE AGG_MENT_T = 50 + (AGG_MENT * 10).
EXECUTE.
COMPUTE PF_T = 50 + (PF_Z * 10) .
EXECUTE.
COMPUTE RP_T = 50 + (RP_Z * 10) .
EXECUTE.
COMPUTE BP_T = 50 + (BP_Z * 10) .
EXECUTE.
COMPUTE GH_T = 50 + (GH_Z * 10) .
EXECUTE.
COMPUTE VT_T = 50 + (VT_Z * 10) .
EXECUTE.
COMPUTE RE_T = 50 + (RE_Z * 10) .
EXECUTE.
COMPUTE SF_T = 50 + (SF_Z * 10) .
EXECUTE.
COMPUTE MH_T = 50 + (MH_Z * 10) .
EXECUTE.
  • asked a question related to SPSS
Question
12 answers
I am carrying out a research on patients with sarcopenia related to fracture rate, using SF-12 version 2 as the QoL tool.
I was wondering if anyone is using the same questionnaire and calculate the scores using SPSS syntax? Thank you very much!
Relevant answer
Answer
SPSS Syntax
*/SF12 V2 Scoring
RECODE SF12HF_1 (1=5) (2=4) (3=3) (4=2) (5=1) INTO SF12HF_1_r.
EXECUTE.
RECODE SF12HF_8 (1=5) (2=4) (3=3) (4=2) (5=1) INTO SF12HF_8_r.
EXECUTE.
RECODE SF12HF_9 SF12HF_10 (1=6) (2=5) (3=4) (4=3) (5=2) (6=1) INTO SF12HF_9_r SF12HF_10_r.
EXECUTE.
RECODE SF12HF_1_r SF12HF_2 SF12HF_3 SF12HF_4 SF12HF_5 SF12HF_6 SF12HF_7 SF12HF_8_r SF12HF_9_r
SF12HF_10_r SF12HF_11 SF12HF_12 (ELSE=Copy) INTO Item1 Item2A Item2B Item3A Item3B Item4A Item4B
Item5 Item6A Item6B Item6C Item7.
EXECUTE.
COMPUTE PF_1=Item2A+Item2B.
EXECUTE.
COMPUTE RP_1=Item3A+Item3B.
EXECUTE.
COMPUTE BP_1=Item5.
EXECUTE.
COMPUTE GH_1=Item1.
EXECUTE.
COMPUTE VT_1=Item6B.
EXECUTE.
COMPUTE SF_1=Item7.
EXECUTE.
COMPUTE RE_1=Item4A+Item4B.
EXECUTE.
COMPUTE MH_1=Item6A+Item6C.
EXECUTE.
COMPUTE PF_2=100*(PF_1 - 2)/4.
EXECUTE.
COMPUTE RP_2=100*(RP_1 - 2)/8.
EXECUTE.
COMPUTE BP_2=100*(BP_1 - 1)/4.
EXECUTE.
COMPUTE GH_2=100*(GH_1 - 1)/4.
EXECUTE.
COMPUTE VT_2=100*(VT_1 - 1)/4.
EXECUTE.
COMPUTE SF_2=100*(SF_1 - 1)/4.
EXECUTE.
COMPUTE RE_2=100*(RE_1 - 2)/8.
EXECUTE.
COMPUTE MH_2=100*(MH_1 - 2)/8.
EXECUTE.
*/TRANSFORM SCORES TO Z-SCORES;
COMPUTE PF_Z = (PF_2 - 81.18122) / 29.10588 .
EXECUTE.
COMPUTE RP_Z = (RP_2 - 80.52856) / 27.13526 .
EXECUTE.
COMPUTE BP_Z = (BP_2 - 81.74015) / 24.53019.
EXECUTE.
COMPUTE GH_Z = (GH_2 - 72.19795) / 23.19041.
EXECUTE.
COMPUTE VT_Z = (VT_2 - 55.59090) / 24.84380 .
EXECUTE.
COMPUTE SF_Z = (SF_2 - 83.73973) / 24.75775 .
EXECUTE.
COMPUTE RE_Z = (RE_2 - 86.41051) / 22.35543 .
EXECUTE.
COMPUTE MH_Z = (MH_2 - 70.18217) / 20.50597 .
EXECUTE.
*/CREATE PHYSICAL AND MENTAL HEALTH COMPOSITE SCORES:
COMPUTE AGG_PHYS = (PF_Z * 0.42402) +
(RP_Z * 0.35119) +
(BP_Z * 0.31754) +
(GH_Z * 0.24954) +
(VT_Z * 0.02877) +
(SF_Z * -.00753) +
(RE_Z * -.19206) +
(MH_Z * -.22069).
EXECUTE.
COMPUTE AGG_MENT = (PF_Z * -.22999) +
(RP_Z * -.12329) +
(BP_Z * -.09731) +
(GH_Z * -.01571) +
(VT_Z * 0.23534) +
(SF_Z * 0.26876) +
(RE_Z * 0.43407) +
(MH_Z * 0.48581) .
EXECUTE.
*/TRANSFORM COMPOSITE AND SCALE SCORES TO T-SCORES
COMPUTE AGG_PHYS_T= 50 + (AGG_PHYS * 10).
EXECUTE.
COMPUTE AGG_MENT_T = 50 + (AGG_MENT * 10).
EXECUTE.
COMPUTE PF_T = 50 + (PF_Z * 10) .
EXECUTE.
COMPUTE RP_T = 50 + (RP_Z * 10) .
EXECUTE.
COMPUTE BP_T = 50 + (BP_Z * 10) .
EXECUTE.
COMPUTE GH_T = 50 + (GH_Z * 10) .
EXECUTE.
COMPUTE VT_T = 50 + (VT_Z * 10) .
EXECUTE.
COMPUTE RE_T = 50 + (RE_Z * 10) .
EXECUTE.
COMPUTE SF_T = 50 + (SF_Z * 10) .
EXECUTE.
COMPUTE MH_T = 50 + (MH_Z * 10) .
EXECUTE.
  • asked a question related to SPSS
Question
7 answers
Similar to this poster (https://www.researchgate.net/post/Is_there_a_non-parametric_alternative_to_repeated_measures_ANOVA), I'm trying to run an RM test with non-parametric continuous data, and am now interested in a between-subjects factor (sex). Since the big limitation of the Friedman test is an inability to include between-subjects factors, I was led to GEE, but the process appears complex. Does anyone know of any straightforward guides/steps to run this kind of analysis?
Relevant answer
Answer
Hey
in the tab "type of model" choose the scale response on "gamma with log link". It will work for non-parametric data
also, Here is some useful file for you.
  • asked a question related to SPSS
Question
3 answers
Dear Researchers:
when we do the regression analysis by using SPSS, when we want to measure a specific variable, some researchers take the average of items under each measurement while some others add the value of each items ? Which one is more reliable? which one produces more better results ?
Thanks in advance
Relevant answer
Answer
Summing really is not a great approach unless the summed scores are scaled in a way that’s well known in application (and even then is problematic).
The mean has two advantages. 1) If the items have a fixed range interpretation is usually easier. If it’s a 1 to 7 scale then 1 is lowest, 7 highest and 4 in the middle. A summed scale is hard to interpret because of variation in number of items. 2) With missing items the sum treats missing as zero so it distorts the score. The mean treats them as if missing completely at random. This is not perfect but better than treating values as 0. Also with only a few missing items and problems are minor.
Ideally one would impute missing values but I’ve seen quite a few analyses that sum scores with missing data and end up with essentially garbage outcomes (if 0 is an impossible value it can really mess with results).
If there’s no missing data the analyses will be identical except for the interpretation issue. So generally the mean is the better and safer default.
  • asked a question related to SPSS
Question
5 answers
Hello everyone!
I am writing my master thesis together with a fellow student and we are having problems analyzing our data.
We conducted a daily diary study. Our model is a mediation model with the following variables:
- Trait self-control (predictor, estimated at the beginning, level 2)
- Digital Media Self-Control Failure (mediator, estimated via daily diary - evening questionnaire, 5 days, level 1)
- Goal pursuit (outcome, estimated via daily diary - evening questionnaire, 5 days, level 1)
We are really unsure how to analyze the data (since we didn't have HLM or multilevel medation in our statistics lecture).
We think the lme4 package for R might be suitable for our analysis, or perhaps Rockwood's MLMED macro for SPSS.
What we have found out so far is that we need to center our mediator using CWC. We also think that we need an intercept only model to test the ICC. After that, we are not so sure what to do. We found some literature for HLM and for 2-1-1 Mediation, but nothing that explained what to in R/SPSS or how to it. We are really lost at the moment.
I'm really scared that I might fail my master thesis because we are not able to the analysis.
I really hope that someone has an idea or some other input that can help us.
Thank you so much!
Sophie Kittlaus
Relevant answer
Answer
You're on the right track,
1) As you state, center your lvl 1 predictor (DMSCF) around each individual's mean (within-person centering),
## Calculate individual means & centering (don't know what your variables are called)
data$DMSCF_mean <- ave(data$DMSCF, data$id, FUN = mean) data$DMSCF_CWC <- data$DMSCF - data$DMSCF_mean
2) ICC inspection (using lme4 package)
## Intercept model for ICC estimation
m0 <- lmer(GoalPursuit ~ 1 + (1 | id), data = data)
summary(m0)
icc <- as.numeric(VarCorr..........)
print(icc)
3) Start with a multilevel model 1-1-1 mediation, first check if there’s enough variation in the mediator (DMSCF) to predict by the lvl 2 predictor ("trait self-control"),
model_1 <- lmer(DMSCF_CWC ~ TraitSelfControl + (1 | id), data = data)
summary(model_1)
Then check the direct effect of "trait self-control" on "goal pursuit",
model_2 <- lmer(GoalPursuit ~ TraitSelfControl + (1 | id), data = data)
summary(model_2)
Next you can check the mediation model,
model_3 <- lmer(GoalPursuit ~ TraitSelfControl + DMSCF_CWC + (1 | id), data = data)
summary(model_3)
4) Check the mediation by testing (Sobel's test) the significance of the product of coefficients (a*b),
## Extract coefficients
a <- fixef(model_1)["TraitSelfControl"]
b <- fixef(model_3)["DMSCF_CWC"]
se_a <- sqrt(vcov(model_1)["TraitSelfControl", "TraitSelfControl"])
se_b <- sqrt(vcoc(model_3)["DMSCF_CWC", "DMSCF_CWC"])
ab <- a * b
## Do a Sobel test
library(multilevel)
sobel_test <- sobel(a, se_a, b, se_b)
print(sobel_test)
a in this case is the effect of the predictor (trait self-control) on the mediator (DMSCF), b is the effect of the mediator (DMSCF) on the outcome (goal pursuit), controlling for the predictor. If everything is OK the sobel() function will return a z-value which you can compare to standard normal distribution critical values to determine the significance.
Finally, if the indirect effect a*b is significant (from Sobel), the idea is that you can claim that DMSCF mediates (partially or fully) the relationship between "trait self-control" and "goal pursuit". If the direct effect (TraitSelfControl on GoalPursuit) in model_3 remains significant after including the mediator, it suggests partial mediation. If it becomes non-significant, it suggests full mediation. I don't know the name of your variables and there might be some hiccups in the code as I haven't tested it, but the general idea is there.
Remember to report the ICC in your analysis to justify the use of a multilevel model though! And report all the fixed effects from each model (+mediation effects)
  • asked a question related to SPSS
Question
6 answers
Then to compare it if there a significant difference between the two independant groups ?
Relevant answer
Answer
Hello Israa al Omari. In case it was not clear to you, the linear mixed model Jochen Wilhelm suggested would have measurement occasions (ages) clustered within subjects. I.e., Subject is the Level 2 variable.
If you have not already done so, you should make a plot showing a scatter-plot for each subject separately with X = Age and Y = the DV. See the examples on this UCLA page:
If you show that plot here, readers will be better able to advise you about whether you should treat time (age) as categorical or quantitative. If the number of ages is relatively small, and if measurements are taken at the same ages for all subjects, you might prefer treating it as categorical, for example. If the number of ages is larger, and especially if the ages are not the same for all subjects, you will almost certainly want to treat age as a quantitative variable. In that case, the plots will help you judge whether a simple model with a linear fit makes sense (versus a model with one or more polynomial terms, or a spline, etc.).
Finally, the webpage for Lesa Hoffman's book has SPSS code and output for many examples you might find helpful.
HTH.
  • asked a question related to SPSS
Question
3 answers
what is the difference between (path Non parametric - > Independent samples then select Mann Whitney) and path Non parametric -> legacy -> 2 independent samples?
Because , while doing both , calculated value of Mann-Whitney is differ in both output.
Is there any different approaches?
Relevant answer
Answer
There are two different procedures in SPSS Statistics to run a Mann-Whitney U test: the Nonparametric Tests > Independent Samples procedure and the Legacy Dialogs > 2 Independent Samples procedure. We recommend the Nonparametric Tests > Independent Samples procedure if your two distributions have the same shape because it is a little easier to carry out, but the Legacy Dialogs > 2 Independent Samples procedure is fine if your two distributions have different shapes.
  • asked a question related to SPSS
Question
4 answers
I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it.
Thank you so much.
Relevant answer
Answer
You can find some excellent spss tutorials for basics and advanced analysis on youtube
  • asked a question related to SPSS
Question
8 answers
Are there any statistical methods to justify your sampling technique using SPSS or AMOS?
Relevant answer
Answer
In snowball sampling, true statistical representativeness is difficult to achieve due to its non-random nature. While there isn’t a specific statistical test to confirm representativeness, researchers can:
1. Compare the sample’s characteristics with known population data (if available) to assess representativeness.
2. Start with multiple initial participants (“seeds”) to increase diversity.
3. Apply statistical adjustments like post-stratification weighting to better align the sample with the population.
  • asked a question related to SPSS
Question
1 answer
Using SPSS, I made my edits in one bar chart — e.g., font type and size, hiding grid lines, and colours, namely: blue, orange, green, purple, and grey for 5 bars, respectively, etc. — and I saved the chart as a template to apply to the remaining charts in my data/item analysis. When I analysed another item (from the same questionnaire section) and applied the saved template to it, the font features and grid lines applied successfully; however, the colors part did not take effect (and graph bars remained a default blue), although I thought I made no mistake in the template saving procedure.
Doesn't SPSS save bar colors in templates or have I made a mistake?
Relevant answer
Answer
You can see www.Stats4Edu.com
  • asked a question related to SPSS
Question
6 answers
I want to use SPSS Amos to calculate SEM because I use SPSS for my statistical analysis. I have already found some workarounds, but they are not useful for me. For example, using a correlation matrix where the weights are already applied seems way too confusing to me and is really error prone since I have a large dataset. I already thought about using Lavaan with SPSS, because I read somewhere that you can apply weights in the syntax in Lavaan. But I don't know if this is true and if it will work with SPSS. Furthermore, to be honest, I'm not too keen on learning another syntax again.
So I hope I'm not the first person who has problems adding weights in Amos (or SEM in general) - if you have any ideas or workarounds I'll be forever grateful! :)
Relevant answer
Answer
You can see www.Stats4Edu.com
  • asked a question related to SPSS
Question
3 answers
Afternoon!
Using SPSS, I made my edits in one bar chart—e.g., font type and size, hiding grid lines, and colours, namely: blue, orange, green, purple, and grey for 5 bars, respectively, etc.—and I saved the chart as a template to apply to the remaining charts in my data/item analysis. When I analysed another item (from the same questionnaire section) and applied the saved template to it, the font features and grid lines applied successfully; however, the colour part did not take effect (and graph bars remained a default blue), although I thought I made no mistake in the template saving procedure.
Doesn't SPSS save bar colours in templates, or have I made a mistake?
Relevant answer
Answer
It sounds like you've hit a common snag with SPSS templates. While SPSS does save a lot of chart formatting in templates, bar colors can sometimes be tricky.
Here are a few things to check:
  1. Color Scheme: Make sure your template is using a specific color scheme rather than individual colors for each bar. SPSS might be overriding the individual colors when you apply the template.
  2. Chart Type: Ensure that the chart types match exactly between the original chart and the one you're applying the template to. Even small differences can affect color application.
  3. Data Structure: Verify that the data structure for both charts is identical. The number of categories and values should match.
  4. Template Overwrite: Try creating a new chart from scratch and then applying the template. Sometimes, existing chart settings can interfere with template application.
  • asked a question related to SPSS
Question
3 answers
I am conducting my analysis using SPSS. I log transformed my data using In(X+1) as my data contain zero values. However, when I want to back transform the regression coefficients generated from my regression analyses, I encountered the following problem:
  1. I saw online that back transformation of In(X+1) can be done by (e^y )-1. However, since the regression coefficients generated from my log transformed data is between the value of 0 to 2 (like say b= 0.15), so after I back-transform it (exp(0.15) = 1.16) and then minus the answer by 1 (i.e. 1.16 - 1 = 0.16), the back-transformed value become less than 1. I read in a paper saying that a back-transformed value below 1.0 would correspond to a decease. Therefore, backtransformation using (e^y )-1 make my data seemed like each one-unit increase in X, the dependent variable Y is decreased by ...%, but the fact is that with each one-unit increase in X, the dependent variable Y should be "increased" by...%. Therefore, is this the correct way to back-transform the data? If not, how should i do it?
  2. Besides, some of the regression coefficients generated from log transformed data is less than 0 (e.g. -0.15). Should I ignore the negative sign, then back-transform it (i.e. exp(0.5) = 1.65), and add back a negative sign? Or should I do it in other way?
  3. To be honest, only 0.5% of the responses in my data is 0 and I have more than 1000 responses, so to simplify thing, do you think it is appropriate to use In(X) instead?
Thank you in advance for your help.
Relevant answer
Answer
Hello Pcl Lee. It may be that some of Jochen Wilhelm's guesses about your situation were correct. Nevertheless, it would help if you provided more information. For example:
  • What is the dependent variable?
  • Why did you log-transform it?.
  • What are the explanatory variables?
  • What is your research question?
Thanks for clarifying.
  • asked a question related to SPSS
Question
3 answers
I am conducting a qualitative-driven approach to mixed-method research. The role of my quantitative data is to corroborate the findings of the qualitative data. Qualitative data has been collected through interviews at the end of the project, and quantitative with a pre-survey before the project and a post-survey at the end of the project. One of the questions that arises when analyzing the quantitative data from my first project is whether I should use a paired t-test or an unpaired t-test analysis. In the first project, I had a small group of participants: the same 18 participants responded to both the pre- and post-questionnaires, but in total, I have 28 participants; the remaining 10 only responded to either the pre- or the post-questionnaire. I am unsure if 28 participants are enough for an unpaired t-test; however, 18 participants might be enough for a paired t-test. I would appreciate your help regarding this question and any resources or videos I can use to analyse pre- and post-questionnaires with SPSS, as I am unsure if I need to upload all the data from the pre-and post-questionnaire in the same file in SPSS or separately. Thanks very much.
Relevant answer
Answer
You clearly have a pre post design, so that you should incorporate this information into your analyses. A repeated measures t-test might be suitable, if all assumptions are reasonably met.
But since you have additional information, i.e. the other participants who delivered data either for the pre or the post measure, you might think about a multilevel model. This would enable you to use all participants and so the 10 participants could contribute to have a better estimate for the pre and post mean value.
  • asked a question related to SPSS
Question
5 answers
SPSS, PLS-SEM
Relevant answer
Answer
The book by DC Jain said CVS. PLS-SEM has free books on their website, so check it out
  • asked a question related to SPSS
Question
6 answers
Hello network,
Is there anyone who could help me with my empirical research - especially with the statistical elaboration - on the topic of entrepreneurial intention and business succession in German SMEs, or who has experience with the preparation of Structural Equation Modeling?
Please feel free to send me a private message.
Thank you and best regards
Julia
Relevant answer
Answer
Do you have results and want interpretation or you have not performed the analysis yet
  • asked a question related to SPSS
Question
2 answers
Looking for international collaboration, someone with deep understanding of SPSS (Specifically on Difference in difference model both,DID and SDID).
Relevant answer
Answer
Hi, I will surely look into it. Thank you!
  • asked a question related to SPSS
Question
5 answers
I'm evaluating the reproductive performance of quails. I have the fertility, and hatchability percentages for six different sire groups and would like to determine the significant difference between the percentages of the sires using SPSS. I'd appreciate it if the steps are shared.
Relevant answer
Answer
Reuben Friday Osemeke, ANOVA suggests a quantitative DV. I may have misunderstood, but I thought that for each individual quail, some binary outcome either occurs or does not occur (e.g., fertile vs not fertile), and that the question is whether the proportions differ across the 6 sire groups. If my understanding is correct, a Chi-square test of association seems more appropriate. But let's wait for clarification from Samuel Ter Vincent.
  • asked a question related to SPSS
Question
2 answers
Video tutorial will be effective
Relevant answer
Answer
Suhel Sen, providing more information about your data will greatly increase the chances of someone being able to help you.
Relying almost entirely on interneTelepathy (credit to the late David Marso for coining that expression), I suspect you are estimating a binary logistic regression model with occurrence of a landslide (Yes/No) as the outcome variable. If so, use the SAVE sub-command to save the predicted probability as a new variable in your dataset, and use it as input to the command that generates the ROC curve. But the best thing would be to provide a lot more info about what you are doing! HTH.
  • asked a question related to SPSS
Question
6 answers
Hi everyone, 
I am working on case control genetic data. I have data for SNP with genotypes AA, AG and GG. I want to check the effect of these individual genotypes on the disease outcome, which in my case is diabetes. Now I want to calculate unadjusted and adjusted (for age and gender) Odds ratio in SPSS. I calculated unadjusted Odds ratio by using multinomial regression (is this suitable for my data?). But do not know how to calculate Odds ratio after adjustment for age and geneder. 
Secondly, How can I  apply dominant, recessive and co-dominant model to check which fits more. 
I would be highly obliged if you provide me the flowchart of commands like 
Spss--> analyze--> regression--> .........
Thanks, 
Misbah
Relevant answer
Answer
I also want to do same analysis coded genotypes as 1,2,3 for TT,TC and CC . And want to estimate risk for infection as dicomatous yes or no, but can not get significant OR. But when i compated percentages of infection rate among three genotype found significant p value by chi sequare. I found higest infection among TC carriers but on logistic regression lowest insignificant OR for TC . I am confused.its phd work
  • asked a question related to SPSS
Question
2 answers
Dear all,
For my study, I ran a multiple regression with robust standard error, and a clustered multiple regression with robust standard error (using Huang's (2020) SPSS macro).
I noticed that the Rsquare of the simple multiple regression was the same as that of the robust standard error regression.
But when I run the regression with the clustered robust standard error, I don't get the Rsquare value. I was wondering if this is the same as for regression with robust standard error?
If the Rsquare doesn't change with different adjustments of the standard error ( robust standard error, or cluster robust standard error ) , could someone explain the idea behind it?
If not, could you tell me how to get the “Model Summary” output containing the Rsquare for the clustered robust standard error regression?
I am using IBM SPSS version 27
Thanks in advance,
Relevant answer
Answer
Moi Iannello R2 is an assessment of how well the IDV explains the variance in the DVs.
When you utlizes robust standard errors, whether heteroskedasticity-robust or clustered, you are adjusting the standard errors of the regression coefficients to be more reliable under certain conditions (like heteroskedasticity or clustering) of your numerical data.
  • asked a question related to SPSS
Question
3 answers
Relevant answer
Answer
As a mathematical statistician, I note that:
- SPSS and Minitab have limited machine learning (ML) capabilities compared to STATA and Python.
- However, they offer interfaces to Python, allowing integration of Python's ML libraries.
- This enables users to combine the strengths of both worlds: statistical rigor and ML capabilities.
- I emphasize the importance of rigorous statistical thinking when applying ML techniques.
  • asked a question related to SPSS
Question
3 answers
Hello everyone,
I am a physician and interested in research. I wanted to know for beginners which one you recommend? spss vs stata- your response will be much appreciate- thank you
Relevant answer
Answer
It is important to carefully consider your specific research needs. As a research-oriented physician, the choice between SPSS and Stata totally depends on your individual requirements and preferences.
SPSS is well-suited for conducting complex data modeling and provides analysis capabilities that are ready for production. On the other hand, Stata is a versatile package that offers robust features for both data management and analysis.
I would suggest using SPSS if you need to model and analyze complex data. However, it is recommended that you thoroughly assess both options to determine which tool aligns more with your research goals and the data you have for analysis.
  • asked a question related to SPSS
Question
3 answers
I did a moderation analysis using SPSS and Process macro model 3, whereby the interaction between X Z W was not significant, but the interaction between X and Z was. I am confused in how to interpret this, could someone please help me get back a grip on this?
Relevant answer
Answer
Yes I can, thank you!
  • asked a question related to SPSS
Question
2 answers
How can analyses a split plot design by SPSS?
Relevant answer
Answer
To analyze a split-plot design in SPSS, start by organizing your data, ensuring you have columns for the dependent variable, between-subjects factors, and within-subjects factors. Open SPSS and load your dataset, then go to the Analyze menu, select General Linear Model, and choose Repeated Measures. Define your within-subjects factor by naming it (like "Time") and specifying the number of levels (e.g., three time points), then assign your within-subjects variables (e.g., measurements at different time points). Move your between-subjects factor (e.g., "Group") to the appropriate box. Specify the model, create interaction plots if desired, and choose additional options like descriptive statistics or effect size. Run the analysis by clicking OK, and SPSS will produce tables showing means, standard deviations, and significance tests for the effects and interactions of both within- and between-subjects factors. Interpret these results to understand how the factors influence your dependent variable.
  • asked a question related to SPSS
Question
4 answers
I ran multiple linear regression in SPSS and it turned out that one of the variables was not statistically significant.
multiple regression equation formula, if X1 not significant at (P<0.05), Can this variable be entered into the equation?
Y = A+B1X1+B2X2,......
Relevant answer
Answer
In my research, there are hypotheses, to test them I used multiple linear regression, and after that I want to write a prediction equation in the dependent variable.
According to this equation: Y=A+B1X1+B2X2+,........
  • asked a question related to SPSS
Question
3 answers
Hello,
I would really appreciate some help interpreting the output from my simple mediation analysis using PROCESS macro in SPSS please.
For context, the X predictor is severity of nausea and emesis in pregnancy (PUQE), the hypothesised mediator M is emetophobia (SPOVI), and the outcome Y is parental stress (PSS).
I believe the overall indirect effect of X on Y is significant, [Effect = .1094, 95% C.I (.0070, .2603)] however, when you look at the significance of the individual paths, a = 1.1232 (.3070) is significant at p = .0003, but b = .0974 (.0532) does not reach statistical significance at p = .0686.
I am stuck as to how to best interpret this result - is it the overall indirect effect that matters, or the significance of the individual paths? Why would one be significant but not the other?
I have attached the full output, thank you in advance for any help/insight.
Relevant answer
Answer
"Statistical significance" is a tricky and somewhat controversial issue regardless of the specific context of mediation analysis. Regardless of the general debate as to whether "statistical significance testing" is useful at all, I would be hesitant (or at least cautious) to interpret a "significant" indirect effect when the b path (or the a path) themselves are not "significant" and/or not substantial in size. That is, it does not make sense to me logically that there could be a "significant" indirect effect of X on Y when either the a or b paths are not statistically different from zero. When the b path is zero in the population, this means M does not cause variation in Y. In that case, there cannot be an indirect effect of X on Y via M.
That being said, this may be an issue of statistical power. You may have enough power to show that the indirect effect is "significantly different from zero" using the bootstrap confidence interval, while you do not have sufficient power to "detect" the b path alone as "significant." The stronger a path may drive the "significance" of the indirect effect. (Notice however, that the lower bound for your confidence interval for the indirect effect is also really close to zero.)
  • asked a question related to SPSS
Question
14 answers
Hello,
I am writing a thesis where I want to test the moderation effect of 4 groups (that I transformed into 3 dummies), on the main impact of the independent variable on the dependent one. The 3 dummies are : Female Senior, Female Adult and Male Senior (the baseline being Male Adult). However, I do not know if I am allowed to just test the moderation of the 3 of them separately and just compare their coefficients or if I should put the 3 dummies together in PROCESS. If the latter is true, I do not know how to do this.
Could you please let me know if I should put the 3 of them together, and if so, how?
Thank you for your help!
Relevant answer
Answer
Oluwaseyi Ayorinde Mohammed please elaborate in detail, where I contradicted myself and please provide adequate references. Thank you.
  • asked a question related to SPSS
Question
5 answers
Each row is a therapy session by a variable label ID. Each unique ID has between 1 and 50 rows in SPSS (sessions). How can I select only those unique IDs with 3 or more sessions?
Relevant answer
Answer
This can surely and simply be done in SPSS. You can create your desired 'Number_of_sessions'-variable with only one small syntax-command:
aggregate outfile=* mode=addvariables /break=ID /Number_of_sessions=N.
You do not even have to sort the cases by 'ID' beforehand (although that used to be necessary, but not anymore). Sorting by 'ID' is nevertheless probably a nice thing to do anyway, so if the records are not already grouped by 'ID', you can do:
sort cases by ID.
You can use the new variable 'Number_of_sessions' like any other variable, for instance to select the records you need (3 or more sessions):
select if Number_of_sessions>=3.
Or use it for filtering purposes, if you do not want a 'permanent' selection. Instead of filtering, you can also opt for a temporary selection, which is only active for the next procedure (frequencies, correlation, regression, anova, or whatever). For example:
temporary.
select if Number_of_sessions>=3.
frequencies Number_of_sessions /histogram /statistics.
So, open that syntax window, type 'aggregate etc.' (or copy/paste it) and... run! If you want to know the ins and outs of 'aggregate', place the cursor on the word 'aggregate' in the syntax (anywhere on that line actually), and press the function-key 'F1' on your keyboard. Note that these kinds of operations are much easier to accomplish with one or two syntax-lines (which you can save for later references and/or re-use) than with the menu's.
That's all.
  • asked a question related to SPSS
Question
6 answers
Hi everyone,
Does anyone have a detailed SPSS (v. 29) guide on how to conduct Generalised Linear Mixed Models?
Thanks in advance!
Relevant answer
Answer
Ravisha Jayawickrama dont thank
Onipe Adabenege Yahaya
, but chatGPT, you could have gotten the same answer yourself.
  • asked a question related to SPSS
Question
3 answers
Hello,
If I have the ROC curve from SPSS output, How can I determine the cut off?
Relevant answer
Answer
I would compute sensitivity and specificity values for various threshold points. Identify the threshold where the sum of sensitivity and specificity is the highest.
Please check; Youden's Index (Sensitivity + Specificity - 1).
  • asked a question related to SPSS
Question
3 answers
In SPSS 25, I have a categorical variable which I would like to display in a frequency table. However, one of my categories was not selected by any of my respondents. I can generate a frequency distribution for this variable, but the unselected category is not included. How can I generate a frequency distribution table to show a zero count for this category?
Relevant answer
Answer
Eva Tsouparopoulou I fail to see why including the full scale in a frequency table is problematic. In fact, it could be informative if, say, the empty level is towards the very bottom or top of the scale.
  • asked a question related to SPSS
Question
4 answers
I plan to apply multinomial logistic regression using the complex sample option of SPSS. The dependent variables have 04 categories (low, moderate, high, and very high), and 05 independent variables are classified/categorized as Yes/no. The 'low' category of dependent variable will be the reference. 'No' will be the reference category of each independent variable.
Relevant answer
Answer
Gabriel Ezenri The procedures you described is well known to me. But the problem is it does not give Odds ratio and corresponding p-value. It only give odds ratio.
  • asked a question related to SPSS
Question
3 answers
I plan to apply multinomial logistic regression using the complex sample option of SPSS. The dependent variables have 04 categories (low, moderate, high, and very high), and 05 independent variables are classified/categorized as Yes/no. The 'low' category of dependent variable will be the reference. 'No' will be the reference category of each independent variable.
Relevant answer
Answer
Heba Ramadan These videos do not sufficiently describe the procedures.
  • asked a question related to SPSS
Question
11 answers
In my study i have 2 groups of patient (diagnosed with stage 1 and 2 at the beginning of treatment), where dependent variable is "lesion volume", which was measured before and after treatment. Treatment time for each patient is not the same. I want to compare the response to treatment between 2 groups.
What is the best statistical test to use to compare compare the response of lesion volume to treatment? Is Mixed ANOVA test a possible test for my study?
Relevant answer
Answer
f you have good reasons to asume that the volume should reduce, then you can divide the p value corresponding to a 2-tailed distribution by 2 and take p=0.008, 1-tiled, (for Stage 0-1). One tile of the distribution corresponds to volume increasing and the other tile to volume reducing.
This result doesn't affect the independent sample T-test because you are analyzing different thinks.
  • asked a question related to SPSS
Question
3 answers
Dear Colleagues
I carried out a multinomial logistic regression to predict the choice of three tenses based on the predictor variables as shown on the image. According to the SPSS output below, the predictor variable "ReportingVoice" appears to have the same result as the intercept. I wonder why this issue happens and how I should deal with this problem. Thank you for your help. Please note that I'm not good at statistics, so your detailed explanation is very much appreciated.
Relevant answer
Answer
Marius Ole Johansen Thank you for your very clear answer. I think I'll exclude this variable.
  • asked a question related to SPSS
Question
3 answers
Dear All,
I need your help :)
For my thesis, I've done an experiment and I need to analyze it on SPSS.
Specifically, I have 112 participants who had to read 7 different scenarios, and after reading the scenario, they had to indicate their degree of agreement on an ascetic scale with five different statements that were intended to measure my dependent variables. The aim is to analyze how the different scenarios impact the dependent variables.
independent variables: scenario 1, scenario 2, scenario 3, scenario 4, scenario 5, scenario 6, and scenario 7
dependent variables: motivation, satisfaction, collaboration and help.
control varibale : gender( male, female ), age and status ( employed, unemployed, student , self-employed, retired, other )
My supervisor advised me to do a multiple linear regression that took into account the clustered standard error ( in order to control for individual heterogeneity). He explained that this would take into account my design within subject.
Unfortunately, I'm a beginner in statistics and this is the first time I've used Spss, so I can't figure out how I could permofer a test that would take clustered standard error into account.
Could you please help me and explain how to do it on spss?
Relevant answer
Answer
You mention a supervisor. Questions like this ought to be asked with supervisors. Otherwise you may get into a situation where you've used a method, perhaps devoting a lot of time to it, and then your supervisor says it is wrong and you've wasted your time. If your supervisor doesn't know how to do something you've been asked to do, ask other professors in your department. If you find no expertise in this way, ask your supervisor who to contact, or what resources might be helpful. If you are in a decent university, there will be a library. It would not be a bad idea to go there and spend a few days looking at all the statistics books.
  • asked a question related to SPSS
Question
11 answers
Hello every one,
I run binary logistic regression in SPSS but i did not have results because of complete separation. How can i solve this proplem?
Thanks in advance.
Relevant answer
Answer
If you need to install it manually, you can do so fairly easily via the Extension Hub: Extensions > Extension Hub. Then search on "Firth", as in the attached image. HTH.
  • asked a question related to SPSS
Question
13 answers
In my research, I have two independent variables (one is within-subject, another is between subject) and a binary dependent variable (i.e. yes or no).
As I have two IVs and one of the them is a repeated measure, I cannot conduct analyses such as chi-square and logistic regression in SPSS. I came across an article saying that for analysis purpose, we can sometimes treat a binary variable as continuous, but in certain circumstances, treating a binary variable as continuous can affect the interpretation of result.
Therefore, I am whether in my case, can I treat my binary DV as continuous and conduct mixed model ANOVA? Can I interpret the result the same way as if the DV is continuous? Thank you in advance for your assistance.
Relevant answer
Answer
Dear Bruce Weaver , sorry for the long delay to answer. You are absolutely right, my answer was not correct about the differences in the log-odds. Somehow, I was thinking in a totally wrong direction, maybe because I find the log-odds very unintuitive to grasp. That's why I would prefer the transformations proposed in my former post.
But nonetheless, my answer was not correct concerning the differences!
I will correct my former post.
  • asked a question related to SPSS
Question
5 answers
I am running a PCA in JASP and SPSS with the same settings, however, the PCA in SPSS shows some factors with negative value, while in JASP all of them are positive.
In addition, when running a EFA in JASP, it allows me to provide results with Maximum Loading, whle SPSS does not. JASP goes so far with the EFA that I can choose to extract 3 factors and somehow get the results that one would have expected from previous researches. However, SPSS does not run under Maximum Loading setting, regardless of setting it to 3 factors or Eigenvalue.
Has anyone come across the same problem?
UPDATE: Screenshots were updated. EFA also shows results on SPSS, just without cumulative values, because value(s) are over 1. But why the difference in positive and negative factor loadings between JASP and SPSS.
Relevant answer
Answer
Hi dears.
Why in EFA results with JASP, I can't see the communality table of items?
  • asked a question related to SPSS
Question
8 answers
Im testing the change in lesion volume between disease stages. I have measured lesion at T0 and T1, and calculated the change in volume with formula:
(VolumeT1 - VolumeT0) / VolumeT0 * 100
In my sample, there several subjects that has increase in volume (positive number) and several subjects has decrease in volume (negative number).
Can I just input all these mixed value into SPSS? and does these mixed values affect the distribution of my data set and affect the choice between parametric and nonparametric tests? Thank you so much
Relevant answer
Answer
Onipe Adabenege Yahaya
is at it again.
Please stop spamming questions you don't know the answers to.
  • asked a question related to SPSS
Question
6 answers
I have fit a Multinomial Logistic Regression (NOMREG) model in SPSS, and have a table of parameter estimates. How are these parameter estimates used to compute the predicted probabilities for each category (including the reference value) of the dependent variable?
Relevant answer
Answer
PCP. The results of this column are also obtained due to the selection of the Predicted category probability option. If you pay attention, the data in this column is the same as the largest number in the EST2, EST1 and EST3 columns. In other words, the numbers in this column can be seen as the probability of each person being placed in the PRE column.
ACP. The results of this column are obtained due to selecting the Actual category probability option. It is also very simple to understand. The numbers in this column show the probability of each person being placed in the level and the actual and observed group of the person. For example, in the picture I sent, person number eight is interested in mathematics. This article is specified from the Subject column. Now, based on the polynomial model, it is obtained that the probability that this person is interested in mathematics is equal to 26%.
  • asked a question related to SPSS
Question
7 answers
I have a question. I made a comparison between women and men on a series of attitudes concerning romantic relationships. Significant differences were found in a line of T-tests, but I also wanted to say something about the great similarity between the two groups in relation to many issues such as recovery time from separation, responsibility for keeping in touch, and more.
Correct me if I'm wrong, but the places where similarity is found, or no difference is found between women and men, is an important finding no less than significant results in a t-test for differences
Now here is the question
What measure of similarity between groups do we have? Is there is a procedure in SPSS that can give an idea of the extent of similarity
Or should I assume that everything that has no difference has similarity? And how do you claim it?
📷
11כל הרגשות:
Relevant answer
Answer
Hello Ornat Turin. If you do want to read up on equivalence testing, I think the following articles are quite accessible:
  • asked a question related to SPSS
Question
13 answers
Hello fellow researchers,
In my research, I investigate two members of the same household. The members of the same household share the same ID number (Nomem_encr). I want to retain household pairs where one member is the household head (position=1) and the other member is the residing child (position=5). Currently, alongside the pairs I desire (position=1 and position=5), I also have household pairs where both members are household heads (position=1 and position=1) or both are residing children (position=5 and position=5). How can I remove these unwanted pairs without losing my desired pairs using SPSS?
Kind Regards,
Raquel
Relevant answer
Answer
Hello Raquel Garcia. It will be much easier to give advice if you provide a small dataset showing how your data is set up. Judging from what you said about the members of pairs sharing the same ID number (Nomem_encr), I think you probably have a LONG file structure, with one row per person, so two rows per pair. But even if that is true, there are other unanswered questions. E.g.,
  • How many rows per ID number are possible?
  • What are the other possible values for the position variable (if any)?
Thank you for clarifying.
  • asked a question related to SPSS
Question
3 answers
how to do calibration curves for the prediction models in SPSS, i think it is important in addition to the external validation
Relevant answer
Answer
Amount of bleeding, or presence of bleeding (yes/no)? If the latter, then the Cross Validated thread I mentioned earlier looks relevant.
  • asked a question related to SPSS
Question
1 answer
I'm the student of M.Phil English, my interest area is to explore teachers perception on implementing inquiry based teaching. I need your guidance regarding theory or theoratical model. In my base paper there is Constructivist theory along with mixed method qualitative and quantitative along with interviews and questionnaire survey. The researcher has not used smart Pls or SPSS to run the data. What'll be suitable for me? Which theory or theoretical model I should use
Relevant answer
Answer
A theoretical model is very different from an analysis tool such as smart Pls or SPSS. If you want to test a theoretical model, you first need to specify hypotheses that can converted into measurable variables, then you would conduct a survey to collect data on those variables, which you would analyze with smart Pls or SPSS.
  • asked a question related to SPSS
Question
3 answers
Hello,
I have data in SPSS from a scale I did, I have 1 variable which is age (separated into 3 categories using values of 1-3), a 2nd variable which is gender (again separated into 3 categories using value), and finally, I have a 3rd IV of education level (this is separated into 5 categories), I need to compare each category against their scores on a 10 item scale.
I am measuring attitudes, each person completed the same scale and I have input them into SPSS but I am unsure which test I need to do to see if there is a difference between each variable's category and their scale scores.
Relevant answer
Answer
In your case, you have one dependent variable and three independent variables, which will result in comparing a scale variable between different groups resulting from different combinations of variables. For this analysis,you need to use Independent Factorial ANOVA. Another option is to use the multiple regression analysis. Best wishes
  • asked a question related to SPSS
Question
4 answers
In my research, I have 11 multiple-choice questions about environmental knowledge, each question with one correct option, three incorrect options, and one "I don't know" option (5 options in total). When I coded my data into SPSS (1 for correct and 0 for incorrect responses) and ran a reliability analysis (Cronbach's Alpha), it was around 0,330. I also ran a KR20 analysis since the data is dichotomous but still not over 0,70.
These eleven questions have been used in previous research, and when I checked them, they all stated a reliability over 0,80 with a similar sampling to the sampling of my research. This got me thinking whether I was doing something wrong.
Low reliability might be caused by each question measuring knowledge from different environmental topics? If this is the case, do I still have to state its reliability when using the results in my study? For example, I can give correct and incorrect response percentages, calculate the sum points, etc.
Thank you!
Relevant answer
Answer
If the questions tap into different topics (i.e., in case of multidimensionality), it likely does not make sense to apply a reliability measure such as Cronbach's alpha. Alpha implies a unidimensional scale/measurement model (i.e., all items measuring a single common factor).
  • asked a question related to SPSS
Question
13 answers
I am using IBM SPSS version 21 as a statistical analysis software. My research is about comparing 2 diferent populations. Let's say it's group A and group B.
Each group has variables changing between 2 different timelines : T0 and T1, and these variables are qualitative. Let's say one of these variables is called X.
X is coded either 0 for no, or 1 for yes.
As a qualitative variable, the frequency of X is calculated in percentage by SPSS.
The difference between the frequencies in T0 and T1 is calculated manually by this formula : ( Freq of X(T0) in group A - Freq of X (T1) in group A )/ Freq of X(T0) in group A * 100
So we obtain the variation between these 2 timelines in percentage.
My question is : how do i compare the variation of group A versus the variation of group B between these two timelines (T0 and T1) using SPSS ?
Relevant answer
Answer
Thank you for clarifying, Mariam Mabrouk. In that case, I see that you have two possible ways to account for the correlated nature of the data:
1) Using GENLIN with generalized estimating equations (GEE); or
2) Using GENLINMIXED with occasions clustered within patients.
In both cases, I would likely choose a logit model.
For option 1, there are some relevant examples here:
For option 2, this book has some relevant info:
Cheers,
Bruce
  • asked a question related to SPSS
Question
3 answers
Hi! My hypothesis have 2 Likert scaled variables to check the effect on one dichotomous dependent variable. Which test to put in SPSS? Can the dichotomous variable later be checked as a DV in mediation analyses?
Relevant answer
Answer
Usually logistic regression.
  • asked a question related to SPSS
Question
3 answers
1. Is Smartpls the only way to deal with formative constructs?
If we have formative constructs, cant we use SPSS and AMOS? If not, that means that all the studies done on SPSS and AMOS are having only reflective constructs (having similar meaning items)?
2. What to do if i have formative constructs, but want to do EFA & CFA?. Can I do ANOVA, MANOVA and other multivariate tests on formative constructs?