Science topic
Applied Psychometrics - Science topic
Explore the latest questions and answers in Applied Psychometrics, and find Applied Psychometrics experts.
Questions related to Applied Psychometrics
Hello! I have this scale which had 10 items initially. I had to remove items 8 and 10 because they correlated negatively with the scale, and then I removed item 9 because Cronbach's alpha and McDonald's omega were both below .7, and after removing it they are now above .7, as it shows in the picture.
My question is, should I also remove item 7 (IntEm_7) because it would raise the reliability coefficients even more and its item-rest correlation is low (0.16), or should I leave it in? Is it necessary to remove it? And also, would it be a problem if I'm now left with only 6 items out of 10?
My goal is to see the correlation between three scales and this is one of them. I am using Jasp.
Any input is appreciated, thank you!

Currently, I am adapting the Workforce Agility scale adapted to Indonesian language and culture. To provide further clarification for items, I am adding contextual information within parentheses "()" to them for better understanding aside from the original translated items. Is this permissible?
For example:
1. Saya senang bekerja sama dengan orang lain (baik dengan tim yang sama ataupun lintas departemen/fungsional)
2. Saya suka mengambil tanggung jawab atas berbagai urusan di tempat kerja (selain tugas-tugas yang menjadi tanggung jawab saya)
The AVE of the scale is below 0.5 and rest of the parameters viz. CR and discriminant validity are above the threshold level
My objective is to generate a composite index of risk perceptions.
I want to assess whether there are differences in risk perceptions of 2 subpopulations. Using Likert-scale responses (scale of 1-5 where 1=extremely serious and 5=not serious) to several Likert-type questions, I will build an index of risk perceptions for each subpopulation. I have 10 Likert-type questions and for each question (or risk perception category), I understand that I have to calculate the weighted mean of responses for each respondent, X:
X = {5 x F(5) + 4 x F(4) + … 1 x F(1)} / N where F is the frequency of responses and N the total number of respondents/ observations. See:
5 = extremely serious, ... 1= not serious
I have a total of 14 questions (or risk perception categories). To get the index (or mean score) for each subpopulation, I read that I just need to get the ratio of each weighed mean X and the total number of questions (i.e. 14). Is this correct?
If so, how do I do that in Stata? This index will then be used in an ordered logit regression as one of the dependent variables.
Thank you in advance.
Hello!
I am validating an adapted tool which suposly has 4 factors. The EFA agrees with it, but the CFA show bad fit indices (to make it work I need to free 3 items). Why is this happening? is it correct to fere this much items (scale has 14 items).
I came across different definitions of low, moderate and high growth, but I have not been able to find a reliable reference to cite where these definitions are presented.
Some suggest adding up the total score, and they define scores below 45 as none to low, and scores above 46 moderate to high. Others suggest summing up all the scores and calculating a mean score, which they later group as 1-3, 3-4, and 4-6 as low, moderate, and high, respectively.
Any references would be highly appreciated.
Have you worked with standard scales/measures/instruments and have modified them in any way?
What modifications are acceptable to standard scales?
What are the steps to be taken in order to ensure these modifications:
- dropping of scale item(s)
- Changing the likert scale for responses: adding anchors, changing what the anchors read, etc.
- splitting double barrelled items into two
- Changing the order of the items.
Dear Research Community,
I am asking for your participation and especially for your feedback on our Self-Assessment for Digital Transformation Leaders: https://t1p.de/mwod
The goal is to provide leaders a mirror to reflect on themselves and the skills as well as personal attributes required for digital transformation. In the end, participants receive an integrated presentation of their results (see appendix).
- Are all questions understandable?
- Which questions lack precision?
- In your opinion (as a digital leader), are essential aspects still missing? If so, which ones?
I am looking forward to any kind of suggestions.
Best regards
Alexander Kwiatkowski
Hello,
Is there any interested in helping with our analysis? We are working on a project relating to personality and friendship. Leave your email address if you are interested.
Regards,
I doubt it is halfway between low and high arousal nor halfway between negative and positive valence. Does anyone know where listeners rate emotionally "neutral", conversational speech?
I'm doing a split-half estimation on the following data:
trial one: mean = 5.12 (SD = 5.76)
trial two: mean = 7.62 (SD = 8.5)
trial three: mean = 8.57 (SD = 12.66)
trial four: mean = 8.11 (SD = 10.7)
(SD = standard deviation)
Where i'm creating two subset scores (from trial one & two; and from trial three & four - I realise this is not the usual odd/even split):
Subset 1 (t1 & t2): mean = 12.73 (SD = 11.47)
Subset 2 (t3 & 4): mean = 16.68 (SD= 17.92)
I'm then computing a correlation between these two subsets, after which I'm computing the reliability of this correlation using the Spearman-Brown formulation.
However, in the literature I've found, it all suggests that the data must meet a number of assumptions, specifically that the mean and variance of the subsets (and possibly the items of these subsets) must all be equivalent.
As one source states:
“the adequacy of the split-half approach once again rests on the assumption that the two halves are parallel tests. That is, the halves must have equal true scores and equal error variance. As we have discussed, if the assumptions of classical test theory and parallel tests are all true, then the two halves should have equal means and equal variances.”
Excerpt From: R. Michael Furr. “Psychometrics”. Apple Books.
My question is, must variance and means be equal for a split-half estimate of reliability? If so, how can equality be tested? And is there a guide to the range, which means can be similar (surely it cannot be expected for means and variance across subsets to be 1:1 equal?!)?
I am looking for a brief and widely accepted measure for self-esteem. I would like to be able to determine intrinsic self-esteem forces rather than extrinsic benchmarks for a "successful" person with this measure.
Hello, I have a questionnaire that consist of four sections with each section focusing on different variables.
First, each section has 9-10 items with each item following a different scale. For instance, the first section has 10 items with no Likert scale and the participants have to choose from either two or three or more specific options. The second section has 9 items with the first five items have six point Likert scale while in the remaining items the respondents have to choose from four specific options. The third section has 10 items with each following six point Likert scale. The fourth section has 9 items with no Likert scale and the participants have to choose from three, or four or more specific options.
Second, in some of the items the respondents were also allowed to select multiple answers for the same item.
Now my question is, how to calculate the "Cronbach's Alpha" for this questionnaire? If we cannot calculate the "Cronbach's Alpha", what are the alternative to find the reliability and internal consistency of the questionnaire.
We can say that these factorial analysis approach are generally used for two main purposes:
1) a more purely psychometric approach in which the objectives tend to verify the plausibility of a specific measurement model; and,
2) with a more ambitious aim in speculative terms, applied to represent the functioning of a psychological construct or domain, which is supposed to be reflected in the measurement model.
What do you think of these generals' use?
The opinion given for Wes Bonifay and colleagues can be useful for the present discussion:
Hello everyone,
I was wondering if I should calculate the ceiling and flooring effect for each item separately or just calculate it for the total score of my questionnaire?
Thanks in advance,
Sara
What if the Cronbach's Alpha of a scale (4-items) measuring a control variable is between the .40 -.50 in your research. However, the scale is the same scale used in previous research in which the scale received a Cronbach's Alpha of .73.
Do you have to make some adjustments to the scale or can you use this scale because previous research showed it is reliable?
What do you think?
I am using the technical manual and am unsure if I am doing it correctly, or am even able to; Im trying to calculate a linear t-score for a scale not part of the test. The CRIN scale is on the MMPI-A-RF but not the MMPI-2-RF. I am doing a project that could really benefit from having the same linear t-scores for CRIN along with what I have in my dataset for the other validity scales. Please and thank you.
I am working on developing a new scale .On running the EFA, only one factor emerged clearly while the other two factors were messy with multiple item loadings from the different factors.
1- Is it possible that I remove the cross-loadings one by one to reach a better factor structure by re-running the analysis?
2-If multiple items still load on one factor, what criteria should I use to determine what this factor is?
I am trying to obtain permission to use Muller/McCloskey scale on job satisfaction among nurses and in a meantime would love to see how it is scored.
I found references for factor loadings, correlations of the subscales etc. but need to look at scoring which is I think building into a continuous variable.
Thanking in advance to anybody who can help.
How many (what percentage) of 'unengaged' respondents would you allow in your respondents database and what criteria would you employ in order to make e decision to eliminate/keep?
With respect to Likert scales, there are times, especially in lengthy self-reported questionnaires, when respondents provide 'linear' answers, i.e., they give, constantly, the same rating to all questions, regardless if the questions are reversed or not.
Some call this type of respondents 'unengaged' respondents. This, is rather a 'friendly' term, since there can be also malicious intent in providing, but this is another discussion. However we call them, the first direct effect on the data is reduced variability.
There may be other effects as well (feel free to list them based on your experience or knowledge), which can affect the relations between the constructs and the final inferences.
Thus, how do you proceed in these situations and what criteria do you base your decision on (supporting arguments are welcome and references to written texts are especially appreciated)?
(Edit): I realised that the original question my induce some confusion. This is not a case that require substitution (although, in general terms, it may be). Please consider that all cases are complete (all respondents answered each item). The problem lies within the pattern of responses. Some respond in straight line, hence the name 'straightlined' answers, others in various patterns (zig zagging, for instance), hence the name "patterned". While for scales which include reversed items, some cases (for instance, 'straightlined") can be easily spotted, for scales without reverse items, this is harder to to. However, the question pertains both situations (scales with and without reversed items).
Another particularity of the question is that I am less interested in "how to identify" this cases (methods) and much more in "what to do"/"how to deal" with them, i.e., what criteria or rules of thumbs or best practices to consider.
Responses including references to academic papers or discussions are much appreciated!
Two outcome measures: Numeric Pain Rating Scale and Pain self-efficacy scale both on public domain however I need a written confirmation of this.
Thanks for your help.
Teina
MID is based on Standard deviation, consequently, I should not use it in non-normal distribution. Am I right? If yes, which test may replace MID?
I've recently come across with several published articles where questionaries composed of more than one section (where every section is meant to gather data for separate variables/dimensions of the study) are applied to a sample in order to calculate each section coefficients AND the coefficient of the entire questionarie.
¿What could be the use of a coefficient calculated for an intentionally multi-dimensional instrument?
J.
Dear Colleagues,
I am working on a new questionnaire and I would like to verify measurement invariance. I have conducted multi-group CFA and now I am verifying metric invariance. I have computed factor loadings for each item in both groups (women and men), but I have not found any criteria how I can assess whether this factor loadings are appropriately similar in both groups.
I have not found any strict criteria related to acceptable level of the difference between factor loadings level for each item or (in an average) for the whole scale (or the whole questionnaire). Some items have similar factor loadings, but in a few cases they are different. It also happens that in a scale one items have higher factor loading for men and another for women.
Do you now any strict criteria to assess metric invariance?
Kind regards,
Kamil Janowicz
I am using three questionnaires
1) has 6 items on a 6 point likert scale (professional self-doubt, subscale of the development of psychotherapists common core questionnaire)
2) has 10 items on a 6 point likert scale (developmental experience, subscale of the development of psychotherapists common core questionnaire)
3) 14 items on a 5 point likert scale (warwick-edinburgh mental wellbeing scale)
These are already established scales
My dissertation supervisor has advised me that i needed to calculate the cronbach's alpha before distributing my questionnaires but i was unsure how to do this? (I have contacted her about this, but she is currently on annual leave)
I ended up distributing the questionnaires as i had already received ethical approval and was allowed to start. I calculated the cronbach's alpha using responses of 64 participants.
question 1) is this enough participants?
question 2) how could i have calculated cronbach's alpha before the questionnaires were sent out?
question 3) I have recruited 107 participants, is this enough? (i am using a cross-sectional, quantitative design and using multi-regression analysis)
Thank you for taking the time to read this, I apologise in advance if any of this does not make sense.
Hi all,
I am designing a student profiling system (in university level technical education) in which student competencies are mapped and used for further intervention. I want to test achievement motivation, study habits, Engineering aptitude and English proficiency. I have selected the following tests. Are these good enough? Which can use for English proficiency.
1. Study Habit Inventory by M.N. Palsane and S. Sharma
2. Engineering Aptitude Test Battery by Swarn Pratap
3. English Language Proficiency Test by K S Misra & Ruchi Dubey
Hi all! I'm conducting a study where I have a designed a path model wherein most latent variables have polytomous indicators (ordinal data). However, one instrument I intend on using measures a construct with a mix of dichotomous and polytomous items. My question is can I somehow model a single latent variable with both these types of data? I've seen models where a factor has either dichotomous or polytomous indicators, but neither types together.
(I planned on using an estimation method that performs better for these types of data, such as WLSMV.)
I'm an undergraduate student who has a course in test construction as well as research. The construct we are studying is mental toughness, and we plan on using the MTQ-48 to assess it. I am aware that there is a technical manual for the MTQ-48 available online, unfortunately, it does not contain a scoring procedure, and does not have information about which items belong to what subscale. Using other references, what we got is that the MTQ-48 is scored on a 5-point likert scale; with 1=strongly disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree. We are also aware of what items fall under each subscale, such as items 4,6,14,23,30,40,44,and 48 which fall under the challenge subscale. However, we were not able to find a reference stating which items are negatively scored. While we could make an educated guess, it is required that we have a source for the scoring procedure. If anyone here has such a reference, or knows the items, it would be highly appreciated. Thanks!
And how would the conceptual diagram look like with direct, indirect, and total effect pathways?
If the original questionnaire is validated, then a linguistic validation is sufficient.
If you think statistical revalidation of a translated version is mandatory:
- So can you give an example where a statistical validation of a translated version questionnaire was not successful?
- Can you explain how statistical validation may change the strategy of the whole process?
I'm currently designing a questionnaire for measuring a variable. There are 5 questions related to the variable and 4 of them are in 5-point and 1 one of them is in 6-point Likert scale (based on literature review). Will it be a problem for further analysis on this variable? For your information, this variable will be a mediating one in the main model.
How to interpret Neutral responses while using Likert scale?, knowing that i use the mode. any help
I'm trying to specify McDonald's Omega. I'm using MPlus (based on FAQ Guide for Reability - https://www.statmodel.com/download/Omega%20coefficient%20in%20Mplus.pdf).
It works perfect for regular models, but when I try it to use it with factor analysis random intercept models, I do get Omega's higher than 1.00 in some factors.
That’s my model (not the syntax, only a brief approach to the example):
f1 by item1-item5
f2 by item6-item10
fRI by item1-item10
fRI with f1-f2@0
Based in Mplus FAQ’s, Omega’s calculation is (loaditems)^2/((loaditems)^2+resvarianceitems), right?
So, I wonder if is it ‘ok’ to have a ‘Omega higher than 1.00’ and what it would mean?
Also, there's an example of my Omega’s calcs (again, not my syntax, just a brief exercise of what I’ve been trying to do):
OmegaFactor1 = (loaditems1to5)^2/((loadsitems1to5)^2+resvarianceitems1to5)
OmegaFactor2 = (loaditems6to10)^2/((loadsitems6to10)^2+resvarianceitems6to10)
OmegaRandomIntercept = (loaditems1to10)^2/((loadsitems1to10)^2+resvarianceitems1to10)
I’m not sure if that approach is correct (even if Omega should be used or not in that case), but I do have an intuition that it’s happening because I’m lacking to specify in my model constraint ‘resvariance’ for Factor 1 and Factor 2, once the Random Intercept Factor is ‘interacting with it’.
Could anyone give me a tip about that theme?
There is software available for item response theory, but it is very hard for me to understand how they work. Can anyone provide information on this.
Thank you
I'm looking for an intelligence test to examine verbal and non-verbal intelligence of children (like WISC or K-BIT) that can be administered in a group setting. My main interest is not to determine a precise IQ, but to obtain a reliable measure of cognitive functioning to include as control variable and compare different groups of school-aged children. K-BIT test would be ideal, since it does not take too much time and includes two vocabulary and one matrices subtest, but I don't know if it can be simultaneously administered to a group of children.
Any idea?
Thank you so much for your attention!
The effect size in statistics reflects the magnitude of a measured phenomenon. It may be expressed as a coefficient of determination (R2), which states how much variance can be explained by the controlled parameter.
If a direct and simple relationship between the variables is claimed (eg. concentration of a measurand vs. absorbance in determination of calibration plot for ELISA), the effect size may be close to 100%. Similarly, in pharmacology, the effect size of relationship between direct phenomena may significantly exceed 50%.
In social sciences, however, where the links between the observed phenomena are not so simple and largely multifactorial, I suspect, the effect sizes are much smaller.
How large the effect size (expressed as R2) in social sciences may be considered relatively "large"? Any examples? Any references?
Thank you for your help!
Respected fellows,
I have recently started a deep dive in gamification of psychological concepts to help me understand it, like Pymetrics, Cognifit , Lumosity, etc., and I am having difficulty finding literature that might help me in adapting these concepts into games. Can someone please help me find any literature regarding this, or even how to further proceed?
Regards,
Azka Safdar
For testing national culture theory for example, there are four dimensions with each a set of related items. From the sample we get, I would think it better to analyze each dimension for its own set of reliability, KMO, and bartlett's test, and not for the overall (all four dimensions together) since it clearly a multidimensional instrument and not unidimensional. Please advise?
Does anyone know what are the 10 response bias items in the Narcissistic Injury Scale (Slyter 1991)? I have found 50 items for this scale, 10 of them are for response set bias, 2 items are dropped by the researchers. So only 38 items need scoring, but I don't know which 38 items out of those 50.
Thank you for your help in advance.
Dear Researchers,
I'm designing an interdisciplinary study (with Public heath, Statistics, Psychology etc) on diseases, stigma, discrimination, mental health and quality of life.
So I'm a bit confused that what should be the ultimate construct for human life, I mean what must be there for a human being?
Quality of life is sometimes observed secondary, primarily they should have a moving life.
The whole United Nations (Sustainable Development) Goals talking about Society and people's wellbeing.
But whats the 1 most important factor for humans?
Please provide your thoughts?
Best Regards,
Abhay
There were ten questions in a pre intervention test given to 40 members. The correct answer was marked as 1 and the wrong one 0. The content validity was tested by a subject expert. Can anyone suggest ways to improve the value. The output as a word document is attached herewith. Thank you
Can somebody direct me towards some good readings on the subject?
Thank you
Recently I've been reviewing how to handle social desirability in testing.
After much theoretical review I have come to the conclusion that the best way to do this is to neutralize the items from social desirability. The format I will use is likert scale.
For example, an item that says "I fight with my coworkers" would be transformed into " sometimes I react strongly to my coworkers" (the second is somewhat more neutral).
The idea comes from the work done by Professor Martin Bäckström.
Now the question I have is: is there any methodology that can help make this neutralization?
If not, what would be good ideas to realize it? What elements should I consider?
I think a good idea might be to "depersonalize" the item. For example, instead of "I fight with my bosses," I would become "I think an employee has the right to fight with his or her boss".
Another option I've thought of is to modify the frequency. For example, instead of "I get angry easily," I'd use "Sometimes I get angry easily."
However, I do not know if these options would affect the validity of the item to measure the construct.
Thank you so much for the help.
Hello everyone,
Currently, I am developing a longitudinal research applying psychometric questionaries in the same sample in two moments. However, I am looking bibliography or evidence about which is stronger statistical test: Paired Sample T-Test and Wilcoxon, and the assumptions behind the application of each test.
Is there any possible way?
I understand that if the options point to the same trait, it can be done. for example a question of the type:
I work better:
(a) individually
(b) with other persons
either of the two options is valid for the person (helping avoid bias) and for example if I'm measuring the trait of teamwork I may think that a person who selects option b will have a higher degree in the trait of teamwork. Am I making a mistake in assuming this?
now, is there any way to do this when they point to different traits in response options? I want to be able, based on the data of forced response items, to carry out normative analysis (to be able to compare with other subjects).
PS: I'm clear that with ipsatives items you can't make comparisons between people, however, if you manage the punctuation in a different way could you do it somehow?
I´ve been working on common psychometric tests. In other words, the traits to be measured are selected, some key behaviors are selected and subjects must respond on an likert scale. Subsequent to this I look at validity and reliability, I take a normative group and calculate the scores (working with classic test theory), etc.
But now I want to do something different: I want to do a test in which they have multiple response options and all of them can be correct. For example, an item could be of the type:
Choose the phrase with which you are most identified (only one).
a) I am respectful
b) I am honest
c) I am a good worker
I have seen several of these tests but would like to know how to calculate their reliability, validity, score tables.
Very Thanks
I understand that at least three first-order factors are needed for identifying a second-order, hierarchical CFA model. I also read in Reise et al. (2010) that at least three group factors are suggested for bifactor EFA. Does the same guideline apply to bifactor CFA? I've seen many influential papers that use bifactor CFA with only two group factors, but I want to make sure this is a correct decision. References are always very appreciated.
Can we model stress. Just curious about it.
(keeping in mind that this is the era of artificial intelligence, machine learning, data science and so on)...
Can we have a predictive model on stress and behaviour as KPIs.
Also, philosophically there must be a path between Stress, behaviour, emotions, intelligence etc... can we also test and find the coefficient for these paths??
Regards,
Abhay
The items could be survey or behavioral performance ratings scaled in accordance to an interval or ordinal format.
For an assignment I need to imaging designing a scale to measure hypomanic symptoms, and write about how this would be done.
Seeing as hypomanic episodes may be present for various periods of time (a few days, a few weeks etc.) is it possible to measure test-retest reliability?
Thanks
I am conducting a research project in which I am using SEM model. My exogenous variable( world system position) is ordinal with 4 categories. I am not sure how creating so many dummy variables will work in SEM model. Thus I would like to treat it as a continuous variable. But I am not sure if I will be violating any statistical assumption by doing this. Can somebody help me with suggestion on this?
If you have run both EFA and CFA then you should calculate discriminant and convergent validity on the basis of EFA or CFA factor loading?Please guide.I would appreciate if you could provide any references.
I am testing measurement invariance of a questionnaire. I am aware of the fact that some methodologists think it unnecessary to test for invariance of residual variances and covariances. However, the measure I am working with has a highly replicated correlated errors issue, and I would like to test its invariance since possible noninvariance could affect the measure's reliability for some groups. What's the correct sequence I should follow? Should correlated errors invariance be tested before or after intercept invariance? Intuitively, I think it would make sense to test it together or immediately before residual variances invariance, since both are usually considered under the label 'strict invariance'. However, the Mplus tutorial in Byrne's book suggests testing correlated residuals invariance after factor loadings invariance and before intercept invariance, and it also makes sense to me. Readable references would be very appreciated.
In my study involving Student Adaptation to College Questionnaire (N=104, 0 missing values) with a likert scale of 9 is showing very low reliability across all 4 sub-scales especially Personal-emotional and Social adjustment factors in SPSS 23. But I checked if it is working if questions are not reverse coded (in SPSS) and it surprisingly produced a highly reliable output although it is irrelevant without reverse coding the questions. I have tried different methods of reverse coding but the issue still remains.
My team and I are implementing a project on assessing cyberbullying perceptions. We have decided to use FRS in order for respondents to answer our 8 situational items.
Could you suggest me some practical applications of FRS based on sinusoidal functions?
Thank you very much, Dana
Hi there!
I've just done Cronbach's Alpha for my UTAUT2 technology acceptance model.
The results are good. One construct is at 0.78, and the rest are > 0.84.
However, SPSS indicates that, with the deletion of some items, I can make some minor improvements (e.g. a jump from 0.87 to 0.89 for a scale).
Should I purify my scale simply as an attempt to increase the Cronbach's Alpha for each one? Or will this cause other problems?
Could you please provide a reference which suggests best practice in this scenario?
Thank you!
Sam
Hi,
I'd really appreciate it if someone could help to explain to me (in simple terms- I'm not a statistician!) why the factor loadings of items in an exploratory factor analysis (principal axis factoring, using direct oblimin), differ to the factor loadings of a confirmatory factor analysis when both are carried out in the same sample? (I'm aware that normally you would perform a CFA in a different sample but that isn't my focus for this question).
There isn't a massive difference in factor loadings, and from what I've read and seen in other papers this is normal, but I'm just trying to understand why? What does the CFA do that results in a different 'loading' of the items?
Any help would be appreciated :)
I've been assessing the reliability for constructs using both Cronbach's alpha and composite reliability scores. But I found Cronbach's alpha is a higher than CR for similar construct. however I was expecting the Cronbach's alpha to be lower estimation than composite reliability. I wonder which one is more suitable to check the internal consistency!
I have a questionnaire data set which has seven categories or dimensions and 13 items across all the dimensions put together.The items however are distributed unequally in each category . In some cases there are as much as 4 items and n some categories just 1. I would like to know how to combine these into seven variables.
I have just 41 observations all together...I feel the low sample size rules out traditional methods such as factor analysis ...
Would like to hear some ideas.
I measured the severity of farmers’ risk perception items/questions using Likert scale (1 to 5). But there are some items/questions that are not relevant to the respondents, this is real in practice. In this case, I suggest to include Not Applicable (NA) optional response apart from the 5 point scale. I think this NA response is not missing value rather it is missing value by design. In factor analysis or any other parametric and non-parametric analysis, what would be the best approach to treat the NA response? Is that better to omit NA response? Or better to replace zero for NA? Or better to consider NA like missing value and handling using any of the methods for treating missing values? What else? I believe we need to have a new methodology that addresses the NA response to avoid biased estimates and wrong inferences.
Hi,
I have data from 200 patients. I want to check the association between the cognitive impairment and the diminution of the quality of life. My variables are mostly categorical, such as sex, age, cancer type. I also have numerical variable such as cognitive tests score and quality of live survey score (likert type). Can anybody tell me if a linear regression could be a good start or some other analysis? What other analysis could be suitable for qualitative data analysis?
Thank you very much as I’m beginning with this discipline,
Dear all,
my dependent and independent variables are 7-Point Likert scale and measure perception toward effective leadership, therefore I assume my data is of ordinal nature.
The independent variables are six constructs with 5 questionnaire items each, which measure cultural values.
The dependent variables are 2 constructs with 19 items each, which measure the perception of effective leader attributes.
So I hypothesize that each of the culture dimensions are associated with perceived effective leader attributes. I have collected the data and intend to do the statistical analysis as follows:
1. Reliability Analysis - Cronbach's Alpha
2. Factor Analysis
But then I'm struggling between ANCOVA or Spearman's Rho to test association. I understand that ANCOVA is used for interval and Spearman for ordinal data.
Could you please advise on an appropriate method?
Many thanks,
Ahmad
Hello,
I have seen many questions of this kind and tried on my data but yet no solution seems to address my problem.
I have 3 instruments each of 13, 11 and 13 statements. I marked them 1-5 scale. After recoding most answers (80%) score around 4-5. (N=100) seem quite similar. My Cronbach's alpha however shows: around .229 to .430. I am still surprised because when I combine all three instruments the Cronbach's Alpha come out .642, however, for individual instrument it is lower than the required limit.
Any help will be highly appreciated. Thanks
I am running a validation study in which I compare two measures of the same process. One variable is continuous, the other is categorical (5 increasing categories). I want to assess the agreement between the two measures, but am doubting what method to use... Anyone who can help?
I had planned to complete a multiple regression but my SPSS knowledge is poor and I am struggling.
example: Five questions measuring "Perceived Usefulness" (dependent variable) in which respondents have to select one of each:
1= Strongly Disagree
2=Disagree
3=Neutral
4=Agree
5=Strongly Agree
6= Not Applicable
Hi,
Does anyone know/.have a reference for what the standardised factor loadings (highlighted in the attached) should be when performing confirmatory factor analysis. Is it the same as the rule of thumb for factor loadings when performing an exploratory factor analysis (>.4)?
Thanks,
Emma.
i want to find out correlation between psychological well being scale (5 point likert type ) and mental health battery (2 point scale) . how to find out.
can any researcher send me the copy of 12-item Procrastination Assessment Scale for Students questionnaire and its scoring manual or instructions?
What is your suggestion to obtain reliability for Visual Analogue Scale?
Note: There is no class to find lower and upper limit in likert scale
Suppose that the phenomena we study comprise about 3 % of the population and we made a two item screening instrument (with dichotomous items). If we think in terms of Rasch /IRT, what is the optimal item difficulty for these two items? and if it's contingent of their covariance - how to test it ?
Good afternoon,
I do have a question about my thesis project. I am estimating predictive validity of an assessment, however, I am a bit confused. Assessment consists of major factors,which in their turn consist if smaller dimensions which in their turn consist of smaller constructs. I need to formulate hypothesis but I do not know which levels of the assessment they should include: major factors, dimensions or smaller constructs?
Thank you!
I have translated the Ryff Psychological Well-being scale into Vietnamese and collected 800 responses from university students. The main construct is psychological well-being, which is comprised of 6 distinct dimensions. I used the 54-item scale with 9 items per scale (9 times 6 = 54).
The items appear to be formative and not reflective thus I am told that an EFA is not appropriate for formative items.
I want/expect to confirm that the Vietnamese version also shows the 6 dimensions found by previous research. So how should I proceed? I have not found a clear standardized approach to handle this question. The videos and writings I have read do not seem to agree on the appropriate method to take.
Anyone who has done this kind of work I would love to hear from you. Thanks.
The Likert scale items are intended to answer a of whether communication between hospital staff is effective or not .
When a dataset is being analyzed by applying a rubric, without having two coders,
1. what is the best procedure to determine the reliability of the rubric application? (repeated administration of the rubric to the same data set?)
2. Is there an analog to (for example) Cohen's Kappa for computing rater-reliability with just one coder?
Sample size < 50
No. of Items = 38
KMO barletts test is not appearing.
I have a self-designed 27 item scale that assesses a particular experience. I have run an exploratory factor analysis to identify potential subscales with a promax rotation (I have good reason to believe that the factors are not orthogonal). I end up with a four-factor solution. However, no matter which loading I use as my cutoff (I like to use .5, but I toyed with .3 to .7), I end up with several items that load on more than factor. Using a loading of .5, I have subscales that make sense but, again, some items (about 8-9 of the 27) are used more than once. The reason why it makes sense is because one specfic type of experience (one subscale) can also be a part of a different type of exeprience (another subscale). On the other hand, adding up all of the items to make a single score yields a score with good internal validity (Cronbach's = .77).
My question is this: Is it acceptable to present a scale with items that are used in multiple subscales, or is it better to simply call it one overall trait?
dichotomous data from questionnaire
So I have a data set where researchers opted to provide an answer type "not apply" for items that are not related in anyways to subject's activity being in inexistent - it represents an answer level different and distinct from the minimum trait level. The other answer levels are in a likert scale of 6 levels in order to avoid a middle term.
So....usually I work with "not apply" as missing, but that's prohibitive for some analysis as they demand complete cases mostly. Winsteps at least considers response patterns so missing data doesn't matter to a certain point. Still, if I was intending to go on R and run a gpcm or other analysis, how should I treat those answers?
I have a single-item question "how often has the statement: you worried about cancer coming back, been true for you in the past four months,” with 7 possible responses: never, seldom, sometimes, about as often as not, frequently, very often and always. In order to analyse it, I need to convert it into 3 categories (Low fear, Moderate fear, High fear). However, I´m not sure which responses to include in each category (i.e. should never, seldom and sometimes be: low fear, about as often as not and frequently: moderate fear, and very often and always: high fear?) I tried looking in the literature to see how other studies have categorized it, as I believe I need some evidence to support this decision but none of the articles I read clarify how they stratified the groups. Do any of you have any suggestion as to how I should approach this or do you know where I can find guidance as to how to make this decision?
sample size: 1,056
Thank you very much in advance!
Gabi
Does anyone have a copy of the revised Self-Perception Profile Questionnaire? With the scoring information? This questionnaire was developed as part of the following study:
Kalmet, N., & Fouladi, R. T. (2008). A comparison of physical self-perception profile questionnaire formats: structured alternative and ordered response scale formats. Measurement in Physical Education and Exercise Science, 12, 88-112.
Currently I am looking at subjective measurements of stress for my university project and have recognised that the DSSQ would be ideal for my particular situation. However, I cannot seem to find any literature on how to use this tool. Furthermore, I am unable to locate/access the original paper on the DSSQ which I believe is called 'Validation of a comprehensive stress state questionnaire: Towards a state big three?' by Matthews, G., Joyner, L., Gilliland, K., Huggins, J., & Falconer, S. (1999). Any advice on how I would go about applying this tool or links to appropriate papers would be much appreciated. Thanks in advance.
According to the literature I need a sample size between 50-100 to run a factor analysis (item no. dependant). I also want to test the reliability of the scale, approximately 50 would be fine. However, my question is, could I simply get a sample size of 100 to complete my questionnaire and run both tests on the collected data or do I have to recruit for two separate samples?
I have a questionnaire with 18 questions and each question have seven items that should be evaluated in a scale between 1 and 5 (Likert item). How should I assess the internal consistency of the questionnaire?
There is an increase in the use of bifactor modeling to test for general factors in multidimensional scales. However, its use seems to make sense only when there are high correlations between first-order factors. How large do you think these correlations should be in order to make bifactor modeling a reasonable approach?
Is anyone aware of a resource (e.g. website, journal article, book chapter) in which the various different neuropsychological tests are linked to specific cognitive functions?
For example, I know that the Stroop test reflects an individual's selective attention capacity and processing speed, whereas the Wechsler Memory Scale is a measure of five different types of memory. Is there anywhere I can learn about the most appropriate tests for each cognitive function? And, furthermore, is there any consensus on the group of cognitive functions which comprise an individual's overall cognitive capacity and functioning?
Thanks in advance for any responses
Experience with the ZP score.
I intend to carry out Spatial Navigation evaluations in the elderly with low level of education.
Thank you!
During the process of a questionnaire translation from an original language - after accurate translation - you need to validate it in a target sample in order to confirm similar structure in factorial analysis as well as test Cronbach alphas. In my study I obtain a little bit different structure of Polish questionnaire than the structure of original English one. Is it justified to change the subscale that an item originally belonged to? Is it justified to remove it completely due to redundance (because it decreases Cronbach alpha)?
Thank you.
I'm trying to measure the extent of agreement/disagreement to a number of statements for an attitude questionnaire, I managed to deal with the data from likert scale questions but I can't find a way to enter and analyze data from VAS on SPSS ( the scale is a continuous 10cm line on which participants marked their answers ranging from 0-totally disagree to 10-totally agree)
Hi all,
I am very much interested in how to convert ordinal scale data into interval scale? What are your suggestions?
Let's say I have a data measured with 4 point likert scale (1 strongly disagree to 4 strongly agree). While it is clearly ordinal, what would be the best option for converting these data into interval scale?
Method of Successive Interval (MSI)?
The Rash model?
Your comments/suggestions are highly appreciated
Best regards,
Davit
Hello, I was wondering what minimum size should be my sample if I want to use a simple moderation model ? Is there any article on that topic ?
Thanks
Hi,
I am a Master of Nursing Science student. Currently, I am conducting a research in patients with breast cancer. I have chosen social support as an independent variable to examine its predictive power in health-related quality of life. To measure Social support, I used the Modified Medical Outcomes Study Social Support Survey (eight item). The scoring manual didn't exactly explained the cut off score for poor, fair and good social support measured in 0-100 point scale. But I referred to previous similar study (only found one study) which categorized as poor,(less than 60) fair(60-79) and good social support (more than 80) and explained that it was categorized according to Bloom's theory. But there is no citation to search further. I tried to confirm searching so many times. But I could not find the article that used same categorization based on score. So, I kindly request you to let me know how it can be done or post article here if you have idea.
Thank you.
Is it okay to produce 2 articles from a single questionnaire study if using different sets of variables for each?
I collected data on 2 occasions (before and after intergroup conflict). I wrote a report that includes mind Attribution, perceived liking & morality. Then there are several variables that I didn't use at that time, such as national identity, perceived suffering of the target, infra-humanization, etc. It was a research project...and partially for exploratory purposes.
Now my questions is, is it okay to produce another article that include those that I didn't use in my previous paper?
Thank you in advance.
Sincerely,
I am comparing two groups of sex offenders, those who are satisfied with sex offender treatment programs versus though who are not, splitting them by satisfaction or not, and then seeing if they recidivate. I am pretty sure I would use a two-groups t test to compare how many recidivated in the satisfied group versus the non-satisfied. Is this right?
Hypothesis: Those who are satisfied with the sex offender treatment program are less likely to recidivate than those who are not satisfied.
I wish to moderate alphas of energy 5.4MeV from an Americium 241 source to 2MeV.
What type of moderator would be suited for this?
Cheers
Shri
Any experience and suggestion on response option when measuring «belief strength» and «outcome evaluation» within the TPB framework?
(a) unipolar scoring (from 1 to 7)
or
(b) bipolar scoring (-3 to + 3)
or
(c) a combination of both?