Science topic

Applied Psychometrics - Science topic

Explore the latest questions and answers in Applied Psychometrics, and find Applied Psychometrics experts.
Questions related to Applied Psychometrics
  • asked a question related to Applied Psychometrics
Question
4 answers
The AVE of the scale is below 0.5 and rest of the parameters viz. CR and discriminant validity are above the threshold level
Relevant answer
Answer
Lam, L. W. (2012). Impact of competitiveness on salespeople's commitment and performance. Journal of Business Research, 65(9), 1328-1334.
  • asked a question related to Applied Psychometrics
Question
5 answers
I came across different definitions of low, moderate and high growth, but I have not been able to find a reliable reference to cite where these definitions are presented.
Some suggest adding up the total score, and they define scores below 45 as none to low, and scores above 46 moderate to high. Others suggest summing up all the scores and calculating a mean score, which they later group as 1-3, 3-4, and 4-6 as low, moderate, and high, respectively.
Any references would be highly appreciated.
Relevant answer
Answer
Hello, if one of you reach the cutoff point for PTGI with reference. I could not find any... thanks for you
  • asked a question related to Applied Psychometrics
Question
4 answers
Dear Research Community,
I am asking for your participation and especially for your feedback on our Self-Assessment for Digital Transformation Leaders: https://t1p.de/mwod
The goal is to provide leaders a mirror to reflect on themselves and the skills as well as personal attributes required for digital transformation. In the end, participants receive an integrated presentation of their results (see appendix).
  1. Are all questions understandable?
  2. Which questions lack precision?
  3. In your opinion (as a digital leader), are essential aspects still missing? If so, which ones?
I am looking forward to any kind of suggestions.
Best regards
Alexander Kwiatkowski
Relevant answer
Answer
Depends upon the structure and culture of the organisations
  • asked a question related to Applied Psychometrics
Question
34 answers
Hello,
Is there any interested in helping with our analysis? We are working on a project relating to personality and friendship. Leave your email address if you are interested.
Regards,
Relevant answer
Answer
It will be good to know the type of psychological data you want to analyse and possibly the research hypotheses
  • asked a question related to Applied Psychometrics
Question
3 answers
I doubt it is halfway between low and high arousal nor halfway between negative and positive valence. Does anyone know where listeners rate emotionally "neutral", conversational speech?
Relevant answer
Answer
I read this somewhere in the RAVDESS paper ( ): "Many studies incorporate a neutral or ªno emotionº control condition. However, neutral expressions have produced mixed perceptual results [70], at times conveying a negative emotional valence [71]. Researchers have suggested that this may be due to uncertainty on the part of the performer as to how neutral should be conveyed [66]. To compensate for this a calm baseline condition has been included, which is perceptually like neutral, but may be perceived as having a mild positive valence."
Intuitively, neutral would be a low arousal state to me so I bet listeners would judge it like so. Based on the excerpt above, I would expect listeners to perceive such utterances as slightly negative in valence on average.
  • asked a question related to Applied Psychometrics
Question
8 answers
I'm doing a split-half estimation on the following data:
trial one: mean = 5.12 (SD = 5.76)
trial two: mean = 7.62 (SD = 8.5)
trial three: mean = 8.57 (SD = 12.66)
trial four: mean = 8.11 (SD = 10.7)
(SD = standard deviation)
Where i'm creating two subset scores (from trial one & two; and from trial three & four - I realise this is not the usual odd/even split):
Subset 1 (t1 & t2): mean = 12.73 (SD = 11.47)
Subset 2 (t3 & 4): mean = 16.68 (SD= 17.92)
I'm then computing a correlation between these two subsets, after which I'm computing the reliability of this correlation using the Spearman-Brown formulation.
However, in the literature I've found, it all suggests that the data must meet a number of assumptions, specifically that the mean and variance of the subsets (and possibly the items of these subsets) must all be equivalent.
As one source states:
“the adequacy of the split-half approach once again rests on the assumption that the two halves are parallel tests. That is, the halves must have equal true scores and equal error variance. As we have discussed, if the assumptions of classical test theory and parallel tests are all true, then the two halves should have equal means and equal variances.”
Excerpt From: R. Michael Furr. “Psychometrics”. Apple Books.
My question is, must variance and means be equal for a split-half estimate of reliability? If so, how can equality be tested? And is there a guide to the range, which means can be similar (surely it cannot be expected for means and variance across subsets to be 1:1 equal?!)?
Relevant answer
Answer
Yes, unfortunately it's common practice to just compute Cronbach's alpha without first testing whether the variables are essentially or strictly tau-equivalent. This may in part be because SPSS calls the procedure MODEL = ALPHA (which does not make sense in my opinion) but does not provide a test of fit for essential or strict tau-equivalence as part of the procedure (for whatever reason). When the variables are not at least essentially tau-equivalent (when they are "only" congeneric, i.e., have different loadings), Cronbach's alpha leads to an underestimate of reliability (McDonald's omega is appropriate for congeneric measures). Even worse is the (probably frequent!) case where the indicators are multidimensional (i.e., they measure more than one factor/true score). In that case, Cronbach's alpha is completely meaningless, yet you wouldn't know from SPSS output.
Essential tau-equivalence can be tested in lavaan and other SEM/CFA programs by specifying a 1-factor model with all factor loadings fixed to one and intercepts and error variances freely estimated (not set equal across variables). Strict tau-equivalence requires equal intercepts (means) across variables (otherwise same specification as essential tau equivalence).
  • asked a question related to Applied Psychometrics
Question
52 answers
I am looking for a brief and widely accepted measure for self-esteem. I would like to be able to determine intrinsic self-esteem forces rather than extrinsic benchmarks for a "successful" person with this measure.
Relevant answer
Answer
Self confidence at the first degree, and the respect of others. However, conceit can be intermingled with self-esteem.
  • asked a question related to Applied Psychometrics
Question
16 answers
Hello, I have a questionnaire that consist of four sections with each section focusing on different variables.
First, each section has 9-10 items with each item following a different scale. For instance, the first section has 10 items with no Likert scale and the participants have to choose from either two or three or more specific options. The second section has 9 items with the first five items have six point Likert scale while in the remaining items the respondents have to choose from four specific options. The third section has 10 items with each following six point Likert scale. The fourth section has 9 items with no Likert scale and the participants have to choose from three, or four or more specific options.
Second, in some of the items the respondents were also allowed to select multiple answers for the same item.
Now my question is, how to calculate the "Cronbach's Alpha" for this questionnaire? If we cannot calculate the "Cronbach's Alpha", what are the alternative to find the reliability and internal consistency of the questionnaire.
Relevant answer
Answer
Amjad Pervez Strictly speaking, Cronbach's alpha only makes sense when your variables are measured on an interval scale (i.e., when you have continuous/metrical/scale-level variables) and when the variables are in line with the classical test theory (CTT) model of (essential) tau equivalence (or stricter models). Essential tau-equivalence implies that the variables/items measure a single factor/common true score variable (i.e., that they are unidimensional) with equal loadings. For variables that are only congeneric (measure a single factor/dimension but have different factor loadings), Cronbach's alpha underestimates reliability. For multidimensional scales, Cronbach's alpha tends to be completely meaningless. For categorical (binary and ordinal) variables, psychometric models and scaling procedures of item response theory are usually more appropriate that procedures derived from CTT which assumes continuous (scale-level) variables.
Maybe you could describe the content of your variables (and the answer options) in a bit more detail. That would make it easier for folks on Researchgate to see which procedure may be appropriate for you.
  • asked a question related to Applied Psychometrics
Question
14 answers
Hello everyone,
I was wondering if I should calculate the ceiling and flooring effect for each item separately or just calculate it for the total score of my questionnaire?
Thanks in advance,
Sara
Relevant answer
Answer
Hi Sara,
you give little additional info. But there is a rerelatively simple thoughts you probably should consider:
1) If there is, say, a 7-point scale for responding to items, and every participant answers with 1 (e.g. fully disagree) or with 7 (e.g. fully agree) for an item, then this item is uninformative. In other words, there will be no variance on these items, and they will hardly add explanatory power to any analysis.
2) If the overall score (summed item scores) is to be interpreted by its extent (e.g., a sum score of 30 means 'high' on a certain trait or attitude etc.) then adding items which only yield high item-scores is not informative as well, because you basically add the same number (a constant) to each individuals score. In other words, the item which has a ceiling or floor effect was probably not verbalized in a valid way (e.g., "You do not like being insulted in front of your family, colleagues and partners" - fully agree; who would not; but does this raise my score on "being vulnerable against insults?").
So I suggest checking this before drawing inferences on overall scores. Usually, such items would be removed from analyses (also for statistical reasons).
Hope this helps,
René
  • asked a question related to Applied Psychometrics
Question
6 answers
What if the Cronbach's Alpha of a scale (4-items) measuring a control variable is between the .40 -.50 in your research. However, the scale is the same scale used in previous research in which the scale received a Cronbach's Alpha of .73.
Do you have to make some adjustments to the scale or can you use this scale because previous research showed it is reliable?
What do you think?
Relevant answer
Answer
Hello Lisa,
as I suspected. These are clearly not indicator of a common underlying factor. Hence, alpha and every other internal consistency approach towards reliability are inappropriate. For its control function, however, the scale will do its job as it can be regarded as a composite of specific facets. And, yes, each of the facets won't be perfect error free indicators of their underlying attribute but that should not hurt much.
All the best,
Holger
  • asked a question related to Applied Psychometrics
Question
9 answers
I am using the technical manual and am unsure if I am doing it correctly, or am even able to; Im trying to calculate a linear t-score for a scale not part of the test. The CRIN scale is on the MMPI-A-RF but not the MMPI-2-RF. I am doing a project that could really benefit from having the same linear t-scores for CRIN along with what I have in my dataset for the other validity scales. Please and thank you.
Relevant answer
Answer
In your Manual, if you have the "complete specimen bag" the test comes; but, in addition to the correction "by hand" that the profile of each subject, with its "T" scores, offers you, using the relevant "ad hoc" templates, you can also correct it electronically, for which also in such a "bag "You have the means to do it ... but, only with the Inventory Manual, IT IS IMPOSSIBLE!
  • asked a question related to Applied Psychometrics
Question
11 answers
I am working on developing a new scale .On running the EFA, only one factor emerged clearly while the other two factors were messy with multiple item loadings from the different factors.
1- Is it possible that I remove the cross-loadings one by one to reach a better factor structure by re-running the analysis?
2-If multiple items still load on one factor, what criteria should I use to determine what this factor is? 
Relevant answer
Answer
Exploratory factor analysis (EFA) is method to explore the underlying structure of a set of observed variables, and is a crucial step in the scale development process. ... After extracting the best factor structure, we can obtain a more interpretable factor solution through factor rotation.
  • asked a question related to Applied Psychometrics
Question
2 answers
I am trying to obtain permission to use Muller/McCloskey scale on job satisfaction among nurses and in a meantime would love to see how it is scored.
I found references for factor loadings, correlations of the subscales etc. but need to look at scoring which is I think building into a continuous variable.
Thanking in advance to anybody who can help.
Relevant answer
Answer
This table in the paper below seems to show the original composition of the subscale
Lee, S.E., Dahinten, S.V. and MacPhee, M. (2016), Psychometric evaluation of the McCloskey/Mueller Satisfaction Scale. Jpn J Nurs Sci, 13: 487-495. https://doi.org/10.1111/jjns.12128
  • asked a question related to Applied Psychometrics
Question
3 answers
How many (what percentage) of 'unengaged' respondents would you allow in your respondents database and what criteria would you employ in order to make e decision to eliminate/keep?
With respect to Likert scales, there are times, especially in lengthy self-reported questionnaires, when respondents provide 'linear' answers, i.e., they give, constantly, the same rating to all questions, regardless if the questions are reversed or not. 
Some call this type of respondents 'unengaged' respondents. This, is rather a 'friendly' term, since there can be also malicious intent in providing, but this is another discussion. However we call them, the first direct effect on the data is reduced variability.
There may be other effects as well (feel free to list them based on your experience or knowledge), which can affect the relations between the constructs and the final inferences.
Thus, how do you proceed in these situations and what criteria do you base your decision on (supporting arguments are welcome and references to written texts are especially appreciated)?
(Edit): I realised that the original question my induce some confusion. This is not a case that require substitution (although, in general terms, it may be). Please consider that all cases are complete (all respondents answered each item). The problem lies within the pattern of responses. Some respond in straight line, hence the name 'straightlined' answers, others in various patterns (zig zagging, for instance), hence the name "patterned". While for scales which include reversed items, some cases (for instance, 'straightlined") can be easily spotted, for scales without reverse items, this is harder to to. However, the question pertains both situations (scales with and without reversed items). 
Another particularity of the question is that I am less interested in "how to identify" this cases (methods) and much more in "what to do"/"how to deal" with them, i.e., what criteria or rules of thumbs or best practices to consider.
Responses including references to academic papers or discussions are much appreciated!
Relevant answer
Answer
I use the standard deviation to identify unengaged responses. If the st. dev. is 0, then I exclude the respondent from the analysis. I am aware that this might pose a problem when there are few responses available.
  • asked a question related to Applied Psychometrics
Question
9 answers
Two outcome measures: Numeric Pain Rating Scale and Pain self-efficacy scale both on public domain however I need a written confirmation of this.
Thanks for your help.
Teina
Relevant answer
Answer
Visual analogue scale
  • asked a question related to Applied Psychometrics
Question
3 answers
MID is based on Standard deviation, consequently, I should not use it in non-normal distribution. Am I right? If yes, which test may replace MID?
Relevant answer
Answer
if you have nnd then you have to remove outliers to ensure normalcy so that you may simplify your work.
  • asked a question related to Applied Psychometrics
Question
9 answers
I've recently come across with several published articles where questionaries composed of more than one section (where every section is meant to gather data for separate variables/dimensions of the study) are applied to a sample in order to calculate each section coefficients AND the coefficient of the entire questionarie.
¿What could be the use of a coefficient calculated for an intentionally multi-dimensional instrument?
J.
Relevant answer
Answer
Javier
There are brilliant contributions here!
Why do people calculate alpha for a multi-dimensional instrument? A cynic may think that this is done because the answer tends to look good! Unmodeled sources of systematic variation will tend to inflate alpha for a set of indicators. We showed this in a set of simulations..
and it was also examined in a much more elegant fashion in...
Many years ago Raykov made a convincing case for estimating reliability as part of a factor analytic study - I suppose this as gained some traction as it is now not uncommon to see composite reliability being reported, but not as frequently as alpha!
I suppose it takes time for people to change habits, and this is not helped if reviewers for journals either don't understand or don't care about such issues. Indeed, it would appear to be an uphill struggle as it also seems to be ingrained to report reliability and validity as properties of actual scales or questionnaires rather the scores. Not quite the medial equivalent of leeches and humours, but (speaking as a psychologist) we don't often demonstrate sound practice or knowledge on issues of measurement.
Mark
  • asked a question related to Applied Psychometrics
Question
12 answers
Hello
I am researching the appropriate cut-off point measurement for the five anxiety scales in cancer patients.
But I can not draw an overall Roc curve with SPSS for all scales.
How can I draw this curve like the example below?
Relevant answer
Answer
Mohamad reza Davoudi, it's several years since I looked closely at ROC curves (I was considering writing an article about them because I thought it's easy to misunderstand them and I wanted to shed some light on them in case my attempt helped others), and my sense back then was that it's not simply a matter of deciding where a curve comes closest to the upper left-hand corner of the ROC space - i.e., where sensitivity and specificity are "equally maximized".
About 3 years ago, I wrote an article about sensitivity, specificity, and predictive values in which I considered some of the ins and outs concerning sensitivity and specificity that researchers, clinicians, students, and teachers seem to have problems with. Near the end of that article, I dealt with some important considerations that clinicians might take into account with regard to predictive values (which would flow back to sensitivity and specificity) - thus demonstrating that matching up sensitivity and specificity isn't always the best tack to take. Sometimes, it could be better to have one noticeably higher than the other.
In case you're interested, my article is in an open access journal, so freely available:
Trevethan, R. (2017). Sensitivity, specificity, and predictive values: Foundations, pliabilities, and pitfalls in research and practice. Frontiers in Public Health, 5:307. https://doi.org/10.3389/fpubh.2017.00307
I hope it might be helpful if you look at it.
All the best with your research.
  • asked a question related to Applied Psychometrics
Question
7 answers
Dear Colleagues,
I am working on a new questionnaire and I would like to verify measurement invariance. I have conducted multi-group CFA and now I am verifying metric invariance. I have computed factor loadings for each item in both groups (women and men), but I have not found any criteria how I can assess whether this factor loadings are appropriately similar in both groups.
I have not found any strict criteria related to acceptable level of the difference between factor loadings level for each item or (in an average) for the whole scale (or the whole questionnaire). Some items have similar factor loadings, but in a few cases they are different. It also happens that in a scale one items have higher factor loading for men and another for women.
Do you now any strict criteria to assess metric invariance?
Kind regards,
Kamil Janowicz
Relevant answer
  • asked a question related to Applied Psychometrics
Question
7 answers
Hi all, 
I am very much interested in how to convert ordinal scale data into interval scale? What are your suggestions? 
Let's say I have a data measured with 4 point likert scale (1 strongly disagree to 4 strongly agree). While it is clearly ordinal, what would be the best option for converting these data into interval scale?
Method of Successive Interval (MSI)? 
The Rash model? 
Your comments/suggestions are  highly appreciated 
Best regards, 
Davit 
  • asked a question related to Applied Psychometrics
Question
9 answers
I am using three questionnaires
1) has 6 items on a 6 point likert scale (professional self-doubt, subscale of the development of psychotherapists common core questionnaire)
2) has 10 items on a 6 point likert scale (developmental experience, subscale of the development of psychotherapists common core questionnaire)
3) 14 items on a 5 point likert scale (warwick-edinburgh mental wellbeing scale)
These are already established scales
My dissertation supervisor has advised me that i needed to calculate the cronbach's alpha before distributing my questionnaires but i was unsure how to do this? (I have contacted her about this, but she is currently on annual leave)
I ended up distributing the questionnaires as i had already received ethical approval and was allowed to start. I calculated the cronbach's alpha using responses of 64 participants.
question 1) is this enough participants?
question 2) how could i have calculated cronbach's alpha before the questionnaires were sent out?
question 3) I have recruited 107 participants, is this enough? (i am using a cross-sectional, quantitative design and using multi-regression analysis)
Thank you for taking the time to read this, I apologise in advance if any of this does not make sense.
Relevant answer
Answer
Hello Sid,
A bit out of order, but here are some thoughts on your three questions.
Q2: You have to have responses to the measures in order to estimate Cronbach's alpha (or any indicator of response consistency), so you were correct in trying to collect some data. Otherwise, since you note these are existing measures (and, presumably, others have compiled evidence concerning their technical quality), all you'd have been able to do would be to report what others have found (with their samples, which might or might not be comparable to yours).
Q1: Is 64 cases enough? For a pilot study, it's certainly enough to permit you to estimate internal consistency reliability. When you finish with the "full sample," it would be good practice to report score reliability estimates based on the larger batch as well.
Q3: Whether 107 cases is enough for multiple regression analysis depends on: (a) how many IVs (and DVs) you're evaluating in your model(s); (b) the smallest effect size (for example, how low an R-squared) you would consider noteworthy given the variables and the population you're investigating; and (c) how much statistical power you'd like your study to have to detect that effect size, if it exists in the population (when testing at your chosen alpha level, of course). Programs like the freely available G*Power program can help with this process (https://www.psychologie.hhu.de/arbeitsgruppen/allgemeine-psychologie-und-arbeitspsychologie/gpower.html)
Good luck with your work.
  • asked a question related to Applied Psychometrics
Question
13 answers
Hi all,
I am designing a student profiling system (in university level technical education) in which student competencies are mapped and used for further intervention. I want to test achievement motivation, study habits, Engineering aptitude and English proficiency. I have selected the following tests. Are these good enough? Which can use for English proficiency. 
1.     Study Habit Inventory by M.N. Palsane and S. Sharma 
2.     Engineering Aptitude Test Battery by Swarn Pratap
3.     English Language Proficiency Test by K S Misra & Ruchi Dubey 
  • asked a question related to Applied Psychometrics
Question
4 answers
Hi all! I'm conducting a study where I have a designed a path model wherein most latent variables have polytomous indicators (ordinal data). However, one instrument I intend on using measures a construct with a mix of dichotomous and polytomous items. My question is can I somehow model a single latent variable with both these types of data? I've seen models where a factor has either dichotomous or polytomous indicators, but neither types together.
(I planned on using an estimation method that performs better for these types of data, such as WLSMV.)
Relevant answer
Answer
Cristian Ramos-Vera Sir in that case when we have two factors ,will the measurement model,sem and pathway all have the same two categories.Or if the hypothesis was that the consolidated on factor effects another,we combine the factors and do a path analysis.Pls guide.
  • asked a question related to Applied Psychometrics
Question
7 answers
We can say that these factorial analysis approach are generally used for two main purposes:
1) a more purely psychometric approach in which the objectives tend to verify the plausibility of a specific measurement model; and,
2) with a more ambitious aim in speculative terms, applied to represent the functioning of a psychological construct or domain, which is supposed to be reflected in the measurement model.
What do you think of these generals' use?
The opinion given for Wes Bonifay and colleagues can be useful for the present discussion:
Relevant answer
Answer
Hi! Factor analysis help to identify the underlying dimensions. Sometimes the measurement items varies based on various contexts. Besides if the measurement is not well established conducting factor analysis can produce clear dimensions that can be used for the particular research model. Thanks
  • asked a question related to Applied Psychometrics
Question
5 answers
I'm an undergraduate student who has a course in test construction as well as research. The construct we are studying is mental toughness, and we plan on using the MTQ-48 to assess it. I am aware that there is a technical manual for the MTQ-48 available online, unfortunately, it does not contain a scoring procedure, and does not have information about which items belong to what subscale. Using other references, what we got is that the MTQ-48 is scored on a 5-point likert scale; with 1=strongly disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree. We are also aware of what items fall under each subscale, such as items 4,6,14,23,30,40,44,and 48 which fall under the challenge subscale. However, we were not able to find a reference stating which items are negatively scored. While we could make an educated guess, it is required that we have a source for the scoring procedure. If anyone here has such a reference, or knows the items, it would be highly appreciated. Thanks!
Relevant answer
Answer
If you want a short unidimensional measure of mental toughness that has demonstrated promising invariance across sports and general samples let me know.
Happy to provide items (NO COSTS) and assist.
  • asked a question related to Applied Psychometrics
Question
5 answers
And how would the conceptual diagram look like with direct, indirect, and total effect pathways?
Relevant answer
Answer
Hi Franklin,
yes, with a 4-category variable, you would create 3 dummies, with one category representing the reference group and the others representing the comparison between a respective category and the references.
Please google how to create dummies. You will find tons of tutorials.
Best,
Holger
  • asked a question related to Applied Psychometrics
Question
6 answers
If the original questionnaire is validated, then a linguistic validation is sufficient.
If you think statistical revalidation of a translated version is mandatory: - So can you give an example where a statistical validation of a translated version questionnaire was not successful? - Can you explain how statistical validation may change the strategy of the whole process?
Relevant answer
Answer
Like you mentioned yourself linguistic validation is enough is sufficient for the use of the questionnaire (correct translations are covered by using validated translation methods such as FACIT trans), but it is the cultural differences/relevance that needs to be taken into account during statistical analyses (especially if you have a questionnaire which is based on IRT modelling).
I have a very simple example for you that we (and other countries) encountered when translating some questionnaires on arm/hand functioning from English to our native language. Certain items were just not relevant for other countries (such as doing yard work, while most Koreans do not have yards) and being able to go to the toilet without help (some countries have high toilets, other low ones). Due to these differences you might run into differential item functioning (DIF), which means questions behave differently for subgroups based on some background characteristic (such as language). This also has an impact on the domain/sum scores. A solution to DIF could be changing the method for calculating sum scores or the replacement/removal of the question with DIF.
Basically, what Cristian Ramos-Vera mentioned, measurement invariance (i.e. same factorial structure) is required for different cultures (which is more than just the language).
Best,
Michiel
  • asked a question related to Applied Psychometrics
Question
10 answers
I'm currently designing a questionnaire for measuring a variable. There are 5 questions related to the variable and 4 of them are in 5-point and 1 one of them is in 6-point Likert scale (based on literature review). Will it be a problem for further analysis on this variable? For your information, this variable will be a mediating one in the main model. 
Relevant answer
Answer
follow
  • asked a question related to Applied Psychometrics
Question
5 answers
How to interpret Neutral responses while using Likert scale?, knowing that i use the mode. any help
Relevant answer
Answer
Nowadays in the corpoeate too they are not using the five point or odd number likert.the use of even number likert is very famous as this atleast gives us the ppsitive or negative opinion of the consumer.
  • asked a question related to Applied Psychometrics
Question
6 answers
I'm trying to specify McDonald's Omega. I'm using MPlus (based on FAQ Guide for Reability - https://www.statmodel.com/download/Omega%20coefficient%20in%20Mplus.pdf).
It works perfect for regular models, but when I try it to use it with factor analysis random intercept models, I do get Omega's higher than 1.00 in some factors.
That’s my model (not the syntax, only a brief approach to the example):
f1 by item1-item5
f2 by item6-item10
fRI by item1-item10
fRI with f1-f2@0
Based in Mplus FAQ’s, Omega’s calculation is (loaditems)^2/((loaditems)^2+resvarianceitems), right?
So, I wonder if is it ‘ok’ to have a ‘Omega higher than 1.00’ and what it would mean?
Also, there's an example of my Omega’s calcs (again, not my syntax, just a brief exercise of what I’ve been trying to do):
OmegaFactor1 = (loaditems1to5)^2/((loadsitems1to5)^2+resvarianceitems1to5)
OmegaFactor2 = (loaditems6to10)^2/((loadsitems6to10)^2+resvarianceitems6to10)
OmegaRandomIntercept = (loaditems1to10)^2/((loadsitems1to10)^2+resvarianceitems1to10)
I’m not sure if that approach is correct (even if Omega should be used or not in that case), but I do have an intuition that it’s happening because I’m lacking to specify in my model constraint ‘resvariance’ for Factor 1 and Factor 2, once the Random Intercept Factor is ‘interacting with it’.
Could anyone give me a tip about that theme?
Relevant answer
Answer
Could you try to Factor 10.9 software to understant case. If Omega is higher than 1.00, it means that your residual variance is negative. If it is negative there is improper solution. Check the Mplus output.
Bests
  • asked a question related to Applied Psychometrics
Question
4 answers
I'm looking for an intelligence test to examine verbal and non-verbal intelligence of children (like WISC or K-BIT) that can be administered in a group setting. My main interest is not to determine a precise IQ, but to obtain a reliable measure of cognitive functioning to include as control variable and compare different groups of school-aged children. K-BIT test would be ideal, since it does not take too much time and includes two vocabulary and one matrices subtest, but I don't know if it can be simultaneously administered to a group of children.
Any idea?
Thank you so much for your attention!
Relevant answer
Answer
Dear Angel.
I think that WISC is a good test to evaluate children's verbal and non-verbal intelligence in a classroom setting. As you know, via WISC you can have two IQ, A verbal IQ and a performance IQ. WISC can also be administered in a group setting.
Kind regards,
Orlando
  • asked a question related to Applied Psychometrics
Question
5 answers
The effect size in statistics reflects the magnitude of a measured phenomenon. It may be expressed as a coefficient of determination (R2), which states how much variance can be explained by the controlled parameter.
If a direct and simple relationship between the variables is claimed (eg. concentration of a measurand vs. absorbance in determination of calibration plot for ELISA), the effect size may be close to 100%. Similarly, in pharmacology, the effect size of relationship between direct phenomena may significantly exceed 50%.
In social sciences, however, where the links between the observed phenomena are not so simple and largely multifactorial, I suspect, the effect sizes are much smaller.
How large the effect size (expressed as R2) in social sciences may be considered relatively "large"? Any examples? Any references?
Thank you for your help!
Relevant answer
Answer
A lot depends on what you are measuring (or attempting to measure). If it is something as "soft" as attitudes, then you can expect a lot of measurement error, as well as a potentially large number of relevant omitted variables and interaction effects.
I did my graduate work in a very quantitatively oriented program, and we were taught that it was a favorable outcome when you could explain 30% of the variance in attitudes. Of course, that leaves three possibilities: your measures are highly unreliable (lots of measurement error), your model is mis-specified (you left out important predictors), or what you are trying to explain has a very large random component. Those are all serious problems, but when everyone in your field is struggling with those same problems, you learn to live with it.
  • asked a question related to Applied Psychometrics
Question
11 answers
Respected fellows,
I have recently started a deep dive in gamification of psychological concepts to help me understand it, like Pymetrics, Cognifit , Lumosity, etc., and I am having difficulty finding literature that might help me in adapting these concepts into games. Can someone please help me find any literature regarding this, or even how to further proceed?
Regards,
Azka Safdar
  • asked a question related to Applied Psychometrics
Question
3 answers
For testing national culture theory for example, there are four dimensions with each a set of related items. From the sample we get, I would think it better to analyze each dimension for its own set of reliability, KMO, and bartlett's test, and not for the overall (all four dimensions together) since it clearly a multidimensional instrument and not unidimensional. Please advise? 
Relevant answer
Answer
Is there any references for KMO being done partially (by dimension) to be refered?
  • asked a question related to Applied Psychometrics
Question
7 answers
Does anyone know what are the 10 response bias items in the  Narcissistic Injury Scale (Slyter 1991)? I have found 50 items for this scale, 10 of them are for response set bias, 2 items are dropped by the researchers. So only 38 items  need scoring,  but I don't know which 38 items out of those 50. 
Thank you for your help in advance. 
Relevant answer
Answer
Hello,
I am also working on a project and would like to use the NIS, did anyone get a response from the author.
Thank you
  • asked a question related to Applied Psychometrics
Question
31 answers
Dear Researchers,
I'm designing an interdisciplinary study (with Public heath, Statistics, Psychology etc) on diseases, stigma, discrimination, mental health and quality of life.
So I'm a bit confused that what should be the ultimate construct for human life, I mean what must be there for a human being?
Quality of life is sometimes observed secondary, primarily they should have a moving life.
The whole United Nations (Sustainable Development) Goals talking about Society and people's wellbeing.
But whats the 1 most important factor for humans?
Please provide your thoughts?
Best Regards,
Abhay
Relevant answer
Answer
First, I would keep a distance from sustainability concepts, concerning human life quality (because this burns down to economic reform in a capitalist system=systems maintenance). Second, the human bias is built into every questionnaire, concerning life quality; it is always built on the subjective perspective of the interviewer. Third, I do think, all research that is based around happiness is a good starting point for your work; most foundations do apply a happiness index, and this indexes could be methodically compared.
  • asked a question related to Applied Psychometrics
Question
13 answers
I conducted a principal components analysis on a subset of my data. Then used the remaining participants to conduct a confirmatory factor analysis. Was this correct, or am I able to use the entire sample for the CFA, even though some participants were used for the PCA?
Please note that I did not conduct an exploratory factor analysis (EFA). My goal was specifically to reduce the number of variables into the proposed constructs, therefore I am confident that I needed to use a PCA, not an EFA. And I know that with EFA, you should definitely run the CFA on a different sample. But does this rule apply to a PCA too, when you need to run a CFA afterwards?
So essentially, what I want to know is can I conduct a PCA and CFA on the same data or not? Any literature cites you may have are greatly appreciated. Thank you.
Relevant answer
Answer
If the scale validity is already established in other setting or other language, then it is preferred to do the confirmatory factor analysis. If the result support the original instrument structure, then no need to do the Exploratory factor analysis. However if the CFA did not support the original tool validity, then you have to do EFA.
  • asked a question related to Applied Psychometrics
Question
5 answers
There were ten questions in a pre intervention test given to 40 members. The correct answer was marked as 1 and the wrong one 0.  The content validity was tested by a subject expert. Can anyone suggest ways to improve the value.  The output as a word document is attached herewith.  Thank you
Relevant answer
Answer
Hi
Any value <0.6 of the measures of Item alpha reliability, split-half reliability, Inter-rater reliability, .. is not accepted.
Negative values required reconstruction of you questionnaire and validation process.
Please refer to,
1. Landau, S. and Everitt, B. S.(2004). A Handbook of Statistical analyses using SPSS.
2.Howitt, D. and Cramer, D.(2008). Introduction to SPSS.
3. Validity and Reliability of Students and Academic Staff’s Surveys to Improve Higher Education. Educational Alternatives, Journal of International Scientific Publications, Vol.14, pp. 242-263.
Regards,
Zuhair
  • asked a question related to Applied Psychometrics
Question
5 answers
I'm struggle to find the answer to this question can I get some help please .thank you
Relevant answer
Answer
Hi
1. Regarding the reliability: all the measures of reliability of the questionnaires can not be measured unless the scale of measurements based on numerical values.
2. Regarding the validity of an open ended questionnaire: which type of validity you want to measure?
Regards,
  • asked a question related to Applied Psychometrics
Question
11 answers
Can somebody direct me towards some good readings on the subject?
Thank you
Relevant answer
Answer
Welcome back! How did you go about measurning meta emotion?
  • asked a question related to Applied Psychometrics
Question
9 answers
Recently I've been reviewing how to handle social desirability in testing.
After much theoretical review I have come to the conclusion that the best way to do this is to neutralize the items from social desirability. The format I will use is likert scale.
For example, an item that says "I fight with my coworkers" would be transformed into " sometimes I react strongly to my coworkers" (the second is somewhat more neutral).
The idea comes from the work done by Professor Martin Bäckström.
Now the question I have is: is there any methodology that can help make this neutralization?
If not, what would be good ideas to realize it? What elements should I consider?
I think a good idea might be to "depersonalize" the item. For example, instead of "I fight with my bosses," I would become "I think an employee has the right to fight with his or her boss".
Another option I've thought of is to modify the frequency. For example, instead of "I get angry easily," I'd use "Sometimes I get angry easily."
However, I do not know if these options would affect the validity of the item to measure the construct.
Thank you so much for the help.
Relevant answer
Answer
This problem is something i try to solve in my study about developing spiritual serenity scale... In first study, i found the desirability of the items.. Rasch model analyze found that the items below of respondents ability...
In the next study, i will try change the descriptors of items to be more difficult... For example, never, seldom, sometimes, often, very often, and always..
Options "often", "very often" , and always will be chosed by persons only with high ability...
  • asked a question related to Applied Psychometrics
Question
8 answers
Hello everyone,
Currently, I am developing a longitudinal research applying psychometric questionaries in the same sample in two moments. However, I am looking bibliography or evidence about which is stronger statistical test: Paired Sample T-Test and Wilcoxon, and the assumptions behind the application of each test.
Relevant answer
Answer
Hello Dante,
If all parametric assumptions are met, the paired t-test is slightly more powerful than the Wilcoxon signed ranks sum test (about 4.5-5% less for Wilcoxon). Otherwise, you can't be certain without running a simulation study of the specific data set characteristics.
Paired t-test: Requires that difference scores are normally distributed, and, therefore are of at least interval scale strength. Tests hypothesis that mean difference = constant (typically, zero). Virtually any software package will compute correct probability (under null hypothesis) of result as or more extreme than that observed in the data set.
Wilcoxon test: Requires that difference scores are at least ordinal strength, and therefore may be converted to ranks. Tests hypothesis that sum of the positive ranked differences = sum of the negatively ranked differences (which simplifies to median ranked difference = 0). Most textbooks will furnish exact critical values for N up to about 20 to 25, but then recommend a "large sample" approximation to the normal distribution thereafter. SPSS, I believe, uses the large sample approximation regardless of sample size.
Good luck with your work!
  • asked a question related to Applied Psychometrics
Question
9 answers
Is there any possible way?
I understand that if the options point to the same trait, it can be done. for example a question of the type:
I work better:
(a) individually
(b) with other persons
either of the two options is valid for the person (helping avoid bias) and for example if I'm measuring the trait of teamwork I may think that a person who selects option b will have a higher degree in the trait of teamwork. Am I making a mistake in assuming this?
now, is there any way to do this when they point to different traits in response options? I want to be able, based on the data of forced response items, to carry out normative analysis (to be able to compare with other subjects).
PS: I'm clear that with ipsatives items you can't make comparisons between people, however, if you manage the punctuation in a different way could you do it somehow?
Relevant answer
Answer
Hi Ale,
there are recent developments in IRT that allows extracting normative scores from forced-choice questionnaires. The Thurstonian IRT (TIRT) model by Brown and Maydeu-Olivares and the MUPP model by Stark and colleagues are good examples.
From my own experience, the TIRT model works best in practice (i.e., in terms of reliability and validity).
  • asked a question related to Applied Psychometrics
Question
8 answers
I´ve been working on common psychometric tests. In other words, the traits to be measured are selected, some key behaviors are selected and subjects must respond on an likert scale. Subsequent to this I look at validity and reliability, I take a normative group and calculate the scores (working with classic test theory), etc.
But now I want to do something different: I want to do a test in which they have multiple response options and all of them can be correct. For example, an item could be of the type:
Choose the phrase with which you are most identified (only one).
a) I am respectful
b) I am honest
c) I am a good worker
I have seen several of these tests but would like to know how to calculate their reliability, validity, score tables.
Very Thanks
Relevant answer
Answer
What you are referring to are the so-called ipsative tests. In general, as far as I know, there are not many ways of studying their psychometric properties. The reliability is usually estimated using the stability method (test-retest). The following article can give you more information:
Best regards
  • asked a question related to Applied Psychometrics
Question
2 answers
I understand that at least three first-order factors are needed for identifying a second-order, hierarchical CFA model. I also read in Reise et al. (2010) that at least three group factors are suggested for bifactor EFA. Does the same guideline apply to bifactor CFA? I've seen many influential papers that use bifactor CFA with only two group factors, but I want to make sure this is a correct decision. References are always very appreciated.
Relevant answer
Answer
In a bifactor EFA you need three factors to have 1 general factor and 2 specific, if there were only 2 you would have a general and specific factor all loading on the same items and the model would not be identified. In CFA 2 specific factors should be OK, but I suspect that this would depend on having enough indicators. I have run some (informal) simulations and more indicators seem to help with identification.
Mark
  • asked a question related to Applied Psychometrics
Question
3 answers
Closed
Relevant answer
  • asked a question related to Applied Psychometrics
Question
4 answers
Can we model stress. Just curious about it.
(keeping in mind that this is the era of artificial intelligence, machine learning, data science and so on)...
Can we have a predictive model on stress and behaviour as KPIs.
Also, philosophically there must be a path between Stress, behaviour, emotions, intelligence etc... can we also test and find the coefficient for these paths??
Regards,
Abhay
Relevant answer
Answer
Dear Luca,
Thank you very much for references and suggestions.
Regards,
Abhay
  • asked a question related to Applied Psychometrics
Question
4 answers
For an assignment I need to imaging designing a scale to measure hypomanic symptoms, and write about how this would be done.
Seeing as hypomanic episodes may be present for various periods of time (a few days, a few weeks etc.) is it possible to measure test-retest reliability?
Thanks
Relevant answer
Answer
It depends on how the questions are asked. If the scale asks about hypomanic symptom right now, or during the past week, test-retest would not be appropriate. If it ask about the average level, or maximum level, of symptoms during an episode then test-retest reliability is necessary to validate the scale.
Regards,
Simon Young
  • asked a question related to Applied Psychometrics
Question
6 answers
I am conducting a research project in which I am using SEM model. My exogenous variable( world system position) is ordinal with 4 categories. I am not sure how creating so many dummy variables will work in SEM model. Thus I would like to treat it as a continuous variable. But I am not sure if I will be violating any statistical assumption by doing this. Can somebody help me with suggestion on this?
Relevant answer
Answer
As the others point out, it is possible to incorporate an ordinal variable. It also seems you are interested in whether you can treat it as continuous and that is another matter. There are several considerations.
1) Continuous scales assume equivalent differences between intervals on the sale. This might not be valid if all you have is the rank order from the ordinal scale. For example, can you reasonably assume that the distance between "agree" and "strongly agree" is the same for all respondents to your questionnaire?
2) I would also consider the purpose of the study. How high are the stakes in your results?
3) How has the scale been treated in your discipline? It is very common in social science research to treat ordinal scales as continuous. What about the scale you are using? How have others treated it?
In the strictest sense you have an ordinal scale. On the other hand, what are typically truly ordinal scales are often treated as continuous in social science research. Though it's not often explicitly discussed, I think one should be explicit about the treatment of the scale as continuous if it is novel in your field. Knowing nothing about your research, to be precise I would treat the variable as ordinal and, if you are curious about that particular scale, conduct research on how reasonable it is to treat it as continuous if such research doesn't already exist for your scale and population. Stevens (1946) is a starting point on this issue.
Best wishes
  • asked a question related to Applied Psychometrics
Question
4 answers
If you have run both EFA and CFA then you should calculate discriminant and convergent validity on the basis of EFA or CFA factor loading?Please guide.I would appreciate if you could provide any references.
Relevant answer
Answer
Dear Mazhar, 
You can go following link. This will be helpful for you.
  • asked a question related to Applied Psychometrics
Question
6 answers
I am testing measurement invariance of a questionnaire. I am aware of the fact that some methodologists think it unnecessary to test for invariance of residual variances and covariances. However, the measure I am working with has a highly replicated correlated errors issue, and I would like to test its invariance since possible noninvariance could affect the measure's reliability for some groups. What's the correct sequence I should follow? Should correlated errors invariance be tested before or after intercept invariance? Intuitively, I think it would make sense to test it together or immediately before residual variances invariance, since both are usually considered under the label 'strict invariance'. However, the Mplus tutorial in Byrne's book suggests testing correlated residuals invariance after factor loadings invariance and before intercept invariance, and it also makes sense to me. Readable references would be very appreciated.
Relevant answer
Answer
After intercept invariance testing! If you have intercept invariance your model would be scalar invarant. Even that is often hard to establish. Usually you need to free some invariant intercepts, and you have a partial scalar invariant model, fully acceptable if at least two items have invariant intercepts. See references in enclosed article. Correlation errors invariance is seldom required but how to is described in the book enclosed (see link). Best of luck.
  • asked a question related to Applied Psychometrics
Question
6 answers
In my study involving Student Adaptation to College Questionnaire (N=104, 0 missing values) with a likert scale of 9 is showing very low reliability across all 4  sub-scales especially Personal-emotional and Social adjustment factors in SPSS 23. But I checked if it is working if questions are not reverse coded (in SPSS) and it surprisingly produced a highly reliable output although it is irrelevant without reverse coding the questions. I have tried different methods of reverse coding but the issue still remains.  
Relevant answer
Answer
 If Cronbach's alpha is higher when you do not correct the coding of reverse-worded items, this would indicate either (a) you have incorrectly coded responses (i.e., in fact you did not correctly reverse-score the reverse-worded items), or (b) there was some sort of response-set going on among participants (i.e., they were not really paying attention to the item wording, just the content/theme, and/or they were lazily ticking the same, or similar, score across all items.)
As others have implied, be intimate with your data - run the complete inter-item correlation matrix on the "final coded" data (i.e., with reverse-worded items reverse-scored). Then look at the pairwise item correlations. Any negative pairwise correlations would imply incorrect item coding. If all reverse-worded items correlate positively with each other, but negatively with all other items (and vice versa), this indicates that perhaps you have incorrectly coded your data (e.g., you reverse-scored the items, then reverse-scored them a second time, reverting to the original scoring).
  • asked a question related to Applied Psychometrics
Question
2 answers
My team and I are implementing a project on assessing cyberbullying perceptions. We have decided to use FRS in order for respondents to answer our 8 situational items. 
Could you suggest me some practical applications of FRS based on sinusoidal functions?
Thank you very much, Dana
Relevant answer
Answer
Hi, Dana and Dibakar!
I don't know whether you need the sinusoidal shape for some additional reasons. Although the already developed statistical methods to deal with FRS data can be applied to sinusoidal and other data shapes, trapezoidal one usually allows us to ease and get exact computations.
Kindest regards,
maria 
  • asked a question related to Applied Psychometrics
Question
16 answers
Hi there! 
I've just done Cronbach's Alpha for my UTAUT2 technology acceptance model. 
The results are good. One construct is at 0.78, and the rest are > 0.84. 
However, SPSS indicates that, with the deletion of some items, I can make some minor improvements (e.g. a jump from 0.87 to 0.89 for a scale). 
Should I purify my scale simply as an attempt to increase the Cronbach's Alpha for each one? Or will this cause other problems?
Could you please provide a reference which suggests best practice in this scenario?
Thank you! 
Sam 
Relevant answer
Answer
 Be careful about getting stuck viewing the veins on the leaves of the trees and not the whole forest here.  Certainly, that may improve Cronbach's Alpha. But, does it improve the theoretical structure of the scale?  Second, be cautious about a statistical fishing expedition in search of better statistics.  Do you have a theoretical rationale or other psychometric rationale for omitting these items (s)?  Do they have low loadings, cross loadings, or other problematic indicators? These would be more potent reasons for omitting an item.
  • asked a question related to Applied Psychometrics
Question
7 answers
What is the good statistical approach to test the common method variance?
Relevant answer
Answer
Please let me know if this reference (freely downloadable) is helpful to you:
Common Method Biases in Behavioral Research: A Critical Review of the
Literature and Recommended Remedies
Philip M. Podsakoff, Scott B. MacKenzie, and
Jeong-Yeon Lee
Indiana University
Nathan P. Podsakoff
University of Florida
Interest in the problem of method biases has a long history in the behavioral sciences. Despite this, a comprehensive summary of the potential sources of method biases and how to control for them does not exist. Therefore, the purpose of this article is to examine the extent to which method biases influence behavioral research results, identify potential sources of method biases, discuss the cognitive processes
through which method biases influence responses to measures, evaluate the many different procedural and statistical techniques that can be used to control method biases, and provide recommendations for ...
Dennis
Dennis Mazur
  • asked a question related to Applied Psychometrics
Question
10 answers
Hi, 
I'd really appreciate it if someone could help to explain to me (in simple terms- I'm not a statistician!) why the factor loadings of items in an exploratory factor analysis (principal axis factoring, using direct oblimin), differ to the factor loadings of a confirmatory factor analysis when both are carried out in the same sample? (I'm aware that normally you would perform a CFA in a different sample but that isn't my focus for this question).
There isn't a massive difference in factor loadings, and from what I've read and seen in other papers this is normal, but I'm just trying to understand why? What does the CFA do that results in a different 'loading' of the items?
Any help would be appreciated :)
Relevant answer
Answer
Hello
Short answer: they are different analysis. One for explore relations and the other is to confirm.
To my knowledge, EFA is required if you are testing new relations which has not been tested before, whereas CFA is used for confirming your relational model in a new sample. Thus, you may do CFA for your final model after EFA. The key point is, you can not use same data or responses to test both model at the same time. After EFA, for CFA, you need to collect responses from another sample from your study population for reliability of your model. That is the main reason in most of the studies researchers are using tested research models from the literature (i.e. TAM, UTAUT). 
Hope my comments would help your work
Bests,
  • asked a question related to Applied Psychometrics
Question
3 answers
I've been assessing the reliability for constructs using both Cronbach's alpha and composite reliability scores. But I found Cronbach's alpha is a higher than CR for similar construct. however I was expecting the Cronbach's alpha to be lower estimation than composite reliability. I wonder which one is more suitable to check the internal consistency!
Relevant answer
I'd be inclined to go with the standard in the research literature that you're working with. If the papers in your research have set a precedent for working with construct validity using Cronbach's alpha - then I would consider keeping the work in line with the standard in your field.
  • asked a question related to Applied Psychometrics
Question
8 answers
I have a questionnaire data set which has seven categories or dimensions and 13 items across all the dimensions put together.The items however are distributed unequally in each category . In some cases there are as much as 4 items and n some categories just 1. I would like to know how to combine these into seven variables.
I have just 41 observations all together...I feel the low sample size rules out traditional methods such as factor analysis ...
Would like to hear some ideas.
Relevant answer
Answer
I disagree with Jan! It depends on the quality of data and the population. If the population is 100 and you tested 50 - it is more than enough.
The other aspect is the quality of data. If people really participated in the survey (not just crossed everything in the middle), you have chances to test your theory. You need to reduce the number of hypotheses. As long as your model fit indices are good, you can work with a smaller sample size.
If you can get more answers - very good, but if not - there are still options. You will not be able to publish in a top journal (if it is not VERY interesting sample or if it is not the half of your population), but you can go for a C-Level journal. Again, it is all about quality of your data and what you do with it.
So, don't give up and keep researching!
Good luck!
Eugene
  • asked a question related to Applied Psychometrics
Question
18 answers
Hi, 
I have data from 200 patients. I want to check the association between the cognitive impairment and the diminution of the quality of life. My variables are mostly categorical, such as sex, age, cancer type. I also have numerical variable such as cognitive tests score and quality of live survey score (likert type). Can anybody tell me if a linear regression could be a good start or some other analysis? What other analysis could be suitable for qualitative data analysis?
Thank you very much as I’m beginning with this discipline,
Relevant answer
Answer
As others have said, you are doing quantitative analysis. Whether you use regression depends on how your variables are measured. For example is your dependent variable, quality of life, measured as a scale with interval-level scoring? The same question applies to your independent variable, cognitive impairment.
If those are both interval-level variables, then you can use regression and treat your other variables as "control variables" (i.e., what is the effect of cognitive impairment, above and beyond other factors such as gender etc.).
  • asked a question related to Applied Psychometrics
Question
3 answers
Dear all,
my dependent and independent variables are 7-Point Likert scale and measure perception toward effective leadership, therefore I assume my data is of ordinal nature.
The independent variables are six constructs with 5 questionnaire items each, which measure cultural values.
The dependent variables are 2 constructs with 19 items each, which measure the perception of effective leader attributes.
So I hypothesize that each of the culture dimensions are associated with perceived effective leader attributes. I have collected the data and intend to do the statistical analysis as follows:
1. Reliability Analysis - Cronbach's Alpha
2. Factor Analysis
But then I'm struggling between ANCOVA or Spearman's Rho to test association. I understand that ANCOVA is used for interval and Spearman for ordinal data.
Could you please advise on an appropriate method?
Many thanks,
Ahmad
Relevant answer
Answer
If you have categorical variables, then the analysis of covariance would be suitable. But you mention Likert scale, so we assume you have rating 1-7; in which case the ANCOVA may not be most optimal
  • asked a question related to Applied Psychometrics
Question
6 answers
Hello, 
I have seen many questions of this kind and tried on my data but yet no solution seems to address my problem.
I have 3 instruments each of 13, 11 and 13 statements. I marked them 1-5 scale. After recoding most answers (80%) score around 4-5. (N=100) seem quite similar. My Cronbach's alpha however shows: around .229 to .430. I am still surprised because when I combine all three instruments the Cronbach's Alpha come out .642, however, for individual instrument it is lower than the required limit.
Any help will be highly appreciated. Thanks
Relevant answer
Answer
Alpha will rise with either (a) increasing the number of items, and/or (b) increasing the average intercorrelation among the items.  The results you mention suggest that all of the items have a modest average intercorrelation, so that when you combine them all, the large total number of items generates a reasonably high alpha.  However, it appears that the average intercorrelation within any subset of the items is also only modest at best, and therefore alpha is low because the subsets have fewer items.  
I suggest you look closely at the correlation matrix for each of your subsets.  Are there any items that correlate noticeably low with the other items in the subset?  If so, your alpha may well increase if you drop those items from the subset, leaving you with even fewer items but a higher average intercorrelation within that subset.  Also, are there any items that correlate negatively with the others?  If so, this will make it almost impossible to get a high alpha. If there are any, then perhaps those items are worded in the reverse (positive answer means negative attitude).  In that case, you should reverse the coding on the reverse-worded items (recode 5 to 1, 2 to 2, etc.) before computing your alpha or forming your scale.  If an item correlates negatively with the rest, but it is not because of reverse-wording of the item, then that items is not measuring the same thing as the other items in the subset.  It should be dropped altogether.  
Any items dropped from one subset might in fact correlate better with the items in one of the other subsets.  So it wouldn't hurt to take a look at the big correlation matrix, with all the variables included, to look for unexpected patterns like that.  
You could get to the same conclusions by doing an exploratory factor analysis, as others suggest, but it might be quicker and easier to just take a careful look at the correlation matrix for each subset, and for all the variables together.
I also agree with others that what all the items have in common might be a methods artifact, like social desirability response, rather than whatever you are trying to measure. If the subsets are supposed to be measuring different things, but only form an acceptably reliable scale when all combined together, then the subsets may lack not only internal consistency reliability, but also discriminant validity.  But you might still be able to rescue the subsets if examining the correlation matrices leads you to discover that the problem can be traced to a few bad (or reversed) items.
  • asked a question related to Applied Psychometrics
Question
12 answers
I am running a validation study in which I compare two measures of the same process. One variable is continuous, the other is categorical (5 increasing categories). I want to assess the agreement between the two measures, but am doubting what method to use... Anyone who can help?
Relevant answer
Answer
Thanks very much everyone for all your suggestions. It will take the whole of Summer to process the electromyography data (needs to be done by hand mostly). I will definitely try your suggestions. I suppose R would be preferred when doing @Joachim 's suggestions?
It would be interesting to see whether the various statistical methods come to the same result as well :)
  • asked a question related to Applied Psychometrics
Question
3 answers
I had planned to complete a multiple regression but my SPSS knowledge is poor and I am struggling.
Relevant answer
Answer
One early step would be to examine reliability of the scale your Likert-scored items, using Analyze > Scale > Reliability.
Note that all your items need to be scored in the "same direction" for this analysis (i.e., no negative correlations).
  • asked a question related to Applied Psychometrics
Question
4 answers
example: Five questions measuring "Perceived Usefulness" (dependent variable) in which respondents have to select one of each:
1= Strongly Disagree
2=Disagree
3=Neutral
4=Agree
5=Strongly Agree
6= Not Applicable
Relevant answer
Answer
Most Likert scales are composed from items with 5 to 7 steps rating. The original authors of those scales usually state how to score the scale/subscales {if present}. To take benefit from Likert scales in inferential statistics, researchers use the sum of scores, which convert the scale from ordinal to ratio level of measurement. 
However, it is acceptable practice to work with the individual items in descriptive and non-parametric statistics.
The main issue in the case which you presented, is option # 6 {NA}, so either you have consider it as missing or see the original guide from the author.
  • asked a question related to Applied Psychometrics
Question
4 answers
Hi,
Does anyone know/.have a reference for what the standardised factor loadings (highlighted in the attached) should be when performing confirmatory factor analysis. Is it the same as the rule of thumb for factor loadings when performing an exploratory factor analysis (>.4)?
Thanks,
Emma.
Relevant answer
Answer
Hi Emma,
"Common variance, or the variance accounted for by the factor, which is estimated on the basis of variance shared with other indicators in the analysis; and (2) unique variance, which is a combination of reliable variance that is specifc to the indicator (i.e., systematic factors that influence only one indicator) and random error variance (i.e., measurement error or unreliability in the indicator)." (Brown, 2015).
it can be said that If the factor loading is 0.75, observed variable explains the latent variable variance of (0.75^2=0,56) %56. It is good measure. So if your factor loading is 0.40, it explains %16 variance. As a cut point 0.33 factor loading can be given. Beacuse of it explains %10 variance.
  • asked a question related to Applied Psychometrics
Question
6 answers
i want to find out correlation between psychological well being scale (5 point likert type ) and mental health battery (2 point scale) . how to find out.
Relevant answer
Answer
thank you so much to all of you.
  • asked a question related to Applied Psychometrics
Question
6 answers
can any researcher send me the copy of 12-item Procrastination Assessment Scale for Students questionnaire and its scoring manual or instructions?
Relevant answer
Answer
I am attaching two files from the author's web site: one consists of the complete PASS, the other of the scoring instructions (with additional information - basically a test manual).
Although this measure is much longer than what you want, I believe the 12-item version can easily be reconstructed based on the information provided in the manual. 
Section One of the scale measures the frequency of procrastination. It is 18 items long: 6 mini-sections, 3 items each. The "total score" for this section is the sum of the first two items in each section (1+2+4+5+7+8+10+11+13+14+16+17: 12 items!). This makes sense, because the third item in each section asks about the test-taker's desire to change the behavior in question. So although I'm not absolutely certain, it looks like these 12 items are the short form that is so widely used.
I suggest contacting the authors directly with any follow-up questions.
  • asked a question related to Applied Psychometrics
Question
8 answers
What is your suggestion to obtain reliability for Visual Analogue Scale?
Relevant answer
Answer
there was excellent work in 1987 by Jagodzinski in Sociological Methods on using quasi-simplex models to get estimates of reliability and stability of single items. Tremendous papers and still very relevant when trying to assess psychometric properties of single items
  • asked a question related to Applied Psychometrics
Question
3 answers
Note: There is no class to find lower and upper limit in likert scale
Relevant answer
Answer
it is preferred not only to take the extreme  readings in likert scale, but if you have 5 points and you want to do dichotomy readings, then you can take the 4 & 5 together and 1,2, & 3 together
  • asked a question related to Applied Psychometrics
Question
12 answers
Suppose that the phenomena we study comprise about 3 % of the population and we made a two item screening instrument (with dichotomous items). If we think in terms of Rasch /IRT, what is the optimal item difficulty for these two items? and if it's contingent of their covariance - how to test it ? 
Relevant answer
Answer
Excellent, thanks! 
  • asked a question related to Applied Psychometrics
Question
4 answers
Good afternoon,
I do have a question about my thesis project. I am estimating predictive validity of an assessment, however, I am a bit confused. Assessment consists of major factors,which in their turn consist if smaller dimensions which in their turn consist of smaller constructs. I need to formulate hypothesis but I do not know which levels of the assessment they should include: major factors, dimensions or smaller constructs?
Thank you! 
Relevant answer
Answer
Hi Maru,
This is a hard question and one that I don't think you will be able to get a concrete answer to on ResearchGate, at least without being very specific about the construct, dimensions, and outcomes you'd like to predict.
I suggest you schedule a meeting with your mentor and discuss the problem. Be prepared for an answer that consists of suggested readings rather than specific hypotheses.
Here are some other answers that might work for you:
(a) If this is an established instrument, read the original publication carefully for statements by the author(s) to determine the intended use of the instrument, which should implicitly or explicitly include statements about how the numbers derived from the tool should be useful for predicting outcomes.
(b) In my experience, it is better to formulate your hypotheses involving broader domains than specific domains. In the first case, if all of your factors, dimensions, and constructs are positively correlated, you will be unlikely to be able to detect specific effects attributable to the different aspects of the measurement. And even if you do detect differences, they are unlikely to be reliable or reproducible. 
Good Luck
Rich
  • asked a question related to Applied Psychometrics
Question
6 answers
I have translated the Ryff Psychological Well-being scale into Vietnamese and collected 800 responses from university students.  The main construct is psychological well-being, which is comprised of 6 distinct dimensions. I used the 54-item scale with 9 items per scale (9 times 6 = 54).
The items appear to be formative and not reflective thus I am told that an EFA is not appropriate for formative items.
I want/expect to confirm that the Vietnamese version also shows the 6 dimensions found by previous research.  So how should I proceed?  I have not found a clear standardized approach to handle this question.  The videos and writings I have read do not seem to agree on the appropriate method to take.
Anyone who has done this kind of work I would love to hear from you.  Thanks.
Relevant answer
Answer
Dr. Ahmad,
Thanks for your response.  The original scale has identified six dimensions.  My initial EFA showed 17 factors.  However, this may be due to the items in each scale being formative instead of reflective in nature.  I was told that the EFA does not work on formative items.  And the CFA indices do not support the original model.  Do you see my situation?