Questions related to Applied Psychometrics
The AVE of the scale is below 0.5 and rest of the parameters viz. CR and discriminant validity are above the threshold level
I came across different definitions of low, moderate and high growth, but I have not been able to find a reliable reference to cite where these definitions are presented.
Some suggest adding up the total score, and they define scores below 45 as none to low, and scores above 46 moderate to high. Others suggest summing up all the scores and calculating a mean score, which they later group as 1-3, 3-4, and 4-6 as low, moderate, and high, respectively.
Any references would be highly appreciated.
Dear Research Community,
I am asking for your participation and especially for your feedback on our Self-Assessment for Digital Transformation Leaders: https://t1p.de/mwod
The goal is to provide leaders a mirror to reflect on themselves and the skills as well as personal attributes required for digital transformation. In the end, participants receive an integrated presentation of their results (see appendix).
- Are all questions understandable?
- Which questions lack precision?
- In your opinion (as a digital leader), are essential aspects still missing? If so, which ones?
I am looking forward to any kind of suggestions.
Is there any interested in helping with our analysis? We are working on a project relating to personality and friendship. Leave your email address if you are interested.
I doubt it is halfway between low and high arousal nor halfway between negative and positive valence. Does anyone know where listeners rate emotionally "neutral", conversational speech?
I'm doing a split-half estimation on the following data:
trial one: mean = 5.12 (SD = 5.76)
trial two: mean = 7.62 (SD = 8.5)
trial three: mean = 8.57 (SD = 12.66)
trial four: mean = 8.11 (SD = 10.7)
(SD = standard deviation)
Where i'm creating two subset scores (from trial one & two; and from trial three & four - I realise this is not the usual odd/even split):
Subset 1 (t1 & t2): mean = 12.73 (SD = 11.47)
Subset 2 (t3 & 4): mean = 16.68 (SD= 17.92)
I'm then computing a correlation between these two subsets, after which I'm computing the reliability of this correlation using the Spearman-Brown formulation.
However, in the literature I've found, it all suggests that the data must meet a number of assumptions, specifically that the mean and variance of the subsets (and possibly the items of these subsets) must all be equivalent.
As one source states:
“the adequacy of the split-half approach once again rests on the assumption that the two halves are parallel tests. That is, the halves must have equal true scores and equal error variance. As we have discussed, if the assumptions of classical test theory and parallel tests are all true, then the two halves should have equal means and equal variances.”
Excerpt From: R. Michael Furr. “Psychometrics”. Apple Books.
My question is, must variance and means be equal for a split-half estimate of reliability? If so, how can equality be tested? And is there a guide to the range, which means can be similar (surely it cannot be expected for means and variance across subsets to be 1:1 equal?!)?
Hello, I have a questionnaire that consist of four sections with each section focusing on different variables.
First, each section has 9-10 items with each item following a different scale. For instance, the first section has 10 items with no Likert scale and the participants have to choose from either two or three or more specific options. The second section has 9 items with the first five items have six point Likert scale while in the remaining items the respondents have to choose from four specific options. The third section has 10 items with each following six point Likert scale. The fourth section has 9 items with no Likert scale and the participants have to choose from three, or four or more specific options.
Second, in some of the items the respondents were also allowed to select multiple answers for the same item.
Now my question is, how to calculate the "Cronbach's Alpha" for this questionnaire? If we cannot calculate the "Cronbach's Alpha", what are the alternative to find the reliability and internal consistency of the questionnaire.
I was wondering if I should calculate the ceiling and flooring effect for each item separately or just calculate it for the total score of my questionnaire?
Thanks in advance,
What if the Cronbach's Alpha of a scale (4-items) measuring a control variable is between the .40 -.50 in your research. However, the scale is the same scale used in previous research in which the scale received a Cronbach's Alpha of .73.
Do you have to make some adjustments to the scale or can you use this scale because previous research showed it is reliable?
What do you think?
I am using the technical manual and am unsure if I am doing it correctly, or am even able to; Im trying to calculate a linear t-score for a scale not part of the test. The CRIN scale is on the MMPI-A-RF but not the MMPI-2-RF. I am doing a project that could really benefit from having the same linear t-scores for CRIN along with what I have in my dataset for the other validity scales. Please and thank you.
I am working on developing a new scale .On running the EFA, only one factor emerged clearly while the other two factors were messy with multiple item loadings from the different factors.
1- Is it possible that I remove the cross-loadings one by one to reach a better factor structure by re-running the analysis?
2-If multiple items still load on one factor, what criteria should I use to determine what this factor is?
I am trying to obtain permission to use Muller/McCloskey scale on job satisfaction among nurses and in a meantime would love to see how it is scored.
I found references for factor loadings, correlations of the subscales etc. but need to look at scoring which is I think building into a continuous variable.
Thanking in advance to anybody who can help.
How many (what percentage) of 'unengaged' respondents would you allow in your respondents database and what criteria would you employ in order to make e decision to eliminate/keep?
With respect to Likert scales, there are times, especially in lengthy self-reported questionnaires, when respondents provide 'linear' answers, i.e., they give, constantly, the same rating to all questions, regardless if the questions are reversed or not.
Some call this type of respondents 'unengaged' respondents. This, is rather a 'friendly' term, since there can be also malicious intent in providing, but this is another discussion. However we call them, the first direct effect on the data is reduced variability.
There may be other effects as well (feel free to list them based on your experience or knowledge), which can affect the relations between the constructs and the final inferences.
Thus, how do you proceed in these situations and what criteria do you base your decision on (supporting arguments are welcome and references to written texts are especially appreciated)?
(Edit): I realised that the original question my induce some confusion. This is not a case that require substitution (although, in general terms, it may be). Please consider that all cases are complete (all respondents answered each item). The problem lies within the pattern of responses. Some respond in straight line, hence the name 'straightlined' answers, others in various patterns (zig zagging, for instance), hence the name "patterned". While for scales which include reversed items, some cases (for instance, 'straightlined") can be easily spotted, for scales without reverse items, this is harder to to. However, the question pertains both situations (scales with and without reversed items).
Another particularity of the question is that I am less interested in "how to identify" this cases (methods) and much more in "what to do"/"how to deal" with them, i.e., what criteria or rules of thumbs or best practices to consider.
Responses including references to academic papers or discussions are much appreciated!
Two outcome measures: Numeric Pain Rating Scale and Pain self-efficacy scale both on public domain however I need a written confirmation of this.
Thanks for your help.
MID is based on Standard deviation, consequently, I should not use it in non-normal distribution. Am I right? If yes, which test may replace MID?
I've recently come across with several published articles where questionaries composed of more than one section (where every section is meant to gather data for separate variables/dimensions of the study) are applied to a sample in order to calculate each section coefficients AND the coefficient of the entire questionarie.
¿What could be the use of a coefficient calculated for an intentionally multi-dimensional instrument?
I am researching the appropriate cut-off point measurement for the five anxiety scales in cancer patients.
But I can not draw an overall Roc curve with SPSS for all scales.
How can I draw this curve like the example below?
I am working on a new questionnaire and I would like to verify measurement invariance. I have conducted multi-group CFA and now I am verifying metric invariance. I have computed factor loadings for each item in both groups (women and men), but I have not found any criteria how I can assess whether this factor loadings are appropriately similar in both groups.
I have not found any strict criteria related to acceptable level of the difference between factor loadings level for each item or (in an average) for the whole scale (or the whole questionnaire). Some items have similar factor loadings, but in a few cases they are different. It also happens that in a scale one items have higher factor loading for men and another for women.
Do you now any strict criteria to assess metric invariance?
I am very much interested in how to convert ordinal scale data into interval scale? What are your suggestions?
Let's say I have a data measured with 4 point likert scale (1 strongly disagree to 4 strongly agree). While it is clearly ordinal, what would be the best option for converting these data into interval scale?
Method of Successive Interval (MSI)?
The Rash model?
Your comments/suggestions are highly appreciated
I am using three questionnaires
1) has 6 items on a 6 point likert scale (professional self-doubt, subscale of the development of psychotherapists common core questionnaire)
2) has 10 items on a 6 point likert scale (developmental experience, subscale of the development of psychotherapists common core questionnaire)
3) 14 items on a 5 point likert scale (warwick-edinburgh mental wellbeing scale)
These are already established scales
My dissertation supervisor has advised me that i needed to calculate the cronbach's alpha before distributing my questionnaires but i was unsure how to do this? (I have contacted her about this, but she is currently on annual leave)
I ended up distributing the questionnaires as i had already received ethical approval and was allowed to start. I calculated the cronbach's alpha using responses of 64 participants.
question 1) is this enough participants?
question 2) how could i have calculated cronbach's alpha before the questionnaires were sent out?
question 3) I have recruited 107 participants, is this enough? (i am using a cross-sectional, quantitative design and using multi-regression analysis)
Thank you for taking the time to read this, I apologise in advance if any of this does not make sense.
I am designing a student profiling system (in university level technical education) in which student competencies are mapped and used for further intervention. I want to test achievement motivation, study habits, Engineering aptitude and English proficiency. I have selected the following tests. Are these good enough? Which can use for English proficiency.
1. Study Habit Inventory by M.N. Palsane and S. Sharma
2. Engineering Aptitude Test Battery by Swarn Pratap
3. English Language Proficiency Test by K S Misra & Ruchi Dubey
Hi all! I'm conducting a study where I have a designed a path model wherein most latent variables have polytomous indicators (ordinal data). However, one instrument I intend on using measures a construct with a mix of dichotomous and polytomous items. My question is can I somehow model a single latent variable with both these types of data? I've seen models where a factor has either dichotomous or polytomous indicators, but neither types together.
(I planned on using an estimation method that performs better for these types of data, such as WLSMV.)
We can say that these factorial analysis approach are generally used for two main purposes:
1) a more purely psychometric approach in which the objectives tend to verify the plausibility of a specific measurement model; and,
2) with a more ambitious aim in speculative terms, applied to represent the functioning of a psychological construct or domain, which is supposed to be reflected in the measurement model.
What do you think of these generals' use?
The opinion given for Wes Bonifay and colleagues can be useful for the present discussion:
I'm an undergraduate student who has a course in test construction as well as research. The construct we are studying is mental toughness, and we plan on using the MTQ-48 to assess it. I am aware that there is a technical manual for the MTQ-48 available online, unfortunately, it does not contain a scoring procedure, and does not have information about which items belong to what subscale. Using other references, what we got is that the MTQ-48 is scored on a 5-point likert scale; with 1=strongly disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree. We are also aware of what items fall under each subscale, such as items 4,6,14,23,30,40,44,and 48 which fall under the challenge subscale. However, we were not able to find a reference stating which items are negatively scored. While we could make an educated guess, it is required that we have a source for the scoring procedure. If anyone here has such a reference, or knows the items, it would be highly appreciated. Thanks!
If the original questionnaire is validated, then a linguistic validation is sufficient.
If you think statistical revalidation of a translated version is mandatory: - So can you give an example where a statistical validation of a translated version questionnaire was not successful? - Can you explain how statistical validation may change the strategy of the whole process?
I'm currently designing a questionnaire for measuring a variable. There are 5 questions related to the variable and 4 of them are in 5-point and 1 one of them is in 6-point Likert scale (based on literature review). Will it be a problem for further analysis on this variable? For your information, this variable will be a mediating one in the main model.
I'm trying to specify McDonald's Omega. I'm using MPlus (based on FAQ Guide for Reability - https://www.statmodel.com/download/Omega%20coefficient%20in%20Mplus.pdf).
It works perfect for regular models, but when I try it to use it with factor analysis random intercept models, I do get Omega's higher than 1.00 in some factors.
That’s my model (not the syntax, only a brief approach to the example):
f1 by item1-item5
f2 by item6-item10
fRI by item1-item10
fRI with f1-f2@0
Based in Mplus FAQ’s, Omega’s calculation is (loaditems)^2/((loaditems)^2+resvarianceitems), right?
So, I wonder if is it ‘ok’ to have a ‘Omega higher than 1.00’ and what it would mean?
Also, there's an example of my Omega’s calcs (again, not my syntax, just a brief exercise of what I’ve been trying to do):
OmegaFactor1 = (loaditems1to5)^2/((loadsitems1to5)^2+resvarianceitems1to5)
OmegaFactor2 = (loaditems6to10)^2/((loadsitems6to10)^2+resvarianceitems6to10)
OmegaRandomIntercept = (loaditems1to10)^2/((loadsitems1to10)^2+resvarianceitems1to10)
I’m not sure if that approach is correct (even if Omega should be used or not in that case), but I do have an intuition that it’s happening because I’m lacking to specify in my model constraint ‘resvariance’ for Factor 1 and Factor 2, once the Random Intercept Factor is ‘interacting with it’.
Could anyone give me a tip about that theme?
I'm looking for an intelligence test to examine verbal and non-verbal intelligence of children (like WISC or K-BIT) that can be administered in a group setting. My main interest is not to determine a precise IQ, but to obtain a reliable measure of cognitive functioning to include as control variable and compare different groups of school-aged children. K-BIT test would be ideal, since it does not take too much time and includes two vocabulary and one matrices subtest, but I don't know if it can be simultaneously administered to a group of children.
Thank you so much for your attention!
The effect size in statistics reflects the magnitude of a measured phenomenon. It may be expressed as a coefficient of determination (R2), which states how much variance can be explained by the controlled parameter.
If a direct and simple relationship between the variables is claimed (eg. concentration of a measurand vs. absorbance in determination of calibration plot for ELISA), the effect size may be close to 100%. Similarly, in pharmacology, the effect size of relationship between direct phenomena may significantly exceed 50%.
In social sciences, however, where the links between the observed phenomena are not so simple and largely multifactorial, I suspect, the effect sizes are much smaller.
How large the effect size (expressed as R2) in social sciences may be considered relatively "large"? Any examples? Any references?
Thank you for your help!
I have recently started a deep dive in gamification of psychological concepts to help me understand it, like Pymetrics, Cognifit , Lumosity, etc., and I am having difficulty finding literature that might help me in adapting these concepts into games. Can someone please help me find any literature regarding this, or even how to further proceed?
For testing national culture theory for example, there are four dimensions with each a set of related items. From the sample we get, I would think it better to analyze each dimension for its own set of reliability, KMO, and bartlett's test, and not for the overall (all four dimensions together) since it clearly a multidimensional instrument and not unidimensional. Please advise?
Does anyone know what are the 10 response bias items in the Narcissistic Injury Scale (Slyter 1991)? I have found 50 items for this scale, 10 of them are for response set bias, 2 items are dropped by the researchers. So only 38 items need scoring, but I don't know which 38 items out of those 50.
Thank you for your help in advance.
I'm designing an interdisciplinary study (with Public heath, Statistics, Psychology etc) on diseases, stigma, discrimination, mental health and quality of life.
So I'm a bit confused that what should be the ultimate construct for human life, I mean what must be there for a human being?
Quality of life is sometimes observed secondary, primarily they should have a moving life.
The whole United Nations (Sustainable Development) Goals talking about Society and people's wellbeing.
But whats the 1 most important factor for humans?
Please provide your thoughts?
I conducted a principal components analysis on a subset of my data. Then used the remaining participants to conduct a confirmatory factor analysis. Was this correct, or am I able to use the entire sample for the CFA, even though some participants were used for the PCA?
Please note that I did not conduct an exploratory factor analysis (EFA). My goal was specifically to reduce the number of variables into the proposed constructs, therefore I am confident that I needed to use a PCA, not an EFA. And I know that with EFA, you should definitely run the CFA on a different sample. But does this rule apply to a PCA too, when you need to run a CFA afterwards?
So essentially, what I want to know is can I conduct a PCA and CFA on the same data or not? Any literature cites you may have are greatly appreciated. Thank you.
There were ten questions in a pre intervention test given to 40 members. The correct answer was marked as 1 and the wrong one 0. The content validity was tested by a subject expert. Can anyone suggest ways to improve the value. The output as a word document is attached herewith. Thank you
Recently I've been reviewing how to handle social desirability in testing.
After much theoretical review I have come to the conclusion that the best way to do this is to neutralize the items from social desirability. The format I will use is likert scale.
For example, an item that says "I fight with my coworkers" would be transformed into " sometimes I react strongly to my coworkers" (the second is somewhat more neutral).
The idea comes from the work done by Professor Martin Bäckström.
Now the question I have is: is there any methodology that can help make this neutralization?
If not, what would be good ideas to realize it? What elements should I consider?
I think a good idea might be to "depersonalize" the item. For example, instead of "I fight with my bosses," I would become "I think an employee has the right to fight with his or her boss".
Another option I've thought of is to modify the frequency. For example, instead of "I get angry easily," I'd use "Sometimes I get angry easily."
However, I do not know if these options would affect the validity of the item to measure the construct.
Thank you so much for the help.
Currently, I am developing a longitudinal research applying psychometric questionaries in the same sample in two moments. However, I am looking bibliography or evidence about which is stronger statistical test: Paired Sample T-Test and Wilcoxon, and the assumptions behind the application of each test.
Is there any possible way?
I understand that if the options point to the same trait, it can be done. for example a question of the type:
I work better:
(b) with other persons
either of the two options is valid for the person (helping avoid bias) and for example if I'm measuring the trait of teamwork I may think that a person who selects option b will have a higher degree in the trait of teamwork. Am I making a mistake in assuming this?
now, is there any way to do this when they point to different traits in response options? I want to be able, based on the data of forced response items, to carry out normative analysis (to be able to compare with other subjects).
PS: I'm clear that with ipsatives items you can't make comparisons between people, however, if you manage the punctuation in a different way could you do it somehow?
I´ve been working on common psychometric tests. In other words, the traits to be measured are selected, some key behaviors are selected and subjects must respond on an likert scale. Subsequent to this I look at validity and reliability, I take a normative group and calculate the scores (working with classic test theory), etc.
But now I want to do something different: I want to do a test in which they have multiple response options and all of them can be correct. For example, an item could be of the type:
Choose the phrase with which you are most identified (only one).
a) I am respectful
b) I am honest
c) I am a good worker
I have seen several of these tests but would like to know how to calculate their reliability, validity, score tables.
I understand that at least three first-order factors are needed for identifying a second-order, hierarchical CFA model. I also read in Reise et al. (2010) that at least three group factors are suggested for bifactor EFA. Does the same guideline apply to bifactor CFA? I've seen many influential papers that use bifactor CFA with only two group factors, but I want to make sure this is a correct decision. References are always very appreciated.
Can we model stress. Just curious about it.
(keeping in mind that this is the era of artificial intelligence, machine learning, data science and so on)...
Can we have a predictive model on stress and behaviour as KPIs.
Also, philosophically there must be a path between Stress, behaviour, emotions, intelligence etc... can we also test and find the coefficient for these paths??
For an assignment I need to imaging designing a scale to measure hypomanic symptoms, and write about how this would be done.
Seeing as hypomanic episodes may be present for various periods of time (a few days, a few weeks etc.) is it possible to measure test-retest reliability?
I am conducting a research project in which I am using SEM model. My exogenous variable( world system position) is ordinal with 4 categories. I am not sure how creating so many dummy variables will work in SEM model. Thus I would like to treat it as a continuous variable. But I am not sure if I will be violating any statistical assumption by doing this. Can somebody help me with suggestion on this?
If you have run both EFA and CFA then you should calculate discriminant and convergent validity on the basis of EFA or CFA factor loading?Please guide.I would appreciate if you could provide any references.
I am testing measurement invariance of a questionnaire. I am aware of the fact that some methodologists think it unnecessary to test for invariance of residual variances and covariances. However, the measure I am working with has a highly replicated correlated errors issue, and I would like to test its invariance since possible noninvariance could affect the measure's reliability for some groups. What's the correct sequence I should follow? Should correlated errors invariance be tested before or after intercept invariance? Intuitively, I think it would make sense to test it together or immediately before residual variances invariance, since both are usually considered under the label 'strict invariance'. However, the Mplus tutorial in Byrne's book suggests testing correlated residuals invariance after factor loadings invariance and before intercept invariance, and it also makes sense to me. Readable references would be very appreciated.
In my study involving Student Adaptation to College Questionnaire (N=104, 0 missing values) with a likert scale of 9 is showing very low reliability across all 4 sub-scales especially Personal-emotional and Social adjustment factors in SPSS 23. But I checked if it is working if questions are not reverse coded (in SPSS) and it surprisingly produced a highly reliable output although it is irrelevant without reverse coding the questions. I have tried different methods of reverse coding but the issue still remains.
My team and I are implementing a project on assessing cyberbullying perceptions. We have decided to use FRS in order for respondents to answer our 8 situational items.
Could you suggest me some practical applications of FRS based on sinusoidal functions?
Thank you very much, Dana
I've just done Cronbach's Alpha for my UTAUT2 technology acceptance model.
The results are good. One construct is at 0.78, and the rest are > 0.84.
However, SPSS indicates that, with the deletion of some items, I can make some minor improvements (e.g. a jump from 0.87 to 0.89 for a scale).
Should I purify my scale simply as an attempt to increase the Cronbach's Alpha for each one? Or will this cause other problems?
Could you please provide a reference which suggests best practice in this scenario?
I'd really appreciate it if someone could help to explain to me (in simple terms- I'm not a statistician!) why the factor loadings of items in an exploratory factor analysis (principal axis factoring, using direct oblimin), differ to the factor loadings of a confirmatory factor analysis when both are carried out in the same sample? (I'm aware that normally you would perform a CFA in a different sample but that isn't my focus for this question).
There isn't a massive difference in factor loadings, and from what I've read and seen in other papers this is normal, but I'm just trying to understand why? What does the CFA do that results in a different 'loading' of the items?
Any help would be appreciated :)
I've been assessing the reliability for constructs using both Cronbach's alpha and composite reliability scores. But I found Cronbach's alpha is a higher than CR for similar construct. however I was expecting the Cronbach's alpha to be lower estimation than composite reliability. I wonder which one is more suitable to check the internal consistency!
I have a questionnaire data set which has seven categories or dimensions and 13 items across all the dimensions put together.The items however are distributed unequally in each category . In some cases there are as much as 4 items and n some categories just 1. I would like to know how to combine these into seven variables.
I have just 41 observations all together...I feel the low sample size rules out traditional methods such as factor analysis ...
Would like to hear some ideas.
I have data from 200 patients. I want to check the association between the cognitive impairment and the diminution of the quality of life. My variables are mostly categorical, such as sex, age, cancer type. I also have numerical variable such as cognitive tests score and quality of live survey score (likert type). Can anybody tell me if a linear regression could be a good start or some other analysis? What other analysis could be suitable for qualitative data analysis?
Thank you very much as I’m beginning with this discipline,
my dependent and independent variables are 7-Point Likert scale and measure perception toward effective leadership, therefore I assume my data is of ordinal nature.
The independent variables are six constructs with 5 questionnaire items each, which measure cultural values.
The dependent variables are 2 constructs with 19 items each, which measure the perception of effective leader attributes.
So I hypothesize that each of the culture dimensions are associated with perceived effective leader attributes. I have collected the data and intend to do the statistical analysis as follows:
1. Reliability Analysis - Cronbach's Alpha
2. Factor Analysis
But then I'm struggling between ANCOVA or Spearman's Rho to test association. I understand that ANCOVA is used for interval and Spearman for ordinal data.
Could you please advise on an appropriate method?
I have seen many questions of this kind and tried on my data but yet no solution seems to address my problem.
I have 3 instruments each of 13, 11 and 13 statements. I marked them 1-5 scale. After recoding most answers (80%) score around 4-5. (N=100) seem quite similar. My Cronbach's alpha however shows: around .229 to .430. I am still surprised because when I combine all three instruments the Cronbach's Alpha come out .642, however, for individual instrument it is lower than the required limit.
Any help will be highly appreciated. Thanks
I am running a validation study in which I compare two measures of the same process. One variable is continuous, the other is categorical (5 increasing categories). I want to assess the agreement between the two measures, but am doubting what method to use... Anyone who can help?
I had planned to complete a multiple regression but my SPSS knowledge is poor and I am struggling.
example: Five questions measuring "Perceived Usefulness" (dependent variable) in which respondents have to select one of each:
1= Strongly Disagree
6= Not Applicable
Does anyone know/.have a reference for what the standardised factor loadings (highlighted in the attached) should be when performing confirmatory factor analysis. Is it the same as the rule of thumb for factor loadings when performing an exploratory factor analysis (>.4)?
can any researcher send me the copy of 12-item Procrastination Assessment Scale for Students questionnaire and its scoring manual or instructions?
Suppose that the phenomena we study comprise about 3 % of the population and we made a two item screening instrument (with dichotomous items). If we think in terms of Rasch /IRT, what is the optimal item difficulty for these two items? and if it's contingent of their covariance - how to test it ?
I do have a question about my thesis project. I am estimating predictive validity of an assessment, however, I am a bit confused. Assessment consists of major factors,which in their turn consist if smaller dimensions which in their turn consist of smaller constructs. I need to formulate hypothesis but I do not know which levels of the assessment they should include: major factors, dimensions or smaller constructs?
I have translated the Ryff Psychological Well-being scale into Vietnamese and collected 800 responses from university students. The main construct is psychological well-being, which is comprised of 6 distinct dimensions. I used the 54-item scale with 9 items per scale (9 times 6 = 54).
The items appear to be formative and not reflective thus I am told that an EFA is not appropriate for formative items.
I want/expect to confirm that the Vietnamese version also shows the 6 dimensions found by previous research. So how should I proceed? I have not found a clear standardized approach to handle this question. The videos and writings I have read do not seem to agree on the appropriate method to take.
Anyone who has done this kind of work I would love to hear from you. Thanks.