Science topic

Scale Development - Science topic

Explore the latest questions and answers in Scale Development, and find Scale Development experts.
Questions related to Scale Development
  • asked a question related to Scale Development
Question
2 answers
I hope this email finds you well. My name is Areeba Shafique and I am a bachelor's student at Fatima Jinnah Women University. I am conducting a research study on Impact of neuroticism and social media exposure on climate change anxiety among young adults ;role of environmental concern. . As part of my research, I am interested in using the Environmental attitude inventory (EAI-24) scale developed by John Duckitt
Could you please provide me with the contact information of the author of the EAI-24 Scale, specifically John Duckitt, so I can request permission to use their scale in my research? Alternatively, if you could forward my request to the author, I would greatly appreciate it or give me Email .
Thank you for your time and assistance.
Best regards,
Areeba Shafique
Relevant answer
Answer
  • asked a question related to Scale Development
Question
3 answers
Hello.
The panel advised me to use a sequential exploratory design for scale development. However, the construct that I will be studying is from the 70s. Also, there are existing scales about it.
My question is: should I write an RRL before conducting interviews? Aside from that, if RRL is advisable, how should it guide my interview questions?
Thank you.
Relevant answer
Answer
I understood your question very differently than David Morgan. Were you suggesting your literature review would be your qualitative component? I presumed your interviews (qualitative) were phase 1, and being exploratory, that the scale (quantitative) would be phase 2. Nevertheless, it is never unwise to examine each scale for reliability and validity with your sample and context. Please clarify.
  • asked a question related to Scale Development
Question
2 answers
Hello. I'm currently writing my thesis proposal about innovative work behavior. Can I just ask if anyone know what is the scoring and interpretation mechanics of Kleysen & Street's innovative work behavior scale? Does it have a global score? I attached the original study and a subsequent study below. Thank you.
Relevant answer
Answer
I happen to have experience in conducting Innovation assessments. I use the 3W+P method which consists of:
Winning Team : this states how solid the team is in collaborating in completing an Innovation project with the abilities and skills that each team member has to be able to complete the project in accordance with the goals and targets that have been set.
Winning Concept : this states how the project is taken based on major issues obtained from global issues, government issues, customer issues, business issues and process or product issues that have a direct impact on the innovation project being carried out.
Winning System : This states how the project is taken following a standard system and made improvements and then standardized back into a new legal standard.
Performance: This states how the project is carried out Control and monitoring well to be evaluated from various aspects that we have taken into account previously (e.g. Q C D S M P E ).
A good Innovation Project will apply the 3W + P concept properly and correctly by using improvement tools that are in accordance with their designation to monitor, control and evaluate so that the impact on the project can be seen clearly by everyone, including people who do not understand the innovation that is carried out on a basic basis to the scientists.
  • asked a question related to Scale Development
Question
1 answer
Hello,
I am looking for a reference that specifies acceptable thresholds for McDonald's Omega. However, upon reviewing the existing cited works, I have noticed that no specific threshold has been determined for McDonald's Omega. Instead, the values generally attributed to Cronbach’s Alpha are applied to McDonald's Omega.
I have also examined studies that suggest there is no difference in cutoff points between McDonald's Omega and Cronbach’s Alpha.
As I am currently conducting a scale development study, could you recommend a reference that supports the acceptability of my obtained value?
Thank you for your assistance.
Relevant answer
Answer
Because McDonald's Omega is so seldom used, the reviewers for you work are likely to ask you for the value of alpha.
  • asked a question related to Scale Development
Question
1 answer
what book to be refer for formative scale development and validation?
Relevant answer
Answer
For formative scale development and validation, consider these comprehensive books that cover methodologies, best practices, and theoretical foundations:
Recommended Books:
"Constructing Measures: An Item Response Modeling Approach" by Wilson, Mark
Focuses on scale development using item response theory, which can be adapted for formative scales.
"Scale Development: Theory and Applications" by Robert F. DeVellis
A classic resource for understanding both reflective and formative scale construction, offering clear guidelines and examples.
"The Measurement of Psychological Constructs: Theory and Practice" by Geoffrey Walford
Discusses diverse measurement approaches, including those relevant to formative constructs.
"Handbook of Structural Equation Modeling" edited by Rick H. Hoyle
Provides in-depth coverage of structural equation modeling techniques, essential for validating formative constructs.
"Formative Indicators: Interpretation and Applications" by Jarvis, Mackenzie, and Podsakoff (Chapters in Research Handbooks)
While not standalone books, chapters from relevant research handbooks provide focused insights into formative scale approaches.
"Designing Surveys: A Guide to Decisions and Procedures" by Johnny Blair, Ronald F. Czaja, and Edward A. Blair
Discusses the design and validation of survey instruments, including formative measures.
If you are particularly focused on formative scale validation techniques like Confirmatory Factor Analysis (CFA), literature from journals like Psychological Methods or books on multivariate analysis might also be helpful.
  • asked a question related to Scale Development
Question
6 answers
I used 21st century skills scale developed by Ravitz in my study. One of the comments I received from one of the professors was that "you cannot simply use an instrument developed for another country and apply the same instrument to your country. You have to contextualize the instrument".
Relevant answer
Answer
Your professor is referring to a problem I have named Measurement Disjuncture (Sul, 2019). This refers to the misalignment that occurs when elements of an instrument-development process from one worldview are applied to the instrument-development process of another worldview" (Sul, 2019, p. 7). My prescription for this problem is to focus on culturally specific assessments (Sul, 2019) that are designed and developed from within the cultural perspective wherein which the instrument will be applied.
  • asked a question related to Scale Development
Question
5 answers
Hi,
I have developed a semi-structured interview assessment tool for a clinical population which gives scores - 1,2,3 (each score is qualitatively described on ordinal scale), using a mixed methodology. The tool has 31 questions.
The content validation was done in the Phase 1&2 and the tool was administered on a small sample (N=50) in Phase 3 to establish psychometric properties. The interview was re-administered on 30 individuals for retest reliability. The conceptual framework on which the tool is developed is multidimensional.
When I ran cronbach alpha for the entire interview it is .75 but for subscales it is coming between 0.4-0.5. The inter-item correlation is pretty low and some are negative and many are non-signifcant. The item-total correlation for many is below 0.2 and also has a negative correlation- based on the established criteria we will have to remove many items.
I am hitting a roadblock in analysis and was wondering if there is any other way in whcih we can establish reliablity of a qualitative tool with low sample, or other ways to interpret the data where a mixed methodology has been used (QUAL-quan).
Since the sample is small I will be unable to do factor analysis, but will be establishing convergent/divergent validity with other self-report scales.
Thanks in advance
Relevant answer
Answer
To get a high level of internal consistency, all of your measures need to be positively correlated. So, each of your scores would need to indicate frequently, mostly, always for strongly related behaviors. If any of them are in the "wrong" direction (i.e., negatively scored), they will be to reversed.
  • asked a question related to Scale Development
Question
5 answers
I have seen many comments implying if a newly developed scale has a solid model background, EFA can (or better, should) be skipped. In a cognitive scale that I have recently developed, I had a clear design on my items, based on the previous theory. However, after administrating it to my study population, I ran a WLSMV CFA with two first-order factors and saw that some items (out of a total of 50) have weak (<0.30) or very weak (<0.10) loadings and possible cross-loadings.
My fit indices improved to an excellent range after deleting some of the lowest-loading items. Even after that, I have items with factor loadings of ~0.20. I have good reliability when they stay. And they don't look bad, theoretically. After pruning them to have a minimum loading of 0.3, not only my already good fit indices don't improve much, but my reliability gets lower. And I lose a good chunk of items. You don't want to assess cognitive skills with 15 items since almost all batteries have 30-40 items minimum. Should I keep them?
Also, some of the items with ceiling effect (98% correct responses) stay in the CFA model with good loadings. Should I keep them?
There are clear guidelines on item-deleting strategies for EFA. What about CFA?
Relevant answer
Answer
A reviewer will rip you apart if you skip EFA for a newly developed scale, one of the key points of an EFA is to uncover any discrepancies between the hypothesized factor structure and theory, but it will also reveal item redundancy, poor loadings and potential cross-loadings (both of which you observed) which already could tell you which items to remove
  • asked a question related to Scale Development
Question
2 answers
I am glad to announce the 5th Annual Research Faculty Development Program (FDP) title, 'Scale Development and Data Analysis with PLS-SEM in SmartPLS 4.0' (23 – 29 December 2024), organized by the Jaipuria Institute of Management, Noida. PROGRAM OUTCOMES · The participants will be able to conduct cross-sectional research in social science. · The participants will be able to frame theoretical frameworks in social science research. · The participants will be able to develop correct scales for conducting effective research in the industry. · The participants can use advanced statistical techniques of cross-sectional research to forecast and predict consumer preferences.
*The participants will receive a Free Two-Month Professional License Key worth 56 Euros (INR 5,393). Link to register: Registration Link: https://lnkd.in/gp3-hGfX A certificate will be provided on successful completion of the program. Seats: 50 (First-cum-first-serve basis). Platform: Offline at Jaipuria Institute of Management, Sector 62, Noida, India.
For more details, Please contact MS. SHUCHİTA TEWARİ (9811050643) or email to annualresearch.fdpnoida@jaipuria.ac.in
Relevant answer
Answer
pls check the attachment
  • asked a question related to Scale Development
Question
2 answers
Hello
I recently developed a tool to Generate the item pool for psychological assessment based on the definition and literature review of the variable of the interest.
can you review the it a write your thoughts about it.
Any input is appreciated
Thank You
Relevant answer
Answer
might there be copyright issues with established tests. You need to state were items are collected (source for each item)
  • asked a question related to Scale Development
Question
3 answers
Hi researchers. I am developing a scale on Psychosocial well-being. I have done the Factor analysis. I have used principal component analysis (PCA) to analyse the data in order to obtain the minimum number of factors required to represent the available data set. Two of the dimensions have shown Eigen values more than 1. Should I remove these two Eigenvalues? What does it mean in a simple language (not the statistical explanation)? Thank you in advance.
Relevant answer
Answer
Hi Soumya
If you want to do a factor analysis then don't do a PCA, because PCA is not a type of factor analysis. There is a lot of information on the distinction between the models (Google "PCA versus factor analysis") and one of the best papers on this is...
Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110(2), 305–314. https://doi.org/10.1037/0033-2909.110.2.305
I think you should get the underlying model correct first - I suspect that psychosocial well-being is not best explained by a PCA model
Mark
  • asked a question related to Scale Development
Question
6 answers
Suppose X is a variable and it has 4 sub construct/ dimensions and a researcher adds two constructs based on literature that hold valid in current research. Is it one of the ways to proceed with EFA and CFA to establish the validity and reliability of variable in the current context without conducting interviews and focus groups, which is a part of scale development? But here only dimension is added.
Relevant answer
Answer
You can run a pair of CFAs, starting with a single 6-dimensional scale. Then run another with two correlated scales, one of which has the original four items and another with the two new dimensions. Then compare the fit of two versions.
  • asked a question related to Scale Development
Question
3 answers
please give me the scoring system of Academic stress scale developed by Rajendra and Kalikapan?
Relevant answer
Answer
is this is the actual scoring system of this tool by author? please give suggestion
  • asked a question related to Scale Development
Question
15 answers
I am working on scale development in behavioral finance by undertaking a mixed-method approach using the exploratory sequential design. The phenomenon has diverse meanings in existing literature (some measuring it in terms of behavior while others use combinations of dimensions such as knowledge and access). I am unclear about its definition, so I want to explore the perspectives about the concept and what components participants feel it includes by taking a phenomenological approach.
My notation for research design is qual→ QUAN→ QUAN. Please guide me as if my approach is right. Do I need to go in so much depth as my main aim is to develop and validate the scale and not undertake a qualitative approach? I just need to take a qualitative viewpoint to support my framework or to guide the initial items bank and dimensions I created using a literature review. Second, if not then is it still remains a mixed-method design? and if yes, then guide me as to how much the sample size should be? I referred to Creswell & Poth's (2018) Qualitative inquiry research design, which says 3-10 (Dukes, 1984) or 5-25 (Polkinghorne, 1989). I am confused, so please if you have any concrete reference, suggest here.
Thanks in advance to the reviewers!
Relevant answer
Answer
David C. Coker I disagree with your estimates for sample size, which might be appropriate for a full-scale, stand-alone qualitative study, but are too high for a survey development process. There is a reason why such studies are summarized qual --> QUAN in mixed methods, because the "small quan" is given the specific purpose of creating items that meet the needs of quantitative study that drives the design as a whole.
  • asked a question related to Scale Development
Question
3 answers
please give me the scoring system of Academic stress scale developed by Rajendra and Kalikappan?
Relevant answer
Answer
I am unable to open the attached file
  • asked a question related to Scale Development
Question
3 answers
It is interesting to know about an exact guide on reporting scale development methods? What should we actually report in the results?
Relevant answer
Answer
Hello @ Parami
Inter-rater reliability is only required where it applies, namely when a measure includes some element of subjective judgment, e.g. ratings of job candidates, scoring of projective measures, or awareness-of-deception measures. (Scoring TAT stories for achievement motivation is one example of a projective test.) Inter-rater reliability clearly does not apply to "objective" measures like Likert scales, Thurstone scales, multiple-choice tests, or true-false tests.
  • asked a question related to Scale Development
Question
1 answer
Does anyone have information about the validity of the diabetes self-efficacy scale developed by the Stanford Patient Education Research Center? There are eight items on the instrument.
Relevant answer
Answer
Validity Assumptions for Stanford's Self-Efficacy for Diabetes Scale
The Stanford Self-Efficacy for Diabetes Scale is a widely used instrument designed to measure the confidence of individuals with diabetes in managing their condition. Establishing the validity of this scale involves several key assumptions and processes:
1. Content Validity
  • Definition: Ensures that the scale covers all relevant aspects of diabetes self-management.
  • Assumption: The items included in the scale comprehensively reflect the necessary skills and knowledge required for effective diabetes self-management.
  • Process: Experts in diabetes care and self-management typically review the items to confirm they are representative of the domain.
2. Construct Validity
  • Definition: Determines whether the scale measures the theoretical construct it is intended to measure.
  • Assumption: The scale’s items should correlate with other measures that assess similar constructs (convergent validity) and not correlate with measures of different constructs (discriminant validity).
  • Process: Statistical techniques like factor analysis are often used to confirm that the items on the scale group together in a way that is consistent with theoretical expectations. For example, items measuring diet management should load onto a diet management factor.
3. Criterion Validity
  • Definition: Assesses how well one measure predicts an outcome based on another, established measure.
  • Assumption: The scale should correlate well with other established measures of diabetes self-efficacy and related outcomes (e.g., HbA1c levels, adherence to medication).
  • Process: Correlational studies compare the scores from the Stanford Self-Efficacy for Diabetes Scale with those from other validated instruments or objective health outcomes.
4. Reliability
  • Definition: Refers to the consistency of the scale over time and across different populations.
  • Assumption: The scale produces stable and consistent results when administered under similar conditions.
  • Process: Test-retest reliability and internal consistency measures (e.g., Cronbach’s alpha) are calculated to ensure that the scale is reliable.
Example Studies and Applications
  • Content Validity Example: A study might involve a panel of endocrinologists, diabetes educators, and patients who review the scale items to ensure they comprehensively cover all necessary aspects of diabetes management, such as diet, exercise, blood glucose monitoring, and medication adherence.
  • Construct Validity Example: Researchers might use factor analysis to determine if the items on the Stanford scale align with expected theoretical factors, such as self-efficacy in diet, exercise, and medication management.
  • Criterion Validity Example: The Stanford scale scores could be correlated with HbA1c levels to determine if higher self-efficacy scores predict better glycemic control.
References for Further Reading
  • Stanford Patient Education Research Center: Provides details on the development and validation of the self-efficacy scales.Stanford Patient Education Research Center
  • Studies on Diabetes Self-Efficacy:Lorig K, Ritter PL, Villa FJ, Armas J. Community-based peer-led diabetes self-management: a randomized trial. Diabetes Educ. 2009 Jan-Feb;35(1):21-32.
By ensuring these validity assumptions are met, the Stanford Self-Efficacy for Diabetes Scale can be considered a reliable and valid tool for assessing self-efficacy in diabetes management.
  • asked a question related to Scale Development
Question
1 answer
I am asking this for a scale development study. We have mutual/same items asked to both groups, and some items are asked only to a specific group (e.g., people of immigrant descent). Is there a way to conduct EFA for both groups at the same time using the mutual/same items asked to the participants of immigrant and non-immigrant descent?
Relevant answer
Answer
The Mplus software allows you to run multigroup EFA. See Chapter 5, Example 5.27 on the Mplus website:
You can find example Mplus syntax files here:
  • asked a question related to Scale Development
Question
3 answers
Hello everyone,
Thank you for your time and patience in advance! I attempt to develop a pedagogical framework and a competence scale for my PhD project. I have developed the drafts of both. I adapted several theories and previous similar frameworks for the current pedagogical framework development. I wonder if the expert panel is an indispensable step before I design teaching materials based on this framework. (I totally agree that having an expert panel can without any doubt benefit my design)
I asked this question because organizing the expert panel discussion might be challenging for me due to the time and resources limit.
Many thanks for your answers!
Best.
Bonnie
Relevant answer
Answer
Expert panels can be of great importance for the development of some educational measures, but whether they are indispensable depends on various factors such as the complexity of the topic, the scope of the measure, and the target audience or application.
There are some things to take into consideration.
Complexity of the topic:
Validity and reliability:
Different viewpoints:
Practical application and feasibility of the application:
User sharing:
  • asked a question related to Scale Development
Question
6 answers
I am currently encountering some issues with a specific scale in my research. Could anyone here help me?
In my study, I am employing the "Servant Leadership Scale" developed by Barbuto and Wheeler in 2006. This scale comprises 15 items, such as: “I encourage my followers to be hopeful about the company; I can help my followers overcome negative emotions," etc.
I have two questions regarding this scale:
(1) Can I use scale developed in corporate setting to conduct research by using the same scale in eduational setting?
Specifically, Barbuto and Wheeler developed this scale within a corporate setting. Hence, I wonder if it is suitable for me to use this scale in an academic context to assess “servant leadership” among university professors in the higher education setting. If so, should I acknowledge in the manuscript that this study utilized the scale developed by Barbuto and Wheeler with revisions. For instance, items such as "I encourage my followers to be hopeful about the company" was adapted to "I encourage my followers to be hopeful about the university" to explicitly illustrate the revisions made to the original scale and to underscore the rigor of the paper to editors and reviewers?
(2) If the Scale was designed for managers to take part in the survey, could employee take part in the survey instead?
Specifically, the "Servant Leadership Scale " developed by Barbuto and Wheeler (2006) expects leaders, rather than followers, to take part in the survey to assess leaders' servant leadership. However, If I intend to distribute this survey to students (students are regarded as followers of professors) and invite them to evaluate the servant leadership of their professors ? Should I explicitly state in the paper that this study employed the scale developed by Barbuto and Wheeler(2006) with revisions. For instance, items such as "I can help my followers overcome negative emotion" was adapted to "my professor can help me overcome negative emotion”?
Because this is really urgent, and looking forward to your earliest help!
Relevant answer
Answer
Hello! I would recommend you review other scales that better fit the educational context, such as: Authentic Leadership in Education (ALE) or Multifactor Leadership Inventory for Educators (MLQ). Greetings!
  • asked a question related to Scale Development
Question
2 answers
I am working in the area of digital finance, and i want to develop a scale, but there is no pre existing scale for the concerned variable. All the existing measures are taken from national surveys items. Moreover, there are varied conceptualizations of the concept. I have developed my own conceptual definition using two theoretical models. Now, i am developing my initial pool of items but i facing problems as the concept has not been studied in the digital context. Please suggest some way forward, and any reference. Also, i want to know is it valid if i am developing a scale using theoretical background, and on the basis of those theories i am adding dimensions into my proposed scale. Because usually papers on scale development does not mention any theory. Please guide me.
Relevant answer
Answer
You can generate new scale through inductive and deductive approach.
  • asked a question related to Scale Development
Question
3 answers
can anybody know how i can get permission for use Bisht battery stress scale developed by Dr. Abha Rani Bisht?
Relevant answer
Answer
It is commercially published. In general, you have to buy it.
  • asked a question related to Scale Development
Question
1 answer
Dear researchers,
I am looking for psychometric scales to use in my survey, measuring depression, anxiety, resilience, and social desirability. There are so many scales available, but it is hard to find scales that consider the public's opinion in the development process using focus groups and cognitive interviews. Can you tell me if you happen to know any? It would be even better if extra steps were taken in the scale development process, such as reviews of existing scales, expert opinion, and theoretical construction.
Thanks a lot!
Sincerely,
Siying
Relevant answer
Answer
Hi Siying,
could you please be a little specific about your population about which developmental process, its referred to....
  • asked a question related to Scale Development
Question
3 answers
I have a multi-part question that is a little ill-formed but the gist of it is this: I have 15 or so dichotomous items that divide into drug use behaviors that put one at risk for an overdose and protective behaviors that potentially reduce risk. I also have 3 socioeconomic indicators - also dichotomous.
My goal is to analyze these variables to obtain a scale that assesses risk for an overdose. I do have past-year overdoses experienced against which I could use any scale to determine predictive validity (sensitivity, specificity).
Prior analyses I did on the risk items only indicated they could be represented by two factors. Preliminary analyses that included the protective items found, not surprisingly, they formed a third factor.
So it seems I have a preliminary "scale" that includes 3 subscales. And I am stuck at this point with a few questions about it:
1) I have read some work and watched videos about assessing the structure of multiple subscale scales that leads to assessing bifactor or higher order factor structure. It's interesting but does that kind of analyses really add anything to scale construction as opposed to conceptual/theoretical clarity?
2) Also read about the use of multidimensional item response theory (mIRT) (as opposed to factor analyses as a means of scale development. I appreciate the more detailed information IRT provides and scale items but again, is there any advantage to using mIRT for scale development versus EFA/CFA? And even so, what is the path from mIRT to setting up an actual scale, determining how to score the items and then testing the predictive validity of the scale and scoring?
Any insights would be much appreciated.
Relevant answer
Answer
If you know the structure of the scale, you should use CFA. If there is a correlation between the 3 factors (r > 0.40 for example), then you can conduct a second order CFA.
MIRT is actually used in scales known to be multidimensional and its purpose is to obtain ability estimates of individuals. If your aim is to examine the structure of the scale, then you should use CFA. MIRT does not provide as much information about the structure of the scale as CFA. You can examine item statistics, but items that are not suitable for MIRT will not be suitable in CFA. Second order
CFA gives you the opportunity to examine whether a total score (or average score) can be obtained from the scale.. In other words, if you get a model-data fit as a result of second order CFA, you can use the scale as unidimensional and you can also use the sub-dimensions separately.
In this respect, second order CFA gives you extra information.
Bests.
  • asked a question related to Scale Development
Question
2 answers
Hello everyone,
My interest is scale development and psychometrics evaluation; and until now I was working based on the classical test theory. I really liked to evaluate the psychometric properties of the tools with item response theory. I would appreciate it if someone could analyze the data collected in this way as a colleague. The data is related to the field of nursing.
Thanks in advance,
Sincerely,
Reza Ghanei Gheshlagh
+98 9144050284
​https://scholar.google.com/citations?user=DA1hhmIAAAAJ&hl=en
Relevant answer
Answer
Hi Reza, What is your actual need? Do you want assistance in the evaluations of a tool? Or did you use a tool and now you want help with analyzing the outcomes on a raw data level? Or how can I understand your question better?
Cheers
Harald
  • asked a question related to Scale Development
Question
4 answers
Hi I am developing and validating a clinical interview based tool which will be administered by clinicians.
To assess its qualitative face validity, I have done cognitive interviews with the target population of individuals diagnosed with addiction
However, I want to know if there is any way to assess face validity through quantitative scores. I have come across "impact scores" but they are mostly used for self-report scales.
Is it suggested to apply the same process for a clinician administered measure?
In this case, can some patients read the interview and score the importance of the items to assess face validity?
Thank you!
Relevant answer
Answer
I think cognitive interviews are currently seen as the "best practice" for assessing questionnaires. If you want further evidence of face validity, you could get experts in the field to rate and critique the items. .
  • asked a question related to Scale Development
Question
5 answers
Hi,
I am about to develop a scale that could measure sustainability literacy. For this i would like to consider different types of questionnaire in this scale such as:
- Questions with yes/no responses.
- Questions with multiple choice options "where one is correct".
- Question with 5 point likert-scale "strongly disagree - strongly agree"
My main question is will i be able to conduct factor analysis (exploratory and confirmatory) on these different types of questionnaire? or do i need to stick to one type of it (most researches chose likert-scale)?
Relevant answer
Answer
Hello Laila,
I agree with the second part of your answer. Regarding the first, may it be that you confuse that with PCA (principal component analysis)? These two are often used synonymously. Factor analysis specifically is a *model* that tries to identify underlying common causes of the observed variables--not to reduce the number of items. This is the goal of PCA (it applies no model but is a purely technical data reduction techique resulting in the lack of any surplus meaning of the components (not factors).
Best,
Holger
  • asked a question related to Scale Development
Question
1 answer
To cut a long story short, I'm measuring perceptions regarding organisational change across schools in SEAsia.
Having developed my own framework based upon existing frameworks and a scale developed and adapted from existing scales, I have designed and conducted a mixed-method Likert-scale and open ended survey.
Whilst I have a matrix framework which shows how the scales align to each factor on the framework, received personal positive feedback from published researchers working in field recognising its validity and had positive feedback from educators working within this context, I still think it might benefit from extra analysis...
Would there be any real benefit including EFA since I am quite confident attributing items to certain factors?
Ps I was going to opt for Confirmatory Factor Analysis, but my sample size is only 71. So is this just something for me to write about in my limitations section?
Any thoughts or feedback appreciated!
Relevant answer
Answer
This of course depends on the number and quality of your items and other factors, but in any case, N = 71 seems really small for either EFA or CFA. You would not have much statistical power to detect an incorrect factor model based on model fit statistics, and there would also potentially be problems with the (im)precision of the parameter estimates (e.g., factor loadings).
  • asked a question related to Scale Development
Question
3 answers
I've started running an analysis for a new outcome measure. I was not part of the face validity design. The scale started with 13 items. I was wondering if there is any literature or references that guide the ideal number of items to have at the start of scale development, or references that discusses number of items?
Relevant answer
Answer
Hello Leng,
the most relevant question is whether a set of observed variables is intended and actually does measure/reflect an underlying latent variable (or several) or rather denote some kind of summative index (like a shopping list). If it is the latter, there is no validity in the strict sense (that the variables measure what they should measure) but rather the issue of a theoretical "adequacy" or comprehensiveness (=is every facet of the umbrella term captured)-that is, "content validity".
If it is the former (i.e. a common factor model is assumed) then the most important thing is to check whether the assumptions inherent in the common fator model are true--namely whether the set of measures really measure the supposed factor. Hence, in contrast to the former criterion of "adequacy", we have now the notion of "truth" and thus, validity. With 13 items, I strongly doubt that this reasoning is plausible from the start but go on to test this (why not give it a try?). But don't betray yourself in explaining away if the model's story isn't the one you like to hear :)
The number of indicators IF they all really measure one underlying factor has some advantages, see
Marsh, H. W., Hau, K., Balla, J. R., & Grayson, D. (1998). Is more ever too much? The number of indicators per factor in confirmatory factor analysis. Multivariate Behavioral Research, 33, 181-220.
But as I said, I NEVER saw a cleanly fitting model with more then 4 indicators so Marsh's study is a nice simulation with little practical relevant (sorry for the pun). That being said, a smaller number (down to 2) of really valid strong indicators serves to achieve the main goal, namely to enable estimating measurment error--so for scientific latent variable models (SEM), this will be enough. If, however, the main goal is to later create summative composite scores, then--again the number of indicators will matter as reliability increases with the number of indicators.
So long story short, yes, a higher number has advantages but the doesn't count very much, as they will most probably violate THE most basic assumption of validity. In my point of view, having a more reliable bunch of invalid measure is not so much of a good thing :)
Hope that helps
Holger
  • asked a question related to Scale Development
Question
2 answers
My objective is scale development. but the indicators in the construct also have associations between them, so I want to know whether it is valid to study the interactions between a construct's indicators. my reasoning for the same is that studying interactions could help me understand if the effects of specific indicators differ depending on the levels or presence of other indicators.
Relevant answer
Answer
Interactions can bring out a construct's complexity, which is something you are trying to measure. Social science constructs frequently have many facets and are subject to various influences. You can learn more about how these elements combine to form the whole construct by looking at interactions.
Effects in different contexts: Interactions can show how the relationships between indicators may change. Depending on the situation or the presence of other factors, some indicators may have stronger or weaker effects.
Increased measurement accuracy: Examining interactions can help with the construct's measurement accuracy. The scale can be improved and the most pertinent items for accurately capturing the construct identified by understanding how indicators interact with one another.
Relevance to intervention or treatment strategies: Understanding interactions may have application to such plans. It can direct you in determining particular sets of indicators that are especially pertinent in particular circumstances, assisting you in adapting your approach.
Theoretical base: Make sure your study is based on a strong theoretical base. Where possible, the postulated interactions should be supported by existing research and prior empirical data.
Sample size: To accurately identify significant effects in interaction analyses, larger sample sizes are frequently needed. Ascertain that the sample size you choose can support the complexity of the analysis.
Statistical analysis: To effectively explore and test the interactions, use the appropriate statistical techniques. Common techniques for examining interactions include regression analysis with interaction terms, moderation analysis, and structural equation modelling (SEM).
The interpretation of interaction effects should be done with care. It can be more difficult to interpret interactions than to understand and explain their implications. If necessary, seek advice from professionals in your field.
In conclusion, investigating how indicators interact during scale development is a legitimate and potentially enlightening strategy. It can improve your comprehension of the construct, how it is measured, and how various indicators might interact to affect the final result. Just make sure your method is rigorously analysed and has a strong theoretical foundation.
  • asked a question related to Scale Development
Question
2 answers
I'm not able to reach Professor Reynolds directly and none of the authors of the papers that have used his scale have replied to my emails. If anyone has used this scale then please let me know if I can use it with relevant citation. Thank you
Relevant answer
Answer
Since Taylor and Francis holds the Copyright here
(follow the DOI), you need to request from them, as Ajit Singh already wrote. On the site of the original article you find the point "requests and permissions" under which you probably will select the "academic permission". That will redirect you to Rightslink which is the standard tool used by almost all big publishers for such occasions.
  • asked a question related to Scale Development
Question
2 answers
Hi all,
I'm currently in the analysis phase for a program evaluation/scale development project where most of the questions follow a standardized option response format:
Strongly agree
Agree
Neither agree nor disagree
Disagree
Strongly disagree
However, there is one item whose response options are:
Strongly agree
Agree
Neutral
Disagree
Strongly disagree
My question is, for the type of analysis I'm doing (year over year differences, some validity/reliability using R), would this item for which the mid-point response option is "Neutral" as opposed to Neither agree nor disagree" make a significant difference in my analysis results? Could I interpret the results with confidence given the change in wording for the mid-point option, or could I move forward with analysis with the inclusion of this item? I would omit it entirely, but it belongs to a scale comprising 5 items.
Thank you in advance for any guidance!
Relevant answer
Answer
Dr-Zaffar Ahmad Thank you for the insights! Very true about the psychological implications.
In terms of the statistical analyses, both response options ("Neither agree nor disagree" and "Neutral") would be coded as 3, since they occupy that "middle" third position within the response options themselves.
  • asked a question related to Scale Development
Question
2 answers
Hi everyone,
This may be bit of a stretch, but if you have used the following scale before, could you please tell me what higher (and lower) scores indicate? I cannot find the scale in English.
Fragebogen zur Erfassung von Einstellungen gegenüber übergewichtigen Menschen [Questionnaire for Measurement of Attitudes toward Obese People], Degner 2006)
Relevant answer
Answer
Hello there! The questionnaire is an 7-item Likert scale (see pages 146 & 147 ) of the dissertation you are referencing. 1 represents "lehne deutlich ab," which translates to strongly disagree. 4 represents "weder noch," which translates to neither agree nor disagree. And 7 represents "stimme deutlich zu," which means strongly agree.
  • asked a question related to Scale Development
Question
14 answers
For a research on a psychotherapeutic intervention for health care employees in the face of COVID-19 I am looking for the Nursing Stress Scale or a similar instrument to measure stress in nursing-staff. It would be great if anyone could help me.
Thank you :)
Relevant answer
Answer
In the same situation, let me know about the email or most important contact?
  • asked a question related to Scale Development
Question
4 answers
Now that EFA shows uni-dimensionality with only one factor extracted and CFA conducted ;then what shall be next steps for the following in scale development/construction.
i) convergent validity
ii)discriminant validity
Relevant answer
Answer
If you have a model that specifies the relationships between your independent and dependent variables and any relevant covariates, then you can use this to assess convergent and discriminant validity. In other words, the validity tests do not need to be a separate step from your overall analysis.
  • asked a question related to Scale Development
Question
1 answer
We are working on a project investigating the impact of electricity reduction/loadshedding on the small business sector. We would appreciate any similar studies in other contexts. Including scales used to measure impact of electricity reduction/loadshedding.
#electricityreduction #loadshedding
Relevant answer
Answer
Here are a few studies and resources that may be relevant to your project:
  1. "The impact of load shedding on small and medium enterprises in South Africa" by P. Adendorff and J. van der Walt (2018). This study investigates the impact of load shedding on small and medium enterprises in South Africa and includes a survey to measure the impact on businesses.
  2. "Assessing the impact of power cuts on small and medium enterprises in Zimbabwe" by J. M. Tawanda and T. M. Mabvure (2017). This study examines the impact of power cuts on small and medium enterprises in Zimbabwe and includes a survey to measure the impact on businesses.
  3. The World Bank's "Guidance Note on the Management of Load Shedding" (2014) provides guidance on how to manage load shedding, including a section on measuring the impact of load shedding on different sectors.
  4. The International Energy Agency's "Energy Access Outlook 2021" (2021) includes a section on the impact of power outages and load shedding on the energy access of households and businesses in low- and middle-income countries.
  5. The United Nations Development Programme's "Access to Energy for the Poor" (2018) includes a section on the impact of power outages and load shedding on businesses and economic development in low- and middle-income countries.
These resources may provide useful insights and methodologies for measuring the impact of electricity reduction and load shedding on small businesses.
  • asked a question related to Scale Development
Question
6 answers
Dear researchers, can both EFA and CFA be applied to the database obtained at the end of the scale development process at the same time?
Relevant answer
Answer
Hello Ömer,
I'm uncertain as to what you mean by "...at the same time."
If you are uncertain as to the underlying structure for a set of variables, and have no prior evidence, claim, or theory as to what that structure should be, then EFA is certainly a viable option for exploring and making initial judgments as to what factor space might be appropriate.
Once you determine a defensible candidate structure, it is certainly a good idea to verify that the structure can adequately account for observed relationships among the variables by using CFA with data not used to derive the EFA.
The reason for not applying to the very same data as used for the EFA is the concern that the EFA process can be opportunistic, resulting in a solution which overfits the data set from which it was derived, and therefore may not generalize.
You can find a number of published studies in which author/s have taken part of data set to derive a candidate structure via EFA, then evaluated that structure, via CFA, with the remainder of the data set.
If this doesn't address your concern, perhaps you could elaborate.
Good luck with your work.
  • asked a question related to Scale Development
Question
9 answers
SSQ (Simulator Sickness Questionnaire) is known to have a complex factor structure, with items loading on multiple dimensions.
In the original study (Kennedy et al., 1993), it is stated that "The N, O, and D scores are then calculated from the weighted totals using the conversion formulas given at the bottom of the table."
Those formulas are:
Nausea = [ Sum obtained by adding symptom scores ] x 9.54
Oculomotor = [ Sum obtained by adding symptom scores ] x 7.58
Disorientation = [ Sum obtained by adding symptom scores ] x 13.92
Total Severity = (Nausea + Oculomotor + Disorientation) x 3.74
It is not clear in the article that how those multipliers, 9.54, 7.58, 13.92 and 3.74 were derived.
Question A: How did they derive those multipliers?
I am working on a Turkish translation of SSQ, and my results are promising. However, it looks like I need to remove some items, and make some changes in scoring.
Attached file contains a comparison of factor weights of my results and Kennedy et al's. original work, besides Bark et al.'s (2013) results on some driving simulator experiments. My results are more similar to Kennedy et al. study, compared to Balk et al study.
The data is collected through 84 participants who had 2 different VR game sessions. SSQ-TR factor analysis is done using Principal Components with Varimax rotation and 3 factors emerged based on eigenvalue>1 assumption.
Question B: I seek for suggestions for factoring the SSQ-TR.
I have some ideas on removing some items and re-adjusting item/load structure, indicated on the shared spreadsheet.
References
Kennedy, R. S., Lane, N. E., Berbaum, K. S., & Lilienthal, M. G. (1993). Simulator Sickness Questionnaire: An Enhanced Method for Quantifying Simulator Sickness. The International Journal of Aviation Psychology, 3(3), 203–220. doi:10.1207/s15327108ijap0303_3
Balk, S. A., Bertola, M. A., & Inman, V. W. (2013). Simulator sickness questionnaire: Twenty years later.
Relevant answer
Answer
So, for the chart that I posted above, the following would be the scores?
N (general discomfort + increased salivation + sweating + nausea + difficulty concentrating + stomach awareness + burping) = 1+1+3+2+2+0+1 = 10 x 9.54 = 95.4
O (General discomfort + Fatigue + Headache + Eye strain + Difficulty focusing + Difficulty concentrating + Blurred vision) = 1+2+1+1+0+2+1 = 8 x 7.58 = 60.64
D (Difficulty focusing + Nausea + Fullness of head + Blurred vision + Dizzy (eyes open) + Dizzy (eyes closed) + Vertigo) = 0+2+3+1+2+1+1 = 10 x 13.92 = 139.2
Total severity = (95.4+60.64+139.2) x 3.74 = 1104.2
Which doesn't seem correct?
  • asked a question related to Scale Development
Question
8 answers
These items are taken from WVS7 questionnaire to find religious tolerance :
  1. Do you trust people of other religion?
  2. Whenever science and religion conflict, religion is always right
  3. The only acceptable religion is my religion
  4. Mention if you would not have people of a different religion as your neighbor
All of them are in 4 point scale. Higher value would indicate higher tolerance. Alpha value is below 0.2.
What should be done? Should I carry on ignoring the alpha? Is alpha even appropriate in this case?
Relevant answer
Answer
It is not clear to me that all your items are in the same direction. To be certain, you should examine a correlation matrix to check whether you have only positive correlations.
  • asked a question related to Scale Development
Question
1 answer
I have found only one valid 40-item scale developed by De Vries, et al. in 2012. I need a shorter one. I am glad one can provide me with one shorter scale.
Tnx
Relevant answer
Answer
Mortaza Bonjakhi The "Teacher Professional Development Impact Scale" established by Rossman and Rolheiser is a shorter scale for measuring teachers continuous professional development that you may find useful (2016). It consists of 12 elements that examine how professional development affects instructors' knowledge, practice, and student learning. Another alternative is the Darling-Hammond and McLaughlin (1995) "Impressions of Professional Development Scale," which comprises 15 items and examines teachers' perceptions of professional development design and execution. These scales were created more recently than De Vries et al. (2012) and feature fewer components.
  • asked a question related to Scale Development
Question
1 answer
Hi researchers
I would like to use seidman zager's teacher burnout scale developed in 1987. Any one with a copy of this questionnaire?
Relevant answer
Answer
  • asked a question related to Scale Development
Question
5 answers
I am conducting a research which involves scale development for emotional experiences with Wanghong (Internet-famous) restaurant dining consumption. With reference to the steps in prior literature, I have already done interview and expert review on the measurement scales. It is interesting to see that the emotional experiences may be categorized into three stages, pre-, during, and post-dining experience. I have conducted the first study, with the objective to purify the scale. I have done one analysis on all the measurement items using EFA without the consideration of three stages, and four factors emerged. In order to reflect the finding that emotional experiences are different in the three stages, I think three EFAs should be conducted? It seems to me that the first way is more methodological correct, while the second way is more theoretically or conceptually correct. Would appreciate if anyone may give me some advices on this! Thanks a lot!
Relevant answer
Answer
Yes , I do .
With our appreciation to your scientific efforts doctor ,
  • asked a question related to Scale Development
Question
7 answers
Dear Colleagues,
I am looking for cross-cultural studies on identity development. I have found various studies using the Utrecht-Management of Identity Commitments Scale. However, I struggle with finding any studies using the Dimensions of Identity Development Scale developed by Koen Luyckx. I've tried to find something in Google Scholar, but the outcomes of my research were poor. If you know about this kind of study, let me know.
Kind regards,
Kamil Janowicz
Relevant answer
Answer
Thanks, Kamil Janowicz ! That's a good reason to do this kind of research. I'm looking forward to see your study!
Theo
  • asked a question related to Scale Development
Question
4 answers
Dear researchers,
I've been working on two different scales for two separate studies. I'm not getting the expected results from Exploratory Factor Analysis (EFA). Confirmatory Factor Analysis, on the other hand, provides me with results that confirm the construct's validity. Given this, do I need to perform EFA in the following cases?
1. A study of scale development based on Bandura's (1977) self-efficacy model. Bandura theorizes that efficacy expectations vary on several dimensions, including magnitude, generality, and strength. As a result, I accepted these dimensions as scale sub-factors and wrote some items about them.
2. A scale development study based on curriculum learning objectives. There are four units in this curriculum. Each unit has its own set of learning objectives. I wrote some items around these learning objectives. I wanted to learn students' perspectives on how well they met these learning objectives, so I accepted each unit as a dimension (sub-factor) of the scale in CFA...
Source: Bandura, A. (1977). Self-efficacy: toward a unifying theory of behavioral change. Psychological review, 84(2), 191.
Relevant answer
Answer
In my opinion, to test the theoretical/hypothesized factor structure of a scale that you developed, you do not need exploratory factor analysis (EFA). EFA is mostly for exploring the number of factors. In your case, you would have clear a priori assumptions about the number of factors and the loading pattern (i.e., which item/variable is supposed to measure which factor). Therefore, confirmatory factor analysis is the method of choice to test the hypothesized factor structure.
  • asked a question related to Scale Development
Question
3 answers
Hi everyone,
As part of my PhD I'm validating a patient-reported disease severity scale for patients with a rare condition. It assesses the severity across 5 symptom groups using a 0-5 likert rating. It's been adapted from a previously validated clinician-reported version to form a lay-reported version so that patients can report their own disease severity. The symptom groups are the same but the ways in which the response options are worded are different between the two questionnaires, which means this version needs validating. Initial testing done on the questionnaire suggests the isn't much differentiation across the response options on most of the items. I was thinking about interviewing patients, amending the questionnaire and then running some quantitative analyses to validate the scale.
I'm looking at using IRT, as the scale is not to be utilized in clinical settings, as there is already a validated clinician-reported tool to measure disease severity in the population. However the main problem I face is the patient population is incredibly small and I'm unlikely to get more than say 100 participants, all the stuff I'm reading on scale development says I need a lot more data otherwise the analysis won't have sufficient power.
Has anyone got any experience validating questionnaires using small sample sizes or has any advice regarding different validation strategies?
Many thanks!
Relevant answer
Answer
Before testing the content domain, I will strongly suggest reading the following paper. If conceptualization is based on well-developed measurement theory, then a small sample size is sufficient for your scale validation:
(20) (PDF) Specifying the Problem of Measurement Models Misspecification in Management Sciences Literature (researchgate.net)
  • asked a question related to Scale Development
Question
3 answers
For convenience, I collected data from a single large sample for scale development.
and then I randomly split into two samples for EFA and CFA.
In this case, I wondering which sample (total? or CFA sample?) should be evaluated for the criterion validity or reliability of the newly developed scale.
Relevant answer
Answer
In general, I am in favor of using as much data as possible for reliability and validity assessment. However, if you choose to split your sample and use the first half of the data for purely exploratory purposes (EFA, determination of the number and nature of factors), I don't find it logical to then combine the data again at the end. It seems more logical to me in that case to only use the second portion of the sample for reliability estimation & validation as I view it as part of the second, "confirmatory" step.
This is one downside of the split-sample approach. You "loose" data/power for the "confirmatory" step. In my view, the exploratory step can often be avoided in scale development. Typically, when we develop new scales, we already have a fairly clear theory about which factor(s) should be measured by which variable/item. If that is true in your case, you may not need to explore the number of factors. In other words, you could avoid EFA altogether and use only CFA on the entire sample (no split) to begin with.
  • asked a question related to Scale Development
Question
3 answers
Hello!
I am in the process of adapting the questionnaire for my research. The data on one of the scales has a strong deviation from the normal distribution - 64% of the answers have the lowest possible value (the sample consisted of 282 people). The rest of the responses were also distributed towards the low values. There were 3 items on the scale. A Likert scale of 1 to 7 was used for responses.
However, it seems to me that the construct being measured implies such a result in this sample.
Can you tell me whether such a scale can be considered usable? Or is it too heavily skewed? Are there any formal criteria for this?
I've already done a CFA, and I've checked the test-retest reliability and convergent validity (the results are acceptable).
Thank you very much in advance for your answers.
Relevant answer
Answer
It depends on what statistical analyses/estimators you are planning to use. For example, for conventional CFA/SEM with maximum likelihood (ML) estimation (assuming continuous variables), extreme skew could be a problem because ML estimation rests on the assumption of multivariate normality of the variables. Although robust ML estimators are available, skewness could still be a problem because the inter-item covariances/correlations can be affected/biased by skewness (and, as a consequence, also the factor structure). In that situation, it may be useful to try out alternative estimation methods (e.g., WLSMV estimation using polychoric correlations vs. ML estimation) and conduct sensitivity analyses (i.e., checking how much the results differ across different estimation methods).
I recommend studying the extensive literature on non-normality and use of categorical (ordinal) variables in CFA/SEM. A Google Scholar search using terms like "ordinal items CFA", "non-normal data CFA" and the like will point you to the relevant papers.
  • asked a question related to Scale Development
Question
4 answers
Could you please elaborate on the specific differences between scale development and index development (based on formative measurement) in the context of management research? Is it essential to use only the pre-defined or pre-tested scales to develop an index, such as brand equity index, brand relationship quality index? Suggest some relevant references.
Relevant answer
Answer
Kishalay Adhikari, you might find some useful information in Chapter 12 of the following book:
Hair, J. F., Babin, B. J., Anderson, R. E., & Black, W. C. (2019). Multivariate data analysis (8th ed.). Cengage.
I think that some of this chapter could have been written a bit more effectively, but overall it is helpful in drawing distinctions between scales and indexes.
All the best with your research.
  • asked a question related to Scale Development
Question
10 answers
I have "value co-creation behavior" variable (Yi and Gong, 2013) in my research model. It is the product of a scale development paper (Yi and Gong, 2013) and a third-order construct. When I test it as a second-order construct, there is no any problem. But in its original paper, it is specified as a third-order construct. So, I would like to test it as it is but do not know how to make it on AMOS. If you have experience in testing third-order constructs on AMOS, Could you please help me with it together with the images from the program?
Thanks in advance.
Yi, Y. and Gong, T. (2013). Customer value co-creation behavior: Scale development and validation. Journal of Business Research, 66, 1279-1284.
Relevant answer
Answer
It would be appropriate to use Smart PLS oruse AMOS-Adins to extract measure validity results. For more Watch this video.
  • asked a question related to Scale Development
Question
5 answers
Exploratory Factor Analysis and Confirmatory Factor Analysis are used in scale development studies. Rasch Analysis method can also be used in scale development. There are some researchers who consider the Rasch Analysis as up-to-date analysis. Frankly, I don't think so, but is there a feature that makes EFA, CFA or Rasch superior to each other in Likert type scale development?
Relevant answer
Answer
Rasch is better for a unidimensional scale
  • asked a question related to Scale Development
Question
5 answers
Hello, I am new to R so a little stuck. I have 164 items as part of a scale development project (n=1271), and I want to set a cutoff of .40 for factor analysis. I tried using logical operators and used this script
loload <- tempdf1 %>% filter(Q1.0<.40) which set up the new datafile for 'loload' but didn't put any data in there. I then tried using this script with all 164 items separated by a comma, which returned an error message.
I'm quite stuck; numerous google searches don't offer a lot unless I want to do things to one specific variable.
Any help is appreciated.
Relevant answer
Answer
Not sure I understand your question, and you have used code from a package without naming it, not shown what things like Q1.0 are, and not included a minimal working example, so you should do this. Say is your 164 items are in a matrix X and you want all values less than .4 turned missing,
X[X < .4] <- NA
But you have thrown the phrase "factor analysis" into your question as if that it relevant. It might make sense to delete this question and start over in order to get more useful responses.
  • asked a question related to Scale Development
Question
1 answer
Looking for the questionnaire on EMOTIONAL STABILITY SCALE developed by CHATURVEDI (2010) with description.
Relevant answer
Answer
I think you may directly contact at following email for the tool:
mamtarameshpsych@
  • asked a question related to Scale Development
Question
4 answers
Dear Sir/Ma'am,
Greetings!
I developed a psychometric tool consisting of 95 items. I had shared with 5 experts for content validation. 51 items are removed after the content validation process. According to Polit & beck (2006) and Polit et al., (2007) the acceptable CVI values should be 1. The 51 items did not get the required value of CVI for acceptance thus, the 44-items are retained.
My question is 'Should I circulate the questionnaire (with 44-items) to students for criterian-releated validity or 'Should I go with 95 original items?'
If I go with 95 items then what is the use of doing content validation or computing CVI. Am I following the right path if I decide to go with 44 retained items? please say
Big Thank you,
Narottam Kumar
Relevant answer
Answer
I don't understand why you would want to include the items that were rejected by the experts, especially given the issues with CVI. Can you please explain why you are considering this option.
  • asked a question related to Scale Development
Question
5 answers
I am a researcher student. I noticed that many scholars used scales developed in western context. However, may constructs manifest differently in different cultures. Thus, I have to develop scales by myself. After reading some papers related to scale development, I found the "item generation" stage is the most difficult. I invite researchers who have experience in scale development to share their tips in "item generations" and other stages. For example, is interview the only approach to generate items? Are there any tips for item generation? Are there any other suggestions for scale development?
Thanks !!!
Relevant answer
Answer
Hi,
There might be some constructs which can be applied universally. Regarding the contextual, geography specific nature of constructs, it depends on the domain and research problem and research area.
I have developed the scale for project success in Indian context.
Yes, Item generation is among the crucial stage in scale development since identification of wrong items can result in an ill fitted scale. Interview is NOT the only approach to generate the items. Again it depends on the research problem. If you are attempting a research problem for which there is no previous theory (i.e. you are aiming for theory generation or theoretical framework generation) then you can go for qualitative interviews since they have the potential to generate insights and concepts and expand our understanding.
However, since you mentioned that constructs are manifesting differently in different cultures, it seems that theory already exists and you are attempting the research problem in a different context. For this, the best way is to have extensive review of literature to collect the scale items and their latent dimensions. From this pool of items you can short-list the constructs and related scale items after realignment and removing highly similar items to start with the process of scale development in your contextual settings. For scale development procedure you can refer DeVellis (2013) and Boateng, et al., (2018).
Virender Kumar
  • asked a question related to Scale Development
Question
5 answers
Hello,
Using factor analysis, I recently created a social questionnaire with four factors, each containing four items. My next step is validating the questionnaire.
I want to show that this questionnaire can differentiate between 3 product categories that have different social characteristics.
- What statistical method should I use to prove the differences?
- Should I ask the same respondents to evaluate all 3 products? or it is ok to have separate respondents for each product category.
Thank you
Relevant answer
Answer
I also recommend that the sample you use to check the validity of your questionnaire be separate from the experimental sample you use to differentiate between products.
  • asked a question related to Scale Development
Question
3 answers
This 2022 article provides extensive details of the steps involved in a robust scale (measurement) study -
Syed Mahmudur Rahman, Jamie Carlson, Siegfried P. Gudergan, Martin Wetzels, Dhruv Grewal. (2022). Perceived Omnichannel Customer Experience (OCX): Concept, measurement, and impact. Journal of Retailing.
Open access; freely available at https://doi.org/10.1016/j.jretai.2022.03.003
See Table 1 for an overview of the steps.
Relevant answer
Answer
well written scale
  • asked a question related to Scale Development
Question
3 answers
Hello, it will be a great help, if anyone would help me find the scale and the scoring of the levels of emotional awareness scale developed by Richard D Lane in 1990. Thank you in anticipation.
Relevant answer
Answer
Try PsyToolkit.org. Good luck with your search.
  • asked a question related to Scale Development
Question
16 answers
Hi,
I have turned up a questionnaire prepared in English. I want to convert it into Turkish than practice it. But I cannot be sure whether I should analyze its validity and reliability because it is not a scale. I hope I can have a chance to pick your brains. Thank you in advance.
Relevant answer
Thank you for the interesting question. In my opinion, content validity and the instrument's reliability are proven by experts to be useful and important even if you modified the questions or not. Also, in the case of a different language.
Kind Regards,
  • asked a question related to Scale Development
Question
3 answers
I developed the scale using qualitative approach ( projective technique and depth interview). can i complement scale development procedure on the basis of recommendation given by Carpenter,2018 for developing the scale? Please suggest .
Also throw the light on scale development procedure prescribed by Carpenter,2018.
Relevant answer
Answer
Imran Anwar, thanks for providing the citation. I've read the article just now and think it's quite good in a number of respects. However, in many ways I think a novice reader would be little better off for having read it. Certainly, I know what the author is getting at only because I am previously aware of those things.
Apart from that, the article needs a bit of editing. For example, there are three problems in the following:
"Two-itemed scales are only recommended if items are highly correlated (i.e., r < .70)."
But, again, thanks for being so helpful.
  • asked a question related to Scale Development
Question
11 answers
Dear researchers
I developed a scale with 6 items and I want to compare it with a widely used scale with 4 items. These two scales have similar validity, reliability and in the EFA they are loading on the same factor. In addition, I have done a CFA between these two scales (or better said between these two latent variables) and the correlation among them is 1. Finally, another CFA between two other latent variables and these scales, (CFA was done separately that is one at a a time) and the correlations between the latent variables and the scales were similar - not identical though.
Are there any suggestions on how I should work ?
Thank you
Relevant answer
Answer
Strictly speaking, it depends on what exactly you mean by "equivalent" because there are different levels and definitions of "equivalence" of latent variables in measurement theory, for example, tau equivalence versus essential tau equivalence versus tau congenericity in classical test theory.
"Less strictly speaking" (which may be enough for the purposes of your study), if you can demonstrate that the observed variables in question load highly on the same factor in a CFA or EFA and/or that two factors for these item sets are perfectly or close to perfectly correlated (like you have already done), then I would say it's fairly plausible that these variables measure very similar attributes or even the exact same attribute.
  • asked a question related to Scale Development
Question
1 answer
Hello,
I am currently working on the discriminant validity on my scale development.
I have a second-order model consists of four first-order latent variable and one global second order variable (as you can see on the figure). I also have other related constructs that is needed for the discriminant validity.
When I calculated the average variances extracted, do I need to use factor loading and variance of a model that includes the related constructs or without them?
looking forward your answer!
Relevant answer
Answer
Hyungwoo Oh, Your CFA model should include all your latent variables both exog and endog. You then calculate AVE based on the relationships (loadings) between latent variables and their respective indicators/items.
  • asked a question related to Scale Development
Question
3 answers
Hello,
I was wondering if anyone would be willing to share any literature/knowledge on determining item order in scale development.
For context, we have developed and tested a scale. Question order was randomized in Qualtrics and the scale has since been validated. I have read several papers on scale development and validation, but they don't discuss item order within individual questionnaires at depth or at all. As the order of the items in a questionnaire should be consistent when used in the future, I was wondering at what stage this order is decided and how it can be tested (e.g., in terms of validity and reliability).
Thank you kindly in advance!
Kat Schneider
Relevant answer
Answer
Hello Jekaterina,
In surveys, one common consideration associated with item/question/stimulus order is the advice to separate sensitive or controversial probes from others with more neutral items, and to begin the survey with more neutral items.
In fixed-forms achievement testing (same test given to all), item order is commonly recommended to follow an "easy to hard" sequence, based on the challenge or demand of the questions. The rationale for this was not to stun an examinee at the outset of a test with overly hard items. However, a number of studies comparing randomized to difficulty-sequenced items have shown negligible differences.
Response-contingent testing (computer-administered, usually following an IRT model) is, in fact, a method that purposively selects items based on their level of challenge. In the absence of other information, one would start administering a "moderate" difficulty item/question/task to the examinee, then adjust the subsequent item based on whether the given answer was correct (harder items follow correct responses; easier items follow incorrect responses). So, this ad hoc sequencing method (adapting to the proficiency of the respondent) is pretty common. This tactic leads to a very efficient process for estimation of proficiency.
Since your initial analysis was based on a randomized order of items and appears to have functioned satisfactorily, there's probably little need to consider changing that going forward.
Good luck with your work.
  • asked a question related to Scale Development
Question
6 answers
Greetings researchers. I am a Ph.D. scholar and I want to go through a Ph.D. research thesis which could guide me regarding the correct methodology of scale or tool development to measure an anxiety. I will be grateful to you for your cooperation in this respect. Thanks.
Relevant answer
Answer
The simplest approach would be a self-report inventory. If you want to construct your own such scale, begin by examining widely accepted descriptions of anxiety - for example, in the International Classification of Diseases and/or David Barlow's "Anxiety and its disorders" and/or psychodynamic writings on the subject. Extract specific statements about the inner experience and reported symptoms. Use them as your item pool. Aim for about twice as many items as you will want in the final scale. Administer this set of items to a pilot sample. Run item analyses (such as corrected item-whole correlations) to eliminate weak items, then have a fresh sample complete the revised scale. You'll also want to think of a good validity study.
Do remember that such inventories already exist. If there is not a good one in Urdu, then maybe a translation project would make more sense. Include collection of local norms and it would be valuable!
An alternative would be to work on a semi-structured interview and a set of clinical ratings; something like the Hamilton Rating Scale.
  • asked a question related to Scale Development
Question
5 answers
can I use a combination of a vignette (for one variable i.e. dependent variable) with a self-report survey questionnaire (for all other variables IVs, Mediators, and moderators)? if I can what types of analysis and software for that analysis I may use? if I can't what should I do? (scale development is not a good solution, neither scale for survey research is used nor available in previous research for that Dependent variable). I mean can I use a vignette for one variable with a self-report scale for all other variables in combination (it is somehow a mix of experimental and self-report methodology).
Relevant answer
Answer
Hi Ijaz,
I am assuming you would be using the vignette to set the stage for asking questions or seeking responses about it. If that is the case, the answer is yes, you can use a vignette as part of a survey questionnaire. Regarding what analysis or analyses you could use, that depends on the nature of your sampling and responses, the research questions you are asking, and the viability of your assumptions. Analyses could range from simple frequencies to much more complex statistics and even qualitative methods.
Good luck,
J. McLean
  • asked a question related to Scale Development
Question
3 answers
Dear RG Community, Can you please share how to do Item Analysis in scale development? I have developed a scale, now need to do the item analysis before try out.
Relevant answer
Answer
Maria Dolores Tolsá Thank you very much for your kind answer. I will love to collaborate for my ongoing project in the research publication stage.
  • asked a question related to Scale Development
Question
3 answers
I am trying to make a risk assessment type scale that measures the likelihood/risk of someone experiencing a psychological disorder in a setting. There are about 10 factors (e.g., gender, sleep quality, self-esteem, resilience, loneliness) that increase the likelihood and I would like to include these in the new scale. There are existing validated scales that examine each of these factors in the literature, but these are relatively long and including all these scales would make the new scale too long.
My plan at the moment is to pick 4-5 best questions (for confirmatory factor analysis later on) from each of these existing scales and adapt them to suit the setting I am interested in, and run the new scale with their full-length counterparts. As 11 scales (10 existing and 1 new one) would be difficult to complete in one sitting, I was thinking of breaking it down to 2 or 3 different testing sessions, preferably with the same participants, but if not possible, with different participants. Is this a reasonable approach to take?
I am aware that there are many books/articles about scale developments (e.g., Scale development: theory and applications by DeVellis), but I feel that these either target single construct (unidimensional) or multidimensional construct in which the subscales sort of “form” the multidimensional construct (e.g., personality with openness, neuroticism, agreeableness, conscientiousness and extraversion), and it is difficult to find the information on the type of scale I am trying to make. If anybody could recommend a good reference for the type of scale I am trying make, it would be most appreciated.
Relevant answer
The article is very interesting
  • asked a question related to Scale Development
Question
5 answers
In a scale development, we know that Cronbach's α coefficient will be increased as the number of item increase, generally, when Cronbach's α coefficient above 0.7, the reliability is acceptable. I want to know when the item in a sub-scale is only 3, how much the Cronbach's α coefficient should be reached, the reliability can be accepted?
Relevant answer
Answer
Scales with three items can be quite reliable. Even single-item measures, when well-constructed, can be highly reliable.
From my perspective, the issues with Cronbach's alpha stem from its frequent inappropriate application rather than the measure itself. It is logical for longer scales to have higher reliabilities, as errors of measurement have more opportunity to "average out" when there are more items. This is a fundamental principle or "law" of classical test theory/measurement error theory that is also formalized / implied by the Spearman-Brown composite reliability formula. It therefore does not make sense to "criticize" alpha for being larger with an increasing number of items, other things being equal. The problems lie in its incorrect application to multidimensional scales that capture more than one true score variable or to scales that are otherwise not in line with the underlying psychometric model. Cronbach's alpha does not make sense for multidimensional scales; however it is often applied without testing unidimensionality first.
More formally, Cronbach's alpha is only correct for computing composite reliability when the items are continuous/metrical variables and when they are in line with the classical test theory models of tau equivalence or essential tau equivalence. These models are fairly restrictive as they imply unidimensionality (single factor models) of the items and equal factor loadings of all items. However, these models are rarely tested in practice before computing alpha. For items that measure multiple factors or items that have different loadings, alpha does not apply and may lead to incorrect reliability estimates of the composite/scale score. The more items you have, the less likely it is that they will be in line with (essential) tau equivalence and the more likely that they will be multidimensional.
The proper procedure is to first test the models of (essential) tau equivalence using confirmatory factor analysis. If these models fit, then it makes sense to compute alpha. It doesn't matter how many items you have as long as the models fit. Reliability/alpha estimates will tend to be higher for longer scales. This makes good sense.
There is no fixed rule as to how reliable a three item scale needs to be. It depends on your purpose of using this scale. E.g., is it for individual diagnostic purposes or for research?) When used solely in research, you can use methods such as confirmatory factor analysis and structural equation modeling to correct for measurement error, so that a lower alpha value for a shorter scale may not matter that much (as long as it is not unreasonably low). For diagnostic ( e.g. clinical) purposes, this may be different.
The most important thing to do first is to see whether Cronbach's alpha applies in your case at all, that is, test the models of (essential) tau equivalence for the items to see if one or both models will fit your data.
  • asked a question related to Scale Development
Question
19 answers
What should be done if items did not load in EFA?
should they be discarded? Or that means that they are unique and do not have partners?
What if these items are important or represent our dependent variables ? Can we include them as single items?
Relevant answer
Answer
Faten Amer, Usually items not loading on their respective constructs are discarded. However, at times discarded items can be combined to make a new construct.
Robert has already given you advice regarding single items.
Can you share a little more information regarding the EFA, so that we could together see what you have done and how could we help you further? Information such as:
- Which extraction method are you using?
- How the number of factors was identified?
- and, which rotation method was employed?
There is a common misconception between PCA and EFA, and at times right parameters (above) not selected. Consider reading the following article, if not already done:
Costello, A. B., & Osborne, J. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical assessment, research, and evaluation, 10(1), 7.
  • asked a question related to Scale Development
Question
8 answers
What are the programs which can be used for structural modeling?
I have used an AMOS previously but the trial period had run off, so are there any other available programs for the use which are free of charge?
thanks
Relevant answer
Answer
JASP is one of the latest software and it is free. If you need any assistance you can contact
  • asked a question related to Scale Development
Question
6 answers
Hello everyone,
My dependent and independent variables are loading on the same component when performing EFA, Also CFA does not show a good fit when they are combined in one model but separated into different components.
I think this is due to the high correlation between them. So should I separate the questionnaire into different scales and perform EFA/CFA for each scale independently. Then combine them together in a path analysis later?
Relevant answer
Answer
Christian
thank you
yes i believe this is the problem
However, i am validating the Balanced scorecard which has 5 perspectives, under each perspective i could identify different constructs
actually i find it hard to delete or ignore any perspective
so i thought why not to split items depending on patient real experience items and the patient attitude items then to identify the constructs of each scale alone
the cfa and efa results became much better, also the constructs became more logic
my theory is that the 5 perspectives of balanced scorecard ( financial, customer, internal, external, and innovation and knowledge) experiences have effect on the balanced scorecard 5 perspectives images, which lead to patient satisfaction and loyalty.
so do you think it is not a good idea to plit the item based on dependent and independent items then to perform the construct validation for each separately?
  • asked a question related to Scale Development
Question
11 answers
How can I validate a questionnaire for a small sample of hospitals' senior executive managers?
Hello everyone
-I performed a systematic review for the strategic KPIs that are most used and important worldwide.
-Then, I developed a questionnaire in which I asked the senior managers at 15 hospitals to rate these items based on their importance and their performance at that hospital on a scale of 0-10 (Quantitative data).
-The sample size is 30 because the population is small (however, it is an important one to my research).
-How can I perform construct validation for the items which are 46 items, especially that EFA and CFA will not be suitable for such a small sample.
-These 45 items can be classified into 6 components based on literature (such as the financial, the managerial, the customer, etc..)
-Bootstrapping in validation was not recommended.
-I found a good article with a close idea but they only performed face and content validity:
Ravaghi H, Heidarpour P, Mohseni M, Rafiei S. Senior managers’ viewpoints toward challenges of implementing clinical governance: a national study in Iran. International Journal of Health Policy and Management 2013; 1: 295–299.
-Do you recommend using EFA for each component separately which will contain around 5- 9 items to consider each as a separate scale and to define its sub-components (i tried this option and it gave good results and sample adequacy), but am not sure if this is acceptable to do. If you can think of other options I will be thankful if you can enlighten me.
Relevant answer
Answer
Cronbach's alpha are different between factors dep and indep
teste de reliability
  • asked a question related to Scale Development
Question
8 answers
How can i validate a questionnaire for hospitals' senior managers?
Hello everyone
-I performed a systematic review for the strategic KPIs that are most used and important worldwide.
-Then, I developed a questionnaire in which I asked the senior managers at 15 hospitals to rate these items based on their importance and their performance at that hospital on a scale of 0-10 (Quantitative data).
-The sample size is 30 because the population is small (however, it is an important one to my research).
-How can I perform construct validation for the items which are 46 items, especially that EFA and CFA will not be suitable for such a small sample.
-These 45 items can be classified into 6 components based on literature (such as the financial, the managerial, the customer, etc..)
-Bootstrapping in validation was not recommended.
-I found a good article with a close idea but they only performed face and content validity:
Ravaghi H, Heidarpour P, Mohseni M, Rafiei S. Senior managers’ viewpoints toward challenges of implementing clinical governance: a national study in Iran. International Journal of Health Policy and Management 2013; 1: 295–299.
-Do you recommend using EFA for each component separately which will contain around 5- 9 items to consider each as a separate scale and to define its sub-components (i tried this option and it gave good results and sample adequacy), but am not sure if this is acceptable to do. If you can think of other options I will be thankful if you can enlighten me.
Relevant answer
Answer
After the survey is completed..but it is better to increase the number of studied samples..so that the result will be bette
  • asked a question related to Scale Development
Question
3 answers
ALTS is an altruism scale developed by S.N Rai and S.Singh.
Relevant answer
Answer
Glad you found it.
  • asked a question related to Scale Development
Question
6 answers
executive functioning = working memory, attention shifting and inhibition if task.
Please suggest me scale who asses all three of them or even i am ready to go with different scales. kindly mention authors also.
I want to use these tests for my m.phil thesis which is non funded and i am unable to bear expenses of publishers. 
Relevant answer
Answer
Yes Talha, you get free access to the Moca test at > https://www.mocatest.org
You will have to demonstrate your competence qualifications to get the required badge.
A useful example of a Functional backup is to ask the client to carry out a simple task AND report back to you through her/his nurse (or email /phone message, etc). Simple tasks like "Please send me your contact information tomorrow", or "Please draw a copy of the shape on this page, then give it to your nurse tomorrow morning", require INITIATIVE, a major executive function skill not tested by simple tests like MOCA where initiative is provided by the tester.
Note that the second task mentioned above requires LESS initiative than the first, since you leave behind a tangible cue for compliance.
Good fortune, Paul
xPsychologist Paul McGaffey, PhD(ABD)
  • asked a question related to Scale Development
Question
5 answers
I am trying to validate a translated version of PCFS scale developed by Klok et al., 2020. in my country. Can you suggest how should I proceed? Such as:
  • Confirmatory Factor Analysis
  • Comparing with the results with other scale evaluating the same domain, i.e. EQ-5D etc.
Relevant answer
Machado, F. V., Meys, R., Delbressine, J. M., Vaes, A. W., Goërtz, Y. M., van Herck, M., ... & Spruit, M. A. (2021). Construct validity of the Post-COVID-19 Functional Status Scale in adult subjects with COVID-19. Health and quality of life outcomes, 19(1), 1-10.
Jokiniemi, K., Pietilä, A. M., & Mikkonen, S. (2021). Construct validity of clinical nurse specialist core competency scale: An exploratory factor analysis. Journal of Clinical Nursing, 30(13-14), 1863-1873.
Tzivinikou, S., Charitaki, G., & Kagkara, D. (2021). Distance Education Attitudes (DEAS) during Covid-19 crisis: Factor structure, reliability and construct Validity of the brief DEA scale in Greek-speaking SEND teachers. Technology, Knowledge and Learning, 26(3), 461-479.
I hope it helps
Best regards
Ph.D. Ingrid del Valle García Carreño
  • asked a question related to Scale Development
Question
7 answers
I want a beginner friendly book about the process of adapting and validating a scale or health measurement for psychology. If I plan to do a scale validation as a dissertation project, what should I start reading about? Thanks!
Relevant answer
Answer
Anyone from ANNE ANASTASI: Pioneer in Psychometry and Test Theory and not yet passed; eg her excellent work "Psychological Tests".
  • asked a question related to Scale Development
Question
5 answers
I am in the midst of a scale development project for a psychological construct and wondering if a scale can still be used as a composite score in other analyses if there are four factors and there is no higher order factor. If anyone has insights on implications and/or examples of this type of construct/use, that would be great! Thank you.
Relevant answer
Answer
Hello Jaclyn,
beyond what my predecessors said. Of course you can use a multi-faceted composite for research or practice (it is done all the time when using "indexes"--for instance, in economy) but the ontological properties change. While measurement of anything implies the attribute you measure exists independently from your measurement procedure, the computation of indexes just refers to the mixing process of things into the index and has no surplus meaning. Practically, if you get some estimates involving the index, interpretation is ambiguous.
HTH
Holger
  • asked a question related to Scale Development
Question
5 answers
I am currently undergoing research. I want to find the academic stress of undergraduate students. Anyone with original version of Academic stress scale developed by Kohn and Frazer 1986? Most of the one found was adapted
Relevant answer
Answer
Here is the reference of the original article in which it is, it also carries a very useful bibliographic references:
Kohn, J. P., and Frazer, G. H. (1986). An academic stress scale: Identification and rated importance of academic stressors.
Psychological Reports, 59, 415-426.
Another, but in Spanish: Cabanach, R G .; Valley, A; Rodríguez, S; Piñeiro, I and Freire, C: ACADEMIC STRESS COATING SCALE (A-CEA); Ibero-American Journal of Psychology and Health, vol. 1, no. 1, January, 2010, pp. 51-64
Here, in "RG": Exams-university-and-health-A-study-psychosocial ... "by MGF Javier; ... and, in Googl, you can see in full: Images of Scale of academic stress original version from Kohn and Frazer 1986; in DIALNET: from JP Montero: Academic Stress, College Students, College Stressor Scale, Stress Assessment ... P. Kohn and Gregory Frazer in 1986; P $ SQ.
Finally, the Doctoral Thesis: EMOTIONAL REGULATION AND ACADEMIC STRESS IN STUDENTS (...), with a similar Scale: The "CORE" Scale and, on the other hand, the "ECEA", which is part of the Academic Stress Questionnaire developed by Cabanach et al. (2008).
¡Very good luck!
  • asked a question related to Scale Development
Question
7 answers
We can say that these factorial analysis approach are generally used for two main purposes:
1) a more purely psychometric approach in which the objectives tend to verify the plausibility of a specific measurement model; and,
2) with a more ambitious aim in speculative terms, applied to represent the functioning of a psychological construct or domain, which is supposed to be reflected in the measurement model.
What do you think of these generals' use?
The opinion given for Wes Bonifay and colleagues can be useful for the present discussion:
Relevant answer
Answer
Hello
Factor Analysis is a multivariate statistical technique applied to a single set of variables when the investigator is interested in determining which variables in the set form logical subsets that are relatively independent of one another. In other words, factor analysis is particularly useful to identify the factors underlying the variables by means of clubbing related variables in the same factor:
Thanks
  • asked a question related to Scale Development
Question
4 answers
Yes/no response scales have a worst result in factorial analysis than Likert scales? Should we avoid yes/no response scales?
Relevant answer
Answer
depends on measure.
  • asked a question related to Scale Development
Question
3 answers
I am completing a study that intends to use a cognitive load scale for different formats of instruction. The simplest, most common scales are “How difficult was the lesson you just studied?” with a 7- point subjective rating scale developed by Kalyuga, Chandler, and Sweller (1999), ranging from (1) extremely easy to (7) extremely difficult, and “What level of effort did you put into learning the lesson?” with a 7-point subjective rating scale developed by Paas and Van Merri¨enboer (1993), ranging from (1) extremely low to (7) extremely high.
What is the benefit of having the two questions instead of just the perceived effort question? Is there a benefit to providing the questionnaire just once rather than after each trial? Finally, I'm having a hard time finding the reported reliability/validity information for the instruments. Much appreciated.
Relevant answer
Answer
Cognitive load measures using the dual-task paradigm require a learner to perform two tasks simultaneously. It is assumed that performance for the second task drops when the primary task, i.e., the learning task, becomes more loading.
  • asked a question related to Scale Development
Question
8 answers
I have developed a scale based on literature review. There are many scales already present on these variables (knowledge management and ICT) and many studies in literature have given various factors as these are multidimensional scales. I have conduct CFA to confirm these factors. Please provide me strong citations based on area of social sciences for this process where we don't need EFA for scale development.
Relevant answer
Answer
there you go (random selection, most of my literature is in German):
Stevens, J. P. (Hg.). (2009). Applied multivariate statistics for the social sciences (5. Ed.). New York, NJ: Routledge.
Hair, J. F., Black, W. C., Babin, B. J. & Anderson, R. E. (2010). Multivariate data analysis: A global perspective (7. Ed.). Upper Saddle River, NJ: Pearson.
Field, A. (2018). Discovering statistics using IBM SPSS statistics (5. Ed.). London: SAGE.
Kline, R. B. (2011). Principles and practice of structural equation modeling (3. Ed.). New York, NJ: Guilford Press.
There is an infinite number of articles on both EFA and CFA that discuss this in greater detail.
Futhermore, there are at least three threads on RG that addresse this issue as well.
Best
Marcel
  • asked a question related to Scale Development
Question
4 answers
Academic Motivation Scale - College Version (AMS-C 28) score interpretation
Relevant answer
Answer
-Núñez Alonso, Juan Luis, & Lucas, José Martín-Albo, & Navarro Izquierdo, José G., & Grijalvo Lobera, Fernando (2006). Validación de la Escala de Motivación Educativa (EME) en Paraguay. Revista Interamericana de Psicología / Interamerican Journal of Psychology, 40 (3), 391-398. [Fecha de Consulta 9 de Junio ​​de 2021]. ISSN: 0034-9690. https://www.redalyc.org/articulo.oa?id=28440314
-Psicothema 2005. Vol. 17, nº 2, pp. 344-349 ISSN 0214 - 9915 CODEN PSOTEG;Validación de la versión española de la Échelle de Motivation en Éducation.Juan Luis Núñez Alonso, José Martín-Albo Lucas y José Gregorio Navarro Izquierdo
-IN "RG":Escala de motivación situacional académica para estudiantes universitarios: desarrollo y análisis psicométricos
May 2020.Interdisciplinaria 37(1).Follow journal.DOI:10.16888/interd.2020.37.1.8
License CC BY 4.0; Flavia Eugenia Bruno, Mercedes Fernández Liporace y Juliana Beatriz Stover
  • asked a question related to Scale Development
Question
1 answer
Currently i have a psychological scale and is ready to gather data for pre-analyses like EFA and corelation analyses.
my quesitons is:how can i assure that the sample is representative for the total .
Relevant answer
Answer
Random samples are representative of the population
  • asked a question related to Scale Development
Question
13 answers
Hello all,
This is my first time doing CFA AMOS.
Initially, I developed a scale for a specific industry 17 items 5 factor scale based on theory of other industries. This proposed scale was tested with two ) datasets first with n=91 year 1 and second n=119 year 2 from a single institution. EFA identified 3 underlying factors in both the datasets, no items were deleted.
During year 3, a sample of n=690 consisting of participants all over the nation was used to do CFA using SPSS AMOS. Following is the output:
1. Based on EFA (3 factors, 17 items)
a) Chisquare = 1101.449 and df= 116 [χ2/DF = 9.495]
b) GFI = 0.805
c) NFI = 0.898
d) IFI = 0.908
e) TLI = 0.892
f) CFI = 0.908
g) RMSEA =0.111 (PClose 0.000)
h) Variance
Estimate S.E. C.R. P Label
F1 .573 .056 10.223 ***
F2 .668 .043 15.453 ***
F3 .627 .040 15.620 ***
i) Covariance
Estimate S.E. C.R. P Label
F1 <--> F2 .446 .036 12.502 ***
F1 <--> F3 .365 .032 11.428 ***
2) Based on theory (5 factors, 17 items)
a) Chisquare = 440.594 and df= 109 [χ2/DF = 4.042]
b) GFI = 0.926
c) NFI = 0.959
d) IFI = 0.969
e) TLI = 0.961
f) CFI = 0.969
g) RMSEA =0.066 (PClose 0.000)
h) Variance
Estimate S.E. C.R. P Label
F1 .677 .047 14.334 ***
F2 .670 .043 15.493 ***
F3 .648 .054 12.100 ***
F4 .741 .061 12.103 ***
F5 .627 .040 15.620 ***
i) Covariance
Estimate S.E. C.R. P Label
F1 <--> F2 .503 .036 14.057 ***
F1 <--> F3 .581 .041 14.262 ***
F1 <--> F4 .546 .041 13.388 ***
F1 <--> F5 .398 .032 12.321 ***
F2 <--> F3 .457 .036 12.848 ***
F2 <--> F4 .403 .035 11.405 ***
F2 <--> F5 .458 .033 13.899 ***
F3 <--> F4 .553 .042 13.036 ***
F3 <--> F5 .360 .032 11.275 ***
F4 <--> F5 .358 .033 10.754 ***
My questions:
1. Do I have to normalize the data before CFA analysis? (I am finding conflicting information since my scale is a likert scale and extreme values are not really outliers ?)
2. Can I report that theory based model is a better fit compared to EFA model? Would doing so be appropriate?
3. Is there anything else I need to do ?
Any guidance will be greatly appreciated.
Thank you,
Sivarchana
Relevant answer
Answer
Hi Robert Trevethan thanks for your question.
I think ML would be suited to continuous normal data, as you suggest, and robust ML for skewed/non-normal continuous (or interval data with say 7-11 categories). So far I have only collected Likert, so because that involves polychoric correlations ML might not estimate accurately. I don't actually know if PAF in SPSS is equal or superior to, say, a DWLS in R. I think the main point I wanted to emphasise is to use PAF instead of PCA - I got this wrong in my earlier days - as PAF will be more accurate if using SPSS.
I must say Robert, your answers really make me think, which is good :) If I need to be corrected, I'm open.
  • asked a question related to Scale Development
Question
4 answers
We often use scales developed by others in our researches. Sometimes the scale may have low validity and reliability values in our sample. In such a situation, I always encounter problems in analysis. So, in this kind of case, is there a problem if items that do not function well are removed in order to obtain acceptable values and if the necessary exploratory and confirmatory factor analyzes are made again?
I would be very happy if you add a source with your feedback. Thank you.
Relevant answer
Answer
Esat Şanlı, I'm sorry that I can't provide you with a citation concerning the issue you face,but I would certainly recommend not using the original version of the scale if you are aware that it lacks reliability and validity in your sample.
If you have enough participants, I'd recommend conducting exploratory factor analyses to identify the most appropriate factor structure to move ahead with.
I am interested to see what other advice you might receive . . .
  • asked a question related to Scale Development
Question
20 answers
I am reviewing studies that assessed stress. The most commonly used tool is the perceived stress scale. I am attempting to interpret the data which is reported as a mean score. I haven't found any studies that have cut off scores related to levels/intensity of stress. I have read internet articles where scores are interpreted in terms of intensity e.g. but they don't give the research behind the cut off score, see below  
I would be grateful if somebody could guide me to research that validates cut off scores for each level/intensity of stress.
Thank
Lloyd  
Relevant answer
Answer
Yoko Nomura
Dear Yoko,
How did you evaluate cut-off scores for the PSS-14? I didn't find any references too.
Thanks for reply.
Regards
  • asked a question related to Scale Development
Question
5 answers
Hi RG colleagues,
Is anyone familiar with a good climate-related vulnerability scale? I have not landed on a valid climate-related psychological vulnerability scale so far. Now I am wondering if someone can make a recommendation. Please advise if adapting The Psychological Vulnerability Scale would be a good idea? The psychological vulnerability scale is a 6 items scale (PVS; Sinclair & Wallston, 1999). I have used it in the past with high reliability in the covid related PVS.
Best wishes,
Gulnaz
Relevant answer
Answer
Thanks for your answer. I must admit, I'm still unclear on the context. "We" was changed to "he" in the passage you included here, which is otherwise identical to the first passage I asked about, but I have no idea who "he" or "we" are referring to.
  • asked a question related to Scale Development
Question
1 answer
I'm proceeding to adapt a scale in English into my own language, and also wanna make some small changes not related to cultural problems. The original scale is designed to measure a general concept, now I want to change the items a little bit so it can measure that concept in a context-specific way.
So I wanna ask what is the right procedure to do it? Should I run a translation-adaptation process first (translate, make some cultural changes if needed and run a pilot study to check the reliability & validity of the adapted scale) then make the context-specificity changes that I want for the research topic;
or translate and make the changes at the same time, then run the pilot study to check the reliability & validity of the adapted scale?
Thank you in advance.
Relevant answer
Answer
This would be helpful... Good luck
  • asked a question related to Scale Development
Question
4 answers
I am a student who needs to do some research for my master thesis about the expected product care of a phone in the future. To be precise: I’m doing research on the impact that a Country of Origin label has on the usage of a mobile phone. In this research, I’ll give different information to different respondent groups (the “made-in-label” will differ). Now I want to develop or use an existing scale that measures how well the respondent thinks he’ll take care of the product. For instance: “I will use this product longer than a year”, “I will take care of this product” etc.
Does anyone know a reliable existing scale or how to develop a new one?
Relevant answer
Answer
What about trying the satisfactory scales since they are related to the consumer behavior and atitudes towards a product and the consumer decision making (taking care of a product or not)?
  • asked a question related to Scale Development
Question
5 answers
I'm in the process of scale development and have as many as 7 to 11 items cross-loading (>.32) on two or more factors/sub-scales. How acceptable and common is the practice of including an item under two or more of them?
Relevant answer
Answer
Hello Prachi,
If factors are correlated more than a little, then cross-loadings can and will be observed (note that I am talking about variable-factor loadings/correlations, not the factor pattern matrix, which outlines emphasis given to variables in the identification of a factor). So, the first question for me is, are your factors correlated, and if so, by more than a little?
Interpretation of factors can indeed be easier when variables load on one and only one factor (as in Thurstone's idealized "simple structure" solution). If you embrace that ideal, then you must be prepared to jettison variables which don't conform to the intended pattern. However, there are several personality measures that include multiple items/questions/stimuli with salient loadings on multiple factors, and these measures still offer utility.
You're perfectly free to set the condition that no cross-loadings are allowed in your solution; but there's nothing about the basic factor model that prevents you from having cross-loadings.
Good luck with your work.
  • asked a question related to Scale Development
Question
3 answers
Need the scale for a dissertation. Would be better if I could get it for free since my work is not funded. I need the English version of this scale.
Relevant answer
Answer
Great discussion
  • asked a question related to Scale Development
Question
13 answers
Hello,
I want to use a motivation scale with my students (secondary school) for my research, but the scale is in a 4-point Likert style with no neutral option (Certainly true - True - Not True - Certainly not true). It is a valid and reliable scale. The question is, would it still be valid and reliable if I added a neutral option in the middle of the answers? More importantly, would it be better that way?
A second question is, as seen above, the answers of the scale goes from positive to negative. Because I use the other way around version (negative to positive) in the rest of my research instruments, I'd like to do the same with this motivation scale too. Would that be OK?
Thanks a lot for your time in advance.
Yusuf Polat
Relevant answer
Answer
Hello Yusuf,
You may add the neutral option, but doing so may change some of the characteristics of the resultant scores. You could certainly run a quick pilot study to determine empirically whether this might be the case with your target population. (The reason hinges mostly on the debate between having the options being "forced choice"--no neutral option, vs. that of whether the neutral category captures only instances of genuine ambivalence vs. some other respondent attribute, such as not bothering to consider the stimulus.)
Reversing the order of the scale should have no effect on its utility, so long as the layout makes it perfectly clear to the respondents as to the direction of the scale. In countries where text is read from left to right, the usual practice is low end of the scale on left hand side, high(er) end of the scale on the right-hand side.
Good luck with your work.
  • asked a question related to Scale Development
Question
3 answers
I am developing a scale to investigate self-understanding in relation to cultural diversity among university students. I will be using this scale to investigate criterion validity for the scale I developing. 
Relevant answer
Answer
موضوع جميل جدا
أحتاج إلى مثل هذا ربما نستفيد من دراستكم أيها الدكتور العزيز
  • asked a question related to Scale Development
Question
15 answers
Hi,
I'm conducting an EFA for my scale development article. I was wondering whether anyone has an article or book I can refer to regarding the process of rerunning the EFA after item deletion?
Thanks!
Relevant answer
Answer
Chelly Maes Here is another paper that will definitely be helpful for you; please check paragraph 3 on the first page:
Beavers, A. S., Lounsbury, J. W., Richards, J. K., Huck, S. W., Skolits, G. J., & Esquivel, S. L. (2013). Practical considerations for using exploratory factor analysis in educational research. Practical Assessment, Research, and Evaluation, 18(1), 6.
  • asked a question related to Scale Development
Question
7 answers
The MMPI-2 and it's much improved variant the MMPI-2-RF produces clinical scales that are derived from participants responding to several hundred true and false items. Test developers reference deploying factor analysis in the scale development, but I thought factor analysis couldn't be computed with dichotomous data? What am I missing?
Relevant answer
Answer
Hi Mark
As usual Robert and Holger have provided great answers, and their approaches are based on the idea that the binary variable is a crude indicator of a continuous underlying variable. You might want to treat the scores as truly categorical and this would bring Item response Theory models into the frame. It's an exciting time as the boundaries between these seemingly different psychometric models (FA/IRT) are now pretty faint.
Mark
  • asked a question related to Scale Development
Question
1 answer
The scale Development is a sequential, systematic and scientific exploring and confirming the variables under a construct....
Relevant answer
Answer
Broadly, I would point to the following steps:
1) Literature review to thoroughly comprehend the theoretical framework surrounding the latent construct you want to measure
2) Developing an explicit understanding of the ultimate purpose of the scale to better determine the type of items to be included in the scale, the length of the scale etc.
3) Generation of item pool (way more than the length of what you intend your final scale to be)
4) Evaluation of face and content validity
5) Administration to a suitable sample
6) Evaluation of construct validity (perhaps the most difficult thing to establish - this is where the soundness of theoretical understanding of the construct helps in accruing evidence for the construct validity of the scale)
7) Reliability analysis - Internal consistency of the scale, test-retest reliability
  • asked a question related to Scale Development
Question
11 answers
I am using three scales in my study, two of them are five Likert Scale, while one of them, Self-perceived communication competence (SPCC) developed by McCroskey & McCroskey, 1988) is a percentage scale measuring the percentage of perception of people about their own communication competence from 1 to 100.
Can I convert the scale using, 1-20=Strongly disagree, 21-40= Disagree, 41-60= Neutral, 61-80=Agree and 81-100= Strongly Agree.
Is there any way to convert scales?
Thank you in Advance
PS: Scale is attached.
Relevant answer
Answer
Muhammad Adnan, I'm pretty sure you could go ahead and obtain correlations without worrying about converting your scales.
However, in regression you might need to consider standardized rather than unstandardized coefficients.
You should be able to find information about that in stats textbooks, on the web and/or in YouTube presentations.
  • asked a question related to Scale Development
Question
7 answers
I am using AMOS-21 for the estimation of the model fit for a questionnaire. I have found the following values. Are these values enough and appropriate? Which of the following are important to present while reporting?
CMIN/df= 2.426
RMR= .05
GFI=.921
AGFI= .890
NFI= .951
TLI=.964
CFI=.970
RMSEA=.064
Relevant answer
Answer
Hello Shamas,
1) Forget GFI and AGFI, they are outdated
2) Forget the ratio of chisquare and df. It makes to sense and never has
3) With regard to the rest, there is quite a debate whether using fit indices are useful at all. I am on the side of the argument that you use the statistical chi-square test and omit the fit indices (although I tend to report the CFI and RMSEA as reviewers want to see them.
The problem with the fit indices is that they measure the degree of deviations between the empirical covariances and the covariances implied by the model. The "degree" of similarity however, does not reflect the degree of correctness. Hence you can have very close matches albeit the model being complete nonsense. The chisquare test, has a clear statistical theory underlying it. If the test is significant, it means that there are deviations greater than those expected by pure chance. This MAY mean that the model is slightly wrong, vastly wrong or that something with the underlying data assumptions is wrong. At any rate, it provides the basis to understand/learn how the model can be improved.
I don't know your test but would bet (as in most cases, also mine) that it is significant. No reason for despair, just learn what's wrong.
For more info see the following thread
and papers
Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making causal claims: A review and recommendations. The Leadership Quarterly, 21, 1086-1120. doi:10.1016/j.leaqua.2010.10.010
McIntosh, C. (2007). Rethinking fit assessment in structural equation modeling: A commentary and elaboration on barrett (2007). Personality and Individual Differences, 42(5), 859-867.
Thoemmes, F., Rosseel, Y., & Textor, J. (2018). Local fit evaluation of structural equation models using graphical criteria. Psychological Methods, 23(1), 27-41. doi:10.1037/met0000147
Hayduk, L. A. (2014). Shame for disrespecting evidence: The personal consequences of insufficient respect for structural equation model testing. BMC Medical Research Methodology, 14(1), 124-124. doi:10.1186/1471-2288-14-124
Best,
--Holger
  • asked a question related to Scale Development
Question
9 answers
Hello
I want to build a questioner about emotional and cognitive Strategies . my question is:
how should I Formulate sentences or items?
they should began mostly with " I thing " or "I feel" to represent the emotional and cognitive side of the subject ?
or
they should began with a verb to present a behavior or an act "I do" to represent the Strategies?
Tank you
Relevant answer
Answer
The sun was just an example. I would make the statement and then using likert scales have 1 as never, going up the scale to 5 always. Sorry, I should have put that.