ArticlePDF Available

Translation, adaptation and validation of instruments or scales for use in cross-cultural health care research: A clear and user-friendly guideline

Authors:

Abstract

The diversity of the population worldwide suggests a great need for cross-culturally validated research instruments or scales. Researchers and clinicians must have access to reliable and valid measures of concepts of interest in their own cultures and languages to conduct cross-cultural research and/or provide quality patient care. Although there are well-established methodological approaches for translating, adapting and validating instruments or scales for use in cross-cultural health care research, a great variation in the use of these approaches continues to prevail in the health care literature. Therefore, the objectives of this scholarly paper were to review published recommendations of cross-cultural validation of instruments and scales, and to propose and present a clear and user-friendly guideline for the translation, adaptation and validation of instruments or scales for cross-cultural health care research. A review of highly recommended methodological approaches to translation, adaptation and cross-cultural validation of research instruments or scales was performed. Recommendations were summarized and incorporated into a seven-step guideline. Each one of the steps was described and key points were highlighted. Example of a project using the proposed steps of the guideline was fully described. Translation, adaptation and validation of instruments or scales for cross-cultural research is very time-consuming and requires careful planning and the adoption of rigorous methodological approaches to derive a reliable and valid measure of the concept of interest in the target population.
Translation, adaptation and validation of instruments or
scales for use in cross-cultural health care research:
a clear and user-friendly guidelinejep_1434268..274
Valmi D. Sousa PhD RN1and Wilaiporn Rojjanasrirat PhD RNC IBCLC2
1Associate Professor, The University of Kansas School of Nursing, Kansas City, Kansas, USA
2Associate Professor, Graceland University School of Nursing, Independence, Missouri, USA
Keywords
back-translation, cross-cultural validation,
health care research, translation
Correspondence
Valmi D. Sousa
The University of Kansas School of Nursing
3901Rainbow Boulevard
Kansas City, KS 66160
USA
E-mail: vsousa@kumc.edu
Accepted for publication: 3 February 2010
doi:10.1111/j.1365-2753.2010.01434.x
Abstract
Rationale, aims and objectives The diversity of the population worldwide suggests a
great need for cross-culturally validated research instruments or scales. Researchers and
clinicians must have access to reliable and valid measures of concepts of interest in their
own cultures and languages to conduct cross-cultural research and/or provide quality
patient care. Although there are well-established methodological approaches for translat-
ing, adapting and validating instruments or scales for use in cross-cultural health care
research, a great variation in the use of these approaches continues to prevail in the health
care literature. Therefore, the objectives of this scholarly paper were to review published
recommendations of cross-cultural validation of instruments and scales, and to propose and
present a clear and user-friendly guideline for the translation, adaptation and validation of
instruments or scales for cross-cultural health care research.
Methods A review of highly recommended methodological approaches to translation,
adaptation and cross-cultural validation of research instruments or scales was performed.
Recommendations were summarized and incorporated into a seven-step guideline. Each
one of the steps was described and key points were highlighted. Example of a project using
the proposed steps of the guideline was fully described.
Conclusions Translation, adaptation and validation of instruments or scales for cross-
cultural research is very time-consuming and requires careful planning and the adoption of
rigorous methodological approaches to derive a reliable and valid measure of the concept
of interest in the target population.
Introduction
Globalization and migration have contributed to an increasing
diversity of the population in many countries, particularly in the
USA, where the background of the population is extremely
diverse regarding culture, language and ethnicity [1,2]. Therefore,
there is a need for relevant cross-cultural research to addresses a
number of problems among these multinational and multicultural
populations. However, health care researchers conducting cross-
cultural studies must have access to reliable and cross-validated
instruments in other cultures and/or in other languages [3–6].
Findings from cross-cultural research may have great clinical
implications for physicians, nurses and other health care profes-
sionals who provide care for diverse populations because the
delivery of quality care depends on the accurate assessment and
deeper understanding of an individual’s cultural, linguistic and
ethnic background.
Background and significance
The increase in diverse populations worldwide and the need for
cross-cultural and multinational research indicate a great need for
clinicians and researchers to have access to reliable and valid
instruments or measures cross-validated among diverse cultural
segments of the population and/or in other languages [3,4,6]. This
would enhance the validity, the generalization and the translation
of cross-cultural health care research. Although there are well-
established methodological approaches for translating, adapting
and validating instruments for use in cross-cultural health care
research [2–5,7–15], a great variation in the use of these
approaches continues to prevail in the health care literature.
A recent review of 47 methodological studies focusing on the
translation and validation of instruments for cross-cultural
research reported that the quality and methodological approaches
of the reviewed studies varied greatly [16]. There was no clear
Journal of Evaluation in Clinical Practice ISSN 1365-2753
© 2010 Blackwell Publishing Ltd, Journal of Evaluation in Clinical Practice 17 (2011) 268–274268
consensus among researchers on how the approaches should be
used or combined, a great variation on the qualifications of trans-
lators, and a lack of detailed information about the translation,
back-translation, validation, testing, and revision and refinement
of the instruments.
Corroborating these findings, another researcher [2] reported
that, unfortunately, translating, adapting and cross-culturally
validating research an instrument is treated as an unimportant
step of study protocols. Furthermore, the most commonly used
and reported methodological approach was the forward trans-
lation only, often using an unqualified translator. In addition,
the same author [2] reported that only a few researchers have
described the use of strategies and steps or processes for the
adaptation and/or validation of the instruments, and empha-
sized that it is not sufficient to forward translate an instrument
without carefully evaluating its adaptation and cross-cultural
validation. The procedure should consist of a comprehensive
process that involves not only translation of an instrument,
but also thorough evaluation of its adaptation and cross-cultural
validation.
These data suggest that despite the existing recommendations
and guidelines to use a comprehensive multistep process for
translating, adapting and cross-validating instruments, research-
ers have not been doing this. It may be that the methodological
approaches are not clearly presented in a user-friendly format,
which makes it difficult for researchers to adopt and follow
the recommendations. Therefore, the objectives of this scholarly
paper were to review published recommendations of cross-
cultural validation of instruments and scales, and to propose and
present the highly recommended methodological approaches
for translating, adapting and validating instruments for cross-
cultural health care research in a clear and a user-friendly guide-
line. This would lead to better understanding and use of
these approaches by health care researchers, particularly nurse
researchers worldwide.
Methodological approaches
Of the two main categories of translation (symmetrical and asym-
metrical), the symmetrical category is the most recommended
approach because it refers to faithfulness of meaning and collo-
quialness in both the source language (SL; original language of
the instrument) and the target language (TL; desired language)
and not to a literal translation [12]. The purpose of translation is
to achieve equivalence between the instrument in the SL and the
instrument in the TL [2]. The symmetrical translation is the only
category that facilitates the comparison of responses from indi-
viduals of one culture to those of another [6,12,13] and the deter-
mination of the most relevant types of cross-cultural equivalence:
the semantic, conceptual, content, technical and criterion [1,17].
In addition, the process known as centring [7], in which both
the SL and the TL of an instrument are equally important, should
be used.
The process of translation, adaptation and cross-cultural
validation of an instrument for use in other cultures, languages
and countries requires careful planning and adoption of com-
prehensive, rigorous and most established methodological
approaches [2–5,7–15]. Because there are variations among these
approaches, we have incorporated the most recommended
ones in a user-friendly guideline to facilitate adoption, consis-
tency and use.
Step 1: translation of the original instrument
into the target language (forward translation
or one-way translation)
The instrument in the source (original) language is forward
translated to the TL (target language) by at least two independent
translators, preferably certified, whose mother language is the
desired TL of the instrument. The translators must be bilingual
(i.e. fluent in the source and desired TL of the instrument) and
preferably bicultural (i.e. having in-depth experience in the culture
of the source and desired TL of the instrument). In addition, the
two translators must have distinct backgrounds. The first translator
must be knowledgeable about health care terminology and the
content area of the construct of the instrument in the desired TL.
The second translator must be familiar with colloquial phrases,
health care slang and jargon, idiomatic expressions, and emotional
terms in common use in the desired TL. The second translator
should not be knowledgeable about medical terminology and/or
the construct of the instrument. This approach will generate two
translated versions that contain words and sentences that cover
both the medical and the usual spoken language with its cultural
nuances. Therefore, choosing well-qualified translators is the key
to high-quality translations. If resources are available, translations
can also be done by two teams of independent translators (each
team of translators must have the same characteristics as the two
individual independent translators described above), which may
result in higher-quality translations by minimizing the introduc-
tion of personal idiosyncrasies when using only two independent
translators.
Key points
1Instrument in the SL translated to TL (TL1 and TL2) to
produce two forward-translated versions of the instrument.
2Use two bilingual and bicultural translators whose mother
language is the desired TL, but who have distinct backgrounds:
One translator must be knowledgeable about health termi-
nology and the content area of the construct of the instrument
in the TL.
The other translator must be knowledgeable about the cultural
and linguistic nuances of the TL.
3Two independent teams of translators can also be used (each
team of translators must have the same characteristics of the two
individual independent translators).
Step 2: comparison of the two translated
versions of the instrument (TL1 and
TL2): synthesis I
The instructions, the items and the response format of the two
forward-translated versions of the instrument (TL1 and TL2) and
both the TL1 and the TL2 with the original version of the instru-
ment in the SL are initially compared by a third bilingual and
preferably bicultural independent translator regarding ambigui-
ties and discrepancies of words, sentences and meanings. Any
V.D. Sousa and W. Rojjanasrirat Validation of instruments or scales
© 2010 Blackwell Publishing Ltd 269
ambiguities and discrepancies must be discussed and resolved
using a committee approach. Consensus should be achieved with
the participation of the third translator, the two translators from
Step 1, and the investigator and/or other members of the research
team. This process will generate the preliminary initial translated
version of the instrument in the TL (PI-TL).
Key points
1Use a third independent translator to compare the TL1 and TL2,
and to compare both the TL1 and TL2 with the SL version of the
instrument.
2Use a committee approach (third independent individual or
translator, translators who participated in Step 1, and investigator
and/or other members of research team) to resolve ambiguities and
discrepancies and derive the PI-TL)
Step 3: blind back-translation (blind backward
translation or blind double translation) of
the preliminary initial translated version
of the instrument
The PI-TL is translated back into the SL by two other independent
translators with the same qualifications and characteristics
described above in Step 1. For this step, the translators mother
language should be the SL of the original instrument, and they
should be completely blind to the original version of the instru-
ment (they had never seen the original version of the instrument).
They will produce two back-translated versions of the instrument.
Again, the first translator must be knowledgeable about health care
terminology and the content area of the construct of the instrument
in the SL, but no prior knowledge of the instrument being back-
translated. The second translator must be familiar with colloquial
phrases, health care slang and jargon, idiomatic expressions,
and emotional terms in common in the SL. The second translator
should not be knowledgeable about medical terminology and/or
construct of the instrument and should have no prior knowledge of
the instrument being back-translated as well. If resources are avail-
able, back-translation can also be done by two teams of translators,
which may result in higher-quality back-translations by minimiz-
ing the introduction of personal idiosyncrasies when using only
one independent back-translator to generate each initial back-
translated version of the instrument. This process will result in
two back-translated versions of the instrument in its original
SL (B-TL1 and B-TL2). This step allows for clarification of
words and sentences used in the translations. As noted in Step 1,
choosing well-qualified translators is the key to high-quality
back-translations.
Key points
1The PI-TL Back-translated to SL (B-TL1 and B-TL2) to
produce two back-translated versions.
2Use two bilingual and bicultural translators whose mother
language is the SL, but who have distinct backgrounds:
One translator must be knowledgeable about health termi-
nology and the content area of the construct of the instrument
in the SL.
The other translator must be knowledgeable about the cultural
and linguistic nuances of the SL.
3Two independent teams of translators can also be used (each
team of translators must have the same characteristics of the two
individual independent translators).
Step 4: comparison of the two back-translated
versions of the instrument (B-TL1 and B-TL2):
synthesis II
Initially, the instructions, items and response format of the two
back-translations (B-TL1 and BTL2) are compared by a multi-
disciplinary committee with the instructions, items and response
format of the original instrument in the SL regarding format,
wording, and grammatical structure of the sentences, similarity
in meaning, and relevance. It is highly recommended that the
committee should include at least one methodologist (who can be
the investigator and/or a member of the research team), one health
care professional who is familiar with the content areas of the
construct of the instrument, and all four bilingual and bicultural
translators involved in Step 1 (forward translation of the instru-
ment into the TL) and Step 3 (back-translation of the instrument
from the TL into the SL). It is also recommended that the devel-
oper of the original instrument in the SL participated and provide
insights on the construct of the instrument and clarify any ques-
tions that might arise. Having at least one monolingual committee
member whose mother language is the TL of the instrument would
enhance the quality of the pre-final version of the translated in-
strument. Any ambiguities and discrepancies regarding cultural
meaning and colloquialisms or idioms in words and sentences of
the instructions, the items, and the response format between the
two back-translations (B-TL1 and B-TL2) and between each one
of the two back-translations (B-TL1 and B-TL2) and the original
instrument in the SL are discussed and resolved through consensus
among the committee members to derive a pre-final version of
the instrument in the TL (P-FTL).
If discrepancies cannot be resolved, it may be necessary to
repeat Steps 1 though 4: two other independent bilingual and
bicultural translators must be used to translate the original instru-
ment (SL) again to generate two translations, and two other inde-
pendent bilingual and bicultural translators must be used to back-
translate the translated versions of the instrument (TL) following
the same procedure described above (known as the repetition
approach). Alternatively, only items that do not retain their original
meaning are re-translated and back-translated. The evaluation of
the translated and back-translated versions follows the same vali-
dation process described above. This process is repeated until no
ambiguities or discrepancies are found.
These methodological approaches of Step 4 will establish the
initial conceptual, semantic and content equivalence of the P-FTL.
Conceptual equivalence refers to the degree to which a concept of
the items of the instrument exists in both the source and target
cultures. Semantic equivalence refers to sentence structure, collo-
quialisms or idioms that ensure that the meaning of the text or idea
of the items of the instrument in the SL is present in the TL.
Finally, content equivalence refers to the relevance and pertinence
of the text or idea of the items of the instrument in each culture.
The committee’s role is to evaluate, revise and consolidate the
instructions, items and response format of the back-translated
instruments that have conceptual, semantic and content equiva-
lency and to develop the P-FTL for pilot and psychometric testing.
Validation of instruments or scales V.D. Sousa and W. Rojjanasrirat
© 2010 Blackwell Publishing Ltd270
Key points
1Comparison between the two back-translations (B-TL1 and
B-TL2) of the instrument, and between both BTL1 and B-TL2 and
the original SL instrument:
Evaluate similarity of the instructions, items and response
format regarding wording, sentence structure, meaning and
relevance.
2Use a multidisciplinary committee:
One methodologist (researcher or a member of the research
team).
One health care professional.
All four bilingual and bicultural translators used in Step 1 and
Step 3: two translators whose mother language is the desired TL
of the instrument and two translators whose mother language
is the SL of the original instrument.
3If possible, developer of the original instrument should parti-
cipate in the discussions.
4If ambiguities and discrepancies cannot be resolved, Steps 1
through 4 may be repeated as many times as necessary. Alter-
natively, only items that do not retain their original meaning are
re-translated and back-translated.
Step 5: pilot testing of the pre-final version of
the instrument in the target language with a
monolingual sample: cognitive debriefing
The P-FTL is pilot tested among participants whose language is
the TL of the instrument to evaluate the instructions, response
format and the items of the instrument for clarity. Participants
should be recruited from the target population in which the instru-
ment will be used (e.g. if the instrument measures self-care among
individuals with type 2 diabetes, then the sample must consist of
individuals with type 2 diabetes). Asample size of 10–40 individu-
als is recommended [3,18]. Each participant is asked to rate the
instructions and items of the scale using a dichotomous scale
(clear or unclear). Participants who rate the instructions, response
format or any item of the instrument as unclear are asked to provide
suggestions as to how to rewrite the statements to make the lan-
guage clearer. Instructions, response format and items of the instru-
ment that are found to be unclear by at least 20% of the sample must
be re-evaluated [19]. Therefore, the minimum inter-rater agreement
among the sample is 80%. This step is used to further support the
conceptual, semantic and content equivalency of the translated
instrument and further improve the structure of sentences used
in the instructions and items of the P-FTL to be easily understood
by the target population prior to psychometric testing.
To further determine the conceptual and content equivalence of
the items of the P-FTL, use of an expert panel, is highly recom-
mended. The instructions, response format and the items of the
instrument are evaluated for conceptual equivalence (clarity) by
six to ten members of an expert panel [20,21] who are knowledge-
able about the content areas of the construct of the instrument
and the target population in which the instrument will be used
and whose mother language is the TL of the instrument. When
possible, a committee of 10 members is preferable [20,21]. Each
member of the committee who rates the instructions, response
format or any item of the instrument as unclear is asked to provide
suggestions as to how to rewrite the statements and make the
language clearer. Instructions, response format and items of
the instrument that are found to be unclear by at least 20% of the
committee members must be revised and re-evaluated [19]. The
minimum inter-rater agreement among the experts panel is 80%).
This process will further determine the conceptual equivalence
of the translated instrument.
The expert panel is then asked to evaluate each item of the
instrument for content equivalence (content-related validity [rel-
evance]) using the following scale: 1 =not relevant; 2 =unable
to assess relevance; 3 =relevant but needs minor alteration;
4=very relevant and succinct. Items classified as 1 (not relevant)
or 2 (unable to assess relevance) should be revised [20,21].
Content validity index at the item level (I-CVI) and at the scale
level (S-CVI) should be calculated. There are three methods to
calculate S-CVI [20–22], but the averaging calculation (S-CVA/
Ave) method is preferred [22]. Using 10 experts, the I-CVI of
0.78 or above [20] and S-CVA/Ave of 0.90 or above [21] are
the minimum acceptable indices. Items that do not achieve the
minimum acceptable indices are revised and re-evaluated. New
content validity indices are calculated. The process continues until
acceptable indices of content-related validity or content equiva-
lence are achieved. It is also recommended that the kappa coeffi-
cient of agreement be determined to increase confidence in the
content validity of the instrument [23]. Akappa of 0.60 is generally
the minimum acceptable coefficient to determine good agreement
[24]. The purpose of Step 5 is to continue developing the P-FTL
for pre-field test for preliminary and/or full psychometric testing.
Key points
1Pilot test of the P-FTL among individuals whose language is the
TL of the instrument:
Evaluate the instructions, items and response format clarity.
Use a sample size of 10–40 participants.
2It is highly recommended to use an expert panel to further
examine the instrument for:
Clarity of the instructions, items and response format.
Content equivalence (content-related validity) using I-CVI,
S-CVI/Ave and Kappa coefficient of agreement.
Use a sample of 6–10 experts (10 experts are preferred).
Step 6: preliminary psychometric testing of the
pre-final version of the translated instrument
with a bilingual sample
This step is rarely used; however, when a bilingual population is
accessible, it is recommended that the instrument be pre-field
tested among bilingual individuals (fluent in the SL of the original
instrument and the TL of the translated instrument). If this is not
possible, skip this step and move to Step 7. In general, the recom-
mendation is to use at least five subjects per item of the instrument
to conduct the preliminary psychometric testing of a new instru-
ment [25]. Ideally, the bilingual sample should be from the target
population in which the instrument will be used (e.g. adult indi-
viduals with type 2 diabetes, African–American women with heart
failure). However, in many instances, this may be difficult and
unrealistic; thus other alternatives can be used such as sampling
bilingual college students and faculty or workers in travel agen-
cies, currency exchange agencies, international trade companies,
embassies and consulates, and language schools.
V.D. Sousa and W. Rojjanasrirat Validation of instruments or scales
© 2010 Blackwell Publishing Ltd 271
Initially, participants are given the P-FTL and are asked to
answer the items. The participants respond to the items of the
P-FTL without seeing the original instrument in the SL. After
completion of the P-FTL, participants are given the original in-
strument in the SL and are asked to answer the items. They may
complete a demographic questionnaire and/or other instruments of
interest. The order of the items of the original instrument must be
mixed to be in a different order from that of the items of the P-FTL.
Responses on both versions of the instrument are then compared
(i.e. interpretation of scores is the same in both cultures) to estab-
lish criterion equivalency (a type of construct validity). Statistical
analyses used for comparison purposes may consist of descriptive
statistics, correlation coefficients, and paired t-test or one-way
anova. Scale and item analysis is also used to establish the initial
preliminary psychometric properties of the instrument (internal
consistency reliability) and to compare these properties of the
P-FTL with the SL of the original instrument.When the instrument
purpose is to serve as a diagnostic or screening testing, preliminary
calculation of sensitivity and specificity is recommended. This
Step 6 also determines initial technical equivalency (the method
of assessment) and is useful to support the conceptual, semantic,
content and construct validity of the P-FTL prior to conducting full
psychometric field testing.
Key points
1When possible, pilot test the P-FTL among bilingual individuals
to:
Compare the P-FTL and the SL instrument in the SL.
Establish criterion equivalency and further support the
conceptual, semantic, content and construct equivalency of the
P-FTL.
2Use at least five subjects per item of the instrument.
3Subjects complete the P-FTL first without seeing the original
instrument in the SL.
4Subjects complete the original instrument in the SL in which
items have been mixed in different order from the P-FTL.
Step 7: full psychometric testing of the
pre-final version of the translated instrument
in a sample of the target population
This last step is used to establish the initial full psychometric
properties of the newly translated, adapted and cross-validated
instrument with a sample of the target population of interest. The
sample size for this step depends on the types of psychometric
approaches that will be used. The more complete the psychometric
approaches for evaluation of the translated instrument the more
confidence will be generated in its reliability and validity proper-
ties. In general, per rule of thumb, it is highly recommended to use
at least 10 subjects per item of the instrument scale and item
analysis and exploratory factor analysis [25–28]. If there is a plan
to use confirmatory factor analysis to test the factor structure of the
instrument, the recommendation per rule of thumb is approxi-
mately 300–500 subjects per item of the instrument [28,29]. Power
analysis based on the number of degrees of freedom, an alpha level
(0.05 or 0.01), and a desired power (80% or above) can also be
calculated [30,31].
The most recommended and commonly used psychometric
approaches in this step are estimation of: (1) internal consistency
reliability (or sensitivity and specificity); (2) stability reliability
(test–retest reliability); (3) homogeneity; (4) construct-related
validity such as convergent and/or divergent (discriminant) valid-
ity; (5) criterion-related validity such as concurrent and/or
predictive validity; (6) factor structure of the instrument (dimen-
sionality); and (7) model fit. Although, it is not the purpose of this
user-friendly guideline to describe the many statistical approaches
that can be used in Step 7, the most common statistical approaches
are scale and item analysis, Pearson’s correlation analysis, explor-
atory factor analysis and confirmatory factor analysis. The purpose
of the Step 7 is to revise and refine the items of the P-FTL as
needed to derive the final psychometrically sound FTL consisting
of adequate estimates of reliability, homogeneity, and validity and
with a stable factor structure and/or model fit.
Key points
1Full psychometric testing of the P-FTL among individuals from
the target population to:
Revise and refine the items of the final version of the instru-
ment in the TL.
Establish internal consistency reliability (or sensitivity and
specificity), stability reliability, homogeneity, construct-related
validity, criterion-related validity, factor structure and model fit
of the instrument.
2Use at least 10 subjects per item of the instrument for general
psychometric approaches (scale and item analysis, Pearson’s cor-
relations and exploratory factor analysis).
3Use 300–500 subjects for confirmatory factor analysis or
conduct a power analysis.
Example of a project to translate, adapt
and validate a research instrument
A project to translate, adapt and cross-validate an instrument for
cross-cultural research may take several years; and it is normally
conducted using more than one study to adhere to the recom-
mended methodological approaches described above. One study
might set as its initial goal to translate, adapt and cross-validate a
research instrument using Steps 1, 2, 3, 4 and 5 only. In a second
study, the researchers might set a single goal to establish the
preliminary psychometrics of the translated instrument with
bilingual participants using Step 6. Then, in a third study, the
researchers’ goal might be to establish the initial full psychometric
properties of a translated instrument in a sample of the target
population of interest using the approaches described in Step 7.
Depending on the psychometric approaches used in this third
study, other studies might be necessary to continue the develop-
ment and psychometric evaluation of the translated instrument. To
illustrate the use of the methodological steps to translate, adapt and
cross-validate an instrument and to evaluate the preliminary and
initial full psychometric properties of an instrument, we present
an example of a project that used two studies to adhere to the
recommended guideline.
Study one: cross-cultural equivalence and
psychometric properties of the Portuguese
version of the depressive cognition scale
In this study [6], the depressive cognition scale (DCS) was trans-
lated from English into Portuguese using Steps 1 and 2, and
Validation of instruments or scales V.D. Sousa and W. Rojjanasrirat
© 2010 Blackwell Publishing Ltd272
back-translated from Portuguese into English and cross-culturally
validated using Steps 3 and 4. Preliminary psychometric evalua-
tion of the scale was conducted with a bilingual sample using Step
6. Note that Step 5, important to determine clarity of the instruc-
tions, response format and sentence structure of the items, was
not done because the committee was comprised of three members
whose mother language was Portuguese who were convinced that
all those aspects of the scale were completely clear. The DCS was
originally developed in English to measure depressive cognitions
[32]. The theoretical basis for the development of the scale was
Beck’s theory of depression and Erickson’s theory of psychosocial
development. The DCS consists of eight items on a 6-point Likert-
type scale ranging from 0 (strongly disagree) to 5 (strongly agree).
Each item of the scale measures a specific cognition: hopeless-
ness, helplessness, purposelessness, powerlessness, worthlessness,
loneliness, emptiness and meaninglessness.
Using Steps 1 through 4, the DCS English version was trans-
lated into Portuguese by two bilingual translators who were fluent
in both English and Portuguese languages to generate two versions
of the translated scale. The Portuguese versions of the scale were
blindly back-translated into English by two different translators
who never saw the original version of the DCS in English. The two
versions of the translated scale were compared with each other,
and each one of these versions was compared with the original
English version of the scale to determine its conceptual, semantic
and content equivalence using a panel of three Brazilian bilinguals
and an American monolingual expert in the content area with
consultation with the translators who participated in the translation
and back-translation of the versions of the scale. Ambiguities and
discrepancies regarding conceptual and semantic equivalence on
two items that measured emptiness and purposelessness were dis-
cussed and resolved by the committee members. A final version of
the translated scale in Portuguese was derived and named ‘Escala
Cognitiva de Depressão (ECD).
As we stated previously, we skipped Step 5 and proceeded to
Step 6 to evaluate the preliminary psychometrics of the Portuguese
version of the scale (ECD), with a bilingual sample of individuals
from the general population, mostly students and faculty members
of a major Brazilian university. This sample was chosen because
of the anticipated difficulty to approach and recruit individuals
from the target population of interest (i.e. individuals with diabetes
mellitus). We used the recommended sample size of five subjects
per item of the scale; therefore, 40 bilingual adults participated in
the study. Using a two-step approach for data collection, partici-
pants received and responded to the Portuguese version of the
scale (ECD) only. Then participants answered a demographic
questionnaire and completed the original English version of the
scale (DCS). We used descriptive statistics, scale and item analy-
sis, and paired t-tests to compare the responses of participants in
both the Portuguese (ECD) and the English (DCS) items of the
scales. Descriptive statistics showed strong similarities in partici-
pants’ responses on both the Portuguese and English versions of
the scale (identical responses ranged from 78% to 98%). Pearson’s
correlation coefficients between each item of the ECD and each
item of the DCS ranged from 0.64 to 0.96 and were statistically
significant (P<0.001). Estimates of internal consistency reli-
ability and homogeneity were similar in both the ECD and DCS
versions of the scale (Cronbach’s alphas were 0.79 and 0.78,
respectively; item-to-total correlation coefficients ranged from
0.32 to 0.75 on the ECD and from 0.29 to 0.77 on the DCS). There
was no statistically significant difference between the overall mean
score of the ECD (M =5.05, SD =3.49) and the overall mean
score of the DCS (M =5.20, SD =3.56); t(1,39) =0.76, P=0.45.
At the item level, a statistically significant difference was found
between the mean score of item 2 on the ECD (M =0.80,
SD =0.76) and the mean score of item 2 on the DCS (M =0.90,
SD =0.74); t(1,39) =2.08, P=0.04. The findings of Step 6 sug-
gested adequate criterion equivalence and supported the concep-
tual, semantic and content validity of the ECD. The ECD had an
adequate internal consistency reliability, homogeneity and initial
support for construct validity. The full psychometric properties of
the ECD could be further tested among individuals from the target
population of interest.
Study two: psychometric properties of the
Portuguese version of the depressive cognition
scale in Brazilian adults with diabetes mellitus
The second study [33], was conducted to continue the psycho-
metric evaluation of the ECD using Step 7. The purpose of the
study was to evaluate the full psychometric properties of the ECD
among adults with diabetes mellitus (i.e. target population of inter-
est). We used the recommended sample size of at least 10 subjects
per item of the scale. The sample consisted of 82 Brazilian adults
with diabetes mellitus. The estimate of internal consistency reli-
ability was a Cronbach’s alpha of 0.88. Scale and item analysis
indicated that the inter-item correlation coefficients ranged from
0.37 to 0.69 and the item-to-correlation coefficients ranged from
0.51 to 0.71. Exploratory factor analysis indicated that the ECD
was unidimensional, explaining 56.73% of its item variance, and
had factor loadings ranging from 0.60 to 0.80 and communality
values ranging from 0.36 to 0.67. In addition, the ECD total score
was statistically significantly correlated with the Portuguese
version of the Beck Depression Inventory (r=0.24, P<0.05).
The study findings further supported the reliability, homogeneity
and construct-related validity of the ECD among Brazilians with
diabetes mellitus.
Conclusions
The diversity of the population worldwide shows a great need for
cross-culturally validated instruments or scales. Translation, adap-
tation and validation of an instrument or scale for cross-cultural
research is time-consuming and requires careful planning and
adoption of rigorous methodological approaches to derive a reli-
able and valid measure of the concept of interest in the target
population. This scholarly paper has reviewed and incorporated
the highly recommended methodological approaches in a clear and
user-friendly guideline. The adoption of symmetrical translation
using centring process will lead to more accurate adaptation and
cross-cultural validation of a translated instrument. The steps of
the proposed guideline provide clear directions to cross-culturally
validate research instruments or scales. Choosing the right trans-
lators and committee members is key aspect that must be consid-
ered to enhance the quality of the translation, back-translation and
cross-validation of an instrument or scale. Pilot testing of a trans-
lated instrument or scale among participants whose language is the
TL of the instrument to evaluate the instructions, the items and the
V.D. Sousa and W. Rojjanasrirat Validation of instruments or scales
© 2010 Blackwell Publishing Ltd 273
response format of the instrument for clarity also enhances the
quality of the final version of the translated instrument. Finally,
using well-established approaches to test the preliminary psy-
chometric properties of the translated instrument or scale among
bilingual individuals and full psychometrics of the translated
instrument or scale among individuals from the target population
of interest will derive a reliable and valid scale in the TL of
interest.
References
1. Hilton, A. & Skrutkowski, M. (2002) Translating instruments into
other languages: development and testing processes. Cancer Nursing,
25 (1), 1–7.
2. Sperber, A. D. (2004) Translation and validation of study instru-
ments for cross-cultural research. Gastroenterology, 126 (1), S124–
S128.
3. Beaton, D. E., Bombardier, C., Guillemin, F. & Ferraz, M. B. (2000)
Guidelines for the process of cross-cultural adaptation of self-report
measures. Spine, 25 (24), 3186–3191.
4. Beaton, D. E., Bombardier, C., Guillemin, F. & Ferraz, M. B. (2002)
Recommendations for the cross-cultural adaptation of health status
measures. Available at: http://www.dash.iwh.on.ca/assets/images/
pdfs/xculture2002.pdf (last accessed 1 October 2009).
5. Bracken, B. A. & Barona, A. (1991) State of the art procedures for
translating, validating and using psychoeducational tests in cross-
cultural assessment. School Psychology International, 12 (1–2), 119–
132.
6. Sousa, V. D., Zauszniewski, J. A., Mendes, I. A. C. & Zanetti, M. L.
(2005) Cross-cultural equivalence and psychometric properties of
the Portuguese version of the depressive cognition scale. Journal of
Nursing Measurement, 13 (2), 87–99.
7. Brislin, R. W. (1970) Back-translation for cross-cultural research.
Journal of Cross-Cultural Psychology, 1 (3), 185–216.
8. Brislin, R. W. (1986) The wording and translation of research instru-
ments. In Field Methods in Cross-Cultural Research (eds W. J. Lonner
& J. W. Berry), pp. 137–164. Beverly Hills, CA: Sage Publications.
9. Chapman, D. W. & Carter, J. F. (1979) Translation procedures for the
cross cultural use of measurement instruments. Educational Evalua-
tion and Policy Analysis, 1 (3), 71–76.
10. Guillemin, F., Bombardier, C. & Beaton, D. (1993) Cross-cultural
adaptation of health-related quality of life measures: literature review
and proposed guidelines. Journal of Clinical Epidemiology, 46 (12),
1417–1432.
11. Jones, E. (1987) Translation of quantitative measures for use in cross-
cultural research. Nursing Research, 36 (5), 324–327.
12. Jones, E. G. & Kay, M. (1992) Instrumentation in cross-cultural
research. Nursing Research, 41 (3), 186–188.
13. Jones, P. S., Lee, J. W., Phillips, L. R., Zhang, X. E. & Jaceldo, K. B.
(2001) An adaptation of Brislin’s translation model for cross-cultural
research. Nursing Research, 50 (5), 300–304.
14. McDermott, M. A. N. & Palchanes, K. (1994) A literature review of
the critical elements in translation theory. Image: Journal of Nursing
Scholarship, 26 (2), 113–117.
15. Wild, D., Grove, A., Martin, M., Eremenco, S., McElroy, S.,
Verjee-Lorenz, A. & Erikson, P. (2005) Principles of good practice for
the translation and cultural adaptation processs for patient-reported
outcomes (PRO) measures: report of the ISPOR task force for trans-
lation and cultural adaptation. Value in Health, 8 (2), 94–104.
16. Maneesriwongul, W. & Dixon, J. K. (2004) Instrument translation
process: a method review. Journal of Advanced Nursing, 48 (2), 175–
186.
17. Beck, C. T., Bernal, H. & Froman, R. D. (2003) Methods to document
semantic equivalence of a translated scale. Research in Nursing and
Health, 26 (1), 64–73.
18. Sousa, V. D., Hartman, S. W., Miller, E. H. & Carroll, M. A. (2009)
New measures of diabetes self-care agency, diabetes self-efficacy, and
diabetes self-management for insulin-treated individuals with type 2
diabetes. Journal of Clinical Nursing, 18 (9), 1305–1312.
19. Topf, M. (1986) Three estimates of interrater reliability for nominal
data. Nursing Research, 35 (4), 253–255.
20. Lynn, M. R. (1986) Determination and quantification of content valid-
ity. Nursing Research, 35 (6), 382–385.
21. Waltz, C. F., Strickland, O. L. & Lenz, E. R. (2005) Measurement in
Nursing and Health Research, 3rd edn. NewYork: Springer Publishing
Company.
22. Polit, D. F. & Beck, C. T. (2006) The content validity index: are you
sure you know what’s being reported? Critique and recommendations.
Research in Nursing and Health, 29 (5), 489–497.
23. Wynd, C. A., Schmidt, B. & Schaefer, M. A. (2003) Two quantitative
approaches for estimating content validity. Western Journal of
Nursing Research, 25 (5), 508–518.
24. Streiner, D. L. & Norman, G. R. (2008) Health Measurement Scales:
A Practical Guide for Their Development and Use, 4th edn. New York:
Oxford University Press.
25. Nunnally, J. C. & Bernstein, I. H. (1994) Psychometric Theory, 3rd
edn. New York: McGraw-Hill.
26. Hair, J. F. J., Anderson, R. E., Tatham, R. L. & Black, W. C. (1998)
Multivariate Data Analysis, 5th edn. Englewood Cliffs, NJ: Prentice-
Hall, Inc.
27. Stevens, J. (2002) Applied Multivariate Statistics for the Social
Sciences, 4th edn. Mahwah, NJ: Lawrence Erlbaum Associates.
28. Tabachnick, B. G. & Fidell, L. S. (2001) Using Multivariate Statistics,
4th edn. Needham Heights, MA: Allyn & Bacon.
29. Gonzalez, R. & Griffen, D. (2001) Testing parameters in structure
equation modeling: every ‘one’matters. Psychological Methods,6(3),
258–269.
30. MacCallum, R. C., Browne, M. W. & Sugawara, H. M. (1996) Power
analysis and determination of sample size for covariance structure
modeling. Psychological Methods, 1 (2), 130–149.
31. MacCallum, R. C., Browne, M. W. & Cai, L. (2006) Testing differ-
ences between nested covariance structure models: power analysis and
null hypotheses. Psychological Methods, 11 (1), 19–35.
32. Zauszniewski, J. A. (1995) Development and testing of a measure of
depressive cognitions in older adults. Journal of Nursing Measure-
ment, 3 (1), 31–41.
33. Sousa, V. D., Zanetti, M. L., Zauszniewski, J. A., Mendes, I. A. C. &
Daguano, M. O. (2008) Psychometric properties of the Portuguese
version of the depressive cognition scale in Brazilian adults
with diabetes mellitus. Journal of Nursing Measurement, 16 (2), 125–
135.
Validation of instruments or scales V.D. Sousa and W. Rojjanasrirat
© 2010 Blackwell Publishing Ltd274
... The development and validation of the PEACHD survey followed a structured process with 3 main steps. 25,26 Initially, a draft questionnaire was created based on the validated In-Center Hemodialysis (ICH) CAHPS survey, 27 which was translated into Italian through a forward and backward translation process by 2 independent translators. During this phase, a first cultural and contextual review of the translated items was conducted by a researcher and a nephrologist involved in the project. ...
... The panel rated the relevance of each item on a 4-point Likert scale (1 = not relevant to 4 = very relevant). 25 Items rated as 1 or 2 by any expert were revised. Face validity, which evaluates whether the survey items appear clear and understandable to respondents, was evaluated through a cognitive debriefing with 21 people undergoing hemodialysis. ...
... A "think-aloud" technique was used to gather feedback on the meaning of questions and response options, 26 and items reported as unclear by at least 20% of participants were revised. 25,26 Qualitative feedback was also used to adjust the wording of some questions. ...
Article
Full-text available
Patient experience is a crucial measure of healthcare quality with the potential to increase value for several health stakeholders. However, various barriers often hinder its impact on quality improvement. Therefore, valid and reliable instruments developed through structured and collaborative processes are needed to establish methodological and organizational practices and ensure consensus and credibility among all stakeholders. This study presents the development and validation of the Patient Experience Assessment of in-Center Hemodialysis (PEACHD) survey. An expert panel, cognitive interviews, and a pilot test were conducted, involving both people receiving hemodialysis care and professionals from four Italian hospitals. The questionnaire evaluates key aspects of the in-center hemodialysis experience, including the provision of medical information, involvement in treatment decision-making, and communication with professionals. The PEACHD survey demonstrated strong content and face validity, acceptable construct validity, and good internal consistency reliability. Pilot data highlighted that the professional delivering care (i.e. nephrologist or dialysis nurse) significantly influenced patient experience and emphasized the need for a holistic and person-centered approach. The PEACHD survey enables effective patient experience evaluation, enhancing value for both service users and professionals.
... Between 30 and 40 persons or subjects from the target setting should be evaluated (Beaton et al., 2000), totaling 10-40 participants ideally. The purpose of this stage was to ensure the clarity and accuracy of the selection of words and phrases based on the target group's perspective (Beaton et al., 2000); Maneesriwongul and Dixon, 2004;Sousa and Rojjanasrirat, 2011). The extent to which the target group understood the meaning, ease of reading, and the estimated time required to complete the questionnaire were determined. ...
... At this stage, the participants provided feedback both verbally and through observations during the pilot study. After the instrument was prepared, the participants provided feedback on its items (Sousa and Rojjanasrirat, 2011). The results show changes in the diction of "children". ...
Article
Full-text available
Introduction: This study aimed to translate and validate the Self-Efficacy Questionnaire (SEQ-C) and its subscale for Indonesian adolescents, which has potential implications for bullying prevention. Methods: Cross-cultural adaptation was carried out using the Beaton guidelines. An assessment of psychometric testing was carried out during January and February 2024. The eligibility criteria for participants were students aged 13 to 15. Students who declined to participate were excluded. The research involved 120 children. Testing the questionnaire's structural factors used Confirmatory Factor Analysis (CFA). IBM SPSS 25 and AMOS 29 were used for the analysis. Results: Following the criteria established for CFA, two items (ASE10 and SSE18) were eliminated due to their low factor loadings. This resulted in a refined SEQ-C structure of 22 items distributed across three factors. The alpha reliability coefficients showed robust internal consistency for the entire scale at first test and retest (α=0.884; α =0.911) and for each of the three subscales (all >0.80). The model fit indices indicated satisfactory values for the Comparative Fit Index (CFI)=0.906; Root Mean Square Error of Approximation (RMSEA)=0.063; and the Minimum Discrepancy Function by Degrees of Freedom divided (CMIN/DF)=1.474). Conclusion: The SEQ-C emerges as a trustworthy and valid tool for evaluating self-efficacy across three key components: intellectual, social, and emotional. It can assess adolescent self-efficacy for research, education, and nursing interventions, as part of enhancing the life skills of adolescents.
... After receiving authorization from the original authors of the MUCSS (Bartolome G., personal communication), the translation process followed these steps [44]: However, two items did not achieve 80% agreement on clarity among the evaluators. The main concerns involved the terminology and grammatical structure of the original scale. ...
... Therefore, the aim of this study was to test the psychometric properties of IT-MUCSS. First, the scale has been appropriately translated and culturally adapted for the Italian population to ensure its validity, in accordance with the guidelines established by Sousa et al. [44]. Then, in this study, we contribute to verifying the inter-rater reliability and internal consistency of the MUCSS-IT, ensuring the scale's robustness and coherence among its components, as the scale has preserved its accuracy and reliability. ...
Article
Full-text available
Background/Objectives: Oral intake and secretions need to be assessed separately, especially in patients with tracheal tubes, as they are vital for dysphagia treatment and may require different management strategies. This study aims to validate the Italian version of the Munich Swallowing Score (IT-MUCSS) by examining its content and construct validity in relation to the fiberoptic endoscopic evaluation of swallowing (FEES) and oral intake in adults with neurogenic dysphagia, as well as assessing intra- and inter-rater reliability. This tool is clinically and scientifically useful as it includes two subscales: IT-MUCSS-Saliva, which assesses saliva/secretion management and the presence of a tracheal tube, and IT-MUCSS-Alimentazione, which evaluates feeding methods. Methods: In this prospective cross-sectional study, a total of 50 dysphagic patients with a neurological diagnosis were recruited from a neuro-rehabilitation hospital and underwent both clinical and instrumental assessments. The main outcome measures included evaluating food and liquid intake using the Italian versions of the Functional Oral Intake Scale (FOIS-It) and the IT-MUCSS. Pharyngeal residues were assessed using the Yale Pharyngeal Residue Severity Rating Scale (IT-YPRSRS), and airway penetration/aspiration were evaluated using the Penetration–Aspiration Scale (PAS) during FEES. Results: The IT-MUCSS demonstrated excellent reproducibility (K = 0.91) and internal consistency (Cronbach’s alpha = 0.72). Strong correlations were found between IT-MUCSS and the FOIS-It scale, indicating the effective assessment of dysphagia. Test–retest reliability was high (ICC = 0.96 for total score). Construct validity was confirmed through significant correlations with instrumental measures during FEES. Conclusions: The IT-MUCSS is a valid tool for assessing functional oral intake and the management of saliva/secretions, specifically in relation to the level of saliva/secretions management compared to FEES measures of swallowing safety and efficiency in patients with neurogenic dysphagia.
... The study employed a convenience sampling approach and utilized WeChat, a popular social media platform in China, to administer an online survey via the professional platform "Questionnaire Star. " The sample size required for instrument validation was calculated to be 5-20 times the total number of items [19]. Given that the study included 44 items, we initially collected 1,064 responses. ...
Article
Full-text available
Background In recent decades, mental health and stress among medical students have become a global concern. Currently, China lacks a scale specifically designed to assess stress levels in medical school settings. This study aims to cross-culturally translate and adapt the Perceived Medical School Stress (PMSS) Scale into Chinese, evaluate its psychometric properties in medical schools, and analyze the associated factors of medical students’ stress levels. Methods Data collection for the Chinese version of the PMSS was conducted from October to November 2023, among medical students from selected medical schools in North and East China. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were used to evaluate the underlying factor structure. Content validity was assessed using the Content Validity Index (CVI). Criterion validity was evaluated with the Chinese version of the Perceived Stress Scale (PSS). Internal consistency was assessed by calculating Cronbach’s alpha coefficient, McDonald’s Omega coefficient, and test-retest reliability. Additionally, relationships between medical school stress and general demographic characteristics, insomnia severity, and self-efficacy were examined. Results The final Chinese version of the PMSS supports a two-factor structure with 13 items, defined as “psychological stress and environment” and “resilience and expectations.” The scale’s Content Validity Index (CVI) was 0.980, with a criterion validity of 0.767. The Cronbach’s alpha coefficient was 0.911, McDonald’s Omega coefficient was 0.914, and the test-retest reliability was 0.794. Medical school stress levels showed significant differences based on gender and educational background (P < 0.05). Stress levels were positively correlated with insomnia severity and negatively correlated with self-efficacy. Conclusions The Chinese version of the PMSS is a reliable and valid tool for assessing stress levels among medical students in Chinese medical schools. Female students and those pursuing graduate degrees report higher levels of medical stress. Insomnia severity and self-efficacy significantly influence stress levels among medical students.
... First, the corresponding author communicated with one of the developers of the tool and permission and the original version of SIDAS were received. Then the translation of SIDAS from English to Farsi was carried out according to the guidelines [20] and during 5 steps in the following order: forward translation (in Persian) by two people, integration of forward translations, backward translation (in English) by two other people, integration of backward translations and preparation of the final version and finally translation of the final version into Persian and application of Persian SIDAS to check validity and reliability. ...
Article
Full-text available
Background The prevalence of Suicidal ideation is increasing in the world. A suitable tool is needed to identify suicide thoughts in the early stage. This study aimed to translation and cultural adaptation of Suicidal Ideation Attributes Scale (SIDAS) among Iranian general people. Methods This study was conducted on 1297 participants (EFA = 364 samples, CFA = 933 samples) in 2024. The psychometric properties of SIDAS including of face validity, content validity, construct validity (confirmatory factor analysis and exploratory factor analysis), Cronbach’s alpha coefficient, intraclass correlation coefficient (ICC), and McDonald’s omega coefficient were evaluated. Results The scores of S-CVI/ Ave and CVR of SIDAS were 1 and 1, respectively. One factor with eigenvalues more than 1 was extracted in EFA section and explained 70.95% variance of SIDAS. In EFA, results showed that the rate of Kaiser Meyer Olkin measure of sampling adequacy was 0.817. In CFA, the one factor was evaluated and confirmed based on the goodness-of-fit indices (for example: CFI = 0.995, RMSEA = 0.061, NFI = 0.994, GFI = 0.992). In reliability section, the McDonald’s omega coefficient, Cronbach’s alpha coefficient, and ICC of SIDAS were 0.931, 0.779, and 0.880, respectively. Conclusion The Persian version of SIDAS was confirmed with One factor and 5 items and this brief scale is appropriate for measure the presence and severity of suicidal thoughts in the general population.
Article
Background Nurse managers’ (NMs) assessment of nurses’ competences is needed to analyse how well the educational preparation corresponds with the requirements of nursing practice in Europe. Aim To assess newly graduated nurses’ professional competence in the transition phase as perceived by NMs and to identify possible background factors related to their assessments. Methods A descriptive cross-sectional multinational study. Data were collected in 2019 from NMs ( n = 425) in Finland, Germany, Iceland, Lithuania and Spain using the structured Nurse Competence Scale and statistically analysed. Results NMs assessed the level of newly graduated nurses’ competence as ‘good’. However, the overall competence varied between different countries. In all countries, the subcategory ‘Managing situations’ scored the highest and ‘Therapeutic interventions’ the lowest. NMs’ background factors were related to their assessment. Conclusions Newly graduated nurses were assessed to have a good level of professional competence to meet the demands of their work in the transition phase, although there is room for improvement. The results can be used for cooperation between working life and nursing education to identify areas where the professional competence of newly qualified nurses can be improved and to promote their transition and continuous professional development in Europe.
Article
Full-text available
This study aimed to translate, adapt, and validate the Player Experience of Need Satisfaction (PENS) – modified version for Malaysian Multiplayer Online Battle Arena (MOBA) players. The translation and adaptation process involved a rigorous procedure with nine experts from the mental health and linguistic fields. A non-probability sampling method was applied by recruiting 491 participants from MOBA Facebook groups. The parallel analysis indicated a two-factor structure (autonomy and competence as one factor and relatedness as another factor). Confirmatory factor analysis demonstrated a better model fit for the translated version compared to the original English version, with satisfactory psychometric properties. Future studies should address psychometric concerns and evaluate the scale across diverse populations and game genres. The study provides a foundation for understanding and enhancing the gaming experience in the dynamic and rapidly growing MOBA gaming market.
Preprint
Full-text available
Background Unintentionally pregnant individuals in Germany seeking an abortion face challenges due to legal regulations, stigma and difficult access to abortion care. Abortion attitudes of (prospective) physicians influence the care situation. To measure these attitudes, psychometrically sound instruments like the Abortion Attitude Scale (AAS) are necessary. So far, no instruments assessing attitudes toward abortions are available in German. The aim of this study is to translate, culturally adapt and psychometrically test the AAS. Methods This is a secondary analysis of a cross-sectional study on abortion attitudes of medical students in Germany. The English 14-item AAS was translated into German and adapted using a team translation protocol. Comprehensibility was tested via cognitive interviews (n=10 medical students). We analyzed acceptance (completion rate), factorial structure (confirmatory factor analysis (CFA), model fit), item characteristics (response distribution, item difficulties, corrected item-total correlations, inter-item correlations), and reliability (McDonald’s omega). Results The translated and adapted AAS version was comprehensible. AAS data of 305 medical students could be included in analysis. Completion rate was above 98% for all items. The CFA results confirmed a one-factorial structure but a model without item 10 and correlations between item 8 and item 13 showed to have the best model fits. Floor or ceiling effects could be found for 7 items, item difficulties ranged between 0.39 and 0.94, corrected item-total-correlations ranged between 0.460 and 0.766 for the best model, inter-item correlations ranged between .129 to .681, and McDonald’s omega was above 0.9 for both models. Conclusion The German AAS is a brief measure with high acceptance and good psychometric properties. Removal of item 10 could be discussed. The AAS can ease and improve the evaluation of attitudes toward abortions in Germany. This can potentially lead to the development of targeted interventions to reduce barriers and improve care for unintentionally pregnant individuals.
Article
This study adapts the “Scales for Identifying Gifted Students (SIGS‐2)” into Turkish for use from preschool onward, specifically during the candidate nomination stage. Conducted with 974 parents (675 mothers, 299 fathers) of children aged 5–10, it employs Confirmatory Factor Analysis (CFA) to evaluate the scale's structure and reliability. CFA results show excellent fit indices (CFI = 0.998, GFI = 0.994, IFI = 0.998, NFI = 0.993, NNFI = 0.998, RFI = 0.993) and an RMSEA of 0.017, indicating good model fit. Factor loadings ranged from 0.36 to 0.89, and item‐total correlations were between 0.32 and 0.79, demonstrating effective discrimination. Reliability coefficients were high, with Cronbach's alpha, McDonald's omega, and Composite Reliability (CR) ranging from 0.87 to 0.96. The SIGS‐2 Home Rating Scale aligns well with existing measures and reflects changes in IQ levels, showing its suitability for assessing gifted children in Türkiye. While parental nominations are valuable, they may be less reliable than test results in identifying giftedness.
Article
Full-text available
A problem with standard errors estimated by many structural equation modeling programs is described. In such programs, a parameter's standard error is sensitive to how the model is identified (i.e., how scale is set). Alternative but equivalent ways to identify a model may yield different standard errors, and hence different Z tests for a parameter, even though the identifications produce the same overall model fit. This lack of invariance due to model identification creates the possibility that different analysts may reach different conclusions about a parameter's significance level even though they test equivalent models on the same data. The authors suggest that parameters be tested for statistical significance through the likelihood ratio test, which is invariant to the identification choice.
Book
Clinicians and those in health sciences are frequently called upon to measure subjective states such as attitudes, feelings, quality of life, educational achievement and aptitude, and learning style in their patients. This fifth edition of Health Measurement Scales enables these groups to both develop scales to measure non-tangible health outcomes, and better evaluate and differentiate between existing tools. Health Measurement Scales is the ultimate guide to developing and validating measurement scales that are to be used in the health sciences. The book covers how the individual items are developed; various biases that can affect responses (e.g. social desirability, yea-saying, framing); various response options; how to select the best items in the set; how to combine them into a scale; and finally how to determine the reliability and validity of the scale. It concludes with a discussion of ethical issues that may be encountered, and guidelines for reporting the results of the scale development process. Appendices include a comprehensive guide to finding existing scales, and a brief introduction to exploratory and confirmatory factor analysis, making this book a must-read for any practitioner dealing with this kind of data.