Content uploaded by Wilaiporn Rojjanasrirat
Author content
All content in this area was uploaded by Wilaiporn Rojjanasrirat on Jun 14, 2019
Content may be subject to copyright.
Translation, adaptation and validation of instruments or
scales for use in cross-cultural health care research:
a clear and user-friendly guidelinejep_1434268..274
Valmi D. Sousa PhD RN1and Wilaiporn Rojjanasrirat PhD RNC IBCLC2
1Associate Professor, The University of Kansas School of Nursing, Kansas City, Kansas, USA
2Associate Professor, Graceland University School of Nursing, Independence, Missouri, USA
Keywords
back-translation, cross-cultural validation,
health care research, translation
Correspondence
Valmi D. Sousa
The University of Kansas School of Nursing
3901Rainbow Boulevard
Kansas City, KS 66160
USA
E-mail: vsousa@kumc.edu
Accepted for publication: 3 February 2010
doi:10.1111/j.1365-2753.2010.01434.x
Abstract
Rationale, aims and objectives The diversity of the population worldwide suggests a
great need for cross-culturally validated research instruments or scales. Researchers and
clinicians must have access to reliable and valid measures of concepts of interest in their
own cultures and languages to conduct cross-cultural research and/or provide quality
patient care. Although there are well-established methodological approaches for translat-
ing, adapting and validating instruments or scales for use in cross-cultural health care
research, a great variation in the use of these approaches continues to prevail in the health
care literature. Therefore, the objectives of this scholarly paper were to review published
recommendations of cross-cultural validation of instruments and scales, and to propose and
present a clear and user-friendly guideline for the translation, adaptation and validation of
instruments or scales for cross-cultural health care research.
Methods A review of highly recommended methodological approaches to translation,
adaptation and cross-cultural validation of research instruments or scales was performed.
Recommendations were summarized and incorporated into a seven-step guideline. Each
one of the steps was described and key points were highlighted. Example of a project using
the proposed steps of the guideline was fully described.
Conclusions Translation, adaptation and validation of instruments or scales for cross-
cultural research is very time-consuming and requires careful planning and the adoption of
rigorous methodological approaches to derive a reliable and valid measure of the concept
of interest in the target population.
Introduction
Globalization and migration have contributed to an increasing
diversity of the population in many countries, particularly in the
USA, where the background of the population is extremely
diverse regarding culture, language and ethnicity [1,2]. Therefore,
there is a need for relevant cross-cultural research to addresses a
number of problems among these multinational and multicultural
populations. However, health care researchers conducting cross-
cultural studies must have access to reliable and cross-validated
instruments in other cultures and/or in other languages [3–6].
Findings from cross-cultural research may have great clinical
implications for physicians, nurses and other health care profes-
sionals who provide care for diverse populations because the
delivery of quality care depends on the accurate assessment and
deeper understanding of an individual’s cultural, linguistic and
ethnic background.
Background and significance
The increase in diverse populations worldwide and the need for
cross-cultural and multinational research indicate a great need for
clinicians and researchers to have access to reliable and valid
instruments or measures cross-validated among diverse cultural
segments of the population and/or in other languages [3,4,6]. This
would enhance the validity, the generalization and the translation
of cross-cultural health care research. Although there are well-
established methodological approaches for translating, adapting
and validating instruments for use in cross-cultural health care
research [2–5,7–15], a great variation in the use of these
approaches continues to prevail in the health care literature.
A recent review of 47 methodological studies focusing on the
translation and validation of instruments for cross-cultural
research reported that the quality and methodological approaches
of the reviewed studies varied greatly [16]. There was no clear
Journal of Evaluation in Clinical Practice ISSN 1365-2753
© 2010 Blackwell Publishing Ltd, Journal of Evaluation in Clinical Practice 17 (2011) 268–274268
consensus among researchers on how the approaches should be
used or combined, a great variation on the qualifications of trans-
lators, and a lack of detailed information about the translation,
back-translation, validation, testing, and revision and refinement
of the instruments.
Corroborating these findings, another researcher [2] reported
that, unfortunately, translating, adapting and cross-culturally
validating research an instrument is treated as an unimportant
step of study protocols. Furthermore, the most commonly used
and reported methodological approach was the forward trans-
lation only, often using an unqualified translator. In addition,
the same author [2] reported that only a few researchers have
described the use of strategies and steps or processes for the
adaptation and/or validation of the instruments, and empha-
sized that it is not sufficient to forward translate an instrument
without carefully evaluating its adaptation and cross-cultural
validation. The procedure should consist of a comprehensive
process that involves not only translation of an instrument,
but also thorough evaluation of its adaptation and cross-cultural
validation.
These data suggest that despite the existing recommendations
and guidelines to use a comprehensive multistep process for
translating, adapting and cross-validating instruments, research-
ers have not been doing this. It may be that the methodological
approaches are not clearly presented in a user-friendly format,
which makes it difficult for researchers to adopt and follow
the recommendations. Therefore, the objectives of this scholarly
paper were to review published recommendations of cross-
cultural validation of instruments and scales, and to propose and
present the highly recommended methodological approaches
for translating, adapting and validating instruments for cross-
cultural health care research in a clear and a user-friendly guide-
line. This would lead to better understanding and use of
these approaches by health care researchers, particularly nurse
researchers worldwide.
Methodological approaches
Of the two main categories of translation (symmetrical and asym-
metrical), the symmetrical category is the most recommended
approach because it refers to faithfulness of meaning and collo-
quialness in both the source language (SL; original language of
the instrument) and the target language (TL; desired language)
and not to a literal translation [12]. The purpose of translation is
to achieve equivalence between the instrument in the SL and the
instrument in the TL [2]. The symmetrical translation is the only
category that facilitates the comparison of responses from indi-
viduals of one culture to those of another [6,12,13] and the deter-
mination of the most relevant types of cross-cultural equivalence:
the semantic, conceptual, content, technical and criterion [1,17].
In addition, the process known as centring [7], in which both
the SL and the TL of an instrument are equally important, should
be used.
The process of translation, adaptation and cross-cultural
validation of an instrument for use in other cultures, languages
and countries requires careful planning and adoption of com-
prehensive, rigorous and most established methodological
approaches [2–5,7–15]. Because there are variations among these
approaches, we have incorporated the most recommended
ones in a user-friendly guideline to facilitate adoption, consis-
tency and use.
Step 1: translation of the original instrument
into the target language (forward translation
or one-way translation)
The instrument in the source (original) language is forward
translated to the TL (target language) by at least two independent
translators, preferably certified, whose mother language is the
desired TL of the instrument. The translators must be bilingual
(i.e. fluent in the source and desired TL of the instrument) and
preferably bicultural (i.e. having in-depth experience in the culture
of the source and desired TL of the instrument). In addition, the
two translators must have distinct backgrounds. The first translator
must be knowledgeable about health care terminology and the
content area of the construct of the instrument in the desired TL.
The second translator must be familiar with colloquial phrases,
health care slang and jargon, idiomatic expressions, and emotional
terms in common use in the desired TL. The second translator
should not be knowledgeable about medical terminology and/or
the construct of the instrument. This approach will generate two
translated versions that contain words and sentences that cover
both the medical and the usual spoken language with its cultural
nuances. Therefore, choosing well-qualified translators is the key
to high-quality translations. If resources are available, translations
can also be done by two teams of independent translators (each
team of translators must have the same characteristics as the two
individual independent translators described above), which may
result in higher-quality translations by minimizing the introduc-
tion of personal idiosyncrasies when using only two independent
translators.
Key points
1Instrument in the SL →translated to TL (TL1 and TL2) to
produce two forward-translated versions of the instrument.
2Use two bilingual and bicultural translators whose mother
language is the desired TL, but who have distinct backgrounds:
•One translator must be knowledgeable about health termi-
nology and the content area of the construct of the instrument
in the TL.
•The other translator must be knowledgeable about the cultural
and linguistic nuances of the TL.
3Two independent teams of translators can also be used (each
team of translators must have the same characteristics of the two
individual independent translators).
Step 2: comparison of the two translated
versions of the instrument (TL1 and
TL2): synthesis I
The instructions, the items and the response format of the two
forward-translated versions of the instrument (TL1 and TL2) and
both the TL1 and the TL2 with the original version of the instru-
ment in the SL are initially compared by a third bilingual and
preferably bicultural independent translator regarding ambigui-
ties and discrepancies of words, sentences and meanings. Any
V.D. Sousa and W. Rojjanasrirat Validation of instruments or scales
© 2010 Blackwell Publishing Ltd 269
ambiguities and discrepancies must be discussed and resolved
using a committee approach. Consensus should be achieved with
the participation of the third translator, the two translators from
Step 1, and the investigator and/or other members of the research
team. This process will generate the preliminary initial translated
version of the instrument in the TL (PI-TL).
Key points
1Use a third independent translator to compare the TL1 and TL2,
and to compare both the TL1 and TL2 with the SL version of the
instrument.
2Use a committee approach (third independent individual or
translator, translators who participated in Step 1, and investigator
and/or other members of research team) to resolve ambiguities and
discrepancies and derive the PI-TL)
Step 3: blind back-translation (blind backward
translation or blind double translation) of
the preliminary initial translated version
of the instrument
The PI-TL is translated back into the SL by two other independent
translators with the same qualifications and characteristics
described above in Step 1. For this step, the translators mother
language should be the SL of the original instrument, and they
should be completely blind to the original version of the instru-
ment (they had never seen the original version of the instrument).
They will produce two back-translated versions of the instrument.
Again, the first translator must be knowledgeable about health care
terminology and the content area of the construct of the instrument
in the SL, but no prior knowledge of the instrument being back-
translated. The second translator must be familiar with colloquial
phrases, health care slang and jargon, idiomatic expressions,
and emotional terms in common in the SL. The second translator
should not be knowledgeable about medical terminology and/or
construct of the instrument and should have no prior knowledge of
the instrument being back-translated as well. If resources are avail-
able, back-translation can also be done by two teams of translators,
which may result in higher-quality back-translations by minimiz-
ing the introduction of personal idiosyncrasies when using only
one independent back-translator to generate each initial back-
translated version of the instrument. This process will result in
two back-translated versions of the instrument in its original
SL (B-TL1 and B-TL2). This step allows for clarification of
words and sentences used in the translations. As noted in Step 1,
choosing well-qualified translators is the key to high-quality
back-translations.
Key points
1The PI-TL →Back-translated to SL (B-TL1 and B-TL2) to
produce two back-translated versions.
2Use two bilingual and bicultural translators whose mother
language is the SL, but who have distinct backgrounds:
•One translator must be knowledgeable about health termi-
nology and the content area of the construct of the instrument
in the SL.
•The other translator must be knowledgeable about the cultural
and linguistic nuances of the SL.
3Two independent teams of translators can also be used (each
team of translators must have the same characteristics of the two
individual independent translators).
Step 4: comparison of the two back-translated
versions of the instrument (B-TL1 and B-TL2):
synthesis II
Initially, the instructions, items and response format of the two
back-translations (B-TL1 and BTL2) are compared by a multi-
disciplinary committee with the instructions, items and response
format of the original instrument in the SL regarding format,
wording, and grammatical structure of the sentences, similarity
in meaning, and relevance. It is highly recommended that the
committee should include at least one methodologist (who can be
the investigator and/or a member of the research team), one health
care professional who is familiar with the content areas of the
construct of the instrument, and all four bilingual and bicultural
translators involved in Step 1 (forward translation of the instru-
ment into the TL) and Step 3 (back-translation of the instrument
from the TL into the SL). It is also recommended that the devel-
oper of the original instrument in the SL participated and provide
insights on the construct of the instrument and clarify any ques-
tions that might arise. Having at least one monolingual committee
member whose mother language is the TL of the instrument would
enhance the quality of the pre-final version of the translated in-
strument. Any ambiguities and discrepancies regarding cultural
meaning and colloquialisms or idioms in words and sentences of
the instructions, the items, and the response format between the
two back-translations (B-TL1 and B-TL2) and between each one
of the two back-translations (B-TL1 and B-TL2) and the original
instrument in the SL are discussed and resolved through consensus
among the committee members to derive a pre-final version of
the instrument in the TL (P-FTL).
If discrepancies cannot be resolved, it may be necessary to
repeat Steps 1 though 4: two other independent bilingual and
bicultural translators must be used to translate the original instru-
ment (SL) again to generate two translations, and two other inde-
pendent bilingual and bicultural translators must be used to back-
translate the translated versions of the instrument (TL) following
the same procedure described above (known as the repetition
approach). Alternatively, only items that do not retain their original
meaning are re-translated and back-translated. The evaluation of
the translated and back-translated versions follows the same vali-
dation process described above. This process is repeated until no
ambiguities or discrepancies are found.
These methodological approaches of Step 4 will establish the
initial conceptual, semantic and content equivalence of the P-FTL.
Conceptual equivalence refers to the degree to which a concept of
the items of the instrument exists in both the source and target
cultures. Semantic equivalence refers to sentence structure, collo-
quialisms or idioms that ensure that the meaning of the text or idea
of the items of the instrument in the SL is present in the TL.
Finally, content equivalence refers to the relevance and pertinence
of the text or idea of the items of the instrument in each culture.
The committee’s role is to evaluate, revise and consolidate the
instructions, items and response format of the back-translated
instruments that have conceptual, semantic and content equiva-
lency and to develop the P-FTL for pilot and psychometric testing.
Validation of instruments or scales V.D. Sousa and W. Rojjanasrirat
© 2010 Blackwell Publishing Ltd270
Key points
1Comparison between the two back-translations (B-TL1 and
B-TL2) of the instrument, and between both BTL1 and B-TL2 and
the original SL instrument:
•Evaluate similarity of the instructions, items and response
format regarding wording, sentence structure, meaning and
relevance.
2Use a multidisciplinary committee:
•One methodologist (researcher or a member of the research
team).
•One health care professional.
•All four bilingual and bicultural translators used in Step 1 and
Step 3: two translators whose mother language is the desired TL
of the instrument and two translators whose mother language
is the SL of the original instrument.
3If possible, developer of the original instrument should parti-
cipate in the discussions.
4If ambiguities and discrepancies cannot be resolved, Steps 1
through 4 may be repeated as many times as necessary. Alter-
natively, only items that do not retain their original meaning are
re-translated and back-translated.
Step 5: pilot testing of the pre-final version of
the instrument in the target language with a
monolingual sample: cognitive debriefing
The P-FTL is pilot tested among participants whose language is
the TL of the instrument to evaluate the instructions, response
format and the items of the instrument for clarity. Participants
should be recruited from the target population in which the instru-
ment will be used (e.g. if the instrument measures self-care among
individuals with type 2 diabetes, then the sample must consist of
individuals with type 2 diabetes). Asample size of 10–40 individu-
als is recommended [3,18]. Each participant is asked to rate the
instructions and items of the scale using a dichotomous scale
(clear or unclear). Participants who rate the instructions, response
format or any item of the instrument as unclear are asked to provide
suggestions as to how to rewrite the statements to make the lan-
guage clearer. Instructions, response format and items of the instru-
ment that are found to be unclear by at least 20% of the sample must
be re-evaluated [19]. Therefore, the minimum inter-rater agreement
among the sample is 80%. This step is used to further support the
conceptual, semantic and content equivalency of the translated
instrument and further improve the structure of sentences used
in the instructions and items of the P-FTL to be easily understood
by the target population prior to psychometric testing.
To further determine the conceptual and content equivalence of
the items of the P-FTL, use of an expert panel, is highly recom-
mended. The instructions, response format and the items of the
instrument are evaluated for conceptual equivalence (clarity) by
six to ten members of an expert panel [20,21] who are knowledge-
able about the content areas of the construct of the instrument
and the target population in which the instrument will be used
and whose mother language is the TL of the instrument. When
possible, a committee of 10 members is preferable [20,21]. Each
member of the committee who rates the instructions, response
format or any item of the instrument as unclear is asked to provide
suggestions as to how to rewrite the statements and make the
language clearer. Instructions, response format and items of
the instrument that are found to be unclear by at least 20% of the
committee members must be revised and re-evaluated [19]. The
minimum inter-rater agreement among the experts panel is 80%).
This process will further determine the conceptual equivalence
of the translated instrument.
The expert panel is then asked to evaluate each item of the
instrument for content equivalence (content-related validity [rel-
evance]) using the following scale: 1 =not relevant; 2 =unable
to assess relevance; 3 =relevant but needs minor alteration;
4=very relevant and succinct. Items classified as 1 (not relevant)
or 2 (unable to assess relevance) should be revised [20,21].
Content validity index at the item level (I-CVI) and at the scale
level (S-CVI) should be calculated. There are three methods to
calculate S-CVI [20–22], but the averaging calculation (S-CVA/
Ave) method is preferred [22]. Using 10 experts, the I-CVI of
0.78 or above [20] and S-CVA/Ave of 0.90 or above [21] are
the minimum acceptable indices. Items that do not achieve the
minimum acceptable indices are revised and re-evaluated. New
content validity indices are calculated. The process continues until
acceptable indices of content-related validity or content equiva-
lence are achieved. It is also recommended that the kappa coeffi-
cient of agreement be determined to increase confidence in the
content validity of the instrument [23]. Akappa of 0.60 is generally
the minimum acceptable coefficient to determine good agreement
[24]. The purpose of Step 5 is to continue developing the P-FTL
for pre-field test for preliminary and/or full psychometric testing.
Key points
1Pilot test of the P-FTL among individuals whose language is the
TL of the instrument:
•Evaluate the instructions, items and response format clarity.
•Use a sample size of 10–40 participants.
2It is highly recommended to use an expert panel to further
examine the instrument for:
•Clarity of the instructions, items and response format.
•Content equivalence (content-related validity) using I-CVI,
S-CVI/Ave and Kappa coefficient of agreement.
•Use a sample of 6–10 experts (10 experts are preferred).
Step 6: preliminary psychometric testing of the
pre-final version of the translated instrument
with a bilingual sample
This step is rarely used; however, when a bilingual population is
accessible, it is recommended that the instrument be pre-field
tested among bilingual individuals (fluent in the SL of the original
instrument and the TL of the translated instrument). If this is not
possible, skip this step and move to Step 7. In general, the recom-
mendation is to use at least five subjects per item of the instrument
to conduct the preliminary psychometric testing of a new instru-
ment [25]. Ideally, the bilingual sample should be from the target
population in which the instrument will be used (e.g. adult indi-
viduals with type 2 diabetes, African–American women with heart
failure). However, in many instances, this may be difficult and
unrealistic; thus other alternatives can be used such as sampling
bilingual college students and faculty or workers in travel agen-
cies, currency exchange agencies, international trade companies,
embassies and consulates, and language schools.
V.D. Sousa and W. Rojjanasrirat Validation of instruments or scales
© 2010 Blackwell Publishing Ltd 271
Initially, participants are given the P-FTL and are asked to
answer the items. The participants respond to the items of the
P-FTL without seeing the original instrument in the SL. After
completion of the P-FTL, participants are given the original in-
strument in the SL and are asked to answer the items. They may
complete a demographic questionnaire and/or other instruments of
interest. The order of the items of the original instrument must be
mixed to be in a different order from that of the items of the P-FTL.
Responses on both versions of the instrument are then compared
(i.e. interpretation of scores is the same in both cultures) to estab-
lish criterion equivalency (a type of construct validity). Statistical
analyses used for comparison purposes may consist of descriptive
statistics, correlation coefficients, and paired t-test or one-way
anova. Scale and item analysis is also used to establish the initial
preliminary psychometric properties of the instrument (internal
consistency reliability) and to compare these properties of the
P-FTL with the SL of the original instrument.When the instrument
purpose is to serve as a diagnostic or screening testing, preliminary
calculation of sensitivity and specificity is recommended. This
Step 6 also determines initial technical equivalency (the method
of assessment) and is useful to support the conceptual, semantic,
content and construct validity of the P-FTL prior to conducting full
psychometric field testing.
Key points
1When possible, pilot test the P-FTL among bilingual individuals
to:
•Compare the P-FTL and the SL instrument in the SL.
•Establish criterion equivalency and further support the
conceptual, semantic, content and construct equivalency of the
P-FTL.
2Use at least five subjects per item of the instrument.
3Subjects complete the P-FTL first without seeing the original
instrument in the SL.
4Subjects complete the original instrument in the SL in which
items have been mixed in different order from the P-FTL.
Step 7: full psychometric testing of the
pre-final version of the translated instrument
in a sample of the target population
This last step is used to establish the initial full psychometric
properties of the newly translated, adapted and cross-validated
instrument with a sample of the target population of interest. The
sample size for this step depends on the types of psychometric
approaches that will be used. The more complete the psychometric
approaches for evaluation of the translated instrument the more
confidence will be generated in its reliability and validity proper-
ties. In general, per rule of thumb, it is highly recommended to use
at least 10 subjects per item of the instrument scale and item
analysis and exploratory factor analysis [25–28]. If there is a plan
to use confirmatory factor analysis to test the factor structure of the
instrument, the recommendation per rule of thumb is approxi-
mately 300–500 subjects per item of the instrument [28,29]. Power
analysis based on the number of degrees of freedom, an alpha level
(0.05 or 0.01), and a desired power (80% or above) can also be
calculated [30,31].
The most recommended and commonly used psychometric
approaches in this step are estimation of: (1) internal consistency
reliability (or sensitivity and specificity); (2) stability reliability
(test–retest reliability); (3) homogeneity; (4) construct-related
validity such as convergent and/or divergent (discriminant) valid-
ity; (5) criterion-related validity such as concurrent and/or
predictive validity; (6) factor structure of the instrument (dimen-
sionality); and (7) model fit. Although, it is not the purpose of this
user-friendly guideline to describe the many statistical approaches
that can be used in Step 7, the most common statistical approaches
are scale and item analysis, Pearson’s correlation analysis, explor-
atory factor analysis and confirmatory factor analysis. The purpose
of the Step 7 is to revise and refine the items of the P-FTL as
needed to derive the final psychometrically sound FTL consisting
of adequate estimates of reliability, homogeneity, and validity and
with a stable factor structure and/or model fit.
Key points
1Full psychometric testing of the P-FTL among individuals from
the target population to:
•Revise and refine the items of the final version of the instru-
ment in the TL.
•Establish internal consistency reliability (or sensitivity and
specificity), stability reliability, homogeneity, construct-related
validity, criterion-related validity, factor structure and model fit
of the instrument.
2Use at least 10 subjects per item of the instrument for general
psychometric approaches (scale and item analysis, Pearson’s cor-
relations and exploratory factor analysis).
3Use 300–500 subjects for confirmatory factor analysis or
conduct a power analysis.
Example of a project to translate, adapt
and validate a research instrument
A project to translate, adapt and cross-validate an instrument for
cross-cultural research may take several years; and it is normally
conducted using more than one study to adhere to the recom-
mended methodological approaches described above. One study
might set as its initial goal to translate, adapt and cross-validate a
research instrument using Steps 1, 2, 3, 4 and 5 only. In a second
study, the researchers might set a single goal to establish the
preliminary psychometrics of the translated instrument with
bilingual participants using Step 6. Then, in a third study, the
researchers’ goal might be to establish the initial full psychometric
properties of a translated instrument in a sample of the target
population of interest using the approaches described in Step 7.
Depending on the psychometric approaches used in this third
study, other studies might be necessary to continue the develop-
ment and psychometric evaluation of the translated instrument. To
illustrate the use of the methodological steps to translate, adapt and
cross-validate an instrument and to evaluate the preliminary and
initial full psychometric properties of an instrument, we present
an example of a project that used two studies to adhere to the
recommended guideline.
Study one: cross-cultural equivalence and
psychometric properties of the Portuguese
version of the depressive cognition scale
In this study [6], the depressive cognition scale (DCS) was trans-
lated from English into Portuguese using Steps 1 and 2, and
Validation of instruments or scales V.D. Sousa and W. Rojjanasrirat
© 2010 Blackwell Publishing Ltd272
back-translated from Portuguese into English and cross-culturally
validated using Steps 3 and 4. Preliminary psychometric evalua-
tion of the scale was conducted with a bilingual sample using Step
6. Note that Step 5, important to determine clarity of the instruc-
tions, response format and sentence structure of the items, was
not done because the committee was comprised of three members
whose mother language was Portuguese who were convinced that
all those aspects of the scale were completely clear. The DCS was
originally developed in English to measure depressive cognitions
[32]. The theoretical basis for the development of the scale was
Beck’s theory of depression and Erickson’s theory of psychosocial
development. The DCS consists of eight items on a 6-point Likert-
type scale ranging from 0 (strongly disagree) to 5 (strongly agree).
Each item of the scale measures a specific cognition: hopeless-
ness, helplessness, purposelessness, powerlessness, worthlessness,
loneliness, emptiness and meaninglessness.
Using Steps 1 through 4, the DCS English version was trans-
lated into Portuguese by two bilingual translators who were fluent
in both English and Portuguese languages to generate two versions
of the translated scale. The Portuguese versions of the scale were
blindly back-translated into English by two different translators
who never saw the original version of the DCS in English. The two
versions of the translated scale were compared with each other,
and each one of these versions was compared with the original
English version of the scale to determine its conceptual, semantic
and content equivalence using a panel of three Brazilian bilinguals
and an American monolingual expert in the content area with
consultation with the translators who participated in the translation
and back-translation of the versions of the scale. Ambiguities and
discrepancies regarding conceptual and semantic equivalence on
two items that measured emptiness and purposelessness were dis-
cussed and resolved by the committee members. A final version of
the translated scale in Portuguese was derived and named ‘Escala
Cognitiva de Depressão (ECD).’
As we stated previously, we skipped Step 5 and proceeded to
Step 6 to evaluate the preliminary psychometrics of the Portuguese
version of the scale (ECD), with a bilingual sample of individuals
from the general population, mostly students and faculty members
of a major Brazilian university. This sample was chosen because
of the anticipated difficulty to approach and recruit individuals
from the target population of interest (i.e. individuals with diabetes
mellitus). We used the recommended sample size of five subjects
per item of the scale; therefore, 40 bilingual adults participated in
the study. Using a two-step approach for data collection, partici-
pants received and responded to the Portuguese version of the
scale (ECD) only. Then participants answered a demographic
questionnaire and completed the original English version of the
scale (DCS). We used descriptive statistics, scale and item analy-
sis, and paired t-tests to compare the responses of participants in
both the Portuguese (ECD) and the English (DCS) items of the
scales. Descriptive statistics showed strong similarities in partici-
pants’ responses on both the Portuguese and English versions of
the scale (identical responses ranged from 78% to 98%). Pearson’s
correlation coefficients between each item of the ECD and each
item of the DCS ranged from 0.64 to 0.96 and were statistically
significant (P<0.001). Estimates of internal consistency reli-
ability and homogeneity were similar in both the ECD and DCS
versions of the scale (Cronbach’s alphas were 0.79 and 0.78,
respectively; item-to-total correlation coefficients ranged from
0.32 to 0.75 on the ECD and from 0.29 to 0.77 on the DCS). There
was no statistically significant difference between the overall mean
score of the ECD (M =5.05, SD =3.49) and the overall mean
score of the DCS (M =5.20, SD =3.56); t(1,39) =0.76, P=0.45.
At the item level, a statistically significant difference was found
between the mean score of item 2 on the ECD (M =0.80,
SD =0.76) and the mean score of item 2 on the DCS (M =0.90,
SD =0.74); t(1,39) =2.08, P=0.04. The findings of Step 6 sug-
gested adequate criterion equivalence and supported the concep-
tual, semantic and content validity of the ECD. The ECD had an
adequate internal consistency reliability, homogeneity and initial
support for construct validity. The full psychometric properties of
the ECD could be further tested among individuals from the target
population of interest.
Study two: psychometric properties of the
Portuguese version of the depressive cognition
scale in Brazilian adults with diabetes mellitus
The second study [33], was conducted to continue the psycho-
metric evaluation of the ECD using Step 7. The purpose of the
study was to evaluate the full psychometric properties of the ECD
among adults with diabetes mellitus (i.e. target population of inter-
est). We used the recommended sample size of at least 10 subjects
per item of the scale. The sample consisted of 82 Brazilian adults
with diabetes mellitus. The estimate of internal consistency reli-
ability was a Cronbach’s alpha of 0.88. Scale and item analysis
indicated that the inter-item correlation coefficients ranged from
0.37 to 0.69 and the item-to-correlation coefficients ranged from
0.51 to 0.71. Exploratory factor analysis indicated that the ECD
was unidimensional, explaining 56.73% of its item variance, and
had factor loadings ranging from 0.60 to 0.80 and communality
values ranging from 0.36 to 0.67. In addition, the ECD total score
was statistically significantly correlated with the Portuguese
version of the Beck Depression Inventory (r=0.24, P<0.05).
The study findings further supported the reliability, homogeneity
and construct-related validity of the ECD among Brazilians with
diabetes mellitus.
Conclusions
The diversity of the population worldwide shows a great need for
cross-culturally validated instruments or scales. Translation, adap-
tation and validation of an instrument or scale for cross-cultural
research is time-consuming and requires careful planning and
adoption of rigorous methodological approaches to derive a reli-
able and valid measure of the concept of interest in the target
population. This scholarly paper has reviewed and incorporated
the highly recommended methodological approaches in a clear and
user-friendly guideline. The adoption of symmetrical translation
using centring process will lead to more accurate adaptation and
cross-cultural validation of a translated instrument. The steps of
the proposed guideline provide clear directions to cross-culturally
validate research instruments or scales. Choosing the right trans-
lators and committee members is key aspect that must be consid-
ered to enhance the quality of the translation, back-translation and
cross-validation of an instrument or scale. Pilot testing of a trans-
lated instrument or scale among participants whose language is the
TL of the instrument to evaluate the instructions, the items and the
V.D. Sousa and W. Rojjanasrirat Validation of instruments or scales
© 2010 Blackwell Publishing Ltd 273
response format of the instrument for clarity also enhances the
quality of the final version of the translated instrument. Finally,
using well-established approaches to test the preliminary psy-
chometric properties of the translated instrument or scale among
bilingual individuals and full psychometrics of the translated
instrument or scale among individuals from the target population
of interest will derive a reliable and valid scale in the TL of
interest.
References
1. Hilton, A. & Skrutkowski, M. (2002) Translating instruments into
other languages: development and testing processes. Cancer Nursing,
25 (1), 1–7.
2. Sperber, A. D. (2004) Translation and validation of study instru-
ments for cross-cultural research. Gastroenterology, 126 (1), S124–
S128.
3. Beaton, D. E., Bombardier, C., Guillemin, F. & Ferraz, M. B. (2000)
Guidelines for the process of cross-cultural adaptation of self-report
measures. Spine, 25 (24), 3186–3191.
4. Beaton, D. E., Bombardier, C., Guillemin, F. & Ferraz, M. B. (2002)
Recommendations for the cross-cultural adaptation of health status
measures. Available at: http://www.dash.iwh.on.ca/assets/images/
pdfs/xculture2002.pdf (last accessed 1 October 2009).
5. Bracken, B. A. & Barona, A. (1991) State of the art procedures for
translating, validating and using psychoeducational tests in cross-
cultural assessment. School Psychology International, 12 (1–2), 119–
132.
6. Sousa, V. D., Zauszniewski, J. A., Mendes, I. A. C. & Zanetti, M. L.
(2005) Cross-cultural equivalence and psychometric properties of
the Portuguese version of the depressive cognition scale. Journal of
Nursing Measurement, 13 (2), 87–99.
7. Brislin, R. W. (1970) Back-translation for cross-cultural research.
Journal of Cross-Cultural Psychology, 1 (3), 185–216.
8. Brislin, R. W. (1986) The wording and translation of research instru-
ments. In Field Methods in Cross-Cultural Research (eds W. J. Lonner
& J. W. Berry), pp. 137–164. Beverly Hills, CA: Sage Publications.
9. Chapman, D. W. & Carter, J. F. (1979) Translation procedures for the
cross cultural use of measurement instruments. Educational Evalua-
tion and Policy Analysis, 1 (3), 71–76.
10. Guillemin, F., Bombardier, C. & Beaton, D. (1993) Cross-cultural
adaptation of health-related quality of life measures: literature review
and proposed guidelines. Journal of Clinical Epidemiology, 46 (12),
1417–1432.
11. Jones, E. (1987) Translation of quantitative measures for use in cross-
cultural research. Nursing Research, 36 (5), 324–327.
12. Jones, E. G. & Kay, M. (1992) Instrumentation in cross-cultural
research. Nursing Research, 41 (3), 186–188.
13. Jones, P. S., Lee, J. W., Phillips, L. R., Zhang, X. E. & Jaceldo, K. B.
(2001) An adaptation of Brislin’s translation model for cross-cultural
research. Nursing Research, 50 (5), 300–304.
14. McDermott, M. A. N. & Palchanes, K. (1994) A literature review of
the critical elements in translation theory. Image: Journal of Nursing
Scholarship, 26 (2), 113–117.
15. Wild, D., Grove, A., Martin, M., Eremenco, S., McElroy, S.,
Verjee-Lorenz, A. & Erikson, P. (2005) Principles of good practice for
the translation and cultural adaptation processs for patient-reported
outcomes (PRO) measures: report of the ISPOR task force for trans-
lation and cultural adaptation. Value in Health, 8 (2), 94–104.
16. Maneesriwongul, W. & Dixon, J. K. (2004) Instrument translation
process: a method review. Journal of Advanced Nursing, 48 (2), 175–
186.
17. Beck, C. T., Bernal, H. & Froman, R. D. (2003) Methods to document
semantic equivalence of a translated scale. Research in Nursing and
Health, 26 (1), 64–73.
18. Sousa, V. D., Hartman, S. W., Miller, E. H. & Carroll, M. A. (2009)
New measures of diabetes self-care agency, diabetes self-efficacy, and
diabetes self-management for insulin-treated individuals with type 2
diabetes. Journal of Clinical Nursing, 18 (9), 1305–1312.
19. Topf, M. (1986) Three estimates of interrater reliability for nominal
data. Nursing Research, 35 (4), 253–255.
20. Lynn, M. R. (1986) Determination and quantification of content valid-
ity. Nursing Research, 35 (6), 382–385.
21. Waltz, C. F., Strickland, O. L. & Lenz, E. R. (2005) Measurement in
Nursing and Health Research, 3rd edn. NewYork: Springer Publishing
Company.
22. Polit, D. F. & Beck, C. T. (2006) The content validity index: are you
sure you know what’s being reported? Critique and recommendations.
Research in Nursing and Health, 29 (5), 489–497.
23. Wynd, C. A., Schmidt, B. & Schaefer, M. A. (2003) Two quantitative
approaches for estimating content validity. Western Journal of
Nursing Research, 25 (5), 508–518.
24. Streiner, D. L. & Norman, G. R. (2008) Health Measurement Scales:
A Practical Guide for Their Development and Use, 4th edn. New York:
Oxford University Press.
25. Nunnally, J. C. & Bernstein, I. H. (1994) Psychometric Theory, 3rd
edn. New York: McGraw-Hill.
26. Hair, J. F. J., Anderson, R. E., Tatham, R. L. & Black, W. C. (1998)
Multivariate Data Analysis, 5th edn. Englewood Cliffs, NJ: Prentice-
Hall, Inc.
27. Stevens, J. (2002) Applied Multivariate Statistics for the Social
Sciences, 4th edn. Mahwah, NJ: Lawrence Erlbaum Associates.
28. Tabachnick, B. G. & Fidell, L. S. (2001) Using Multivariate Statistics,
4th edn. Needham Heights, MA: Allyn & Bacon.
29. Gonzalez, R. & Griffen, D. (2001) Testing parameters in structure
equation modeling: every ‘one’matters. Psychological Methods,6(3),
258–269.
30. MacCallum, R. C., Browne, M. W. & Sugawara, H. M. (1996) Power
analysis and determination of sample size for covariance structure
modeling. Psychological Methods, 1 (2), 130–149.
31. MacCallum, R. C., Browne, M. W. & Cai, L. (2006) Testing differ-
ences between nested covariance structure models: power analysis and
null hypotheses. Psychological Methods, 11 (1), 19–35.
32. Zauszniewski, J. A. (1995) Development and testing of a measure of
depressive cognitions in older adults. Journal of Nursing Measure-
ment, 3 (1), 31–41.
33. Sousa, V. D., Zanetti, M. L., Zauszniewski, J. A., Mendes, I. A. C. &
Daguano, M. O. (2008) Psychometric properties of the Portuguese
version of the depressive cognition scale in Brazilian adults
with diabetes mellitus. Journal of Nursing Measurement, 16 (2), 125–
135.
Validation of instruments or scales V.D. Sousa and W. Rojjanasrirat
© 2010 Blackwell Publishing Ltd274