ArticlePDF Available

Cultural-linguistic test adaptations: Guidelines for selection, alteration, use, and review

Authors:

Abstract and Figures

In 1991, Bracken and Barona wrote an article for School Psychology International focusing on state of the art procedures for translating and using tests across multiple languages. Considerable progress has been achieved in this area over the 25 years between that publication and today. This article seeks to provide a more current set of suggestions for altering tests originally developed for other cultures and/or languages. Beyond merely describing procedures for linguistic translations, the authors provide suggestions on how to alter, use, and review tests as part of a cultural-linguistic adaptation process. These suggestions are described in a step-by-step manner that is usable both by test adapters and by consumers of adapted tests.
Content may be subject to copyright.
School Psychology International
2017, Vol. 38(1) 3–21
!The Author(s) 2016
Reprints and permissions:
sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/0143034316684672
journals.sagepub.com/home/spi
Article
Cultural-linguistic test
adaptations: Guidelines
for selection, alteration,
use, and review
S. Kathleen Krach
Florida State University, Tallahassee, FL, USA
Michael P. McCreery
University of Nevada Las Vegas, Las Vegas, NV, USA
Jessika Guerard
Florida State University, Tallahassee, FL, USA
Abstract
In 1991, Bracken and Barona wrote an article for School Psychology International focusing
on state of the art procedures for translating and using tests across multiple languages.
Considerable progress has been achieved in this area over the 25 years between that
publication and today. This article seeks to provide a more current set of suggestions
for altering tests originally developed for other cultures and/or languages. Beyond
merely describing procedures for linguistic translations, the authors provide suggestions
on how to alter, use, and review tests as part of a cultural-linguistic adaptation process.
These suggestions are described in a step-by-step manner that is usable both by test
adapters and by consumers of adapted tests.
Keywords
multicultural, multilingual, psychometrics, tests, translations
HWÆT, WE GAR-DEna in geardagum,
Þeodcyninga Þrym gefrunon,
hu ôÞelingas ellen fremedon!
(Beowulf, in Old English, Klaeber, 1922)
Corresponding author:
S. Kathleen Krach, Florida State University, 1114 West Call Street, Tallahassee, FL 32304, USA.
Email: skrach@fsu.edu
LO, praise of the prowess of people-kings
of spear-armed Danes, in days long sped,
we have heard, and what honor the athelings won!
(Beowulf, in modern English, Gummere, 1910)
Although both of these samples are in English, one may be unreadable to modern-
day speakers of the language. That is because only one word (‘in’) from the
old-English version is still the same in the modern-day version. So, why are the
two versions (both in English) so different? These changes did not happen quickly,
but came about steadily over time due to several factors. One is that new words are
always coming into use to serve previously unknown purposes. For example, the
2015 Oxford Dictionary’s English word of the year was ‘emoji,’ a word that barely
existed in the lexicon ten years ago, and not at all 50 years ago. In addition, as new
words enter a language, other words exit due to shifts in popularity (e.g., words like
‘boffin’ or ‘bouffant’). Words may be lost because they no longer serve a current
purpose (e.g., ‘45 rpm adapter’ for converting a record player). Thus, it is clear that
a single language can change dramatically from decade to decade, eventually result-
ing in such changes as seen in the examples above.
These types of changes happen not only to the English language, but to other
languages as well. As each language can be presumed to change at a similarly rapid
rate, then attempting to translate between two (or more) languages at a time may
prove to be a difficult challenge. Add to this the possibility of needing translated
versions for more than 7,000 independent languages worldwide (Ethnologue, 2008;
http://www.ethnologue.com/). And these are only simple linguistic concerns.
Sometimes the literal meaning of a word (denotation) and the value accompanying
the word (connotation) may change within a single language, as well as from lan-
guage to language. One example of an English word that has different connotations
depending on context would be ‘feminist’. This is because the value attached to this
word by some individuals is positive, while others view it as negative.
Such language issues are complicated enough when trying to translate a work of
fiction or technical documentation, but they become compounded with test and
measurement issues when trying to make a test developed for a specific cultural-
linguistic group available for a different language and/or culture. One such issue is
the translation of content; one may translate an item from its origin language and
maintain the item’s cultural value, but may lose the value (connotation) of or level
(difficulty) of the construct of interest being measured (Hambleton, 2005; Sattler,
Oades-Sese, Kitzie, & Krach, 2014). For example, if the goal is to measure reading
ability, then translated items should have the same number of phonemes. If they do
not (e.g., as happens when changing the one-syllable English word ‘car’ to the two-
syllable Spanish word ‘coche,’) then this changes the item’s difficulty level. This
same translation may also subtly change the value of the task as well. For example,
the use of the Castillan-derived word ‘coche’ may be seen as more ‘authentic’ by
native Spanish speakers than the use of the English-influenced word ‘carro’ or the
Germanic-influenced version ‘auto’ (Roggen, 2014).
4School Psychology International 38(1)
Value differences may also make it difficult to translate the actual meaning of
certain words. For example, translating a rating scale item of ‘I feel blue’ from
English does not allow for a direct word-for-word translation into Chinese.
Instead, the item has to be translated as ‘I feel sad’, which is similar to the same
construct but does not hold the same value (Ren, Amick, Zhou, & Gandek, 1998).
And, finally, certain activities are valued differently for different groups. For exam-
ple, a rating-scale item stating, ‘My child prepares a simple meal’ may produce
dramatically different answers depending on the cultural group to whom it is
administered. This is because, in some countries, children start preparing food at
an earlier age than others, thereby changing the age-expectation value of the task.
In other countries, only women prepare the meals, thereby changing the gender-
expectation of the task. Finally, the idea that preparing food is a solitary experience
(and not a performed as a group or family) can also be culturally loaded.
Given all of these considerations, the term ‘test translation’ is too narrow a
description for the modern process. Instead, the term ‘test adaptation’ is more
appropriate. Test adaptation includes all considerations of the cultural-linguistic
transfer of a test from one group to another instead of the singular aspect of simple
translation (Hambleton, 2005). The purpose of this article is not to dissuade the
reader from using tests designed for other populations. Instead, it is to provide
guidance in the daunting task of making decisions regarding test adaptation. Thus,
this article attempts to meet two goals. The first is to serve as a primer for test
adapters to consider when using a test originally designed for a different culture/
language. The second is to provide test users assistance in selecting, using, and inter-
preting adapted tests when necessary.
Step one: Familiarizing yourself with recommendations from the field
Bracken and Barona (1991) published one of the first multi-step guides for properly
translating tests. One difference between their work and the current article was that
the procedures they recommended were for translating tests largely for intrana-
tional use, not international. Specifically, they discussed translating tests to be used
within the United States with speakers of languages other than English. This
population is different from an international audience in that, although hundreds
of languages are spoken in the United States, the assumed goal of those who speak
these languages is to eventually speak English. This is not true of speakers of
languages other than English who live in their native countries. This difference is
subtle but important. The most important aspect of this difference is not linguistic
but cultural. Specifically, when test items are written and published for use in the
United States, these items now addresse the extent to which the individuals tested
conform to US cultural norms. For these reasons, their work focused on the
linguistic translation and not the cultural-linguistic adaptation that encompasses
the current work.
In Bracken and Barona (1991), their first step was to enlist a bilingual individual
familiar with the test to conduct a word-for-word (or meaning-for-meaning)
Krach et al. 5
translation from the origin test to the target test. The second step consisted of
having someone who had never seen the origin test perform a back-translation
from the target version to the origin language. These steps were repeated as
often as necessary until only a minimal gap between the origin test and its back-
translation existed. The third step was to have a multinational or multiregional
bilingual committee review both versions to ensure the target test met the require-
ments for content and construct similarity (Bracken & Barona, 1991).
Upon approval of the target test by the committee, the next step was to conduct
pilot testing in which the target instrument was administered to individuals who
shared the same language but hailed from different cultural, social, and economic
backgrounds. Upon completion of pilot testing and related changes, field-testing
was completed to evaluate psychometric issues such as reliability, validity, and
normative data collection (Bracken & Barona, 1991). The last step focused on
test version equivalency using statistical techniques such as factor analyses and
multi-trait-multi-method analyses (Bracken & Barona, 1991).
Following the work of Bracken and Barona (1991), other authors (Butcher,
1996; Geisinger, 1994) built similar sets of guidelines for test translation.
Figure 1 provides a unified combination of these steps. Following their combined
work, many professional organizations began to establish definitive rules and ethi-
cal restrictions for test adaptation. For example, in 1999, the International Test
Commission (ITC) developed its testing regulations, as did a joint partnership of
the American Educational Research Association (AERA), American Psychological
Association (APA), and National Council on Measurement in Education (NCME).
In 2001, the US Office of Minority Health also added standards for providing
competent services to culturally and linguistically diverse clients. More recently,
the ITC (2005) updated its standards, as did the National Association of School
Psychologists (NASP, 2010) and the International School Psychology Association
(ISPA, 2011). Table 1 provides a list of the guidelines from each of these organiza-
tions specific to test interpretation. Please note that these guidelines focus mostly
on the following concepts: 1) who can translate and administer the tests (e.g.,
fluency, testing knowledge, cultural competency) and 2) what type of tests to use
(e.g., culturally appropriate, psychometrically sound, etc.). There is an additional
emphasis that documenting the methods used for translating an instrument should
be evidence-based as well as culturally and linguistically appropriate. Although
these guidelines are helpful, these organizations provide no specific information
on what these evidence-based practices are, nor what constitutes cultural and
linguistically appropriate approaches.
Gudmundsson (2009) provides one of the most recent and comprehensive sets of
guidelines for adapting tests. He laid out an eight-step procedure for both test
translation and adaptation. The first step was selecting an instrument for transla-
tion and adaptation, with an emphasis on instruments that are psychometrically
sound. The second step emphasizes the need to select skilled translators who are
fluent in both the language of, and knowledge about, the origin test. The third step
requires the selection of subject-area experts to ensure that the construct remains
6School Psychology International 38(1)
sound across both versions. The fourth step emphasizes the need to be thoughtful
in the method of adaptation (word-for-word, meaning-for-meaning, construct-for-
construct), with a goal of decreasing overall bias in the target test. The fifth step
calls for the use of the selected method of adaptation; the sixth focuses on reducing
bias in the target version. The seventh step includes a pilot study with a focus on
item analysis and administration techniques. The final step emphasizes psycho-
metrics such as reliability, validity, and equivalency. Although Gudmundsson’s
(2009) work is well-considered and follows the ethical and legal structures set
forth by the professional organizations, it is non-specific. The methods described
in the rest of this article are designed to provide more specifics on the process by
integrating previous research as well as ethical best practices.
Figure 1. Steps to test translation
Note: This figure was adapted from the writings of Bracken, B. A., & Barona, A. (1991) as well as
Butcher (1996) and Geisinger (1994).
Krach et al. 7
Table 1. Organizational guidelines for translators and translating tests
Authors Date Standard description
AERA, APA,
NCME
1999 Interpreters used in assessments should be
Experts in translating
Fluent in the original and target language
Have a basic understanding of the assessment process
APA 2010 Psychologists
Take into account culture and language in test interpretation
Ensure consent for testing is without linguistic or cultural bias
Are knowledgeable about cultural or linguistic differences
NASP 2011 School psychologists should
Ensure that consent for testing is understandable taking into consid-
eration the language and culture of the client
Practice in a non-discriminatory manner regarding individuals who are
linguistically different
Conduct fair assessments taking into consideration culture and
language
ISPA 2011 Use these steps when choosing someone to work with linguistically
diverse clients
First, identify a school psychologist who speaks the language
Next, use a knowledgeable colleague who speaks the language
Finally, bring in a properly prepared translator.
School psychologist are responsible to ensure that the translator
Be prepared
Translate with accuracy
Maintain client confidentiality
ITC 2005 Ensure score equivalence across culturally and linguistically diverse
groups by having test developers
Verify cultural and linguistic differences when adapting a test
Write test materials (e.g., handbooks, directions, etc.) to include any
language issues related to the intended population
Ensure that test procedures used are familiar to all populations
Present evidence (including statistical evidence) that ensures and
documents equivalency across all language versions
Consider content validity ensuring items meets standards for cultural /
linguistic equivalency
Offer test instructions in the original and translated languages
Document any changes from one translated version to another
(including evidence of equivalence and validation)
Provide information on the influence of socio-cultural and ecological
context when interpreting scores
Notes: AERA ¼American Education Research Association; APA ¼American Psychological Association;
ITC ¼International Test Commission; ISPA ¼International School Psychology Association;
NASP ¼National Association of School Psychologists; NCME ¼National Council on Measurement in
Education.
Note: This table was copied with permission from the Trainers of School Psychology Forum (TSP, 2015).
8School Psychology International 38(1)
Step two: Determining rationale for testing
Although the first step in other works is test selection, (Bracken & Barona, 1991;
Butcher, 1996; Geisinger, 1994), it is not even the second step in the current model,
which focuses more on determining why and what the practitioner would like to
test rather than focusing on a specific instrument. This model uses the work of
Kumman (2005), who developed a Test Context Framework (Figure 2) as the basis
for all test adaptation decisions. According to this framework, every test is
embedded in a system comprising laws and ethics governing its use; political and
economic reasons for its development; technological constraints and affordances;
and educational, social, and cultural constructs that guide its purpose. Without
considering the reason for adapting a test using a framework like Kumman’s
(2005), all other decisions are stalled.
The following demonstrates how a single test can be embedded across several
aspects of this framework. In this example, an achievement test is administered to a
child in primary school. Ethics and laws govern the manner in which the test is
administered. The specific items on the test are determined by politicians’ decisions
on the academic standards for that child’s level. The child’s performance may
influence how much money the school receives. The test may be administered by
a computer, but only if the school has the infrastructure to support such
technology.
Finally, there may be test-taker issues that influence test administration and
interpretation (Sattler et al., 2014). Amongst these considerations is the possibility
Figure 2. Test context framework
Note: This figure was copied with permission from Kumman (2005), p. 241.
Krach et al. 9
that the child may struggle with the test because she is from a cultural group where
academic support is not offered in the home. On the other hand, she may have
social pressure to be successful on the test, so many supports are provided by the
school and family (van de Vijver & Poortinga, 1997). Also of note is the very act of
having a school psychologist administer a test may create cross-cultural barriers.
For example, if children are raised not to look adults or members of the opposite
sex in the eye, that may influence test performance. In addition, test-taker interac-
tions with specific test tasks may be quite different from person to person.
A specific example would be adapting a timed-test for use within a culture that
emphasizes accuracy over speed, thereby decreasing the validity of the instrument.
There are some simple guidelines for determining the need to create a new test
adaptation. First, it is important to make sure that the publisher does not already
have a translated version available. This may seem intuitive, but finding a pub-
lished, adapted test, is not as easy as entering the name of a test into a search engine
and getting a list of all versions available. Search engines modify searches based on
the country you are searching from, so that in the US, your search engine might
default to ‘publisher.com.’ Many publishers do not list all of their possible transla-
tions on their ‘publisher.com’ website. Instead, in the US, the publisher may only
list the US English and Spanish versions. Thus, if you want a version for a different
language, you must navigate to that language’s native country’s version of the
website; for example, if there is an Australian version of the test, it might be on
listed on ‘publisher.com.au’.
By surveying publishers directly, Krach, Doss, and McCreery (2015) found that,
of 45 social/emotional/behavioral tests examined, there were more than 143 pub-
lished versions available. While a few tests only had an English version, some had
as many as 30 different translated or adapted versions available. It should be noted
that, although some of the different published versions of the test were full adapta-
tions, many were only simple translations. For example, in the Krach and collea-
gues (2015) study, 26 of the 42 tests had versions in languages other than English,
but these provided only a translation of the test with no new normative data. Thus,
just because a publisher provides a translated version does not mean that the test
was adapted. In such cases, it may be necessary to use information from both
Figure 1 and Table 1 to review what has been done before determining whether
the test adaptation is sufficient for your needs.
If a publisher’s translated/adapted version of a test is not sufficient for your
needs, please note that better instruments for your target population than the one
you have chosen as your origin test may be available. Do not ignore tests that were
originally developed for the target language or cultural group of interest. Tests
developed in your client’s country of origin should be seriously considered before
trying to adapt a test on your own. However, though published tests exist for many
cultural-linguistic groups, the precise one you need may not yet have been adapted
or even written, given the 7000 languages spoken on this planet. It is not financially
practical for publishers to undergo all of the formal steps of a proper cultural-
linguistic test adaptation for all language options. If this is the case for you, then
10 School Psychology International 38(1)
you might choose to adapt a test on your own; however, it should be noted that
adapting a test should be your final option.
If you must adapt a test on your own, all components of Kumman’s (2005)
framework must be considered. Out of these, the two of utmost importance are:
1) for what reason is a specific test selected for adaptation, and 2) in what ways does
the reason influence the adaptation process. For example, if a test is selected for ‘high
stakes’ reasons (e.g., holding a child back a level, teacher pay, making a diagnosis,
etc.), then any test adaptation should undergo the most rigorous requirements pos-
sible in the adaptation process. This more stringent approach to adaptation would
include teams of experts working on the translation, determining equivalency
between the origin and target versions and collecting normative data for compar-
isons. However, if the stakes are lower (e.g., monitoring progress, screening, etc.),
then a less formal adaptation process involving translation, collecting data, and
determining equivalency may be acceptable. However, the amount of data needed
and the methods used to determine equivalency might be less burdensome.
In the next two steps, the article will discuss the recommended professional
practice in test adaptation, as well as ideal (and less than ideal) adaptation
procedures.
Step three: Appropriate adaptation procedures
This adaptation procedures section is divided into two sets of guidelines for adapt-
ing tests. The first includes a list of ideal guidelines and the second includes a list of
less than ideal guidelines. Please note, even if you choose to use the less than ideal
guidelines as your standard, you should still strive to reach the goals listed in the
ideal version.
Ideal adaptation procedures. There is no mystery surrounding how to culturally and
linguistically adapt tests. The list provided in Figure 1 is an excellent place to start.
The following describes each of these steps in more detail, adding information that
has been updated since the authors cited in Table 1 first developed their plans.
Choose origin test. According to the first step, it is vital to choose the appro-
priate origin test from the beginning. Arguably, the best way to do this would be to
build all targeted cultural-linguistic variations of an origin test concurrently with
the development of the original instrument (Solano-Flores, Trumbull, & Nelson-
Barber, 2002). This would ensure that all cultural and linguistic issues are
addressed prior to development, and no version would be a lesser version. All
items included originally would have a reduction in bias; therefore, no additions,
subtractions, or item alterations would be needed to ensure cultural equivalency.
It should be noted that no test will be completely free of bias; instead, the goal
should be to reduce bias as much as possible.
When concurrent test development is not feasible, then the origin test must be
chosen carefully to ensure that it can be altered to measure a similar construct
Krach et al. 11
in the target version; additional considerations include the need to ensure that the
origin test is psychometrically sound and uses simple items with basic instructions
(Bracken & Barona, 1991). Sometimes, the selection of the origin test is limited by
variables such as 1) the dominance of the test (e.g., the Wechsler scales may be
chosen because of popularity), 2) the type of data required (e.g., special education
law or the diagnostic manual), or 3) a lack of other assessments that measure the
same construct. When the origin test options are narrowed, then it is vitally impor-
tant that all remaining adaptation steps are followed with fidelity.
Translate the test. Next, the process of starting the initial translation begins. It is
at this earliest stage that the focus should be on bias (Brislin, 1980). Each of the
three types of bias comes from a different source; Van de Vijver and Poortinga
(1997) provide a helpful table that is reproduced in Table 2.
At the simplest level, one may assume that there is limited bias in the construct
measured (e.g., belief that phonemic awareness is a universal construct) and/or
assume there is limited bias in the method of assessment (e.g., assume rating
Table 2. Overview of kinds of bias and their possible causes
Kind of bias Source
Construct Incomplete overlap of definitions of the construct across cultures
Differential appropriateness of item content (e. g., skills do not belong to the
repertoire of either cultural group)
Poor sampling of all relevant behaviors (e. g., short instruments covering
broad constructs)
Incomplete coverage of the psychological construct
Method Differential social desirability
Differential response styles such as extremity scoring and acquiescence
Differential stimulus familiarity
Lack of comparability of samples (e. g., differences in educational back-
ground, age, or gender composition)
Differences in physical testing conditions
Differential familiarity with response procedures
Tester effects
Communication problems between subject and tester in either cultural
group
Item Poor item translation
Inadequate item formulation (e. g., complex wording)
One or a few items may invoke additional traits or abilities
Incidental differences in appropriateness of the item content (e. g., topic of
item of educational test not in curriculum in one cultural group)
Note: Copied with permission from van de Vijver, F. J., & Poortinga, Y. H. (1997). Towards an integrated
analysis of bias in cross-cultural assessment. European Journal of Psychological Assessment,13(1), pg. 34.
12 School Psychology International 38(1)
scales are a universal method of collecting data). However, such assumptions must
always be confirmed empirically. If they are not, then results from the instrument
cannot be interpreted with fidelity. In the phonemic awareness example used here,
the assumption of a lack of construct bias would be erroneous because not all
languages use the same phonemes. This is true for a method bias assumption
when using rating scales; not all cultural groups respond the same way (Hui &
Triandis, 1989).
Once it is determined that neither construct nor method bias is an issue, the test
adapter should focus on item bias (Van de Vijver & Leung, 1996). Item bias occurs
when there is either a poor translation of the item or when the item may have
different cultural-specific interpretations. However, empirical investigations have
shown that when test developers focus on both construct and how that construct
manifests across cultural groups, they are able to minimize item bias (Byrne & Van
de Vijver, 2010).
The goal in the earliest steps is not to determine bias but to prevent it. In the
later steps of review and validation, the researchers will run analyses to examine
each source of bias. In this earliest step, the test adapter is simply determining
whether it is even feasible to make a version of an instrument for a new cultural-
linguistic group, or if the test will be too biased for an adaptation to be possible.
When it appears that bias will keep a literal translation from being possible, there
are only two choices: adapt only parts of the instrument, or create an entirely new
instrument (Van de Vijver & Hambleton, 1996).
Review. If either a part or the whole of a test can be adapted without bias, then
one may move forward with a cultural-linguistic translation. After that, it is vital
that the adapted instrument be reviewed prior to moving forward with any addi-
tional steps. One common method of review is through a process called back-
translation. Brislin (1980) describes translation as a process of taking a document
translated from one language (L1) to another language (L2), and back translation as
the process of translating it back again (L2 to L1). In addition to back-translation,
the review step provides the opportunity for a qualitative analysis of the test items.
A panel of experts should provide a comprehensive review of the translated items
that are problematic in terms of either cultural or linguistic equivalency (Geisinger,
1994). Members of the panel should review the information separately and come
together later to make joint recommendations.
Alterations. The next step is to make appropriate alterations to the translated
test as needed. The test developer starts by making changes based on the panel’s
recommendations. The revised version of the materials then goes back through the
panel review process repeatedly until all concerns are addressed. After the panel
clears the work, pilot testing is needed. The goal of pilot testing is to understand
how the translated test works in a real-world setting. Van Teijlingen and Hundley
(2002) provide a good outline on the steps of pilot testing, writing that pilot ver-
sions of a test should be administered in the same manner as the final version.
Krach et al. 13
However, afterwards, test-takers should provide feedback on item difficulty,
clarity, and confusion, as well as answer questions about the testing process and
expectations. Following the collection of pilot data, the test adapters should make
changes that reflect the feedback from the pilot tests and address any bias or
psychometric concerns that arise from the collected data.
There are several possible techniques for assessing the pilot data for bias. One
method to determine both types of equivalence across the construct and the
method of testing; these are referred to as structural equivalence and measurement
equivalence (Byrne & van De Vijver, 2010). To have structural equivalence, the
meaning of the construct measured must be independent from both cultures (Van
de Vijver & Tanzer, 2004), and all of the construct’s facets, dimensions, or under-
lying structures are presumed to be equal across cultures (Byrne & van De Vijver,
2010). Structural equivalence is mostly a theoretical concept that is difficult to
quantify.
So, despite the importance of structural equivalence, overall test equivalence is
traditionally determined through measurement equivalence, which provides an
empirical method to analyse construct structures across cultures (Byrne & van
De Vijver, 2010). Specifically, measurement equivalence is the evaluation of factor-
ial structure and loading (e.g., do items that load on one factor for the English
version still load on that factor for the Spanish version) as well as item content and
mean equivalency (i.e., do the means and standard deviations for each item equate
across cultural-linguistic groups, or does one group score higher or lower).
For more information on the specific statistical procedures for each of these tech-
niques, see Kankaras
ˇ, & Moors (2010). It is important to note that measurement
equivalence implies that different cultures are measuring the same construct. In
other words, one may achieve structural equivalence but not measurement equiva-
lence, but one must achieve structural equivalence to achieve measurement
equivalence.
Validation and publication. It is the responsibility of the developer to provide
documentation of all steps taken to ensure the instrument is psychometrically
sound. As part of this, data from any pilot, bias, or validation studies should be
provided so that users can interpret the appropriate use of the instrument for
themselves (American Educational Research Association et al., 1999). In the case
of cultural-linguistic test adaptation, it is the developers’ responsibility to also
disclose the translation and review procedures in addition to any validation or
equivalency studies conducted as part of the adaptation process (Geisinger,
1994). Specifically, test publishers must disclose attempts to overcome the different
types of bias embedded in the adaptation process (see Table 2 for specific sources of
bias). A reviewer would peruse all of the potential types of bias described in Table 2
and expect to find corresponding information in the manual to document how the
test developers considered, overcame, or explained a work-around for this type
of bias.
14 School Psychology International 38(1)
Administration. The final step is to oversee initial administrations of the test.
This means that the test adapter is responsible for determining the level of training
needed to administer the test, setting up opportunities for training, and providing
a method for users in the field to offer feedback on the test after publication
(Geisinger, 1994). In addition, the adapter is responsible for disseminating updates
to the adapted tests to users in field as new information becomes available. Finally,
the adapter must consider the need to update the test periodically to accommodate
problems associated with societal changes (Van der Velde, Feij, & van Emmerik,
1998) and the Flynn (1988) effect.
Less than ideal adaptation procedures. The previous section on the ideal methods for
adapting a test works well for professional publishers who have the time and
resources to complete all of the steps. However, individuals who need a translation
of a test that is unavailable or nonexistent may fall back upon a less than ideal
method of test adaptation. Just as with the ideal method of adapting an instrument,
Figure 1 should be your guide.
Choose origin test. As before, the process starts by choosing the test you wish to
adapt. This step is virtually unchanged from the ideal version. However, you do not
have the freedom of choice that the origin test publisher may have unless the
copyright of the origin test allows for adaptation. If the test copyright does not
allow for adaptation, you must seek permission from the publisher before starting
the process.
Translate the test and review. Once permission is granted, it is time to start the
translation itself. The translation (and back-translation) of the instrument remains
the same (Brislin, 1980), as does the need to consider bias (construct, method, and
item). Although having a comprehensive panel to help with this process may be
impossible and impractical, it is recommended that the translation/back-transla-
tion be conducted by two different people. You should have a third person involved
with the review process. Each of these individuals should be knowledgeable about
both languages and cultures (APA, 2010; NASP, 2010). The final reviewer should
work independently from anyone else involved in this process.
Alterations and validation. Alterations should be made based upon the final
reviewer’s comments. This may include adding, removing, or substituting items.
Validation procedures for the less than ideal adaptation are the biggest area of
difference between the two methods. For instruments adapted using this method,
normative data from the original version can no longer be used (Bracken & Barona,
1991; Rhodes, Ochoa, & Ortiz, 2005). Because there is no need to worry about the
normative effect, changing items is not problematic. However, if the test user is going
to use a raw score-to-raw score comparison, the user must either administer all items
(no basals/ceilings) or make any changes to both versions to ensure accuracy.
Krach et al. 15
For example, in the ‘car/coche’ translation described above, the problem is that ‘car’
has three phonemes and ‘coche’ has four. The adapter’s choice is to do a literal
translation, finding one word in each language that has three phonemes, or to
change ‘coche’ to a concrete common noun that has three phonemes.
As part of the validation process, the test adapter will still want to create a pilot
version of this test. Preferably, they would do this with at least five test takers who
are fluent in both languages. The pilot group should not include any referred
individuals who need the test, because the test has not been validated and should
not be used for diagnostic purposes. After the pilot test, changes should be made as
needed and reliability data should be analysed.
Publication and administration. The adapted version of an instrument cannot be
distributed without the publisher’s permission. Even though the test adapter will
not be selling or distributing the test, the individual should write down all of the
steps used in the adaptation process. This information is needed for future users, as
well as for the adapter’s own use, in case there is a need to defend the findings from
the adapted test in a court of law. There is another copyright consideration in the
use of an adapted version of a test. The user may still be responsible for purchasing
the blank record form of the origin version to accompany the target version; the
two should be stored together. This is because the right to use still belongs to the
original publisher. Finally, the test adapter must ensure that the person adminis-
tering the test is both fluent in the target language and knowledgeable about testing
procedures (Hallberg, 1996; Rhodes et al., 2005).
Even less than ideal version. There is no ‘even less than ideal version’ that you can use
in this process. For example, you should not be conducting ‘ad-hoc’ translations as
part of your practice. Ad-hoc testing does not follow any of the guidelines listed in
Figure 1, nor does it address any of the sources of bias found in Table 2. Thus, any
test findings cannot be considered psychometrically or theoretically sound, and are
not fit for use. Although the desire to use a short cut may be strong, it is strenu-
ously not recommended.
Unfortunately, the use of ad-hoc translations is a common procedure among
school psychologists in the US (about 50% say they have done it; Ochoa, Riccio,
Jimenez, de Alba, & Sines, 2004). Compounding the problem is that most school
psychologists have not been trained to identify and recruit appropriate translators
(Ochoa, Gonzalez, Galarza, & Guillemard, 1996). Inappropriate translators
include individuals such as secretaries and janitors (Paone, Malott, & Maddux,
2010), the referred child or older sibling (Garcı
´a-Sa
´nchez, Orellana, & Hopkins,
2011; Tse, 1995), friends of the child or family (Lynch & Hanson, 1996), or foreign-
language teachers at the school (Swender, 2003). Appropriate translators should be
individuals who are trained for this specific task and are thoroughly aware of the
national standard for translator code of ethics (e.g., American Translators
Association Code of Ethics (n.d.). These individuals should be certified and/or
licensed in the field of translation services (if applicable in your area).
16 School Psychology International 38(1)
Step four: Drawing conclusions from adapted test data
Once a test has been adapted, an interpretation plan must be established. When
interpreting results from any tests, it is vital to consider corroborating data in
decision-making (NASP, 2010). For example, if a teacher reports that a child is
performing well in school, but a test score says otherwise, then it is vital to inves-
tigate further to determine why there is a discrepancy between the two sources
(Sattler, 2008). This is, and should be, common practice. Please note that in this
practice, each piece of data may not hold as much weight as every other piece in the
decision-making process. For example, scores from a psychometrically sound
instrument might be weighed more heavily than interview data from a novice
teacher; however, interview data from an experienced teacher should be more
strongly considered than data from a poorly adapted test.
The most important consideration for interpreting a culturally-linguistically
adapted test is ensuring that you get the weightings correct for a given piece of
data. These weightings are based purely on clinical judgment, with a primary
concern that the further away from the ‘ideal’ adaptation of the test, the less the
findings from that instrument should be weighted in decision-making processes
(van de Vijver & Poortinga, 1997). Instead, when the origin and target versions
of a test differ considerably, then users should depend more on outside data sources
for corroborative support (e.g., review of records, interviews, observations, etc.).
In addition, tests that have been adapted using the ‘less than ideal method’ should
be weighted with less consideration than those done using the ‘ideal method;’ while
any which use ad-hoc data should be disregarded. Studies show that, for the most
part, clinicians weight data differently when making interpretations for cultural-
linguistically diverse test-takers. In a study by Sotelo-Dynega and Dixon (2014),
98.3% considered the validity of the test scores in their interpretation. The majority
(55.8%) also included informal assessments as supplemental materials for consid-
eration in their analysis. These findings support that test users understand the need
for best practice in test interpretation.
Discussion
In 1991, Bracken and Barona wrote an article for School Psychology International
focusing on ‘state of the art’ procedures used in translating and using tests for
multiple languages. This seminal piece from 25 years ago led to the creation of a
barrage of test translation/test adaptation policies, guidelines, and ethical require-
ments that have been developed and updated ever since. The current article seeks to
build on their original work by providing more specific suggestions and a more
current analysis of recent literature. This article attempted to establish empirically
based procedures beyond simple linguistic translations with a focus on more com-
plicated cultural-linguistic adaptation. These suggestions have been described in
a step-by-step manner for both ideal and less-than-ideal adaptation methods.
The first step in the process was to familiarize oneself with all of the legal and
ethical considerations around using tests with multilingual and multicultural
Krach et al. 17
populations (AERA, APA, & NCME, 1999; ISPA, 2011; ITC, 2005; NASP, 2010).
The next is to determine why a child needs to be tested and what systemic issues
should be considered as part of any test selection, adaptation, and interpretation
(Kumman, 2005). These considerations help to determine whether an ideal or a less
than ideal method of test adaptation is needed. It is only once these two steps are
completed that the appropriate adaptation process may begin.
When adapting a test, there are similar procedures to follow for both ideal and
less than ideal methods to decrease test bias: 1) Choose origin test, 2) translate the
test, 3) review, 4) alterations, 5) validation, and 6) publication (Bracken, & Barona,
1991; Butcher, 1996; Geisinger, 1994). Table 2 describes sources of test bias in more
detail, and Figure 1 provides more guidance on each of these procedures. There is
no even less than ideal method of test adaptation. Specifically, ad-hoc test transla-
tions should never be used for diagnostic purposes (Ochoa et al., 2004; Rhodes
et al., 2005).
Finally, when using adapted tests, be careful when drawing conclusions from the
findings. Even when adapted using ideal methods, the target version of a test will
not be as good as the original test. Therefore, test users should use professional
judgment and make weighted decisions utilizing multiple data sources (Sattler,
2008). Even if the test adaptation process goes perfectly, test users must interpret
the data using best practices or the findings will be suspect.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, author-
ship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication
of this article.
Note
The authors would like to thank the reviewers for their input in this article. Specifically, the
recommendations by Bruce Bracken were valued.
References
American Educational Research Association, American Psychological Association, &
National Council on Measurement in Education. (1999). Standards for educational and
psychological testing. Washington, DC: American Educational Research Association
American Psychological Association (APA, 2010). American Psychological Association’s
Ethical Principles of Psychologists and Code of Conduct. Washington, DC: Author.
American Translators Association (n.d.). Code of Ethics. Alexandria, VA: Author. Retrieved
from https://www.atanet.org/governance/code_of_professional_conduct.php
Bracken, B. A., & Barona, A. (1991). State of the art procedures for translating, validating
and using psychoeducational tests in cross-cultural assessment. School Psychology
International,12, 119–132. doi: 10.1177/0143034391121010.
18 School Psychology International 38(1)
Brislin, R. W. (1980). Translation and content analysis of oral and written material. In H.
C. Triandis, & J. W. Berry (Eds.), Handbook of cross-cultural psychology (Vol. 1,
pp. 389–444). Boston, MA: Allyn & Bacon.
Butcher, J. N. (1996). Translation and adaptation of the MMPI-2 for international use. In J.
N. Butcher (Ed.), International adaptations of the MMPI-2: Research and clinical appli-
cations. (pp. 26–43). Minneapolis, MN: University of Minnesota Press.
Byrne, B. M., & van De Vijver, F. J. (2010). Testing for measurement and structural equiva-
lence in large-scale cross-cultural studies: Addressing the issue of nonequivalence.
International Journal of Testing,10, 107–132. doi: 10.1080/15305051003637306.
Flynn, J. R. (1998). IQ gains over time: Toward finding the causes. In U. Neisser (Ed.),
The rising curve. Long term gains in IQ and related measures (pp. 25–66). Washington,
DC: American Psychological Association.
Garcı
´a-Sa
´nchez, I. M., Orellana, M. F., & Hopkins, M. (2011). Facilitating intercultural
communications in parent-teacher conferences: Lessons from child translators.
Multicultural Perspectives,13, 148–154. doi: 10.1080/15210969.2011.594387.
Geisinger, K. F. (1994). Cross-cultural normative assessment: Translation and adaptation
issues influencing the normative interpretation of assessment instruments. Psychological
Assessment,6(4), 304–312. doi: 10.1037/1040-3590.6.4.304.
Gudmundsson, E. (2009). Guidelines for translating and adapting psychological instru-
ments. Nordic Psychology,61(2), 29–45. doi: 10.1027/1901-2276.61.2.29.
Gummere, F. B. (1910). Beowulf, translated. In C. W. Eliot (Ed.), The Harvard Classics,
Vol. 49. New York, NY: P.F. Collier & Son. Retrieved from https://legacy.fordham.edu/
halsall/basis/beowulf.asp.
Hallberg, G. R. (1996). Assessing bilingual and LEP students: Practical issues in the use of
interpreters. NASP Communique,25(1), 16–18.
Hambleton, R. K. (2005). Issues, designs, and technical guidelines for adapting tests into
multiple languages and cultures. In R. K. Hambleton, P. F. Merenda, & C.
D. Spielberger (Eds.), Adapting educational and psychological tests for cross-cultural
assessment (pp. 3–38). Mahway, NJ: Lawrence Erlbaum Associates.
Hui, C. H., & Triandis, H. C. (1989). Effects of culture and response format on extreme
response style. Journal of Cross-Cultural Psychology,20, 296–309. doi: 10.1177/
0022022189203004.
International School Psychology Association (ISPA, 2011). Code of ethics. Retrieved
from http://www.ispaweb.org/wp-content/uploads/2013/01/The_ISPA_Code_of_Ethics_
2011.pdf.
International Test Commission (ITC, 1999). International test commission guidelines for
translating and adapting tests. Johannesburg, South Africa: Author.
International Test Commission (ITC, 2005). International test commission guidelines for
translating and adapting tests. Retrieved from http://www.intestcom.org/files/guideline_
test_adaptation.pdf.
Kankaras
ˇ, M., & Moors, G. (2010). Researching measurement equivalence in cross-cultural
studies. Psihologija,43(2), 121–136. doi: 10.2298/PSI1002121K.
Klaeber, F. (Ed.) (1922). Beowulf and the fight at Finssburg. Boston, MA: DC Health & Co.
Retrieved from https://legacy.fordham.edu/halsall/basis/beowulf-oe.asp
Krach, S. K., Doss, K. M., & McCreery, M. P. (2015). Multilingual versions of
popular social, emotional, and behavioral tests: Considerations for training school
psychologists. Trainer’s Forum: Journal of the Trainers of School Psychologists,33(3),
3–26.
Krach et al. 19
Kumman, A. J. (2005). Towards a model of test evaluation: Using the test fairness and the
test context frameworks. In L. Taylor, & C. J. Weir (Eds.), Multilingualism and assess-
ment:Achieving transparency,assuring quality,sustaining diversity. Proceedings of the
ALTE Berlin Conference (pp. 239–251). Cambridge, NY: Cambridge University.
Lynch, E. W., & Hanson, M. J. (1996). Ensuring cultural competence in assessment. In M.
McLean, D. B. Bailey, Jr., & M. Wolery (Eds.), Assessing infants and preschoolers with
special needs (2nd ed.) (pp. 69–94). Englewood Cliffs, NJ: Prentice Hall.
National Association of School Psychologists. (NASP, 2010). Principles for professional
ethics. Bethesda, MD: Author.
Ochoa, S. H., Gonzalez, D., Galarza, A., & Guillemard, L. (1996). The training and use of
interpreters in bilingual psycho-educational assessment: An alternative in need of study.
Diagnostique,21, 19–40. doi: 10.1177/073724779602100302.
Ochoa, S. H., Riccio, C., Jimenez, S., de Alba, R. G., & Sines, M. (2004). Psychological
assessment of English language learners and/or bilingual students: An investigation of
school psychologists’ current practices. Journal of Psychoeducational Assessment,22(3),
185–208. doi: 10.1177/073428290402200301.
Oxford Dictionary (2015). Word of the Year:2015. Oxford: Oxford University Press.
Retrieved from http://blog.oxforddictionaries.com/2015/11/word-of-the-year-2015-emoji/
Paone, T. R., Malott, K. M., & Maddux, C. (2010). School counselor collaboration with
language interpreters: Results of a national survey. Journal of School Counseling,8(13),
1–30.
Ren, X. S., Amick, B., Zhou, L., & Gandek, B. (1998). Translation and psychometric
evaluation of a Chinese version of the SF-36 Health Survey in the United States.
Journal of Clinical Epidemiology,51(11), 1129–1138. doi: 10.1016/S0895-4356(98)00104-
8.
Rhodes, R. L., Ochoa, S. H., & Ortiz, S. O. (2005). Assessing culturally and linguistically
diverse students: A practical guide. New York, NY: Guilford Press.
Roggen, V. (2014). Expanding the area of classical philology: International words. Nordlit,
(33), 321–328. doi: 10.7557/13.3166.
Sattler, J. (2008). Assessment of children: Cognitive foundations 5th ed. San Diego, CA:
Author.
Sattler, J. M., Oades-ese, G. V., Kitzie, M., & Krach, S. K. (2014). Chapter 4: Culturally and
linguistic diverse children. In J. Sattler, & R. D. Hogue (Eds.), Assessment of children:
Behavioral and clinical applications (4th ed.) (pp. 125–159). San Diego, CA: Author.
Solano-Flores, G., Trumbull, E., & Nelson-Barber, S. (2002). Concurrent development of
dual language assessments: An alternative to translating tests for linguistic minorities.
International Journal of Testing,2(2), 107–129. doi: 10.1207/S15327574IJT0202_2.
Sotelo-Dynega, M., & Dixon, S. G. (2014). Cognitive assessment practices: A survey of
school psychologists. Psychology in the Schools,51(10), 1031–1045. doi: 10.1002/
pits.21802.
Swender, E. (2003). Oral proficiency testing in the real world: Answers to frequently asked
questions.Foreign Language Annals,36, 520–526. doi: 10.1111/j.1944-
9720.2003.tb02141.x.
Tse, L. (1995). Language brokering among Latino adolescents: Prevalence, attitudes, and
school performance. Hispanic Journal of Behavioral Sciences,17, 180–193. doi: 10.1177/
07399863950172003.
US Office of Minority Health (2001). Eliminating racial and ethnic disparities in health.
Washington, DC: US Department of Health and Human Services.
20 School Psychology International 38(1)
Van der Velde, M. E., Feij, J. A., & van Emmerik, H. (1998). Change in work values and
norms among Dutch young adults: Ageing or societal trends? International Journal of
Behavioral Development,22(1), 55–76. doi: 10.1080/016502598384513.
Van Teijlingen, E., & Hundley, V. (2002). The importance of pilot studies. Nursing Standard,
16(40), 33–36. doi: 10.7748/ns2002.06.16.40.33.c3214.
Van de Vijver, F. J. R., & Hambleton, R. K. (1996). Translating tests: Some practical
guidelines. European Psychologist,1(2), 89–99. doi: 10.1027/1016-9040.1.2.89.
Van de Vijver, F. J. R., & Leung, K. (1996). Methods and data analysis of comparative
research. In J. W. Berry, Y. H. Poortinga, & J. Pandey (Eds.), Handbook of cross-cultural
psychology.Vol.1:Theory and method, (2nd ed.) (pp. 257–300). Boston, MA: Allyn &
Bacon.
Van de Vijver, F. J., & Poortinga, Y. H. (1997). Towards an integrated analysis of bias in
cross-cultural assessment. European Journal of Psychological Assessment,13(1), 29–37.
doi: 10.1027/1015-5759.13.1.29.
Van de Vijver, F., & Tanzer, N. K. (2004). Bias and equivalence in cross-cultural assessment:
An overview. Revue Europe
´enne de Psychologie Applique
´e / European Review of Applied
Psychology,54(2), 119–135. doi: 10.1016/j.erap.2003.12.004.
Author biographies
S. Kathleen Krach, PhD, NCSP is an Assistant Professor at Florida State
University in the School Psychology Program. The primary purpose of her research
is to help improve the practice of school psychology. This is accomplished by
studying school psychology training practices as well as the tools used by school
psychologists.
Michael P. McCreery, PhD, is an Assistant Professor of Interaction and Media
Sciences at the University of Nevada, Las Vegas. He is a psychometrician whose
primary research focus is understanding how principles of human-computer inter-
action can be applied to the creation of technology-based psychoeducational
assessments.
Jessika Guerard, BA, is a PhD. student in the Counseling and School Psychology
program at Florida State University. Her research focuses on bilingual assessments
and training practices of school psychologists.
Krach et al. 21
... The careful cross-cultural adaptation of measurement instruments is essential in ensuring their reliability, validity and relevance in diverse cultural contexts [29,30]. This process goes beyond mere translation and involves comprehensive cultural adaptation that preserves the conceptual equivalency between the original and adapted versions [29]. ...
... The careful cross-cultural adaptation of measurement instruments is essential in ensuring their reliability, validity and relevance in diverse cultural contexts [29,30]. This process goes beyond mere translation and involves comprehensive cultural adaptation that preserves the conceptual equivalency between the original and adapted versions [29]. Ensuring this approach's success requires the consideration of multiple types of equivalence. ...
... Semantic equivalence ensures an accurate and meaningful translation, operational equivalence verifies the appropriateness of measurement methods, and metric equivalence evaluates the comparability of psychometric properties across versions [30]. However, cultural biases can pose significant challenges [29]. Methodological bias may arise from differences in response styles, question formats or social norms, whereas content bias occurs when certain items are inappropriate or unclear to the target culture [29,30]. ...
Article
Full-text available
To effectively mitigate the health impacts of climate change, future nurses must be equipped with the requisite knowledge and competencies. Knowing their levels of eco-literacy would help to make them more effective. Background/Objectives: This descriptive study will use a three-round, multicentre, modified e-Delphi survey to establish an expert panel’s consensus on the Climate, Health, and Nursing Tool’s (CHANT) item-level and scale-level content validity indices. It will also examine potential associations between the expert panel members’ sociodemographic and professional characteristics and their content validity index assessments of the CHANT. Methods: The study will be conducted in the French-speaking regions of Switzerland, running its three-round e-Delphi survey between January and April 2025. After each round, the CHANT’s overall scale-level and individual item-level content validity indices will be computed. Comparisons between different types of healthcare professionals’ profiles will also be conducted. Results: The three-round modified e-Delphi survey should allow the expert panel to reach a consensus on the CHANT’s overall content validity index. The tool should then be considered suitable for pilot testing. The first round brought together 16 experts from different regions, namely French-speaking Switzerland, France, and Belgium. Conclusions: To ensure that the nursing discipline is well positioned to meet future challenges, the development of eco-literacy must be integrated into nursing education. Ensuring the CHANT’s conceptual and psychometric validity will be essential in strengthening nursing competencies in and knowledge about planetary health and in implementing future educational interventions.
... Not only equivalence in verbal content must be considered, but also the cultural context giving sense to the very act of measuring that specific construct, with that specific method, and including those specific contents (item words, phrases, and meanings). The current distinction between test translation and test adaptation, and the consensual option for the latter (International Test Commission [ITC], 2017; Krach et al., 2017), stresses that it is crucial to assure levels of equivalence other than linguistic, like construct and method equivalence (Krach et al., 2017;van de Vijver & Tanzer, 2004). The need for cultural adaptation implies item rephrasing, or even content change, to overcome cultural differences in item reading level and interpretation. ...
... Not only equivalence in verbal content must be considered, but also the cultural context giving sense to the very act of measuring that specific construct, with that specific method, and including those specific contents (item words, phrases, and meanings). The current distinction between test translation and test adaptation, and the consensual option for the latter (International Test Commission [ITC], 2017; Krach et al., 2017), stresses that it is crucial to assure levels of equivalence other than linguistic, like construct and method equivalence (Krach et al., 2017;van de Vijver & Tanzer, 2004). The need for cultural adaptation implies item rephrasing, or even content change, to overcome cultural differences in item reading level and interpretation. ...
... The need for cultural adaptation implies item rephrasing, or even content change, to overcome cultural differences in item reading level and interpretation. This is particularly true when a test written in an Anglo-Saxon language is transposed to a Latin language, as the kind of colloquial wording required to preserve reading level, without interfering in the psychometric and psychological value of each item, often demands substantial verbal change (Krach et al., 2017). ...
Article
Full-text available
The bilingual samples’ studies are listed as a useful tool to confirm the equivalence between linguistically different versions of a test. Yet, such studies are rare in the literature, as they require technical issues to be considered before any conclusion about equivalence can be reached. This paper discusses some of these issues, taking the example of the recent MMPI-2-RF Portuguese adaptation and standardization study. The results of a bilingual study (N = 53) using a single-sample design are analyzed, at item, scale, profile, and structural levels, allowing an encouraging general conclusion about the equivalence of the Portuguese MMPI-2-RF to the North American original version, but also pointing out some directions for improvement. The shortcomings of the classical bilingual studies, and the specific limitations due to the obstacles to bilingual samples’ recruitment in Portugal, are considered. The limited sample size and some other methodological shortcomings are discussed, considering their implications for future Portuguese MMPI equivalence studies.
... Esto se debe a dos factores fundamentales: en primer lugar, en el caso del instrumento Practice Environment Scale of the Nursing Work Index (1) que mide el ambiente laboral de enfermería y, por ende, la práctica de los profesionales del área, difiere significativamente de lo establecido en países de América del Norte, por lo tanto, ciertos ítems requerían una revisión más detallada, análisis y adaptación. Estos resultados subrayan la importancia de la adaptación lingüística de los instrumentos de medición para obtener resultados que reflejen la realidad de cada país (16,27,28) . Nuestro equipo siguió un proceso similar al realizado por Squires et al. (29) en el estudio piloto sobre el clima laboral de enfermeras en México en donde se tuvo que realizar adaptaciones de lenguaje y agregar algunos ítems para reflejar la realidad experimentada por los profesionales en dicho país. ...
... Como limitación se observa cierta debilidad en el proceso de revisión lingüística de los instrumentos originalmente de habla inglesa. La literatura señala que el trabajo con traductores en una primera etapa en los procesos de traducción y retrotraducción resulta ser fundamental (16,27,28) , sin embargo, se destaca que durante la revisión crítica del equipo de investigación y el trabajo con expertas, con extensa trayectoria en temas relacionados a asuntos laborales y/o desarrollo de áreas temáticas desde la academia e investigación, este punto fue analizado de manera profunda. Se recomienda en siguientes estudios incorporar esta etapa, al igual que incluir entrevistas cognitivas en la fase piloto para obtener mayor riqueza de datos en el feedback de la población de estudio (35) . ...
Article
Full-text available
Objetivo: Presentar el proceso de adaptación transcultural de instrumentos utilizados para determinar la relación entre acoso psicológico en el trabajo, ambiente laboral y calidad de vida profesional en la intención de renuncia de profesionales de enfermería en dos países de Sudamérica: Chile y Perú. Material y Método: Estudio de corte transversal, metodológico realizado en los años 2022 y 2023. Para la validez de contenido se consideraron dos etapas: 1) revisión crítica de los instrumentos por parte del equipo investigador; 2) revisión del contenido a través del trabajo con expertas/os con la aplicación del método Delphi en conjunto con el cálculo de índices de validez de contenido. Para la consistencia interna los instrumentos fueron piloteados utilizando la plataforma QuestionPro aplicando un muestreo por conveniencia a través de un llamado abierto en redes sociales (Facebook, Twitter y Linkedin) a profesionales de enfermería que cumplieran con los criterios de inclusión y exclusión. La fase de pilotaje fue en una muestra de 30 profesionales por país. Como medida de consistencia interna se calculó un Alfa de Cronbach. Resultados: En la validación de contenido participaron entre siete y nueve expertas en las áreas de interés para el problema de investigación. La primera ronda de la técnica Delphi tuvo buenos índices de validez de contenido por ítem y promedio, el índice de contenido universal indicó la necesidad de realizar ajustes lingüísticos. Todos los instrumentos presentaron niveles aceptables de consistencia interna. Conclusiones: Se cuenta con instrumentos adaptados para el análisis del fenómeno en estudio para cada país.
... The confidentiality of the participants was maintained throughout the whole study process. Procedure of Translation: For translation follow the guidelines of Krach et al. (2017), at the first, AHS was translated into Urdu language according to the sample demand following Forward/Backward translation method. It comprised of 12-items. ...
Article
Full-text available
The present research aims to translate the English version of the Adult Hope Scale into Urdu language. The cross-sectional research design with descriptive research method was used in this research. A sample of 200 participants was selected for the present study for determining the validation of the Adult Hope Scale, 98 (male) and 102 (female) university students from four universities in Faisalabad. The data were collected through simple random sampling. The Adult Hope Scale (Snyder et al., 1991) was used to calculate dispositional hope. The data were analyzed through SPSS (24. Version) and Amos (25 Version). The results of confirmatory factor analysis established the factorial validity of the single-factor model and two-factor model of the Adult Hope Scale. Factor loading and model fit indices confirmed two-factor model (pathway and agency) is better validated than a single-factor model. The Adult Hope Scale Urdu Version revealed the acceptable values of Cronbach's alpha (overall hope = .86, pathways = .87, and agency = .86) for reliability in the existing research that shows good enough reliability The results of correlation found a significant positive correlation between English version and Urdu version of Adult Hope Scale. Test-retest reliability is also evaluated within one week of the gap it prevailed as a strong positive correlation (r = .91) that is significant. The results showed gender significant differences, while the mean score of hope, pathways, and agency are significantly higher in male university students than female university students. This research confirmed Urdu version of the Adult Hope Scale is reliable and valid for university students to measure hope.
... The field of assessment has historically been challenged with two broad issues: accuracy and alignment. Accuracy often addresses psychometric issues as described in the Standards for Educational and Psychological Testing (AERA, APA, NCME, 2014) such as reliability, validity, norming, as well as socially desirable responding (McCreery et al., 2022) and cultural / linguistic variations (Krach et al., 2017). If scores on an assessment are not reliable or valid, then no other information is needed for interpretation (William, 2001). ...
... En el proceso de traducción y validación de jueces, se sugieren pautas para la traducción de pruebas, la cual inició con la elección del test, la simplificación de sus ítems en relación a un contexto cultural, la traducción del test por un equipo traductor, la revisión de la traducción por expertos en salud mental y por la autora de la escala, con sus respectivas recomendaciones en relación a los ítems y a la implementación de la misma (19) . ...
Article
Full-text available
Introducción: La continuidad de atención es considerada como un proceso que involucra una atención ordenada, un movimiento ininterrumpido de personas entre los diversos elementos del sistema de prestación de servicios. No existe evidencia suficiente en cuanto a instrumentos de medición en Iberoamérica. Por lo anterior, el objetivo del presente estudio consiste en describir el proceso de traducción, adaptación cultural a un contexto latinoamericano, así como la consistencia interna y validez de constructo de la Escala de Continuidad de Servicios de Salud Mental de Alberta (ACSS-MH). Método: Este instrumento fue sometido a la evaluación de validez de contenido por expertos y este fue aplicado a una población rural en un contexto colombiano. Se realizaron pruebas de consistencia interna y validez de constructo para cada una de las partes de la escala. Resultados: Bajo el consenso del experto, se realizan cambios en algunos ítems, buscando una mejor adaptabilidad del instrumento a las características lingüísticas del español, sin perder de vista el objetivo de evaluación de cada uno de los ítems del cuestionario original. El resultado del análisis de la parte A convergió en 5 componentes que explican el 69,69% de la varianza con 24 ítems; de igual forma, el análisis de la parte B agrupó 13 ítems en cuatro componentes, los cuales explican el 72.02% de la varianza. Conclusiones: este instrumento podría ser implementado para mejorar la prestación de los servicios en salud mental en contextos latinoamericanos, donde la continuidad del cuidado ha presentado importantes dificultades.
... The SDQ parent form was previously validated in the Latvian, Italian, and Portuguese languages. The Latvian SDQ self-report form and CD-RISC-10 were translated according to recommendations in the literature (68), namely translation, review, back-translation, review by experts, piloting in the target group, and final agreement among the experts' committee to reach appropriate cultural, semantic, and conceptual equivalence with the original measure. The same procedure was applied to the Brief SSIS-SEL scales for their Latvian, Italian, and Portuguese versions (see Table 4). ...
Article
Full-text available
Objectives: The consequences of long-lasting restrictions related to the COVID-19 pandemic have become a topical question in the latest research. The present study aims to analyze longitudinal changes in adolescents' social emotional skills, resilience, and behavioral problems. Moreover, the study addresses the impact of adolescents' social emotional learning on changes in their resilience and behavioral problems over the course of seven months of the pandemic. Methods: The Time 1 (T1) and Time 2 (T2) measuring points were in October 2020 and May 2021, characterized by high mortality rates and strict restrictions in Europe. For all three countries combined, 512 questionnaires were answered by both adolescents (aged 11-13 and 14-16 years) and their parents. The SSIS-SEL and SDQ student self-report and parent forms were used to evaluate adolescents' social emotional skills and behavioral problems. The CD-RISC-10 scale was administered to adolescents to measure their self-reported resilience. Several multilevel models were fitted to investigate the changes in adolescents' social emotional skills, resilience, and behavioral problems, controlling for age and gender. Correlation analysis was carried out to investigate how changes in the adolescents' social emotional skills were associated with changes in their resilience and mental health adjustment. Results: Comparing T1 and T2 evaluations, adolescents claim they have more behavioral problems, have less social emotional skills, and are less prosocial than perceived by their parents, and this result applies across all countries and age groups. Both informants agree that COVID-19 had a negative impact, reporting an increment in the mean internalizing and externalizing difficulties scores and reductions in social emotional skills, prosocial behavior, and resilience scores. However, these changes are not very conspicuous, and most of them are not significant. Correlation analysis shows that changes in adolescents' social emotional skills are negatively and significantly related to changes in internalized and externalized problems and positively and significantly related to changes in prosocial behavior and resilience. This implies that adolescents who experienced larger development in social emotional learning also experienced more increase in resilience and prosocial behavior and a decrease in difficulties. Conclusion: Due to its longitudinal design, sample size, and multi-informant approach, this study adds to a deeper understanding of the pandemic's consequences on adolescents' mental health. Keywords: COVID-19; adolescents; behavioral problems; longitudinal research; mental health; multi-informant approach; social emotional learning.
... The confidentiality of the participants was maintained throughout the whole study process. Procedure of Translation: For translation follow the guidelines of Krach et al. (2017), at the first, AHS was translated into Urdu language according to the sample demand following Forward/Backward translation method. It comprised of 12-items. ...
Article
Full-text available
The present research aims to translate the English version of the Adult Hope Scale into Urdulanguage. The cross-sectional research design with descriptive research method was used in thisresearch. A sample of 200 participants was selected for the present study for determining thevalidation of the Adult Hope Scale, 98 (male) and 102 (female) university students from fouruniversities in Faisalabad. The data were collected through simple random sampling. The AdultHope Scale (Snyder et al., 1991) was used to calculate dispositional hope. The data were analyzedthrough SPSS (24. Version) and Amos (25 Version). The results of confirmatory factor analysisestablished the factorial validity of the single-factor model and two-factor model of the AdultHope Scale. Factor loading and model fit indices confirmed two-factor model (pathway andagency) is better validated than a single-factor model. The Adult Hope Scale Urdu Versionrevealed the acceptable values of Cronbach’s alpha (overall hope = .86, pathways = .87, andagency = .86) for reliability in the existing research that shows good enough reliability The resultsof correlation found a significant positive correlation between English version and Urdu versionof Adult Hope Scale. Test-retest reliability is also evaluated within one week of the gap itprevailed as a strong positive correlation (r = .91) that is significant. The results showed gendersignificant differences, while the mean score of hope, pathways, and agency are significantlyhigher in male university students than female university students. This research confirmed Urduversion of the Adult Hope Scale is reliable and valid for university students to measure hope.Keywords: Adult Hope Scale, Translation, Validation, Pakistani University Students (2) (PDF) TRANSLATION AND VALIDATION OF ADULT HOPE SCALE AMONG PAKISTANI UNIVERSITY STUDENTS. Available from: https://www.researchgate.net/publication/373657414_TRANSLATION_AND_VALIDATION_OF_ADULT_HOPE_SCALE_AMONG_PAKISTANI_UNIVERSITY_STUDENTS#fullTextFileContent [accessed Sep 06 2023].
Article
Full-text available
Bu çalışma, Yüksek Performanslı İş Sistemleri (YPİS) Ölçeğinin Türkçeye uyarlama süreci, ölçeğin geçerliliği ve güvenilirliği üzerine yapılan analizleri ele almaktadır. Bu bağlamda, yazılım ve donanım geliştiren, ayrıca hizmet sağlayıcı teknoloji firmalarında çalışan toplam 201 beyaz yakalı katılımcıdan elde edilen bir veri seti oluşturulmuş, elde edilen veriler üzerinde açımlayıcı faktör analizi (AFA), doğrulayıcı faktör analizi (DFA), yakınsak ve ayırt edici geçerlilik analizleri uygulanmıştır. Değerlendirme sürecinde tanımlayıcı istatistiksel yöntemler (Frekans, Oran) dikkate alınmıştır. Bulgular, faktörlerin orijinal veri setindeki değişkenler arasında anlamlı ve kapsayıcı bir yapı oluşturduğunu göstermektedir. DFA analizi ise verilerin kabul edilebilir ve güçlü uyum sergilediğini ortaya koymuştur. Yakınsak ve ayırt edici geçerlilik analizleri ise ölçeğin geçerliliğini desteklemektedir. Sonuç olarak, bu ölçeğin, Türkiye'deki işletmelerin insan kaynakları uygulamalarını değerlendirmek ve geliştirmek için uygun bir araç potansiyeline sahip olduğu değerlendirilmektedir.
Article
Full-text available
The main objective of the study was “Urdu translation and cultural adaptation” of Glasgow Anti-psychotic Side-effect Scale (GASS). The study comprised two phases and employed a combined methods design. Multiple Forward Translation Method was used to translate and adapt GASS, and a pre-clinical version was administered to 58 indoor patients. Later, psychometric properties and gender patterns of side effects were assessed. The pre-final version of the tool had good content validity with S-CVI of .94 and I-CVIs of .8 to .1. The Cronbach alpha for men indicated good internal consistency and reliability, while it was low for women. Both men and women exhibited mild to moderate severity of symptoms. This study provides a tool to assess side effects of SGAs in our setting.
Article
Full-text available
The classical languages, Greek and Latin, have a special kind of afterlife, namely through their explosive expansion into other languages, from antiquity until today. The aim of the present paper is to give a broad survey of this field of study – enough to show that there is a lot to find. As examples are chosen English, Spanish and Norwegian – three Indo-European languages, all of them with rich material for our purpose. In the national philologies, the treat­ment of the Greek and Latin elements are often not given special attention, but are studied alongside other aspects of the language in question. A cooperation with classical philology would be an advantage. Moreover, only classical philology can give the full picture, seen from the point of view of Greek and Latin, and explain why and how these languages have lended so many words and word elements to so many vernacular languages. Another aspect of the field, which I call ‘international words’, is the enormous potential that these words have, if disseminated in a good way to the general population. If taught systematically, the learner will be able to see the connections between words, learn new words faster, and develop a deeper understanding of the vocabularies in – for example – English, Spanish and Norwegian.
Article
Full-text available
The present article describes an exploratory study regarding the preferred cognitive assessment practices of current school psychologists. Three hundred and twenty-three school psychologists participated in the survey. The results suggest that the majority of school psychologists endorsed that they base their assessment practices on an underlying theoretical framework, specifically Cattell-Horn-Carroll (CHC) theory. Despite this finding, the majority of those sampled continue to engage in traditional assessment practices that are not consistent with CHC theory. Furthermore, the majority of those sampled reported that they assess culturally and linguistically diverse students and modify their practices when doing so. Unfortunately, the modifications endorsed by those surveyed might be discriminatory. The implications of these findings are discussed herein.
Article
The ACTFL Oral Proficiency Interview (OPl) is used to assess the ability of individuals to use language for real-world purposes. Today, OPIs are used by academic institutions, government agencies, and private corporations for many purposes: academic placement, student assessment, program evaluation, professional certification, hiring, and promotional qualification. Through Language Testing International (LT1), the exclusive ACTFL testing office, ACTFL conducts, rates, and archives 8,000 to 10,000 oral proficiency interviews each year. This article addresses questions that are frequently ashed by educators, test takers, employers, certification boards, and others who require information about an individual's level of oral proficiency. The frequently ashed questions (FAQs) addressed in this article are (1) Does taking an OPl over the phone produce a different rating than a face-to-face interview? (2) Are there differences in testing performance from one testing occasion to another when there is no significant opportunity for learning or forgetting between the two tests? (3) How proficient are today's foreign language undergraduate majors? (4) What minimum levels of proficiency are required in the workplace? The answers to questions 1 and 2 are based on the results of an ACTFL-sponsored testing project that compared face-to-face with telephonic inteiyiews. The findings indicated that there is no significant difference in the ratings assigned using face-to-face versus telephone test administration. The data from the same study indicated that comparable results arc obtained in test/retest situations. Questions 3 and 4 arc answered using data from the ACTFL Test Archives. The majority of undergraduate language majors have achieved proficiency levels that cluster around the Intermediatc-High/Advanced-Low border. Different jobs require different levels of proficiency. Charts are provided to summarize the findings.
Article
This study examined critical components of the assessment procedures school psychologists use when conducting evaluations for emotional disturbance with students who are English language learners (ELLs). A random sample of 1, 500 members of NASP from 12 states with high limited English proficient populations was surveyed. A total of 439 respondents (29.27%) returned the survey. Only 223 of the respondents indicated that they had assessed ELLs. The results indicate that school psychologists are assessing ELLs from many different language groups, Spanish being the most common language group assessed. A significant number of school psychologists used interpreters when assessing ELLs. The following assessment methods were employed by over 90% of the respondents: behavioral observation, child interview, teacher interview, and parent interview. These four methods were judged to be very helpful. The most frequently used measures included Bender Visual Motor Gestalt Test (75.8%), Draw-A-Person (71.7%), House-Tree-Person (58.4%), Kinetic Family Drawing (55.3%), and Generic Sentence Completion Forms (52.5%). The Acculturation Rating Scale for Mexican Americans (ARSMA), all BASC measures (PRS in English and Spanish, TRS, SRP, SDH, and SOS), Million, and Haak Sentence Completion obtained the highest mean ratings for level of helpfulness. Implications of results with respect to professional standards and recommended practices are discussed.
Article
This article describes some of the issues affecting measures that are translated and/or adapted from an original language and culture to a new one. It addresses steps to ensure (a) that the test continues to measure the same psychological characteristics, (b) that the test content is the same, and (c) that the research procedures needed to document that it effectively meets this goal are available. Specifically, the notions of test validation, fairness, and norms are addressed. An argument that such adaptations may be necessary when assessing members of subpopulations in U. S. culture is proposed.
Article
Outcome measures are rapidly becoming standard tools in the assessment of clinical effectiveness and in the measurement of health status in populations. In this article we document the development of a self-administered Chinese version of the Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36) and report the results of psychometric testing among 156 adult Chinese Americans in Boston, Massachusetts. Following the standard guidelines, a Chinese version of the SF-36 was developed through forward-backward translation techniques and committee review. We used psychometric methods to test assumptions underlying construction and scoring of scales and to evaluate the reliability and validity of the Chinese SF-36 as a measure of health status. The preliminary results indicated that missing value rates for the 36 items were consistently low. Item-discriminant validity was high (over 90% scaling successes) for six of the eight scales (Physical Functioning, Role-Physical, Bodily Pain, General Health, Role-Emotional, and Mental Health). Cronbach's alpha coefficient was above 0.70 criterion for all scales except Social Functioning. Reliability estimates also appeared to vary by sample characteristics. We discuss the implications of these findings and identify where further work will be required.