European Journal of Psychological Assessment

Published by Hogrefe
Online ISSN: 1015-5759
Publications
Article
This study provides evidence that an Italian version of the Positive and Negative Affect Schedule (PANAS) is a reliable and valid self-report measure. In an Italian sample (N = 600), the PANAS showed solid psychometric properties, and several American findings with the PANAS were replicated. The replicability of the PANAS factor structure was confirmed by high congruence coefficients between the American and Italian varimax solutions. Alternative models were tested with Confirmatory Factor Analysis; as in previous studies, the two-factor model achieved the best fit, but absolute fit indices varied with the estimation methods used. The independence/bipolarity issue was also explored: Positive and negative affect scales remain substantially independent after accounting for measurement error and acquiescence. Some predictions from the tripartite model of anxiety and depression were confirmed, and external correlates of the PANAS replicated those found in other languages and cultures. These analyses offer strong support for the construct validity of the Italian PANAS.
 
Number of participants in the baseline assessment, in the cell phone-based EMA, and those who repeatedly took part in the EMA N at baseline N in the EMA in total N repeated EMA participation 1
Proportion of submitted, returned and unreturned questionnaires among participants who replied to at least 5 questionnaires
Response time among participants who replied to at least 5 questionnaires
Article
Rapid advances in mobile data-transfer technologies offer new possibilities in the use of cell phones to conduct assessments of a person's natural environment in real time. This paper describes features of a new Internet-based, cell phone-optimized assessment technique (ICAT), which consists of a retrospective baseline assessment combined with text messages sent to the participants' personal cell phones providing a hyperlink to an Internet-stored cell phone-optimized questionnaire. Two participation conditions were used to test variations in response burden. Retention rates, completion rates, and response times in different subgroups were tested by means of χ² tests, Cox regression, and logistic regression. Among the 237 initial participants, we observed a retention rate of 90.3% from the baseline assessment to the cell-phone part, and 80.4% repeated participation in the 30 daily assessments. Each day, 40-70% of the questionnaires were returned, a fourth in less than 3 minutes. Qualitative interviews underscored the ease of use of ICAT. This technique appears to be an innovative, convenient, and cost-effective way of collecting data on situational characteristics while minimizing recall bias. Because of its flexibility, ICAT can be applied in various disciplines, whether as part of small pilot studies or large-scale, crosscultural, and multisite research projects.
 
Article
Contemporary theories of social anxiety emphasize the role of cognitive processes. Although social anxiety disorder is one of the most common mental health problems in adolescents, there are very few self-report instruments available to measure cognitive processes related to social anxiety in adolescents, let alone non-English instruments. The Self-Statements during Public Speaking Scale (SSPS; Hofmann & DiBartolo, 2000) is a brief self-report measure designed to assess self-statements related to public speaking, the most commonly feared social performance situation. In order to fill this gap in the literature, we translated the SSPS into Spanish and administered it to 1,694 adolescents from a community sample, a clinical sample composed of 71 subjects with a principal diagnosis of social anxiety disorder; and a clinical control group consisting of 154 patients. The scale showed good psychometric properties, supporting the use of the Spanish version of the SSPS in adolescents.
 
The SDS-17 and the Marlowe-Crowne Scale across age groups of 18-89 years. 
Article
Presents 4 studies (with a total of 440 Ss) that investigate the convergent validity, discriminant validity, and relationship with age of the Social Desirability Scale-17 (SDS-17). As to convergent validity, SDS-17 scores showed correlations between .52 and .85 with other measures of social desirability. With respect to the Balanced Inventory of Desirable Responding, SDS-17 scores showed a unique correlation with impression management, but not with self-deception. As to discriminant validity, SDS-17 scores showed nonsignificant correlations with neuroticism, extraversion, psychoticism, and openness to experience, whereas there was some overlap with agreeableness and conscientiousness. With respect to relationship with age, the SDS-17 was administered in a sample stratified for age, with age ranging from 18 to 89 yrs. In all but the oldest age group, the SDS-17 showed substantial correlations with the Marlowe-Crowne Scale. The influence of age (cohort) on mean scores, however, was significantly smaller for the SDS-17 than for the Marlowe-Crowne Scale. In sum, results indicate that the SDS-17 is a reliable and valid measure of social desirability, suitable for adults of the target age range.
 
Article
Presents an obituary of Hans-Jürgen Eysenck (1916-1997). Hans J. Eysenck--one of the most prominent psychologists in this century-- died on September 4, 1997. He was born on March 4, 1919, in Berlin. Eysenck can be characterized with three main traits. First of all, he considered psychology a science, so that methodology was tremendously important to him. Second, he was an honest researcher, and therefore he was a very controversial one. Finally, he was a very supportive scholar. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
We conducted a historical analysis of the articles published in the first (1992–1996) and last 5 years (2005–2009) of the European Journal of Psychological Assessment ( EJPA), mainly on the basis of an analysis of abstracts and keywords of articles. We dealt with the impact of EJPA, the main characteristics of its articles, its evolution, and to what extent main features in psychological assessment are represented in the journal. EJPA is a journal with a steadily rising impact factor that is relatively high for the field of assessment. Authorship is mainly European and coauthors usually come from the same country. The personality domain has gained popularity at the expense of cognition and education. Questionnaires are the most often and increasingly popular assessment method; there is also a tendency to employ multiple instruments and methods, and computerized assessment. More recent volumes have fewer substance-oriented and more measurement-oriented studies, notably studies in which validity is addressed by factor-analytic procedures. The incomplete coverage of recent developments in psychological assessment is discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
In this study, a job analysis method was used to derive important and observable personal qualities (PQs) which were used to assess 116 military officer candidates (graduates, non-graduates, and staff) within a structured, life-history, general selection interview. 17 subject matter experts, who were knowledgeable of the job and interview technique, were used. After correcting for range restriction and adjusting for number of variates, the multiple correlation of the PQs against success at the next stage of training was: 0.41 for non-graduates; 0.28 for staff; and 0.18 for graduates. 2 possible explanations, both to do with observability of PQs, are proposed to explain these differences in predictive validity. It is argued that the proposed method can have similar validity to the situational interview for some groups of candidates without the problems and limitations of the situational interview. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
With reference to EJPA’s unique and broad scope, the current study analyzed the characteristics of the authors as well as the topics and research aims of the 69 empirical articles published in the years 2009–2010. Results revealed that more than one third of the articles were written by authors affiliated with more than one country. With reference to their research aims, an almost comparable number of articles (1) presented a new measure, (2) dealt with adaptations of measures, or (3) dealt with further research on existing measures. Analyses also revealed that most articles did not address any particular field of application. The second largest group was comprised of articles related to the clinical field, followed by the health-related field of application. The majority of all articles put their focus on investigating questionnaires or rating scales, and only a small number of articles investigated procedures classified as tests or properties of interviews. As to further characteristics of the method(s) used, a majority of EJPA contributions addressed self-report data. Results are discussed with reference to publication demands as well as the current and future challenges and demands of psychological assessment. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Ambulatory assessment targets capturing psychological, behavioral, and physiological data in "real time" using in-field data acquisition systems. Although ambulatory assessment research has flourished particularly in the last decades, overviews on hardware and software solutions for monitoring are scarce, and--if found--are often outdated. In this review, we give an overview of current software and hardware solutions, focusing on multichannel systems for physiological data acquisition and hand-held computer based "experience sampling" systems. We aim at offering the reader guidance with regard to their choice of psychological and physiological monitoring solutions, giving special emphasis to key features relevant for different research questions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Examines the psychometric properties of the scaled version of the General Health Questionnaire (GHQ-28) and its 4 subscales: somatic symptoms, anxiety and insomnia, social dysfunctioning, and severe depression. Data from 4 European countries were used, including 691 patients with rheumatoid arthritis. Psychometric evaluation of the GHQ-28 for each country, both separately and simultaneously, were carried out. Results from Simultaneous Component Analysis, internal consistencies, intercorrelations between the subscales across countries, and group-mean comparisons found that the original 4 factor structure was present in all 4 countries. Evidence was also found for the unidimensionality of each of the 4 subscales. For cross-national comparison on the subscale level, it is concluded that sufficient evidence is found for all 4 subscales, as originally suggested by D. P. Goldberg and V. F. Hillier (1979). (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Factor loadings from the confirmatory factor analysis of the French translation of EPQR-A.
Factor correlation of the French translation of the EPQR-A.
Article
There is increasing interest in the abbreviated form of the Revised Eysenck Personality Questionnaire Revised (EPQR-A) as a research tool for psychologists. The present study evaluated the psychometric properties of a French translation of the EPQR-A in order to facilitate its use among French researchers. Data from a sample of 515 French undergraduate university students (462 females and 53 males; mean age 20.46 yrs) were used. The dimensionality of the EPQR-A was examined in terms of the underlying latent factors. Using confirmatory factor analysis, the authors found evidence for the unidimensionality of the four EPQR-A subscales of extraversion, neuroticism, psychoticism, and the lie scale. These results are consistent with those of previous research with the original English version of the EPQR-A (L. J. Francis et al, 1992; S. Forrest et al, 2000). (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Examined the structural validity of the multiple-choice items of the Sternberg Triarchic Abilities Test (STAT) regarding the existence and separability of the 3 aspects (creative, analytical, and practical) of intelligence in 3 content modalities, by using the techniques of confirmatory factor analysis on a combined sample of 3,278 school students (12–18 yrs old) from the US, Finland, and Spain. The results of the comparison of a number of models—using the strategy of hierarchical confirmatory factor analysis (HCFA) and comparing nested and alternative models, specified under different assumed theories relative to a unidimensional concept of general intelligence, a traditional factorial concept, and a triarchic model—illustrate that the second-order factor model based on the triarchic theory of intelligence achieves the best (albeit far from perfect) fit to the empirical data. The results of this study provide some support for the construct validity of the STAT and of the triarchic theory of intelligence on which it is based. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
In the US , normative population for the Wechsler Adult Intelligence Scale--Revised (WAIS--R), differences (Ds) between persons' verbal and performance IQs (VlQs and PIQs) tend to increase with an increase in full scale IQs (FSIQs). This suggests that norm-referenced interpretations of Ds should take FSIQs into account. Two new graphs are presented to facilitate this type of interpretation. One of these graphs estimates the mean of absolute values of D (called typical D) at each FSIQ level of the US normative population. The other graph estimates the absolute value of D that is exceeded only 5% of the time (called abnormal D) at each FSIQ level of this population. A graph for the identification of conventional "statistically significant Ds" (also called "reliable Ds") is also presented. A reliable D is defined in the context of classical true score theory as an absolute D that is unlikely to be exceeded by a person whose true VIQ and PIQ are equal. As conventionally defined reliable Ds do not depend on the FSIQ. The graphs of typical and abnormal Ds are based on quadratic models of the relation of sizes of Ds to FSIQs. Implications of the three juxtaposed graphs for the interpretation of VIQ-PIQ differences are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Means, standard deviations, Cronbach's αs, and correlations between the French ABQ subscales and motivation, self-confidence and anxiety 
Article
This research develops a psychometrically sound measure of the Athlete Burnout Questionnaire (ABQ; Raedeke & Smith, 2001) in French (Le Questionnaire du Burnout Sportif, QBS). We first developed a preliminary version and then had 895 French adolescents involved in competitive sport or physical education at school complete the survey. The results showed good internal consistency (all Cronbach’s α values > .75). Confirmatory factor analysis with the three subscales of the ABQ (emotional and physical exhaustion, reduced sense of accomplishment, and devaluation) confirmed the structure of the instrument and good data fit (NNFI = .95, CFI = .96, GFI = .95, RMSEA = .07) in accordance with the results obtained in previous studies (e.g., Cresswell & Eklund, 2005a,b; Raedeke & Smith, 2001). Furthermore, the patterns of relationships between the ABQ subscales and motivation, self-confidence, and anxiety provide concurrent validity of the ABQ. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The Inventory of Personality Organization (IPO, Kernberg & Clarkin, 1995; Lenzenweger, Clarkin, Kernberg, & Foelsch, 2001) is a self-report instrument intended to measure a patient’s level of personality organization. This manuscript describes the development of a shortened version of the IPO (the IPO-R). Construct validity of the IPO-R is determined by investigating (a) its latent structure, (b) the equivalence of this latent structure in a normal and a clinical sample (structural validity), and (c) differences between mean scores of the IPO-R scales for a normal population, axis-I disordered and axis-II disordered patients (concurrent validity). The IPO-R showed adequate construct validity in a normal and a clinical sample. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
In a research carried out on 69 Ss (mean age 20.56 yrs), the authors examined the absolute and relative accuracy in the retrospective estimate of positive and negative mood as well as specific factors within positive and negative mood. The absolute accuracy was defined as the difference between average daily estimates within a period of 35 to 42 days and retrospective mood estimates for the same period, which was examined one week after the end of the day-to-day estimates. The results show statistically significant differences between average daily and retrospective mood estimates, both for positive and negative mood, for all specific factors of positive mood and for sadness as a specific factor of negative mood. In all cases retrospective estimates are statistically higher in comparison to the average day-to-day estimates. The correlation coefficients, which reflect the relative accuracy, are statistically significant and high for all mood factors. The results obtained are discussed in the context of the cognitive and motivational processes that can be operative in the retrospective mood estimates and the main measurement implications of the results are indicated. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Comparative data among three different samples.
Article
The conditions are investigated in which Spanish university teachers carry out their teaching and research functions. 655 teachers from the University of Oviedo took part in this study by completing the Academic Setting Evaluation Questionnaire (ASEQ). Of the three dimensions assessed in the ASEQ, Satisfaction received the lowest ratings, Social Climate was rated higher, and Relations with students was rated the highest. These results are similar to those found in two studies carried out in the academic years 1986/87 and 1989/90. Their relevance for higher education is twofold because these data can be used as a complement of those obtained by means of students' opinions, and the crossing of both types of data can facilitate decision making in order to improve the quality of the work (teaching and research) of the university institutions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The German Pupils Academy (Deutsche Schüler-Akademie) is a summer-school program for highly gifted secondary-school students. Three types of program evaluation were conducted. Input evaluation confirmed the participants as intellectually highly gifted students who are intrinsically motivated and interested to attend the courses offered at the summer school. Process evaluation focused on the courses attended by the participants as the most important component of the program. Accordingly, the instructional approaches meet the needs of highly gifted students for self-regulated and discovery oriented learning. The product or impact evaluation was based on a multivariate social-cognitive framework. The findings indicate that the program contributes to promoting motivational and cognitive prerequisites for transforming giftedness into excellent performances. To some extent, the positive effects on students' self-efficacy and self-regulatory strategies are due to qualities of the learning environments established by the courses. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
In the 1st study, 46 Ss (aged 17–57 yrs) filled in the Big Five under the spontaneous and accountability conditions. Contrary to expectations, there was a small but significant effect. When Ss were asked to give answers they would have to account for, they scored higher on conscientiousness and emotional stability. In the 2nd study, Ss filled in the Big Five for 2 jobs differing in the extent to which the applicant has to manage people or systems. In line with expectations, there was an effect of autonomy but contrary to expectations not of conscientiousness and extraversion. The practical consequences of the accountability instruction for the validity of personality questionnaires and of job types for norms are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Addresses 4 problems that arise when translating achievement tests for use in cross-national studies: (1) selecting translators, (2) identifying the appropriate language for the target version of the test, (3) identifying and minimizing cultural differences, and (4) finding equivalent words or phrases. How these problems might be resolved are identified, and judgmental and statistical methods for establishing the equivalence of scores from the test presented in different languages are reviewed. Two basic judgmental methods are identified in the educational and psychological literature, and 3 data collection designs are used in establishing test score equivalence of the source and target language versions of a test. The author also provides 14 preliminary guidelines for persons doing test translations and equivalence studies. (French abstract) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The Achievement Motives Scale (AMS) is a well-established and frequently used scale to assess hope of success and fear of failure. In three studies with German-speaking samples (N = 3523, N = 132, N = 126), the authors developed a revised form of the AMS using confirmatory factor analysis. As found in previous research, the original 30-item set of the AMS did not provide an acceptable fit to a two-factor model. In contrast, a revised 10-item version (AMS-R) provided an adequate fit to the theoretically intended two-factor model. The adequate fit could be validated in cross-validation procedures. Furthermore, the revised scales provided adequate reliability, lower interscale correlations, and criterion-related validity with respect to typical criteria of achievement-related behavior. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The objective of the present study was to compare alternative factorial structures of the French-Canadian version of the Positive and Negative Affect Schedule (PANAS; Watson, Clark, & Tellegen, 1988) across samples of athletes at different stages of a sport competition. The first sample (N=305) was used to assess, compare, and improve the measurement model of the PANAS. The second sample (N=217) was used to cross-validate the model that provided the best fit with the calibration sample. Results of confirmatory factor analyses suggested that a modified three-factor model with cross-loadings provided a better fit to the data than either the hypothesized or the modified two-factor models. This model was partially replicated on the second sample. Results of a multiple-group confirmatory factor analysis have shown that the model was partially invariant across the two samples. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The construct of behavioral undercontrol is often assessed as a potential risk factor in studies of health-risk behaviors, but few studies have examined psychometric properties of measures of behavioral undercontrol. The present study tested the factor structure of the Behavioral Undercontrol Questionnaire (BUQ), a 20-item self-report measure, across gender and racial/ethnic groups, using a college sample (N = 648). We hypothesized that the factor structure would vary by both gender and race/ethnicity. A single-factor solution was identified and confirmed within each group. However, analyses yielded differences across gender and racial/ethnic groups. Findings support the overall validity of the BUQ, but also suggest that caution should be exercised in making comparisons across gender and racial/ethnic groups. These data also highlight the importance of assessing the psychometric properties of measures of behavioral undercontrol and other externalizing constructs. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Unstandardized values of loadings and intercepts from multisample CFA of the PESE and the PSSE
Results of the construct validity for the PESE and PSSE scales
Article
The Perceived Empathic Self-Efficacy Scale (PESE) and the Perceived Social Self-Efficacy Scale (PSSE) were developed to assess, respectively, individuals’ self-efficacy beliefs regarding both empathic responding to others’ needs or feelings and managing interpersonal relationships. In this study of young adults, a unidimensional factorial structure of both scales was found in Italy, the United States, and Bolivia. Complete invariance at the metric level and partial invariance at the scalar level were found across gender and countries for both scales. The construct and incremental validity of both PESE and PSSE were further examined in a different sample of Italian young adults. Patterns of association of the PESE or PSSE with self-esteem, psychological well-being, and the use of adaptive and maladaptive coping strategies were found, often over and beyond their associations with empathy or extraversion, respectively. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Differences in perception of specific environmental con- cern according to place of residence.
Article
Pro-environmental orientation constitutes one of the basic referents of modern culture. However, this pro-environmental orientation of a general nature does not permit us to predict pro-environmental behaviors. In order to explain this incongruence, it is necessary to take into account the sociostructural factors and socialization experiences through which people form their environmental values, attitudes, and behaviors. In this study we compare the values, attitudes, and behaviors of a rural sample and an urban sample, measured by means of three scales: the New Ecological Paradigm Scale, a moral obligation scale specifically designed for this study, and a scale of pro-environmental behavioral intentions. The results indicate high levels of environmental concern and low levels of pro-environmental behavior in both samples. On comparing the two samples it was found that those living in cities assume a larger number of environmental responsibility values but show less pro-environmental orientation when the attitude and behavioral intention scales are used. People living in the rural context present more attitudes of environmental responsibility and greater consistency on expressing behavioral intentions compatible with the protection of the environment. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Synthesis of prototypical items in the principal five-factor solution on the LERQ. 
Results of discriminant analysis for the LERQ executive study regulation scales (N = 336 psychology freshmen). 
Results of discriminant analysis for the ILS executive study regulation scales (N = 336 psychology freshmen). 
Mean examination scores (standard deviations in parentheses) by level of executive regulation and aptitude group for the ILS-and the LERQ-questionnaire (N = 517). 
Article
Examined cultural bias or situatedness in the assessment of executive regulation activities (flexible decision making components about one's own cognitive ability) in higher education by comparing the results of a local executive regulation questionnaire with a foreign questionnaire; the latter was constructed within a different educational context. 592 freshmen completed the Leuven Executive Regulation Questionnaire (local [LERQ]) and the Inventory of Learning Styles (foreign [ILS]). The ILS was developed at the Open University (distance education). The ILS appears to be less discriminative and predictive than the LERQ. Due to cultural bias, the external regulation activities cause the main difference. Specificity of measuring metacognitive skills about study processes is biased by the university setting. The factorial structure of the regulation activity scales seems to be invariant over different domains of study within the same educational setting. The bias in the assessment of regulation activities is significantly stronger in the group of Ss with the lower and medium level of general thinking skills. A higher level of regulation skills can compensate for a lower level of general thinking skills. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Addresses 3 goals: First, to present comments on the 22 guidelines that have resulted from the numerous field-tests (including achievement tests, cognitive tests, and personality tests) and reviews of the International Test Commission (ITC) Test Translation and Adaptation Guidelines; second, where possible, to describe specific suggestions for revising the ITC Guidelines; third, to present 3 suggestions for essential research to improve the methodology associated with translating and adapting tests. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
In this study the Eating Disorder Inventory-3 was adapted to Spanish and analyzed the internal psychometric properties of the test in a clinical sample of females with eating disorders. The results showed a high internal consistency of the scores as well as high temporal stability. The factor structure of the scale composites was analyzed using confirmatory factor analysis. The results supported the existence of a second-order structure beyond the psychological composites. The second-order factor showed high correlation with the factor related to eating disorders. Overall, the Spanish version of the EDI-3 showed good psychometric qualities in terms of internal consistency, temporal stability and internal structure. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
This study constitutes the first stage of the Spanish adaptation of the Matching Familiar Figures Test 20 (MFFT-20; Cairns & Cammock, 1978). The normative data for reflexivity-impulsivity and efficiency-inefficiency were obtained by means of the Salkind and Wright formula (1977). The study sample was composed of 700 Spanish children, aged 6 to 12. This paper presents the results for reliability and latency-error correlation. The development of errors and latencies at older ages and gender differences were also analyzed. No significant differences between boys and girls were observed with respect to any of the variables. A high reliability was established for errors as well as for latencies. Errors decreased and latency increased with age. It is important to note that the data for both sexes at 8, 9, and 10 years of age reveal a stabilization in the latencies even though the errors continue to decrease. The psychometric properties of the Spanish adaptation of the MFFT-20 are discussed. Results are analyzed by examining the instrument's validation using the normative data presented here. The study concludes that the MFFT-20 is a reliable and valid measure of reflexivity-impulsivity. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Intercorrelations among latent variables 
Article
This study developed and validated a French version of the Adult Temperament Questionnaire short form (ATQ; Evans & Rothbart, 2007). The ATQ is a self-report instrument that evaluates four temperamental dimensions: negative affect, effortful control, surgency/extraversion, and orienting sensitivity. The French version was elaborated following adaptation and translation procedures that are precisely described. A first sample of 141 young adults completed the ATQ. Internal consistency and test-retest correlations over a 4-week period suggest an adequate reliability, and a confirmatory factor analysis revealed a 4-factor solution consistent with the original instrument. Internal consistency and factorial structure were reexamined with a second sample (N = 385). Criterion-related validity was explored in relation to Big Five model dimensions and yielded results comparable to those of the original instrument. Overall, results indicate a good equivalence between the original and the adapted instrument. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The peoples of Europe use many languages for communication. This variety of languages is, on the one hand, advantageous for the expression of local specialties and peculiarities; but there are also disadvantages. One of them is the restriction on the applicability of psychological measures since psychological assessment by means of questionnaires, tests, and other assessment instruments can only be accomplished by taking the clients’ linguistic capabilities into account. As a consequence, measures have to be developed and validated separately for each and every European language. The development and validation of measures can be achieved in two ways: They can be developed according to one master plan or by following quite different routes. Fortunately, there is presently the tendency to accept major theoretical developments and related measures as master plans and to transfer such measures from the original language into other languages. This makes it possible that many scientists concentrate their research efforts on key concepts and theories. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Reviews some assumptions developed about the notion of "illness adaptation" and proposes an index to assess this concept in cancer patients based on a questionnaire developed by A. Font (1988). A pilot study was conducted with 40 cancer patients (aged 28–79 yrs) who received chemotherapy for the 1st time. There were no significant differences between adjuvant (ADJ), palliative (PAL), and curative chemotherapy Ss; however, data agree with the assumptions that can be inferred from the illness stage. With chemotherapy, ADJ Ss decreased their quality of life, while PAL Ss increased it. The adaptation index distinguished between the 3 groups of patients, identifying a higher psychological disturbance rate and a poor adaptation in ADJ Ss. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The Five Factor Personality Inventory (FFPI) was translated and adapted to a Spanish population of 567 Ss (mean age 19.3 yrs). A principal component analysis using orthogonal Procrustes rotation replicated the 5-component structure of the original FFPI questionnaire. The coefficients of congruence between the loading matrices obtained in the Dutch sample and the Spanish sample were also computed showing high factorial convergence. The Spanish version of the FFPI showed adequate reliability. Further, convergent and discriminant validity were studied using other well-known Big Five and PEN questionnaires. The results fully supported the psychometric properties of the FFPI questionnaire in the present population. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
To facilitate the development of valid multicultural/multilingual tests, the International Test Commission (ITC) prepared the ITC Guidelines on Test Adaptations. This paper reviews the current version (cf. Van de Vijver & Hambleton, 1996), which consists of 22 guidelines on recommended practices pertaining to context, development, administration, documentation, and test-score interpretation, by identifying key principles in test adaptations and comparing them to a content analysis of the ITC Guidelines. The content analysis revealed a number of inconsistencies and ambiguities in a few guidelines, and proposals for reformulating them are given. A checklist to supplement the more narrative guidelines would also be helpful. Nevertheless, the review clearly demonstrates that the ITC Guidelines on Test Adaptations address key principles in test adaptations and constitute a significant standard or "code of conduct" in this field. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Describes judgmental and statistical procedures for adapting tests for cross-cultural assessments to establish test item equivalence. Judgmental methods include the use of forward and backward adaptation designs, both of which provide information about equivalence of source and target language tests. Statistical methods used to identify differential item functioning between 2 or more tests in different languages is characterized by the statistical design and procedures used. The statistical design depends on the characteristics of the Ss and on the version of the adapted test. Statistical procedures depend on whether a common scale is used and whether conditional and unconditional procedures are applied. These factors determine the specific analytic procedures that are best suited to identify differential item functioning, such as factor analysis, item response theory and logical regression. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Mean scores on nine semantic differential items, by test type.
Descriptive and inferential results for each dependent variable, by type of test.
Article
A new type of self-adapted test (S-AT), called Assisted Self-Adapted Test (AS-AT), is presented. It differs from an ordinary S-AT in that prior to selecting the difficulty category, the computer advises examinees on their best difficulty category choice, based on their previous performance. Three tests (computerized adaptive test, AS-AT, and S-AT) were compared regarding both their psychometric (precision and efficiency) and psychological (anxiety) characteristics. Tests were applied in an actual assessment situation, in which test scores determined 20% of term grades. A sample of 173 high school students participated. Neither differences in posttest anxiety nor ability were obtained. Concerning precision, AS-AT was as precise as CAT, and both revealed more precision than S-AT. It was concluded that AS-AT acted as a CAT concerning precision. Some hints, but not conclusive support, of the psychological similarity between AS-AT and S-AT was also found. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Provides a progress report on the work of the International Test Commission, committee of psychologists who are preparing a set of guidelines for adapting educational and psychological tests across cultures and languages. The committee has worked for 2 yrs to produce drafts of 22 guidelines organized into 4 categories: context, instrument development and adaptation, administration, and documentation/score interpretations. The final version of the test adaptation guidelines will be available in 1995. (French abstract) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
In 1994, the International Test Commission (ITC) and seven other international organizations published a draft set of guidelines for adapting educational and psychological tests from one language and culture to other languages and cultures. The purposes of the research described in this paper were to (1) fieldtest the ITC Guidelines in an actual test adaptation project and (2) suggest any necessary revisions to the Guidelines. The fieldtest involved the adaptation of a 69-item grade-8 mathematics test from English to Chinese. The results were informative because they highlighted the sorts of problems that arise in test adaptation projects. Also, as the first formal evaluation of the ITC Test Adaptation Guidelines, this work was useful to the ITC in suggesting revisions and clarifications. The findings should also be interesting to psychologists interested in cross-cultural research because the Guidelines are being widely adopted for use around the world and evidence of their validity is important. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Summary of hierarchical moderated regression analysis for variables predicting AMT score
Article
We investigated the effects of test anxiety on test performance using computerized adaptive testing (CAT) versus conventional fixed item testing (FIT). We hypothesized that tests containing mainly items with medium probabilities of being solved would have negative effects on test performance for testtakers high in test anxiety. A total of 110 students (aged 16 to 20) from a German secondary modern school filled out a short form of the Test Anxiety Inventory (TAI-G; Wacker, Jaunzeme, & Jaksztat, 2008) and then were presented with items from the Adaptive Matrices Test (AMT; Hornke, Etzel, & Rettig, 1999) on the computer, either in CAT form or in a fixed item test form with a selection of items arranged in order of increasing item difficulty. Additionally, half of the students were given a short summary of information about the mode of item selection in adaptive testing before working on the CAT. In a moderated regression approach, a significant interaction of test anxiety and test mode was revealed. The effect of test mode on the AMT score was stronger for students with higher scores on test anxiety than for students with lower test anxiety. Furthermore, getting information about CAT led to significantly better results than receiving standard test instructions. Results are discussed with reference to test fairness. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The Frankfurt Adaptive Concentration Test (FACT-2) requires discrimination between geometric target and nontarget items as quickly and accurately as possible. Three forms of the FACT-2 were constructed, namely FACT-I, FACT-S, and FACT-SR. The aim of the present study was to investigate the convergent validity of the FACT-SR with self-reported cognitive failures. The FACT-SR and the Cognitive Failures Questionnaire (CFQ) were completed by 191 participants. The measurement models confirmed the concentration performance, concentration accuracy, and concentration homogeneity dimensions of FACT-SR. The four dimensions of the CFQ (i.e., memory, distractibility, blunders, and names) were not confirmed. The results showed moderate convergent validity of concentration performance, concentration accuracy, and concentration homogeneity with two CFQ dimensions, namely memory and distractibility/blunders. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Item parameters for hundreds of items (including analogies and number problems) were estimated based on empirical data from thousands of Ss. The logistic one-parameter (lPL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so the remaining ones were assembled in well-fitting item banks. In simulation studies, 5,000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end computerized adaptive testing (CAT). The authors also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all IPL fitting item banks, most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of CAT. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Tested the effects of response mode (choice vs judgment) on decision-making strategies when Ss were faced with the task of deciding the adequacy of a set of tests for a specific assessment situation in 3 experiments with a total of 300 undergraduates. Compared with choice, judgment was predicted to lead to more information (IN) sought, more time spent on the task, a less variable pattern of search, and a greater amount of inter- dimensional search. Time pressure (TP), IN load, and decision importance were variables hypothesized as potential moderators of the response mode effects. Using an IN board, Ss made choice and judgment decisions on tests for a concrete assessment situation, under high or low TP, high or low IN load, and high or low decision importance. Response mode produced strong effects on all measures of decision behavior except for pattern of search. Moderator effects occurred for TP and IN load. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
How people perceive situations and their own reactions to these situations relates to how well they tolerate and control stress. In this paper, the hypothesis was tested that D and AdjD, Rorschach measures of tolerance and stress control, can be shown to relate to individuals' beliefs, both about whether it is possible to control internal states and their perceptions of how well they themselves do so. The study was carried out in a nonclinical population. The findings were, first, that variations in levels of D, but not AdjD and its component variables, were associated with lower or higher scores on the Perceived Control of Internal States Inventory (1998), a multidimensional questionnaire on individuals' beliefs about the possibility of controlling internal states. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Graphic representation of three variables in a bifactorial space
Factorial solution for the 50 selected adjectives.
Correlations among the factors of the three instruments and the benchmark factors.
Correlations among the poles of the three instruments and the benchmark factors.
Article
The Big Five factors structure is currently the benchmark for personality dimensions. In the domain of adjectives, various instruments have been developed to measure the Big Five. In this contribution the authors propose a methodology to find a simple factorial structure and apply this methodology to the domain of Big Five as measured by adjectives. Using data collected on a sample of 337 Ss (mean age 21.69 yrs), a five-factor benchmark structure is proposed derived from the 50 best marker adjectives selected among the adjectives contained in three instruments specifically developed to measure the Big Five (i.e., L. R. Goldberg's 100 adjectives list, IASR-B5, and SACBIF (1992)). They use this common factor structure (or benchmark structure) to investigate the differences and the similarities between the three operationalizations of the Big Five, and to investigate the placements of the full set of adjectives contained in the three instruments. The main features of the proposed methodology and the generalizability of the obtained results are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Simplicity index and factor projections for the 50 SACBIF's adjectives (N = 377).
Parameter Estimates for the regression model. a) Factor Loadings
The Big Five in the SACBIF space.
The EPQ, MPQ, PANAS and RTTI in the SACBIF space.
Article
Presents the construction and validation of a short adjective-based measure of the Five Factor Model (FFM) of personality, the Short Adjectives Checklist of Big Five (SACBIF). A total sample was composed of 961 Ss (mean age 26.7 yrs). 50 adjectives were selected with a selection procedure, the "Lining Up Technique," specifically used to identify the best factorial markers of the FFM. The factorial structure and the psychometric properties of the SACBIF were studied. The SACBIF factorial structure was correlated with some main measures of the FFM to establish its construct validity and with some other personality dimensions to determine how well these dimensions could be represented in the SACBIF factorial space. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
Investigated the reliability and validity of the Dyadic Adjustment Scale (DAS) in a Turkish sample of 264 married individuals (mean age 36.8 yrs). The α coefficient for the DAS was .92 and the computed split-half reliability coefficient was .86. The DAS correlated .82 with the Locke-Wallace Marital Adjustment Test. The original factor structure of the DAS was replicated in the present study. In general, the data indicated that the DAS provides a reliable and valid measure of marital adjustment for a Turkish sample. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Example of boundary characteristic curves and response characteristic curves of the graded response model (a = 1.25, b1 =-2, b2 =-1, b3 = 1, b4 = 2). 
Standard error (classical test theory and item response theory). 
Article
Item response theory (IRT) provides valuable methods for the analysis of the psychometric properties of a psychological measure. However, IRT has been mainly used for assessing achievements and ability rather than personality factors. This paper presents an application of the IRT to a personality measure. Thus, the psychometric properties of a new emotional adjustment measure that consists of a 28-six graded response items is shown. Classical test theory (CTT) analyses as well as IRT analyses are carried out. Samejima's (1969) graded-response model has been used for estimating item parameters. Results show that the bank of items fulfills model assumptions and fits the data reasonably well, demonstrating the suitability of the IRT models for the description and use of data originating from personality measures. In this sense, the model fulfills the expectations that IRT has undoubted advantages: (1) The invariance of the estimated parameters, (2) the treatment given to the standard error of measurement, and (3) the possibilities offered for the construction of computerized adaptive tests (CAT). The bank of items shows good reliability. It also shows convergent validity compared to the Eysenck Personality Inventory (EPQ-A; Eysenck & Eysenck, 1975) and the Big Five Questionnaire (BFQ; Caprara, Barbaranelli, & Borgogni, 1993). (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Demographic data of the study population
PAS-Q sections translated into ICD-10 PD and DSM-IV-TR PD classification
Information about the informants and correlation with the SCID-II and the SAP for any PD
Correlation between the short self report screen and the full SCID-II interview Correlation P Cluster
consistency, test-retest, Phi, and corrected total correlation coefficients for items of the short self report screen
Article
The internal consistency, test-retest reliability, and predictive validity of the Iowa Personality Disorder Screen (IPDS) as a screening instrument for personality disorders (PDs) were studied in 195 Dutch psychiatric outpatients, using the SCID-II as the gold standard. All patients completed a self-administered version of the IPDS. Internal consistency was moderate (0.64), and the test-retest reliability was good (0.87). According to the SCID-II, 97 patients (50%) had at least one personality disorder (PD). The IPDS correctly classified 81.0 percent of all participants in the category PD present/absent. The sensitivity and specificity were 77% and 88%, respectively. Positive and negative predictive values were 83% and 79%. Test-retest reliability after a 2-week interval was 0.87. These results are comparable with those reported in earlier studies with respect to the interview version of the IPDS and more promising than previously reported results obtained with a self-report version of the IPDS. Therefore, it is concluded that a self-report version of the IPDS may be useful as a screening measure for determining the presence/absence of PD in a population of psychiatric outpatients. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The aim of this study was to analyze the psychometric properties of the Questionnaire about Interpersonal Difficulties for Adolescents (QIDA; Inglés, Méndez, & Hidalgo, 2000). In Study 1, the questionnaire was administered to a sample of 4,240 high school pupils. Exploratory factor analysis identified five factors accounting for 42.86% of the variance: Assertiveness, Heterosexual Relationships, Public Speaking, Family Relationships, and Close Friendships. Internal consistency was high (.90). In Study 2, 538 high school pupils answered a set of social anxiety and personality self-report measures. Test-retest reliability, over a 2-week period, was adequate (.78). Correlations between the QIDA and the Personal Report of Confidence as Speaker (r = .43), the Social Phobia and Anxiety Inventory (r = .61), and the Eysenck Personality Questionnaire (r = -.38, Extraversion; r = .34, Neuroticism) were statistically significant. Asignificant difference was found between the total QIDA score for adolescents with and without social phobia (d = 1.53) supporting the construct validity of the questionnaire. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Article
The aim of this research was to design a program of psychological intervention for adolescents and to assess its effects on factors of emotional development. The study used a pretest-intervention-posttest design with control groups and a sample of 174 subjects aged 12 to 14. There were 125 experimental subjects and 49 control subjects. Before and after the program, six instruments were administered to measure: empathy, anxiety, self-concept, image of classmates, and ability to analyze feelings. The intervention program applied to the experimental subjects consisted of one 2-h intervention session per week throughout an academic year. Two activities and their corresponding debates generally took place in each session. Each one of the 60 activities stimulates communication and friendly and cooperative interactions, and their objectives are to improve self-concept, promote identification, understanding and expression of emotions, and develop empathic feelings. Results suggest that the program had a highly positive effect on factors of emotional development. A decrease in state-trait anxiety was observed and there was improvement in the ability for empathy, in self-concept, in image of others, and in the ability to analyze feelings. The results are discussed in terms of the... (PsycINFO Database Record (c) 2012 APA, all rights reserved)
 
Top-cited authors
Ronald K Hambleton
  • University of Massachusetts Amherst
Evangelia Demerouti
  • Eindhoven University of Technology
Ralf Schwarzer
  • Freie Universität Berlin
Urte Scholz
  • University of Zurich
Benicio Gutiérrez-Doña
  • Universidad Estatal a Distancia