Assessing professional competence: From methods to programs

Department of Educational Development and Research, University of Maastricht, Maastricht, The Netherlands.
Medical Education (Impact Factor: 3.62). 04/2005; 39(3):309-17. DOI: 10.1111/j.1365-2929.2005.02094.x
Source: PubMed

ABSTRACT INTRODUCTION: We use a utility model to illustrate that, firstly, selecting an assessment method involves context-dependent compromises, and secondly, that assessment is not a measurement problem but an instructional design problem, comprising educational, implementation and resource aspects. In the model, assessment characteristics are differently weighted depending on the purpose and context of the assessment. EMPIRICAL AND THEORETICAL DEVELOPMENTS: Of the characteristics in the model, we focus on reliability, validity and educational impact and argue that they are not inherent qualities of any instrument. Reliability depends not on structuring or standardisation but on sampling. Key issues concerning validity are authenticity and integration of competencies. Assessment in medical education addresses complex competencies and thus requires quantitative and qualitative information from different sources as well as professional judgement. Adequate sampling across judges, instruments and contexts can ensure both validity and reliability. Despite recognition that assessment drives learning, this relationship has been little researched, possibly because of its strong context dependence. ASSESSMENT AS INSTRUCTIONAL DESIGN: When assessment should stimulate learning and requires adequate sampling, in authentic contexts, of the performance of complex competencies that cannot be broken down into simple parts, we need to make a shift from individual methods to an integral programme, intertwined with the education programme. Therefore, we need an instructional design perspective. IMPLICATIONS FOR DEVELOPMENT AND RESEARCH: Programmatic instructional design hinges on a careful description and motivation of choices, whose effectiveness should be measured against the intended outcomes. We should not evaluate individual methods, but provide evidence of the utility of the assessment programme as a whole.

  • Source
    • "The means, standard deviations and reliability estimates are similar within each administration. The reliability estimates under all models are moderately high, ranging from 0.74 to 0.78, consistent with reliability for OSCE examinations such as the MCCQE Part II of two to four hours in length (Van der Vleuten & Schuwirth 2005). More importantly, the three simpler scoring models yielded scores that are as reliable "
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Background: Past research suggests that the use of externally-applied scoring weights may not appreciably impact measurement qualities such as reliability or validity. Nonetheless, some credentialing boards and academic institutions apply differential scoring weights based on expert opinion about the relative importance of individual items or test components of Observed Structured Clinical Examinations (OSCEs). Aims: To investigate the impact of simplified scoring models that make little to no use of differential weighting on the reliability of scores and decisions on a high stakes OSCE required for medical licensure in Canada. Method: We applied four different weighting models of various complexities to data from three administrations of the OSCE. We compared score reliability, pass/fail rates, correlations between the scores and classification decision accuracy and consistency across the models and administrations. Results: Less complex weighting models yielded similar reliability and pass rates as the more complex weighting model. Minimal changes in candidates' pass/fail status were observed and there were strong and statistically significant correlations between the scores for all scoring models and administrations. Classification decision accuracy and consistency were very high and similar across the four scoring models. Conclusions: Adopting a simplified weighting scheme for this OSCE did not diminish its measurement qualities. Instead of developing complex weighting schemes, experts' time and effort could be better spent on other critical test development and assembly tasks with little to no compromise in the quality of scores and decisions on this high-stakes OSCE.
    Medical Teacher 05/2014; 36(7). DOI:10.3109/0142159X.2014.899687 · 2.05 Impact Factor
  • Source
    • "More recent, a utility model have been used to illustrate that, firstly, selecting an assessment method involves context-dependent compromises, and secondly, that assessment is not a measurement problem but an instructional design problem, comprising educational, implementation and resource aspects. In the model, assessment characteristics are differently weighted depending on the purpose and context of the assessment (Van der Vleuten and Schuwirth, 2005). After this brief overview of some relevant references regarding competence assessment (based on their practical observation and analysis) methods, procedure there will be describe a proposed methodology together with its validation by a case study the approach for the competencies use overview report based. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The proposed methodology compares employers’ professional competencies development with the competencies gained during their education process (developed by curricula programs in universities) that provide specific qualification. The balance refers to the professional competencies comparison that is out-puts for the education providers and in-put for the employer organization (competencies balance card and profile design). The research motivation lies on: (a) harmonization and adaptation of the universities curricula programs to the real organizations specific needs, and vice-versa, (b) satisfying the needs of real organizations development based on human resources advanced competencies. A case study will demonstrate the proposed methodology effectiveness.
    Procedia - Social and Behavioral Sciences 01/2014; 109:193–197. DOI:10.1016/j.sbspro.2013.12.443
  • Source
    • "Fitness for purpose Alignment between curriculum goals and what and how is assessed. Criteria and standards should address all competences and the mix of methods should be fit to assess competence (Brown 2004; Miller and Linn 2000) Cognitive complexity CAPs should enable the judgment of thinking process, besides assessing the product or outcome (Maclellan 2004) Self-assessment CAPs should stimulate self-regulated learning, for example by using self-assessments, and letting students formulate their own learning goals (Tillema, Kessels, and Meijers 2000) Authenticity The degree of resemblance of a CAP to the future workplace (Gulikers, Bastiaens, and Kirschner 2004) Transparency CAP should be clear and understandable for all stakeholders (Frederiksen and Collins 1989; Linn, Baker and Dunbar 1991) Comparability Assessment tasks, criteria, working conditions and procedures should be consistent with respect to key features of interest (Baartman, Bastiaens et al. 2007) Reproducibility of decisions Decisions about students should be based on multiple assessors, multiple tasks and multiple situations (Moss 1994; van der Vleuten and Schuwirth 2005) Fairness Students should get a fair chance to demonstrate their competences, for example by letting them express themselves in different ways and making sure the assessors do not show biases (Dierick and Dochy 2001; Hambleton 1996; Linn, Baker and Dunbar 1991) Acceptability All stakeholders should approve of the assessment criteria and methods (Stokking et al. 2004) Meaningfulness CAPs should be learning opportunities in themselves and generate useful feedback for all stakeholders (Linn, Baker and Dunbar 1991) Educational consequences The degree to which the CAP yields positive effects on learning and teaching (Messick 1994; Schuwirth and van der Vleuten 2004) Costs and efficiency The feasibility of carrying out the CAP for assessors and students (Hambleton 1996; Linn, Baker and Dunbar 1991) Assessment & Evaluation in Higher Education 5 opportunity to provide input to the decision - making process (Wikeley, Stoll, and Lodge 2002). Other research ( McNamara and O'Hara 2006) shows that the theoreti - cal ideal of teachers collecting evidence and making quality judgments based on that evidence is far from reality at the moment . "
    [Show abstract] [Hide abstract]
    ABSTRACT: The development of assessments that are fit to assess professional competence in higher vocational education requires a reconsideration of assessment methods, quality criteria and (self)evaluation. This article examines the self-evaluations of nine courses of a large higher vocational education institute. Per course, 4?11 teachers and 3?10 students participated. The purpose of this article is to critically examine the quality of assessment in higher vocational education, to identify critical factors influencing assessment quality and to study whether self-evaluation leads to concrete points for improvement. Results show that strong points are fitness for purpose, comparability and fairness. Weak points are reproducibility of decisions and development of self-regulated learning. Critical factors are the translation of competences into assessment criteria to be used in daily lessons and the involvement of the work field. The self-evaluations generated many points for improvement, but not all were translated into actions. Altogether, this article provides a rich picture of assessment quality in higher education and identifies quality aspects that need improvement, (partly) confirming other research on current assessment methods.
    Assessment & Evaluation in Higher Education 12/2013; 38(8):1-20. DOI:10.1080/02602938.2013.771133 · 0.84 Impact Factor
Show more


Available from