Conference PaperPDF Available

Evaluating the Adaptation of a Learning System before the Prototype Is Ready: A Paper-Based Lab Study



We report on results of a paper-based lab study that used information on task performance, self appraisal and personal learning need assessment to validate the adaptation mechanisms for a work-integrated learning system. We discuss the results in the wider context of the evaluation of adaptive systems where the validation methods we used can be transferred to a work-based setting to iteratively refine adaptation mechanisms and improve model validity.
Evaluating the Adaptation of a Learning System before
the Prototype is Ready: A Paper-based Lab Study
Tobias Ley1,2, Barbara Kump3, Antonia Maas2, Neil Maiden4, Dietrich Albert2
1 Know-Center, Inffeldgasse 21a,
8010 Graz, Austria
2 Cognitive Science Section, University of Graz, Universitätsplatz 2,
8010 Graz, Austria
{tobias.ley, antonia.maas, dietrich.albert}
3 Knowledge Management Institute, Graz University of Technology, Inffeldgasse 21a,
8010 Graz, Austria
4 Centre for HCI Design, City University London, Northampton Square, College Building,
London, EC1V 0HB, United Kingdom
Abstract. We report on results of a paper-based lab study that used information
on task performance, self appraisal and personal learning need assessment to
validate the adaptation mechanisms for a work-integrated learning system. We
discuss the results in the wider context of the evaluation of adaptive systems
where the validation methods we used can be transferred to a work-based
setting to iteratively refine adaptation mechanisms and improve model validity.
Keywords: Adaptive Learning Systems, Evaluation, Task-based Competency
Assessment, Learning Need Analysis, Knowledge Space Theory
1 Evaluating Adaptive Systems in Due Time
Learning systems that adapt to the characteristics of their users have had a long
history. Due to the complexity of most adaptive systems, it has been acknowledged
that rigorous evaluation is indispensable in order to deliver worthwhile adaptive
functionality and to justify the considerable effort of implementation. This is also
reflected in the substantial amount of evaluations that have been published so far. Van
Velsen et al. [1] present an overview and have noted several limitations in current
evaluation practices. A variety of evaluation frameworks have been presented [2], [3],
[4], all of which propose to break down the adaptive system into assessable, self-
contained functional units.
The core research question when evaluating an adaptive system concerns the
appropriateness of the adaptation. Typically, two aspects are distinguished, (a) the
Ley, T., Kump, B., Maas, A., Maiden, N., & Albert, D. (2009). Evaluating the
Adaptation of a Learning System before the Prototype Is Ready: A PaperíBased Lab
Study. In G. Goos, J. Hartmanis & J. van Leeuwen (Eds.), Lecture Notes in
Computer Science - User Modeling, Adaptation, and Personalization, 5535/2008 (pp.
2 Tobias Ley1,2, Barbara Kump3, Antonia Maas2, Neil Maiden4, Dietrich Albert2
inference mechanisms and (b) the adaptation decision. While endeavors related to (a)
seek to answer the question if user characteristics are successfully detected by the
adaptive system, evaluations of (b) ask if the adaptation decisions are valid and
meaningful, given selected assessment results.
It is recommended that these two research questions are investigated in an
experimental setting using a running system (or prototype) where the algorithms are
already implemented [1], [3]. The problem is that in many situations the development
cycle of the software product is short and the evaluation might become obsolete as
soon as a new version has been developed [4].
For this reason, we are pursuing a multifaceted evaluation approach for adaptive
systems. By gathering both field and experimental evidence, we are checking validity
of models and appropriateness of the adaptation mechanisms over the course of
design, implementation and use of the system in an iterative manner. With this article,
we describe an experimental evaluation that seeks to answer the above mentioned
research questions in a controlled lab situation but without a running prototype, that
is, in due time before the system is actually developed. After a brief presentation of
the results, we will discuss the wider implications of our approach for evaluation
research for adaptive systems.
2 Evaluation of an Adaptive Work-Integrated Learning System
Our paper-based evaluation has been conducted in the course of the APOSDLE1
project. APOSDLE is a system for supporting adaptive work-integrated learning
(WIL). With WIL, we refer to learning that happens directly in a user’s work context,
which is deemed beneficial for maximising learning transfer [5]. APOSDLE offers
learning content and recommends experts based on both the demands of the current
tasks, as well as the user’s state of knowledge with regard to this task. APOSDLE is
currently available for five different application domains. The experiment in this
article has been conducted for the requirements engineering domain.
2.1 Adaptation in APOSDLE
Corresponding to the basic ideas of competence-based knowledge space theory [6],
the users’ knowledge states in APOSDLE are modelled in terms of sets of
competencies (single elements of domain related cognitive skill or knowledge). In
order to make inferences on a user’s competencies, APOSDLE observes the tasks a
user has worked on in the past. Each of the tasks is linked to a set of competencies
(task demand). Taking into account the task demands of all previously performed
tasks, their frequency and success, APOSDLE builds the user’s instance of the user
model by making inferences on the likely state of knowledge. In the following, this
procedure shall be termed task-based competency assessment.
1 APOSDLE ( has been partially funded under grant 027023 in the IST work
programme of the European Community.
Evaluating the Adaptation of a Learning System before the Prototype is Ready: A Paper-
based Lab Study 3
In order to adapt to the needs of a user in a given situation, APOSDLE performs a
learning need analysis (also termed competency gap analysis elsewhere): The task
demand of a task is compared to the set of competencies of the user. If there is a
discrepancy (learning need), APOSDLE suggests learning content which should help
the user acquire exactly these missing competencies in a pedagogically reasonable
sequence. In order to perform these adaptations, the domain model of APOSDLE
contains tasks and competencies as well as a mapping that assigns required
competencies to tasks. A prerequisite relation exists both for competencies and for
For the present study, the domain model was modelled in terms of the tasks in the
requirements engineering domain (e.g. Complete the normal course specification for
a use case, or Carry out a stakeholder analysis), as well as the competencies needed
to perform these tasks (e.g. Understanding of strategic dependency models, or
Knowledge of different types of system stakeholders). The model has been
constructed, initially validated and refined in a previous study [7].
2.2 Design, Procedure and Hypotheses of the Study
The aim of our study was to test different algorithms for task-based competency
assessment and learning need analysis. The participants were a sample of nineteen
requirements engineering (RE) students. We had selected eight tasks from two sub-
domains of the RESCUE process (Requirements Engineering with Scenarios in User-
Centred Environments, [8]). According to the domain model, 22 competencies were
required in total to perform well in these eight tasks.
Each student had to work on four exercises which had been constructed to directly
map to the tasks from the task model. For example, they were asked to write a use
case specification for an iPod, or to carry out a stakeholder analysis for a realtime
travel alert system of an underground. The exercises were constructed to be
ecologically valid, i.e. that they corresponded well to tasks that would have to be
conducted by requirements engineers in a work-based setting. The sequence of
exercises was randomized across participants.
Before conducting the exercises, students gave both competency and task self
appraisals. Performance in the exercises was measured by marks assigned by a
professor of RE. After each exercise, students were asked for an appraisal of their
performance for the exercise just conducted. They were also asked to indicate which
additional knowledge they would have required to perform better, both in a free
answer and a multiple choice format. Answers from the free answer format were later
subjected to a deductive content analysis that mapped each free answer to a
competency from the domain, or a new one. The multiple choice items contained all
competencies assigned to the particular task in the domain model as well as a number
of distractors, i.e. other competencies not assigned to that task. Competencies had
been reformulated to describe personal learning needs (e.g. I would need to learn
what is a domain lexicon and how to apply it).
Self appraisal was included in this study as it is a common and economical way to
assess competencies or performance in the workplace [9]. In accordance with prior
research [10], we expected that self appraisals would correspond to actual task
4 Tobias Ley1,2, Barbara Kump3, Antonia Maas2, Neil Maiden4, Dietrich Albert2
performance (hypothesis 1). The second hypothesis looked at the personal learning
needs indicated by the students. We assumed that competencies selected by the
students for each task would, in a substantial proportion of cases, correspond to
competencies assigned to the task in the domain model. If this were not the case,
learning need analysis based on the task-competency assignment in the domain model
would not be possible. Lastly, we employed different algorithms for task-based
competency assessment and investigated whether they would correspond to
competency self appraisal by the students (hypothesis 3).
2.3 Results of the Study
2.3.1 Hypothesis 1: Self appraisal and task performance
A one-way Analysis of Variance which compared the marks received for the
exercises between those students that had indicated they were able to perform the task
without assistance and those that had indicated otherwise showed that contrary to our
expectations there was no relationship between self appraisal before task performance
and task performance as assessed by the marks received (F (1,69) = .007, ns.). There
was, however, a moderate relationship between self appraisal after task performance
and task performance itself as measured by a Spearman Rank Correlation Coefficient
(U = -.38, p < .01). It appears that students were not able to realistically predict their
performance in the tasks before they conducted the exercise. Their appraisals after
task performance, then, were slightly more accurate.
2.3.2 Hypothesis 2: Personal Learning Needs
Asked for their personal learning needs after the exercises, the students were
significantly more likely to chose learning needs assigned to the particular tasks (M=
2.63) than distractors (M= 1.06) (t = 5.23; p < .001). This confirms the hypothesis and
is an indication of the overall validity of the modeled structures. Similarly, learning
needs extracted from student free answers in the content analysis were to a large
degree those originally assigned to the particular task (60 vs. 39). Two of the tasks
account for more than two thirds of the contradicting answers, namely task 3 (17
contradicting learning needs) and task 4 (11 contradicting learning needs). This gives
strong reason to believe that there had been missing competency assignments for
these tasks. Particularly, six new competencies that had not been part of the original
list were suggested from analyzing the free answers. These include items like
Knowledge about different types of requirements. These missing competencies may
have also led to violations of the prerequisite relation on tasks which were found
when comparing the relation to the obtained answer patterns.
2.3.3 Hypothesis 3: Task-based Competency Assessment
Assessing competencies from the observation of task performance is one of the key
benefits of using competence-based knowledge space theory. The usual way to do this
is to take the union of all assigned competencies for all successfully mastered tasks.
As [11] has shown previously, this method may lead to contradictions, especially in
Evaluating the Adaptation of a Learning System before the Prototype is Ready: A Paper-
based Lab Study 5
the case where the numbers of competencies assigned to tasks are large, and therefore
suggests using both positive as well as negative task performance information. In the
present study, we have compared two algorithms to predict the knowledge state of the
students from task based information. Three predictors for task information were used
(task self appraisal prior to task, task self appraisal after task, and task performance
assessed by the expert) and each was correlated with competency self appraisal.
Although in all three cases, correlation coefficients were higher for the algorithm
that took negative task performance information into account, the coefficients were of
only small magnitude, ranging between U=-.017 and U=.129 (Spearman Rank
Correlation), and with only one becoming significant. We partly attribute these low
correlations to the fact that competency self appraisal is probably not a very accurate
criterion for the actual knowledge state of our subjects.
3 Discussion and Outlook
The results caution towards the use of self appraisal information as a criterion variable
for evaluating the adaptation of a learning system, but also as an input variable for the
user model. Self appraisal by our subjects showed to be unrelated to their actual
performance. A possible reason for this may be that the students were rather
inexperienced in the domain. We assumed this also holds for the case of work-
integrated learning, which is in line with [12] who found high validity of self
appraisal only for experienced job holders. Also social desirability may have resulted
in answer tendencies, as all performance appraisals before task execution were much
higher than after.
The results for task-based competency assessment were largely unsatisfying due to
low validity of the criterion variable. Future research will show whether our
algorithms prove to be more successful than traditional measures. In any case, the
question of a valid criterion variable for a knowledge state (which at the same time
has ecological validity), will continue to be a challenge in work-based learning.
Checking for personal learning needs has proven to be a promising way to identify
parts of the models with low validity (missing competencies in our case). In
combination with indicators that estimate violations of the prerequisite relation from
answer patters, these methods can be used to iteratively refine models once they are in
We are currently planning an extensive summative evaluation of the APOSDLE
system and the components contained therein. A purpose of the study reported here
was to gain an understanding of how paper-based methods could be applied for
evaluating the adaptation of a learning system specifically in the context of adaptive
work-integrated learning so that they may be incorporated in a more comprehensive
evaluation approach in a field setting. For that reason, all the validation methods
employed here can be easily transferred to a setting where the learning system is in
operation and provides suggestions for learning needs and learning content during
actual task performance. The role of the RE professor in our study could then be taken
by supervisors of those working in the tasks. Short and unobtrusive system dialogues
after task execution could be used for collecting self appraisal as well as indications
6 Tobias Ley1,2, Barbara Kump3, Antonia Maas2, Neil Maiden4, Dietrich Albert2
of actual personal learning needs from the learners. This information could then be
fed back to adaptation designers to iteratively refine the adaptation decision or the
underlying domain model, such as suggesting additional competency assignments for
particular tasks or missing competencies altogether.
The Know-Center is funded within the Austrian COMET Program - Competence
Centers for Excellent Technologies - under the auspices of the Austrian Federal
Ministry of Transport, Innovation and Technology, the Austrian Federal Ministry of
Economy, Family and Youth and by the State of Styria. COMET is managed by the
Austrian Research Promotion Agency FFG. Contributions of four anonymous
reviewers to an earlier draft of this paper are kindly acknowledged.
1. Van Velsen, L., Van Der Geest, T., Klaassen, R., Steehouder, M.: User-centered evaluation
of adaptive and adaptable systems: a literature review. The Knowledge Engineering Review,
23 (3), 261-281 (2008)
2. Brusilovsky, P., Karagiannidis, C., Sampson, D.: The Benefits of Layered Evaluation of
Adaptive Applications and Services. In: S. Weibelzahl; D. Chin; G. Weber (eds.): Empirical
evaluation of adaptive systems. Workshop at the UM 2001, pp. 1-8, (2001)
3. Paramythis, A., Totter, A., Stephanidis, C.: A modular approach to the evaluation of
adaptive user interfaces. In: S. Weibelzahl, D. C. a. G. (eds.): Empirical evaluation of
adaptive systems: Workshop at the UM 2001, pp. 9-24 (2001)
4. Weibelzahl, S., Lauer, C. U.: Framework for the evaluation of adaptive CBR-systems. In: I.
Vollrath; S. Schmitt; U. Reimer (eds.): Experience Management as Reuse of Knowledge.
GWCBR 2001, pp. 254-263, Baden-Baden, Germany (2001)
5. Lindstaedt, S. N., Ley, T., Scheir, P., Ulbrich, A.: Applying Scruffy Methods to Enable
Work-integrated Learning. Upgrade: The European Journal of the Informatics Professional,
9 (3) 44-50 (2008)
6. Korossy, K.: Extending the theory of knowledge spaces: A competence-performance
approach. Zeitschrift für Psychologie, 205, 53-82 (1997)
7. Ley, T., Ulbrich, A., Scheir, P., Lindstaedt, S. N., Kump, B., Albert, D.: Modelling
Competencies for supporting Work-integrated Learning in Knowledge Work. Journal of
Knowledge Management, 12 (6), 31-47 (2008)
8. Maiden, N. A., Jones, S. V.: The RESCUE Requirements Engineering Process - An
Integrated User-centered Requirements Engineering Process, Version 4.1. Centre for HCI
Design, The City University, London/UK (2004)
9. Hoffman, C., Nathan, B. & Holden, L.: A Comparison of Validation Criteria: Objective
versus Subjective Performance Measures and Self- versus Supervisor Ratings, Personnel
Psychology, 44, 601-619 (1991)
10. Mabe, P. & West, S.: Validity of Self-Evaluation of Ability: A Review and Meta-Analysis,
Journal of Applied Psychology, 67, 280-296 (1982)
11. Ley, T.: Organizational Competency Management - A Competence Performance Approach.
Shaker, Aachen (2006)
12. Muellerbuchhof, R. & Zehrt, P.: Vergleich subjektiver und objektiver Messverfahren für die
Bestimmung von Methodenkompetenz am Beispiel der Kompetenzmessung bei technischem
Fachpersonal. Zeitschrift für Arbeits- und Organisationspsychologie, 48, 132-138 (2004)
... This study is part of a design-based research which has undergone three iterations. The feedback from end-users has been sought in each iteration to improve the final product, meaning that we seek to involve teachers into the design and research process from early on (Ley et al., 2009). The previous two iterations are briefly described below to provide the reader with an overview of the process and decisions made before introducing the principal focus of the study: iteration 3. ...
... The first iteration involved conducting a needs analysis and presenting eight in-service teachers with a paper prototype (see Fig. 1) to encourage teachers to discuss their concrete needs rather than abstract wishes (Ley et al., 2009). The interviewed teachers in the first iteration reported on wanting to know about the "power dynamics" of the group (who was more dominant, passive in the group); additionally, they were interested in knowing what the individual contribution of each student within the group is "to have an objective basis for assessing the students" (Kasepalu et al., 2019). ...
Full-text available
Monitoring and guiding multiple groups of students in face-to-face collaborative work is a demanding task which could possibly be alleviated with the use of a technological assistant in the form of learning analytics. However, it is still unclear whether teachers would indeed trust, understand, and use such analytics in their classroom practice and how they would interact with such an assistant. The present research aimed to find out what the perception of in-service secondary school teachers is when provided with a dashboard based on audio and digital trace data when monitoring a collaborative learning activity. In a vignette study, we presented twenty-one in-service teachers with videos from an authentic collaborative activity, together with visualizations of simple collaboration analytics of those activities. The teachers perceived the dashboards as providers of useful information for their everyday work. In addition to assisting in monitoring collaboration, the involved teachers imagined using it for picking out students in need, getting information about the individual contribution of each collaborator, or even as a basis for assessment. Our results highlight the need for guiding dashboards as only providing new information to teachers did not compel them to intervene and additionally, a guiding dashboard could possibly help less experienced teachers with data-informed assessment.
... One of the ways it is intended to overcome the transferability issue is by using design-science research, where the feedback from end-users is sought in each cycle to improve the final product. This is the reason why teachers were involved early on into the design process [5], another strategy will be to conduct workshops inviting teachers to come and learn how to have a technological assistant with them in the classroom. The aim of the present study is to investigate in dialogue with in-service teachers what they need in order to be able to support students during CL more effectively. ...
... In the first cycle of the research, a needs analysis was conducted researching the literature and presenting eight inservice teachers with a paper prototype of a dashboard, to support teachers' voicing of needs beyond mere questions in the abstract [5]. The interviewed teachers in the first phase reported on wanting to know about the "power dynamics" of the group and were interested in knowing what the individual contribution of each student within the group is "to have an objective basis for assessing the students" [7]. ...
... However, the multimodal data complexity makes it difficult to design dashboards for teachers to monitor CSCL activities in the classroom. There is also a need to include the teacher early on in the design process to make the dashboard effective and suitable for the teacher's needs (Ley et al. 2009). In this paper, we present our findings from co-designing a dashboard using multimodal data -audio and logs-with 58 in-service teachers together in an iterative manner. ...
Full-text available
The understanding of collaboration quality is crucial for teachers to become aware of the activities going on in the groups and also for identifying groups in need to offer support in CSCL (Computer-Supported Collaborative Learning). Multimodal data captured during CSCL activity in the classroom can facilitate a holistic understanding of collaboration behavior. However, part of the problem is an adequate graphical representation of multimodal data that can help teachers to understand the collaboration and its related sub-constructs (e.g., argumentation). This aspect has been scarcely investigated in CSCL research using multimodal data. This paper presents a study with 58 participants co-designing a dashboard using multimodal data to aid teachers monitoring collaboration and supporting students in the classroom. According to our findings, teachers' preferences include: abstract representation over quantitative measures (e.g., showing the group's written contribution with a pen icon and an amount of writing as size of pen), quantitative details on request, and being notified when problems are identified.
... Evaluate loop. The research team introduced the mock-ups using oversized paper prototypes (Bødker and Grønbaek, 1991;Ley et al., 2009), i.e. a paper-based user interface simulating the canvases, separate small paper snippets depicting different user interface elements (e.g. contextual menus, drop-down boxes, arrows or maps) and paper icons representing the collected bits of experiences ( Figure 6). ...
Full-text available
Purpose Introducing technology at work presents a special challenge as learning is tightly integrated with workplace practices. Current design-based research (DBR) methods are focused on formal learning context and often questioned for a lack of yielding traceable research insights. This paper aims to propose a method that extends DBR by understanding tools as sociocultural artefacts, co-designing affordances and systematically studying their adoption in practice. Design/methodology/approach The iterative practice-centred method allows the co-design of cognitive tools in DBR, makes assumptions and design decisions traceable and builds convergent evidence by consistently analysing how affordances are appropriated. This is demonstrated in the context of health-care professionals’ informal learning, and how they make sense of their experiences. The authors report an 18-month DBR case study of using various prototypes and testing the designs with practitioners through various data collection means. Findings By considering the cognitive level in the analysis of appropriation, the authors came to an understanding of how professionals cope with pressure in the health-care domain (domain insight); a prototype with concrete design decisions (design insight); and an understanding of how memory and sensemaking processes interact when cognitive tools are used to elaborate representations of informal learning needs (theory insight). Research limitations/implications The method is validated in one long-term and in-depth case study. While this was necessary to gain an understanding of stakeholder concerns, build trust and apply methods over several iterations, it also potentially limits this. Originality/value Besides generating traceable research insights, the proposed DBR method allows to design technology-enhanced learning support for working domains and practices. The method is applicable in other domains and in formal learning.
... Since the first introduction of the term in 2000, the scientific community has adopted this concept in planning and conducting empirical studies. The researcher acknowledge that many authors explicitly refer back to the foundational papers published on the topic to justify experimental designs, to provide rationale for goals or structure of their evaluation studies (Ortigosa and Carro, 2003; Gena, 2005; Goren-Bar et al., 2005; Petrelli and Not, 2005; Arruabarrena et al., 2006; Glahn et al., 2007; Kobsa, 2007; Nguyen and Santos Jr, 2007; Ley et al., 2009; Limongelli et al., 2008; Popescu, 2009; Santos and Boticario, 2009) or to demonstrate methodological shortcomings of existing studies (Masthoff, 2002; Gena, 2005; Brusilovsky et al., 2006; Yang and Huo, 2008; Brown et al., 2009). The fact that layered evaluation received such a high level of attention in the literature reaffirms the claim that the evaluation of adaptive systems implicates some inherent difficulties. ...
Full-text available
A current problem with the research of adaptive systems is the inconsistency of evaluation applied to the adaptive systems. However, evaluating an adaptive system is a difficult task due to the complexity of such systems. Evaluators need to ensure correct evaluation methods and measurement metrics are used. This paper reviews a variety of evaluation techniques applied in adaptive and user-adaptive systems. More specifically, it focuses on the user-centred evaluation of adaptive systems such as personalised recommender systems and adaptive information retrieval systems. The review tackles the question of "リHow have user-centred evaluations of adaptive and user-adaptive systems been conducted and how can these evaluation practices be improved?' Based on the analysed results of the: (a) evaluation approaches, (b) user-centred evaluation techniques, and (c) evaluation metrics, we propose an evaluation framework for end-user experience in evaluating adaptive systems (EFEx).
... Here, the research question is: Is the adaptation provided by the models appropriate? In [9], we asked a group of requirements engineering students to perform a number of typical exercises that were related to the tasks modeled in the APODLE domain model. Before and after completing the exercises, they were asked to complete task and skill selfassessments . ...
Conference Paper
Full-text available
Evaluation frameworks have been presented that suggest layered evaluation of adaptive systems along two dimensions: (i) the software development cycle, and (ii) the component of the adaptive system that shall be looked at. We argue that a third dimension is crucial: the question whether an evaluation should take place in the lab or in the field. We present a refined systematization of evaluation approaches using the evaluation of the adaptive WIL system APOSDLE as an example.
... Since the first introduction of the term in 2000, the scientific community has adopted this concept in planning and conducting empirical studies. Many authors explicitly refer back to the foundational papers published on the topic to justify experimental designs, to provide rationale for goals or structure of their evaluation studies (Arruabarrena et al. 2002; Ortigosa and Carro 2003; Petrelli and Not 2005; Cena et al. 2006; Goren-Bar et al. 2006; Glahn et al. 2007; Kosba et al. 2007; Nguyen and Santos Jr 2007; Stock et al. 2007; Carmagnola et al. 2008; Limongelli et al. 2008; Ley et al. 2009; Popescu 2009; Santos and Boticario 2009), or to demonstrate methodological shortcomings of existing studies (Masthoff 2002; Gena 2005; Brusilovsky et al. 2006; Yang and Huo 2008; Brown et al. 2009). The fact that layered evaluation received such a high level of attention in the literature reaffirms the claim that the evaluation of adaptive systems implicates some inherent difficulties. ...
Full-text available
The evaluation of interactive adaptive systems has long been acknowledged to be a complicated and demanding endeavour. Some promising approaches in the recent past have attempted tackling the problem of evaluating adaptivity by “decomposing” and evaluating it in a “piece-wise” manner. Separating the evaluation of different aspects can help to identify problems in the adaptation process. This paper presents a framework that can be used to guide the “layered” evaluation of adaptive systems, and a set of formative methods that have been tailored or specially developed for the evaluation of adaptivity. The proposed framework unifies previous approaches in the literature and has already been used, in various guises, in recent research work. The presented methods are related to the layers in the framework and the stages in the development lifecycle of interactive systems. The paper also discusses practical issues surrounding the employment of the above, and provides a brief overview of complementary and alternative approaches in the literature.
Within this chapter we first outline the important role learning plays within knowledge work and its impact on productivity. As a theoretical background we introduce the paradigm of Work-Integrated Learning (WIL) which conceptualizes informal learning at the workplace and takes place tightly intertwined with the execution of work tasks. Based on a variety of in-depth knowledge work studies we identify key requirements for the design of work-integrated learning support. Our focus is on providing learning support during the execution of work tasks (instead of beforehand), within the work environment of the user (instead of within a separate learning system), and by repurposing content for learning which was not originally intended for learning (instead of relying on the expensive manual creation of learning material). In order to satisfy these requirements we developed a number of context-aware knowledge services. These services integrate semantic technologies with statistical approaches which perform well in the face of uncertainty. These hybrid knowledge services include the automatic detection of a user’s work task, the ‘inference’ of the user’s competencies based on her past activities, context-aware recommendation of content and colleagues, learning opportunities, etc. A summary of a 3 month in-depth summative workplace evaluation at three testbed sites concludes the chapter.
Full-text available
Reviews 55 studies in which self-evaluations of ability were compared with measures of performance to show a low mean validity coefficient (mean r = .29) with high variability ( SD = .25). A meta-analysis by the procedures of J. E. Hunter et al (1982) calculated sample-size weighted estimates of –- r and SDr and estimated the appropriate adjustments of these values for sampling error and unreliability. Among person variables, high intelligence, high achievement status, and internal locus of control were associated with more accurate evaluations. Much of the variability in the validity coefficients ( R = .64) could be accounted for by 9 specific conditions of measurement, notably (a) the rater's expectation that the self-evaluation would be compared with criterion measures, (b) the rater's previous experience with self-evaluation, (c) instructions guaranteeing anonymity of the self-evaluation, and (d) self-evaluation instructions emphasizing comparison with others. It is hypothesized that conditions increasing self-awareness would increase the validity of self-evaluation. (84 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Full-text available
Adaptivity is a way of increasing the usability of interactive software. Several systems use CBR as a basic inference mechanism to model the user and to reason about adaptive actions. However, empirical evaluations of adaptive systems are rare. This paper introduces an evaluation framework of six neces-sary steps for adaptive CBR-systems. The framework is applied to a case-based product recommendation system. Finally, possible experimental designs are pre-sented. Evaluation criteria, both generally applicable as well as criteria that are specific to e-commerce applications, are discussed. keywords: evaluation framework, criteria, adaptivity, usability, behavioral com-plexity 1 Problems and Barriers in the Evaluation of Adaptive CBR-Systems Making interactive systems adaptive is an emerging field. Many systems have been developed that adapt to the user. E.g., an adaptive product retrieval system might adapt to the user's preferences, an adaptive learning environment might adapt to the learners current knowledge and goals, and an adaptive help system might adapt to the user's current task. In some of these systems CBR is used as inference mechanism to provide the adap-tive features. E.g., PTV (Smyth & Cotter, 1999) adaptively recommends TV programs, DUMBO (Melchiors & Tarouco, 1999) supports network maintenance, ELM-PE (We-ber, 1995) teaches programming by examples, and CASTLE (Weibelzahl, 1999) rec-ommends vacation homes. These systems aim at optimizing the human-computer inter-action. Adaptive features are one possibility of improving usability. However, empirical evaluations of adaptive systems are hard to find. Nevertheless, they are strictly required to justify the enormous efforts of implementation. Several reasons have been identified to be responsible for this lack (e.g., Eklund, 1999). One structural reason is that computer science has little tradition of empirical research and, thus, evaluations of adaptive systems are usually often required for publi-cation. Second, the development cycle of software products is short. Evaluations might become obsolete as soon as a new version has been developed. The resources consumed by the evaluation cannot put to use for further development.
Full-text available
Purpose – The purpose of this paper is to suggest a way to support work-integrated learning for knowledge work, which poses a great challenge for current research and practice. Design/methodology/approach – The authors first suggest a workplace learning context model, which has been derived by analyzing knowledge work and the knowledge sources used by knowledge workers. The authors then focus on the part of the context that specifies competencies by applying the competence performance approach, a formal framework developed in cognitive psychology. From the formal framework, a methodology is then derived of how to model competence and performance in the workplace. The methodology is tested in a case study for the learning domain of requirements engineering. Findings – The Workplace Learning Context Model specifies an integrative view on knowledge workers' work environment by connecting learning, work and knowledge spaces. The competence performance approach suggests that human competencies be formalized with a strong connection to workplace performance (i.e. the tasks performed by the knowledge worker). As a result, competency diagnosis and competency gap analysis can be embedded into the normal working tasks and learning interventions can be offered accordingly. The results of the case study indicate that experts were generally in moderate to high agreement when assigning competencies to tasks. Research limitations/implications – The model needs to be evaluated with regard to the learning outcomes in order to test whether the learning interventions offered benefit the user. Also, the validity and efficiency of competency diagnosis need to be compared to other standard practices in competency management. Practical implications – Use of competence performance structures within organizational settings has the potential to more closely relate the diagnosis of competency needs to actual work tasks, and to embed it into work processes. Originality/value – The paper connects the latest research in cognitive psychology and in the behavioural sciences with a formal approach that makes it appropriate for integration into technology-enhanced learning environments.
Zusammenfassung. Die dargestellte Untersuchung befasst sich mit konzeptionellen und methodischen Problemen der Messung einer Teilfacette beruflicher Handlungskompetenz bei technischem Fachpersonal, der Storungsdiagnosekompetenz. Einem subjektiven Verfahren zum Selbstkonzept methodischer Kompetenz wird ein objektives Kompetenzmas gegenubergestellt. Es wird der Frage nachgegangen, wie stark der Zusammenhang zwischen beiden Analyseverfahren ausfallt und ob es bei unterschiedlichen Einsatzbedingungen Praferenzen fur den Verfahrenstyp gibt. Dazu wurde eine objektive Messmethodik nach rationalen Prinzipien entwickelt und empirisch erprobt. In einer korrelativen Studie wurde technisches Fachpersonal aus dem Instandhaltungsbereich der High-Tech-Branche mit den unterschiedlichen methodischen Zugangen zu Kompetenzmerkmalen untersucht. Im Ergebnis sind z. T. sehr signifikante Zusammenhange zwischen den subjektiven Masen Methodisches Vorgehen bzw. Selbstkonzept methodischer Kompetenz und dem objektiven Kompetenzmas f...
This work is concerned with a new approach to Organizational Competency Management. The goal is to develop a method that is practically feasible for organizational settings, is firmly based in psychological conceptions of human competence and performance in the workplace, and employs a degree of mathematical formalization that improves possibilities for establishing the validity of the implementation. Competency Management is defined to encompass all instruments and methods used in an organization to systematically assess current and future competencies required for the work to be performed, and to assess available competencies of the workforce. Competencies are defined as the cognitive (e.g. knowledge and skills), affective (e.g. attitudes and values), behavioral and motivational (e.g. motives) characteristics or dispositions of a person which enable him or her to perform well in a specific situation (e.g. Boyatzis, 1982; Erpenbeck & Rosenstiel, 2003). A process model is introduced which encompasses five steps that usually guide implementation of a Competency Management initiative. In the first step, setting and purpose of the initiative are analyzed (analyzing setting and purpose). The second step encompasses the definition of a model for the specific organization detailing out which competencies should be measured (defining competencies). In the third step, available competencies of the workforce are assessed (assessing competencies). The fourth step brings about an evaluation of the models and the assessment (evaluating models), and finally the last step puts the models to use (using models). The steps are used as a frame of reference for reviewing existing approaches and methods. A review of current approaches in organizational Competency Management in Human Resource Management (HRM) and Knowledge Management (KM) fields leads me to conclude that instruments that are integrated in existing work processes
This study compared four criteria–two objective (production quantity and production quality) and two subjective (supervisor and self-ratings)–for their predictability in a criterion-related validity study. Results from this sample of 212 maintenance, mechanic, and field service workers replicated previous meta-analytic results with clerical workers (Nathan & Alexander, 1988); supervisor ratings and objective productivity indices provided similar and significant validity coefficients with a unit-weighted composite of five cognitive ability tests. The objective quality index and employee self-ratings resulted in near zero correlations with the same predictor battery. Additional productivity and quality objective criterion data were available for 2 years since the original validation study; no change in validity was found.