ArticlePDF Available


Apprentices' performance after vocational educational training (VET) is commonly attributed to the effectiveness of the training. This implies the assumption that learners' development of vocational knowledge and ability is significantly affected by vocational instruction. However, the few analyses that have been made of instructional sensitivity within the general school-based educational system, have in most cases shown little or no effect of instruction (time in school) on performance in assessments. The question as to whether, and to what extent, VET in adult education is effective (in the sense that it fosters the development of vocational knowledge and ability), as well as the related questiondwhether we are able to track the resulting learning progress with adequate measures (i.e., assessments)dhas hardly been investigated. In the present study, we propose modeling of instructional sensitivity via differential item functioning (DIF), and apply this method to a sample of n ¼ 534 apprentices. We find that during vocational instruction, apprentices significantly improved their performance in an assessment of vocational knowledge and ability, and that we were able to track these changes in the quality of their abilities over the span of a three year initial VET program: that is, the first program of vocational study in which apprentices become qualified to work in a given trade. Moreover, with this proposed method, it is possible to identify items that are particularly sensitive to instruction and that appear therefore to be amenable to the future development of vocational assessments.
Instructional sensitivity in vocational education
Viola Deutscher
, Esther Winther
University of Mannheim, Business School, L4 1, 68161 Mannheim, Germany
German Institute for Adult Education (DIE), Heinemannstraße 12-14, 53175 Bonn, Germany
article info
Article history:
Received 10 November 2016
Received in revised form
30 June 2017
Accepted 7 July 2017
Available online 8 September 2017
Instructional sensitivity
Item differential functioning (DIF)
Vocational educational training (VET)
Competence development
Competence-based assessment
Apprentices' performance after vocational educational training (VET) is commonly attributed to the
effectiveness of the training. This implies the assumption that learnersdevelopment of vocational
knowledge and ability is signicantly affected by vocational instruction. However, the few analyses that
have been made of instructional sensitivity within the general school-based educational system, have in
most cases shown little or no effect of instruction (time in school) on performance in assessments. The
question as to whether, and to what extent, VET in adult education is effective (in the sense that it fosters
the development of vocational knowledge and ability), as well as the related questiondwhether we are
able to track the resulting learning progress with adequate measures (i.e., assessments)dhas hardly been
investigated. In the present study, we propose modeling of instructional sensitivity via differential item
functioning (DIF), and apply this method to a sample of n ¼534 apprentices. We nd that during
vocational instruction, apprentices signicantly improved their performance in an assessment of voca-
tional knowledge and ability, and that we were able to track these changes in the quality of their abilities
over the span of a three year initial VET program: that is, the rst program of vocational study in which
apprentices become qualied to work in a given trade. Moreover, with this proposed method, it is
possible to identify items that are particularly sensitive to instruction and that appear therefore to be
amenable to the future development of vocational assessments.
©2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
1. Premise
Schooling/training is commonly assumed to be responsible for
learning (Burstein, 1989; Naumann, Hochweber, &Hartig, 2014).
Somewhat surprising therefore are some empirical hints that per-
formance on assessments in general education is often little or not
at all sensitive to the effects of instruction. Diverse research (e.g.,
Chen, 2012; Court, 2013; Pham, 2009; Phillips &Mehrens, 1988;
Popham, 2007; Popham &Ryan, 2012) suggests that many
achievement tests fail to effectively reect whether students suc-
cessfully receive and absorb curricular content during instruction.
This apparent paradox might result from one of two causes (or
conceivably both): (1) That learners have indeed learned during
instruction, but that the assessment applied was not able to capture
the learning progress made. For example, Goe (2007) and Polikoff
(2010) caution that the failure to detect instructional sensitivity
does not necessarily imply that no learning progress has been
made. Rather, the weak relationship between curricular instruction
and student performance could be due to the applied measurement
tools not being sufciently sensitive to capture the effect of in-
struction. These measures of learning outcomes possibly indicate
what students know, but not necessarily what they learn during
instruction (Popham, 2007).
The second possible cause (2) is expressed by Wiliam (2007,12)
who, providing an insightful analysis of the relevant research
addressing instructional sensitivity, goes one step further, arguing
for a more pessimistic second order explanation:
the fundamental issue is not that tests are insensitive to in-
struction; it is that achievement is insensitive to instruction. Put
bluntly, most of what happens in classrooms doesn't change
what students know very much, especially when we measure
deep, as opposed to surface aspects of a subject.
This second explanation in turn might result from two causes:
Either studentsknowledge as a latent structure is generally
insensitive to instruction, or instruction may not have been deliv-
ered (or not effectively).
*Corresponding author.
E-mail address: (V. Deutscher).
Contents lists available at ScienceDirect
Learning and Instruction
journal homepage:
0959-4752/©2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (
Learning and Instruction 53 (2018) 21e33
Even without any clear indication of which of the two expla-
nations (or, conceivably, a combination) accounts for the empirical
ndings, both interpretations of the instructional insensitivity of
diverse outcome measures pose a severe threatdespecially to
educational accountability. In some nations (e.g., the US), outcome
measures have been used in recent times not only to evaluate the
effectiveness of schools and teachers on the basis of their students
test prociency, but also to allocate educational resources on the
basis of test results (e.g., state tests used for the purposes of the No
Child Left Behind Act). Without a doubt, an accountability test
woulddas one prerequisite, among other aspects of validitydat
least have to be instructionally sensitive, in order to form an
appropriate basis for making decisions with potentially far-
reaching consequences. However, given unreliable and possibly
inaccurate test-based evidence, achievement or learning progress,
or even the lack thereof, instructional sensitivity cannot be accu-
rately determined; this leaves the danger that teachers and schools
will be misjudged, and even be unfairly denied resources.
Considering these potentially severe consequences, Polikoff
(2010, 34), summarizing the overall state of instructional sensi-
tivity research, comes to the conclusion that the lack of docu-
mentation of instructional sensitivity in accountability tests
constitutes a grievous oversight. Even more strongly, Popham and
Ryan (2012, 2) assail the current lack of empirical evidence
regarding instructional sensitivity in most educational tests,
describing it as an intolerable state of affairs. In view of the above,
the internationally observable trend towards test-based account-
ability systems, and political reliance on outcome measures in
making decisions affecting education, seems highly questionable.
For this reason, some authors have demanded that the concept of
instructional sensitivity become an explicit and integral part of a
broadened conception of validity, for common standards in
educational and psychological testing (e.g., AERA, APA, &NCME,
1999). They call for this to be applied at least for the outcome
measures that are used to assess changes in learning and for those
testing system effectiveness (e.g., teacher or school effectiveness;
for example, Polikoff, 2010; Popham &Ryan, 2012).
Way (2014, 4) raises the concern that despite these recent
imperatives for explicitly making assessments instructionally sen-
sitive, there is not agreement about how this is to be done ().
Naumann et al. (2014) similarly believe that the question whether
outcome measures are indeed sensitive to instruction is hardly
empirically engaged, due to the lack of a commonly accepted
denition and operationalization of the concept of instructional
sensitivity. The methodological approaches to modeling instruc-
tional sensitivity are diverse, to say the least: this has led to mainly
psychometric papers on the topic, and few practical applications
combining the proposed methods with a didactical perspective (for
one such application however, see the recent study by Naumann
et al., 2014).
Although, as we have noted, instructional sensitivity is a crucial
concept in instructional science, to our knowledge no studies have
addressed the modeling of instructional sensitivity with respect to
vocational education of adults. In Germany, about half of the pop-
ulation takes vocational educational training (VET) rather than
academic training, after their school education. Most of this VET
(60%) relates to commercial professions: for bankers, industrial
management assistants, salesmen (National Educational Report,
Hasselhorn et al., 2014). While development of measures of voca-
tional knowledge and ability for this branch of education is very
relevant, it is still in its infancy. In general, however, signicant
progress has been made in the last decade with respect to the
measurement of learning outcomes in the vocational domains of
auto mechanics (e.g., Nickolaus, Lazar, &Norwig, 2012) and ap-
prenticeships in commercial professions: for example, industrial or
logistics apprentices (e.g., Klotz, Winther, &Festner, 2015; Rausch,
Seifried, Wuttke, K
ogler, &Brandt, 2016; Seeber, 2008; Weber et al.,
2016; Winther &Achtenhagen, 2009). More recently, there has also
been notable progress in the area of social health care (e.g., Seeber,
2015; Seeber, Ketschau, &Rüter, 2016). Therefore, the purpose of
this study is to conceptualize and model instructional sensitivity in
the area of vocational education, and to detect which item types are
especially relevant to modeling the learning progress. More pre-
cisely, we focus on the occupation of industrial management as-
sistant, and seek to explore whether instructional sensitivity is
detectable in an assessment of vocational knowledge and ability.
According to Polikoff (2010,8e9), it is impossible to say
whether a nding of low or no sensitivity in any particular study
is due to a poor-quality test that is actually insensitive to in-
struction or to poor quality instruction, so that the test results
actually reect the instruction received by students. In contrast,
anding of high sensitivity indicates both effective instruction
and also a high-quality assessment sensitive to that instruction.
Clearly, the goal is always to have instruction of maximum
effectiveness, and to design a test to capture the effects of
So if we do not nd instructional sensitivity, this does not
necessarily mean that learners have not learned anything (e.g., due
to poor instruction); it may possibly mean that our assessment
failed to capture their learning (i.e., instructional insensitivity of the
assessment). However, if we nd instructionally sensitive items,
this must mean that vocational knowledge and ability are being
acquired during VET and that we are able to capture them. More
precisely, in this study, the following research questions are
1. Is the developed assessment of vocational knowledge and
ability sensitive to instruction (meaning that learning progress
is made during VET and that we are able to capture that
2. Is the learning of specic (vocational) knowledge and ability
equally sensitive to instruction as is the learning of generic
knowledge and ability?
In order to explore this matter, the paper begins by reviewing
different denitions of instructional sensitivity and different
methodological approaches to its detection. Subsequently, the item
and test design of an instrument to capture apprenticesknowledge
and ability is introduced. We then apply the IRT-DIF approach to a
vocational sample of n ¼877 industrial apprentices, and outline
and discuss the results.
2. Dening and detecting instructional sensitivity
In the theoretical research into instructional sensitivity, this
term has often been used interchangeably with instructional val-
idity, with both terms being treated as subfacets of other, common
aspects of test validity, such as curricular validity and content val-
idity (Polikoff, 2010). Li et al. (2012b, p. 2) note that the intended
meaning of the term sometimes relates exclusively to the extent to
which the curriculum content is taught successfully (e.g., Linn,
1983). Occasionally however, it also includes the nature of the
teaching of the content (e.g., Burstein, Aschbacher, Chen, &Lin,
1990; Popham &Ryan, 2012; Yoon &Resnick, 1998). A denition
that is open to both interpretations is the originally used, more
technical denition of Haladyna and Roid (1981, p. 40), dening
instructional sensitivity as the tendency for an item to vary in
difculty as a function of instruction. This relation is then specied
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e3322
either by the duration of instruction only (Opportunity to Learn
[OTL] as time for learning; see, e.g, Yu, Lei, &Suen, 2006]orby
aspects referring to the quality, content and nature of instruction,
such as is often implemented in broader OTL conceptions (e.g., Kao,
1990; Switzer, 1993; Yu et al., 2006). In this study we adopt the
broader approach, as we focus on the effectiveness of VET and its
assessment as a whole, and therefore dene instructional sensi-
tivity as the tendency for a test or a single item to vary in difculty as a
function of the duration of vocational educational training. According
to this denition, if vocational instruction is reasonably effective,
items should be easier for instructed students and more difcult
when administered to uninstructed students. Conversely, if time on
training does not change apprenticesperformance on an assess-
ment to any marked degree, then that assessment must be insen-
sitive to instruction (Ruiz-Primo et al., 2012; Wiliam, 2007).
In the literature on detecting instructional sensitivity, a variety
of approaches are distinguishable, but they can be subsumed under
two basic headings: (1) Judgmental approaches are usually inte-
grated into test design and development processes (Popham,
2007), but potentially can also be applied as ex-post evaluations
of instructional sensitivity (e.g., Rovinelli &Hambleton, 1977).
Judgmental approaches rely on trained experts in a domain rating
the specied attributes of a test's instructional sensitivity. Ideally, in
such methods, only instructionally sensitive items are selected for a
nal test instrument. However, the major drawback of these ap-
proaches is that it has not yet been demonstrated that experts can
validly and reliably distinguish between tasks that are instruc-
tionally sensitive and those that are not (Chen, 2012; Polikoff, 2010;
Way, 2014).
The second approach (2) is an empirical investigation of
instructional sensitivity in learners' test outcomes, and includes a
variety of empirical methods and respective designs.
One empir-
ical method, an IRT-based Differential Item Functioning approach
(DIF), has won recognition over recent years. In several studies (e.g.,
Naumann et al., 2014; Polikoff, 2010; Popham &Ryan, 2012) it has
proved to be well suited to the purpose of detecting instructional
sensitivity. This method goes back to the conceptual framework of
Masters (1988). The major nding of Mastersframework is that,
aside from the fact that high and low achieving students will
usually score differently on a test, differential instructional sensi-
tivity is reected in some items being more highly discriminating
than others. So the key technical element of DIF-based studies (e.g.,
Polikoff, 2010; Popham &Ryan, 2012) is that they compare the
performance of groups on an assessment, controlling for the overall
ability of the groups. In this respect, in either longitudinal or cross-
sectional designs, DIF analyses can be run to compare instructed
students to novice students, indicating whether the items are
sensitive to the instruction experienced by the students (Polikoff,
2010, p. 17).
In this study, we seek to combine a judgmental with an
empirical approach; Ruiz-Primo et al. (2012) have offered an
example of such a triangulating approach.
3. Assessment design for VET
Especially in vocational education, where ability has to be
demonstrated in the workplace on a regular basis, the concept of
competence is more signicant than the concept of mere knowl-
edge, as a target construct of vocational assessment. We dene
competence in line with Mulder, Weigel, and Collins (2006, p. 79),
as the capability to perform by using knowledge, skills, and
attitudes that are integrated in the professional repertoire of the
individual. A paper-pencil assessment is utilized to infer the ap-
prenticescognitive structures by assessing how well the test takers
can do on authentic workplace-related tasks that they are expected
to master at the end of their VET. So, in this contribution we spe-
cically consider knowledge and ability as major cognitive pre-
requisites for the capability to perform in vocational situations,
rather than attitude-related aspects of vocational competence in
terms of attitudes and beliefs.
Following the item classication system of Ruiz-Primo et al.
(2012), our developed measure may be considered proximal to
instruction: It is designed to take a snapshot of the relevant
knowledge and skills in the curriculum. However, the exact content
(e.g., a situation in the workplace) can be different to that studied
during instruction. Our assessment was explicitly designed to align
with the intended VET curriculum for industrial apprentices, and to
be used as a paper-pencil test for the nal examination of ap-
prentices at the end of their VET (summative assessment). The
design process was inspired by recent assessment theory
(Pellegrino, , Chudowsky, &Glaser, 2001; Mislevy &Haertel, 2006;
Wilson, 2005; 2008).
In order to assure the validity of the assessment of instructional
sensitivity, we undertook several assessment phases. After (1)
dening our theoretical construct (as above), we (2) undertook a
curricular analysis, in order to closely align our assessment of
vocational knowledge and ability to the intended industrial VET
curriculum. A particular feature of this phase was that in a VET
assessment, we have to pay attention to the curricula of two
learning sites: The German VET system is structured so as to equip
apprentices with practical and theoretical knowledge by a dual
system, of company-based training programs provided by the pri-
vate sector (where the apprentices work about three days per week
and are paid a wage by their employer), together with a school-
based component (about two days per week, provided by the
public sector).
Consequently, not only did we analyze the ofcial curriculum of
vocational schools, but we also made a survey study in the indus-
trial sector, investigating what content is commonly taught and
considered necessary by the apprenticestraining companies. The
specic job analysis was guided by several questions: What content
is processed in which departments? What materials are used? How
does internal/external communication take place (infrastructure)?
All results and data were incorporated into the development of a
model of the typical business processes that occur within com-
panies (Winther, 2010). The model, which followed the process
perspective of the St. Galler Management Model (Rüegg-Stürm,
2004) includes three central processes in (industrial) companies:
value chain processes, related to quantiable goods and services and
their marketing; control processes, including decision support for
management; and management processes that comprise business
management and organization concerns.
The phase of item construction (3) was implemented according
to three guiding principles, the rst of which was a) authenticity of
the vocational assessment (e.g., Achtenhagen &Weber, 2003;
Shavelson &Seminara, 1968). In order to secure maximum
authenticity, we modeled a simulated company that produces
ceramic products such as tableware, bath tubs or sinks (see
Appendix). All assessment items developed were implemented
within the simulated company framework, together with addi-
tional realistic material, and information with which respondents
For a comparison of different approaches of an empirical investigation of
instructional sensitivity see Li, Ruiz-Primo, and Wills (2012a).
However, such aspects presumably also inuence task solutions during the
assessment and therefore are in all likehihood integrated in our measurement
approach to some (unknown) extent.
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e33 23
were to solve the items (e.g., product lists or e-mails; see
Appendix). With respect to the design of the single items, the
assessment tasks were designed to measure economic knowledge
and skills in the commercial sector by representing job-related
skills in the industrial sector. For this purpose, the item format of
all tasks was open-ended.
In order to attend to b) the varying cognitive demands of
vocational practice, the items were developed on three cognitive
levels, according to the conceptual framework of Greeno, Riley, and
Gelman (1984), which represents an action schema for performing
vocational tasks. On the rst level, conceptual competence implies
an understanding of the principles in the domain. It corresponds to
factual knowledge that can be translated into an action schema. At
the second level, procedural competence takes the form of
knowledge in action, such as dealing with facts, structures and
knowledge nets. At the third level, interpretational competence
refers to strategic decision making that reects the cognitive pro-
cess of grounded interpretation of the ndings obtained, through
conceptual and procedural knowledge. To assess these different
types of cognitive process, we modeled six conceptual items, seven
procedural and three interpretative items.
The third principle of item construction refers to c) the admin-
istration of tasks of varying specicity. In line with Gelman and
Greeno (1989), we distinguish between domain-linked and
domain-specic item content in the business domain. The former,
decontextualized aspect is generally relevant to the business
domain, while the latter is highly situational and reects the spe-
cic aspects, guidelines, and action maxims of a particular occu-
pation. More precisely, domain-linked aspects refer to basic
knowledge and skills that are generic but are nonetheless relevant
prerequisites for solving vocational problems (Klotz, Winther, &
Festner, 2015). In business domains, concepts such as literacy and
numeracy are examples of this type of general preknowledge
(OECD, 2003; Winther &Achtenhagen, 2009). Domain-linked
knowledge and ability is needed, for example, to perform simple
exchange rate calculations in the workplace. Such calculations do
not require any specic vocational knowledge or ability, but can be
dealt with simply by applying the general mathematical concept of
the rule of three, with which learners were already familiar from
their general school education. Domain-specic knowledge and
ability, on the other hand, entails job- or enterprise-specic
knowledge and skills (Oates, 2004). In a business domain, an
example of this kind of knowledge and ability might be rules that
are newly acquired during vocational educational training: for
example, for preparing a balance sheet in accounting. Both aspects
of vocational knowledge and abilityddomain-linked and domain
specicdare prerequisites for solving workplace-related tasks (for
sample items see Appendix). For the study at hand, we modeled 10
domain-specic items and 6 domain-related items.
In the (4) test assembly phase (see e.g., Mislevy &Haertel, 2006),
an important principle of assessment design for vocational edu-
cation relates to the assembly of single tasks into one coherent
business-process (Klotz, 2015). The test therefore starts as a
simulated typical event in the company (e.g., an e-mail from a
potential client) demanding certain responses from the test takers,
which in turn lead to further events and tasks (see Appendix).
The nal step of our assessment design process included (5)
validating our test design. We asked 24 vocational experts (12 ex-
perts for each item) to rate all tasks in terms of authentic item
design (relevance of content and realistic situational setting), as
well as to rate the items as either domain-linked or domain-
specic. Items that received an average value below 3.5 on the
ve-point Likert scale, in respect of workplace relevance, were
excluded from the instrument. Moreover, we used the expert
judgments of the items as being either domain related or domain
specic, as a basis for the empirical analysis. The experts mostly
agreed, in relation to their categorization of each item; this is re-
ected in the high degree of inter-rater reliability (Intraclass-Cor-
relation-Coefcient [ICC) ¼0.940).
4. Theoretical assumptions about instructional sensitivity in
As Ruiz-Primo et al. (2012, 693) note, most research studies
concerned with instructional sensitivity focus on evaluating
assessment instruments already developed and used, but are silent
on how to construct instructionally sensitive assessments. In our
research, in contrast, we implemented theoretical design princi-
ples, ex ante, into the assessment, that could be manipulated sys-
tematically to model items of varying instructional sensitivity. With
respect to Research Questions 1 and 2, we are interested not only in
whether the assessment is instructionally sensitive, but also, if so,
why. Often, item attributes causing difculty in tasks, also reect
sources of instructional sensitivity. Detection of instructional
sensitivity therefore requires strong familiarity with the vocational
area and its theoretical difculty, or gleaning the necessary attri-
butes through interaction with vocational experts. Ideally, both
circumstances would apply, to enable determining which voca-
tional activities are complex, and for what reason, and how the
capacity to achieve them might develop over time, with instruction.
In our assessment the item design characteristics potentially
causing difculty were the level of cognitive processing and the
degree of specicity of the learning content. However, we believe
that only the later attribute plays a predominant role in generating
instructional sensitivity. In line with Billett (1994), we argue that
most often, vocational novices do not lack cognitive ability. Rather,
in most instances, apprentices lack the specic knowledge and
experience within a vocational domain (Glaser, 1990) that would
otherwise enable them to conceptualize and categorize workplace-
related problems and to deploy their cognitive structures more
effectively (Billett, 1994, p. 4). Similarly, Dreyfus and Dreyfus (1980)
describe vocational learning as an expansion of novicesgeneric
preknowledge, which develops with relevant knowledge about
aspects, specic guidelines, and action schemes, such that it
transforms into an increasingly organized form, as specic knowl-
edge and ability. The newly acquired specic knowledge is then
storeddin addition to general knowledge and ability (domain-
linked)dto provide the learner with a broad knowledge base from
which to act in similar vocational situations.
The existing theoretical and qualitative research offers support
for the idea of vocational learning as acquirement of specic
knowledge and ability (see research on the expert-novice paradigm
in diverse vocational domains: e.g., Dreyfus &Dreyfus, 1980;
Benner, 2004; Worthy, 1996; Ryan, Fook, &Hawkins, 1995;
Campbell, Brown, &DiBello, 1992; Chmiel &Loui, 2004). We
therefore assume that instructional sensitivity in vocational do-
mains is determined by the extent of content specicity of items in
an assessment. More precisely, two hypotheses can be formulated
in reference to the above-stated research questions:
1. Advanced vocational learners improve signicantly, compared
to novices, in respect of their performance in the assessment
(Hypothesis 1).
2. Items that are domain specic are signicantly more strongly
instructionally sensitive than are items that relate to domain-
related generic contents (Hypothesis 2).
The tasks were scored with 0 for noneor a wrong solution, 1 for a partially
correct answer and 2 for a fully correct answer.
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e3324
5. Data acquisition and method
A cross-sectional design was used for the acquisition of data.
This design was sufcient for our purpose of detecting instructional
sensitivity in an assessment, as we did not seek to estimate or
explain individual differences within the cohort, but only to
ascertain whether items were instructionally sensitive at the
aggregate cohort level. Moreover, longitudinal data would have
caused test repetition effect issues (e.g., Hoffman, Hofer, &
Sliwinski, 2011; Salthouse &Tucker-Drob, 2008). The cross-
sectional data were gathered in 2013 as a non-random sample
from visits to vocational schools in locations spread widely across
Germany (Munich, Hanover, Bielefeld, and Paderborn). For eco-
nomic efciency, schools with a large proportion of industrial ap-
prentices were selected. Access was initiated by the German
Chamber of Industry and Commerce (IHK). Within these schools all
students enrolled in industrial apprentice programs were selected,
and all agreed to participate. Table 1 presents the sample, sub-
divided into the two groups of vocational novices (n
¼136) and
advanced vocational learners (n
¼398), and the basic character-
istics of these groups. Even though the data were gathered as a non-
random sample, the two groups were remarkably similar in regard
to the distributional characteristics of all collected variables, and
showed no differences with regard to gender (T ¼0.748;
p¼0.455), educational career paths (T ¼0.169; p ¼0.866) and
migrational background (T ¼1.011; p ¼0.313). The two subsets
(group 1 and group 2) only differed signicantly with regard to the
average time spent on vocational educational training (years spent
in vocational training) and average age (T ¼8.630; p ¼0.000).
Moreover, the distributions of the two collected subsamples are
comparable to the general population of industrial apprentices in
Germany (Table 1).
During test taking we observed, in regard to test motivation,
that the students engaged very well with the instrumentdmost
probably because it had been represented to them as a useful
preparation for their nal examination, and because we had
assured them of individual feedback. This also likely explains the
low rate of missing values (1.68%). The solutions to the items were
corrected and coded according to a detailed scoring guide (Wilson,
2008). Two independent raters randomly corrected 16% of all 534
tests, in order to estimate the accuracy of the scoring process. The
Intraclass-Correlation-Coefcient (ICC) proved a satisfactory degree
of scoring objectivity (ICC ¼0.914).
To analyze the open-ended items, we used a multidimensional-
random-coefcient-multinomial logit-model (Adams, Wilson, &
Wang, 1997) and analyzed the polytomous database of varying
scaling with the program ConQuest (Wu, Adams, &Wilson, 1997).
Then, thresholds for the two groups were estimated: for vocational
novices (group 1) and advanced learners at the end of their training
(group 2). A downward shift in the difculty of items in a com-
parison of group 1 with group 2 would mean that the learners must
have progressed in their vocational knowledge and ability, as the
items were relatively easy to solve for them, in comparison with
vocational novices. In order to determine the difculty of all items
in both groups, we used a Differential Item Functioning (DIF)
approach. DIF analyses explore whether the probabilities for the
solving of items are different for different groups, after controlling
for overall group performance (Holland &Wainer, 1993; Wilson,
2005, p. 165). For this purpose, the simple Rasch-Model was
extended by a group term, in which an interaction term interacted
with the single assessment items and therefore functioned as an
empirical criterion for the existence of differential differences be-
tween the groups.
6. Results
The item statistics suggest int for all items included in the
model (0.81 wMNSQ 1.12 )
and satisfactory reliability values
(EAP/PV reliability ¼0.846).
Applying the DIF approach to our
database, we obtained the results given in Tables 2 and 3. As can be
seen in Table 2, there was a signicant difference in performance on
our assessment from the beginning until the end of VET instruction.
Given the large chi-square and p-value <0.001, we reject H
there is no difference between novices and advanced learners). The
estimated value of vocational knowledge and ability of group 2 for
the assessment was, on average, 1.446 logits higher than for group
1; this, with a large effect size,
is indicated by tasks being harder
for beginners than for advanced learners. This means that learners
acquired a signicant additional degree of vocational knowledge
and ability during training, and that we were able to capture this
with our developed assessment instrument (Hypothesis 1).
Apart from this general change on the scale of vocational
knowledge and ability, it was also possible to look at each item's
difculty for the total sample and compare it to the difculty in
each group. If the change in difculty from group 1 to group 2 was
larger than what could be expected from the general advancement
given in Table 2 (1.446), the item must have been exposed to DIF.
Table 3 shows the item difculty for the total sample, and changes
in the subsamples.
With respect to Table 3 it is important to note that the DIF
approach consists of a strictly relative analysis. That is, every item
Table 1
Sample description (n ¼534).
Sample Characteristic Group 1 (n
¼136) Group 2 (n
¼398) Statistical population
Average years of initial VET 0.1 (¼beginners) 2.3 (¼advanced) Ø 0.0 years (¼beginners)
Age 19.2 21.3 19.1
Gender Female: 56% Female: 59% Female: 58%
Educational career Secondary school: 1% Secondary school: 1% Secondary school: 2%
Intermediate school: 30% Intermediate school: 33% Intermediate school: 36%
High school diploma: 69% High school diploma: 66% High school diploma: 62%
Migration background
21% 19% (not available)
Numbers for the Federal Republic of Germany, according to the Federal Institute for Vocational Education and Training (BIBB), of apprentices entering their initial
vocational training.
Migrational background was assessed in the questionnaire by asking for the language spoken at the apprentice's parental home.
Adams and Khoo (1996) advocate int for items with a weighted Mean Square
(wMNSQ) value from 0.75 to 1.33.
The Expected A Posteriori/Plausible Value (EAP/PV) reliability indicates how
much variance in a person's estimated ability is accounted for by the measurement
model on average for all testees. As a scale reliability it can be compared to Cron-
bach's alpha and should be 0.80 or preferably higher for research designs based on
correlative relations (Nunnally, 1978).
According to Paek (2002), absolute differences on the logit scale less than 0.426
are negligible. Differences up to 0.638 indicate medium-sized effects, and those
higher than this level indicate a strong effect size for a learning progression.
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e33 25
on the assessment shows a positive gain from beginner to
advanced, in absolute terms. However, looking at the DIF from
group 1 to group 2, it becomes obvious that all items that were
disproportionately easier for subsample 2 compared to subsample
1 (indicated by a negative sign) were domain-specic tasks (ds).
These assessment items were highly sensitive to instruction.
Domain-linked items (dl) on the other hand, underestimated
the total improvement of learners during VET. This does not mean
that learners did not improve in respect of their general abilities
during VET (the group effect that adds to the DIF analysis was 1.446
logits, and thus was always larger than the disadvantage for
advanced learners of an item being domain-linked), but that their
improvement with respect to those items was less than what we
would expect, given the total learning progress on the scale of
vocational knowledge and ability. In our assessment, 9 items
demonstrated negligible DIF, two demonstrated medium DIF (two
domain-linked items) and ve items demonstrated large DIF (three
domain-specic items and two domain-linked items).
In order to further quantify this difference, it is possible to
calculate the average DIF for domain-linked and domain-specic
items. Domain-linked items demonstrated an average DIF of
0.389 as a disadvantage for advanced learners. Domain-specic
items contrarily demonstrated an average DIF of 0.648 as an
advantage for advanced learners, indicating that items that were
domain specic were signicantly more instructionally sensitive
than items that related to domain-related contents (Hypothesis 2).
Taking on an absolute view, learners progress by 1.057 when
administered domain-linked items and by 2.094 when adminis-
tered domain-specic items. So the increase in specic knowledge
and ability is on average roughly double the increase in domain-
linked ability.
Fig. 1 summarizes all results graphically. The IRT-Wright-Map on
the left hand side orders the items of the assessment for the total
sample from least to most difcult. Items 6 or 11 are most
demanding with respect to the required quality of vocational
knowledge and ability. Items 12 or 1 were of about average dif-
culty, and items 9 and 10 were the easiest items of the assessment.
The instructional function given in the middle of the graph illus-
trates that for advanced vocational learners (group 2), the average
difculty of the whole assessment dropped from 0.723 to 0.723.
This in turn means that vocational knowledge and ability must have
improved by 1.446 logits, given the higher probability of solving
items. The last column shows the interaction of the single items
with the duration of instruction. E.g. item 9 had a DIF effect
of 1.160 logits for group 2 (9.2) compared to group 1 (9.1).
If we add the DIF effect of (for example) item 9 tothe total group
effect (1.446), we can calculate the absolute difference in difculty
for this item in both groups. This value (2.606), which can be ob-
tained in Fig. 2, indicates the absolute distance of an item on the
logit scale for both groups (from 1.9 to 2.9). Here, the rst number
of an item indicates the group to which it belongs, the second, the
item name; the third number refers to the different thresholds a
polytomous item can possibly have. Looking at the graph, it again
becomes obvious that all items were easier for advanced learners
(shaded items) than they were for vocational novices.
Although it has been shown that domain-specic items were
more instructionally sensitive and therefore, advantaged advanced
learners in an assessment, in our data there was one exception to
this rule. The one item not tting into this scheme was item 6 (see
Appendix), which disadvantaged advanced learners. A possible
explanation for this phenomenon is that as learners gain in
knowledge, this leads to more intra-individual cognitive conicts
(see Foster, 2011; Naumann et al., 2014; Vosniadou, 2007). As
learners gain additional knowledge, newly acquired knowledge
structures might conict with existing knowledge, generating
greater uncertainty in the answering process. This explanation
becomes probable when one looks at learnersanswers. Item 6
asked students whether a binding purchase contract was in place,
in respect of the business process described in the realistic scenario.
Learners new to the domain mostly followed their intuition,
arguing that no binding purchase contract was in place, as the
purchase offer had already expired by the acceptance date (correct).
Advanced learners, on the other hand, who could give a legal
denition of a purchase contract, were often less sure if a purchase
Table 2
General advancement in performance on the assessment for vocational knowledge and ability.
Subgroups Z Ability Estimate Standard Error Chi-Square (df) p-Wert
Group 1 (novices) 1 0.723 0.017 1797.35 (1) 0.000
Group 2 (advanced learners) 2 -0.723 0.017
Table 3
Item difculty for total sample and subsamples.
Item Absolute item difculty for the total sample DIF (Item*Instruction) for group 2 compared to group 1 Error Chi-Square (df) p-value
1 (dl) 0.026 0.528 0.038 698.48 (15) 0.000
2 (dl) 0.220 0.066 0.039
3 (ds) 0.793 0.986 0.046
4 (dl) 1.126 0.786 0.035
5 (ds) 0.483 0.202 0.044
6 (ds) 1.846 0.218 0.045
7 (dl) 0.603 0.058 0.038
8 (dl) 0.587 0.480 0.037
9 (ds) 0.910 1.160 0.046
10 (ds) 0.926 1.068 0.046
11 (dl) 1.201 0.160 0.044
12 (ds) 0.107 0.370 0.045
13 (dl) 0.139 0.378 0.039
14 (dl) 0.210 0.360 0.039
15 (ds) 0.343 0.102 0.037
16 (dl) 0.446 0.852 0.161
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e3326
contract was in place, as their theoretical denition did not quite t
the situational setting of the business scenario.
7. Conclusion and discussion
The denition and detection of instructional sensitivity is not an
isolated endeavor but rather is a matter of what is supposed to be
taught in the classroom (curriculum), what is actually taught in the
classroom (instruction), and how well tests and items align with
what is taught (assessment). Instructional sensitivity should
therefore be evaluated according to the notion of the curriculum-
instruction-assessment triad (Pellegrino, 2012). With respect to
Research Question 1, the results suggest that during vocational
instruction, apprentices signicantly improve their performance (a
large effect) and that it is possible to track these changes in the
quality of vocational knowledge and ability over the span of initial
VET via an instructionally sensitive assessment for vocational
knowledge and ability that aligns with the vocational curriculum.
The results strengthen the proposition that dual vocational
learning is a powerful system for skill acquisition (Bonnal, Mendes,
&Sofer, 2002; Grifn, 2016; OECD, 2008; OECD, 2010) positioned
at the boundary between learning and working (Harteis, Rausch and
Seifried, 2014). The high exposure of almost all items to instructional
sensitivity over a relatively short period of time (about two years)
points to dual VET being an effective education system for conveying
workplace-related knowledge and ability to adults successfully.
With respect to Research Question 2, we were interested not
only in whether the assessment was instructionally sensitive, but
why it proved to be so. As we have demonstrated, theoretically and
empirically, the question of instructional sensitivity is also a theo-
retical question in respect of the target construct assessed. The
more generic the assessed knowledge and ability, the less sensitive
this construct is to instruction, whereas the more specic the
assessed knowledge and ability, the more sensitive this construct is
to instruction. Hence, the results empirically support Dreyfus and
Dreyfuss (1980) profound conception of vocational learning as an
expansion of novices' generic abilities through specic knowledge
and ability, together allowing for the solving of situational prob-
lems. In the past this theory has been supported by qualitative
research in diverse vocational domains (e.g., Campbell, Benner
2004; Campbell et al., 1992; Chmiel &Loui, 2004). The results
also point to the possibility that for adults, the acquirement of
specic knowledge and ability is less laborious and amenable than
is the acquirement of general abilities.
Fig. 1. DIF-analysis for the assessment.
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e33 27
However, more surprising is the nding that during VET, adults
also signicantly increase their general abilities, such as numeracy
and literacydalthough to a lesser extent. These abilities should
have been learnt already at school, but were only successfully ac-
quired during VET. This challenges the notion that vocational
learning consists solely of the transition from generic to specic
knowledge. Rather, it appears that vocational learning settings also
incidentally stimulate the acquisition of general abilities, presum-
ably through experience and/or the didactical approaches of situ-
ated or problem-based learning (as suggested e.g. by Brown,
Collins, &Duguid, 1989; Lave &Wenger, 1991; Gruber, Harteis, &
Rehrl, 2008). This suggestion is conrmed explicitly by research
in the domain of management learning (Kolb &Kolb, 2009; see also
the empirical research of; Klotz, 2015).
On a practical level, the nding of different categories of voca-
tional items with varying degrees of instructional sensitivity, allows
for future assessment development to model item characteristics
that can be systematically manipulated to develop items and
assessments that prove to be instructionally sensitive. Via an ex
ante classication of item specicity, by experts, items that are
sensitive to instruction and therefore especially valuable with
respect to the information they can impart on learning success,
could be identied.
However, this study is subject to several limitations that need to
be considered and that may inspire future research endeavors:
First, the results reported here are limited by the fact that a con-
venience sample of participants was used, and consequently the
obtained results cannot be considered strongly generalizable.
However, the two groups were remarkably similar in regard to the
distributional characteristics of all collected variables, and to the
general population of industrial apprentices in Germany. Second,
the cross-sectional nature of the data did not allow for controlling
the baseline achievement of the two subsamples. Third, as noted
above, according to Polikoff (2010,9)anding of high sensitivity
indicates both good instruction and a high quality test that is
sensitive to that instruction. However, this is true only for the
effectiveness of training, and it is possible that the test questions, or
even the learning goals set in the curriculum, were relatively un-
challenging. That is, while, on the basis of our results, the training
appears to have achieved the desired outcomes adequately, actually
more could have been achieved (see e.g., Popham &Ryan, 2012).
Therefore, on the basis of this study, we are not able to make a
statement about the efciency of the VET.
Another limitation can be seen in the way we assessed the
authenticity of the assessment. For the purposes of determining
and improving the authenticity of the assessment, we only gath-
ered expert data. Authenticity howeveris in the eye of the beholder
(Gulikers, Bastiaens, Kirschner, &Kester, 2008), so future research
in the vocational domain should also gather data on apprentices
perceptions of the authenticity of an assessment, as the perspec-
tives of experts and testees may yield different outcomes (Khaled,
Gulikers, Biemans, &Mulder, 2015).
Finally, while the data showed signicant progress in domain-
linked and domain-specic knowledge and ability, it did not
show why. So, while we know that the assessment was instruc-
tionally sensitive, we do not know which aspect or aspects of in-
struction yielded the educational outcomes. For instance, we are
yet unable to say which part of the dual educationdthe learning at
a vocational school or the working and learning at a training
companydcontributed most to this nding, given that instruc-
tion, for our sample, refers to the dual VET treatment as a whole.
The same applies for the respective didactical methods used by the
teachers in vocational schools and at the workplace.
Therefore, the exact causalities for the learning processes
observed here, on the boundary between learning and working,
remain hidden, and future research might adopt a broader under-
standing of the topic of instructional sensitivity, including measures
of the pedagogic quality of instruction, in order to probe the issue of
instructional sensitivity more deeply and, with respect to vocational
education, to understand more fully the qualities and potentials of
vocational training as an environment in which notmerely domain-
specic, but also broader educational goals, can be addressed.
Conict of interest
The authors declare that they have no conict of interest.
This research was supported by the German Research Founda-
tion, within the projects Competence development through en-
culturation(KL 3076/2-1) and Competence-oriented assessments
in VET and professional development(Wi 3597/1-2).
Fig. 2. Absolute item difculty in each sample (group 1 and group 2).
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e3328
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e33 29
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e3330
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e33 31
Achtenhagen, F., & Weber, S. (2003). Authentizit
atin der Gestaltung beruicher
Lernumgebungen. In A. Bredow, R. Dobischat, & J. Rottmann (Eds.), Berufs- und
adagogik von A-Z. Grundlagen, Kernfragen und Perspektiven, Fest-
schrift für Günter Kutscha (pp. 185e199). Baltmannsweiler: Schneider.
Adams, R. J., & Khoo, S. T. (1996). Quest-Interactive test analysis system. Victoria:
Australian Council for Educational Research.
Adams, R. J., Wilson, M., & Wang, W.-C. (1997). The multidimensional random co-
efcient multinomial logit model. Applied Psychological Measurement, 21,1e23.
AERA, APA, & NCME. (1999). Standards for educational and psychological testing.
Washington, DC: American Educational Research Association.
Benner, P. (2004). Using the Dreyfus Model of Skill Acquisition to describe and
interpret skill acquisition and clinical judgement in nursing practice and edu-
cation. Bulletin of Science, Technology and Society, 24(1), 188e199.
Billett, S. (1994, December). Situated cognition: Reconciling culture and cognition.
In Paper presented at reforming post compulsory education and training: Recon-
ciliation and reconstruction (Brisbane, Australia).
Bonnal, L., Mendes, S., & Sofer, C. (2002). School-to-work transition: Apprenticeship
versus vocational schools in France. International Journal of Manpower, 23(5),
Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of
learning. Educational Researcher, 18(1), 32e34.
Burstein, L. (1989, March). Conceptual considerations in instructionally sensitive
assessment. In Paper presented at the annual meeting of the american educational
research association. San Francisco, CA.
Burstein, L., Aschbacher, P., Chen, Z., & Lin, L. (1990). Establishing the content validity
of tests designed to serve multiple purposes: Bridging secondary-postsecondary
mathematics. CSE Technical Report 313. Center for Research on Evaluation,
Standards, and Student Testing. Los Angeles, CA: University of California, Los
Campbell, R. L., Brown, N. R., & DiBello, L. A. (1992). The programmer's burden:
Developing expertise in programming. In R. R. Hoffman (Ed.), The psychology of
expertise: Cognitive research and empirical AI (pp. 269e294). New York: Springer.
Chen, J. (2012). Impact of instructional sensitivity on high-stakes achievement test
items: A comparison of methods. Lawrence, KS: University of Kansas.
Chmiel, R., & Loui, M. C. (20 04). Debugging: From novice to expert. In Proceedings of
the 35th SIGCSE technical symposium on computer science education, 17-21 (New
York, USA).
Court, S. (2013, November). DIF and SPGS: Implementing the popham-ryan design.
In Presentation at the rst international conference on instructional sensitivity
(Lawrence, Kansas).
Dreyfus, S. E., & Dreyfus, H. L. (1980). Ave-stage model of the mental activities
involved in directed skill acquisition. Berkley: University of California.
Foster, C. (2011). A slippery slope: Resolving cognitive conict in mechanics.
Teaching Mathematics and Its Applications, 30(4), 216e221.
Gelman, R., & Greeno, J. G. (1989). On the nature of competence: Principles for
understanding in a domain. In L. B. Resnick (Ed.), Knowing and learning: Essays
in honor of robert glaser (pp. 125e186). Hillsdale, NJ: Erlbaum Associates.
Glaser, R. (1990). Re-emergence of learning theory within instructional research.
American Psychologist, 45(1), 29e39.
Goe, L. (2007). The link between teacher quality and student outcomes: A research
synthesis. Washington, DC: National Comprehensive Center for Teacher Quality.
Greeno, J. G., Riley, M. S., & Gelman, R. (1984). Conceptual competence and chil-
dren's counting. Cognitive Psychology, 16(1), 94e143 .
Grifn, T. (2016). Costs and benets of education and training for the economy, busi-
ness and individuals. Adelaine: NCVER.
Gruber, H., Harteis, C., & Rehrl, M. (2008). Professional learning: Skill formation
between formal and situated learning. In K. U. Mayer, & H. Solga (Eds.), Skill
formation. Interdisciplinary and cross-national perspectives (S. 207-229). Cam-
bridge: Cambridge University Press.
Gulikers, J. T. M., Bastiaens, T. J., Kirschner, P. A., & Kester, L. (2008). Authenticity is in
the Eye of the beholder: Student and teacher perceptions of assessment
authenticity. Journal of Vocational Education and Training, 60(4), 401e412.
Haladyna, T., & Roid, G. (1981). The role of instructional sensitivity in the empirical
review of criterion-referenced test items. Journal of Educational Measurement,
18(1), 39e53.
Harteis, C., Rausch, A., & Seifried, J. (Eds.). (2014). Discourses on professional learning:
On the boundary between learning and working. Dordrecht: Springer.
Hasselhorn, M., Baethge, M., Füssel, H.-P., Hetmeier, H.-W., Maaz, K.,
Rauschenbach, T., et al. (2014). National Educational Report, Education in Ger-
many 2014dan indicator-based report including an analysis of the situation of
people with special educational needs and disabilities. Federal Ministry of Edu-
cation and Research: Bertelsmann.
Hoffman, L., Hofer, S. M., & Sliwinski, M. J. (2011). On the confounds among retest
gains and age-cohort differences in the estimation of within-person change in
longitudinal studies: A simulation study. Psychology and Aging, 26(4), 778e791.
Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ:
Kao, C.-F. (1990). An investigation of instructional sensitivity in mathematics
achievement test items for U.S. eighth grade students. Los Angeles: University of
Khaled, A., Gulikers, J., Biemans, H., & Mulder, M. (2015). How authenticity and self-
directedness and student perceptions thereof predict competence development
in hands-on simulations. British Educational Research Journal, 41(2), 265e286.
Klotz, V. K. (2015). Diagnostik beruicher Kompetenzentwicklung: Eine wirt-
schaftsdidaktische Modellierung für die kaufm
annische Dom
ane. Berlin: Springer.
Klotz, V. K., Winther, E., & Festner, D. (2015). Modeling the development of voca-
tional competence: A psychometric model for business domains. Vocations and
Learning, 8(3), 247e268.
Kolb, A. Y., & Kolb, D. A. (2009). Experiential learning theory: A dynamic, holistic
approach to management education and development. In S. J. Armstrong, &
C. V. Fukami (Eds.), The SAGE handbook of management learning, education and
development (pp. 42e68). London: Sage.
Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation.
Cambridge: Cambridge University Press.
Linn, R. L. (1983). Curricular validity: Convincing the courts that it was taught
without precluding the possibility of measuring it. In G. F. Madaus (Ed.), The
courts, validity, and minimum competency testing (pp. 115e132). Boston, MA:
Kluwer-Nijhoff Publishing.
Li, M., Ruiz-Primo, M. A., Giamellaro, M., Wills, K., Mason, H., & Lan, M.-C. (2012b).
Instructional sensitivity and transfer of learning at different distances: Close,
proximal and distal assessment items. In Paper presented at the AERA annual
meeting (Vancouver, Canada).
Li, M., Ruiz-Primo, M. A., & Wills, K. (2012a). Comparing methods to estimate the
instructional sensitivity of items. DEISA. Paper 4.
Masters, J. R. (1988). A study of the differences between what is taught and what is
tested in Pennsylvania. In Paper presented at the annual meeting of the national
council on measurement in education. New Orleans, LA.
Mislevy, R. J., & Haertel, G. D. (2006). Implications of evidence-centered design for
educational testing. Educational Measurement: Issues and Practice, 25(1), 6e20.
Mulder, M., Weigel, T., & Collins, K. (2006). The concept of competence in the
development of vocational education and training in selected EU member
states: A critical analysis. Journal of Vocational Education and Training, 59(1),
Naumann, A., Hochweber, J., & Hartig, J. (2014). Modeling instructional sensitivity
using a longitudinal multilevel differential item functioning approach. Journal of
Educational Measurement, 51(4), 381e399.
Nickolaus, R., Lazar, A., & Norwig, K. (2012). Assessing professional competences
and their development in vocational education in Germany: State of research
and perspectives. In S. Bernholt, K. Neumann, & P. Nentwig (Eds.), Making it
tangible: Learning outcomes in science education (pp. 141e161). Münster:
Nunnally, J. C. (1978). Psychometric theory. New York: McGraw-Hill.
Oates, T. (2004). The role of outcome-based national qualications in the devel-
opment of an effective vocational education and training system: The case of
England and Wales. Policy Futures in Education, 2(1), 53e71.
OECD. (2003). The PISA 2003 assessment framework: Mathematics, reading, science
and problem solving knowledge and skills. Paris: OECD.
OECD. (2008). Costs and benets in vocational education and training. Paris: OECD.
OECD. (2010). Learning for Jobs: Synthesis report of the OECD reviews of vocational
education and training. Paris: OECD.
Paek, I. (2002). Investigation of differential item functioning: Comparisons among
approaches, and extension to a multidimensional context. Berkley, CA: University
of California.
Pellegrino, J. W. (2012). The design of an assessment system focused on student
achievement: A learning sciences perspective on issues of competence, growth
and measurement. In S. Bernholt, K. Neumann, & P. Nentwig (Eds.), Making it
tangibledlearning outcomes in science education (pp. 79e107). Münster:
Pellegrino, J. W., Chudowsky, N., & Glaser, R. (Eds.). (2001). Knowing what students
knowdthe science and design of educational assessment. Washington, DC: Na-
tional Academy Press.
Pham, V. H. (2009). Computer modeling of the instructionally insensitive nature of the
Texas assessment of knowledge and skills (TAKS) Exam. Austin: The University of
Texas at Austin.
Phillips, S. E., & Mehrens, W. A. (1988). Effects of curricular differences on
achievement test data at item and objective levels. Applied Measurement in
Education, 1(1), 33e51.
Polikoff, M. S. (2010). Instructional sensitivity as a psychometric property of as-
sessments. Educational Measurement: Issues and Practice, 29(4), 3e14.
Popham, J. W. (2007). Instructional insensitivity of tests: Accountability's dire
drawback. Phi Delta Kappan, 89(2), 146e155 .
Popham, W. J., & Ryan, J. M. (2012, April). Determining a high-stakes test's
instructional sensitivity. In Paper presented at the annual meeting of the national
council on educational measurement. Vancouver, B.C., Canada.
Rausch, A., Seifried, J., Wuttke, E., K
ogler, K., & Brandt, S. (2016). Reliability and
validity of a computer-based assessment of cognitive and non-cognitive facets
of problem-solving competence in the business domain. Empirical Research in
Vocational Education and Training, 8(9).
Rovinelli, R. J., & Hambleton, R. K. (1977). On the use of content specialists in the
assessment of criterion-referenced test item validity. Dutch Journal of Educa-
tional Research, 2(1), 49e60.
Rüegg-Stürm, J. (2004). Das neue St. Galler management-modell. In R. Dubs,
D. Euler, J. Rüegg-Stürm, & C. Wyss (Eds.), Einführung in die Managementlehre
(pp. 65e134). Bern: Haupt.
Ruiz-Primo, M. A., Li, M., Wills, K., Giamellaro, M., Lan, M.-C., Mason, H., et al. (2012).
Developing and evaluating instructionally sensitive assessments in science.
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e3332
Journal of Research in Science Teaching, 49(6), 691e712.
Ryan, M., Fook, J., & Hawkins, L. (1995). From beginner to graduate social worker:
Preliminary ndings of an Australian longitudinal study. British Journal of Social
Work, 25(1), 17e35.
Salthouse, T. A., & Tucker-Drob, E. M. (2008). Implications of short-term retest ef-
fects for the interpretation of longitudinal change. Neuropsychology, 22(1),
Seeber, S. (2008). Ans
atze zur Modellierung beruicher Fachkompetenz in
annischen Ausbildungsberufen. Zeitschrift für Berufs- und Wirt-
adagogik, 104(1), 74e97.
Seeber, S. (2015, April). The impact of institutional training settings on competence
development in the area of social and health care. In Presentation at the AERA
annual meeting (Chicago).
Seeber, S., Ketschau, T., & Rüter, T. (2016). Structure and level of vocational
competence of Medical assistance. Unterrichtswissenschaft, 44(2), 185e203.
Shavelson, R. J., & Seminara, J. (1968). Effect of lunar gravity on man's performance
of basic maintenance tasks. Journal of Applied Psychology, 52(3), 177e183 .
Switzer, D. M. (1993). Differential item functioning and opportunity to learn:
Adjusting the Mantel-Hansel chi-square procedure. Practical Assessment,
Research &Evaluation, 13(7), 1e16 .
Vosniadou, S. (2007). The cognitive-situative divide and the problem of conceptual
change. Educational Psychologist, 42(1), 55e66.
Way, W. (2014). Memorandum on instructional sensitivity considerations for the
PARCC assessments. Washington DC: Pearson.
Weber, S., Wiethe-K
orprich, M., Bley, S., Weiß, C., Draxler, C., & Gürer, C. (2016).
Modellierung und Validierung eines Intrapreneurship-Kompetenz-Modells bei
Industriekaueuten. Unterrichtswissenschaft, 44(2), 149e168.
Wiliam, D. (2007). Sensitivity to instruction: The missing ingredient in large-scale
assessment systems?. In Paper presented at the annual meeting of the interna-
tional association for educational assessment (Baku, Azerbaijan).
Wilson, M. (2005). Constructing measures: An item response modeling approach.
Mahwah, NJ: Lawrence Erlbaum Associates.
Wilson, M. (2008). Cognitive diagnosis using item response models. Journal of
Psychology, 216(2), 74e88.
Winther, E. (2010). Kompetenzmessung in der beruichen Bildung. Bielefeld: W.
Bertelsmann Verlag.
Winther, E., & Achtenhagen, F. (2009). Measurement of vocational competencies. A
contribution to an international large-scale-assessment on vocational educa-
tion and training. Empirical Research in Vocational Education and Training, 1(1),
Worthy, C. (1996). Clinical ladders: Can we afford them? Nursing Management,
27(1), 33e34.
Wu, M. L., Adams, R. J., & Wilson, M. R. (1997). ConQuest: Multi-aspect test software.
Camberwell: Australian Council for Educational Research.
Yoon, B., & Resnick, L. B. (1998). Instructional validity, opportunity to learn and equity:
New standards examinations for California Mathematics Renaissance. CSE Tech-
nical Report 484. Los Angeles: National Center for Research on Evaluation,
Standards, and Student Testing (CRESST) University of California.
Yu, L., Lei, P.-W., & Suen, H. K. (2006). Using a differential item functioning (DIF)
procedure to detect differences in opportunity to learn (OTL). In Paper presented
at the american educational research (San Francisco, California).
V. Deutscher, E. Winther / Learning and Instruction 53 (2018) 21e33 33
... Apprenticeship is a type of vocational training in which people learn a trade through a mixture of classroom instruction and on-the-job training under the guidance of a trained person. According to Deutscher et al. (2018), post-VET apprenticeship performance is typically linked to program efficiency. It suggests that vocational education has a significant impact on the development of learners' professional knowledge and skills [56]. ...
... According to Deutscher et al. (2018), post-VET apprenticeship performance is typically linked to program efficiency. It suggests that vocational education has a significant impact on the development of learners' professional knowledge and skills [56]. ...
... The expected a posteriori plausible value reliability was also computed for the CT practices test. It indicates how much variance in an individual's estimated ability is accounted for by the measurement model on average for all test takers (Deutscher & Winther, 2018). This reliability can be compared to Cronbach's alpha, and higher than 0.70 indicates acceptable reliability (Nunnally, 1978). ...
... We conducted the DIF analysis within the 3PL modeling framework to explore whether the probabilities for solving the CT practices test items are different for the control (group 1) and intervention (group 2) schools at the end of the intervention (T3), while controlling the overall performance of the entire sample population. This DIF analysis served as an empirical criterion for differences between groups attributed to the intervention (Deutscher & Winther, 2018). The effect sizes were estimated to determine what CT practices abilities are more strongly influenced by the intervention and what CT practices abilities are less influenced. ...
Computational thinking (CT) practices is a critical cognitive tool for students to overcome challenges in this technological era. Few studies investigated CT practices from a development perspective. A two-year intervention was implemented in 30 intervention and 22 control schools in Hong Kong to promote CT education (T1baseline: N = 13,056; T2: N = 16,367; and T3: N = 15,269). Senior primary students (grades 4 to 6) around 8–12 years old were targeted. Among them, 34% were girls at T1baseline. Their learning progression of cognitive development was assessed with a CT practices test. Data were analyzed using item response analysis. Learning trajectories of CT practices were modeled to understand how students’ cognitive development differed between schools. Insights of findings were discussed in bringing CT practices to life.
... Contemporary WPL research has, on the one hand, indeed shown that learning occurs at workplaces and that it is an adjuvant in developing specific knowledge, abilities and generic working-life competences (Kyndt et al., 2014;Klotz et al., 2015;Deutscher and Winther, 2018), intuition and expertise (Harteis and Billett, 2013) and positive personal traits and attitudes toward work such as a vocational identity (Billett, 2007;Klotz et al., 2015). On the other hand, large differences have become apparent in the extent to which workplaces are indeed rich learning environments (Negrini et al., 2016) affecting competence development and drop-out rates from educational WPL programs. ...
Purpose This study aims to support researchers and practitioners in finding suitable instruments for future research studies and organizational quality assessments. Design/methodology/approach Employees’ success of learning at work is strongly influenced by the quality of the workplace learning environment. In the recent decades growing effort has been given to the development of surveys to measure the quality of workplace learning, resulting in a large number of available survey instruments. This study conceptually draws on a 3-P model and uses a qualitative metasynthesis to collect and categorize n = 94 surveys that intend to measure the quality of workplace learning (WPL). Findings The results underline that research on WPL environments is a highly interdisciplinary endeavor, where every discipline enriches the field by a new perspective and own foci. Overall, this study finds a focus on learning culture and working conditions, on social and functional inclusion of the learner and on support and feedback during training. Products of WPL such as professional competences or career aspirations play a minor role. Originality/value With the integration of quality measurement instruments from various research studies, this study produces an interactive online instrument map that gives a broad, yet organized overview of available quality measures in the WPL field.
... Auf der Grundlage technologiebasierter Instrumente zur Kompetenzdiagnostik liegen nun für verschiedene Berufe Informationen zur Leistungsfähigkeit von Auszubildenden am Ende der Ausbildung vor. Die Ergebnisse verweisen jedoch darauf, dass die (curricular geforderten) Problemlösekompetenzen und reflexiven Kompetenzen nicht durchgängig erreicht (Seeber & Seifried 2019), die berufsfachlichen Kompetenzen aber teilweise (allerdings nicht ausnahmslos) auf hohem Niveau realisiert werden können (Deutscher & Winther 2018). Dieser Befund wirft Fragen auf, denn zukünftig wird es in vielen Berufsfeldern vermehrt darum gehen, nicht-routinisierte, komplexe Aufgaben zu bewältigen. ...
Im vorliegenden Beitrag ziehen wir eine Bilanz zu Erträgen und Potenzialen der ASCOT-Initiative. Die Darstellung der zentralen Ideen der verschiedenen Projekte erfolgt entlang der C-I-A-Triade. Die mittels technologiebasierter Kompetenzdiagnostik generierten Informationen zur Leistungsfähigkeit von Auszubildenden ermöglichen für verschiedene Berufe einen genaueren Blick auf Möglichkeiten der Kompetenzförderung.
... Responding to changes that occur so quickly, vocational education practices must prepare innovative ways to use management knowledge in producing quality and competitive vocational education graduates. Related research states that knowledge management is also applied to assess changes in learning and test system effectiveness/teacher or school effectiveness (Deutscher & Winther, 2018). ...
Full-text available
This study analyzes the differences between the levels of management knowledge possessed by vocational high school teachers. The problem in this research is the shortage of vocational teachers who teach productive subjects, so that there is a gap in the skills possessed by graduates. This study uses a quantitative approach with Chi-Squared analysis, data collection is done by distributing online questionnaires and documents. The sample amounted to 62 respondents using a non-probability sampling technique, namely saturated sampling. The results showed that there was no significant difference between knowledge management and the category of teachers (vocational and non-vocational). These results mean that every teacher who teaches at Muhammadiyah vocational schools has the same knowledge competence in designing, organizing, carrying out tasks, and controlling school programs.
... The school can give the training to improve their competence (Wahyuni, Agustini, Sindu, & Sugihartini, 2020), like pedagogy competence (Zain, 2021;Budaghyan, 2015). Thus, instructional education in vocational education can be structured according to needs (Deutscher & Winther, 2018). This means that teacher competence has a very important role as a manifestation of facing and developing education according to 21st-century skill standards. ...
Full-text available
This study aims to obtain the level of interest of vocational school teachers in the implementation of the competency certification program organized by the National Professional Certification Agency. Moreover, this competency certification is part of what is required in the era of globalization to produce quality graduates. This study uses a qualitative approach with a survey method. This research was conducted at a vocational school in Bekasi. Data was collected using interviews, documents, and questionnaires. The data analysis process uses the Miles and Huberman model through the process of data reduction, data presentation, drawing conclusions and triangulation. The results of the study concluded that there were 40% of schools encouraged teachers to take part in competency certification test training. Teachers need competency improvement activities such as training in mastery of competency certification test materials and also difficulties in keeping up with all developments in knowledge and technology. The table data shows that there are 60% of teachers interested in competency certification and desire to take the competency certification test.
... Competency tests in vocational education schools should have a comprehensive multi-step structure. Basically, there are three different levels of competence: procedural ability and translation capabilities (Deutscher and Winther, 2018;Winther and Klotz, 2013). All of these capabilities provide operational features that meet the needs of global businesses. ...
This evaluation aimed to determine the context of implementation student skill competency assessment office administration at Dharma Karya Vocational High School, South Jakarta, the readiness of students, educators and education personnel, financing, facilities and infrastructure, the process of implementing the assessment, and knowing the achievement of program implementation. The method used in this evaluation is descriptive qualitative method by taking empirical data and facts, and the model used is the CIPP model (Contex, Input, Process, Product). Data collection was carried out scientifically which included observation, structured interviews, and documentation. Interviews were conducted with the Principal, Deputy Head of Curriculum, Deputy Head of Facilities and Infrastructure, Head of Office Administration Department, Chair of the Examination Committee, Internal Examiners, External Examiners, and students. Then the data obtained were analyzed using the triangulation method of sources, display data, and conclusions. The conclusion of the evaluation research, that the background for the activity of student skill competency assessment at the Dharma Karya Vocational High School, South Jakarta, is the existence of a Government policy towards Vocational High Schools that are in accordance with environmental needs and the vision and mission of the school.
... More commonly known as internship, work-integrated learning (WIL) is a distinct education experience which emphasises practical application of knowledge in workplace. WIL is useful for extending the education-to-work transition and improving the performance of VET students (Deutscher & Winther, 2018). Unlike countries such as Germany and Switzerland, educational tracking in Hong Kong is normally initiated when a student is about to reach adulthood. ...
Applied degree programme is an innovative form of vocational education and training. The aim of this chapter is to examine the challenges and opportunities of the implementation of an existing vocational degree programme and its transformation into an applied degree. Bachelor of Science in Horticulture, Arboriculture and Landscape Management, BSc(HALM), which is a degree programme in Technological and Higher Education Institute of Hong Kong is used throughout as a case study. The purposes of BSc(HALM), and the methods to achieve them would be examined. Knowledge classification, workplace training and cultivation of transferable skills would be explored. The process of incorporating the positive impacts of COVID-19 would be elaborated to evince the revolution of applied degree programmes. The challenges and potential solutions, at individual, programme, industry and social levels, would be identified and discussed. Teaching and learning experiences would serve as evidence to support the arguments in this chapter.KeywordsApplied degreeVocational education and trainingUrban greeningUrban forestryHorticultureArboriculture
Conference Paper
Vocational education has a vital role in overall social development and contributes to the learners' quality improvement. Thus, learners' preparation in competencies and work professionalism through vocational education is critical. The current problem was the 8.49% unemployed vocational education graduates who showed their lack of ability to compete in the 21st century. The correct solution to realize the quality and industry-recognized vocational education was by improving vocational education into the centre of excellence to improve teachers and learning capacity. This research was a literature review. Generally, this paper presents the link and match concept, link and match key between vocational high school (Sekolah Menengah Kejuruan/SMK) and the industry and centre of excellence. This paper aimed to explain the link and match model through the centre of excellence to improve vocational education quality.
Full-text available
Background To measure higher-order outcomes of vocational education and training (VET) we developed a computer-based assessment of domain-specific problem-solving competence. In modeling problem-solving competence, we distinguish four components of competence: (1) knowledge application, (2) metacognition, (3) self-concept, and (4) interest as well as thirteen facets of competence, each of which is assigned to one of the four components. Methods With regard to ecological and content validity, rather than apply highly structured items (e.g. multiple choice items), we developed three authentic problem scenarios and provided an open-ended problem space in terms of an authentic office simulation. The assessment was aimed at apprentice industrial clerks at the end of a 3-year apprenticeship program and focused on the domain of controlling (i.e., support of managerial decisions, cost planning, cost control, cost accounting, etc.). The computer-based office simulation provided typical tools (e.g., email client, spreadsheet software, file system, notebook, calculator, etc.). In order to assess the non-cognitive components in our competence model, we implemented an integrated measurement of self-concept and interest that we refer to as ‘Embedded Experience Sampling’ (EES). Test-takers are requested to spontaneously answer short prompts (EES items) during the test that are embedded in typical social interactions in the workplace. The empirical section is based on a study with 780 VET students from three commercial training occupations in Germany (industrial clerks and apprentices from two similar VET programs). The focus of the contribution is on testing a theoretically derived competence model based on item response theory, the implemented scoring methods and reliability of the instrument. Fine-grained response patterns from automated codings and human ratings were condensed into one partial credit item for each scenario and each of the facets in the cognitive component ‘knowledge application’. Results The multidimensional Rasch analysis revealed satisfactory EAP/PV reliabilities, which are between .78 and .84 for the ‘knowledge application’ facets and between .77 and .85 for the non-cognitive facets. Furthermore, the achievement differences between the industrial clerks and their comparison groups are as assumed. Conclusions In our study, we introduced an innovative method to measure non-cognitive facets of problem-solving competence in the course of complex problem scenarios. Furthermore, by using authentic problem scenarios and providing an open-ended and authentic problem space, our assessment of domain-specific problem-solving comeptence focuses on ecological validity but also ensures reliability.
Full-text available
This article discusses the development of vocational competence through economic vocational educational training (VET) from a theoretical and psychometric perspective. Most assessment and competence models tend to adopt a state perspective toward assessments of competence and carve out different structures of competence for diverse vocational domains. However, the order and at what stages of development these identified structures actually occur remains uncertain. This study therefore moves beyond a static perspective to denote changes in competence over the duration of vocational training, using item response theory-based scaling and a cross-sectional database of 877 economic apprentices. The resulting four-stage psychometric model represents a systematization of the development of vocational competence, character- ized by the degree of occupational specificity and different forms of cognitive process- ing. This proposed psychometric model can be used to inform educational researchers and practitioners about the different stages of competence development, such that they can both assess and teach economic competence more effectively.
This book analyses and elaborates on learning processes within work environments and explores professional learning. It presents research indicating general characteristics of the work environment that support learning, as well as barriers to workplace learning. Themes of professional development, lifelong learning and business organisation emerge through the chapters, and contributions explore theoretical and empirical analyses on the boundary between working and learning in various contexts and with various methodological approaches. Readers will discover how current workplace learning approaches can emphasise the learning potential of the work environment and how workplaces can combine the application of competence, that is working, with its acquisition or learning. Through these chapters, we learn about the educational challenge to design workplaces as environments of rich learning potential without neglecting business demands. Expert authors explore how learning and working are both to be considered as two common aspects of an individual’s activity. Complexity, significance, integrity, and variety of assigned work tasks as well as scope of action, interaction, and feedback within its processing, turn out to be crucial work characteristics, amongst others revealed in these chapters. Part of the Professional and Practice-based Learning series, this book will appeal to anyone with an interest in workplaces as learning environments: those within government, community or business agencies and within the research communities in education, psychology, sociology, and business management will find it of great interest.
Zusammenfassung In einer globalisierten, wissensorientierten Welt wird kreatives, innovatives Denken und Handeln zunehmend zur „Schlüsselkompetenz”. Nachstehend geht es um die Konstruktion eines Instrumentariums zur Abbildung innovativen Denkens und Handelns im Kontext der kaufmännischen Berufsausbildung von Industriekaufleuten für ein Large-Scale Assessment. Dieses latente Konstrukt wird als Intrapreneurship-Kompetenz (IP-Kompetenz) operationalisiert und auf Basis sowohl betriebswirtschaftlicher als auch lehr-lern-theoretischer Ansätze modelliert. Die Überprüfung des Kompetenzmodells erfolgt hier anhand von 51 technologiebasierten Testitems an einer querschnittlichen ad hoc-Stichprobe von N=357 Auszubildenden am Ende ihrer Ausbildung als Industriekaufleute im Sommer/Herbst 2013. Die mittels Item Response Theory (IRT) bzw. Rasch-Modellen skalierten Leistungsdaten wurden im Rahmen weitergehender Analysen mittels des Andersen-Likelihood-Ratio-Tests sowie des Wald-Tests auf mögliche Modellverletzungen hin geprüft und auf Basis des One-Parameter-Logistic-Model (OPLM) korrigiert. Die Fit-Werte verweisen auf eine gute bis sehr gute Modellgüte zur Abbildung der modellierten IP-Kompetenz: (1) Theoriekonform zeigt sich eine zweidimensionale Modellstruktur. (2) Nach der OPLM-Korrektur weisen die Items eine vergleichbare Trennschärfe und keine relevanten Benachteiligungen für die geprüften Subgruppen (u.a. Geschlecht, Alter, Schulabschluss, Wunschberuf, Größe des Ausbildungsbetriebes) auf. (3) Die konstruierten Items zeigen eine akzeptable bis gute Qualität (EAP/PV-Reliabilitäten betragen 0,64 für Dimension I und 0,78 für Dimension II). (4) Die technologiebasierten Items decken die modellierten Inhaltsbereiche im Hinblick auf Breite und Tiefe umfassend ab. Damit liegt ein Instrumentarium vor, das in der Lage ist, die zukunftsweisende Fähigkeit IP-Kompetenz in der kaufmännischen Berufsausbildung abzubilden. Für eine Generalisierung dieses Modells stehen noch Replikationsstudien aus. Mit dem technologiebasierten Itempool sind für die Ausbildungspraxis erste Aufgaben zur Vermittlung von IP-Kompetenz gegeben. Schlüsselwörter: Intrapreneurship, Kompetenz, kaufmännische Berufsausbildung, Item Response Theory (IRT), Rasch-Skalierung, One-Parameter-Logistic-Model (OPLM) Abstract Within a globalized, knowledge-oriented world creative, innovative thinking and acting becomes increasingly a “key qualification”. This study deals with the construction of an instrument for a large scale assessment - visualizing and measuring innovative thinking and acting at business workplaces. Thereby, we operationalized and modelled this corresponding latent construct as “intrapreneurship” (IP)-competence in accordance with business but also with teaching-learning research. The model is validated by 51 technology-based testitems upon a cross sectional ad hoc-sample of N=357 industrial clerks at the end of their 2.5 resp. 3 years of apprenticeship in summer/autumn 2013. The test data were IRT- resp. Rasch-scaled. By running Andersen-Likelihood-Ratio- and Wald-Tests we proofed whether the model assumptions e.g., of common discrimination for each item were violated and ran the One-Parameter-Logistic-Model (OPLM) for correction. The statistics show acceptable to good fit values for representing the modelled IP-competence: (1) according to theory we found a two-dimensional model structure. (2) After the OPLM correction the items show comparable discrimination values and no relevant DIF-effects for subgroups like gender, age, education, desired occupation, size of apprenticeship firm. (3) The constructed items have acceptable fit values (EAP/PV-reliabilities are 0.64 for dimension I and 0.78 for dimension II). (4) Furthermore, the technology-based items cover the modeled content area in width and depth sufficiently. By our study we got evidence for a test instrument measuring the trend-setting intrapreneurhsip competence in a reliable and valid way. For generalizing this model further replication studies are necessary. By this item pool we got tasks for boosting IP competence for the practice of business education. Key words: Intrapreneurship, Competence, Apprenticeship, Item Response Theory (IRT), Rasch-scaling, One-Parameter-Logistic-Model (OPLM)
A central issue for social work educators is to delineate the process of how social work students become competent practitioners. Previous literature has tended to concentrate on value and attitudinal change as a result of education. The authors are currently engaged in a five-year longitudinal study of a cohort of 39 Victorian social workers which explores this issue, as well as the changes in knowledge, skills and theory use. The study also explored whether the Dreyfus model of skill acquisition was applicable in social work. This paper describes the study and its preliminary findings at mid-point in the study with respondents having completed two years of social work education and now practising as social workers. Participants have been interviewed twice yearly and data consists of participants' descriptions of critical incidents from their course experience and responses to case vignettes. After content analysis, preliminary findings indicate that some stages similar to the Dreyfus model, can be identified. Particular themes emerged including a predominantly individualizing approach to problems, a reluctance to deal with men and situations involving conflict and a disillusionment about the nature of social work. Some implications of these findings, including competency-based education, are discussed.