Page 1

Firm-sponsored Training and Poaching

Externalities in Regional Labor Markets∗

Samuel Muehlemann

University of Berne

January 31, 2008

Abstract

This paper examines whether firms are less likely to provide ap-

prenticeship training in dense local labor markets, where the probabil-

ity that workers can be poached by other firms after training is high.

Regional labor markets are defined based on travel time rather than

travel distance or political borders. The results show that firms provide

less training in dense labor markets. Applying count data hurdle mod-

els to Swiss firm-level data, it can be shown that the effect is strongest

at the extensive margin, i.e. to provide any training, but less severe at

the intensive margin, which is the number of apprenticeship positions

offered once the training decision has been made.

JEL Classification: J23, J24, J42

Keywords: Regional labor markets, poaching, firm density, apprentice-

ship training

∗The study is based on two surveys from the years 2000 and 2004. The first is financed

by the Commission for Technology and Innovation (CTI credit 4289.1 BFS and 5630.1

BFS), and the second by the Swiss Federal Office for Technology and Innovation. The

author acknowledges the assistance of the Swiss Federal Statistical Office. We thank Guido

Imbens, Steve Machin, Catherine Sofer, Rainer Winkelmann and Ludger Woessmann for

helpful comments and discussions. The usual disclaimer holds.

Page 2

1Introduction

Apprenticeship training is of great importance, especially in German speak-

ing countries, but also in other OECD member countries. In Switzerland,

about 60% of a cohort enroll in a dual apprenticeship program after com-

pulsory schooling. Since firms are free to choose whether they want to hire

apprentices and train them or simply recruit workers in the external la-

bor market, it is crucial to understand why firms would be willing to train

apprentices.

One important determinant of the decision to train are the costs that are

associated with it. For Switzerland, on average, training apprentices is prof-

itable (for a recent study see Muehlemann et al. 2007).

apprenticeship program is costly from the firm’s perspective in Germany

(see Beicht et al. 2004). Firms for which training results in net costs dur-

ing the apprenticeship period, must somehow be able to recoup these costs

afterwards, since otherwise it would not be rational for them to train ap-

prentices. The key determinant for the firm’s training decision is whether

it is able to retain a former apprentice after graduation as a skilled worker

and pay a wage below worker productivity. In a perfectly competitive labor

market, firms and workers have full information about productivity, and the

latter are not restrained by mobility costs. Hence, in a competitive setting,

firms would not be able to set a wage below productivity without losing the

apprentice with a probability equal to one. Since there are about one third

of the firms that incur net costs of training in Switzerland, it must be the

case that these firms are able to recoup at least part of their investment, oth-

erwise their behavior would not be rational from an economic perspective.

While part of the positive net costs could also be explained by firm-specific

human capital, it is very unlikely that firm-specific human capital accounts

for all of these costs, since the curriculum of vocational training is designed

in a way that apprentices are thought mainly general skills with the goal to

increase mobility in the labor market.

Conversely, an

While labor markets in the real world are not completely competitive, eco-

nomic theory of industrial organization predicts that competition increases

with the number of firms within a market. If training apprentices is indeed

mainly general from a human capital point of view, then an apprentice can

use these skills for productive activities in any other firm. While workers will

1

Page 3

not receive a wage equal to their marginal productivity if there are frictions

in the labor market (see e.g. Acemoglu and Pischke 1998), the gap between

wage and productivity is negatively related to labor market competition.

Hence, it should be the case that the training probability is lower if there

are many other firms in the same industry that are located closely to the

training firm, because increased competition narrows the gap between wage

and worker productivity and therefore reduces the probability that a firm

can recoup their training investments. Obviously, it should be noted that

the threat of poaching is only relevant for firms whose net costs of training

are positive.

Studies for the U.K. (Brunello and Gambarotto 2007) and Italy (Brunello

and De Paola 2008) have shown that regional effects such as the density

of the labor market play an important role in employer-provided train-

ing. While Switzerland is a relatively small country with a well developed

transportation system, there are still regions where traveling is very time-

consuming, especially in mountainous or more rural areas. Switzerland is

divided in 26 cantons, therefore it would be a natural starting point to define

regional labor markets by political borders, especially since Switzerland is

federalistic, i.e. cantons have a high degree of independence with regards to

labor market policies and schooling. But while some cantons are too large to

be considered a single regional labor market, a number of cantons are very

small and economic activities of firms located in such cantons are closely

tied with firms located in neighboring cantons. In this paper, regional labor

markets are defined by travel time. The relevant regional labor market for a

firm is considered to be an area that can be reached by car in less than half

an hour. A previous study for Switzerland (Muehlemann and Wolter 2007)

has shown that regional effects matter for employer-provided training.1

The effect of labor market competitiveness on the firms demand for ap-

prentices is estimated at the extensive margin, i.e. whether firms provide

training or not, and at the intensive margin, which is the number of appren-

tices hired by training firms. The results show that a dense labor market

reduces the probability of firms to offer training significantly. Furthermore,

firms that train apprentices offer slightly less apprenticeship positions if they

1This paper extends and improves the earlier study in two ways. It is possible to pool

two cross-sectional data sets instead of only relying on a single cross-section. As well, the

use of additional econometric methods allows to test further hypotheses.

2

Page 4

are located in a region with many other firms close-by that could potentially

poach their apprentices.

The paper is organized as follows: Section 2 describes the Swiss apprentice-

ship system. Section 3 briefly discusses the theory on firm training. Section

4 introduces the data and the sample design. Section 5 introduces the em-

pirical estimation strategies and presents the results. Section 6 concludes.

2 The Swiss apprenticeship system

Dual apprenticeships are the most important part of the post-compulsory

schooling system, with more than 60% of young adults per cohort enrolling

in this form of vocational training each year. While an apprenticeship cer-

tificate can also be received by graduating from full-time school based forms

of education, the dual apprenticeship programs are the most popular with a

share of over 88% of total vocational training.2In total, there are over 200

professions to choose from. Although one of the virtues of the apprentice-

ship system is its inclusiveness for not so academically prone school leavers

(Switzerland has one of the lowest percentages of the over-16 population

not having attended any form of non-compulsory schooling in the OECD),

apprentices can qualify for further education at the tertiary level. There is

a possibility to acquire a professional baccalaureate (either part-time dur-

ing, or full-time after an apprenticeship program), which gives access to

the universities of applied sciences (Fachhochschule), and - with an extra-

curriculum - even to universities. Furthermore, there is an opportunity to

enroll in higher vocational education programs at the tertiary level (ISCED

5B). The proportion of apprentices that continue their education at the

tertiary level has steadily risen over the last decade. Hence, from the per-

spective of an individual, a dual apprenticeship program is in no way a dead

end.

The two main types of apprenticeship training programs last either three

or four years. During this time, an apprentice spends - depending on the

training profession - about 1 to 2 days per week in a public vocational

school. During the remainder of the time, the apprentice receives either

further on-the-job training by in-house training personnel within the firm

or participates in the production process. In the year 2004, Swiss firms

2Apprenticeship programs correspond to the OECD classification ISCED 3C.

3

Page 5

invested about 4.7 billion Swiss francs in the training of apprentices (about

1% of GDP), while the value of the apprentices productive work during

the training program amounted to 5.2 billion Swiss francs (Muehlemann

et al., 2007). While apprenticeship training is profitable on average, about

one third of all apprenticeship contracts end with positive net costs for the

training firm. These figures make it clear that apprentices are, on average,

not just cheap substitutes for low-skilled labor, but nevertheless they make a

significant contribution to the firms production process. The apprenticeship

contract ends automatically at the end of the training program. Hence, if the

employer and the apprentice want to continue their employment relationship,

they have to negotiate a new labor contract. In Switzerland, mobility of

apprentices is relatively high. Only 37% of all apprentices remain in the

training firm one year after graduation. This fact is consistent with the

observation that only one third of firms incurs positive net costs of training.

Since the Swiss labor market is relatively flexible, especially compared to

other (European) countries3, firms are forced to train apprentices in a cost-

efficient manner, because the probability that part of the training costs can

be recouped later on depends on whether apprentices remain in the training

firm or not.

3 Theory

From a theoretical point of view, it is of interest whether the firm or the

worker pays for training. In a perfectly competitive labor market, firms will

not invest in the training of their workers if the acquired human capital is

purely general in the spirit of Becker (1964) or transferable in the sense of

Stevens (1994), such that the skills can be used productively in other firms.

Instead, the workers have to pay for their own training, either directly by

paying a tuition fee, or indirectly by accepting a wage below their produc-

tivity during the training period.

The recent training literature has tried to explain the frequently observed

phenomena that firms are willing to pay for general or transferable training

of their workers, which is in contradiction to the traditional human capital

theory.4There are several theoretical models that can explain such behavior

3(see e.g. OECD 1999, p. 57)

4For a comprehensive summary of the literature on firm training see Leuven (2005).

4

Page 6

if one considers imperfections in the labor market. For example, Acemoglu

and Pischke (1998, 1999) argue that firms will find it optimal to invest in

general training if there are frictions in the labor market, such as search

costs, asymmetric information about worker productivity, firm-specific hu-

man capital, efficiency wages or other wage floors. They show that although

firms invest in training, the equilibrium outcome will not be efficient. In-

stead, there will be under-investment in training because not all revenues

can be internalized due to the fact that not all apprentices remain in the

training firm later on. This behavior is anticipated by a firm ex-ante and

results in a deviation from the efficient allocation, hence there is sub-optimal

investment in training. Therefore, if the net costs of training apprentices are

positive, as assumed in the model of Acemoglu and Pischke, training will de-

crease with the share of the revenues that cannot be internalized. Acemoglu

and Pischke (1998) make the assumption that firms are able to counter any

wage offers from outside firms that want to poach their apprentices after-

wards, which would results in a winners curse, because the training firm

would bid up to the point where the outside option of a worker exceeds

productivity. On the other hand, Acemoglu and Pischke (1998) allow for

the possibility that workers might be unhappy in the training firm, which

gives them a disutility if they remain in the same firm. Thus, if mobility

costs are sufficiently high, workers will not change the firm even if they are

unhappy. However, if there are many potential employers in the same local

labor market, mobility costs decrease for obvious reasons, and there will be

more apprentices that leave the training firm after graduation. Firms oper-

ating in dense labor markets will anticipate this behavior, and therefore - if

training apprentices is costly - not provide training if the probability that

an apprentice quits after graduation is sufficiently high. As a conclusion,

denser labor markets would result in lower firm-sponsored training. Stevens

(1996) developed a theoretical model where the threat of poaching has an

adverse effect on firm-sponsored general training.

There is also a potential positive effect of dense local labor markets due

to knowledge spill-over and diffusion, improved matching due to higher

turnover or lower transportation costs in dense areas, which would result in

higher productivity (see e.g. Ciccone and Hall 1996; Ciccone 2002). Firms

can only achieve higher productivity if they have a skilled workforce that

is able to adapt new knowledge and technologies (see e.g. Acemoglu 2002).

5

Page 7

Brunello and De Paola (2008) highlight an endogeneity problem that might

arise because skilled workers potentially move to regions with a high degree

of knowledge spill-over, since they will be more productive in such regions

due to better technology. However, this endogeneity problem is not likely to

play an important role in the case of vocational training, since young adults

enter these programs at the age of 16, when they usually still live with their

parents and have high mobility barriers. As well, vocational training is a

relatively small part of a firm’s overall strategy, hence it is unlikely that a

firm would relocate for the sole reason of avoiding poaching externalities.

4 Data

4.1Survey design and data

The data used in this paper are from two representative surveys conducted

in Swiss firms in the years 2000 and 2004 by the Centre for Research in

Economics of Education at the University of Berne and the Swiss Federal

Statistical Office. The survey has been carried out at the establishment-

level. All establishments with more than 50 workers have been included

in the survey population, whereas firms with less than 50 employees were

drawn at random. The Federal Statistical Office has calculated the appro-

priate weights to account for the survey structure.5The data set contains

in total 7,593 firms, of which 4,312 firms train and 3,281 firms do not train

apprentices. A total of 1,265 firms have been excluded because they either

operate in the whole country, are part of the federal government or use a

centralized training scheme. The reason why these firms have been excluded

is that regional labor market characteristics do not influence their training

decision. Furthermore, firms that cannot make independent decisions about

apprenticeship training, because they are part of a larger enterprise, have

been excluded as well. Detailed data on the number of workers, training

profession and the number of skilled workers is available at the firm level

(see summary statistics in Table 8).

5All calculations in this paper have been performed using survey weights.

6

Page 8

4.2Regional labor markets

Switzerland is a small country with an area of only 41,000 square kilometers.

Despite its small size, Switzerland has a federalistic system with 26 cantons

that have their own government and parliament. Cantons have the power

to decide on the level of cantonal taxes and to a large extent about the

structure of the education system. They also have some decision power with

regards to labor market regulations. The lowest level of political decision-

making are the communes, which can still make their own decisions about

communal taxes and to some extent to the education system.

criteria for a firm to choose its location is the road and railway infrastructure.

While transportation is easy and well developed between the major cities in

Switzerland, travel time to more remote areas can increase substantially.

Hence, the potential employees of a firm are located in a certain area around

the firm. Obviously, it is very difficult to define such a region exactly. From

a worker’s perspective, travel time to work is costly. Thus, for a given wage,

a worker will reject a job offer if travel costs are too high, i.e. if the firm

is located too far away. While travel distance is costly for a worker due

to costs of gasoline or public transportation, these costs are usually small

compared to the costs of travel time. From an economic perspective, it is

more important to focus on travel time rather than travel distance, especially

because travel time is not just a monotone function of travel distance. It

is sometimes the case that for, let’s say worker A, who is located close to a

firm, has to spend more time traveling to work than a worker B, who lives

further away but has direct access to a highway or fast public transportation.

Another

Summing up, there are several possibilities to define local labor markets:

1. Firstly, political borders such as cantons could be used to define a

region of economic activity. But since some cantons are very small

and potentially border several other cantons (see Figure 2), it is highly

unlikely that the relevant labor market for a firm stops at the cantonal

border.

2. The second possibility is to define regional labor markets based on

travel distance. The advantage of this approach is that it can be im-

plemented rather easily by using a coordinate system. The drawback of

this approach is that such regions might not reflect an area of economic

activity, if there are e.g. mountains or lakes that hinder traveling.

7

Page 9

3. Lastly, one could define local labor markets based on travel time. This

might be the appropriate choice if time is seen as the driving factor

of transportation costs. The disadvantage of this approach is that the

borders of a region are somewhat ad-hoc, because maximum travel

time has to be defined. In addition, the definition of regions is rather

complicated because there is no computer-based software that auto-

matically assigns such a region to a firm based on travel time. As well,

one has to decide whether travel time of private or public transporta-

tion should be used.

Given the federalistic structure of Switzerland with many small cantons and

geographic factors such as mountainous areas and lakes that can lengthen

travel time considerably, even for small distances, a labor market definition

based on travel time seems to be the appropriate choice for the potential

supply of workers to a firm.

Table 1: Descriptive statistics of regions

VariableMean MedianMin Max Obs

Average share of training

firms in a region

0.322

(0.100)

0.3180.129 0.62067

Average number of local firms in

the same industry per hectare

0.023

(0.021)

0.0180.0010.093 67

Regions are defined as follows: The 67 largest Swiss cities and towns each

build the center of a region. From this point, all towns that can be reached

by car within 30 minutes constitute a regional labor market.6

In densely populated areas, regions can be overlapping, in the sense that a

6The limit of 30 minutes travel time was chosen based on Swiss census information

(year 2000) showing that the majority of the working population has a travel time below

30 minutes. Only 16% of all workers have a travel time above 30 minutes. Furthermore,

because the research question about the threat of poaching is targeted at very young people

between age 17 to 19, a longer travel time than 30 minutes seems rather inadequate, since

young people are often more mobility constrained, because e.g. they might not own a

car. Furthermore, if the true maximum travel time that individuals are willing to take

on would be longer, then the effect of poaching should be smaller. The travel time was

measured with the software ”Microsoft Autoroute 2005”

8

Page 10

firm located at the intersection between two regions could potentially poach

apprentices of firms situated in both regions (see Figure 3). It should be

noted that a firm always belongs to a single region, if the firm is used as the

dependent variable. If a firm is situated in a town that is not large enough

to constitute an own regional center, then it will belong to the center of the

region which is closest.7The overlapping regions are only relevant for the

construction of the independent variable ”local number of firms in the same

industry per hectare”. Table 1 shows the descriptive statistics of the average

training ratio and the variable of interest, the number of local firms in the

same industry per hectare.

5Econometric models and empirical analysis

The empirical estimation strategy proceeds in two steps. In the first sub-

section, the effect of labor market density on the number of apprenticeship

position offered by firms is estimated by local polynomial regression. In

the second subsection, the effect of labor market density is estimated by

applying multivariate count data models.

5.1Local polynomial regression

In this subsection, the functional form of the number of apprentices nihired

by a firm with respect to regional labor market density is estimated by local

polynomial regression. The regression model is of the form

ni= m(d) + εi, i = 1,...,7593

where d denotes the number local firms in same industry per hectare. I am

interested in the functional form m(d), which is linear in the neighborhood

of d0, such that m(d) = a0+ b0(d − d0) in the neighborhood of d0.8The

local linear regression estimator minimizes

7Obviously, it would be optimal to construct a region including the number of firms

in the same industry per hectare for each firm in the sample. However, it is doubtful

whether the construction of over 7’500 individual regions would have improved the analysis

substantially.

8see Cameron and Trivedi (2006), p. 320.

9

Page 11

N

?

i=1

K

?di− d0

h

?

(ni− a0− b0(di− d0))2,

w.r.t. the parameters a0 and b0, where K denotes the Kernel weighting

function. As a result, ˆ m(d) = ˆ a0+ˆb0(d − d0) in the neighborhood of d0.

There are different estimators that can be applied. An Epanechnikov Kernel

with first degree polynomial has been used in the regression displayed in

Figure 1.9

Obviously, this regression only serves descriptive purposes to illustrate the

bivariate relationship between the demand for apprentices and labor market

density.

0

.2

.4

.6

.8

Number of apprentics

0 .05.1 .15

Local number of firms in same industry per hectare

95% CIlpoly smooth

kernel = epanechnikov, degree = 1, bandwidth = .03, pwidth = .05

Local polynomial smooth

Figure 1: Local polynomial regression

As shown in Figure 1, there is a negative relationship between local labor

market density and the number of apprentices a firm hires.

5.2Count data models

The number of apprentices that a firms chooses to train is a count variable

that only takes on nonnegative values. Hence, ordinary least squares re-

gression is not the optimal choice for the analysis. A natural starting point

to analyze count data is the Poisson regression model. Let nj= 0,1,2,...

9The estimations were carried out in Stata using the -lpoly- command.

10

Page 12

denote the number of apprentices employed by firm i and d the local labor

market density.10Then (see e.g. Greene 2003, p.740),

Prob(Ni= ni|xi) =e−λiλni

i

ni!

, ni= 0,1,2,...

where lnλi= x′

the coefficient vector of the explanatory variables xi. The expected number

of apprentices hired by a firm is given by

iβ in the standard loglinear version of the Poisson model; β is

E[ni|xi] = Var[ni|xi] = λi= ex′

iβ

The individual marginal effect of a small change in xion E[ni|xi] is therefore

∂E[ni|xi]

∂xi

= λiβ = E[ni|xi]β

The coefficient vector β can also be interpreted directly as the relative change

in E[ni|xi] associated with a small change in xi(see Winkelmann 2003, p.68),

since

β =∂E[ni|xi]/E[ni|xi]

∂xi

The parameter vector β can be estimated with maximum-likelihood tech-

niques and is therefore efficient. The likelihood-function is

lnL =

k

?

i=1

[−λi+ nix′

iβ − lnni!]

The requirement that the mean is equal to the variance is referred to as

equidispersion, but this is often not the case. In the data used here, the

mean of the number of apprentices [E(n) = 0.7] is much smaller than the

variance [Var(n) = 4.2]. This problem is often referred to as overdispersion.

While the Poisson regression still yields consistent results even if there is

overdispersion, the standard errors will be grossly deflated (see e.g. Cameron

and Trivedi 2006, p.670).

10To estimate the regression models, the ”cluster” command implemented in Stata has

been applied, where a regional labor market denotes a cluster. It is assumed that unob-

served effects within a cluster are uncorrelated with the regressors. Therefore, it suffices

to adjust the standard errors of the regression coefficients, because the point estimates re-

main unchanged. For a detailed treatment of data with cluster structure, see e.g. Cameron

and Trivedi (2006), pp. 829.

11

Page 13

A typical alternative that allows for overdispersion in the data is the negative

binomial model. It can be interpreted as a generalization of the Poisson

model by introducing unobserved heterogeneity, such that lnµi= x′

Hence,

iβ + εi.

E[ni|xi,εi] = ex′

iβ+εi= µi= hiλi

where hi= eεiis assumed to have a gamma distribution with mean normal-

ized to 1 and variance 1/δ. Thus, E[ni|xi,εi] = λiif E[hi] = 1. Therefore,

the interpretation of the parameter vector β remains the same as in the

Poisson regression model.11

Overall, about 70% of Swiss firms in our sample do not train any apprentices

(see Table 2).Neither a Poisson nor a negative binomial model is able

Table 2: Apprentices hired by firms

012

70.0513.25 9.08

70.0583.392.38

Apprentices

Frequency

Cumulative freq.

34567+

0.98

100

3.61

96

1.82

97.82

0.78

98.6

0.43

99.02

to predict such an ”excess of zeros”. To overcome this problem, so-called

hurdle models can be applied to relax the assumption that zeros and positive

outcomes are the results of the same data generating process.

data hurdle model combines a binary model for the training decision with a

truncated-at-zero count data model for the number of apprentices employed

by training firms (for a detailed exposition see e.g.

From an economic perspective, these models can be interpreted as a two-

stage decision process.

In this paper, firms are assumed to first make a decision whether apprentice-

ship training is desirable in their firm or not. Many small and specialized

firms do not engage in apprenticeship training, because they do not have the

necessary infrastructure or training personnel. For some firms, the relevant

profession in which they have a need for skilled workers might not even exist

in the form of an apprenticeship program, hence they are forced to recruit

workers on the external labor market. Once the basic decision has been

A count

Winkelmann 2003).

11Different distributions could be chosen for the error term εi, such as the normal

distribution, which leads to a Poisson-log-normal model. For a detailed exposition see e.g.

Winkelmann (2003). See Appendix A for a detailed derivation of the negative binomial

model as a poisson-gamma-mixture.

12

Page 14

made as to whether apprentices should be trained or not, a firm will choose

the optimal number of apprentices it wants to hire. A hurdle model allows

to test whether the effects of the variables of interest are the same at the

hurdle and for positive outcomes of the count variable.

The zero outcomes are determined by a density f1(·), such that f(n =

0) = f1(0). Positive outcomes are determined by a truncated density f2(·)

such that f(n = k) =

1−f2(0)f2(k),k = 1,2,3,.... Hence, the probability

distribution of a hurdle-at-zero model is given by

1−f1(0)

g(n) =

?

f1(0)

1−f1(0)

1−f2(0)f2(n)

if

if

n = 0

n ≥ 1

A standard model is the Poisson hurdle model proposed by Mullahy (1986)

with f1and f2being two Poisson distributions, where λ1i= ex′

ex′

Hence, the parametric restriction H0: β1= β2can be tested using a Wald

test. If H0 cannot be rejected, the Poisson hurdle model reduces to the

standard model since f1= f2. The likelihood function of the Poisson hurdle

model (see e.g. Winkelmann 2003) is given by

iβ1and λ2i=

iβ2. The advantage of this model is that it nests the simple Poisson model.

L=

k?

?exp(−exp(x′

i=1

exp(−exp(x′

iβ1))di[1 − exp(−exp(x′

iβ1))]1−di

×

iβ2))exp(nix′

iβ2)

ni!exp(−exp(x′

iβ2))

?1−di

where di= 1 − min{ni,1}

Besides the Poisson hurdle model, there are other possible specifications,

such as the logit-negative binomial hurdle model, which combines a logit

model for the hurdle and a truncated-at-zero negative binomial model for

positive outcomes. The results of this model will be presented in the next

section.12

12Many other specifications are possible, such as the Probit-Poisson-log-normal model,

with a probit model at the hurdle and a truncated-at-zero Poisson-log-normal distribution

for positive outcomes (see e.g. Winkelmann 2003 for a detailed treatment, or Muehlemann

et al. 2007 for a recent empirical application to vocational training). This model will not

be applied in this paper because it has not yet been implemented for the use of survey

weights and the calculation of cluster-robust standard errors.

13

Page 15

The main interest of this paper is the elasticity of the firm’s demand for ap-

prentices n with respect to the local labor market density d at the extensive

margin, which is given by

η1=∂P(n > 0)

∂d

d

P(n > 0)

and the elasticity at the intensive margin, given by

η2=∂E(n|n > 0)

∂d

d

E(n|n > 0)

The overall elasticity equals η1+ η2since E(n) = P(n > 0)E(n|n > 0).

The advantage of hurdle models is that they allow for different effects of the

independent variables in different parts of the distribution. For example, the

effect of labor market density could have a negative impact at the hurdle, i.e.

on the training decision, but no impact on the demand for apprentices once

a firm has decided to train. The simple Poisson model does not allow for

this because both η1and η2are functions of a single index with parameter

β.

The estimation strategy is as follows: First, simple Poisson and negative bi-

nomial regression models are estimated and then compared to hurdle models.

Second it will be tested whether hurdle models are the appropriate statistical

tools to model the firm’s demand for apprentices.

5.3Results

The results of the simple Poisson regression (Table 4) show that the local

number of firms per hectare has a negative and highly significant effect

on the number of apprentices hired by firms. This result remains robust

if a negative binomial model is applied. While the null hypothesis of no

overdispersion can be rejected by a likelihood ratio test of the dispersion

parameter (H0: α = 0), the coefficient on local labor market density remains

negative and significant (Table 5).

The Poisson hurdle model gives additional insights with regards to the effect

of local labor market density (Table 6). While the coefficient of interest is

negative and highly significant at the hurdle, it is still negative but small and

not significant for positive outcomes. To test the hurdle model against the

simple Poisson model, a Wald test can be performed. The null hypothesis

14