Identification of Social Interactions through Partially Overlapping Peer Groups
ABSTRACT In this paper, we demonstrate that, in a context where peer groups do not overlap fully, it is possible to identify all the relevant parameters of the standard linearinmeans model of social interactions. We apply this novel identification structure to study peer effects in the choice of college major. Results show that one is more likely to choose a major when many of her peers make the same choice. We also show that peers can divert students from majors in which they have a relative ability advantage, with adverse consequences on academic performance, entry wages, and job satisfaction. (JEL I23, J24, J31, Z13)

Article: Identifying peer achievement spillovers: Implications for desegregation and the achievement gap
[Show abstract] [Hide abstract]
ABSTRACT: This paper develops a new approach to identifying peer achievement spillovers in the context of an equilibrium model of student effort choices. By focusing on the effect of contemporaneous peer achievement, this framework integrates previously unexplored types of heterogeneity in peer spillovers in the achievement production context. Applying the strategy to North Carolina public elementary school students, I find peer achievement spillovers exist primarily within racebased reference groups, and the magnitude of these spillovers diminishes across the percentiles of the achievement distribution. Simulations highlight the importance of peer achievement spillovers for determining the distributional effects of desegregation relative to flexible reducedform specifications that focus entirely on predetermined peer characteristics.Quantitative Economics. 03/2013; 4(1).  SourceAvailable from: Mitchell Adrian[Show abstract] [Hide abstract]
ABSTRACT: In this study we investigated the effects of two forms of instruction: strengthening concepts and strengthening the ability to connect context with concepts. Although students may have acquired a reasonable amount of conceptual knowledge as a result of economics courses, two obstacles may prevent them from achieving transfer. One obstacle is a lack of a rich conceptual network; another is the inability to make connections between the conceptual network and realistic problems. The aim of this study was to find out what contributes most to the ability to transfer: strengthening conceptual knowledge or strengthening the making of connections. Some 139 students of the prefinal year of preuniversity education participated in an experiment with two conditions and with a pretest and a posttest. All students performed significantly better on the posttest in which conceptual knowledge was measured compared to the pretest. No significant differences were measured between the two instructions on the posttest on transfer.01/2012;  SourceAvailable from: Eirini Tatsi[Show abstract] [Hide abstract]
ABSTRACT: This paper examines the effect of peers on academic achievement for the ninth grade in German schools. Using the 2003 extended PISA survey, I disentangle the effect of classmates' scores from classmates' characteristics on individual scores for mathematics, reading, science and problem solving. The mixed regressive spatial autoregressive type model considers correlated effects in the form of grade fixed effects and identifies all social interaction parameters stemming from grade and classroom size variation. The empirical findings uncover the presence of positive contemporaneous effects of peer achievement for all four subjects, which lingers even after excluding schools that report nonrandom assignment of students into classrooms. Using the socialmultiplier matrix to calculate marginal effects from changes in each of the explanatory variables, I find that peers' age and socioeconomic status have a significant impact on own scores. Interestingly, I cannot establish the presence of gender or immigrationbackground peer effects.10/2014;
Page 1
Identification of Social Interactions through Partially
Overlapping Peer Groups
By Giacomo De Giorgi, Michele Pellizzari and Silvia Redaelli∗
In this paper we demonstrate that, in a context where peer groups do
not overlap fully, it is possible to identify all the relevant parameters
of the standard linearinmeans model of social interactions. We apply
this novel identification structure to study peer effects in the choice of
college major. Results show that, indeed, one is more likely to choose
a major when many of her peers make the same choice. We also show
that peers can divert students from majors in which they have a relative
ability advantage, with adverse consequences on academic performance,
entry wages and job satisfaction.
JEL: J0, I21
Keywords: Social interactions, peer effects, identification, mismatch.
The importance of peers in shaping individual behavior has been widely recognized in
both the economic and the sociological literature (Matthew O. Jackson 2006). Numerous
studies have produced empirical evidence documenting the existence of relevant peer
effects in many areas, from schooling performance to criminal behavior and financial
decisions (Anne Case and Lawrence F. Katz 1991, Caroline Hoxby 2000, Bruce. Sacerdote
2001, Esther Duflo and Emmanuel Saez 2003, Piero Cipollone and Alfonso Rosolia 2007,
Patrick Bayer, Randi Pintoff and David Pozen 2009, Scott E. Carrell, Richard L. Fullerton
and James E. West 2009, Scott E. Carrell, Fredrick V. Malmstrom and James E. West
2008, Andreas Ammermueller and JornSteffen Pischke 2009).
Despite the vast literature on the topic, the identification of social interactions remains
very problematic because of two wellknown issues: endogeneity  due either to peers’
selfselection or to common group (correlated) effects  and reflection  a particular case
of simultaneity (Charles F. Manski 1993, William A. Brock and Steven N. Durlauf 2001,
Robert Moffitt 2001, Adriaan R. Soetevent 2006).
∗De Giorgi:
giorgi@stanford.edu. Pellizzari: Bocconi University, IGIER and IZA, via Roentgen 1, 20136, Milan,
Italy, michele.pellizzari@unibocconi.it. Redaelli: Bocconi University, via Roentgen 1, 20136, Milan, Italy,
sredaelli@worldbank.org. We thank the editors Thomas Lemieux and Esther Duflo, Ran Abramitzki, Li
ran Einav, Eliana La Ferrara, Caroline Hoxby, Enrico Moretti, Derek Neal, Steve Pischke, Luigi Pistaferri
and Imran Rasul for their comments. We also thank seminar participants at IZA/CEPR Summer Sym
posium, Stanford University, Bocconi University, the London School of Economics, ISER, University of
California at Irvine, University of California at Berkeley, University of California at Davis, University of
British Columbia, the Universities of Bologna, Modena and Verona and the NBEREducation Meeting.
We are grateful to Stefano Gagliarducci, Pietro Garibaldi, Francesco Giavazzi, Andrea Ichino and Enrico
Rettore for providing the initial data for this project. We also thank Giacomo Carrai, Mariele Chirulli,
Alessandro Ciarlo, Cherubino Profeta, Alessandra Startari, Mariangela Vago and the administration of
Bocconi University for allowing access to their archives and helping with the extraction of additional
data. Enrica Greggio, in particular, has been a constant source of precious information and help. The
usual disclaimer applies
Stanford University and NBER, 579 Serra Mall, 943056072 Stanford, CA, de
1
Page 2
2AMERICAN ECONOMIC JOURNALMONTH YEAR
In this paper we propose a novel strategy for the identification and estimation of peer
effects and particularly of their endogenous component, defined as the impact of the
average peers’ outcome on the individual outcome. Such a strategy takes advantage of a
common feature of social networks, namely the existence of partially overlapping groups
of peers, hence it is very general and readily applicable to several analytical settings. By
definition, groups are partially overlapping if the sets of peers of two individuals who are
peers to each other do not perfectly coincide. By exploiting such a feature it is possible to
solve issues related to both reflection and correlated effects (or group shocks) and, hence,
to identify all the parameters of interest in the standard linearinmeans model of social
interactions (Manski 1993).
The intuition behind our identification result is twofold. First, partially overlapping
groups generate peers of peers (or excluded peers) who act as exclusion restrictions in the
simultaneous equation model of social interactions and, thus, solve the reflection problem.
Second, a potentially large set of instruments naturally arises from the group structure
and allows to deal with correlated effects, i.e. macroshocks to one’s group. Specifically,
such instruments are the exogenous characteristics of the excluded peers: they are corre
lated with peers’ performance by means of social interactions but uncorrelated with the
individual group shock.
We present a simple application of this identification strategy to the choice of major by
college students. Such an application is of interest on its own right, since major choice is
fundamental for determining occupations and, ultimately, the labor market performance
of an individual.1
Our empirical analysis is based on a newly constructed set of administrative data
on undergraduate students from Bocconi University. In these data, groups of peers are
nicely defined by institutional features. At the time covered by our data, Bocconi students
initially enrolled in a common program and only at the end of their third semester did
they choose whether to specialize in one of two majors: business or economics. During
the first three semesters, all students took nine compulsory courses and attended lectures
in randomly assigned classes for each course. This process of repeated random allocation
naturally defines groups of peers that vary at the level of each single student and are,
thus, immune from simultaneity (or reflection). Moreover, since the allocation into the
classes is random, endogeneity in peer group formation is also excluded by construction
(George A. Akerlof 1997).2
Having peer groups that vary at the individual level guarantees the presence of excluded
classmates, i.e. students who did not attend classes with student i but did attend some
courses with some of i’s peers. The exogenous characteristics of such excluded peers are
a natural set of instruments to overcome potential endogeneity generated by common
1Sacerdote (2001) does not find any significant influence of peers on major choice. Notice, however,
that in Sacerdote’s paper, endogenous and exogenous effects could not be separated.
2Note that the exogenous formation of groups only simplifies our analysis, but it is not a necessary
feature of the novel identification strategy that we propose. In principle, our approach can be applied
to settings in which groups form endogenously, in which case one would have to take that process into
account.
Page 3
VOL. VOL NO. ISSUEPEERS IN PARTIALLY OVERLAPPING GROUPS3
(correlated) group effects.3
Individualspecific groups and the use of excluded peers as instruments are the key
features of our identification strategy which, far from being peculiar to our specific setting,
can be applied to numerous other contexts. In the real world one rarely belongs to only
one fixed group of peers. On the contrary, we all have friends, mates and colleagues that
may overlap, but very rarely perfectly coincide. Whenever the units of analysis are linked
directly to some other units (the peers) but only indirectly (through peers) connected to
others (the peers of peers), our identification could be fruitfully adopted.4
The spectrum of applications of our identification strategy is, therefore, potentially
larger than that of current empirical approaches to the estimation of peer effects. Perhaps,
a limitation might come from the stricter data requirements.
requires knowledge of all (or many) indirect links among peers of peers. However, data
of this type are becoming more and more popular and easily available. In fact, it is likely
that in many existing datasets our approach could be readily replicated. Some recent
papers could already exploit our identification with no additional data requirement, like
Bayer, Pintoff and Pozen (2009), a study of criminal behavior where peers’ characteristics
are weighted by the time spent in the same correctional facility; Antoni Calv´ oArmengol,
Eleonora Patacchini and Yves Zenou (2009), who focus on the influence of the position
of a player in a particular network on academic performance; Luigi Guiso and Fabiano
Schivardi (2007) who look at spillovers in employment and investments of firms located
in the same district and Alexandre Mas and Enrico Moretti (2009) in their analysis of
social interactions among supermarket cashiers.
Our econometric methodology differs from the existing literature that tries to recover
peer effects using either laboratory experiments (Armin Falk and Andrea Ichino 2006),
natural experiments (Sacerdote 2001, David J. Zimmerman 2003, Cipollone and Rosolia
2007), quasiexperimental designs (Hoxby 2000), fixed effects (Eric A. Hanushek, John F.
Kain, Jacob M. Markman and Steven G. Rivkin 2003) or higher order moments (Bryan S.
Graham 2008). Ron Laschever (2005) is, to our knowledge, the first application of a
multiple group framework. In a paper developed independently and at the same time as
ours, Yann Bramoull´ e, Habiba Djebbari and Bernard Fortin (2009) discuss identification
of peer effects in a network framework that is very close to ours.
Besides the methodological innovation, our empirical application also provides an im
portant contribution to the literature on peer effects in education. Social interactions,
and particularly the endogenous influence of peers, are likely to be crucial determinants
of students’ choices, above and beyond their documented effect on performance. For
example, when deciding whether to go to college, which particular school to attend or
in which field to specialize, peers may provide precious and otherwise costly informa
tion. Moreover, to many students, classmates also represent a natural reference group
and conformist behavior is extremely common during adolescence (George A. Akerlof
Our strategy, in fact,
3The usual suspects for group shocks in the education framework are teachers’ effects or classmates’
disruptive behaviors.
4In the network literature (Antoni Calv´ oArmengol and Matthew O. Jackson 2004, Jackson 2006)
this corresponds to the existence of links of (at least) diameter 2.
Page 4
4AMERICAN ECONOMIC JOURNALMONTH YEAR
and Rachel E. Kranton 2002). Alternatively, there might simply be a utility gain from
attending school or courses with one’s friends.
Peer effects in schooling choices can be particularly important in explaining why such
decisions are often inefficient, as documented in several branches of the economic liter
ature. For example, recent evidence in the large literature on wage inequality (Claudia
Goldin and Lawrence F. Katz 2007) suggests that most of the dynamics in wage inequal
ity in the 1980s and 1990s has been driven by the slow response of (skilled) labor supply
rather than by the acceleration in the demand for skills. Although our focus is on ma
jor choice and not specifically on college enrollment, these two decisions should not be
fundamentally different, particularly regarding the role of peers.
Another dimension of inefficient schooling decisions, more closely connected to the
choice of major, is skill mismatch. While the fact that large fractions of workers are
employed in jobs unrelated to their studies has been documented already some time
ago (Henry S. Farber 1999), the issue has remained largely unexplored. Among the few
recent contributions, Peter Gottschalk and Michael Hansen (2003) analyze the trends in
college graduates that are occupied in ”noncollege” jobs, while John Robst (2007) shows
that being employed in a job outside one’s major field negatively and significantly affects
the returns to schooling. Keith A. Bender and John S. Heywood (2009) provide similar
evidence for a sample of PhD graduates in the US. Correctly understanding the dynamics
of schooling choices is, thus, crucial for any policy intervention aimed at reducing these
inefficiencies and peer effects are likely to play a key role in this framework.
We show that peers’ influence can indeed lead students to make schooling decisions
that contrast with their revealed abilities. Further, by observing students in their first
job after graduation we are also able to estimate the effect of different major decision
modes on wages and on the (selfreported) probability of jobmismatch. In particular,
we find that those students who choose a major following their peers and in contrast to
their revealed ability graduate with lower final marks, end up in lower paid jobs and are
more likely to be mismatched.
Our findings do not directly support any specific policy intervention, with the possible
exception of providing better information to students about the available schooling op
tions. However, we contribute substantially to the understanding of the mechanism that
underlies a large set of schooling (and potentially occupational) choices, thus allowing a
better design and evaluation of any policy aimed at improving the allocation of skills to
jobs.
The paper is organized as follows: Section I describes the institutional structure of
Bocconi University, the available data and the details of the allocation of students into
classes; Section II presents our approach for the construction of the peer groups; Section
III discusses the identification strategy and Section IV presents the results of the analysis
of the choice of major. In Section V we provide a number of robustness checks. Section VI
discusses the effects of the decision modes on final academic and labor market outcomes.
Finally, Section VII concludes.
Page 5
VOL. VOL NO. ISSUEPEERS IN PARTIALLY OVERLAPPING GROUPS5
I.Data and institutional details5
The analysis in this paper is based on administrative data from Bocconi University,
an Italian private institution of higher education that specializes in management and
economics. We complement these data with a series of surveys conducted by the university
on its recent graduates to collect information concerning their labor market outcomes.
Our analysis uses information on one full cohort of students who entered university in the
academic year 1998/99 and enrolled in the the most popular degree offered by Bocconi
at that time.
Students in this program would first follow a common track of nine exams during
the initial three semesters and would then choose whether to specialize in Business or
Economics (See Figure 1).6The nine common compulsory courses are listed in Figure
1 and can be classified by subject areas according to the department responsible for the
teaching: business, economics, quantitative subjects and law.
[FIGURE 1]
Excluding a few missing values for our variables of interest and those students who
did not complete the courses of the first 3 semesters, our working sample consists of the
1,141 observations described in Table 1.7
A few of them (less than 10 percent) have
not graduated, either because they dropped out, moved to another university or are still
enrolled. The distribution across majors is strongly skewed towards business with only
about 13 percent of students choosing Economics.
[TABLE 1]
Table 2 reports some descriptive statistics on the ability and performance of these
two groups of students. Considering all the common exams in the first three semesters,
the 145 students choosing Economics score on average almost 2 gradepoints above the
Business students (exams are graded on a scale of 0 to 30, with a passing grade equal
to 18). This difference is even higher when the exams are disaggregated by field. As
expected, Economics students perform relatively better in economics and quantitative
subjects, while the difference is considerably smaller for the average grade in business
courses.
[TABLE 2]
Further, we exploit a number of surveys conducted by the university on its alumni,
who are contacted and interviewed about 1.5 years after graduation. Unfortunately, the
5In this section we describe only the most important features of the data and the institutional
background. Further details can be found in the web Appendix A.
6The original name of the degree program that we consider is CLEA/CLEP, where CLEA (Corso di
Laurea in Economia Aziendale) refers to the management specialization and CLEP (Corso di Laurea in
Economia Politica) to economics.
7We also exclude students transferring from other universities and students from abroad who were
given reserved places.
Page 6
6AMERICAN ECONOMIC JOURNALMONTH YEAR
response rates are not particularly high, especially for the earlier years and for males, who
often did their compulsory military service right after graduation.8For these reasons, we
can recover labor market information only for 448 of the 1,027 students in our working
sample who eventually graduated from Bocconi. This selection is obviously nonrandom;
however most of it is driven by observable characteristics, namely the survey wave, the
respondent’s gender and the place of residence.9
analyzes we always condition on a very large set of observable characteristics, many of
which are often unobservable in other datasets (e.g. ability as measured by the entry test
score).
The available labor market data include questions on monthly wages in the first job,
the type of occupation and contract, and a number of questions on satisfaction with the
job and the university. The last rows of Table 1 report some descriptive statistics for the
variables that will be explored later in section A. Wages are observed for all graduates
who have had a job between the day of their graduation and the day of the interview
(a large majority of 97 percent) and are recorded in 7 intervals, from below 750 to a
maximum of 5,000 euros per month at steps of 250 or 500 euros. The reported mean and
standard deviation refer to an imputed measure of wages computed at the midpoint of
the interval indicated by the respondent. All monetary values are in euros evaluated at
2005 prices.
Our labor market data also allow us to construct a more direct measure of job mismatch
using a question regarding work conditions and difficulties encountered by the student
in her first job. The specific question reads as follows: ”In your first job, have you
experienced any of the following problems/difficulties?” The list of possible answers is:
tasks were too easy, tasks were too demanding, problems with team work, relationship
problems with colleagues, difficulties with finding one’s position in the organization, job
is not secure, low pay, job does not fit personal attitudes. More than one item can
be indicated. The measure we adopt is a simple dummy variable that equals 1 if the
respondent indicates at least one problem and zero otherwise. However, as we will discuss
at length in Section A, our main results are extremely robust to the particular definition
of the mismatch variable.
Notice, additionally, that in all our
A.Lecturing classes
Within each of the nine compulsory courses students are randomly assigned to teaching
classes.10The number of classes for each course depends on the number of available lec
8The military service was 10 months long and university students could postpone it until graduation.
Over the years, the reasons for complete exemption have been expanded (for example, around 2000
a set of new rules allowed permanent exemption from the service to students who enrolled in a PhD
programme). The compulsory military service was abolished in 2001 for all citizens born after 1985.
9These variables alone explain about 30 percent of the probability of survey participation. Moreover,
the indicator for survey participation is never significant in any regression with measures of performance
or ability as dependent variables. This result is unaffected by the introduction of additional covariates.
10The terms class and lecture often have different meanings in different countries and sometimes also
in different schools within the same country. In most British universities, for example, lecture indicates
a teaching session where an instructor  typically a faculty member  presents the main material of the
Page 7
VOL. VOL NO. ISSUEPEERS IN PARTIALLY OVERLAPPING GROUPS7
turers. Moreover, the capacity of the available classrooms at Bocconi varies considerably
and the number of students in each class had to be determined accordingly. The decision
to adopt a random allocation algorithm was dictated by the need to avoid congestion in
the classrooms resulting from students wanting to attend lectures with their friends or
with the best teachers.
Towards the end of each term, students had to enroll in the courses of the following
term either at the administration desk or through computer terminals located in the
university buildings. Moreover, students who failed to pass an exam during the academic
year in which they had attended the corresponding course were required to reregister
and were also assigned randomly to a new class (together with other students). For these
reasons, the total number of students enrolled in each course (the sum over all the classes)
may vary slightly across subjects.
When enrolling for a course, the algorithm would randomly assign the student to a class
and communicate the allocated class number. By no means could the students interfere
with the algorithm. For example, there was no guarantee that two students enrolling in
the same course one right after the other would be placed in the same teaching class.
In principle, students were required to attend lectures in their assigned classes, but
enforcement varied substantially over time, becoming stricter in more recent years. Ac
tually, the evolution of enforcement practices is closely related to the availability of the
information on lecturing classes: as the enforcement of the allocations was made more
and more stringent, lecturing classes were also recorded on various official documents and
thus maintained in the administration’s archives.
The mere fact that lecturing classes have been carefully recorded for the 1998/1999
cohort is an indication that the system was effectively enforced.11Additionally, students
were forced to attend their assigned classes by various methods. First, lecturers were
supposed to circulate attendance sheets at the beginning of each class for students to
sign their presence. Obviously, with a large number of students in each class (the average
class size was 202 students), it was relatively easy for those who wanted to attend a
different class to have someone else signing for them. Midterms were also important in
encouraging students to attend their assigned classes. In fact, while the final exams were
identical for all students regardless of their classes, midterms were organized directly
by the lecturers. Therefore, if a student wanted to take the midterm (which were not
compulsory, but highly recommended and very popular among the students), it would
be in her interest to attend her assigned class, as the exam was prepared and marked by
the same lecturer.
These institutional features also lead to very high levels of attendance, as indicated by
the students’ evaluation questionnaires.12Both the number of questionnaires collected
course. Classes are instead practical sessions where a teacher assistant solves problem sets and applied
exercises with the students. At Bocconi there was no such distinction, meaning that the same randomly
allocated groups were kept for both regular lectures and applied classes. Hence, in the remainder of the
paper we use the two terms interchangeably.
11There are less than 2 percent of missing values.
12As it is now customary in most universities, at the end of each course students are asked to evaluate
both the teaching and the logistics of the lectures by filling a questionnaire. The congestion variable
Page 8
8AMERICAN ECONOMIC JOURNALMONTH YEAR
on the day of the evaluation and the selfreported percentage of attended lectures are
typically very high, with students being present at over 80 percent of the lectures for
economics, management and quantitative courses.
Only law subjects have very low attendance levels (1020 percent of formally enrolled
students complete an evaluation questionnaire at the end of the term). At that time
Bocconi did not have a law department and relied exclusively on external professors (from
other universities). For this reason, the number of law classes that could be created was
relatively small (4) and their size was consequently extremely high; the administration
was well aware of the low attendance in these courses. For these reasons we never use
information on the law subjects for the definition of the peer groups.
The number of classes ranges from 4 in the law courses, to 6 in the two economics
classes, 8 in statistics and management II and 10 in all remaining courses (mathemat
ics, management and accounting) and the average number of enrolled students varies
accordingly, from over 350 to 140.13
II. Peer group definition
Our definition of peers is based on students attending courses in the same classes and
it is meant to capture the network in which students interact academically and socially.
The underlying assumption is that these interactions are fostered by class attendance so
that the relevant set of peers for each student overlaps (at least partly) with classmates.
Class attendance heavily influences how, where and with whom students spend most
of their time. Each of the 9 compulsory courses is taught in 3 weekly sessions of 2 hours.
Hence, students taking only one course together would be sitting in the same classroom
for 6 hours per week for a term period of 12 or 13 weeks. Students taking all 9 courses
together would spend as much as 54 hours per week in the same room.
While considering classes is standard in the literature on peer effects in school, in our
case effective attendance and the large size of the lecturing classes may cast doubts on
the possibility of capturing relevant peer interactions by looking at assigned classmates.
We address this problem by excluding the two law courses from our definition of peer
groups and, more importantly, also by weighting peers by the number of common courses
attended together.
In our preferred specification the weights are non linear, as only students who attended
at least 4 of the 7 common courses in the same classes are considered peers.14
restricted definition is particularly interesting because it leads to group sizes that are
comparable to other papers in the literature, particularly those that look at high school
classes. For completeness, however, we also present all our results using a simpler, but
also looser, definition where any two students who have taken at least one course together
This
reported in Table 4 is computed from such questionnaires which are available in our dataset at the level
of each single class.
13Detailed statistics about the teaching classes can be found in the web Appendix C, Table C.2.
14We choose the threshold of 4 courses because it is the highest that guarantees a nonempty peer
group for all students (i.e. there are some students who have never taken more than 4 courses with
others).
Page 9
VOL. VOL NO. ISSUE PEERS IN PARTIALLY OVERLAPPING GROUPS9
are considered peers.
More formally, individual i’s peer group (Gi) includes all individuals j who were as
signed to the same class as individual i for at least 4 of the 7 courses that we consider
(all 9 common exams minus the 2 law subjects). Furthermore, each of the j ∈ Giis given
an importance weight, ωi,j ∈ (0,1], according to the number of common courses taken
together with i, i.e. ωi,j= 1 if j attends all 7 courses in the same class as i, ωi,j= 4/7
if j attends 4 courses with i , and so on.15When computing means of variables at the
group level the weights are normalized to sum to one within each group.
[TABLE 3]
The first two columns of Table 3 report some characteristics of these groups. In column
1 we consider as peers only students who have attended at least 4 courses in the same
classes, while in column 2 the groups are constructed according to the looser definition.
The mean raw group size is approximately 18 students in the stricter definition and goes
up to 674 when all peers are considered. On average, students in these groups are assigned
to the same classes for 4.2 and 1.6 courses respectively, which implies that, when peers
are weighted by the number of courses taken together, the size of the groups goes down
to 10.7 in our main definition and to 151 with the looser definition.
[TABLE 4]
As a first check of the validity of our definition of peers, Table 4 shows that, after the
initial 3 semesters, students who have attended lectures in the same random classes also
show remarkably similar academic patterns. In the upper panel of Table 4, for example,
we show that students who were randomly assigned to the same classes are significantly
more likely to graduate in the same session.16For the average student in our sample,
approximately 12.5 percent of the nonpeers graduate in the same session. This number
goes up to 13.4 percent for peers in our most comprehensive definition (column 1) and
it increases steadily as the definition becomes more stringent (columns 2 to 4). In our
stricter definition the probability of graduating in the same session is almost twice (22.4
percent) that for nonpeers in our sample. The differences are always strongly statistically
significant.
In the middle panel of Table 4 we contrast the number of peers and nonpeers who
choose the same submajor (i.e. field). Within each of the main majors  economics and
business  students could further specialize in different fields, like marketing or accounting
within business and finance or theory within economics. The students in our sample
15The weights adopted in the core of the paper are linear in the number of courses attended together
both for the restricted and the looser definition of peer group. We have experimented with many other
specifications and the results are robust to the weighting scheme; see Section V.
16In the period covered by our data, students could graduate in several different sessions throughout
the year (almost one session per month). During these sessions, which lasted one or two days, stu
dents present their final dissertation to a committee which decides their final mark (based on both the
dissertation and their GPA). Students could freely choose when to graduate, a decision that is usually
affected both by how quickly they complete their coursework and by how much time they spend on their
dissertation (Pietro Garibaldi, Francesco Giavazzi, Andrea Ichino and Enrico Rettore 2007).
Page 10
10AMERICAN ECONOMIC JOURNAL MONTH YEAR
could choose among 8 submajors within the economics area and 16 submajors within
the business area. Among students who have attended at least one of the 7 common
courses in the same random class on average slightly more than 9.6 percent of peers
choose the same submajor. This compares to a marginally lower incidence of students
making similar choices among the nonpeers. As we restrict our definition of peers to
students who have attended more and more courses in the same classes, the difference
between peers and nonpeers increases and becomes statistically significant. Only with
the strictest definition (column 4) does this difference become smaller and insignificant
again.
Finally, in the lower panel of Table 4 we look at the probability of choosing the same
thesis supervisor (advisor). Once again, students who have been assigned to the same
classes in the initial three semesters are substantially more likely to choose the same
advisor roughly two years later, and such probability increases with the number of courses
taken together.17
The evidence in Table 4 shows that randomly assigned peers eventually follow similar
academic patterns, suggesting that they actually interact with each other. Moreover,
the stronger effects that emerge for peers that have attended more and more courses
supports the idea of our weighting scheme, which should indeed emphasize the most
intense interactions.
Despite all our efforts, our peer groups could still be measured with error, an issue
that we discuss more at length in Section V. Nevertheless, a more general point can
already be made here. Without some knowledge of the mechanism that generates social
interactions, it is extremely hard to establish a priori who is going to influence whom,
hence measurement error in the definition of the groups affects virtually all studies of
peer effects. In fact, the level and degree of interactions are entirely specific not only to
the context but also to the specific social mechanism. For example, if peer effects arise
due to imitation, it is unclear that the definition of the groups should be limited to close
friends, since one may in fact follow the behavior of the average person, thus including
close friends as well as simple acquaintances.
In the literature the definition of peer groups varies substantially, from possibly the
most comprehensive, i.e. same race in the State of residence in the US (Kerwin K.
Charles, Erik Hurst and Nicolai Roussanov 2009) to the entire school cohort (Carrell,
Fullerton and West 2009), down to the very restrictive roommate in a college dorm
(Sacerdote 2001). Our definition is to the restrictive end of the spectrum, and it appears
as the most natural given the institutional setting. In section V, we perform additional
robustness checks by experimenting with alternative definitions and we also discuss the
results of a simulation exercise, that is presented in fuller details in the web Appendix B.
17Note that the pattern in the probability of choosing the same advisor might also be influenced by
the fact that students who have attended the same classes have also met the same professors. However,
students typically pick their advisors among the teachers of later elective courses, hence the evidence in
Table 4 can hardly be explained solely by the fact that students have met the same professors in the
initial compulsory courses.
Page 11
VOL. VOL NO. ISSUEPEERS IN PARTIALLY OVERLAPPING GROUPS 11
III.Identification strategy
The identification of social interaction effects has been the topic of several papers
(Manski 1993, Brock and Durlauf 2001, William A. Brock and Steven N. Durlauf 2007,
Moffitt 2001, Bryan S. Graham and Jinyong Hahn 2005, Graham 2008) and it rests on
two distinct dimensions: endogeneity and reflection. Endogeneity may arise for at least
two reasons: first, people usually choose their peers endogenously and, second, common
unobserved shocks may hit the group as a whole (teacher effects are the usual suspect
in studies of education). As a consequence, when we observe a significant correlation
between individual and group outcomes it is hard to say whether this result is due to
true peer effects or simply to endogenous group formation and/or correlated effects.
The second problem  reflection  arises because in a peer group everyone’s behavior
affects the others and, as in a mirror reflection, we cannot know if one’s action is the
cause or the effect of peers’ actions. Although particularly cumbersome, this is essentially
a problem of simultaneity.
Let us start with a discussion of how we address reflection. This problem has been
commonly described by using a simple linear in means model:
(1)
yi= α + βE(yGi) + γE(xGi) + δxi+ ui
In our framework, yi is the chosen major (i.e. economics or business), xi is a set
of individual traits, and E(xGi) contains the averages of the xs in the peer group of
individual i, denoted by Gi. Following the literature, β measures the endogenous effect,
and γ the exogenous effects. For now assume E(uiGi,xi) = 0, i.e. no correlated effects
or selfselection into groups.
In the standard framework, peer groups are fixed across individuals, i.e. if A and B are
both in the peer group of C, it must also be that A and B are in the same group. Put in
the wording of equation (1), if i and j are in the same peer group, then the two groups
coincide, i.e. Gi= Gj. In this situation, endogenous effects cannot be distinguished from
exogenous effects (Manski 1993). In fact, it is easy to show, by simply averaging equation
(1) over group Gi, that E(yGi) is a linear combination of the other regressors:
(2)
E(yGi) =
?
α
1 − β
?
+
?γ + δ
1 − β
?
E(xGi)
Hence, with perfectly overlapping peer groups E(yGi) does not vary at the individual
level as it is constant for all members of the same group. This fact alone would prevent
the separate identification of endogenous and exogenous effects, even if the groups were
randomly formed and in the absence of correlated effects or group shocks. In fact, one
popular approach to the reflection problem simply consists in estimating a composite pa
rameter that incorporates both the endogenous and exogenous effects without attempting
Page 12
12AMERICAN ECONOMIC JOURNALMONTH YEAR
to separate the two.18
In our framework peer groups are instead individualspecific, a feature that guarantees
the existence of excluded peers, i.e. students who are not in one’s peer group but are
included in the groups of one’s peers. Later on we will discuss how to exploit such excluded
peers in an IV strategy that addresses the potential endogeneity due to correlated effects
but it is important to clarify now that the existence of the excluded peers is the key to
generate withingroups variation in E(yGi) and that we would not need to instrument
if the only identification issue were reflection.
A simple example may help to illustrate this point. Consider the simple case of only
three students. Students A and B study together (e.g., they attend 4 courses in the
same classes), however, B also studies with C (e.g., they attend some of the remaining 3
courses in the same class, different from A’s class). A’s peer group includes only B, while
B’s peer group includes both A and C. This identification can also be seen as a case of
triangularization. In the standard simultaneous equation model, at least one exogenous
variable is excluded from each equation; here, A is excluded from the peer group of C,
who is excluded from the peer group of A.
With 7 courses, each divided into 6 to 10 lecturing classes, our data exhibit enough
variation to generate peer groups that vary at the level of the single individual, so that
every student has a distinct group of peers. The weighting scheme described in the
previous section adds even more variation to such groups.
To formally see the advantage of this framework in solving the reflection problem,
rewrite equation (2) allowing peer groups to vary at the level of the single individual:
(3)
E(yiGi) = α + βE[E(yGj)Gi] + γE[E(xGj)Gi] + δE(xiGi)
where j is a generic member of i’s peer group. The key to understanding this equation
is the fact that j’s peer group Gjnever coincides with Gi.
This result can be further clarified by going back to the previous example with 3
students: A, B and C, where A and B are in the same class for one subject and B and C
sit together in another course. This structure implies that GA= {B}, GB= {A,C} and
GC= {B}. Equation (1), then, translates into the following three equations:
yA
=
α + βyB+ γxB+ δxA+ uA
?yA+ yC
α + βyB+ γxB+ δxC+ uC
A
yB
=
α + β
2
?
+ γ
?xA+ xC
2
?
+ δxB+ uB
B
yC
=
C
18Such a composite parameter (or set of composite parameters) is in fact the coefficient on E(xGi)
in equation 2.
Page 13
VOL. VOL NO. ISSUEPEERS IN PARTIALLY OVERLAPPING GROUPS13
Now, consider the corresponding reduced form equations:
yA
=
?
?α(1 + β)
?
α +αβ (1 + β)
1 − β2
?
+
?β(γ + δ)
?
?β(γ + δ)
1 − β2
xB+
+ γ
?γ + δβ
+ γ
?
xB+
?β(γ + δβ)
??xA+ xC
xB+
1 − β2
??xA+ xC
+ ηB
B
??xA+ xC
2
?
+ δxA+ ηA
A
yB
=
1 − β2
α +αβ (1 + β)
1 − β2
?
+
?γ + δ
?
1 − β2
+
1 − β2
?
2
?
yC
=
1 − β2
?β(γ + δβ)
1 − β2
2
?
+ δxC+ ηC
C
where the new reduced form error terms ηA
structural error terms  uA
The example above shows how we achieve identification: we are left with four reduced
form parameters and four structural ones. Notice, additionally, that in this particular
case the last equation yCis redundant and, in fact, only observations with distinct groups
of peers contribute to identification.20
Clearly, as written above, our identification rests on the assumption that the excluded
peer C does not interact with A directly. In our setting this seems a plausible assump
tion. Alternatively, identification could also be achieved in a setting where direct inter
actions with one’s excluded peers are allowed, by simply assuming that the strength of
the interactions declines with distance in the network. This is the approach taken by
Calv´ oArmengol, Patacchini and Zenou (2009) and, contrary to ours, it requires some
additional parametric assumptions on such smoothing factor. Moreover, we provide a
simulation exercise in the web Appendix B where we show that our strategy is generally
robust to small deviations from our key exclusion restriction.
Although this particular setting allows us to solve reflection, one might still worry about
the presence of correlated effects, i.e. common unobservable shocks at the group level
which could undo the previous identification result. Suppose, in fact, that the general
error term is of the following form:
A, ηB
Band ηC
C are linear combinations of the
A, uB
Band uC
C.19
(4)
ug
i= µi+ θg+ εi
with g = A,B,C and where µiis an individual fixed effect, θga group fixed effect (e.g.
teacher quality, class disruptions), and εi an i.i.d. random component.21
to substitute (4) into (1) we would face two problems of endogeneity arising from the
individual effect (µi) and the group effect (θg).
In our particular case, the random nature of the peer groups rules out correlation
between the individual effect and any endogenous or exogenous effect (E(yGi) and
If we were
19The meaning of the double indexing  subscript and superscript  will become clear in a few para
graphs.
20In fact, A and C here have the same peer group, {B}, although they are not peers to each other.
In our data, however, there are no such cases and each single student has a distinct group of peers.
21The double indexing of the previous error terms should clarify the fact that these errors include both
an individual specific error (µi) and a group shock (θg).
Page 14
14AMERICAN ECONOMIC JOURNALMONTH YEAR
E(xGi)).22However, unobservable group shocks could still be present and induce endo
geneity, i.e. Cov (E (yGi),θg) ?= 0.23Even if our strategy effectively solves reflection,
the presence of correlated effects may still generate endogeneity of E(yGi) and impede
identification.
One possible solution is to use instrumental variables. Fortunately, this setting natu
rally offers valid instruments, namely peers of peers who are not in one’s own peer group.
In fact, by construction, the xs of students who are excluded from i’s peer group but
included in the group of one or more of i’s peers are uncorrelated with the group fixed
effect of i and correlated with the mean outcome of i’s group through endogenous inter
actions. In our previous example, xCwould be a valid instrument for yBin group A. The
logical chain is the following: xC, which is uncorrelated with θA, affects yCand, since C
is a peer of B, through endogenous effects yCalso affects yB. For the same reasoning xA
would be a valid instrument for yBin group C.
Bramoull´ e, Djebbari and Fortin (2009), a paper developed independently and at the
same time as ours, present a more general approach for the identification of social inter
actions that includes our specific case. As far as we know, our paper is the first to also
have an empirical application of the methodology.
In our data, the group of peers of peers  which we label excluded peers for clarity  for
a generic student i includes all other students who have never taken any of the 9 common
courses in the same lecturing classes of i, but have taken some of the 7 courses that we
consider with one or more of i’s peers. Importantly, we maintain the same definition of
excluded peers also when working with groups defined over students who take at least 4
courses together. This guarantees that the excluded peers of any student i never attended
any course in the same class of i, regardless of how we define actual peers.
The average raw size of these groups is 252 students, as reported in the third column
of Table 3. Notice additionally that the union of the groups of excluded and actual
peers never spans the entire sample. The student with the largest groups is linked either
directly or indirectly to 1085 students, thus allowing for more than 50 totally excluded
peers. On average, the sum of the two groups is 927.
Random assignment. — To better document the absence of selfselection in our setup,
Figure 2 compares the distributions of some selected individual characteristics in the en
tire population and in one randomly selected group of peers and excluded peers (Jonathan
Guryan, Kory Kroft and Matt Notowidigdo 2009). In the upper panels of the figure we
show the Kernel plots of the distributions of two important measures of ability and aca
demic outcomes, namely the entry test score and the high school grade.
[FIGURE 2]
22Additionally, our data include several observable proxies for variables that are generally unobservable
to the econometrician (i.e. standardized ability test, high school grades, type of high school, preferences,
etc.) and we make use of all of them to purge our results from potential residual endogeneity.
23Note that correlated effects cannot induce endogeneity of the exogenous effect  Cov (E (xGi),θg) =
0  since the xs are determined prior to the allocation to the groups.
Page 15
VOL. VOL NO. ISSUEPEERS IN PARTIALLY OVERLAPPING GROUPS 15
Not surprisingly the distributions of these variables are extremely similar in the popu
lation and in the randomly selected groups of peers or excluded peers. We also performed
twosample Kolmogorov–Smirnov tests for the equality of such distributions and in none
of the cases we could reject the null of equal distributions.
Two features of the distributions of test scores and high school results are worth notic
ing. First, the sharp increase of the density of test scores around the (normalized) values
of 55 is due to the practice adopted by Bocconi of rejecting students with particularly
low test scores, regardless of the availability of admission places. Second, the evident
right skewness of the distribution of high school grades confirms the well known fact that
Bocconi attracts a pool of positively selected students.
In the lower panels of Figure 2 we look at two other characteristics: gender and ex
ante determinedness to major in economics.24Again the proportions of both female and
economics determined students are very similar. Twosample tests of proportions also
fail to reject the null of equality. Additionally, to show that the randomization that we
exploit is not subject to the bias described in Guryan, Kroft and Notowidigdo (2009), we
also run a battery of regressions of individual on peers’ characteristics. For brevity, the
results from those regressions are reported in the web Appendix (Table C.3).
IV. Peer effects in major choices
As already mentioned, students in our sample choose between economics and business
majors after the initial three common semesters and the remaining five terms were clearly
differentiated across the two majors.25
To estimate the effect of peers on one’s decision to specialize in economics versus
business, we run a linear probability model similar to equation (1), where yi = 1 if a
student chooses economics and 0 otherwise. E(yGi) is the (weighted) share of peers
choosing economics and xiis a set of controls for individual characteristics that includes
a gender dummy, household income (as recorded at the first registration), a dummy for
students who reside outside the city of Milan (the site of Bocconi), a set of dummies for
the region of origin, a series of controls for academic performance and ability (high school
type and grades, results of the admission test) and an indicator of exante preferences
over the two majors (i.e. whether a student was determined to major in economics at
enrollment).26
[TABLE 5]
Table 5 reports the results of the estimation of linear probability models for two defi
nitions of peer groups: our preferred one, based on the restricted set of peers who have
24This variable is derived from the students’ original applications, where they are asked to rank degrees
and majors according to their preferences. See the web Appendix A for more details.
25Although some elective courses could be picked from any of the two majors, such practice was quite
uncommon and the number of such options very limited.
26We obtain very similar, actually statistically more significant, results with a probit model. However,
we prefer the linear specification simply because it shows more clearly the features of our identification
strategy. See Brock and Durlauf (2007) for a more detailed discussion of identification of nonlinear
models of social interactions.