ArticlePDF Available

The Multigroup Entropy Index (Also Known as Theil's H or the Information Theory Index)

The Multigroup Entropy Index
(Also Known as Theil’s H or the Information Theory Index)
John Iceland1
University of Maryland
December 2004
1 These indexes were prepared under contract to the U.S. Census Bureau.
Table of Contents
Summary......................................................................................................................................... 3
Data Source..................................................................................................................................... 3
Race and Ethnicity.......................................................................................................................... 4
Geographic Areas............................................................................................................................ 5
Residential Pattern Measures..........................................................................................................6
Dual-Group Entropy Indexes.......................................................................................................... 9
References....................................................................................................................................... 9
This website contains three sets of residential-pattern indicators for 1980, 1990, and
1. The highlighted measure is the “multigroup entropy index,” which is also known as the
multigroup version of Theil’s H or the multigroup information theory index. This is a
measure of “evenness.”
2. “Diversity” scores are also available; these are used in the calculation of the multigroup
entropy index. A diversity score measures the extent to which several groups are present
in a metropolitan area, regardless of their distribution across census tracts.
3. Dual-group entropy indexes are included here, where the reference group consists of all
people not of the main group in question.2 Two-group entropy indexes are computed for
Non-Hispanic Whites, Non-Hispanic African Americans, Non-Hispanic Asians and
Pacific Islanders, Non-Hispanic American Indians and Alaska Natives, Non-Hispanics of
other races, and Hispanics.
Data Source
These indexes are based on data from the 1980, 1990, and 2000 decennial censuses (the
100 percent data). The main data issues involved in calculating racial and ethnic residential
patterns revolve around the definition of racial and ethnic categories, geographic boundaries, and
residential-pattern measures.
2 Dual-group entropy indexes were also included in the 2002 report by Iceland, Weinberg, and Steinmetz. In that
report, the entropy index indicated the segregation of each of several groups (Blacks, Hispanics, Asians and Pacific
Islanders, and American Indians and Alaska Natives) from non-Hispanic Whites.
Race and Ethnicity
In 1977, the Office of Management and Budget (OMB) issued its Statistical Policy
Directive 15, which provided the framework for federal data collection on race and ethnicity to
federal agencies, including the Census Bureau for the 1980 decennial census. The OMB directed
agencies to focus on data collection for four racial groups – White, Negro or Black, American
Indian, Eskimo, or Aleut; and Asian or Pacific Islander – and one ethnicity – Hispanic, Latino, or
Spanish origin. The questions on the 1980 and 1990 censuses asked individuals to self-identify
with one of these four racial groups and whether they were Hispanic or not.3
After much research and public comment in the 1990s, the OMB revised the Nation’s
racial classification to include five categories – White, Black or African American, American
Indian or Alaska Native, Asian, and Native Hawaiian or other Pacific Islander. An additional
major change was to permit the self-identification of individuals as “one or more races.” While a
small fraction of the population had already been doing so on previous census forms, this new
directive made this practice permissible in data collection activities.
This change naturally challenges researchers to determine the best way to present
historically-compatible data. To facilitate comparisons across time, minority race/ethnicity
definitions that could be rather closely reproduced in the three different decades were used, and
which closely approximate 1990 census categories. Six mutually exclusive and exhaustive
categories were constructed: Non-Hispanic Whites, Non-Hispanic African Americans, Non-
Hispanic Asians and Pacific Islanders, Non-Hispanic American Indians and Alaska Natives,
Non-Hispanics of other races, and Hispanics. Having mutually exclusive and exhaustive
3The Population Censuses have a special dispensation from OMB to allow individuals to
designate “Some Other Race” rather than one of those specifically listed. Because of
Congressional directives, the decennial census questions also ask about specific Asian and
Pacific Islander races (e.g., Chinese).
categories is essential for constructing a single multiracial index. For Census 2000, this involved
combining the Asian and Native Hawaiian or other Pacific Islander groups. In addition, non-
Hispanic people who identified themselves as being of two or more races in 2000 were also
categorized as “Other” since people could not mark more than one race in 1980 or 1990. Census
2000 figures indicate that 4.6 million, or 1.6 percent of the population, designated themselves as
multiracial (and non-Hispanic). Because of the relatively small number of multiracial people, the
impact of the creation of this category in Census 2000 on segregation is small.4 People who
reported being Hispanic were categorized as such, regardless of their response to the race
Geographic Areas
Residential pattern indexes often measure the distribution of different groups across units
within larger areas. Thus, to measure residential patterns, one has to define both the appropriate
larger area and its component parts. The larger areas here are represented by metropolitan areas,
as these are reasonable approximations of housing markets. These are operationalized by using
independent and primary metropolitan statistical areas, referred to hereafter as metropolitan
areas, or MAs. To facilitate comparisons over time, the definition of MA boundaries in effect
during Census 2000 (issued by the Office of Management and Budget on June 30, 1999) were
4 As a way of testing the sensitivity of the information theory index calculated here to differences
in race categories, an alternative race classification scheme with the Census 2000 data was
tested: instead of the six categories described above, eight were constructed. The two extra were
created by splitting the Asian and Pacific Islander category into two (Asians, and Native
Hawaiians and Other Pacific Islanders), and splitting the non-Hispanic Other category into non-
Hispanic “Other,” and non-Hispanics who marked two or more races. The mean entropy index
for all 331 metropolitan in 2000 was 0.181 using six categories, and 0.180 using the eight
categories, indicating the very small effect of using these two alternatives. The correlation
between the two is over 0.99.
used. Minor Civil Division-based MAs were used in New England. To address the second
geographic consideration, this analysis uses census tracts. These units are designed with the
intent of representing neighborhoods, are delineated with substantial local input, and thereby a
reasonable choice from a heuristic perspective.
In 2000, there were 331 MAs in the U.S. For this analysis, six MAs were omitted
(Barnstable-Yarmouth, MA, Flagstaff, AZ-UT, Greenville, NC, Jonesboro, AR, Myrtle Beach,
SC, and Punta Gorda, FL) because they had fewer than 9 census tracts and populations of less
than 41,000 in 1980. All other MAs used had populations of at least 50,000 in 1980, which is
typically one of the criteria for defining an area an MA.
Residential Pattern Measures
Residential pattern measures, usually referred to as “residential segregation” measures in
the social scientific literature, have been the subject of extensive research for many years, and a
number of different measures have been developed over time (e.g., see Massey and Denton,
1988; Iceland, Weinberg, and Steinmetz, 2002). Reardon and Firebaugh (2002) note that all
major reviews of such indexes limit their discussion to dichotomous measures (e.g. Duncan and
Duncan, 1955; James and Taeuber, 1985; Massey and Denton; 1988; White, 1986; Zoloth, 1976;
Massey, White, and Phua, 1996). The earliest of the multigroup indexes is the information theory
index (H) (sometimes referred to as the entropy index), which was defined by Theil (Theil, 1972;
Theil and Finezza, 1971).
The entropy index is a measure of “evenness”—the extent to which groups are evenly
distributed among organizational units (Massey and Denton 1988). More specifically, Theil
described entropy index as a measure of the average difference between a unit’s group
proportions and that of the system as a whole (Theil 1972). H can also be interpreted as the
difference between the diversity (entropy) of the system and the weighted average diversity of
individual units, expressed as a fraction of the total diversity of the system (Reardon and
Firebaugh 2002).
The entropy score, which is a measure of diversity, and the entropy index, which
measures the distribution of groups across neighborhoods, are discussed below. A measure of the
first is used in the calculation of the latter. The entropy score is defined by the following
formulas, from Massey and Denton (1988). First, a metropolitan area’s entropy score is
calculated as:
where Πr refers to a particular racial/ethnic group’s proportion of the whole metropolitan area
population. All logarithmic calculations use the natural log.5
Unlike the entropy index defined below, this partial formula describes the diversity in a
metropolitan area. The higher the number, the more diverse an area. The maximum level of
entropy is given by the natural log of the number of groups used in the calculations. With six
racial/ethnic groups, the maximum entropy is log 6 or 1.792. The maximum score occurs when
all groups have equal representation in the geographic area, such that with six groups each would
comprise about 17 percent of the area’s population. This is typically not referred to as a measure
of “segregation” because it does not measure the distribution of these groups across a
metropolitan area. A metropolitan area, for example, can be very diverse if all minority groups
5 When the proportion of a particular group in a given census tract (Πr) is 0, then the log is set to
0. This is the preferred procedure here, as the absence of a group (or multiple groups) should
result in a 0 increase in the diversity score (where a higher score indicates more diversity).
are present, but also very highly “segregated” if all groups live exclusively in their own
A unit within the metropolitan area, such as a census tract, would analogously have its
entropy score, or diversity, defined as:
where πri refers to a particular racial/ethnic group’s proportion of the population in tract i.
The entropy index is the weighted average deviation of each unit’s entropy from the
metropolitan-wide entropy, expressed as a fraction of the metropolitan area’s total entropy:
where ti refers to the total population of tract i, T is the is the metropolitan area population, n is
the number of tracts, and Ei and E represent tract i's diversity (entropy) and metropolitan area
diversity, respectively. The entropy index varies between 0, when all areas have the same
composition as the entire metropolitan area (i.e., maximum integration), to a high of 1, when all
areas contain one group only (maximum segregation). While the diversity score is influenced by
the relative size of the various groups in a metropolitan area, the entropy index, being a measure
of evenness, is not. Rather, it measures how evenly groups are distributed across metropolitan
area neighborhoods, regardless of the size of each of the groups.
Other multigroup segregation indexes exist, such as a generalized dissimilarity index and
an index of relative diversity. In a detailed review of 6 multigroup indexes (dissimilarity, gini,
entropy, squared CV (coefficient of variation), relative diversity, normalized exposure), Reardon
Ht(E E)
iri ri
and Firebaugh (2002) conclude that the entropy index is clearly the superior measure. They note,
for example, that entropy is the only index that obeys the “principle of transfers,” (the index
declines when an individual of group m moves from unit i to unit j, where the proportion of
persons of group m is higher in unit i than in unit j). The entropy index can also be decomposed
into its component parts. For these reasons, the entropy index was calculated here.
Dual-Group Entropy Indexes
In addition to the multigroup entropy index, indexes for particular groups are also
available here. These employ a two-group entropy index (H) calculation, which uses the same
formulas specified above, where the distribution of each of six groups in question (Non-Hispanic
Whites, Non-Hispanic African Americans, Non-Hispanic Asians and Pacific Islanders, Non-
Hispanic American Indians and Alaska Natives, Non-Hispanics of other races, and Hispanics) is
compared to the distribution of all other groups combined. In other words, the reference group
for these calculations consists of those who are not of the racial/ethnic group being considered.
Additional discussion and analyses of these indexes is contained in Iceland (2004).
Duncan, Otis Dudley and Beverly Duncan. 1955. “A methodological analysis of segregation
indexes.” American Sociological Review 20: 210-17.
Iceland, John. 2004. “Beyond Black and White: Residential Segregation in Multiethnic
America.” Social Science Research 33, 2 (June): 248-271.
Iceland, John, Daniel H. Weinberg, and Erika Steinmetz. 2002. Racial and Ethnic Residential
Segregation in the United States: 1980-2000. U.S. Census Bureau, Census Special
Report, CENSR-3, Washington, DC: U.S. Government Printing Office.
James, David R. and Karl E. Taeuber. 1985. “Measures of segregation.” Sociological
Methodology 14: 1-32.
Massey, Douglas S. and Nancy A. Denton. 1988. "The Dimensions of Residential Segregation."
Social Forces 67:281-315.
Massey, Douglas S., White, Michael J., and Voon Chin Phua. 1996. "The Dimensions of
Segregation Revisited." Sociological Methods and Research 25, 2 (November): 172-206.
Reardon, Sean F., and Glenn Firebaugh. 2002. “Measures of MultiGroup Segregation.”
Sociological Methodology 32, 1 (January): 33-67.
Theil, Henri. 1972. Statistical decomposition analysis. Amsterdam: North-Holland Publishing
Thiel, Henri and Anthony J. Finezza. 1971. “A note on the measurement of racial integration of
schools by means of informational concepts.” Journal of Mathematical Sociology 1: 187-
White, Michael J. 1986. “Segregation and diversity measures in population distribution.”
Population Index 52: 198-221.
Zoloth, Barbara S. 1976. “Alternative measures of school segregation.” Land Economics 52:
... A multi-racial entropy index was calculated using White populations, Black/African American populations, Asian populations, and Hispanic populations to measure racial segregation [16,23,24]. The entropy index is an evenness residential segregation index that measures each census tract's weighted average deviation from the city's racial or ethnic diversity. ...
... The values of an entropy index range between 0 and 1. A value of 0 means that all census tracts have the same composition as the entire city (integration), and a value of 1 means that all census tracts contain only one racial group (segregation) [16,23]. We also accounted for the clustering of the White population, the clustering of the Black population, the percentage of the Asian population, and the percentage of the Hispanic population. ...
... We used a multi-racial entropy index because, while most segregation measures account for segregation between only two or even one group, such as clustering measures, the multi-racial entropy index was used to measure segregation between multiple groups [23]. This is important because segregation manifests itself in diverse forms within an urban environment, and hence, the entropy index was used to provide a holistic approach of combining multiple races (White population, Black/African American population, Asian population, and Hispanic population) in a single segregation measure, unlike the clustering approach, which measures only a single racial composition. ...
Full-text available
Introduction: The food environment influences the availability and affordability of food options for consumers in a given neighborhood. However, disparities in access to healthy food options exist, affecting Black and low-income communities disproportionately. This study investigated whether racial segregation predicted the spatial distribution of supermarkets and grocery stores better than socioeconomic factors or vice versa in Cleveland, Ohio. Method: The outcome measure was the count of supermarket and grocery stores in each census tract in Cleveland. They were combined with US census bureau data as covariates. We fitted four Bayesian spatial models. The first model was a baseline model with no covariates. The second model accounted for racial segregation alone. The third model looked at only socioeconomic factors, and the final model combined both racial and socioeconomic factors. Results: Overall model performance was better in the model that considered only racial segregation as a predictor of supermarkets and grocery stores (DIC = 476.29). There was 13% decrease in the number of stores for a census tract with a higher majority of Black people compared to areas with a lower number of Black people. Model 3 that considered only socioeconomic factors was less predictive of the retail outlets (DIC = 484.80). Conclusions: These findings lead to the conclusion that structural racism evidenced in policies like residential segregation has a significant influence on the spatial distribution of food retail in the city of Cleveland.
... Non-Hispanic White (White), Non-Hispanic Black or African American (Black), Non-Hispanic Asian (Asian), and Hispanic 66 . There are altogether six categories of indices of dissimilarity: White-Black, White-Asian, White-Hispanic, Black-Asian, Black-Hispanic, and Asian-Hispanic.Next, we calculate the entropy index68 ...
Cities play an important role in achieving sustainable development goals (SDGs) to promote economic growth and meet social needs. Especially satellite imagery is a potential data source for studying sustainable urban development. However, a comprehensive dataset in the United States (U.S.) covering multiple cities, multiple years, multiple scales, and multiple indicators for SDG monitoring is lacking. To support the research on SDGs in U.S. cities, we develop a satellite imagery dataset using deep learning models for five SDGs containing 25 sustainable development indicators. The proposed dataset covers the 100 most populated U.S. cities and corresponding Census Block Groups from 2014 to 2023. Specifically, we collect satellite imagery and identify objects with state-of-the-art object detection and semantic segmentation models to observe cities' bird's-eye view. We further gather population, nighttime light, survey, and built environment data to depict SDGs regarding poverty, health, education, inequality, and living environment. We anticipate the dataset to help urban policymakers and researchers to advance SDGs-related studies, especially applying satellite imagery to monitor long-term and multi-scale SDGs in cities.
... The entropy score is a multi-group measure that assesses the extent to which race/ethnic groups are evenly distributed on a scale of 0 to 1. 3 This index has a minimum value of 0 indicates a county with no diversity and composed entirely of a single race/ethnic group. and a maximum value of 1 indicates a county in which Black, White, and Latinx are equally represented, with larger values indicating more diversity (Feldmeyer, 2009;Iceland, 2004;Massey & Denton, 1988). Other structural variables include immigration, measured as the percentage of the county population that is foreign born; crime prone population, measured as the percentage of males between ages 18 and 24; residential instability, measured as the percentage of a county population that are renters; and police presence, calculated as the number of full-time police officers per 100,000 residents. ...
Full-text available
Structural disadvantage has long been empirically linked to violent crime across different race/ethnic groups. More recently conceptualized as “racial invariance,” observed racial differences in crime rates are hypothesized to be the result of disparities in community-level structural conditions. However, most investigations into this hypothesis have focused on urban settings, with limited attention to rural contexts. The current study seeks to fill this gap by comparing county-level structural predictors of homicide victimization for Black, White, and Latinx populations in both urban and rural communities. Consistent with the racial invariance hypothesis, findings reveal that disadvantage strongly predicts homicide across race/ethnicity in both rural and urban counties. Closer inspection of results, however, exposes noteworthy differences in the effects in rural and urban settings.
... • Population density, which is the estimated number of people per square mile (U.S. Census Bureau, 2010); • Index of Medical Underservice (IMU) score, which ranges from 0 (highest need for medical care access) to 100 (lowest need for medical care access), calculated based on the ratio of primary care physicians per 1,000 population, the infant mortality rate, the percentage of the population with incomes below the federal poverty level and the percentage of the population aged 65 or over (HRSA, no date) -the score applies only to MCBHOs that are located in medically underserved areas and serve medically underserved populations; • Theil's H Index, ranging from 0, suggesting that sub-areas have a composition similar to the larger area (even distribution, less segregation), to 1, suggesting that the racial and ethnic composition of sub-areas within a larger area deviates from the larger area (non-uniform distribution, more segregation) (Iceland, 2004); ...
Muslim community-based health organisations (MCBHOs) represent a new wave of non-profit organisations outside of mosques and Islamic community centres. In this article we examine MBCHOs’ core management competencies because they are instantiations of institutional logics, which result in different forms of organisational hybridity within the third sector. Theoretically, we focus on the instantiations that are associated with a societal institutional logic (religion) and two organisational field logics (voluntarism and healthcare). Empirically, we draw from a survey, maps, tax filings and strategic plans. We observed convergences in financial and human resource management and divergences in community engagement and patient assessment among 110 MCBHOs located in the United States. Volunteering and patient care hold the meaning of faith. Our findings suggest that most MCBHOs resemble an assimilated hybrid, characterised by managerial practices that adhere to the core logics of healthcare and voluntarism, with traces of the Islamic religious logic. We thus introduce the concept of ‘faithwashing’.
... Independent variables--The focal independent variable is K-12 school diversity, which we measure through a multigroup entropy index of school desegregation. The entropy index is a multi-group measure of "evenness" and describes how groups are distributed across schools within the metropolitan areas (Iceland 2004;Massey and Denton 1988). ...
... The higher the value of h i , the higher the diversity of drinking water sources. Lower values indicate lower diversity while a value of zero indicates that the community has only one drinking water source (Iceland, 2004;Reardon & Firebaugh, 2002). High diversity is associated with largely vendor water sources particularly sachet water while low diversity is associated with mainly the piped water sources. ...
Full-text available
Universal access to safe drinking water is essential to population health and well-being, as recognized in the Sustainable Development Goals (SDG). To develop targeted policies which improve urban access to improved water and ensure equity, there is the need to understand the spatial heterogeneity in drinking water sources and the factors underlying these patterns. Using the Shannon Entropy Index and the Index of Concentration at the Extremes at the enumeration area level, we analyzed census data to examine the spatial heterogeneity in drinking water sources and neighborhood income in the Greater Accra Metropolitan Area (GAMA), the largest urban agglomeration in Ghana. GAMA has been a laboratory for studying urban growth, economic security, and other concomitant socio-environmental and demographic issues in the recent past. The current study adds to this literature by telling a different story about the spatial heterogeneity of GAMA’s water landscape at the enumeration area level. The findings of the study reveal considerable geographical heterogeneity and inequality in drinking water sources not evidenced in previous studies. We conclude that heterogeneity is neither good nor bad in GAMA judging by the dominance of both piped water sources and sachet water (machine-sealed 500-ml plastic bag of drinking water). The lessons from this study can be used to inform the planning of appropriate localized solutions targeted at providing piped water sources in neighborhoods lacking these services and to monitor progress in achieving universal access to improved drinking water as recognized in the SDG 6 and improving population health and well-being.
... Multigroup entropy (H) is a measure of evenness that considers the simultaneous distribution of three or more mutually exclusive population subgroups among a set of smaller geographies nested within a larger geography (Iceland, 2004;. Again, the smaller geographies are enumeration districts, which nest within a county. ...
The extant theory posits that ethno-racial diversity promotes entrepreneurship by increasing the novelty of information and perspectives available for recombination in a region. This view presupposes the flow of novel information among potential entrepreneurs. Yet, we know comparatively little about how regional social structures (e.g., collective social capital) that affect information flows condition this relationship. We build on the sociological literature to theorize how the interplay between collective social capital and residential segregation moderates the relationship between ethno-racial diversity and entrepreneurship. We test, and find empirical support for, our hypotheses among all registered new ventures started in the United States between 1990 and 2018.
An assessment of the differentiation of the ethnodemographic factor combining the spatial and temporal development of interrelated processes of population movement and ethnodemographic heterogeneity, cultural evolution in the United States of America for the period 1981-2020 in the context of individual territories is carried out. The empirical results obtained indicate an ever-increasing, but multidirectional change in the regional interaction of demographic, ethnic and cultural processes in the space - time continuum of the United States. As a result of the comparative analysis, the conclusion is made about the «American specificity» and at the same time the typicality (in comparison with other countries) of the studied spatial and temporal dynamics of the influence of the ethnodemographic factor on the nature of cultural evolution observed in different groups of US territories over the past 40 years. The presence of a qualitative transition is empirically substantiated - a cultural shift characterized by the achievement of a certain quantitative limit of the linear development of cultural values to democracy, tolerance, freedom, national unity. The noted shift «highlighted» the critical essence (threshold values) of the multidirectional evolution of the American nation over the past 20 years, determined by the predominance of factional or polarization trends in population development, migration, and localization «pressure» on the concentration of carriers of different cultures
Full-text available
Abstract We examine trends in five dimensions of segregation for African Americans, Hispanics, Asians and Pacific Islanders, and American Indians and Alaska Natives: evenness, exposure, concentration, centralization, and clustering. The trend for African Americans is clearest— declines in segregation over the 1980 to 2000 period, regardless of the dimension considered. Nevertheless, segregation is still higher for African Americans than for the other groups across all measures. Latinos are generally the next most highly segregated group, followed by Asians and Pacific Islanders and then American Indians and Alaska Natives. Asians and Pacific Islanders and Hispanics both tended to experience increases in segregation over the period, though not across all dimensions. Increases were generally larger for Asians and Pacific Islanders than for Hispanics. The story of American Indian and Alaska Native residential segregation is mixed, with declines across some dimensions of segregation and increases in others. 1 Racial and Ethnic Residential Segregation in the United States: 1980-2000 Residential segregation has been the subject of considerable research for many,years. An extensive tour through any major American city reveals that many neighborhoods,are racially and ethnically homogenous. In addition to controversiesabout the causes and consequences,of residential segregation, there are substantial disagreements as to how to best measure it. Massey and Denton (1988) identified 19 residential segregation indexes and used cluster analysis to distinguish five key dimensions: evenness, exposure, concentration, centralization, and clustering.
This paper conceives of residential segregation as a multidimensional phenomenon varying along five distinct axes of measurement: evenness, exposure, concentration, centralization, and clustering. Twenty indices of segregation are surveyed and related conceptually to one of the five dimensions. Using data from a large set of U.S. metropolitan areas, the indices are intercorrelated and factor analyzed. Orthogonal and oblique rotations produce pattern matrices consistent with the postulated dimensional structure. Based on the factor analyses and other information, one index was chosen to represent each of the five dimensions, and these selections were confirmed with a principal components analysis. The paper recommends adopting these indices as standard indicators in future studies of segregation.
In this article the authors replicate and extend the methodological analysis of Massey and Denton (1988), which conceptualized residential segregation as a multidimensional construct with five axes of spatial variation: evenness, exposure, concentration, centralization, and clustering. To reproduce their work, the authors of this article factor analyzed 20 indexes of segregation computed in 1990 for three groups in 58 metropolitan areas. They extended Massey and Denton's analysis by expanding the set of metropolitan areas to include all 318 defined for 1990, and they broadened it by carrying out systematic comparisons across ethnic groups. This study's analyses reconfirm the multidimensional nature of residential segregation; however the authors also find that the indexes recommended by Massey and Denton to measure concentration and clustering do not function quire as well in 1990 as in 1980. Alternative indexes are considered as possibilities, but in the end, using the same indexes is recommended to maintain continuity.
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
This note is concerned with the measurement of racial integration of schools in a way that permits a simple aggregation of the measure to sets of schools such as school districts. [The procedure can be applied to institutions other than schools, but we prefer a more specific terminology.] There are several standard procedures for measuring integration, but the dissimilarity index appears to be more popular than any other index.2 This index is based on a comparison of the number of white students in each school, measured as a fraction of the total number of white students in the city, and the analogous nonwhite proportion of the same school. The index is defined as one-half of the sum over all schools of the absolute differences of these proportions. The value of the index is zero when the racial compositions of all schools are identical and it takes larger and larger positive values (up to a maximum of 1) when the racial compositions are more different; also, it has a simple interpretation in terms of minimum shifts which are needed in order to obtain identical racial compositions in all schools. However, the use of absolute differences makes the dissimilarity index less suitable when one wants to aggregate schools to school districts.3 The objective of this note is to show that there are some simple measures derived from information theory which are superior in this respect, and to illustrate their use by means of data on the elementary schools of the Chicago public school system for the years 1963–1969.
In this paper we derive and evaluate measures of multigroup segregation. After describing four ways to conceptualize the measurement of multigroup segregation—as the disproportionality in group (e.g., race) proportions across organizational units (e.g., schools or census tracts), as the strength of association between nominal variables indexing group and organizational unit membership, as the ratio of between–unit diversity to total diversity, and as the weighted average of two–group segregation indices—we derive six multigroup segregation indices: a dissimilarity index (D), a Gini index (G), an information theory index (H), a squared coefficient of variation index (C), a relative diversity index (R), and a normalized exposure index (P). We evaluate these six indices against a set of seven desirable properties of segregation indices. We conclude that the information theory index H is the most conceptually and mathematically satisfactory index, since it alone obeys the principle of transfers in the multigroup case. Moreover, H is the only multigroup index that can be decomposed into a sum of between– and within–group components.
Whether greater racial and ethnic diversity in the United States is being accompanied by greater integration remains unclear. This analysis examines segregation in the multi-ethnic context over the 1980–2000 period by using the multi-race information theory index (H), which simultaneously takes the presence of many groups into account, and by also looking at the segregation of each group separately. Results indicate that segregation has been decreasing, mainly due to declines in African American segregation and White segregation with little change or slight increases in Asian and Hispanic segregation. Growing diversity was associated with increases in overall segregation, White segregation, Hispanic segregation, and Asian segregation, though strongly associated with declines in Black segregation. For Hispanics and Asians, it was the growth in Hispanic and Asian and Pacific Islander populations, respectively, that were associated with increases in segregation, suggesting that this population growth likely buttressed ethnic enclaves.
In this paper we derive and evaluate measures of multigroup segregation. After describing four ways to conceptualize the measurement of multigroup segregation-as the disproportionality in group (e.g., race) proportions across organizational units (e.g., ...