Abstract
People sort over space due to (among other factors) the demand for amenities. This demand is partially determined by income. Using data from southern California, we investigate how variation in air quality affects residential sorting by income. We find that sorting caused by air quality differences may exacerbate income segregation. (JEL Q51, R14)
I. Introduction
The levels of ambient pollution in southern California, among the highest in the United States, vary dramatically over the Los Angeles basin.1 The scarcity and location-specific nature of clean air in Los Angeles should make the amenity a location determinant among households. The Los Angeles area has, however, experienced declining ambient pollution levels since the late 1970s, despite a large increase in population.2 The region, mandated to eventually attain federal pollution standards, is projected to experience continued declines in ozone and particulate matter levels,3 which should decrease absolute air quality disparities in the area.
This paper examines the effect variation in air quality has on the location decisions of households in the Los Angeles area. We specifically study the effect the amenity has on the sorting of households by income. To project the effect of the area attaining federal standards, we simulate changes in the geographic distribution of relocating households arising from the reduction in air quality disparities in the Los Angeles region. Spatial variation in any amenity could induce sorting along some demographic dimension if demand for the amenity varies by demographic group. Evidence that the demand for air quality varies by income—that clean air, for instance, is a normal good—is, surprisingly, mixed (Kahn and Matsusaka 1997). We directly measure household clustering across neighborhoods by income and estimate the effect air quality variation has on our sorting measure.
We model the location choice of a sample of households that migrated to (or within) the inland counties of the Los Angeles metropolitan area in 1999-2000. Only movers are included; their location choice would more likely reflect the contemporary costs and benefits of the neighborhoods evaluated. The spatial distribution of the sampled relocating households may characterize that of the population if households are generally prone to move.4
We examine households that relocated to the two counties that make up the inland basin of the region: Riverside and San Bernardino. The region has the most severe peak air pollution levels in the metropolitan area. Also, being landlocked, air quality in the two counties is not closely correlated with proximity to the ocean. For households residing in the coastal areas of Los Angeles, where the air is substantially cleaner, it would be difficult to distinguish their demand for unpolluted air from other amenities associated with the coast.
Income sorting resulting from locationspecific amenities is predicted by the basic market theory on urban land use (Alonso 1964). If living space is a normal good whose growth in demand with income outstrips the increase in the time costs of commuting, the theory predicts residential sorting in which higher-income households locate away from the urban center. Geographic sorting has been empirically modeled to arise from demand for a number of location-specific amenities, such as school quality (Black 1999; Bogart and Cromwell 1997), safety (Cullen and Levitt 1999), and public transportation (LeRoy and Sonstelie 1983; Glaeser, Kahn, and Rappaport 2008).
We estimate the effect air quality has on the choices of relocating households. The empirical literature on sorting and local environmental amenities normally estimates a relationship between net changes in population (or demographic subgroup) and environmental quality.5 ,Cameron and McConnaha (2006) and Greenstone and Gallagher (2008) define location at the census tract level and estimate net population changes induced by proximity to Superfund sites. Their evidence on the household sorting effect of environmental disparities is mixed. Banzhaf and Walsh (2008), defining communities at the sub-census-tract level, finds neighborhoods exposed to higher levels of toxic emissions suffer net population losses and income declines. We find that relocating households of middle and upper incomes are deterred by high levels of ambient pollution. We present evidence suggesting disparities in air quality may contribute to household sorting by income.
II. Neighborhood and Household Characterstics
The neighborhood of the household, this study’s geographic unit of observation, is defined by the Public Use Micro Area (PUMA) constructed by the U.S. Census. Our dependent variable is constructed from Public Use Micro Sample (PUMS) data generated from approximately 5% of the respondents to the 2000 Census. We narrow the sample to the inland households in the Los Angeles area that relocated to their present address within one year of the census survey. Most of the right-hand-side characteristics are as of 19992000. We include only those households that rent or own their place of residence.
San Bernardino County in southern California is divided into 11 PUMAs, Riverside into 9. We eliminate the eastern-most PUMA in each county. Both PUMAs are large, sparsely settled areas that include the SanBernardino mountains and do not characterize the remaining relatively urbanized region across the two counties (see Figure 1).
Map of Los Angeles Area and the Specific Public Use Micro Area Neighborhoods in Inland Los Angeles
The relatively large size of the sampled PUMAs—between 108,000 and 288,000 in population—matches the breadth of our ambient pollution measure. We measure the proportion of days in 1998 in which neighborhood ground-level ozone met California’s one-hour standard, which is an hourly ozone concentration of 0.09 parts per million. Excessive ozone exposure has been found to reduce lung function.6 We use data from the South Coast Air Quality Management District (AQMD), which monitors daily air quality from a number of fixed stations in the Los Angeles region. The AQMD provides data on the number of days in a year each monitoring station detects ozone levels exceeding federal or state standards.7 Each PUMA is assigned the monitoring station closest to its center. The assignments are not unique. Eight monitoring stations are assigned to the 18 PUMA neighborhoods.
We estimate the probability households locate in specific neighborhoods controlling for a number of neighborhood and household characteristics. Neighborhood characteristics include population density and average annual household income. We account for the unit cost of housing by controlling for mean monthly rent and the mean number of rooms per housing structure by neighborhood. The number of rooms is an approximation of the size of the housing unit. We also control for the proportion of residents who are non-His-panic white and college educated in each neighborhood. The variables were constructed by aggregating data from census tracts falling within the specific neighborhoods. The proportion college educated is the percentage of those aged 25 years or older with at least a bachelor’s degree. Mean rent and number of rooms were constructed by summing census tract aggregate rent and rooms and dividing by the total number of rental units and total housing units. Mean household income by neighborhood was similarly constructed.
The household-specific characteristics we control for include income, number of children under 18, and qualitative variables indicating whether the household is nonwhite and if the householder is over the age of 65.8 We also construct two variables that vary by both household and neighborhood. We control for the effect proximity to work has on neighborhood choice by calculating the distance between the householder’s place of work (by PUMA) and all the potential neighborhood choices in the sample.9 We also account for the household relocating to areas whose racial makeup corresponds to its own by constructing a variable measuring the proportion of the neighborhood’s residents that are of the same race as the relocating household.
III. Logit Model and Estimates
In our model the household chooses among 18 neighborhoods. We estimate the initial stage in what can be considered a two-stage decision in which the relocating household first chooses neighborhood before deciding on final residence. The household makes a utility maximizing selection evaluating the factors that vary by neighborhood. We employ a multinomial logit model to estimate the choice probabilities of the relocating households.
The household is able to assign a specific level of utility to each location. In equation [1] is the level of well-being household m derives locating in neighborhood i:

The household’s utility is determined by its own characteristics represented by the vector λm, neighborhood attributes, xi, as well as characteristics in the vector ηmi, which vary by both household and neighborhood. The probability household m selects the specific neighborhood i is expressed as the probability of the condition for all i ≠ j for each neighborhood and household. The vector ρ represents the coefficients of the neighborhood-specific covariates, which include our measure of air quality. In the multinomial model, the chooser-specific variables are constructed such that a separate parameter is estimated for each choice (minus one). The vector θ represents the 17 parameters estimated for each of the household-specific covariates in λm.
A single parameter is estimated for each of the covariates within ηmi. The vector includes the distance-to-work variable and the covariate accounting for the proportion of each neighborhood’s residents who are of the household’s race. Both covariates vary by choice and chooser. The vector ηmi also includes two variables interacting air quality with household income. We want to examine whether income partially determines sensitivity to air quality differences. We interact the air quality measure with a dummy variable that defines middle-income households (between the 25th and 75th percentile in the sample) and another defining high income (above the 75th percentile). The uninteracted air quality covariate in the logit model with the interaction terms represents the effect variation in air quality has on the location decisions of low-income households, the left-out group.
Table 1 shows specifications of the multinomial logit model for the full sample and a subsample of working households with recorded job location information. Models I and II include the full sample, while III and IV include the distance-to-work covariate, which limits the sample to households where the variable is observed. All of the logit specifications include household-specific covariates—income, number of children, elderly, and nonwhite status—each generating a separate parameter estimate for each of the 18 neighborhood choices (minus one). These covariates, which account for heterogeneity among the relocating households, generate a total of 68 parameter estimates and are therefore not shown.
Logit Model Results for Location Choice
Many of the nonenvironmental determinants of household location in Table 1 indicate consistent, intuitive effects across specifications. Households are attracted to more educated areas, and neighborhoods in which a larger percentage of residents are of the household’s race. They are deterred by areas distant from their place of employment and also by population density, although this effect achieves statistical significance only in the working sample.10 Households in the full sample are, counterintuitively, deterred by an area’s income; the relationship switches sign but turns insignificant for the subsample of working households.11
The mean rent covariate and the measure for the average number of rooms control for the cost and size of housing by neighborhood. Household location is consistently negatively correlated with the level of rent across specifications. Holding rent (and the other factors) constant, households are attracted to areas with smaller housing units. This result, statistically significant for the full sample, may reflect households clustering to consume a location-specific amenity not controlled for in our models.
The first and third specifications in Table 1 estimate the average location effect of air quality variation for the full sample and subsample. The results in both specifications suggest air quality is a positive amenity to relocating households, but the point estimates are statistically insignificant. Air quality turns statistically significant when interacted with income in Models II and IV, suggesting a single air quality effect estimated across income groups is an incorrect specification.
While the marginal effects in Table 2 suggest the impact of air quality, averaged across all households, is small, its decomposition uncovers marked disparities by household income. The empirical results suggesting air quality is a deterrent to low-income households are statistically insignificant. Neighborhood air quality is a positive amenity, however, for middle- and high-income households; the statistically significant marginal relationships increase in size with income. The effects in Table 2 indicate a 10 percentage point increase in a neighborhood’s acceptable air quality days would increase a high-income household’s location probability by 1.2%; it would decrease the probability for a low-income household by 0.4%.12 The magnitude of the air quality effects for the subsample of working households is larger for each income class, compared to the full sample.
Effects of a Marginal Change in the Proportion of Days the Air Quality Standard Was Met
If air quality is fully capitalized in rent, the household’s marginal willingness to pay for the amenity can be derived from the logit model parameter estimates.13 The estimated willingness to pay for a 10% increase in the number of days air quality standards are met varies from $44.76 per month for middleincome households to $72.74 for high-income households in the full sample. The logit results imply low-income households have a negative willingness to pay.
The examination of household sorting by income induced by spatial differences in air quality is premised on income partially determining the demand for the environmental good—which we find. Our evidence on the normality of the environmental good is more consistent than what has been found, in other contexts, in the literature. Kahn and Matsusaka (1997) estimates the probability of supporting various California environmental initiatives and finds that many environmental goods are normal over only a range of the income distribution. With the exception of Banzhaf and Walsh (2008), there is little consistent evidence in the literature that demand for environmental goods is normal (e.g., Greenstone and Gallagher 2008; Cameron and McConnaha 2006).
IV. The Geographic Distribution of Relocating Households
The geographic distribution of the environmental amenity is shown in Figure 1. Among the 18 neighborhoods modeled, the two northern PUMAs in San Bernardino County suffered the most high-ozone days in 1998. The least-polluted areas fell largely in midRiverside County. The Los Angeles region has in recent years experienced dramatic decreases in ambient pollution and is mandated to eventually fall within federal standards. Figure 1 suggests the sampled households would have redistributed themselves away from Riverside County northward toward San Bernardino if variation in air quality had been eliminated. We simulate this redistribution using estimates from the logit model.
Table 3 illustrates the number of households predicted by the logit model to locate in neighborhoods distinguished by the air quality categories shown in Figure 1. The table shows the effect of a discontinuous change in the variation in air quality. We utilize the logit specification in Table 1 estimating separate income effects for the full sample (Specification II) to generate the individual probability distribution for each household across neighborhoods. We treat the average probabilities calculated across the 9,202 households for each choice as relative frequencies and predict the number of relocating households by neighborhood. We also simulate, using the logit function, the location probability distributions after eliminating geographic differences in air quality. We set air quality to be invariant across neighborhoods and calculate location probabilities using the estimated logit function with only the remaining nonenvironmental determinants.
Simulated Distributions of Relocating Households across Neighborhoods
The advantage high air quality areas would otherwise have in attracting households would be lost upon eliminating differences in the amenity over space. The simulations in Table 3 imply high air quality areas would lose 8.31% of their middle-income households and 12.53% of their high-income households if variation in the amenity were eliminated. The simulated effect is greater, in percentage terms, in low air quality areas; the experiment leads to a 14.39% increase in middle-income households, 23.91% in high-income households. The inverse relationship estimated in the logit models between air quality and the location of low-income households corresponds to the simulated effect, suggesting those households would relocate toward the high air quality areas once spatial differences in air quality were eliminated.
The comparison of the total number of high- and low-income households that would change neighborhoods provides a sense of the disparate effect air quality variation has by income. Of the 2,301 low-income households, 68 are predicted to relocate across air quality categories, a total that is less than one-third of the 207 high-income households (out of 2,302) that are estimated to relocate given the elimination in disparities in the environmental good. This large movement of high- (and middle-) income households leads to the question of whether the resulting redistribution of households would be more spatially segregated.
We measure spatial income clustering across neighborhoods using the Moran statistic. The statistic, measuring the similarity of geographic neighbors along a given dimension, explicitly incorporates the physical proximity of spatial units. The statistic can be used to detect spatial clustering that could arise from household sorting. In the Moran statistic in equation [2], Zik is the number of households within income category k locating in neighborhood i; is the mean number of relocating households with income k across neighborhoods:

The weight matrix, wij, defines geographic proximity between neighbors. We use a contiguity matrix, in which wij characterizes neighbors as those with a common border. Positive spatial correlation implies geographically contiguous neighbors are more similar, in terms of the number of relocating households, than noncontiguous neighborhoods.
Table 4 illustrates the Moran statistics calculated from the alternative simulated distributions of households using the estimated variation in air quality and assuming no variation. We use as a benchmark the calculated spatial correlation for the full sample of households. The clustering implied by the positive Moran statistics in Table 4 for all households signifies that neighborhoods receiving a larger than average share of relocating households likely shared borders with neighbors that also held a disproportionate share. The table suggests variation in air quality increases the geographic clustering of all relocating households. Households, in aggregate, cluster across neighborhoods that offer high air quality. In Table 4, the geographic concentration of middle-income households approximates that of all households.
Spatial Correlation for the Simulated Distributions of Relocating Households across Neighborhoods
The Moran statistics indicate that although the settlement pattern for high-income households is less spatially correlated than that for the full sample, the estimated clustering of the households is highly sensitive to variation in air quality. The simulated geographic distribution of high-income households assuming no variation in air quality generates a Moran statistic that approaches random assignment across neighborhoods.14 This compares to a positive clustering of high-income households from the simulated distribution given the actual variation of air quality. The elimination of air quality differences would change the location patterns of high-income households such that they would be less segregated from all others across neighborhoods (although segregation could remain within the neighborhoods).
The Moran statistics indicate low-income households are more geographically clustered than is the case for high-income households, regardless of the spatial variation in air quality. The relative isolation of low-income households actually grows with the elimination of air quality differences. The simulation in Table 3 indicates that, given the elimination of differences in the amenity, low-income households would relocate to areas the other two income groups would be moving away from. The Moran statistics suggest low-income households would replace the middleincome group as the most segregated class once air quality differences were eliminated.
V. Conclusion
We present evidence on the effect spatial variation in air quality has on the sorting of households in the Los Angeles area. We find that middle- and high-income households are deterred by high ambient pollution levels. This effect contributes to geographic clustering among those households across neighborhoods. Air quality improvements in the Los Angeles area should reduce the geographic isolation of upper-income households, although not necessarily lower-income households. The results suggest the elimination of disparities in ground level ozone would decrease the neighborhood-level heterogeneity of high- and middle-income households but possibly increase the isolation of lower-income households.
Acknowledgments
The authors acknowledge helpful input by reviewers of this journal as well as participants at the 2007 NARSC and 2008 ERSA meetings.1.
Appendix
Mean and Standard Deviation of Variables
Footnotes
The authors are, respectively, professor, Department of Economics and Statistics, California State University, Los Angeles; assistant professor, Department of Urban and Public Affairs, University of Louisville; and professor, Department of Economics and Statistics, California State University, Los Angeles.
↵1 In 2005, the central San Bernardino Mountain region in southern California recorded the highest ozone concentrations in the nation. The Los Angeles basin recorded the highest particulate matter (PM 2.5) concentrations in the country as well (South Coast Air Quality Management District 2007).
↵2 For example, at least one area in the Los Angeles basin exceeded the federal ozone standard (an average of 0.085 parts per million over an 8-hour period) on 166 days in 1973, in 2007 this had fallen to 108 days. A simple linear trend over the period indicates the number of days out of compliance fell by 3.7 days per year over the period. See www.arb.ca.gov/adam/ from the California Air Resources Board.
↵3 Projections are within the 2007 Air Quality Management Plan by the South Coast Air Quality Management District (2007).
↵4 Wasi and White (2005) estimate that owner households in California had, as of 2000, an average tenure of 13.44 years. For renters, the figure is 5.25 years.
↵5 There is also a literature that examines the relationship between geographic sorting and broader environmental amenities such as climate (see Graves 1979, 1980; Cragg and Kahn 1997; and Rappaport 2007).
↵6 The U.S. Environmental Protection Agency website www.epa.gov/air/ozonepollution/health.html summarizes the known deleterious effects of exposure to ground-level ozone.
↵7 Our empirical results are not sensitive to the specific ozone standard used. Data are also available on actual recorded ozone levels by monitoring station. Although average ozone levels may better convey the magnitude of air quality differences across neighborhoods, we find the calculated means to be highly correlated with our measure of air quality.
↵8 In the PUMS data, the householder is normally the person who answers the PUMS questionnaire. We define the household as nonwhite if the householder is nonwhite.
↵9 Many in the sample worked in a neighborhood outside of the PUMAs sampled for this study. Distances were calculated for all households with recorded job location information.
↵10 The distance to work variable is an exogenous location determinant only if the place of work is established before the household’s relocation decision. This cannot be completely determined in the PUMS data.
↵11 The results from the county dummy variable also show that, holding other factors constant, households are prone to choose Riverside County neighborhoods over those in San Bernardino County. This may reflect the proximity of many Riverside neighborhoods to the amenities of Orange County, a densely populated upper-income area south of Los Angeles County.
↵12 The estimated sorting may only partially reflect the household response to variation in environmental amenities. The amenity may be partially capitalized in the form of lower labor market returns even within an urban area (see Blomquist, Berger, and Hoehn 1988). This caveat was provided by one of the referees of this paper.
↵13 The covariate vector in the logit model and its parameter estimates can be considered the household’s indirect utility function. The function includes the air quality and rent variables: V = ρ1air quality + ρ2rent. Willingness to pay is the derivative,
↵14 The expected value of the statistic assuming no spatial pattern is −1/(n−1), where n is the number of spatial units. For our sample, the expected value of the statistic assuming random assignment is −0.0588.