Abstract
This paper evaluates whether the property value capitalization effects measured with quasi-experimental methods offer reliable estimates of willingness to pay for changes in amenities. We propose the use of a market simulation as a robustness check. Two applications establish the method’s relevance. The first examines the conversion of land cover from desert to wet landscape. The second examines cleanup of hazardous waste sites. We find that even when quasi-experimental methods have access to ideal instruments, their performance in measuring general equilibrium willingness to pay cannot be assumed ideal. It needs to be evaluated considering the specific features of each application. (JEL Q51)
I. Introduction
There is a fundamental distinction between estimating the effect of a policy that influences the value of a parcel on that land’s price and estimating what an individual would be willing to pay to obtain the policy. This issue is important to nearly all of the reduced form quasi-experimental (QE) and hedonic property value analyses conducted over the past decade. This distinction arises because the source of identifying information used to avoid biases in hedonic estimates that can arise from omitted variables and sorting behavior is not neutral to the economic interpretation of what is measured.1 Two approaches have been used to evaluate the empirical significance of this logical distinction in recovering estimates of economic trade-offs associated with a change in a nonmarket service. The first uses analytical models to describe the properties of these trade-off estimates, using the evaluation logic often associated with quasi experiments.2 The second approach uses simulation methods to evaluate the quantitative importance of distinguishing specific types of changes in site-specific amenities and compares the evaluation logic to conventional cross-sectional hedonic methods.
The theoretical analysis by Kuminoff and Pope (2012) is an example of the first strategy. They adapt the Tinbergen (1959)-Epple (1987) description of the features of a hedonic price function to describe a hedonic equilibrium. With this model they demonstrate that for an infinitesimal, exogenous change in a spatial attribute, conveyed with a house, the prechange and postchange marginal willingness to pay (MWTP) measures will be equal and correspond to the incremental price capitalization. However, in other situations the price differential associated with capitalization may not correspond to either the prechange or the postchange MWTP. In evaluating policies that are inherently nonmarginal, the close relationship between capitalization and willingness to pay (WTP) may not hold. In the current paper, we use simulation methods originating in the logic developed by Cropper, Deck, and McConnell (1988) and Kuminoff, Parameter, and Pope (2010) to provide a strategy for developing an understanding of this relationship as it arises in each specific type of application. An economic model, calibrated to a specific market, is used to simulate different hedonic equilibria and then to evaluate the performance of conventional cross-sectional hedonic models and methods based on the logic of program evaluation for estimating specific trade-offs people would make in response to changes in spatially varying amenities.
Our analysis complements the existing hedonic simulation papers and extends them to demonstrate how a market simulation can serve as a robustness check on the maintained assumptions of the evaluation logic when it is used to develop measures in property value applications of the trade-offs a person would make to secure more of a desirable amenity. For small changes, analysts have interpreted these measures as point estimates of the MWTP. For large, discrete changes associated with some applications of the evaluation framework, the appropriate interpretation of these measures is a topic of debate. Our analysis provides additional guidance on the interpretation of these measures. We focus on situations where the measure of interest is the general equilibrium willingness to pay (GE WTP) for changes in amenities, which is often the goal of policy analysis. We present two examples to illustrate the importance of a simulation check. Our findings in these examples imply that quasi experiments that are routinely a part of the evaluation logic can have large errors when their estimates of price capitalization are treated as estimates of WTP. We also find that the use of instruments with cross-sectional hedonic modeling can improve the quality of the estimates for the WTP for discrete changes in amenities. This is true even when the changes are large enough to induce re-sorting and result in a new hedonic price function. Finally, we find that the context for each application matters, so that general conclusions about robust strategies for estimating GE WTP do not follow; and therefore, it would be prudent to consider the use of similar simulations as a complement to empirical research on a case-by-case basis.
II. Measuring Trade-Offs For Localized Amenities
Two issues are generally raised in discussions of the challenges posed in estimating households’ WTP for spatially varying amenities with hedonic price functions. The first concerns the prospects for unobservables that co-vary with both the local amenity of interest and housing prices. The realtors’ mantra— “location, location, location”—implies that attributes attached to records of housing sales based on location are likely to be only a subset of the full set of “things” important to buyers and sellers that also vary with location and are not readily observed by the analyst. The second issue is more subtle. An ideal experiment would randomly assign people to different levels of the amenities of interest as if they were in assigned treatment and control conditions. Households’ decisions to re-sort based on their initial assignment would then help to isolate what each individual would trade off to change the level of the amenity he received. Unfortunately, for most hedonic applications, we observe the outcomes after people have sorted, and we don’t know where they started.3 As a result, self-selection to locations based on each person’s preferences is a source of bias in estimating the trade-offs people make to obtain local amenities.
Policy evaluation methods address these problems in several ways. First, they routinely include fixed effects at various spatial scales (or other institutional boundaries) to capture the effects of unobservables.4 Second, they consider differences in the spatial context or in exogenous events as sources of instruments to mimic the treatment and control logic implied by a random assignment. When the evaluation logic is linked to spatial differences or temporal "events," they are usually labeled as QE methods. In these cases the instruments are often indicator variables that identify exogenous events (or areas) that are related to changes in or differences in the amenity of interest that could not be anticipated in advance by the people choosing their residences.
As an example of the QE approach, Chay and Greenstone (2005) consider the case of estimating a household’s WTP to avoid air pollution and use indicators for locations identified as nonattainment areas in a specific time period preceding new regulations as instruments. This initial nonattainment status was interpreted as signaling locations that could be expected to improve because of the mandates from air quality regulations. Measures of changes in housing prices after these regulations took effect might be expected to reflect appreciation in response to anticipated improvements. Thus, the measure of air pollution could be correlated with the model’s error as a direct reflection of the sorting adjustments in response to anticipated improvements in some areas. The earlier nonattainment designation was argued to be exogenous—and outside the control of the households. Using initial nonattainment as an instrument is designed to provide a surrogate for the ideal experiment, which would be initial random assignment and the ability to sort out price changes in response to air quality improvements by examining starting and ending locations of randomly assigned households.
Regression discontinuity (RD) designs exploit situations where the treatment is a deterministic, but discontinuous, function of some independent variable. In this case neither space nor time is required to serve as a basis for mimicking the ideal random assignment. Identification arises from whether an individual observation happens to have a value of this variable above or below a predefined thresh-old.5 The arguments for RD depend on the analyst’s willingness to extrapolate across the values for this independent variable in the region of the discontinuity. The specification of the model is then important to the logic.
For the most part, evaluation strategies rely on maintained assumptions about the data-generating process that assure the hypothesized exogenous variables being used as instruments to reduce self-selection bias or as the basis for assigning treatments meet the properties required. There is less focus on the properties of the economic relationships that are being estimated. Lee and Lemieux’s (2010) excellent review of the use of RD designs in economics notes that they can be treated as if there were randomized experiments. More specifically, they observe: "If the variation in the treatment near the threshold is approximately randomized, then it follows that all ’baseline characteristics’—all those variables determined prior to the realization of the assignment variable—should have the same distribution just above and just below the cutoff" (p. 3). This argument underlies comparisons of structural and demographic characteristics for collections of census tracts with different hazard ranking scores (HRS), above and below the threshold for the National Priorities List (NPL) presented by Greenstone and Gallagher (2008) (see their Table II). It is also one of the reasons for Black’s (1999) comparison of housing attributes for homes sold in corridors defined by their distances to the boundaries defining school attendance zones in her sample of housing sales for three counties in Massachusetts (see her Table I).
Landscape Willingness to Pay (WTP) Using Matched Houses
However, both of these studies include further analyses of randomization using subsamples to estimate hedonic price functions. This next step in their robustness checks compromises the economic interpretation of their respective models. Black’s analysis introduces an RD design by exploiting the fact that Boston neighborhoods can be divided so that the home owners’ children go to different schools because one portion of a neighborhood is in one district and another portion is in a separate one. To gauge the robustness of the RD structure, the model is estimated in progressively smaller samples, each restricting the initial sample of sales to houses closer to the boundary.6 This practice brings the check for random assignment into potential conflict with the maintained economic assumptions for the model being estimated. There is no economic reason to believe the results of these smaller samples should be consistent with the overall sample. Each represents a different assumption about the extent of the market relevant to defining the hedonic equilibrium.
Overall then, the key distinction is that most descriptions of the QE and RD logic focus on models for outcome measures without a direct connection to an economic model that relies on the sample composition in relation to the economic process it is intended to represent.7 In these applications, the estimating equations are simply statistical summaries of the outcomes of a "deeper" structural process. Thistlethwaite and Campell’s (1960) introduction of RD assessed the effect of merit awards on future academic outcomes using the threshold test score for the award. Similarly, another classic example by Angrist and Lavy (1999) compares the effects of class size on educational performance using a cap on class size in Israel to specify an RD design. The effects measured in both of these applications of the evaluation logic have implications for economic and social policy, but the empirical models cannot be used to describe a market. In environmental applications, the hedonic price function has a direct economic interpretation as a description of the market equilibrium.
The importance of these distinctions for welfare measurement can be seen directly using an augmented version of a figure Palmquist (1992) developed two decades ago. Figure 1 expands Palmquist’s figure to include two hedonic price functions. ph(q) and ph*(q) are hedonic price functions corresponding to different equilibria. In his example we focus on one local amenity, labeled as q, but acknowledge that the graph is holding other attributes of houses and locations constant. The B0, B1 curves represent different cross sections of an individual’s bid function, B(q), at different utility levels (see Ellickson 1981).
Interpreting Hedonic Estimates for Large Changes in Site-Specific Amenities.
Consider first the effects of omitted variables. This problem can imply that an analyst would estimate p
h(q) and associate the slope , with the MWTP for the individual with preferences described by B0(q). If the analysis omits important variables (due to specification errors or because they are not easily measured) and they co-vary with q, then the true hedonic price function could properly be assumed as represented by ph*(q). The individual with preference B0(q) would actually select qa, and a different person with her associated MWTP (the slope at A’ not at A) would be found at the q0 point. This difference is comparable to what Chay and Greenstone (2005) found when they used controls for unobservables along with their nonattainment instrument for total suspended particulates.
The interpretation of an exogenous change in q can also be described using this figure.8 Suppose there is a change that results in the location initially having q0 moving to q1, and assume the hedonic price function is correctly estimated as ph(q). If the hedonic price function does not change, the individuals with preferences described by B0 and B1 will not be satisfied by q1 with the higher price of the house. This is true regardless of whether the initial occupant is an owner or renter. With zero moving costs, the individual (as a renter) will prefer a location at q0 and move to it. The increase in wealth (ph1 - ph0) of the owner in this case measures the welfare gain. If the individual is the owner, this difference still measures the gain, but this owner may select a new location (due to the income effect of the enhanced value of the initial location). This newly selected location may have a new level of q. Nonetheless, so long as the hedonic price function doesn’t change, the price difference for the location experiencing the change measures the benefit realized by the person at that location.9
Once we consider a situation where the hedonic price function may change as a result of the exogenous change in q, it is essential to distinguish price capitalization effects and measures of the WTP for the change in the local amenity that caused the price function to change. This can be seen in our amended form of Palmquist’s figure. Now suppose the change causing the location with q0 initially to now have q1 is large enough so that the hedonic price function changes to ph*(q). p h1 - p h0 is a biased estimate of the capitalization of the change. p hR - p h0 is the correct measure of capitalization. Moreover, as Kuminoff and Pope (2012) suggest, there is no reason to believe that the ratio of (phR - ph0)/ (q1 - q0) offers a reliable measure of the MWTP for q. There are two MWTPs—the slope at A associated with the initial equilibrium (q0) and the slope at D at the new value of q(q1) for the location that originally (before the exogenous event) had q0. Even if the slopes of p h*(q) and ph(q) were equal at A and D, this condition alone does not assure (phR - ph0) measures the WTP for the discrete change (q1 - q0) or that (phR - ph0)/(q1 - q0) offers a reliable measure of the marginal WTP.10 The WTP for (q1 - q0) with a change in the hedonic price function is a general equilibrium concept and must be developed in these terms.
Our analysis of this amended version of the Palmquist diagram, together with the Ku- minoff-Pope analysis, suggests that the insights about what can be learned using the evaluation logic within a hedonic property value (or wage) model will depend on whether the analyst can assume the equilibrium hedonic price function remains unchanged in the face of the manipulations used to mimic random assignment. For most applications there is no clear-cut a priori answer to this question. It will vary with each housing market and problem to be evaluated. It should not be surprising that there has been considerable debate about the ability of the QE approach, with appropriate instruments, to provide more reliable estimates for economic trade-offs measured than those using traditional cross-sectional hedonic methods.
What is needed is a new robustness check. This paper demonstrates that such a check can be tailored to each problem and application using a simulation framework. We show it is possible to calibrate an assignment model to the specific details of a particular economic application and to use the resulting model to construct simulated hedonic equilibria. With these simulated data we can evaluate whether actual hedonic comparisons can be interpreted as revealing the GE WTP for a change in a spatial amenity. This process then serves as a robustness check on whether assumptions about the nature of the hedonic equilibrium are compromised by the strategies used to control for unobservables and other threats to the statistical properties of the estimates.
III. Illustrating Our Proposal: Two Simulation Experiments
Research Questions
A controlled setting is required to assess how the size and character of changes to spatially varying amenities affect the interpretation of hedonic price functions that we can estimate. We follow the Cropper, Deck, and McConnell (1988) and Kuminoff, Parmeter, and Pope (2010) strategies and use an assignment model to simulate different hedonic equilibria, and then evaluate the performance of evaluation strategies and conventional cross-sectional hedonic approaches in recovering WTP measures associated with policy changes.
We select two actual problems—one for a cross-sectional QE analysis, and a second intended to describe an exogenous event where a temporal context might seem more appropriate. There is nothing in the logic of using simulation that would preclude defining something that looked like an RD design. Each simulation is based on models calibrated using actual data that reflect an existing spatially delineated amenity or disamenity. These actual conditions influence the spatial pattern of connections between the source of the amenity and the houses in each of the samples used to evaluate the estimation methods. This process assures that the spatial correlations in these features, observed in each sample of the housing transactions, will be embedded in the evaluation.
The first simulation involves trade-offs inherent in changing landscaping from wet to desert. This problem parallels cross-sectional sources of experimental control, such as those used by Black (1999). Our second simulation uses hazardous waste sites as an example and is intended to resemble exogenous events that occur over time, such as those presented by Chay and Greenstone (2005) and Greenstone and Gallagher (2008). In each case the standard we use to gauge the performance of each set of methods is based on how close the estimates for WTP are to the average of the true GE WTP for the amenity change in each example. These simulations are designed to answer three specific questions:
With a single source of amenity change, how well does the traditional cross-sectional hedonic estimate for GE WTP compare to a first- difference hedonic?
Is an ideal instrument necessary and does it impact the performance or sensitivity of the model’s estimate of GE WTP?
Should subsample robustness checks be relied upon to evaluate the QE approach?
The Assignment Logic
The assignment logic for describing an equilibrium, at the micro level, for the housing market defines the indirect utility function in terms of household income, mi, a vector of housing attributes, Aj, and a vector of household characteristics, Ci, with j an index for houses and i the index for households as in equation [1]. For simplicity we drop consideration of other goods from the analysis.

Bji(vij) is household i’s bid for house j with its utility level at vij.
Xji = 1 when household i occupies house j and
Xji = 0 otherwise. An assignment equilibrium arises when there is (for a given, fixed supply of N houses and fixed number, N, households) a set of utilities and housing prices
with the N × N matrix X, such that the four equations in [2] are satisfied.

The first component of equation [2] equates the equilibrium price for each house to the maximum WTP of the household occupying it. The second equation maintains that no one is willing to pay more for a house than the person assigned to the house. The last two equations imply that all households have houses and all houses are occupied.
Two solutions to the assignment model will be developed for each of our applications. The first corresponds to a baseline condition defined by the actual spatial distribution of the amenity (or disamenity) associated with each problem. The second corresponds to changes in the spatially delineated amenity that are associated with the specific, discrete policies considered in each example.
Our simulation design requires household specific preference parameters. These correspond to a distinct vector of parameters that describe each household. Because these designations uniquely identify agents, this allows the analysis of simulated equilibrium to track the re-sorting of households among houses from baseline to policy solutions. This tracking permits two separate interpretations of the comparison between postpolicy and prepolicy solutions. The first matches each house, keeping track of the differences in the households assigned to each house in the two solutions. This matching corresponds to the typical data structure used for quasi experiments based on hedonic property value studies. It will be the primary basis for our analysis of each application. The second interpretation would match each household, recording their baseline and policy scenario housing "selections" and associated prices.11 Matches that track households correspond to the types of data and to applications of the QE framework in labor and health related applications. For our assessment of the GE WTP we focus on the agent initially assigned to each house.
Calibrating Preferences
Our simulations require two different preference calibrations—one for each application. In developing preference parameters for each application we propose to use market data for our study area—Maricopa County, Arizona. Actual housing sales information, census characteristics, and spatially delineated information describing measures that characterize actual amenities and disamenities in the area are used. We assume preferences can be characterized using a generalized Leontief specification as in equation [3]:

The virtual price function for an attribute given these preferences, for example A1, (with πA1 the virtual price for A1) is given in equation [4]:

Taking the logarithm of both sides, we have equation [5]:

The parameter β cannot be separately identified from the aj parameters. As a result we normalize it to unity. We estimate the preference parameters for all of the attributes jointly, using the estimates from the marginal price equations for each hedonic function associated with our applications. The resulting estimates for the αj’s along with their estimated covariance matrix provide the basis for calibrating the preference heterogeneity across households in our simulation. We draw parameter values using a multivariate normal error structure with a vector of means corresponding to the estimated parameters for each application and the covariance structure associated with those estimates. Our estimate for household income is derived from the U.S. Census and corresponds to the block group median income assigned to each house within a particular block group.12
To recover the preference distributions we draw from for use in our simulation, we require two different sets of calibrated functions. Both rely on housing sales in Maricopa County prior to the recent dramatic shocks to the housing market. Each is treated as an independent exercise. The first concerns landscape amenities associated with predominately wet landscapes for subdivisions in the county. The second exploits the presence of seven Superfund sites in the county.
Our estimates for the hedonic price function used to recover preference parameters for the landscape simulation are based on housing sales from 1980 through 2004 in Maricopa County. We estimated a semilog specification for the price equation using housing attributes (square feet of living space, lot size in acres, number of stories, age of home, presence of a pool, presence of a garage, and two landscape-related variables) a dummy variable for the wet versus desert classification, and the July minimum temperature for each parcel. We also measured the distance to the central business district. The model includes fixed effects for the year of the home sale. There are 398,200 observations, and, as one might expect (with this large a sample), we are able to estimate the housing and site attributes with considerable precision. Table A1 in Appendix A reports summary statistics for the sample, as well as our estimates of the first-stage hedonic model we use for the landscape application.
Our primary focus is on the "wet" versus "desert" attribute describing subdivisions in our sample. We are able to identify approximately 23,000 unique subdivisions. For each subdivision, we calculate the percentage of homes classified as wet and restrict attention to subdivisions with a minimum of 25 singlefamily residential parcels. For each of those subdivisions, we define a subdivision as wet if at least 90% of the houses within it are classified wet, and we define a desert subdivision as having no wet houses. Appendix C provides the estimated preference parameters from the second-stage analysis for the landscape scenario.
Our estimates for the hedonic price function for the hazardous waste application consider a smaller sample of housing sales from 1990 to 1999 in Maricopa County. We estimate a semilog specification for the price equation using housing attributes (square feet of living space, lot size in acres, number of stories, age of home, presence of a pool, presence of a garage, and the distance to the central business district). We include city and block group fixed effects to account for other spatially varying unobservable factors that could impact housing prices, as well as temporal fixed effects. We use inverse distance to the closest site with hazardous substances as a measure of the disamenity. Our sample selection criteria for housing sales assured that the seven hazardous waste sites were listed as NPL sites and known to the public for these sales dates. There were 242,827 single-family residential transactions included in this sample. Results are shown in Appendix B, with preference parameters shown in Appendix C.
IV. Findings
Implementing our simulation logic requires at least two separate simulations for each application. The first defines the baseline conditions, and the second defines the change being used for estimation to mimic either spatial differences in environmental services or temporal differences. If our analysis had also assumed the analyst must consider the potential for measurement error or unobserved heterogeneity in the attributes of the housing units included in the equilibrium, then multiple replications of the baseline and altered conditions would be required to estimate the sampling distributions involved. For our example, we assume the only source of unobserved variation is preference heterogeneity. That is, the agents choosing among homes know their preferences and know all the features of each potential house. The analysts do not know how people differ. Under these conditions there is no reason to replicate the simulations.13
We develop our results in four parts. The first part summarizes the features used to simulate our policy examples for the landscape and hazardous waste examples. After that we describe the results from each example separately. The last subsection draws together the common conclusions from our simulations.
Designing the Simulations
The baseline sample for the landscape scenario consists of 1,000 houses. We reduced the full sample of houses used to estimate the preference parameters to only houses within wet and desert subdivisions; 20 houses were selected from each of 25 wet and 25 desert subdivisions.14 As a result, our sample reflects unobserved spatial correlation between distinct types of subdivisions that would be expected in an actual application of the QE methodology to estimate the economic effects of a landscape amenity on housing prices. The landscape treatment assumes a discrete change in conditions. Half of the baseline wet homes switch to desert landscape and half of the desert to wet. This treatment means 250 of the homes identified as located in desert subdivisions are assigned wet status with the associated average night temperature for wet locations. Similarly 250 homes located in wet subdivisions are assigned as desert with the associated average nighttime temperatures for desert locations. All other attributes of these houses relating to accessibility to the central business district and other structural features are unchanged. Our instrument in this case corresponds to a dummy variable that identifies the specific homes switching from wet to desert, or the reverse.
The baseline solution for the hazardous waste cleanup also selects 1,000 homes from the Phoenix housing sample to portray the spatial array of houses in relation to NPL sites in a way that reflects proximity in Maricopa County. We limit the simulation to houses located near three of the hazardous waste sites. The three sites are Motorola 52nd Street plant, 19th Avenue Landfill, and the Indian Bend Wash site. Appendix B provides a brief description of the three sites. As shown in Figure 2, the Motorola and Indian Bend sites are located relatively close to each other, while the 19th Avenue Landfill site is located a distance away from these sites.15 Five hundred houses were randomly selected based on being located within 3 miles of both Indian Bend Wash and the Motorola 52nd Street plant. The remaining five hundred houses are randomly drawn from those within 3 miles of the 19th Avenue Landfill and are not within 3 miles of the other hazardous waste sites. Our selection criteria assured that we had 250 houses whose closest site was Indian Bend and 250 houses whose closest site was the Motorola plant. Our preference structure assumes the disamenity effect is for the nearest hazardous waste site. As a result, the simulations for the baseline will not allow for effects from the more distant site even though it is within 3 miles of each house.
Spatial Distribution of Impact Areas for Hazardous Waste Sites.
We developed two experiments in the baseline solution was cleaned up. This policy leads to a change in distance to the closest site for a subset of the houses. The houses whose initial closest site was Motorola become closest to the Indian Bend site after the cleanup, as they are still within 3 miles of that site. Thus, the quality change is represented as an increase in the distance to the closest site with hazardous substances for the houses initially nearest the Motorola site. Figure 2 displays the positioning of the two sites, along with the overlap in the areas with homes that could be affected by the cleanup of one of the sites. This overlap is the basis for a cleanup altering the site that is closest to a subset of the homes (i.e., the houses initially closest to Motorola become closest to Indian Bend when the Motorola site is cleaned up, but are further from Indian Bend than they initially were from Motorola).
The second experiment assumes the policy causes simultaneous changes in two sites. One of the changes involves the assumed cleanup of the Motorola site, as before. For those houses initially closest to the Motorola site, their closest site once again becomes the Indian Bend site. The second cleanup is based on the 19th Avenue Landfill. To maintain an isolated NPL site for a comparison following the cleanup, we divide the 50θ houses selected near the 19th Avenue site into two groups of 250 houses, chosen randomly. In effect, this strategy converts the single, isolated 19th Avenue Landfill site into what could be described as the equivalent of two isolated sites with an identical structure for the underlying unobserved spatial correlation. The policy cleans up the 19th Avenue Landfill site for only one set of 250 houses, reducing the disamenity effect of proximity to zero. Thus, while there is a differential effect for each of the houses close to this site (because in the baseline they are all different distances from the site), after the treatment some houses no longer experience any effect due to proximity to a hazardous waste site. Thus the value of the disamenity implies a "corner" or zero value for some households (and houses).
The analysis challenge in the first cleanup scenario arises because a subset of the houses close to the Motorola site in the treated sample experiences an improvement. The cleanup changes distance to a site with hazardous waste contributing to the equilibrium hedonic price function. For the second scenario, distance changes associated with proximity to the closest site mean different things depending on which site was cleaned up.
In the first of these experiments, cleaning a single site, the ideal instrument identifies which of the homes designated as having Indian Bend as their closest site following the cleanup were initially designated as closest to Motorola and therefore received the cleanup. For the second example, this instrument along with a variable that identifies houses close to the isolated site that was cleaned are used as instruments.
Changes in Landscape
Table 1 summarizes the results of our assessment of different cross-sectional models using the landscape scenario. The objective is to estimate households’ WTP for changes to wet landscapes. Three types of models are considered for each of three samples. The models are distinguished based on functional form (linear and semilog16) and the use of an instrument for the key focus—a fixed effect identifying the wet homes.17 Each model is a hedonic price function, and our focus is on using the estimated parameter for the fixed effect identifying wet homes to calculate WTP measures.
The table reports the average and median value as summaries of the GE WTP for wet landscape for the simulated households assigned to each house after the policy. The "true" WTP is based on the initial assignment of households. Because this analysis is confined to matched houses, the "true" values will be different, as the individuals who selected each type of home change between the baseline and the treated samples. They will also be different with the subsample of wet homes in baseline and treated simulations.
With the full sample the QE semilog specification provides the best estimates, while the QE linear specification is preferred in the subsample of wet houses. Under the linear specification for the full sample and the semilog specification for the wet sample, the simple cross-sectional hedonic provides superior estimates with errors of less than 6% compared to over 10% with the QE estimates. Overall, these findings indicate a simple cross-section hedonic based on equilibrium prices following the policy change provides estimates comparable to the QE instrumental variable (IV) models.
Cleanup of Hazardous Waste Sites
Our analysis of the cleanup of hazardous waste sites uses inverse distance as the indicator of the amenity effect and assumes it is recognized (by the analyst) as the way households perceive the disamenity effect. Thus, our difference and IV difference models use the change in the distance to the closest site as the basis for measuring the change in price. Since these differences are small, the effects attributed to cleanups are small.18 The magnitude of the effect is not important for our purposes, because our focus is on the comparative performance of QE strategies in estimating the WTP. The sample used to estimate each model and the standard for evaluating them are identified in each row of Tables 2 and Table 3. The sample restrictions also relate to the standard used for performance: houses near the affected sites or the full sample. Thus, this feature of the assessment demonstrates how the average value for the true value for the WTP changes with the subsampling process.
Single Hazardous Waste Model Performance Using Matched Houses
Table 2 provides the results for the cleanup of a single site. In this case we are comparing two hedonic equilibriums before and after cleanups to mimic the exogenous event. The model represents the event by changing the distance measure for some of the affected houses. Linear and semilog models are estimated. Our comparison considers the full sample—averaging estimates across all houses and then the subsample of houses affected by the cleanup. Because distances change in this scenario depending on the subgroup under consideration, the WTP measures for the linear models is not constant across subgroups, as was the case in the discrete landscape change scenario. The table reports the true value for the average WTP in annualized dollars, mean, and median, as well as the range of estimates for the households initially located near hazardous waste sites before the policy. Below the true values, each row corresponds to a different model. The results show both first-difference and IV first-difference specifications. These models are repeated in linear and log-linear form with and without the identification of the households affected by the policy as an instrument. We further report results using the full sample of all households as a comparison to the subsample that experiences changes due to the policy.
Overall the results indicate that when using an instrument, the QE strategy is inferior to a simple difference model based on the GE WTP standard and the correctly measured index for the disamenity change. While the differences between the QE and this simple approach can be small—a 9% error with the conventional first differences versus 12% with the QE IV estimator for the full sample—the ranking is consistent for all samples and model specifications. Thus in this case a QE estimation strategy offers no advantage over conventional first-difference models that do not attempt to isolate an instrument for the source of the "treated" houses experiencing cleanup outcomes.
Table 3 considers the results focusing on matched houses for the policy associated with cleaning up two sites simultaneously. In this case the layout of the table is comparable, but there are more evaluations possible. The first five rows use the full sample for two different evaluations: (1) cleanup of the Motorola site which changes distance for those houses close to the site before cleanup; and (2) elimination of a site through cleanup in the case of the 19th Avenue Landfill.
Multiple Hazardous Waste Model Performance Using Matched Houses
In these cases the instruments are essential for the reduced form models to recover estimates of the GE WTP. When the effects are estimated with ideal instruments, the errors are consistent with what we found for the one site cleanup that did not use instruments. The results are comparable when we evaluate cleanup of the Motorola site, even though a second site may affect the market equilibrium. When we consider the evaluation of this second site, the landfill site, the performance of either a difference model or one that exploits instruments is much less promising, with a 70% error when using the full sample. When examining only the subset of homes near the landfill site, the performance improves markedly. Overall, using the full sample for the landfill cleanup dramatically underestimates the gains compared to when the sample is not limited to houses near the cleaned-up site.
Answering Questions
We identified three questions to be considered using our simulation. Our findings are similar across the experiments, but this consistency does not imply they are generalized to other applications. Briefly, they support the following answers to the questions posed in Section III.
Performance of Cross-Sectional Hedonics
For simple cross-sectional and time-series designs, the errors associated with cross-sectional hedonic analysis are comparable to those from the evaluation method with an ideal IV for the landscape and the one-site cleanup hazardous waste case. This is true even though policy implied discrete changes for landscape and different changes for each house with the hazardous waste scenario. Of course, our standard is the average GE WTP, not a house-by-house microestimate.19
The Need for an Instrument
When the policy change alters a more complex system, an instrument is essential and the evaluation logic is clearly superior to the cross-sectional hedonic. This case is illustrated when the hazardous waste cleanup policy affects multiple hazardous waste sites and the realized measure for the disamenity effect arises from several sites. In this case a cleanup policy implies different sites will become the closest source of the disamenity for different houses. From a modeling perspective these changes imply a diverse pattern of distance changes that arise from the policy and re-sorting. The instrument allows them to be distinguished.
Robustness Checks When the Standard Is Measuring GE WTP
The practice of using subsamples to confirm the findings from the evaluation method is not a reliable gauge of the methodology. In hedonic applications this approach implies that the estimate of the hedonic equilibrium price function is reliably estimated with subsamples. Equally important, it implies the estimated true GE WTP does not vary with subsamples. Neither of these conditions is assured in practice.
V. Implications
Kuminoff, Parmeter, and Pope (2010) concluded their comprehensive Monte Carlo assessment of the functional forms to be used with hedonic methods when they are supplemented with spatial and temporal fixed effects by noting that the stylized facts for hedonic analysis must change. These "facts" must now acknowledge the importance of QE methods and the merits of controlling for unobservables. Moreover, flexible functional forms such as the quadratic Box-Cox and fixed effects were found in their experiments to be important to realizing desirable properties for the estimates of the MWTP for site attributes.
Our analysis adds to this research as well as to the Kuminoff-Pope (2012) theoretical analysis of the QE logic in four ways. First, our simple graphical analysis and the simulation experiments confirm that a general verdict on the superiority of QE methods with changes in housing prices is unlikely. Our criterion in developing this assessment is a specific one. It is based on how well the estimated capitalization effects correspond to the GE WTP. The definition of the amenity determines both the amount of the change for each household and the "size" of the change from a market perspective. These need not be the same. Households can experience small changes, and yet the market impact can be sufficient to alter the hedonic price equilibrium. This distinction is important and influences the credibility of general arguments for interpreting price changes as measures for GE WTP. To address this issue we suggest a way the calibration, simulation, and measurement logic associated with our experiments could be used as a robustness check on studies where the issue of the size of the effect is especially troubling.
Second, we illustrate the logic for a case that matches one of the most important QE studies from a policy perspective—the Green-stone-Gallagher analysis of hazardous waste cleanup policies. Our results suggest that for situations likely to match the Greenstone-Gallagher application, namely, a policy that affects multiple sites and houses in a single market simultaneously, attention should focus on the quality of the instruments used to sort out the separate effects on different sites. With high-quality instruments our results are consistent with those of Kuminoff, Parmeter, and Pope for the case of MWTP.
Third, robustness checks for QE methods with hedonic models that subsample observations of housing sales should be considered as altering the assumptions of the maintained economic model. As a result, they cannot be treated as reliable assessments of the statistical elements of the research design.
Finally, this overall structure illustrates how simulation can be used as a new type of robustness check, in the spirit of Chetty’s(2009) (and Heckman’s 2010) proposal to reconsider Marschak’s (1953) maxim and design models to link strategic combinations of structural parameters to the reduced form logic of the QE framework for evaluation.
Acknowledgments
Thanks are due to two anonymous reviewers, Nick Kuminoff, and seminar participants at the University of Calgary and Oregon State University for very helpful comments on earlier drafts of this research. All remaining errors are our responsibility. Thanks are also due Natalie Cardita for preparing all the drafts of this paper. This material is based upon work supported by the National Science Foundation under grants BCS-1026865 and DEB-0423704 CAP LTER.
Appendix A: First-Stage Hedonic Price Function For Landscape Amenities
Landscape Calibration First-Stage Hedonic
Appendix B: Description Of Hazardous Waste Sites
The Indian Bend Wash site was proposed for listing on the NPL in December 1982 and received final listing in September 1983. The site was listed due to contaminated groundwater and includes 12 square miles of land stretching from Scottsdale to Tempe. As a result of the contamination of groundwater, six city wells were closed. Over 350,000 people live in the contaminated area. The site received "construction completed" status in 2006 but has yet to be fully deleted from the active NPL list.
The second hazardous waste site we focus on is the Motorola 52nd Street plant. This site was proposed for listing in June 1988 and received final listing in December 1989. To date, the site has not received construction complete status and remains an active cleanup site. This site includes a former semiconductor manufacturing plant and encompasses 90 acres in the midst of a residential and commercial area. As a result of a leaking underground storage tank, groundwater and soil were contaminated. The contaminated water has spread several miles underground and is not being used for drinking water, but it resulted in the closure of several wells.
The third and final site we examine is the 19th Avenue Landfill, which was proposed for listing in December 1982 and received final listing in September 1983. The site was delisted in September 2006. The 213 acre landfill is located in an industrial area adjacent to the Salt River. Over 16,000 people live within 6 miles of the site, with the closest residents located only a third of a mile away. As with the previous two sites, this site is responsible for contaminated groundwater and has been made worse intermittently due to flooding of the nearby Salt River, which had breached areas of the closed landfill. Unlike the other two sites, there are no residential wells located in the immediate vicinity of the landfill. Cleanup of the site ultimately cost $22 million.
Hazardous Waste Calibration First-Stage Hedonic
Appendix C: Second-Stage Hedonic Preference Parameter Estimates
Preference Calibration for Second-Stage Hedonic
Footnotes
The authors are, respectively, assistant professor, Department of Agricultural, Environmental, and Development Economics, The Ohio State University, Columbus; and Regents’ Professor and W. P. Carey Professor of Economics, Arizona State University, Tempe, university fellow, Resources for the Future, Washington, D.C., and research associate, National Bureau of Economic Research, Cambridge, Massachusetts.
↵1 This point is one implication of Heckman’s (2010) more general analysis and proposal that the analysis of causal models be distinguished based on first defining the set of counterfactuals and then considering the task of inference from data and the associated identification strategy. The first of these tasks is about how specific, observable outcomes are selected and how they relate to economic agents’ preferences.
↵2 In the context of hedonic property value models, we use the term "evaluation methods" to describe the set of methods that attempt to minimize the bias in reduced form estimates of the effects of a spatially delineated attribute of a location. The effects of omitted variables and sorting can introduce simultaneity bias in conventional cross-sectional hedonic estimation methods. QE methods refer to efforts to identify an exogenous source of variation in the spatial attribute of interest. Ideally it is related to the attribute but not known to the households selecting locations at the time they make their decisions. It can be a feature that varies over space or over time. This variation allows households to be separated into groups, akin to the treated and control samples of a true experiment with random assignment. Because it is not a true random assignment, its logic is referred to as QE, with the ability to avoid bias resting on the quality of the factor that distinguishes the groups. Sometimes this is based on another variable and a discrete value of this variable. Above or below some threshold that is outside the agent selecting a location’s control means different things for the locational attribute. These are usually referred to as regression discontinuity designs. Because they require added assumptions we have labeled them separately from the basic QE methods. See Parmeter and Pope (2013) for a detailed overview of these methods, and for discussion of the use of hedonic models in the context of nonmarket valuation see Freeman (2003).
↵3 Moreover, the choices are usually the result of a more complex process. They are not responses to a simple exogenous change.
↵4 See Kuminoff, Parmeter, and Pope (2010) for an evaluation of these strategies, and Abbott and Klaiber (2011) for an alternative approach for addressing these issues.
↵5 In practice this threshold is sometimes determined by a spatial boundary.
↵6 Lee and Lemieux also raise issues with Black’s practiceof dropping some of the attendance zone boundaries, on the grounds that in this case the attendance zone boundaries may be potentially confounded with other attributes of the neighborhoods. Lee and Lemieux suggest this practice has the potential to convert a strategy for using RD to assure random assignment of treatment and control into a more arbitrary judgment.
↵7 As Heckman (2010) notes, a key element in the structural approach is the joint modeling of the outcomes we can observe in relationship to the economic choices we need to understand in order to evaluate program outcomes. In some cases the outcomes may not be as clearly linked to economic processes. Nonetheless, the analyst needs to appreciate how behavioral processes causing people to be observed in a school district (in these examples) might condition the interpretation of what is measured with the QE or RD analysis.
↵8 This was the original reason for Palmquist’s analysis.
↵9 To demonstrate this result, consider each in analyticalterms. Let Vi(·) be the indirect utility function for the ith household with income mi, facing a price schedule p(·) for homes in different locations.
is a vector of prices for all other goods and services. Our example is an owner who has a property at a location with site-specific amenities given by q0. The owner selects a location with q0. His income is mi plus the rent p(q0) received from his property. So the left side of the equation defines his baseline condition:
. Nothing changes but the contribution to income so
↵10 Theorem I in Kuminoff and Pope (2010) details the required assumptions for price capitalization to measureMWTP.
↵11 Actually, these are the assignments required to satisfy the equilibrium conditions given in equation [2]; see Wheaton (1974). For an alternative strategy for computing equilibrium see Kuminoff and Jarrah (2010).
↵12 Our estimates for household preference heterogeneity are held constant for the baseline and policy simulation in each application report. We investigated the effects of doubling the covariance matrix for the estimated αj’s by a factor of 10. This increase did not alter any of our findings.
↵13 In any simulation there will be approximation errors. With a sample size of 1,000, these are a relatively small influence on our findings.
↵14 Only wet houses were selected from wet neighborhoods.
↵15 This design is intended to reflect the concerns Nancy Bockstael raised with the Greenstone-Gallagher use of census aggregates in attempting to detect the effects of cleanup of hazardous waste sites on housing values. See Smith (2007) for the report of U.S. Environmental Protection Agency workshop on assessing the benefits from land cleanup policies for a summary of this discussion.
↵16 The computations for the semilog follow Halvorsen and Palmquist’s (1980) interpretation and correct the coefficients for the dummy variables.
↵17 The hedonic specification includes all the structural characteristics used in the assignment—square feet of living space, lot size, number of stories, number of baths, age in years, presence of a pool, garage, temperature, distance to the central business district and the dummy variable for presence in a subdivision with mesic landscape.
↵18 Our estimated effects with the actual sales data (see the hedonic price function in Appendix B) are about 2% of the price for a one-unit change in inverse distance. The point is simply that the average value for the inverse distance for the full sample is only 0.15 and becomes smaller with our restricted sample.
↵19 This is different than the summary measures of MWTP presented by Cropper, Deck, and McConnell (1988) and Kuminoff, Parmeter, and Pope (2010).