Combining Revealed and Stated Preference Data to Estimate Preferences for Residential Amenities: A GMM Approach

Daniel J. Phaneuf, Laura O. Taylor and John B. Braden

Abstract

We show how stated preference information obtained from a choice experiment, and revealed preference information based on housing market transactions, can be combined via generalized method of moments (GMM) estimation. Specifically, we use a moment condition matching the predicted marginal willingness to pay (WTP) from a first-stage hedonic model to the marginal WTP formula implied by the choice experiment utility function. This is coupled with other moments from the choice experiment to produce GMM-based estimates of parameters that reflect the strengths of each data source. Our application values remediation of a contaminated site in Buffalo, New York, and we find evidence in support of estimates arising from our approach. (JEL Q51, Q53)

I. Introduction

In this paper we propose a new means of combining revealed and stated preference data in the context of property value models for nonmarket valuation. We show how stated preference (SP) information obtained from a conjoint experiment, and revealed preference (RP) information based on market transactions, can be combined using a generalized method of moments (GMM) estimator. In particular, we propose using a moment condition matching the predicted marginal willingness to pay (WTP) from a first-stage hedonic model to the marginal WTP formula implied by the conjoint model utility function. This moment is coupled with other moments implied by the distributional assumptions used in the conjoint model, to produce estimates of preference parameters that reflect the strengths of each data source. We demonstrate our method using an application valuing remediation of a contaminated site in Buffalo, New York, and find evidence in support of estimates arising from our approach.

The starting point for our proposal is the first-stage hedonic property value model. This model is well established in environmental economics due to its ability to provide estimates of households’ marginal WTP for an amenity, using readily available data on residential housing transactions. The attractiveness of valuation estimates arising from the hedonic model is rooted in their connection to a large and consequential market decision on the part of households. The first-stage hedonic model, however, is notably lacking in its ability to provide valuation measures for discrete changes in an amenity. For this the second- stage hedonic model is needed, but agreement on how best to implement a second stage of estimation for a single market has proven elusive. This has caused researchers to examine SP methods—particularly choice experiments—as a means of obtaining the variability needed to estimate households’ preferences for residential amenities. While this approach has considerable appeal, as with all SP methods, concerns about hypothetical bias cannot be readily dismissed.

Property value applications provide a good example of a larger issue in nonmarket valuation regarding RP and SP data. It is often the case that observed behavior and natural variation in the environment can be used to obtain good estimates of behavioral functions under baseline conditions. However, in many cases, there is insufficient variation to consistently estimate shifts in behavioral functions arising from nonmarginal changes in environmental conditions. SP methods are well suited for generating variation in behavior and environmental conditions through designed experiments, but they typically lack a connection to consequential behavioral outcomes. This observation is the basis for a large literature examining different ways that RP and SP data can be productively combined (see Whitehead et al. 2008 for a review). One strand of this literature focuses explicitly on the notion that RP data can be used to calibrate preferences at baseline conditions, while SP data is used to estimate the size of movements away from the revealed baseline. Von Haefen and Phaneuf (2008) provide an example of this logic for combining RP and SP data to estimate random utility maximization models of recreation behavior. In this paper, we describe a strategy for combining RP and SP data in a model of residential location that is similar in concept but distinct in its approach.

Our combined RP/SP approach envisions a data environment in which a conjoint experiment is conducted in a market for which home sales transactions information is simultaneously gathered. We constructed a database of this type to measure values held by households in Buffalo, New York, for remediation of a contaminated aquatic site near the eastern end of Lake Erie (Braden et al. 2006). In particular, the database contains transactions data and GIS information needed to relate home sales prices to measures of distance to the contaminated site. These data are used to estimate the hedonic price equation, from which we produce estimates of households’ marginal WTP for proximity to the site at baseline conditions. A subsample of households who purchased a home was also asked to complete a survey. The survey included a choice experiment in which respondents were asked to weigh hypothetical houses that differ in price, size, and distance to the contaminated site, against their actual purchase. Our approach uses the two data sources to quantify the trade-offs people are willing to make between housing price and distance to the contaminated site, while also assuring to the extent possible that our preference estimates are grounded in the reality of baseline market conditions.

We find that our RP/SP models provide estimates of preference parameters that are qualitatively similar in sign to the conjoint only (SP) model, but that their magnitudes imply substantially different welfare measures. In particular, the marginal WTP estimate from our proposed model is more than two times larger than the comparable measure from our SP-only model. Likewise, nonmarginal welfare measures based on a counterfactual increase in distance to the contaminated site result in economically significant differences between the two approaches. Comparisons to estimates from the hedonic model provide evidence of our proposed model’s ability to calibrate the SP responses to a consequential market baseline. This, combined with the ability to conduct discrete change welfare analysis, suggests our GMM approach to combining RP and SP data produces a characterization of preferences that may be preferred to either the RP or SP approach in isolation.

II. Hedonic Model

To motivate our approach, consider the standard property value hedonic model as reviewed by Taylor (2003, 2008) and Palmquist (2005). In this framework a residential property is completely characterized by the variables x and q, where x is a J-dimensional vector of property characteristics that is broadly defined to include all structural, parcel, and neighborhood attributes, and q is an environmental (dis)amenity. For purposes of exposition, we assume q is a scalar good. The market price of a house is determined by an equilibrium price schedule P(x,q)—the hedonic price function—that arises via the interaction of all buyers and sellers in the market. A household participates in the market by choosing the levels of attributes x and q to maximize utility, subject to its budget constraint and the price schedule. Formally, households solve the problem

Embedded Image[1]

where z is the strictly positive numeraire good, y is income, and h summarizes the household’s characteristics, such as family size and composition. Focusing specifically on q, the first-order conditions for this problem lead to the familiar result that the household selects the level of q to equate its marginal rate of substitution between q and the numeraire to the implicit price of q:

Embedded Image[2]

Since ∂U( ■ )∕∂z is the marginal utility of income, the first-order conditions imply that, in equilibrium, households select q to equate their marginal WTP for q to the marginal implicit price of q. This is the primary result upon which much of RP analysis of property value markets is based. We return to this point below.

First, however, we gain further conceptual insight by examining Rosen’s (1974) bid function b(x,q,y,h,u) for a representative level of utility U, which is implicitly defined by

Embedded Image[3]

The bid b( ■ ) is the maximum amount that the household would (and could) pay for a house with attributes x and q, given its income and characteristics, while holding utility fixed at U. Differentiating [3] with respect to q allows us to relate the bid function to the marginal WTP at any point q:

Embedded Image[4]

Bockstael and McConnell (2007) note that the properties of b( ■ ) imply that ∂b( ■ )/dq is not a function of income, and so we can rewrite [4] as

Embedded Image[5]

where Embedded Image is the marginal WTP (compensated inverse demand) function for q. Connecting this back to the first order conditions, we see that in equilibrium

Embedded Image[6]

where x0 and q0 are the household’s observed choices for the attributes, and U0 is the level of utility obtained. The primary objective of empirical models of household residential choice is to obtain an estimate of all or parts of Embedded Image for individual households, since this function contains the information needed to conduct welfare analysis for changes in q. In particular, the WTP (compensating variation) for a discrete improvement in q from q0 to q1 is given by

Embedded Image[7]

This WTP measure is what Bockstael and McConnell (2007) refer to as “pure willingness to pay,” since it holds the person at his current location (as opposed to allowing adjustment through mobility) and reflects only the preference effect net of any change in the actual price paid. In what follows we describe the typical RP and SP techniques that have been used to estimate WTP as defined by this expression.

RP Estimation

The model described thus far is usually presented as the conceptual basis for using observed housing market transactions, coupled with information about the households who purchased the homes, to estimate πq(x,q,h,U) in two steps. The first step involves combining transaction prices and the attributes of properties to estimate P(x,q) based on the econometric model

Embedded Image[8]

where (pi,xi,qi) are the observed sale price and attributes for property i = 1,..., I, f(■ ) is a functional specification for the price schedule, θ is a vector of parameters to be estimated, and εi is a disturbance term. This is the first stage of hedonic estimation, and it is ubiquitous in applied analysis for several reasons. Perhaps most importantly, equation [6] suggests that an estimate of P(x,q) alone can provide a measure of each household’s marginal WTP for q at its observed choice. Thus we can obtain a single point on household i’s marginal WTP curve based on

Embedded Image[9]

where we use Embedded Image to denote an estimate of the marginal implicit price of q for person i.

To estimate the full marginal WTP curve we need to conduct a second stage of estimation. Rosen (1974) suggests estimating an econometric model of the form

Embedded Image[10]

where the left-hand side variable is predicted from the first stage of estimation, s(·) is a specification for the ordinary inverse demand for q function, β is a vector of parameters to be estimated, and ηi is a disturbance term. Palmquist (2005) reviews how, in its original cross-sectional form, the second stage of hedonic estimation is fraught with conceptual and econometric identification challenges.

Consider the information needed to identifythe parameters in s( · ). For nonlinear specifications off(·), there is cross-sectional variation in pqi and the right-hand-side variables (qi,hi,yi), so equation [10] can in principle be estimated. However, without imposing assumptions about the functional form for utility and the hedonic price function, a sample of size I does not in general contain enough information to trace out a function specific to household i in a single market. This is the Brown and Rosen (1982) critique, and it is best seen by looking at Figure 1.Thetoppanel shows a cross section of the hedonic price function relating q to P(x,q) for given values of x. Equilibrium outcomes for two distinct households 1 and 2 are shown at the points of tangency between their respective bid functions and the hedonic price function. Specifically, household 1 locates at point a and consumes q1, and household 2 locates at point b and consumes q2. The lower panel shows how points a and b correspond to single points on each of two inverse demand for q functions. Note that for each household, q( · ) is traced out as the slope of its individual bid function as q changes. For household 1 we observe point a’ on πq( · ,hi) but nothing more; likewise for household 2 we observe only b’ on πq( · ,h2). For this example with I = 2, points a’ and b’ represent the sample of data referred to in equation [10]. The Brown and Rosen problem arises because the regression fits a function such as the dashed line in the lower panel, which is not an inverse demand curve for either of the sampled households, unless they have identical preferences—in which case the hedonic price and bid functions are the same. Thus, the main challenge in using a second-stage regression to estimate πq(x,q,h,u) is that a single housing market does not provide adequate variability in price/quantity space, since each household reveals only one price/quantity outcome.

This and other difficulties in estimating the parameters of the underlying bid function have long been long recognized in the hedonic literature, and no single solution comes without its own assumptions and trade-offs.1 This presents a dilemma for nonmarket valuation using RP models of property markets. Specifically, while the first stage of estimation usually provides a solid estimate of households’ baseline marginal WTP, researchers typically rely on approximations of unknown quality, rather than second-stage estimates, to obtain the value of discrete changes in q.

SP Estimation

An alternative to the RP approach is to use SP methods to elicit households’ preferences for q as related to their choice of residential location. This has typically been done using choice experiment, or conjoint analysis, in which surveyed households are presented with hypothetical choices between homes of different configurations and asked to indicate their preferred option. Examples of this method are provided by Earnhart (2001, 2002), Braden et al. (2008), and Chattopadhyay, Braden, and Patunru (2005). A typical procedure is to ask respondents to compare their current home to a hypothetical home in which the attributes of interest—for example, the home price and level of q—are experimentally designed to vary away from their baseline levels, while all other attributes remained fixed. This gives rise to a discrete choice model in which the utility from a particular choice c is

Embedded Image[11]

where V( ■ ) is the observable component of utility that person i gets from choosing option c, β is a vector of parameters characterizing utility, andwe have substituted out for zic using the budget constraint yi = zic +P(xic,qic). The random variable εic accounts for the unobserved component of preferences, and it is assumed to have a known distribution. Finally, in what follows we use c = 0 to denote the actual home the person purchased, and c =1 to denote a hypothetical home against which the actual home is compared. Under the assumptionthatpeopleselecttheoptionwiththegreatest utility, we can use maximum likelihood to recover estimates of β and thereby a characterization ofV( ■ ). We discuss estimationofthis model in detail in the following section.

Once we obtain a characterization of V( ■ ), welfare analysis is relatively straightforward. The marginal WTP for q at baseline conditions is found by differentiating [11] with respect to q; we obtain

Embedded Image[12]

which we can rewrite as

Embedded Image[13]

Note that the model once again suggests that the baseline marginal WTP is equal to the marginal implicit price of q in the market. In the SP approach, however, we calculate its magnitude from the utility function estimate rather than an estimate of the price schedule.

Indeed, the right-most expression in [13] is not available in a purely SP study. This is an important distinction that illustrates how RP and SP approaches rely on different information sources to predict similar quantities.

The value ofadiscrete change in q is found by integrating the marginal WTP function over the relevant range of q:

Embedded Image[14]

For the special case of a model that is linear in yi - P(xic,qic) and qic, the welfare measure reduces to the familiar expression

Embedded Image[15]

where βy is the coefficient on the budget constraint (i.e., the marginal utility of income) and βq is the coefficient on q. This definition of CV corresponds to the gross WTP, since it includes an adjustment for the change in purchase price but does not allow mobility. The pure WTP corresponding to equation [7] is simply the first term in the expression, that is, βq[qi1 - qi0]/βy.

The SP approach is attractive in that, via the designed experiment, respondents reveal trade-offs between different levels of q and home prices. In this sense it solves the fundamental dilemma of second-stage hedonic estimation in the RP context, because it delivers the variability in price/quality space that is needed to characterize household-specific marginal and nonmarginal change values for q. However, like all SP exercises, respondents do not bear the real consequences of their choices. This potential for hypothetical bias may therefore give one pause when interpreting estimates arising from purely SP methods, and it motivates our combined RP/SP approach.

III. A Combined RP/SP Approach Using GMM

Our proposal is based on equation [13], which illustrates how the RP and SP approaches overlap conceptually but differ empirically. Note in particular that both sides of the equation show the baseline marginal WTP for q; this equality links the conjoint and first- stage hedonic approaches to a common underlying model of preferences. The two approaches, however, obtain this estimate in different ways: the RP estimate is based on analyzing market transactions, and the SP estimate is based on hypothetical trade-offs. The former is likely to be a better baseline estimate, while the coefficient estimates from the latter expand the range of measurement possibilities to include analysis of nonmarginal changes.

To see how the RP and SP data can be combined, consider first the estimating equations for the conjoint model. For ease of notation and exposition let Wict = (xict,qict) and assume that Uict = Wict β+εict, where Wict can include interactions between q and household characteristics, K is the dimension of β, εic is distributed type I extreme value, and t indexes the different choice situations the person faces in the survey. In this case the probability Pritc of observing a particular choice has a simple closed form, and the sample log-likelihood function is given by

Embedded Image[16]

where I is the number people, T is the number of choice occasions each person faces, and ditc = 1 ifperson i chooses house c on question t, and zero otherwise. The value of β that maximizes [16] is the maximum likelihood estimator.

Equivalently, we can interpret the estimator arising in [16] as a method of moments (MM) estimator. Define the 1 score vector sit(β) for an observation indexed (i,t) as

Embedded Image[17]

and recall that the first-order conditions forthe maximum likelihood estimator are

Embedded Image[18]

Equation [18] defines a just-identified MM estimator, in which the sums of the scores for the sample serve as the K moments. For future reference, we denote these as g1(β,W) and refer to them as the K SP moments.

Consider now adding an additional moment condition based on equation [13]. Inparticular, define

Embedded Image[19]

as a moment condition, where pi0 is the actual price individual i paid for his home. Note that this moment relates the prediction of baseline marginal WTP for person i in the SP model to the prediction for the person’s marginal WTP as given by the RP model. Assuming that we have effectively estimated a first-stage hedonic model, g2( ■ ) can be computed as a function of data, predictions from the hedonic model, and the K unknown parameters. With equation [19] we now have a collection of K+ 1 moments with which to estimate K unknowns, and our model is overidentified. Indeed, additional moments such as [19] can easily be written for any attributes (such as square feet of living space) that are common to both the hedonic and the conjoint models. In these cases a GMM estimator is appropriate, since there are more moments than parameters. For our example with one additional moment, the estimator is defined as the value of β that minimizes

Embedded Image[20]

where HI is a (K+1)×(K+1) dimension weighting matrix that is generally unknown. The feasible, two-step GMM estimator for β as described by Cameron and Trivedi (2005) is used to compute our combined RP/SP estimates for β. We provide additional details on the GMM estimator in the Appendix.

Relationship to Other Literature

Our proposal is related to three strands of literature, the most obvious being the large body of work on combined RP and SP models. In particular, it is akin to early work by Cameron (1992) and Kling (1997) that focused on merging RP and SP information from different decision margins—recreation trips and contingent valuation in their cases— to estimate a single preference function. It is also related to more recent research by Whitehead et al. (2010) and von Haefen and Phaneuf (2008), which views the joint use of RP and SP data as a means of exploiting different sources of variability for a single inference task. Because we are asserting (rather than testing) that the property value and conjoint data are generated by the same underlying behavioral process, our approach is distinct from the strand of literature that uses joint models of common decision margins to test the validity of one or the other elicitation method (e.g., Azevedo, Herriges, and Kling 2003).

Our approach is also related to the literature on second-stage estimation of hedonic models. Purely RP strategies for solving the identification problem are based on using functional form and exclusion restrictions, or multiple markets that are spatially distinct, to generate the conditions needed for estimation. Our approach is related to the second of these, though our additional variability comes from a choice experiment in a single market, rather than multiple market RP datasets. However, many of the same notions apply. As for Bajari and Kahn (2005), our strategy relies on first estimating a hedonic price function, and then matching predictions for baseline marginal WTP to a parametric specification for the inverse demand function.

Finally, we borrow ideas from the small literature on preference calibration for benefits transfer (Smith, van Houtven, and Pattanayak 2002). This approach to benefits transfer begins with a parametric expression for the preference function and the implied analytic marginal WTP statements. Numeric estimates from the literature are then matched to their analytical counterparts, from which values of the structural parameters are obtained. In instances when more than one numeric value can be matched to an analytical expression, the authors advocate a moments-based approach. Our model is similar in its reliance on matching analytical expressions to their numeric equivalents generated elsewhere.

IV. Application

We investigate the performance of our combined RP/SP strategy via an application examining household’s WTP to avoid proximity of their primary residence to an aquatic hazardous waste site. The 1987 Amendments to the Great Lakes Water Quality Agreement between the United States and Canada designated 43 sites in the Laurentian Great Lakes, and their tributaries, as Areas of Concern (AOC). A common feature of these areas is the presence of toxic chemicals—notably polychlorinated biphenyls (PCBs)—known to cause cancer and neurological defects in humans and to bio-accumulate in aquatic food webs. Since the 1987 Amendments, only one U.S. and two Canadian AOCs have been delisted. The remaining remedial activities on the U.S. side alone are expected to cost between $1.5 billion and $4.5 billion (Great Lakes Regional Collaboration 2005). There is considerable interest in discerning whether further expenditures on cleanup will produce benefits consonant with the costs.

Our analysis focuses on the Buffalo River, New York, area of concern, which is shown in Figure 2. The area consists of a commercial harbor and a 6.2 mile segment of the river running eastward from its terminus into Lake Erie. The AOC is flanked by a large industrial complex, which is in decline and contains many unused contaminated parcels. Nevertheless, there are private homes nearby: the 2000 Census counted 52,628 single-family homes within five miles of the AOC. Our objective is to measure how nearness to the AOC affects the market value of these private homes, and what the value to homeowners would be of remediation. We proxy the latter by analyzing a discrete change in distance to the AOC that is large enough to eliminate the external effects of the disamenity, holding all else constant.2 For these purposes both real estate transactions and survey data were collected, in which the latter provides both characteristics of households who purchased a home and the results of a conjoint experiment. We explain these two sources of data in turn.

Figure 2

Map of Buffalo River, New York, Area of Concern

Real Estate Data

Our analysis uses sales of single-family, owner-occupied homes that occurred between January 2002 and December 2004. The data were collected by Braden et al. (2006) and initial, separateanalyses oftheRPandSPdata are reported by Braden et al. (2008). The present study is the first effort to combine the data for joint estimation as well as to explore the potential to use a GMM estimator in this context. The sample is limited to properties that lie within five linear miles of any point along the Buffalo River AOC. The study area encompasses most of the city of Buffalo, all of Lackawanna, and portions of Cheektowaga, Hamburg, and West Seneca, as well as two smaller municipalities.

Two primary databases were combined to characterize homes sales in our study area. The first comes from local tax assessors and contains sales prices (normalized to 2004 dollars), transaction dates, and property characteristics that include lot size, square feet of living area, age of primary structure, and miscellaneous housing characteristics. The top section of Table 1 displays the names, definitions, and summary statistics for these variables. The second database describes spatial features ofthe properties thatsoldinourstudy area. Variables that are of particular interest include proximity of the house to the AOC; proximity of the house to other location-specific (dis)amenities, such as the shoreline of Lake Erie, local parks, transportation networks, and employment districts; and spatial units such as census tract and block, and school district. The proximity measures were created for each parcel using a GIS map of the Buffalo area. The lower sections of Table 1 show the names, descriptions, and summary statistics for these variables. In particular, the summaries show that 47% of the sales in our sample occurred north of the Buffalo River. This distinction becomes important when we discuss our estimation results. Also, the mean distance to the AOC is approximately three miles. Other summaries we examined indicated that 12% and 16% of homes north and south of the river, respectively, lie within 1.5 miles of the AOC.

Table 1

Variable Descriptions and Summaries for Property Value Data

Table 1 also indicates that there are 118 additional dummy variables for use in the analysis, each representing a census tract in which a property is located. Census tracts are designed to be relatively homogeneous with respect to population characteristics, economic status, and living conditions and generally contain between 2,500 and 8,000 individuals. Census tracts vary in size depending on the population density; in our study area they average less than one square mile. By including census tract identifiers, our analysis nonparametrically controls for infrastructure and demographic factors that influence home choices and prices across space. These spatial fixed effects help eliminate confounding between our distance measures of interest (e.g., miles to the AOC) and other factors that may be correlated with these distances but not included in our explanatory variables (Kuminoff, Parmeter, and Pope 2010).

Survey Data

Based on the home sales data, Braden et al. (2006) randomly selected 850 households that purchased a transacted property, each of whom was sent a survey.3 Among these, 315 were returned; excluding 63 undeliverable surveys, the response rate was 40.7%. Of the returned surveys, 281 were sufficiently complete to be of use for this analysis. The survey was designed to complement the real estate market data. Four categories of information were collected: verification of current home characteristics; measurement of respondent attitudes regarding the AOC; responses to conjoint questions; and household demographic information. The conjoint questions asked respondents to imagine that additional homes had been on the market during their recent home-buying experience. Hypothetical homes were then sequentially offered. Respondents were asked whether, at the time of purchase, they would have preferred the hypothetical home to the home they actually bought. A representative choice question is shown in Figure 3.

In order to focus respondents’ attention on variables of interest and to make the choices as concrete as possible, the hypothetical homes were described as being identical to the current home, aside from four designed attributes. Table 2 summarizes the designed attributes. These were chosen to focus on trade-offs between private aspects of homes (sale price, square feet of living area), and spatial aspects of the neighborhood (distance to the AOC, condition of the AOC). Values for the attributes were expressed in relation to the home/ location as it existed at the time of purchase. For sale price and square feet of living area, the designed levels are proportions of the price and home size of the property actually purchased. For proximity, Braden et al. (2006) used nominal deviations from current distance and askedrespondents to imagine the river being closer (further) to (from) their home without changing other features of the neighborhood. The environmental condition of the river was varied qualitatively, with toxic pollution increasing, decreasing, staying the same, or being completely eliminated. The four attributes with four levels each suggest there are 44 = 256 possible combinations of hypothetical homes. From these they constructed a fractional factorial design that allows estimation of main effects and interactions, while maximizing the efficiency of parameter estimates (Montgomery 2000). Sixty-four unique choice alternatives resulted from the design. Eight survey versions were created so that each version contained eight choice tasks, each comparing one of the hypothetical homes to the respondent’s actual purchase.

Table 2

Conjoint Experiment Design

V. Results

Hedonic Model

Our modeling approach involves first estimating the hedonic price function. Using the transactions and spatial data described above, Braden et al. (2006, 2008) estimated the price/AOC distance gradient using several parametric specifications and variable interactions. Here we focus on their preferred specification

Embedded Image[21]

where AOCi is the distance from property i to the area of concern (measured in tenths of miles), Ni is a dummy variable that takes the value one when the property is north of the Buffalo River, sflai is the size of the house, and xi is a vector of all other control variables thought to influence the sale price of the property as listed in Table 1. The two interaction terms with the dummy variable for north of the AOC are included to account for the onthe-ground realities of this particular market. The central business district (CBD) lies north of the AOC, on the western-most portion of the river (see Figure 1). All else equal, proximity to the CBD is expected to positively influence prices. However, just north of the AOC and running parallel are significant railway networks and an interstate highway, as well as an industrial zone. These features generally lie between the AOC and housing on the north side of the river. In contrast, residential communities south of the AOC begin immediately adjacent to the AOC and continue southward with fewer confounding spatial features.

Figure 1

Identification Problem in Second-Stage Hedonic Estimation

Based on the specification in [21], the marginal WTP for a change in distance is

Embedded Image[22]

implying the marginal value of distance decreases as we move further away from the AOC. The more common specification in which the natural log of sales price is the dependent variable was also estimated. While results are qualitatively the same, there were quantitative differences (discussed below) that lead us to focus on the linear-log specification. More flexible specifications may ultimately be advantageous in applications of our proposed method, but we have chosen to stay with the simple and transparent linear-log form for this demonstration.

Selected coefficient estimates and robust standard errors obtained using ordinary least squares on equation [21] with I=3,474 are shown in Table 3. Estimates for the full set of parameters are given in Appendix Table A1. The results show that a larger distance to the AOC positively affects property values south of the Buffalo River. Note, however, that the estimated price gradient for properties north of the river, given by (θ1 + θ2Ni)/ AOCi, is positive but insignificantly different from zero (H0: θ1+θ2=0; F1,3323=0.35, p-value = 0.55). The rail lines, highway, and industrial area discussed above appear to act as a buffer between the residential real estate market to the north and the AOC, and they likely overwhelm the influence of the AOC.4

Table 3

Selected Results from First-Stage Hedonic Model

Table 3 also includes a summary of predictions for baseline marginal WTP (MWTP), based on point estimates obtained using [22]. The sample average MWTP for moving an additional 0.10 miles from the AOC is $295 for all properties, and $362 when computed for just those properties south of the Buffalo River. These are relatively small percentages of the sale price. As the distance to the AOC decreases, however, the effect on property values becomes more substantial. For example, the MWTP for homes located less than 0.3 miles from the AOC is on average almost 7% ($3,406) of the purchase price of the home, and the estimate for homes out to 0.5 miles is 3.5% ($1,548). These numbers generally imply economically significant effects on property values for houses near the AOC, but that the effect decreases fairly rapidly as the distance increase. This is consistent with the highly localized effect of other types of land use externalities (Ihlanfeldt and Taylor 2004). Finally, our estimates imply MWTP for an additional square foot of living space for an average sized home of $21 in areas south of the AOC, and $49 in areas to the north. We also examined models with no difference in the marginal price of living space between areas north and south of the AOC; results for the AOC variables remain qualitatively unchanged but are somewhat less precisely estimated.5 Since the north/south distinction is clearly important empirically, we maintain it in our preferred specification.

SP and RP/SP Models

We turn now to the conjoint model. For our demonstration we consider two specifications for utility:

Embedded Image[23]

and

Embedded Image[24]

where Ritc is the alternative-specific constant that is equal to one if choice c is the house actually purchased and zero otherwise, yi is income, Fi is the number of people living in the household, Eietc is a dummy variable indicating the environmental condition of the AOC, and εitc is distributed type I extreme value. For the utility function in [23], the marginal WTP for distance to the AOC at baseline conditions is simply

Embedded Image[25]

and for the utility function in [24] it is

Embedded Image[26]

Note that in the latter specification, interactions between ln(AOCitc) and household characteristics imply the inverse demand for AOC is conditional on the type of household that chooses to occupy the house, and in particular the household’s income and size. The specific form for the implicit price function in [22] and the marginal WTP functions implied by [25] or [26] are substituted into [19] to fully specify the RP moment condition.

Estimation results for the simple utility function are shown in Table 4, and results for the utility function with interactions are shown in Table 5. In both tables, maximum likelihood estimates for the SP only model are shown in the left-hand columns, while GMM estimates for three versions of our RP/SP model are shown in the remaining columns. The model RP/SP-A uses an additional moment condition based on the hedonic estimates of MWTP for distance to the AOC. The model RP/SP-B also uses one additional moment condition, but here it is based on matching the marginal implicit price of house size (square feet of living space) from the hedonic model to the corresponding expression in the conjoint model. Finally, the model RP/SP-C estimates the parameters using both of these additional moments simultaneously. Standard errors and t-statistics for all the models shown in Tables 4 and 5 were computed by bootstrapping the data. In the case of the RP/SP models we first bootstrapped the hedonic model 200 times to obtain an empirical distribution of the parameters needed to construct the marginal implicit prices entering the moment conditions. For each of these we then bootstrapped the conjoint data; the resulting empirical distribution was used to calculate standard errors for each of the utility function parameters. This more involved routine was needed to account for the fact that the moment conditions used to estimate the utility function parameters are themselves functions of estimates containing sampling noise.

Table 4

Estimates from SP and RP/SP Models without Interactions

Table 5

Estimates from SP and RP/SP Models with Interactions

Consider first Table 4. We find parameter signs that are consistent across all four models, and so we discuss the important qualitative interpretations jointly. As in the hedonic model, we find for all models that distance to the AOC is a “good,” that is, the marginal utility for ln(AOC) is positive. The interaction terms between ln(AOC) and the pollution status dummies demonstrate, however, that the external effect of proximity to the river can change when the pollution status changes. For example, β61 > 0 (the coefficient on the more pollUtion and ln(AOC) interaction) suggests that the marginal utility of distance to the AOC increases when the pollution status of the river worsens. Likewise β63<0 (the coefficient on the fUll cleanUp and ln(AOC) interaction) suggests a fully restored river causes the effect to become smaller.6 These effects are not significantly different from zero, however. Finally, the coefficients on the pollution status dummies are consistent across models. We find that a full cleanup of pollution increases utility, and that utility decreases when there is a worsening of pollution. In terms of attributes not related to the externality, we find a positive coefficient on house size and a negative price effect. Taken as a whole, the signs of our parameter estimates suggest models that are qualitatively similar to each other, and generally reflective of rational trade-offs among survey respondents.

There are, however, important quantitative differences among the models, which we now discuss. The bottom of Table 4 shows estimates of marginal WTP for distance to the AOC under status quo conditions, at two different baseline distances. It also shows estimates for the marginal WTP for home size at a baseline of 1,500 square feet. These are useful for comparing the four models, since the marginal WTP estimates do not confound issues of scale, as do the level parameters. Three patterns are clear. First, the two RP/SP models using the distance to AOC moment (A and C) produce larger estimates of marginal WTP for distance than does the SP model, and this larger value corresponds closely to the hedonic estimate. Second, the RP/SP models that use the home size moment (B and C) produce smaller estimates of the marginal WTP for size than the models that do not (SP and RP/SP-A). By way of comparison, the average marginal WTP for home size from the hedonic model at 1,500 square feet is $34.81, andsotheRP/SPmodelsBandChavemoved the conjoint estimates toward the actual market outcome. Finally, the estimate of the AOC distance effect is less precise for the RP/SP models relative to the SP-only model. This arises in our case study from the comparatively imprecise estimates coming from the hedonic model; in other applications our method may result in greater precision if estimates from the RP data are themselves less noisy. Our sense here, however, is that the differences in marginal WTP estimates across the models have economic, if not statistical, significance. Our preferred model (RP/SP-C) effectively calibrates the baseline marginal WTP for both distance and square feet to the market price and results in different estimates than the SP-only and other RP/SP models.

The figures in Table 5 are qualitatively similar to those in Table 4. Generally we find that people with higher income have a higher WTP for distance to the AOC, and the larger families have a smaller WTP. Neither of these effects, however, is statistically significant at conventional levels.

Welfare Effects

The main advantage of the SP and RP/SP models relative to the first-stage hedonic model is that they are both capable of delivering estimates of the WTP for discrete changes in the conditions of the AOC. As noted by Bockstael and McConnell (2007), however, there are different welfare measures we can use depending on whether or not households can move in response to a shock, and how we treat changes in prices. For our SP and RP/SP conjoint models, it is necessary to assume that households stay in their current home (i.e., they are not mobile), since the experimentally designed choice set does not characterize the true collection of homes a household might consider in the case of a move. This assumption is typical in most property value welfare measurements, given the difficulty of predicting counterfactual moves. Thus, the proper formula for welfare measurement using the utility function parameter estimates is given in equation [15], rather than by the log-sum expected utility formula used in other logit contexts. Note that this formula includes the price change effect, as well as the preference-based effect of the change in q. Since the former is typically not available in SP models, our welfare measures include only the first term in equation [15]—the pure WTP measure. We separately examine price changes using the hedonic model.

Table 6 contains point estimates for our counterfactual analysis of discrete changes in the AOC, computed for each of the conjoint models using the utility function without interactions. We examine a discrete change in distance to the AOC that implies no house is closer than 4 miles to the site to proxy elimination of the disamenity. This provides a direct means of comparing welfare measures from the conjoint models to predicted price changes from the hedonic estimates.7 The top panel of the table shows household-level WTP estimates, broken out by different baseline distances to the AOC. In particular, we consider the value of our remediation proxy for representative households living at five different distances from the contaminated site under current conditions. For all four models the WTP decreases as the baseline distance from the contaminated site increases, as one would expect. The magnitude of the welfare effect, however, is substantially different among the different models. Point estimates from the RP/SP-A and RP/SP-C models are more than twice as large as the comparable estimate from the SP model. These are economically significant differences, though in our case they are not statistically so. Table 6 also displays predicted price changes from the hedonic model, computed using estimates from the full hedonic data set. For homes located close to the AOC the price change estimate is similar to but larger than the predictions from the RP/SP models including the AOC moment. Note that these price change predictions do not say anything about the accuracy of the welfare measures arising from either the SP or the RP/SP models, since price change predictions are measures of WTP only in special cases. For quality increases, standard theory holds that the price change is an upper bound on the pure WTP for the improvement, but the degree of overestimation is not generally known. Given this, one interpretation of our findings is that the predicted price change is not an unreasonable approximation to the utility theoretic welfare measure.

Table 6

Nonmarginal Welfare Estimates for Different Baseline Distances to AOC

Although it is not the emphasis of this paper, our conjoint models allow us to compare our estimate of remediation based on the distance proxy to a direct estimate based on the pollution status variables. The second panel of Table 6 shows point estimates from the SP model and our three RP/SP models. For our preferred model (RP/SP-C) we find a point estimate that, for homes close to the AOC, is nearly identical to the point estimate based on the distance proxy. This result supports the thought experiment commonly employed in the literature to approximate the benefits associated with removal of a disamenity by virtually “moving” the home further from the offending site.

VI. Discussion

What are the advantages and disadvantages of our proposal relative to other options for measuring discrete change welfare effects using property value models? Consider first the disadvantage, which is that it is relatively data intensive since an SP survey is needed along with a property value database. Recall, however, that any effort to measure householdlevel preferences requires household-level data, usually obtained via a survey. This includes the second-stage hedonic model. Thus the extra data collection cost lies in the inclusion of a conjoint exercise; the fixed costs of a survey need to be borne in any case. Our sense is that the type of conjoint experiment a researcher would need to include in the survey is comparatively simple, and outside of the particular amenity being examined, could be standardized—thereby substantially reducing development costs for individual studies. Designing the SP questions for the particular amenity of interest would, or course, be study specific. The need to coordinate this with the amenity data collected for the hedonic model might also be viewed as a disadvantage of our approach. However, scoping exercises to determine how and which environmental conditions affect behavior—regardless of whether one uses anSP, RP, or combined technique—is a critical component of any nonmarket valuation study and not unique to our approach, except perhaps in degree.

There are four advantages to our approach, beyond those associated with combining complementary data sources. First, econometric innovations (including quasi-experimental methods) and the rich home sales databases suggest marginal WTP estimates from the first-stage hedonic model are of high and increasing quality. Our approach allows analysts to leverage this progress by coupling it to conjoint models, which provide greater flexibility in the valuation measures that can be provided. Second, the environment in which the SP data is collected favors its use as we have proposed, in that our target population for the survey is people who recently purchased a home, and hence have experience with the commodity. Thus we combine the accurate baseline characterization from the hedonic model with SP data that is gathered under circumstances favorable for minimizing scenario rejection. Third, like maximum likelihood, our approach requires nothing more than numerical optimization to implement. Since starting values can be obtained from a conjoint maximum likelihood estimation routine, our sense is that the computational burden of our GMM model should be comparable to maximum likelihood and small relative to the potential benefits.

The final advantage to our approach is the flexibility it offers an analyst. We have developed our ideas in this paper under the premise that the baseline marginal WTP estimates from the hedonic model are preferred to their SP counterparts. Given this, we argued that it makes sense to calibrate the SP amenity value estimates toward the RP predictions using our moments-based estimator. In some circumstances, however, hedonic-based estimates of the implicit price of an amenity may inspire less confidence. For example, the environmental condition of interest may be poorly measured by an available distance proxy, or data constraints more generally might limit identification. Even in these cases, however, the first-stage hedonic model is likely to provide reliable estimates of market-based attributes such as home and lot size, and distance to employment centers. Since these are also attributes that could be included in the conjoint design, matching moments based on them could provide a consequential baseline around which the SP parameters could be calibrated. Although this may not directly affect the marginal utility of the amenity of interest in the SP specification, it would affect how it is valued via its impact on the marginal utility of income and the valuation of the other attributes. In some contexts estimates from such an approach might be preferred to their SPonly counterparts.

Given these advantages we believe this model is worthy of additional investigation, and we are pursuing several avenues. First, there are econometric considerations. While our demonstration here has used a logit assumption to develop the K SP moments, it is not necessary in a GMM context to impose so much structure. Relaxing the distributional assumptions underlying the conjoint model may deliver additional flexibility. Likewise, estimates of the marginal implicit price from the hedonic model should be obtained from a model that is as flexible in its functional form as the data will support. Consistent with the recommendations of Ekeland, Heckman, and Nesheim (2004), this suggests that we investigate nonparametric or semi-nonparametric approaches for modeling how the variables of interest affect price.

Other avenues for research are application based. For example, the costs of future applications of this idea would be much reduced if a transferable template for a housing choice conjoint experiment (absent the applicationspecific amenity) were carefully developed and made available. This would provide a finite number of housing attributes to include in a choice experiment, which could then be linked to typical hedonic price estimates. Application-specific survey development efforts could then be spent on identifying and defining the environmental attributes that are of interest, and the extent to which they can be can be linked in the RP and SP models. Also, though we have described this idea in the context of combining RP and SP data, one could also view it through the lens of solutions to the identification problem in second-stage hedonic models. Research in this area might involve comparisons between our GMM model, traditional second-stage hedonic estimation, and the newer class of sorting models that have been developed.

Acknowledgments

This study was supported in part by grant no. GL- 96553601 from the U.S. Environmental Protection Agency/Great Lakes National Program Office, Cooperative States Research Education and Extension Service, U.S. Department of Agriculture/ Illinois Agricultural Experiment Station projects MRF 470311 and ILLU-470-316, and Illinois-Indiana Sea Grant project AOM NA06 OAR4170079 R/CC-06-06. Any opinions, interpretations, conclusions, and recommendations are entirely the responsibility of the authors and do not necessarily reflect the views of the aforementioned sponsors and individuals. We thank Raymond Palmquist, Walter Thurman, Christopher Timmins, and seminar participants at University of Manchester, University of Stravanger, University of Heidelberg, the University of Kiel, and the World Congress ofEnvironmental and Resource Economists 2010 for useful comments on earlier versions of this paper. All errors and omissions are, of course, the responsibility of the authors.

Appendix

As described by Cameron and Trivedi (2005, 17276), the GMM approach to estimation begins with the specification of r moment conditions. Define an r × 1 vector for person i by g(Wi,β), where β is a q × 1 parameter vector and the specific form for g( ■ ) comes from the model assumptions. In our case, for example, equation [13] suggests one element of g( ■ ), and the other elements come from knowing the distribution of the error term in the conjoint model. Define the population moment condition by E[g(Wi,β)] = 0, and define the population analog to this moment condition by

Embedded Image[A1]

If r = q, then the model is just identified in that (given regularity conditions) we can solve for the q unknowns in β using the r equations in [A1]. If r > q, then the model is overidentified, and [A1] has no unique solution. In this case a GMM estimator can be used, which is defined as the solution to

Embedded Image[A2]

where HI is an r ×r weighting matrix that is symmetric, positive definite, and not a function of β. Different estimators arise from different choices of HI.

We implement the two-stage feasible GMM estimator, which uses a consistent estimate of the particular weighting matrix that leads to minimum variance. In the first step we use the r-dimensional identity matrix for Hj in [A2] to obtain ß, which is a consistent (though not efficient) estimate of β.We then compute

Embedded Image[A3]

In the second stage we obtain the two-step GMM estimates βGMM by solving

Embedded Image[A4]

The usual two-step GMM βGMM estimator is asymptotically normal with mean β, and estimated variance

Embedded Image[A5]

where

Embedded Image

and Embedded Image is computed as in [A3], but with βGMM rather than Embedded Image . In our application; however, some of the moments are themselves functions of estimated parameters, and so [A5] would need to be altered to reflect this. The method of Murphy and Topel (2002) can in principle be used to derive an analytical expression for the variance matrix. In this paper we have used an alternative bootstrap method in which we first bootstrap the hedonic model to obtain an empirical distribution of the parameters needed to construct the marginal implicit prices that enter the moment conditions. For each of these bootstrap marginal price outcomes we then bootstrap the conjoint data. The resulting empirical distribution of utility function parameters is used to compute standard errors for our combined RP/ SP models.

In our analysis, we have used Stata to execute the hedonic models and predict implicit prices, and purpose-written Matlab code to execute the GMM estimation.

Table A1.

Parameter Estimates from Hedonic Model (Spatial Fixed Effects Not Reported)

Footnotes

  • The authors are, respectively, associate professor, Department of Agricultural and Applied Economics, University of Wisconsin, Madison; professor, Department of Agricultural and Resource Economics, North Carolina State University; and professor emeritus, Department of Agricultural and Consumer Economics, University of Illinois, Champaign–Urbana.

  • 1 Beyond the issue of insufficient variation, Bartik (1987) discusses an econometric threat to identification of the second-stage equation that follows from the simultaneous determination of the implicit price and quantity consumed in the household’s decision. Research examining these two types of identification problems has not yet produced a single, preferred approach for consistent estimation of πq( ■ ) for a single market area. While Ekeland, Heckman, and Nesheim (2004) suggest identification in a single market is assured by the properties of the hedonic equilibrium, the practical ramifications of this have not yet been fully explored.

  • 2 There are many issues associated with how one conceptualizes the value of remediation for contaminated sites and for using a distance-based proxy to measure it. For example, McCluskey and Rausser (2003) and Kiel and Williams (2007) argue that the value of remediation is not necessarily equivalent to property value improvements, because the externality may have induced local demography changes that alter the affected population’s underlying demand for remediation. Furthermore, the change in property values following remediation can differ from an estimate based on the distance proxy for reasons such as stigma effects, as well as the fact that continuous distance may be an imperfect measure of exposure to the externality. Although these issues are important for how one uses estimates from this application for actual benefits analysis, they are not the emphasis of this paper. We therefore proceed under the assumption that obtaining a distance-based proxy measure from hedonic and/ or conjoint models is a legitimate inference objective and evaluate how well our proposed model does in achieving this objective.

  • 3 The survey instruments were developed with assistance from the University of Illinois Survey Research Laboratory and in cooperation with the Great Lakes Program, University at Buffalo. Early versions were assessed by focus groups held at a public library branch in West Seneca, New York, in early 2005. Advanced versions were pretested in Spring 2005. For the final survey, respondents could either mail back a completed questionnaire or complete an equivalent instrument using the Zoomerang.com commercial survey web site. Approximately 9% of the responses were received online.

  • 4 We recognize that the coefficient estimate on the interaction term between distance to the AOC and the north indicator is not statistically significant, implying that the gradient for the north is not significantly different than the gradient for the south. We elected to keep this interaction for several reasons. First, we have good economic reasons to believe that the gradient in the north will be different based on the spatial features of this market described earlier. Second, the lack of significance of the interaction term is not surprising, given the difficulty we had in distinguishing the effect of proximity to the AOC from proximity to the CBD in the north. This latter point is shown by models in which the AOC is left out of the specification. In these cases, proximity to the CBD is a significant amenity in the north. When the AOC proximity variable is included, proximity to the CBD loses statistical significance.

  • 5 The coefficient estimate for distance to the AOC is 5,173 (p-value = 0.083) and the coefficient estimate for the interaction term between distance and the north dummy variable is -6,797 (p-value = 0.428).

  • 6 It is possible for the river to become a landscape amenity rather than a disamenity, once it is fully restored. In this case we would expect the marginal utility of ln(AOC)tobe negative under the full restoration scenario, implying I β631 > β1. Our results suggest residents would not view proximity to the river as a positive attribute of their homes, even if it were fully restored.

  • 7 Computationally we do this by setting the new value for AOC to four for all survey respondents with AOCi ≤ 4at the observed baseline. Conceptually, we “move” the otherwise identical house a nonmarginal distance to eliminate exposure to the disamenity, and this measures the net-benefit of removing exposure under certain conditions (e.g., that the amenity affects a small portion of the overall housing market and that there are zero transactions costs). See Ihlanfeldt and Taylor (2004), Chattopadhyay, Braden, and Patunru (2005), and Kaufman and Cloutier (2006) for benefit estimates based on a similar approach, and Taylor (2003) or Palmquist (2005) for a discussion of benefits estimation with the firststage hedonic price function.

References