Abstract
Sensitivity to the scope of public good provision is an important indication of validity for the contingent valuation method. An online survey was administered to an opt-in nonprobability sample panel to estimate the willingness to pay to protect hemlock trees from a destructive invasive species on federal land in North Carolina. We collected survey responses from 907 North Carolina residents. We find evidence that attribute nonattendance (ANA) is a factor when testing for sensitivity to scope. When estimating the model with stated ANA, the ecologically and socially important scope coefficients become positive and statistically significant, with economically significant marginal willingness-to-pay estimates. (JEL Q51)
1. Introduction
The contingent valuation method (CVM) is a stated preference approach to the valuation of nonmarket goods (Johnston et al. 2017). Since the National Oceanic and Atmospheric Administration Panel’s report on contingent valuation (Arrow et al. 1993), the scope test, for which stated preference willingness-to-pay (WTP) estimates are expected to increase with the scope of a policy, has been an important validity test and a significant source of controversy. Tests of sensitivity to scope fall into two categories. External scope tests rely on between-respondent variation either through split sample designs or by restricting estimation to just one valuation question per respondent.1 Internal scope tests allow with-in-respondent variation to inform the test by using multiple responses from each survey participant.2
Not every CVM study passes the scope test (Desvousges, Mathews, and Train 2012). One explanation may be attribute nonattendance (ANA), a deviation from neoclassical theory and an example of preference heterogeneity.3 ANA arises when survey respondents ignore one or more of the choice attributes for a variety of reasons (Aleumu et al. 2013). Generally, ANA is handled empirically by restricting estimated attribute coefficients to zero for individuals who do not have fully compensatory preferences. Stated ANA models rely on survey respondent statements about which attributes they ignored. Inferred ANA models allow the empirical model to provide clues about ANA.
The survey we use to test for sensitivity to scope in the presence of ANA presents respondents with plans to manage an invasive species in public forests and builds on the analysis of Moore, Holmes, and Bell (2011). Respondents are asked if they would be willing to pay additional taxes to support a forest protection plan described using three attributes: treatment of ecologically important acreage, treatment of socially important acreage, and treatment method. We employ a split sample design to perform our scope tests on two different CVM response formats. One subsample receives a multiple-bounded dichotomous choice survey in which respondents can respond to up to three tax amounts with other attributes held fixed, with follow-up tax amounts dependent on the response to the previous amount. The other subsample receives a repeated dichotomous choice survey in which all attributes can change over three questions.
This study design allows us to test for both external and internal sensitivity to scope. Because the first question for both samples follows the same single binary choice format, those responses can be used to estimate a marginal WTP surface that relies on between-respondent variation only. A statistically significant and positive marginal WTP for a given attribute is an indication of external sensitivity to scope.4 Next, by examining the sequential valuation questions, we test for internal scope.5 We repeat both scope tests using stated ANA models and random parameters logit to examine the impact of nonattendance and preference heterogeneity on scope effects. We also extend this line of tests to include inferred ANA using latent class logit models.
To our knowledge this is the first paper to treat CVM binary choice sequential data as a discrete choice experiment (DCE) and test for the effects of stated and inferred ANA. We find evidence that ANA is a factor when testing for sensitivity to scope. When estimating the model with stated ANA, the ecologically and socially important scope coefficients become positive and statistically significant. We find several differences between the multiple-bounded and repeated sequential contingent valuation data samples that support the use of the repeated sequential design. Our results suggest that accounting for ANA is one factor that identifies a core group of CVM respondents who exhibit sensitivity to scope. These results could have implications for the interpretation of past CVM studies that find insensitivity to scope and the implementation of public policy.
2 Literature Review
CVM and DCEs
In this paper we address the CVM scope test with strategies adopted by DCEs, which have proliferated in the environmental economics literature (Mahieu et al. 2017). The scope test validity issue is not an empirical concern for DCE researchers in the same way that it is for contingent valuation researchers. In short, CVM studies tend to conduct external scope tests and DCE studies tend to conduct internal scope tests. DCEs employ more complicated designs by increasing the number of choice alternatives, increasing the number of choice questions, or separating the description of the public good into a number of varying attributes (Carson and Louviere 2011). Each of these may contribute to the finding of sensitivity to scope in, to our knowledge, all published DCEs. In this paper we consider increasing the number of binary choice questions and separating the description of the public good into several varying attributes, as do Boyle et al. (2016). While several studies have compared internal and external scope tests in the contingent valuation literature (e.g., Giraud, Loomis, and Johnson 1999; Veisten et al. 2004), to our knowledge only one DCE article has addressed external scope. Lew and Wallmo (2011) find that most differences in WTP estimates are statistically significant in external scope test comparisons. Most DCE researchers have focused on internal scope tests. Sequential binary choice question for-mats have been used to examine ordering effects (Holmes and Boyle 2005; Day et al. 2012; Nguyen, Robinson, and Kaneko 2015) but have not been explicitly concerned with a comparison of internal and external scope tests.
The single binary choice, or referendum, form of the CVM presents survey respondents with a policy description and a randomly assigned policy cost (e.g., a tax). Respondents reveal their preferences by indicating whether they would be willing to pay the randomly assigned cost in the hypothetical setting. Carson and Groves (2007, 2011) and Carson, Groves, and List (2014) describe the situations in which a single binary choice should lead to a truthful revelation of preferences. If the survey respondent believes the survey to be consequential and the policy description is for a public good, then the best response to the single binary choice is truth-telling (Zawojska and Czajkowski 2017). A consequential survey is one in which respondents care about the outcome and believe their answers are important to the policy process, as in an advisory referendum.
Our paper is similar to those of Christie and Azevedo (2009), Siikamäki and Larson (2015), and Petrolia, Interis, and Hwang (2014, 2018). All four compare valuation questions from two separate samples of respondents. In Christie and Azevedo’s (2009) paper, one sample received a survey with three CVM questions, while the other sample was presented with eight DCE questions. The authors find scope effects with the CVM and DCE questions, but WTP differs across question format.6 Similarly, Siikamäki and Larson (2015) asked two sets of “two and one-half bounded” binary choice valuation questions for an improvement in water quality. Their conditional logit models fail the scope test, but a mixed logit model that accounts for preference heterogeneity finds statistically and economically significant scope effects.7
Petrolia, Interis, and Hwang (2014) compare single binary choice and single multinomial choice valuation questions in split samples of survey respondents. The authors find scope effects for all three attributes in the multinomial choice version of the survey.8 Finally, Petrolia, Interis, and Hwang (2018) compare single multinomial choice and sequential multinomial choice question versions with and without consideration of ANA. They find scope effects in both survey versions and similarity in the magnitudes of scope effects across versions. In addition, Petrolia, Interis, and Hwang (2018) make a comparison of external and internal scope tests but with multinomial choice data.
Attribute Nonattendance
A standard assumption made when analyzing stated preference survey data is that respondents have unlimited substitutability (fully compensatory preferences) between attributes (Scarpa et al. 2009). Recent research, however, suggests that respondents often employ simplified heuristics such as threshold rules, attribute aggregation, or ANA when making choices (Swait 2001; Caparros, Oviedo, and Campos 2008; Greene and Hensher 2008; Hensher 2006, 2008; Puckett and Hensher 2008). ANA occurs when respondents fail to consider a particular attribute from a stated preference discrete choice (Alemu et al. 2013). Such discontinuous preference orderings violate the “continuity axiom” and thus pose problems for neoclassical analysis (Gowdy and Mayumi 2001; McIntosh and Ryan 2002; Rosenberger et al. 2003; Lancsar and Louviere 2006; Scarpa et al. 2009). In this context, ANA models can be thought of as an extension to other methods that account for preference heterogeneity, such as empirical methods including latent class and random parameters models.
Economic literature on ANA began appearing in the early 2000s (McIntosh and Ryan 2002; Sælensminde 2002; Lancsar and Louviere 2006; Hensher, Rose, and Green 2005; Hensher 2006; Rose, Hensher, and Greene 2005; Campbell, Hutchinson, and Scarpa 2008; Scarpa et al. 2009). In our study, the use of a relatively inexpensive opt-in sample may raise the potential for ANA (i.e., insensitivity to scope for two of our three attributes), as respondents may pay less attention to survey details (Baker et al. 2010; Johnston et al. 2017). ANA tends to cause statistical insignificance of attribute coefficients or bias them toward zero. Perhaps most importantly, this body of literature has found that WTP and willingness-to-accept (WTA) estimates are lower when ANA is accounted for (McIntosh and Ryan 2002; Hensher, Rose, and Greene 2005; Hole 2011; Rose, Hensher, and Greene 2005; Campbell, Hutchinson, and Scarpa 2008; Scarpa et al. 2009; Scarpa et al. 2011; Scarpa et al. 2012; Hensher, Rose, and Greene 2015; Puckett and Hensher 2008; Koetse 2017).9 These inaccurate estimates could subsequently influence policy through benefit transfer (Glenk et al. 2015).
Several empirical strategies have arisen to account for ANA (Scarpa et al. 2009). There are two primary methods for identifying ANA: stated and inferred. In the former, a survey explicitly asks respondents to indicate the degree of attention they paid to each attribute that described the choice alternative. The researcher will then use these data to group respondents into an attendance class (Hensher, Rose, and Greene 2005; Carlsson, Kataria, and Lampi 2010; Hensher, Rose, and Greene 2012; Scarpa et al. 2012; Alemu et al. 2013; Kragt 2013). ANA may be inferred by use of latent class methods to separate respondents into different groups based on their preference orderings (Scarpa et al. 2009; Hensher, Rose, and Greene 2012; Scarpa et al. 2012; Kragt 2013; Glenk et al. 2015; Koetse 2017). Inferred models are important for two main reasons. First, not all surveys explicitly ask attribute attendance questions. Thus analysis of ANA implications using past data is possible with inferred models. Second, there is evidence that respondents may not respond accurately to stated attribute attendance questions (Armitage and Conner 2001; Ajzen, Brown, and Caravajal 2004; Hensher, Rose, and Greene 2012; Scarpa et al. 2011; Kragt 2013; Carlsson, Kataria, and Lampi 2010; Hess and Hensher 2010; Scarpa et al. 2012; Alemu et al. 2013). Regardless of the empirical method chosen, estimated attribute coefficients are constrained to zero for cases of nonattendance (Hensher, Rose, and Greene 2012).
Our work also builds on the body of literature that has investigated preference heterogeneity of responses from choice survey data (Provencher, Baerenklau, and Bishop 2002; Boxall and Adamowicz 2002; Scarpa and Thiene 2005; Hensher and Greene 2003). Specifically, this paper is most closely related to those of Kragt (2013) and Koetse (2017) in that we explore whether stated or inferred ANA methods with respect to the scope at-tributes can lead to statistically significant scope effects. Like Koetse (2017) we focus on a particular subset of ANA, and following Kragt (2013) explore several different empirical specifications. In addition, our analysis is similar to that of Thiene, Scarpa, and Louviere (2015) in that we compare various numbers of potential classes.
3 Survey and Data
Our application is to the control of an invasive species, hemlock woolly adelgid (HWA), in public forests in North Carolina. We developed the binary choice question format with randomly assigned attributes for the Survey Monkey online survey platform and pretested the survey with 62 respondents. In order to collect a large sample of data at relatively low cost we conducted an internet survey with a nonprobability panel of respondents. So called opt-in panels are becoming popular in social science research, but their ability to adequately represent sample populations and obtain high-quality data is still unresolved (Hays, Liu, and Kapteyn 2015). Yeager et al. (2011) find that non-probability internet samples are less accurate than more representative probability samples for socioeconomic variables. Lindhjem and Navrud (2011) reviewed the stated preference literature and find that internet panel data quality is no lower than more traditional survey modes, and internet panel WTP estimates are lower.
Our Southern Appalachian Forest Management Survey was administered in September 2017. More than 8,400 individuals were invited to take the online survey, and roughly 13% opted to be panelists. About 83% of those panelists completed the survey, for a total of 974 respondents. We use a sample of 907 respondents who answered each of the choice questions. The survey questions can be divided into one of three categories. First, we asked preliminary questions about the respondents’ prior knowledge of HWA and recreational experiences in Pisgah National Forest, Nantahala National Forest, and Great Smoky Mountains National Park (see Appendix Figure A1). Second, we asked a series of either two or three referendum questions (called “situations” in the survey, and in the rest of this paper) depending on the respondent’s survey treatment. Finally, we asked debriefing questions about consequentiality, ANA, and individual-specific characteristics.
Respondents were then led through a series of education materials and instructions. First, they were led through descriptions about ecologically important and socially important areas of hemlock dominated forest and asked about the importance of each. Ecologically important areas contribute to natural diversity, provide habitat for rare plants and animals, and tend to be in remote areas of the forest. Socially important areas are used by visitors for recreation and are near parking areas or accessible by trail. Then, respondents were described biological and chemical treatment methods and asked about whether they agree with their use. Finally, respondents were informed about the choice situation that would be described by treatment options or costs, and include multiple situations.
Figure 1 shows two important features of our survey. First, it shows how the four attributes (ecologically important acreage, socially important acreage, treatment method, and annual cost over the next three years) describing the referendum policy varied in the first situation. There were four levels of acreage for each type of hemlock-dominated forest: 2,500, 5,000, 7,500, and 10,000. There were also four cost amounts in the first situation: $50, $100, $150, and $200. Second, Figure 1 shows the wording of the referendum questions and the answer choices for all situations. Respondents were asked how they would vote for the referendum and given three choices: “For,” “Against,” or “Don’t know.”10 In the remainder of the paper we combine the “Against” and “Don’t know” votes and treat the responses to choice situations as binary.11
Binary Choice Question
After the first choice situation respondents were randomly assigned to either a bounded or repeated referendum treatment.12 In the repeated treatment, all four attributes randomly varied in each situation.13 Conversely, in the bounded treatment only the cost attribute varied, the other three attributes remained constant throughout the survey. For example, respondents in the bounded treatment who answered affirmatively in the first situation were then asked in the second situation if they would vote for the same acreage to be treated using the same method if the cost was $250. Fifty-five percent of those respondents voted for the referendum at the higher cost. Alternatively, respondents who did not answer affirmatively in the first situation were then asked how they would vote if the cost was $25. Fifty percent of those respondents voted for the referendum at the lower cost.
Table 1 reports the referendum responses. In the bounded sample the percentage of “For” votes falls from 66% to 45% as the cost amount increases from $50 to $200.14 In the repeated sample the percentage of “For” votes falls from 60% to 36% as the cost amount rises. Furthermore, Table 1 also shows that in the repeated treatment the cost attribute’s range expanded to include the new costs in the bounded treatment.15 In the second choice situation of the repeated treatment, the “For” votes fell from 62% to 41% as the cost amount increased from $25 to $250. Table 1 also shows some evidence of non-monotonicity and fat tails, which can cause the range of WTP estimates measured with different approaches to be wide and for the standard errors to be large.
Binary Choice Sequence Referendum Votes
The third choice situation of the bounded sample was designed like the second situation. Respondents who voted for the policy at $250 were asked if they would vote for the referendum at $300. Seventy percent of those respondents voted for the referendum at the higher cost. Conversely, respondents who voted against the policy (or voted “Don’t know”) at $25 were asked how they would vote if the cost was $5. Thirty-one percent of those respondents voted for the referendum at the lowest cost. In the third situation of the repeated sample, the “For” votes decreased from 64% to 32% as the cost amount increased from $5 to $300.
Survey respondents were also asked how much attention they paid to each of the four attributes and given four choices: “A lot,” “Some,” “Not much,” and “None.” Respondents who chose “None” and “Not much” are classified as not attending to the attribute. Overall, about 13% of respondents did not attend to the “size of the ecologically important area treated” (see Appendix Table A1). Nearly 25% of respondents did not attend to the “size of the socially important area treated.” The “treatment method (chemical or biological)” was not attended to by about 15% of respondents. Almost 17% of respondents did not attend to the “cost over the next 3 years.”
4 Empirical Models
We employ several increasingly general models to analyze our data and incorporate ANA for our hypothesis tests. Our first test of external scope, for which only the first response is used in estimation, can be conducted with a standard discrete choice model. If the estimated coefficients on protection of ecological and social areas are positive and statistically significant, then our data exhibit external sensitivity to scope. Estimating the model based on repeated choice data provides the means to test for internal scope effects. Estimation of the coefficients in both cases is based on a linear utility function:
[1]
The observable portion of individual n’s utility from choosing alternative j in situation s is a linear function of the attribute vector ansj and coefficient vector β. Total utility is the sum of observable utility (Vnsj) and an additive component that is unobservable to the researcher (εnsj).
A respondent will choose the alternative that yields the highest utility, but the choice may also depend on characteristics of that individual, such as income and education, which are placed in a vector xn. Using the parameter vectors α and γ to capture these additional effects, and assuming the unobservable portion of utility εnsj is a distributed type 1 extreme value (Gumbel), the probability that individual n will choose alternative j in situation s is
[2]
Our first departure from the standard model in equation [2] is the stated ANA model, which relies on debriefing questions about attribute attendance to place respondents into classes. Each class is defined by a set of parameter restrictions, setting coefficients equal to zero when a respondent indicated he did not attend to the corresponding attributes (Hensher, Rose, and Greene 2012; Kragt 2013; Koetse 2017). Generalizing our standard model to allow for classes of nonattending respondents yields the choice probabilities given by
[3] where the β vector is now indexed by class c, indicating which elements of β are restricted to zero.
For k attributes describing a discrete alternative, there are a total of 2k possible attribute (non)attendance classes (Hensher, Rose, and Greene 2012; Thiene, Scarpa, and Louviere 2015; Glenk et al. 2015). Thus, for our empirical setting with four attributes (cost over the next three years, ecologically important acreage, socially important acreage, and treatment method), there are 16 possible classes of attribute (non)attendance to which individual n may belong. We abstract away from the full set of possibilities and largely focus on two classes: total attendance, and nonattendance to at least one attribute. Such an assumption is not unconventional given our empirical focus. In fact, Koetse (2017) investigates how accounting for nonattendance to the cost attribute both corrects hypothetical bias and decreases the WTA-WTP disparity. Related to this issue is the relationship between ANA and consequentiality. Koetse (2017) argues that consequentiality, or rather the lack thereof, is the most important reason for respondents ignoring the cost attribute. While Scarpa et al. (2009) discuss how cost nonattendance is often correlated with nonattendance to other attributes, Thiene, Scarpa, and Louviere (2015) focus on nonattendance to a single attribute due to data limitations. Hensher, Rose, and Greene (2015) also agree that investigating all 2k possible classes is not ideal.
We further generalize the stated ANA model by estimating a random parameters logit model, which allows coefficients that are not restricted to zero to vary over respondents (Train 2003). In the random parameters logit, the β vector is distributed multivariate normal so that βnk = βk +σkνnk if ak is attended to by respondent n, and βnk = 0 otherwise. βk represents the mean of the distribution of marginal utilities across the sample population, σk represents the spread of preferences around the mean, and νnk is the random draw taken from the assumed distribution (Hensher, Rose, and Greene 2015).
Concerns about the accuracy of this self-reported information has led to use of latent class models to separate individuals into classes and motivates our inferred ANA model (Scarpa et al. 2009; Scarpa et al. 2012; Hensher, Rose, and Greene 2012; Kragt 2013; Glenk et al. 2015; Thiene, Scarpa, and Louviere 2015). In this framework class membership is unknown to the analyst and is instead treated probabilistically. Estimation requires specifying the ANA class probabilities, πnc, which are the probabilities that individual n belongs to class c (Hensher, Rose, and Greene 2015). These probabilities can be specified by the logit formula and estimated as a function of the choice-invariant characteristics in xn:
[4]
where θc is a vector of estimated parameters and C is the number of classes specified by the analyst. The class membership probabilities can be combined with the conditional probability of equation [3] to express the unconditional probability of individual n’s response via the law of total probability:
[5]
5 Results
We estimate the probability of voting for the hemlock woolly adelgid treatment policy as a function of its attributes using conditional logit, latent class, and random parameters logit models.16 The coefficients from the conditional logits are shown in Table 2.17 The estimated coefficient on the cost is negative and statistically significant, as one would expect for the repeated sample. The bounded sample, however, exhibits some strange behavior on the cost attribute. When considering only the first choice situation, the coefficient for the cost attribute has the correct sign, but its significance disappears when both the first and second choice situations are included. The statistical significance reappears when all three choice situations are included in the model. Most strikingly, Table 2 shows no robust evidence of scope effects. Thus, our simple models show limited evidence of the repeated sample passing the internal scope test. Furthermore, our simple models fail the external scope test. In other words, the coefficients on the ecologically important and socially important acreage amounts are not statistically different from zero.
Conditional Logit Models
Table 3 reports estimated coefficients from the random parameters logit model. It shows, like Siikamäki and Larson (2015), the evidence for scope effects becomes stronger when we account for preference heterogeneity. In addition, the evidence of scope effects for ecologically important acreage in the repeated sample appears when the first two choice situations are included in the model. We also estimated latent class models that show evidence of scope effects in the repeated sample.
Random Parameters Logit Models
For example, in a two-class non-equality-constrained latent class model, we observe positive and statistically significant scope effects in the dominant class, class 1 (see Appendix Table A2).18 The coefficient for ecologically important acreage, however, is not distinguishable from zero unless all three choice situations are included in the model. Like the random parameters logit models, the two-class non-equality-constrained latent class estimates a smaller Akaike information criterion (AIC) function than the conditional logit model. Thus, we have evidence that accounting for preference heterogeneity improves the fit of the model and the data exhibit internal scope effects.
We next investigate several ways of allowing for ANA. First, we compare the effects of accounting for stated ANA on the conditional logit and random parameters logit models. Second, we examine a number of inferred ANA models.
There are 2k possible attribute (non)attendance classes in an equality-constrained latent class model framework. In our application k is equal to 4 and refers to the four attributes that describe the referenda (cost, ecologically important acreage, socially important acreage, and treatment). Appendix Table A3 illustrates how under the full 2k (16) class equality-constrained latent class model we estimate only five different coefficients (cost over the next three years, ecologically important acreage, socially important acreage, biological HWA treatment, and chemical HWA treatment). The estimated coefficient on attribute i (βi) is the same for all classes that attend to the attribute.
If the attribute is not attended to, then the coefficient is restricted to be equal to zero. The same conditions apply to models with fewer classes. Appendix Table A3 also shows that 59% of respondents exhibit total attribute attendance, or the continuity axiom of neoclassical theory.19 About 10% of the respondents ignored only the social scope of the policy, and about 4% ignored treatment. Almost 15% of respondents ignored a combination of attributes. Unlike many studies socially important acreage, and not “cost,” was the most commonly ignored attribute (Kragt 2013). Similar to Thiene, Scarpa, and Louviere (2015) and Koetse (2017) we do not attempt to investigate the 2k classes.20 We limit our analysis and consider three specific cases of stated ANA: scope nonattendance, cost nonattendance, and cost and scope nonattendance.
We defined stated ANA using the respondent’s answer to a Likert-scale question. Spe-cifically, the survey asked, “When you were making your decisions about the different alternatives, how much influence did each of the following have on your voting decision?” Respondents were able to answer either “A lot,” “Some,” “Not much,” “None,” or they could decline to answer. We report estimated coefficients for models where ANA has been defined as either None or Not much influence (see Appendix Table A1). Later we will discuss the sensitivity of our results to our definition.21
Table 4 reports the conditional logit model that accounts for stated nonattendance to the scope attributes (ecologically and socially important acreage). The model is conceptually similar to an equality-constrained latent class model in that there are four (22) possible latent classes. Specifically, the four classes are (1) total attendance (class 16 in Appendix Table A3), (2) nonattendance to ecologically important acreage (class 2 in Appendix Table A3), (3) nonattendance to socially important acreage (class 3 in Appendix Table A3), and (4) nonattendance to both ecologically and socially important acreage (class 8 in Appendix Table A3).
Conditional Logit Models Allowing for Stated Attribute Nonattendance on the Scope Attributes
Unlike our previous results, Tables 4 and 5 show that we find robust external and internal scope effects for both ecologically and socially important acreage in both the bounded and repeated samples. This is true for both the conditional logit (Table 4) and random parameters logit (Table 5) specifications.22 We find qualitatively similar results (robust scope effects in both bounded and repeated samples) when we account for both cost and scope stated ANA. However, as previously discussed, the estimated coefficient on the cost attribute is not always statistically significant in the bounded sample. In addition, Table 4 shows that we estimate a positive and statistically significant coefficient on cost using the bounded sample and all choice situations.23 When we account for stated ANA on the cost attribute alone we find no evidence of scope effects in the bounded sample, and nonrobust evidence of scope effects for ecologically important acreage in the repeated sample. Based on the AIC the best fit is provided by the random parameters logit model that accounts for ANA to the scope attributes only.
Our results do not change qualitatively as we change our measurement of stated ANA. We do observe less peculiar behavior in the estimated coefficient on the cost attribute for the bounded data as we broaden our definition from “none” to “none,” “not much,” and “some.” The most substantial difference in the results occurs when we extend our definition of stated ANA to include “some” influence. We find that the statistical significance of socially important acreage begins to decrease.
Random Parameters Logit Models Allowing for Stated Attribute Nonattendance on the Scope Attributes
Because there is evidence in the literature that respondents may not answer accurately to ANA questions, we estimate equality-constrained latent class models (Armitage and Conner 2001; Ajzen, Brown, and Carvajal 2004; Carlsson, Kataria, and Lampi 2010; Hess and Hensher 2010; Hensher, Rose, and Greene 2012; Scarpa et al. 2011; Scarpa et al. 2012; Alemu et al. 2013; Kragt 2013). We examine whether these inferred ANA models will show statistically significant scope effects. We examine several different cases of latent classes: 16 classes (1–16 in Appendix Table A3), 9 classes (1–4, 8, and 14–16 in Appendix Table A3), 5 classes (3, 4, 8, 15, and 16 in Appendix Table A3), 4 classes (3, 8, 15, and 15 in Appendix Table A3), 3 classes (3, 8, and 16 in Appendix Table A3), and 2 classes (8 and 16, and 1 and 16 in Appendix Table A3). We find that when including only the first situation in the model, the AIC generally decreases for both the bounded and repeated samples, suggesting that fewer classes is better. These results are noisy when both the first and second situations are included in the model, but generally speaking the AICs still decrease. Interestingly, the AIC seems to increase as equality-constrained latent class models with fewer classes are estimated when all three choice situations are included. Furthermore, these findings are confounded by the fact that we also often estimate singular variance matrices.24
We report the inferred scope nonattendance and inferred cost nonattendance models in Appendix Table A4. The estimated coefficients in these tables once again illustrate the robustness of the lack of scope effects.25 The first specification in Appendix Table A4 assumes one class does not attend to both ecologically and socially important acreage (scope ANA). The second of these two inferred ANA models, cost ANA, comes from Koetse (2017). Specifically, in the second specification we set the estimated coefficient for the cost attribute to be zero in the nonattending class. Unlike Kragt (2013) and Koetse (2017) our inferred models do not exhibit scope effects.
Table 6 reports the marginal WTP estimates calculated from coefficients presented in the previous tables. It once again illustrates our two main points. First, there is a lack of scope effects in models that do not account for ANA. Second, accounting for preference heterogeneity using either stated ANA or random parameters logit models can reveal scope effects. In general, when using stated ANA in conditional logit models we find that respondents are willing to pay between about $9 and $16 per acre. Our negative WTP estimates for treatment of ecologically and socially important acreage are the result of peculiarities in the bounded data. Specifically, the positive coefficient estimated on cost, shown in Table 4, causes this puzzling result. Random parameters logit models, however, suggest a lower range of marginal WTP of $5 to $14 per acre.
Marginal Willingness to Pay (in Dollars)
6 Conclusions
We have shown that like many other contingent valuation studies, the 2017 Southern Appalachian Forest Management Survey data generally fail internal and external scope tests when using naive models that assume fully compensatory preferences. This result is robust to the number of choice situations in the survey and empirical specification. We employ several methods to account for preference heterogeneity: (1) stated ANA, (2) inferred ANA, (3) equality-constrained latent class models, (4) random parameters models, and (5) combinations of the four. Accounting for stated ANA allows our models to pass both internal and external scope tests, shedding light on a major criticism of contingent valuation. As noted by other researchers, such a result could have large policy implications (Scarpa et al. 2009; Kragt 2013; Glenk et al. 2015; Koetse 2017).
We compare bounded and repeated valuation questions and employ a number of different specifications to search for scope effects in single or sequential binary choice situations. While some models, like random parameters logit and two-class non-equality-constrained latent class models, find scope effects for ecologically important acreage in the repeated data when we include all three choice situations, such results are not robust to specification. Robust scope effects appear only when we account for stated ANA on scope attributes. The preferred models are stated ANA random parameters logit models using the repeated data, which have a better statistical fit than the bounded data. The bounded sample models typically have goodness of fit that is inferior to the repeated data for the same specifications, except when all three choice situations are included in our models. There is also less robust evidence of scope effects with the bounded sample. As we change our definition of ANA, the qualitative results do not change substantially; however, we begin to see less evidence of scope effects for socially important acreage. Our results conflict with those of Kragt (2013) and Koetse (2017). While they favor the inferred models, the results of our analysis favor stated models. A striking result from our investigation into the inferred models is that we find no evidence of scope effects on the bounded survey treatment sample, regardless of the number of latent classes we assume. If researchers are unable to rely on inferred ANA via latent class models, then economic survey design should routinely include questions that capture stated ANA.
Discovery of a factor that provides a rationale for insensitivity to scope, a longstanding criticism of contingent valuation, warrants further investigation. In particular, researchers and policy makers need a thorough understanding of the effects of ANA, and the implications of discontinuous preference ordering among survey respondents. Our results provide evidence of a need and an opportunity to revisit old data that exhibit insensitivity to scope using new techniques. It is possible that use of ANA methods could lead to some of these old studies passing scope tests. Our results also highlight the need for research focused on the determinants of ANA and ways to mitigate it. Our analysis suggests that accounting for stated ANA for scope attributes could help overcome other CVM problems like fat tails, temporal insensitivity, and others.
While the primary aim of our analysis does not require scaling household values up to the population level, such as for benefit-cost analysis, our results raise the question of how to treat respondents who do not attend to certain attributes—cost in particular—in that scaling exercise. To our knowledge the existing literature has not addressed this question directly, but generally, the difference between naive models and those that account for ANA is viewed as a bias arising from process het-erogeneity (Glenk et al. 2015; Hensher, Rose, and Green. 2015). Taking that point of view, one would conclude that WTP values found with models that correct for that bias could be applied to the larger population to estimate total social benefits. A more conservative approach would apply only positive WTP to the percentage of the population corresponding to the proportion of the sample that fully attended to the attributes, or just the cost attribute if that is the primary concern. Alternatively, if a latent class model is used to correct for ANA and the class probability functions rely on data that are available for the larger population, those probabilities could be estimated for the population, thereby scaling the likelihood of nonattendance to the population as well. No doubt there are other possibilities that could provide other population-level estimates to generate lower and upper bounds for comparison with total cost for program evaluations.
Acknowledgments
Previous versions of this paper were presented at the 2018 Society for Benefit-Cost Analysis meetings in Washington, D.C., and the 2018 Appalachian Experimental and Environmental Economics Workshop in Blowing Rock, North Carolina. The authors thank participants in those sessions, Dan Petrolia and Matt Interis, and two journal referees for helpful comments. The views and opinions expressed in this article are those of the authors and do not necessarily reflect the official policy or position of any agency of the United States (U.S.) government.
Footnotes
↵1 If the estimated WTP is greater for the larger policy intervention, the study passes an external scope test (Carson, Flores, and Meade 2001).
↵2 When individuals are asked multiple CVM questions, the response format is known as a discrete choice experiment and permits tests of internal scope (Carson and Louviere 2011; Carson and Czajkowski 2014). Internal scope tests verify consistency of preferences within an individual’s set of responses (Carson and Mitchell 1995).
↵3 Carson and Mitchell (1995) and Carson, Flores, and Meade (2001) review several other reasons why WTP may be insensitive to scope.
↵4 The economic importance of the scope test can be assessed with scope elasticity (Whitehead 2016). We include a table of marginal WTP estimates. In an attribute-based CVM experiment design the marginal WTP estimates serve as the slope of the WTP function, which is a ray from the origin. In our case and most others in the DCE literature, the scope elasticity is equal to one. In other words, the percentage change in treated acreage is equal to the percentage change in WTP.
↵5 While Carson and Groves (2007) discuss how sequential choice questions generally violate incentive compatibility, Vossler, Doyon, and Rondeau (2012) identify cases where such a survey design maintains the important property. See also Boyle et al. (2016).
↵6 Christie and Azevedo (2009) combine the CVM and DCE treatments, test for parameter equality, and find evidence of convergent validity between the two types of choice questions. Their comparison, however, is confounded by the number of valuation questions presented to respondents (three in repeated CVM and eight in the choice experiment) and the number of choice alternatives (two in the repeated CVM and three in the choice experiment), creating potentially large differences in cognitive burden between samples.
↵7 Siikamäki and Larson (2015) find scope effects with bounded sequential questions but do not compare their results with those from the single binary choice and do not ask repeated questions.
↵8 Petrolia, Interis, and Hwang (2014) did not conduct a scope test in their single binary choice question.
↵9 Meyerhoff and Liebe (2009) and Carlsson, Kataria, and Lampi (2010) do not find that WTP values declined when accounting for ANA. DeShazo and Fermo (2004) and Hensher, Rose, and Bertoia (2007), however, find higher marginal WTP when accounting for ANA.
↵10 In the first choice situation, for the bounded survey sample, 52% voted for the treatment referendum, 23% voted against, and 25% did not know how they would vote. Similarly, for the repeated survey sample in the first situation, 55% voted for the treatment referendum, 22% voted against, and 23% did not know how they would vote. These differences are not statistically significant across bounded and repeated survey samples. This is not surprising since there were no differences in the first situation between the two samples.
↵11 Carson et al. (1996) discuss the 1993 replication of an Alaskan survey introduced the “would not vote” option, which did not have an effect on stated choices. Additionally they mention that “would not vote” and “don’t know” responses were treated as choices against the plan (a conservative decision recommended by Schuman [1996]).”
↵12 Our survey design is referred to as a binary choice sequence (BC-Seq) by Carson and Louviere (2011).
↵13 We do not use an efficient survey design. In the repeated treatment the attributes all vary independently from one another.
↵14 In dichotomous choice CVM the “fat tails” problem results when survey respondents, in the aggregate, fail to respond rationally to increases in the cost of the policy (Ready and Hu 1995). This is often a problem with the subsample sizes are small. Our data exhibits a fat tail from $150 (For = 45.13%, n = 113) to $200 (For = 45.16%, n = 124) with the repeated data. When the data on the first question are pooled we no longer see the fat tail problem, as the percentage of “For” votes falls from 49.78% at $150 (n = 231) to 41.03% at $200 (n = 234). The difference is statistically significant at the 0.05 level in a one-tailed test.
↵15 We find no evidence of anchoring in our repeated treatment; however, data from the bounded treatment do exhibit anchoring. This suggests CVM researchers should move toward the repeated referendum type question with random cost and other attributes.
↵16 We find no differences in results for those who meet the conditions for survey consequentiality and those who do not (Petrolia, Interis, and Hwang 2014). Also, similar binary choice sequence question formats have been used to examine ordering effects (Holmes and Boyle 2005; Day et al. 2012; Nguyen et al. 2015). We find little evidence of ordering effects in our data.
↵17 While there is some evidence of incentive incompatibility and starting point bias, it is not a result that is robust enough to be cause for concern. For example, we generally observe such evidence only in the bounded data when controlling for both issues, but do find some evidence of starting point bias in the first two situations of the bounded data when we do not control for incentive compatibility.
↵18 We purposely omit the results from the first situation, and the first and second situation. The variance matrix for the repeated sample is singular in both cases, and the bounded sample does not exhibit scope effects for the first situation. Including both the first and second situation for the bounded sample yields statistically significant scope effects for ecological acreage in one class, while the estimated coefficient is negative and statistically significant at the 10% level in the other.
↵19 This is comparable to Hensher’s (2008) and Kragt’s (2013) results. Sixty-two percent of Hensher’s sample exhibit total attribute attendance and 55% of Kragt’s respondents attended to all attributes.
↵20 Our primary motivation for this is the small dataset that is likely to yield singular variance estimates in the 2k-classes case. For example, large numbers of latent classes lead to identification problems such as singular Hessians (Hensher, Rose, and Greene 2015). In fact, using our data the 2k-class latent class model and several specifications of the 2k-class equality-constrained latent class model often fail to converge or are singular. While Scarpa et al. (2009) discuss how cost ANA is often correlated with nonattendance to other attributes, due to data limitations, Thiene, Scarpa, and Louviere (2015) and Koetse (2017) focus on nonattendance to a single attribute.
↵21 Our qualitative results are robust to our definition of ANA (none, none and not much, none and not much and some). We chose to report answers of either not much or none as our baseline definition.
↵22 Our random parameters model uses 1,000 standard Halton sequence draws from normally distributed attribute parameters. The relative size of our estimated standard deviations range from 0.02 times smaller (social) than the mean to 19.26 times larger (social) than the mean. On average the standard deviations of the scope parameters are 2.53 times larger than the means.
↵23 Such peculiar results could be caused by a number of factors including model misspecification. In our context it is likely the result of our survey design and its effect on respondents’ answers. For example, the cost amount provided in follow-up situations may have changed relative to the other static attributes in a manner that is not seen as realistic. Thus follow-up situations may not be incentive compatible, leading to yea-saying behavior (Whitehead 2002).
↵24 We were unable to estimate a 2k non-equality-constrained latent class model due to data limitations.
↵25 We also tried to estimate a 16-class (non-equality-constrained) latent class model, but it did not converge.