Abstract
Proposition 21 on California’s 2010 ballot concerned an annual surcharge on vehicles to support state parks. Proposition 21 failed, leaving 25% of California state parks at risk of closure. We analyze voting patterns, which we show depend on the average gross price of the proposition, political ideology, environmental preferences, the availability of substitutes, and park salience. We simulate counterfactual scenarios under which Proposition 21 might have passed and use holdout samples to illustrate the predictive ability of our model. Heterogeneity across California makes our model potentially useful for predicting public sentiment for similar propositions, even for jurisdictions without direct democracy. (JEL H41, Q58)
I. INTRODUCTION
State park funding mechanisms across the United States have been shifting away from public funding for the past 20 years, leaning more heavily upon user-based fees instead. Parks typically represent a good that is publicly provided, although rival when congested, so parks would be underprovided without public funding. Furthermore, popular state parks may become congested in the absence of user fees. However, while state governments spend billions of tax dollars annually to maintain these parks for public use, the shift away from public funding has left many state parks around the country with substantial deferred maintenance and operating budget deficits. As a result, state parks have seen decreased access, diminished services, and in some cases they have been forced to close.1
The state park system of California has been significantly impacted by a decline in public funding appropriated from the general fund, an increasing reliance on user fees and concession revenues, as well as the recession that began in 2008. Per capita public expenditures on state parks had declined by 33% in the decade leading up to 2010.2 Furthermore, user fees and concession revenues together comprised over half of the state park department’s total revenue stream in the 2010–2011 fiscal year. In isolation, these trends may not be perceived as a problem; however, the amount of deferred maintenance in California state parks by 2010 was estimated to be in excess of $1 billion, suggesting that prevailing funding trends were insufficient to keep parks in good working order.3
As a result of these funding difficulties, the State Parks and Wildlife Conservation Trust Fund Act appeared as Proposition 21 on California’s November 2010 ballot. In this citizens’ initiative, voters were asked to authorize an $18 per year increase in vehicle license fees for noncommercial vehicles. In exchange for this fee, noncommercial vehicles with instate registration would no longer have to pay a user fee at state parks and beaches.4 The revenue from the fee would have resulted in an additional $250 million per year being placed in a fund specifically designated for the state park system.5 Fewer than 43% of voters, overall, approved of the proposed shift in funding source, however, so Proposition 21 failed. As a consequence, 25% of California’s 278 state parks were slated to close by July 2012.6 While the state park system managed to partner at least temporarily with other organizations to save all but a handful of the threatened parks from closure in 2012, no long-term solution has yet been identified. If state parks are closed and sold to private interests, the land and its ecosystems may be irreversibly altered. The research described in this paper explores the determinants of the outcome of the 2010 state park funding ballot, in an effort to clarify why the ballot failed.
There was substantial variation across California census tracts in voter support for Proposition 21. We exploit the state’s heterogeneity in income levels, sociodemographics, political ideologies, park salience, and the availability of substitutes for any particular state park to build a statistical model to explain the variation in voting outcomes for Proposition 21. Our primary research objective is to identify which factors best explain variation across census tracts in support for greater public funding of state parks. Second-order questions include the following: (1) Might there have been a lower statewide vehicle license surcharge at which Proposition 21 would have passed? (2) How different would other sociodemographic conditions need to have been for Proposition 21 to have passed? (3) Does our model have reasonable out-of-sample predictive power?
To answer these questions, we construct a dataset of Proposition 21 voting results along with other economic and sociodemographic characteristics. In particular, we construct variables at the census tract level that measure the gross price of Proposition 21 to households, household income, proxies for environmental preferences, the availability of substitutes for a particular state park, and the likely salience of state parks to voters in each census tract. Using these variables, we characterize variation in support for public funding for state parks. We also counterfactually simulate the extent to which conditions and/or preferences would have needed to differ for Proposition 21 to have passed, ceteris paribus. We find that support for Proposition 21 is sufficiently insensitive to gross price that there is no lower statewide vehicle license fee at which it would have passed. Our simulations do, however, show the extent to which more-liberal political preferences and greater sympathies for the environment might have led to a statewide majority in favor of Proposition 21. Finally, we use county-level holdout samples within California to demonstrate how well our empirical model performs in making out-of-sample predictions. This exercise supports the potential to transfer our model to other jurisdictions within the United States.
Several earlier studies have examined user responses to different park management policies (see Crompton and Lue 1992; Kerkvliet and Nowell 2000; Mansfield et al. 2008; Loomis and Keske 2009). There is also a body of work using stated preference methods to examine how people value public parks, both domestically and in other countries (see Carson, Wilks, and Imber 1994; Shechter, Reiser, and Zaitsev 1998; Baral, Stern, and Bhattarai 2008; Jacobsen and Hanley 2009; Jacobsen and Thorsen 2010). Some works apply novel estimation techniques to stated preference data to derive willingness-to-pay values for protection of state parks (see Fernández et al. 2004; Czajkowski and Hanley 2009; Hanley et al. 2009; Scarpa, Thiene, and Hensher 2010), whereas other authors undertake qualitative studies of attitudes toward public park funding (see Fix and Vaske 2007; Peters and Hawkins 2009).
There is also a class of literature that uses referendum voting results to reveal information about the possible nature of demand for public goods. Studies in this class include those by Deacon and Shapiro (1975); Kline and Wichelns (1994); Kahn and Matsusaka (1997); Kahn (2002); Vossler et al. (2003); Vossler and Kerkvliet (2003); and Messer et al. (2010), who use U.S. data to characterize support for environmental referenda. Outside the United States, Thalmann (2004); Halbheer, Niggli, and Schmutzler (2006); Kotchen and Powers (2006); Bornstein and Thalmann (2008); Bornstein and Lanz (2008); and Schulz and Schläpfer (2009) use Swiss voting data with similar objectives.
While we do not claim to derive a formal demand function for state parks in this paper, we do use methods prevalent in the related literature to address our research questions. Deacon and Shapiro (1975) conducted a seminal study in which they modeled California voters’ choices on referenda about conservation of the California coastline and public funding for rapid transit.7 The empirical strategy in this paper is an adaptation of the model developed by Deacon and Shapiro and is similar to that of Kahn and Matsusaka (1997), who characterize demand for environmental goods using voting results from all of California’s environmental public referenda between 1970 and 1994.8
In research following the 2012 temporary reprieve for California’s state parks, Walls (2013) considers different payment mechanisms for funding public parks. She notes that California created an enterprise fund for state parks. An enterprise fund would dictate that revenue raised from user fees goes toward park maintenance and not back to the general fund. Walls (2014) also examines the role of philanthropy in funding public parks.
The present paper makes two main contributions to the literature. First, we conduct our analysis at the census tract level and include a wider array of economically important explanatory variables than has been used in other analyses like this one, paying particular attention to the local availability of substitutes for the public good in question (namely, any specific park that might close) and to variables that are designed to capture variations in the salience of state parks to local voters. Second, we use the substantial heterogeneity in California’s population and geography to make a thorough assessment of our model’s capabilities in terms of out-of-sample forecasting.
II. DATA
We gather two types of data to address our research questions. The first type of data provides measures of our dependent and independent variables. The second type assists us in normalizing the spatial extent of each of our variables to the level of the census tract. We choose the census tract as our unit of observation because it represents the spatial extent at which we can construct the richest set of data.
The raw data for our dependent and independent variables, however, are measured at a variety of different spatial extents. To normalize the data on the common spatial unit of the census tract, we use ArcGIS (ESRI 2011). Circa 2010, California had 21,907 voting precincts, 7,049 census tracts, and 1,654 zip codes, but the larger polygons are not simply aggregates of the smaller ones. The boundaries often overlap. If we knew the geographic distribution of the population within a given polygon, we could use population weights when we aggregate precincts (or disaggregate zip codes) to the level of census tracts. Unfortunately, the absence of information about population densities within subareas of zip codes makes it necessary to assume that households are uniformly distributed across space in each overlapping jurisdiction. We therefore weight the characteristics of each measured spatial unit based on the geographic proportion of the spatial unit contained within the census tract in question.
One of the formats in which ArcGIS accepts spatial data is the so-called shapefile, which contains information describing spatial boundaries and the values of variables associated with that polygon. We obtained the voting precinct shapefile from the Statewide Database of California (SWDB).9 From the U.S. Census Bureau we obtained shapefiles for the state, counties, and census tracts.10 Finally, we acquired a shapefile with all California zip codes from ESRI, the software company that produces ArcGIS.
Our dependent variable is the proportion of total votes in favor of Proposition 21. We take the raw data for this variable at the precinct level from the SWDB and normalize them to the level of census tracts. To measure political preferences, we similarly generate variables for the proportion of total gubernatorial votes in favor of the Democratic, Republican, and other minor liberal and conservative candidates.
We argue that the total cost of Proposition 21 to each household is more likely to have affected voting decisions than the cost to individuals within a household. Therefore, the average gross price of Proposition 21 is based on the average number of vehicles per household in each census tract (i.e., the total number of registered vehicles divided by the number of households). This average, times $18, represents the gross price implied by the proposition, and variation in this gross price is created by the heterogeneity in average number of vehicles per household across census tracts.
To create our gross price variable, we purchased vehicle registration data for the state of California from Polk, Inc.11 This dataset contains over seven million vehicle counts, by make, model, and year, at the zip code level in California for October of 2010. These data are normalized to the census tract level prior to creating the gross price variable. It is important to emphasize that this gross price measure is not the net price of Proposition 21. The net price of the proposition to each household would also be affected by the fact that vehicles for which the surcharge had been paid would then be exempted from vehicle charges for park use. Thus, the net price per household will vary endogenously with expected park usage. There are no general-population data on state park usage by census tract in California, so we can only proxy for this expected park use measure using a set of indicators that we argue are likely to be correlated with average park usage in a census tract.
We seek to control for differences in environmental preferences in two different ways. Kahn and Vaughn (2009) discovered that hybrid vehicles tend to cluster in environmentalist communities. We utilize the Polk data on vehicle registrations by make, model, and year to proxy for environmental preferences using the proportion of hybrids per total vehicles in a census tract.12 We also use California’s League of Conservation Voters (LCV) scores as an additional measure of environmental preferences, although California’s 53 congressional districts are far larger, spatially, than census tracts or zip codes, implying rather poor spatial resolution for the sentiments associated with this variable.13
Park salience is also likely to be an important determinant of support for Proposition 21. Deacon and Schläpfer (2010) find that voter support for environmental improvements is related to the distance at which the improvements are likely to occur. The nearby area of state park land is likely to be correlated with support for Proposition 21 because the closer a household is to state park land, the lower will be the travel cost to access that land. We obtain a shapefile containing location and size of state, local, and national parks from the California Protected Areas Database.14 With this shapefile, we construct variables which permit us to control for the total area of statepark land within buffers of designated widths around the geographic centroid of each censustract (giving us the option to consider 20 km, 50 km, or 100 km buffers).
As another measure of park salience, we use participation in hunting, an ecosystembased activity. From the California Department of Fish and Game Automated License Data System, we obtain zip-code-level data on licenses purchased between November 2010 and October 2011, this being the first full year for which digital data became available.15 These data give us the total number of licenses per zip code, which we then divide by the number of households to get the average number of licenses of each type per household.
Support for Proposition 21 is likely to be related to the availability of other types of non-state-park substitutes in the event of closure of any particular state park. To capture the local spatial density of substitutes for state parks, we identify the areas (or numbers) of different types of state park substitutes within designated buffers around the geographic centroid of each census tract (including 20 km, 50 km, or 100 km buffers). Substitutes include (1) local parks, which are parks managed at geographic areas smaller than the state, including anything falling into the category of local park, local beach, local recreation area, local wilderness area, or local historical site; (2) federally controlled land, which includes national parks, national seashores, national recreation areas, or national wildernesses; and (3) attractions (which include, e.g., historical houses and farms) and museums (which include, e.g., natural history and art museums).16
Finally, we turn to the American Community Survey17 to obtain five-year (2005–2009) average household income and other sociodemographic information by census tract. These data give us measures of income, industry of employment, age distribution, race/ ethnicity, and some particular types of household structures (i.e., single parents).
Table 1 shows descriptive statistics broken out by category of variable. A few interesting observations are relevant. First, average tract level support for Proposition 21 is 43%, with some tracts showing as little as 9% support and others supporting the proposed change with nearly 85% of the vote (Figure 1). Second, the average household gross price in terms of vehicle fees is $42.83, although it varies widely across census tracts from $5.23 to $71.81.18 Consistent with California’s vast heterogeneity in rural and urban areas, the calculated number of vehicles per census tract varies from 82 to over 33,000, and average household size varies from 1.2 to about 6.4. A visual description of the distribution of the average household gross price of Proposition 21 across census tracts is shown in Figure 2.
Variation in Support for Proposition 21 across Census Tracts
Variation in Gross Price of Proposition 21 across Census Tracts
Variable Descriptive Statistics (N = 6,699)
Third, at the census tract level, political preferences tend to lean more Democratic than Republican, with the average gubernatorial vote share for the Democratic candidate at 57.6% and the average Republican share at 36.8%. Fourth, although the proportion of hybrids to total vehicles is small, there is considerable variation in this proportion across census tracts. The average share of hybrid vehicles per total vehicles is 0.3%, although some census tracts have a share as large as 4.6%. Finally, there is a large amount of heterogeneity across census tracts in terms of the number of acres of state park land within specific buffers. For example, the average amount of state park land within a 50 km radius of the geographic centroid of a census tract is 30,000 acres, while some census tracts have zero acres of nearby state park land, and some have as many as 671,000 acres in that same radius. We exploit this variation to explain voter support for Proposition 21.
III. ECONOMETRIC SPECIFICATION
We estimate a log-odds reduced-form regression model to characterize variation across California census tracts in support for Proposition 21. The dependent variable in the model is
[1]
where Si is the share of total votes in favor of Proposition 21 in census tract i. This approach is common in the literature because this transformation allows the dependent variable to span the entire real line instead of being bounded between zero and one.19
The equation we estimate for the full set of data takes the following general form:
[2]
The categories of explanatory variables in equation [2] help us achieve our main objective, which is to explain variation in support for Proposition 21 as precisely as possible. GrossPricei is $18× (Vehiclesi/Hhldi), or the average gross price of Proposition 21 per household in tract i.20 Incomei is the logged value of median household income in thousands of dollars. Ideologyi is a vector of variables indicating the proportion of each census tract voting for the gubernatorial candidate of each of the parties running in the 2010 election.21 Inclusion of these variables dramatically increases the fit of our model by allowing us to control for political preferences. EnvPrefi is a vector of variables that includes hybrids as a share of vehicles and other vehicle-related variables, as well as LCV scores during the 2010 Congress. These variables give us a glimpse into how environmental preferences shift support.
Saliencei is a vector of state-park-specific variables including the logarithm of the thousands of acres of state park land within a 50 km buffer of the geographic centroid of the census tract, as well as the average number of hunting licenses purchased by households in the census tract. We use these as measures of how the relevance of state park land affects support for Proposition 21. Substi is a vector of variables including the area of local parks within a 20 km buffer of the centroid of the census tract, the area of federally controlled land within a 50 km buffer of the centroid, and the number of museums and other attractions within a 50 km buffer of the centroid. These variables allow us to determine whether support for state parks is higher when there are fewer recreational substitutes nearby for any park that might be closed. Finally, Socioi is a vector of sociodemographic controls including information about the distribution of income, industry of employment, age, ethnicity, and family structure in each census tract.
Ordinary least squares (OLS) models may be inappropriate for estimation, given the potential for unobserved spatial heterogeneity in the data. We explore two strategies to accommodate spatial properties in the systematic and stochastic portions of the model. Following LeSage and Pace (2009) and Wu and Cutter (2011), we explore a spatial autoregressive model with spatially correlated disturbances (SAC model). We consider this specification because we cannot exclude the possibility that, due to unobserved factors, the share of total votes in favor of Proposition 21 in a given census tract is systematically correlated with the share of total votes in favor of Proposition 21 in neighboring census tracts. But we also consider an OLS specification with errors clustered at the county level (OLS model). Unfortunately, available algorithms for the SAC model do not also allow for clustered errors, and we note that clustering tends to reduce the statistical significance of several of the coefficients in the model. Both our SAC model and our OLS model accommodate a non-scalar-identity error variance-covariance matrix in different ways, but these specifications are nonnested. The implications of the two models are fortunately very similar.
While equation [2] has microfoundations (Deacon and Shapiro 1975), the data available for use in this study are aggregate. The dependent variable is a group average of a binary variable, and the explanatory variables are analogous group averages or proportions. We avoid the so-called ecological fallacy (Freedman 1999) and adopt the strategy of making inferences only about group behavior, using the census tract as the unit of observation. We do not claim to be able to recover individual preferences from our parameter estimates, although any regularities we discover may of course be suggestive of new stylized facts to be considered in future individuallevel studies.22
IV. RESULTS
We first discuss model selection, after which we examine the main results of our econometric estimation, which appear in Table 2. Then we outline a couple of simple falsification tests and comment upon the robustness of our results to alternative specifications. We subsequently explore simulations, based on the estimated coefficients, that permit us to answer the questions posed in the introduction. Finally, we conduct a comprehensive holdout sample analysis whereby we assess the predictive power of our models.
Main Results (Dependent Variable = Log Odds of Vote Share)
Our results are based on 6,699 of the 7,049 California census tracts for which we have complete data for estimation. We focus on a model that includes all seven categories of available regressors. While we provide the full set of parameter estimates, the ensuing detailed discussion focuses on the estimates for only the most important classes of explanatory variables. We first consider our preferred model, the spatial autoregressive model with spatially correlated errors (SAC) like that used by Wu and Cutter (2011). We also consider an OLS specification with standard errors clustered at the county level.
When considering the SAC specification, it is important to look first at the coefficients on λ and ρ, toward the bottom of Table 2. The parameter λ measures the extent of spatial autocorrelation in the dependent variable; ρ measures the extent to which there is spatial correlation in the error term. The absolute magnitudes of the coefficients on λ and ρ are bounded by –1 and 1. In this case, the magnitudes of the correlations are statistically significantly different from zero and positive, indicating that our preferred model should be the spatial autoregressive model with spatially correlated disturbances (SAC), although the absolute magnitudes of these correlations seem relatively small, at roughly 0.007 for λ and 0.115 for ρ. However, a simple comparison of the log-likelihoods for these two nonnested models implies that the SAC model does a better job with this particular dataset. The log-likelihood is 3,229 for the SAC model versus 1,786 for the OLS model (note that the log-likelihoods are atypically positive for these models). The Akaike information criterion produces values of –6,379 for the SAC model versus –3,499 for the OLS model.23 Hence, the SAC specification is our preferred model.
Across the SAC and OLS models, the coefficients all bear the same signs, except for the coefficient on the interaction term between the gross price of Proposition 21 and the share of votes for the gubernatorial candidate of one of the conservative parties other than the Republican party (only about 3.2% of the vote, on average). For the effects of the sociodemographic control variables, the SAC coefficients tend to be somewhat smaller, for the most part, than their OLS counterparts, and they appear to be more precisely estimated. The reported t-test statistics for the OLS model are based on errors clustered at the county level, whereas those for the SAC model reflect the spatially autoregressive specification and the structured spatial correlation in the errors, so these nonnested models are not readily comparable. We will concentrate upon the SAC model in our discussion of the estimates but provide OLS model results to illustrate the robustness of our findings to the choice of specification.
Point Estimates
It is worth repeating that our price measure is a gross price and not a net price for Proposition 21. To obtain a net price, we would need to be able to measure and subtract the total expected user fees for state parks that households would avoid if Proposition 21 passed. Instead, we proxy for things that are likely to be positively correlated with expected state park user fee savings due to Proposition 21, but we refer to just the gross price in our discussion of price. As we expected, a higher gross price decreases the log-odds of the yes-vote share, which is illustrated by both the raw coefficient on gross price and its interaction terms.24
Our subsequent analysis relies heavily upon the estimated gross price coefficient, so it is worth asking how well our model of vote shares would perform in the absence of the gross price variable. We do this in two ways. First, we use a likelihood ratio test for the joint contribution of the suite of price variables (including interaction terms) to see how important they are in the context of the full model. The likelihood ratio test statistic of 155.8 indicates that we can soundly reject the null hypothesis that the suite of four price variables fails to make a significant contribution to explaining the variation in vote shares. Second, we estimate the model using only the suite of price variables and compare those results with the full model to see how well the price variables explain voting choices in the absence of the other explanatory variables. Evidence from this comparison confirms that the suite of price variables alone does not explain the variation in vote shares nearly as well as the full model.25
The overall effects of the variables capturing political preferences are as expected. The more Republican-leaning is a census tract, the less likely are its residents to vote for Proposition 21, compared with a Democratic-leaning tract, all else equal. This is consistent with Thalmann (2004), Bornstein and Lanz (2008), Bornstein and Thalmann (2008), and Coan and Holman (2008), who find that political ideology has substantial power to explain variation in support for an environmental proposition.
We turn next to environmental preferences. The share of hybrid vehicles has a coefficient that is positive and strongly significant across all specifications in Table 2, suggesting that stronger environmental preferences are associated with greater odds that census tract voters will, on average, favor Proposition 21. Our other measure of environmental preferences, the LCV scores for the legislator associated with each census tract, is statistically insignificant in both models. One reason for this apparently counterintuitive result may be the lesser degree of spatial resolution at the level of California’s 53 congressional districts (as opposed to its 7,049 census tracts). This higher level of spatial aggregation may attenuate the point estimate and prevent us from getting a clearer picture of the potential influence of the environmental ideology of elected representatives on referendum voting by individuals at the much finer spatial scale of the census tract. Nevertheless, we retain the LCV score in the model because it is a standard variable in so many other studies of this nature.
Related to environmental preferences, we expect that our proxies for the salience of state parks should be an important class of explanatory variables. The positive and significant estimated coefficient on state park area within a 50 km buffer of the geographic centroid of a census tract suggests that the more state park area there is nearby, the greater is support for Proposition 21.26 However, the per capita number of hunting licenses purchased in a tract is negatively associated with support for Proposition 21. This result could stem from the strong association of hunting participation with the degree of urbanization and its correlates. While 6% of the U.S. population hunts, participation is only 3% in large Metropolitan Statistical Areas (1 million or more inhabitants), but 18% outside MSAs. The negative coefficient on hunting licenses may reflect that since state parks typically do not allow hunting, any public land devoted to state parks could be viewed by hunters as reducing their access to potential hunting areas.
In contemplating the potential transferability of our findings to other states, it is important to control for the availability of substitutes for state parks. California’s geography is large and diverse, but many other states do not enjoy California’s endowment of attractive natural areas. We include categories of potential substitutes that are both similar to and different from state parks. These potential substitutes include (1) local parks, (2) federally controlled land, and (3) museums and attractions. As mentioned above, a greater area of local parks may affect support for Proposition 21, although the number of local parks in a given area (as opposed to state or national parks) is also correlated with the degree of urbanization, and thus with urban amenities such as museums and nonpark attractions. People who choose to live in census tracts with many nearby local parks may also place a higher value on state parks. The coefficient on federally controlled land is negative and significant, suggesting that these types of lands may indeed be substitutes for state parks. Likewise, the estimated coefficient on the explicit number of museums and other nonpark attractions is negative and statistically significant, suggesting that these recreational options, too, are substitutes for state parks.
Falsification Tests
Before running our simulations and holdout sample analysis, we conduct a falsification-type test to determine whether our results are sensitive to specification decisions or whether we are merely measuring voter reactions to other propositions on the same ballot. We also ask whether our model merely explains voter turnout, rather than preferences concerning Proposition 21 in particular.
We first compare our fitted SAC model for Proposition 21 with analogous models to explain voting results for other propositions in the same election to see if the econometric model in this paper uniquely explains the variation in the vote for Proposition 21. Specifically, we consider Proposition 22 (The Local Taxpayer, Public Safety, and Transportation Protection Act) and Proposition 26 (Supermajority Vote to Pass New Taxes and Fees Act). Proposition 22 prohibits the state, even during a period of severe financial hardship, from delaying the distribution of tax revenues for transportation, redevelopment, or local government projects and services. Proposition 26 requires that certain state fees be approved by a two-thirds vote of the legislature and certain local fees be approved by two-thirds of voters. Both propositions concern how the government will handle tax revenues, but on the surface they have nothing in particular to do with parks or other environmental public goods.
We compare the coefficients for Proposition 21 to the coefficients for Proposition 22 (revenue distribution), and also compare the coefficients for Proposition 21 to the coefficients for Proposition 26 (two-thirds votes). Both comparisons show that the coefficients differ substantively across propositions by an order of magnitude. For good measure, we conduct Wald tests for the equivalence of estimated coefficients using pooled data for Proposition 21 and Proposition 22, and again for Proposition 21 and Proposition 26. The Wald tests soundly reject the equivalence of coefficients in both cases, which strengthens our confidence that the data-generating process that led to voting patterns for Proposition 21 is fundamentally different from that which led to voting patterns for Proposition 22 and Proposition 26. We run one final falsification test to examine whether our suite of regressors is merely explaining voter turnout. A Wald test similarly allows us to reject the possibility that our results are merely explaining voter turnout.27
Simulations
We turn now to some of the specific questions we posed at the beginning of the paper. We answer these questions by simulating counterfactual scenarios under which (1) the gross price of Proposition 21 is lower, (2) the population is more liberal, and (3) the population is more environmentally conscious.
To see whether merely a decrease in the size of the per vehicle surcharge might have allowed Proposition 21 to pass, we solve both of our fitted models for the vehicle surcharge, ki, which would have led to a predicted breakeven vote in each individual census tract.28 The SAC model predicts that for 47% of tracts, even lowering the per vehicle surcharge all the way down to $.01 would not have been sufficient to achieve a 50% vote for Proposition 21.29 For the other 53% of tracts, however, the simulated vehicle surcharges that would produce predicted yes votes of 50% in each tract are depicted across these tracts in Figure 3. The SAC model predicts that some tracts might have been willing to pay upward of $400 per vehicle and still pass Proposition 21 at 50%. This $400 per vehicle fee is, obviously, an out-of-sample extrapolation. It is likely that people in these areas are highly supportive of Proposition 21, and their individual votes would be largely the same given any reasonable vehicle fee.
Fitted Breakeven Fees
Alternatively, we could ask whether there might have been some lower common statewide vehicle license surcharge, k ≤ $18, that would have allowed Proposition 21 to just pass. To obtain a point estimate of a breakeven vehicle surcharge, we set ki = k for all census tracts and conduct a line search, starting at $18 and reducing the price incrementally. Using this strategy for both models, we find that even reducing the statewide vehicle surcharge all the way to $.01 would still have been insufficient to pass Proposition 21 at the state level. Voters appear to be rather insensitive to the gross price at the state level. One explanation for this result is that people care more about the payment vehicle (method of payment) for funding Proposition 21 than they care about the actual cost of funding Proposition 21. This explanation is consistent with the results of Kotchen and Powers (2006) and Nelson, Uwasu, and Polasky (2007), who find that municipal open-space conservation referenda are less likely to pass when new taxes are involved. It is also consistent with the results of Johnson (1999), where potential price increases seem to have negligible effects on voter support for the proposed regulation. Additionally, Matsusaka (1995, 2005) finds that, on average, states with direct democracy, such as California, rely less on broad-based taxes and more on user fees. This result further supports the idea that a tax may not have been the most desirable payment vehicle for increasing overall state park revenue. However, that explanation is inconsistent with the findings of Halbheer, Niggli, and Schmutzler’s (2006), who use Swiss data to determine that voter support for environmental propositions does not seem to depend upon whether a proposal involves a tax. The explanation for this difference may lie in heterogeneous attitudes toward taxes in the United States versus Switzerland.
While a ceteris paribus decrease in the statewide fee associated with Proposition 21 might not have been enough to pass it, both of our models predict that more liberal ideologies might have been sufficient. Consider the effect of the tract-level share of votes for the Republican gubernatorial candidate. The point estimate suggests that the greater the share of these Republicans voters in a tract, the lower the support for Proposition 21. This effect is strongly significant in both models, which raises the analogous question: How much more liberal would the voting population have needed to be to just pass Proposition 21? We again simulate a counterfactual scenario in which the population is more liberal. We fix the handful of minor-party gubernatorial vote shares and concentrate on the margin between mainstream Democrats and mainstream Republicans, shifting only the swing voter between these two parties. Assuming that voter turnout is unaffected, we incrementally shift the log-odds of the proportion of Democratic votes in each census tract by an equal amount, each time checking the predicted effect on the statewide vote in favor of Proposition 21. We line-search until the statewide simulated share of yes votes for Proposition 21 reaches 50%. As Figure 4 shows, results are nearly identical across the SAC and OLS models. Both specifications point to the substantial increase in the proportion of votes for the Democratic candidate (as reflecting a shift in political ideology) that would have been necessary for Proposition 21 to pass, holding all else constant.
Simulated Political Ideology Distributions
Our models also suggest that a ceteris paribus increase in preferences for the environment could have led to passage of Proposition 21. We simulate this increase by holding constant all other factors and increasing our indicator for the strength of environmental preferences—the proportion of hybrid vehicles— by a uniform amount across all census tracts until the predicted statewide vote reaches 50%. The average observed proportion of hybrids across census tracts in our sample is about 0.3%. Assuming unchanged voter turnout, the SAC model predicts that if environmental preferences had been such that the average share of hybrid vehicles had been approximately 3.7% (i.e., still less than the highest share in the data, which was 4.6%), Proposition 21 would have just passed at the statewide level. Inasmuch as the average share of hybrid vehicles is a measure of tract-level environmental preferences, this result suggests that increased environmental preferences could have resulted in the passage of Proposition 21.30
Holdout Samples and Out-of-Sample Predictions
A clearer understanding of the determinants of support for Proposition 21 across California may provide a tool to predict likely support for a similar proposition in other direct-democracy states. It may also be used to predict likely popular support for similar legislation in states that do not practice direct democracy. We thus explore the extent to which our two models might succeed in forecasting the vote for Proposition 21 in other jurisdictions. Ideally, we would gather analogous explanatory variables for census tracts in different states for this task. Given that no other state has conducted such a vote, we instead use holdout samples comprising individual counties in California to assess the predictive ability of our models.
For both the SAC and OLS models, we rotate through each of the counties of California, excluding the holdout county’s census tracts from the estimation process. We save the estimated parameters of the model and use them to predict the share of yes votes in each of the census tracts of the holdout county. Figure 5 summarizes the results of our holdout sample analysis for each county, for both the SAC model and the OLS model. Each point on the scatter plot represents the predicted countylevel vote shares from a model that is estimated using censustract-level data from all California counties other than the county in question. The three most obvious outliers are Monterey and Siskiyou counties (positive outliers with predicted vote shares greater than actual vote shares) and Imperial County (a negative outlier with a predicted vote share less than the actual vote share). This visual representation suggests that both models are relatively successful at predicting the actual vote share in a given county, using point estimates based on all census tracts in California outside that county. While the estimates provided by the SAC model may be more appropriate for inference, it appears that the OLS model still performs relatively well when comparing point predictions across models.31
Holdout Sample Analysis
V. CONCLUSION
The purpose of this paper has been to identify which factors are most important in explaining the substantial variation in support by California voters for Proposition 21, an $18 vehicle registration fee to support state parks. To accomplish this task, we construct a rich set of data that includes a larger suite of economically motivated explanatory variables than has been used previously in the related literature. We conduct simulations and a comprehensive holdout sample analyses to answer the three main questions that motivate our study.
1. Could there have been a lower statewide vehicle-license surcharge at which Proposition 21 may have passed?
Nearly 43% of California voters voted in favor of Proposition 21. We explore heterogeneity in support for Proposition 21 by using our model to solve for the highest vehicle surcharge that would have produced 50% support in each census tract. These fitted vehicle surcharges display substantial variation across census tracts. In fact, a majority of census tracts in the state would have voted to pass Proposition 21 at some positive fee less than $18. Interestingly, our model also suggests that, in some census tracts, the proposition may have just passed with vehicle fees as high as $400.
We then explore whether Proposition 21 might have passed, had the statewide fee been less than $18 per vehicle. We use our estimated model to simulate the effects of lower and lower statewide fees per vehicle. The results of the simulation suggest that there is no lower positive statewide fee at which Proposition 21 would have passed, all other things equal.
2. How different would other economic and sociodemographic conditions have needed to be for Proposition 21 to have passed?
Our simulations suggest that a more liberal electorate and stronger preferences for the environment could have been enough to pass Proposition 21, ceteris paribus. The mix of political ideologies represented in each census tract has a large influence on the proportion of votes for Proposition 21. Our simulations suggest that it would have taken a substantial shift toward the ideological left by swing voters to have allowed Proposition 21 to pass at the statewide level, holding everything else constant. Similarly, stronger preferences for the environment, as proxied by the proportion of hybrid vehicles in a tract, might have allowed Proposition 21 to pass statewide. However, our simulations suggest that environmental preferences consistent with a 10-fold greater share of hybrid vehicle ownership would likely have been required to allow Proposition 21 to pass, holding all else constant.
3. Does our empirical model have predictive power?
Our county-level holdout sample analyses indicate good in-state predictive ability, as evidenced by the strong correlation between predicted vote share and the actual vote share. Inasmuch as the voters and conditions elsewhere in the United States have characteristics spanned by the conditions represented across California in November of 2010, it might be possible to use our model to forecast likely support for a similar measure elsewhere.
There are, of course, other potential explanations for why Proposition 21 failed. For example, Li et al. (2011) find that people are less likely to give to government-run charity organizations than to private charities, and that national organizations attract more donations than do local and state organizations. Moreover, they find that people are more willing to be voluntarily taxed when the public good in question is cancer research as opposed to education, national parks and wildlife, or disaster relief. It is possible that many people in California had other plans for charitable contributions on their minds at the time of the election.
There are also many reasons why the nature of state park funding mechanisms is a socially relevant issue. One is the potential distributional consequences of different possible funding mechanisms. There exists weak evidence that Proposition 21 may have been an inferior economic good at the means of the data, given that tracts with lower median incomes may have been, if anything, more likely to vote for the proposition. This could be true if a prolonged recession causes people to anticipate that they may need to substitute away from more expensive out-of-state vacations and perhaps visit state parks instead. Perhaps support for Proposition 21 was greater than it would otherwise have been, due to the concurrent recession. In addition, tracts with larger shares of Asian and African-American populations were less likely to vote in favor of Proposition 21, suggesting that these groups may derive less utility from access to state parks. However, there is evidence that higher-income Latino/Hispanic populations found Proposition 21 relatively more appealing.
Another reason that the mode of state park funding could matter to park users is congestion control. In addition to raising revenue from park visitors, user fees were initially implemented in part to manage congestion by imposing a positive marginal price per visit for park users. Depending upon the price elasticity of demand for visits to state parks, a per visit fee could discourage some types of households from using state parks, whereas the exogenous surcharge on vehicle registration fees would not be expected to have this effect of increasing the cost of park visits.
Finally, Siikamäki (2011) finds that the establishment and existence of state parks generates a substantial amount of nature recreation in the general population, the annual value of which he estimates at $14 billion. Given the growing obesity epidemic in the U.S. population, there may be an argument for positive health-related externalities associated with removing per trip user fees, since fees constitute a disincentive to engage in nature-based outdoor recreation. Given that state parks around the country continue to fall into disrepair because of a lack of funding, this study provides useful information for policy makers and environmental groups in other parts of the country as they try to solve their own state park funding problems.
Acknowledgments
We thank an anonymous reviewer for very helpful comments. In addition, we are grateful to Soren Anderson, Gulcan Cil, Tatyana Deryugina, Laura Grant, Catherine Hausman, Ben Hansen, Grant Jacobsen, Jason Lindo, Andrew Plantinga, John Whitehead, and JunJie Wu. For other helpful comments, we are grateful to participants at the Micro Group Workshops at the University of Oregon, the 2011 Oregon Resource and Environmental Economics Workshop, the 2011 Heartland Environmental and Resource Economics Workshop, the 2011 Southern Economic Association conference, the 2012 (13th) Southern California Occasional Workshop on Environmental and Resource Economics, and the 2013 conference of the Association of Environmental and Resource Economists. Finally, we are indebted to Andrew Cisakowski and Derek Wolfson for invaluable research assistance. Funding for this project was provided in part by the R. F. Mikesell Foundation at the University of Oregon, Alaska EPSCoR NSF award #OIA-1208927, and the state of Alaska.
Footnotes
The authors are, respectively, assistant profesor, Department of Economics and Finance, Drake University, Des Moines, Iowa; and professor, Department of Economics, University of Oregon, Eugene.
↵1 One example of a user-based fee system for parks is the Northwest Forest Pass program, under which users purchase an annual pass for unlimited admission to publicly maintained areas in the Pacific Northwest.
↵2 Per capita expenditures are defined as expenditures per member of the population, regardless of park use.
↵3 Statistics concerning the recent history of park funding are summarized in the online supplement, Figures A1, A2, and A3, available at http://le.uwpress.org.
↵4 The full text of the Proposition 21 can be found at www.ballotpedia.org/wiki/index.php/Text_of_Proposition_21,_the_State_Parks_and_Wildlife_Conservation_Trust_Fund_Act_(California_2010).
↵5 Gross revenues from the proposition were expected to be $500 million per year. The net figure accounts for the money that would have gone back to the general fund, and also reflects lost user fee revenue.
↵6 Figure A4 in the online supplement (available at http://le.uwpress.org) shows the locations of individual state parks within 12 regions of California; Figure A5 shows the locations of the 70 parks that were scheduled to close. The choice of which parks to close was made based on the following criteria: (1) protect the most significant natural and cultural resources, (2) maintain public access and revenue generation to the greatest extent possible, and (3) protect closed parks so that they remain attractive and usable for potential partners. Voters were not informed in voter pamphlets about which specific parks were scheduled to close.
↵7 Fischel (1979) uses a separate survey of voters to test different determinants of voting behavior in a referendum at the local level on a proposed paper and pulp mill in New Hampshire.
↵8 A recent working paper by Burkhardt and Chan (2015) expands this type of analysis to consider an array of California ballot propositions between 2006 and 2010 but excludes the 2010 state park funding proposition we analyze in this study.
↵10 See www.census.gov.
↵11 See www.ihs.com/btp/polk.html.
↵12 The classification criteria for hybrid vehicles are drawn primarily from online sources, including (1) www.flhsmv.gov/dmv/ilev-hybrid-vehicle-list.pdf; (2) www.hybridcars.com/hybrid-cars-list; (3) www.kbb.com/hybrid/?vehicleclass=newcar&intent=buynew&r=664452610111215600; (4) www.edmunds.com/finder/; and (5) www.carfax.com/.
↵13 See www.lcv.org.
↵14 See www.calands.org.
↵15 See www.dfg.ca.gov.
↵16 See www.mapcruzin.com/free-california-gis-shapefile.htm.
↵18 We dropped census tracts having more than four registered vehicles per household, on average, based on the Polk data. These tracts seem likely to be low-population areas with new/used car lots.
↵19 See Deacon and Shapiro (1975); Kline and Wichelns (1994); Kahn and Matsusaka (1997); Kotchen and Powers (2006); and Wu and Cutter (2011).
↵20 We ignore the possible income effect of the $18 fee for two reasons. First, this amount is unlikely to comprise a large proportion of a household’s income. Second, we are focused on group inference, so interpretations involving individual income effects are inappropriate in this instance.
↵21 The Democratic Party candidate’s share is designated as the base share since the Democratic candidate won.
↵22 In contrast, Burkhardt and Chan, in their 2015 working paper, use a similar specification to consider a variety of California ballot propositions and do seek to interpret their results as conveying willingness-to-pay information for the public goods involved.
↵23 The large difference in the log-likelihoods can be explained by the different assumptions made in estimation. We clustered standard errors at the county level in the OLS specification, which means that we assume some common error structure that varies independently at the county level. In contrast, the SAC specification allows both the dependent variable and the error term for any given census tract to be correlated with those for neighboring census tracts.
↵24 The marginal effect of price is –0.00238; it is statistically significant, with a standard error of 0.000787. A referee noted that household size is also likely to be positively correlated with the number of vehicles owned by a household. To address the possibility that household size alone is implicitly driving the size of the estimated price coefficient, we have also estimated our preferred specification with average household size included as a separate explanatory variable. The estimated coefficient on the added household size variable was significant and negative. Fortunately, the estimated coefficient on the gross price variable remains significant and negative, with a less than 1% change in its magnitude.
↵25 The full model exhibits a log-likelihood of 3,229, while the constrained model using the price variables alone yields a log-likelihood of only 1,600.
↵26 Of course, this may reflect endogenous residential location decisions by households who prefer easier access to state parks. We have estimated models with state park area within 20 km and 100 km buffers. The 50 km measure also had more explanatory power than either of the other buffer areas. Assuming a straight-line distance, 50 km represents about one hour of travel time at 31 mph, or half an hour of travel time at 62 mph.
↵27 Results appear in online supplement Table A1 (available at http://le.uwpress.org). We conduct the test of parameter equivalence across pairs of propositions using seemingly unrelated regressions with tests of cross-equation parameter restrictions. An analogous test in the context of the SAC model appears to be prohibitively difficult. We rely on the similarities between the SAC and OLS models, as well as the overwhelming rejection of parameter equivalence, to infer that the qualitative results would be the same if we could include spatial autoregression and error correlations in these pairwise comparisons across propositions.
↵28 In the body of this paper, we focus on the results based on the SAC model.
↵29 The simulation is a linear extrapolation and an out-of-sample prediction. Evidence of a discontinuous jump in demand at p = $0 exists in the development literature (e.g., Cohen and Dupas 2010), and we have no reason to expect it would be different here. Additionally, in the actual data, the cost to the household of Proposition 21 was constant, per vehicle, and we rely on variations in average vehicles per household to pin down the gross price effect.
↵30 The OLS model predicts that if environmental preferences were such that the share of hybrids was 3%, Proposition 21 might have passed, ceteris paribus.
↵31 The predicted vote shares represent the medians of the fitted conditional distribution of these shares, not the means (expected values). Both the SAC and OLS models are fitted in terms of the log-odds of the dependent variable, and this is a nonlinear transformation of the raw data on vote shares. Calculation of the predicted expected shares for each census tract in the holdout sample would involve using the fitted variance of the conditional distribution, which will depend upon the variance-covariance matrix for the parameter estimates and potentially even the error variance for the model. Interval estimates for the SAC model are especially difficult to calculate correctly, given the spatial dynamics in the model. Thus we opt to depict the median fitted values for both the SAC and OLS models as a qualitative summary of the abilities of out-of-sample predictions from these two models to track differences in the actual vote shares. The slight underprediction of the lowest shares, and slight overprediction of the highest shares, is an artifact of the use of medians rather than means for the predicted shares.