Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Pooled discrete choice models combine revealed preference (RP) data and stated preference (SP) data to exploit advantages of each. SP data is often treated with suspicion because consumers may respond differently in a hypothetical survey context than they do in the marketplace. However, models built on RP data can suffer from endogeneity bias when attributes that drive consumer choices are unobserved by the modeler and correlated with observed variables. Using a synthetic data experiment, we test the performance of pooled RP–SP models in recovering the preference parameters that generated the market data under conditions that choice modelers are likely to face, including (1) when there is potential for endogeneity problems in the RP data, such as omitted variable bias, and (2) when consumer willingness to pay for attributes may differ from the survey context to the market context. We identify situations where pooling RP and SP data does and does not mitigate each data source’s respective weaknesses. We also show that the likelihood ratio test, which has been widely used to determine whether pooling is statistically justifiable, (1) can fail to identify the case where SP context preference differences and RP endogeneity bias shift the parameter estimates of both models in the same direction and magnitude and (2) is unreliable when the product attributes are fixed within a small number of choice sets, which is typical of automotive RP data. Our findings offer new insights into when pooling data sources may or may not be advisable for accurately estimating market preference parameters, including consideration of the conditions and context under which the data were generated as well as the relative balance of information between data sources.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The disadvantage is that such choices were made in the past (from 2010 onwards), hence, some of them might neither reflect the actual market condition nor fully incorporate the current expectations regarding the future developments of the car market. A potential solution is to pool the two datasets to exploit advantages of each (Helveston, Feit, & Michalek, 2018;Guzman, Arellana, Cantillo-García, & Ortúzar, 2021). ...
... The null hypothesis is that the common utility parameters are equal. Helveston et al. (2018) performed a theoretical analysis testing their conclusions with a synthetic database. They acknowledged that pooling has the potential to improve the model by adding additional information about the parameters, reducing multicollinearity, and allowing the incorporation of attributes that do not appear in the market. ...
... Regarding the parameter interactions with the ASC BEV and the socio-economics, we did not have specific a-priori, hence they could be either data-specific or jointly estimated. As for vehicle attribute coefficients, the literature (Brownstone et al., 2000;Helveston et al., 2018) suggests to take advantage of the data enrichment property of the joint RP\SP estimation. ...
Article
Norway is the leading country in electric car adoption in the world, while in Italy electric cars are only recently gaining acceptance. We compared car choices in the two countries highlighting commonalities and differences in the choice determinants, distinguishing between the small and the large car segment. We analyzed actual choices made in the real-world conditions and stated choices under hypothetical scenarios. The comparison between the preference structures of the two countries shows important differences when the revealed preference dataset is analyzed, while the differences are much reduced with the stated preference dataset. All in all, we feel that the two countries present only differences associated with longer car driving habits of the Norwegian drivers, the higher percentage of large cars in Norway, and the more developed public charging infrastructure. Since the supply of cars is quite similar, such a consideration leads us to believe that the huge discrepancy in electric car uptake is mainly due to the different car policies adopted in the two countries. The evolution of the policy setting and of the technology will determine whether Italy will follow the Norwegian model of gradual BEV uptake.
... The typical procedure to obtain these estimates is to divide the estimated parameters of a "preference space" utility model by the negative of the price parameter. Despite this common practice, it can yield unreasonable distributions of WTP across the population in heterogeneous random parameter (or "mixed logit") models (Train and Weeks 2005;Sonnier, Ainslie, and Otter 2007;Helveston, Feit, and Michalek 2018). For example, if the parameters for the price attribute and another non-price attribute are both assumed to be normally distributed across the population, then the resulting WTP estimate follows a Cauchy distribution, implying that WTP has an infinite variance across the population. ...
... Adopting the same notation as in Helveston et al. (2018), consider the following utility model: ...
... This example illustrates the sensitivity of the WTP distribution to modeling choices made in preference space utility models. In general, WTP space models yield more reasonable estimates of WTP distributions across the population, a consistent finding across multiple prior studies (Train and Weeks 2005;Sonnier et al. 2007;Das, Anderson, and Swallow 2009;Helveston et al. 2018). ...
Article
Full-text available
This paper introduces the logitr R package for fast maximum likelihood estimation of multinomial logit and mixed logit models with unobserved heterogeneity across individuals, which is modeled by allowing parameters to vary randomly over individuals according to a chosen distribution. The package is faster than other similar packages such as mlogit, gmnl, mixl, and apollo, and it supports utility models specified with “preference space” or “willingness-to-pay (WTP) space” parameterizations, allowing for the direct estimation of marginal WTP. The typical procedure of computing WTP post-estimation using a preference space model can lead to unreasonable distributions of WTP across the population in mixed logit models. The paper provides a discussion of some of the implications of each utility parameterization for WTP estimates. It also highlights some of the design features that enable logitr's performant estimation speed and includes a benchmarking exercise with similar packages. Finally, the paper highlights additional features that are designed specifically for WTP space models, including a consistent user interface for specifying models in either space and a parallelized multi-start optimization loop, which is particularly useful for searching the solution space for different local minima when estimating models with non-convex log-likelihood functions.
... The typical procedure to obtain these estimates is to divide the estimated parameters of a "preference space" utility model by the negative of the price parameter. Despite this common practice, it can yield unreasonable distributions of WTP across the population in heterogeneous random parameter (or "mixed logit") models (Train and Weeks 2005;Sonnier, Ainslie, and Otter 2007;Helveston, Feit, and Michalek 2018). For example, if the parameters for the price attribute and another non-price attribute are both assumed to be normally distributed across the population, then the resulting WTP estimate follows a Cauchy distribution, implying that WTP has an infinite variance across the population. ...
... Adopting the same notation as in Helveston et al. (2018), consider the following utility model: u * j = β * x j + α * p j + ε * j , ε * j ∼ Gumbel 0, σ 2 π 2 6 (1) ...
... This example illustrates the sensitivity of the WTP distribution to modeling choices made in preference space utility models. In general, WTP space models yield more reasonable estimates of WTP distributions across the population, a consistent finding across multiple prior studies (Train and Weeks 2005;Sonnier et al. 2007;Das, Anderson, and Swallow 2009;Helveston et al. 2018). ...
Preprint
Full-text available
This paper introduces the logitr R package for fast maximum likelihood estimation of multinomial logit and mixed logit models with unobserved heterogeneity across individuals, which is modeled by allowing parameters to vary randomly over individuals according to a chosen distribution. The package is faster than other similar packages such as mlogit, gmnl, mixl, and apollo, and it supports utility models specified with "preference space" or "willingness to pay (WTP) space" parameterizations, allowing for the direct estimation of marginal WTP. The typical procedure of computing WTP post-estimation using a preference space model can lead to unreasonable distributions of WTP across the population in mixed logit models. The paper provides a discussion of some of the implications of each utility parameterization for WTP estimates. It also highlights some of the design features that enable logitr's performant estimation speed and includes a benchmarking exercise with similar packages. Finally, the paper highlights additional features that are designed specifically for WTP space models, including a consistent user interface for specifying models in either space and a parallelized multi-start optimization loop, which is particularly useful for searching the solution space for different local minima when estimating models with non-convex log-likelihood functions.
... We instead specified a ''willingness-to-pay'' (WTP) space utility model in which coefficient estimates have units of dollars and represent the valuation for marginal changes in attribute values. This has several advantages, in particular the ability the directly interpret the coefficients independent of one another and across different models; in contrast, utility coefficients must be interpreted relative to one another within each model as each model could have a different error scaling (53,54). The general WTP space utility model is shown in Equation 2, ...
... where p j is price, l is a scale parameter, x j is all nonprice attributes, and v is a vector of WTP coefficients for nonprice attributes. For MXL models, directly estimating WTP provides greater control over how WTP is assumed to be distributed across the population, and has been found to yield more reasonable distributions of WTP compared with WTP computed from preference space model coefficients (53)(54)(55). Equation 3 shows the full model used in the study, with explanations of the variable names in Table 3: ...
Article
Automated vehicles (AVs) have the potential to dramatically disrupt current transportation patterns and practices. One particular area of concern is AVs' impacts on public transit systems. If vehicle automation enables significant price decreases or performance improvements for ride-hailing services, some fear that it could undercut public transit, which could have significant implications for the environment and transportation equity. The extent to which individuals adopt automated transportation modes will drive many system-level outcomes, and research on public preferences for AVs is immature and inconclusive. In this study, we used responses from an online choice-based conjoint survey fielded in the Washington, D.C. metropolitan region (N = 1,694) in October 2021 to estimate discrete choice models of public preferences for different automated (ride-hailing, shared ride-hailing, bus) and nonautomated (ride-hailing, shared ride-hailing, bus, rail) modes. We used the estimated models to simulate future marketplace competition across a range of trip scenarios. Respondents on average were only willing to pay a premium for automated modes when a vehicle attendant was also present, limiting the potential cost-savings that AV operators might achieve by removing the driver. Scenario analysis additionally revealed that for trips where good transit options were available, transit remained competitive with automated ride-hailing modes. These results suggest that fears of a mass transition away from transit to AVs may be limited by people's willingness to use AVs, at least in the short term. Future AV operators should also recognize the presence of an AV attendant as a critical feature for early AV adoption.
... where λ is the coefficient for the incentive amount, a j , and β is a vector of coefficients for all other attributes, x j . To make the results more easily interpretable, we specify the utility model in the 'willingness-to-pay' (WTP) space [19,20] such that estimated model coefficients represent the marginal WTP (or valuation in the context of this study) for marginal changes in each attribute: ...
... low-income buyers (3), new vs. used vehicle buyers (4), and high vs. low budgets (5). All models are estimated in the 'Willingness to Pay' (WTP) space such that coefficients reflect preference values in dollars [19,20]. Table 2 shows the estimated coefficients from each model, which are in units of thousands of dollars. Since each respondent answered ten choice questions, the final dataset includes 21 700 choice observations from sets of four incentive types: sales tax exemption, tax credit, tax deduction, and rebate. ...
Article
Full-text available
Financial incentives, such as purchase subsidies, have been found to increase plug-in electric vehicle (PEV) adoption, but how these incentives are designed can impact their effectiveness as well as how equitably they are distributed. Using a national conjoint survey (N = 2,170 respondents), we quantify how U.S. vehicle buyers value different features of PEV financial incentives. Participants overwhelmingly prefer immediate rebates, on average valuing them by $580, $1,450, and $2,630 more than sales tax exemptions, tax credits, or tax deductions, respectively. These effects are significantly larger for lower-income households, used vehicle buyers, and buyers with lower budgets. We estimate that on average $2 billion could have been saved if the federal subsidy available between 2011 and 2019 were delivered as an immediate rebate instead of a tax credit, or $1,440 per PEV sold. Our results suggest that structuring incentives as immediate rebates would deliver a greater value to customers and would be more equitably distributed compared to the current tax credit scheme.
... This infer and plug-in strategy can be implemented in a number of research fields where combination of stated and revealed preferences is common practice. For instance, transportation (Helveston et al., 2018), health (Lambooij et al., 2015;Mark and Swait, 2004), and hedonic applications (Phaneuf et al., 2013). Also, we believe that the potential of this strategy goes beyond those fields. ...
Preprint
Full-text available
The travel cost (TC) method models the number of trips to a recre- ation site as a function of the costs to reach that site. The single site TC equation is particularly vulnerable to endogeneity since travel costs are chosen by the visitor. This paper suggests a control function approach that breaks the correlation between travel costs and the error term by plugging inferred omitted variables into the TC equation. Inference of omitted variables is carried out on an endogenous free, stated preference equation that, arguably, shares omitted variables with the TC equation. By revisiting the TC and contingent valuation (CV) data analyzed by Fix and Loomis (1998), this paper infers the omitted variables from the CV equation via a �nite mixture speci�cation |an inference strategy whose justi�cation resembles the use of heteroscedastic errors to construct instru- ments as suggested by Lewbel (2012). Results show that not controlling for endogeneity in this particular case produces an overestimation of wel- fare measures. Importantly, this infer and plug-in strategy is pursuable in a number of contexts beyond recreation demand applications.
... The most popular method for DEC design is to utilize both RP and SP choice data [22]. In this research, the status quo option shows the average value of each attribute/characteristic in the real world. ...
Article
Full-text available
Local community acceptance, a key indicator of the socio-political risk of a project, is addressed through good stakeholder (community) engagement. Discrete choice modeling (DCM) enhances stakeholder analysis and has been widely applied to encourage community engagement in energy projects. However, very little detail is provided on how researchers design discrete choice experiments (DCEs). DCE design is the key step for effective and efficient data collection. Without this, the discrete choice model may not be meaningful and may be misleading in the local community engagement effort. This paper presents a framework for mining community engagement DCE design in an attempt to determine (1) how to identify the optimum number of factors and (2) how to design and validate the DCE design. Case studies for designing discrete choice experiments for community acceptance of mining projects are applied to accommodate these two objectives. The results indicate that the four-factor design, which seeks to reduce cognitive burden and costs, is the optimal choice. A survey was used to examine the difficulty of the survey questions and the clarity of the instructions for the designs. It has, therefore, been proven that the DCM design is of reasonable cognitive burden. The results of this study will contribute to a better design of choice experiments (surveys) for discrete choice modeling, leading to better policies for sustainable energy resource development.
... RP surveys investigate the choice behaviour in actual scenarios. Although this method can be applied to observe realistic choice behaviour, information about only existing and past scenarios can be obtained [22]. SP surveys overcome this problem by using a set of hypothetical situations to control the scenarios or demographic characteristics of the participants [23]. ...
Article
Understanding human decision making during emergency evacuations is important. It is because decision making is crucial at each stage of the evacuation (e.g. route choice, exit choice, path finding, etc.). Previous studies have examined the influence of social interaction and environmental factors on exit choice. However, researches on the decision-making behaviour and risk attitude towards such selection at the individual level is still limited. To fill this knowledge gap, we designed a series of virtual evacuation scenarios to examine the exit choice behaviour and decision-making attitude in uncertainty risk scenarios in different smoke conditions. Data collection was implemented by an online stated preference survey. The results revealed a systematic and pattern preference in exit choice regardless of the decisions made by experienced evacuees. The weighted uncertainty risk for individuals was determined by considering the smoke height and smoke appearance frequency. The weighting function between the subjective and objective entities exhibited an S shape. In addition, the function was estimated using an empirical equation. A decision maker’s attitude towards uncertain risk in evacuation scenarios was observed to be a rank- and reference-dependent preference rather than being fully rational. The results of the experiments conducted in virtual environment well agreed with the cumulative prospect theory. This research demonstrates the feasibility of using virtual environment for evacuation experiment which can total avoid the risk of stampede while the individual decision-making behaviours can still be captured.
... Combining hypothetical choice data and RP data to improve model fit and estimation accuracy is a common practice in choice modelling. Particularly in transport and travel behaviour studies this method and its various theoretical implications have been heavily discussed in a broad range of contexts and applications including commuter valuations of travel time savings and reliability, preferences for transport modes or route choice (Abildtrup et al., 2015;Ben-Akiva et al., 1994;Ortúzar, 2002, 2006;Duann and Shiaw, 2001;Fifer et al., 2011;Haghani and Sarvi, 2016c, 2017Haghani et al., 2016;Helveston et al., 2018;Hensher et al., 2008;Lavasani et al., 2017;Morikawa, 1994;Polydoropoulou and Ben-Akiva, 2001;Train and Wilson, 2008;van Essen et al., 2020;Wardman, 1988). This approach, however, has not been conventionally regarded as a mitigation method for HB. ...
Article
This paper follows the review of empirical evidence on the existence of hypothetical bias (HB) in choice experiments (CEs) presented in Part I of this study. It observes how the variation in operational definitions of HB has prohibited consistent measurement of HB in CE. It offers a unifying definition of HB and presents an integrative framework of how HB relates to but is also distinct from external validity (EV), with HB representing one component of the wider concept of EV. The paper further identifies major sources of HB and discusses explanations as well as possible moderating factors of HB. The paper reviews methods of HB mitigation identified in the literature and the empirical evidence of their effectiveness. The review includes both ex-ante and ex-post bias mitigation methods. Ex-ante bias mitigation methods include cheap talk, real talk, consequentiality scripts, solemn oath scripts, opt-out reminders, budget reminders, honesty priming, induced truth telling, indirect questioning, time to think and pivot designs. Ex-post methods include follow-up certainty calibration scales, respondent perceived consequentiality scales, and revealed-preference-assisted estimation. It is observed that the mitigation methods and their preferred use vary markedly across different sectors of applied economics. The existing empirical evidence points to the overall effectiveness of mitigation strategies in reducing HB, although there is some variation. The paper further discusses how each mitigation method can counter a certain subset of HB sources. Considering the prevalence of HB in CEs and the effectiveness of bias mitigation methods, it is recommended that implementation of at least one bias mitigation method (or a suitable combination where possible) becomes standard practice in conducting CEs to ensure that inferences and subsequent policy decisions are as much as possible free of HB.
... For example, discrete choice studies that had made use of both stated and revealed choices were included only if they reported explicit comparisons or at least had separate but comparable model estimates across the two data types. Those that exclusively focused on combining stated and revealed choices as a way of data enrichment, without explicitly testing for HB, did not meet the inclusion criteria (e.g., Helveston et al. (2018)). A similar approach was taken with respect to studies of economic choice where incentive-compatible experimental methods were adopted but no comparison with a pure hypothetical treatment was reported (e.g., Yang et al. (2018), Gilmour et al. (2019)). ...
Preprint
The notion of hypothetical bias (HB) constitutes, arguably, the most fundamental issue in relation to the use of hypothetical survey methods. Whether or to what extent choices of survey participants and subsequent inferred estimates translate to real-world settings continues to be debated. While HB has been extensively studied in the broader context of contingent valuation, it is much less understood in relation to choice experiments (CE). This paper reviews the empirical evidence for HB in CE in various fields of applied economics and presents an integrative framework for how HB relates to external validity. Results suggest mixed evidence on the prevalence, extent and direction of HB as well as considerable context and measurement dependency. While HB is found to be an undeniable issue when conducting CEs, the empirical evidence on HB does not render CEs unable to represent real-world preferences. While health-related choice experiments often find negligible degrees of HB, experiments in consumer behaviour and transport domains suggest that significant degrees of HB are ubiquitous. Assessments of bias in environmental valuation studies provide mixed evidence. Also, across these disciplines many studies display HB in their total willingness to pay estimates and opt-in rates but not in their hypothetical marginal rates of substitution (subject to scale correction). Further, recent findings in psychology and brain imaging studies suggest neurocognitive mechanisms underlying HB that may explain some of the discrepancies and unexpected findings in the mainstream CE literature. The review also observes how the variety of operational definitions of HB prohibits consistent measurement of HB in CE. The paper further identifies major sources of HB and possible moderating factors. Finally, it explains how HB represents one component of the wider concept of external validity.
... Combining hypothetical choice data and RP data to improve model fit and estimation accuracy is a common practice in choice modelling. Particularly in transport and travel behaviour studies this method and its various theoretical implications have been heavily discussed in a broad range of contexts and applications including commuter valuations of travel time savings and reliability, preferences for transport modes or route choice (Abildtrup et al., 2015;Ben-Akiva et al., 1994;Ortúzar, 2002, 2006;Duann and Shiaw, 2001;Fifer et al., 2011;Haghani and Sarvi, 2017, 2019Helveston et al., 2018;Hensher et al., 2008;Lavasani et al., 2017;Morikawa, 1994;Polydoropoulou and Ben-Akiva, 2001;Train and Wilson, 2008;van Essen et al., 2020;Wardman, 1988). This approach, however, has not been conventionally regarded as a mitigation method for HB. ...
Preprint
This paper reviews methods of hypothetical bias (HB) mitigation in choice experiments (CEs). It presents a bibliometric analysis and summary of empirical evidence of their effectiveness. The paper follows the review of empirical evidence on the existence of HB presented in Part I of this study. While the number of CE studies has rapidly increased since 2010, the critical issue of HB has been studied in only a small fraction of CE studies. The present review includes both ex-ante and ex-post bias mitigation methods. Ex-ante bias mitigation methods include cheap talk, real talk, consequentiality scripts, solemn oath scripts, opt-out reminders, budget reminders, honesty priming, induced truth telling, indirect questioning, time to think and pivot designs. Ex-post methods include follow-up certainty calibration scales, respondent perceived consequentiality scales, and revealed-preference-assisted estimation. It is observed that the use of mitigation methods markedly varies across different sectors of applied economics. The existing empirical evidence points to their overall effectives in reducing HB, although there is some variation. The paper further discusses how each mitigation method can counter a certain subset of HB sources. Considering the prevalence of HB in CEs and the effectiveness of bias mitigation methods, it is recommended that implementation of at least one bias mitigation method (or a suitable combination where possible) becomes standard practice in conducting CEs. Mitigation method(s) suited to the particular application should be implemented to ensure that inferences and subsequent policy decisions are as much as possible free of HB.
... For example, identification and efficiency issues arise because of the lack of variability (due to spatial aggregation) (Hewko, Smoyer-Tomic, & Hodgson, 2002;Nilsson, 2017) , multicollinearity and the potential endogeneity of amenities (due to omitted variable, self-sorting and individuals' endogenous preferences for amenities in their current city) when eliciting the individuals' WTP using revealed preference data (or observed behavior) (Bishop & Timmins, 2019;Diamond, 2016;Helveston, Feit, & Michalek, 2018;Sheppard, 1999). Based on the results from the choice experiment, we are able to infer the monetary values associated with each hypothetical city amenity by computing a persons' expected compensation (or willingness to pay) for marginal changes. ...
Article
Full-text available
Eliciting willingness to pay (WTP) for city amenities is not an easy task due to both endogeneity problems and unobserved heterogeneity in individuals' preferences. We address these two issues by using a city choice experiment to infer monetary values (WTP) associated with relocation attributes of a hypothetical job offer. Adopting a latent class logit approach (LCL) allows us to explore underlying unobserved preference heterogeneity. Benchmark results ‐ without accounting for heterogeneity ‐suggest that commuting time, crime, and access to entertainment are very important. However, our LCL estimates support that focusing just on average WTP obscures the fact that preferences for amenities vary across individuals.
... This nested logit method can also be seen as a pooled estimation with heteroscedasticity across RP and SP[36,25] 2 Cross-entropy loss is the same as negative log likelihood, so minimizing the cross-entropy loss is the same as maximizing log likelihood. ...
Article
Full-text available
It is an enduring question how to combine revealed preference (RP) and stated preference (SP) data to analyze individual choices. While the nested logit (NL) model is the classical way to address the question, this study presents multitask learning deep neural networks (MTLDNNs) as an alternative framework, and discusses its theoretical foundation, empirical performance, and behavioral intuition. We first demonstrate that the MTLDNNs are theoretically more general than the NL models because of MTLDNNs' automatic feature learning, flexible regularizations, and diverse architectures. By analyzing the adoption of autonomous vehicles (AVs), we illustrate that the MTLDNNs outperform the NL models in terms of prediction accuracy but underperform in terms of cross-entropy losses. To interpret the MTLDNNs, we compute the elasticities and visualize the relationship between choice probabilities and input variables. The MTLDNNs reveal that AVs mainly substitute driving and ride hailing, and that the variables specific to AVs are more important than the socioeconomic variables in determining AV adoption. Overall, this work demonstrates that MTLDNNs are theoretically appealing in leveraging the information shared by RP and SP and capable of revealing meaningful behavioral patterns, although its performance gain over the classical NL model is still limited. To improve upon this work, future studies can investigate the inconsistency between prediction accuracy and cross-entropy losses, novel MTLDNN architectures, regularization design for the RP-SP question, MTLDNN applications to other choice scenarios, and deeper theoretical connections between choice models and the MTLDNN framework.
... For each model, price and vehicle attributes are the sales weighted average of all model observations. Due to the high degree of correlation between various vehicle attributes in revealed preference data, choice models estimated using utility functions that are linear in attributes often result in estimated parameters of the unexpected sign, even after accounting for unobserved product characteristics (Allcott and Wozny, 2014;Berry et al., 1995;Brownstone et al., 2000;Earnhart, 2002;Haaf et al., 2016;Helfand and Wolverton, 2011;Helveston et al., 2018;Mabit, 2014). To address the unexpected sign issue, we utilize an approach based on multi-criteria decisionmaking (MCDM) (Dyer et al., 1992;Triantaphyllou, 2000;Wallenius et al., 2008) within a random utility maximization (RUM) framework. ...
Article
Full-text available
Subsidies for promoting plug-in electric vehicle (PEV) adoption are a key component of China's overall plan for reducing local air pollution and greenhouse gas emissions from the light-duty vehicle sector. In this paper, we explore the impact and cost-effectiveness of the Chinese PEV subsidy program. In particular, a vehicle choice model is estimated using a large random sample of individual level, model year 2017 Chinese new vehicle purchases. The choice model is then used to predict PEV market share under alternative policies. Simulation results suggest that the 2.5% PEV market share of Chinese new vehicle sales in 2017 resulted in China's new vehicle fleet fuel economy improving by roughly 2%, reducing total gasoline consumption by 6.66 billion liters. However, the current PEV subsidy in China is expensive, costing $1.90 per additional liter of gasoline saved. This is due to a large number of non-additional PEV buyers, particularly high income consumers, who would have purchased the PEV regardless of the subsidy. Eliminating the subsidy for high income consumers and increasing it for low income consumers could result in a substantially lower cost per additional PEV ($13,758 versus $24,506). This would allow for greater PEV adoption (3.11% versus 2.47% market share) for the same budget. In terms of the impact of the recently announced subsidy reduction, results suggest that the PEV market share in China would have declined by 21% had the subsidy been halved without any countervailing measures. Using the same reduced budget, had zero PEV subsidies been given to high-income consumers and higher subsidies been given to low-income consumers, the PEV market share would have declined by only 8%.
... Online surveys are the most often adopted approach for exploring consumers' heterogeneous engagement with DSR. Even though respondents' stated preferences could be different from their revealed preferences, this approach is seen as an efficient way to investigate consumers' possible behaviours (Helveston et al. 2018). Moreover, unlike participants of DSR trials who opt in, respondents of an online survey are more likely to represent the general public as a whole, as they have not been screened by DSR recruitment mechanisms. ...
Article
Full-text available
Demand-side response (DSR), the incentivised time-shifting of energy use by consumers away from peak times, is regarded as a potentially effective measure to balance electricity supply and demand. This will be even more important in the low-carbon energy system of the future, with a high share of non-dispatchable power, such as variable renewable energy and nuclear power. Most DSR programmes require consumers’ active engagement in shifting end-use activities. Previous studies have, however, rarely revealed socio-demographic factors influential for consumers’ willingness-to-shift specific end-use activities. This study thus aims to fill this research gap and, using a multinomial logistic model to analyse a nationwide survey, identify factors influential for DSR-related decisions. The nationwide survey for 1004 respondents was carried out to collect data about consumers’ willingness-to-shift their daily activities. We focused on the activities that constitute the major part of domestic energy consumption, i.e. cooking, dish-washing, entertainment, heating, laundry and showering. According to the results, consumers’ original timing of the end-use activities, socio-demographic factors, ownership of specific appliances and level of concern for energy-saving are influential for their willingness-to-shift activities. These findings can not only help policymakers make more targeted DSR promotion plans but also help to improve broader modelling tools to better consider consumers’ willingness-to-shift their demand.
... The scanner type of data can be integrated to the stated choice experiment data, as was done in the seminal work of Adamowicz et al. [17] and others [18][19][20][21] that combined the revealed and stated preference. Models' estimates based on pooled data may reduce the hypothetical bias and improve the goodness of fit of the models in preference analysis and predictions. ...
Article
Full-text available
The changing lifestyles and the growing health concerns towards the negative impact of the saturated fatty acids originating from animals has increased consumers’ preferences for dairy-alternative products. These products belong to the food and beverage classification that is similar to certain types of dairy-based products in terms of texture and flavor, and has similar nutritional benefits. In this context, we seek to identify the willingness to pay (WTP) for the most important attributes that consumers take into account when purchasing the dairy-alternative drinks. A revealed preference discrete choice experiment was carried out using home-scan data belonging to ©Kantar Worldpanel (Barcelona, Spain) regarding the consumption of dairy-alternative drinks in Catalonia (Spain) in 343 households. Furthermore, factors that affect the purchasing frequency of this type of product were analyzed through the Poisson and negative binomial models. Results showed that price was the major driving factor, followed by the original non-dairy beverage flavor attribute. The original non-dairy beverage flavor compared to other added ingredients and tastes showed higher WTP when purchasing the non-dairy alternative. Marketing strategies should promote products by focusing on the “original” and “pure” version of the product without additional ingredients, or through reduction of the undesirable compounds if they exist in these kinds of beverages.
... This specification enables us to directly estimate WTP for marginal changes in nonprice attributes and also facilitates the direct comparison of model coefficients across different model specifications [32]. ...
Article
Full-text available
A decarbonized future will require a transition to lower carbon fuels for personal transportation. We study consumer preferences for combustion fuels including gasoline, diesel, natural gas, and E85 (85% ethanol and 15% gasoline) using consumer choice survey data from two settings: online ( n = 331) and in-person at refueling stations ( n = 127). Light-duty vehicle owners were asked in a series of choice tasks to choose among fuels that varied in type, price, CO 2 emissions, and location of origin for a hypothetical vehicle that could accept all fuels. We find that the majority of gasoline and E85 users are willing to substitute towards other fuels at today’s prices and attributes, while diesel users have a strong preference for diesel fuel. We also find that respondents are willing to pay on average $150/ton CO 2 avoided from fuel consumption—more than most estimates of the social cost of carbon. Thus, communicating the climate benefits from alternative fuels may be an important strategy for decarbonizing the transportation sector.
... More precisely, it is a method of pooled estimation with heteroscedasticity across 1 We call it an approach because it is not exactly a nested logit model. People make two choices respectively in RP and SP, rather than one choice, which is the assumption in standard nested logit model RP and SP [3] [4]. While this modeling idea has been used for decades and becomes standard in textbooks [5] [6], the nested logit structure can be behaviorally restrictive and pose intractable optimization challenges. ...
Preprint
Full-text available
It is an enduring question how to combine revealed preference (RP) and stated preference (SP) data to analyze travel behavior. This study presents a new approach of using multitask learning deep neural network (MTLDNN) to combine RP and SP data and incorporate the traditional nest logit approach as a special case. Based on a combined RP and SP survey in Singapore to examine the demand for autonomous vehicles (AV), we designed, estimated and compared one hundred MTLDNN architectures with three major findings. First, the traditional nested logit approach of combining RP and SP can be regarded as a special case of MTLDNN and is only one of a large number of possible MTLDNN architectures, and the nested logit approach imposes the proportional parameter constraint under the MTLDNN framework. Second, out of the 100 MTLDNN models tested, the best one has one shared layer and five domain-specific layers with weak regularization, but the nested logit approach with proportional parameter constraint rivals the best model. Third, the proportional parameter constraint works well in the nested logit model, but is too restrictive for deeper architectures. Overall, this study introduces the MTLDNN model to combine RP and SP data, relates the nested logit approach to the hyperparameter space of MTLDNN, and explores hyperparameter training and architecture design for the joint demand analysis.
Article
Full-text available
Electric vehicle sales have been growing rapidly in the United States and around the world. This study explores the drivers of demand for electric vehicles, examining whether this trend is primarily a result of technology improvements or changes in consumer preferences for the technology over time. We conduct a discrete choice experiment of new vehicle consumers in the United States, weighted to be representative of the population. Results suggest that improved technology has been the stronger force. Estimates of consumer willingness to pay for vehicle attributes show that when consumers compare a gasoline vehicle to its battery electric vehicle (BEV) counterpart, the improved operating cost, acceleration, and fast-charging capabilities of today's BEVs mostly or entirely compensate for their perceived disadvantages, particularly for longer-range BEVs. Moreover, forecasted improvements of BEV range and price suggest that consumer valuation of many BEVs is expected to equal or exceed their gasoline counterparts by 2030. A suggestive market-wide simulation extrapolation indicates that if every gasoline vehicle had a BEV option in 2030, the majority of new car and near-majority of new sport-utility vehicle choice shares could be electric in that year due to projected technology improvements alone.
Article
Urban transportation systems involve thousands of individuals making choices between routes with multiple modes and transfers. For transportation system simulations to produce realistic results, modelers need to incorporate these users and their choices. Choice-based conjoint surveys provide an attractive solution for obtaining flexible utility models that can be used to predict choices for a wide variety of trips. In this study, we demonstrate an example using conjoint survey data of commuter mode choice in the Washington, D.C. metro area (N = 1651). We sample commuters who primarily drive and those that take transit. We examine preferences for different types of multimodal trips, including those with intramodal and intermodel transfers. We find that trips involving a bus transfer are the least preferred while both drivers and transit users both value metro similarly to driving. We also find that walking during transit trips is an important barrier, with the travel time penalty for walking being 60% higher than that of time in a vehicle. Our findings highlight the significance of accounting for differences in modal transfer types in transportation system simulations. Reducing arrival time uncertainty was not a significant factor in commuter mode choice, and commuters' value of time was similar across all vehicle types, suggesting that increasing the relative speed of transit modes may only have a marginal effect on commuter substitution away from personal vehicles.
Preprint
Full-text available
Automated vehicles (AVs) have the potential to dramatically disrupt current transportation patterns and practices. One particular area of concern is AVs’ impacts on public transit systems. If vehicle automation enables significant price decreases or performance improvements for ride-hailing services, some fear that it could undercut public transit, which could have important implications for the environment and transportation equity. The extent to which individuals adopt automated transportation modes will drive many system-level outcomes, and research on public preferences for AVs is immature and inconclusive. In this study, we use responses from an online choice-based conjoint survey fielded in the Washington, D.C. Metropolitan Region (N = 1,736) in October 2021 to estimate discrete choice models of public preferences for different automated (ride-hailing, shared ride-hailing, bus) and non-automated (ride-hailing, shared ride-hailing, bus, rail) modes. We use the estimated models to simulate future marketplace competition across a range of trip scenarios. Respondents on average were only willing to pay a premium for automated modes when a vehicle attendant was also present, limiting the potential cost savings that AV operators might achieve by removing the driver. Additionally, scenario analysis revealed that for trips where good transit options were available, transit remained competitive with automated ride-hailing modes. These results suggest that fears of a mass transition away from transit to AVs may be limited by people’s willingness to use AVs, at least in the short term. Future AV operators should also recognize presence of an AV attendant as a critical feature for early AV adoption.
Article
The notion of hypothetical bias (HB) constitutes, arguably, the most fundamental issue in relation to the use of hypothetical survey methods. Whether or to what extent choices of survey participants and subsequent inferred estimates translate to real-world settings continues to be debated. While HB has been extensively studied in the broader context of contingent valuation, it is much less understood in relation to choice experiments (CE). While the number of CE studies has rapidly increased, the critical issue of HB has been studied in only a small fraction of CE studies. This paper provides macro-scale insights into the literature of CE and reviews the empirical evidence for HB in CE in various fields of applied economics as well as experimental psychology and behavioural neuroscience. Results suggest mixed evidence on the prevalence, extent and direction of HB as well as considerable context and measurement dependency. While HB is found to be an undeniable issue when conducting CEs, the empirical evidence on HB does not render CEs unable to represent real-world preferences. While health-related choice experiments often find negligible degrees of HB, experiments in consumer behaviour and transport domains suggest significant degrees of HB. Environmental valuation studies provide mixed evidence. Also, across these disciplines many studies display HB in their total willingness to pay estimates and opt-in rates but not in their hypothetical marginal rates of substitution (subject to scale correction). Further, recent findings in psychology and brain imaging studies identify neurocognitive mechanisms underlying HB that may explain some of the discrepancies and unexpected findings in the mainstream CE literature.
Article
Full-text available
Endogeneity arises for numerous reasons in models of consumer choice. It leads to inconsistency with standard estimation methods that maintain independence between the model's error and the included variables. The authors describe a control function approach for handling endogeneity in choice models. Observed variables and economic theory are used to derive controls for the dependence between the endogenous variable and the demand error. The theory points to the relationships that contain information on the unobserved demand factor, such as the pricing equation and the advertising equation. The authors' approach is an alternative to Berry, Levinsohn, and Pakes's (1995) product-market controls for unobserved quality. The authors apply both methods to examine households' choices among television options, including basic and premium cable packages, in which unobserved attributes, such as quality of programming, are expected to be correlated with price. Without correcting for endogeneity, aggregate demand is estimated to be upward-sloping, suggesting that omitted attributes are positively correlated with demand. Both the control function method and the product-market controls method produce downward-sloping demand estimates that are similar.
Article
Full-text available
Transport problems in developing countries are rapidly increasing due to intense vehicle ownership and usage. Therefore, investigation of travel behavior is found as an urgent issue in such countries. This study proposes a Nested Logit model to investigate household travel behavior emphasizing vehicle ownership, mode choice and trip chaining decisions. Model is analyzed using Revealed Preference (RP) and Stated Preference (SP) data since combining RP/SP data in travel behavior models is an effective method in expressing complex travel behavior and forecasting travel demand for new transport services. In the proposed model, nesting structure has two levels where the upper level shows car owning, motorcycle owning and no vehicle owning choices and the lower level represents the mode choice combinations for two-traveler households. Since trip chaining/sharing is fairly popular event among the vehicle owning households in developing countries, it is considered as one of the mode-choice options in the proposed model. Proposed model is analyzed using the data from Bangkok Metropolitan Region (BMR).
Article
Full-text available
In most marketing experiments, managerial decisions are not based directly on the estimates of the parameters but rather on functions of these estimates. For example, many managerial decisions are driven by whether or not a feature is valued more than the price the consumer will be asked to pay. In other cases, some managerial decisions are weighed more heavily than others. The standard measures used to evaluate experimental designs (e.g., -efficiency or -efficiency) do not accommodate these phenomena. We propose alternative “managerial efficiency” criteria (-errors) that are relatively easy to implement. We explore their properties, suggest practical algorithms to decrease errors, and provide illustrative examples. Realistic examples suggest improvements of as much as 30% in managerial efficiency. We close by considering approximations for nonlinear criteria and extensions to choice-based experiments.
Article
Full-text available
Our objective is to develop a unifying framework for the incorporation of different types of survey data in individual choice models. We describe statistical methodologies that combine multiple sources of data in the estimation of individual choice models and summarize the current state of the art of data combination methods that have been used with market research data. The most successful applications so far have combined revealed and stated preference data. We discuss different types of market and survey data and provide examples of research contexts in which one might wish to combine them. Although these methods show a great deal of promise and have already been used successfully in a number of applications, several important research issues remain. A discussion of these issues and directions for further research conclude the paper.
Chapter
Full-text available
In models with unobserved taste heterogeneity, distributional assumptions can be placed in two ways: (1) by specifying the distribution of coefficients in the utility function and deriving the distribution of willingness to pay (WTP), or (2) by specifying the distribution of WTP and deriving the distribution of coefficients. In general the two approaches are equivalent, in that any mutually compatible distributions for coefficients and WTP can be represented in either way. However, in practice, convenient distributions, such as normal or log-normal, are usually specified, and these convenient distributions have different implications when placed on WTP’s than on coefficients. We compare models that use normal and log-normal distributions for coefficients (called models in preference space) with models using these distributions for WTP (called models in WTP space). We find that the models in preference space fit the data better but provide less reasonable distributions of WTP than the models in WTP space. Our findings suggests that further work is needed to identify distributions that either fit better when applied in WTP space or imply more reasonable distributions of WTP when applied in preference space.
Article
Full-text available
Discrete choice models estimated using hypothetical choices made in a survey setting (i.e., choice experiments) are widely used to estimate the importance of product attributes in order to make product design and marketing mix decisions. Choice experiments allow the researcher to estimate preferences for product features that do not yet exist in the market. However, parameters estimated from experimental data often show marked inconsistencies with those inferred from the market, reducing their usefulness in forecasting and decision making. We propose an approach for combining choice-based conjoint data with individual-level purchase data to produce estimates that are more consistent with the market. Unlike prior approaches for calibrating conjoint models so that they correctly predict aggregate market shares for a "baseline" market, the proposed approach is designed to produce parameters that are more consistent with those that can be inferred from individual-level market data. The proposed method relies on a new general framework for combining two or more sources of individual-level choice data to estimate a hierarchical discrete choice model. Past approaches to combining choice data assume that the population mean for the parameters is the same across both data sets and require that data sets are sampled from the same population. In contrast, we incorporate in the model individual characteristic variables, and assert only that the mapping between individuals' characteristics and their preferences is the same across the data sets. This allows the model to be applied even if the sample of individuals observed in each data set is not representative of the population as a whole, so long as appropriate product-use variables are collected that can explain the systematic deviations between them. The framework also explicitly incorporates a model for the individual characteristics, which allows us to use Bayesian missing-data techniques to handle the situation where each data set contains different demographic variables. This makes the method useful in practice for a wide range of existing market and conjoint data sets. We apply the method to a set of conjoint and market data for minivan choice and find that the proposed method predicts holdout market choices better than a model estimated from conjoint data alone or a model that does not include demographic variables.
Article
Full-text available
This paper develops new techniques for empirically analyzing demand and supply in differentiated products markets and then applies these techniques to analyze equilibrium in the U.S. automobile industry. Our primary goal is to present a framework which enables one to obtain estimates of demand and cost parameters for a broad class of oligopolistic differentiated products markets. These estimates can be obtained using only widely available product-level and aggregate consumer-level data, and they are consistent with a structural model of equilibrium in an oligopolistic industry. When we apply the techniques developed here to the U.S. automobile market. we obtain cost and demand parameters for (essentially) all models marketed over a twenty year period.
Article
Full-text available
We investigate direct and indirect specification of the distribution of consumer willingness-to-pay (WTP) for changes in product attributes in a choice setting. Typically, choice models identify WTP for an attribute as a ratio of the estimated attribute and price coefficients. Previous research in marketing and economics has discussed the problems with allowing for random coefficients on both attribute and price, especially when the distribution of the price coefficient has mass near zero. These problems can be avoided by combining a parameterization of the likelihood function that directly identifies WTP with a normal prior for WTP. We show that the typical likelihood parameterization in combination with what are regarded as standard heterogeneity distributions for attribute and price coefficients results in poorly behaved posterior WTP distributions, especially in small sample settings. The implied prior for WTP readily allows for substantial mass in the tails of the distribution and extreme individual-level estimates of WTP. We also demonstrate the sensitivity of profit maximizing prices to parameterization and priors for WTP.
Article
Full-text available
Surveys are frequently used by businesses and governments to elicit information about the public’s preferences. They have become the most common way to gather preference information regarding goods, that are not (or are not yet) bought or sold in markets. In this paper we apply the standard neoclassical economic framework to generate predictions about how rational agents would answer such survey questions, which in turn implies how such survey data should be interpreted. In some situations, the standard economic model would be expected to have no predictive power. For situations where it does have predictive power, we compare different survey formats with respect to: (a) the information that the question itself reveals to the respondent, (b) the strategic incentives the respondent faces in answering the question, and (c) the information revealed by the respondent’s answer. Copyright Springer Science+Business Media, Inc. 2007
Article
Full-text available
" Hungarian home gardens are small-scale farms managed by farm households using traditional management practices and family labor. They generate private benefits for farmers by enhancing diet quality and providing food when costs of transacting in local markets are high. Home gardens also generate public benefits for society by supporting long-term productivity advances in agriculture. In this paper, we estimate the private value to farmers of agrobiodiversity in home gardens. Building on the approach presented in EPTD Discussion Paper 117 (2004), we combine a stated preference approach (a choice experiment model) and a revealed preference approach (a discrete-choice, farm household model). Both models are based on random utility theory. To combine the models, primary data were collected from the same 239 farm households in three regions of Hungary. Combining approaches leads to a more efficient and robust estimation of the private value of agrobiodiversity in home gardens. Findings can be used to identify those farming communities, which would benefit most from agri-environmental schemes that support agrobiodiversity maintenance, at least public cost." Authors' abstract
Article
Multinomial logit (MNL) models are widely used in marketing research to analyze choice data, but it is not generally recognized that the unit of the utility scale in a MNL model is inversely related to the error variance. This means that, for instance, parameters of two identical utility specifications estimated from different data sources with unequal variances will necessarily differ in magnitude, even if the true model parameters that generated the utilities are identical in both sets. Despite a growing number of papers that compare MNL coefficients, no examples of appropriate tests of the joint and separate hypotheses of scale and parameter equality in MNL models exist in the marketing literature. The purpose of this paper is to address the proper procedure for MNL parameter comparisons between different data sets and to propose a simple relative scaling test that can be implemented with standard MNL estimation software. Several examples are given to illustrate the approach.
Article
Choice designs traditionally have been built under the assumption that all coefficients are zero. The authors show that if there are reasonable nonzero priors for expected coefficients, then these can be used to generate more statistically efficient choice designs, because the alternatives in their choice sets are balanced in utility—they have more similar choice probabilities. The authors demonstrate that the appropriate measure of choice design efficiency requires probability centering and weighting of the rows of the design matrix, and they illustrate how this criterion enables the analyst to appropriately trade off utility balance against three other principles: orthogonality, level balance, and minimal overlap. Two methods, swapping and relabeling attribute levels, provide complementary ways to increase the utility balance of choice designs. The authors apply a process for generating utility-balanced designs to five different choice designs and show that it reduces by 10–50% the number of respondents needed to achieve a specific error level around the parameters. A sensitivity analysis reveals that these gains are diminished, but still substantial, despite strong misspecifications of prior parameter estimates.
Article
Stated choice experiments are a preeminent method for researchers and practitioners who seek to examine the behavior of consumers. However, the extent to which these experiments can replicate real markets continues to be debated in the literature, with particular reference to the potential for biased estimates as a result of the hypothetical nature of such experiments. In this paper, a first in the transportation literature, we compare stated choice responses to revealed preference behavior and examine three methods proposed in the literature for calibrating choice experiments via reported choice certainty. In doing so we provide evidence that the incorrect calibration of responses can produce stated choice results that are more biased than doing nothing at all, however we show that by jointly estimating choice and choice certainty there is a significant reduction in hypothetical bias such that stated choice responses more directly replicate real behavior.
Article
Endogeneity often arises in discrete-choice models, precluding the consistent estimation of the model parameters, but it is habitually neglected in practical applications. The purpose of this article is to contribute in closing that gap by assessing five methods to address endogeneity in this context: the use of Proxys (PR); the two steps Control-Function (CF) method; the simultaneous estimation of the CF method via Maximum-Likelihood (ML); the Multiple Indicator Solution (MIS); and the integration of Latent-Variables (LV). The assessment is first made qualitatively, in terms of the formulation, normalization and data needs of each method. Then, the evaluation is made quantitatively, by means of a Monte Carlo experiment to study the finite sample properties under a unified data generation process, and to analyze the impact of common flaws. The methods studied differ notably in the range of problems that they can address; their underlying assumptions; the difficulty of gathering proper auxiliary variables needed to apply them; and their practicality, both in terms of the need for coding and their computational burden. The analysis developed in this article shows that PR is formally inappropriate for many cases, but it is easy to apply, and often corrects in the right direction. CF is also easy to apply with canned software, but requires instrumental variables which may be hard to collect in various contexts. Since CF is estimated in two stages, it may also compromise efficiency and difficult the estimation of standard errors. ML guarantees efficiency and direct estimation of the standard errors, but at the cost of larger computational burden required for the estimation of a multifold integral, with potential difficulties in identification, and retaining the difficulty of gathering proper instrumental variables. The MIS method appears relatively easy to apply and requiring indicators that may be easier to obtain in various cases. Finally, the LV approach appears as the more versatile method, but at a high cost in computational burden, problems of identification and limitations in the capability of writing proper structural equations for the latent variable.
Article
Marketing is a field that is rich in data. Our data is of high quality, often at a highly disaggregate level, and there is considerable variation in the key variables for which estimates of effects on outcomes such as sales and profits are desired. The recognition that, in some general sense, marketing variables are set by firms on the basis of information not always observable by the researcher has led to concerns regarding endogeneity and widespread pressure to implement instrumental variables methods in marketing problems. The instruments used in our empirical literature are rarely valid and the IV methods used can have poor sampling properties, including substantial finite sample bias and large sampling errors. Given the problems with IV methods, a convincing argument must be made that there is a first order endogeneity problem and that we have strong and valid instruments before these methods should be used. If strong and valid instruments are not available, then researchers need to look toward supplementing the information available to them. For example, if there are concerns about unobservable advertising or promotional variables, then the researcher is much better off measuring these variables rather than using instruments (such as lagged marketing variables) that are clearly invalid. Ultimately, only randomized variation in marketing variables (with proper implementation and large samples) can be argued to be a valid instrument without further assumptions.
Article
Choice designs traditionally have been built under the assumption that all coefficients are zero, The authors show that if there are reasonable nonzero priors for expected coefficients, then these can be used to generate more statistically efficient choice designs, because the alternatives in their choice sets are balanced in utility - they have more similar choice probabilities. The authors demonstrate that the appropriate measure of choice design efficiency requires probability centering and weighting of the rows of the design matrix, and they illustrate how this criterion enables the analyst to appropriately trade off utility balance against three other principles: orthogonality, level balance, and minimal overlap. Two methods, swapping and relabeling attribute levels, provide complementary ways to increase the utility balance of choice designs. The authors apply a process for generating utility-balanced designs to five different choice designs and show that it reduces by 10-50% the number of respondents needed to achieve a specific error level around the parameters. A sensitivity analysis reveals that these gains are diminished, but still substantial, despite strong misspecifications of prior parameter estimates.
Article
In most marketing experiments, managerial decisions are not based directly on the estimates of the parameters but rather on functions of these estimates. For example, many managerial decisions are driven by whether or not a feature is valued more than the price the consumer will be asked to pay. In other cases, some managerial decisions are weighed more heavily than others. The standard measures used to evaluate experimental designs (e.g., A-efficiency or D-efficiency) do not accommodate these phenomena. We propose alternative “managerial efficiency” criteria (M-errors) that are relatively easy to implement. We explore their properties, suggest practical algorithms to decrease errors, and provide illustrative examples. Realistic examples suggest improvements of as much as 30% in managerial efficiency. We close by considering approximations for nonlinear criteria and extensions to choice-based experiments.
Article
This study examines perceptions and objective attribute measures in discrete choice models of recreation site choice behavior. These forms of attribute measurement are examined in individual and combined revealed preference/stated preference models. Our results suggest that the model based on perceptions slightly outperforms the models based on objective attribute measures. However, issues such as the definition of the choice set and the measurement of welfare present significant challenges when using perceptions data.
Article
This paper proposes a methodology for modeling switching behavior using simultaneously cross-sectional revealed preference (RP) data and stated intentions, a type of stated preference (SP) data. With explicit consideration of biases and random errors potentially contained in SP data, combined estimation with RP and SP data can exploit the advantages of both data sources. An empirical analysis of commuters' mode choice shows that the stated intention data have predictive validity if their biases and errors are properly corrected.
Article
Multinomial logit (MNL) models are widely used in marketing research to analyze choice data, but it is not generally recognized that the unit of the utility scale in a MNL model is inversely related to the error variance. This means that, for instance, parameters of two identical utility specifications estimated from different data sources with unequal variances will necessarily differ in magnitude, even if the true model parameters that generated the utilities are identical in both sets. Despite a growing number of papers that compare MNL coefficients, no examples of appropriate tests of the joint and separate hypotheses of scale and parameter equality in MNL models exist in the marketing literature. The purpose of this paper is to address the proper procedure for MNL parameter comparisons between different data sets and to propose a simple relative scaling test that can be implemented with standard MNL estimation software. Several examples are given to illustrate the approach.
Article
A combined stated preference (SP) and revealed preference nested logit model of access and main mode choice was developed for the Tel Aviv metropolitan area. The data used for model estimation include a conventional travel-activity survey and a customized, computerized SP survey. The model incorporates a generic mass transit alternative that represents any possible combination of existing and new technologies, such as heavy rail, enhanced bus, light rail, and suburban rail. The lower level of the nested logit model represents the access choice to the bus or mass transit alternatives, including walk, park and ride, kiss and ride, and bus access alternatives; the upper level represents the mode choice among bus, mass transit, car driver, and car passenger alternatives. Travel time reliability and seat availability are among the attributes of the public transportation alternatives
Article
Hypothetical bias arises in stated preference valuation studies when respondents report a willingness to pay (WTP) that exceeds what they actually pay using their own money in laboratory or field experiments. Although this bias is not found in all stated preference surveys, hypothetical WTP typically exceeds the actual value by a factor of two to three. Unfortunately, there is no widely accepted general theory of respondent behaviour that explains hypothetical bias. Therefore, two meta-analyses are reviewed to test current hypotheses regarding the causes of this overstatement of WTP and the associated recommendations to mitigate the bias. Suggestions for future research are made including the development of a general theory.
Article
Applications of random utility models to scanner data have been widely presented in marketing for the last 20 years. One particular problem with these applications is that they have ignored possible correlations between the independent variables in the deterministic component of utility (price, promotion, etc.) and the stochastic component or error term. In fact, marketing-mix variables, such as price, not only affect brand purchasing probabilities but are themselves endogenously set by marketing managers. This work tests whether these endogeneity problems are important enough to warrant consideration when estimating random utility models with scanner panel data. Our results show that not accounting for endogeneity may result in a substantial bias in the parameter estimates.
Chapter
Economists have traditionally relied on retrospective analysis of actual consumer behaviour to understand the factors affecting the decisions of economic agents. This type of information is termed revealed preference (RP) data. Suppose we want to understand health plan choice. Typically, researchers will try to find a data set where consumers were offered and selected a number of different health plans. Then, they will examine the effect of the health plan attributes – price, services covered and quality ratings – on choice. The advantage of RP data is that it is based on actual decisions; thus, there is no need to assume that consumers will respond to simulated product markets as they do to actual market situations. This characteristic gives RP data high reliability and face validity. The disadvantage with this approach, in this case, is that the price of the insurance plan is highly correlated with the services covered and the actuarial value of the plan.What is needed is an estimate of the effect of prices for a wide range of benefit bundles.
Article
There is growing interest in exploring the view that both revealed preference (RP) and stated preference (SP) data have useful information and that their integration will enrich the overall explanatory power of RP choice models. These two types of data have been independently used in the estimation of a wide variety of discrete choice applications in marketing. In order to combine the two data sources, each with independent choice outcomes, allowance must be made for their different scaling properties. The approach uses a full information maximum likelihood estimation procedure of the hierarchical logit form to obtain suitable scaling parameters to make one or more data sets comparable. We illustrate the advantages of the dual data strategy by comparing the results with those obtained from models estimated independently with RP and SP data. Data collected as part of a study of high speed rail is used to estimate a set of illustrative mode choice models.
Article
This research examines the methods, viability, and benefits of pooling scanner panel choice data with compatible preference data from designed choice experiments. The fact that different choice data sources have diverse strengths and weaknesses suggests it might be possible to pool multiple sources to achieve improved models, due to offsetting advantages and disadvantages. For example, new attributes and attribute levels not included in the scanner panel data can be introduced via the choice experiment, while the scanner panel data captures preference dynamics, which is, at best, difficult with experimental data. Our application, involving liquid laundry detergent, establishes the feasibility and desirability of doing such augmentations of scanner panel data: The joint scanner panel/choice experiment model has significantly better prediction performance on a holdout data set than does a pure scanner panel model. Thus, we extend the concept of choice into another domain and demonstrate that data enrichment can add significantly to one's understanding of preferences reflected in scanner panel data.
Article
Revealed preference (RP) data and stated preference (SP) data have complementary characteristics for model estimation. To enhance the advantages of both data types, a combined estimation method is proposed. This paper discusses the method and practical considerations in applying it, and introduces a new method of considering serial correlation of RP and SP data. An empirical analysis is also presented.
Article
The possibility of and procedure for pooling RP and SP data have been discussed in recent research work. In that literature, the RP data has been viewed as the yardstick against which the SP data must be compared. In this paper we take a fresh look at the two data types. Based on the peculiar strengths and weaknesses of each we propose a new, sequential approach to exploiting the strengths and avoiding the weaknesses of each data source. This approach is based on the premise that SP data, characterized by a well-conditioned design matrix and a less constrained decision environment than the real world, is able to capture respondents' tradeoffs more robustly than is possible in RP data. (This, in turn, results in more robust estimates of share changes due to changes in independent variables.) The RP data, however, represent the current market situation better than the SP data, hence should be used to establish the aggregate equilibrium level represented by the final model. The approachfixes the RP parameters for independent variables at the estimated SP parameters but uses the RP data to establish alternative-specific constants. Simultaneously, the RP data are rescaled to correct for error-in-variables problems in the RP design matrixvis-à- vis the SP design matrix. All specifications tested are Multinomial Logit (MNL) models. The approach is tested with freight shippers' choice of carrier in three major North American cities. It is shown that the proposed sequential approach to using SP and RP data has the same or better predictive power as the model calibrated solely on the RP data (which is the best possible model for that data, in terms of goodness-of-fit figures of merit), when measured in terms of Pearson's Chi-squared ratio and the percent correctly predicted statistic. The sequential approach is also shown to produce predictions with lower error than produced by the more usual method of pooling the RP and SP data.
Article
This paper compares results from evaluations of two recent road pricing demonstrations in southern California. These projects provide particularly useful opportunities for measuring commuters’ values of time and reliability. Unlike most revealed preference studies of value of time, the choice to pay to use the toll facilities in these demonstrations is relatively independent from other travel choices such as whether to use public transit. Unlike most stated preference studies, the scenarios presented in these surveys are real ones that travelers have faced or know about from media coverage. By combining revealed and stated preference data, some of the studies have obtained enough independent variation in variables to disentangle effects of cost, time, and reliability, while still grounding the results in real behavior. Both sets of studies find that the value of time saved on the morning commute is quite high (between $20 and $40 per hour) when based on revealed behavior, and less than half that amount when based on hypothetical behavior. When satisfactorily identified, reliability is also valued quite highly. There is substantial heterogeneity in these values across the population, but it is difficult to isolate its exact origins.
Article
We compare multinomial logit and mixed logit models for data on California households' revealed and stated preferences for automobiles. The stated preference (SP) data elicited households' preferences among gasoline, electric, methanol, and compressed natural gas vehicles with various attributes. The mixed logit models provide improved fits over logit that are highly significant, and show large heterogeneity in respondents' preferences for alternative-fuel vehicles. The effects of including this heterogeneity are demonstrated in forecasting exercises. The alternative-fuel vehicle models presented here also highlight the advantages of merging SP and revealed preference (RP) data. RP data appear to be critical for obtaining realistic body-type choice and scaling information, but they are plagued by multicollinearity and difficulties with measuring vehicle attributes. SP data are critical for obtaining information about attributes not available in the marketplace, but pure SP models with these data give implausible forecasts.
Article
We describe and apply choice models, including generalizations of logit called "mixed logits, " that do not exhibit the restrictive "independence from irrelevant alternatives" property and can approximate any substitution pattern. The models are estimated on data from a stated-preference survey that elicited customers ’ preferences among gas, electric, methanol, and CNG vehicles with various attributes. ACKNOWLEDGEMENTS: David Bunch and Tom Gollob collected the data and conducted preliminary analyses upon which our analysis relies. We are grateful to them for allowing us to use the data. They are not, of course, responsible for any errors or representations that we make
Article
This paper brings together several research streams and concepts that have been evolving in random utility choice theory: (1) it reviews the literature on stated preference (SP) elicitation methods and introduces the concept of testing data generation process invariance across SP and revealed preference (RP) choice data sources; (2) it describes the evolution of discrete choice models within the random utility family, where progressively more behavioural realism is being achieved by relaxing strong assumptions on the role of the variance structure (specifically, heteroscedasticity) of the unobserved effects, a topic central to the issue of combining multiple data sources; (3) particular choice model formulations incorporating heteroscedastic effects are presented, discussed and applied to data. The rich insights possible from modelling heteroscedasticity in choice processes are illustrated in the empirical application, highlighting its relevance to issues of data combination and taste heterogeneity.
Article
According to intuition and theories of diffusion, consumer preferences develop along with technological change. However, most economic models designed for policy simulation unrealistically assume static preferences. To improve the behavioral realism of an energy–economy policy model, this study investigates the “neighbor effect,” where a new technology becomes more desirable as its adoption becomes more widespread in the market. We measure this effect as a change in aggregated willingness to pay under different levels of technology penetration. Focusing on hybrid-electric vehicles (HEVs), an online survey experiment collected stated preference (SP) data from 535 Canadian and 408 Californian vehicle owners under different hypothetical market conditions.Revealed preference (RP) data was collected from the same respondents by eliciting the year, make and model of recent vehicle purchases from regions with different degrees of HEV popularity: Canada with 0.17% new market share, and California with 3.0% new market share. We compare choice models estimated from RP data only with three joint SP–RP estimation techniques, each assigning a different weight to the influence of SP and RP data in coefficient estimates. Statistically, models allowing more RP influence outperform SP influenced models. However, results suggest that because the RP data in this study is afflicted by multicollinearity, techniques that allow more SP influence in the beta estimates while maintaining RP data for calibrating vehicle class constraints produce more realistic estimates of willingness to pay. Furthermore, SP influenced coefficient estimates also translate to more realistic behavioral parameters for CIMS, allowing more sensitivity to policy simulations.
Article
We develop a combined, revealed and stated preference approach to identify discrete choice demand parameters in the presence of unobserved determinants of choice. Our approach overcomes difficulties associated with small choice sets, multicollinearity, and endogeneity that arise with revealed preference approaches. To illustrate our framework, we revisit two Canadian moose hunting datasets. Our empirical results suggest the potential gains from fusing revealed and stated preference data, but they also suggest its limitations when the data-generating processes for the data sources differ.
Article
This paper formulates and applies a unified mixed-logit framework for joint analysis of revealed and stated preference data that accommodates a flexible competition pattern across alternatives, scale difference in the revealed and stated choice contexts, heterogeneity across individuals in the intrinsic preferences for alternatives, heterogeneity across individuals in the responsiveness to level-of-service factors, state-dependence of the stated choices on the revealed choice, and heterogeneity across individuals in the state-dependence effect. The estimation of the mixed logit formulation is achieved using simulation techniques that employ quasi-random Monte Carlo draws. The formulation is applied to examine the travel behavior responses of San Francisco Bay Bridge users to changes in travel conditions. The data for the study are drawn from surveys conducted as part of the 1996 San Francisco Bay Area Travel Study. The results of the mixed logit formulation are compared with those of more restrictive structures on the basis of parameter estimates, implied trade-offs among level-of-service attributes, heterogeneity and state-dependence effects, data fit, and substantive implications of congestion pricing policy simulations.
Article
In the current paper, a departure time choice model including travel time variability is estimated, combining stated preference and revealed preference data. We account for response scale differences between RP and SP data and, applying the mixed logit model, test for correlation of scheduling sensitivity across RP and SP choices within individuals. The analysis implies systematic differences in the RP and SP data. With support of the evaluation from the Stockholm trial, this indicates that SP is less trustworthy for trip timing analysis and forecasting, presumably because there are temporal differences in RP and SP choice situations.
Article
There is growing interest in establishing the extent of differences in willingness to pay (WTP) for attributes, such as travel time savings, that are derived from real market settings and hypothetical (to varying degrees) settings. Non-experiment external validity tests involving observation of choice activity in a natural environment, where the individuals do not know they are in an experiment, are rare. In contrast the majority of tests are a test of external validity between hypothetical and actual experiments. Deviation from real market evidence is referred to in the literature broadly as hypothetical bias. The challenge is to identify such bias, and to the extent to which it exists, establishing possible ways to minimise it. This paper reviews the efforts to date to identify and 'calibrate' WTP derived from one or more methods that involve assessment of hypothetical settings, be they (i) contingent valuation methods, (ii) choice experiments involving trading attributes between multiple alternatives, with or without referencing, or (iii) methods involving salient or non-salient incentives linked to actual behaviour. Despite progress in identifying possible contributions to differences in marginal WTP, there is no solid evidence, although plenty of speculation, to explain the differences between all manner of hypothetical experiments and non-experimental evidence. The absence of non-experimental evidence from natural field experiments remains a major barrier to confirmation of under or over-estimation. We find, however, that the role of referencing of an experiment relative to a real experience (including evidence from revealed preference (RP) studies), in the design of choice experiments, appears to offer promise in the derivation of estimates of WTP that have a meaningful link to real market activity, closing the gap between RP and SC WTP outputs.
Article
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Civil Engineering, 1989. Supervised by Moshe E. Ben-Akiva. Includes bibliographical references (leaves 156-163).
Article
The use of stated preference analyses to evaluate choice of health care products has been growing in recent years. This paper shows how revealed preference data can be enriched with stated preference data and highlights the relative advantages of revealed and stated preference data. The techniques were applied to a study of determinants of physicians' prescriptions of alcoholism medications. Analyses were conducted on the relationship between physicians' perceptions of existing alcoholism medication attributes and their prescribing rates of those medications. Analyses were also conducted on physicians' decisions to prescribe hypothetical alcoholism medications with varying attributes such as efficacy, side effects, compliance, mode of action, and price. Finally, analyses were conducted on the combined stated and revealed preference data. Joint estimation suggests that parameters from the revealed and stated preference data are equal, up to scale. Joint analyses highlight how stated preference data can be used to estimate parameters for attributes that are not observed in the marketplace, that do not vary in the marketplace, or that are highly collinear with other attributes in actual markets.
Article
Because most conjoint studies are conducted in hypothetical situations with no consumption consequences for the participants, the extent to which the studies are able to uncover "true" consumer preference structures is questionable. Experimental economics literature, with its emphasis on incentive alignment and hypothetical blas, suggests that more realistic incentive-aligned studies result in stronger out-of-sample predictive performance of actual purchase behaviors and provide better estimates of consumer preference structures than do hypothetical studies. To test this hypothesis, the authors design an experiment with conventional (hypothetical) conditions and parallel incentive-aligned counterparts. Using Chinese dinner specials as the context, the authors conduct a field experiment in a Chinese restaurant during dinnertime. The results provide strong evidence in favor of incentive-aligned choice conjoint analysis, in that incentive-aligned choice conjoint outperforms hypothetical choice conjoint in out-of-sample predictions. To determine the robustness of the results, the authors conduct a second study that uses snacks as the context and considers only the choice treatments. This study confirms the results by providing strong evidence in favor of incentivealigned choice analysis in out-of-sample predictions. The results provide a strong motivation for conjoint practitioners to consider conducting studies in realistic settings using incentive structures that require participants to "live with" their decisions.
Article
In this article we propose theoretically consistent welfare measurement of use and nonuse values for an improvement in environmental quality with revealed and stated preference data. An analytical model based on the comparative static analysis of the variation function that describes the relationship between recreation demand and dichotomous choice contingent valuation models is estimated. Our results show that revealed and stated data should not be combined under the same assumed preference structure unless the two decisions imply the same change in behavior induced by the quality change. In addition, our results indicate scope effects in willingness to pay measures estimated with stated preference data.
Article
Corruption in the public sector erodes tax compliance and leads to higher tax evasion. Moreover, corrupt public officials abuse their public power to extort bribes from the private agents. In both types of interaction with the public sector, the private agents are bound to face uncertainty with respect to their disposable incomes. To analyse effects of this uncertainty, a stochastic dynamic growth model with the public sector is examined. It is shown that deterministic excessive red tape and corruption deteriorate the growth potential through income redistribution and public sector inefficiencies. Most importantly, it is demonstrated that the increase in corruption via higher uncertainty exerts adverse effects on capital accumulation, thus leading to lower growth rates.
Article
The nonlinear fixed-effects model has two shortcomings, one practical and one methodological. The practical obstacle relates to the difficulty of computing the MLE of the coefficients of non-linear models with possibly thousands of dummy variable coefficients. In fact, in many models of interest to practitioners, computing the MLE of the parameters of fixed effects model is feasible even in panels with very large numbers of groups. The result, though not new, appears not to be well known. The more difficult, methodological issue is the incidental parameters problem that raises questions about the statistical properties of the ML estimator. There is relatively little empirical evidence on the behaviour of the MLE in the presence of fixed effects, and that which has been obtained has focused almost exclusively on binary choice models. In this paper, we use Monte Carlo methods to examine the small sample bias of the MLE in the tobit, truncated regression and Weibull survival models as well as the binary probit and logit and ordered probit discrete choice models. We find that the estimator in the continuous response models behaves quite differently from the familiar and oft cited results. Among our findings are: first, a widely accepted result that suggests that the probit estimator is actually relatively well behaved appears to be incorrect; second, the estimators of the slopes in the tobit model, unlike the probit and logit models that have been studied previously, appear to be largely unaffected by the incidental parameters problem, but a surprising result related to the disturbance variance estimator arises instead; third, lest one jumps to a conclusion that the finite sample bias is restricted to discrete choice models, we submit evidence on the truncated regression, which is yet unlike the tobit in that regard--it appears to be biased towards zero; fourth, we find in the Weibull model that the biases in a vector of coefficients need not be in the same direction; fifth, as apparently unexamined previously, the estimated asymptotic standard errors for the ML estimators appear uniformly to be downward biased when the model contains fixed effects. In sum, the finite sample behaviour of the fixed effects estimator is much more varied than the received literature would suggest. Copyright Royal Economic Socciety 2004