ArticlePDF Available

Connection Between Built Environment and Travel Behavior

Authors:

Abstract and Figures

In recent years, the compact city concept has become a paradigm of sustainable urban development under the premise that mixed use, high density cities can significantly reduce automobile dependency and promote the use of alternative modes. This claim however hinges on the existence of a true causal mechanism between the built environment and travel behavior. This study tackles the causality problem using a propensity score approach, but differs from previous studies in that it relaxes the binary treatment assumption (i.e. urban vs. suburban) and assumes a continuous treatment of urbanization level, estimated as a latent variable. Methodologically, the propensity score stratification method utilized is successful in mitigating residential self-selection bias on estimates of the effect of the built environment on non-work trip frequency and traveled distance. Overestimation ranges for the direct regression estimates against the estimates stratified on the propensity score range from 6% to 36%. Findings suggest the existence of a causal mode substitution mechanism from car to non-motorized modes given positive increases in the latent score of urbanization level.
Content may be subject to copyright.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 1
THE BUILT ENVIRONMENT-TRAVEL BEHAVIOR CONNECTION: A
PROPENSITY SCORE APPROACH UNDER A CONTINUOUS TREATMENT
REGIME
Giancarlos Troncoso Parady
(Corresponding Author)
Ph.D. Candidate
Department of Urban Engineering, Graduate School of Engineering
The University of Tokyo
7-3-1 Hongo, Bunkyo-Ku, Tokyo, Japan 113-8656
gtroncoso@ut.t.u-tokyo.ac.jp
+81-3-5841-6234
Kiyoshi Takami, Ph.D.
Assistant Professor
Department of Urban Engineering, Graduate School of Engineering
The University of Tokyo
7-3-1 Hongo, Bunkyo-Ku, Tokyo, Japan 113-8656
takami@ut.t.u-tokyo.ac.jp
Department of Urban Engineering, School of Engineering
+81-3-5841-6234
Noboru Harata, Ph.D.
Professor
Department of Urban Engineering, Graduate School of Engineering
The University of Tokyo
7-3-1 Hongo, Bunkyo-Ku, Tokyo, Japan 113-8656
nhara@ut.t.u-tokyo.ac.jp
+81-3-5841-6233
Word count: 5243 (Without counting tables and figures)
Number of tables: 3
Number of figures: 3
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 2
ABSTRACT
In recent years, the compact city concept has become a paradigm of sustainable urban
development under the premise that mixed use, high density cities can significantly reduce automobile
dependency and promote the use of alternative modes. This claim however hinges on the existence of
a true causal mechanism between the built environment and travel behavior. This study tackles the
causality problem using a propensity score approach, but differs from previous studies in that it relaxes
the binary treatment assumption (i.e. urban vs. suburban) and assumes a continuous treatment of
urbanization level, estimated as a latent variable. Methodologically, the propensity score stratification
method utilized is successful in mitigating residential self-selection bias on estimates of the effect of
the built environment on non-work trip frequency and traveled distance. Overestimation ranges for the
direct regression estimates against the estimates stratified on the propensity score range from 6% to
36%. Findings suggest the existence of a causal mode substitution mechanism from car to non-
motorized modes given positive increases in the latent score of urbanization level.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 3
1. INTRODUCTION
The connection between the built environment and travel behavior has been the object of
interest of a considerable number of studies in the past twenty years. As concepts such as Smart Growth,
Compact Cities and New Urbanism permeate the sustainability discourse, the validity of the argument
that high density, compact and mixed use cities might reduce car use and promote the use of alternative
modes hinges on the existence of a true causal mechanism between the built environment and travel
behavior. Of particular importance to the establishment of this causal mechanism is the issue of
residential self-selection, where individuals choose their residential location in part to meet their
transport preferences (1); in that sense, failure to control for self-selection might results in biased and
inconsistent estimators of the true effect of interest.
This study uses a propensity score approach to mitigate self-selection bias, yet differs from
previous studies in that it generalizes the binary treatment regime of urban vs. suburban neighborhoods.
Instead, a continuous treatment regime of urbanization level is assumed, thus accounting for more
variability in the built environment characteristics.
The rest of the paper is structured as follows. Section 2 discusses the residential self-selection
problem from a program evaluation perspective in the planning literature. Section 3 summarizes the
general characteristics of the study, while Section 4 details the properties of the propensity score and
elaborates on the propensity score generalization used in this study. Section 5 and 6 present estimation
results for the latent variable for urbanization level and the estimated treatment effects respectively,
followed by discussion of findings Section 7. Finally, Section 8 wraps up the general conclusions of the
presented analysis.
2. THE RESIDENTIAL SELF-SELECTION PROBLEM FROM A PROGRAM
EVALUATION PERSPECTIVE
The establishment of a causal relation between the built environment and travel behavior has
been a widely debated issue among researchers. Although a great deal of studies have found statistically
significant associations between built environment features and some dimensions of travel behavior,
establishing a causal relationship hinges on stronger conditions that might be hard to meet outside ideal
experimental conditions. Particularly, the non-random allocation of the treatment of interest (in this case,
of the built environment characteristics) might compromise the validity of results. In the context of the
built environment-travel behavior connection, non-random treatment assignment is most likely a result
of households self-selecting themselves into neighborhoods that match their transport preferences.
In the absence of true experimental studies, researchers have attempted to establish causality
through either natural experiment studies or cross-sectional studies via econometric mechanisms (The
review of all the existing approaches is beyond the scope of this article and have been documented
elsewhere in the literature. Interested readers are referred to a Cao et al. (2) for a detailed review); the
present article focuses though on the latter group, as given cost and other implementation reasons,
existing data is largely cross-sectional.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 4
In the program evaluation literature several methods have been developed to address the
causality problem in the presence of non-random treatment allocation, out of which two main
approaches are highlighted: sample selection models and propensity score models.
The first step for both approaches is to estimate the conditional probability of seeking
treatment given a vector of covariates; in the case of a binary treatment,   , where z is a
binary treatment that takes value 1 when the individual is treated and zero otherwise, and X is a vector
of conditioning covariates. For binary treatments, probability is usually estimated via a binary probit or
logit model.
In sample selection models (Heckman’s sample selection), given an estimated treatment
probability, a sample correction coefficient is estimated and introduced in the regression of the outcome
variable of interest on all covariates to correct for selectivity bias (3). Using this approach, Zhou and
Kockelman (4), estimated household vehicle-miles-Traveled (VMT) given a binary treatment of urban
vs. suburban residential location and found that 90% of the difference in VMT was attributed to the
treatment itself and that self-selection accounted only for 10% of the observed traveled distances. A
similar study by Cao (5) estimated that self-selection accounted for 19% of the observed VMT.
Similar to the sample selection models, propensity score models depart from the estimation
of treatment probability. Propensity score matching consists in matching treated and untreated samples
given similar treatment probability scores, difference in outcomes between matched pairs are then
averaged and the average treatment effect (ATE) estimated (6). Alternatively, as Rosenbaum and Rubin
(6) show, the sample can be stratified given estimated treatment probability and outcomes of subgroups
compared. Using propensity score matching with different binary treatments (i.e. urban vs. suburban,
urban vs. exurban etc.), Cao et al. (7) found a positive association between vehicle miles drive and
distance from the city center, with the impact of self-selection ranging from 0.05% to 52%, depending
on the treatments considered. Concerning walking behavior, Boer et al. (8) estimated treatment effects
for several built environment features and found that higher levels of business diversity and four way
intersections were on average associated with more walking. Through propensity score stratification,
Cao (9) found that residents living in neo-traditional neighborhoods tend to walk more than their
suburban counterpart; furthermore, he found that failure to account for self-selection might result in
overestimating the effect of the built environment on walking frequency by 64% and 17% for utilitarian
trips and recreational trips respectively.
Although the studies discussed so far do evidence the potential of propensity score methods
to reduce selection bias, in most studies reviewed, the built environment is polarized to a binary
treatment (usually urban vs. suburban), in itself a rather strong assumption that ignores the spectrum of
variability in terms of how “urban” or how “suburban” a neighborhood might be. In other words, a
binary treatment considers all neighborhoods within each treatment class identical in its built
environment features, thus making the estimates insensitive to variations in neighborhood compositions.
As an aside, it is important to note that as extensions to the binary treatment approach, multi-
valued treatment approaches have been proposed by Lee (10), in the form of a multinomial logit-OLS
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 5
two stage model, and by Imbens (11) which can also be applied to ordinal multi-level treatments. While
these approaches are certainly promising improvements, to the authors best knowledge they have yet
to be operationalized into the transportation field.
3. GENERAL CHARACTERISTICS OF THE STUDY
In order to address the identified gap in the literature this analysis uses a propensity score
approach under a continuous treatment regime. In that sense, this study consists of two main parts, (i)
the estimation of a continuous treatment variable for urbanization level, and (ii) the estimation of its
effect on travel behavior. Furthermore, given a continuous treatment, and considering that under the
Gauss-Markov assumptions the OLS estimator is the best linear unbiased estimator, the performance of
the direct OLS estimates are tested against the propensity score estimates.
Data from an online survey conducted in the city of Hiroshima, Japan was used for the analysis.
The survey was conducted in March 2013 through Rakuten Research, a company affiliated to Rakuten
Market, the largest internet shopping site in Japan, with over 2.3 million monitors all over the country.
The sample size consisted of 600 individuals gathered through stratified random sampling from the
monitor list to match the population distribution of the 8 wards that compose the metropolitan area and
the overall age distribution in the city.
Certainly there are some issues regarding web surveys that might compromise the external
validity of results, particularly, the issue of coverage error, which stems from the exclusion of (i) people
who do not have access to internet and (ii) have enough digital literacy to adequately answer the
questionnaire (12). Regarding access to the internet, the Ministry of Internal Affairs and
Communications of Japan (MIC) estimated in their Communications Usage Trend Survey a penetration
rate of 79.1% for 2011, with a 90% penetration in the 13-49 years old cohort, and lower rates for the
60-64, 65-69, and 70-79 cohorts, with 73%, 60% and 42% diffusion rates respectively (13); Regarding
digital literacy, the same survey estimates that among internet users, 60% used the internet for online
purchases or trade of merchandise, although it noted a gap between users under 49 years old and users
above that threshold. This suggests that although there is a rather high internet diffusion rate, some
limitations do exist in terms of representativeness of the sample, especially for the older cohorts. At any
rate, it is important to note that even if the sample is not perfectly representative of the population of
interest, the main contribution of this study is of a methodological nature, hence the present data was
considered valid for the analysis in question.
Regarding data gathered in the survey, as illustrated in Table 1, information was gathered on
general characteristics of the household as well as individual characteristics of the respondent such as
car use habit, as measured by the response frequency index (14), an indicator variable for job location
in the city center, a set of indicator variables for the type of area where they grew up or spent most of
their childhood (i.e. large metropolitan area, suburbs of large metropolitan area, regional city, etc.) and
three measurements of individual attitudes and preferences estimated via confirmatory factor analysis
following Kitamura et al. (15).
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 6
General sample characteristics were compared against population characteristics from the 2010
national census to check the representativeness of the sample. Overall the sample mean values
approximate the census mean values in the evaluated criteria, however, women are slightly under-
represented in the sample by 4 percentage points on average, a larger margin was observed for the
women over 60 years old. Although on average online surveys are expected to be biased towards the
young, the average age in the sample is also approximately five years higher than the population average
for the city of Hiroshima. Sample households are also slightly larger, with sample average of 2.66
against the population average of 2.29. Finally, the largest difference from the population values comes
from household income, where consistent with findings from the literature, higher income households
tend to be over-represented in web-survey samples (12). Compared against the Private Income
Statistical Survey for 2011 (16), income groups under US$4,000/year are underrepresented by 44%,
while the rest of the income cohorts are overrepresented. Other variables of interest however, such as
number of vehicles in the household and ratio of home-owners in the sample did not exhibit any
significant differences from the population values.
The socio-demographic composition data of all the districts of Hiroshima city, used to estimate
the continuous urbanization level indicator (see Section 5) were gathered from the 2005 national census,
as the GIS data for the 2010 census was not available at the time of writing. Finally, geo-referenced
land use data was gathered from the TelPOINT pack geo-referenced phonebook, developed by ZENRIN.
The dependent variables considered in this study were non-work trip frequency by car, non-
work trip frequency by non-motorized modes and total travel distance for non-work trips. Respondents
were asked to state the number of home-based trips (excluding return trips) taken during an average
week by purpose and the most frequently used mode for that type of trip. Trip frequencies lower than
once a week were assumed as zero. Travel distances were calculated as a function of the stated trip time
given average speeds by mode. Although estimation errors are likely given the way travel distances are
calculated, when formulating the questionnaire, it was assumed that individuals were more likely to be
accurate in terms of travel time than in terms of actual travel distances.
4. PROPENSITY SCORE AND CAUSAL INFERENCE
The propensity score, defined as the conditional probability of treatment given observed
covariates, was proposed by Rosenbaum and Rubin (6) as a way to remove bias due to observed
covariates. By acting as a balancing score in a non-randomized treatment assignment context, the
propensity score makes inherently different groups comparable, the main advantage being the
possibility of balancing a potentially large set of covariates X using one single scalar function. In that
sense, given a binary treatment z, stratifying by the propensity score will balance X, so that conditional
on the propensity score function, the distribution of X is the same for treated and
untreated groups; that is, conditional on P(X), X and z are independent:
   . (1)
The main assumption behind the propensity score approach is the strong ignorability of
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 7
treatment where given equation (1), treatment outcomes (r0,r1) are independent from treatment
assignment given P(X):
    (2)
Given equations (1) and (2), the expected difference between outcomes given P(X) is the average
treatment effect (ATE) given P(X). Rosenbaum and Rubin (17) further show that a 5 strata sub-
classification of the propensity score might reduce over 90% of bias due to observed covariates.
The analysis presented in this article follows the generalization of the propensity score method
proposed by Imai and van Dyk (18) to allow for arbitrary treatment regimes TA. This Section draws
heavily from Imai and van Dyk in order to summarize the used method. Readers are referred to the
original article for a more in-depth explanation.
Following the proposed generalization approach, under a continuous treatment regime, the
distribution of treatment TA given a vector of covariates X, is modeled as  , where the
propensity score function  is Gaussian distributed and parameterized by  
 , and  , thus the propensity score function is solely characterized by the scalar .
In practice,
is estimated through a linear regression of the treatment variable  and all
covariates X, so that
 
, that is, the propensity score is uniquely characterized by the
conditional mean function of the regression.
Imai and Van Dyk (18), also demonstrated that equations (1) and (2) can be extended to show
that even for non-binary treatments, the propensity score serves as a balancing score:
   , (3)
and that the distribution of the outcome given a potential treatment tP, Y(tP) is independent from
treatment assignment given P(X):
    , (4)
for any  , where is a set of potential treatment values. Thus, by averaging 
over the distribution of P(X), the distribution of the outcome of interest can be obtained:
  . (5)
This integration can then be approximated parametrically as   stratified by the
propensity score θ, where parameterizes the distribution. Thus, the distribution of Y(tP) can be
approximated as the weighted average of the within strata outcome distribution:

  
 (6)
Where
is the within strata estimate of unknown parameter in strata j, and Wj is the relative
weight of strata j. can then be estimated as
   
 , (7)
where covariates X are included to control for variability of θ within strata. The average treatment effect
is then a function of
; in this case, the weighted treatment coefficient of the regression of the outcome
variable Y(tP) on tP and all covariates, where weights are given by the sample relative weight nj/N.
Imai and van Dyk (18) verified through simulation and empirical analysis that stratification
on the propensity score reduces bias of observed covariates by 16-95%, suggesting a superior
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 8
performance over the direct non-stratified treatment estimation.
Although the estimation might look cumbersome, in practice, the procedure is rather simple.
First, the propensity score function is estimated through an OLS regression of the treatment variable
  on all covariates X, where the conditional mean function
 
, characterizes the
propensity score. Using our score estimate
, that is, the regression fitted values, the sample is then
stratified in approximately equal strata and the outcome variables of interest are then regressed against
the same covariates X within each strata. The average treatment effect is simply the weighted average
of the within-strata treatment coefficients.
5. URBANIZATION LEVEL AS A CONTINUOUS TREATMENT REGIME
The latent variable for urbanization level was estimated using confirmatory factor analysis
(CFA). Five indicators were specified to load on the latent variable urbanization level: population
density, average area of housing per person, ratio of households living in multifamily residences within
the district, ratio of renter households within the district and density of commercial facilities. Out of the
460 districts (Chōchōmoku in Japanese) that constitute the Hiroshima metropolitan area, the effective
sample size was of 400 districts. Districts with values for population or housing area equivalent to zero
were excluded from the sample as these areas are not inhabited. Industrial parks in the port area were
also excluded from the analysis. The model was estimated with MPlus 6 developed by Muthen &
Muthen, using the maximum likelihood estimator, which allows for calculation of goodness of fit
indices to evaluate the estimated factor solution. Goodness of fit acceptable thresholds were guided by
the values recommended by Hu and Bentler (19) as follows: Root mean square error of approximation
RMSEA (≤ 0.06), standardized root mean square residual SRMR (≤0.08), comparative fit index CFI
(≥0.95) and Tucker-Lewis index TLI (≥0.95).
Goodness of fit indices suggest an acceptable model fit. Chi square with two degrees of
freedom was 1.813, yielding a p-value of 0.40; the lack of significance of the p-value indicates that the
estimated variance-covariance matrix is not statistically different from the input matrix, thus suggesting
that the model adequately reproduces observed variations in the data. RMSEA was 0.000, with a CFit
value (probability RMSEA≥0.05) of 0.684. CFI and TLI were 1.000 and 1.001 respectively, while the
standardized root mean square residual (SRMR) was 0.008. No modification indices were above the
3.84 threshold. Completely standardized estimated parameters are illustrated in Figure 1.
All estimated parameters were statistically significant at the 1% level. Factor loading suggest
that all indicators are strongly related with the latent factor urbanization level (with explained variances
ranging from 0.52 to 0.85) exhibiting positive correlations with all indicators except for average housing
area per person, which exhibited a negative correlation, clearly illustrating the existing tradeoff between
accessibility and housing area that households face when deciding residential location.
Figure 2, illustrates the geographical distribution of urbanization in the city. Urbanized areas
are mostly concentrated in flat areas, while the hilly areas that surround the city are less urbanized, and
in most cases scantly populated. Furthermore, the monocentric nature of the city is evidenced in the
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 9
distribution of the highly urbanized areas, concentrated near the harbor and gradually dispersing away.
6. TREATMENT EFFECT ESTIMATION
As explained in Section 4, an estimate of the propensity score function
for the continuous
treatment variable urbanization level is estimated through an OLS regression. Covariates considered for
the estimation of the propensity score are summarized in Table 1. Covariate selection was based both
on findings from the literature as well as theoretical considerations. Furthermore, variables that differed
considerably from the population distribution, as discussed in Section 3 were also introduced as
covariates. Estimation results are presented in Table 2. R-squared of the final model was 0.37 suggesting
an acceptable model fit.
It is important to note that as a prediction model, the object of interest of this regression is not
the individual coefficients of each explanatory variable, but the scalar estimate
, that following the
balancing score assumption described in equation (3) balances all the covariates thought to affect
treatment allocation; thus warranting the inclusion in the final model of variables that although
theoretically significant might be rendered insignificant due to multicollinearity.
To verify the balancedness of covariates given the estimated propensity score
, as suggested
by Imai and Van Dyk (18) each covariate was regressed against the original treatment variable; the same
regressions were run a second time but this time conditioning on
. OLS was used for continuous
covariates while binary logit was used for dummy covariates. As Figure 3 illustrates, without controlling
for
, most covariates are strongly correlated with the treatment, but once conditioned on the propensity
score, this correlation is considerably reduced, evident in the drop of the t-statistics for each covariate.
Having verified that the estimate
balances observed covariates, the average treatment
effect of urbanization level on the outcome variables of interest were estimated, outcomes being non-
work car trip frequency, non-work non-motorized trip frequency and its respective traveled distances;
all variables were introduced in the models in log form. As Table 3 illustrates, the sample is stratified
on
into roughly equal sub-classes j and effect estimates are compared not only against the non-
stratified estimates but against the no covariates estimates as well. The sensitivity of estimates to
stratification is tested by estimating treatment effects given different j, thus, sample is stratified into
three, five and ten strata.
In general, all models support the hypothesis that higher urbanization levels have a negative
effect on car trip frequency and car traveled distances, and a positive effect on non-motorized trip
frequency and distances; however, the propensity score models, in particular those controlling for all
covariates yield smaller effect magnitudes, suggesting that direct estimations (both with and without
covariate control) tend overestimate the real effect of the built environment on trip frequency and
traveled distances for both modes.
In the case of car trip frequency, while the naïve estimate (no stratification, no covariates)
stands at -0.36, that is, a 36% reduction in car trips given a standardized unit increase in urbanization
level, stratified full model estimates (all covariates) range from -0.155 in the three strata estimate to
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 10
-0.128 in the ten strata estimate. In a similar manner, for the non-motorized trip frequency model, the
naïve estimate stands at 0.37, suggesting a 37% increase in walking and biking trips given a one unit
increase in the urbanization level index, while stratified full model effects range from 0.25 to 0.22, for
the three strata and the ten strata estimates respectively. Likewise, considerable reductions are observed
in the case of traveled distances.
When comparing the stratified full models against the non-stratified, all covariates models,
although the direct estimation does indeed reduces bias, stratifying on the propensity score does perform
better. When compared to the ten strata full models, direct regression (all covariates) estimates are
overestimated by 36% and 6% in the car and non-motorized trip frequency cases respectively, and 8%
and 20% in the cases of car and non-motorized travel distances.
Certainly, given the fact that both the true propensity score, as well as the true population
parameter for the treatment of interest are unknown, the misspecification issue is a non-trivial one. In
spite of including a diverse range of covariates into the estimation of the propensity score function, and
verifying the balance of covariates after conditioning on
, model misspecification is still possible.
Results suggest however, that propensity score models are rather robust to model misspecifications,
particularly as the number of strata j increases; that is, estimate ranges between no covariate models
and full covariate models for each stratification scheme get smaller as number of strata increases, being
the exception the non-motorized traveled distance models; this latter might be nevertheless, a result of
aggregation of walk trips and bicycle trips, although this assertion needs to be verified through further
analysis.
7. DISCUSSION OF FINDINGS
Regarding the implication of estimated results, empirical findings suggest the existence of a
mode substitution mechanism between car and non-motorized modes as a result of increases in
urbanization levels as measured by the estimated latent variable. Furthermore, high urbanization levels
were not only associated with less car trip but also shorter traveled distances as well as a higher non-
motorized trip frequency and longer total traveled distances. These findings support the arguments of
advocates of compact cities as ways to reduce car dependency and promote travel by alternative modes.
Methodologically, stratification on the propensity score was shown to reduce estimation bias when
compared to both the naïve and direct regression estimates, furthermore, treatment effect estimates are
rather robust to misspecifications, particularly as the number of strata get larger.
It is important to highlight yet again the importance of the strong ignorability assumption.
That is, the assumption that the distribution of treatment outcomes are independent from the distribution
of treatment assignment given the propensity score is crucial to unbiasedness of estimates; nevertheless,
in practice it is impossible to know how well does the estimated function approximates the true
population function; that being said, in order to estimate the propensity score function, variables largely
cited in the literature as relevant to residential location were introduced in the model; hence, it is
assumed that the estimated function is a good estimate of the true unknown function.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 11
In terms of the overall contribution to the field, the presented methodology helps overcome
some of the limitations of existing program evaluation approaches in the transportation literature,
particularly the binary treatment assumption which usually polarizes the built environment into two
extremes, either urban or suburban, disregarding the large variability in district characteristics within
cities. A continuous urbanization level treatment, as the one developed in this article allows thus for a
more precise understanding of the built environment effect on travel behavior at all levels of the
urbanization spectrum without the need to arbitrarily draw a defining line between “urban” and
“suburban” which binary treatment models might be highly sensitive to. Instead, the CFA estimation of
a latent variable score for urbanization level allows for the calculation of goodness of fit statistics to
evaluate the estimated solution and provides statistical support to the proposed index. Certainly,
methodologically there is still room for improvement. Subject to data availability, other better-fitting
model specifications are possible. Furthermore, the use of a regular spatial unit such as a rectangular
grid instead of the existing demarcations (e.g. census tracts, districts) might make results less prone to
modifiable areal unit problems.
8. CONCLUSIONS
This study evaluated the built environment-travel behavior connection using a propensity
score approach under a continuous treatment regime thus overcoming the limitations of constraining
variations in the built environment to a binary treatment. To do so a latent variable for urbanization
level was estimated and considered as the treatment of interest. A mode substitution effect from car to
non-motorized modes was observed given positive changes in the urbanization level. The implemented
propensity score stratification approach was also successful in mitigating self-selection bias. Compared
against ten strata weighted estimates, direct regression treatment effects are overestimated by 36% and
6% in the car trip frequency and non-motorized trip frequency models respectively, and 8% and 20%
in the cases of car traveled and non-motorized travel distances. Findings provide supporting evidence
regarding a causal effect of the built environment on some dimensions of travel behavior, namely trip
frequency and travel distances for car and non-motorized modes.
ACKNOWLEDGEMENTS
All spatial data used for the analysis presented in this article were provided by the Center for Spatial
Information Science of The University of Tokyo. CSIS joint research No.479.
This study was supported by JSPS KAKENHI Grant No. 23246091.
REFERENCES
(1) M. Boarnet and R. Crane. The influence of land use on travel behavior: specification and
estimation strategies. Transportation Research Part A, vol. 35, 2001, pp. 823-845.
(2) X. Cao, P. Mokhtarian and S. Handy. Examining the impacts of residential self-selection on travel
behavior. Transport Reviews, vol. 29, no. 3, 2009, pp. 359-39.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 12
(3) J. Heckman, Sample selection bias as a specification error. Econometrica, vol. 47, no. 1, 1979,
pp. 153-162.
(4) B. Zhou and K. Kockelman. Self-selection in home choice: Use of treatment effects in evaluation
the relationship between the built environment and travel behavior. In Transportation Research
Record: Journal of the Transportation Research Board, No. 2077, Transportation Research Board
of the National Academies, Washington DC, 2008, pp. 54-61.
(5) X. Cao. Disentangling the influence of neighborhood type and self-selection on driving behavior:
an application of sample selection model. Transportation, vol. 36, 2009, pp. 207-222.
(6) P. Rosenbaum and D. Rubin. The central role of the propensity score in observational studies for
causal effects. Biometrika, vol. 70, no. 1, 1983, pp. 41-55.
(7) X. Cao, Z. Yu and Y. Fan. Exploring the connections among residential location, self-selection,
and driving: Propensity score matching with multiple treatments. Transportation Research Part A,
vol. 44, 2010, pp. 797-805.
(8) R. Boer, Y. Zheng, A. Overton, G. Ridgeway and D. Cohen. Neighborhood design and walking
trips in ten U.S. metropolitan areas. American Journal of Preventive Medicine, vol. 32, no. 4, 2007,
pp. 298-304.
(9) X. Cao. Exploring causal effects of neighborhood type on walking behavior using stratification of
propensity score. Environment and Planning A, vol. 42, 2010, pp. 487-504.
(10) L.-F. Lee. Generalized econometric models with selectivity. Econometrica, vol. 51, no. 2, 1983,
pp. 507-512.
(11) G. Imbens. The role of the propensity score in estimating dose-response functions. Biometrika, vol.
87, 2000, pp. 706-710.
(12) M. P. Couper. Web surveys: A review of issues and approaches. The Public Opinion Quarterly,
vol. 64, no. 4, 2000, pp. 464-494.
(13) Ministry of Internal Affairs and Communications (MIC). Communications usage trend survey in
2011 compiled. Tokyo, Japan, 2012.
(14) B. Lanken, H. Aarts, A. van Knippenberg and C. van Knippenberg. Attitude versus general habit:
Antecedents of travel mode choice. Journal of Applied Social Psychology, vol. 24, no. 11, 1994,
pp. 285-300.
(15) R. Kitamura, P. Mokhtarian and L. Laidet. A micro-analysis of land use and travel in five
neighborhoods in the San Francisco Bay Area. Transportation, vol. 24, 1997, pp. 125-158.
(16) National Tax Agency. Heisei 23 nenbun minkan kyuuyo jittai chousa (Private income statistical
survey for 2011). Tokyo, Japan, 2012.
(17) P. Rosenbaum and D. Rubin. Reducing bias in observational studies using subclassification on the
propensity score. Journal of the American Statistical Association, vol. 79, no. 387, 1984, pp. 516-
524.
(18) K. Imai and D. A. van Dyk. Causal inference with general treatment regimes: Generalizing the
propensity score. Journal of the American Statistical Association, vol. 99, no. 467, 2004, pp. 854-
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 13
866.
(19) L. Hu and P. Bentler. Cutoff criteria for fit indexes in covariance structure analysis: Conventional
criteria versus new alternatives. Structural Equation Modeling, vol. 6, 1999, pp. 1-55.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 14
List of table titles and figure captions
TABLE 1. Descriptive Statistics of Covariates
FIGURE 1. Path Diagram of “Urbanization Level” Latent Variable Estimation
FIGURE 2. Urbanization Level Map of Hiroshima City.
TABLE 2. Propensity Score OLS Estimation Results
FIGURE 3. Standard Normal Quantile Plots of t-Statistics
TABLE 3. Estimation Results of Causal Effects of Urbanization Level on Travel Behavior
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 15
TABLE 1. Descriptive Statistics of Covariates
Variable
Mean
Std.Dev.
Min.
Max
Household Characteristics
Household size
2.667
1.231
1
7
Single household
0.173
0.379
0
1
Pre-school children
0.127
0.333
0
1
Members over 65 years in household
0.205
0.404
0
1
Owner of detached house
0.198
0.399
0
1
Owner of apartment
0.302
0.459
0
1
Number of drivers in household
2.420
0.892
1
5
Income (<US$ 40,000)
0.326
0.469
0
1
Income (US$ 40,001-60,000)
0.283
0.451
0
1
Income (US$ 60,001-80,000)
0.164
0.371
0
1
Income (US$ 80,001-100,000)
0.117
0.322
0
1
Income (>US$ 100,001)
0.108
0.211
0
1
Number of bikes in household
2.393
1.200
1
9
Number of cars in household
1.162
0.759
0
5
Individual Characteristics
Gender
0.522
0.500
0
1
Age
46.835
14.487
20
79
Worker
0.532
0.499
0
1
Car habit (Response frequency index)
5.745
3.214
0
10
Grew up in large metropolitan area (Tokyo, Osaka,etc.)
0.015
0.122
0
1
Grew up in the suburbs of large metropolitan area
0.047
0.211
0
1
Grew up in a regional city (Hiroshima, Fukuoka, etc.)
0.287
0.453
0
1
Grew up in the suburbs of a regional city
0.393
0.489
0
1
Grew up in a small city
0.143
0.351
0
1
Grew up in a village
0.065
0.247
0
1
Grew up in the remote countryside
0.050
0.218
0
1
Attitudes: Car lover*
0.000
1.000
-2.912
1.632
Attitudes: Pro-transit and non-motorized modes*
0.000
1.000
-4.971
2.206
Attitudes: Suburban*
0.000
1.000
-2069
3.026
Works in city center (Naka, Minami and Higashi wards)
0.412
0.493
0
1
Dependent Variables**
Non-work car trip frequency
2.4366
3.3815
0
23
Non-work non-motorized trip frequency
3.4066
3.941
0
20
Non-work car traveled distances
15.583
25.224
0
228
Non-work non-motorized traveled distances
6.073
10.067
0
92.93
*3 factor CFA solution using maximum likelihood estimator. Full estimation results available upon request.
**Dependent variables were log-transformed for model estimations.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 16
FIGURE 1. Path Diagram of “Urbanization Level” Latent Variable Estimation
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 17
FIGURE 2. Urbanization Level Map of Hiroshima City.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 18
TABLE 2. Propensity Score OLS Estimation Results
Variable
Coefficient
S.E.
t stat.
P value
Constant
0.854
0.188
4.546
0.000
Age
0.000
0.003
-0.144
0.886
Male
-0.120
0.058
-2.053
0.040
Household size
-0.109
0.033
-3.276
0.001
Single household
0.065
0.097
0.665
0.506
Children in preschool /elementary school
0.125
0.075
1.680
0.093
Members over 65 in HH
-0.163
0.092
-1.768
0.077
Number of cars
-0.105
0.050
-2.071
0.038
Number of bicycles
0.085
0.022
3.945
0.000
Middle income (US$40,000~100,000)
0.051
0.067
0.750
0.453
High income (>US$100,001)
0.190
0.096
1.971
0.049
Car habit
-0.049
0.012
-4.260
0.000
Grew up in large city or regional city
-0.089
0.087
-1.017
0.309
Grew up in large or regional city suburbs
-0.126
0.083
-1.522
0.128
Grew up in the country side or village
-0.349
0.114
-3.065
0.002
House owner
-0.285
0.062
-4.610
0.000
Car lover
0.002
0.032
0.052
0.958
Pro-transit and non-motorized modes
0.036
0.031
1.167
0.243
Suburban preference
-0.140
0.029
-4.873
0.000
Job located in city center
0.268
0.059
4.538
0.000
Standard deviation
0.78
Sum of squares
195.500
Number of observations
517
Standard error of e
0.627
Parameters
20
R2
0.370
Degrees of freedom
497
Adjusted R2
0.346
F[ 19, 497] (prob)
15.38 (0.00)
Variables in bold are significant at least at the p<0.10.
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 19
FIGURE 3. Standard Normal Quantile Plots of t-Statistics
-12.0
-8.0
-4.0
0.0
4.0
8.0
-2.0 -1.0 0.0 1.0 2.0
Quantiles of t-statistics
Standard normal quantiles (Without controlling for θ)
-12.0
-8.0
-4.0
0.0
4.0
8.0
-2.0 -1.0 0.0 1.0 2.0
Standar normal quantiles (After controlling for θ)
Giancarlos Troncoso Parady, Kiyoshi Takami, Noboru Harata 20
TABLE 3. Estimation Results of Causal Effects of Urbanization Level on Travel Behavior
Models
Covariates
No stratification
3 Strata
5 Strata
10 Strata
Car trip frequency
No covariates
-0.362
-0.204
-0.182
-0.148
(0.042)
(0.094)
(0.121)
(0.053)
All covariates
-0.174
-0.155
-0.146
-0.128
(0.048)
(0.084)
(0.110)
(0.050)
NMM trip frequency
No covariates
0.376
0.289
0.248
0.199
(0.046)
(0.109)
(0.149)
(0.064)
All covariates
0.234
0.255
0.214
0.220
(0.059)
(0.100)
(0.129)
(0.059)
Car traveled distance
No covariates
-0.753
-0.444
-0.388
-0.342
(0.067)
(0.155)
(0.198)
(0.087)
All covariates
-0.366
-0.361
-0.329
-0.339
(0.076)
(0.141)
(0.180)
(0.084)
NMM traveled distance
No covariates
0.436
0.322
0.283
0.317
(0.056)
(0.130)
(0.175)
(0.075)
All covariates
0.243
0.256
0.216
0.201
(0.071)
(0.119)
(0.145)
(0.070)
Variance for weighted coefficients (in parenthesis) was calculated as:
  
, where Wj is the weight
of each strata j, where
 
 .
... In a sense, the establishment of a compact city itself serves as a springboard for PT promotion. Hence, as Parady et al. (2014) and Rahman and Sciara (2022) note, the built environment itself can indirectly impact travel behavior through changes in travel attitudes, which can change over time owing to various influences, including observations of other people that influence behavioral norms. ...
Article
Full-text available
There is a general consensus that transportation policies play a crucial role in establishing a compact city with high density, mixed land use, and well-coordinated transit systems that promote sustainability through mechanisms such as reducing car and energy use. However, the possible contributions of the compact city environment to the increase (or decrease) in public transportation use have been understudied. To fill this gap, the current case study examines psychological factors influencing citizens’ intention to use public transportation modes in the compact city environment in Toyama City, Japan. We focused on citizens’ perception, attitudes, and behaviors, as they play important roles in sustainable development initiatives. Survey questionnaires were used to collect data, and structural equation modeling was performed to analyze 973 respondents’ transportation mode choices. The theoretical foundations underlying this study include the norm-activation theory and the theory of planned behavior. The findings suggest that attitudes, perceived behavioral control, and behavioral norms served as significant factors to explain the intention to use public transportation modes. Consequently, we derived theoretical and practical implications to further promote transportation policies in the context of compact city strategies.
... Through propensity score stratification with urbanization level as treatment variable, Troncoso Parady et al. (2014a) found evidence of a substitution effect between car and non-motorized modes given changes in urbanization level in Hiroshima city. Using panel data, Troncoso Parady et al. (2014b) analysed through a fixed effect model changes in travel behaviour in residents moving to a new high-density compact development in Chiba, Japan. ...
Article
Full-text available
This study uses data from the 4th Nationwide Person Trip Survey to analyse the relation between the built environment, modal access preference at residential location and travel behaviour in Japan. By estimating random parameter count models, significant statistical associations were found between the built environment and preferences with non-work trip frequency by mode. Furthermore the effect of population density, car ownership and some access preference traits were found to be heterogeneous for some modes. Since most of the recent literature has focused largely on North-American and European cities, this study contributes to the existing body of literature by examining the role of the built environment and individual preferences on travel behaviour in the context of Japanese cities, and sheds some light on existing heterogeneity in the effects of some factors related to travel behaviour.
Article
Many studies have examined the association between the built environment, residential self-selection, and travel behavior. However, few studies have quantified the relative contribution of the built environment itself. Using the 2012 Nanjing Household Travel Survey data, this study applied hierarchical clustering and propensity score weighting to study the effects of the built environment and residential self-selection on travel behavior. First, residents’ household locations were classified into four built environment patterns using hierarchical clustering based on six built environment variables by loosely following the “5Ds” (i.e., density, diversity, design, destination accessibility, and distance to transit). Second, a powerful machine learning method, generalized boosted model (GBM), was employed to obtain propensity scores. Propensity score weighting, which is more effective for multiple treatments than matching or stratification, was used to control for residential self-selection. Lastly, the observed effect (OBE), the average treatment effect on the population (ATE), and the built environment proportion (BEP) were calculated for the walking trip frequency, bicycle trip frequency, public transit trip frequency, and vehicle kilometers traveled (VKT) of six pairs of built environment patterns. The results show that a high-density, mixed-use, walkable, and transit-accessible built environment is associated with more walking trips and lower VKT but has no impact on bicycle trips and has an inconsistent impact on public transit trips. The effects of some built environment variables on bicycle and public transit trips are tangled. The residential self-selection effect has the greatest impact on VKT (BEP: 48%–77%), followed by the walking trip frequency (BEP: 62%–74%) and the public transit frequency (BEP: 90%–107%).
Article
Full-text available
This article discusses the validation and implementation of a propensity score approach with continuous treatment to test the existence of a causal relationship between the built environment and travel behavior using cross-sectional data. The implemented methodology differs from previous applications in the planning literature in that it relaxes the binary treatment assumption, which polarizes the built environment into two extremes (e.g., urban vs suburban). The effectiveness of the proposed methodology in reducing bias was validated via Monte Carlo simulation. The proposed approach was shown to reduce self-selection bias against Ordinary Least Squares (OLS) regression in all but extreme levels of non-linearity. Empirical results suggest that an increase in urbanization has a negative effect on home-based maintenance car trip frequencies, and conversely, a positive effect on home-based maintenance non-motorized trip frequencies. Result estimates suggest the existence of a causal mode substitution mechanism between car and non-motorized modes given increases in the urbanization level at residential locations, thus providing some empirical support to the arguments put forth by compact city advocates.
Article
Many cities have made massive investments on rail systems to substitute transit for driving. Some studies have considered the confounding effect of attitudes in the connections between rail transit and travel behavior. However, they often focused on the average effect of rail transit and assumed that individuals' responses to transit improvements do not vary by their tastes. Using the 2014 data from Xi'an in China, this study explores the interaction effect between metro transit (heavy rail) and the propensity (i.e., predicted probability) of living in neighborhoods with metro transit on transit use. The propensity is positively associated with commute by metro transit and bus. Further, individuals with a strong propensity use transit equivalently no matter whether they live near metro transit, but metro transit tends to promote transit commute for those with a weak propensity of living near metro transit. Overall, building a rail line helps enhance transit ridership. Planners should also consider the variation in responses by individuals with different tastes when using policies to shape urban travel.
Article
Cities are implementing an ever widening range of initiatives to promote bicycle use with the aim of improving the sustainability of urban journeys. One strategy that is achieving the most immediate results in the promotion of bicycle use, along with the construction of bicycle lanes and bicycle parking, is the implementation of Public Bicycle Sharing Systems (PBSS), which coexist with private bicycle use. As both these systems (PBSS and the private bicycles) have their advantages and disadvantages, this paper seeks to compare the distances for which PBSS and private bicycles are habitually used by applying a propensity score matching-based model. Our findings unequivocally demonstrate that the mean journey length made by private bicycle is 700–800 m (0.44–0.5 miles) greater than those made by public bicycle. We find robust empirical evidence that there is a complementarity relationship between the two modes of transport with regard to distance. The conclusions of this study are useful for the PBSS literature in spatial/geographical terms, for the management of PBSS hire charges, and in relation to the system's suitability for different city models.
Article
Full-text available
This study examined the effects of land use and attitudinal characteristics on travel behavior for five diverse San Francisco Bay Area neighborhoods. First, socio-economic and neighborhood characteristics were regressed against number and proportion of trips by various modes. The best models for each measure of travel behavior confirmed that neighborhood characteristics add significant explanatory power when socio-economic differences are controlled for. Specifically, measures of residential density, public transit accessibility, mixed land use, and the presence of sidewalks are significantly associated with trip generation by mode and modal split. Second, 39 attitude statements relating to urban life were factor analyzed into eight factors: pro-environment, pro-transit, suburbanite, automotive mobility, time pressure, urban villager, TCM, and workaholic. Scores on these factors were introduced into the six best models discussed above. The relative contributions of the socio-economic, neighborhood, and attitudinal blocks of variables were assessed. While each block of variables offers some significant explanatory power to the models, the attitudinal variables explained the highest proportion of the variation in the data. The finding that attitudes are more strongly associated with travel than are land use characteristics suggests that land use policies promoting higher densities and mixtures may not alter travel demand materially unless residents' attitudes are also changed.
Article
Full-text available
The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Both large and small sample theory show that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates. Applications include: (i) matched sampling on the univariate propensity score, which is a generalization of discriminant matching, (ii) multivariate adjustment by subclassification on the propensity score where the same subclasses are used to estimate treatment effects for all outcome variables and in all subpopulations, and (iii) visual representation of multivariate covariance adjustment by a two- dimensional plot.
Article
Full-text available
Numerous studies have found that suburban residents drive more and walk less than residents in traditional neighbourhoods. What is less well understood is the extent to which the observed patterns of travel behaviour can be attributed to the residential built environment (BE) itself, as opposed to attitude-induced residential self-selection. To date, most studies addressing this self-selection issue fall into nine methodological categories: direct questioning, statistical control, instrumental variables, sample selection, propensity score, joint discrete choice models, structural equations models, mutually dependent discrete choice models and longitudinal designs. This paper reviews 38 empirical studies using these approaches. Virtually all of the studies reviewed found a statistically significant influence of the BE remaining after self-selection was accounted for. However, the practical importance of that influence was seldom assessed. Although time and resource limitations are recognized, we recommend usage of longitudinal structural equations modelling with control groups, a design which is strong with respect to all causality requisites.
Article
Full-text available
The issue of self-selection's role in shaping travel patterns, by impacting one's home location choice, is a critical question. Developers, planners and policymakers regularly debate to what extent the built environment and land use patterns can alleviate roadway congestion, greenhouse gas emissions and myriad other urban problems. This study illustrates the use of Heckman's (1976, 1979) latent index model to ascertain travel impacts of neighborhood type in Austin, Texas. Under this approach, self-selection is formulated as sample selection bias in receiving a treatment. Here, treatment is defined to be one's residence in a suburban or rural zone, rather than Austin's central business district and nearby urban zones. This treatment/no-treatment approach is a meaningful advance in models of self-selection effects, and requires estimation of three straightforward models. Model results suggest that the great majority (90%) of differences in vehicle-miles-traveled between central/CBD and suburban/rural locations is due to the treatment itself, rather than self-selection of such treatment (by households that wish to meet special travel needs).
Article
Full-text available
A large number of studies have investigated the association between the built environment and travel behavior. However, most studies did not explicitly quantify the contribution of residential self-selection to the connection. Using the 2006 data collected from a regional travel diary in Raleigh, NC, this study applies propensity score matching to explore the effects of the regional location of individuals' residences on their vehicle miles driven. We found that residential location plays a more important role in affecting driving behavior than residential self-selection; and that the self-selection effect is non-trivial when we compare driving behavior between urban residents and people living in other areas. Therefore, for such comparisons, the observed influence of residential locations on driving should be appropriately discounted when we evaluate the causal impacts of the built environment on travel behavior.
Article
The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Previous theoretical arguments have shown that subclassification on the propensity score will balance all observed covariates. Subclassification on an estimated propensity score is illustrated, using observational data on treatments for coronary artery disease. Five subclasses defined by the estimated propensity score are constructed that balance 74 covariates, and thereby provide estimates of treatment effects using direct adjustment. These subclasses are applied within sub-populations, and model-based adjustments are then used to provide estimates of treatment effects within these sub-populations. Two appendixes address theoretical issues related to the application: the effectiveness of subclassification on the propensity score in removing bias, and balancing properties of propensity scores with incomplete data.
Article
In this article we develop the theoretical properties of the propensity function, which is a generalization of the propensity score of Rosenbaum and Rubin. Methods based on the propensity score have long been used for causal inference in observational studies; they are easy to use and can effectively reduce the bias caused by nonrandom treatment assignment. Although treatment regimes need not be binary in practice, the propensity score methods are generally confined to binary treatment scenarios. Two possible exceptions have been suggested for ordinal and categorical treatments. In this article we develop theory and methods that encompass all of these techniques and widen their applicability by allowing for arbitrary treatment regimes. We illustrate our propensity function methods by applying them to two datasets; we estimate the effect of smoking on medical expenditure and the effect of schooling on wages. We also conduct simulation studies to investigate the performance of our methods.
Article
This article examines the adequacy of the “rules of thumb” conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice. Using a 2‐index presentation strategy, which includes using the maximum likelihood (ML)‐based standardized root mean squared residual (SRMR) and supplementing it with either Tucker‐Lewis Index (TLI), Bollen's (1989) Fit Index (BL89), Relative Noncentrality Index (RNI), Comparative Fit Index (CFI), Gamma Hat, McDonald's Centrality Index (Mc), or root mean squared error of approximation (RMSEA), various combinations of cutoff values from selected ranges of cutoff criteria for the ML‐based SRMR and a given supplemental fit index were used to calculate rejection rates for various types of true‐population and misspecified models; that is, models with misspecified factor covariance(s) and models with misspecified factor loading(s). The results suggest that, for the ML method, a cutoff value close to .95 for TLI, BL89, CFI, RNI, and Gamma Hat; a cutoff value close to .90 for Mc; a cutoff value close to .08 for SRMR; and a cutoff value close to .06 for RMSEA are needed before we can conclude that there is a relatively good fit between the hypothesized model and the observed data. Furthermore, the 2‐index presentation strategy is required to reject reasonable proportions of various types of true‐population and misspecified models. Finally, using the proposed cutoff criteria, the ML‐based TLI, Mc, and RMSEA tend to overreject true‐population models at small sample size and thus are less preferable when sample size is small.