1 / 22
Pedestrian Gestures Increase Driver Yielding at
Uncontrolled Mid-block Road Crossings
Xiangling Zhuang Changxu Wu*
To protect pedestrians, many countries give them priority at uncontrolled
mid-block crosswalks or pedestrian crossings. However, the actual driver yielding rate
is not always satisfactory (only 3.5% in this study). To increase the yielding rate, this
study proposed eleven pedestrian gestures to inform drivers of their intent to cross.
The gestures were evaluated based on the process of human interaction with
environment. Four gestures were selected as candidates to test in field experiments
based on scores for visibility, clarity, familiarity and courtesy (see illustration in
Figure 2): (1) right elbow bent with hands erect and palm facing left (R-bent-erect), (2)
left elbow bent with hands level and palm facing left (L-bent-level), (3) left arm
extended straight to left side with palm erect facing left (L-straight-erect), and (4) a ‘T’
gesture for “Time-out”. In the experiment, confederate pedestrians waiting at the
roadside displayed the gestures (baseline: no gesture) to 420 vehicles at 5 sites in
Beijing, China. When pedestrians used the L-bent-level gesture, the vehicle yielding
rate more than tripled of that in the baseline condition. The L-bent-level gesture also
resulted in a significant decrease in driving with unchanged speed (63.5% to 38.8%)
and had no significant side effects in terms of drivers’ horn use or lane changing. The
effects of such gestures in other contexts such as when pedestrians are in the
crosswalk and when they are interacting with turning vehicles are discussed, together
with the applications in training vulnerable pedestrian groups (children or elderly) and
facilitating pedestrian detection by drivers.
Keywords: pedestrian gestures, yield rate, persuasion, road and street crossings,
uncontrolled mid-block crosswalks
1.1. Road rights at unsignalized crosswalks
Pedestrian safety worldwide is threatened: number of pedestrian deaths and their
proportion among all road fatalities in low, middle and high income countries are
227,835 (45%), 161,501 (29%) and 22,500 (18%) (Naci et al., 2009). To protect
pedestrians, engineering approaches (e.g. traffic lights), together with educational
approaches have been stressed (Hebert Martinez and Porter, 2004). However, for
efficiency or cost reasons, traffic lights are usually not installed in places that do not
meet certain warrants on pedestrian or vehicle volume, etc. (General Administration
* Corresponding Author. Dr. Changxu Wu, PhD. University of Michigan. Email: firstname.lastname@example.org
2 / 22
of Quality Supervision, 2006). To compensate for the potential risk resulted from
limited protection facilities, traffic laws in many countries require drivers to yield to
pedestrians at these sites (e.g. Hakkert et al., 2002, China State Council 2005).
However, the marked crosswalks have still been found to be dangerous, even when
compared with unmarked ones (Koepsell et al., 2002). In fact, Zegeer et al.’s (2002)
comparison of 2000 marked and unmarked crosswalks in the USA showed that on
multi-lane roads with vehicle volume higher than 12,000 per day, marked crosswalks
could be riskier than their unmarked counterparts. Although this has been claimed to
be the result of pedestrians’ decreased carefulness in crossing (Leden et al., 2006),
drivers’ not obeying the yielding regulation contributes much to the problem. In
Ibrahim et al.’s (2005) observation in Malaysia, most pedestrians had difficulty in
crossing because the drivers did not yield to them. Várhelyi (1998) also observed that
95% drivers in Sweden did not give way when pedestrians were present. It is
therefore important to explore which approaches may help to increase driver yielding
1.2. Strategies to promote yielding
According to Lewin’s equation (Sansone et al., 2004), human behaviors are
determined both by the person and the environment:
𝐵𝑒ℎ𝑎𝑣𝑖𝑜𝑟 = 𝑓(𝑝𝑒𝑟𝑠𝑜𝑛, 𝑒𝑛𝑣𝑖𝑟𝑜𝑛𝑚𝑒𝑛𝑡)
In the context of driver yielding behavior, the “person” element refers to top-down
factors like drivers’ attitude towards pedestrians, their understanding of the right of
way, or their driving skills. Such personal factors have been found to influence drivers’
yielding rate in natural observations. Piff et al.’s (2012) observations in San Francisco
found that drivers with higher social status are less willing to yield to pedestrians.
Ibrahim et al. (2005) also explained that drivers’ observed failure to stop was because
they either did not care about pedestrians or because of their misunderstanding of the
rules of the road. Meanwhile, environmental factors are bottom-up determinants of
behaviors. Researchers have identified several such factors influencing yielding
behavior including speed limits (Turner et al., 2007), pedestrian’s distance from the
kerb (Himanen and Kulmala 1988), pedestrian’s clothes (Harrell, 1993) and the
number of pedestrians waiting to cross (Sun et al., 2003).
Theoretically, both types of factor help to understand driver yielding behavior.
When it comes to actively manipulate factors to get a higher yielding rate, however,
personal factors like social status are impossible or much more difficult to control
than environmental factors. Therefore, previous studies aiming to increase driver
compliance have resorted to changing the latter, on the basis that environmental
information can make a difference when processed in the human mind properly. The
SIFT model (Straker, 2008) states that an individual’s inner process of interacting
with the outer world has four phases: sensing, inferring meaning, formulating intent
and translating into actions. Based on this model, the “person” element in Lewin’s
equation (Sansone et al., 2004) in the context of driver yielding can be elaborated as
in Figure 1. First, drivers sense the surrounding environment, mostly via vision. For
instance, drivers may see a line of white triangular markings on the road ahead of a
3 / 22
crossing. Second, drivers interpret what the scene means. In the above example, they
may remember that the marking is a reminder of crosswalks ahead, and they need to
yield to pedestrians. Third, considering that not yielding is against traffic regulations,
they form a yielding intention. Finally, the driver translates the intention into action:
braking. This process also stands when applied to explain drivers’ responses to other
treatments such as prompt signs that remind with text “yield to pedestrians” (Van
Houten and Malenfant 1992, Huybers et al., 2004, Benekohal et al., 2007), pedestrian
activated flashing beacons (Schroeder, 2008) and responsive warning lights that flash
when pedestrians are detected (Hakkert et al., 2002).
Figure 1. Drivers’ interaction with the environment (elaborating Lewin’s equation in
the driver yielding context based on the SIFT model)
Emphasizing mental activities, the SIFT model focuses on the personal element
(Straker 2008). In Figure 1, rrequirements for “environment” elements corresponding
to the first three phases have also been added. “Visibility” refers to how easy a
treatment can be identified from surroundings. “Clarity” means that the intended
meaning of a treatment should not be misinterpreted, and “Motive power” requires
that a treatment has to connect with a motivator that can push the driver towards a
desired action. In other words, a treatment should have high visibility to facilitate the
sensing phase, as well as high clarity to avoid misinterpretation, and a strong
connection with motivators to encourage intent formation. In fact, in traffic sign
design and evaluation, understandability (i.e. clarity) and conspicuity (i.e. visibility)
have been considered by experts to be the most important two principles (Dewar
Considering the three criteria, previous mainstream treatments can be assessed as
in Table 1 (for the moment, please ignore the grayed columns). All the treatments
have medium to high visibility, and can convey the meaning clearly after training.
4 / 22
Among them, prompt signs can stimulate different motivations, depending on the text
on the sign. Most of them can remind drivers of the law (Van Houten and Malenfant
1992, Huybers et al., 2004, Benekohal et al., 2007), while others may encourage
yielding via social approval (Nasar 2003). Advance yield markings ahead of
crosswalks can also increase yielding by informing drivers of approaching crossings
nearby (Huybers et al., 2004). In addition to these static approaches, flashing beacons
and responsive lights can dynamically show the position of the crosswalk, thus
increased visibility and law awareness.
Table 1. Comparison of approaches aiming to increase driver yielding rate
Although the above treatments have been successful in terms of effectiveness,
hidden dimensions may undermine them (see the last 3 columns of Table 1). First, all
the facilities need to be built by a third party (e.g. the transport ministry) beyond the
drivers and pedestrians who are main parties involved in the context. Another
important attribute of the treatments is whether they are responsive – i.e., can be
activated by the user. This is important because responsive treatments like pedestrian
activated flashing beacons (Schroeder, 2008) and responsive lights only operate when
needed, thus they are less disturbing to drivers when no pedestrians are around.
Compared with devices that operate regardless of pedestrians’ existence, the
responsiveness attribute of a signal also enforces the connection between the yielding
behavior and the signal, thus facilitating drivers’ future responses to such warnings.
Unfortunately, responsive facilities are currently very expensive to install.
This study therefore aims to explore an alternative approach to traditional driver
warnings. Besides the three basic requirements (visibility, clarity, motive power), the
method must be able to work without any need to install equipment by a third party
and should also be responsive and cheap to apply. A promising candidate that satisfies
all the requirements is to allow pedestrians, in a sense, to “step out, tell drivers their
crossing intention, and ask drivers to yield”. Of course, the road context is often very
noisy and complex, thus potentially effective ways to “tell” and “ask” in this context
must be non-verbal. Some possible strategies can be gleaned from the way drivers
communicate with each other using blinkers, headlamps, horn-use, car movements
and gestures (Renge 2000). Among these approaches, gestures can be easily
understood, indicating their potential for use by pedestrians. In fact, Renge (2000)
found that, although novice drivers do not clearly understand informal device-based
signals (e.g. blinkering headlights to cars cutting in), they perform better in explaining
informal gesture-based signals. Moreover, gestures are natural, cost-efficient, and can
be used whenever and wherever needed. These advantages therefore motivated us to
law; social approval
5 / 22
explore how they may influence drivers’ behaviors at uncontrolled crosswalks.
1.3. Pedestrian gestures as a candidate
Recall that we have stated three basic dimensions that can be used to assess a
signal: visibility, clarity and motive power. In terms of visibility, gesturing is dynamic
and therefore can be more prominent than using signs and markings. Underwood et
al.’s (2003) work showed that drivers’ attention was more easily attracted to moving
than static objects. In terms of clarity, different gestures have varying clarities, thus
we need to choose the most effective. For the motivation element, driver compliance
may be attainable due to commanding or polite gestures, which can map to two
compliance gaining strategies: “assertion” and “direct request” (Kellermann and Cole
1994). However, it is also possible that the gestures may not work because people are
more likely to comply with authority (the official traffic controls) than with ordinary
pedestrians (Cialdini and Goldstein 2004).
To our knowledge, the only study on pedestrian gestures in the literature is by
Crowley-Koch et al. (2011) conducted in Chicago and western Michigan. They
compared drivers’ yielding rates under three conditions: (1) no gesture or prompt, (2)
extended arm (where pedestrians “extend the right arm into the crosswalk at 90
degrees” with the palm facing drivers), and (3) raised arm (where pedestrians “held
the left hand up at chest height in front of the body with the elbow bent, palm facing
the driver”). The results indicated that the yielding rate increased in the latter two
conditions compared with no prompt.
However, some points need to be considered
in this pioneering work. First, seven of the ten sites in their study had only 2 lanes (the
other three had 3 or 4), and the reported vehicle volumes were low. In other words,
pedestrians’ risk at these sites was low. Since vehicle speed is usually lower on narrow
roads (Godley et al., 2004), drivers are more likely to yield at these sites. Second, the
study did not report the gestures’ effect in other aspects except for the actual “yielding
rate”. In some cases, drivers may have slowed down to see what was going on as a
natural response to novel stimuli (i.e. a gesture), without the real intention of yielding.
Although the overt behavior (slow down) may be the same, the interpretation (i.e.
curious about novel stimuli vs. want to yield) ultimately determined whether the
gestures would still be effective in the long term.. In addition, gestures may be
effective in increasing yielding rates but at the same time cause other problems. For
example, the drivers may use the horn at pedestrians, which is frequently observed in
daily life (in this study, the horn-use rate is 15.3%). Side effects like this were not
evaluated. Finally, some gestures are culture dependent (Archer 1997), as a proper
gesture in one culture may cause problems in a different one. In fact, the “extended
arm” seemed to be a gesture for “flagging a taxi” in China, which may not gain a high
yielding rate. To avoid such problems, an evaluation of gestures and of drivers’
understanding is needed, based on the three basic elements already mentioned:
visibility, clarity, and motive power.
No statistical test results were provided, but the baseline yielding rate was 1.9%~31.5% across the 10 sites, much
lower than the yielding rate for the latter two conditions which were respectively 9%~63.6%, and 18.5%~68.8%.
6 / 22
Gestures have good visibility due to their dynamic features when compared with
static signs (See Section 1.2). However, to ensure that drivers can see pedestrians at a
distance, comparison of the visibility of different gestures is still necessary. The
“motivator” of traditional treatments is based on enforcement of the law and driving
tests. Gestures do not currently have such a legal status. Nevertheless, a possible
motivator is the courtesy dimension indicating whether a gesture is politely requesting
or forcefully commanding a right to cross. “Assertion” and “Direct request” are
compliance gaining techniques that change behavior via social influence (Kellermann
and Cole 1994), so they may indirectly reflect whether a gesture has a strong driving
force. Moreover, this dimension is helpful in that when all other aspects are equivalent,
courtesy gestures may help to maintain a harmonious relationship among road users.
In addition to these basic dimensions, general ergonomic principles for evaluating
traffic signs also include familiarity (Ben-Bassat and Shinar 2006), which can
facilitate the learning of signals.
To sum up, pedestrians’ rights at unsignalized crosswalks are potentially at risk in
many parts of the world. To increase driver yielding rates, researchers have proposed
several environmental changes to remind drivers including yield markings, prompt
signs, and responsive lights. Although such measures can be effective, they need to be
built by official authorities, and may have high cost implications. An alternative or
supplementary solution has therefore been proposed that pedestrians themselves
actively use arm gestures to ask drivers to stop. The following Section 2 evaluates
eleven gestures for visibility, clarity, familiarity and courtesy. Four of them were then
selected to be evaluated in field experiments described in Section 3 to explore how
different gestures influenced drivers’ yielding, horn-use and lane changing behaviors.
Section 4 discusses the implications and limitations of the gestures.
2. Evaluation of gestures
The evaluation process is necessary for two reasons. The first is that observation
of behavior can only reflect a response, but cannot reveal mental activities that
motivated it. For instance, a driver may slow down to see what is happening when the
pedestrian is making a gesture rather than yield due to understanding the pedestrians’
request. The other purpose of the evaluation is to screen the gestures in order to
balance the experiment implementation cost and inclusion of many gestures. To avoid
omitting potentially effective gestures, eleven were included in the study. However, if
all the eleven were directly tested in field experiments, 24,000 crossing attempts
would have been necessary.
But, after the screening process which resulted in four
gestures, only 1,000 attempts were needed.
As part of the research, thirty two drivers, recruited in a continuing education
2 Number of attempts for testing eleven gestures: 2 (back and forth) x 12 levels (baseline, eleven gestures) x 100
times / level= 24,000 times. Similarly, number of attempts needed for testing four gestures: 2 (back and forth) x 5
levels (baseline, four gestures) x 100 times / level = 1,000 times.
7 / 22
class and on the road in Beijing, China, participated in the pedestrian gesture
evaluation. Two were taxi drivers and the others were private car owners. The age
range was from 25 to 56, with an average age of 36. Among them, 44% were males.
On average, they had been licensed drivers for 6.9 years and usually drive for 1.9
hours on a daily basis. They were paid 25 RMB for their participation.
After reviewing webpages searched with the keyword “pedestrian gestures” (in
Chinese) and referring to daily experience, eleven gestures were selected as
evaluation candidates. An illustration of the gestures can be seen in Figure 2. The
following shows the sources of these gestures (some have several).
G1, G5: Adapted from internet news introducing how to cross the road safely.
G1, G2, G3, G6: Daily observation in Beijing, China.
G3, G8: Crowley-Koch et al. (2011) had the two gestures.
G4, G7, G5, G9: Adapted from other contexts to show praise, respect or request.
G10: Integrating G2 and G6 to increase visibility.
G11: A gesture often used to represent “stop, terminate, and halt” in sports.
Figure 2. Illustration of the eleven proposed and evaluated gestures. The first row
Example: news from Shenzhen city describing a social program called “Civilized and Courteous Zebra”:
http://dnsb.cnnb.com.cn/portal.php?mod=view&aid=3609 accessed 2013-4-7
8 / 22
shows five Left-arm gestures: L-straight-erect, L-straight-level, L-bent-level, L-ok,
and L-thumb-up. The second row has three Right-arm gestures: R-bent-erect, R-salute,
and R-straight-level. The final three gestures in the last row were performed with both
arms: Hold fist salute, L-straight-level-R-bent-erect, and T-gesture. “Straight” and
“bent” refer to the state of the arm or elbow, while “erect” and “level” refer to the state
of the hands. The order number “GX” means the Xth Gesture.
Gesture photographs (not diagrams as in Figure 2) used in the evaluation were
taken from drivers’ viewpoints using models displaying the gestures at the kerb with
one foot on an unsignalized crosswalk. Both female and male versions of the pictures
were taken. Thus, 22 color pictures were used in the evaluation.
A short questionnaire with six questions was developed to evaluate each arm
gesture. The questions and are as follows:
Q1. Can you see pedestrian clearly? (7 point Likert scale, test visibility)
Q2. What do you think his/her gesture means in the current context? ____________
Q3. How definite is the gesture in conveying the meaning you answered in Q2? (7
point Likert scale, test clarity)
Q4. How often do you see this gesture on the road? (7 point Likert scale, test
Q5. Is the pedestrian commanding or politely asking someone to do something? (7
point Likert scale, test courtesy)
Q6. What are you most likely to do in this case? (Multiple choice, choose among five
responses: Speed up; Do not change speed; Pass by the pedestrian with reduced
speed; Slow down to let the pedestrian cross; Stop to let pedestrians pass).
For all the 7 point Likert scale, higher values represent better visibility, stronger
clarity, higher familiarity, and more courtesy, and the neutral response is 4. The
questions were constructed so that all the dimensions stated in Section 1.3 could be
evaluated. An additional question about stated behavior when coming across such
gestures was included to compare with actual behavior in the subsequent field test.
The gesture photographs were printed out in color format and shown to the
drivers one by one in random order. First, participants were instructed to notice that
the pedestrians in all pictures were standing at a marked but uncontrolled mid-block
crosswalk. Then for each gesture, both male and female versions of the gestures were
given to participants before answering the corresponding survey questions. They were
told to ask the researchers standing beside if they could not see a gesture clearly in the
photograph. (In these rare instances, as they would not be able to answer the other
questions, researchers showed them the gesture to help them finish the evaluation.) To
minimize the experimenter effect, researchers did not look directly at participants
during the evaluation except when explaining the task. They changed the pictures
whenever a gesture evaluation was completed.
9 / 22
Since the meanings of the gestures were gained from an open-ended question, the
answers were coded into eight categories as shown in Table 2. The category “Yield”
includes answers such as “please let me cross first”, while “Yield (stop)” and “Yield
(slow down)” refers to answers that clearly stated the way of yielding, such as “stop, I
want to cross the road” or “I plan to cross, please slow down.” These three categories
were the “intended” meanings of requests to yield. However, there were other
additional answers: ”Flag a taxi”, “Ask for a lift”, “Yield to drivers” (e.g. “you go
first”), and “Not clear”. The category “Other” refers to answers that only indicated
the general meaning of the gesture regardless of the current context (e.g. R-salute:
“salute”, L-ok: “Ok” and L-thumb-up: “praising me”).
In Table 2, darker backgrounds show more popular choices within each category
or higher score in the gesture attributes. The results indicate that all gestures had good
visibility, except G4 (L-ok) which, together with G7 (R-salute), was considered
unclear in conveying meaning. Meanwhile, although gestures G5 and G8 had good
clarity, this referred respectively to “asking for a lift” and “flagging a taxi” rather than
“yielding”. To rank the gestures in a systematic way, the evaluation criteria, in order
of significance, were as follows:
(1) Correct understanding of meaning
(2) Good visibility and clarity
(3) Greater familiarity and courtesy
(4) Drivers’ stated responses to the gestures (e.g. slow down, stop) were only
subsidiary references to rank the gestures.
The eleven gestures were ranked based on the above criteria by three researchers
with the paired comparison method. The final ranks were ordered according to their
increasing appropriateness and displayed in Table 2. In the table, the dotted line
separates desirable and undesirable understandings. Drivers’ understanding of the first
four gestures (G4, G8, G7 & G5) were mostly distributed below the dotted line. G4
was in fact interpreted as having the opposite meaning: yield to drivers. The middle
three gestures (G9, G2 & G10) were better understood but still caused confusion
among some: the meanings were misinterpreted or the clarity was not high. The
remaining four gestures (G6, G11 G3 & G1 in bold) outperformed the others.
Although none were considered particularly courteous and only G1 was familiar to
drivers, important aspects of the gestures were satisfactory, especially criteria (1) and
(2), with almost all correctly interpreted with high scores on visibility and clarity.
Therefore, these four (G6, G11 G3 & G1) were selected to be tested in field contexts
in Section 3.
Table 2. Evaluation results of eleven gestures by thirty-two drivers
Ranked gestures (negative → positive)
10 / 22
Yield (slow down)
Yield to drivers
Flag a taxi
Ask for a lift
Pass (slow down)
Yield (slow down)
Note: X* denotes X is significantly lower than 4 (neutral value in the 7-point Likert
scale); X0 means X is not significantly different to 4, and all other attribute values are
significantly higher than 4.
3. Field experiments
The selected gestures from Section 2 were G6 (R-bent-erect), G11 (T gesture),
G3 (L-bent-level), and G1 (L-straight-erect). These gestures, together with a baseline
condition where no gesture was used, were the five levels of the independent variable
and were presented to drivers randomly in our field experiments carried out in China.
The dependent variable was driver responses to gestures including: speeding up, not
changing speed, slowing down when passing, slowing down to yield to pedestrians,
stopping to yield to pedestrians, changing lanes, and horn-use. In case of yielding, the
distance between the driver and the pedestrian (when they were in the same lane) was
also recorded to evaluate safety.
Sites that have non-signalized crosswalks usually have two or four lanes in China.
The latter are wider and more dangerous for pedestrians and were therefore among
four of the five experiment sites selected. All these sites were in Beijing (their
characteristics are listed in Table 3). Image “a” in Figure 3 shows Cuiwei Road: on
either side of the traffic barrier in the middle, there are two motor lanes, a non-motor
lane and a green verge and sidewalk. Zhixin West Road has similar layout except that
the road only has two lanes with no barrier dividing them. Image “b” shows Xicui
Road with a mid-road barrier. On either side of the barrier, the layout has the
following structure from middle of the road to roadside: two motor lanes, a green
11 / 22
verge, a non-motor lane and a sidewalk. Zhixin East Road has similar layout except
that the gap between the barriers in the crosswalk is larger. Xueyuan South Road also
shares this layout except that the big trees act as separator between the motor and
non-motor lanes instead of a green verge. The signs with blue background and a white
triangle showing a person crossing the road indicate the position of the crosswalk, and
the diamond-shape road markings remind drivers of crosswalks ahead. At all these
sites, action is rarely taken against drivers do not yield, unless an accident happens
due to incompliance.
Table 3. Characteristics of the experiment sites
Zhixin East Rd.
Residential area, restaurants
Xueyuan South Rd.
Residential area, university
Zhixin West Rd.
Residential area, park
Figure 3. Photographs of the experiment sites: “a” shows the layout of Cuiwei Road
(similar to Zhixin West Road), and “b” shows Xicui Road (similar to Zhixin East
12 / 22
Road and Xueyuan South Road).
As shown in Figure 4, three people were needed in each observation, one as a
pedestrian and two as observers. The pedestrian waited at the edge of the road at P0
while Observer 1 surveyed the whole scene to detect a target vehicle that could meet
the following criteria: the vehicle is on the same side of the road as the pedestrian,
with no other vehicles in adjacent lanes, and no real pedestrian crossing. Once the
target vehicle arrived at V0 (see the flag in Figure 4), observer 1 signaled to observer 2
and the confederate pedestrian (enacted by researchers). Then the pedestrian began to
walk towards the crosswalk until one of his/her feet was on it. The pedestrian stopped,
turned his/her head to the left, looked at the vehicle and presented a gesture (or no
gesture in the baseline condition) to the driver. If the driver yielded, the pedestrian
withdrew the gesture and crossed the street; otherwise he/she withdrew their gesture
and waited there until there was a gap large enough to cross. While the pedestrian was
walking, both observers recorded the drivers’ responses independently in a predefined
datasheet. When the road had been crossed, the confederate returned to P0 and waited
to start another trial.
Figure 4. Experiment procedure diagram
Since the distance to the pedestrian may influence drivers’ yielding behavior, the
onset of the gesture was made at point V1 when the distance was sufficient for the
fastest driver (calculated with the speed limit of the road) to stop if they wanted to
yield (V1P1 was kept roughly constant among gestures). Point V0 (when the observer
signaled to the pedestrian) was estimated to make sure that vehicles would arrive at
V1 when the pedestrian arrived at P1 from P0. Since the vehicle speed varied among
target vehicles, sometimes the point V0 was adjusted according to vehicle speed.
One hundred drivers were observed at each site, thus the sample size was 500.
Cases in which two observers did not have the same data recorded were excluded.
This resulted in 420 valid cases, with a valid rate of 84%. The following Table 4
shows behavior responses from the 420 drivers. No driver was observed to speed up,
13 / 22
so the table only includes six types of response.
Table 4. Drivers’ responses to selected gestures (G6, G11, G3, G1)
The overall yielding rate (including both slowing down and stopping to yield) at
all sites was 8.6%, with an average yielding distance of 8.1 meters ahead of
pedestrians. This rate is far lower than the driver self-reported yielding rate in Section
2 (91%). In the baseline condition (i.e. with no gesture), only 3.5% of the drivers
yielded to the pedestrian, and up to 63.5% of them did not even change speed. Instead,
they chose to use their horn (15.3%) or change lanes (5.9%) to make sure they could
continue forward without being disturbed.
Since the first four responses – no change, pass (slow down), yield (slow down),
yield (stop) – indicated increasing levels of yielding, they were regarded as the four
values of an ordinal variable: yielding degree. A sum rank test using Kruskal-Wallis H
was conducted to explore the effects of the gestures. Although the yield behaviors
differed among the gestures (χ2 = 12.8, df = 4, p = .012), a post hoc test with
Mann-Whitney U showed that only the last gesture in Table 4 (G3: L-bent-level)
significantly differed with the baseline on yielding behavior (Z = -3.45, p = .01, after
Dunn-Sidák correction). All other gestures did not differ from each other significantly
(p > .05). Fisher’s exact test also showed that drivers’ horn use did not differ in
response to different gestures (p = .405), nor in the case of lane changing behavior (p
Consequently, the final selected gesture was L-bent-level (see Figure 5). While
the yielding rate was still not high with this best performing gesture, the increase in
drivers’ yielding rate (3.5% to 12.9%), and the decrease in drivers’ passing by with
unchanged speed (63.5% to 38.8%) indicated a promising prospect. Moreover, the
side effects of the gesture (horn use and lane changing) did not differ with that of the
14 / 22
Figure 5. Examples of the best-performing gesture: G3 (L-bent-level).
Although Chinese law gives the right of way to pedestrians at uncontrolled
mid-block crosswalks (China State Council 2005), the baseline yielding rate is only
3.5%. The evaluation of eleven proposed pedestrian gestures according to their
visibility, clarity, familiarity and courtesy resulted in four gestures – G6 (R-bent-erect),
G11 (T gesture), G3 (L-bent-level), and G1 (L-straight-erect) – that satisfied the major
criteria. Field experiment with them identified G3 (L-bent-level) as the final choice
for increased yielding and slowing down without bringing about side effects. This
section first discusses the effect of gestures on yielding in the light of current
literature and then presents the potential theoretical and practical contributions of the
research. Possible limitations of the study are also discussed.
4.1. Effect of gestures
As noted above in section 1.3, Crowley-Koch et al. (2011) compared two gestures
(extended arm and raised arm) and their effects in increasing drivers’ yielding.
Although they did not illustrate the gestures with pictures, their verbal descriptions
(‘extend arm: extend the right arm into the crosswalk at 90 degree with the palm
facing drivers’; ‘raised arm: hold the left hand up at chest height in front of the body
with the elbow bent, palm facing the driver’) suggested that the extended arm gesture
was similar to G8 (R-straight-level) and the raised arm similar to G3 (L-bent-level).
Although the extended arm treatment was found to be effective in their study, the G8
(R-straight-level) was not selected in the evaluation step in the current study. The
evaluation results showed that although 41% of the participants reported yielding
when coming across this gesture, 34% of the surveyed drivers thought this gesture
meant flagging a taxi, and 13% even thought the pedestrian was yielding to drivers.
The heads of the models are masked to protect privacy, but it can still show the direction they are looking which
is the approaching direction of vehicles. Note that in countries where vehicles drive on the left side, the gestures
and head direction should be reversed to the right.
15 / 22
This suggests gestures are indeed not universal (Archer 1997).
Similar to Crowley-Koch et al.’s (2011) finding on the effectiveness of the raised
arm gesture, G3 (L-bent-level) proved effective in increasing yielding when compared
with the baseline. However, considering that this gesture was not significantly better
than the other three selected gestures (G6, G11, G1), it only stood out by a slim
margin. The evaluation results on clarity and familiarity shows that this gesture is no
better than G1 (L-straight-erect). Crowley-Koch et al. (2011) explained that G3
(L-bent-level) was a ubiquitous representation of halting. This interpretation, however,
cannot explain the slight difference between G1 (L-straight-erect) and G3
(L-bent-level) since the former also means “stop” or “forbidden”. A possible reason is
the visibility difference. In a static scenario, as in the evaluation, the drivers were not
at the wheel and the scene was static. Therefore, the visibility requirement was not
high, and both gestures were evaluated as satisfactory. In a real setting, however,
drivers’ eyes are flooded with complex visual data and must approach pedestrians
from a distance. Therefore, visibility becomes more important. With G3 (L-bent-level),
drivers can see the whole arm of the pedestrian whereas only the palm can be seen in
It should be noted that even with the effective G3 (L-bent-level) gesture, the
yielding rate is still relatively low (12.9%). However, this does not mean that
pedestrian gestures cannot make a difference. For one thing, the drivers that slowed
down when passing by increased by 15.3%, which arguably means that such drivers
might yield if the gestures were implemented as part of a traffic safety campaign, for
instance. Moreover, the effectiveness of a gesture also depends on where, when, and
how long it is used. In the current study, the gestures were only displayed at the
roadside by stationary pedestrians for several seconds to the nearest driver. Without
these restrictions, a gesture may be more effective. There are two common scenarios
where gestures might be used with more relaxed limits: (1) pedestrians can cross the
road while maintain the gesture to vehicles approaching subsequently; (2) pedestrians
who previously stopped in the crossing can display gestures to approaching drivers for
a sufficient time to reinitiate crossing. Although the effects of gestures in these two
scenarios cannot be confirmed easily in field experiments,
a higher yielding rate
might be expected for two reasons. First, previous research has found that assertive
pedestrians standing farther from the kerb they are already were more likely to gain
the right of way (Himanen and Kulmala 1988, Harrell 1993). Second, when
pedestrians are crossing the road, the intended meanings of gestures (i.e. please yield)
are more obvious because some misinterpretations of the gestures can be excluded
naturally. For instance, in the case of pedestrians who used G8 (R-straight-level),
drivers will not think that they are flagging a taxi. Theoretically, these two reasons
To make the scenarios realistic, pedestrians need to be on the road when drivers are approaching. However,
drivers’ speeds far away are hard to predict, thus the moment the pedestrian appears cannot be determined easily.
What’s more, to compare the effectiveness of the gestures with that of the baseline, drivers should be exposed to
identical treatments in aspects other than gesture conditions (e.g. pedestrians’ distance from drivers). In this case,
researchers acting as the confederate pedestrians need to pass in front of vehicles without even showing their
intention with gestures in the baseline condition in the “cross while displaying gesture” scenario. This
implementation would be very risky, especially when multiple crossings are needed.
16 / 22
will lead to a higher yielding rate, which is consistent with the authors’ observations
of daily life in China. Another potential extended use of gestures would be at
intersections where vehicles turning right fail to yield to pedestrians, thus becoming
hazardous to pedestrians (Abdulsattar et al., 1996). Turning vehicles usually slow
down to ensure a safe turn. Given that lower speed can increase the possibility of
yielding (Himanen and Kulmala 1988, Turner et al., 2007), an additional increase in
the yielding rate may be expected if gestures were applied in this situation as
compared with that of the mid-block crosswalks.
4.2. Theoretical implications and practical applications
The integration of existing theories, together with the three added features to
evaluate gestures, has theoretical implications beyond the context of this study. As
noted in Section 1.2, Lewin’s equation (Sansone et al., 2004) is a very general
conceptual framework about the interaction of person and environment, thus it is not
easy to use in practice. On the other hand, the SIFT model (Straker, 2008) is a detailed
model that focuses on the inner activity of person, but rather neglects the influence of
the environment. Their integration makes it more practical for future studies to
evaluate existing countermeasures as well as develop new strategies based on
psychological process of perception and decision making. The three environmental
features added to three corresponding internal phases in order to induce desired
behaviors are visibility, clarity and motive power. In Fogg’s (2009) behavioral model
(FBM) of persuasive design, the power of a design in changing behavior depends on
three components: ability, motivation and trigger. In other words, people will behave
in an intended manner when a task is easy, motivating and contains a signal related to
the intended behavior. Clearly, “visibility” can make the yielding task easier, thus can
be mapped to the “ability” component in FBM whereas “Motive power” is essentially
the same as the “motivation” component. The “clarity” feature means that the target
only has one exclusive meaning attached to it, which not only makes the task easier,
but also function like a “trigger”. This mapping between the three components and
such a widely used behavior model indicates that these features are not confined to the
current study but reflect general requirements of approaches when trying to alter
behavior via the environmental change. The rest of this section discusses the clarity
and motive components in detail.
Overall, the gestures used in our research, together with responsive lights
(Hakkert et al., 2002) and signs (Van Houten and Malenfant, 1992) all try to
encourage drivers to yield. In this context, clarity requires these signals to have strong
association with the request to yield to pedestrians, or more specifically, associate
with “the presence of pedestrians” and “the need to yield to them”. The signals’
connection with pedestrians’ presence can be assessed in terms of time and meaning.
In terms of time, prompt signs and yield markings can trigger drivers to pay attention
to approaching crosswalks or potential pedestrians. Their appearances are not always
associated with the presence of pedestrians waiting to cross, so the time connection
between signal and target is weak. In contrast, responsive lights and gestures are
always accompanied by pedestrians, thus the time connection is strong. In terms of
17 / 22
meaning association, gestures are a direct signal of pedestrians’ presence and intent,
but the traditional triggers such as prompt signs rely on a memory extraction of how
they are associated with pedestrians (as part of the ‘inferring meaning’ phase in the
SIFT model). The direct association is so important that ergonomic guidelines have
one requirement for traffic signs, known as “physical representation”, which stresses
the similarity between the content of the sign and the reality it represents (Ben-Bassat
and Shinar 2006). These time and meaning associations of gesture and pedestrian
mean that gestures outperform traditional treatments in their connection with
pedestrian presence. However, this does not mean that all gestures are effective as
they also must have a conceptual association with “yielding to pedestrians”, beyond
merely signaling “the presence of pedestrian”. This is why the stated meaning and
clarity are very important in the evaluation of gestures in Section 2. Before the
evaluation, the attribute “courtesy” was considered important for social influence and
a harmonious transportation environment. However, all of the polite gestures were
excluded because courteous gestures such as G7 (R-salute) and G9 (Host fist salute)
had various interpretations. It was evident that they were associated with a request,
but the specific content of the request was not clear. This implied that two-step
gestures that first show the request and then display a gesture showing gratitude if
drivers’ yield may be effective and harmonious triggers – e.g. combine G3
(L-bent-level) and G5 (L-thumb-up). The reward may encourage more drivers to yield
voluntarily in future.
Traditional traffic management treatments mainly rely on respect for the law as
the motivation (see Table 1), which is very effective because authority is an important
determinant of compliance identified in social psychology (Cialdini and Goldstein
2004). In the current study, the gap between reported and observed yielding rates was
very large (12.9% vs. 91%). Self-reported measures usually suffer from bias in
questions that have social desirability like “yield to pedestrians” (Lajunen et al., 1997).
If this bias is the reason for the difference, it can be inferred that drivers know they
should yield, but they simply refuse to do so. Another explanation is that when
answering the survey, drivers need to “choose” a response among the available
answers. This process resulted in yielding choices in some drivers. In reality however,
drivers may simply follow their habits and ignore the crosswalk without even making
the effort to “choose”. Whatever the explanation, it indicates that drivers lack the
motivation to yield. A possible reason is the low authority level (Cialdini and
Goldstein, 2004) when the gestures were used by the researchers. Suppose that police
officers displayed the gestures, the yielding rate may have soared. This problem might
therefore be alleviated by integrating the gestures into traffic regulations.
Besides the lack of authority, other potential causes for low compliance are
multiple (recall the top down factors in Figure 1): drivers’ may be unwilling to be
interrupted when driving in a state of flow (Chen and Chen, 2011) or they
misunderstand who has the right of way (Hatfield et al., 2007). These possibilities
indicate that although an environmental change (i.e. bottom-up factors in Figure 1) is
a quick and rough solution, understanding the intrinsic reason for the low yielding rate
may offer alternative clues. It is therefore suggested that future research should
18 / 22
approach the driver yielding issue from top-down, looking at why drivers lack the
motivation to yield and how to stimulate it.
In addition to these theoretical implications for future work, a practical
implication from the study is that pedestrians should be trained to make the
‘L-bent-level’ gesture to approaching vehicles. Currently, a commonly seen slogan for
pedestrians is “first stop, second look, and third cross”. This way of crossing places
low demands on the driver’s side but may overload pedestrians, especially young
children and the elderly who are vulnerable groups worldwide (Zegeer and Bushell
2012). In future, we propose that using an appropriate “gesture” may become the third
step, as a signal of intent to cross. In this way, drivers share some of responsibility as
they need to look at the pedestrian gestures and act accordingly. For this reason, the
gesture should be included in the formal training to obtain a driving license. In the
case of pedestrians, the gesture is easy to learn and convenient to display with one
hand, thus even small children can master it easily. Several ways could be adopted to
educate pedestrians: children can be educated in school to cross marked crosswalks
with the gesture. Signs telling pedestrians how to use the gestures might be posted
nearby or by means of an official website or microblog of the transport authorities.
The gesture could improve pedestrian safety in another way. In intelligent
transportation systems, detection of pedestrians and their crossing intention via
machine learning is very important in assisting drivers in case of visual failure
(Kohler et al., 2012). Since shape based detection has already been used as a cue in
pedestrian detection system (Gavrila and Munder 2007), a pedestrian displaying a
gesture might make it easier to distinguish between pedestrians that want to cross the
road from those who simply wander near the crosswalks..
The side effects of using gestures considered in this paper were horn use and lane
changing. In a survey conducted in Japan, horn use made 60.1% of pedestrians feel
noisy, startled and irritated (Takada et al., 2012). Lane change was also included as a
side effect because it decreased the predictability of vehicle behaviors. However, it is
notable that these two phenomena did not increase as a result of the gesture use in our
research. Nevertheless, there may be some other potential problems. For instance, if
the vehicle volume is high and only the vehicles close to the pedestrians yield, then it
is risky to cross since vehicles out of sight may dart out in an adjacent lane.
Pedestrians should be more careful to avoid such situations if the nearby vehicle is a
big one such as buses that can block pedestrians’ sights. Another potential side effect
of using a gesture is that it is displayed to gain pedestrians’ right of way, potentially
increasing the perception of their assertiveness and aggression. It should also be noted
that the gestures were only tested in China, so whether the findings could be extended
to other culture is unknown. Theoretically, so long as a gesture does not bear
alternative meaning to “yielding”, it can be implemented in driver training to build a
connection with yielding need, just as the recognition training of other traffic signs
like yield markings. However, a signal that was conventionally understood or even
already used by some pedestrians would be better. For instance, in this study, the G3
‘L-bent-level’ gesture is a natural response of many Chinese people when blocking or
stopping something undesirable.
19 / 22
Four out of eleven gestures (‘R-bent-erect’, ‘T gesture’, ‘L-bent-level’, and
‘L-straight-erect’) were judged by Chinese drivers as winner when evaluated for
visibility, clarity and their level of familiarity. Field experiments showed that only the
‘L-bent-level’ gesture significantly increased the drivers’ yielding rate (or drivers
slowing down when passing through). This gesture had no significant side effects in
terms of horn use or lane changing. Therefore, it is suggested that pedestrians be
trained to use the gesture and drivers be trained to properly interpret and respond to it.
This work is supported by the National Basic Research Program of China
(2011CB302201). The authors gratefully acknowledge the assistance of Shu Ma and
Rui Jiang in the preparation of the gestures and pilot test of experiments.
Abdulsattar, H., Tarawneh, M., Mccoy, P., Kachman, S., 1996. Effect on
vehicle-pedestrian conflicts of "turning traffic must yield to pedestrians" sign.
Transportation Research Record: Journal of the Transportation Research
Board 1553 (1), 38-45.
Archer, D., 1997. Unspoken diversity: Cultural differences in gestures. Qualitative
Sociology 20 (1), 79-105.
Ben-Bassat, T., Shinar, D., 2006. Ergonomic guidelines for traffic sign design increase
sign comprehension. Human Factors: The Journal of the Human Factors and
Ergonomics Society 48 (1), 182-195.
Benekohal, R.F., Wang, M., Medina, J.C., 2007. Crosswalk signing and marking
effects on conflicts and pedestrian safety in UIUC campus. Traffic Operations
Laboratory, Department of Civil and Environmental Engineering, University
of Illinois at Urbana-Champaign.
Chen, C.-F., Chen, C.-W., 2011. Speeding for fun? Exploring the speeding behavior of
riders of heavy motorcycles using the theory of planned behavior and
psychological flow theory. Accident Analysis & Prevention 43 (3), 983-990.
Cialdini, R.B., Goldstein, N.J., 2004. Social influence: Compliance and conformity.
Annual Review of Psychology 55, 591-621.
China State Council, 2005. Traffic safety law of the People's Republic of China
Available at http://www.gov.cn/banshi/2005-08/23/content_25579.htm
Crowley-Koch, B.J., Van Houten, R., Lim, E., 2011. Effects of pedestrian prompts on
motorist yielding at crosswalks. Journal of Applied Behavior Analysis 44 (1),
Dewar, R., 1988. Criteria for the design and evaluation of traffic sign symbols
20 / 22
Transportation Research Board, Transportation Research Record.
Fisher, D., Garay-Vega, L., 2012. Advance yield markings and drivers’ performance in
response to multiple-threat scenarios at mid-block crosswalks. Accident
Analysis & Prevention 44 (1), 35-41.
Fogg, B., 2009. A behavior model for persuasive design. In: Proceedings of the 4th
International Conference on Persuasive Technology, pp. 40.
Gavrila, D.M., Munder, S., 2007. Multi-cue pedestrian detection and tracking from a
moving vehicle. International Journal of Computer Vision 73 (1), 41-59.
General Administration of Quality Supervision, Inspection and Quarantine of the
People's Republic of China, 2006. Specification for setting and installation of
road traffic signals. Standardization Administration of the People's Republic of
China, pp. 36.
Godley, S.T., Triggs, T.J., Fildes, B.N., 2004. Perceptual lane width, wide perceptual
road centre markings and driving speeds. Ergonomics 47 (3), 237-256.
Hakkert, A.S., Gitelman, V., Ben-Shabat, E., 2002. An evaluation of crosswalk
warning systems: Effects on pedestrian and vehicle behaviour. Transportation
Research Part F: Traffic Psychology and Behaviour 5 (4), 275-292.
Harrell, W.A., 1993. The impact of pedestrian visibility and assertiveness on motorist
yielding. The Journal of Social Psychology 133 (3), 353-360.
Harrell, W.A., 1994. Effects of pedestrians' visibility and signs on motorists' yielding.
Perceptual and Motor Skills 78 (2), 355-362.
Hatfield, J., Fernandes, R., Job, R., Smith, K., 2007. Misunderstanding of
right-of-way rules at various pedestrian crossing types: Observational study
and survey. Accident Analysis & Prevention 39 (4), 833-842.
Hebert Martinez, K.L., Porter, B.E., 2004. The likelihood of becoming a pedestrian
fatality and drivers’ knowledge of pedestrian rights and responsibilities in the
commonwealth of Virginia. Transportation Research Part F: Traffic
Psychology and Behaviour 7 (1), 43-58.
Himanen, V., Kulmala, R., 1988. An application of logit models in analysing the
behaviour of pedestrians and car drivers on pedestrian crossings. Accident
Analysis & Prevention 20 (3), 187-197.
Huybers, S., Van Houten, R., Malenfant, J.E.L., 2004. Reducing conflicts between
motor vehicles and pedestrians: The separate and combined effects of
pavement markings and a sign prompt. Journal of Applied Behavior Analysis
37 (4), 445-456.
Ibrahim, N.I., Karim, M.R., Kidwai, F.A., 2005. Motorists and pedestrian interaction
at unsignalized pedestrian crossing. In: Proceedings of the Eastern Asia
Society for Transportation Studies, pp. 120-125.
Kellermann, K., Cole, T., 1994. Classifying compliance gaining messages: Taxonomic
disorder and strategic confusion. Communication Theory 4 (1), 3-60.
Koepsell, T., Mccloskey, L., Wolf, M., Moudon, A.V., Buchner, D., Kraus, J.,
Patterson, M., 2002. Crosswalk markings and the risk of pedestrian–motor
vehicle collisions in older pedestrians. JAMA: the Journal of the American
Medical Association 288 (17), 2136-2143.
21 / 22
Kohler, S., Goldhammer, M., Bauer, S., Doll, K., Brunsmann, U., Dietmayer, K., 2012.
Early detection of the pedestrian's intention to cross the street. In: Proceedings
of the 15th International IEEE Conference on Intelligent Transportation
Systems (ITSC), pp. 1759-1764.
Lajunen, T., Corry, A., Summala, H., Hartley, L., 1997. Impression management and
self-deception in traffic behaviour inventories. Personality and Individual
Differences 22 (3), 341-353.
Leden, L., Gårder, P., Johansson, C., 2006. Safe pedestrian crossings for children and
elderly. Accident Analysis & Prevention 38 (2), 289-294.
Naci, H., Chisholm, D., Baker, T.D., 2009. Distribution of road traffic deaths by road
user group: A global comparison. Injury prevention 15 (1), 55-59.
Nasar, J.L., 2003. Prompting drivers to stop for crossing pedestrians. Transportation
Research Part F: Traffic Psychology and Behaviour 6 (3), 175-182.
Piff, P.K., Stancato, D.M., Côté, S., Mendoza-Denton, R., Keltner, D., 2012. Higher
social class predicts increased unethical behavior. Proceedings of the National
Academy of Sciences 109 (11), 4086-4091.
Renge, K., 2000. Effect of driving experience on drivers' decoding process of
roadway interpersonal communication. Ergonomics 43 (1), 27-39.
Sansone, C., Morf, C.C., Panter, A.T., 2004. The sage handbook of methods in social
Schroeder, B.J., 2008. A behavior-based methodology for evaluating
pedestrian-vehicle interaction at crosswalks. North Carolina State University.
Straker, D., 2008. Changing minds: In detail. Syque Pub. Available at
Sun, D., Ukkusuri, S.V.S.K., Benekohal, R.F., Waller, S.T., 2003. Modeling of
motorist-pedestrian interaction at uncontrolled mid-block crosswalks. In:
Proceedings of the 82nd TRB Annual Meeting, Washington.
Takada, M., Murasato, H., Iwamiya, S.-I., 2012. Questionnaire survey on vehicle horn
use in urban areas of Japan. In: Proceedings of the INTER-NOISE and
NOISE-CON Congress, Paris, pp. 1698-1708.
Turner, S., Fitzpatrick, K., Brewer, M., Park, E., 2007. Motorist yielding to
pedestrians at unsignalized intersections: Findings from a national study on
improving pedestrian safety. Journal of the Transportation Research Board
Underwood, G., Chapman, P., Berger, Z., Crundall, D., 2003. Driving experience,
attentional focusing, and the recall of recently inspected events. Transportation
Research Part F: Traffic Psychology and Behaviour 6 (4), 289-304.
Várhelyi, A., 1998. Drivers' speed behaviour at a zebra crossing: A case study.
Accident Analysis & Prevention 30 (6), 731-743.
Van Houten, R., Malenfant, L., 1992. The influence of signs prompting motorists to
yield before marked crosswalks on motor vehicle-pedestrian conflicts at
crosswalks with flashing amber. Accident Analysis & Prevention 24 (3),
Van Houten, R., Malenfant, L., 2004. Effects of a driver enforcement program on
22 / 22
yielding to pedestrians. Journal of Applied Behavior Analysis 37 (3), 351.
Zegeer, C.V., Bushell, M., 2012. Pedestrian crash trends and potential
countermeasures from around the world. Accident Analysis & Prevention 44
Zegeer, C.V., Stewart, J.R., Huang, H.H., Lagerwey, P.A., 2002. Safety effects of
marked vs. Unmarked crosswalks at uncontrolled locations. Technique Report
NO. FHWA-RD-01-075, U.S. Department of Transportation, Federal