Retail Investor Attention and Stock Liquidity
Rong Ding and Wenxuan Hou*
Following Da, Engelberg and Gao (2011), we use the search volume index (SVI) of the stock
ticker provided by Google Trends to capture the active attention retail investors pay to the
S&P 500 stocks. Based on the analysis of data from January 2004 to December 2009, we
show that the majority of the cross-sectional variation in the SVI cannot be explained by
passive attention measures, including online media coverage from Google News and
advertising expenditure. We further find that retail investor attention, reflected by the level
and change in the SVI, significantly enlarges the shareholder base and improves stock
liquidity. The results are robust to the control of firm-specific characteristics that affect stock
liquidity, and year and industry fixed effects. Our results remain consistent under alternative
measures of stock liquidity and Granger causality test.
Keywords: Investors’ attention; breadth of ownership; liquidity; bid-ask spread; SVI; media
coverage; Google; retail investor
JEL Classification: G10, G14
The authors are grateful for valuable comments from the participants in the EFA (European Finance
Association) 2011 meeting in Stockholm, Michela Verardo (discussant) and John Doukas for helpful comments.
We are grateful to Lei Zhang for his assistance with the collection of the SVI data.
* Rong is from the Management School of the University of Southampton, UK; and Wenxuan is from the
University of Edinburgh Business School, UK.
Address correspondence to Wenxuan Hou, 29 Buccleuch Place, Edinburgh, EH8 9JS, UK. Tel: +44 (0) 131 651
5319, Fax: +44 (0) 131 651 4389; Email: firstname.lastname@example.org
The “under-diversification puzzle” documented in the literature shows that investors have
“home bias” because they tend to favor investment in firms they are familiar with (French
and Poterba, 1991; Tesar and Werner, 1995; Cao, Han, Hirshleifer and Zhang, 2011). In order
to get familiar with such firms, investors have to spend time and effort collecting relevant
information, which suggests that attention from investors might predict the subsequent
trading activity. On the theoretical side, studies on asset pricing posit that investor attention is
a necessary condition for a stock price to fully reflect public information, as investors need to
be aware of the information before they can analyze and react to it (Hirshleifer and Teoh,
2003; Hou, Peng and Xiong, 2008; Hirshleifer, Lim and Teoh, 2011). However, because of
the limits on the information-processing capacity of human beings, attention is largely
concentrated on the stocks that investors are interested in or familiar with, which implies that
attention paid to stocks by investors could result in subsequent trading of these stocks. Our
study aims to provide fresh insights into the capital market consequences of investors’
Building on the assumption that the investors passively attend to publicly available
information, previous studies have used advertising expenditure (Grullon, Kanatas and
Weston, 2004) and media coverage (Fang and Peress, 2009) to capture investors’ attention
and examine its implications for stock liquidity and stock returns. In this paper, we employ a
measure of active attention from investors, recently developed by Da, Engelberg and Gao
(2011), namely the aggregate search volume index (SVI) provided by Google Trends
(available from: www.google.com/trends), and test the impact of investors’ attention paid to
listed firms on two aspects of listed firms: breadth of ownership and liquidity. After
controlling for the passive attention measures documented in the literature, we find that
increased investors’ attention measured by the SVI contributes to a broader shareholder base.
This is in line with the argument of Barber and Odean (2008) that retail investors tend to
search for information about the firm’s history, product, environment and strategies when
selecting stocks, and can be interpreted with the “investor recognition hypothesis” (Merton,
1987), which states that the shareholder base measures the recognition of the firm among
investors, so that an enlarged shareholder base indicates that the firm has been well
recognized. In other words, potential investors have to be aware of a firm before they can
gradually become familiar with it and then eventually decide to invest, suggesting that
investor attention is a necessary condition for a firm to be recognized. The impact of passive
attention measures, however, is not always significant in the results, showing that retail
investors do not necessarily invest in firms with more advertising expenditure or media
coverage.2 Furthermore, we find that increased investors’ attention, as measured by the SVI,
results in reduced bid-ask spread, and our results remain consistent after controlling for the
passive attention measures, firm characteristics, and year and industry fixed effects. Our
findings remain robust to alternative liquidity measures, including effective spread, relative
effective spread, and turnover rate (trading volume divided by shares outstanding).
This paper makes three important contribution to the literature. First, our study contributes to
the broad literature on the “investor recognition hypothesis” (i.e., Merton, 1987, Grullon et
al., 2004; Tetlock, 2010; Fang and Peress, 2009).3 Merton (1987) posits that “ceteris paribus,
2 We argue that the SVI captures investor attention in a more timely and accurate manner than passive attention
measures for the following reasons: 1) media coverage of a firm is sporadic, while the SVI is continuous; 2)
media coverage does not necessarily guarantee attention unless investors attend to it, and the same news
coverage could generate different levels of investor attention for different stocks (Da et al., 2011).
3 Empirical evidences largely support the investor’s recognition hypothesis. Chen, Noronha and Singal (2004)
report an increased investor’s awareness after a firm is added to the S&P 500 index, which leads to a reduction
in both the information asymmetry component of the bid-ask spread and the Merton (1987)’s cost of under-
diversification. By the same token, Lehavy and Sloan (2008) contend that an exchange listing increases
an increase in the relative size of the firm’s investor base will reduce the firm’s cost of capital
and increase the market value of the firm.” A stock’s visibility is associated with its price,
publicity and popularity of the core products and social image. However, we suggest that
these measures are passive, in that it is implicitly assumed that firms with high visibility will
attract more attention from investors which is difficult to empirically verify. Our study is
built on an active measure of ex post attention, as Google search is a confirmed measure of
attention: if an individual intentionally searches for information about a stock, it is evident
that one is paying attention to it (Da et al., 2011).4 Furthermore, Google search index captures
investors’ attention in a more timely way than passive measures of attention. When individual
investors actively search for a stock using Google, they acquire useful information relevant to
the stock, which mitigates the information asymmetry problem for these stocks. As a result,
liquidity improves for stocks with better investor recognition.
Second, our paper adds to the emerging literature on investor attention and asset pricing
dynamics, including Barber and Odean (2008) on investor attention and individual investors’
trading behavior, Engelberg and Parsons (2011) on the casual impact of local media coverage
on local trading, Da et al. (2011) on the impact of active attention on IPO returns and price
changes in subsequent periods and Aouadi, Arouri and Teulon (2013) on the effect of
investor attention on stock market liquidity and volatility use Google French data.
investor’s recognition of a firm. Furthermore, a positive association between investor’s recognition and
contemporaneous stock return id documented. Bushee and Miller (2012) find that small and mid-cap firms can
enhance their visibility among investors and analysts by hiring an investor relation firm, which contributes to
improved market valuation.
4 Our study is related to, but different from, Grullon et al. (2004) because our paper focuses on the relation
between investors’ active attention (to a stock) and the firm’s shareholder base as well as its liquidity, while
Grullon et al. (2004) investigate firms’ advertising expenditure, as a (passive) approach used to reach a broad
audience, and its impact on breadth of ownership and liquidity.
Finally, our study extends the literature on the stock market consequence of investors’
information demand. For example, Vlastakis and Markellos (2011) use the Google search
volume of constituents of Dow Jones Industrial Average Index as a proxy of investors’
information demand, and find that such information demand has significant impact on stock
trading volume and the conditional variance of excess return. Siganos (2013) use Google
search volume of target firms involved in a merger between 2004 and 2010 in the UK as a
proxy for investor’s information demand for the target firms, and find that such measure can
explain a large percentage of the price increase in target firms prior to the merger.
Vozlyublennaia (2014) use Google search to proxy for investor attention (investors’
information demand) and reports that attention has a short-lived influence on performance of
index of stocks, bonds and commodities. In addition, attention weakens the predictability of
index return because more revealed information due to increasing attention improves market
efficiency. We contribute to this stream of literature by showing investors’ attention leads to
larger shareholder base and improved stock liquidity.
The remainder of the paper is organized as follows. Section II describes the research design
and the data. Sections III and IV present the empirical results. Section V describes the
robustness checks. Section VI concludes by providing suggestions for future research.
II. Research Design and Data
A. Active Attention Measures
Since the beginning of 2004, Google Trends has provided data on the search frequencies of
terms on a weekly basis (http://www.google.com/trends).5 It shows how many searches have
5 http://www.google.com/intl/en/trends/about.html. The data are scaled to the average traffic for the term in
question over a fixed time period (usually January 2004).
been made for a specific keyword relative to the total number of searches over time.6
Following Da et al. (2011) and Drake, Roulstone and Thornock (2012), we proxy investor
attention by the search volume index (SVI) provided by Google Trends. Specifically, we
measure investor attention for a company based on the SVI for the stock ticker rather than the
company name, since searching for a stock using its ticker is less ambiguous (Da et al., 2011)
and searches using ticker symbols as the search term are more likely to reflect searches for
financial information than searches for non-financial information (Drake et al., 2012). We
download the weekly SVI for the ticker symbols of S&P500 stocks, which provides time-
series variations in the information searches for each firm. If a ticker is rarely searched for,
Google Trends will return a zero value. In addition, we exclude two types of noisy tickers.
First, we remove 12 companies whose tickers are single or double alphabets (e.g., “C” for
Citi group, “M” for Macy’s and “AA” for Alcoa). Second, we exclude 23 companies whose
tickers have generic meanings (e.g., “DO” for Diamond Offshore Drilling, “GAS” for AGL
resources, “LEG” for Legget & Platt and “FAST” for Fastenal).7
We download weekly SVIs for all constituents of the S&P 500 index over a six-year period
from January 2004 to December 2009. A retail investor can easily obtain a firm’s ticker from
financial news, where tickers are often reported in parentheses. Following Da et al. (2011),
we exclude SVIs with value of zero, and compute the change in SVI as follows:
6 In this study, search is defined as the activity of submitting an enquiry regarding a particular term using
Google. Consequently, the search volume is the number of enquiries submitted within a certain period.
7 To confirm that the search of the tickers reflects retail investors’ attention on the stocks, we employ a new
application “Goolge Correlate” (http://www.google.com/trends/correlate), which identifies the most correlated
SVIs. For example, the SVI of the ticker “APPL” is highly correlated with SVIs of “apple stocks” (correlation
as 0.894), “apple quotes” (0.867) and “apple stock price”; while the SVI of “apple” is highly correlated with
SVIs of with “apple store” (0.862), “iphone” (0.852) and “apple online store” (0.827). This indicates that
investors tend to use tickers to search for stock-related information whereas consumers tend to search
company’s name for product and retail information, which justifies our strategy to use SVI for stock ticker
instead of company name as a proxy for investors’ attention.
∆SVIt = Ln(SVIt )– Ln[Med (SVI t-1, ......,SVI t-8)] (1)
where SVIt is the search volume index during week t obtained from Google Trends, and [Med
(SVI t-1, ......,SVI t-8)] is the median value of the SVI during the previous eight weeks. As a
positive DSVI would indicate a surge in investor attention, a positive ∆SVI is more likely to
lead to subsequent trading behavior. Another benefit of using ∆SVI is that time trends and
low-frequency seasonality are removed (Da et al., 2011).
B. Passive Attention Measures
A commonly used passive attention measure is media coverage in newspapers. For example,
Fang and Peress (2009) focus on four daily newspapers with nationwide circulation in the
US: the New York Times, USA Today, the Wall Street Journal, and the Washington Post. We
argue, however, that the average retail investor is unlikely to subscribe to more than two to
three newspapers at the same time. A more convenient and inexpensive way for them to
obtain news is through the internet, and every piece of news on the internet has “global
circulation and access”.
The advanced “news search” section in Google News enables us to obtain a figure for the
total number of relevant news items per year, for each company in our sample, from 2004 to
2009.8 To obtain the number of news items, we use the company name instead of the ticker,
because tickers are only reported in financial newspapers but retail investors do not
necessarily get their information from financial newspapers only. The multiple meanings of
8 The number of news items related to a particular search term over a given period of time is available from
Google News (http://news.google.com) database, which aggregates news from 4,500 English-language news
sources worldwide. The stories are sorted without any consideration of political viewpoint or ideology.
the names of some companies may add noise to our data (e.g., Apple). However, due to the
large number of news items, it would be unfeasible for us to read every article in order to
exclude the irrelevant ones. Nevertheless, this noise is expected to introduce some bias
against obtaining consistent results. A feature of the Google News is that it counts multiple
newspaper distribution of the same article. Thus, it also reflects the dissemination of news,
which is closely related to the passive attention of individuals.
Prior research also suggests that advertising expenditure is a measure of passive attention
because intensive advertisement is able to promote the awareness of the product of the
company among consumers and the stock of the company among investors (Grullon et al.,
2004). In our study, we control for both Google News and firm’s advertising expenditure so
that the incremental effect of active attention reflected by SVI can be disentangled.
C. Research Design and Data
To investigate how investor attention affects the breadth of ownership and stock liquidity, we
incorporate the attention measures to the models of Grullon et al. (2004) as follows:
LnAdv LnNews SVILnNumS
)/ 1 (
/ 1 (
where the number of shareholders (lnNumS) and the relative bid-ask spread (RBAS) are
regressed against the search volume index (SVI), the number of news items available online
(LnNews), and the advertising expenses (lnAdv). To confirm our predictions, we expect to
find a significantly positive
1 in equation (2) and a significantly negative
1 in equation (3).
We also use the change in SVI (DSVI) instead of the level of the SVI as a robustness check.
The change in SVI defined in Equation (1) reflects an abnormal “jump” in the SVI relative to
the “normal” level over a longer time period (the previous eight weeks). As explained earlier,
it can also remove time trends and low-frequency seasonality (Da et al., 2011). The annual
observations of the number of shareholders, advertising expenses and other accounting data
are obtained from Compustat. A large proportion of firms do not report their advertising
expenses. Replacing missing advertising expenditure with zero is an approach commonly
used in previous studies to maintain sample size (e.g., Grullon et al., 2004; Banker, Huang
and Natarajan, 2011). In this study, because we use the natural logarithm of advertising
expenditure in our analysis, we replace any missing values with $0.01 rather than zero. As a
robustness check, we also replicate the analysis based on the smaller sample excluding those
firms with missing advertising expenditure and the results are consistent.
We calculate the relative bid-ask spread as the monthly average of the ratio of the daily inside
spread to the midpoint of the daily inside spread from CRSP (Centre for Research in Security
Prices). Chung and Zhang (2009) suggest the daily CRSP-based spread as a good substitute
for the TAQ-based spread in that the former represents at least 91% (78%) of the cross-
sectional variation in the latter from NASDAQ (NYSE/AMEX) stocks. We drop any
observations of relative spread that are greater than 50% of the midpoint in order to filter the
data for errors. We remove 26,732 daily observations with relative spreads larger than 50%,
from the original 10,238,830 daily observations (accounting for 0.26%), by following Chung
and Zhang (2009). We then transform the daily data into monthly data to perform the
analysis. For robustness checks, we replicate the analysis using alternative liquidity measures,
including the effective spread and the relative effective spread. The change in relative spread
is defined as the monthly change in relative spread in percentage terms. The effective spread
is constructed as twice the difference between the transaction price and the spread midpoint.
The relative effective spread is the effective spread scaled by the midpoint of the spread.
In order to perform the empirical analysis, we transform the daily liquidity spread measures
and the weekly attention measures of SVI and DSVI to monthly observations by taking the
average in each calendar month. Then, we merge the annual observations of the
advertisement expense into the firm-month panel data.
Following Grullon et al. (2004), we control for other factors that may have an impact on
stock liquidity. The market microstructure model (Ho and Stoll, 1980) suggests that a high
trading volume reduces the inventory cost per trade and therefore leads to a smaller bid-ask
spread. Hence, stocks with a high trading volume are expected to have smaller spreads. We
control for share turnover (LnTurnover), which is constructed as the monthly average of the
share volume divided by shares outstanding from CRSP. Large firms tend to have high
trading volumes and thus smaller spreads, and therefore we also control for firm market
capitalization (lnMC) from CRSP. Investors may have a preference for stocks within a certain
price range, so we also include the inverse of the closing price from CRSP (1/P) in our
analysis. Return volatility and firm age are included to proxy for risks. Return volatility is the
monthly average of the standard deviation of daily returns, obtained from CRSP. Firm age is
the number of years for which the firm has been included in CRSP. Average monthly return
(RET) and return on assets (ROA) are used to control for market performance and
profitability. Average monthly return is the average of the daily stock returns from CRSP.
Return on assets is constructed from Compustat as the annual operating income before
depreciation, scaled by total assets. Finally, an exchange dummy (NASDAQ is assigned the
value 1 for firms listed on the NASDAQ, and 0 otherwise) is included to account for
systematic differences in the market microstructure. Following Grullon et al. (2004), for
some variables we take their natural logarithm as shown in Equation (2) and (3), and we
include industry and year fixed effects in the analysis9. The final sample consists of 14,690
firm-month observations over the period from 2004 to 2009. The top and bottom 0.5% of the
variables are winsorized to reduce the possible effects of spurious outliers.
D. Summary Statistics
Panel A of Table I presents the descriptive statistics of the variables. Both the mean and
median of the SVI change (ΔSVI) are positive, showing an upward trend in the attention paid
to the tickers of S&P 500 firms. The media coverage of the firms, according to Google News,
varies from 313 (25%) to 3,320 (75%), with a mean (median) of 9,914 (1,200). This is
substantially larger than the amount of newspaper coverage documented in Fang and Peress
(2009), where the mean (median) was 12 (5). The difference indicates that firms are better
covered by online media than by traditional media such as national newspapers. The mean
(median) advertising expenditure is $449 ($144) million, which is much larger than the
figures documented in Grullon et al. (2004) based on an earlier sample from 1993 to 1998.
This shows that firms are spending much more on advertising nowadays. The number of
shareholders ranges from 4,000 to 51,000, with a mean (median) of 74,000 (14,000). The
mean (median) of the relative spread is 0.029 (0.0241). The average firm in our sample is
older and larger than that in Grullon et al. (2004), presumably for two reasons. First, we only
include the constituents of the S&P 500, in which newly listed firms are less likely to be
9 By following Grullon et al. (2004), the average monthly return (RET) is only incorporated in equation (2), that
is, when the dependent variable is the number of shareholders (lnNumS).
included. Second, there is a threshold of search volume for Google Trends to report the SVIs,
therefore SVIs are not available for some fledgling firms.
[Insert Table I about here]
E. The Active and Passive Attention Measures
In Table II, we explore how the SVI and ΔSVI, the newly proposed direct measures of active
attention, are related to the traditional passive attention measures, and to firm-specific
characteristics. Table II shows that news coverage and advertising expenditure are positively
associated with the SVI, which suggests that investors pay more attention to firms with
greater visibility in terms of news coverage and expenditure on advertising. The coefficients
of turnover, return on assets, firm size and return volatility are significantly positive, showing
that firms with good operating performance, actively traded stocks, high market value, and
higher risk grab more attention from retail investors. This is in line with the finding of
Seasholes and Wu (2007) that stocks with higher returns or higher risk receive more news
coverage and therefore attract more attention among investors. The coefficient of firm age is
significantly negative, and this might be due to the impact of newly founded Information
Technology glamour companies.10 Despite the significance of the explanatory variables, the
explanatory power of the model is low, and 95% of the variation in the active attention
measures remains unexplained. In model II, we regress the change in SVI against the same
set of explanatory variables. The only significant variable here is turnover, and more than
10 We partition the sample according to the median of firm age, and run the regressions again on the two
subsamples. We find that the coefficient of firm age is significantly positive (p < 0.01) in the subsample of older
firms, and significantly negative (p < 0.01) in the subsample of younger firms. Since there are more Information
Technology firms in the young subsample, we conjecture that the negative coefficient of firm age obtained for
the full sample is attributed to their impact.
99% of the variation is unexplained. This shows the distinction in the aspects of attention
captured by the active and passive measures.
[Insert Table II about here]
III. Active Attention Measure and Breadth of Ownership
We perform the regression analysis as shown in equation (2) to test the effect of investor
attention on the shareholder base.11 We regress the natural logarithm of the number of
shareholders against the active attention measures, passive attention measures including
online news coverage and advertising expenditure, and a set of control variables suggested in
Grullon et al. (2004) to explain cross-sectional variations in the breadth of ownership. The
results are reported in Panel A of Table III. In model I we include only the SVI and the
control variables. Here, the coefficient of SVI is significantly positive, which suggests that
active attention is positively associated with the size of the shareholder base. The coefficients
of firm age, firm size and return on assets are significantly positive, showing that profitable
firms, large firms and long-standing firms enjoy a larger shareholder base. The coefficient of
1/P is positive and significant, in line with the explanation that individual investors are likely
to buy stocks within a certain price range (i.e., higher 1/P).
We incorporate online news coverage in model II, and advertising expenditure in model III.
The effect of the SVI remains significant after controlling for the passive attention measures.
The coefficient of online news coverage is significantly positive, suggesting that firms that
11 As shown in Panel A of the appendix, we first conduct a univariate test in the following way: we classify the
firms into low-attention and high-attention subsamples based on the median value of the SVI. The former group
of firms is associated with 47,310 fewer shareholders on average, and this difference is significant at the 1%
level. We further classify the two subsamples into small and large firms, based on the median value of market
capitalization. The difference between the number of shareholders for the low-attention and high-attention sub-
samples is 2,270 and 74,380 for the small and large subsamples respectively, and the differences are statistically
significant. The results support our prediction that firms with a higher amount of active attention paid to them
will be associated with a larger shareholder base, no matter how large the firm is. The results of these tests are
available from the authors upon request.
are widely covered by news stories on the internet are associated with a larger shareholder
base. The impact of advertising expenditure is positive, but marginally insignificant. All
active and passive measures are incorporated in model III, and the positive effect of the SVI
on the shareholder base remains significant after controlling for the passive attention
measures. The results are also economically significant. According to model I, a one standard
deviation (1.34) increase in the SVI leads to an increase of 1,000 shareholders, which is 6%
of the median number of shareholders (15,000) for our sample firms.
Because S&P 500 firms are observed multiple times in the firm-month panel data, we correct
for standard errors using clustering in model IV by applying a bootstrapping regression as a
robustness check. The coefficient of the SVI remains significantly positive. In model V, we
replicate the test based on a smaller sample of 6,742 observations by excluding firms with
missing advertising expenditure. Once again, consistent results are obtained. Overall, the
results reported in Panel A show that the positive impact of active attention on the
shareholder base is robust to the control of the passive attention measures, firm
characteristics, and year and industry fixed effects.
In Panel B, we replicate the test by replacing the SVI with the change in SVI, and examine its
impact on the shareholder base. Consistent with our prediction, the coefficient of the change
in SVI is significantly positive in model I, showing that an increase in active attention leads
to a larger number of shareholders. A one standard deviation (0.12) increase in the change of
SVI leads to an increase of about 1,000 in the shareholder base, which is about 6% of the
corresponding figure (15,000) for a median firm. The positive impact, again, is robust to the
control of the passive attention measures, firm characteristics, and year and industry fixed
effects, as shown in models II and III. The signs of the coefficients for online news and
14 Download full-text
advertising expenditure and the other control variables are consistent with those reported in
Panel A. Finally, we adjust for standard errors and apply the bootstrapping regression in
model IV as a robustness check, and replicate the test based on a smaller sample of 6,742
firm-month observations excluding observations with missing advertising expenditure in
model V. The findings remain consistent. To sum up, the results reported in Table III show
that retail investors tend to become shareholders of the listed firms to which they pay
attention through internet searches. The results suggest that the internet has become an
important tool for retail investors to gather information and make investment decisions.
[Insert Table III about here]
IV. Active Attention and Stock Liquidity
Table IV reports the results of the impact of investor attention on the stock liquidity. As
shown in Equation (3), we regress the relative bid-ask spread on the SVI, online media
coverage, advertising expenditure and a set of control variables.12 In model I, we include only
the SVI and the control variables. Capturing the impact of active attention, the coefficient of
the SVI is significantly negative, which suggests that higher level of investor attention
reflected by search frequency leads to a reduced bid-ask spread and therefore improved stock
liquidity. Model II and Model III incorporates passive attention measures including Google
online news coverage and advertising expenditure, while Model IV adjusts the clustered
standard error and apply the bootstrapping regression. Model V is based on the reduced
12 As shown in the Panel B of the Appendix, we divide our sample firms into low-attention and high-attention
subsamples based on the median level of the SVI and test the difference in the means of the relative bid-ask
spread. The results show that the relative bid-ask spread is significantly smaller in the high-attention subsample.
When we further divide the sub-samples according to market capitalization, into small and large firms, the
difference in the bid-ask spread still exists in both types of firm. The difference is more pronounced for smaller
firms because they are, in general, less recognized by investors, and therefore more likely to benefit from
increased active attention from investors. The results are available upon request.