Detecting Influenza Epidemics Using Search Engine Query Data

Google Inc., 1600 Amphitheatre Parkway, Mountain View, California 94043, USA.
Nature (Impact Factor: 41.46). 12/2008; 457(7232):1012-4. DOI: 10.1038/nature07634
Source: PubMed


Seasonal influenza epidemics are a major public health concern, causing tens of millions of respiratory illnesses and 250,000 to 500,000 deaths worldwide each year. In addition to seasonal influenza, a new strain of influenza virus against which no previous immunity exists and that demonstrates human-to-human transmission could result in a pandemic with millions of fatalities. Early detection of disease activity, when followed by a rapid response, can reduce the impact of both seasonal and pandemic influenza. One way to improve early detection is to monitor health-seeking behaviour in the form of queries to online search engines, which are submitted by millions of users around the world each day. Here we present a method of analysing large numbers of Google search queries to track influenza-like illness in a population. Because the relative frequency of certain queries is highly correlated with the percentage of physician visits in which a patient presents with influenza-like symptoms, we can accurately estimate the current level of weekly influenza activity in each region of the United States, with a reporting lag of about one day. This approach may make it possible to use search queries to detect influenza epidemics in areas with a large population of web search users.

Download full-text


Available from: Mark Smolinski, Sep 17, 2015
  • Source
    • "Such an extreme potential of Internet search data has been put into practice and it is now being used for tracking or even anticipating various social phenomena. The utilization ranges from influenza tracking (Dugas et al. 2012; Ginsberg et al. 2008), consumer interest and its impact on product sales (Choi and Varian 2009; Goel et al. 2010; Kulkarni 2012) to macroeconomic indicators (Askitas and Zimmermann 2009; Cooper et al. 2005; Preis et al. 2010). The work of Merton (1987) suggests that attention may be also relevant for the complex reality of financial markets and Preis et al. (2008) are among the first ones to support this hypothesis using the web search data to proxy attention. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Online activity of Internet users has proven very useful in modeling various phenomena across a wide range of scientific disciplines. In our study, we focus on two stylized facts or puzzles surrounding the initial public offerings (IPOs) - the underpricing and the long-term underperformance. Using the Internet searches on Google, we proxy the investor attention before and during the day of the offering to show that the high attention IPOs have different characteristics than the low attention ones. After controlling for various effects, we show that investor attention still remains a strong component of the high initial returns (the underpricing), primarily for the high sentiment periods. Moreover, we demonstrate that the investor attention partially explains the overoptimistic market reaction and thus also a part of the long-term underperformance.
    SpringerPlus 12/2015; 4(1):84. DOI:10.1186/s40064-015-0839-4
  • Source
    • "Recent studies demonstrated that web search streams could be used to analyze trends about several phenomena (Choi and Varian, 2012) (Rose and Levinson , 2004) (Bordino et al., 2012). In one of the most interesting works, Ginsberg et al. proved that search query volume is a sophisticated way to detect regional outbreaks of influenza in USA almost 7 days before CDC surveillance (Ginsberg et al., 2009). There are also studies that report another use in a search engine , namely as a possible predictor of market trends. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In the last decade, Web 2.0 services have been widely used as communication media. Due to the huge amount of available information, searching has become dominant in the use of Internet. Millions of users daily interact with search engines, producing valuable sources of interesting data regarding several aspects of the world. Search queries prove to be a useful source of information in financial applications, where the frequency of searches of terms related to the digital currency can be a good measure of interest in it. Bitcoin, a decentralized electronic currency, represents a radical change in financial systems, attracting a large number of users and a lot of media attention. In this work we studied the existing relationship between Bitcoin's trading volumes and the queries volumes of Google search engine. We achieved significant cross correlation values, demonstrating search volumes power to anticipate trading volumes of Bitcoin currency.
    Information Filtering and Retrieval - DART 2015, Lisbon; 11/2015
    • "The number of online searches through Web search engines generates trend data, which can be subsequently analyzed over time. Pattern analysis of such big data or metadata can work as a real-time surveillance approach to complement more traditional data-gathering techniques (Ginsberg et al. 2009). Google is the most popular search engine worldwide (covering almost 90 % of the total online searches) (NetMarketShare 2015). "

    Natural Hazards 09/2015; DOI:10.1007/s11069-015-1961-x · 1.72 Impact Factor
Show more