Detecting Influenza Epidemics Using Search Engine Query Data

Google Inc., 1600 Amphitheatre Parkway, Mountain View, California 94043, USA.
Nature (Impact Factor: 41.46). 12/2008; 457(7232):1012-4. DOI: 10.1038/nature07634
Source: PubMed


Seasonal influenza epidemics are a major public health concern, causing tens of millions of respiratory illnesses and 250,000 to 500,000 deaths worldwide each year. In addition to seasonal influenza, a new strain of influenza virus against which no previous immunity exists and that demonstrates human-to-human transmission could result in a pandemic with millions of fatalities. Early detection of disease activity, when followed by a rapid response, can reduce the impact of both seasonal and pandemic influenza. One way to improve early detection is to monitor health-seeking behaviour in the form of queries to online search engines, which are submitted by millions of users around the world each day. Here we present a method of analysing large numbers of Google search queries to track influenza-like illness in a population. Because the relative frequency of certain queries is highly correlated with the percentage of physician visits in which a patient presents with influenza-like symptoms, we can accurately estimate the current level of weekly influenza activity in each region of the United States, with a reporting lag of about one day. This approach may make it possible to use search queries to detect influenza epidemics in areas with a large population of web search users.

Download full-text


Available from: Mark Smolinski, Sep 17, 2015
  • Source
    • "One of the most well-known applications is Google Flu Trends. In an influential article published in Nature, Ginsberg et al. (2009) explain how Google Trends can be used to improve the early detection of seasonal influenza by monitoring search engines like Google. This approach seems to work well because of the high correlation between the percentage of doctor visits and the relative frequency of specific queries on Google. "

    Preview · Article · Dec 2016
  • Source
    • "New large-scale and high-frequency data sets have been presented in the academic literature with the promise of being able to improve macroeconomic measurement (see, for example, Aruoba and Diebold 2010). Previously, early studies have shown that Internet search query data might help predict influenza epidemics (Ginsberg et al. 2009), video game sales (Goel et al. 2010), and housing market transactions (Wu and Brynjolfsson 2015). However, the data has only been used in a handful of studies.Askitas and Zimmermann 2009), the U.S. (Choi and Varian 2012;D'Amuri and Marcucci 2012), the UK (McLaren and Shanbhogue 2011), Israel (Suhoy 2009), Finland (Tuhkuri 2014), Italy (D'Amuri 2009), Norway (Anvik and Gjelstad 2010), Turkey (Chadwick and Sengul 2012), France (Fondeur and Karamé 2013), Spain (Vicente et al. 2015), Czech Republic, Hungary, Poland, and Slovakia (Pavlicek and Kristoufek 2014).The questions are also more generally relevant since none of them have been discussed in-depth in other contexts in which Internet search data could be useful. "
    [Show description] [Hide description]
    DESCRIPTION: Data on Google searches help predict the unemployment rate in the U.S. But the predictive power of Google searches is limited to short-term predictions, the value of Google data for forecasting purposes is episodic, and the improvements in forecasting accuracy are only modest. The results, obtained by (pseudo) out-of-sample forecast comparison, are robust to a state-level fixed effects model and to different search terms. Joint analysis by cross-correlation function and Granger non-causality tests verifies that Google searches anticipate the unemployment rate. The results illustrate both the potentials and limitations of using big data to predict economic indicators.
    Full-text · Working Paper · Feb 2016
  • Source
    • "Traditional methods of gathering ecological data can be supplemented with new technologies. Temporal fluctuations in Google search volume and Wikipedia logs have been used to forecast influenza, dengue or tuberculosis outbreaks (Generous, Fairchild, Deshpande, Del Valle, & Priedhorsky 2014; Ginsberg et al. 2008; McIver & Brownstein 2014). In a recent study, Google Trends were successfully used to collect national–scale data on fluctuations in rodent numbers, to study the role of rodent predation pressure in wood warbler (Phylloscopus sibilatrix) habitat selection (Szymkowiak & Kuczy´nski 2015 "
    [Show abstract] [Hide abstract]
    ABSTRACT: Lyme disease is a major zoonosis in the northern hemisphere. It is caused by the spirochete Borrelia burgdorferi, transmitted by ticks (genus Ixodes), and the abundance of infected tick nymphs determines the risk of the disease in humans. In eastern USA, fluctuations in oak (Quercus spp.) acorn production (including mast seeding) determine rodent abundance, which has been linked with Lyme borreliosis risk in humans. However, the predictive power of masting on Lyme disease risk in other systems has never been tested. We used a combination of field and Internet data to trace the ecological chain reaction that links acorn production by oaks and Lyme borreliosis risk in European forests. We found a positive relationship between oak acorn production (Q. robur and Q. petraea) in year T and the number of Lyme borreliosis incidences in year T+2. Acorn production was also positively correlated with Google search volume for the terms “tick” and “Lyme disease” two years later. Our results suggest that acorn production influences tick population, leading to fluctuations in the intensity of interactions between humans and ticks that can be seen in Google search dynamics. Thus, mast seeding together with the volume of specific Internet web searches appears to be a promising tool that could be used to alert public.
    Full-text · Article · Jan 2016 · Basic and Applied Ecology
Show more