ArticlePDF Available

Correlation between Google Trends on dengue fever and national surveillance report in Indonesia

Authors:

Abstract and Figures

Background: Digital traces are rapidly used for health monitoring purposes in recent years. This approach is growing as the consequence of increased use of mobile phone, Internet, and machine learning. Many studies reported the use of Google Trends data as a potential data source to assist traditional surveillance systems. The rise of Internet penetration (54.7%) and the huge utilization of Google (98%) indicate the potential use of Google Trends in Indonesia. No study was performed to measure the correlation between country wide official dengue reports and Google Trends data in Indonesia. Objective: This study aims to measure the correlation between Google Trends data on dengue fever and the Indonesian national surveillance report. Methods: This research was a quantitative study using time series data (2012–2016). Two sets of data were analyzed using Moving Average analysis in Microsoft Excel. Pearson and Time lag correlations were also used to measure the correlation between those data. Results: Moving Average analysis showed that Google Trends data have a linear time series pattern with official dengue report. Pearson correlation indicated high correlation for three defined search terms with R-value range from 0.921 to 0.937 (p ≤ 0.05, overall period) which showed increasing trend in epidemic periods (2015–2016). Time lag correlation also indicated that Google Trends data can potentially be used for an early warning system and novel tool to monitor public reaction before the increase of dengue cases and during the outbreak. Conclusions: Google Trends data have a linear time series pattern and statistically correlated with annual official dengue reports. Identification of information-seeking behavior is needed to support the use of Google Trends for disease surveillance in Indonesia.
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=zgha20
Global Health Action
ISSN: 1654-9716 (Print) 1654-9880 (Online) Journal homepage: http://www.tandfonline.com/loi/zgha20
Correlation between Google Trends on dengue
fever and national surveillance report in Indonesia
Atina Husnayain, Anis Fuad & Lutfan Lazuardi
To cite this article: Atina Husnayain, Anis Fuad & Lutfan Lazuardi (2019) Correlation between
Google Trends on dengue fever and national surveillance report in Indonesia, Global Health Action,
12:1, 1552652, DOI: 10.1080/16549716.2018.1552652
To link to this article: https://doi.org/10.1080/16549716.2018.1552652
© 2019 The Author(s). Published by Informa
UK Limited, trading as Taylor & Francis
Group.
Published online: 08 Jan 2019.
Submit your article to this journal
View Crossmark data
ORIGINAL ARTICLE
Correlation between Google Trends on dengue fever and national
surveillance report in Indonesia
Atina Husnayain
a
, Anis Fuad
b
and Lutfan Lazuardi
c
a
E-Health Division, Center for Health Policy and Management, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada,
Yogyakarta, Indonesia;
b
Department of Biostatistics, Epidemiology, and Population Health, Faculty of Medicine, Public Health and
Nursing, Universitas Gadjah Mada, Yogyakarta, Indonesia;
c
Department of Health Policy Management, Faculty of Medicine, Public
Health and Nursing, Universitas Gadjah Mada, Yogyakarta, Indonesia
ABSTRACT
Background: Digital traces are rapidly used for health monitoring purposes in recent years.
This approach is growing as the consequence of increased use of mobile phone, Internet, and
machine learning. Many studies reported the use of Google Trends data as a potential data
source to assist traditional surveillance systems. The rise of Internet penetration (54.7%) and
the huge utilization of Google (98%) indicate the potential use of Google Trends in Indonesia.
No study was performed to measure the correlation between country wide official dengue
reports and Google Trends data in Indonesia.
Objective: This study aims to measure the correlation between Google Trends data on
dengue fever and the Indonesian national surveillance report.
Methods: This research was a quantitative study using time series data (20122016). Two sets
of data were analyzed using Moving Average analysis in Microsoft Excel. Pearson and Time
lag correlations were also used to measure the correlation between those data.
Results: Moving Average analysis showed that Google Trends data have a linear time series
pattern with official dengue report. Pearson correlation indicated high correlation for three
defined search terms with R-value range from 0.921 to 0.937 (p0.05, overall period) which
showed increasing trend in epidemic periods (20152016). Time lag correlation also indicated
that Google Trends data can potentially be used for an early warning system and novel tool
to monitor public reaction before the increase of dengue cases and during the outbreak.
Conclusions: Google Trends data have a linear time series pattern and statistically correlated
with annual official dengue reports. Identification of information-seeking behavior is needed
to support the use of Google Trends for disease surveillance in Indonesia.
ARTICLE HISTORY
Received 4 September 2018
Accepted 21 November 2018
RESPONSIBLE EDITOR
Stig Wall, Umeå University,
Sweden
KEYWORDS
Google Trends; information
seeking; digital
epidemiology; dengue;
Indonesia
Background
Digitaltraceshavebecomeapotentialdatasourcefor
health-related purposes in the past few years. Digital
epidemiology is a new field that uses digital traces to
explore the patterns of disease and health dynamics in
a population. The definition of digital epidemiology
according to Salathe [1]is:Digital epidemiology is epi-
demiology that uses data that was generated outside the
public health system, i.e. with data that was not generated
with the primary purpose of doing epidemiology.
As the Internet penetration becomes more wide-
spread, with increased mobile phone usage, and the
growing artificial intelligence of machine learning, the
field of digital epidemiology provides a promising
approach to assist traditional surveillance systems [1,2].
This approach potentially fills the gap in conventional
surveillance systems in developing countries that often
suffer from underreporting, limited timeliness, and the
lack of sufficient budget for physical needs, facilities, and
infrastructures [36]. Data provided by conventional
surveillance system often required weeks or months to
be collected.
In Indonesia, regulation by Ministry of Health
requested hospitals to report any new dengue cases
to district health office within 24 hours after confirmed
diagnosis [7]. However no single application was avail-
able to capture the data electronically. Consequently,
each district has its own database structure of dengue
cases. Data from districts are submitted monthly to
province and national level. Reports at province and
national level are aggregated on number of cases by
districts, age group. Top-down feedbacks are provided
by the sub-directorate of Vector-Borne Diseases and
Zoonoses under the Directorate General of Disease
Prevention and Control in the Ministry of Health of
the Republic of Indonesia. This circumstances poten-
tially caused the delay in response and indicated the
need for an alternative data source to depict the den-
gue cases in near real-time.
Among the digital traces that are increasingly stu-
died for epidemiology are those recorded in search
CONTACT Lutfan Lazuardi lutfan.lazuardi@ugm.ac.id Department of Health Policy Management, Faculty of Medicine, Public Health and
Nursing, Universitas Gadjah Mada, 55281 Yogyakarta, Indonesia
GLOBAL HEALTH ACTION
2019, VOL. 12, 1552652
https://doi.org/10.1080/16549716.2018.1552652
© 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
engines [2]. These data provide the information-
seeking patterns using specified search terms in
defined locations during a specific time period.
Digital recorded data provided by Google are dis-
played on Google Trendswebsite (https://trends.goo
gle.com/trends/). Many studies proved that Google
Trends data correlated well with traditional surveil-
lance data [814]. Those researches reveal the poten-
tial use of Google Trends data that can be obtained
earlier, more easily, and at little cost compared with
conventional reporting systems. On the other hand,
some studies reported a weak potential use of Google
Trends data finding that they are more influenced by
media clamor than truly actual epidemiological bur-
den [15,16].
The increasing Internet penetration in Indonesia
that has attained 54.7% and the huge utilization of
Google (98%) indicate the potential use of Google
Trends in Indonesia [17,18] This study was designed
to validate the use of Google Trends data as an alter-
native or complement data source for dengue surveil-
lance in Indonesia. No study was performed to
measure the correlation between country-wide offi-
cial dengue reports and Google Trends data in
Indonesia. This is the first study to measure the
correlation between Google Trends data on dengue
fever and the Indonesian national surveillance report
at a national level.
Methods
This research was a quantitative study using time
series data ranging from 2012 until 2016. The frame-
work in this study is adapted from a previous study
related to validation of Google Trends data at the
national level [10]. We used official dengue reports
from the Department of Arbovirus, Health Ministry,
Indonesia and Google search volumes related to den-
gue in Indonesia from Google Trends. Official den-
gue reports were used as a gold standard to validate
the Google Trends data.
Cases with confirmed status of dengue from labora-
tory tests that were reported in official dengue reports
from 34 provinces in Indonesia are available on
a monthly basis. Data cleaning was performed for
those data to examine the completeness of data.
Missing values from five provinces including
Lampung, North Sulawesi, West Sulawesi, Papua, and
West Papua are filled in using the Amelia Package in
RStudio. This approach used multiple imputation and
frequently used to overcome missing values in time
series data [19]. Multiple imputation used a Bayesian
approach to replace missing values with predictive
distribution based on the observed data [20].
Complete official dengue reports then were trans-
formed to the same interval of relative search volume
(RSV) in Google Trends data, in order to compare
the official dengue reports and Google Trends data in
a single graphical form. This approach is also used in
a previous study to transform the official dengue
reports in interval data which range from 0 to 100
[10]. By using those approach, 0 is defined as the
absence of dengue case and 100 is defined as the
highest incidence of dengue cases during 2012 until
2016.
We compared the normalized dengue cases with
dengue Google search volume for the same data
period. Dengue Google search volume is described
as how often a defined search term is used by
Indonesians to search online information related to
dengue in Google. Data were downloaded in comma-
separated values (CSV) file from Google Trends
website (https://trends.google.com/trends/) and are
available on a weekly basis. Data were obtained
using 19 search terms related to disease definition,
symptom, treatment, and vector of disease which is
listed in Table 1. Search terms were collected from
Google Trends (search terms listing the most fre-
quently used) and Google Correlate (search terms
which have a similar pattern with the search term
demam berdarah dengue).
Obtained data from Google Trends then were
transformed from weekly period to monthly period
using mean. This method was also used in previous
study [11], in order to compare two sets of data in
a single graphical form using a line chart in Microsoft
Excel. Graphs which have relatively similar linearity
of pattern then can be visualized using moving aver-
age analysis. Moving average analysis was used to
measure the pattern similarity between official den-
gue reports and Google Trends data in more detailed
ways. Pattern similarity includes the linearity of pat-
tern, similarity of leap, and similarity of dengue out-
break per period in Indonesia.
Pearson correlation was performed for search
terms with the highest pattern of similarity with
official dengue reports. The correlation strength was
defined as a correlation coefficient R-value of 0.7
(p0.05). We also performed Time lag correlation
analysis with significance level at p0.05 for search
terms with the highest correlation. Time lag correla-
tion is used to compute the correlation between time
lag variables and official dengue case history.
Statistical analysis was conducted using Stata ver-
sion 13.
Results
Results of data analysis in Figure 1 show the time
series of dengue cases in Indonesia from 2012 until
2016. There were four peaks of dengue cases with the
highest peak in February 2016 which involved 32,117
dengue cases. Figure 1 shows the dengue outbreaks
per period in Indonesia which tended to increase
2A. HUSNAYAIN ET AL.
between October to January and then spiked to
a peak in January or March.
Time series of dengue cases then were visualized in
single graphical form with Google Trends data.
Moving average graph from official dengue reports
and Google Trends data which has relative similarity
in linearity of pattern is shown in Figure 2. Search
terms such as gejala demam berdarah,demam ber-
darah, and dbdseem to be in-line with official
dengue reports. Information seeking using search
term demam berdarahincreased in point 9 (22.6);
23 (27.2); 34 (26.8); and 46 (24.6). Search term dbd
increased in point 10 (11.6); 22 (13.4); 33 (15.5); and
46 (16.2), followed by search term gejala demam
berdarahwhich increased in point 11 (15.5); 23
(18.3); 33 (20); and 48 (21.4). Compared with official
dengue reports which increased in point 11 (17.2); 25
(20); 34 (24.3); and 47 (18), search terms gejala
demam berdarah,demam berdarah, and dbdare
increased in 1 until 3 points before the increase of
dengue cases. There were 4 peaks in the last 5 years
which are visualized by official dengue reports and
the 3 search terms from Google Trends in point 15,
27, 39, and 51. Figure 2 also shows that information
seeking using the search term, demam berdarah
tended to have higher value than official dengue
reports, different from the search terms, gejala
demam berdarah, and dbdwhich tend to have
lower values than official dengue reports.
Results of Pearson correlation in Table 2 show high
correlation (R-value0.7 and p0.05) between official
dengue reports and the Google Trends data. Correlations
from the three search terms in the overall time period
range from 0.921 to 0.937. The search term gejala
demam berdarahhas the highest R-value in the overall
timeperiod.Duringthelast5years,R-valueseemstobe
increased in the epidemic period (20152016) and search
terms gejala demam berdarahand demam berdarah
seem to have stable R-value. Results of Time lag correla-
tion in Table 3 show high correlation (R-value0.7 and
p0.05) between official dengue reports and Google
Trends data a month earlier which have R-value ranging
from 0.755 to 0.773. Information seeking using the
search term gejala demam berdarahin the month
prior shows the highest correlation with official dengue
reports (R-value = 0.773; p0.05).
Discussion
Validation using moving average analysis showed
that Google Trends data have a linear time series
pattern correlated with official dengue reports. This
finding is relevant to previous research by Cho [10].
Information seeking using the search term gejala
demam berdarah,demam berdarah, and dbdfol-
lowed the dengue outbreak period in Indonesia from
October to January and the peak in January to
March during the epidemic years 2015 and 2016.
Table 1. List of search terms.
Num Category Search Term Description Source
1. Disease definition demam berdarah,dengue,dengue fever,
fever,penyakit demam,penyakit demam
berdarah
Terms are used to identify searching pattern
related to disease definition in bahasa
Indonesia
Google Correlate
demam berdarah dengue pdf,dengue
hemorrhagic fever,dhf,demam berdarah
dengue,dbd
Google Trends
2. Symptom berdarah,demamTerms are used to identify searching pattern
related to dengue symptom in bahasa
Indonesia
Google Correlate
gejala demam berdarah dengue,gejala demam
berdarah
Google Trends
3. Treatment obat demam berdarah dengueTerms are used to identify searching pattern
related to dengue treatment in bahasa
Indonesia
Google Trends
4. Vector of disease aedes,aedes aegypti,aegyptiTerms are used to identify searching pattern
related to dengue vector in bahasa Indonesia
Google Correlate
Figure 1. Time series of dengue cases in Indonesia (20122016).
GLOBAL HEALTH ACTION 3
Validation using Pearson correlation shows high
correlation (R-value 0.7 and p0.05) between
official dengue reports and Google Trends data.
This finding is relevant to previous studies by
Althouse, Chan, Cho, Gluskin, Castro, Strauss and
Teng [814]. Those publications showed R-values
ranging from 0.33 to 0.94. Researches in tropical
countries [10,11,13,14] showed the high correlation
(R-value ranging from 0.82 to 0.94) between official
surveillance data with Google Trends data that seem
to be similar with the result of this study. In com-
parison, the R-value in this research is relatively
high (R-values ranging from 0.921 to 0.937).
Figure 2. Moving average of dengue cases and information seeking using search term gejala demam berdarah,demam
berdarah, and dbdin Indonesia (20122016).
DOI for Dataset: 10.17632/x855pphhx9.1
Table 2. Result of pearson correlation.
Time Period
Search Term
gejala demam
berdarah
(dengue
symptom)
demam
berdarah
(dengue)
dbd
(abbreviation of
dengue)
Overall period 0.937* 0.931* 0.921*
2012 0.936* 0.918* 0.862*
2013 0.847* 0.850* 0.719*
2014 0.844* 0.814* 0.570
2015 0.921* 0.929* 0.918*
2016 0.954* 0.966* 0.950*
*significant in p0.05.
Table 3. Result of time lag correlation.
Time lag
Search Term
gejala demam
berdarah
(dengue symptom)
demam
berdarah
(dengue)
dbd
(abbreviation of
dengue)
rp-value r p-value r p-value
3 0.264* 0.047 0.283* 0.033 0.356* 0.007
2 0.517* <0.001 0.526* <0.001 0.567* <0.001
1 0.773* <0.001 0.755* <0.001 0.767* <0.001
0 0.937* <0.001 0.931* <0.001 0.921* <0.001
*significant in p 0.05
4A. HUSNAYAIN ET AL.
The high correlation between official dengue
reports and Google Trends data in this study is dif-
ferent from Alicino, Cervellin, and Ellerys research
finding [15,16,21]. Those researches found
a disassociation between Google Trends data and
disease occurrence and also found that Google
Trends data is more influenced by media coverage
than actual epidemiological burden. Thus, the poten-
tial use of Google Trends data depends on media
coverage, Internet penetration, and utilization of
mobile phone. Apart from those research findings,
research by Chan found that information seeking
related to dengue tended to be less influenced by
media coverage [9].
According to the 2 steps of validation, moving
average analysis and Pearson correlation, Google
Trends data is well correlated with official dengue
reports. This research successfully proved that
Google Trends data is potentially useful as
a complement data source for disease surveillance in
Indonesia where Internet penetration attained 54.7%
(2017). One previous study suggested that Google
Trends is better suited in developed countries with
large Internet penetration [22].
Three search terms with linear time series pattern
and high correlation with official dengue case were
drawn from disease definitions and symptom cate-
gory. This research finding is relevant to research
from Althouse, Chan, Cho, and Kang [810,23].
Search terms which are generally used by netizens
have higher correlation with official data [10,23].
This finding also are demonstrated in this study.
Search terms which generally are used by
Indonesian netizens such as gejala demam berdarah,
demam berdarah, and dbdhave higher correlation
with official dengue reports than the search term
demam berdarah dengue, even though that search
term is a standard disease definition for dengue in
Bahasa Indonesia. Different from the search term
gejala demam berdarahand demam berdarahthat
had generally stable R-values as shown in Table 2, the
search term dbdhad a fluctuating R-value. The
search terms gejala demam berdarahand demam
berdarahare specific search terms that have specified
result from query (gejala demam berdarahhas
104,000 results and demam berdarahhas 2,560,000
results, whereas the search term dbdhas a broad
query with 20,400,000 results).
Different from previous research [810,23], search
terms in this research were collected from Google
Trends (search terms used most frequently) and
Google Correlate (search terms which have a similar
pattern with the search term demam berdarah dengue).
According to the results of the Pearson correlation in
Table 2, and list of search terms in Table 1, Google
Trends and Google Correlate successfully describe the
keyword or search term utilization by Indonesians.
Nevertheless, the accuracy of keyword or search term
identification depends on information-seeking beha-
vior which are influenced by media trends, outbreak
news briefs, disease occurrence, and Internet penetra-
tion [8,9,10,11,13,15,16,23]. Information-seeking beha-
vior also is influenced by individual variables such as
age, sex, level of education, cultural aspect, language,
social class, marital status, level of healthcare utilization,
and level of stress [23,25,26,27]. An additional factor
that drives the information-seeking behavior is keyword
suggestion in Google. According to Wen and Sun [24],
keyword suggestion is generated from previous query
using content-based re-ranking. Thus, keyword or
search term utilization also depends on previous
query. In summary, the condition, distribution, and
dynamic of factors that influences the information-
seeking behavior may vary among national wide and
change over time. Therefore, identification of factors
that influence the information-seeking behavior is
needed to support the use of Google Trends for disease
surveillance in Indonesia.
Increasing of information seeking in 1 until 3
points before the increase of dengue cases as shown
in Figure 2 and high correlation of lag-1 in Time lag
correlation indicate the initial potential use of Google
Trends data as an early warning system. Some pre-
vious studies showed the potential use of Google
Trends data as an early warning system in countries
with a weak surveillance system [8,9,11,13,28]. This
finding also indicates the potential use of Google
Trends as novel tool to monitor public reaction
before the increase of dengue cases and during the
outbreak. Google Trends is potentially used to cap-
ture the public reaction in terms of worries, knowl-
edge needs, and gaps which can be obtained earlier,
more easily and at little cost [9,10,23,29].
With an assumption that information seeking
related to dengue tended to be less influenced by
media coverage [9], Google Trends can be potentially
used to capture knowledge needs, and gaps between
available information and needed information. Gaps
can be identified using search term or keyword utili-
zation by Indonesians in Google Trends and Google
Correlate before the increase of dengue cases and
during the outbreak. Identified gaps then can be
used to determine the topic or information which is
published on health official website and news chan-
nel. As the three most frequently accessed news chan-
nel in Indonesia, Tribunnews.com, Detik.com, and
Liputan6.com could potentially be used to dissemi-
nate the needed information [30].
Early warning system according to World Health
Organization is timely surveillance systems that col-
lect information on epidemic-prone diseases in order
to trigger prompt public health interventions[31].
Approaches in utilization of Google Trends for early
warning systems and as a monitoring tool for public
GLOBAL HEALTH ACTION 5
reaction are intended to assist traditional surveillance
systems in order to increase public health response to
dengue in Indonesia. A study in Yogyakarta munici-
pality reported that on average it takes 12 days to
submit the report from hospital to the district health
office [32]. Given that no standardized electronic
dengue surveillance system, each district develop the
system differently with limited interoperability.
Consequently, provincial health offices and the
Ministry of Health lacks of disaggregated data for
appropriate action.
In this concern, Google Trends is a prospective
tool to overcome the timeliness problem in conven-
tional surveillance system which requires weeks or
months to collect the data. In principle, this study
offers opportunities to complement the existing sur-
veillance system especially in terms of early warning.
However, some works need to be done. These include
overcome the noise to increase the quality of data,
combine with other data source including social
media and online news data, then create the algo-
rithm to produce early warning systems. Google
Trends also can be used to reveal what people search
in a defined time and location. Otherwise, utilization
of Google Trends as a monitoring tool for public
reaction can be used immediately. Google Trends
data can be used to monitor the public interest and
the most commonly searched topic.
According to Bragazzi [33], Google Trends can be
used to monitor the public interest and the most
commonly searched topic, while Google Trends can-
not be used to disclose the individual characteristics
as done by conventional surveillance system.
Therefore, Google Trends does not have the capacity
to replace the existing surveillance systems [34] but
can serve to supplement and complement them.
Implementation of Google Trends in Indonesia still
poses some challenges related to Internet penetration
and information-seeking behavior. As an island coun-
try, Indonesia has to encounter the discrepancy of
infrastructure and level of literacy which may vary
widely nationally. Those factors can affect the
Internet utilization and information-seeking behavior
in all regions in Indonesia. Google Trends could be
used easily in a region with high Internet penetration
and high dengue incidence. Nevertheless, how to
implement the Google Trends in a region with high
dengue incidence but low Internet penetration still
remains challenging. Future studies need to validate
the utilization of Google Trends data in regions with
high dengue incidence and compare it among regions
with high and low Internet penetration in Indonesia.
Googling for health or disease-related information
does not always reflect the individuals health condi-
tion. Likewise, searching for dengue information
could also be performed by those with the infected
disease in various point during the incubation period,
other related disease with similar symptoms or even
by healthy people [9]. Research by Indriani revealed
the similarity of spatio-temporal patterns of dengue
and chikungunya in Yogyakarta [35]. This issue poses
a critical challenge for any Google Trends research.
Thus, involving other variables that potentially influ-
encing the googling behavior is needed to weigh the
relative search volume and then improve the correla-
tion analysis. Beside information-seeking behavior,
other researchers could consider Internet penetration
rate by geographic areas as a weighting variable for
Google Trends data in order to increase the quality of
data.
Conclusion
Google Trends data have a linear time series pattern
and are statistically correlated with official dengue
report. Identification of information-seeking behavior
is needed to support the use of Google Trends for
disease surveillance in Indonesia.
Acknowledgments
We gratefully give thanks to the sub-directorate of Vector-
Borne Diseases and Zoonosis under the Directorate
General of Disease Prevention and Control within the
Ministry of Health Republic of Indonesia for providing
the official dengue surveillance data from all 34 provinces
in Indonesia.
Author contributions
All authors contributed to the writing process. Atina
Husnayain contributed to writing the first draft including
study design, analysis, and interpretation of data. Anis
Fuad provided the research conception and revised draft
critically for important intellectual content. Lutfan
Lazuardi also played a part in ensuring the research accu-
racy and revised the final draft before submission.
Disclosure statement
No potential conflict of interest was reported by the
authors.
Ethics and consent
This research has been approved by the Medical and
Health Research Ethics Committee of Faculty of
Medicine, Public Health and Nursing, Universitas
Gadjah Mada with the reference number of KE/FK/
1061/EC/2017.
Funding information
This work was supported by the Faculty of Medicine,
Public Health and Nursing, Universitas Gadjah Mada
[Grant number: UPPM/221/M/05/04/05.17].
6A. HUSNAYAIN ET AL.
Paper context
This study has demonstrated that Google Trends data on
dengue has a high potential to complement the national
level surveillance data in Indonesia. In a dengue endemic
country with 54.7% Internet penetration, Google Trends
data could be adopted as an early warning system and
monitoring tool of public responses to communicable dis-
eases. Further validation studies at sub-national level are
necessary since the district level surveillance system is
highly influenced by the decentralization policy and inter-
ests of local leadership.
ORCID
Atina Husnayain http://orcid.org/0000-0003-3002-8728
Anis Fuad http://orcid.org/0000-0003-2303-5903
Lutfan Lazuardi http://orcid.org/0000-0001-5146-8162
References
[1] Salathé M. Digital epidemiology: what is it, and where
is it going? J Life Sci Soc Policy. 2018;14:15.
[2] Salathé M, Bengtsson L, Bodnar TJ, et al. Digital
epidemiology. PLoS Comput Biol. 2012;8:15.
[3] Runge-Ranzinger S, McCall PJ, Kroeger A, et al.
Dengue disease surveillance: an updated systematic
literature review. Trop Med Int Heal.
2014;19:11161160.
[4] Das S, Sarfraz A, Jaiswal N, et al. Impediments of
reporting dengue cases in India. J Infect Public
Health [Internet]. 2017;10:494498.
[5] Sitepu FY, Suprayogi A, Pramono D. Evaluasi dan
Implementasi Sistem Surveilans Demam Berdarah
Dengue (DBD) di Kota Singkawang, Kalimantan
Barat, 2010. J Litbang Pengendali Penyakit
Bersumber Binatang Banjarnegara [Internet].
2012;8:510. Available from: http://ejournal.litbang.
depkes.go.id/index.php/blb/article/view/3259
[6] Stahl H, Butenschoen VM, Tran HT, et al. Cost of
dengue outbreaks: literature review and country case
studies. BMC Public Health. 2013;13:111.
[7] Indonesian Health Ministry. Health ministry
regulation. 1501 Indonesia; 2010.
[8] Althouse BM, Ng YY, Cummings DAT. Prediction of
dengue incidence using search query surveillance.
PLoS Negl Trop Dis. 2011;5:17.
[9] Chan EH, Sahai V, Conrad C, et al. Using web search
query data to monitor dengue epidemics: a new model
for neglected tropical disease surveillance. PLoS Negl
Trop Dis. 2011;5.
[10] Cho S, Sohn CH, Jo MW, et al. Correlation between
national influenza surveillance data and Google
Trends in South Korea. PLoS One. 2013;8:e81422.
[11] Gluskin RT, Johansson MA, Santillana M, et al.
Evaluation of internet-based dengue query data:
Google dengue trends. PLoS Negl Trop Dis.
2014;8:15.
[12] Castro JS, Torres J, Oletta J, et al. Google Trend tool
as a predictor of chikungunya and zika epidemic in
a environment with little epidemiological data,
a Venezuelan case. Int J Infect Dis [Internet].
2016;53:133134.
[13] Strauss R, Castro JS, Reintjes R, et al. Google dengue
trends: an indicator of epidemic behavior. The
Venezuelan case. Int J Infect Dis [Internet].
2017;53:119120.
[14] Teng Y, Bi D, Xie G, et al. Dynamic forecasting of
Zika epidemics using Google Trends. PLoS One.
2017;12:110.
[15] Alicino C, Bragazzi NL, Faccio V, et al. Assessing
Ebola-related web search behaviour: insights and
implications from an analytical study of Google
Trends-based query volumes. Infect Dis Poverty
[Internet]. 2015;4:113.
[16] Cervellin G, Comelli I, Lippi G. Is Google Trends
a reliable tool for digital epidemiology? Insights from
different clinical settings. J Epidemiol Glob Health.
2017;7:185189.
[17] APJII. Penetrasi & Perilaku Pengguna Internet
Indonesia 2017 [Internet]. Penetrasi Perilaku
Pengguna Internet Indones. 2017. Jakarta; 2017.
Available from: https://web.kominfo.go.id/sites/
default/files/LaporanSurveiAPJII_2017_v1.3.pdf
[18] StatCounter Global Stats. Search engine market share
in Indonesia [Internet]. 2017 [cited 2017 Apr 29].
Available from: http://gs.statcounter.com/search-
engine-market-share/all/indonesia
[19] Zhang Z. Multiple imputation for time series data
with Amelia package. Ann Transl Med [Internet].
2016;4:56. Available from: http://www.ncbi.nlm.nih.
gov/pubmed/26904578%5Cnhttp://www.pubmedcen
tral.nih.gov/articlerender.fcgi?artid=PMC4740012
[20] Sterne JAC, White IR, Carlin JB, et al. Multiple impu-
tation for missing data in epidemiological and clinical
research: potential and pitfalls. BMJ [Internet].
2009;338:112. Available from: https://www.bmj.com/
content/338/bmj.b2393.long
[21] Ellery PJ, Vaughn W, Ellery J, et al. Understanding
internet health search patterns: an early exploration
into the usefulness of Google Trends. J Commun
Healthc. 2008;1:15.
[22] Carneiro HA, Mylonakis E. Google Trends: a
web-based tool for real-time surveillance of disease
outbreaks. Clin Ifectious Dis. 2009;49:15571564.
[23] Kang M, Zhong H, He J, et al. Using Google Trends
for influenza surveillance in South China. PLoS One.
2013;8:16.
[24] Wen F, Sun J Google Patents: dynamic keyword sug-
gestion and image-search re-ranking [Internet].
Google Patents Dyn. keyword Suggest. image-search
re-ranking. 2010 [cited 2018 Feb 26]. Available from:
https://patents.google.com/patent/US20110179021A1/
en?q=suggested&q=keyword&oq=suggested+keyword
[25] Beck F, Richard J-B, Nguyen-Thanh V, et al. Use of the
internet as a health information resource among French
young adults: results from a nationally representative
survey. J Med Internet Res [Internet]. 2014;16:e128.
Available from: http://www.jmir.org/2014/5/e128/
[26] Nölke L, Mensing M, Krämer A, et al. Sociodemographic
and health-(care-)related characteristics of online health
information seekers: A cross-sectional German study.
BMC Public Health. 2015;15:112.
[27] Oh YS, Song NK. Investigating relationships between
health-related problems and online health information
seeking. J Comput Informatics, Nurs. 2017;35:2935.
[28] Seo D-W, Shin S-Y. Methods using social media and
search queries to predict infectious disease outbreaks.
Healthc Inform Res [Internet]. 2017;23:343348.
Available from: http://www.ncbi.nlm.nih.gov/
pubmed/29181246%0Ahttp://www.pubmedcentral.
nih.gov/articlerender.fcgi?artid=PMC5688036
GLOBAL HEALTH ACTION 7
[29] Adawi M, Bragazzi NL, Watad A, et al. Discrepancies
between classic and digital epidemiology in searching
for the mayaro virus: preliminary qualitative and
quantitative analysis of Google Trends. J Med
Internet Res. 2017;3:111.
[30] Alexa. Top sites in Indonesia [Internet]. Top Sites
Indones. 2018 [cited 2018 Apr 2]. p. 113. Available
from: https://www.alexa.com/topsites/countries/ID
[31] World Health Organization. Emergencies prepared-
ness, response early warning systems [Internet]. 2016
[cited 2018 Nov 12]. p. 1113. Available from: http://
www.who.int/csr/labepidemiology/projects/earlywarn
system/en/
[32] Cahyono AD, Satoto TBT, Lazuardi L. Kemanfaatan
pelaporan berbasis sms dan aplikasi surveilans demam
berdarah berbasis web di kota yogyakarta. Yogyakarta:
Universitas Gadjah Mada; 2018.
[33] Bragazzi NL, Barberis I, Rosselli R, et al. How often
people google for vaccination: qualitative and quanti-
tative insights from a systematic search of the
web-based activities using Google Trends. J Hum
Vaccines Immunother. 2018;13:120.
[34] Milinovich GJ, Williams GM, Clements ACA, et al.
Internet-based surveillance systems for monitoring
emerging infectious diseases. Lancet Infect Dis
[Internet]. 2014;14:160168.
[35] Indriani C, Fuad A, Kusnanto H. Spatial-temporal
pattern comparison between chikungunya outbreak
and dengue hemmorhagic fever incidence at Kota
Yogyakarta 2008. Ber Kedokt Masy. 2011;27:4150.
8A. HUSNAYAIN ET AL.
... The process involves mining textual user-generated data from the internet, systematically aggregating and analyzing these textual, unstructured data, then presenting findings as graphs, tables, and/or maps (14). GT has successfully demonstrated the ability to detect outbreaks of influenza, as well as early detection of infectious diseases, and implementing Google Trends studies results in forecasting, surveillance, and monitoring purposes in various health sectors (14,15). For example, a study on dengue disease found a surge in dengue disease-related public information-seeking behavior related to dengue outbreaks or rising cases (15). ...
... GT has successfully demonstrated the ability to detect outbreaks of influenza, as well as early detection of infectious diseases, and implementing Google Trends studies results in forecasting, surveillance, and monitoring purposes in various health sectors (14,15). For example, a study on dengue disease found a surge in dengue disease-related public information-seeking behavior related to dengue outbreaks or rising cases (15). Health studies using GT have shown great potential for monitoring health behavior problems (16)(17)(18)(19). ...
... GT data were downloaded for analysis on the same date to minimize the bias (February 11, 2022) (15,27). All searches used "all categories" and "web searches" (image, news, Google shopping, and YouTube searches) in "Indonesia." ...
Full-text available
Article
Objective This study set out to explore public interest through information search trends on diet and weight loss before and during the COVID-19 pandemic in Indonesia. Methods The Google Trends database was evaluated for the relative internet search popularity on diet-related search terms, including top and rising diet-related terms. The search range was before and during the COVID-19 pandemic (April 2018 to January 2022) in the Indonesia region. We analyzed the Relative Search Volume (RSV) data using line charts, correlation, and comparison tests. Results Search queries of “lose weight” was higher during the pandemic (58.34 ± 9.70 vs. 68.69 ± 7.72; p<0.05). No difference was found in diet-related searches before and after the pandemic. Public interest in the diet was higher after Eid al-Fitr (Muslims break fasting celebration day) and after the new year. Many fad diet (FD) terms were found on the top and rising terms. Conclusion After Eid al-Fitr and the new year were susceptible times for promoting a healthy diet in Indonesia. Potential need found before those times for education in inserting healthy food among fatty and sugary menus related to holidays and celebrations. Higher interest in “lose weight” was relevant to heightened obesity risk during the social restriction and heightened COVID-19 morbidity and mortality due to obesity. The high interest for rapid weight loss through FD needs to be resolved by promoting healthy diets with a more captivating message and messenger, like consistently using top terms in the keywords of the official healthy diet guidance. Future research could explore the relationship between diet and other behavior or with non-communicable diseases.
... Six articles were excluded [Supplementary Material File 1]; three studies were not original research articles (e.g., the theoretical development of a mathematical model), one did not include an eligible disease, and two examined the wrong setting. Out of 13 included studies in the review, four focused on Dengue fever [25][26][27][28] , four on Zika virus [29][30][31][32] , and five on COVID-19 disease [33][34][35][36][37] . No study was identified that had met the inclusion criteria and focused on HIV infections, SARS, Lyme disease, E. coli, Hantavirus, or West Nile virus. ...
... The authors concluded that GTD is a useful source for surveillance and prevention of Dengue fever outbreaks in Brazil. In addition, Monnaka et al. [25] assessed GTD and [27] ; Husnayain et al. [26] ; Marques-Toledo et al. [28] , Monnaka et al. [25] ). GTD: Google Trends data. ...
... The authors concluded that GTD is a useful source for surveillance and prevention of Dengue fever outbreaks in Brazil. In addition, Monnaka et al. [25] assessed GTD and [27] ; Husnayain et al. [26] ; Marques-Toledo et al. [28] , Monnaka et al. [25] ). GTD: Google Trends data. ...
Full-text available
Article
Uncontrolled outbreaks of emerging infectious diseases can pose threats to livelihoods and can undo years of progress made in developing regions, such as Sub-Saharan Africa. Therefore, the surveillance and early outbreak detection of infectious diseases, e.g., Dengue fever, is crucial. As a low-cost and timely source, Internet search queries data [e.g., Google Trends data (GTD)] are used and applied in epidemiological surveillance. This review aims to identify and evaluate relevant studies that used GTD in prediction models for epidemiological surveillance purposes regarding emerging infectious diseases. A comprehensive literature search in PubMed/MEDLINE was carried out, using relevant keywords identified from up-to-date literature and restricted to low-to middle-income countries. Eight studies were identified and included in the current review. Three focused on Dengue fever, three analyzed Zika virus infections, and two were about COVID-19. All studies investigated the correlation between GTD and the cases of the respective infectious disease; five studies used additional (time series) regression analyses to investigate the temporal relation. Overall, the reported positive correlations were high for Zika virus (0.75-0.99) or Dengue fever (0.87-0.94) with GTD, but not for COVID-19 (-0.81 to 0.003). Although the use of GTD appeared effective for infectious disease surveillance in low-to middle-income countries, further research is needed. The low costs and availability remain promising for future surveillance systems in low-to middle-income countries, but there is an urgent need for a standard methodological framework for the use and application of GTD.
... Husnayain et al. [20] used MA to increase the correlation between the incidence of dengue fever with Google search activities for dengue and found that the correlation was very high between the two. Hu et al. [21] utilized MAs to reduce noise in water pH and water temperature data to improve the correlation of the two data with other water quality data to provide better performance in mariculture water quality forecasting. ...
... In the test results, the application of MAs to the movement data of the motion sensor results can increase the PC of the features on the actual presence of humans in the room. is is in accordance with existing studies, namely, [20][21][22][23] and [24]. e related studies use MA to increase the correlation of regression and classification features, among others, for noise reduction and forecasting. ...
Full-text available
Article
Smart lighting systems utilize advanced data, control, and communication technologies and allow users to control lights in new ways. However, achieving user comfort, which should be the focus of smart lighting research, is challenging. One cause is the passive infrared (PIR) sensor that inaccurately detects human presence to control artificial lighting. We propose a novel classification-integrated moving average (CIMA) model method to solve the problem. The moving average (MA) increases the Pearson correlation (PC) coefficient of motion sensor features to human presence. The classification model is for a smart lighting intelligent control based on these features. Several classification models are proposed and compared, namely, k-nearest neighbor (KNN), support vector machine (SVM), decision tree (DT), näive Bayes (NB), and ensemble voting (EV). We build an Internet of things (IoT) system to collect movement data. It consists of a PIR sensor, a NodeMCU microcontroller, a Raspberry Pi-based platform, a relay, and LED lighting. With a sampling rate of 10 seconds and a collection period of 7 days, the system achieved 56852 data records. In the PC test, movement data from the PIR sensor has a correlation coefficient of 0.36 to attendance, while the MA correlation to attendance can reach 0.56. In an exhaustive search of an optimum classification model, KNN has the best and the most robust performance, with an accuracy of 99.8%. It is more accurate than direct light control decisions based on motion sensors, which are 67.6%. Our proposed method can increase the correlation value of movement features on attendance. At the same time, an accurate and robust KNN classification model is applicable for human presence-based smart lighting control.
... The present study which was similar to the study conducted by Verma in India and Husnayain in Indonesia. 13,17 During the outbreak season in 2017, we found an 99% increase in search volume data across the entire keyword set we analyzed (Figures 1, 3, 4, and 6). We found that the constructed dengue search index on Google showed an obviously similar changing trend with dengue dynamics, suggesting that Google trend is a good indicator for estimating intensity and peak incidence of dengue fever in India. ...
Full-text available
Article
Millions of people worldwide search online for health-related information and search engines have become an increasingly popular resource for accessing health-related information and provides valuable source. Key words used as well as the number and geographic location of searches can provide trend data, available by Google trends. In this study exploring this resource using dengue disease as an example. Objectives were to use Google trends data for comparison across different locations in India for the past 5 years, and to assess the specific search terms used in Google trends data and to correlate the real time dengue outbreak of Tamil Nadu with Google trend search. It was a cross sectional study. Data collection was done via Google search queries and record was included. Weekly trends were accessed from Google Trends. Data is a randomly collected sample of real time and non-real time Google search queries. Search traffic for the string “dengue fever” reflected increased likelihood of exposure and the string “dengue symptoms and treatment” had higher relative traffic during rainy season. Cities and states with the highest amount of search traffic for “dengue disease” overlapped where dengue is endemic. Found that search trend data produced by Google to approximate the seasonality, spikes at September to November every year and geographic distribution also identified in dengue disease. Web search query data were found to be capable of tracking dengue activity and predict periods of large incidence of dengue with high accuracy and may prove useful.
... Importantly, our results demonstrate that the estimated correlations are as applicable to lower-income countries as they are for higher-income countries. Leveraging search interest data to forecast illness started with Google Flu Trends (GFT) to track influenza [15], where later work showed the potential of search interest to forecast other illnesses [31,42,30,37,38,22]. While GFT showed promise in tracking influenza, it ultimately faced challenges such as significantly overestimating ...
Full-text available
Preprint
Real-time data is essential for policymakers to adapt to a rapidly evolving situation like the COVID-19 pandemic. Relying on Google search interest data across 207 countries and territories, we demonstrate the capacity of publicly-available, real-time data to anticipate COVID-19 cases; evaluate the economic, mental health, and social impacts of containment policies; and identify demand for (mis)information about COVID-19 vaccines. We show that: (1) search interest in COVID-specific symptoms can anticipate rising COVID-19 cases across both high- and low-income settings; (2) countries with more restrictive containment policies experienced larger socio-economic externalities; in addition, lower-income countries experienced less searches for unemployment, but more pronounced mental health externalities; and (3) high vaccination rates are associated with strong demand for information about vaccine appointments and side effects; in some settings, high interest in misinformation search terms is associated with low vaccination rates. Overall, the results demonstrate that real-time search interest data can be a valuable tool for both high- and low-income countries to inform policies across multiple stages of the pandemic.
... Many studies have been conducted in the field of health on the prevalence of diseases by using Google Trends (37)(38)(39)(40). It is noteworthy that one of such study was on epidemiology (38). ...
Full-text available
Article
Aim: Google Trends, which allows Internet users to interact with and search data, can provide in-depth information about new phenomena regarding population and health-related behavior and is a tool that can be accessed free of charge. With the widespread use of dental implants in almost every country in the world today, an increase has also been reported in the prevalence of peri-implantitis (PP), which is a peri-implant disease. The aim of this study is to determine whether the rates of PP that were obtained from previous studies on this disease are in line with the data obtained using Google Trends. Methodology: Using observational, ecological research, we searched Google Trends for the following query terms: peri implantitis + periimplantitis, to obtain the volume of this Internet search query. The queries were searched within Spain (ES), Germany (DE), the Netherlands (NL), the United Kingdom (UK), and Turkey from January 2010 to December 2019. Results: An examination of the search results for “peri-implantitis + peri-implantitis” on Google Trends found that the largest numbers of searches for these words were made from the country of ES, and the smallest numbers were made from Turkey. It took two years to make forecasts based on the results, and the study determined that there has been a change in the trends in countries that were searched for these words. Also, the results obtained in previous studies for the prevalence of peri-implantitis were not similar to the data obtained from Google Trends. Conclusion: We concluded in this study that Google Trends is not a reliable tool for dental epidemiology. How to cite this article: Üner DD, İzol BS. Is Google Trends a reliable way to determine digital dental epidemiology? Int Dent Res 2021;11(Suppl.1):38-46. https://doi.org/10.5577/intdentres.2021.vol11.suppl1.7 Linguistic Revision: The English in this manuscript has been checked by at least two professional editors, both native speakers of English.
Chapter
Studies on disease surveillance for COVID-19 have utilized search interests such as from Google Trends. The selection of search terms can play a pivotal role and affect the validity of the results of such systems. The present study inventoried search terms from studies associated with outbreak detection or prevention with the intent of contributing to the process of deriving an optimal search strategy. The studies were retrieved from the Google Scholar database using the phrase coronavirus+ “relative search volume”. Seventy-nine (79) were found eligible for the period from 2020 to 2021. The collection of search terms obtained comprised of COVID-19 names, symptoms, public measures and protective measures. The network of search interests depicted disease-related terms and symptoms to be predominant. Further studies are directed to model search interests and incidence of the outbreak leading to the deployment of early warning systems geared for outbreak detection.
Full-text available
Article
Pada artikel ini, penulis melakukan investigasi terhadap peluang pemanfaatan informasi dari mesin pencarian daring (dalam hal ini Google Trends), sebagai sumber data surveilans, untuk mengatasi masalah kesehatan berskala massif (khususnya Pandemi COVID-19). Ditemukan bahwa frekuensi pencarian Google Trends pada kata kunci yang diasosiasikan dengan gejala virus COVID-19 beberapa kali mengalami peningkatan pesat. Peningkatan frekuensi pencarian ini ternyata diikuti dengan peningkatan kasus infeksi virus COVID-19 di Indonesia. Kemudian, ditemukan peningkatan angka pencarian kata kunci yang berkaitan dengan pertimbangan seseorang dalam mengambil keputusan vaksinasi pada periode Januari 2021 hingga Maret 2022. Pada bulan Januari 2021 merupakan bulan dimana pemerintah mulai menggalakkan program vaksinasi masal di Indonesia. Dengan meningkatnya angka pemakaian internet, dan angka penggunaan mesin pencari daring di Indonesia, diharapkan informasi frekuensi pencarian kata kunci di mesin pencari seperti Google Trends dapat digunakan sebagai sumber informasi surveilans alternatif.
Full-text available
Article
The uncertainty of information on the development of the COVID-19 case in Indonesia has made the public have to search for information independently through various sources, such as the mass media and the internet. One of the tools that can be used to facilitate getting internet search data is Google Trends. This aims to analyze the potential use of Google Trends as an information monitoring tool during the COVID-19 pandemic in Indonesia. The research method was carried out during the pandemic period using Google Trends from 1 January - 1 September 2020. The keywords were “corona outbreak”, “corona disease”, and “corona pandemic”. Each surge that occurred will be analyzed and linked to keyword search, namely "corona drug" using Pearson Correlation with significant <0.05. The query for "corona disease" is more common than "corona outbreak" and "corona pandemic". The search for the keywords "corona outbreak", "corona pandemic", and "corona disease" is associated with the keyword "corona medicine" (p value=0.000). Public interest in searching related keywords has increased with searching for information related to curative efforts. Therefore, Google Trends has the potential to be a monitoring tool for information searches during the COVID-19 pandemic in Indonesia. Keywords: Corona outbreaks, corona disease, corona pandemic, corona medicine, google trends
Full-text available
Article
Background Traditionally, dengue surveillance is based on case reporting to a central health agency. However, the delay between a case and its notification can limit the system responsiveness. Machine learning methods have been developed to reduce the reporting delays and to predict outbreaks, based on non-traditional and non-clinical data sources. The aim of this systematic review was to identify studies that used real-world data, Big Data and/or machine learning methods to monitor and predict dengue-related outcomes. Methodology/Principal findings We performed a search in PubMed, Scopus, Web of Science and grey literature between January 1, 2000 and August 31, 2020. The review (ID: CRD42020172472) focused on data-driven studies. Reviews, randomized control trials and descriptive studies were not included. Among the 119 studies included, 67% were published between 2016 and 2020, and 39% used at least one novel data stream. The aim of the included studies was to predict a dengue-related outcome (55%), assess the validity of data sources for dengue surveillance (23%), or both (22%). Most studies (60%) used a machine learning approach. Studies on dengue prediction compared different prediction models, or identified significant predictors among several covariates in a model. The most significant predictors were rainfall (43%), temperature (41%), and humidity (25%). The two models with the highest performances were Neural Networks and Decision Trees (52%), followed by Support Vector Machine (17%). We cannot rule out a selection bias in our study because of our two main limitations: we did not include preprints and could not obtain the opinion of other international experts. Conclusions/Significance Combining real-world data and Big Data with machine learning methods is a promising approach to improve dengue prediction and monitoring. Future studies should focus on how to better integrate all available data sources and methods to improve the response and dengue management by stakeholders.
Full-text available
Article
Introduction: Dengue Haemorrhagic Fever (DHF) is still a public health problem in Singkawang Municipality which was an endemic area. DHF surveillance is expected to inform endemicity of an area, season of transmission and disease progression that can be use to make the system more effective and efficient. Methods: Observational study by using a structured questionnaire. Interview was conducted to all DHF surveillance officers. Evaluated had been done to the variable of input, process, and output of the surveillance system. We conducted an on the job training to all DHF surveillance officers after the evaluation. Results: 66.7% officers never got any trainings of surveillance, 83.3% had double duty, budgeting limited to physical needs, facilities and infrastructures. Process variable, data collection was late; analysis and recommendation had not been directed to the distribution of cases, the relationship between risk factors and the mortality of DHF incidence, and environment changing, feedback; data distribution had not been implemented optimally. Output variable was still weak, no surveillance epidemiology profile. Attribute surveillance such as simplicity, flexibility, and positive predictive value were good, but still weak in acceptability, sensitivity, representativeness, and timeliness. Short-term evaluation resulted that there was an increasing knowledge of surveillance officers (p value <0.05). Mid-term evaluation resulted that there was an increasing of completeness and accuracy of DHF report from 80% to 100%, active case finding, epidemiology investigation conducted to all DHF cases. Discussion and Conclusions : DHF surveillance system in Singkawang needs to be improved, there were many attributes of surveillance system that had not done well. Training of surveillance system is needed to improve capability and capacity of the surveillance officers. Keywords: Evaluation, Surveillance, DHF, Singkawang
Full-text available
Article
Digital Epidemiology is a new field that has been growing rapidly in the past few years, fueled by the increasing availability of data and computing power, as well as by breakthroughs in data analytics methods. In this short piece, I provide an outlook of where I see the field heading, and offer a broad and a narrow definition of the term.
Full-text available
Article
Objectives For earlier detection of infectious disease outbreaks, a digital syndromic surveillance system based on search queries or social media should be utilized. By using real-time data sources, a digital syndromic surveillance system can overcome the limitation of time-delay in traditional surveillance systems. Here, we introduce an approach to develop such a digital surveillance system. Methods We first explain how the statistics data of infectious diseases, such as influenza and Middle East Respiratory Syndrome (MERS) in Korea, can be collected for reference data. Then we also explain how search engine queries can be retrieved from Google Trends. Finally, we describe the implementation of the prediction model using lagged correlation, which can be calculated by the statistical packages, i.e., SPSS (Statistical Package for the Social Sciences). Results Lag correlation analyses demonstrated that search engine data/Twitter have a significant temporal relationship with influenza and MERS data. Therefore, the proposed digital surveillance system can be used to predict infectious disease outbreaks earlier. Conclusions This prediction method could be the core engine for implementing a (near-) real-time digital surveillance system. A digital surveillance system that uses Internet resources has enormous potential to monitor disease outbreaks in the early phase.
Full-text available
Article
Background: Mayaro virus (MAYV), first discovered in Trinidad in 1954, is spread by the Haemagogus mosquito. Small outbreaks have been described in the past in the Amazon jungles of Brazil and other parts of South America. Recently, a case was reported in rural Haiti. Objective: Given the emerging importance of MAYV, we aimed to explore the feasibility of exploiting a Web-based tool for monitoring and tracking MAYV cases. Methods: Google Trends (GT) is an online tracking system. A Google-based approach is particularly useful to monitor especially infectious diseases epidemics. We searched GT from its inception (from January 2004 through to May 2017) for MAYV-related Web searches worldwide. Results: We noted a burst in search volumes in the period from July 2016 (relative search volume [RSV]=13%) to December 2016 (RSV=18%), with a peak in September 2016 (RSV=100%). Before this burst, the average search activity related to MAYV was very low (median 1%). MAYV-related queries were concentrated in the Caribbean. Scientific interest from the research community and media coverage affected digital seeking behavior. Conclusions: MAYV has always circulated in South America. Its recent appearance in the Caribbean has been a source of concern, which resulted in a burst of Internet queries. While GT cannot be used to perform real-time epidemiological surveillance of MAYV, it can be exploited to capture the public’s reaction to outbreaks. Public health workers should be aware of this, in that information and communication technologies could be used to communicate with users, reassure them about their concerns, and to empower them in making decisions affecting their health.
Full-text available
Article
Internet-derived information has been recently recognized as a valuable tool for epidemiological investigation. Google Trends, a Google Inc. portal, generates data on geographical and temporal patterns according to specified keywords. The aim of this study was to compare the reliability of Google Trends in different clinical settings, for both common diseases with lower media coverage, and for less common diseases attracting major media coverage. We carried out a search in Google Trends using the keywords “renal colic”, “epistaxis”, and “mushroom poisoning”, selected on the basis of available and reliable epidemiological data. Besides this search, we carried out a second search for three clinical conditions (i.e., “meningitis”, “Legionella Pneumophila pneumonia”, and “Ebola fever”), which recently received major focus by the Italian media. In our analysis, no correlation was found between data captured from Google Trends and epidemiology of renal colics, epistaxis and mushroom poisoning. Only when searching for the term “mushroom” alone the Google Trends search generated a seasonal pattern which almost overlaps with the epidemiological profile, but this was probably mostly due to searches for harvesting and cooking rather than to for poisoning. The Google Trends data also failed to reflect the geographical and temporary patterns of disease for meningitis, Legionella Pneumophila pneumonia and Ebola fever. The results of our study confirm that Google Trends has modest reliability for defining the epidemiology of relatively common diseases with minor media coverage, or relatively rare diseases with higher audience. Overall, Google Trends seems to be more influenced by the media clamor than by true epidemiological burden.
Full-text available
Article
Dengue has emerged as one of the most important mosquito-borne, fatal flaviviral disease, apparently expanding as a global health problem. An estimated 3.6 billion people are at risk for dengue, with 50 million infections per year occurring across 100 countries globally. The annual number of dengue fever cases in India is many times higher than it is officially reported. This under reporting would play a major role in the government’s decision-making. Underestimating of the disease in India encumbers its people from taking preventive measures, discourages efforts to ensnare the sources of the disease and deliberates efforts for vaccine research. In this article, we highlight the probable impediments of under reporting leading to its impact on national and global public health and also offer key remedies to effectively address the issues across the clinics to the community level.
Full-text available
Article
We developed a dynamic forecasting model for Zika virus (ZIKV), based on real-time online search data from Google Trends (GTs). It was designed to provide Zika virus disease (ZVD) surveillance and detection for Health Departments, and predictive numbers of infection cases, which would allow them sufficient time to implement interventions. In this study, we found a strong correlation between Zika-related GTs and the cumulative numbers of reported cases (confirmed, suspected and total cases; p<0.001). Then, we used the correlation data from Zika-related online search in GTs and ZIKV epidemics between 12 February and 20 October 2016 to construct an autoregressive integrated moving average (ARIMA) model (0, 1, 3) for the dynamic estimation of ZIKV outbreaks. The forecasting results indicated that the predicted data by ARIMA model, which used the online search data as the external regressor to enhance the forecasting model and assist the historical epidemic data in improving the quality of the predictions, are quite similar to the actual data during ZIKV epidemic early November 2016. Integer-valued autoregression provides a useful base predictive model for ZVD cases. This is enhanced by the incorporation of GTs data, confirming the prognostic utility of search query based surveillance. This accessible and flexible dynamic forecast model could be used in the monitoring of ZVD to provide advanced warning of future ZIKV outbreaks.
Article
Introduction Dengue Fever is a neglected increasing public health thread. Developing countries are facing surveillance system problems like delay and data loss. Lately, the access and the availability of health-related information on the internet have changed what people seek on the web. In 2004 Google developed Google Dengue Trends (GDT) based on the number of search terms related with the disease in a determined time and place. The goal of this review is to evaluate the accuracy of GDT in comparison with traditional surveillance systems in Venezuela. Methods Weekly epidemic data from GDT, Official Reported Cases (ORC) and Expected Cases (EC) according the Ministry of Health (MH) was obtained Monthly and yearly correlation between GDT and ORC from 2004 until 2014 was obtained. Linear regressions taking the reported cases as dependent variable were calculated. Results The overall Pearson correlation between GDT and ORC was r = 0.87 (p < 0.001), while between ORC and EC according the Ministry of Health (MH) was r = 0.33 (p < 0.001). After clustering data in epidemic and non-epidemic weeks in comparison with GDT correlation were r = 0.86 (p < 0.001) and r = 0.65 (p < 0.001) respectively. Important interannual variation of the epidemic was observed. The model shows a high accuracy in comparison with the EC, particularly when the incidence of the disease is higher. Conclusions This early warning tool can be used as an indicator for other communicable diseases in order to apply effective and timely public health measures especially in the setting of weak surveillance systems.