Content uploaded by Sameer Dhoju
Author content
All content in this area was uploaded by Sameer Dhoju on Jan 28, 2019
Content may be subject to copyright.
Dierences between Health Related News Articles from Reliable
and Unreliable Media
Sameer Dhoju, Md Main Uddin Rony, Naeemul Hassan
Department of Computer Science and Engineering, The University of Mississippi
ABSTRACT
In this study, we examine a collection of health-related news articles
published by reliable and unreliable media outlets. Our analysis
shows that there are structural, topical, and semantic dierences
in the way reliable and unreliable media outlets conduct health
journalism. We argue that the ndings from this study will be
useful for combating health disinformation problem.
ACM Reference Format:
Sameer Dhoju, Md Main Uddin Rony, Naeemul Hassan. 2019. Dierences
between Health Related News Articles from Reliable and Unreliable Media.
In Proceedings of Computation+Journalism Symposium (C+J’19). ACM, New
York, NY, USA, Article 4, 5 pages. https://doi.org/10.475/1234
1 INTRODUCTION
Of the 20 most-shared articles on Facebook in 2016 with the word
“cancer” in the headline, more than half the reports were discredited
by doctors and health authorities [
6
]. The spread of health-related
hoaxes is not new. However, the advent of Internet, social network-
ing sites (SNS), and click-through-rate (CTR)-based pay policies
have made it possible to create hoaxes/“fake news”, published in a
larger scale and reach to a broader audience with a higher speed
than ever [
14
]. Misleading or erroneous health news can be dan-
gerous as it can lead to a critical situation. [
12
] reported a measles
outbreak in Europe due to lower immunization rate which experts
believed was the result of anti-vaccination campaigns caused by
a false news about MMR vaccine. Moreover, misinformation can
spoil the credibility of the health-care providers and create a lack of
trust in taking medicine, food, and vaccines. Recently, researchers
have started to address the fake news problem in general [
19
,
27
].
However, health disinformation is a relatively unexplored area. Ac-
cording to a report from Pew Research Center [
7
], 72% of adult
internet users search online for information about a range of health
issues. So, it is important to ensure that the health information
which is available online is accurate and of good quality. There
are some authoritative and reliable entities such as National In-
stitutes of Health (NIH)
1
or Health On the Net
2
which provide
high-quality health information. Also, there are some fact-checking
1https://www.nih.gov/
2https://www.hon.ch/en/
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
C+J’19, February 2019, Miami, Florida USA
©2019 Association for Computing Machinery.
ACM ISBN 123-4567-24-567/08/06. . . $15.00
https://doi.org/10.475/1234
sites such as Snopes.com
3
and Quackwatch.org
4
that regularly de-
bunk health and medical related misinformation. Nonetheless, these
sites are incapable of busting the deluge of health disinformation
continuously produced by unreliable health information outlets
(e.g., RealFarmacy.com, Health Nut News). Moreover, the bots in so-
cial networks signicantly promote unsubstantiated health-related
claims [
8
]. Researchers have tried developing automated health
hoax detection techniques but had limited success due to several
reasons such as small training data size and lack of consciousness
of users [10, 11, 18, 30].
The objective of this paper is to identify discriminating features
that can potentially separate a reliable health news from an unreli-
able health news by leveraging a large-scale dataset. We examine
how reliable media and unreliable media outlets conduct health
journalism. First, we prepare a large dataset of health-related news
articles which were produced and published by a set of reliable
media outlets and unreliable media outlets. Then, using a system-
atic content analysis, we identify the features which separate a
reliable outlet sourced health article from an unreliable sourced
one. These features incorporate the structural, topical, and seman-
tic dierences in health articles from these outlets. For instance,
our structural analysis nds that the unreliable media outlets use
clickbaity headlines in their health-related news signicantly more
than what reliable outlets do. Our semantic analysis shows that on
average a health news from reliable media contains more reference
quotes than an average unreliable sourced health news. We argue
that these features can be critical in understanding health misinfor-
mation and designing systems to combat such disinformation. In
the future, our goal is to develop a machine learning model using
these features to distinguish unreliable media sourced health news
from reliable articles.
2 RELATED WORK
There has been extensive work on how scientic medical research
outcomes should be disseminated to general people by following
health journalism protocols [
3
,
4
,
16
,
26
,
28
]. For instance, [
20
] sug-
gests that it is necessary to integrate journalism studies, strategic
communication concepts, and health professional knowledge to
successfully disseminate professional ndings. Some researchers
particularly focused on the spread of health misinformation in social
media. For example, [
10
] analyzes Zika
5
related misinformation in
Twitter. In particular, it shows that tracking health misinformation
in social media is not trivial, and requires some expert supervision.
It used crowdsource to annotate a collection of Tweets and used the
annotated data to build a rumor classication model. One limitation
of this work is that the used dataset is too small (6 rumors) to make
3https://www.snopes.com/
4http://www.quackwatch.org/
5https://en.wikipedia.org/wiki/Zikavirus
arXiv:1811.01852v1 [cs.SI] 5 Nov 2018
C+J’19, February 2019, Miami, Florida USA S. Dhoju et al.
a general conclusion. Moreover, it didn’t consider the features in
the actual news articles unlike us. [
11
] examines the individuals
on social media that are posting questionable health-related infor-
mation, and in particular promoting cancer treatments which have
been shown to be ineective. It develops a feature based supervised
classication model to automatically identify users who are com-
paratively more susceptible to health misinformation. There are
other works which focus on automatically identifying health misin-
formation. For example, [
17
] developed a classier to detect misin-
formative posts in health forums. One limitation of this work is that
the training data is only labeled by two individuals. Researchers
have also worked on building tools that can help a user to easily
consume health information. [
18
] developed the “VAC Medi+board”,
an interactive visualization platform integrating Twitter data and
news coverage from a reliable source called MediSys
6
. It covers
public debate related to vaccines and helps users to easily browse
health information on a certain vaccine-related topic.
Our study signicantly diers from these already existing researches.
Instead of depending on a small sample of health hoaxes like some
of the existing works, we take a dierent approach and focus on
the source outlets. This gives us the benet of investigating with a
larger dataset. We investigate the journalistic practice of reliable
and unreliable health outlets, an area which has not been studied
according to our knowledge.
3 DATA PREPARATION
For investigating how reliable media outlets and unreliable outlets
portray health information, we need a reasonably sized collection
of health-related news articles from these two sides. Unfortunately,
there is not an available dataset which is of adequate size. For this
reason, we prepare a dataset of about 30
,
000 health-related news
articles disseminated by reliable or unreliable outlets within the
years 2015
−
2018. Below, we describe the preparation process in
detail.
3.1 Media Outlet Selection
The rst challenge is to identify reliable and unreliable outlets. The
matter of reliability is subjective. We decided to consider the outlets
which have been cross-checked as reliable or unreliable by credible
sources.
3.1.1 Reliable Media. We identied 29 reliable media outlets from
three sources–
i)
11 of them are certied by the Health On the
Net [
22
], a non-prot organization that promotes transparent and
reliable health information online. It is ocially related with the
World Health Organization (WHO) [
31
].
ii)
8from U.S. govern-
ment’s health-related centers and institutions (e.g., CDC, NIH,
NCBI), and
iii)
10 from the most circulated broadcast [
25
] main-
stream media outlets (e.g., CNN, NBC). Note, the mainstream out-
lets generally have a separate section for health information (e.g.,
https://www
.
cnn
.
com/health). As our goal is to collect health-
related news, we restricted ourselves to their health portals only.
3.1.2 Unreliable Media. Dr. Melissa Zimdars, a communication and
media expert, prepared a list of false, misleading, clickbaity, and
satirical media outlets [
33
,
34
]. Similar lists are also maintained by
6http://medisys.newsbrief .eu
Reliable
everydayhealth, WebMD, statnews,AmericanHeart, BBCLifestyleHealth,
CBSHealth, FoxNewsHealth, WellNYT, latimesscience, tampabaytimeshealth,
philly.comhealth, AmericanHeart, AmericanCancerSociety, HHS, CNNHealth,
cancer.gov,FDA, mplus.gov, NHLBI, kidshealthparents,
ahrq.gov, healthadvocateinc, HealthCentral, eMedicineHealth, C4YWH,
BabyCenter, MayoClinic, MedicineNet, healthline
Unreliable
liveahealth, healthexpertgroup, healthysolo, organichealthcorner,
justhealthylifestyle1, REALfarmacyCOM, thetruthaboutcancer, BookforHealthyLife,
viralstories.bm, justhealthyway, thereadersle, pinoyhomeremedies,
onlygenuinehealth, greatremediesgreathealth, HealthRanger, thefoodbabe,
AgeofAutism, HealthNutNews, consciouslifenews, HealthImpactNews
Table 1: List of Facebook page ids of the reliable and unreli-
able outlets. Some of them are unavailable now.
Wikipedia [
32
] and informationisbeautiful
.
net [
13
]. We identied 6
media outlets which primarily spread health-related misinforma-
tion and are present in these lists. Another source for identifying
unreliable outlets is Snopes.com, a popular hoax-debunking website
that fact-checks news of dierent domains including health. We
followed the health or medical hoaxes debunked by Snopes.com and
identied 14 media outlets which sourced those hoaxes. In total,
we identied 20 unreliable outlets. Table 1 lists the Facebook page
ids of all the reliable and unreliable outlets that have been used in
this study.
3.2 Data Collection
The next challenge is to gather news articles published by the
selected outlets. We identied the ocial Facebook pages of each of
the 49 media outlets and collected all the link-posts
7
shared by the
outlets within January 1, 2015 and April 2, 2018
8
using Facebook
Graph API. For each post, we gathered the corresponding news
article link, the status message, and the posting date.
3.2.1 News Article Scraping. We used a Python package named
Newspaper3k
9
to gather the news article related data. Given a
news article link, this package provides the headline, body, author
name (if present), and publish date of the article. It also provides
the visual elements (image, video) used in an article. In total, we
collected data for 29
,
047 articles from reliable outlets and 15
,
017
from unreliable outlets.
3.2.2 Filtering non-Health News Articles. Even though we restricted
ourselves to health-related outlets, we observed that the outlets
also published or shared non-health (e.g., sports, entertainment,
weather) news. We removed these non-health articles from our
dataset and only kept health,food & drink, or tness & beauty related
articles. Specically, for each news article, we used the document
categorization service provided by Google Cloud Natural Language
API
10
to determine its topic. If an article doesn’t belong to one of the
three above mentioned topics, it is ltered out. This step reduced the
dataset size to 27
,
589;18
,
436 from reliable outlets and 9
,
153 from
unreliable outlets. We used this health-related dataset only in all the
experiments of this paper. Figure 1 shows the health-related news
percentage distribution for reliable outlets and unreliable outlets
using box-plots. For each of the 29 reliable outlets, we measure the
percentage of health news and then use these 29 percentage values
7
Facebook allows posting status, pictures, videos, events, links, etc. We collected the
link type posts only.
8
After that, Facebook limited access to pages as a result of the Cambridge Analytica
incident.
9https://newspaper.readthedocs.io/en/latest/
10https://cloud.google.com/natural-language/
Dierences between Health Related News Articles from Reliable and Unreliable Media C+J’19, February 2019, Miami, Florida USA
to draw the box-plot for the reliable outlets; likewise for unreliable.
We observe that the reliable outlets (median 72%) publish news on
health topics comparatively less than unreliable outlets (median
85%).
Figure 1: Comparison between reliable and unreliable out-
lets with respect to presence of health-related news contents
4 ANALYSIS
Using this dataset, we conduct content analysis to examine struc-
tural, topical, and semantic dierences in health news from reliable
and unreliable outlets.
4.1 Structural Dierence
4.1.1 Headline. The headline is a key element of a news article.
According to a study done by American Press Institute and the
Associated Press [
15
], only 4out of 10 Americans read beyond
the headline. So, it is important to understand how reliable and
unreliable outlets construct the headlines of their health-related
news. According to to [
1
], a long headline results in signicantly
higher click-through-rate (CTR) than a short headline does. We
observe that the average headline length of an article from reliable
outlets and an article from unreliable outlets is 8
.
56 words and 12
.
13
words, respectively. So, on average, an unreliable outlet’s headline
has a higher chance of receiving more clicks or attention than a
reliable outlet’s headline. To further investigate this, we examine
the clickbaityness of the headlines. The term clickbait refers to a
form of web content (headline, image, thumbnail, etc.) that employs
writing formulas, linguistic techniques, and suspense creating visual
elements to trick readers into clicking links, but does not deliver
on its promises [
9
]. Chen et al. [
2
] reported that clickbait usage is
a common pattern in false news articles. We investigate to what
extent the reliable and unreliable outlets use clickbait headlines in
their health articles. For each article headline, we test whether it is a
clickbait or not using two supervised clickbait detection models– a
sub-word embedding based deep learning model [
24
] and a feature
engineering based Multinomial Naive Bayes model [
21
]. Agreement
between these models was measured as 0
.
44 using Cohen’s
κ
. We
mark a headline as a clickbait if both models labeled it as clickbait.
We observe, 27.29% (5,031 out of 18,436) of the headlines from
reliable outlets are click bait. In unreliable outlets, the percentage
is signicantly higher, 40.03% (3,664 out of 9,153). So, it is evident
that the unreliable outlets use more click baits than reliable outlets
in their health journalism.
Figure 2: Distribution of clickbait patterns
We further investigate the linguistic patterns used in the clickbait
headlines. In particular, we analyze the presence of some common
patterns which are generally employed in clickbait according to [
1
,
23]. The patterns are-
•Presence of demonstrative adjectives (e.g., this, these, that)
•Presence of numbers (e.g., 10, ten)
•Presence of modal words (e.g., must, should, could, can)
•Presence of question or WH words (e.g., what, who, how)
•Presence of superlative words (e.g., best, worst, never)
Figure 2 shows the distribution of these patterns among the clickbait
headlines of reliable and unreliable outlets. Note, one headline may
contain more than one pattern. For example, this headline “Are
these the worst 9 diseases in the world?” contains four of the above
patterns. This is the reason why summation of the percentages
isn’t equal to one. We see that unreliable outlets use demonstrative
adjective and numbers signicantly more compared to the reliable
outlets.
Figure 3: Distribution of (Shared Date - Published Date) gaps
in days
4.1.2 Time-span Between Publishing and Sharing. We investigate
the time dierence between an article’s published date and share
date (in Facebook). Figure 3 shows density plots of Facebook Share
Date – Article Publish Date for reliable and unreliable outlets. We
observe that both outlet categories share their articles on Facebook
within a short period after publishing. However, unreliable outlets
seem to have considerable time gap compared to reliable outlets.
It could be because of re-sharing an article after a long period.
To verify that, we checked how often an article is re-shared on
Facebook. We nd that on average a reliable article is shared 1.057
times whereas an unreliable article is shared 1.222 times.
C+J’19, February 2019, Miami, Florida USA S. Dhoju et al.
(a) Image (b) Quotation (c) Link
Figure 4: Distribution of average number of image/quotation/link per article from reliable and unreliable outlets.
(a) RT1 (b) RT2 (c) RT3
(d) UT1 (e) UT2 (f) UT3
Figure 5: Topic modeling (k=
3
) of articles from reliable out-
lets (top, denoted as RT) and from unreliable outlets (bot-
tom, denoted as UT).
4.1.3 Use of visual media. We examined how often the outlets
use images in the articles. Our analysis nds that on average an
article from reliable outlets uses 13.83 images and an article from
unreliable outlets uses 14.22 images. Figure 4a shows density plots of
the average number of images per article for both outlet categories.
We observe that a good portion of unreliable outlet sourced articles
uses a high number of images (more than 20).
4.2 Topical Dierence
All the articles which we examined are health-related. However,
the health domain is considerably broad and it covers many topics.
We hypothesize that there are dierences between the health topics
which are discussed in reliable outlets and in unreliable outlets. To
test that, we conduct an unsupervised and a supervised analysis.
4.2.1 Topic Modeling. We use Latent Dirichlet Allocation(LDA) al-
gorithm to model the topics in the news articles. The number of
topics,
k
, was set as 3. Figure 5 shows three topics for each of the
outlet categories. Each topic is modeled by the top-10 important
words in that topic. The font size of words is proportional to the
importance. Figure 5a and 5d indicate that “cancer” is a common
topic in reliable and unreliable outlets. Although, the words study,
said,percent,research, and their font sizes in Figure 5a indicate that
the topic “cancer” is associated with research studies, facts, and
references in reliable outlets. On the contrary, unreliable outlets
have the words vaccine,autism, and risk in Figure 5d which suggests
the discussion regarding how vaccines put people under autism
and cancer risk, an unsubstantiated claim, generally propagated
by unreliable media
11,12
. Figure 5e and 5f suggest the discussions
about weight loss, skin, and hair care products (e.g., essential oil,
lemon). Topics in Figure 5b and 5c discuss mostly u, virus, skin
infection, exercise, diabetes and so on.
(a) Reliable (b) Unreliable
Figure 6: Top-10 topics in reliable and unreliable outlets.
4.2.2 Topic Categorization. In addition to topic modeling, we cate-
gorically analyze the articles’ topics using Google Cloud Natural
Language API
13
. Figure 6 shows the top-10 topics in the reliable
and unreliable outlets. In the case of reliable, the distribution is
signicantly dominated by health condition. On the other hand, in
the case of unreliable outlets, percentages of nutrition and food are
noticeable. Only 4 of the 10 categories are common in two outlet
groups. Unreliable topics have weight loss,hair care,face & body
care. This nding supports our claim from topic modeling analysis.
4.3 Semantic Dierence
We analyze what eorts the outlets make to make a logical and
meaningful health news. Specically, we consider to what extent
the outlets use quotations and hyperlinks. Use of quotation and
hyperlinks in a news article is associated with credibility [
5
,
29
].
Presence of quotation and hyperlinks indicates that an article is
11https://www.webmd.com/brain/autism/do-vaccines-cause-autism
12
https://www.skepticalraptor.com/skepticalraptorblog.php/polio-vaccine-causes-
cancer-myth/
13https://cloud.google.com/natural-language/
Dierences between Health Related News Articles from Reliable and Unreliable Media C+J’19, February 2019, Miami, Florida USA
logically constructed and supported with credible factual informa-
tion.
4.3.1 otation. We use the Stanford QuoteAnnotator
14
to iden-
tify the quotations from a news article. Figure 4b shows density
plots of the number of quotations per article for reliable and unreli-
able outlets. We observe that unreliable outlets use less number of
quotations compared to reliable outlets. We nd that the average
number of quotations per article is 1 and 3 in unreliable and reliable
outlets, respectively. This suggests that the reliable outlet sources
articles are more credible and unreliable outlets are less credible.
4.3.2 Hyperlink. We examine the use of the hyperlink in the ar-
ticles. On average, a reliable outlet sourced article contains 8.4
hyperlinks and an unreliable outlet sourced article contains 6.8 hy-
perlinks. Figure 4c shows density plots of the number of links per
article for reliable and unreliable outlets. The peaks indicate that
most of the articles from reliable outlets have close to 8 (median)
hyperlinks. On the other hand, most of the unreliable outlet articles
have less than 2 hyperlinks. This analysis again suggests that the
reliable sourced articles are more credible than unreliable outlet
articles.
5 CONCLUSION AND FUTURE WORK
In this paper, we closely looked at structural, topical, and semantic
dierences between articles from reliable and unreliable outlets.
Our ndings reconrm some of the existing claims such as unreli-
able outlets use clickbaity headlines to catch the attention of users.
In addition, this study nds new patterns that can potentially help
separate health disinformation. For example, we nd that less quo-
tation and hyperlinks are more associated with unreliable outlets.
However, there are some limitations to this study. For instance, we
didn’t consider the videos, cited experts, comments of the users,
and other information. In the future, we want to overcome these
limitations and leverage the ndings of this study to combat health
disinformation.
REFERENCES
[1]
Chris Breaux. (accessed September 28, 2018). "You’ll Never Guess
How Chartbeat’s Data Scientists Came Up With the Single Greatest Head-
line". http://blog
.
chartbeat
.
com/2015/11/20/youll-never-guess-how- chartbeats-
data-scientists- came-up- with-the-single- greatest-headline/
[2]
Yimin Chen, Niall J Conroy, and Victoria L Rubin. 2015. Misleading online
content: Recognizing clickbait as false news. In Proceedings of the 2015 ACM on
Workshop on Multimodal Deception Detection. ACM, 15–19.
[3]
Nicole K Dalmer. 2017. Questioning reliability assessments of health information
on social media. Journal of the Medical Library Association: JMLA 105, 1 (2017),
61.
[4]
Irja Marije de Jong, Frank Kupper, Marlous Arentshorst, and Jacqueline Broerse.
2016. Responsible reporting: neuroimaging news in the age of responsible re-
search and innovation. Science and engineering ethics 22, 4 (2016), 1107–1130.
[5]
Juliette De Maeyer. 2012. The journalistic hyperlink: Prescriptive discourses
about linking in online news. Journalism Practice 6, 5-6 (2012), 692–701.
[6]
Katie Forster. (accessed October 30, 2018). Revealed: How dangerous fake health
news conquered Facebook. https://www
.
independent
.
co
.
uk/life-style/health-
and-families/health- news/fake-news- health-facebook-cruel- damaging-social-
media-mike- adams-natural- health- ranger-conspiracy- a7498201.html
[7]
Susannah Fox. (accessed October 30, 2018). The social life of health infor-
mation. http://www
.
pewresearch
.
org/fact-tank/2014/01/15/the- social-life- of-
health-information/
[8]
Gaby Galvin. (accessed October 30, 2018). How Bots Could Hack Your
Health. https://www
.
usnews
.
com/news/healthiest-communities/articles/2018-
07-24/how- social-media- bots- could-compromise- public-health
14https://stanfordnlp.github.io/CoreNLP/quote.html
[9]
Bryan Gardiner. (accessed September 28, 2018). "You’ll Be Outraged at How Easy
It Was to Get You to Click on This Headline". https://www
.
wired
.
com/2015/12/
psychology-of- clickbait/
[10]
Amira Ghenai and Yelena Mejova. 2017. Catching Zika Fever: Application of
Crowdsourcing and Machine Learning for Tracking Health Misinformation on
Twitter. In Healthcare Informatics (ICHI), 2017 IEEE International Conference on.
IEEE, 518–518.
[11]
Amira Ghenai and Yelena Mejova. 2018. Fake Cures: User-centric Modeling of
Health Misinformation in Social Media. In 2018 ACM Conference on Computer-
Supported Cooperative Work and Social Computing (CSCW). ACM.
[12]
Muiris Houston. (accessed October 31, 2018). Measles back with a vengeance due
to fake health news. https://www
.
irishtimes
.
com/opinion/measles-back- with-a-
vengeance-due- to-fake- health-news-1.3401960
[13]
informationisbeautiful.net. 2016. Unreliable/Fake News
Sites & Sources. https://docs
.
google
.
com/spreadsheets/d/
1xDDmbr54qzzG8wUrRdxQlC1dixJSIYqQUaXVZBqsJs. (2016).
[14]
Mathew Ingram. (accessed October 30, 2018). The internet didn’t invent viral
content or clickbait journalism âĂŤ there’s just more of it now, and it happens
faster. https://gigaom
.
com/2014/04/01/the-internet- didnt-invent- viral-content-
or-clickbait- journalism-theres- just-more-of- it-now- and-it- happens- faster/
[15]
American Press Institute and the Associated Press-NORC Center for Public
Aairs Research. (accessed September 28, 2018). The Personal News Cycle: How
Americans choose to get their news. https://www
.
americanpressinstitute
.
org/
publications/reports/survey-research/how-americans-get- news/
[16]
Marjorie Kagawa-Singer and Shaheen Kassim-Lakha. 2003. A strategy to reduce
cross-cultural miscommunication and increase the likelihood of improving health
outcomes. Academic Medicine 78, 6 (2003), 577–587.
[17]
Alexander Kinsora, Kate Barron, Qiaozhu Mei, and VG Vinod Vydiswaran. 2017.
Creating a Labeled Dataset for Medical Misinformation in Health Forums. In
Healthcare Informatics (ICHI), 2017 IEEE International Conference on. IEEE, 456–
461.
[18]
Patty Kostkova, Vino Mano, Heidi J Larson, and William S Schulz. 2016. Vac
medi+ board: Analysing vaccine rumours in news and social media. In Proceedings
of the 6th International Conference on Digital Health Conference. ACM, 163–164.
[19]
David MJ Lazer, Matthew A Baum, Yochai Benkler, Adam J Berinsky, Kelly M
Greenhill, Filippo Menczer, Miriam J Metzger, Brendan Nyhan, Gordon Penny-
cook, David Rothschild, et al
.
2018. The science of fake news. Science 359, 6380
(2018), 1094–1096.
[20]
Felisbela Lopes, Teresa Ruão, Zara Pinto Coelho, and Sandra Marinho. 2009.
Journalists and health care professionals: what can we do about it?. In 2009
Annual Conference of the International Association for Media and Communication
Research (IAMCR),“Human Rights and Communication”. 1–15.
[21]
Saurabh Mathur. (accessed September 24, 2018). Clickbait Detector. https://
github.com/saurabhmathur96/clickbait-detector
[22]
HEALTH ON THE NET. (accessed September 24, 2018). . https://www
.
hon
.
ch/en/
[23]
Matthew Opatrny. (accessed September 28, 2018). "9 Headline Tips to Help You
Connect with Your Target Audience". https://www
.
outbrain
.
com/blog/9-headline-
tips-to- help-marketers- and-publishers-connect- with-their- target- audiences/
[24]
Md Main Uddin Rony, Naeemul Hassan, and Mohammad Yousuf. 2017. Diving
Deep into Clickbaits: Who Use Them to What Extents in Which Topics with
What Eects?. In Proceedings of the 2017 IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining 2017. ACM, 232–239.
[25]
Michael Schneider. (accessed September 24, 2018). Most-Watched Television
Networks: Ranking 2016’s Winners and Losers. https://www
.
indiewire
.
com/2016/
12/cnn-fox- news-msnbc- nbc- ratings-2016- winners-losers- 1201762864/
[26]
Gary Schwitzer. 2008. How do US journalists cover treatments, tests, products,
and procedures? An evaluation of 500 stories. PLoS medicine 5, 5 (2008), e95.
[27]
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news
detection on social media: A data mining perspective. ACM SIGKDD Explorations
Newsletter 19, 1 (2017), 22–36.
[28]
Miriam Shuchman and Michael S Wilkes. 1997. Medical scientists and health
news reporting: a case of miscommunication. Annals of Internal Medicine 126, 12
(1997), 976–982.
[29]
S Shyam Sundar. 1998. Eect of source attribution on perception of online news
stories. Journalism & Mass Communication Quarterly 75, 1 (1998), 55–68.
[30]
Emily K Vraga and Leticia Bode. 2017. Using Expert Sources to Correct Health
Misinformation in Social Media. Science Communication 39, 5 (2017), 621–645.
[31]
World Health Organization (WHO). (accessed September 24, 2018). . http:
//www.who.int/
[32]
Wikipedia. (accessed September 24, 2018). List of fake news websites. https:
//bit.ly/2moBDvA
[33]
Wikipedia. (accessed September 24, 2018). Wikipedia:Zimdars’ fake news list.
https://bit.ly/2ziHaf j
[34]
Melissa Zimdars. 2016. My ‘fake news list’ went viral. But made-up stories are
only part of the problem. https://www
.
washingtonpost
.
com/posteverything/
wp/2016/11/18/my-fake- news-list- went-viral- but-made-up- stories-are- only-
part-of- the-problem. (2016).