PreprintPDF Available

The truth is out there: Who makes and influences health news on social media in the UK and internationally?

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Unpublished paper but see blog. https://scotpublichealth.com/2018/10/31/healthnews/ Aim: To use Twitter Big Data to identify the main influencers and disseminators of health news in the UK, in the general and specialist press, and to place in an international context. Methods: Health-related tweets from late January 2017 were extracted and mapped using NodeXL, an Excel extension that identifies tweets meeting specified criteria (hashtag, word or username). Top webpage Uniform Resource Locators (URL) were identified and mapped separately. Connections were quantified using "betweenness centrality", a measure of social media connections. The number of followers for the top 20 tweeters in the main searches were obtained from Twitter. Results: The Independent, Guardian, Mail Online, BBC News and Sky News Twitter accounts were the prominent generators of UK health news stories during the period studied, with the BMJ at number 6 and the Lancet at number 11. The BMJ ranked third in the global medical journal search. Charities, NHS organisations and high profile individuals joined the list of influencers in a broader UK-focused search of health-related tweets. Many of the more popular news stories identified in the searches had a political dimension, reflective of recent political upheavals and polarisation between science and "alternative facts". Discussion: The immediacy and reach of Twitter big data analysis has obvious strengths. There are, nonetheless, limitations to the analysis (technical and methodological). Furthermore, Twitter is used by only a minority of the population (around 14%). Conclusion: This analysis gives a rapid snapshot of health-related social media activity in the UK and beyond, during a period when understanding the process of health news creation and dissemination has never been more important or at threat. The medical press in the UK has national and international influence that exceeds its immediate base of followers. NHS organisations, charities and individuals also have an important influence in broadcasting health information. Ways to bring together dissemination and constructive debate on health issues using social media are discussed. Word count 3,495 (excluding title page, abstract, references, figures and tables) Funding: Not funded-PC and NodeXL Pro were purchased by GM. Analysis and write up were performed in own time. 2 Ethical approval: Not required. Information obtained from a publicly available source (Twitter). Competing interest summary: No conflict of interest to declare.
Content may be subject to copyright.
1
The truth is out there: Who makes and influences health news on social media in the UK and
internationally?
Graham Mackenzie1
Christopher W Oliver2
Author Affiliations
1. Public Health and Health Policy, NHS Lothian, Edinburgh, UK
2. King James IV Professor RCSEd
Correspondence to: Graham Mackenzie
Twitter
@gmacscotland
@CyclingSurgeon
Abstract:
Aim: To use Twitter Big Data to identify the main influencers and disseminators of health news in
the UK, in the general and specialist press, and to place in an international context.
Methods: Health-related tweets from late January 2017 were extracted and mapped using NodeXL,
an Excel extension that identifies tweets meeting specified criteria (hashtag, word or username). Top
webpage Uniform Resource Locators (URL) were identified and mapped separately. Connections
were quantified using “betweenness centrality, a measure of social media connections. The number
of followers for the top 20 tweeters in the main searches were obtained from Twitter.
Results: The Independent, Guardian, Mail Online, BBC News and Sky News Twitter accounts were
the prominent generators of UK health news stories during the period studied, with the BMJ at
number 6 and the Lancet at number 11. The BMJ ranked third in the global medical journal search.
Charities, NHS organisations and high profile individuals joined the list of influencers in a broader
UK-focused search of health-related tweets. Many of the more popular news stories identified in the
searches had a political dimension, reflective of recent political upheavals and polarisation between
science and “alternative facts”.
Discussion: The immediacy and reach of Twitter big data analysis has obvious strengths. There are,
nonetheless, limitations to the analysis (technical and methodological). Furthermore, Twitter is used
by only a minority of the population (around 14%).
Conclusion: This analysis gives a rapid snapshot of health-related social media activity in the UK and
beyond, during a period when understanding the process of health news creation and dissemination
has never been more important or at threat. The medical press in the UK has national and
international influence that exceeds its immediate base of followers. NHS organisations, charities
and individuals also have an important influence in broadcasting health information. Ways to bring
together dissemination and constructive debate on health issues using social media are discussed.
Word count 3,495 (excluding title page, abstract, references, figures and tables)
Funding: Not funded PC and NodeXL Pro were purchased by GM. Analysis and write up were
performed in own time.
2
Ethical approval: Not required. Information obtained from a publicly available source (Twitter).
Competing interest summary: No conflict of interest to declare.
Contributors and Guarantor: Graham Mackenzie performed the analysis and Graham Mackenzie
and Chris Oliver used the analysis to write the paper. Graham Mackenzie is the Guarantor.
Research checklist: Not required.
Patient involvement: Patients were not involved in this study. See also methodology section.
Data sharing: no additional data available.
What this study adds:
- Big data from Twitter (using #NodeXL tool), can be used to extract, study and quantify the
social media interactions around recent health stories across media outlets;
- Medical journals such as BMJ have a major influence in making and disseminating health
stories on social media, at a UK and international level;
- Many of the big health news stories identified during the period studied (late January 2017)
focused on health and politics, reflecting concerns about the status of science and medicine
in a “post-truth” era;
- The method used for this study could be automated to produce a daily analysis of health-
related tweeting in the UK and internationally.
Update (October 2018): This work was not accepted for publication by two international medical
journals (February to March 2017). It is being shared via social media to illustrate the development
of interpretation of social network analysis outputs for health and public health.
Perhaps this work was ahead of its time we were writing this and revising for resubmission to
journals just as Carole Cadwalladr was breaking the story that would unfold to expose the influence
of Facebook, Cambridge Analytica and other organisations on the Brexit and Trump votes. We were
keen to show that social media can also have a positive influence (eg in dissemination of clinical and
scientific knowledge), if used properly.
In November 2017 GM wrote a blog that explored the topic further, simplifying the output
into a Pareto chart of Twitter accounts rather than exploring individual stories. That blog is
now out of date as it linked to a number of Storify summaries (Storify has since gone
offline). Nonetheless, the main messages from that blog are still covered in sufficient detail
to remain useful: https://scotpublichealth.com/2017/11/09/who-are-the-top-3-of-
influencers-on-health-healthcare-on-twitter-work-in-progress/
Further methodological advances in interpretation of social network analysis have been
shared in other blogs on the www.scotpublichealth.com website and in publications in peer
reviewed journals, with more in press and being prepared for submission. This current
article, however, provides a detailed view of a critical time (January 2017) when fake news,
social media influence and clouding of the truth were emerging rapidly into public
consciousness. Writing in October 2018 we therefore feel that it is important to share this
information even though the medical journals were not ready to publish at the time.
3
Introduction: In an era characterised by terms such as “post-truth”, “news bubble”, “alternative
facts” and “fake news”, and the downplaying of role of “experts”, it is important to understand who
makes and influences news. This is directly relevant to science and health news coverage, where the
accurate reporting of evidence-based information is required to guide decision making by the public,
professionals and politicians alike. Examples include newspaper coverage and politician involvement
in cancer treatment, against the person-centred care plans increasingly recommended by healthcare
professionals.1,2
Data sources such as newspaper circulation figures and “most read” tables on newspaper websites
give only a partial picture of likely readership and do not allow comparison between news sources,
or the interactions between readers. Social media potentially provides a measurable comparison
across outlets. We set out to explore the use of “Big Data” from Twitter to identify the top UK health
news stories and influencers for a week (commencing 23 January 2017), using information from
Twitter accounts of mainstream UK newspapers and the top UK medical journals (BMJ and Lancet).
Extending the approach, we were able to compare the UK medical journals with other global medical
journals. Finally, we looked beyond the media outlets and health journals to compare their impact
with high profile opinion formers and influencers in the UK.
Methods: An Excel add-on (NodeXL Pro, Social Media Research Foundation3) was used to run a
series of Twitter searches on a PC constructed to perform large data analysis (Windows 10 Pro, 64-
bit PC with 32Gb RAM, Intel® Core™ i7-6700K, Excel 2016). NodeXL extracts information on Twitter
users whose recent tweets contained the search term, or who were replied to or mentioned in those
tweets, taken from a data set limited to a maximum of 18,000 tweets (typically less; see individual
NodeXL outputs for details). Searches were conducted as far as possible to focus on the period 23
28 January 2017. NodeXL can extract tweets for up to 8-9 days into the past, though the way that
Twitter issues the tweets can be variable and the volume of tweets will vary depending on search.
Accordingly, later searches in this paper focused on a slightly more recent period.
Search 1: We started with a broad search of health news in the mainstream UK press and top UK
medical journals before focusing down on the top three top stories (by count/retweet count) from
this search, identified by URL in the NodeXL report. Information about top URLs, hashtags, words
and “word sentiment” were extracted from the NodeXL report. The BMJ had the highest ranking of
medical journals in this search and therefore became the focus for search 2.
Search 2: A separate NodeXL search was run for the @bmj_latest Twitter handle, again repeating for
the top three stories identified by URL.
Search 3: A search of global medical journals was performed to compare the ranking of the top UK
medical journals with other global journals, repeating for the top three stories identified by URL.
Search 4: To study the influencers and disseminators of health and care messages, beyond the
media outlets themselves, a NodeXL search was conducted looking at the top general and health
media outlets from the first search, plus tweets from UK-based health tweeters (charities, think
tanks, a royal college, NHS organisations, Department of Health and high profile individuals) with a
large number of Twitter followers.
The full search terms and URLs to NodeXL maps for the main searches and related URL-based
searches are listed in the supplementary materials, providing a summary report of main URLs,
hashtags, words and influencers, plus a full download of the data extracted for each search.
Patient involvement: Patients were not involved in the development or conduct of this study. The
research did not directly impact on patient care. The dissemination and interpretation of health
news may impact on shared decision making, but that is beyond the scope of this paper.
Public involvement: Earlier NodeXL maps of health news tweets were uploaded to the NodeXL
gallery and tweeted in December 2016.
4
Results: Table 1 shows “betweenness centrality” for the general health news search (search 1): the
higher the measure, the broader the connections across the network4. Independent, Guardian and
Mail Online, BBC News and Sky News were the main media outlets in this map, but the BMJ was at
number 6, ranking higher than several popular national newspapers.
Table 1) Betweenness centrality and number of followers for top 20 tweeters in general health
news search (23-27 January 2017) (search 1)
Ranking
Vertex
Betweenness Centrality
No followers (2 Feb 2017)
1
independent
59,860,338
2.13M
2
guardian
51,457,004
6.26M
3
mailonline
23,498,056
1.89M
4
bbcnews
22,735,753
7.72M
5
skynews
12,399,063
3.78M
6
bmj_latest
10,785,563
235K
7
telegraph
6,096,766
1.97M
8
daily_express
6,095,199
625K
9
dailymirror
5,209,256
856K
10
thetimes
4,869,493
924K
11
thelancet
4,168,534
253K
12
Thesun
2,135,816
1.24M
13
pd_health
1,724,688
128
14
pash22
1,659,415
17K
15
thescotsman
1,588,991
120K
16
labour_zone
1,360,319
1,116
17
gpconsortia
1,359,029
4,028
18
Macaxe
1,125,013
1,442
19
nhsmillion
1,095,450
124K
20
cost_ofliving
1,071,824
1,466
Source: data from NodeXL analysis (betweenness centrality) and Twitter (number of followers)
Figure 1 shows the map of the general health news search, and related maps for the top 3 URLs on
this map. The number of tweets and retweets for the top 3 URLs in this map are shown in square
brackets. Conducting separate NodeXL maps for these 3 URLs identified a larger number of tweeters
and tweets (vertices and edges provided in figure 1), as the URL-based searches identified tweets
that were posted without mentioning any of the search terms from search 1.
The top two URLs on the general health news search were from the Independent website, and the
map for these URL searches identified different groups tweeting on the topic of Trump, Brexit and
the future of the NHS. The third URL was from the Guardian website and this produced a much
simpler map, with tweets related to the Royal College of Paediatricians and Child Health (RCPCH)
report on child health, which identified that a fifth of children in the UK live in poverty.
The top 20 connections for tweets mentioning “@bmj_latest” (search 2) are shown in table 2,
consisting mainly of individual tweeters and a smaller number of Twitter accounts from healthcare
organisations. The NodeXL map for @bmj_latest search (figure 2) shows a network dominated by
tweets by the journal or mentioning the journal. The 3 URL-based maps from the BMJ search, in
order of number of tweets in the NodeXL report, included a story from the BMJ archive (from a Trish
Groves tweet), an infographic on UK health spending, and a Margaret McCartney comment piece on
celebrity endorsements. The first two searches produced simple maps, while the third map showed
two main groups, plus other smaller clusters, with a lot of tweets that did not mention @bmj_latest
in their tweet (53 tweets identified in the main bmj_latest search, 424 tweets in the URL-based
search).
5
Table 2) Top 20 for BMJ search (23-28 January 2017) (search 2)
Vertex
Betweenness Centrality
No followers (1 Feb 2017)
bmj_latest
4,500,150
235K
orthopodreg
34,052
2,584
drdavidwarriner
22,718
2,817
copddoc
20,613
4,807
pash22
19,642
17K
fsrh_uk
19,544
2,032
wittykidney
17,068
673
carlofavaretti
17,068
1,123
carlheneghan
17,068
5,772
rcpsglasgow
15,998
3,543
ted_melnick
15,288
188
mancunianmedic
14,892
8,499
drraggarwal
14,831
2,643
hammeritoutuk
12,837
111
barbarabulc
12,804
887
hcuk_clare
12,804
9,909
nizammamode
12,789
196
frank_dor
12,789
2,170
jckjsphs
12,728
9
johncam_brennan
11,723
26
Source: data from NodeXL analysis (betweenness centrality) and Twitter (number of followers)
Table 3 and figure 3 repeat the approach for health journals with a global reach (search 3). Nature,
NEJM and the BMJ were the top 3 Twitter accounts in this map. The top 2 URLs by number of tweets
were from Nature (climate change and dermatology research), the 3rd top from NEJM (a C. difficile
film), each of which produced straightforward NodeXL maps.
6
Table 3) Top 20 for global medical journals (25-28 January 2017) (search 3)
Vertex
Betweenness Centrality
No followers (1 Feb 2017)
1
nature
37,446,107
635K
2
nejm
17,848,413
388K
3
bmj_latest
10,990,092
235K
4
thelancet
8,342,886
253K
5
jama_current
7,993,000
192K
6
erictopol
1,588,822
98.8K
7
gavinprestonmd
1,123,961
44.7K
8
pash22
1,076,431
17K
9
pharmadecisions
916,314
319
10
gerardhough
909,179
874
11
healthuktd
866,764
1,338
12
vilavaite
812,813
10.4K
13
neilflochmd
762,979
115K
14
annalsofim
679,679
21.8K
15
quantifyhealth
544,499
1,652
16
hmkyale
516,839
11.9K
17
gladeolie
515,840
937
18
behdadnavabi
497,705
73
19
vicentelozadab
475,125
10.7K
20
jimjohnsonsci
472,029
2,478
Source: data from NodeXL analysis (betweenness centrality) and Twitter (number of followers)
Table 4 and figure 4 extend the UK search to include key Twitter influencers from the UK, including
charities, a royal college, NHS, government and individual Twitter accounts (search 4). Separate
NodeXL searches using URLs have not been attempted because there were a smaller number of
tweeted URLs in this analysis. Full details are available, however, from the link in the supplementary
materials.
7
Table 4) Top 20 for UK search of influencers and disseminators (search 4)
Ranking
Vertex
Betweenness Centrality
No followers (1 Feb 2017)
1
cr_uk
23,542,162
287K
2
nhsengland
21,963,976
155K
3
independent
12,163,849
2.13M
4
bmj_latest
11,339,377
236K
5
guardian
10,940,073
6.26M
6
thebhf
9,017,958
295K
7
olivierbranford
8,395,990
121K
8
helenbevan
8,232,637
40.9K
9
alzheimerssoc
7,671,859
142K
10
mindcharity
7,462,615
301K
11
nhschoices
6,475,434
201K
12
thelancet
5,569,996
253K
13
thekingsfund
3,745,105
100K
14
bjsm_bmj
3,481,951
37.2K
15
wellcometrust
3,306,834
102K
16
prof_anil_jain
3,264,548
1,716
17
childrensociety
2,825,856
61.3K
18
liz_oriordan
2,794,036
6,414
19
jrf_uk
2,779,942
140K
20
blanchyshed
2,727,807
181
Source: data from NodeXL analysis (betweenness centrality) and Twitter (number of followers)
Table 5 shows the word and sentiment analysis using information available from Twitter via NodeXL;
the results are shown as a count of words, and a ratio of positive to negative words (a figure of over
1 suggests more positive wording of tweets). Tweets on the Guardian coverage of the RCPCH report
on child poverty had a highly negative ranking, as did Margaret McCartney’s article on celebrity
endorsement. In the global journal search, the climate change editorial and C. difficile film also had a
strongly negative sentiment analysis.
Table 5) Words and sentiment analysis from NodeXL map reports
Search strategy
+ve words
-ve words
Total words
Ratio of +ve to -ve words
General health news
search (search 1)
4082
7653
189,874
0.53
Health news 1 URL
2232
3568
90,487
0.63
Health news 2 URL
4187
3071
104,907
1.36
Health news 3 URL
34
328
4,370
0.10
BMJ tweets (search 2)
1794
1698
44,898
1.06
BMJ 1 URL
206
5
1,879
41.2
BMJ 2 URL
69
0
629
N/A
BMJ 3 URL
118
381
6,043
0.31
Global medical
journals (search 3)
2885
4579
130,151
0.63
Global 1 URL
1
1390
23,715
0.001
Global 2 URL
8
8
3,413
1.00
Global 3 URL
2
204
2,326
0.01
NodeXL map reports (see supplementary materials for URLs to each report)
8
Figure 5 takes data from the general health news search (table 1) to plot connectedness against
number of followers for the general health news search. The Independent, Guardian, Daily Mail and
BMJ Twitter accounts had a higher than average connectedness for number of followers.
Discussion: This analysis of recent health-related tweets in the UK and internationally has shown the
potential use of Big Data in understanding how health news is made and disseminated. The obvious
strengths are the immediacy, relative ease and accessibility of the outputs (maps and full NodeXL
reports). The NodeXL Pro add-on to Excel performs the analysis and uploads a report automatically
to the NodeXL gallery, with a summary of influencers, hashtags, words and URLs. Furthermore, the
full extract of data is available to download from that report. The searches performed for this paper
could be automated to provide a summary of tweets on UK health news on a regular basis.
Assessing the accuracy of NodeXL algorithms and outputs is beyond the scope of this paper as it
would require extraction and checking of tens of thousands of tweets. Twitter will sometimes not
provide a full extract, particularly if there is a lot of tweeting around the search term in question.
Nonetheless, the health stories identified, and the main influencers, appear to be reflective of the
main stories of this recent period. It is not possible to perform an analysis of tweets posted further
back into the past, so we cannot provide a commentary over time; to our knowledge this is the first
time this approach has been used to study UK health news. It is apparent, however, that NodeXL
looks beyond the URLs that obviously link to a story (though these are the links used in the NodeXL
report), and also includes shortened URLs that point to the same page. This means that NodeXL
extracts information that a casual inspection of Twitter using the inbuilt search function would miss.
Altmetrics are increasingly used in medical journals: they provide a cumulative measure of tweets
and other social media activity, plus a breakdown by geography and demographic (e.g. members of
public, healthcare professionals, scientists etc).5 The NodeXL and altmetrics data are complementary
(the latter can be viewed through the metrics tab on BMJ articles). While NodeXL gives individual
breakdown and interaction of recent tweets (typically no more than 9 days into the past), altmetrics
data gives aggregate data, at considerable detail, on a cumulative basis. Of note, altmetrics suggest
that Twitter is much more commonly used for dissemination of information from the BMJ than other
social media platforms (e.g. Facebook and Google+), which strengthens the rationale for focusing on
Twitter in this initial NodeXL analysis.
Each NodeXL map needs careful interpretation, though only a sketch of selected maps is possible
here. If searching for a Twitter handle (Twitter username eg @bmj_latest), NodeXL identifies tweets
from that account, plus tweeters that interact with that account (e.g. a reply, retweet, or a tweet
mentioning that account). Somebody else mentioning this account in their tweet, with interactions
for some of these tweets, can appear prominently on a NodeXL map. This is potentially important
information, but does not necessarily equate to a positive interaction with the main account in the
search. NodeXL can also search for words and hashtags; we avoided hashtags for this analysis, as it
restricted the number of tweets identified (for example, a search term “#NHS turns up many fewer
tweets than NHS, which captures tweets using either NHS or #NHS).
One of the striking points from the maps of general and specialist press (figures 1 and 4) is the lack
of direct interaction between the main opinion formers in these two sectors: a newspaper may
tweet a health story, and perhaps provide a link to the published study in the article, but will rarely
use the author or publisher Twitter handles in the article or tweets. The bridges between groups on
these maps are often individual tweeters with medium to large Twitter followers rather than the
publishers themselves. The lack of interaction between the publishers’ accounts is potentially a
missed opportunity: the newspaper, medical journal and clinician/scientist all have a part to play in
the dissemination of the story, and the interaction possible from a tweet could further the
audience’s interest and interaction in the story and ultimately perhaps the understanding of the
evidence; there is, of course also the potential for more negative interactions/spam tweeting, so
there would be a balance to reach after testing out this idea.
9
The pattern observed in NodeXL maps also needs interpretation in the context of all other available
information. For example, the tweets around the search for the Margaret McCartney article (3rd URL
in @bmj_latest search (search 2)) clustered around 2 main groups and a smaller number of more
isolated tweeting. The two main groups had different membership, and different words/word pairs
and hashtags were dominant for the different groups in the NodeXL report, but these were two
groups of medical professionals in agreement over the general message we need to promote
evidence based approaches when others are all too keen to resort to celebrity endorsements. There
was quite a lot of cross-pollination between the two groups in the map. Beyond this particular map,
members of both groups are well known to each other and many would see each other as colleagues
and allies on Twitter and/or the real world. As another example, the more complex map of the
Trump/Brexit/NHS tweets in the first of the Independent stories identified in search 1 (general
health news) had more groups, with less interaction between groups, but again these groups were
tweeting similar messages, expressing concern about the potential US-UK trade deal, and its impact
on the NHS, albeit with a different vocabulary in different groups. Separation and lack of interaction
is different from polarisation, so caution is required in reading these maps. The groupings on these
maps are not examples of the “filter bubble”6; they are perhaps more accurately described as people
sharing the same bubble (i.e. reading the same newspapers, watching/ listening to the same
programmes) but rarely encountering each other perhaps as much on social media as in the real
world.
There are limitations to the analysis. Social analytical tools are limited by the information that
individual tweeters choose to share in their profile, with geolocation often incomplete or
ambiguous. Hence the decision to focus on UK media outlets, and inclusion of NHS in some of the
search terms. It should be noted that many tweeters do not include their location anyway, so a
location-based search would also have its limitations, restricting the number of tweets down to
substantially smaller numbers.
Even with a powerful PC there were challenges in extracting and processing all the information
required for this analysis, and this impacted the number of tweets extracted (limited to a maximum
of 10,000 tweets or fewer, rather than the 18,000 allowed by NodeXL). The time to run a single map
(up to 2 hours for the largest searches, often requiring refinement and rerunning before finding a
suitable search) also influenced the dates of the data extraction with limited computer access for
extraction and analysis, despite an attempt to run the analyses and write the paper in as short a
period as possible, the time span of the final search starts after the period of the first search.
The search terms used were necessarily quite generic tweets without these terms (health, NHS,
hospital, depression etc), would not have been picked up in searches 1 or 4. However searches 2 and
3 were designed to capture all tweets from the listed medical journals.
The sentiment analysis in NodeXL is designed for general use. It is likely that the positive and
negative word lists would need to be modified for scientific and/or clinical use. Words such as
“infection” are listed under “negative words”, so the NEJM tweets on the C. difficile film are
unsurprisingly identified as having a negative sentiment. The film itself, however, was well received
by the people tweeting about it, so it would be inappropriate to group this together with the
negative tweets about global warming in the Trump era. A similar effect is seen in the RCPCH tweets
from the Guardian website in search 1, some of which included negative words such as “poverty”
and a negative sentiment, despite the positive reception of the report. The NodeXL sentiment lists
are, however, adaptable,7 and this could be explored in further analysis. Further fine tuning of
methodology in the future will be possible, which makes the open access to previous searches and
data on the NodeXL graph gallery so helpful.
The analysis presented here represents a snapshot of tweets. The influencers and types of stories
will change over time. Previous simpler attempts at health news searches using NodeXL (available on
NodeXL graph gallery) showed different patterns, with slightly different orderings of the main media
10
outlets over time, largely the top 5 shuffling about as different stories dominate. The reasons for
searching on a regular basis become clear when pulling out specific examples: a preliminary NodeXL
search conducted for 23-27 November 2017 identified highly polarised news reporting of the same
story (the discredited claim that £350m/week would be available to NHS post Brexit): 506 tweets of
an Independent article “Brexit means no extra money for the NHS” versus 135 tweets of a Daily Mail
story that claimed “The Brexit dividend does exist”8. The balance in this Twitter analysis is in favour
of the Independent story, though the readership may be quite different in other contexts, both in
age profile and ability/ likelihood to vote. This is a clear example of a “filter bubble 6, where
individuals’ views are highly polarised, presumably on both social media and in the real world, with
chosen media reinforcing already deeply ingrained views.
If only 14% of the population uses Twitter2 then we need to be careful in attempting to generalise
findings from this analysis. Global social media campaigns can influence tweets over a short period
of time eg #WorldCancerDay on 4 February 2017 meant that an analysis of tweets when running
and refining search 4 was dominated by hashtags relating to #WorldCancerDay (90% of the top ten
hashtags)9. Fortunately NodeXL and Twitter support time-limited searching using the “since” and
“until” search terms, so the search was repeated to run as close to earlier analyses as possible.
A number of factors have an important influence on the impact of a tweet its timing, use of image,
URL, hashtag and/or mention of another Twitter user. The newspaper sites are set up to embed an
image into a tweet automatically, thereby boosting the impact of the tweet. The medical journals do
not tend to provide this function, which is likely to reduce the impact of their stories. As described in
the supplementary materials to this paper, most of the tweets sent by individuals during the period
studied did not include the ingredients of a successful tweet (image, hashtag, mention of another
Twitter user; obviously all the tweets identified through the URL based search did include a weblink).
Tweets from the @BMJ_latest account did include these, and the large number of followers will also
have boosted the number of replies and retweets. Serendipity, luck, humour and capturing the
zeitgeist are also important, which means that much smaller tweeters can sometimes achieve a large
audience. Ease of access of the original article would also be expected to have an impact the
Independent and Guardian have completely open sites, whereas the Times is only open to
subscribers, which may explain its lower ranking. The BMJ sits somewhere in between, with blogs
and selected scientific articles freely available, but others locked. The decision to unlock articles,
potentially increasing audience but reducing revenue, needs to be considered carefully.
Conclusions: This analysis has demonstrated the potential opportunities and pitfalls of using Big Data
from Twitter in exploring the main health stories, and the main generators, interactors and
disseminators of these stories. Options for automating these searches for regular extraction and
uploading to the NodeXL graph gallery will be discussed with the Social Media Research Foundation.
The BMJ is identified as a major producer and shaper of news stories in the UK and internationally.
There are opportunities to strengthen the social media interactions around health stories, including
the inclusion of links and Twitter handles in tweets by the major newspaper publishers, and perhaps
fostering relationships between the clinical/scientific community, journals and general press.
Building bridges between the tweeting communities we see in the NodeXL maps in this analysis
would help reach new audiences.
There are also areas for improvement in the general quality of tweeting, with individuals often
missing out the key ingredients of a successful tweet (image, URL, hashtag, mention).
NodeXL provides an opportunity to extract information, share that information with key opinion
makers, and analyse the impact of any changes. In a “post truth” era when evidence and an
understanding of the scientific method is struggling to be heard, we need to tweet more effectively,
reaching new audiences, ensuring that the truth is out there.
11
Acknowledgements: The authors wish to thank Mark Outhwaite and Marc Smith for their
introduction to NodeXL and ongoing support in using NodeXL.
Figure 1) NodeXL maps for general health search plus top 3 URLs within this search [including
number of retweets in square brackets]. Vertices = number of users tweeting; edges = number of
tweets (search 1)
Source: NodeXL searches, recorded on NodeXL Graph Gallery website under username
scotpublichealth http://nodexlgraphgallery.org/Pages/Default.aspx?search=scotpublichealth
3 sub searches for general health news were limited to 5,000 tweets
12
Figure 2) NodeXL maps for @bmj_latest search plus top 3 URLs within this search [including
number of retweets in square brackets]. Vertices = number of users tweeting; edges = number of
tweets (search 2)
Source: NodeXL searches, recorded on NodeXL Graph Gallery website
13
Figure 3) NodeXL maps for global health journal search plus top 3 URLs within this search
[including number of retweets in square brackets]. Vertices = number of users tweeting; edges =
number of tweets (search 3)
Source: NodeXL searches, recorded on NodeXL Graph Gallery website
14
Figure 4) UK influencers and disseminators (search 4)
Source: NodeXL searches, recorded on NodeXL Graph Gallery website
15
Figure 5) Scatter plot showing number of followers against a measure of connectedness
(“betweenness centrality”).
The labels and data are from table 1 (i.e. 1 = Independent, 2 = Guardian etc).
Source: data from NodeXL analysis (betweenness centrality) and Twitter (number of followers)
16
Supplementary materials:
Search terms
Search 1: Search term for general health news search:
(@guardian OR @independent OR @thetimes OR @mailonline OR @daily_express OR @thesun OR
@dailymirror OR @bbcnews OR @skynews OR @telegraph OR @thescotsman OR @thelancet OR
@bmj_latest) AND (Health OR NHS OR hospital OR diabetes OR obesity OR anxiety OR depression OR
flu OR influenza)
Searches for top 3 URLs identified through the general health news search were limited to 5,000
tweets because of computer processing power.
Search 2: Search term for BMJ search: @bmj_latest. Searches for top 3 URLs identified through the
BMJ search were not limited to an upper number of tweets because there was a lower volume of
tweeting than search 1.
Search 3: Search term for global medical journals (25 to 28 January 2017): @jama_current OR
@nejm OR @bmj_latest OR @thelancet OR @AnnalsofIM OR @Nature since:2017-01-23 until:2017-
01-29 was limited to 8,000 tweets due to computer processing power. Related URL-based tweets
were not limited to an upper number of tweets.
Search 4: Search term for UK organisations tweeting on health issues: @mindcharity OR @jrf_uk OR
@thebhf OR @bmj_latest OR @thelancet OR @bjsm_bmj OR @olivierbranford OR @helenbevan OR
@rcplondon OR ((@guardian OR @independent) AND (NHS OR health)) OR @GdnHealthcare OR
@CR_UK OR @alzheimerssoc OR @DementiaUK OR @wellcometrust OR @HealthFdn OR
@thekingsfund OR @childrensociety OR @nhschoices OR @nhsengland OR @DHgovuk since 2017-
01-27 until:2017-01-29” – limiting to 10,000 tweets due to computer processing power. This search
was conducted later than the other searches, so looks at a slightly later period.
As a by-product of this analysis we were also able to study the tweeting practices of different Twitter
users. Tweets are typically more effective if an image, hashtag, mention and/or URL are used. In
order to assess the “quality” of tweeting a search of individual tweets by or mentioning @bmj_latest
was conducted, using a URL-based search that identified a manageable but sufficient number of
tweets. Tweets were identified via the Twitter search function, and the use of image, hashtag and
“mention” of another Twitter user(s) was recorded, along with the number of retweets and likes.
This approach was repeated for tweets posted by the @bmj_latest account on 23 Jan 2017.
Using the Margaret McCartney comment piece in search 2, 23 tweets were identifed from 23 users
in a Twitter search on 3 February 2017 (tweets posted between 25/1/2107 and 31/1/2017). These
tweets were from users with a median of 851 folowers on Twitter (range 17 followers to 236k
followers). The tweets resulted in a median of 7.5 retweets (range 0-155) and 7.6 likes (0-134), with
16 (69.6%) of the tweets retweeted or liked. Only 4 (17.4%) tweets included an image, none used a
hashtag and 4 mentioned another Twitter account; overall 6 tweets (26.1%) included an image,
hashtag and/or mention (2 tweets included both a picture and mention). The first tweet (by
@bmj_latest), which included an image, had by far the most retweets and likes; given the small
number of other tweets that included these additional ingredients a statistical analysis has not been
attempted.
The low percentage of tweets with image, hashtag or mention is not reflected in the BMJ’s own
tweets: extracting the 17 tweets on the @BMJ_latest timeline for 23 January 2017, all had a URL and
15/17 (88%) had picture, URL and/or mention, and all but one tweet had two or more of these
ingredients.
17
Links to reports and full extracts from these NodeXL searches
General health news map (search 1):
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92439
Maps for top URLs in general health news search:
1) Independent article on Theresa May / Donald Trump meeting that mentioned NHS
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92491
2) Another Independent article on Theresa May / Donald Trump meeting
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92494
3) Guardian article on RCPCH report:
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92495
BMJ map (search 2): http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92796
Maps for top URLs in BMJ_latest search:
1) The scandal of poor medical research (BMJ Editorial, 1994, tweeted by Trish Groves)
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92486
2) NHS in 2017: the long arm of government (BMJ Feature, January 2017)
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92781
3) Margaret McCartney: Swapping systematic reviews for celebrity endorsements (BMJ Views and
Reviews, January 2017)
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=92783
Global health journal search (search 3):
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=93155
Maps for top URLs in global health journal search:
1) Base the social cost of carbon on the science (Nature, 18 January 2017)
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=93159
2) Dermatologist-level classification of skin cancer with deep neural networks (Nature, 2 February
issue)
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=93161
3) Preventing Clostridium difficile Infection Recurrence (Video) (NEJM)
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=93162
UK influencers and broadcasters (search 4):
https://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=93536
18
References
1 Choosing Wisely website: http://www.choosingwisely.co.uk/ (accessed 4 February 2017)
2 Realistic Medicine. Chief Medical Officer’s Annual Report 2014-15 (Scottish Government, January 2016).
www.gov.scot/Resource/0049/00492520.pdf (accessed 4 February 2017)
3 Social Media Research Foundation website http://www.smrfoundation.org/ (accessed 4 February 2017)
4 PewResearchCenter in association with SMRF. How we analysed Twitter social media networks with NodeXL
http://www.pewinternet.org/files/2014/02/How-we-analyzed-Twitter-social-media-networks.pdf (accessed 4
February 2017)
5 Bower, C. BMJ Blog: Redefining impact altmetrics now on journals from BMJ http://blogs.bmj.com/bmj-
journals-development-blog/2013/10/21/redefining-impact-altmetrics-now-on-journals-from-bmj/ (accessed 4
February 2017)
6 BMJ Blog. Evidence Live 2016: Whither evidence in the social media world?
http://blogs.bmj.com/bmj/2016/06/09/evidence-live-2016-whither-evidence-in-the-social-media-world/ (9
June 2016, accessed 4 February 2017)
7 SMRF. Sentiment analysis: don’t take our word for it http://www.smrfoundation.org/2016/09/15/sentiment-
analysis-dont-take-our-word-for-it/ (accessed 4 February 2017)
8 An earlier general news NodeXL search (without medical journals included)
https://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=85029 (search conducted 27 November 2016,
uploaded on 16 December 2016)
9 A subsequent NodeXL search of health tweeting early Feb 2017
https://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=93480 (uploaded 3 February 2017)
ResearchGate has not been able to resolve any citations for this publication.
Article
The actor Ben Stiller has written an essay on prostate specific antigen screening, called Taking the PSA Test Saved My Life . He finishes by saying, “I believe the best way to determine a course of action for the most treatable, yet deadly cancer, is to detect it early.” The piece prompted widespread, mainly uncritical, media coverage.1 The social media star Kim Kardashian promoted Diclegis, a tablet promoted for morning sickness in Canada, on her Instagram feed. The US Food and Drug Administration subsequently told its manufacturers to take “corrective …
The scandal of poor medical research
The scandal of poor medical research (BMJ Editorial, 1994, tweeted by Trish Groves)
Whither evidence in the social media world?
BMJ Blog. Evidence Live 2016: Whither evidence in the social media world? http://blogs.bmj.com/bmj/2016/06/09/evidence-live-2016-whither-evidence-in-the-social-media-world/ (9 June 2016, accessed 4 February 2017)
Sentiment analysis: don
  • Smrf
SMRF. Sentiment analysis: don't take our word for it http://www.smrfoundation.org/2016/09/15/sentimentanalysis-dont-take-our-word-for-it/ (accessed 4 February 2017)
) 2 Realistic Medicine
Choosing Wisely website: http://www.choosingwisely.co.uk/ (accessed 4 February 2017) 2 Realistic Medicine. Chief Medical Officer's Annual Report 2014-15 (Scottish Government, January 2016). www.gov.scot/Resource/0049/00492520.pdf (accessed 4 February 2017)