Conference Paper

Measuring Correlation-to-Causation Exaggeration in Press Releases

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... While Sumner et al. (2014) and Bratton et al. (2019) performed manual analyses to understand the prevalence of exaggeration in press releases of scientific papers from a variety of sources, recent work has attempted to expand this using methods from NLP (Yu et al., 2019(Yu et al., , 2020Li et al., 2017). These works focus on the problem of automatically detecting the difference in the strength of causal claims made in scientific articles and press releases. ...
... At test time, these classifiers are then applied to document pairs (t, s) and the predicted claim strengths (l s , l t ) are compared to get the final label l. Previous work has used this formulation to estimate the prevalence of correlation to causation exaggeration in press releases (Yu et al., 2020), but have not evaluated this on paired labeled instances. ...
... Following previous work (Yu et al., 2020), we simplify the problem by focusing on detecting when the main finding of a paper is exaggerated. The first step is then to identify the main finding from s, and the sentence describing the main finding in s from t. ...
... While Sumner et al. (2014) and Bratton et al. (2019) performed manual analyses to understand the prevalence of exaggeration in press releases of scientific papers from a variety of sources, recent work has attempted to expand this using methods from NLP (Yu et al., 2019(Yu et al., , 2020Li et al., 2017). These works focus on the problem of automatically detecting the difference in the strength of causal claims made in scientific articles and press releases. ...
... At test time, these classifiers are then applied to document pairs (t, s) and the predicted claim strengths (l s , l t ) are compared to get the final label l. Previous work has used this formulation to estimate the prevalence of correlation to causation exaggeration in press releases (Yu et al., 2020), but have not evaluated this on paired labeled instances. ...
... Following previous work (Yu et al., 2020), we simplify the problem by focusing on detecting when the main finding of a paper is exaggerated. The first step is then to identify the main finding from s, and the sentence describing the main finding in s from t. ...
Preprint
Public trust in science depends on honest and factual communication of scientific papers. However, recent studies have demonstrated a tendency of news media to misrepresent scientific papers by exaggerating their findings. Given this, we present a formalization of and study into the problem of exaggeration detection in science communication. While there are an abundance of scientific papers and popular media articles written about them, very rarely do the articles include a direct link to the original paper, making data collection challenging. We address this by curating a set of labeled press release/abstract pairs from existing expert annotated studies on exaggeration in press releases of scientific papers suitable for benchmarking the performance of machine learning models on the task. Using limited data from this and previous studies on exaggeration detection in science, we introduce MT-PET, a multi-task version of Pattern Exploiting Training (PET), which leverages knowledge from complementary cloze-style QA tasks to improve few-shot learning. We demonstrate that MT-PET outperforms PET and supervised learning both when data is limited, as well as when there is an abundance of data for the main task.
... One of them is altmetrics (Sud and Thelwall, 2014), which is to measure the social impact of a research work based on its mentions in news and social media. Another example is evaluating information quality in science communication, e.g., detecting if a news article misinterprets a research work, such as making causal claims from correlational findings, or inference to humans from animal study results (Sumner et al., 2014;Yu et al., 2020). ...
... However, an institutional press release serves the dual purpose of responsible science reporting and marketing (Carver, 2014;Caulfield and Ogbogu, 2015;Samuel et al., 2017). With press releases becoming the dominant link between academia and news media, concerns have also increased regarding exaggeration in press releases, such as reporting correlational findings as causal and extrapolating results from animal studies to humans (Sumner et al., 2014;Li et al., 2017;Yu et al., 2020). ...
... To identify and correct exaggeration in press releases, the first step is to link press releases to the original research papers, and then different versions of claims can be compared to determine whether overstatements occurred and how to correct them (Yu et al., 2019(Yu et al., , 2020. However, linking press releases to the original papers is not a trivial task. ...
Preprint
Accurately linking news articles to scientific research works is a critical component in a number of applications, such as measuring the social impact of a research work and detecting inaccuracies or distortions in science news. Although the lack of links between news and literature has been a challenge in these applications, it is a relatively unexplored research problem. In this paper we designed and evaluated a new approach that consists of (1) augmenting latest named-entity recognition techniques to extract various metadata, and (2) designing a new elastic search engine that can facilitate the use of enriched metadata queries. To evaluate our approach, we constructed two datasets of paired news articles and research papers: one is used for training models to extract metadata, and the other for evaluation. Our experiments showed that the new approach performed significantly better than a baseline approach used by altmetric.com (0.89 vs 0.32 in terms of top-1 accuracy). To further demonstrate the effectiveness of the approach, we also conducted a study on 37,600 health-related press releases published on EurekAlert!, which showed that our approach was able to identify the corresponding research papers with a top-1 accuracy of at least 0.97.
... The strong need for understanding the fastgrowing scientific evidence has led to the creation of specialized data hubs and search platforms (e.g. Chen et al., 2020;Wang et al., 2020;Hope et al., 2020). Nevertheless, they have not been able to provide functions to support the direct retrieval of health advice. ...
... If further combined with other NLP tools, such as claim and stance classification, the health advice service would be able to compare and summarize the evidence strength of recommendations for or against certain policies or treatments. The prediction model may be further extended to detect exaggerated health advice in science communication by comparing advice given in research papers against its counterparts in press releases, news articles, and social media posts (Yu et al., 2020). In future work, we will extend health advice identification to news and social media. ...
... Finally, causal relations maybe could be to verify false claims in advertising such as health (Yu et al. 2020). False advertising claims can be a laborious task to prove, and therefore an automated system could be used to expedite false advertising legal processes. ...
Article
Causationin written natural language can express a strong relationship between events and facts. Causation in the written form can be referred to as a causal relation where a cause event entails the occurrence of an effect event. A cause and effect relationship is stronger than a correlation between events, and therefore aggregated causal relations extracted from large corpora can be used in numerous applications such as question-answering and summarisation to produce superior results than traditional approaches. Techniques like logical consequence allow causal relations to be used in niche practical applications such as event prediction which is useful for diverse domains such as security and finance. Until recently, the use of causal relations was a relatively unpopular technique because the causal relation extraction techniques were problematic, and the relations returned were incomplete, error prone or simplistic. The recent adoption of language models and improved relation extractors for natural language such as Transformer-XL (Dai et al . (2019). Transformer-xl: Attentive language models beyond a fixed-length context . arXiv preprint arXiv:1901.02860 ) has seen a surge of research interest in the possibilities of using causal relations in practical applications. Until now, there has not been an extensive survey of the practical applications of causal relations; therefore, this survey is intended precisely to demonstrate the potential of causal relations. It is a comprehensive survey of the work on the extraction of causal relations and their applications, while also discussing the nature of causation and its representation in text.
... Efforts to study and address promotional language use in scientific and biomedical research tend to focus on hype in popular media. [33][34][35][36] Within this framework, much of the research on hype or overstatement evaluates mismatches between the underlying research and the presentation of findings in press releases and news articles. Similar research also evaluates inconsistencies between research results and the presentation of findings in abstracts or published articles. ...
Preprint
Background Recent advances in Artificial intelligence (AI) have the potential to substantially improve healthcare across clinical areas. However, there are concerns health AI research may overstate the utility of newly developed systems and that certain metrics for measuring AI system performance may lead to an overly optimistic interpretation of research results. The current study aims to evaluate the relationship between researcher choice of AI performance metric and promotional language use in published abstracts. Methods and findings This cross-sectional study evaluated the relationship between promotional language and use of composite performance metrics (AUC or F1). A total of 1200 randomly sampled health AI abstracts drawn from PubMed were evaluated for metric selection and promotional language rates. Promotional language evaluation was accomplished through the development of a customized machine learning system that identifies promotional claims in abstracts describing the results of health AI system development. The language classification system was trained with an annotated dataset of 922 sentences. Collected sentences were annotated by two raters for evidence of promotional language. The annotators achieved 94.5% agreement (κ = 0.825). Several candidate models were evaluated and, the bagged classification and regression tree (CART) achieved the highest performance at Precision = 0.92 and Recall = 0.89. The final model was used to classify individual sentences in a sample of 1200 abstracts, and a quasi-Poisson framework was used to assess the relationship between metric selection and promotional language rates. The results indicate that use of AUC predicts a 12% increase (95% CI: 5-19%, p = 0.00104) in abstract promotional language rates and that use of F1 predicts a 16% increase (95% CI: 4% to 30%, p = 0. 00996). Conclusions Clinical trials evaluating spin, hype, or overstatement have found that the observed magnitude of increase is sufficient to induce misinterpretation of findings in researchers and clinicians. These results suggest that efforts to address hype in health AI need to attend to both underlying research methods and language choice.
... Prior work (Yu et al., 2019(Yu et al., , 2020Li et al., 2017) uses datasets based on PubMed abstracts and paired press releases from EurekAlert. 1 Their core limitations of is that they are limited to only observational studies from PubMed, which have structured abstracts, which strongly simplifies the task of identifying the main claims of a paper. This also holds for the test settings they consider, meaning that the proposed models have a limited applicability. ...
... Prior work (Yu et al., 2019(Yu et al., , 2020Li et al., 2017) uses datasets based on PubMed abstracts and paired press releases from EurekAlert. 1 Their core limitations of is that they are limited to only observational studies from PubMed, which have structured abstracts, which strongly simplifies the task of identifying the main claims of a paper. This also holds for the test settings they consider, meaning that the proposed models have a limited applicability. ...
Preprint
Most work on scholarly document processing assumes that the information processed is trustworthy and factually correct. However, this is not always the case. There are two core challenges, which should be addressed: 1) ensuring that scientific publications are credible -- e.g. that claims are not made without supporting evidence, and that all relevant supporting evidence is provided; and 2) that scientific findings are not misrepresented, distorted or outright misreported when communicated by journalists or the general public. I will present some first steps towards addressing these problems and outline remaining challenges.
Chapter
Dynamic environments can be modeled as a series of events and facts that interact with each other, these interactions being characterised by different relations including temporal and causal ones. These have largely been studied in knowledge management, information retrieval or natural language processing, leading to several strategies aiming at extracting these relationships in textual documents. However, more relation types exist between events, which are insufficiently covered by existing data models and datasets if one needs to train a model to recognise them. In this paper, we use semantic web technologies to design FARO, an ontology for representing event and fact relations. FARO allows representing up to 25 distinct relationships (including logical constraints), making it a possible bridge between (otherwise incompatible) datasets. We describe the modeling decision of this ontology resource. In addition, we have re-annotated two already existing datasets with some of the FARO properties.KeywordsSemantic WebOntologyEvent Relations
Article
Full-text available
Background: Patients increasingly turn to search engines and online content before, or in place of, talking with a health professional. Low quality health information, which is common on the internet, presents risks to the patient in the form of misinformation and a possibly poorer relationship with their physician. To address this, the DISCERN criteria (developed at University of Oxford) are used to evaluate the quality of online health information. However, patients are unlikely to take the time to apply these criteria to the health websites they visit. Methods: We built an automated implementation of the DISCERN instrument (Brief version) using machine learning models. We compared the performance of a traditional model (Random Forest) with that of a hierarchical encoder attention-based neural network (HEA) model using two language embeddings, BERT and BioBERT. Results: The HEA BERT and BioBERT models achieved average F1-macro scores across all criteria of 0.75 and 0.74, respectively, outperforming the Random Forest model (average F1-macro = 0.69). Overall, the neural network based models achieved 81% and 86% average accuracy at 100% and 80% coverage, respectively, compared to 94% manual rating accuracy. The attention mechanism implemented in the HEA architectures not only provided 'model explainability' by identifying reasonable supporting sentences for the documents fulfilling the Brief DISCERN criteria, but also boosted F1 performance by 0.05 compared to the same architecture without an attention mechanism. Conclusions: Our research suggests that it is feasible to automate online health information quality assessment, which is an important step towards empowering patients to become informed partners in the healthcare process.
Article
Full-text available
Motivation: Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processing, extracting valuable information from biomedical literature has gained popularity among researchers, and deep learning has boosted the development of effective biomedical text mining models. However, directly applying the advancements in natural language processing to biomedical text mining often yields unsatisfactory results due to a word distribution shift from general domain corpora to biomedical corpora. In this paper, we investigate how the recently introduced pre-trained language model BERT can be adapted for biomedical corpora. Results: We introduce BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining), which is a domain specific language representation model pre-trained on large-scale biomedical corpora. With almost the same architecture across tasks, BioBERT largely outperforms BERT and previous state-of-the-art models in a variety of biomedical text mining tasks when pre-trained on biomedical corpora. While BERT obtains performance comparable to that of previous state-of-the-art models, BioBERT significantly outperforms them on the following three representative biomedical text mining tasks: biomedical named entity recognition (0.62% F1 score improvement), biomedical relation extraction (2.80% F1 score improvement), and biomedical question answering (12.24% MRR improvement). Our analysis results show that pre-training BERT on biomedical corpora helps it to understand complex biomedical texts. Availability and implementation: We make the pre-trained weights of BioBERT freely available at https://github.com/naver/biobert-pretrained, and the source code for fine-tuning BioBERT available at https://github.com/dmis-lab/biobert. Supplementary information: Supplementary data are available at Bioinformatics online.
Conference Paper
Full-text available
The spread of ‘fake’ health news is a big problem with even bigger consequences. In this study, we examine a collection of health-related news articles published by reliable and unreliable media outlets. Our analysis shows that there are structural, topical, and semantic patterns which are different in contents from reliable and unreliable media outlets. Using machine learning, we leverage these patterns and build classification models to identify the source (reliable or unreliable) of a health-related news article. Our model can predict the source of an article with an F-measure of 96%. We argue that the findings from this study will be useful for combating the health disinformation problem.
Article
Full-text available
Background: Exaggerated or simplistic news is often blamed for adversely influencing public health. However, recent findings suggested many exaggerations were already present in university press releases, which scientists approve. Surprisingly, these exaggerations were not associated with more news coverage. Here we test whether these two controversial results also arise in press releases from prominent science and medical journals. We then investigate the influence of mitigating caveats in press releases, to test assumptions that caveats harm news interest or are ignored. Methods and findings: Using quantitative content analysis, we analyzed press releases (N = 534) on biomedical and health-related science issued by leading peer-reviewed journals. We similarly analysed the associated peer-reviewed papers (N = 534) and news stories (N = 582). Main outcome measures were advice to readers and causal statements drawn from correlational research. Exaggerations in press releases predicted exaggerations in news (odds ratios 2.4 and 10.9, 95% CIs 1.3 to 4.5 and 3.9 to 30.1) but were not associated with increased news coverage, consistent with previous findings. Combining datasets from universities and journals (996 press releases, 1250 news), we found that when caveats appeared in press releases there was no reduction in journalistic uptake, but there was a clear increase in caveats in news (odds ratios 9.6 and 9.5 for caveats for advice and causal claims, CIs 4.1 to 24.3 and 6.0 to 15.2). The main study limitation is its retrospective correlational nature. Conclusions: For health and science news directly inspired by press releases, the main source of both exaggerations and caveats appears to be the press release itself. However we find no evidence that exaggerations increase, or caveats decrease, the likelihood of news coverage. These findings should be encouraging for press officers and scientists who wish to minimise exaggeration and include caveats in their press releases.
Article
Full-text available
Science-related news stories can have a profound impact on how the public make decisions. The current study presents 4 experiments that examine how participants understand scientific expressions used in news headlines. The expressions concerned causal and correlational relationships between variables (e.g., “being breast fed makes children behave better”). Participants rated or ranked headlines according to the extent that one variable caused the other. Our results suggest that participants differentiate between 3 distinct categories of relationship: direct cause statements (e.g., “makes,” “increases”), which were interpreted as the most causal; can cause statements (e.g., “can make,” “can increase”); and moderate cause statements (e.g., “might cause,” “linked,” “associated with”), but do not consistently distinguish within the last group despite the logical distinction between cause and association. On the basis of this evidence, we make recommendations for appropriately communicating cause and effect in news headlines.
Article
Full-text available
Background The increasing push to commercialize university research has emerged as a significant science policy challenge. While the socio-economic benefits of increased and rapid research commercialization are often emphasized in policy statements and discussions, there is less mention or discussion of potential risks. In this paper, we highlight such potential risks and call for a more balanced assessment of the commercialization ethos and trends. Discussion There is growing evidence that the pressure to commercialize is directly or indirectly associated with adverse impacts on the research environment, science hype, premature implementation or translation of research results, loss of public trust in the university research enterprise, research policy conflicts and confusion, and damage to the long-term contributions of university research. Summary The growing emphasis on commercialization of university research may be exerting unfounded pressure on researchers and misrepresenting scientific research realities, prospects and outcomes. While more research is needed to verify the potential risks outlined in this paper, policy discussions should, at a minimum, acknowledge them.
Article
Full-text available
Science press officers can play an integral role in helping promote expectations and hype about biomedical research. Using this as a starting point, this article draws on interviews with 10 UK-based science press officers, which explored how they view their role as science reporters and as generators of expectations. Using Goodwin's notion of 'professional vision', we argue that science press officers have a specific professional vision that shapes how they produce biomedical press releases, engage in promotion of biomedical research and make sense of hype. We discuss how these insights can contribute to the sociology of expectations, as well as inform responsible science communication. © The Author(s) 2015.
Article
Full-text available
Fake medicalWeb sites have become increasingly prevalent. Consequently, much of the health-related information and advice available online is inaccurate and/or misleading. Scores of medical institution Web sites are for organizations that do not exist and more than 90% of online pharmacy Web sites are fraudulent. In addition to monetary losses exacted on unsuspecting users, these fake medical Web sites have severe public safety ramifications. According to a World Health Organization report, approximately half the drugs sold on the Web are counterfeit, resulting in thousands of deaths. In this study, we propose an adaptive learning algorithm called recursive trust labeling (RTL). RTL uses underlying content and graph-based classifiers, coupled with a recursive labeling mechanism, for enhanced detection of fake medical Web sites. The proposed method was evaluated on a test bed encompassing nearly 100 million links between 930,000Web sites, including 1,000 known legitimate and fake medical sites. The experimental results revealed that RTL was able to significantly improve fake medical Web site detection performance over 19 comparison content and graph-based methods, various meta-learning techniques, and existing adaptive learning approaches, with an overall accuracy of over 94%. Moreover, RTL was able to attain high performance levels even when the training dataset composed of as little as 30 Web sites. With the increased popularity of eHealth and Health 2.0, the results have important implications for online trust, security, and public safety.
Article
Full-text available
The media have a key role in communicating advances in medicine to the general public, yet the accuracy of medical journalism is an under-researched area. This project adapted an established monitoring instrument to analyse all identified news reports (n = 312) on a single medical research paper: a meta-analysis published in the British Journal of Cancer which showed a modest link between processed meat consumption and pancreatic cancer. Our most significant finding was that three sources (the journal press release, a story on the BBC News website and a story appearing on the 'NHS Choices' website) appeared to account for the content of over 85% of the news stories which covered the meta analysis, with many of them being verbatim or moderately edited copies and most not citing their source. The quality of these 3 primary sources varied from excellent (NHS Choices, 10 of 11 criteria addressed) to weak (journal press release, 5 of 11 criteria addressed), and this variance was reflected in the accuracy of stories derived from them. Some of the methods used in the original meta-analysis, and a proposed mechanistic explanation for the findings, were challenged in a subsequent commentary also published in the British Journal of Cancer, but this discourse was poorly reflected in the media coverage of the story.
Article
Full-text available
To identify the source (press releases or news) of distortions, exaggerations, or changes to the main conclusions drawn from research that could potentially influence a reader's health related behaviour. Retrospective quantitative content analysis. Journal articles, press releases, and related news, with accompanying simulations. Press releases (n=462) on biomedical and health related science issued by 20 leading UK universities in 2011, alongside their associated peer reviewed research papers and news stories (n=668). Advice to readers to change behaviour, causal statements drawn from correlational research, and inference to humans from animal research that went beyond those in the associated peer reviewed papers. 40% (95% confidence interval 33% to 46%) of the press releases contained exaggerated advice, 33% (26% to 40%) contained exaggerated causal claims, and 36% (28% to 46%) contained exaggerated inference to humans from animal research. When press releases contained such exaggeration, 58% (95% confidence interval 48% to 68%), 81% (70% to 93%), and 86% (77% to 95%) of news stories, respectively, contained similar exaggeration, compared with exaggeration rates of 17% (10% to 24%), 18% (9% to 27%), and 10% (0% to 19%) in news when the press releases were not exaggerated. Odds ratios for each category of analysis were 6.5 (95% confidence interval 3.5 to 12), 20 (7.6 to 51), and 56 (15 to 211). At the same time, there was little evidence that exaggeration in press releases increased the uptake of news. Exaggeration in news is strongly associated with exaggeration in press releases. Improving the accuracy of academic press releases could represent a key opportunity for reducing misleading health related news. © Sumner et al 2014.
Article
Full-text available
There is growing competition among publicly funded scientific institutes and universities to attract staff, students, funding and research partners. As a result, there has been increased emphasis on science communication activities in research institutes over the past decade. But are institutes communicating science simply for the sake of improving the institute’s image? In this set of commentaries we explore the relationship between science communication and public relations (PR) activities, in an attempt to clarify what research institutes are actually doing. The overall opinion of the authors is that science communication activities are almost always a form of PR. The press release is still the most popular science communication and PR tool. There is however disagreement over the usefulness of the press release and whether or not gaining public attention is actually good for science.
Article
Full-text available
The news media play an important role in informing the public about scientific and technological developments. Some argue that restructuring and downsizing result in journalists coming under increased pressure to produce copy, leading them to use more public relations material to meet their deadlines. This article explores science journalism in the highly commercialised media market of New Zealand. Using semi-structured interviews with scientists, science communication advisors and journalists, the study finds communication advisors and scientists believe most media outlets, excluding public service media, report science poorly. Furthermore, restructuring and staff cuts have placed the journalists interviewed under increasing pressure. While smaller newspapers appear to be printing press releases verbatim, metropolitan newspaper journalists still exercise control over their use of such material. The results suggest these journalists will continue to resist increasing their use of public relations material for some time to come.
Conference Paper
Full-text available
The NLP community has shown a renewed interest in deeper semantic analyses, among them automatic recognition of relations between pairs of words in a text. We present an evaluation task designed to provide a framework for comparing different approaches to classifying semantic relations between nominals in a sentence. This is part of SemEval, the 4th edition of the semantic evaluation event previously known as SensEval. We define the task, describe the training/test data and their creation, list the participating systems and discuss their results. There were 14 teams who submitted 15 systems.
Article
Full-text available
Understanding how genetic science is communicated to the lay public is of great import. To address this issue, this study examines the presentation of genetic research relating to cancer outcomes and behaviors (i.e., prostate cancer, breast cancer, colon cancer, smoking and obesity) in both the press release (n = 23) and its subsequent news coverage (n = 71). Data suggest that genetic discoveries are presented in a biologically deterministic and simplified manner 67.5% of the time. The introduction of deterministic language is attributed equally to both press releases and news coverage. Also, there are substantive differences between content introduced in the press release and content presented in subsequent press coverage; in fact, when two sources report on the same scientific discovery, the information is inconsistent more than 40% of the time. These findings suggest that the intermediary press release may serve as a source of distortion in the dissemination of science to the lay public.
Article
Full-text available
The kappa statistic is frequently used to test interrater reliability. The importance of rater reliability lies in the fact that it represents the extent to which the data collected in the study are correct representations of the variables measured. Measurement of the extent to which data collectors (raters) assign the same score to the same variable is called interrater reliability. While there have been a variety of methods to measure interrater reliability, traditionally it was measured as percent agreement, calculated as the number of agreement scores divided by the total number of scores. In 1960, Jacob Cohen critiqued use of percent agreement due to its inability to account for chance agreement. He introduced the Cohen's kappa, developed to account for the possibility that raters actually guess on at least some variables due to uncertainty. Like most correlation statistics, the kappa can range from -1 to +1. While the kappa is one of the most commonly used statistics to test interrater reliability, it has limitations. Judgments about what level of kappa should be acceptable for health research are questioned. Cohen's suggested interpretation may be too lenient for health related studies because it implies that a score as low as 0.41 might be acceptable. Kappa and percent agreement are compared, and levels for both kappa and percent agreement that should be demanded in healthcare studies are suggested.
Article
Full-text available
To determine whether the quality of press releases issued by medical journals can influence the quality of associated newspaper stories. Retrospective cohort study of medical journal press releases and associated news stories. We reviewed consecutive issues (going backwards from January 2009) of five major medical journals (Annals of Internal Medicine, BMJ, Journal of the National Cancer Institute, JAMA, and New England Journal of Medicine) to identify the first 100 original research articles with quantifiable outcomes and that had generated any newspaper coverage (unique stories ≥100 words long). We identified 759 associated newspaper stories using Lexis Nexis and Factiva searches, and 68 journal press releases using Eurekalert and journal website searches. Two independent research assistants assessed the quality of journal articles, press releases, and a stratified random sample of associated newspaper stories (n=343) by using a structured coding scheme for the presence of specific quality measures: basic study facts, quantification of the main result, harms, and limitations. Proportion of newspaper stories with specific quality measures (adjusted for whether the quality measure was present in the journal article's abstract or editor note). We recorded a median of three newspaper stories per journal article (range 1-72). Of 343 stories analysed, 71% reported on articles for which medical journals had issued press releases. 9% of stories quantified the main result with absolute risks when this information was not in the press release, 53% did so when it was in the press release (relative risk 6.0, 95% confidence interval 2.3 to 15.4), and 20% when no press release was issued (2.2, 0.83 to 6.1). 133 (39%) stories reported on research describing beneficial interventions. 24% mentioned harms (or specifically declared no harms) when harms were not mentioned in the press release, 68% when mentioned in the press release (2.8, 1.1 to 7.4), and 36% when no press release was issued (1.5, 0.49 to 4.4). 256 (75%) stories reported on research with important limitations. 16% reported any limitations when limitations were not mentioned in the press release, 48% when mentioned in the press release (3.0, 1.5 to 6.2), and 21% if no press release was issued (1.3, 0.50 to 3.6). High quality press releases issued by medical journals seem to make the quality of associated newspaper stories better, whereas low quality press releases might make them worse.
Conference Paper
Full-text available
Finding temporal and causal relations is cru- cial to understanding the semantic structure of a text. Since existing corpora provide no parallel temporal and causal annotations, we annotated 1000 conjoined event pairs, achiev- ing inter-annotator agreement of 81.2% on temporal relations and 77.8% on causal re- lations. We trained machine learning mod- els using features derived from WordNet and the Google N-gram corpus, and they out- performed a variety of baselines, achieving an F-measure of 49.0 for temporals and 52.4 for causals. Analysis of these models sug- gests that additional data will improve perfor- mance, and that temporal information is cru- cial to causal relation identification.
Conference Paper
Full-text available
We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argument structure of discourse relations, (b) the sense annotation of the relations, an d (c) the attribution of discourse relations and each of their arguments. We list the differences between PDTB-1.0 and PDTB-2.0. We present representative statistics for several aspects of the annotation in the corp us.
Article
Full-text available
To assess the inappropriate use of causal language in studies on obesity and nutrition. Titles and abstracts of 525 peer-reviewed papers in the 4 leading journals in the fields of obesity and nutrition were scrutinized for language implying causality in observational studies published in 2006. Such misleading language appeared in 161 papers (31%) independent of funding source. Remarkably 49% of studies lacking statistically significant primary outcomes used misleading language compared to 29% of those with p values ≤0.05 (chi square p < 0.001). Exculpatory language was present in the body of the text in 19%; of the 161 studies. We suggest that editors and reviewers evaluate submissions for misleading reporting.
Article
Full-text available
Scientific journals issue press releases to disseminate scientific news about articles they publish. To assess whether press releases about journal articles were associated with publication of subsequent newspaper stories. Retrospective content analysis of newspaper stories, journal press releases, and journal tables of contents. From December 1, 1996, to February 28, 1997, press releases and tables of contents were collected from BMJ, Nature, Science, and The Lancet, along with newspaper stories on scientific research published in The New York Times (United States), Le Figaro and Le Monde (France), El País and La Vanguardia (Spain), La Repubblica (Italy), and the International Herald Tribune. Number of newspaper stories that contained reference to articles appearing in the 4 scientific journals, number of newspaper stories that referred to journal articles described in press releases, and the order in which journal articles were mentioned in press releases. Of the 1060 newspaper stories analyzed, 142 referred to journal articles; of these, 119 (84%) referred to articles mentioned in press releases and 23 (16%) referred to journal articles not mentioned in press releases (comparison of proportions, P=.03). Articles described first or second were referenced in more newspapers than articles described later in the press release (P=.01 by chi2 analysis). Journal articles described in press releases, in particular those described first or second in the press release, are associated with the subsequent publication of newspaper stories on the same topic.
Article
Full-text available
HealthNewsReview.org evaluates US health news coverage of claims made about medical interventions. Gary Schwitzer reports on the project's findings after evaluation of 500 health news stories.
Article
Background: Exaggerations in health news were previously found to strongly associate with similar exaggerations in press releases. Moreover such exaggerations did not appear to attract more news. Here we assess whether press release practice changed after these reported findings; simply drawing attention to the issue may be insufficient for practical change, given the challenges of media environments. Methods: We assessed whether rates of causal over-statement in press releases based on correlational data were lower following a seminal paper on the topic, compared to an equivalent baseline period in the preceding year. Results: We found that over-statements in press releases reduced from 28% (95% confidence interval = 16% to 45%) in 2014 to 13% (95% confidence interval = 6% to 25%) in 2015. A corresponding numerical reduction in exaggerations in news was not significant. The association between over-statements in news and press releases remained strong. Conclusions: Press release over-statements were less frequent following publication of Sumner et al. (2014), indicating that press release practice is malleable. However, this is correlational evidence and the reduction may be due to other factors.
Article
Scientific institutions have for a long time known the importance of framing and owning stories about science They also know the effective way of communicating science in a press release This is part of the institution’s public relations. Enhanced competition among research institutions has led to a buildup of communicative competences and professionalization of public relations inside the institutions and the press release has become an integrated part of science communication from these institutions. Changing working conditions in the media, where fewer people have to publish more, have made press releases from trustworthy scientific institutions into free and easily copied content for the editors. In this commentary I investigate and discuss the communicative ecosystem of the university press release. I especially take a close look at the role of the critical and independent science journalist in relation to this corporate controlled communication
Article
Causal relation extraction is a challenging yet very important task for Natural Language Processing (NLP). There are many existing approaches developed to tackle this task, either rule-based (non-statistical) or machine-learning-based (statistical) method. For rule-based method, extensive manual work is required to construct handcrafted patterns, however, the precision and recall are low due to the complexity of causal relation expressions in natural language. For machine-learning-based method, current approaches either rely on sophisticated feature engineering which is error-prone, or rely on large amount of labeled data which is impractical for causal relation extraction problem. To address the above issues, we propose a Knowledge-oriented Convolutional Neural Network (K-CNN) for causal relation extraction in this paper. K-CNN consists of a knowledge-oriented channel that incorporates human prior knowledge to capture the linguistic clues of causal relationship, and a data-oriented channel that learns other important features of causal relation from the data. The convolutional filters in knowledge-oriented channel are automatically generated from lexical knowledge bases such as WordNet and FrameNet. We propose filter selection and clustering techniques to reduce dimensionality and improve the performance of K-CNN. Furthermore, additional semantic features that are useful for identifying causal relations are created. Three datasets have been used to evaluate the ability of K-CNN to effectively extract causal relation from texts, and the model outperforms current state-of-art models for relation extraction.
Book
Analyzing the role of journalists in science communication, this book presents a perspective on how this is going to evolve in the twenty-first century. The book takes three distinct perspectives on this interesting subject. Firstly, science journalists reflect on their ‘operating rules’ (science news values and news making routines). Secondly, a brief history of science journalism puts things into context, characterising the changing output of science writing in newspapers over time. Finally, the book invites several international journalists or communication scholars to comment on these observations thereby opening the global perspective.
Article
Background: To earn HONcode certification, a website must conform to the 8 principles of the HONcode of Conduct In the current manual process of certification, a HONcode expert assesses the candidate website using precise guidelines for each principle. In the scope of the European project KHRESMOI, the Health on the Net (HON) Foundation has developed an automated system to assist in detecting a website’s HONcode conformity. Automated assistance in conducting HONcode reviews can expedite the current time-consuming tasks of HONcode certification and ongoing surveillance. Additionally, an automated tool used as a plugin to a general search engine might help to detect health websites that respect HONcode principles but have not yet been certified. Objective: The goal of this study was to determine whether the automated system is capable of performing as good as human experts for the task of identifying HONcode principles on health websites. Methods: Using manual evaluation by HONcode senior experts as a baseline, this study compared the capability of the automated HONcode detection system to that of the HONcode senior experts. A set of 27 health-related websites were manually assessed for compliance to each of the 8 HONcode principles by senior HONcode experts. The same set of websites were processed by the automated system for HONcode compliance detection based on supervised machine learning. The results obtained by these two methods were then compared. Results: For the privacy criterion, the automated system obtained the same results as the human expert for 17 of 27 sites (14 true positives and 3 true negatives) without noise (0 false positives). The remaining 10 false negative instances for the privacy criterion represented tolerable behavior because it is important that all automatically detected principle conformities are accurate (ie, specificity [100%] is preferred over sensitivity [58%] for the privacy criterion). In addition, the automated system had precision of at least 75%, with a recall of more than 50% for contact details (100% precision, 69% recall), authority (85% precision, 52% recall), and reference (75% precision, 56% recall). The results also revealed issues for some criteria such as date. Changing the “document” definition (ie, using the sentence instead of whole document as a unit of classification) within the automated system resolved some but not all of them. Conclusions: Study results indicate concordance between automated and expert manual compliance detection for authority, privacy, reference, and contact details. Results also indicate that using the same general parameters for automated detection of each criterion produces suboptimal results. Future work to configure optimal system parameters for each HONcode principle would improve results. The potential utility of integrating automated detection of HONcode conformity into future search engines is also discussed. http://www.jmir.org/2015/6/e135/
Article
Journal publication has long been relied on as the only required communication of results, tasking journalists with bringing news of scientific discoveries to the public. Output of science papers increased 15% between 1990 and 2001, with total output over 650,000. But, fewer than 0.013—0.34% of papers gained attention from mass media, with health/medicine papers taking the lion’s share of coverage. Fields outside of health/medicine had an appearance rate of only 0.001—0.005%. In light of findings that show scientific literacy declining despite growing public interest and scientific output, this study attempts to show that reliance on journal publication and subsequent coverage by the media as the sole form of communication en masse is failing to communicate science to the public.
Article
Commentary: When this essay first appeared more than 10 years ago, it built on a small but substantial body of scholarship that declared scientific writing an appropriate field for rhetorical analysis. In the last 10 years, studies of scientific writing for both expert and lay audiences have increased exponentially, drawing on the long-established disciplines of the history and philosophy of science. These newer studies, however, differ widely in approach. Many take the perspective of cultural critique (e.g., the work of Bruno Latour and Stephen Woolgar), whereas others use the tools of discourse analysis (e.g., Greg Myers, M.A.K. Halliday, and J. R. Martin). But, application of rhetorical theory also thrives in the work of John Angus Campbell, Alan Gross, Charles Bazerman, Jean Dietz Moss, Lawrence J. Prelli, Carolyn Miller, and many others. Randy Allen Harris offers a useful introduction to this field in Landmark Essays on Rhetoric in Science (1997). “Accommodating Science” applies ideas from classical rhetoric and techniques of close reading typical of discourse analysis to the question of what happens when scientific reports travel from expert to lay publications. This change in forum causes a shift in genre from forensic to celebratory and a shift in stasis from fact and cause to evaluation and action. These changes in genre, audience, and purpose inevitably affect the material and manner of re-presentation in predictable ways. Two concerns informed this study 10 years ago: the impact of science reporting on public deliberation and the nature of technical and professional writing courses. These concerns have, if anything, increased (e.g., the campaign on global warming), warranting continued scholarly investigation of the gap between the public's right to know and the public's ability to understand.
Article
The suggestion that the activities of public relations professionals and news agencies help to shape news content in national and local news media is increasingly commonplace among journalists, academics and public relations professionals. The findings from this study provide substantive empirical evidence to support such claims. The study analyses the domestic news content of UK national “quality” newspapers (2207 items in the Guardian, The Times, Independent, Daily Telegraph and the mid-market Daily Mail) and radio and television news reports (402 items broadcast by BBC Radio 4, BBC News, ITV News and SkyNews), across two week-long sample periods in 2006, to identify the influence of specific public relations materials and news agency copy (especially reports provided by the UK Press Association) in published and broadcast news contents. The findings illustrate that journalists’ reliance on these news sources is extensive and raises significant questions concerning claims to journalistic independence in UK news media and journalists’ role as a fourth estate. A political economy analysis suggests that the factors which have created this editorial reliance on these ‘information subsidies’ seems set to continue, if not increase, in the near future.
Article
There is a general assumption that the inverted pyramid (lead-and-body principle, answers to four or five w-questions at the beginning of the article) became a professional standard during the American Civil War (1861–65), either because of the unreliability of the new telegraph technology (technological explanation); or because of the information policy of the Union (political explanation); or because of the increasing commercial interests of publishers and competition between them (economical explanation). But a content analysis of the New York Herald and the New York Times shows that the inverted pyramid became commonplace only two decades later. Between 1880 and 1890, moreover, publishers and editors attempted systematically to enhance the comprehensibility of their products by using, for example, headlines and illustrations. The author therefore favours the thesis that the journalistic routine and genre of the inverted pyramid resulted from the professional effort to strengthen the communicative quality of news.
Article
The news media are often criticized for exaggerated coverage of weak science. Press releases, a source of information for many journalists, might be a source of those exaggerations. To characterize research press releases from academic medical centers. Content analysis. Press releases from 10 medical centers at each extreme of U.S. News & World Report's rankings for medical research. Press release quality. Academic medical centers issued a mean of 49 press releases annually. Among 200 randomly selected releases analyzed in detail, 87 (44%) promoted animal or laboratory research, of which 64 (74%) explicitly claimed relevance to human health. Among 95 releases about primary human research, 22 (23%) omitted study size and 32 (34%) failed to quantify results. Among all 113 releases about human research, few (17%) promoted studies with the strongest designs (randomized trials or meta-analyses). Forty percent reported on the most limited human studies--those with uncontrolled interventions, small samples (<30 participants), surrogate primary outcomes, or unpublished data--yet 58% lacked the relevant cautions. The effects of press release quality on media coverage were not directly assessed. Press releases from academic medical centers often promote research that has uncertain relevance to human health and do not provide key facts or acknowledge important limitations. National Cancer Institute.
Article
While medical journals strive to ensure accuracy and the acknowledgment of limitations in articles, press releases may not reflect these efforts. Telephone interviews conducted in January 2001 with press officers at 9 prominent medical journals and analysis of press releases (n = 127) about research articles for the 6 issues of each journal preceding the interviews. Seven of the 9 journals routinely issue releases; in each case, the editor with the press office selects articles based on perceived newsworthiness and releases are written by press officers trained in communications. Journals have general guidelines (eg, length) but no standards for acknowledging limitations or for data presentation. Editorial input varies from none to intense. Of the 127 releases analyzed, 29 (23%) noted study limitations and 83 (65%) reported main effects using numbers; 58 reported differences between study groups and of these, 26 (55%) provided the corresponding base rate, the format least prone to exaggeration. Industry funding was noted in only 22% of 23 studies receiving such funding. Press releases do not routinely highlight study limitations or the role of industry funding. Data are often presented using formats that may exaggerate the perceived importance of findings.
Article
To analyse the reviews of medical news articles posted on media doctor, a medical news-story monitoring website. A descriptive summary of operating the media doctor website between 1 February and 1 September 2004. Consensus scores for 10 assessment criteria for the medical intervention described in the article (novelty, availability in Australia, alternative treatment options given, evidence of "disease mongering", objective supportive evidence given, quantification of benefits, coverage of harms, coverage of costs, independent sources of information, and excessive reliance on a press release); cumulative article rating scores for major media outlets. 104 news articles were featured on media doctor in the study period. Both online and print media scored poorly, although the print media were superior: mean total scores 56.1% satisfactory for print and 40.1% for online; percentage points difference 15.9 (95% CI, 8.3-23.6). The greatest differences were seen for the use of independent information sources, quantification of benefits and coverage of potential harms. Australian lay news reporting of medical advances, particularly by the online news services, is poor. This might improve if journals and researchers became more active in communicating with the press and the public.
Discern: an instrument for judging the quality of written consumer health information on treatment choices
  • Deborah Charnock
  • Sasha Shepperd
  • Gill Needham
  • Robert Gann
Deborah Charnock, Sasha Shepperd, Gill Needham, and Robert Gann. 1999. Discern: an instrument for judging the quality of written consumer health information on treatment choices. Journal of Epidemiology & Community Health, 53(2):105-111.
Ginger cannot cure cancer: Battling fake health news with a comprehensive data repository
  • Enyan Dai
  • Yiwei Sun
  • Suhang Wang
Enyan Dai, Yiwei Sun, and Suhang Wang. 2020. Ginger cannot cure cancer: Battling fake health news with a comprehensive data repository. In Proceedings of the International AAAI Conference on Web and Social Media, volume 14, pages 853-862.
  • Jacob Devlin
  • Ming-Wei Chang
  • Kenton Lee
  • Kristina Toutanova
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Ongoing newsroom cutbacks hit health reporting ranks. Association of Health Care Journalists
  • Phil Galewitz
Phil Galewitz. 2006. Ongoing newsroom cutbacks hit health reporting ranks. Association of Health Care Journalists. HealthBeat, 9:1-12.
The strength of PR and the weakness of science journalism
  • Winfried Göpfert
Winfried Göpfert. 2007. The strength of PR and the weakness of science journalism. In MW Bauer and M Bucchi, editors, Journalism, Science and Society: Science Communication Between News and Public Relations, chapter 20, pages 215-226. Routledge, New York.
Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals
  • Iris Hendrickx
  • Su Nam Kim
  • Zornitsa Kozareva
  • Preslav Nakov
  • Diarmuidó Séaghdha
  • Sebastian Padó
  • Marco Pennacchiotti
  • Lorenza Romano
  • Stan Szpakowicz
Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, DiarmuidÓ Séaghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano, and Stan Szpakowicz. 2009. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pages 94-99. Association for Computational Linguistics.
An analysis of causality between events and its relation to temporal information
  • Paramita Mirza
  • Sara Tonelli
Paramita Mirza and Sara Tonelli. 2014. An analysis of causality between events and its relation to temporal information. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 2097-2106.
Information quality of reddit link posts on health news
  • Haichen Zhou
  • Bei Yu
Haichen Zhou and Bei Yu. 2020. Information quality of reddit link posts on health news. In Proceedings of the 2020 iConference, pages 186-197. Springer.
Observational studies: Does the language fit the evidence? Association vs
  • Mark Zweig
  • Emily Devoto
Mark Zweig and Emily DeVoto. 2018. Observational studies: Does the language fit the evidence? Association vs. causation. http://www.healthnewsreview.org/toolkit/ tips-for-understanding-studies/.