MTMvis built on the topic model obtained from the abstracts of the citing entities, shown against the three period P1-P3. For each period the visualization plots the topics distribution (e.g. topic 3 is the dominant topic in all the periods: P1, P2 and P3

Source publication

Fig. 6 MTMvis built on the topic model obtained from the abstracts of...

Fig. 7 MTMvis built on the topic model obtained from the abstracts of...

The features that directly characterize the in-text citations. The...

summary of the differences and similarities between our study and...

A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.'s case

Article

Full-text available

Aug 2021

In this article, we show the results of a quantitative and qualitative analysis of open citations on a popular and highly cited retracted paper: “Ileal-lymphoid-nodular hyperplasia, non-specific colitis and pervasive developmental disorder in children” by Wakefield et al . , published in 1998. The main purpose of our study is to understand the beha...

Context 1

... MTMvis visualizations are plotted considering the period P1-P3 (Fig. 6) and the subject areas of the citing articles (Fig. 7). As shown in Fig. 6, the topics 1, 2 and 5 were constantly increasing their percentages over the time while, on the contrary, topics 4 and 9 were decreasing. Along the same lines, topics 3 and 11 showed a very similar pattern along the three periods. As shown in Fig. 7, some ...

View in full-text

Context 2

View in full-text

Context 3

... Fig. 16, we investigated the sections of the in-text citations marked as credits and cites as evidence. On the one hand, the credits citations were mostly distributed on descriptive sections -i.e. introduction, discussion and background -during all the three periods. Fig. 15 The four graphs illustrate the way the use of citation intents ...

View in full-text

Context 4

... Fig. 15 The four graphs illustrate the way the use of citation intents changed over time (i.e., the three periods P1, P2 and P3) and according to their perceived sentiment. The citation intents cites as evidence, critiques and credits are illustrated in separated charts, that show an increment in the negative sentiment along the three periods Fig. 16 The cites as evidence and credits citation intents distributions among the sections (the recognizable ones) and during the three periods (i.e. P1-P3) Fig. 17 The evolution over time of three groups of topics defined from the citation contexts of the in-text citations to WF-PUB-1998 On the other hand, the cites as evidence citations ...

View in full-text

Citation of retracted papers

Article

Jan 2024
Theoria

Sven Ove Hansson

Impact of COVID-19 research: a study on predicting influential scholarly documents using machine learning and a domain-independent knowledge graph

Article

Full-text available

Nov 2023

Multiple studies have investigated bibliometric features and uncategorized scholarly documents for the influential scholarly document prediction task. In this paper, we describe our work that attempts to go beyond bibliometric metadata to predict influential scholarly documents. Furthermore, this work also examines the influential scholarly document prediction task over categorized scholarly documents. We also introduce a new approach to enhance the document representation method with a domain-independent knowledge graph to find the influential scholarly document using categorized scholarly content. As the input collection, we use the WHO corpus with scholarly documents on the theme of COVID-19. This study examines different document representation methods for machine learning, including TF-IDF, BOW, and embedding-based language models (BERT). The TF-IDF document representation method works better than others. From various machine learning methods tested, logistic regression outperformed the other for scholarly document category classification, and the random forest algorithm obtained the best results for influential scholarly document prediction, with the help of a domain-independent knowledge graph, specifically DBpedia, to enhance the document representation method for predicting influential scholarly documents with categorical scholarly content. In this case, our study combines state-of-the-art machine learning methods with the BOW document representation method. We also enhance the BOW document representation with the direct type (RDF type) and unqualified relation from DBpedia. From this experiment, we did not find any impact of the enhanced document representation for the scholarly document category classification. We found an effect in the influential scholarly document prediction with categorical data.

Some Insights into the Factors Influencing Continuous Citation of Retracted Scientific Papers

Article

Full-text available

Oct 2023

Bor Luen Tang

Once retracted, the citation count of a research paper might be intuitively expected to drop precipitously. Here, we assessed the post-retraction citation of life and medical sciences papers from two top-ranked, multidisciplinary journals Nature and Science, from 2010 to 2018. Post-retraction citations accounted for a staggering 47.7% and 40.9% of total citations (median values), respectively, of the papers included in our analysis. These numbers are comparable with those from two journals with lower impact factors, and with retracted papers from the physical sciences discipline. A more qualitative assessment of five papers from the two journals with a high percentage (>50%) of post-retraction citations, all of which are associated with misconduct, reveal different contributing reasons and factors. Retracted papers associated with highly publicized misconduct cases are more prone to being cited with the retraction status indicated, or projected negatively (such as in the context of research ethics and misconduct discussions), with the latter also indicated by cross-disciplinary citations by humanities and social sciences articles. Retracted papers that retained significant validity in their main findings/conclusions may receive a large number of neutral citations that are somewhat blind to the retraction. Retracted papers in popular subject areas with massive publication outputs, particularly secondary publications such as reviews, may also have a high background citation noise. Our findings add further insights to the nature of post-retraction citations beyond the plain notion that these are largely made through sheer ignorance or negligence by the citing authors.

On Retraction Cascade? Citation Intention Analysis as a Quality Control Mechanism in Digital Libraries

Chapter

Full-text available

Sep 2023

The amount of information in digital libraries (DLs) has been experiencing rapid growth. With the intense competition for research breakthroughs, researchers often intentionally or unintentionally fail to adhere to scientific standards, leading to the retraction of scientific articles. When a paper gets retracted, all its citing articles have to be verified to ensure the overall correctness of the information in digital libraries. Since this subjective verification is extremely time and resource-consuming, we propose a triage process that focuses on papers that imply a dependence on retracted articles, thus requiring further reevaluation. This paper seeks to establish a systematic approach for identifying and scrutinizing scholarly works that draw upon retracted work by direct citations, thus emphasizing the importance of further evaluation within the scholarly discourse. Firstly, we categorized and identified the intention in the citation context using verbs with predicative complements and cue phrases. Secondly, we classified the citation intentions of the retracted articles into dependent (if the citing paper is based on or incorporates part of the cited retracted work) and non-dependent (if the citing article discusses, criticizes, or negates the cited work). Finally, we compared the existing state-of-the-art literature and found that our proposed triage process can aid in ensuring the integrity of scientific literature, thereby enhancing its quality.

An analysis of retracted papers in Computer Science

Article

Full-text available

May 2023
PLOS ONE

Context: The retraction of research papers, for whatever reason, is a growing phenomenon. However, although retracted paper information is publicly available via publishers, it is somewhat distributed and inconsistent. Objective: The aim is to assess: (i) the extent and nature of retracted research in Computer Science (CS) (ii) the post-retraction citation behaviour of retracted works and (iii) the potential impact upon systematic reviews and mapping studies. Method: We analyse the Retraction Watch database and take citation information from the Web of Science and Google scholar. Results: We find that of the 33,955 entries in the Retraction watch database (16 May 2022), 2,816 are classified as CS, i.e., ≈ 8%. For CS, 56% of retracted papers provide little or no information as to the reasons. This contrasts with 26% for other disciplines. There is also some disparity between different publishers, a tendency for multiple versions of a retracted paper to be available beyond the Version of Record (VoR), and for new citations long after a paper is officially retracted (median = 3; maximum = 18). Systematic reviews are also impacted with ≈ 30% of the retracted papers having one or more citations from a review. Conclusions: Unfortunately, retraction seems to be a sufficiently common outcome for a scientific paper that we as a research community need to take it more seriously, e.g., standardising procedures and taxonomies across publishers and the provision of appropriate research tools. Finally, we recommend particular caution when undertaking secondary analyses and meta-analyses which are at risk of becoming contaminated by these problem primary studies.

Open Citations as a Tool for Bibliometric Verification and Transparency and for Correcting Erroneous References

Article

Full-text available

Mar 2023
J SCHOLARLY PUBL

Citations in a scientific paper reference other studies and form the information backbone of that paper. If cited literature is valid and non-retracted, an analysis of citations can offer unique perspectives about the supportive or contradictory nature of a statement. Yet, such analyses are still limited by the relative lack of access to open citation data. The creation of open citation databases (OCDs) allows for data analysts, bibliometric specialists and other academics interested in such topics to independently verify the validity and accuracy of a citation. Since the strength of an individual's curriculum vitae can be based on, and assessed by, metrics (citation counts, altmetric mentions, journal ranks, etc.), there is interest in appreciating citation networks and their link to research performance. Open citations would thus not only benefit career, funding and employment initiatives, they could also be used to reveal citation rings, abusive author-author or journal-journal citation strategies, or to detect false or erroneous citations. OCDs should be open to the public, and publishers have a moral responsibility of releasing citation data for free use and academic exploration. Some challenges remain, including long-term funding, and data and information security.

A quantitative and qualitative open citation analysis of retracted articles in the humanities

Article

Full-text available

Nov 2022

In this article, we show and discuss the results of a quantitative and qualitative analysis of open citations to retracted publications in the humanities domain. Our study was conducted by selecting retracted papers in the humanities domain and marking their main characteristics (e.g., retraction reason). Then, we gathered the citing entities and annotated their basic metadata (e.g., title, venue, etc.) and the characteristics of their in-text citations (e.g., intent, sentiment, etc.). Using these data, we performed a quantitative and qualitative study of retractions in the humanities, presenting descriptive statistics and a topic modeling analysis of the citing entities’ abstracts and the in-text citation contexts. As part of our main findings, we noticed that there was no drop in the overall number of citations after the year of retraction, with few entities which have either mentioned the retraction or expressed a negative sentiment toward the cited publication. In addition, on several occasions, we noticed a higher concern/awareness when it was about citing a retracted publication, by the citing entities belonging to the health sciences domain, if compared to the humanities and the social science domains. Philosophy, arts, and history are the humanities areas that showed the higher concerns toward the retraction. Peer Review https://publons.com/publon/10.1162/qss_a_00222

Reducing the Inadvertent Spread of Retracted Science: recommendations from the RISRS report

Article

Full-text available

Sep 2022

Background Retraction is a mechanism for alerting readers to unreliable material and other problems in the published scientific and scholarly record. Retracted publications generally remain visible and searchable, but the intention of retraction is to mark them as “removed” from the citable record of scholarship. However, in practice, some retracted articles continue to be treated by researchers and the public as valid content as they are often unaware of the retraction. Research over the past decade has identified a number of factors contributing to the unintentional spread of retracted research. The goal of the Reducing the Inadvertent Spread of Retracted Science: Shaping a Research and Implementation Agenda (RISRS) project was to develop an actionable agenda for reducing the inadvertent spread of retracted science. This included identifying how retraction status could be more thoroughly disseminated, and determining what actions are feasible and relevant for particular stakeholders who play a role in the distribution of knowledge. Methods These recommendations were developed as part of a year-long process that included a scoping review of empirical literature and successive rounds of stakeholder consultation, culminating in a three-part online workshop that brought together a diverse body of 65 stakeholders in October–November 2020 to engage in collaborative problem solving and dialogue. Stakeholders held roles such as publishers, editors, researchers, librarians, standards developers, funding program officers, and technologists and worked for institutions such as universities, governmental agencies, funding organizations, publishing houses, libraries, standards organizations, and technology providers. Workshop discussions were seeded by materials derived from stakeholder interviews (N = 47) and short original discussion pieces contributed by stakeholders. The online workshop resulted in a set of recommendations to address the complexities of retracted research throughout the scholarly communications ecosystem. Results The RISRS recommendations are: (1) Develop a systematic cross-industry approach to ensure the public availability of consistent, standardized, interoperable, and timely information about retractions; (2) Recommend a taxonomy of retraction categories/classifications and corresponding retraction metadata that can be adopted by all stakeholders; (3) Develop best practices for coordinating the retraction process to enable timely, fair, unbiased outcomes; and (4) Educate stakeholders about pre- and post-publication stewardship, including retraction and correction of the scholarly record. Conclusions Our stakeholder engagement study led to 4 recommendations to address inadvertent citation of retracted research, and formation of a working group to develop the Communication of Retractions, Removals, and Expressions of Concern (CORREC) Recommended Practice. Further work will be needed to determine how well retractions are currently documented, how retraction of code and datasets impacts related publications, and to identify if retraction metadata (fails to) propagate. Outcomes of all this work should lead to ensuring retracted papers are never cited without awareness of the retraction, and that, in public fora outside of science, retracted papers are not treated as valid scientific outputs.

An analysis of retracted papers in Computer Science

Preprint

Full-text available

Jun 2022

Context: The retraction of research papers, for whatever reason, is a growing phenomenon. However, although retracted paper information is publicly available via publishers, it is somewhat distributed and inconsistent. Objective: The aim is to assess: (i) the extent and nature of retracted research in Computer Science (CS) (ii) the post-retraction citation behaviour of retracted works and (iii) the potential impact on systematic reviews and mapping studies. Method: We analyse the Retraction Watch database and take citation information from the Web of Science and Google scholar. Results: We find that of the 33,955 entries in the Retraction watch database (16 May 2022), 2,816 are classified as CS, i.e., approximately 8.3%. For CS, 56% of retracted papers, provide little or no information as to the reasons. This contrasts with 26% for other disciplines. There is also a remarkable disparity between different publishers, a tendency for multiple versions of a retracted paper over and above the Version of Record (VoR), and for new citations long after a paper is officially retracted. Conclusions: Unfortunately retraction seems to be a sufficiently common outcome for a scientific paper that we as a research community need to take it more seriously, e.g., standardising procedures and taxonomies across publishers and the provision of appropriate research tools. Finally, we recommend particular caution when undertaking secondary analyses and meta-analyses which are at risk of becoming contaminated by these problem primary studies.

A quantitative and qualitative citation analysis of retracted articles in the humanities

Preprint

Full-text available

Nov 2021

In this article, we show and discuss the results of a quantitative and qualitative analysis of citations to retracted publications in the humanities domain. Our study was conducted by selecting retracted papers in the humanities domain and marking their main characteristics (e.g., retraction reason). Then, we gathered the citing entities and annotated their basic metadata (e.g., title, venue, subject, etc.) and the characteristics of their in-text citations (e.g., intent, sentiment, etc.). Using these data, we performed a quantitative and qualitative study of retractions in the humanities, presenting descriptive statistics and a topic modeling analysis of the citing entities' abstracts and the in-text citation contexts. As part of our main findings, we noticed a continuous increment in the overall number of citations after the retraction year, with few entities which have either mentioned the retraction or expressed a negative sentiment toward the cited entities. In addition, on several occasions we noticed a higher concern and awareness when it was about citing a retracted article, by the citing entities belonging to the health sciences domain, if compared to the humanities and the social sciences domains. Philosophy, arts, and history are the humanities areas that showed the higher concerns toward the retraction.

MTMvis built on the topic model obtained from the abstracts of the citing entities, shown against the three period P1-P3. For each period the visualization plots the topics distribution (e.g. topic 3 is the dominant topic in all the periods: P1, P2 and P3

Contexts in source publication

Citations